-
Notifications
You must be signed in to change notification settings - Fork 165
Open
Description
from trp import trp2 as t2
t_document = t2.TDocumentSchema().load(job_results)
get_layout_csv_from_trp2(t_document)
>>>
AttributeError Traceback (most recent call last)
Cell In[3], line 1
----> 1 layout_csv = get_layout_csv_from_trp2(t_document)
File [~/.pyenv/versions/3.11.9/lib/python3.11/site-packages/textractprettyprinter/t_pretty_print_layout.py:263](http://localhost:8888/lab/workspaces/auto-R/tree/~/.pyenv/versions/3.11.9/lib/python3.11/site-packages/textractprettyprinter/t_pretty_print_layout.py#line=262), in get_layout_csv_from_trp2(trp2_doc)
261 processed_ids = []
262 relationships: t2.TRelationship = page.get_relationships_for_type()
--> 263 blocks = [trp2_doc.get_block_by_id(id) for id in relationships.ids if relationships.ids]
264 layout_blocks = [
265 block for block in blocks if block.block_type in [
266 "LAYOUT_TITLE", "LAYOUT_HEADER", "LAYOUT_FOOTER", "LAYOUT_SECTION_HEADER", "LAYOUT_PAGE_NUMBER",
267 "LAYOUT_LIST", "LAYOUT_FIGURE", "LAYOUT_TABLE", "LAYOUT_KEY_VALUE", "LAYOUT_TEXT"
268 ]
269 ]
270 for idx, layout_block in enumerate(layout_blocks):
271 # for lists the output is special, because the LAYOUT_TEXTs do have a reference to the LAYOUT_LIST in the text
272 # so we grab the list and process all children
273 # probably could make this "easier" by keeping track of the len of CHILD relationships in LAYOUT_LIST
274 # but wanted to see if I can prepare the lists in lists, which may happen one point in the future...
AttributeError: 'NoneType' object has no attribute 'ids'The t_document document contains a blank page and so relationships is set to None for this page.
There is a missing handler for the edge case where relationships = None
Metadata
Metadata
Assignees
Labels
No labels