[textractprettyprinter] does not return the last row of a table when using get_text_from_layout_json

I used the following code to extract information from documents, including text and tables:

textract_json = call_textract(
    input_document=byte_img, 
    features=[Textract_Features.TABLES, Textract_Features.LAYOUT, Textract_Features.FORMS],
    boto3_textract_client=textract_client
)

layout = get_text_from_layout_json(textract_json, exclude_figure_text=False)

if 1 in layout.keys():
    full_text = layout[1]
else:
    full_text = ''


However, when testing it on the attached document (document_anonyme_1.jpg), the resulting text output (document_anonymise_1.txt) is missing the last row of the table — specifically, the row that contains "COPYRIGHT EOT ..." does not appear.

Could you please help me resolve this issue?

For reference, I am using the following versions of the relevant packages:

amazon-textract-caller: 0.2.4

amazon-textract-prettyprinter: 0.1.10

amazon-textract-response-parser: 0.1.48

amazon-textract-textractor: 1.9.2

![Image](https://github.com/user-attachments/assets/b2b002ae-2ec9-4dd0-b78f-0fe574af4ec6)

[document_anonymise_1.txt](https://github.com/user-attachments/files/20832540/document_anonymise_1.txt)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[textractprettyprinter] does not return the last row of a table when using get_text_from_layout_json #430

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[textractprettyprinter] does not return the last row of a table when using get_text_from_layout_json #430

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions