logo

Amazon Textract

Text generation

Use this optical character recognition (OCR) model to extract text from images. The model will take images as input and generate text annotations, classifying them within bounding boxes. The bounding boxes will be grouped by words.


Intended Use

Amazon Textract extracts text, handwriting, and structured data from scanned documents, including forms and tables, surpassing basic OCR capabilities. It provides extracted data with bounding box coordinates and confidence scores to help users accurately assess and utilize the information.


Performance

  • Custom Queries: Amazon Textract allows customization of its pretrained Queries feature to enhance accuracy for specific document types while retaining data control. Users can upload and annotate a minimum of ten sample documents through the AWS Console to tailor the Queries feature within hours.

  • Layout: Amazon Textract extracts various layout elements from documents, including paragraphs, titles, and headers, via the Analyze Document API. This feature can be used independently or in conjunction with other document analysis features.

  • Optical Character Recognition (OCR): Textract’s OCR detects both printed and handwritten text from documents and images, handling various fonts, styles, and text distortions through machine learning. It is capable of recognizing text in noisy or distorted conditions.

  • Form Extraction: Textract identifies and retains key-value pairs from documents automatically, preserving their context for easier database integration. Unlike traditional OCR, it maintains the relationship between keys and values without needing custom rules.

  • Table Extraction: The service extracts and maintains the structure of tabular data in documents, such as financial reports or medical records, allowing for easy import into databases. Data in rows and columns, like inventory reports, is preserved for accurate application.

  • Signature Detection: Textract detects signatures on various documents and images, including checks and loan forms, and provides the location and confidence scores of these signatures in the API response.

  • Query-Based Extraction: Textract enables data extraction using natural language queries, eliminating the need to understand document structure or format variations. It’s pre-trained on a diverse set of documents, reducing post-processing and manual review needs.

  • Analyze Lending: The Analyze Lending API automates the extraction and classification of information from mortgage loan documents. It uses preconfigured machine learning models to organize and process loan packages upon upload.

  • Invoices and Receipts: Textract leverages machine learning to extract key data from invoices and receipts, such as vendor names, item prices, and payment terms, despite varied layouts. This reduces the complexity of manual data extraction.

  • Identity Documents: Textract uses ML to extract and understand details from identity documents like passports and driver’s licenses, including implied information. This facilitates automated processes in ID verification, account creation, and more without template reliance.


Limitations

  • May struggle with highly stylized fonts or severe document degradation

  • Handwriting recognition accuracy can vary based on writing style

  • Performance may decrease with complex, multi-column layouts

  • Limited ability to understand document context or interpret extracted data

  • May have difficulty with non-Latin scripts or specialized notation


Citation

https://docs.aws.amazon.com/textract/