AWS, Cloud Computing

4 Mins Read

Classifying and Document Extracting using Analyze Lending API in Amazon Textract

Overview of Amazon Textract

Amazon Textract is a fully managed machine learning service that automatically analyzes and extracts data where the data can be handwritten, printed, or even scanned pages of documents. It internally has optimized machine learning models where it can classify forms, tables, and raw text in terms of words or lines and query the data which fetches you accurate results. Now the new add-on feature of textract can detect if there is a signature in it or not.

Introduction

Amazon Textract comes up with a new API specifically for Mortgage Loan related Documents which has the power of analyzing and classifying the documents based on page wise manner. This Amazon Lending API was introduced to help customers with the Mortgage Loan Application form to process the application faster, reducing human error which in terms brings a better optimistic way of customer experience. Let’s quickly kick start with the managed Machine learning service, Amazon Textract.

  • Cloud Migration
  • Devops
  • AIML & IoT
Know More

Analyze Lending APIs

Analyze Lending workflow has 3 APIs for mortgage related document processing, StartLendingANalysis, GetLendingAnalysis, and GetLendingAnalysisSummary.

  • StartLendingAnalysis :

This API internally takes the input of mortgage related documents, initiates the classification and analysis of processing the documents.

  • GetLendingAnalysis :

After processing the input documents and classifying them, the results can be extracted with the help of GetLendingAnalysis. It contains information regarding the page number and page classification which includes the information about document type on each page and extracted fields respective to the page number. The return will be in key-value pair data.

  • GetLendingAnalysisSummary:

Post of processing the results we also get the summary of the input document which include information about all the documents grouped by the same document types. It returns the total of Pages scanned, Page Classified, Pages Unclassified, Document Types detected, and Documents with Signatures.

Note: To make use of the Lending API you will have to store the input document in S3 Bucket, and it works with asynchronous operation

Add on Features of Textract [Signature Detection]

On any given document Textract can analyze and detect whether the given document contains a signature or not. If it has, it also gets the bounding box of it. Signatures can be of different types like a handwritten signature, e-signatures, and initials on the documents. It has a pre-trained machine learning model which is been trained with multiple and various financial, tax, and insurance kinds of documents. This specifically reduces the need for human validations or reviewers which helps the customer to save cost and reduces human error.

Real-Time example:

In lots of document processing, we may have encountered many reports of documents that include signatures in it. The specific report or document will be termed valid if there is the presence of a signature in it. Eg, if a cheque or check doesn’t have a signature on it, then it can be an invalid one.

Documents types that AnalyzeLending API extracted successfully with being said of signature present in it or not.

Table

Demo view of Analyze Lending in the Management Console

In this demo view of Analyze Lending in the management console, you can upload a package of documents.

Note: The document Package should be less than 10 pages and 5MB in size. It supports various formats of documents like JPEG, PNG, TIFF, and PDF.

Just to showcase the effective results of this API, I have uploaded a sample pdf package containing 2 pages.

Page 1: Receipt Bill for my purchase

Page 2: A dummy cheque of mine with signature

fig1

 Fig 1

Here in this Fig 1 above I have uploaded a sample package and the summary says it has scanned 2 pages in total. In that uploaded document it has detected 2 different document types. Also, a signature is detected on one page.

fig2

Fig 2

In Fig 2, each page has classified into different document types say e.g., Page 1 is classified as Receipt and page 2 has a check from which signature is detected.

fig3

Fig 3

Down the console, you have the option to explore the document page wise.

Conclusion

In this blog, we have seen how Analyze Lending API using Amazon Textract helps customer to process Mortgage documents in one upload. In short, it can able to classify, split the document type and extract valuable information from loan documents. In addition, it can detect signatures on the document. 

Get your new hires billable within 1-60 days. Experience our Capability Development Framework today.

  • Cloud Training
  • Customized Training
  • Experiential Learning
Read More

About CloudThat

CloudThat is also the official AWS (Amazon Web Services) Advanced Consulting Partner and Training partner and Microsoft gold partner, helping people develop knowledge of the cloud and help their businesses aim for higher goals using best in industry cloud computing practices and expertise. We are on a mission to build a robust cloud computing ecosystem by disseminating knowledge on technological intricacies within the cloud space. Our blogs, webinars, case studies, and white papers enable all the stakeholders in the cloud computing sphere.

Drop a query if you have any questions regarding AWS Textract and I will get back to you quickly.

To get started, go through our Consultancy page and Managed Services Package that is CloudThat’s offerings.

FAQs

1. What are the languages supported in Textract?

ANS: – Currently English, Spanish, Italian, Portuguese, French, and German. Handwriting, Invoices and Receipts, Identity documents, and Query processing are in English only.

2. How to access Amazon Textract in the management console?

ANS: – Follow this link, https://aws.amazon.com/textract/ and sign into the console.

3. What is the region that has the support for AnalyzingLending API and Signature Detection?

ANS: – It is available in US East (Ohio, N. Virginia), US West (N. California, Oregon), Asia Pacific (Mumbai, Seoul, Singapore, Sydney), Canada (Central), Europe (Frankfurt, Ireland, London, Paris), and AWS GovCloud (US) Note: New Language support and region may be updated in the future.

WRITTEN BY Ganesh Raj

Ganesh Raj V works as a Sr. Research Associate at CloudThat. He is a highly analytical, creative, and passionate individual experienced in Data Science, Machine Learning algorithms, and Cloud Computing. In a quest to learn and work with recent technologies, he strives hard to stay updated on advanced technologies along efficiently solving problems analytically.

Share

Comments

    Click to Comment

Get The Most Out Of Us

Our support doesn't end here. We have monthly newsletters, study guides, practice questions, and more to assist you in upgrading your cloud career. Subscribe to get them all!