Form Recognizer
/An extension to the Vision family of Azure Cognitive Services, Form Recognizer is an AI powered document extraction service that is able to extract key-value pairs and table data from documents (PDF, JPG, or PNG). One of the key benefits of the service is that it is fully managed, and does not require any manual labeling, simply train a custom model off of a small training data set (either 5 x filled-in forms or 2 x filled-in forms + 1 empty form and you are ready to go).
Note: The service was announced in May 2019 at Microsoft Build in limited public preview. To request access, fill in and submit the access request form (https://aka.ms/FormRecognizerRequestAccess). Once approved, you will receive an email with instructions for accessing the service.
Pre-Requisites
An Azure Subscription
Access to the Form Recognizer Public Preview (Request Access)
A Form Recognizer resource (created via the Azure Portal)
Postman (an API testing tool)
1. Create a Form Recognizer Resource
In order to create a Form Recognizer resource, we need to provide the following pieces of information: Name, Subscription, Location, Pricing Tier, and Resource Group.
A couple things to note:
Form Recognizer is currently available at two price points:
Free (0 - 500 pages per month)
S0 (Custom $25 or Pre-built $5 per 1,000 pages)
The free tier is sufficient for this demo. For more information, check out the pricing page.
At time of this post, the service can be deployed in two regions: West US 2 (westus2) and West Europe (westeurope).
As the service is currently in limited public preview, the resource can only be created by clicking the embedded Azure Portal link within the access confirmation email (i.e. not currently able to be found by searching the marketplace).
2. Upload a Training Data Set to Azure Blob Storage
Once we have our Form Recognizer resource, we need to curate a data set to train the custom model. The source of the training data can either be local or reside externally within an Azure Blob Storage container. In this example we have used the Azure Storage Explorer to upload a training data set to a container. If you would like to use the same files as used in this demo, download the training set here. Alternatively, the documentation provides some guidelines on building a training data set for a custom model.
3. Generate a Shared Access Signature URI
In order to provide the Form Recognizer resource access to our training data set, we must generate a Shared Access Signature URI. To do this, right-click on the container and click “Get Shared Access Signature…”. For the purposes of the demo, the default property values (Access Policy, Start Time, Expiry Time, Permissions) can be left as is. Click “Create” and copy the URL, this will be needed in the next step.
4. Train a Custom Model
In order for Form Recognizer to ingest and learn from our training data set, we must invoke the Train Model API. To do this via Postman, set up a HTTP request with the following parameters:
Method: POST
Endpoint: https://{{region}}.api.cognitive.microsoft.com/formrecognizer/v1.0-preview/custom/train
Headers
Content-Type: application/json
Ocp-Apim-Subscription-Key: {{subscription_key}}
Body: See sample below.
source: Specify a source path (local or Azure Blob Storage).
sourceFilter\prefix: A case sensitive prefix string to filter content at the source location.
sourceFilter\includeSubFolders: A boolean to indicate if sub folders should be included.
{
"source": "{{shared_access_signature}}",
"sourceFilter": {
"prefix": "{{prefix}}",
"includeSubFolders": true
}
}
Once complete, Postman should look like the below with the method set to POST, Headers and Body populated. Click send to begin training your custom model. Upon success, the service will return a HTTP 200 response that includes details about the training documents as well as a modelId.
5. Analyze a Document
Now that we have successfully trained a custom model, we can use the Analyze Model API in conjunction with our new modelId to extract data from a new document. To do this via Postman, create a new HTTP request with the following parameters:
Method: POST
Endpoint: https://{{region}}.api.cognitive.microsoft.com/formrecognizer/v1.0-preview/custom/models/{{model_id}}/analyze
Headers
Content-Type: application/png
Ocp-Apim-Subscription-Key: {{subscription_key}}
Body: Binary upload of the image file.
If you would like to follow along with the demo in this post, download the image file to be uploaded as part of the POST body.
Note: You can optionally add a query parameter to limit the results by keys (e.g. …/analyse?keys=Trailing P/E).
That’s it! You have successfully trained a custom model and used that model to analyze a new document.
To increase your familiarly with the rest of the API’s (Get Models, Get Model, Get Keys, Analyze Form, and Delete Model), cycle through the screenshots below.
Service Limitations
While the Form Recognizer service will inevitably improve over time, there are a couple of limitations to be aware of (e.g. Printed text only, English only, checkboxes and radio buttons not supported).
Content Types: PDF (application/PDF), JPG (image/jpeg), or PNG (image/png)
File Size: < 4 MB.
Number of Pages: < 50
Minimum Image Size: 50 x 50
Maximum Image Size: 4200 x 4200
Supported Language(s): English
Character Type: Printed (not Handwritten)
Keys: Must appear above or to the left of the values (not below or to the right)
Checkboxes and Radio Buttons are not supported
API Reference
Summary table below of the Form Recognizer APIs. For more detail, check out the complete API reference documentation.
Method | Name | API | Description |
---|---|---|---|
POST | Train Model | custom/train | Create and train a custom model. |
POST | Analyze Form | custom/models/{id}/analyze[?keys] | Extract key-value pairs from a given document. |
GET | Get Models | custom/models | Get information about all trained custom models. |
GET | Get Model | custom/models/{id} | Get information about a model. |
DELETE | Delete Model | custom/models/{id} | Delete model artifacts. |
GET | Get Keys | custom/models/{id}/keys | Retrieve the keys for a model. |
POST | Analyze Receipt | prebuilt/receipt/asyncBatchAnalyze | Extract values from a given receipt document. |
GET | Get Receipt Result | prebuilt/receipt/operations/{operationId} | Retrieve the result of an Analyze Receipt operation. |
Additional Resources
Access Request (https://aka.ms/FormRecognizerRequestAccess)
Product Page (https://aka.ms/form-recognizer)
Documentation (https://aka.ms/form-recognizer/docs)
API Reference (https://aka.ms/form-recognizer/api)
Pricing (https://azure.microsoft.com/en-us/pricing/details/cognitive-services/form-recognizer/)