Skip to main content
Field extraction training can be done by the administrator where a project has to be trained before the operators start working on it. Training by the administrator may also be required if auto-learning proves insufficient.
In the standard scenario, complete the following steps:
  1. Create a Document Definition.
  2. In the section properties of the Document Definition, select Allow field location training.
  3. Create the necessary fields in the section. Select Can have region in the properties of each field.
  4. Save and publish the Document Definition.
  5. Switch to Field Extraction Training Batches mode and create a new batch.
  6. Select the Document Definition that you have created.
  7. Select Default variant from the list of variants.
  8. Load document images and recognize them. We recommend loading 3 to 50 images for each kind of document.
  9. Adjust the positions of the fields.
  10. Set the following states for your documents:
  • Select some of the documents, right-click the selection, and click Set Document State → For training on the shortcut menu.
  • Select the remaining documents, right-click the selection, and click Set Document State → For testing on the shortcut menu.
    Note: We recommend using 60% of the documents in the training batch for training and the remaining 40% for testing.
  1. Train the program to extract the fields:
  • (Standalone) Click Fields Training → (Standalone) Train
    • (Distributed) Click Fields Training → (Distributed) Train to start the training on the same machine where the Project Setup Station component is installed.
    • (Distributed) Right-click the batch and select (Distributed) Send for Training on the shortcut menu if you want the training to be performed on a Processing Station.
In the case of documents with variable field locations, you need to create variants and train a classifier to distinguish these variants. For more about documents with variable field locations and variants, see Variable field locations on documents that belong to the same type.Complete the following steps:
  1. Create a Document Definition.
  2. In the section properties of the Document Definition, select Allow field location training.
  3. Create the necessary fields in the section. Select Can have region the properties of each field.
  4. Add section variants using one of the following three methods:
    • Create variants manually. To do this, click the Data Sets tab in the section properties and then click the View… button. Then click the Add… button to add variants.
    • Load variants from a database. To do this, click the Data Sets tab in the section properties and then click the Set Up… button. From the drop-down list, select Database as the data source.
    • Create variants using a script. To do this, click the Data Sets in the section properties and then click the Set Up… button. From the drop-down list, select Script as the data source.
  5. Save and publish the Document Definition.
  6. Train a classifier on the newly created variants:
    • Switch to Open Classifier Training Batches mode and load document images into a new batch.
    • Assign a reference class to each document, using variants as separate classes:
      • Click Set Class… → Add… → Add…
      • Select Specify variant.
      • Select a variant from the list.
    • Train a classifier by clicking Classification Training → Train.
Note: You can create a classifier in FlexiLayout Studio and then import it into ABBYY FlexiCapture. For details, see Exporting and importing classifiers.
  1. Train ABBYY FlexiCapture to detect the field regions on each variant:
    • Switch to Field Extraction Training Batches mode.
    • Create a new batch. Select your Document Definition and then select a variant to train.
    • Load document images and recognize them. We recommend loading 3 to 50 images for each kind of document.
    • Adjust the positions of the fields.
    • Set the following states for your documents:
      • Select some of the documents, right-click the selection, and click Set Document State → For training on the shortcut menu.
      • Select the remaining documents, right-click the selection, and click Set Document State → For testing on the shortcut menu.
    • Train the program to extract the fields:
      • (Standalone) Click Fields Training → (Standalone) Train
      • (Distributed) Click Fields Training → (Distributed) Train to start the training on the same machine where the Project Setup Station component is installed.
      • (Distributed) Right-click the batch and select (Distributed) Send for Training on the shortcut menu if you want the training to be performed on a Processing Station.
When working with the training results, you may need to check which variant was assigned to a document and edit it if necessary. To display the IDs of the variants on the form, create a service field. For details, see Enabling operators to change variants.
We recommend configuring auto-learning for field extraction. With auto-learning configured, the program automatically learns to extract fields as the operators work on the configured project.

(Distributed) Sending training batches to a Processing Station for training

(Distributed) As the training process can take a long time and consume a lot of computational resources, the administrator can choose to train batches on a Processing Station.
(Distributed) Before sending a training batch to a Processing Station, please, make sure that:
  • (Distributed) At least one Processing Station has been added on the Processing Server.
  • (Distributed) The project has been uploaded to the server.
(Distributed) To send a training batch to a Processing Station, right-click the batch and select (Distributed) Send for Training on the shortcut menu. Once the batch has been sent, a training task will appear in the task queue in the Processing Server Monitor. The training itself will take place on the Processing Station. Multiple training batches can be sent for training on a Processing Station. If required, the administrator can cancel a training task in the Processing Server Monitor. Training batches sent for training will be locked until their training completes and the administrator will not be able to make any changes to such training batches.