- Upload documents you plan to use to train and test the classifier (each file must contain only one document image). To upload your documents, navigate to the Documents tab in the Skill Designer and click the Upload documents button located in the center of the designer, on the toolbar, or in the Actions pane. When documents are being uploaded, a progress indicator is displayed at the top of the Skill Designer, to the right of the bookmarks. The indicator tooltip contains information about the number of documents that still need to be uploaded and processed. You can open a document preview window by clicking the icon located to the left of a document’s name. The default width of this window is 35% of the browser window width. If needed, you can increase the width of the preview window up to 80% of the browser window’s width by dragging the left border of the preview window. User-specified widths are preserved until the browser cache is cleared.
- In your training documents, label all the fields from which data must be extracted, specifying their location on the image. You can also add validation rules and skill parameters if required. To label fields in your training documents, navigate to the Editor tab of the Skill Designer. You can do so either by clicking on the tab name or by selecting one or more documents from the list and clicking Label Fields and Create Business Rules in the Actions pane.
- Train your skill and analyze its extraction accuracy. Correct errors, if any. To begin training your skill, click the Train button in the Actions pane. Once training has been completed, the following is displayed next to the Train button: Completed. You can stop skill training by clicking Cancel under the Train button in the Actions pane.
- Repeat steps 1 through 3 until you are satisfied with the results.
- Publish your production skill to the Skill Catalog to make it available to use.
- If needed, you can further derive skills from your skill. Such skills inherit all the rules and fields from their base skills. Additionally, derived skills can be seamlessly updated to the latest version of their base skills, while preserving all the changes you have made.
Setting up a Document skill for processing structured documents
ABBYY Vantage offers a machine learning mode for processing structured documents, i.e., documents where the location of the fields is the same on each document instance. Examples of such documents include questionnaires, application forms, and tax return forms. This mode can handle even documents that have multiple variants, such as the IRS Form 1040 for different years, where the set and location of the fields differ slightly from year to year. Each of these variants represents a separate structured document, and you must upload a blank form for each. To set up a skill for processing structured documents, follow these steps:-
Create a new Document skill and turn on the Fixed-form documents toggle for that skill.

- Upload one sample blank form for each variant of your document. To do this, navigate to the Blank Form tab in the Skill Designer and click the Upload Blank Form button located in the center of the designer, on the toolbar, or in the Actions pane. If you don’t have a blank form, you can upload a completed form and mark it as a blank form.
Note: One skill can handle up to 10 different variants of one form (for example, variants of the IRS Form 1040 for different years).
- Label the fields from which data must be extracted.
- In the field settings, enable the Eliminate field background option for fields where the background may affect recognition.
- In the Actions pane, click Train.
- Click the Test Set tab and upload completed test documents. Make sure that all the fields are labeled correctly on each document. If any field locations don’t match those on any of the uploaded sample blank forms, add a sample blank form for that document variant.
- In the Actions pane, test your skill. When the operation completes review the results. If you are not satisfied with the results, adjust the labeling and train the activity again.
- Publish your skill to the Skill Catalog to make it available for use.
Tabs displayed in the Skill Designer for a skill that processes structured documents
While processing your documents, you may discover that they should be more properly treated as semi-structured rather than structured documents. In this case:
- Click the icon to the right of the skill name in the Skill Designer.
- Turn off the Fixed-form documents toggle.
Note: All the labeled fields are preserved.
- Retrain your skill.
Working with tables and repeating groups
When processing structured documents, Vantage can handle tables and repeating groups if the maximum number of table rows or group instances is known in advance and the boundaries of the table or group are fixed. You must label all the rows that may possibly occur on all the variants of the form.Note: Only rows with data are displayed in the processing results. Any empty rows are ignored.If the number of rows or instances in a group is not known in advance, you must use another Vantage technology, see Processing structured documents.
Tip: Currently, only tables with text values can be handled. If your table has columns with checkboxes or barcodes, use a repeating group instead.
Setting up a recognition language
When processing a document, Vantage selects a processing language from a list of specified languages. When setting up a new skill, the following four languages are selected by default: English, French, German, and Spanish. The recognition language list can be modified by clicking the icon to the right of the skill name in the Skill Designer. Languages displayed in the settings dialog are sorted in alphabetical order. The selected languages are displayed at the top of the list. At least one language needs to be selected. To close the settings dialog, click either Save to keep your changes or Cancel to discard your changes.Note: The number of selected languages may affect the document processing speed. If you are completely certain of which languages Vantage will encounter in your documents, we recommend that you restrict the selected languages to those specific ones.You can modify a skill’s list of recognition languages even if the skill has already been trained. To apply your changes, upload the documents again and retrain the skill.
Setting up Online Learning mode
Online learning is available for Document skills and for Classification skills. The Online Learning service collects documents into a training set, and then trains the skill further using the collected documents. In Document Skill Settings, the following Online Learning modes are available:- Collect and learn. Default option.
- Collect only. If Skill Designer needs to collect documents without training.

