Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.abbyy.com/llms.txt

Use this file to discover all available pages before exploring further.

Use Advanced Designer for semi-structured document sets that are too complex for the cloud Skill Designer alone — for example, sets with many varying layouts, outliers that need rule-based extraction, or insufficient training data for machine learning.

Pick a scenario

ScenarioWhen to useKey activities
Many varying document variantsNew templates keep appearingDeep Learning + Fast Learning
Outliers needing rulesMost variants work with ML, but a few need rulesClassify + IF + Deep Learning + Extraction Rules
Limited training dataNot enough documents to train Deep LearningClassify + Extraction Rules per variant
Each scenario below follows the same pattern; only the activities you add to the processing flow differ.

Common workflow

1

Create a Document skill

Open Advanced Designer and click Create Document Skill on the start page.
2

Upload documents

On the Documents tab, upload the documents you’ll use to set up the skill. For scenarios that include classification, add a roughly equal number of documents for each variant so the classifier has balanced training data.
3

Define fields and label

On the Fields tab, create and configure the fields the skill will extract. Label documents in the Reference section.
4

Add and configure activities

On the Activities tab, add the activities for your scenario (described below). Open each activity in the Activity Editor to configure and train it.
5

Test and publish

Click Test Skill Using Selected Documents to evaluate results. When the results are good enough, publish the skill.

Many varying document variants

Use this scenario when documents share a type but layouts vary widely — for example, invoices from many suppliers where new templates will keep appearing. Pair a Deep Learning activity with a Fast Learning activity:
  • Deep Learning generalizes to unforeseen variants. Training requires at least 100 labeled documents.
  • Fast Learning sharpens accuracy on the specific templates you’ve seen. It can also be trained continuously via the Online Learning feedback loop from manual review.
Document processing flow with Deep Learning followed by Fast Learning

Outliers that need rule-based extraction

Use this scenario when most variants work with Deep Learning + Fast Learning, but a few documents have nested tables or other features that machine learning won’t capture cleanly. Separate the outliers with a Classify activity, then branch the flow with an IF activity:
  • Use Classify By Company when the variants come from different companies whose name or address appears on the document — for example, bank statements from multiple banks.
  • Use Classify By Text and Image in all other cases. This multimodal classifier uses text, layout, and image patterns to distinguish variants.
After branching, send outlier documents through an Extraction Rules activity tailored to those documents.
Document processing flow with Classify, IF, Deep Learning, and Extraction Rules activities

Limited training data

Use this scenario when you don’t have enough documents to train a Deep Learning activity, but you have enough domain knowledge to describe the extraction rules — for example, tax forms that vary by year. Split the documents with a Classify activity (Classify By Company or Classify By Text and Image), then route each variant through its own Extraction Rules activity. Optionally add a Fast Learning activity so Vantage can refine the skill over time.
Document processing flow with a Classify activity branching into multiple Extraction Rules activities

Deep Learning activity

Generalize across many semi-structured layouts.

Fast Learning activity

Sharpen accuracy on known templates with manual review feedback.

Extraction Rules activity

Define rule-based extraction for outliers or low-data sets.

Classify activities

Branch the flow by document variant before extraction.