Tips for improving classifiers
If you are not satisfied with the results obtained with your classifier, try the following:- Check whether the selected classification profile is appropriate.
- Adjust the recall/precision ratio.
- Check whether the reference classes have been assigned correctly.
- Use a larger number of sample documents. Make sure your training batch includes as many document variants as possible. The larger and more varied the training batch, the more document variants of the same class the classifier will detect.
Document Definition confidence
Document Definition sections mapped to document classes will be matched against their document sections with a certain degree of confidence. The names of low-confidence sections will be marked in red. If Automatically confirm section type when matched is enabled and the Document Definition section mapped to the selected document class has been successfully matched, the name of the section will not be highlighted in red. In this case, the class will be confirmed during Document Definition matching, even if initially the class was determined with low-confidence.More about the 'Automatically confirm section type when matched' option
More about the 'Automatically confirm section type when matched' option
The Automatically confirm section type when matched option is located on the General tab in the Document Definition section properties. Selecting this option will speed up manual verification, but it should be selected for sections that can only be matched to documents for which a given Document Definition was created. This can be a fixed Document Definition for a fixed section with identifiers, or a FlexiLayout section containing required elements. Operators will not need to confirm such matches manually. We do not recommend selecting this option for Document Definitions that can be matched to any documents.
- Click the Confirm Document Definition command on the shortcut menu of the page or document.
- Modify the low-confidence page (for example, by changing the section type or moving the page to another document).
- Change the Document Definition selected for the page or document.
Once an operator corrects all errors so that there are no more sections with unconfidently matched Document Definitions, the “low-confidence” error will be removed automatically.
Detecting errors in the classifier training batch
Classification errors are mostly caused by incorrectly assigned reference classes or by a lack of sample pages in the training batch. To detect these sorts of errors, you can ask the program to find pages that are similar to a misclassified page. To do this, right-click a misclassified page and then click one of the following three commands on the shortcut menu (these commands can also be accessed via the Classification Training menu at the top):- Show Similar Pages looks for similar pages in the entire batch, regardless of the reference or the result class of the selected page.
- Show Similar Pages from Reference Class looks for similar pages with the same reference class as the reference class of the selected page.
- Show Similar Pages from Result Class looks for similar pages with the same reference class as the result class of the selected page.
The program will look for similar pages in all documents regardless of their state, whether they are marked as For Training, For Testing or Unused.

Practical example
Suppose you spot a misclassified page in the confusion matrix and this page has ID for its reference class and Invoice for its result class.


