- The General
- The Recognition
-
Prefer settings from batch type**.** Select this option if you want to synchronize full-text recognition settings.
Note: Disabling the synchronization may lead to slower Document Definition matching. - Languages**.** It is necessary to set a correct language for recognition to proceed without errors. This property defines both the language itself and other settings like date format, currency, etc.
-
Select a Recognition mode from the list:
- Fast mode. Colored and half-tone images are binarized prior to recognition (i.e. converted to black and white). Fastrecognition takes less time and provides mostly satisfactory results.
- Balanced mode. The program additionally takes into account image colors. Recognition in this mode is slower but delivers better quality.
Note: No further modifications are planned for algorithms powering Balanced recognition mode, as the best possible speed and quality figures have been achieved. - Normal mode is default. It is also recommended when pre-recognition in the Fast / Balanced mode results in multiple errors.
- Accurate mode is intended for extracting data from barely readable digital images or from poor-quality documents.
Note: Accurate mode takes the most time and is therefore recommended for use only on problem images/documents.
-
Advanced recognition settings…
- You may select one or several options from the Correct page orientation, if page group for a page to rotate 180°, 90° clockwise or 90° counter-clockwise when its orientation is selected automatically.
- If necessary, specify the way a blank form will print (Typographic, Matrix printer, Typewriter) in the Text type section and select the Use pattern option to add a sample pattern.
- Barcodes**.** Parameters of barcode processing:
- Disable barcode extraction. Select this option if barcodes should not be found on images. This will speed up document recognition considerably.
- Extract 2D barcodes: Data Matrix, Aztec, QR Code**.** Select this option if your images contain barcodes of specified types. If the option is not selected, Data Matrix, Aztec, and QR Code barcodes will not be found on images.
-
Extract post barcodes. Select this option if your images contain postal barcodes. If this option is not selected, postal barcodes will not be found on images.
Important! Extracting barcodes slows down recognition. -
CJK pre-recognition
- Separated furigana mode**.** Select this option to improve recognition of phonetic tips (furigana) in the Japanese language.
-
Named entity recognition: Extract named entities – extraction of information using NLP methods.
Note: Requires an NLP module and a specific license type. -
Vertical text extraction – Vertical text extraction parameters:
- Extract for all languages – Detects vertically-oriented text written in any of the supported languages.
- Do not extract – Prevents the detection of vertically-oriented text.
- Extract for CJK languages – Detects vertical text written in Chinese, Japanese or Korean.
-
Click the Advanced… button to configure correction of linear and nonlinear distortions of images, specify direction of scanner’s automatic feeder, etc.
Note: Select the Correct linear distortion option to specify parameters of image spreading/compression by height and width. Images are scaled by existing anchors (black square, cross or corner) as well as horizontal and vertical separators. - Amount of Money – A combination of a numerical amount and a currency code or symbol. In order to avoid any recognition errors for visually similar characters like 1, I, and i, or s and $, a regular expression is used which allows letters only in certain combinations that represent currency codes, either preceding or immediately following the numerical amount. The major currency codes are listed in Currencies.
- The Assembly
- Use key fields equality assembling rule enable this option if you want to perform a check of document assembly based on key fields. Then select a key field for each section in the Key Field column. When you input documents, only documents with the matching values of key fields in each section will be considered correctly assembled. If their values do not match, an assembly error message will be displayed.
- Use standard assembly rules - enable this option if you want to perform a check of document assembly using the following standard rules:
- Disable sections order check - enable this option if you want to disable the checks for the order of sections in the document (e.g. if the order of sections does not affect document assembly). The program will still check that all the sections are present in the document, but their order will be ignored.
- Enable annex pages - enable this option if you want to process documents with annexes. If processing document with annexes is enabled, you can also select the option Detect annexes using preset document structure, without analyzing (fast) to enable faster detection of annexes on the basis of the present document structure.
- Use custom assembly rules - enable this option if you want to perform a check of document assembly using a document assembly script. A custom assembly script can be executed both separately and together with the standard assembling rules. To start editing the script, click the Edit Assembly Script… button. The Script Editor window will open.
- The Rules
- The Export Destinations
- The Data Form
- The Data Text Settings group contains font settings for displaying recognized values.
- The Label Text Settings group contains settings for displaying the explanatory text (field names).
- The Data Sets
- The Event Handlers
- The .NET References
