Key New Features Overview - ABBYY Documentation

Comparing Documents

Feature	Description
New “Compare Documents” Module	For quick verification of the document’s integrity, the new “Compare Documents” Module in ABBYY FineReader Engine enables detecting content differences in two versions of the same document.
Comparison of bilingual documents	The new option of the “Compare Documents” Module provides the ability to automatically detect the bilingual nature of such a document and its complex layout and to compare each column (and thus each language version) separately.

Input of Office formats in Linux and Windows

Feature	Description
Processing of Office documents	In addition to a broad set of image formats, FineReader Engine can now process input documents that are created in one of these Office document formats: Text documents: .doc, .docx, .rtf, .htm / .html, .txt, .odt Tables: .xls, .xlsx, .ods Presentations: .ppt, .pptx, .odp Microsoft Office or LibreOffice must be installed on the system.

MRZ Capture

Feature	Description
Data capture from a Machine-Readable Zone (MRZ)	The new feature allows automatic data extraction from a machine-readable zone (MRZ) in ID documents and allows faster entering and verification of personal data during customer onboarding or verification processes.

Improved Japanese OCR

Feature	Description
Leading recognition accuracy	With the new version of ABBYY Fine Reader Engine, Japanese OCR has seen some major improvements, bringing recognition accuracy to a new level previously unattainable for most solutions.

Improved Arabic OCR

Feature	Description
End-to-end recognition for Arabic on poor images	Arabic OCR on low-quality images where general technology provides low confident results with a lot of errors.

Improved Korean OCR

Feature	Description
Deep learning language model for Korean	A trained model for Korean language selects the best word recognition variant from recognition hypotheses or even generates new one based on a recognition context (preceding and following words).

New neural network-based OCR technologies

Feature	Description
Improvements in OCR technologies	With the help of neural network approaches in OCR technologies, ABBYY FineReader Engine was enhanced with handwritten and handprinted Latin symbols processing: Language model for consistent and accurate choice of word variants End-to-end recognition for Latin scripts to process the multilingual documents
Machine learning barcode recognition technology	The neural network architecture introduces a new model of barcode recognition performing detection of the approximate region of a barcode, its classification, and obtaining the output represented as a region with the most likely type of barcode.
New recognition mode	The new Accurate mode allows you to get the maximum quality of the output document, assuming a reasonable slowdown in the recognition speed. This mode is best suited for low-quality or photo-generated invoices, contracts, receipts, and ID cards.

OCR quality improvements for text near stamps and signatures

Feature	Description
Detecting text near stamps and signatures	Whenever an agreement contains stamps or signatures, the text nearby is recognized separately from them, thus improving the quality of the processed documents.

New licensing options

Feature	Description
Online License usage as Network and Standalone	The Developer’s Help for FineReader Engine 12 has been extended by additional information about different possibilities to license the SDK, describing the individual types of licensing options in an easy-to-understand comparison table.
Using grace periods	With the new option, customers can use the ABBYY FineReader Engine license for some time after the expiration date, thereby enlarging the license validity period.

ICR and OMR technologies in Linux and macOS version

Feature	Description
Handwritten text and checkmark recognition	With ABBYY FineReader Engine 12, you may recognize handwritten, handprinted characters and the checkmarks of various types. ICR and OMR technologies are implemented to extract the data from the handwritten documents and develop new data extraction solutions.

Ability to run Engine in cloud environments

Feature	Description
New deployment options	New licensing type allows deployment in Virtual and Cloud environments, allowing you to offer a broader spectrum of solutions. The licensing mechanism requires internet connection and supports proxy servers. Note: Applicable to FineReader Engine for Linux and Windows.

.NET Core wrapper in FRE for Windows

Feature	Description
New development framework	To increase the efficiency of development teams using containers and other native environments for the popular way of software development and deployment, ABBYY FineReader Engine now offers a pre-built .NET Core 6 wrapper.

New libraries in ABBYY FineReader Engine

Feature	Description
NeoML library usage	NeoML is an open-source end-to-end machine learning framework that allows you to build, train, and deploy Machine Learning models. This framework is used by engineers for computer vision and natural language processing tasks, including image preprocessing, classification, document layout analysis, OCR, and data extraction from structured and unstructured documents.
Embedded PDFium for processing PDFs	PDFium is a cross-platform native library conforming to PDF standards and controlling all operations related to PDF, including processing, parsing, rendering, and obtaining the output.

Enhanced Document Classification

Feature	Description
Document Classification using NLP and Machine Learning	With ABBYY FineReader Engine 12, incoming documents can be automatically sorted into different categories. Machine learning, OCR and natural language processing technologies are employed to train the image-based and text-based classifiers on representative documents. The received information is then used during classification step.
Text-based classifier: advanced security of training data	To train and optimize the text-based classifier, documents representing each document category must be imported. In order to protect data contained in these documents, implemented hashing algorithms avoid the possibility to recover information from the sample documents.
Enhanced Classification Demo Sample	ABBYY FineReader Engine is able to process PDFs, scanned or photographed document images as well as documents in Office formats. To reflect this capability in the classification process, the provided pre-compiled Demo Sample for classification was enhanced and allows now to import Office documents in addition to PDFs and image formats.

Code sample for command-line interface (CLI)

Feature	Description
Ready-to-use code sample	With this code sample, developers can efficiently utilize ABBYY FineReader Engine libraries and integrate document processing capabilities in command-line-based applications.

Implementation of PDF meta-data extractor

Feature	Description
Digitally-born PDF file processing	AuxInfo is a supplementary object of PDFium providing meta-data information from a PDF file. ABBYY R&D PDFTools team implemented its own AuxInfo object working with PDFium.

Improved PDF processing

Feature	Description
Improvements for PDF with “mixed” contents	ABBYY FineReader Engine provides new capabilities for processing the PDF documents containing both image-only and digitally-born pages: Adaptive recognition to improve and speed up PDF processing Text layer quality classifier for preserve good one in the output format Indication of digital signature presence in PDF New content reuse mode for processing the document with mixed contents
Using additional content in PDF	To ensure more flexible forming the PDF contents, ABBYY FineReader Engine offers the new options: Opening PDF Portfolios and processing their contents Adding custom images to the output PDF and managing their positions

Additional language support

Feature	Description
Farsi OCR	ABBYY FineReader Engine features updated and improved Farsi recognition options, opening up the door for more effective work with documents from Iran, Afghanistan and many other countries of the Middle East.
Georgian OCR	The Georgian language was added as new OCR language.
OCR for simple mathematical formulas	Extracting characters of simple mathematical formulas allows better recognition of scientific documents containing simple single-line mathematical formulas inside the text.
Technical preview for Burmese OCR	Burmese OCR was added as a technical preview to highlight future capabilities.
Special languages for Arabic and Japanese dates capture	FineReader Engine supports special languages for field recognition in FineReader Engine for Windows. The new version adds improved date recognition in Arabic and Japanese.
Technical preview for Bangla OCR	Bangla OCR was added for a technical preview to demonstrate potential functionality.

Improved document layout recreation

Feature	Description
Improved table reconstruction	With ABBYY FineReader Engine 12, extracted tables from documents keep their formatting better than ever.
Detection and recreation of balanced columns	Whenever a document contains balanced columns of text (e.g., contracts, scientific papers, articles, etc.), now the initial structure stays intact, thus simplifying document processing.
New “single-column” document model	The main improvements of the new algorithm are in the detection and analysis of tables and charts.
Enhanced table structure analysis	With the improved mechanism of document conversion, ABBYY FineReader Engine can detect tables with columns of numbers in the “Accounting” format.

Internal process optimization for faster processing

Feature	Description
New scheme of the ILayout object iteration	A new scheme that speeds up the iteration of the ILayout object obtained after processing the document outside the main process. Note: Applicable to FineReader Engine for Linux and Windows.

New scanning options in FRE for Windows

Feature	Description
More scanning capabilities	ABBYY FineReader Engine 12 has lots of device-based scanning features: automatic deletion of blank pages from the document automatic page crop automatic skew correction automatic detection of colority

Online documentation

Feature	Description
Documentation available online	In addition to the built-in documentation, you can now use the online version providing “just in time” information about the features and capabilities of ABBYY FineReader Engine.

Latest .NET Framework versions in FRE for Windows

Feature	Description
.Net COM Interop wrappers support	The distributive now includes .Net COM Interop wrappers for the following .Net Framework versions: 3.5 SP1 4.6 4.7 4.8

New export formats

Feature	Description
JSON	JSON (JavaScript Object Notation) is as an open-standard, language-independent file format to transmit data objects consisting of attribute–value pairs and array data types. FineReader Engine now supports exporting OCR results in JSON format.
New ALTO versions	ALTO (Analyzed Layout and Text Object) is an XML Schema that details technical metadata to describe the layout and content of physical text resources, such as the pages of a book or newspaper. The latest versions of this schema (4.0, 4.1, 4.2) are supported in FineReader Engine 12.
PDF/A-2b and PDF/A-3b	PDF/A is an ISO-standardized version of the Portable Document Format (PDF), specialized for use in archiving and the long-term preservation of electronic documents. Now, FineReader Engine supports all PDF/A conformance levels.
Doclang	Doclang is an AI‑optimized XML format designed to represent document structure in a semantically meaningful way for modern AI pipelines, including LLM ingestion, document RAG systems, and structured document processing workflows. Available as of v. 12.8.0.

Full functionality

FineReader Engine Glossary

Document Scanning and Image Import

⌘I