Predefined Languages in ABBYY FineReader Engine

Here is the list of internal names of the predefined languages that are supported in ABBYY FineReader Engine. Availability of this or that predefined recognition language depends on the availability of the corresponding modules among ABBYY FineReader Engine files. See the Installation section to know which recognition languages correspond to which ABBYY FineReader Engine modules. ABBYY FineReader Engine provides core recognition languages for OCR and ICR with full built-in dictionary support. Some recognition languages are available only for OCR, or do not have full built-in dictionary support. See details in the table below.

ABBYY FineReader Engine for Windows also provides a set of specific recognition languages. These languages contain special language units (addresses, date and time, human names, etc.). Such languages can be used for field recognition. See the list of special predefined languages for more information.

Internal name	Recognition language	Can be used for OCR	Full dictionary support available	Handwritten (++) or handprinted (+) text supported1	Can be used for text-based classification2	Can be used for BCR
Abkhaz	Abkhaz	+
Adyghe	Adyghe	+
Afrikaans	Afrikaans	+		+
Agul	Agul	+
Albanian	Albanian	+		+
Altaic	Altaic	+
Arabic	Arabic (Saudi Arabia)	+	+	3	+
ArmenianEastern	Armenian (Eastern)	+	+		+
ArmenianGrabar	Armenian (Grabar)	+	+		+
ArmenianWestern	Armenian (Western)	+	+		+
Awar	Avar	+
Aymara	Aymara	+		+
AzeriCyrillic	Azerbaijani (Cyrillic)	+
AzeriLatin	Azerbaijani (Latin)	+	+	+	+
Bangla	Bangla	+
Bashkir	Bashkir	+	+		+
Basic	Basic programming language	+
Basque	Basque	+		+
Belarusian	Belarussian	+
Bemba	Bemba	+		+
Blackfoot	Blackfoot	+		+
Breton	Breton	+		+
Bugotu	Bugotu	+		+
Bulgarian	Bulgarian	+	+	+	+
Burmese	Burmese	+
Buryat	Buryat	+		+
C++	C/C++ programming language	+
Catalan	Catalan	+	+		+
Chamorro	Chamorro	+		+
Chechen	Chechen	+
Chemistry	Simple chemical formulas	+
ChineseSimplified	Chinese Simplified	+				+
ChineseTraditional	Chinese Traditional	+				+
Chukcha	Chukcha	+
Chuvash	Chuvash	+
CMC7	For MICR (CMC-7) text type4	+
COBOL	COBOL programming language	+
Corsican	Corsican	+		+
CrimeanTatar	Crimean Tatar	+		+
Croatian	Croatian	+	+	+	+
Crow	Crow	+		+
Czech	Czech	+	+	+	+	+
Danish	Danish	+	+	+	+	+
Dargwa	Dargwa	+
Digits	Numbers	+		+
Dungan	Dungan	+
Dutch	Dutch (Netherlands)	+	+	+	+	+
DutchBelgian	Dutch (Belgium)	+	+	+	+
E13B	For MICR (E-13B) text type4	+
English	English	+	+	++ (including handwritten)	+	+
EskimoCyrillic	Eskimo (Cyrillic)	+
EskimoLatin	Eskimo (Latin)	+
Esperanto	Esperanto	+
Estonian	Estonian	+	+	+	+	+
Even	Even	+		+
Evenki	Evenki	+		+
Faeroese	Faeroese	+
Farsi	Farsi	+	+		+
Fijian	Fijian	+		+
Finnish	Finnish	+	+	+	+	+
Fortran	Fortran programming language	+
French	French	+	+	++ (including handwritten)	+	+
Frisian	Frisian	+		+
Friulian	Friulian	+		+
GaelicScottish	Scottish Gaelic5	+		+
Gagauz	Gagauz	+
Galician	Galician	+		+
Ganda	Ganda	+		+
Georgian	Georgian6	+
German	German	+	+	++ (including handwritten)	+	+
GermanLuxembourg	German (Luxembourg)	+		+
GermanNewSpelling	German (new spelling)	+	+	+	+
Greek	Greek	+	+	+	+	+
Guarani	Guarani	+		+
Hani	Hani	+		+
Hausa	Hausa	+
Hawaiian	Hawaiian	+		+
Hebrew	Hebrew	+	+		+
Hungarian	Hungarian	+	+	+	+	+
Icelandic	Icelandic	+
Ido	Ido	+		+
Indonesian	Indonesian	+	+	+	+	+
Ingush	Ingush	+
Interlingua	Interlingua	+		+
Irish	Irish5	+		+
Italian	Italian	+	+	+	+	+
Japanese	Japanese	+	+	++ (including handwritten)	+	+
JapaneseModern	Japanese (Modern)	+	+		+	+
Java	Java programming language	+
Kabardian	Kabardian	+
Kalmyk	Kalmyk	+
KarachayBalkar	Karachay-Balkar	+		+
Karakalpak	Karakalpak	+
Kasub	Kasub	+		+
Kawa	Kawa	+		+
Kazakh	Kazakh	+		+
Khakas	Khakas	+
Khanty	Khanty	+
Kikuyu	Kikuyu	+
Kirgiz	Kirghiz	+		+
Kongo	Kongo	+		+
Korean	Korean	+	+		+	+
KoreanHangul	Korean (Hangul)	+	+		+
Koryak	Koryak	+
Kpelle	Kpelle	+		+
Kumyk	Kumyk	+		+
Kurdish	Kurdish	+		+
Lak	Lak	+
Lappish	Sami (Lappish)	+		+
Latin	Latin	+	+	+	+
Latvian	Latvian	+	+	+	+
LatvianGothic	Latvian language written in Gothic script	+
Lezgin	Lezgin	+
Lithuanian	Lithuanian	+	+	+	+
Luba	Luba	+		+
Macedonian	Macedonian	+
Malagasy	Malagasy	+		+
Malay	Malay	+
Malinke	Malinke	+		+
Maltese	Maltese	+
Mansi	Mansi	+
Maori	Maori	+		+
Mathematical	Mathematical	+
Mari	Mari	+
Maya	Maya	+		+
Miao	Miao	+		+
Minankabaw	Minangkabau	+		+
Mohawk	Mohawk	+		+
Mongol	Mongol	+		+
Mordvin	Mordvin	+		+
Nahuatl	Nahuatl	+		+
Nenets	Nenets	+		+
Nivkh	Nivkh	+		+
Nogay	Nogay	+		+
Norwegian	NorwegianNynorsk and NorwegianBokmal	+	+	+	+	+
NorwegianBokmal	Norwegian (Bokmal)	+	+	+	+	+
NorwegianNynorsk	Norwegian (Nynorsk)	+	+	+	+	+
Nyanja	Nyanja	+		+
Occidental	Occidental	+
OcrA	For OCR-A text type	+
OcrB	For OCR-B text type	+
Ojibway	Ojibway	+		+
OldEnglish	Old English	+	+	+	+
OldFrench	Old French	+	+	+	+
OldGerman	Old German	+	+	+	+
OldItalian	Old Italian	+	+	+	+
OldSlavonic	Old Slavonic	+
OldSpanish	Old Spanish	+	+	+	+
Ossetic	Ossetian	+
Papiamento	Papiamento	+		+
Pascal	Pascal programming language	+
PidginEnglish	Tok Pisin	+		+
Polish	Polish	+	+	+	+	+
PortugueseBrazilian	Portuguese (Brazil)	+	+	+	+	+
PortugueseStandard	Portuguese (Portugal)	+	+	+	+	+
Provencal	Provencal	+
Quechua	Quechua	+		+
RhaetoRomanic	Rhaeto-Romanic	+		+
Romanian	Romanian	+	+	+	+
RomanianMoldavia	Romanian (Moldavia)	+		+
Romany	Romany	+		+
Ruanda	Ruanda	+		+
Rundi	Rundi	+		+
RussianOldSpelling	Russian (old spelling)	+	+		+
Russian	Russian	+	+	+	+	+
RussianWithAccent	Russian (with accents marking stress position)	+	+		+
Samoan	Samoan	+		+
Selkup	Selkup	+		+
SerbianCyrillic	Serbian (Cyrillic)	+		+
SerbianLatin	Serbian (Latin)	+		+
Shona	Shona	+
Sioux	Sioux (Dakota)	+		+
Slovak	Slovak	+	+	+	+
Slovenian	Slovenian	+	+	+	+
Somali	Somali	+		+
Sorbian	Sorbian	+
Sotho	Sotho	+		+
Spanish	Spanish	+	+	++ (including handwritten)	+	+
Sunda	Sunda	+
Swahili	Swahili	+		+
Swazi	Swazi	+		+
Swedish	Swedish	+	+	+	+	+
Tabassaran	Tabassaran	+
Tagalog	Tagalog	+		+
Tahitian	Tahitian	+		+
Tajik	Tajik	+		+
Tatar	Tatar	+	+		+
Thai	Thai	+	+		+
Tinpo	Jingpo	+		+
Tongan	Tongan	+		+
Tswana	Tswana	+		+
Tun	Tun	+		+
Turkish	Turkish	+	+	+	+	+
Turkmen	Turkmen	+
TurkmenLatin	Turkmen (Latin)	+		+
Tuvin	Tuvan	+		+
Udmurt	Udmurt	+
UighurCyrillic	Uighur (Cyrillic)	+
UighurLatin	Uighur (Latin)	+		+
Ukrainian	Ukrainian	+	+	+	+	+
UzbekCyrillic	Uzbek (Cyrillic)	+
UzbekLatin	Uzbek (Latin)	+		+
Vietnamese	Vietnamese	+	+		+
Visayan	Cebuano	+		+
Welsh	Welsh	+
Wolof	Wolof	+		+
Xhosa	Xhosa	+		+
Yakut	Yakut	+
Yiddish	Yiddish	+7
Zapotec	Zapotec	+		+
Zulu	Zulu	+

1Several languages support recognizing handwritten text: English, German, French, Japanese, and Spanish. Other languages marked in this column support only handprinted text. The same settings (IPageAnalysisParams::DetectHandwritten and IRecognizerParams::TextTypes = TT_Handwritten) enable recognizing handwritten or handprinted text, depending on which option the language supports. 2The classifier which uses only image characteristics can be used for documents in any language. The text-based classifiers (ClassifierTypeEnum::CT_Combined, ClassifierTypeEnum::CT_Text) are only available for recognized documents in languages which have full dictionary support. 3 Arabic ICR is not supported. However, handprinted Arabic digits can be recognized. See Recognizing Handprinted Arabic Digits. 4 If you would like to recognize a block with the MICR text type, use only languages with the Latin characters and not the combinations of Latin and CJK languages. 5 FineReader Engine doesn’t support some of the special symbols with diacritics in Gaelic Scottish and Irish languages. 6 The Nuskhuri and Mtavruli characters are recognized separately from each other, but both types of the characters are saved in the Unicode strings for Nuskhuri. 7 A few standard characters (veys בֿ, pasekh alef אַ, komets alef אָ, pasekh tsvey yudn ײַ, melupm vov וּ) are not supported in the predefined Yiddish language. To recognize these characters, create a new custom language and add these characters to it using the LetterSet property of the TextLanguage object (see Working with Languages), then set the new language as recognition language. For Windows, use the scenario described in Recognizing with Training and Training to recognize ligatures. See also LanguageIdEnum Working with Languages