Skip to main content
ABBYY FineReader Engine allows you to add words with spaces to a dictionary. This feature can be very useful for checking words like “New York.” We recommend using a dictionary for words with spaces during field-level recognition when you can recognize the fields (small image chunks which contain short text fragments) using some specific information about the kind of data they can contain. The “New York” word, for example, may be useful if you are recognizing addresses. To recognize words with spaces, do the following:
  1. Add the “space” character to the alphabet of the current language.
  2. Add the necessary words with spaces to the dictionary.
  3. Set the OneWordPerLine property of the RecognizerParams object to TRUE.
Below is a detailed description of this operation:
  1. Create a new text language on the basis of a predefined language. To do this, create a TextLanguage object using the CreateTextLanguage method of the LanguageDatabase object and copy the attributes of the predefined language.
  2. Add the “space” character to the BaseLanguage object within the TextLanguage object, using the LetterSet property of the BaseLanguage object.
  3. Create a new dictionary and add all the necessary words with spaces to this dictionary. You can use the Dictionary object to do this.
  4. Create a UserDictionaryDescription object. Assign the path to the new dictionary to the FileName property of this object.
  5. Add the UserDictionaryDescription object to the DictionaryDescriptions collection of the BaseLanguage object.
  6. In the RecognizerParams object of all text blocks, assign the previously created TextLanguage object to the TextLanguage property and the TRUE value to the OneWordPerLine property.
Below you can see a sample in which the “space” character has been added to the alphabet of the English language, and the word “New York” has been added to the dictionary.
// Create a LanguageDatabase object
FREngine::ILanguageDatabasePtr pLanguageDatabase = Engine->CreateLanguageDatabase();
// Create a new TextLanguage object
FREngine::ITextLanguagePtr pTextLanguage = pLanguageDatabase->CreateTextLanguage();
// Copy all attributes from the predefined English language
FREngine::ITextLanguagePtr pEnglishLanguage =
          Engine->PredefinedLanguages->Find( "English" )->TextLanguage;
pTextLanguage->CopyFrom( pEnglishLanguage );
pTextLanguage->InternalName = L"SampleTL";
// Bind new dictionary to the first (and only) BaseLanguage object within TextLanguage
FREngine::IBaseLanguagePtr pBaseLanguage = pTextLanguage->BaseLanguages->Item(0);
// Change the internal dictionary name to a user-defined name
pBaseLanguage->InternalName = L"SampleBL"; 
// Add the "space" character
_bstr_t alphabet = pBaseLanguage->GetLetterSet( FREngine::BLLS_Alphabet );
pBaseLanguage->put_LetterSet( FREngine::BLLS_Alphabet, alphabet + L" " );
 
// Create new dictionary
_bstr_t dictionaryFile = L"D:\\sample.amd";
FREngine::IDictionaryPtr pDictionary =
pLanguageDatabase->CreateNewDictionary( dictionaryFile, FREngine::LI_EnglishUnitedStates );
pDictionary->Name = L"Sample";
// Add words with space to the dictionary
pDictionary->AddWord( "New York", 100 );
// Get the collection of dictionary descriptions and remove all items
FREngine::IDictionaryDescriptionsPtr pDictionaryDescriptions =
 pBaseLanguage->DictionaryDescriptions;
pDictionaryDescriptions->DeleteAll();
// Create a user dictionary description and add it to the collection
FREngine::IDictionaryDescriptionPtr dic =
 pDictionaryDescriptions->AddNew(FREngine::DT_UserDictionary);
// Specify the path to the dictionary which contains words with spaces
FREngine::IUserDictionaryDescriptionPtr userDic =
 dic->GetAsUserDictionaryDescription();
userDic->FileName = dictionaryFile;
FREngine::ILayoutPtr pLayout;
...
// Specify the properties of the RecognizerParams object of all text blocks
// Iterate blocks
for( int i = pLayout->Blocks->Count - 1; i >= 0; i-- ) {
  FREngine::BlockTypeEnum blockType = pLayout->Blocks->Item( i )->Type;
  // Find the text block
  if( blockType != FREngine::BT_Text ) {
    pLayout->Blocks->DeleteAt(i);
  } else {
    pLayout->Blocks->Item(i)->GetAsTextBlock()->RecognizerParams->
 TextLanguage = pTextLanguage;
    pLayout->Blocks->Item(i)->GetAsTextBlock()->RecognizerParams->
 OneWordPerLine = VARIANT_TRUE;
  }
}
...
// Create a LanguageDatabase object
FREngine.ILanguageDatabase languageDatabase = engineLoader.Engine.CreateLanguageDatabase();
// Create a new TextLanguage object
FREngine.ITextLanguage textLanguage = languageDatabase.CreateTextLanguage();
// Copy all attributes from the predefined English language
FREngine.ITextLanguage englishLanguage = engineLoader.Engine.PredefinedLanguages.Find( "English" ).TextLanguage;
textLanguage.CopyFrom( englishLanguage );
textLanguage.InternalName = "SampleTL";
// Bind new dictionary to the first (and only) BaseLanguage object within TextLanguage
FREngine.IBaseLanguage baseLanguage = textLanguage.BaseLanguages[0];
// Change the internal dictionary name to a user-defined name
baseLanguage.InternalName = "SampleBL";
// Add the "space" character
string alphabet = baseLanguage.get_LetterSet( FREngine.BaseLanguageLetterSetEnum.BLLS_Alphabet );
baseLanguage.set_LetterSet( FREngine.BaseLanguageLetterSetEnum.BLLS_Alphabet, alphabet + " " );
// Create new dictionary
string dictionaryFilePath = "D:\\sample.amd";
FREngine.IDictionary dictionary = languageDatabase.CreateNewDictionary( dictionaryFilePath, FREngine.LanguageIdEnum.LI_EnglishUnitedStates );
dictionary.Name = "Sample";
// Add words with space to the dictionary
dictionary.AddWord( "New York", 100 );
// Get the collection of dictionary descriptions and remove all items
FREngine.IDictionaryDescriptions dictionaryDescriptions = baseLanguage.DictionaryDescriptions;
dictionaryDescriptions.DeleteAll();
// Create a user dictionary description and add it to the collection
FREngine.IDictionaryDescription dic = dictionaryDescriptions.AddNew(FREngine.DictionaryTypeEnum.DT_UserDictionary);
// Specify the path to the dictionary which contains words with spaces
FREngine.IUserDictionaryDescription userDic = dic.GetAsUserDictionaryDescription();
userDic.FileName = dictionaryFilePath;
FREngine.ILayout layout;
...
// Specify the properties of the RecognizerParams object of all text blocks
// Iterate blocks
for( int i = layout.Blocks.Count - 1; i >= 0; i-- ) {
    FREngine.IBlockTypeEnum blockType = layout.Blocks[i].Type;
    // Find the text block
    if( blockType != FREngine.BlockTypeEnum.BT_Text ) {
        layout.Blocks.DeleteAt(i);
    } else {
        layout.Blocks[i].GetAsTextBlock().RecognizerParams.TextLanguage = textLanguage;
        layout.Blocks[i].GetAsTextBlock().RecognizerParams.OneWordPerLine = true;
    }
}
...

See also

Working with Languages Working with Dictionaries Field-Level Recognition