Skip to main content
A search element allows you to set conditions for the type and properties of an object you wish to extract. As NLP activities work with unstructured text, search conditions specify the position of objects in relation to other text rather than their geometric relationship. You can also use auxiliary search elements to narrow down the search, specifying that the desired object can be found inside, before, or after such auxiliary elements.

Creating a search element

  • You can quickly create a search element by clicking one of the objects highlighted on the image. The new search element will be of the same type as the object you click. Use the Show Image Objects button on the toolbar to select the objects to be highlighted.

Highlighting objects

The following types of objects can be highlighted:
  • Person
  • Organization
  • Address
  • Location
  • Date
  • Duration
  • Money
  • Recognized words
Note: All these objects will be highlighted by default with the exception of recognized words.
  • You can also create a new search element by using the menu:
  1. Go to the Search Elements tab to the right of the document window.
  2. Click Create Element.
  3. Select the desired element type in the list that opens.
Once the element has been created, you need to set up its properties in the Properties pane (see Element properties for more information).
Note: The specified properties can also be viewed and edited in code format (see Code syntax for Extraction Rules activity for NLP for more information).

Search element types

When creating a search element, you need to specify its type, which will depend on the object you want to find. The available types of search elements are briefly described below.

Person

Names of people, for example: John Doe, Jane Smith.

Organization

Names of organizations, for example: ABBYY, Acme Corp.

Address

Addresses, for example: 123 Main Str., Anytown AB 45678.

Location

Names of locations, for example: Anytown, Corporate Place.

Date

Dates in different formats, for example: November 14, 2009, 11/14/2009.

Duration

Time periods, for example: twelve (12) months, 4 days.

Money

Amounts of money, for example: $2670.00, 199 dollars 99 cents. Note: The Person, Organization, Address, Location, Date, Duration, and Money elements correspond to the named entities that you can set up in the Named Entities (NER) activity and are extracted using the same technology.

Value from Dictionary

A word or phrase from a dictionary. The dictionary should be a plain TXT file with a list of search text variants, one variant per line of text.

Value from Regular Expression

A value that matches a regular expression you specified.

Text

A keyword or phrase, with the option to look for all word forms or to allow for some recognition errors.

Group

A collection of nested search elements. Elements making up a group can be both simple and group elements. A group element has no properties of its own. Data will be extracted based on the settings of its nested search elements. Group elements can be used to enforce a logical hierarchy of elements, for easier debugging and navigation. For example, grouping together a person’s name, address, and date of birth will let you extract the data about each person in a consistent manner.

Repeating Group

This element is designed to look for repeating groups of elements. Repeating groups are intended for cases where an entity may have multiple instances, each with its own properties, but you do not know how many instances you are going to have. The properties of each instance are specified in the nested elements of the repeating group. For example, if you are processing résumés, you may want to create an “Education” repeating group with the following nested elements: “School_name”, “Degree”, “Start_date”, and “Graduation_date”. On the other hand, if the data you are looking for relates to different entities with different roles, a repeating group won’t be the right choice. For example, if you have only two parties to a contract, say, buyer and seller, create a “Party1_Buyer” group and a “Party2_Seller” group instead of one repeating “Party” group.

Input field

This element allows you to use a field extracted by another activity as a building block for the rules. For example, if an Extraction Rules activity is preceded by a Segmentation activity, you may want to use some of the segments to narrow down the search.

Changing an element’s type, name, and position in the list

To change the type of an element:
  • Right-click an element and select Convert Element to on the shortcut menu.
  • Select an element in the list and click a highlighted object on the image. This will let you convert the selected search element to the type of the highlighted object. If you click on a highlighted recognized word, you can convert the search element to Text and at the same time add the selected word to the list of keywords for this search element.
  • For search elements that correspond to named entities, use the Entities property to change the type of the named entity.
Note: Changing the type of an element won’t convert non-group elements to group elements and vice versa.
To change the name of an element:
  • Right-click an element, select Rename on the shortcut menu, and enter a new name.
  • Select an element, click on its name (or press F2), and enter a new name.
An element name can contain English letters, numbers, and underscores. However, an element name cannot start with a number. Spaces, special symbols (.,:- \ /), and reserved names are not allowed. To move elements in the list:
  • Drag elements up or down to change their position in the list.
  • Drag elements onto a group element to put them inside the group.
Note: Elements are searched top down. This means that elements used in the search conditions of another element must precede that element in the list of elements.