Skip to main content
Debugging the Extraction Rules activity includes the following steps:
  1. Compiling and matching the activity.
  2. Reviewing the errors and refining the rules to improve object extraction.
Repeat these steps until you are satisfied with the results.

Compiling and matching

The Extraction Rules activity is compiled automatically when you exit the activity editor or when you click Match or Test Activity. If any compilation errors occur, matching cannot proceed. Any search elements with compilation errors will be marked with an error icon. Hover your mouse over this icon to see a detailed description of the error. Compilation errors may occur in the following:
  • Element dependencies: Elements are searched top down. This means that elements used in the search conditions of another element must precede that element in the list of elements. For example, if Element A is referenced in the search conditions for Element B, then Element A must precede Element B in the list of elements. If you disable Element A or move it below Element B in the list of elements, a compilation error will occur, and Element B will be marked with an error icon.
  • Regular expressions: If the regular expression in a Value from Regular Expression search element is invalid, an error will occur in this element.
  • Dictionaries: If the program is unable to connect to the dictionary used by a Value from Dictionary search element, an error will occur in this element.
  • Code (see Code syntax for Extraction Rules activity for NLP)
Matching refers to finding objects that meet the conditions specified in the properties of the search elements. The program will go down the Search Elements list attempting to locate all the objects described by the elements, one by one. Matching is performed only for the active elements and fields. To reduce matching time while debugging, you can temporarily disable the elements that do not affect the results for the element you are currently debugging. To change the state of an element, use its shortcut menu. You can also select multiple elements and change their state with one click. The compilation and matching status of the Extraction Rules activity is displayed in the notification log (available by clicking the bell icon button in the upper right corner). You can navigate to the matching results by clicking the link in the appropriate notification.

Reviewing and correcting errors

If a search element has not been found, check that you selected the correct element type and consider refining the conditions for more reliable search. For example, auxiliary search elements can be added to help locate the element. Once the errors have been corrected, match the activity once again, making sure that all the objects can be found on the problem pages and that the corrections have not interfered with the matching of the objects on other pages.