Preparatory steps
- Open the “Sick Note DE” activity in the Activity Editor.
- Select one of the documents from the document set.
- Make sure that the advanced mode for the element properties is enabled. To toggle this mode on or off, click the icon on the Properties pane.
- All uploaded documents have undergone pre-recognition, and it’s useful to see what objects were found on the image. Click the icon. If you don’t see this icon due to the size of your screen, click the icon and select Recognized Words. The corresponding objects will be highlighted on the document image. You can switch between various highlighted object types at any time. For example, switching to Recognized Lines can be helpful when looking for paragraphs, and switching to Separators will facilitate the configuration of a Separator search element.
- If a search element lies outside the search area, it will not be found. Enable the Show search area option in the document image context menu. The search area for each element will be highlighted in green when you evaluate the matching results.
Extracting the patient’s data
Let’s start by extracting the missing data for the patient. To do so, we need to create several search elements. We advise grouping all elements related to one entity. Elements are matched one after another, and not finding the top element will decrease the hypothesis quality for subsequent elements. In the meantime, groups of search elements are processed independently of one another during matching, and an individual hypothesis is formulated for each group. Thus, you can control how elements influence one another. You can also evaluate matching results at a glance by checking whether the group elements have been found successfully. Lastly, grouping may help reduce matching time.- Click Create Element and select the Group element from the drop-down list. Change its name to “PatientDataArea”.
- A new group search element is set to be required by default. If a required element is not found, the Activity Editor runs into an error and matching is aborted. This scenario lets activities be skipped if they are not suitable for a certain document. However, in this tutorial we are creating an activity to extract data from all incoming documents, so we want the group to be optional. In the Under what conditions section, change the Element is value to Optional.
- We want to extract the find the paragraph that contains the patient’s name and address. In German documents the paragraph we are looking for is always located in the field with the label “Name, Vorname … ”. We need to find this text on the document and use it as a reference to search for the data we want to extract.
a. Keywords can be found using the Static Text search element. Click Create Element and select the Static Text element from the drop-down list. Change its name to “kwPatientTitle”.
b. Enter the text “Name, Vorname” in the Text to find field on the Properties pane.
c. Click Match. When processing is finished, you will see the Tree of Hypotheses below the document. Make sure that Advanced Designer has successfully found the desired static text. A green dot next to the element name indicates that a corresponding element was successfully found on the document. If you click the element name in the Tree of Hypotheses, you will see a violet frame around the corresponding region on the document.
- Now let’s find the lower boundary of the cell which contains the patient’s name and address. We will do so using a Separator element.
a. Add a Separator element to the group and call it “SeparatorBottom”. Set its minimum length to 200.
b. Right-click the element and select Match Element in the context menu. You will see that the Tree of Hypotheses contains many green dots. They correspond to different separators that fit the search criteria. You can click on each dot to see the corresponding object on the image.
c. To narrow down the search criteria, specify the search area for the separator. Click Match to find that “kwPatientTitle” element that will be used as an anchor element. In the Where to search section of the Properties pane, click Draw on Image. Select the “kwPatientTitle” element on the document and click the down arrow icon to specify the search area below the keyword and the nearest icon to look for the separator nearest to the keyword. You can find a detailed description of the anchor elements in the documentation.
d. Click Match and check that Advanced Designer has found the separator below the “kwPatientTitle” element. You can check the hypothesis for each element by clicking its name in the Tree of Hypotheses section. - A label and a separator are reliable reference elements for the patient’s data. However, if the print quality is too low, there is a chance the label text won’t be recognized or the separator won’t be found. To ensure good extraction results, we will search for a paragraph that lies between the label and the separator. A paragraph is a uniform block of text, meaning that it can succesfully be found even if some of the boundary elements were not found.
a. Create a Paragraph search element and call it “NameAddressParagraph”.
b. Change Text alignment to Left.
c. The patient’s data occupies from two to five lines, so specify the Line count from 2 to 5.
d. Specify the search area for the paragraph. This time you should use the Add menu in the Where to search section. The element should be located below the “kwPatientTitle” element and above the “SeparatorBottom” element.
e. Click Match. - Now we want to extract the patient’s data. Create a new group element called “PatientGroup”.
- The patient’s name that can occupy one or two lines. To capture several instances of an element, we will use a repeating group.
a. Create a Repeating Group search element and call it “NameGroup”. Specify 2 as the maximum number of repetitions. Make the element optional.
b. We want to search for the lines that are part of the “NameAddressParagraph” paragraph. To specify the element’s region as the search area, click the code editor icon below the document image and paste the following script in the Search Conditions section of the Code Editor:
d. The text we are looking for may contain upper- and lower-case letters, as well as a set of punctuation marks that may occur in names. Configure two separate character sets. The first set should contain all Latin upper- and lower-case letters. To add characters with diacritical marks, change the Unicode subrange or paste the characters directly into the Selected characters field.
e. The other set should contain the following punctuation marks: ,-.()’. We don’t want the string to contain only punctuation marks, so set the Portion in text, % for the second set to 40%. This property defines the maximum allowed percentage of characters from a certain set. Note: The default settings allow the string to contain up to 30% of characters not included in any set. This helps find strings even when some characters are recognized incorrectly or are not included in the set (such as characters with diacritical marks). You can adjust this setting by changing the Allowed errors value on the Properties pane. f. Disable the Search for parts of words option.
g. Specify the search area for the “NameLine” element: below the “kwPatientTitle” element and nearest to it.
h. Click Match and review the Tree of Hypotheses. You will see that two character strings are found. However, the second string contains the patient’s address.
i. To exclude the address from the search results, we will check if the first string contains both the first and the last name. This can be done by adding a simple script search condition. Select the “NameLine” search element and open the Search Conditions code editor.
j. We assume that the first line contains a full name if it contains a comma and a whitespace. If it contains a full name, we don’t want to search for a second instance of the repeating group. Paste the following script in the editor:
- The patient’s name extracted in step 7 will be mapped to the “Name” field. We will also extract and map the patient’s address.
a. Inside the “PatientGroup”, create a Character String search element called “Address” with the same character set configuration as the “NameLine” element.
b. Specify the search area for the element using code: the address must be located below the “NameLine” or, in case this element was not found, below the first line of the “NameAddressParagraph” element.

- Open the Manage Fields dialog, create the corresponding fields, and map them to search elements as follows:
| Name | Type | Search element |
|---|---|---|
| Name | Text field in the “Patient” group | NameLine |
| Address | Text field in the “Patient” group | Address |
- Delete the search elements that were automatically created for the new fields.
Extracting the type of sick note
The type of sick note field has two checkboxes. They are labeled as “Erstbescheinigung” and “Folgebescheinigung”. The task is to find the labels and then to check whether there are filled checkmarks next to them.- Create a Group element called “TypeOfSickNoteGroup”. Make the element optional.
- To store the information about both checkmarks, create a Repeating Group search element and call it “PrimaryGroup”.
a. A good idea is to restrict the search area for the element group. Specify the search area using code: to the right of the “PatientGroup” element and above the “DoctorAreaGroup” element (that will be created later on). **Note: **Always specify the “Exists” condition when using future elements.
c. Create an Object Collection search element called “Checkmark” with the following settings: Type:
Checkmark, Checkmark state: Checked, Minimum height: 10, Maximum width: 20, Maximum height: 20. Specify that the element is located to the left of the “kwPrimary” element and nearest to it. d. Click Match.
- Copy and paste the “PrimaryGroup” group. Rename the copied group to “SecondaryGroup”. This group will be required.
- Edit the “SecondaryGroup”.
a. Rename the “kwPrimary” element to “kwSecondary” and set the text to find to “Folgebescheinigung”. Specify the search area: below the “kwPrimary” element from the “PrimaryGroup”.
b. Specify the search area for the “Checkmark” element: to the left of “kwSecondary” and nearest to it.
c. The Object Collection search element finds a collection of all suitable objects within the search area. If the checkmarks are located on the same line, the “Checkmark” element of the “SecondaryGroup” may also find the Primary checkmark. To avoid this, exclude the primary checkmark (“Checkmark” element of the “PrimaryGroup”) from the search area for the “Checkmark” element from the “SecondaryGroup”.
d. Click Match.

- Open the Manage Fields window, create the corresponding fields and map them to search elements as follows:
| Name | Type | Search element |
|---|---|---|
| Type of Sick Note | Checkmark group | |
| Primary | Checkmark in the “Type of Sick Note” checkmark group | PrimaryGroup -> Checkmark |
| Secondary | Checkmark in the “Type of Sick Note” checkmark group | SecondaryGroup -> Checkmark |
- Delete the search elements that were automatically created for the new fields.
Extracting the doctor’s data
We now have to process the last block of data on these documents. It contains the doctor’s data and signature. We’ll first find the box which holds the data and then extract a paragraph with the doctor’s information and an image region containing the signature.- Create a Group element called “DoctorAreaGroup”. Make the element optional.
- The box we’ll be looking for contains a label. To find it, create a Static Text element called “kwDoctorTitle” (text to find: “Unterschrift des Arztes”).
- Inside the “DoctorAreaGroup” group, create another group called “DataArea”.
- The box that contains the doctor’s information and signature is a combination of four separators. They are located around the “kwDoctorTitle” element. However, we should configure the elements in a way that allows the program to find them even if the “kwDoctorTitle” element wasn’t found. In the “DataArea” group, create four Separator search elements with the following properties:
| Name | Orientation | Minimum length | Search area |
|---|---|---|---|
SeparatorRight | Vertical | 180 | Right of “kwDoctorTitle”, Nearest to the right page edge |
SeparatorLeft | Vertical | 180 | Left of “kwDoctorTitle”, Left of “SeparatorRight” (in case “kwDoctorTitle” wasn’t found), Nearest to “SeparatorRight”, Below “SeparatorRight” (click the icon to the right of the separator name and select Top Boundary of Region), Exclude “SeparatorRight” |
SeparatorBottom | Horizontal | 200 | Below “kwDoctorTitle” (with adjustment of -10 points), Right of “SeparatorLeft”, Left of “SeparatorRight”, Nearest to the bottom page edge (this setting will be useful in case “kwDoctorTitle” wasn’t found) |
SeparatorTop | Horizontal | 200 | Above “kwDoctorTitle”, Right of “SeparatorLeft”, Nearest to “TypeOfSickNoteGroup”, Exclude “SeparatorBottom” |
- We could specify the search area for the doctor’s signature and doctor information manually with respect to the found separators. Instead of doing so, we will create a Region element that corresponds to the area bounded using the separators. Create a Region search element called “BoxRegion” and specify the search area: left of “SeparatorRight”, right of “SeparatorLeft”, above “SeparatorBottom”, and below “SeparatorTop”.
- Create a new group called “DoctorGroup”.
- To locate the doctor’s signature, create an Object Collection element with the following settings inside the “DoctorGroup”:
| Property | Value |
|---|---|
| Name | Signature |
| Type | Picture |
| Minimum width | 15 |
| Minimum height | 15 |
| Maximum width | 600 |
| Maximum height | 350 |
| Search Conditions section of the Code Editor | The signature may be partly located outside the box. To find the whole image, we will expand the search area by 100 dots in each direction: RSA: DoctorAreaGroup.DataArea.BoxRegion.Rect.GetInflated(100dot,100dot); |
- To extract the text information in the box, create a Paragraph element with the following settings:
| Property | Value |
|---|---|
| Name | DoctorInformation |
| Maximum line count | 6 |
| Search area | Above “kwDoctorTitle”, Exclude “Signature” |
| Search Conditions section of the Code Editor | RSA: DoctorAreaGroup.DataArea.BoxRegion.Rect; |
- Click Match and make sure the elements are found correctly.

- Open the Manage Fields dialog, create the corresponding fields, and map them to search elements as follows:
| Name | Type | Search element |
|---|---|---|
| Doctor Information | Text field in the “Doctor” group | DoctorInformation |
| Signature | Image field in the “Doctor” group | Signature |
- Delete the search elements that were automatically created for the new fields.
