> ## Documentation Index
> Fetch the complete documentation index at: https://docs.abbyy.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Labeling guidelines

> Best practices for labeling fields in Advanced Designer — selection methods and per-type guidelines for structured and unstructured documents.

To label a document, mark the regions that contain field values and tell the skill what data type each field holds. Before you start, pick the right selection method for the field shape, then follow the per-type guidelines for [structured](#structured-documents), [semi-structured](#semi-structured-documents), or [unstructured](#unstructured-documents) documents.

## Selection methods

| Method                                                                    | Best for                                                                                                |
| :------------------------------------------------------------------------ | :------------------------------------------------------------------------------------------------------ |
| Hover and click a word                                                    | Single-word fields                                                                                      |
| Drag a rectangle around words                                             | [Semi-structured documents](/vantage/documentation/advanced-designer/use-cases/process-semi-structured) |
| Click the first word, then drag (left mouse button held) to the last word | [Unstructured documents](/vantage/documentation/advanced-designer/use-cases/process-unstructured)       |

## Structured documents

[Structured documents](/vantage/documentation/advanced-designer/use-cases/extracting-from-docs#structured-documents) (such as pre-formatted forms) always contain the same information in the same locations. You only need to label a few sample documents because there's no layout variation.

* Specify each field's region accurately — field values alone aren't enough for training.
* Mark the entire **placeholder**, not the value inside it.
* If a field contains no value, mark the empty placeholder anyway.
* For multi-part fields, hold **Shift** to add additional parts. All parts must be on the same page.
* For tables on a fixed form, label every row, including empty rows.
* If you add a new field after labeling, go back and label that field on every document in the training set.

## Semi-structured documents

[Semi-structured documents](/vantage/documentation/advanced-designer/use-cases/extracting-from-docs#semi-structured-documents) — bills, payment orders, invoices — contain similar fields, but field locations, sizes, and counts vary across documents.

* Specify each field's region accurately — field values alone aren't enough for training.
* Click the field's value (the word or words it contains); the region is created automatically.
* If a field contains no value, don't create a region for it.
* Don't mark partial words — the trainer learns on whole words only.
* For multi-part fields, hold **Shift** to add additional parts. All parts must be on the same page.
* Do not instruct the program to find fields **inside** another field's region (whether an individual field like an address or a table cell like *Description*). To extract from a large region, chain activities: a semi-structured extraction activity to find the region, then an [NLP Extraction Rules activity](/vantage/documentation/advanced-designer/activities/nlp-extraction-rules/extraction-rules) or a script rule to pull specific fields from it.
* If you add a new field after labeling, go back and label that field on every document in the training set.

### Tables and repeating groups

For repeating data, decide between a **table** and a **repeating group**:

| Use this                                                   | When                                                                            |
| :--------------------------------------------------------- | :------------------------------------------------------------------------------ |
| **Table**                                                  | Tabular data with a common header and values that have no keywords next to them |
| **Repeating group** with the *Allow multiple items* option | Less-structured data where keywords sit next to the values                      |

If different documents are organized differently, pick the option that fits the majority.

To label a table, mark the first row's cells one at a time (each click creates a column), then click **Continue table from this row** and verify the rest of the table is labeled correctly.

<Tip>
  For large tables on visually similar pages, you can delete the similar middle pages and label only the first page, the last page, and a few pages in between.
</Tip>

## Unstructured documents

[Unstructured documents](/vantage/documentation/advanced-designer/use-cases/extracting-from-docs#unstructured-documents) — contracts, scientific articles, email messages — have no consistent structure.

* Specify each field's region accurately — field values alone aren't enough for training.
* For segments (fields trained by the [Segmentation activity](/vantage/documentation/advanced-designer/activities/segmentation/segmentation-activity)), include one or more whole paragraphs. A segment cannot include only part of a paragraph.
* Click the field's value (the word or words it contains); the region is created automatically.
* If a field contains no value, don't create a region for it.
* Don't mark partial words — the trainer learns on whole words only.
* If a word is followed by punctuation, adjust the region so the punctuation isn't enclosed.
* A field region may span pages (for example, a contract clause). Label the first part on the first page, then hold **Shift** while continuing on the next page.
* To label a field **inside** another field's region (for example, a field inside a segment), select the inner field and start labeling — the action creates a new region rather than selecting the outer one.

<Note>
  This is the opposite of the semi-structured guideline above: segments in unstructured documents are *designed* to contain inner fields, so labeling within them is intended. In semi-structured documents, the equivalent nesting creates training conflicts.
</Note>

## Related topics

<CardGroup cols={2}>
  <Card title="Labeling documents" icon="tag" href="/vantage/documentation/advanced-designer/labeling/labeling">
    Reuse labeled documents from training sets, manual review, or FlexiCapture.
  </Card>

  <Card title="Importing from FlexiCapture" icon="file-import" href="/vantage/documentation/advanced-designer/labeling/import-from-flexi">
    Format and procedure for reusing FlexiCapture-labeled documents.
  </Card>

  <Card title="Document categories" icon="layer-group" href="/vantage/documentation/advanced-designer/use-cases/extracting-from-docs">
    Background on structured, semi-structured, unstructured, and mixed documents.
  </Card>

  <Card title="Segmentation activity" icon="scissors" href="/vantage/documentation/advanced-designer/activities/segmentation/segmentation-activity">
    Used for segment fields in unstructured documents.
  </Card>
</CardGroup>
