- 0x2028 — Line break symbol
- 0x2029 — Paragraph break symbol
- 0xFFFC — Object replacement character (denotes an embedded picture inside the text)
- 0x0009 — Tabulation
- 0x005E — Circumflex accent (^), used by ABBYY FineReader Engine as a replacement for unrecognized characters
- 0x00AC — Soft hyphen
Recognized text in the layout
Only text, table, and barcode blocks contain text after recognition. Other blocks have no text. The Text object provides access to the recognized text of text and table blocks, while the BarcodeText object provides access to the text of a barcode block. To access the recognized text of a block, do the following:- For text blocks
- For table blocks
- Receive the collection of table cells using the ITableBlock::Cells property.
- Select the desired cell. Use the methods of the TableCells object.
- Receive the block object of the cell (the ITableCell::Block property).
- Check that the block is of type BT_Text (the IBlock::Type property) and receive the TextBlock object using the IBlock::GetAsTextBlock method.
- Use the ITextBlock::Text property.
- For barcode blocks
