Skip to main content

Input Formats

ABBYY Vantage supports the following image files formats for input:
  • PDF (*.pdf, including digitally born files, scans, and scans with text under the image)*
  • GIF (*.gif)
  • JBIG2 (*.jb2, *.jbig2)
  • JPEG (*.jpg, *.jpeg)
  • JPEG 2000 (*.jp2, *.j2k, *.jpf, *.jpx, *.jpc)
  • PNG (*.png)
  • TIFF (*.tif, *.tiff). Note that TIFF files with alpha channels and JPEG compression are not supported.
  • BMP (*.bmp, *.dib, *.rle)
  • PCX (*.pcx, *.dcx)
  • DOCX (*.docx)
  • PPTX (*.pptx)
  • XLSX (*.xlsx)
Import of office file extensions works in ABBYY Vantage Cloud environments only. This feature is planned to be implemented for on-premises environments in upcoming releases.
Important! ABBYY Vantage will not open images with height or width greater than 32512 pixels.
Best practices for the input images:
  • Scan the documents in a resolution of 150 dpi or greater, optimally not less than 300 dpi.
  • Pass the original images to Vantage, without any editing or preprocessing on your side.
  • If you’re photographing documents:
    • Make sure the whole document fits into the image, no corners are cut off
    • If possible, place the document onto a level surface
    • Avoid strong shadows and flashlight glares
    • Try to achieve an even focus, so that no parts of the document are blurred

Output Formats

Field data can be exported into the following formats:
  • JSON
  • CSV
Document contents can be exported into the following formats:
  • JSON
  • XML
  • ALTOXML
  • TXT
  • DOCX
  • XLSX
  • PPTX
  • HTML
  • PDF with text layer
    • PDF/A-3a
    • PDF/A-3b
The document images can be exported into formats with no text content:
  • Image-only PDF/A-3b
  • TIFF
  • JPEG

PDF Restrictions

ABBYY Vantage won’t be able to process secured PDF files (for example, password-protected PDFs, or PDFs with content copying prohibited). To process such files, use a third-party PDF editing tool to remove the restrictions.