Documentation Index
Fetch the complete documentation index at: https://docs.abbyy.com/llms.txt
Use this file to discover all available pages before exploring further.
Introduction
When installing ABBYY Vantage, the number of services and workers depends on the load. ABBYY Vantage will automatically scale the services and workers to optimize document processing. This guide contains information about the resources that ABBYY Vantage will require depending on the load, as well as recommendations for the System Administrator regarding the correct ways to provide these resources to ABBYY Vantage.Reference Configurations
Resource consumption depends on your document processing scenario: the type of documents being processed, the skill being used, and the page load (that is, the number of pages processed within a certain time period). The reference Highly available configuration was tested while processing 3-page and 50-page invoices using the default Process skill with the following loads:- 50,000 pages per 8 hours
- 100,000 pages per 8 hours
- 150,000 pages per 8 hours
- 200,000 pages per 8 hours
- 10,000 pages per 8 hours
- 30,000 pages per 8 hours
- 50,000 pages per 8 hours
The Without high availability configuration doesn’t support training skills with the Deep Learning activity.
- Import files.
- Recognize documents.
- Classify and determine document types.
- Extract data from documents.
- Export data to JSON.
Node Types
| Node type | CPU cores (for each node) | RAM, GB (for each node) | Disk size, GB |
|---|---|---|---|
| Service nodes | 12 | 48 | 120* |
| Worker nodes | 12 | 48 | 120 |
Storage Requirements
| Configuration | Storage | Storage location | Disk size, GB |
|---|---|---|---|
| Without high availability | Internal NFS | Service node | 500 (for processing every 10,000 pages per 8 hours) |
| Without high availability | External NFS | NFS server machine | 500 (for processing every 10,000 pages per 8 hours) |
| Highly available | External NFS | NFS server machine | 50 (for processing every 10,000 pages per 8 hours) |
| Highly available | Local persistent volume | First service node (from the inventory file) | 500 (for processing every 10,000 pages per 8 hours) |
We recommend using external storage if the load is greater than 10,000 pages per 8 hours.
Performance Results
Depending on the page load, ABBYY Vantage required the following amount of resources to efficiently process documents in each configuration:Highly Available Configuration
| Load (pages/8 hours) | Nodes for services (3-page invoices) | Nodes for services (50-page invoices) | Nodes for workers (3-page invoices) | Nodes for workers (50-page invoices) |
|---|---|---|---|---|
| 50,000 | 4 | 4 | 4 | 4 |
| 100,000 | 4 | 4 | 5 | 7 |
| 150,000 | 4 | 4 | 7 | 9 |
| 200,000 | 4 | 4 | 8 | 11 |
Disk I/O Operations
| Load (pages/8 hours) | Disk I/O operations/second (3-page invoices) | Disk I/O operations/second (50-page invoices) |
|---|---|---|
| 50,000 | 100 | 50 |
| 100,000 | 250 | 100 |
| 150,000 | 400 | 170 |
| 200,000 | 600 | 230 |
Without High Availability Configuration
| Load (pages/8 hours) | Nodes for services | Nodes for workers |
|---|---|---|
| 10,000 | 1 | 1* |
| 30,000 | 1 | 3 |
| 50,000 | 1 | 3 |
When scaling ABBYY Vantage, no increase in document processing time was noted.
Managing Nodes
The System Administrator can add additional worker nodes to the cluster to increase the required load. For more information on how to prepare a node, see System Requirements.Adding a Worker Node
To add a worker node, follow these steps:- Open an inventory file from the installation directory.
- In the
[abbyy_workers]section, add an additional node by specifying its name and IP address. - Run the installer:
- Run the following playbook:
