Skip to main content

Setting Up a Shared Folder Connection

ABBYY Vantage lets you use shared folders hosted on the Vantage server for importing and exporting documents and skills, as well as updating data catalogs. Before you can start using shared folders (NFS share), you need to set up a connection to those shared folders from a client computer. Perform the following steps on a client computer running Windows:
  1. Run Windows PowerShell as an Administrator.
  2. Install Windows NFS Client:
dism /online /Enable-Feature /FeatureName:ServicesForNFS-ClientOnly
dism /online /Enable-Feature /FeatureName:ClientForNFS-Infrastructure
  1. Configure a mapping of Windows users to Unix UIDs and GIDs depending on your company policies:
New-ItemProperty -Path "HKLM:\Software\Microsoft\ClientForNFS\CurrentVersion\Default" -Name "AnonymousGid" -Value 65532 -PropertyType DWord
New-ItemProperty -Path "HKLM:\Software\Microsoft\ClientForNFS\CurrentVersion\Default" -Name "AnonymousUid" -Value 65532 -PropertyType DWord
  1. Restart NFS Client:
nfsadmin client stop
nfsadmin client start
Once you have completed the above steps, you will be able to copy and use shared folder paths in Vantage, as well as open them in File Explorer.

Setting Up a Database Connection

ABBYY Vantage uses databases hosted on external servers and may become inoperable if those servers fail. The system administrator can restore such databases on a different server and set up a connection to the new databases using Consul.
Before starting, make sure that the kubectl command line tool is installed and that a connection to the Kubernetes cluster has been established.
To set up a connection to a new database in the ABBYY Vantage settings:
  1. Access the Consul web interface by running the command below:
kubectl port-forward -n abbyy-infrastructure service/consul-ui 8500:80
Then navigate to http://localhost:8500/ui/dc1/kv/secret/.
  1. Use the Key/Value tab that will open to select the correct Vantage environment.
  2. Select either the platform or the vantage project, as well as the appropriate service that uses the database (for example, mail).
Consul platform services list showing available services
  1. Navigate to the database section that every service contains.
Consul Key/Value tab showing Vantage environment selection
  1. Open the SqlServer section.
Consul database section showing SqlServer configuration
  1. In the connectionString key:
    • Replace the old value of Server with the address of the new server
    • Specify the new database in the Database parameter
    • Specify the login credentials in the User Id and Password parameters
Consul connectionString value editor
  1. Click Save.
  2. Restart the modified service by running the following command:
label=mail
kubectl -n abbyy-vantage rollout restart $(kubectl -n abbyy-vantage get deployments -l app.kubernetes.io/component=$label -o name)
When a server address changes, this procedure has to be carried out for every database.

Database Services Reference

Below is a table listing all services that use the database, as well as their label for restarting each service.

Platform Services

Consul Section NameService LabelNotes
api-gateway-registryapi-gateway-registry
api-registryapi-registry
auth-adminapi2auth-adminapi2
auth-identityauth-identity
authauth-sts-identity, auth-adminapi2This database is used by two services
blob-storageblob-storage
cron-servicecron-service
documentsetstoragedocumentsetstorage
mailmail
security-auditsecurity-audit
storagestorageThe database section is stored in the fileMetadata catalog
workflow-facadeworkflow-facade
workflow-schedulerworkflow-scheduler

Vantage Services

Consul Section NameService Label
catalogstoragecatalogstorage
folderimportfolderimport
interactive-jobsinteractive-jobs
mailimportmailimport
permissionspermissions
publicapipublicapi
reportingreporting
secretstoragesecretstorage
skill-monitorskill-monitor
skillinfoskillinfo
subscriptionssubscriptions
tokenmanagementtokenmanagement
transactionstransactions
workspaceworkspace

Setting Up GPU

Vantage allows you to use GPU to train skills with the Deep Learning activity for extracting data from semi-structured documents.

System Requirements for GPU

  • Minimum virtual GPU RAM: 12 GB
  • 1 CPU and 4 GB RAM for each virtual GPU on host (e.g., a VM with a single virtual GPU of 12 GB must have at least 2 CPU and 8 GB RAM)

Virtual GPU

You can use a virtual GPU (vGPU) to split one physical GPU into several virtual machines, allowing Vantage resources to be used more efficiently. To set up vGPU:
  1. Copy the nVidia GRID driver package from the nVidia application hub to a virtual machine with a GPU and run the following commands:
apt-get update
apt-get install dkms
dpkg -i nvidia-linux-grid-535_535.54.03_amd64.deb
  1. Install the nVidia GPU operator onto the Kubernetes cluster: a. Place the license token file (generated in the nVidia application hub) in the $PWD/gpu/ folder before running the Vantage installer container. b. Add the -v $PWD/gpu:/ansible/files/gpu:ro parameter to the command for running the Vantage installer container:
docker run -it \
-v $PWD/kube:/root/.kube \
-v $PWD/ssh/ansible:/root/.ssh/ansible \
-v "//var/run/docker.sock:/var/run/docker.sock" \
-v $PWD/inventory:/ansible/inventories/k8s/inventory \
-v $PWD/env_specific.yml:/ansible/inventories/k8s/group_vars/all/env_specific.yml \
-v $PWD/ssl:/ansible/files/ssl:ro \
-v $PWD/gpu:/ansible/files/gpu:ro \
--privileged \
registry.local/vantage/vantage-k8s:2.7.1
c. Add a GPU node to the inventory file in the [abbyy_workers] group. The name of the virtual machine with the GPU must contain “gpu”:
[abbyy_workers]
worker16-48-w01 ansible_host=10.10.10.27
worker16-48-w02 ansible_host=10.10.10.21
worker16-48-w03 ansible_host=10.10.10.20
worker2-12-a40-gpu01 ansible_host=10.10.10.60
d. Add a node to the cluster by running the following playbook:
chmod 600 /root/.ssh/ansible
ansible-playbook -i inventories/k8s -v playbooks/4-Kubernetes-k8s.yml
  1. Set up vGPU by running the following playbook:
ansible-playbook -i inventories/k8s -v playbooks/setup-gpu-node.yml

GPU Passthrough

You can set up GPU passthrough which gives a virtual machine direct access to your GPU. To set up GPU passthrough:
  1. Run the Vantage installer container:
docker run -it \
-v $PWD/kube:/root/.kube \
-v $PWD/ssh/ansible:/root/.ssh/ansible \
-v "//var/run/docker.sock:/var/run/docker.sock" \
-v $PWD/inventory:/ansible/inventories/k8s/inventory \
-v $PWD/env_specific.yml:/ansible/inventories/k8s/group_vars/all/env_specific.yml \
-v $PWD/ssl:/ansible/files/ssl:ro \
--privileged \
registry.local/vantage/vantage-k8s:2.7.1
  1. Add a GPU node (e.g., worker2-12-a40-gpu01) to the inventory file in the [abbyy_workers] group:
[abbyy_workers]
worker16-48-w01 ansible_host=10.10.10.27
worker16-48-w02 ansible_host=10.10.10.21
worker16-48-w03 ansible_host=10.10.10.20
worker2-12-a40-gpu01 ansible_host=10.10.10.60
  1. Run the playbook:
ansible-playbook -i inventories/k8s -v playbooks/4-Kubernetes-k8s.yml
  1. Install the GPU operator helmchart:
helm upgrade --install gpu-operator ansible/files/helm/charts/gpu-operator --create-namespace --debug -n gpu-operator
  1. Add a node taint:
kubectl taint nodes worker2-12-a40-gpu01 nvidia.com/gpu:NoSchedule

Testing and Deploying GPU

For both vGPU and GPU passthrough modes, to test a GPU operator installation:
  1. Run this command:
kubectl apply -f filename
  1. Create a YAML file with the following contents and apply it:
apiVersion: v1
kind: Pod
metadata:
  name: gpu-pod
  namespace: gpu-operator
spec:
  restartPolicy: Never
  containers:
    - name: cuda-container
      image: nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda10.2
      resources:
        limits:
          nvidia.com/gpu: 1 # requesting 1 GPU
  tolerations:
    - key: nvidia.com/gpu
      operator: Exists
      effect: NoSchedule
    - key: k8s.abbyy.com/techcore
      effect: NoSchedule
      value: "true"
  1. Check the pod log. You should see a response containing Test PASSED:
kubectl -n gpu-operator logs gpu-pod
Expected output:
[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done
To deploy the GPU worker:
  1. Add the following parameters to the env_specific.yml file:
techcore:
  use_gpu_workers: true
  use_nn_extraction_training_workers: true
  1. Do one of the following:
    • If Vantage is already installed, run the following playbook to deploy GPU workers:
ansible-playbook -i inventories/k8s -v playbooks/11-DeployWorkers-k8s.yml
  • If Vantage is not installed yet, GPU workers will be deployed during the installation.

Setting Up a Manual Review Inactivity Timeout

In manual review, if no actions are taken by the operator for a period of 15 minutes with regards to an open task, a timeout is triggered. The System Administrator can change the length of inactivity required for a timeout using Consul. To configure the timeout:
  1. Access the Consul web interface by running:
kubectl port-forward -n abbyy-infrastructure service/consul-ui 8500:80
Then navigate to http://localhost:8500/ui/dc1/kv/secret/.
  1. Use the Key/Value tab to select the correct Vantage environment.
  2. Change the values of the following keys:
KeyDescription
secret/abbyy-vantage/vantage/verification/interactiveJobsOptions/popTimeoutThe minimum period of time a user is inactive before a task will be returned to the interactive task queue. Any interactive action (mouse movement, keyboard input, patch processing, etc.) resets the countdown. Default: 00:15:00 (15 minutes)
secret/abbyy-vantage/vantage/verification/interactiveJobsOptions/processingPopTimeoutThe minimum period of user inactivity after which the task will be returned to the queue if there are long-term operations (applying a skill, turning pages, etc.). When a long-running operation starts, this value is set to the maximum allowable inactivity period. When the operation completes, the inactivity period resets to the popTimeout value. Default: 1.00:00:00 (24 hours)
  1. Click Save.
  2. Restart the verification and manualverification services:
kubectl -n abbyy-vantage rollout restart $(kubectl -n abbyy-vantage get deployments -l app.kubernetes.io/component=verification -o name)
kubectl -n abbyy-vantage rollout restart $(kubectl -n abbyy-vantage get deployments -l app.kubernetes.io/component=manualverification -o name)

Updating the SSL Certificate

When the SSL certificate expires, you will need to change to the new certificate.

Using Lens

  1. Go to Config > Secrets and find all secrets called platform-wildcard.
  2. For each secret, find the Data subsection, click the Show icon, and update the values:
    • Enter the value of the new certificate in the tls.crt field
    • Enter the value of its key in the tls.key field
The certificate and the key must be PEM files with base64 ASCII encoded contents (PKCS#8). They should start with -----BEGIN CERTIFICATE----- for the certificate and -----BEGIN PRIVATE KEY----- for the key.
  1. Click Save.

Using Linux Command Line

  1. Make sure you have access to the Kubernetes cluster:
kubectl get nodes
  1. Place the certificate and key in PEM format into the current folder: cert.pem, key.pem.
Convert your CRT file to PEM format if needed:
-----BEGIN CERTIFICATE-----
[your certificate]
-----END CERTIFICATE-----
-----BEGIN PRIVATE KEY-----
[your key]
-----END PRIVATE KEY-----
  1. Run the following commands:
for i in `kubectl get secret --field-selector metadata.name=platform-wildcard -o custom-columns=:metadata.namespace -A --no-headers 2>/dev/null`; do kubectl patch secret platform-wildcard -p "{\"data\":{\"tls.key\":\"$(base64 < "./key.pem" | tr -d '\n')\", \"tls.crt\":\"$(base64 < "./cert.pem" | tr -d '\n')\"}}" -n $i; done
kubectl rollout restart deployment -n abbyy-infrastructure $(kubectl get deployment -n abbyy-infrastructure -o custom-columns=NAME:metadata.name --no-headers 2>/dev/null | grep ingress-nginx-controller)