OCR AI – Installation

Facebooktwitterpinterestlinkedinmail
applies to version: 8.2.x; author: Filip Jawień

The technical side of WEBCON BPS 8.2 has been massively enhanced when it comes to automatic document recognition and registration.

A completely new and original mechanism was constructed based on neural networks.

A basic use case scenario involving this recognition mechanism would be scanning a company’s physical documents to transfer their information into electronic WEBCON BPS forms. In cooperation with OCR engine, the system transfers scanned imagery into text, then through the use of neural networks, recognizes certain data and phrases.

Required components:

  • WEBCON BPS 8.2
  • ABBYY FineReader 11 with active license – component responsible for generation of text layer. In order for the component to work as intended, the text layer must be generated by ABBYY FindReader 11
  • OCR AI Project provided by WEBCON – component responsible for recognizing and reading text, then entering the retrieved data into configured form fields.

Installation

To complete the installation successfully, the folder containing ABBYY FineReader 11 installation files is necessary. These files should be placed in  subfolder named ABBYY found in the WEBCON BPS installation folder.

ocr1

ocr2

After launching the WEBCON Business Process Suite Install wizard, pick advanced installation – select the necessary WEBCON BPS elements and additionally select ABBYY FineReader Engine 11.0.

finereader engine

When all WEBCON BPS components are successfully installed, an extra window for ABBYY FineReader 11 will appear on the screen

To correctly install the FineReader 11 component, a product key is required.

There are two types of said key:

  • Stand–alone– a USB key which must be plugged into the machine on which FineReader will be installed. This key supports one file processing service.
  • Network license – The USB key must be plugged into the license server. Supports multiple services.

ocr4

Enter a valid product key, as well as the install path for FineReader 11.

Additionally, for the full installation select the “Install license server” and “Install hardlock key drivers”checkboxes.

The “Use network license” should be checked only when using a network license, it will enable the bottom-most field where the license server name must be entered.

Check that all fields have been filled out correctly, then click “OK” to proceed with the installation process.

During the installation you will be asked to assign privileges for FineReader 11. Accepting to assign them now will open up the DCOM application, it is necessary to (at the very least) give privileges to the service account. This is mandatory for the proper cooperation of FineReader 11 and WEBCON BPS.

The license should activate automatically, however in case the need arises to manage FineReader 11 license keys manually:

 

License activation

After FineReader 11 finishes installing,  go to the destination folder where FineReader 11 was installed:

C:\Program Files\FineReader 11\Bin64

Then open LicenseManager application

ocr5

Check if the installed license is activated. If the licenses field is empty, it is necessary to activate it by clicking the “Activate license…” button

ocr6

Once more,  choose the license type and enter the product serial number. If given number is correct, the activated license should be visible in the License window.

ocr7

How to configure WEBCON BPS to work with FineReader 11

After successfully installing FineReader 11, it must be correctly configured to work with WEBCON BPS.

In Designer Studio, navigate to the System settings tab, then click Services configuration and select OCR component – FineReader.

ocr8

Afterwards, click the Services configuration node, and then the Services node to unroll a list of all services. Find the desired service to open its configuration, then select the OCR AI checkbox under Service roles.

ocr9

The Configuration tab is used define the number of threads for each individual OCR function. For maximum effectiveness,  it is recommended to set the number of the threads to be one greater than the number of cores in the processor.

ocr10

In order to fully utilize OCR AI functionality, an OCR AI project has to be added. In order to add OCR AI project, move to OCR AI Projects tab in System settings and choose a folder containing a project.

OCR AI Project is a component responsible for recognition of certain document fields. It is a kind of template, which shows the system where it should look for  the chosen fields.

In order to run OCR on documents other than invoices, a new OCR AI project has to be created. If you wish to create a new project, please contact WEBCON.

ocr11

If the project loaded correctly, click Save button. When saved correctly, the system will be ready to run the recognition process on files, using OCR AI.

Sample OCR WorkFlow

ocr12

The sample workflow is made out of 7 steps:

  • Registration – Start step. Used to fill in basic information and add attachments that intended for be processing.
  • Awaiting for text layer – system step, waits to process files (adding text layer) with FineReader component. When the processing is finished, the system will automatically move the workflow to the next step.
  • Awaiting for TAX ID recognition – system step, waits for files to be processed by the OCR AI component. In this particular case, the system looks only for the distinguisher, (which, in this case, is set as Seller TAX ID). Thanks to this configuration, the system will decide if a custom OCR AI project has to be used in following step. When the processing is finished, the system will automatically move the workflow to the next step.
  • Awaiting for OCR AI – system step, waits for files to be processed by OCR AI. When the processing is finished, the system will automatically move the workflow to the next step.
  • Verification – OCR verification step. Allows to validate the results retrieved by OCR AI actions.
  • Awaiting for OCR AI learn – system step, waits for the learning process to complete. When this process is concluded, the system will automatically move the workflow to the next step.
  • Archive – positive final step

 

Actions

Generate text layer

This action adds files to the Text layer queue. For the example given below, this action will process all supported (jpg, pdf and tiff) files attached to the workflow document, after text layer is generated, it will overwrite the original files.

ocr13

Action may process only individually chosen files. The system may choose them based on their type, category to which they belong or be filtered by a regular expression.

After text layer is generated, action may overwrite files, create their new version, or create new attachment for which it is possible to configure several properties such as their name, description and category.

This action also allows us to set a priority for files. The range of priority values goes from 1 to 10, where 1 is the highest priority. If Night OCR option is selected, the priority is defined as 11. Such files will only be processed at night/after hours, when exactly these events occur is configured in the schedules section in system settings.

Seller TAX ID recognition

Action searches for a “distinguisher” field (In this case: Seller TAX ID) the value of this distinguisher field will then tell the system which custom OCR AI project to use on the file for best results.

For this action, we leave the distinguisher field empty, as it should be recognized by one, custom project.

You may now map a fields to the desired BPS Form field. If TAX ID field is checked, every non-numerical character will be deleted from the TAX ID number.

ocr14

This WorkFlow employs two identical actions. The first one leads from “Awaiting text layer” step to “TAX ID recognition” step, and the second one is used to assess the effectiveness of OCR AI learning, leading from “Awaiting for Teach OCR AI” step, to “Awaiting for OCR AI recognition” step.

OCR AI recognition step

This action recognizes all the remaining fields. The distinguisher is set as Seller TAX ID, which has already been retrieved by the previous action – TAX ID recognition. System will choose a custom OCR AI project based on the value of the distinguisher field.

ocr15

In such a case, OCR AI project including all remaining fields should be used.

OCR AI learning actions

Example of WorkFlow uses 2 OCR AI learning actions.

  1. Learn OCR AI – used to teach full OCR AI project in order to improve recognition effectiveness
  2. Learn TAX ID – used to teach Seller TAX ID project.

Every teaching action in this example is an “on path” type action. To choose learning mode, user has to choose a path with correctly configured action.

 

OCR Verification step

OCR Verification step is used to verify and correct data recognized by the OCR AI system.

To use the OCR Verification step, go to step configuration screen and then in General tab, move to Properties window and check “OCR Verification step” field.

ocr16

OCR Verification mode is available only when attached file had been processed by OCR AI. If OCR AI recognition was omitted, verification mode will not be available.

OCR Verification form

OCR verification form offers additional features such as preview of fragment of recognized text and checkboxes allowing to choose fields for learning.

ocr17

After clicking on the review of recognized text, a graphic viewer will automatically show you that fragment of text by enclosing it in red frame. Such view allows to define if recognized text is correct (not only if the characters are correct but also the format and spacing).

ocr18

Revision of recognized data

If the system doesn’t find the correct fragment of text or it finds an incorrect one, there is possibility to modify such values.

To enter missing values, one has to click on the desired word. After clicking it, text will be automatically entered into recently selected form field.

ocr19

After revising the fields, user can specify which fields need relearning/improving

ocr20

Checkboxes will only be available after entering new values to form fields.

If the configuration is correctly completed, the system is ready to work with OCR AI components.

Leave a Reply

Your email address will not be published. Required fields are marked *