UiPath

Retrieve text data in PDF files with UiPath (PDF Activity, OCR, Anchor-based, CV)

In UiPath Studio development, there are times when you want to get the text of a PDF file and use it to register data in the system or send an email.

This article explains how to install the PDF package, what you can do with PDF Activities, and how to extract text data from PDF.

 

 Related Articles Learn the Creation Techniques f UiPath robotics creation with Udemy’s online courses that take it up a notch

This site was created by translating a blog created in Japanese into English using the DeepL translation.

Please forgive me if some of the English text is a little strange

PDF Package Installation Instructions

For activities such as reading PDF text or getting the number of pages in a PDF, you will need to install “UiPath.PDF Activities”.

Activity packages are configured for each process, so install them as needed each time you create a new process.

 

Installation instructions for the PDF package
①With the target process open in Studio, click “Manage Packages”.

②Click on “Official” in the pop-up window.

③Enter “UiPath.PDF” in the search window and click [UiPath.PDF Activities].

④Click “Install”

⑤Click “Save”

⑥Click “Project” in the lower left corner, and make sure “UiPath.PDF Activities” is displayed.

 

 

What you can do with PDF Activities

By installing the PDF activity package, you can use the activities to read text and create images in PDF.

The PDF activity can be found in App Integration > PDF.

 

PDF activity list

Activity Name Activity behavior
Read PDF With OCR Reads all characters from a specified PDF file and stores it in a string variable by using OCR technology.
Extract Images From PDF Extracts images from a specified PDF file.
Read PDF Text Reads all characters from a specified PDF file and stores them in a string variable.
Manage PDF Password Changes the password of a specified PDF file.
Get PDF Page Count Provides the total number of pages in a PDF file.
Extract PDF Page Range Extracts a specified range of pages from a PDF document.
Join PDF Files Joins multiple PDF files stored in an array of strings into a single PDF file.
Export PDF Page As Image Creates an image from a page in a specified PDF file.

 

Extract all text data from PDF files

To extract all text in a PDF file, use the Read PDF Text, Get Full Text, and Read PDF with OCR activities.

 

Activity to extract all text in a PDF

Activity Name Activity location Activity Overview Reading Accuracy
Read PDF Text App Integration > PDF Reads all characters from a specified PDF file and stores them in a string variable. about right
Get Full Text UI Automation > Text> Screen Scraping Extracts a string and its information from an indicated UI element using the FullText screen scraping method. about right
Read PDF With OCR App Integration > PDF Reads all characters from a specified PDF file and stores it in a string variable by using OCR technology. Not so good.
(Depends on the OCR used.)

 

F-pen
F-pen
Exported Office files will be read almost accurately, but PDF files of handwritten paper will have lower text reading accuracy.
sea otter
sea otter
The OCR available for free is quite inaccurate, so don’t have high expectations.
F-pen
F-pen
Considering the accuracy of reading text and the stability of the process, the usage priority should be based on Read PDF Text > Get Full Text > Read PDF With OCR.

 

Read PDF Text

To read all the text in a PDF file, use the “Read PDF Text” activity.

“Read PDF Text” is located in System> Activities> Statements.

 

Read PDF Text  Setting item

Setting location Setting item Setting contents
Properties File FileName The path of the PDF file to be read.
Password The password of the PDF file, if necessary.
Input
PreserveFormatting If selected, this option maintains the formatting of the file after the extraction is completed.
Range The range of pages that you want to read.
Output
Text The extracted string.
Common
DisplayName The display name of the activity.
Misc
Private If selected, the values of variables and arguments are no longer logged at Verbose level.

 

Sample Process
Reads all text in a PDF file and outputs the read text data to a log message.

・Read PDF Text  Properties

・Variables

・PDF to be read (export Excel file, no handwriting)

・Execution result  Log

F-pen
F-pen
The PDF was created by exporting Excel, so it reads almost exactly as it should.
sea otter
sea otter
If the PDF file to be read contains handwriting, the reading accuracy will decrease.

 

Get Full Text

Given a PDF file with a specified UI element, use the “Get Full Text” activity to read the text characters of the PDF file.

“Get Full Text” specifies the selector with the PDF file open, so you need to have Adobe Acrobat Reader (or an alternative PDF viewer) installed.

 

“Get Full Text” is located in UI Automation > Text > Screen Scraping.

 

Get Full Text  Setting item

Setting location Setting item Setting Contents
Properties Output Text The string extracted from the indicated UI element.
Options IgnoreHidden If this check box is selected, string information from the indicated UI element is NOT extracted. By default, this check box is not selected.
Misc Private If selected, the values of variables and arguments are no longer logged at Verbose level.
Selector Target.Selector Text property used to find a particular UI element when the activity is executed.
Target.TimeoutMS Specifies the amount of time (in milliseconds) to wait for the activity to run before the SelectorNotFoundException error is thrown.
Target.WaitForReady Before performing the actions, wait for the target to become ready.
Target.Element Use the UiElement variable returned by another activity.
Target.ClippingRegion Defines the clipping rectangle, in pixels, relative to the UiElement, in the following directions: left, top, right, bottom.
Common DisplayName The display name of the activity.
ContinueOnError Specifies if the automation should continue even when the activity throws an error.

 

Sample Process
Reads all the text of a pre-opened PDF file by specifying the UI elements and outputs the read text data to a log message.

・Get Full Text  Properties

・PDF file to be read

・Execution result  Log

F-pen
F-pen
Since the PDF was created by exporting Excel, the full text is read accurately in the fetch.

 

Read PDF With OCR

Read all characters in a PDF file with OCR using the Read PDF With OCR activity.

Read PDF With OCR can be found in App Integration>PDF.

 

Read PDF With OCR  Setting item

Setting location Setting item Setting contents
Properties Common DisplayName The display name of the activity.
File FileName The path of the PDF file to be read.
Password The password of the PDF file, if necessary.
Input DegreeOfParalelism Specifies how many, if any, pages to be analyzed in parallel.
ImageDpi The DPI used for the OCR process.
Range The range of pages that you want to read.
Misc Private If selected, the values of variables and arguments are no longer logged at Verbose level.
Output Text The extracted string.

 

sea otter
sea otter
If you use “Read PDF With OCR”, you need to set the OCR engine in the activity.
F-pen
F-pen
The OCR engine “Microsoft OCR” requires no additional settings, but “Tesseract OCR (formerly Google OCR)” requires additional settings.

 

Additional configuration steps for Tesseract OCR
Tesseract OCR (formerly Google OCR) requires that you place the language files in a specified folder when using non-English languages. The procedure to add Japanese files is as follows.

①Download jpn.traineddata from the tesseract-ocr/tessdata page.

②Create a “tessdata” folder in the UiPath installation directory (*) and save the jpn.traineddata file.

F-pen
F-pen
*The installation directory of the enterprise version of UiPath is “C:\Program Files\UiPath\Studio
sea otter
sea otter
*The installation directory of the Community version of UiPath is “C:\Users\[username]\AppData\Local\UiPath\app-xx.xx.xx\net461”. The xx.xx.xx part represents the version, so we’ll choose the highest number in the folder that exists.

③If you restart UiPath (close the application and start it again), you will be able to use Japanese.

 

Sample Process 1
Read all text in PDF files with Microsoft OCR and output the read text data to log messages.

 

・Read PDF With OCR  Properties

・Microsoft OCR  Properties

・Variables

・PDF to be read

 

・Execution result  Log

ラッコくん
ラッコくん
Microsoft OCR is not very accurate, so even in the exported Excel file, there are a few words that are being read incorrectly.

 

Sample Process 2
Read all the text in a PDF file with Tesseract OCR and output the read text data to a log message.

 

・Read PDF With OCR  Properties

・Properties of Tesseract OCR

・Variables

・PDF to be read

・Execution result Log

F-pen
F-pen
The reading accuracy of “Tesseract OCR” is not very good.

 

Partial text data extraction from PDF files

Extracting some text in a PDF file uses the Get Text, Get Text in View, Anchor Based, and Get CV Text activities.

 

Activity to extract some text from a PDF

Activity Name Activity location Activity Overview Reading Accuracy
Get Text UI AUtomation > Element > Control Extracts a text value from a specified UI element. n
Get Visible Text UI AUtomation > Text > Screen Scraping Extracts a string and its information from an indicated UI element using the Native screen scraping method. n
Anchor Base UI AUtomation > Element > Find A container that searches for a UI element by using other UI elements as anchors. n
CV Get Text Computer Vision Extracts the text from a specified UI element. n

*Y:yes,     n:neither,     N:no,    -:None

 

sea otter
sea otter
Considering read accuracy and process stability, you should prioritize UI elements.
F-pen
F-pen
In order of activity, Get Text, Anchor Base(Element) > Get Visible Text > CV Get Text, Anchor Base(Image).

 

Get Text

Extracting some text values from a specified UI element of a PDF can be done using the Get Text activity.

“Get Text” is located in UI AUtomation > Elemeny > Contorol.

 

Get Text  Setting item

Setting location Settting item Setting contents
Properties Output Value Enables you to store the text from the specified UI element in a variable, as well as make changes to the text with VB expressions.
Common DisplayName The display name of the activity.
ContinueOnError Specifies if the automation should continue even when the activity throws an error.
Misc Private If selected, the values of variables and arguments are no longer logged at Verbose level.
Target Target.Selector  Text property used to find a particular UI element when the activity is executed.
Target.TimeoutMS Specifies the amount of time (in milliseconds) to wait for the activity to run before the SelectorNotFoundException error is thrown.
Target.WaitForReady Before performing the actions, wait for the target to become ready.
Target.Element Use the UiElement variable returned by another activity.
Target.ClippingRegion Defines the clipping rectangle, in pixels, relative to the UiElement, in the following directions: left, top, right, bottom.

 

Sample Process
Read some text by specifying UI elements of a PDF file that has been opened in advance, and output the read text data to a log message.

・Get Text Properties

・Get Text  Selector

・Variables

・PDF to be read

・Execution result  Log

 

Get Visible Text

Use the “Get Visible Text” activity to extract strings and their information from a specified UI element from a PDF using the Native screen scraping method.

“Get Visible Text” is located in UI AUtomation > Text > ScreenScraping.

 

Get Visible Text  Setting item

Setting location Setting item Setting contents
Properties Output Text The string extracted from the indicated UI element.
WordsInfo The screen coordinates of each word found in the indicated UI element.
Options Separators Specify the characters used as string separators.
FormattedText  If this check box is selected, the screen layout of the scraped text is preserved.
Misc Private If selected, the values of variables and arguments are no longer logged at Verbose level.
Input Target.Selector Text property used to find a particular UI element when the activity is executed.
Target.TimeoutMS Specifies the amount of time (in milliseconds) to wait for the activity to run before the SelectorNotFoundException error is thrown.
Target.WaitForReady Before performing the actions, wait for the target to become ready.
Target.Element Use the UiElement variable returned by another activity.
Target.ClippingRegion Defines the clipping rectangle, in pixels, relative to the UiElement, in the following directions: left, top, right, bottom.
Common DisplayName The display name of the activity.
ContinueOnError Specifies if the automation should continue even when the activity throws an error.

 

Sample Process
Reads some text by specifying UI elements in a PDF file that has been opened in advance, and outputs the read text data and word information to a log message.

・Get Visible Text  Properties

F-pen
F-pen
“Get Visible Text” is used to output text and word information.
sea otter
sea otter
WordsInfo is output as IEnumerable<TextInfo> type, so I’m using “For Each” to output each element.

 

・Get Visible Text  Selector

・For Each  Properties

F-pen
F-pen
TypeArgument in “For Each” specifies the type of TextInfo in the element.

 

・Variables

・PDF to be read

 

・Execution result  Log

 

 

 

Anchor Base

If you can’t locate the element on the screen in the PDF, use the “Anchor Base” activity to specify another element or image as a landmark and get the value of the element in the relative position.

“Anchor Base” is located in UI AUtomation > Element > Find.

 

Anchor Base  Setting item

設定場所 設定項目 設定内容
Properties Input AnchorPosition Specifies to which edge of the container the UI element is anchored.
Common DisplayName The display name of the activity.
ContinueOnError Specifies if the automation should continue even when the activity throws an error.
Misc Private If selected, the values of variables and arguments are no longer logged at Verbose level.

 

Sample Process 1
Specify a UI element in a PDF file that has been opened in advance, read some text at a relative position, and output the read text data to a log message.

sea otter
sea otter
On the left side of the Anchor Base, we’ll place an activity that allows us to get the UI element that will be the anchor.
F-pen
F-pen
The right side of the Anchor Base places the activity of the element you want to retrieve relative to the specified UI element.

 

・Anchor Base  Properties

・Find Element   Properties

・Find Element  Selector

・Get text  Properties

・Variables

・PDF to be read

・Execution result  Log

 

 

Sample Process 2
Specify an image element in a PDF file that has been opened in advance, read some text at a relative position, and output the read text data to a log message.

sea otter
sea otter
On the left side of the Anchor Base, I’ve placed an activity to find the anchor image.

 

・Anchor Base  Properties

・Find Image  Properties

・Find Image  Selector

 

・Get Text  Properties

・Variables

・PDF to be read

・Execution result1  Log

 

CV Get Text

If you can’t locate the element on the screen in the PDF, and you can’t specify an anchor-based image, use the “CV Get Text” activity, which uses AI Computer Vision to specify other images as landmarks and get the value of the element in the relative position.

F-pen
F-pen
“CV Get Text” can only be used within the “CV Screen Scope”.
sea otter
sea otter
To use “CV Screen Scope”, you need to get an API key from UiPath Automation Cloud and set it to a property.

 

“CV Get Text” and “CV Screen Scope” are located under Compuer Vision.

 

CV Get Text  Setting item

設定場所 設定項目 設定内容
Properties Common ContinueOnError Specifies if the automation should continue even when the activity throws an error.
DelayAfter Delay time (in milliseconds) after executing the activity.
DelayBefore Delay time (in milliseconds) before the activity begins performing any operations.
DisplayName The display name of the activity.
Input Descriptor The on-screen coordinates of the Target and each Anchor that is used, if any.
Method Specifies what method you want to use to retrieve the text.
TimeoutMS Specifies the amount of time (in milliseconds) to wait for the activity to run before an error is thrown.
Misc Private If selected, the values of variables and arguments are no longer logged at Verbose level.
Options RefreshBefore If selected, a Computer Vision screen analysis is carried out to make sure that any changes in the user interface since the last CV Screen Scope or Refresh activities are captured.
Output Result The retrieved text, stored in a String variable. This field supports only String variables.
Reusable Region InputRegion Receives the target of another CV activity stored in a Rectangle variable, using it as a target for this activity.
OutputRegion Saves the target of this activity as a Rectangle variable.

 

 

CV Screen Scope  Setting item

Setting location Settting item Setting contents
Properties Common ContinueOnError Specifies if the automation should continue even when the activity throws an error.
DelayBefore Delay time (in milliseconds) before the activity begins performing any operations.
DisplayName The display name of the activity.
Input CvMethod
Target ClippingRegion Defines the clipping rectangle, in pixels, relative to the UiElement, in the following directions: left, top, right, bottom.
Element Use the UiElement variable returned by another activity.
Selector Text property used to find a particular UI element when the activity is executed.
Timeout (milliseconds) Specifies the amount of time (in milliseconds) to wait for the activity to run before the SelectorNotFoundException error is thrown.
Target.WaitForReady Before performing the actions, wait for the target to become ready.
Misc Private If selected, the values of variables and arguments are no longer logged at Verbose level.
Server (synced) ApiKey The API key used for authenticating to the Computer Vision server.
URL The URL of the server that runs the Computer Vision service. By default, this property is set to https://cv.uipath.com/.
UseLocalServer If checked the local server will be used for the analysis.
Activity body The specified screen The application you want to automate can be indicated to the CV Screen Scope activity by using the Indicate On Screen button in the body of the activity.
screen name Screens can also be renamed by selecting them from the Screen Name drop-down and clicking the rename button

 

Sample Process
Specify a screen of a PDF file that has been opened in advance, read some text at a position relative to the image in the screen, and output the read text data to a log message.

 

・CV Screen Scope  Properties

・CV Screen Scope  Selector

・APIkey acquisition screen of CV Screen Scope

F-pen
F-pen
“CV Screen Scope” API key can be obtained by logging in to UiPath Automation Cloud and clicking [Copy API key] in [Admin] -> [Licenses].
sea otter
sea otter
“The URL for the CV Screen Scope is fixed at https://cv.uipath.com/. 

 

・UiPathScreenOCR  Properties

・CV Get Text  Properties

・Variables

・PDF to be read

・Execution result  Log

 

 

Summary

Back to Table of Contents

 

 Related Articles Learn the Creation Techniques f UiPath robotics creation with Udemy’s online courses that take it up a notch

 same category UiPath

 

The operator of this blog, F-penIT blog