Retrieve text data in PDF files with UiPath (PDF Activity, OCR, Anchor-based, CV)

In UiPath Studio development, there are times when you want to get the text of a PDF file and use it to register data in the system or send an email.

This article explains how to install the PDF package, what you can do with PDF Activities, and how to extract text data from PDF.

＼Save during the sale period!／

Take a look at the UiPath course on the online learning service Udemy

*Free video available

Click here for the official Udemy website.

This site was created by translating a blog created in Japanese into English using the DeepL translation.

Please forgive me if some of the English text is a little strange

PDF Package Installation Instructions

For activities such as reading PDF text or getting the number of pages in a PDF, you will need to install “UiPath.PDF Activities”.

Activity packages are configured for each process, so install them as needed each time you create a new process.

Installation instructions for the PDF package
①With the target process open in Studio, click “Manage Packages”.

②Click on “Official” in the pop-up window.

③Enter “UiPath.PDF” in the search window and click [UiPath.PDF Activities].

④Click “Install”

⑤Click “Save”

⑥Click “Project” in the lower left corner, and make sure “UiPath.PDF Activities” is displayed.

What you can do with PDF Activities

By installing the PDF activity package, you can use the activities to read text and create images in PDF.

The PDF activity can be found in App Integration > PDF.

PDF activity list

Activity Name	Activity behavior
Read PDF With OCR	Reads all characters from a specified PDF file and stores it in a string variable by using OCR technology.
Extract Images From PDF	Extracts images from a specified PDF file.
Read PDF Text	Reads all characters from a specified PDF file and stores them in a string variable.
Manage PDF Password	Changes the password of a specified PDF file.
Get PDF Page Count	Provides the total number of pages in a PDF file.
Extract PDF Page Range	Extracts a specified range of pages from a PDF document.
Join PDF Files	Joins multiple PDF files stored in an array of strings into a single PDF file.
Export PDF Page As Image	Creates an image from a page in a specified PDF file.

Extract all text data from PDF files

To extract all text in a PDF file, use the Read PDF Text, Get Full Text, and Read PDF with OCR activities.

Activity to extract all text in a PDF

Activity Name	Activity location	Activity Overview	Reading Accuracy
Read PDF Text	App Integration > PDF	Reads all characters from a specified PDF file and stores them in a string variable.	about right
Get Full Text	UI Automation > Text> Screen Scraping	Extracts a string and its information from an indicated UI element using the FullText screen scraping method.	about right
Read PDF With OCR	App Integration > PDF	Reads all characters from a specified PDF file and stores it in a string variable by using OCR technology.	Not so good. (Depends on the OCR used.)

F-pen

Exported Office files will be read almost accurately, but PDF files of handwritten paper will have lower text reading accuracy.

sea otter

The OCR available for free is quite inaccurate, so don’t have high expectations.

F-pen

Considering the accuracy of reading text and the stability of the process, the usage priority should be based on Read PDF Text > Get Full Text > Read PDF With OCR.

Read PDF Text

To read all the text in a PDF file, use the “Read PDF Text” activity.

“Read PDF Text” is located in System> Activities> Statements.

Read PDF Text Setting item

Setting location		Setting item	Setting contents
Properties	File	FileName	The path of the PDF file to be read.
	File	Password	The password of the PDF file, if necessary.
	Input	PreserveFormatting	If selected, this option maintains the formatting of the file after the extraction is completed.
	Input	Range	The range of pages that you want to read.
	Output	Text	The extracted string.
	Common	DisplayName	The display name of the activity.
	Misc	Private	If selected, the values of variables and arguments are no longer logged at Verbose level.

Read PDF Text

Sample Process
Reads all text in a PDF file and outputs the read text data to a log message.

・Read PDF Text Properties

・Variables

・PDF to be read (export Excel file, no handwriting)

・Execution result Log

F-pen

The PDF was created by exporting Excel, so it reads almost exactly as it should.

sea otter

If the PDF file to be read contains handwriting, the reading accuracy will decrease.

Get Full Text

Given a PDF file with a specified UI element, use the “Get Full Text” activity to read the text characters of the PDF file.

“Get Full Text” specifies the selector with the PDF file open, so you need to have Adobe Acrobat Reader (or an alternative PDF viewer) installed.

“Get Full Text” is located in UI Automation > Text > Screen Scraping.

Get Full Text Setting item

Setting location		Setting item	Setting Contents
Properties	Output	Text	The string extracted from the indicated UI element.
	Options	IgnoreHidden	If this check box is selected, string information from the indicated UI element is NOT extracted. By default, this check box is not selected.
	Misc	Private	If selected, the values of variables and arguments are no longer logged at Verbose level.
	Selector	Target.Selector	Text property used to find a particular UI element when the activity is executed.
		Target.TimeoutMS	Specifies the amount of time (in milliseconds) to wait for the activity to run before the SelectorNotFoundException error is thrown.
		Target.WaitForReady	Before performing the actions, wait for the target to become ready.
		Target.Element	Use the UiElement variable returned by another activity.
		Target.ClippingRegion	Defines the clipping rectangle, in pixels, relative to the UiElement, in the following directions: left, top, right, bottom.
	Common	DisplayName	The display name of the activity.
	Common	ContinueOnError	Specifies if the automation should continue even when the activity throws an error.

Get Full Text

Sample Process
Reads all the text of a pre-opened PDF file by specifying the UI elements and outputs the read text data to a log message.

・Get Full Text Properties

・PDF file to be read

・Execution result Log

F-pen

Since the PDF was created by exporting Excel, the full text is read accurately in the fetch.

Read PDF With OCR

Read all characters in a PDF file with OCR using the Read PDF With OCR activity.

Read PDF With OCR can be found in App Integration>PDF.

Read PDF With OCR Setting item

Setting location		Setting item	Setting contents
Properties	Common	DisplayName	The display name of the activity.
	File	FileName	The path of the PDF file to be read.
	File	Password	The password of the PDF file, if necessary.
	Input	DegreeOfParalelism	Specifies how many, if any, pages to be analyzed in parallel.
		ImageDpi	The DPI used for the OCR process.
		Range	The range of pages that you want to read.
	Misc	Private	If selected, the values of variables and arguments are no longer logged at Verbose level.
	Output	Text	The extracted string.

Read PDF With OCR

sea otter

If you use “Read PDF With OCR”, you need to set the OCR engine in the activity.

F-pen

The OCR engine “Microsoft OCR” requires no additional settings, but “Tesseract OCR (formerly Google OCR)” requires additional settings.

Additional configuration steps for Tesseract OCR
Tesseract OCR (formerly Google OCR) requires that you place the language files in a specified folder when using non-English languages. The procedure to add Japanese files is as follows.

①Download jpn.traineddata from the tesseract-ocr/tessdata page.

②Create a “tessdata” folder in the UiPath installation directory (*) and save the jpn.traineddata file.

F-pen

*The installation directory of the enterprise version of UiPath is “C:\Program Files\UiPath\Studio

sea otter

*The installation directory of the Community version of UiPath is “C:\Users\[username]\AppData\Local\UiPath\app-xx.xx.xx\net461”. The xx.xx.xx part represents the version, so we’ll choose the highest number in the folder that exists.

③If you restart UiPath (close the application and start it again), you will be able to use Japanese.

Installing OCR Languages

Sample Process 1
Read all text in PDF files with Microsoft OCR and output the read text data to log messages.

・Read PDF With OCR Properties

・Microsoft OCR Properties

・Variables

・PDF to be read

・Execution result Log

ラッコくん

Microsoft OCR is not very accurate, so even in the exported Excel file, there are a few words that are being read incorrectly.

Sample Process ２
Read all the text in a PDF file with Tesseract OCR and output the read text data to a log message.

・Read PDF With OCR Properties

・Properties of Tesseract OCR

・Variables

・PDF to be read

・Execution result Log

F-pen

The reading accuracy of “Tesseract OCR” is not very good.

Partial text data extraction from PDF files

Extracting some text in a PDF file uses the Get Text, Get Text in View, Anchor Based, and Get CV Text activities.

Activity to extract some text from a PDF

Activity Name	Activity location	Activity Overview	Reading Accuracy
Get Text	UI AUtomation > Element > Control	Extracts a text value from a specified UI element.	n
Get Visible Text	UI AUtomation > Text > Screen Scraping	Extracts a string and its information from an indicated UI element using the Native screen scraping method.	n
Anchor Base	UI AUtomation > Element > Find	A container that searches for a UI element by using other UI elements as anchors.	n
CV Get Text	Computer Vision	Extracts the text from a specified UI element.	n

*Y:yes, n:neither, N:no, -:None

sea otter

Considering read accuracy and process stability, you should prioritize UI elements.

F-pen

In order of activity, Get Text, Anchor Base(Element) > Get Visible Text > CV Get Text, Anchor Base(Image).

Get Text

Extracting some text values from a specified UI element of a PDF can be done using the Get Text activity.

“Get Text” is located in UI AUtomation > Elemeny > Contorol.

Get Text Setting item

Setting location		Settting item	Setting contents
Properties	Output	Value	Enables you to store the text from the specified UI element in a variable, as well as make changes to the text with VB expressions.
	Common	DisplayName	The display name of the activity.
	Common	ContinueOnError	Specifies if the automation should continue even when the activity throws an error.
	Misc	Private	If selected, the values of variables and arguments are no longer logged at Verbose level.
	Target	Target.Selector	Text property used to find a particular UI element when the activity is executed.
		Target.TimeoutMS	Specifies the amount of time (in milliseconds) to wait for the activity to run before the SelectorNotFoundException error is thrown.
		Target.WaitForReady	Before performing the actions, wait for the target to become ready.
		Target.Element	Use the UiElement variable returned by another activity.
		Target.ClippingRegion	Defines the clipping rectangle, in pixels, relative to the UiElement, in the following directions: left, top, right, bottom.

Get Text

Sample Process
Read some text by specifying UI elements of a PDF file that has been opened in advance, and output the read text data to a log message.

・Get Text Properties

・Get Text Selector

・Variables

・PDF to be read

・Execution result Log

Get Visible Text

Use the “Get Visible Text” activity to extract strings and their information from a specified UI element from a PDF using the Native screen scraping method.

“Get Visible Text” is located in UI AUtomation > Text > ScreenScraping.

Get Visible Text Setting item

Setting location		Setting item	Setting contents
Properties	Output	Text	The string extracted from the indicated UI element.
	Output	WordsInfo	The screen coordinates of each word found in the indicated UI element.
	Options	Separators	Specify the characters used as string separators.
	Options	FormattedText	If this check box is selected, the screen layout of the scraped text is preserved.
	Misc	Private	If selected, the values of variables and arguments are no longer logged at Verbose level.
	Input	Target.Selector	Text property used to find a particular UI element when the activity is executed.
		Target.TimeoutMS	Specifies the amount of time (in milliseconds) to wait for the activity to run before the SelectorNotFoundException error is thrown.
		Target.WaitForReady	Before performing the actions, wait for the target to become ready.
		Target.Element	Use the UiElement variable returned by another activity.
		Target.ClippingRegion	Defines the clipping rectangle, in pixels, relative to the UiElement, in the following directions: left, top, right, bottom.
	Common	DisplayName	The display name of the activity.
	Common	ContinueOnError	Specifies if the automation should continue even when the activity throws an error.

Get Visible Text

Sample Process
Reads some text by specifying UI elements in a PDF file that has been opened in advance, and outputs the read text data and word information to a log message.

・Get Visible Text Properties

F-pen

“Get Visible Text” is used to output text and word information.

sea otter

WordsInfo is output as IEnumerable<TextInfo> type, so I’m using “For Each” to output each element.

・Get Visible Text Selector

・For Each Properties

F-pen

TypeArgument in “For Each” specifies the type of TextInfo in the element.

・Variables

・PDF to be read

・Execution result Log

Anchor Base

If you can’t locate the element on the screen in the PDF, use the “Anchor Base” activity to specify another element or image as a landmark and get the value of the element in the relative position.

“Anchor Base” is located in UI AUtomation > Element > Find.

Anchor Base Setting item

設定場所		設定項目	設定内容
Properties	Input	AnchorPosition	Specifies to which edge of the container the UI element is anchored.
	Common	DisplayName	The display name of the activity.
	Common	ContinueOnError	Specifies if the automation should continue even when the activity throws an error.
	Misc	Private	If selected, the values of variables and arguments are no longer logged at Verbose level.

Anchor Base

Sample Process 1
Specify a UI element in a PDF file that has been opened in advance, read some text at a relative position, and output the read text data to a log message.

sea otter

On the left side of the Anchor Base, we’ll place an activity that allows us to get the UI element that will be the anchor.

F-pen

The right side of the Anchor Base places the activity of the element you want to retrieve relative to the specified UI element.

・Anchor Base Properties

・Find Element Properties

・Find Element Selector

・Get text Properties

・Variables

・PDF to be read

・Execution result Log

Sample Process 2
Specify an image element in a PDF file that has been opened in advance, read some text at a relative position, and output the read text data to a log message.

sea otter

On the left side of the Anchor Base, I’ve placed an activity to find the anchor image.

・Anchor Base Properties

・Find Image Properties

・Find Image Selector

・Get Text Properties

・Variables

・PDF to be read

・Execution result1 Log

CV Get Text

If you can’t locate the element on the screen in the PDF, and you can’t specify an anchor-based image, use the “CV Get Text” activity, which uses AI Computer Vision to specify other images as landmarks and get the value of the element in the relative position.

F-pen

“CV Get Text” can only be used within the “CV Screen Scope”.

sea otter

To use “CV Screen Scope”, you need to get an API key from UiPath Automation Cloud and set it to a property.

“CV Get Text” and “CV Screen Scope” are located under Compuer Vision.

CV Get Text Setting item

設定場所		設定項目	設定内容
Properties	Common	ContinueOnError	Specifies if the automation should continue even when the activity throws an error.
		DelayAfter	Delay time (in milliseconds) after executing the activity.
		DelayBefore	Delay time (in milliseconds) before the activity begins performing any operations.
		DisplayName	The display name of the activity.
	Input	Descriptor	The on-screen coordinates of the Target and each Anchor that is used, if any.
		Method	Specifies what method you want to use to retrieve the text.
		TimeoutMS	Specifies the amount of time (in milliseconds) to wait for the activity to run before an error is thrown.
	Misc	Private	If selected, the values of variables and arguments are no longer logged at Verbose level.
	Options	RefreshBefore	If selected, a Computer Vision screen analysis is carried out to make sure that any changes in the user interface since the last CV Screen Scope or Refresh activities are captured.
	Output	Result	The retrieved text, stored in a String variable. This field supports only String variables.
	Reusable Region	InputRegion	Receives the target of another CV activity stored in a Rectangle variable, using it as a target for this activity.
	Reusable Region	OutputRegion	Saves the target of this activity as a Rectangle variable.

CV Get Text

CV Screen Scope Setting item

Setting location		Settting item	Setting contents
Properties	Common	ContinueOnError	Specifies if the automation should continue even when the activity throws an error.
		DelayBefore	Delay time (in milliseconds) before the activity begins performing any operations.
		DisplayName	The display name of the activity.
	Input	CvMethod	–
	Target	ClippingRegion	Defines the clipping rectangle, in pixels, relative to the UiElement, in the following directions: left, top, right, bottom.
		Element	Use the UiElement variable returned by another activity.
		Selector	Text property used to find a particular UI element when the activity is executed.
		Timeout (milliseconds)	Specifies the amount of time (in milliseconds) to wait for the activity to run before the SelectorNotFoundException error is thrown.
		Target.WaitForReady	Before performing the actions, wait for the target to become ready.
	Misc	Private	If selected, the values of variables and arguments are no longer logged at Verbose level.
	Server (synced)	ApiKey	The API key used for authenticating to the Computer Vision server.
		URL	The URL of the server that runs the Computer Vision service. By default, this property is set to https://cv.uipath.com/.
		UseLocalServer	If checked the local server will be used for the analysis.
Activity body		The specified screen	The application you want to automate can be indicated to the CV Screen Scope activity by using the Indicate On Screen button in the body of the activity.
Activity body		screen name	Screens can also be renamed by selecting them from the Screen Name drop-down and clicking the rename button

CV Screen Scope

Sample Process
Specify a screen of a PDF file that has been opened in advance, read some text at a position relative to the image in the screen, and output the read text data to a log message.

・CV Screen Scope Properties

・CV Screen Scope Selector

・APIkey acquisition screen of CV Screen Scope

F-pen

“CV Screen Scope” API key can be obtained by logging in to UiPath Automation Cloud and clicking [Copy API key] in [Admin] -> [Licenses].

sea otter

“The URL for the CV Screen Scope is fixed at https://cv.uipath.com/.　

・UiPathScreenOCR Properties

・CV Get Text Properties

・Variables

・PDF to be read

・Execution result Log

Summary

PDF activities can be used by installing the UiPath.PDF Activities package
Use Read PDF Text, Get Full Text, Read PDF With OCR for activities to extract full text of a PDF
Activities for extracting partial text from a PDF use Get Text, Get Visible Text, Anchor Base, and CV Get Text.

Back to Table of Contents

＼Save during the sale period!／

Take a look at the UiPath course on the online learning service Udemy

*Free video available

Click here for the official Udemy website.

same category UiPath

The operator of this blog, F-penIT blog

F-Pen

Japanese IT engineer with a wide range of experience in system development, cloud building, and service planning. In this blog, I will share my know-how on UiPath and certification. profile detail / twitter:@fpen17