In UiPath Studio development, we often extract text and table data from web pages.
However, you may not be able to extract data from the web as you would like.
This article explains how to extract text data and tabular data from web pages (data scraping).
\Save during the sale period!/
Take a look at the UiPath course on the online learning service Udemy
*Free video available
Related Articles Learn the Creation Techniques f UiPath robotics creation with Udemy’s online courses that take it up a notch
This site was created by translating a blog created in Japanese into English using the DeepL translation.
Please forgive me if some of the English text is a little strange
Get Text
Getting text data from a web page is done using the Get Text activity.
Get Text is found in UI Automation > Element > Control
Get Text Setting item
設定場所 | 設定項目 | 設定内容 | |
Properties | Output | Value | Enables you to store the text from the specified UI element in a variable, as well as make changes to the text with VB expressions. |
Common | DisplayName | The display name of the activity. | |
ContinueOnError | Specifies if the automation should continue even when the activity throws an error. | ||
Misc
|
Private | If selected, the values of variables and arguments are no longer logged at Verbose level. | |
Target.Selector | Text property used to find a particular UI element when the activity is executed. | ||
Target.TimeoutMS | Specifies the amount of time (in milliseconds) to wait for the activity to run before the SelectorNotFoundException error is thrown. | ||
Target.WaitForReady | Before performing the actions, wait for the target to become ready. | ||
Target.Element | Use the UiElement variable returned by another activity. | ||
Target.ClippingRegion | Defines the clipping rectangle, in pixels, relative to the UiElement, in the following directions: left, top, right, bottom. |
Get the top article title of Yahoo Sports as text.
・Target web page(Yahoo! Sports Home)
・Get Text Properties
・Execution result
Get Full Text
Extracting strings and their information from a web page is done using the Get Full Text activity.
Get Full Text is located in UI Automation > Text > Screen Scraping.
Get Full Text Setting item
Setting location | Setting item | Setting Contents | |
Properties | Output | Text | The string extracted from the indicated UI element. |
Options | IgnoreHidden | If this check box is selected, string information from the indicated UI element is NOT extracted. | |
Common | DisplayName | The display name of the activity. | |
ContinueOnError | Specifies if the automation should continue even when the activity throws an error. | ||
Misc
|
Private | If selected, the values of variables and arguments are no longer logged at Verbose level. | |
Target.Selector | Text property used to find a particular UI element when the activity is executed. | ||
Target.TimeoutMS | Specifies the amount of time (in milliseconds) to wait for the activity to run before the SelectorNotFoundException error is thrown. | ||
Target.WaitForReady | Before performing the actions, wait for the target to become ready. | ||
Target.Element | Use the UiElement variable returned by another activity. | ||
Target.ClippingRegion | Defines the clipping rectangle, in pixels, relative to the UiElement, in the following directions: left, top, right, bottom.U |
On the Yahoo Spotrs NFL Schedule page, use “Get Full Text” and “Get Text” for selective weeks to get text data and output it to the log.
・Target web page(NFL Schesule)
・Get Full Text Properties
・Get Text Properties
・Execution result
Data scraping
Data scraping of a single page
Data scraping is used to retrieve tabular data from web pages.
Data scraping is in the ribbon.
・Click on [Data Scraping] in the ribbon section
・Click [Next]
・Move the mouse over the data part of the table format, and click
・Click Yes
・Click [Finish].
・If you don’t want to retrieve data for multiple pages, click [No].
・Verify that the data scraping workflow has been created.
Extract Structured Data Setting item
Setting location | Setting item | Setting Contents | |
Properties | Input | ExtractMetadata | An XML string that enables you to define what data to extract from the indicated web page. |
Target.Selector | Text property used to find a particular UI element when the activity is executed. | ||
Target.TimeoutMS | Specifies the amount of time (in milliseconds) to wait for the activity to run before the SelectorNotFoundException error is thrown. | ||
Target.WaitForReady | Before performing the actions, wait for the target to become ready. | ||
Target.Element | Use the UiElement variable returned by another activity. | ||
Target.ClippingRegion | Defines the clipping rectangle, in pixels, relative to the UiElement, in the following directions: left, top, right, bottom. | ||
Options | DelayBetweenPagesMS | The amount of time, in milliseconds, to wait until the next page is loaded. | |
MaxNumberOfResults | The maximum number of results to be extracted. | ||
NextLinkSelector | The selector that identifies the link/button used to navigate to the next page. | ||
SendWindowMessages | If selected, in the case where the data that is to be extracted spans multiple pages, the click that changes the page is executed by sending a specific message to the target application. | ||
SimulateClick | If selected, in the case where the data that is to be extracted spans multiple pages, it simulates the click that changes the page by using the technology of the target application. | ||
Output | DataTable | The information extracted from the indicated web page. | |
Common | DisplayName | The display name of the activity. | |
ContinueOnError | Specifies if the automation should continue even when the activity throws an error. | ||
Misc | Private | If selected, the values of variables and arguments are no longer logged at Verbose level. |
Export Yahoo Sports MLB rankings to a CSV file.
・Target Sites(MLB Standings)
・Extract Structured Data Properties
・Write CSV Properties
・Setting variables
・CSV file output as a result of execution
Data scraping of multiple pages (with repeating links)
Getting multi-page tabular data with repeating links is done by using links across multiple pages in data scraping.
Data scraping is in the ribbon.
・Click on [Data Scraping] in the ribbon section.
・Click [Next]
・Mouse over the data in the tabular data and click
・Click [Yes]
・Maxinum number of result(0 for all)] to 0. Click [Finish]
・Click [Yes]
・Click the page transition link (“>” in the following case) with the same link
・Verify that the data scraping workflow has been created.
Get the Boston schedule for the NBA and output to a CSV file.
・Target Sites(Boston Celtics Schedule)
・Attach Browser Properties
・Extract Structured Data Properties
・Write CSV Properties
・CSV file of execution results
Data scraping of multiple pages (no repeating links)
To retrieve multi-page tabular data without repeating links, use the data scraping, click(or Select Item) and repeat activity.
Output the results of 1week to 3weeks of NFL to a CSV file.
・Target Sites(NFL Schedule)
・Attach Browser Properties
・Extract Structured Data Properties
・Select Item properties
・Write CSV Properties
・Variables
・CSV file output as a result of execution
Summary
- To retrieve text data from a web page, use the “Get Text” activity.
- Use the “Get Full Text” activity to extract strings and their information from a web page.
- Data scraping is used to retrieve tabular data from web pages.
- To retrieve multiple pages of data, specify multiple page elements in data scraping, or use the Click(or Select Item) and Repeat activity.
\Save during the sale period!/
Take a look at the UiPath course on the online learning service Udemy
*Free video available
Related Articles Learn the Creation Techniques f UiPath robotics creation with Udemy’s online courses that take it up a notch
same category UiPath

Japanese IT engineer with a wide range of experience in system development, cloud building, and service planning. In this blog, I will share my know-how on UiPath and certification. profile detail / twitter:@fpen17