Please wait...
Execute Javascript code and render dynamic content to static HTML with Headless Chrome in the cloud.
We route HTTP requests via a worldwide proxy network according to specified target geolocation.
Server-side rendering (SSR) is a technique when the whole document is generated totally on the server. Whenever a request comes, the server first generates the entire document and returns its content to the client. The browser in the client machine simply displays that document, without any further rendering.
"Base Fetcher" is suitable for processing server-side rendered pages where the HTML in the HTTP response contains all content.
Crawling a URL with "Base fetcher" takes fewer resources and works faster than rendering HTML with "Chrome fetcher."
Requesting static HTML pages is always cheaper ...
But...
When they talk about client-side rendering, that means rendering content in the browser using JavaScript. So instead of getting all of the content from the HTML document itself, you are getting a bare-bones HTML document with a JavaScript file that will render the rest of the site using the browser. Usually, additional AJAX calls come from the client to the server to refresh web page content or to receive extra data.
JavaScript Frameworks like Angular, React, Vue.js used widely for building modern web
applications. They consist of HTML + JS code. HTML initially does not contain all the actual
content. It loads dynamically after rendering JavaScript code.
So scraping such HTML pages 'as is' is useless for most cases.
The headless Chrome browser is used by "Chrome fetcher" to render dynamic content and return it as a static HTML. It renders websites in the same way as a real browser would do it.
Of course, we don't intend only to render JavaScript driven web pages but to perform tasks with them.
Doing actions helps you to become closer to the desired data.
Actions are performed by scraper upon visiting a Web page. It simulates real-world human interaction with the page.
You can use DFK API for executing simple actions after rendering a web page:
"Input" action |
Specify Input CSS Selector and Input Text to perform search queries, or fill forms. |
"Click" action |
Click on an element with the specified CSS Selector. |
"Wait" action |
Wait for the specific DOM elements you want to manipulate. |
"Scroll" action |
Automatically scroll a page down to load more content, simulating user interaction with infinite scrolled pages. |
Let machines do the grunt work and let humans do what they do best.
Proxy scraper online service from Dataflow kit is useful to get around content download restrictions from specific websites.
Choose the one from 100+ supported global locations to send your html scraping API requests.
Or select "country-any" to use random geo-targets.
Render JavaScript web pages right from your application.
Just send an API request specifying the desired web page and parameters.
Easily integrate DFK API with your applications using your favorite framework or language.
Store from a few records to a few hundred million, with the same low latency and high reliability in our S3 compatible storage.
And besides, you can easily upload your data to the following cloud storages:
The next step obviously after scraping a webpage is to extract specific data from rendered HTML.
Depending on a website, it may be a separate HTML element like an image, text, link. Or for example, e-commerce sites list several products on a page as blocks of data grouped by some patterns.
Another web scraping task would be extracting your prospect's email, phone contacts from web pages for lead generation.
For automating such kind of tasks, we offer visual point-&-click web scraper.