Render dynamic content from web pages to static HTML with Headless Chrome in the cloud.
Just specify target geolocation for sending requests to route them for you via a worldwide proxy network.
"Base Fetcher" is suitable for processing server-side rendered pages where the HTML in the HTTP response contains all content.
Crawling a URL with "Base fetcher" takes fewer resources and works faster than rendering HTML with "Chrome fetcher."
Requesting static HTML pages is always cheaper ...
applications. They consist of HTML + JS code. HTML initially does not contain all the actual
So scraping such pages 'as is' is useless for most cases.
The headless Chrome browser is used by "Chrome fetcher" for rendering dynamic content and returning it as a static HTML. It renders websites in the same way as a real browser would do it.
Doing actions helps you to be closer to the desired data.
Actions are performed by scraper upon visiting a Web page, and it simulates real-world human interaction with the page.
You can use DFK API for executing simple actions after rendering a web page:
|Specify Input CSS Selector and Input Text to perform search queries, or fill forms.|
|Click on an element with the specified CSS Selector.|
|Wait for the specific DOM elements you want to manipulate.|
|Automatically scroll a page down to load more content, simulating user interaction with infinite scrolled pages.|
Let machines do the grunt work and let humans do what they do best.
Dataflow kit Proxies pool is useful to get around content download restrictions from specific websites.
Choose the one from 100+ supported global locations to send your web scraping API requests.
Or select "country-any" to use random geo-targets.
Just send an API request specifying the desired web page and parameters.
Easily integrate DFK API with your applications using your favorite framework or language.
Dataflow Kit storage is designed according to the best industry practices. We use S3 compatible storage, giving you high availability and scalability.
Store from a few records to a few hundred million, with the same low latency and high reliability.
And besides, you can easily upload your data to the following cloud storages: