Turn websites into useful data

We help people to extract information from web sites with simple point-and-click toolkit.

Dataflow Kit is Web scraping open source framework written in Go.


What type of data can Dataflow kit extract?

E-commerce merchants collect product information including prices, reviews and ratings from competitors’ retailer sites for further analysis. That gives retailers the great opportunity to attract new customers, increase sales, and go head-to-head against the competition.

Job board owners aggregate, monitor and refine regularly huge amount of job postings, company profiles, employee profiles connecting employers with job seekers. Mostly, job boards use this refined data on their website keeping the links to original websites.

Travel and hospitality companies gather hotel reviews, prices, customer sentiments from multiple travel portals and then, build business intelligence using this data.

Real estate agents scrape details like property address, details, price etc. This helps real estate agencies to keep track of real estate listings, or help them to create a database of properties available for sellers and agents.

Data Extraction and Delivery Process

Open a web page

Behind-The-Scenes Headless Chrome browser is used for rendering JavaScript driven web pages properly.

Click to select data

  • Optionally check trim, UPPER, lower, Capitalize filters, Or build Regular Expression.
  • Choose paginator type from either "Next" link or "Infinite scroll" or "Load more" button.
  • Follow links and detailed pages processing.

Download results

  • Launch crawler to follow links and extract the content from specified pages.
  • Select one of available formats from CSV, Excel, JSON/ JSON Lines, or XML.
  • Download parsed data.

Facing unique challenges with Web Scraping?

We will work with you to build custom solutions to fit your needs. From web scraping to data conversion, our experts are here to help.

Open source

Dataflow kit is open source and we welcome all contributors who are interested in collaborating.

Whether you want to help with issues, coding features, releasing the project, scripting, tests, benchmarking, documentation, updating samples or share an information about Dataflow kit.

Please star DFK GitHub repository.