Dataflow Kit Proxy Servers / Proxy IPs

Getting past rate limits.

Many large sites detect multiple requests coming in from one IP address in a short amount of time. This usually indicates some sort of automated access and a site blocks future requests from that client for a pre-set period of time.

In order to get around this type of restriction when crawling or processing data from certain sites, you need to diversify the IP addresses of your requests out evenly across a number of proxy servers.

Getting around geo-IP based content restriction

If someone in Europe wants to extract data from US based web site but they don't have access from their home country, they can make the request for US web pages through a proxy server that’s located in USA (and has an US IP address) to get past the restriction, since their traffic seems to be coming from the US IP address.

So proxy servers are often used to get around geo-IP based content restrictions.

What is a Proxy Server?

When you make an HTTP request to a site using a proxy server, instead of travelling directly to that site, your request first passes through the proxy server, and then on to your target site.

Thus, the proxy server is making the request on your behalf ("by proxy") and then passing the response from the target site back to you.

Dataflow kit forwards web page fetching requests to proxy servers and in return proxies sends response back with downloaded web page content.

pass request to target via proxy server

From the perspective of the target site, they have no idea that the request is being proxied. They simply see a normal web request coming in from the proxy server’s IP address.

Dataflow Kit proxies.

In order to get around content download restrictions from certain web sites Dataflow Kit offers utilizing proxy IPs.

Our default datacenter shared proxy servers are usable for most sites and at most volumes. Usage of these proxies incurs an additional request for each page processed.

If you would like to utilize private proxies for specific sites or individual crawls, please contact Dataflow Kit Support.

Additional notes:

Some popular sites always require proxies. These domains have proxies enabled globally and will incur additional page requests for all users.

Details on your account’s proxy usage will be available via our Account API, in your Dashboard, and in your monthly invoices.