How to set up a proxy for Crawlee
The Proxy Port SDK contains a proxy rotation package for Crawlee. Using this package you don't need to worry about proxy rotation, everything will be done by itself. If you want to manually set proxies, keep reading.
All your proxy needs are managed by the
You need to instantiate the
ProxyConfiguration
class.You need to instantiate the
ProxyConfiguration
class and pass it to the Crawler
constructor.You can set up a proxy from a predefined list:
import { CheerioCrawler, ProxyConfiguration } from 'crawlee';
const proxyConfiguration = new ProxyConfiguration({
proxyUrls: [
// replace it with the URLs of your proxy servers
'http://yourproxyserver-1.com',
'http://yourproxyserver-2.com',
],
});
const crawler = new CheerioCrawler({
proxyConfiguration,
// ...
});
or with a function which provides proxy dynamically:
import { CheerioCrawler, ProxyConfiguration } from 'crawlee';
async function newUrlFunction(sessionId: string | number): Promise<string> {
// Must be defined by you
return getProxyFunction();
}
const proxyConfiguration = new ProxyConfiguration({
newUrlFunction: newUrlFunction,
});
const crawler = new CheerioCrawler({
proxyConfiguration,
// ...
});
See also:
crawlee-proxyport
- Proxy provider for Crawlee