Setting up a crawler is not that difficult. We have best practices that allow us to e.g. setup a crawler for a few thousand corporate websites in just hours.
As input we need either company names or URLs.
- If we get the URLs, we know where to crawl and the challenge is to find out who we crawl.
- If we get a company name, we know who we crawl and the challenge is to find the right website that belongs to the specific company.
For both challenges we have solutions like reference crawling that uses crawled data from other sources that are likely to link to these companies.
use our browser farm. If the target is in the Dark Web, we use our TOR Crawlers that know how to deal with Onion URLs.
You can always contact us, we'll explain in more detail how that works - it's a great piece of technology.
(We do not reveal everything we do here, because this is special knowledge we acquired over time.)