Programmatically starting scraper runs


Is it possible to programmatically initiate a run via an API? Our service also uses Github OAuth, if that makes implementation easier.

Can I schedule my scraper to run twice a day

Hey @jeffreyliu this is a cool idea. Would you mind writing a little bit about what you’re hoping to achieve with this, and your service?


hey! apologies for the delay - just got around to this. We’re interested in being able to start runs in response to detected changes to websites, for example if a new data set is published or if there’s an update to previously published information. We’d also like to be able to run scrapes on a schedule, for example for streams or regularly-updated data sources (e.g. weather, traffic, etc). The scheduling/dispatch logic would happen separately from the scraping, so it’d be nice just to have an exposed way to tell a specific scraper to run. Thanks!


@jeffreyliu I’ve always imagined something like this as part of the API. It looks like we don’t already have an issue for it though. Can you please write one up?


@henare, Can we do something like this? to trigger the scraper belongs to ‘user-name’ to run?

but not sure where to start…