Stop and resume a crawl

As it’s based on Scrapy, Crowl offers the option to stop and resume a crawl.

To stop a running crawl, use ctrl+c on most UNIX systems.
Be sure to let the crawl stop safely, otherwise you won’t be able to resume.

To resume a crawl, you’ll need to use the output basename (project name + timestamp) that’s logged at the end of the crawl, and the --resume command line argument.
Here’s an example:

# Launch a crawl
python crowl.py --conf project.ini  
# Stop it using ctrl+c

# Resume crawl
python crowl.py --conf project.ini --resume project_20200118-010101

Stop and resume a crawl

Get Connected