As it’s based on Scrapy, Crowl offers the option to stop and resume a crawl.
To stop a running crawl, use
c on most UNIX systems.
Be sure to let the crawl stop safely, otherwise you won’t be able to resume.
To resume a crawl, you’ll need to use the output basename (project name + timestamp) that’s logged at the end of the crawl, and the
--resume command line argument.
Here’s an example:
# Launch a crawl python crowl.py --conf project.ini # Stop it using ctrl+c # Resume crawl python crowl.py --conf project.ini --resume project_20200118-010101