Installation Guide

Requirements

Python

Crowl runs on Python 3 or above. It works best on UNIX-like systems (Linux and macOS), but will run on Windows too.

To check which version of Python your system is running, open a terminal and execute the following:

python --version  

You should get something like Python 3.*. If that’s the case, you can now install Crowl.

If the output isn’t something like Python 3.*, try this:

python3 --version  

If this didn’t work either, please download and install the latest Python 3 version.

If you do have python3 installed but not as the default python interpreter, here are your options:

We recommend using virtual environments to split your different projects dependecies and avoid conflicts.

You can for instance use pyenv.
Once pyenv is installed, you’ll be able to quickly create environments:

mkdir crowltech  
cd crowltech    
pyenv virtualenv 3.6.4 crowltech  
pyenv local crowltech    
python --version  

Using an alias to set python3 as the default interpreter

You can replace Python 2 as the default Python interpreter on your system by using aliases.
On UNIX-like systems (Linux & macOS), edit your ~/.bash_profile file and add the following:

alias python=python3  
alias pip=pip3  

Save the changes, then run:

source ~/.bash_profile  

Using python3

We really dont advise to do so, but if you don’t want (or can’t) change your default Python interpreter, you can simply replace python and pip commands with respectively python3 and pip3.

A few more tips

You might find that Python can be very useful in a daily basis. Learn a few tips in this post.

Install Crowl

Download the source code

We recommend using git as it will be a lot easier to upgrade.
Simply clone the repository:

git clone https://gitlab.com/crowltech/crowl.git  
cd crowl  

Not using git

If you’re not comfortable using git, you can download a zip archive or a tar.gz archive directly.

In console:

wget https://gitlab.com/crowltech/crowl/-/archive/master/crowl-master.tar.gz  
tar -xzvf crowl-master.tar.gz  
mv crowl-master crowl
cd crowl  

Install dependecies

Once into the crowl directory, install dependencies using pip:

pip install -r requirements.txt  

This will download and install all python dependencies.
You are now ready to start crawling.

Start crawling

Simply launch Crowl from the command line:

python crowl.py -u https://www.crowl.tech/ -b crowl  

Here is a list of useful available options:

  • -u/--url: start URL (required). The starting point of your crawl.
  • -b/--database: project basename (required). This will be used to name the output file or database.
  • -l/--links: add this argument to store links.
  • -c/--content: add this argument to store webpage content.
  • -d/--depth: set the maximum crawl depth (default: 5).

Upgrade Crowl

If you installed Crowl using git, simply download the latest version:

git pull origin master  

If you didn’t use git, save your configuration files,

Get Connected