Installing pip based tools for bioinformatics

Loads of bioinformatics and data science tools are available through the `pip` package manager/installer. In days gone by, it was possible to type somethinig like pip install mytool --user to get a tool installed, but due to changes this isn't possible anymore. When you try a command like that on Ubuntu 24, you will get an error and suggestion to use a virtual environment (or "venv"). While there is a lot of in depth documentation on using venv, I couldn't find a quick start guide specifically foir bioinformaticians. Quickly, a virtual environment is an isolated environment which should help by stabilising the base python version and the package version, so there isn't a chance of problems from version mismatch. So here is a quick guide to install the MultiQC tool, used extensively in genome sequence analysis.

Install multiqc via pip

The first step is to create a folder to put your venvs (you will likely have several if you do a lot of analysis).

mkdir ~/venv

Next, create the multiqc venv

python3 -m venv ~/venv/multiqc

Next, activate this venv

source ~/venv/multiqc_env/bin/activate

You will see that the command line prompt has changed to include something like (multiqc). Next, we're all clear to use pip3 to install the desired tool.

pip3 install multiqc

Now the tool should be working. Try multiqc . to check. When you want to return to a base environment, type deactivate.

Next time you need to fire up the tool, use the following commands.

source ~/venv/multiqc/bin/activate

multiqc .


MultiQC gets updated frequently, so it is a good idea to check version numbers. If you want to move to a newer version, delete the ~/venv/multiqc folder and reinstall using these instructions.

Popular posts from this blog

Data analysis step 8: Pathway analysis with GSEA

Uploading data to GEO - which method is faster?

Generate an RNA-seq count matrix with featureCounts