Managing virtual environments in python

Writing software in python is a quick and easy process: New projects don't require special setup and missing packages can be installed with pip. But when maintaining multiple projects on the same machine, you quickly run into issues: some might require conflicting versions of the same library or different python versions. Virtual environments are used to solve the issues.

What is a virtual environment?

At it's core, a virtual python environment is just a directory that isolates a python project form the rest of the machine it is stored on: the directory contains it's own python interpreter and installed packages, so the project does not need to rely on globally installed versions. This helps developers maintain many working environments for many different projects, without compromising on the host installation or projects affecting each other.

Creating a virtual environment

The default tool of choice for virtual environments is venv, mainly because it adheres to the current standards and is included out of the box in python 3.3 and later versions.

To create a virtual environment, you first need to choose a name for it's directory (where interpreter and packages will be stored). It is common to also name this directory venv, to make it easy for other users to know immediately that this is the virtual environment directory, managed by venv, just by reading the directory name.

Creating a new virtual environment is as simple as:

python -m venv venv

This will create a directory named venv in the current directory, containing all packages, interpreters and executable scripts needed to isolate it from the host system.

Activating the virtual environment

Just creating the virtual environment doesn't change the way we interact with files inside it. To get the benefits of using venv, we must now activate the virtual environment. This is done by using source from a bash shell:

source venv/bin/activate

Or running the batch file on windows:

venv\Scripts\activate.bat

Be careful to use source when activating the environment in bash: you cannot execute the bash files activate directly, you must source it to get it working properly!

You need to activate the virtual environment every time you open a new terminal. To deactivate it, you can either close the terminal, or run:

deactivate

Managing dependencies with venv

Once the virtual environment is activated, using venv is seamless: it quietly adjusts pip and python commands to use the virtual environment versions without any additional setup. New users may be confused initially, because the virtual environment does not contain any packages initially (and doesn't inherit system packages either), so the first steps are usually to install some packages, even if they were already present on the host system.

Installing a package works as usual:

pip install requests

The command looks and behaves the same, but packages are now installed into venv/lib and venv/lib64, respectively.

Fixing dependency versions

When installing packages into a virtual environment, package versions won't change without manual updates anymore. While this works as intended on your local machine, this may not be the case when sharing the code, for example by uploading it to a public git repository. You could upload the entire venv directory, but then your project can only be run by users that have the same operating system and processor architecture as you, for example mac users can't user windows projects and vice versa.

That's not intuitive, so the convention became to not include the venv/ directory in the source code, and instead provide a text file named requirements.txt that contains the names and versions of all packages required to run the project. This file can be quickly generated from your virtual environment:

pip freeze > requirements.txt

The only thing you need to do now is re-run this command when you install or update packages.

Users that want to install the project on another machine can first create and activate a new virtual environment, and then install all packages at once:

pip install -r requirements.txt

Removing packages

Especially for longer-running projects, it is vital to be precise with package management. Every package included in the virtual environment will be installed on all user's machines if they run the project - whether your code actually uses it or not. To reduce bloat, you should remove packages when you don't need them anymore:

pip uninstall requests

This will delete the files from your local venv/ directory, and running pip freeze will not list it as a requirement for installation anymore.

Managing dependencies in virtual environments with venv can be a quick and easy solution, but it requires diligence on the part of the developer: While installing dependencies is easy, forgetting to uninstall obsolete ones can quickly bloat the project and figuring out which packages are or aren't required is tedious to do in hindsight. Some package managers seek to remedy this problem, the most popular ones being poetry and pdm. While these tools provide further automation for dependency management, they are not widely adopted throughout the open source community at this point, so understanding venv remains a vital and valuable skill to have.