Evaluating Python Dependency Managers

February 25, 2022

Written By:

No items found.

Introduction

Hello again! The last time we met (see Part 1), we discussed pain points around our Python ecosystem and how we were evaluating the performance of alternative tools. In Part 1, our focal problems were around fighting too tightly pinned dependencies, slow environment resolution times and debugging unresolvable dependency graphs. In this follow-up, we wanted to dive deeper into the ergonomics of each tool and how they could empower Recursion’s developers to build products more efficiently.

Performance

Refer to the README in the benchmarks repository for runtime details. The plots shown below represent the median resolution time for each tool over 10 trials with error bars representing the 95th-percentile of the collected time data. The bars are sorted by median time in increasing order.

Building environments with Conda packages are among the slowest builds of all. We found that using Mamba with Conda packages was noticeably faster than using Conda, so we don’t think the source package location and format is a significant factor. In general, having both Conda and Mamba build environments with only PyPI packages performed better.

Poetry’s performance had more variance than other tools when resolving data and web benchmarks. We noticed that Poetry would sometimes hang or otherwise fail to resolve the data benchmark’s environment as we were developing the benchmark.

Among the pip-based tools, we saw that pip-compile was faster than pip-lock in most cases and the environment resolution was generally cheaper than downloading and installing packages.

Supporting this hypothesis is the fact that the smallest benchmark (utility), which has only 25 packages, shows the opposite trend. When there are fewer requirements to download, it is faster to install them than to resolve the environment. This raises the question of whether downloading and installing can be optimized. Perhaps an obvious solution would be to employ parallelism. There is a long-standing feature request regarding parallel downloads for pip. For Conda-based workflows, one of the first lines in the Mamba README features “parallel downloading of repository data and package files using multi-threading.”

Tool performance was our highest priority because slow environment resolution contributes significantly to long build times, leading to poor developer experience (DevEx). The rest of these criteria aim to capture ergonomics and productivity benefits.

Package Format

The Python wheel distribution package format[5] is currently the standard for building and distributing pip-compatible packages. A pure Python wheel contains only Python source code. Platform wheels package and distribute native binary libraries and executable files built for specific operating systems, which are needed to support code using native binary extensions (frequently compiled from C or C++ source code). If a package is written in pure Python, then it’s possible to create a platform-agnostic universal wheel package.

Conda’s packaging format and tooling was designed with building and linking native binary libraries in mind, and offers a standardized build framework. Conda also has tools to convert packages for use on multiple platforms[6]. Some packages (numpy, scipy for example) also distribute optimized Conda packages for a range of platforms, which is a major advantage for Conda users.

Any tool, including Conda, that can install packages from PyPI can install pip-compatible packages. In contrast, Conda packages can only be installed by Conda or Mamba. One caveat is that Anaconda does not seem to handle packages with underscores in their name in a pip-compatible way, causing friction when attempting to translate between the two. Additionally, because Anaconda can host other languages, some Python packages have non-standard naming conventions (e.g., python-graphviz instead of graphviz).

Lockfile Support

A lockfile is an exact pinning of required packages, representing a known compatible/working development environment. For package development, lockfiles are more specific than requirements, but give us confidence that our dependency list can be resolved. For service development, they can and should be used to represent the state of a running service in production to increase parity between development and production environments.

Lockfiles are a boon for CI/CD as they generate consistent, reproducible environments and speed up automated builds and new developers by allowing them to skip the process of resolving environments. A developer only needs to pay the cost of re-locking these when a new or existing dependency is changed, preferably using automation to keep up with security patches. Many of the tools we evaluated that provide this support also allow for splitting dependencies minimally into required and development-only dependencies.

The last thing to mention about having lockfiles in a common format is that we can leverage tools like dependabot to flag packages pinned to versions with known CVEs (common vulnerabilities and exposures). We can even shift security left by using tools like safety, jake, or snyk on our build pipelines to prevent bad package versions from getting into our main branch.

Maintained

When evaluating Open Source Software, we found it to be important that the community using a tool is actively engaged. We worked to understand the answers to questions like: how often is the release cycle? Are bugs being fixed? Are new features being fielded/vetted? How many open issues? How many closed issues

Automation Friendliness

Each of these tools use their own file format with different features that make it easier (or sometimes harder) to automate around. For the uninitiated, we encourage you to look at our benchmarks repo to see minimal examples of what each of these files looks like in practice for specifying dependencies[7]. Here is a brief overview of what we found useful in each of these formats:

The TOML formats used by tools like Poetry (pyproject.toml) and Pipenv (Pipfile) allow for users to separate dev dependencies from required dependencies. Poetry also allows for specifying optional or “extra” requirements, which is useful for building lightweight core packages with the option to do “full” or specific option installs (e.g., think of something like pip install apache-airflow[aws]). Separate files can be used (and constraints between them imposed) to mimic both the dev and optional behaviors with a tool like pip-tools’ pip-compile.

Poetry, Pipenv, and Conda/Mamba all allow users to specify the version of Python in their respective environment definition files. Additionally, Conda and Mamba allow for naming of the virtual environments, and Poetry supports a variety of additional project details. Poetry’s robust file format makes sense, as its primary function is to make deploying Python packages painless. It is worth mentioning that Poetry deserves extra praise here for being the only vetted solution that leverages the agreed upon PEP-0518[8] standard of using pyproject.toml.

Security Scannable

As mentioned in the lockfile support section, we looked at four popular tools that scan Python dependencies for vulnerabilities: Dependabot[9] (free for GitHub repositories, paid plans), Snyk[10] (free, paid plans), Safety[11] (free, paid plans) and Jake[12] (free, paid plans).

At time of writing, Dependabot works by scanning and analyzing git repositories for pip requirements, Pipenv TOML, Poetry TOML, pip-tools/pip-compile files and lockfiles. This means it will flag packages with known vulnerabilities. It does not support Conda environment YAML files.

Snyk tools can scan pip, pipenv TOML and poetry TOML files and lockfiles[13] and recommends using dephell[14] to convert Conda files[15]. Safety’s tools can scan pip files and lockfiles. At time of writing, Pipenv ships with a command line tool (pipenv check) that wraps Safety[16]. Other file formats are not explicitly supported[17][18]. Sonatype’s Jake tool works with all of the tools we evaluated.

DevEx/Debuggability

Our first post in this series made a strong case for clear and succinct output that guides users directly to the problem. Any environment resolution tool should be able to point users to the exact packages in conflict and any useful supplemental information (such as all available versions and their relevant requirements).

Looking at a specific instance where the data benchmark’s requirements broke, most of the tools failed with a list of incompatible packages. Pipenv failed with a stacktrace from attempting to build compatible packages from source, but didn’t list which packages were in conflict. Poetry, on the other hand, appeared to hang without further output when running with the default verbosity settings:

In our experience, pip-tools gave straightforward output, and thus was the tool we felt most comfortable reaching for when debugging a hairy dependency tree.

Python Version Manager

Version managers install and switch between multiple Python versions on a single system without causing problems in the default system Python. It’s also useful for local package development, where you may want to support multiple versions of Python. This is a nice-to-have feature, but can be replaced with containerized development. The same result can be achieved by deploying in containers and using testing tools that support multiple Python versions like tox or nox.‍

Virtual Environment Manager

Python virtual environment managers create self-contained Python installations, which can be convenient for setting up local development environments. This is useful when working on multiple projects with different dependencies. This was a nice-to-have feature, but not required since tools like the built-in Venv or Virtualenv exist for this sole purpose. If you are an end-user who needs to use a Python product in an isolated environment, consider a tool like pipx.

Conclusion

Hopefully the information provided in this blog gives you a better sense of the landscape of Python tooling and provides good fodder for internal discussions where you work. In the end, we picked pip-tools’ pip-compile for requirement wrangling. We felt pip-tools was battle-hardened, performed well, fit nicely into our workflow, and "does one thing well." Users can choose between Conda and Pyenv in conjunction with Virtualenv or Venv for managing Python versions on their systems.

‍

Ayla Khan is a Senior Software Engineer and Dan Maljovec is an Infrastructure Engineer at Recursion.

‍

Footnotes

[1] Lockfiles supported with pip freeze and help from other tools

[2] Pipenv caught some flak for having a bit of an erratic release schedule for a time, but they have since switched to a calendar versioning scheme and appear to be keeping up since 2020

[3] Pipenv documentation are currently not available (cert issue + 403 response)

[4] Poetry could hang indefinitely without any error or warning messages, had the most variance in timing, and was pretty slow in most benchmarks

[5] https://www.python.org/dev/peps/pep-0427/

[6] https://docs.conda.io/projects/conda-build/en/latest/user-guide/tutorials/build-pkgs.html#converting-a-package-for-use-on-all-platforms

[7] https://github.com/recursionpharma/perp_public_benchmarks

[8] https://www.python.org/dev/peps/pep-0518/

[9] Dependabot - GitHub

[10] Snyk for Python - Snyk User Docs

[11] safety · PyPI

[12] Jake - GitHub

[13] Snyk for Python - Snyk User Docs: Features

[14] dephell/dephell: Python project management