June 10, 2022

Discovering a Dangerous Pattern in a Popular Python Package Manager

Note: Verizon Media is now known as Yahoo.

A drawing of a python with a vulnerability in it's stomach.

Recently, I discovered a high severity vulnerability in Pipenv, which is a popular package manager for Python. 

Due to a flaw in how Pipenv parsed requirements files, an attacker could hide a specially crafted comment in a requirements.txt file to trigger arbitrary remote code execution on developers' machines. (Seriously, upgrade Pipenv.)

But while the vulnerability is interesting in its own right, how I found it is even better.

An Unexpected Discovery 

In December of 2021, I discovered the vulnerability in Pipenv while working on a tool I wrote in Python to manage and automate our static analysis pipeline. My tool had a few dependencies on other Python packages, which were listed in a requirements.txt file. 

However, I noticed that when I tried to install the requirements file with Pipenv, I kept getting an error referencing some line in the file that I had commented out. When I removed the comment from the requirements file, installing with Pipenv worked again.

So the problem was solved, right? We could just remove the comment.

For me, as a security engineer, it’s not enough just to fix it. 

My job is to identify and understand security risks in applications, and to give recommendations on how to remediate or mitigate those risks. Unexpected crashes in software can and do lead to really scary things. 

So I dove in to see if this Pipenv crash could be exploited by an attacker. In early January, I reported the bug, and a patch (and advisory) was quickly released. (CVE-2022-21668)

For those unfamiliar with the Python language ecosystem, there are a number of popular package management tools which Python developers use to define and manage the third-party packages that their code depends on. 

These tools automatically download and install these packages from an index server (like https://pypi.org/), a remote source code repository, or a location on a filesystem.

Pip, which has been bundled with Python since Python 3.4, is the de facto standard utility for installing packages in Python, but there are a number of other package managers that offer additional features.

Pipenv is one of those package managers (and seems to be the officially recommended option).

Like many of the Python package management tools, Pipenv supports installing a project’s dependencies from a requirements file.

A requirements file, usually named “requirements.txt”, is just a simple text file that lists the pip configuration options and the Python packages, one per line, that should be used to set up the development environment. The requirements file format was defined and popularized by pip, and is supported by most Python package management tools.

In theory, all of these package managers should parse that file in the same way. However, Pipenv implemented its own parsing of requirements file parser. And that was causing a crash.

Crashing the Parser 

Even though the overall likelihood of exploitation is low to moderate depending on a range of factors, the impact of successful exploitation is severe.

Indeed, if an attacker is able to hide a malicious --index-url option in a requirements file that a victim installs with Pipenv, an attacker can embed arbitrary malicious code in packages served from their malicious index server that will be executed on the victim's host during installation (remote code execution/RCE).

To demonstrate, we can fork the source of the popular pipdeptree package and add some extra code to be executed during build time (in setup.py) and at runtime (in pipdeptree.py), and then build a new source distribution for our modified pipdeptree package.

To set up our “malicious” index server, we can copy our source distribution to a new “index/pipenv” directory on our local machine, and use python’s built-in http.server module to serve it by running “python3 -m http.server 8080 -b –directory 'index'”.

Lastly, we create our compromised requirements.txt file with the following content (note the custom index URL given with the abbreviated “-i” option in the comment on line 1):

Now, if we install the latest unpatched version of Pipenv (2021.11.23) and install our requirements file with “pipenv install -r requirements.txt”, we can see that our compromised version was installed from our malicious index server:

Looking at the Pipfile and Pipfile.lock files generated by Pipenv, we can also see that our malicious index server was, in fact, added:

The root cause is due to this vulnerable requirements file parsing code in the parse_indexes(str: line) function of the pipenv.utils module:

This function is called iteratively on each line of a requirements file, and uses the argparse module to find and process --index-url, --extra-index-url, and --trusted-host options (and variations thereof). However, it does not ignore these options when they appear in comments, or validate that these options appear on their own lines as required by the requirements file specification (see: https://pip.pypa.io/en/stable/reference/requirements-file-format/#global-options). 

Moreover, the options can also be abbreviated due to default behavior provided by the argparse.ArgumentParser object used to parse these options in the requirements file, so that --trusted-host and --t will be treated as equivalent by Pipenv, for example.

About the Author

Chris Passarello is a Principal Product Security Engineer at Yahoo. He works to improve the security of our products and systems by working closely with our development teams at key touchpoints during each stage of software development to ensure we design and build systems with security in mind from the start and to quickly find and fix security issues that come up during development or later on in production. 

In addition to other activities like threat modeling and penetration testing, a large part of his time is spent reviewing code for security vulnerabilities.