322 lines
9.5 KiB
Markdown
322 lines
9.5 KiB
Markdown
+++
|
|
title = "Improving Python Dependency Management With pipx and Poetry"
|
|
date = "2021-09-19"
|
|
author = "Ceda EI"
|
|
tags = ["python", "development"]
|
|
keywords = ["python", "development"]
|
|
description = "My current dev setup with python, poetry and pipx"
|
|
showFullContent = false
|
|
+++
|
|
|
|
Over time, how I develop applications in python has changed noticeably. I will
|
|
divide the topic into three sections and see how they tie into each other at
|
|
the end.
|
|
|
|
- Development
|
|
- Packaging
|
|
- Usage
|
|
|
|
## Development
|
|
|
|
Under development, the issues I will focus on are the following:
|
|
|
|
- Dependency Management
|
|
- Virtualenvs and managing them
|
|
|
|
Historically, the way to do dependency management was through
|
|
`requirements.txt`. I found `requirements.txt` hard to manage. In that setup,
|
|
adding a dependency and installing it was two steps:
|
|
|
|
- Add the package `bar` to `requirements.txt`
|
|
- Either do `pip install bar` or `pip install -r requirements.txt`
|
|
|
|
While focused on development, I would often forget one or both of these steps.
|
|
Also, the lack of a lock file was a small downside for me (could be a much
|
|
larger downside for others). The separation between `pip` and
|
|
`requirements.txt` can also easily lead you to accidentally depend on packages
|
|
installed on your system or in your virtualenv but not specified in your
|
|
`requirements.txt`.
|
|
|
|
Managing virtualenvs was also difficult. As a virtualenv and a project are not
|
|
related, you need a directory structure. Otherwise, you can't tell which
|
|
virtualenv is being used for which project. You can use the same virtualenvs
|
|
for multiple projects, but that partially defeats the point of virtualenvs and
|
|
makes `requirements.txt` more error-prone (higher chances of forgetting to add
|
|
packages to it). The approach generally used is one of the following two:
|
|
|
|
|
|
```
|
|
foo/
|
|
├── foo_src/
|
|
└── foo_venv/
|
|
```
|
|
|
|
or
|
|
|
|
```
|
|
foo_src/
|
|
└── venv/
|
|
```
|
|
|
|
I preferred the second one as the first one nests the source code one
|
|
directory deeper.
|
|
|
|
### A new standard - `pyproject.toml`
|
|
|
|
In [PEP-518](https://www.python.org/dev/peps/pep-0518/), python standardized
|
|
the `pyproject.toml` file which allows users to choose alternate build systems
|
|
for package generation.
|
|
|
|
One such project that provides an alternate build system is
|
|
[Poetry](https://python-poetry.org/). Poetry hits the nail on the head and
|
|
solves my major gripes with traditional tooling.
|
|
|
|
### Poetry and virtualenvs
|
|
|
|
Poetry manages the virtualenvs automatically and keeps track of which project
|
|
uses which virtualenv automatically. Working on an existing project which uses
|
|
poetry is as simple as this:
|
|
|
|
```bash
|
|
$ git clone https://gitlab.com/ceda_ei/verlauf
|
|
$ poetry install
|
|
```
|
|
|
|
The `poetry install` command sets up the virtualenv, install all the required
|
|
dependencies inside that, and sets up any commands accordingly (I will get to
|
|
this soon). To activate the virtualenv, simply run:
|
|
|
|
```bash
|
|
. "$(poetry env info --path)/bin/activate"
|
|
```
|
|
|
|
I wrap this in a small function which lets me toggle it quickly:
|
|
|
|
```bash
|
|
function poet() {
|
|
POET_MANUAL=1
|
|
if [[ -v VIRTUAL_ENV ]]; then
|
|
deactivate
|
|
else
|
|
. "$(poetry env info --path)/bin/activate"
|
|
fi
|
|
}
|
|
```
|
|
|
|
Running `poet` activates the virtualenv if it is not active and deactivates it if
|
|
it is active. To make things even easier, I automatically activate and
|
|
deactivate the virtualenv as I enter and leave the project directory. To do
|
|
so, simply drop this in your `.bashrc`.
|
|
|
|
```bash
|
|
function find_in_parent() {
|
|
local path
|
|
IFS="/" read -ra path <<<"$PWD"
|
|
for ((i=${#path[@]}; i > 0; i--)); do
|
|
local current_path=""
|
|
for ((j=1; j<i; j++)); do
|
|
current_path="$current_path/${path[j]}"
|
|
done
|
|
if [[ -e "${current_path}/$1" ]]; then
|
|
echo "${current_path}/"
|
|
return
|
|
fi
|
|
done
|
|
return 1
|
|
}
|
|
|
|
function auto_poet() {
|
|
ret="$?"
|
|
if [[ -v POET_MANUAL ]]; then
|
|
return $ret
|
|
fi
|
|
if find_in_parent pyproject.toml &> /dev/null; then
|
|
if [[ ! -v VIRTUAL_ENV ]]; then
|
|
if BASE="$(poetry env info --path)"; then
|
|
. "$BASE/bin/activate"
|
|
PS1=""
|
|
else
|
|
POET_MANUAL=1
|
|
fi
|
|
fi
|
|
elif [[ -v VIRTUAL_ENV ]]; then
|
|
deactivate
|
|
fi
|
|
return $ret
|
|
}
|
|
|
|
PROMPT_COMMAND="auto_poet;$PROMPT_COMMAND"
|
|
```
|
|
|
|
This ties in well with the `poet` function; if you use `poet` anytime in a bash
|
|
session, activation switches from automatic to manual and changing directories
|
|
no longer auto-toggles the virtualenv.
|
|
|
|
![auto_poet and poet in action](/images/auto_poet.webp)
|
|
|
|
### Poetry and dependency management
|
|
|
|
Instead of using `requirements.txt`, poetry stores the dependencies inside
|
|
`pyproject.toml`. Poetry is more strict compared to `pip` in resolving
|
|
versioning issues. Dependencies and dev-dependencies are stored inside
|
|
`tool.poetry.dependencies` and `tool.poetry.dev-dependencies` respectively.
|
|
Here is an example of a `pyproject.toml` for a project I am working on.
|
|
|
|
```toml
|
|
[tool.poetry]
|
|
name = "bells"
|
|
version = "0.3.0"
|
|
description = "Bells is a program for keeping track of sound recordings."
|
|
authors = ["Ceda EI <ceda_ei@webionite.com>"]
|
|
license = "GPL-3.0"
|
|
readme = "README.md"
|
|
homepage = "https://gitlab.com/ceda_ei/bells.git"
|
|
repository = "https://gitlab.com/ceda_ei/bells.git"
|
|
|
|
[tool.poetry.dependencies]
|
|
python = ">=3.7,<3.11"
|
|
click = "^8.0.1"
|
|
questionary = "^1.10.0"
|
|
sounddevice = "^0.4.2"
|
|
SoundFile = "^0.10.3"
|
|
numpy = "^1.21.2"
|
|
|
|
[tool.poetry.dev-dependencies]
|
|
|
|
[build-system]
|
|
requires = ["poetry-core>=1.0.0"]
|
|
build-backend = "poetry.core.masonry.api"
|
|
|
|
# I will talk about this section soon
|
|
[tool.poetry.scripts]
|
|
bells = "bells.__main__:main"
|
|
```
|
|
|
|
One of the upsides of poetry is that you don't have to manage the dependencies
|
|
in `pyproject.toml` file yourself. Poetry adds an `npm`-like interface for
|
|
adding and removing dependencies. To add a dependency to your project, simply
|
|
run `poetry add bar` and it will add it to your `pyproject.toml` file and
|
|
install it in the virtualenv as well. To remove a dependency, just run `poetry
|
|
remove bar`. For development dependencies, just add the `--dev` flag to the
|
|
commands.
|
|
|
|
## Packaging
|
|
|
|
Since poetry replaces the build system, we can now configure the build using
|
|
poetry via `pyproject.toml`. Inside `pyproject.toml`, the `tool.poetry` section
|
|
stores all the build info needed; `tool.poetry` contains the metadata,
|
|
`tool.poetry.dependencies` contains the dependencies, `tool.poetry.source`
|
|
contains private repository details (in case, you don't want to use PyPi).
|
|
|
|
One of the options is `tool.poetry.scripts`. It contains scripts that the
|
|
project exposes. This replaces `console_scripts` in `entry_points` of
|
|
`setuptools`.
|
|
|
|
For example,
|
|
|
|
```toml
|
|
[tool.poetry.scripts]
|
|
foobar = "foo.bar:main"
|
|
```
|
|
|
|
This will add a script named `foobar` in your `PATH`. Running that is
|
|
equivalent to running the following script
|
|
|
|
```python
|
|
from foo.bar import main
|
|
|
|
if __name__ == "__main__":
|
|
main()
|
|
```
|
|
|
|
For further details, check the
|
|
[reference](https://python-poetry.org/docs/pyproject/).
|
|
|
|
Poetry also removes the need for manually doing editable installs (`pip install
|
|
-e .`). The package is automatically installed as editable when you run
|
|
`poetry install`. Any scripts specified in `tool.poetry.scripts` are
|
|
automatically available in your `PATH` when you activate the `venv`.[^1]
|
|
|
|
To build the package, simply run `poetry build`. This will generate a wheel and
|
|
a tarball in the dist folder.
|
|
|
|
To publish the package to PyPi (or another repo), simply run `poetry publish`.
|
|
You can combine the build and publish into one command with `poetry publish
|
|
--build`.
|
|
|
|
![example of poetry build](/images/poetry_build.webp)
|
|
|
|
## Usage
|
|
|
|
This part is more user-facing rather than dev-facing. If you want to use two
|
|
packages globally that expose some scripts to the user, (e.g. `awscli`,
|
|
`youtube-dl`, etc.) the general approach to do so is to run something like `pip
|
|
install --user youtube-dl`. This install the package at the user level and
|
|
exposes the script through `~/.local/bin/youtube-dl`. However, this installs
|
|
all the packages at the same user level. Hypothetically, if you have two
|
|
packages `foo` and `bar` which have conflicting dependencies, this causes an
|
|
issue. If you run,
|
|
|
|
```bash
|
|
$ pip install foo
|
|
$ pip install bar
|
|
$ bar # works
|
|
$ foo # breaks because of dependency mismatch
|
|
```
|
|
|
|
While installing `bar`, `pip` will install the dependencies for `bar` which
|
|
will break `foo` after warning you[^2].
|
|
|
|
To solve this, there is [`pipx`](https://github.com/pypa/pipx). Pipx installs
|
|
each package in a separate virtualenv without requiring the user to activate
|
|
said virtualenv before using the package.[^3]
|
|
|
|
In the same scenario as before, doing the following works just fine.
|
|
|
|
```bash
|
|
$ pipx install foo
|
|
$ pipx install bar
|
|
$ bar # works
|
|
$ foo # also works
|
|
```
|
|
|
|
In this scenario, both `bar` and `foo` are installed in separate virtualenvs so
|
|
the dependency conflict doesn't matter.
|
|
|
|
## Some more things from my bashrc
|
|
|
|
```bash
|
|
|
|
function wrapper_no_poet() {
|
|
local last_env
|
|
if [[ -v VIRTUAL_ENV ]]; then
|
|
last_env="$VIRTUAL_ENV"
|
|
deactivate
|
|
fi
|
|
"$@"
|
|
ret=$?
|
|
if [[ -v last_env ]]; then
|
|
. "$last_env/bin/activate"
|
|
fi
|
|
return $ret
|
|
}
|
|
|
|
alias wnp='wrapper_no_poet'
|
|
alias pm='POET_MANUAL=1'
|
|
```
|
|
|
|
Prefixing any command with `wnp` runs it outside the virtualenv if a virtualenv
|
|
is active. Running `pm` turns off automatic virtualenv activation.
|
|
|
|
|
|
[^1]: This also allows for a nice switch between the development and production
|
|
versions of the app. Essentially, when the virtualenv is active, you are
|
|
using the development script while when it is deactivated, you are using
|
|
the global (likely production) version.
|
|
|
|
[^2]: To be precise, it will warn you that it broke `foo` but will still
|
|
continue with the installation
|
|
|
|
[^3]: For development, poetry also provides `poetry run` which runs a file
|
|
without having to activate the virtualenv.
|