Another daily hassel for python developers is to have their packages under control.
Python, unlike other programming languages lacks of solid package management tool.
pip provides some basic functionality, but still does not have necessary version and
dependency management capabilities, required for mission critical python projects.
As a consequence, there are several issues that might occur in your production environment.
In this article I will discuss the most critical issues and give some thoughts how they
can be avoided using poetry
.
Version divergence issue
Depending on team size the one pythong project is usually maintained by several developers. The scrum-master/team lead prepares tasks for the sprint/iteration (if you work in agile). and distributes the tasks among developers. For simplicity lets consider the following example project stucture:
my-flask-app
|--static/
|--css/
|--main.css
|--templates/
|--index.html
|--products.html
|--models.py
|--views.py
|--requirements.txt
where the requirements.txt would like:
Flask~=2.*
requests~=1.*
And now imagine, that before starting, the developer decides to run
pip install -r requirements.txt
to refresh the packages. The result of this comand will update the packages in his local environment.
Flask==2.0.1 -> Flask==2.2.3
requests==1.8.1 -> requests==1.9.2
Now, after developing the changes, he pushes his changes to CI server, the server builds the packages, runs the auto tests, stores the artifacts and notifies the QA team that the task is ready for testing. The QA engineer pulls the code, starts the manual regression tests and realises that some of the features are not working on his machine. He reports the problem and moves the ticket back to development. The developer tries to reproduce problem, but could not and reports back to QA engineer, with something like
"I followed all your instructions, but could not reproduce the problem,
could you please come over or provide more details?"
or even worse
"everything works on my machine, please check again"
Hopefully, after a few iterations of such a ping-pong in Jira they realise that
they need to check the packages they installed in their .venv
.
Apparently, the QA engineer still has the old packages:
Flask==2.0.1
requests==1.8.1
while the developer has
Flask==2.2.3
requests==1.9.2
and the newly developed feature is using the functionality of the new requests
lib,
thus breaks without the updates.
The QA engineer updates the packages, approves the changes and moves the ticket to DevOps
engineer for deployment.
The production was not affected, however a bunch of hours were wasted on figuring out the
root cause of the problem.
And, as usual, it could be worse. Particularly, in my own practise, I have seen clusters of production machines serving the customers, where each machine had slightly different sets of packages. This kind of setup works “fine”, until it breaks, with really long running consequences and some reputation lost. Therefore, it is tremendously important to keep your packages and their versions under control.
Control your package with Poetry
Poetry is a beautifull instrument that provides and extensive control over your python packages:
- controls the list of packages
- separates the dev dependancies from production dependancies
- resolves the packages dependencies
- stores the hash of all installed packages and their versions
Poetry leverages the pyproject.toml
file where the list of dependencies can be defined.
[tool.poetry]
name = "my-flask-app"
version = "0.3.0"
description = "My amazin flask app"
authors = [
"john.snow <john.snow21212230@gmai.com>",
"gourgen.mardirossian <gougou.mardirossian45112@gmail.com>"
]
[tool.poetry.dependencies]
Flask ~= "2.2.*"
requests ~= "1.9.*"
[tool.poetry.dev-dependencies]
black = "^21.6b0"
flake8 = "^3.9.2"
pylint = "^2.8.3"
pytest-mock = "^3.6.1"
Pympler = "^0.9"
[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"