Since 2016, the file setup.cfg can completely
Python package prerequisites.
setup.cfg specifies nearly all Python metadata, potentially reducing setup.py to a one-liner or eliminating it.
This helps security by allowing extremely fast recursive machine-parsing of prerequisites without installing packages first.
Generally, specify Python package prerequisites in
setup.cfg as much as possible.
Python packages should minimize the size of their directed dependency graph for best package longevity with minimum maintenance effort. However, the most effective use of programmer/scientist/engineer time generally comes from reusing code wherever appropriate. How do we evaluate quality of prereqs? Modern Python code includes these factors:
- use setup.cfg as much as possible
- continuous integration
- code coverage tests
- PEP8 compliance
- Static type hinting
Long term archiving of Python software requires direct and indirect dependencies.
This is commonly done by
pip freeze, but provides no direct sense of module hierarchy.
The techniques described below provide a detailed, zoomable hierarchical view of Python module dependencies.
Python dependency analysis
Python package using setup.py to specify package prerequisites generally require modules to be installed to determine their dependencies. That is, setup.py is recursively executed for each module to determine what modules are needed overall. This is bad for automated security analysis, which is slowed greatly by needing to install packages to determine prereqs. Modern Python packages solve this problem by specifying most package configuration in setup.cfg. setup.py can be as small as:
from setuptools import setup; setup()
There are of course cases where setup.py is more than a one-liner.
In general, problems can arise where
install_requires specifies more packages than strictly required to pass the CI unit test, or not enough (test fails).
Proper use of CI will usually resolve these issues before end users see them.
pipdeptree is the most practical solution to generate plots of Python directed dependency graphs.
This method assumes:
- self-test has adequate coverage to be meaningful for most users
- packages only used as convenience methods for some users are under
- strictly necessary modules are specified
- minimum Python version is specified
- CI-only requirements are specified
The process below is targeted for packages used in “development mode” that is, not installed into
site-packages, except for a link back to the code directory.
pip install virtualenv
In the Python package directory, create a new Python virtual environment, since
pipdeptree depends on having only the analyzed package and its dependencies installed.
virtualenv testdep . testdep/bin/activate pip install pipdeptree[graphviz]
Install the package you wish to examine (and whatever dependencies it automatically installs)
pip install -e .
Make a hierarchical dependency graph
This should be a very short tree (unless you are testing with a big package). Try it with a simple package you’ve made, seeing if the dependency list matches what you expect
Directed Dependency Graph
Now you’re ready to create the directed dependency graph for the package. Install GraphViz by
apt install graphviz
brew install graphviz
pipdeptree --graph-output svg > dep.svg
View the SVG in your web browser or image viewer software (e.g. IrfanView).
One-click Python dependency graph
Wrap up the previous discussion and scripts in this Bash script
#!/bin/bash set -e [[ ! -z $1 ]] && cd $1 virtualenv testdep # it's OK if it already exists . testdep/bin/activate pip install pipdeptree[graphviz] pip install -e .[tests] pipdeptree --graph-output svg > dep.svg . deactivate eog dep.svg & # whatever your favorite image viewing program is
Other dependency graph modules
These modules are not yet ready to use in my opinion due to the deficiencies noted in each section. Hence, they are included for reference.
Note: to make Modulegraph useful, the output must be post-processed, as almost all of the output is system stdlib modules.
Modulegraph is an established, maintained tool for creating a
.dot dependency graph.
It lists extremely verbose output.
It’s necessary to post-process
.dot output with
pydot to make use of
What if we instead preemptively excluded from a list of known stdlib modules, removing say 98% of
modulegraph output from the start?
pip install modulegraph
Examine a file’s requirements, creating a
python -mmodulegraph file.py -q -g > graph.dot
dot -Tsvg graph.dot > graph.svg
Snakefood is in maintenance mode. Snakefood is Python 2 only. There was a merged pull request to Python 3, but it was not yet uploaded to PyPi.
pip install hg+https://bitbucket.org/blais/snakefood