Streamlining Python Development: A Guide to a Modern Project Setup
- 1. © 2024
A Guide to a Modern
Project Setup
PyConDE / PyData 2024, April 22nd
Florian Wilhelm
Streamlining
Python
Development
- 2. Mathematical Modelling
Modern Data Warehousing & Analytics
Personalisation & RecSys
Uncertainty Quantification & Causality
Python Data Stack
OSS Contributor & Creator of PyScaffold
Dr. Florian Wilhelm
• HEAD OF DATA SCIENCE
FlorianWilhelm.info
florian.wilhelm@inovex.de
FlorianWilhelm
- 3. ‣ Application Development (Web Platforms, Mobile
Apps, Smart Devices and Robotics, UI/UX
design,Backend Services)
‣ Data Management and Analytics (Business
Intelligence, Big Data, Searches, Data Science
and Deep Learning, Machine Perception and
Artificial Intelligence)
‣ Scalable IT-Infrastructures (IT Engineering, Cloud
Services, DevOps, Replatforming, Security)
‣ Training and Coaching (inovex Academy)
is an innovation and quality-driven
IT project house with a focus on
digital transformation.
Using technology to
inspire our clients.
And ourselves.
Berlin · Karlsruhe · Pforzheim · Stuttgart · München · Köln · Hamburg · Erlangen
www.inovex.de
- 4. 1. Introduction:
a. What makes a good project setup?
b. How do we achieve it?
2. Streamlined Project Setup:
a. configuration with pyproject.toml
b. tooling with hatch, ruff, mypy, pytest, …
3. Conclusion
Agenda
- 7. 1. Conventions
a. project structure
b. code formatting, e.g., pep8, black, ruff
c. documentation, e.g., Sphinx, mkdocs
2. Automation
a. dependency & environment management
b. building & publishing
c. versioning, e.g., semantic versioning
d. testing, linting/formatting, type checking
3. Easy to Use!
Concrete Requirements for those Goals
- 8. Semantic Versioning
‣ tells developers what to
expect
‣ avoids dependency hell for
developers using your
software
‣ necessary for requirement
specifiers like ~= 2.21 or
^2.2.21 (Poetry only)
More Details: https://www.geeksforgeeks.org/introduction-semantic-versioning/ and https://semver.org/
- 9. This is not a talk about the best Package Management Tool
Source: An unbiased evaluation of environment management and packaging tools (https://www.inovex.de/de/blog/)
- 11. ‣ reproducibly building & publishing packages
‣ robust environment management with support for
custom scripts
‣ easy Python management, replacing pyenv
‣ easy semantic versioning based on Git tags
‣ sophisticated testing within various environments,
replacing tox
🐣 Hatch, the extensible Python project manager
Ofek Lev
- 12. ‣ folders for
∙ source files
∙ documentation
∙ tests
‣ human-readable information
∙ README.md
∙ …
‣ configuration files
∙ pyproject.toml
∙ …
Project Directory Structure
- 13. ‣ defines the build system
‣ metadata about your project
for PyPI
‣ configuration for (almost) all
tools
∙ pytest
∙ mypy
∙ ruff
∙ coverage
All-in-One Configuration with pyproject.toml
- 14. Scripts in pyproject.toml for automation of tasks, e.g.
∙ running unit-tests with our without coverage, debugging,
∙ building the documentation,
∙ running the linters, code checks, mypy,
∙ …
Automation with Scripts!
> hatch run test:cov
- 15. ‣ replaces tons of tools
‣ easy configuration via
pyproject.toml
‣ extremely fast
‣ over 700 plugins
Code Quality: Linting & Formatting
Ruff
flake8
autoflake
pydocstyle
…
- 16. Why mypy?
Type Checking: Are you my type?
compile-time type checking finds many errors in
advance, often edge cases.
type declaration act as machine-checked
documentation, thus enhancing the dev
experience.
- 18. pytest
‣ defacto standard for unit testing
‣ powerful features like fixtures, etc.
‣ tons of useful plugins, e.g.:
∙ pytest-cov for coverage
∙ pytest-recording for mocking calls to external services
∙ pytest-sugar to make it easier on the eyes
Testing with pytest & hatch
hatch & tox
‣ isolated environments for testing different Python versions and
dependency combinations
- 20. ‣ Automatic and reproducible testing
‣ Publishing packages based on git tags
‣ Established branching strategy, e.g. GithubFlow
for efficient collaboration
‣ Scalability and Adaptability when needed
‣ Automated deployments, building of
documentation etc.
Automation with CI/CD
More Details: Data Science in Production: Packaging, Versioning and Continuous Integration (https://www.inovex.de/de/blog/)
- 21. Conclusion
‣ unified configuration in pyproject.toml
‣ standardized folder structure with
src-layout and useful README.md
‣ easy package management and
automation with hatch
‣ automated QA with ruff, pytest,
pre-commit, mypy, CI/CD
‣ proper documentation with mkdocs
‣ automation & conventions are key!
- 23. CHEERS TO THE COMMUNITY
Credits & Resources
‣ Ofek Lev, the creator of hatch, for is
awesome work in his spare time ❤
‣ Michael Hofmann from inovex who
made these awesome slides
- 24. © 2023
Thank you!
Dr. Florian Wilhelm
Head of Data Science
PyConDE / PyData 2024
inovex.de
florian.wilhelm@inovex.de
@inovexlife
@inovexgmbh