SlideShare a Scribd company logo
© 2024
A Guide to a Modern
Project Setup
PyConDE / PyData 2024, April 22nd
Florian Wilhelm
Streamlining
Python
Development
Mathematical Modelling
Modern Data Warehousing & Analytics
Personalisation & RecSys
Uncertainty Quantification & Causality
Python Data Stack
OSS Contributor & Creator of PyScaffold
Dr. Florian Wilhelm
• HEAD OF DATA SCIENCE
FlorianWilhelm.info
florian.wilhelm@inovex.de
FlorianWilhelm
‣ Application Development (Web Platforms, Mobile
Apps, Smart Devices and Robotics, UI/UX
design,Backend Services)
‣ Data Management and Analytics (Business
Intelligence, Big Data, Searches, Data Science
and Deep Learning, Machine Perception and
Artificial Intelligence)
‣ Scalable IT-Infrastructures (IT Engineering, Cloud
Services, DevOps, Replatforming, Security)
‣ Training and Coaching (inovex Academy)
is an innovation and quality-driven
IT project house with a focus on
digital transformation.
Using technology to
inspire our clients.
And ourselves.
Berlin · Karlsruhe · Pforzheim · Stuttgart · München · Köln · Hamburg · Erlangen
www.inovex.de
1. Introduction:
a. What makes a good project setup?
b. How do we achieve it?
2. Streamlined Project Setup:
a. configuration with pyproject.toml
b. tooling with hatch, ruff, mypy, pytest, …
3. Conclusion
Agenda
Introduction
1. efficient development
2. easy collaboration
3. seamless build & deployment
What makes a streamlined Python Project Setup?
1. Conventions
a. project structure
b. code formatting, e.g., pep8, black, ruff
c. documentation, e.g., Sphinx, mkdocs
2. Automation
a. dependency & environment management
b. building & publishing
c. versioning, e.g., semantic versioning
d. testing, linting/formatting, type checking
3. Easy to Use!
Concrete Requirements for those Goals
Semantic Versioning
‣ tells developers what to
expect
‣ avoids dependency hell for
developers using your
software
‣ necessary for requirement
specifiers like ~= 2.21 or
^2.2.21 (Poetry only)
More Details: https://www.geeksforgeeks.org/introduction-semantic-versioning/ and https://semver.org/
This is not a talk about the best Package Management Tool
Source: An unbiased evaluation of environment management and packaging tools (https://www.inovex.de/de/blog/)
Streamlined Project Setup
‣ reproducibly building & publishing packages
‣ robust environment management with support for
custom scripts
‣ easy Python management, replacing pyenv
‣ easy semantic versioning based on Git tags
‣ sophisticated testing within various environments,
replacing tox
🐣 Hatch, the extensible Python project manager
Ofek Lev
‣ folders for
∙ source files
∙ documentation
∙ tests
‣ human-readable information
∙ README.md
∙ …
‣ configuration files
∙ pyproject.toml
∙ …
Project Directory Structure
‣ defines the build system
‣ metadata about your project
for PyPI
‣ configuration for (almost) all
tools
∙ pytest
∙ mypy
∙ ruff
∙ coverage
All-in-One Configuration with pyproject.toml
Scripts in pyproject.toml for automation of tasks, e.g.
∙ running unit-tests with our without coverage, debugging,
∙ building the documentation,
∙ running the linters, code checks, mypy,
∙ …
Automation with Scripts!
> hatch run test:cov
‣ replaces tons of tools
‣ easy configuration via
pyproject.toml
‣ extremely fast
‣ over 700 plugins
Code Quality: Linting & Formatting
Ruff
flake8
autoflake
pydocstyle
…
Why mypy?
Type Checking: Are you my type?
compile-time type checking finds many errors in
advance, often edge cases.
type declaration act as machine-checked
documentation, thus enhancing the dev
experience.
Mypy Example
> hatch run lint:typing
pytest
‣ defacto standard for unit testing
‣ powerful features like fixtures, etc.
‣ tons of useful plugins, e.g.:
∙ pytest-cov for coverage
∙ pytest-recording for mocking calls to external services
∙ pytest-sugar to make it easier on the eyes
Testing with pytest & hatch
hatch & tox
‣ isolated environments for testing different Python versions and
dependency combinations
Avoiding human-errors by automated checks on every git commit
Automated QA with pre-commit
‣ Automatic and reproducible testing
‣ Publishing packages based on git tags
‣ Established branching strategy, e.g. GithubFlow
for efficient collaboration
‣ Scalability and Adaptability when needed
‣ Automated deployments, building of
documentation etc.
Automation with CI/CD
More Details: Data Science in Production: Packaging, Versioning and Continuous Integration (https://www.inovex.de/de/blog/)
Conclusion
‣ unified configuration in pyproject.toml
‣ standardized folder structure with
src-layout and useful README.md
‣ easy package management and
automation with hatch
‣ automated QA with ruff, pytest,
pre-commit, mypy, CI/CD
‣ proper documentation with mkdocs
‣ automation & conventions are key!
https://github.com/FlorianWilhelm/the-hatchlor
Check out the Hatchlor!
⭐
CHEERS TO THE COMMUNITY
Credits & Resources
‣ Ofek Lev, the creator of hatch, for is
awesome work in his spare time ❤
‣ Michael Hofmann from inovex who
made these awesome slides
© 2023
Thank you!
Dr. Florian Wilhelm
Head of Data Science
PyConDE / PyData 2024
inovex.de
florian.wilhelm@inovex.de
@inovexlife
@inovexgmbh

More Related Content

Streamlining Python Development: A Guide to a Modern Project Setup

  • 1. © 2024 A Guide to a Modern Project Setup PyConDE / PyData 2024, April 22nd Florian Wilhelm Streamlining Python Development
  • 2. Mathematical Modelling Modern Data Warehousing & Analytics Personalisation & RecSys Uncertainty Quantification & Causality Python Data Stack OSS Contributor & Creator of PyScaffold Dr. Florian Wilhelm • HEAD OF DATA SCIENCE FlorianWilhelm.info florian.wilhelm@inovex.de FlorianWilhelm
  • 3. ‣ Application Development (Web Platforms, Mobile Apps, Smart Devices and Robotics, UI/UX design,Backend Services) ‣ Data Management and Analytics (Business Intelligence, Big Data, Searches, Data Science and Deep Learning, Machine Perception and Artificial Intelligence) ‣ Scalable IT-Infrastructures (IT Engineering, Cloud Services, DevOps, Replatforming, Security) ‣ Training and Coaching (inovex Academy) is an innovation and quality-driven IT project house with a focus on digital transformation. Using technology to inspire our clients. And ourselves. Berlin · Karlsruhe · Pforzheim · Stuttgart · München · Köln · Hamburg · Erlangen www.inovex.de
  • 4. 1. Introduction: a. What makes a good project setup? b. How do we achieve it? 2. Streamlined Project Setup: a. configuration with pyproject.toml b. tooling with hatch, ruff, mypy, pytest, … 3. Conclusion Agenda
  • 6. 1. efficient development 2. easy collaboration 3. seamless build & deployment What makes a streamlined Python Project Setup?
  • 7. 1. Conventions a. project structure b. code formatting, e.g., pep8, black, ruff c. documentation, e.g., Sphinx, mkdocs 2. Automation a. dependency & environment management b. building & publishing c. versioning, e.g., semantic versioning d. testing, linting/formatting, type checking 3. Easy to Use! Concrete Requirements for those Goals
  • 8. Semantic Versioning ‣ tells developers what to expect ‣ avoids dependency hell for developers using your software ‣ necessary for requirement specifiers like ~= 2.21 or ^2.2.21 (Poetry only) More Details: https://www.geeksforgeeks.org/introduction-semantic-versioning/ and https://semver.org/
  • 9. This is not a talk about the best Package Management Tool Source: An unbiased evaluation of environment management and packaging tools (https://www.inovex.de/de/blog/)
  • 11. ‣ reproducibly building & publishing packages ‣ robust environment management with support for custom scripts ‣ easy Python management, replacing pyenv ‣ easy semantic versioning based on Git tags ‣ sophisticated testing within various environments, replacing tox 🐣 Hatch, the extensible Python project manager Ofek Lev
  • 12. ‣ folders for ∙ source files ∙ documentation ∙ tests ‣ human-readable information ∙ README.md ∙ … ‣ configuration files ∙ pyproject.toml ∙ … Project Directory Structure
  • 13. ‣ defines the build system ‣ metadata about your project for PyPI ‣ configuration for (almost) all tools ∙ pytest ∙ mypy ∙ ruff ∙ coverage All-in-One Configuration with pyproject.toml
  • 14. Scripts in pyproject.toml for automation of tasks, e.g. ∙ running unit-tests with our without coverage, debugging, ∙ building the documentation, ∙ running the linters, code checks, mypy, ∙ … Automation with Scripts! > hatch run test:cov
  • 15. ‣ replaces tons of tools ‣ easy configuration via pyproject.toml ‣ extremely fast ‣ over 700 plugins Code Quality: Linting & Formatting Ruff flake8 autoflake pydocstyle …
  • 16. Why mypy? Type Checking: Are you my type? compile-time type checking finds many errors in advance, often edge cases. type declaration act as machine-checked documentation, thus enhancing the dev experience.
  • 17. Mypy Example > hatch run lint:typing
  • 18. pytest ‣ defacto standard for unit testing ‣ powerful features like fixtures, etc. ‣ tons of useful plugins, e.g.: ∙ pytest-cov for coverage ∙ pytest-recording for mocking calls to external services ∙ pytest-sugar to make it easier on the eyes Testing with pytest & hatch hatch & tox ‣ isolated environments for testing different Python versions and dependency combinations
  • 19. Avoiding human-errors by automated checks on every git commit Automated QA with pre-commit
  • 20. ‣ Automatic and reproducible testing ‣ Publishing packages based on git tags ‣ Established branching strategy, e.g. GithubFlow for efficient collaboration ‣ Scalability and Adaptability when needed ‣ Automated deployments, building of documentation etc. Automation with CI/CD More Details: Data Science in Production: Packaging, Versioning and Continuous Integration (https://www.inovex.de/de/blog/)
  • 21. Conclusion ‣ unified configuration in pyproject.toml ‣ standardized folder structure with src-layout and useful README.md ‣ easy package management and automation with hatch ‣ automated QA with ruff, pytest, pre-commit, mypy, CI/CD ‣ proper documentation with mkdocs ‣ automation & conventions are key!
  • 23. CHEERS TO THE COMMUNITY Credits & Resources ‣ Ofek Lev, the creator of hatch, for is awesome work in his spare time ❤ ‣ Michael Hofmann from inovex who made these awesome slides
  • 24. © 2023 Thank you! Dr. Florian Wilhelm Head of Data Science PyConDE / PyData 2024 inovex.de florian.wilhelm@inovex.de @inovexlife @inovexgmbh