Skip to content
View dominikhei's full-sized avatar
Block or Report

Block or report dominikhei

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
dominikhei/README.md

Hey there

I am a German student, who is passionate about Data Engineering and Infrastructure. I have work experience in Data- and Software Engineering. In my free time I like learning new skills and concepts and building completely overengineered projects, which are showcased on my Github.

Tech Stack:

Java Python MySQL

Apache Spark Apache Airflow Dbt Trino

Docker AWS Terraform

Linux Git

Example projects, if you are interested in ...

Infrastructure / Data Platform / Data Engineering 🏗

Open source contributions 💡

Project Added Link
Apache Airflow Functionality and respective unit tests to export and import roles including permissions using the Airflow CLI Merged Pull-Request
Apache Airflow Changed the Airflow docker-compose to easily ingest custom config files and added relevant documentation Merged Pull-Request
PM4PY Functionality to filter for a maximum coverage percentage of graph variants Merged Pull-Request

GitHub Stats:

Pinned Loading

  1. eartquake-streaming eartquake-streaming Public

    Distributed system on AWS which extracts earthquakes in real time using Apache Kafka and displays them on a frontend with load balancing

    HCL 3 1

  2. aws-elt aws-elt Public

    ELT pipeline built on AWS, which extracts data from 2 API endpoints and carries out transformations using DBT. The different parts are scheduled by Apache Airflow.

    Python 3

  3. terraform-ecr-build-push-image terraform-ecr-build-push-image Public

    Terraform provider to build Docker images and push them to AWS ECR

    Go

  4. Local-Data-LakeHouse Local-Data-LakeHouse Public

    Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testing.

    Dockerfile 51 8

  5. pm4py/pm4py-core pm4py/pm4py-core Public

    Public repository for the PM4Py (Process Mining for Python) project.

    Python 686 272

  6. apache/airflow apache/airflow Public

    Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

    Python 35.7k 13.9k