End-to-end data platform for processing, analyzing and visualizing fleet drivers violation reports.
This project demonstrates a complete data engineering pipeline:
- file upload API
- ETL pipeline in Python
- PostgreSQL data warehouse
- BI dashboards in Metabase
- Dockerized infrastructure
Excel files
↓
FastAPI (/upload)
↓
ETL Pipeline (Python)
↓
PostgreSQL
↓
Metabase Dashboards
- 📤 Upload legacy
.xlsreports via API - 🔄 Automatic ETL processing after upload
- 🧹 Idempotent processing (same file is not imported twice)
- 🗄️ PostgreSQL as data warehouse
- 📊 Metabase dashboards:
- drivers overview
- violations per driver
- driver profile (per company / file)
- 🔍 REST API:
/drivers/drivers/{id}/profile/upload
- Python 3.12
- FastAPI
- Pandas
- SQLAlchemy
- PostgreSQL
- Metabase
- Docker / Docker Compose
fleet-data-platform/
├── services/
│ ├── api/
│ │ └── app/
│ │ ├── main.py
│ │ ├── db.py
│ │ └── models.py
│ ├── etl/
│ │ ├── incoming/
│ │ ├── parsers/
│ │ │ └── legacy_xls_parser.py
│ │ ├── etl_pipeline.py
│ │ └── models.py
│ └── venv/
├── storage/
│ ├── postgres/
│ └── metabase/
├── docker-compose.yml
└── README.md
