Alexander Batrakov AlexBatrakov

Hi, I'm Alexander Batrakov 👋

Data Science · Data Analytics · Machine Learning · Software Engineering
Physics PhD researcher building reproducible data, ML, and software systems — with a focus on analytics workflows, model evaluation, backend-supported experiment platforms, and scientific software.

What I build

I focus on turning messy, ambiguous, or computation-heavy problems into structured, testable, and reproducible systems.

Data Analytics & Data Science: data quality, SQL marts, feature engineering, time-aware validation, statistical diagnostics.
Machine Learning: controlled experiments, baselines, multi-seed evaluation, calibration, error analysis, model serving.
Backend & Experiment Platforms: FastAPI services, database-backed run state, async workers, Docker/CI reviewer paths.
Scientific & Numerical Software: Julia packages, external model-fitting workflows, residual diagnostics, parameter search, ODE-based simulation.

Core stack

Portfolio documents

Recommended starting points

For Data Analytics / Data Science: Wearable Analytics — SQL marts, data quality, feature engineering, time-aware validation.
For Machine Learning / ML Engineering: PyTorch Pets Classifier — MLflow experiments, multi-seed evaluation, diagnostics, FastAPI/Docker deployment.
For Backend / Experiment Platforms: Evalynx — FastAPI, PostgreSQL, Redis/RQ, async run orchestration, CI smoke tests.

Selected projects

Wearable Analytics

Privacy-first analytics workflow on real Garmin data: sanitized pipelines, quality labels, SQL marts, feature engineering, leakage-aware modeling, and tests.

Signals: Data Analytics · Time Series · SQL · ML Evaluation
Evidence: 52,812 model configs · 150 tests · DuckDB/PostgreSQL marts · data quality reports
Stack: Python · pandas · scikit-learn · DuckDB · PostgreSQL · pytest

PyTorch Pets Classifier

Reproducible computer-vision workflow with MLflow tracking, multi-seed model selection, diagnostics, calibration, FastAPI serving, Docker, and Azure deployment path.

Signals: Machine Learning · Model Reliability · Deployment
Evidence: 21 configs · 3-seed evaluation · 62 tests · FastAPI/Docker/Azure path
Stack: Python · PyTorch · MLflow · FastAPI · Docker · Azure

Evalynx

FastAPI/PostgreSQL/Redis control plane for reproducible computational runs: API submission, async workers, retries, metrics, artifacts, and CI smoke tests.

Signals: Backend · Run Orchestration · Experiment Platform
Evidence: 8 API endpoints · PostgreSQL run state · Redis/RQ workers · 29 tests
Stack: Python · FastAPI · PostgreSQL · Redis/RQ · SQLAlchemy · Docker Compose

Solo Wargame AI

Deterministic hidden-information simulator with legal-action masks, replay traces, fixed-seed benchmarks, heuristic/search/learned agents, and evaluation tooling.

Signals: Simulation · AI Agents · ML Evaluation
Evidence: deterministic replay · legal-action masks · fixed-seed benchmarks · 356 tests
Stack: Python · PyTorch · Gym-style wrappers · TOML · JSON/JSONL · SQLite · pytest

GravityToolsNext.jl

Julia framework for external model-fitting workflows with structured metrics, residual diagnostics, white-noise fitting, priors, and adaptive parameter search.

Signals: Statistical Diagnostics · Experiment Orchestration · Scientific Computing
Evidence: structured run artifacts · residual diagnostics · prior-aware search · docs/tests
Stack: Julia · TEMPO/TEMPO2 · HypothesisTests · KernelDensity · Optim · JLD2 · Docker

StructureSolver.jl

Julia scientific-computing package for ODE-based simulation workflows with equation-of-state abstractions, shooting methods, parameter scans, and sensitivity analysis.

Signals: Numerical Methods · Sensitivity Analysis · Julia
Evidence: ODE workflows · shooting methods · ForwardDiff sensitivities · 69 tests
Stack: Julia · DifferentialEquations.jl · ForwardDiff.jl · NLsolve.jl · Documenter.jl · GitHub Actions

What makes these projects reviewable

documented setup and reproducible run paths;
tests, CI checks, and structured artifacts;
baseline comparisons, metrics, diagnostics, and failure cases;
clear separation between data processing, modeling, evaluation, and serving;
public repositories with code, documentation, and project-specific README files.

How I work

Across projects, I emphasize:

explicit data contracts and clear interfaces;
reproducible runs and inspectable intermediate artifacts;
baselines, diagnostics, and conservative validation;
automated tests, documentation, CI, and reviewer-friendly execution paths;
translating complex domain problems into maintainable software.

Open to roles

Open to roles in Data Science, Data Analytics, Machine Learning Engineering, and Software Engineering where reproducibility, evaluation, and analytical systems matter.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly