Binyam Sisay Bina-man

 ██████╗ ██╗███╗   ██╗██╗   ██╗ █████╗ ███╗   ███╗
 ██╔══██╗██║████╗  ██║╚██╗ ██╔╝██╔══██╗████╗ ████║
 ██████╔╝██║██╔██╗ ██║ ╚████╔╝ ███████║██╔████╔██║
 ██╔══██╗██║██║╚██╗██║  ╚██╔╝  ██╔══██║██║╚██╔╝██║
 ██████╔╝██║██║ ╚████║   ██║   ██║  ██║██║ ╚═╝ ██║
 ╚═════╝ ╚═╝╚═╝  ╚═══╝   ╚═╝   ╚═╝  ╚═╝╚═╝     ╚═╝
 Sr. Data Scientist · ML Systems · LLM Engineering · AdTech

Senior Data Scientist specializing in end-to-end ML system design — spanning quantile regression, temporal clustering, causal inference, anomaly detection, and LLM-powered agent pipelines. Building production systems where statistical rigor and engineering precision drive measurable outcomes at scale in latency-sensitive AdTech environments.

⚡ ML Architecture · Pipeline as Neural Net

   EDA              EMBED             MODEL             SERVE           FEEDBACK
   ───              ─────             ─────             ─────           ────────

  Polars ●─────────● Word2Vec ───────● LightGBM ───────● Lambda ───────● Thompson
         ╲         ╱╲        ╲      ╱╲          ╲     ╱╲        ╲     ╱
          ╲       ╱  ╲        ╲    ╱  ╲           ╲   ╱  ╲        ╲   ╱
   Kafka  ●─────────● BERT    ─────── ● DCN    ───────── ● O(1)   ───● Elasticity
          ╱       ╲  ╱        ╱    ╲  ╱           ╱   ╲  ╱        ╱   ╲
         ╱         ╲╱        ╱      ╲╱            ╱    ╲╱         ╱     ╲
     DSP ●─────────● HMM     ───────● Iso.Forest ───────● RT infer──────● AutoLoop

          └─────────────────────────────────────────────────────────────────┘
                         hourly recalibration feedback arc

🤖 LLM Engineering · Agents · Retrieval

Agentic Pipeline Architecture

┌─────────────┐    ┌──────────────┐    ┌─────────────┐    ┌──────────────┐    ┌──────────────┐
│  User Query │───▶│   Planner    │───▶│  Tool Use   │───▶│  Reflection  │───▶│   Response   │
│             │    │  LangGraph   │    │  MCP · RAG  │    │ self-critique│    │   grounded   │
└─────────────┘    └──────────────┘    └─────────────┘    └──────────────┘    └──────────────┘
                          │                   │
                          ▼                   ▼
                   ┌────────────┐     ┌──────────────┐
                   │  Memory    │     │  Vector DB   │
                   │  store     │     │  FAISS·pgvec │
                   └────────────┘     └──────────────┘

Retrieval Stack

Layer	Method	Detail
Embedding	Dense + sparse hybrid	BERT, Word2Vec, BM25 fusion
Indexing	HNSW approximate NN	200M+ document scale
Re-ranking	Cross-encoder	Precision boost post-retrieval
Entity linking	Custom taxonomy mapper	Publisher content → audience graph
Storage	FAISS · pgvector	On-prem and cloud portable

LLM Routing Strategy

Query complexity assessment
        │
        ├──▶ Simple retrieval   ──▶  Haiku  (fast · cheap)
        ├──▶ Structured output  ──▶  Sonnet (balanced)
        └──▶ Complex generation ──▶  Opus   (frontier)

Tools & Frameworks: LangGraph · LangSmith · PydanticAI · MCP · FAISS · pgvector · RAG

🏭 Production ML Pipeline

01 INGEST        02 FEATURES      03 MODEL         04 SERVE         05 FEEDBACK
─────────        ───────────      ─────────        ────────         ───────────
Polars · Kafka   Word2Vec         LightGBM QR      Lambda           Thompson MAB
Bidstream EDA    HMM states       DCN features     O(1) lookup      Auto-calibrate
DSP signals      GloVe embeds     Anomaly detect   Real-time        Price elasticity
     │                │                │                │                │
     └────────────────┴────────────────┴────────────────┴────────────────┘
                              Feedback loop (hourly recalibration)

📊 Impact at a Glance

Metric	Result
Bid request reduction	50%+
GCPM gain	2×+
Pipeline speedup	10×
Directional decision accuracy	76%
Daily revenue lift	$44–$500
ID5 integration revenue	$10K+/day
Infra cost reduction	61% ($7.63 → $3/hr)

🧠 ML Competencies

Bid floor optimization    ████████████████████  95%
Quantile regression       ███████████████████   92%
LLM agents · LangGraph    ██████████████████    88%
Anomaly detection · IVT   ██████████████████    88%
Embedding · HNSW · RAG    █████████████████     87%
NLP · BERT · Word2Vec     █████████████████     86%
Thompson Sampling · MAB   ████████████████      83%
Hidden Markov Models      ████████████████      82%
Causal inference          ███████████████       80%

🔧 Tech Stack

Core ML

LLM & Agents

Data & Infrastructure

Cloud

🗂 Open Source Projects

📈 GitHub Analytics

_{Addis Ababa, ET · Open to remote · AdTech · ML Systems · LLM Engineering}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly