Intelligent Learning Analytics & Agentic AI Study Coach

Unified Project: Milestone 1 + Milestone 2

Quick Navigation

Milestone 1 — ML-Based Student Performance Prediction
Milestone 2 — Agentic AI Study Coach
System Architecture
Tech Stack
Getting Started
Project Structure
Roadmap

Milestone 1 — ML-Based Student Performance Prediction

Overview

Milestone 1 implements a machine learning pipeline that predicts student academic performance using demographic, academic, and behavioural data.

The system provides three outputs:

Exam score prediction (regression)
Pass/Fail classification
Learner segmentation using clustering

A Streamlit dashboard is used to interactively visualize predictions.

Live Demo:
https://predictive-learning-analytics-ml.streamlit.app/

Models & Performance

Model	Task	Performance
Linear Regression	Predict ExamScore	R² = 0.9397
Logistic Regression	Pass/Fail	Accuracy = 91.76%
K-Means Clustering	Learner Segmentation	Silhouette = 0.2112

Key Design Decisions

WritingScore used as proxy target (no leakage risk)
Median-based Pass/Fail threshold (69.0)
Manual ordinal encoding for education-related features
Midpoint encoding for study hours
class_weight='balanced' instead of SMOTE
k=3 clustering aligned with interpretability requirement
Train-only scaling to avoid data leakage

Dataset

30,640 student records
14 original features
11 engineered features

Source: Kaggle Students Exam Scores Extended Dataset

Preprocessing Pipeline

Data cleaning and normalization
Missing value imputation
Ordinal + one-hot encoding
Outlier handling (IQR method)
Feature scaling (StandardScaler)
Train-test split (80/20)
Model training and evaluation

Milestone 2 — Agentic AI Study Coach

Overview

Milestone 2 transforms the ML system into a conversational agentic AI tutor.

Instead of static inputs, students interact through natural language, and the system dynamically decides:

What analysis to run
What knowledge to retrieve
Whether to generate a study plan or quiz
How to respond using ML + LLM reasoning

This is implemented using LangGraph-based multi-agent orchestration.

Key Features

1. Conversational AI Interface

Students interact via chat instead of form inputs.

2. LangGraph Agent System

A structured graph of specialized nodes:

Analyser Node (ML inference)
Retriever Node (RAG system)
Planner Node (study plan generation)
Quizzer Node (MCQ system)
End Node (response finalization)

3. Machine Learning Integration

Extracts student data from natural language
Runs all Milestone 1 models dynamically
Predicts:
- Exam score
- Pass/Fail status
- Learner category

4. Retrieval-Augmented Generation (RAG)

FAISS vector database
Sentence-transformer embeddings (MiniLM)
Academic knowledge base
Optional Tavily web search fallback

5. Personalized Study Plans

7-day structured study plan
Generated from retrieved knowledge
Adapted to learner category:
- At-Risk → fundamentals + revision
- Average → balanced learning
- High Performer → advanced + challenge tasks

6. AI Quiz System

5 MCQs per session
Auto-generated from retrieved content
Automatic grading + explanations
Performance-based feedback

7. Persistent Memory

PostgreSQL (Neon)
Stores:
- chat history
- agent state
- session data
Allows conversation resumption

8. Academic Guardrails

Blocks non-academic queries
Prevents cheating requests
Ensures safe tutoring behavior

System Architecture


User (Streamlit UI)
|
v
LangGraph Agent (Master Node)
|
v
+----------------------------------+
| Specialist Nodes                 |
| - Analyser (ML Models)           |
| - Retriever (RAG System)         |
| - Planner (Study Plan Generator) |
| - Quizzer (MCQ System)           |
| - End Node                       |
+----------------------------------+
|
v
PostgreSQL (Persistent Memory)

Tech Stack

Layer	Technology
ML Models	Scikit-Learn
Agent Framework	LangGraph
LLM	Groq (Llama 3.3 70B)
Embeddings	all-MiniLM-L6-v2
Vector DB	FAISS
Backend	Python
UI	Streamlit
Database	PostgreSQL (Neon)
Deployment	Streamlit Cloud
Web Search	Tavily API

Getting Started

Clone Repository

git clone https://github.com/sathvik89/Predictive-Learning-Analytics_ML.git
cd Predictive-Learning-Analytics_ML

Create Environment

python -m venv venv
source venv/bin/activate   # Mac/Linux
venv\Scripts\activate      # Windows

Install Dependencies

pip install -r requirements.txt

Run Application

streamlit run app.py

Project Structure

predictive-learning-analytics
├── __pycache__
│   ├── app.cpython-313.pyc
│   └── styles.cpython-313.pyc
├── agent
│   ├── __init__.py
│   ├── __pycache__
│   ├── chat_history.py
│   ├── formatting.py
│   ├── graph.py
│   ├── guardrails.py
│   ├── ml_pipeline.py
│   ├── nodes.py
│   ├── rag.py
│   ├── session_context.py
│   └── state.py
├── app_errors.log
├── app.py
├── chat_history.db
├── Data
│   ├── processed
│   └── raw
├── knowledge
│   ├── academic_coaching.md
│   ├── algebra_geometry_trig.md
│   ├── math_foundations.md
│   ├── performance_intervention.md
│   ├── reading_comprehension.md
│   ├── README.md
│   ├── statistics_probability.md
│   ├── study_skills.md
│   └── writing_skills.md
├── models
│   ├── kmeans_model.pkl
│   ├── linear_model.pkl
│   ├── logistic_model.pkl
│   ├── scaler_clf.pkl
│   ├── scaler_cluster.pkl
│   └── scaler_reg.pkl
├── modules
│   ├── __pycache__
│   ├── components.py
│   ├── home.py
│   ├── icons.py
│   ├── model_loader.py
│   ├── performance.py
│   ├── predict.py
│   ├── sidebar.py
│   └── styling.py
├── notebooks
│   ├── AgenticAI_Practice_Roughbook.ipynb.ipynb
│   ├── Cleaned__Notebook.ipynb
│   └── GenAi_Project_Predictive_learning.ipynb
├── README.md
├── Report
│   └── GenAi_Final_Report_v2.pdf
├── requirements.txt
├── styles.py
└── venv
    ├── bin
    ├── etc
    ├── include
    ├── lib
    ├── pyvenv.cfg
    └── share

Roadmap

Completed

ML-based student prediction system
Streamlit dashboard
LangGraph agent system
RAG pipeline
Quiz system
Persistent memory

Future Work

User authentication system
Step-by-step adaptive quiz flow
Larger academic knowledge base
Student analytics dashboard
Mobile UI optimization
Personalized long-term learning tracking

Project Summary

This project demonstrates the evolution from a traditional machine learning pipeline into a fully agentic AI tutoring system. It integrates predictive modeling, retrieval-augmented generation, and multi-agent orchestration to create a system that not only analyzes student performance but actively supports learning through conversation, planning, and assessment.

Built for Gen AI Course — Milestone 1 & 2 Sathvik Koriginja | Anushka Tyagi | Apoorva Choudhary

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
.devcontainer		.devcontainer
Data		Data
Report		Report
agent		agent
knowledge		knowledge
models		models
modules		modules
notebooks		notebooks
.gitignore		.gitignore
README.md		README.md
agent_cli.py		agent_cli.py
app.py		app.py
requirements.txt		requirements.txt
styles.py		styles.py

Folders and files

Latest commit

History

Repository files navigation

Intelligent Learning Analytics & Agentic AI Study Coach

Quick Navigation

Milestone 1 — ML-Based Student Performance Prediction

Overview

Models & Performance

Key Design Decisions

Dataset

Preprocessing Pipeline

Milestone 2 — Agentic AI Study Coach

Overview

Key Features

1. Conversational AI Interface

2. LangGraph Agent System

3. Machine Learning Integration

4. Retrieval-Augmented Generation (RAG)

5. Personalized Study Plans

6. AI Quiz System

7. Persistent Memory

8. Academic Guardrails

System Architecture

Tech Stack

Getting Started

Clone Repository

Create Environment

Install Dependencies

Run Application

Project Structure

Roadmap

Completed

Future Work

Project Summary

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages