Vector compression for high-dimensional data

A MATH 5110 (Applied Linear Algebra) course project. We survey five ways to shrink high-dimensional embedding vectors — Johnson–Lindenstrauss random projection, truncated SVD, 1-bit sign quantization, scalar quantization, and TurboQuant — implement each from scratch in NumPy, and test them on a real retrieval task: a RAG index built from the MATH 5110 Quarto textbook, scored against full-precision ground truth. The central finding is that preserving distances is not the same as preserving ranking.

These are pedagogical implementations of the underlying linear algebra, not a reproduction of LLM KV-cache inference.

Read this first

Paper — full write-up: theory → methods → results.
Presentation — slide deck.

Demo

A small web app (SvelteKit + FastAPI) searches the textbook index and compares compressed methods side by side.

Compare mode runs one query against every index at once, so you can watch the ranking drift as the bit budget shrinks:

Single-index mode shows the retrieved chunks in full for one method:

Setup

Requires uv and Python 3.12+. Set embedding credentials in .env:

Azure OpenAI:

AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_ENDPOINT=https://YOUR-RESOURCE.openai.azure.com/

Then set embeddings.provider: azure and embeddings.azure_deployment (your deployment name) in python/config.yaml.

Direct OpenAI:

OPENAI_API_KEY=sk-...

Then set embeddings.provider: openai in python/config.yaml.

Run

End-to-end pipeline (embeddings → compression study → figures):

uv sync
uv run python scripts/run_all.py

Search the RAG index from the command line:

uv run python scripts/search_class.py "What is the singular value decomposition?"

Web UI (SvelteKit + FastAPI — search and compare index sizes side by side). Requires Bun:

uv sync --directory backend
cd frontend && bun install && cd ..
bun run dev   # API on :8010, UI on http://localhost:5173

Repo layout

Path	Purpose
`python/src/vector_linalg/`	Embeddings, compression, metrics, plots
`scripts/run_all.py`	End-to-end pipeline
`backend/`, `frontend/`	FastAPI search API + SvelteKit UI
`docs/paper.tex`, `docs/paper.pdf`	Final academic paper (source + PDF)

Token vectors are embedded with the OpenAI Embeddings API (text-embedding-3-small) and cached locally.

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
backend		backend
docs		docs
frontend		frontend
python		python
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
bun.lock		bun.lock
package.json		package.json
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vector compression for high-dimensional data

Read this first

Demo

Setup

Run

Repo layout

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Vector compression for high-dimensional data

Read this first

Demo

Setup

Run

Repo layout

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages