feat: add AI-powered "For You" feed personalization#2
Merged
Conversation
|
🔎 Cloudflare preview: https://426e3711-feedreader.phatpham9.workers.dev Uploaded from |
Reranks the feed by free-text reader interests using Cloudflare Workers AI, with graceful degradation to chronological order on any model or ranking failure. Settings live in the existing Reader settings dialog (disabled by default); personalization is opt-in per browser. The backend ranks once per (interests, source filter, freshness) via the Cache API and re-projects the cached order onto a freshly-fetched item page each request, so pagination behaves exactly like /api/items (true offset/limit, real has_next) without re-invoking the LLM per page. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
33db4f5 to
c972fc0
Compare
Resolves conflicts between the For You personalization endpoint (/api/personalize) and main's edge caching for /api/items plus the new /api/version and install-prompt-hide features. Kept both sets of functionality, sharing the latestSuccessAt helper and cardsToJson mapping between the items and personalize handlers.
…mbed-at-ingestion + similarity-at-query Embeds each item once at ingestion time (Workers AI bge-base-en-v1.5) instead of sending the full candidate pool through an LLM on every personalize request. At query time, only the interests string is embedded and the pool is ranked by cosine similarity in-worker; the existing LLM ranker now runs only as an optional polish pass over the top similarity hits (FEEDREADER_PERSONALIZE_POLISH_POOL_SIZE, 0 disables it). This also fixes the old all-or-nothing degraded fallback: the response now reports personalization: "llm" | "similarity" | "none" instead of a boolean, so a failed/disabled LLM polish step still serves a similarity-personalized order instead of falling all the way back to chronological.
Resolves conflicts between the embedding-retrieval personalization rewrite and main's weekly item-retention prune (which landed via a separate PR while this branch was in progress, and changed design mid-flight from a second cron trigger to a single hourly trigger with a wall-clock window check). Kept both features; core/test/fakeFeedRepository.ts needed a pruneOldItems stub added since FeedRepository requires it again.
Both files gained real content changes in this branch (personalization toggle/interests UI, personalization field handling) without bumping their ?v= query strings, so a CDN/browser cache could keep serving the pre-change script/styles after deploy.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
bge-base-en-v1.5) and stored in D1. At request time only the interests string is embedded; the ~500-item candidate pool is ranked by cosine similarity in-worker — no per-request LLM call needed for the core ranking step.CloudflareLlmRanker, 4-model fallback chain: Llama 3.1 8B → 3.3 70B → Mistral 7B → Llama 3.2 3B) now runs only as an optional "polish" pass over the topFEEDREADER_PERSONALIZE_POLISH_POOL_SIZEsimilarity hits (default 30;0disables it)./api/personalizereportspersonalization: "llm" | "similarity" | "none"instead of a flatdegradedboolean, so a skipped/failed LLM polish step still serves a similarity-personalized order rather than falling all the way back to chronological.(interests, source filter, source freshness)for 6h, same invalidation pattern as/api/items.Why retrieve-then-rerank instead of a pure per-request LLM rerank
Sending the full candidate pool through an LLM on every request scales cost with requests, not data, and gives zero cache benefit across paraphrased interests strings ("rust, AI" vs "AI and rust"). Embedding once at ingestion amortizes that cost over the item's lifetime in the pool, and similarity ranking is naturally robust to paraphrasing.
Test plan
npm run typecheck && npm testnpm run db:migrate:localapplies0002_add_item_embeddings.sqlcleanlywrangler dev --local:/api/personalizedegrades gracefully topersonalization: "none"without AI credentials; malformed body / missing interests / wrong method all return clean errors/internal/refresh/<source>succeeds even when embedding generation fails (ingestion never blocked on embed)