perf(contests): partial index for shadowban CTE on aggregate_user.score#813
perf(contests): partial index for shadowban CTE on aggregate_user.score#813dylanjeffers wants to merge 1 commit into
Conversation
….score Cold-cache /v1/events/remix-contests?status=all takes ~22s end-to-end (warm: ~100ms). The dominant cost is the sequential scan of aggregate_user introduced by the shadowban filter: SELECT user_id FROM aggregate_user WHERE score < 0 aggregate_user has one row per user (millions), and only a small number have score < 0, so a partial index covers the filter cheaply. The same CTE is used in v1_event_comments, v1_fan_club_feed, v1_track_comments, and v1_track_comment_count — all benefit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Superseded by #814. After discussion: we don't actually rely on Leaving this open for now in case we want to fall back to the index approach, but the preferred fix is #814. |
|
Update on the supersession: the revised approach in #814 keeps the shadow-ban semantics — banned users are still filtered from contest discovery and comment streams — but sources the ban list from |
`/v1/events/remix-contests?status=all` was hanging the contests page in production with ~22s cold-cache calls (warm: ~100ms). PR #803's shadowban filter added: low_abuse_score AS (SELECT user_id FROM aggregate_user WHERE score < 0) `aggregate_user` has one row per user (millions of rows) and no index covering `score`, so the CTE ran a full sequential scan on every cold call. Pages then stuck in shared_buffers, which is why warm calls were fast. The same CTE is reused in v1_event_comments, v1_fan_club_feed, v1_track_comments, v1_track_comment_count — all pay the same cost. The contests endpoint hits hardest because status=all/status=ended keeps most events past the WHERE filter and the sort then forces a per-row LATERAL entry_count count, but the seq scan is the dominant fixed cost. Fix: partial index on (user_id) WHERE score < 0. The shadowban set is a tiny fraction of users, so the index is tens of KB. CREATE INDEX CONCURRENTLY avoids holding ACCESS EXCLUSIVE on aggregate_user; the migration follows the existing 0197_playlists_albums_partial_idx.sql pattern (no BEGIN/COMMIT, IF NOT EXISTS for idempotency). aggregate_user.score is the canonical shadowban signal — driven by the AAO `anti_abuse_blocked_users` admin list and the AAO score formula, and written back to aggregate_user.score by refresh_all_user_scores(). It is the correct table for this filter; the only issue was the missing index. Closes #813. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Superseded by #814 — same partial-index fix, carried forward there. Closing this one. |
Problem
/v1/events/remix-contests?limit=12&offset=0&status=allis timing out / hanging the contests page in production.Measured from my machine against
api.audius.co(presumably hitting whatever replica is cold per request):status=allstatus=endedstatus=activeThe contests page lands cold and hangs for ~22s.
Root cause
The shadowban filter PR (#803, api/v1_events_remix_contests.go:46-47) added:
aggregate_userhas one row per user (millions of rows). There is no index coveringscore— onlyidx_aggregate_user_follower_count (user_id, follower_count)exists (ddl/migrations/...0126). So the CTE runs a full sequential scan ofaggregate_useron every cold call, then the pages stay inshared_buffersand warm calls are fast.The same CTE is used in
v1_event_comments,v1_fan_club_feed,v1_track_comments,v1_track_comment_count— all pay the same cost on cold cache. Contests is hit hardest becausestatus=all/status=endedkeep most events past the WHERE filter and the sort then forces a per-row LATERALentry_countcount, but the seq scan is the dominant fixed cost.Fix
Add a partial index on the
score < 0predicate so the planner can resolve the shadowban set with an index scan instead of touching the heap for non-shadowbanned rows:Only a small fraction of users have
score < 0(shadowbanned only), so the partial form is dramatically smaller than a full btree onscore— size budget is tens of KB.CREATE INDEX CONCURRENTLYis used so the migration does not holdACCESS EXCLUSIVEonaggregate_user. Following the 0197_playlists_albums_partial_idx.sql pattern: not wrapped inBEGIN/COMMIT, and idempotent viaIF NOT EXISTS.Test plan
EXPLAIN ANALYZEonSELECT user_id FROM aggregate_user WHERE score < 0showsIndex Only Scan using idx_aggregate_user_score_negativeinstead ofSeq Scan/v1/events/remix-contests?status=allcold on staging — expect sub-second/v1/events/remix-contests,/v1/event-comments/...,/v1/fan-club-feed/...and confirm shadowbanned authors still excludedaggregate_userwrite throughput; this index only updates on rows that flipscoreacross zero, which is rare🤖 Generated with Claude Code