TRT-2734: Move release metadata from BigQuery to PostgreSQL#3679
TRT-2734: Move release metadata from BigQuery to PostgreSQL#3679mstaeble wants to merge 1 commit into
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: automatic mode |
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: mstaeble The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
eccf470 to
c3dca6d
Compare
c3dca6d to
23f283f
Compare
|
Scheduling required tests: |
| func (s *Server) getReleases(ctx context.Context, forceRefresh ...bool) ([]sippyv1.Release, error) { | ||
| if s.bigQueryClient != nil { | ||
| refresh := len(forceRefresh) > 0 && forceRefresh[0] | ||
| return api.GetReleases(ctx, s.bigQueryClient, refresh) | ||
| // getReleases returns release data from PostgreSQL. | ||
| func (s *Server) getReleases(ctx context.Context) ([]sippyv1.Release, error) { | ||
| if s.db != nil { | ||
| releases, err := api.GetReleasesFromDB(ctx, s.db) |
There was a problem hiding this comment.
this removes the cache. reading from a small postgres table is fast, but not nearly as fast as reading from redis, and this page gets called.... a lot. with React calling it several times per UI transition it may even be noticeable at a human timescale. consider whether we don't want caching here?
There was a problem hiding this comment.
Good call raising this. I traced all the callers to understand the full impact.
getReleases() is called by 8 HTTP handlers:
- /api/releases (page load, ~15 calls/hour)
- /api/component_readiness (every CR request)
- /api/component_readiness/test_details
- /api/component_readiness/views
- 4 triage/regression endpoints
The old Redis cache (8h TTL, key "Releases~") was valuable because it avoided a BigQuery round-trip on every one of these requests. That saved both latency and BQ query cost.
With PostgreSQL, the query is ~7ms for ~20 rows. The CR endpoints themselves take seconds, so 7ms is noise. There's no per-query cost like BQ.
More importantly, these callers are transitional. The CR handlers use release data for two things: resolving relative time strings like "ga-30d" into absolute dates, and generating HATEOAS links with correct time parameters. As we move CR fully to PostgreSQL:
- The CR test status queries will use pre-aggregated matviews per release (e.g., cr_base_agg_4.21), where the GA-based time window is baked into the matview at build time. No release metadata lookup needed at request time.
- For any queries that still need release dates, the lookup can fold into the main query as a JOIN against release_definitions rather than a separate round-trip.
That leaves /api/releases as the only endpoint that needs a standalone query, at ~15 calls/hour. Adding a Redis cache layer for that adds complexity with no user-visible benefit. The HTTP round-trip overhead alone (~50-200ms) dwarfs the 7ms PG query.
If we see this become a bottleneck in practice, we can add caching for PQ later, but I don't think it's warranted now.
23f283f to
09c13a6
Compare
Create a release_definitions table to store release metadata (GA dates, development start dates, previous release, capabilities, product, status) that was previously only available in BigQuery. This eliminates the BQ dependency for the /api/releases endpoint and removes the hardcoded releaseMetadata map from the PostgreSQL data provider, which required manual updates for each new release. Key changes: - Add ReleaseDefinition model with capability constants and HasCapability method - Add release-definitions loader (--loader release-definitions) that fetches from BQ and syncs to PG via upsert - getReleases() in the server prefers PG, falls back to BQ - PG data provider QueryReleases() reads from release_definitions instead of deriving from prow_jobs + hardcoded map - Seed data populates release_definitions for local development - Fix stale "from big query" error messages in server.go Ref: TRT-2734 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
09c13a6 to
4592ca7
Compare
|
Scheduling required tests: |
|
@mstaeble: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
@mstaeble: This pull request references TRT-2734 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Tip For best results, initiate chat on the files or code changes. This comment is an automated notice from |
Cherry-pick from PR openshift#3716 with conflict resolution: - Use civil.Date for date-only fields (GA dates, development start dates) - Use TIMESTAMP WITH TIME ZONE and DATE types in PostgreSQL - RFC 3339 for API timestamps, YYYY-MM-DD for dates - Resolve conflict in postgres provider (keep GetReleasesFromDB from PR openshift#3679) - Add time.Time to civil.Date conversion in DefinitionToRelease Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
release_definitionstable in PostgreSQL to store release metadata (GA dates, development start dates, previous release, capabilities, product, status) previously only available in BigQueryrelease-definitionsloader (--loader release-definitions) that fetches release rows from BQ and syncs them to PostgreSQL during the data load cycle/api/releasesand the PG data provider now read from therelease_definitionstable instead of BigQuery or the hardcodedreleaseMetadatamapgetReleases()in the server prefers PG, falls back to BQA follow-up PR will replace
v1.Releasewithmodels.ReleaseDefinitionacross all internal consumers, removeQueryReleases/QueryReleaseDatesfrom theDataProviderinterface, and eliminate the remaining BQ release functions.Test plan
go build ./...passesgo vet ./...passesmake lintpassesgo test ./pkg/... ./cmd/...passessippy servewith seed data serves/api/releasesfrom PostgreSQL with correct GA dates, capabilities, and previous release chainStaging verification
Deployed the branch to
sippy-stagingand ran the release-definitions loader as a one-off job:Synced 36 release definitions from BigQuery to staging PostgreSQL (all OCP releases 3.11 through 5.0, OKD, ROSA, ARO, HyperShift, and CAPI entries).
Verified the following endpoints on
sippy-staging.dptools.openshift.org:GET /api/releasesGET /api/releases/health?release=4.22GET /api/component_readiness(4.21→4.22 with full params)GET /api/component_readiness/regressions?release=4.22All release data served from PostgreSQL with no BigQuery calls.
Ref: TRT-2734
@coderabbitai ignore
🤖 Generated with Claude Code