feat: add boilerplate for validatorapi#446
Conversation
|
@varex83agent /review-pr |
varex83agent
left a comment
There was a problem hiding this comment.
Review: validatorapi boilerplate
Clean, well-documented boilerplate that faithfully establishes the validator-API surface. I verified the full route table against Charon v1.7.1 core/validatorapi/router.go: all 30 endpoints match on path, HTTP method, registration order, and the respond_404-vs-handler split (incl. the GET+POST combo on /eth/v1/beacon/states/{state_id}/validators and the v1-404 / v2-v3-handler pattern for attestations, aggregates and proposals). The errorResponse JSON shape ({code,message}), metric names/labels/buckets (== prometheus DefBuckets), the proxy path-label transform, and respond_404 → {code:404,message:"NotFound"} all match the Go reference. Type imports (eth2api::spec::phase0, crypto::types) resolve and the vise usage mirrors existing crates, so it should compile clean under -D warnings.
Bugs (must-fix): none. Nothing is wrong at runtime, and the router is correctly not mounted anywhere yet.
Minor / nit findings (all appropriate to defer or quick to fix — left to author discretion):
- Routing semantics divergence: in Charon a wrong-method request to a DV endpoint falls through gorilla-mux's catch-all and is proxied upstream, whereas axum's
MethodRouterreturns405and never reaches.fallback(). Worth a deliberate decision before handlers are wired. - The per-endpoint encoding allow-list (JSON vs JSON+SSZ) that Charon's
wrap()enforces (415) is not captured anywhere in the route table — easy to lose track of. error.rsdoc commenteth2types.go:20violates the Go-line-anchor rule this same PR adds toporting/SKILL.md.- Consistency:
#[async_trait]vs the native-> impl Future + Sendused by the siblingTrackertrait; manualDisplay/Errorimpl vs the crate-widethiserrorconvention; placeholderVersioned*Proposaltypes shadow real ones insigneddata.rs. - The
proxy_path_transformtest re-implements the transform inline instead of exercisingProxyLatencyTimer::start, and there are no route-table tests.
Forward-looking (track when wrap()/handlers/proxy land, not actionable on this diff): add a request timeout + body-size limit (Charon uses a 10s deadline) and a CatchPanicLayer before mounting (today every handler is todo!()); reproduce Charon's deadline-exceeded latency skip; replicate the proxy's basic-auth/Host forwarding and keep the upstream target operator-configured (no SSRF); bound the attacker-controlled vc_user_agent/path label cardinality.
Verdict: COMMENT. No bugs or merge-blockers — the route-table parity is solid. Findings are minor parity/consistency notes and nits, left open for author discretion given the deliberately-boilerplate scope.
| /// `_handler` will be threaded into Axum router state once request bodies | ||
| /// and responses are wired. `_builder_enabled` is consumed only by | ||
| /// `propose_block_v3`. | ||
| pub fn new_router<H: Handler>(_handler: H, _builder_enabled: bool) -> Router { |
There was a problem hiding this comment.
Routing-semantics divergence to flag (not a stub issue — it's a property of how routes are registered). Charon registers each route with gorilla-mux .Methods(...); on a method mismatch gorilla-mux does NOT short-circuit to 405 — it keeps iterating and the trailing r.PathPrefix("/").Handler(proxyHandler(...)) catch-all matches any method, so e.g. a GET /eth/v2/beacon/pool/attestations (registered POST-only) gets proxied upstream. In axum, .route(path, post(h)) builds a MethodRouter that returns 405 Method Not Allowed for an unregistered method on a matched path, and that 405 does not fall through to .fallback(proxy_handler) (the fallback only runs when no path matches). Net effect: wrong-verb requests to DV endpoints return 405 from Pluto but a real upstream response from Charon. Worth making this a deliberate, recorded decision before handlers are wired (e.g. route method-mismatches to the proxy to preserve parity).
| /// `_handler` will be threaded into Axum router state once request bodies | ||
| /// and responses are wired. `_builder_enabled` is consumed only by | ||
| /// `propose_block_v3`. | ||
| pub fn new_router<H: Handler>(_handler: H, _builder_enabled: bool) -> Router { |
There was a problem hiding this comment.
Charon's endpoint table carries a per-route Encodings []contentType field, and wrap() enforces it: JSON-only vs JSON+SSZ, returning 415 otherwise. Only propose_block, propose_block_v3, submit_proposal_v1/v2, submit_blinded_block_v1/v2 and submit_validator_registration allow SSZ; everything else is JSON-only. This router captures no encoding information anywhere (route, Handler trait, or types), so that parity-critical allow-list will have to be re-derived from the Go source when 415 handling is added. Consider attaching a per-endpoint encodings descriptor now so the SSZ/JSON distinction isn't silently forgotten.
| /// `_handler` will be threaded into Axum router state once request bodies | ||
| /// and responses are wired. `_builder_enabled` is consumed only by | ||
| /// `propose_block_v3`. | ||
| pub fn new_router<H: Handler>(_handler: H, _builder_enabled: bool) -> Router { |
There was a problem hiding this comment.
nit: new_router<H: Handler>(_handler: H, _builder_enabled: bool) moves H in and immediately drops it (the generic is used only as a bound), and _builder_enabled is ignored. Acceptable for boilerplate and the doc comment explains the intent, but it's a mild smell: callers must build a real handler that is then thrown away, and any Drop side effects would fire unexpectedly. When wiring this up, prefer stashing it via Router::with_state/Extension (even while handlers are todo!()) rather than taking-and-dropping.
| /// `_handler` will be threaded into Axum router state once request bodies | ||
| /// and responses are wired. `_builder_enabled` is consumed only by | ||
| /// `propose_block_v3`. | ||
| pub fn new_router<H: Handler>(_handler: H, _builder_enabled: bool) -> Router { |
There was a problem hiding this comment.
No tests build new_router and assert the route table — path/method registration, the eight respond_404 endpoints returning {code:404,message:"NotFound"}, the GET+POST dual-method on /validators, and fallback dispatch are all unverified. This is the parity-critical surface (one transposed path or wrong verb is silently wrong and only surfaces against a real VC). A table-driven test driving requests through axum's tower Service would lock in parity and guard the 405-vs-proxy divergence noted above. I manually verified all 30 endpoints + fallback match router.go, so the current table is correct — the gap is the absence of a regression guard.
|
|
||
| /// Gauge set to 1 when a request from the given user agent is observed. | ||
| #[metrics(labels = ["user_agent"])] | ||
| pub vc_user_agent: LabeledFamily<String, Gauge>, |
There was a problem hiding this comment.
Charon declares vc_user_agent as a ResetGaugeVec (capable of .Reset() to purge stale label series); here it's a plain LabeledFamily<String, Gauge> that observe_user_agent only ever .set(1) on, so every distinct (attacker-controlled) User-Agent stays pinned at 1 forever with no path to reset — an unbounded label-cardinality / memory growth vector. Note Charon's validatorapi never actually calls .Reset() in v1.7.1, so runtime behavior is currently equivalent and this is not a regression — but the type choice forecloses the reset behavior the Go type was built for. Recommend recording the intentional simplification, and when this is wired up, bounding cardinality (e.g. allow-list known VC user-agent prefixes) independently of Charon.
| } | ||
| } | ||
|
|
||
| impl fmt::Display for ApiError { |
There was a problem hiding this comment.
ApiError hand-rolls Display + Error::source (~30 lines incl. the map(|e| e.as_ref() as &(dyn ...)) cast), but the crate-wide convention is #[derive(thiserror::Error)] (used across core: qbft, consensus, dutydb, deadline, sigagg, types, …). This could be #[derive(thiserror::Error)] #[error("api error[status={status_code},msg={message}]")] with #[source] source: Option<Box<dyn Error + Send + Sync>>, dropping the manual impls. The Box<dyn Error + Send + Sync> source and the new/not_found/with_source builders are fine as-is — this is purely about not reimplementing what thiserror provides.
| /// registered in [`new_router`](super::router::new_router). All methods | ||
| /// return [`ApiError`] on failure; the router converts that into the | ||
| /// `errorResponse` JSON body. | ||
| #[async_trait] |
There was a problem hiding this comment.
Consider native async-fn-in-trait instead of #[async_trait], to match the sibling Tracker trait in this crate (crates/core/src/tracker/mod.rs), which uses fn ...(&self, ...) -> impl Future<Output = ...> + Send. Since Handler is Send + Sync + 'static and all methods take &self, the native form works without the proc-macro/boxing. If a dyn Handler (object safety) is needed later, that would justify keeping async-trait — worth confirming the intended usage.
|
|
||
| /// Versioned unsigned proposal payload. Placeholder. | ||
| #[derive(Debug, Clone)] | ||
| pub struct VersionedProposal {} |
There was a problem hiding this comment.
These placeholder VersionedProposal / VersionedSignedProposal structs shadow real same-named types already in this crate (crates/core/src/signeddata.rs:126 defines VersionedSignedProposal, wrapping versioned::VersionedSignedProposal). Having two distinct types with identical names in crates/core is confusing and will need reconciliation when these are filled in. The empty placeholder pattern itself is expected — but consider importing/aliasing the existing signeddata types here, or at least a doc note flagging the intended convergence.
|
|
||
| impl ApiError { | ||
| /// Builds a new `ApiError` with the given status and message. | ||
| #[must_use] |
There was a problem hiding this comment.
nit: #[must_use] on new and not_found is low-value (constructors returning a plain value are rarely dropped by accident). The one that matters is the consuming builder with_source(mut self, ...) -> Self, where dropping the result silently discards the attached source. Not wrong, just noise on the two constructors.
| ) -> Result<(), ApiError>; | ||
|
|
||
| /// `GET /eth/v1/node/version`. | ||
| async fn node_version(&self) -> Result<EthResponse<String>, ApiError>; |
There was a problem hiding this comment.
nit: node_version returns EthResponse<String>, but that envelope carries execution_optimistic / finalized / dependent_root — meaningful for state/duties responses, not for GET /eth/v1/node/version (whose body is just {data:{version}}), so those three fields will always be defaults/None here. Placeholder territory, so non-blocking, but worth deciding whether the version endpoint should use a lighter response shape.
No description provided.