Conversation
…ayloads and no quorum is reached
|
I see you updated files related to
|
|
✅ No conflicts with other open PRs targeting |
There was a problem hiding this comment.
Pull request overview
Risk Rating: MEDIUM — Changes alter quorum/termination behavior in the remote executable request aggregation path and user-visible error reporting.
This PR improves the remote executable client’s behavior when peer responses arrive but cannot reach an F+1 matching quorum (e.g., all payloads differ), returning an early, more accurate ConsensusFailed capability error instead of timing out with a misleading context deadline error.
Changes:
- Add early-detection logic for when an OK-payload quorum is mathematically unreachable and return a public
ConsensusFailederror immediately. - Ensure locally-produced
caperrors.Errorvalues are wrapped so callers can stillerrors.Asintocaperrors.Error. - Update/add tests to validate early “quorum unreachable” behavior and unwrap semantics.
Targeted areas for scrupulous human review:
ClientRequest.OnMessagequorum progression/termination logic (interaction between OK responses, error responses, and pending peers).- Correctness of
quorumStillPossibleacross all response mixes (OK payloads + errors) and ensuring early-fail triggers in all intended cases. - User-facing error formatting and code (
ConsensusFailed) consistency with downstream handling/metrics.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| core/capabilities/remote/executable/request/client_request.go | Adds quorum-unreachable detection and returns a public ConsensusFailed capability error early; adds wrapper for locally-produced cap errors. |
| core/capabilities/remote/executable/request/client_request_test.go | Updates message-validation test to expect early quorum-unreachable error instead of no response. |
| core/capabilities/remote/executable/request/client_request_internal_test.go | Adds unit tests for quorum feasibility logic and early error send behavior (including unwrap chain expectations). |
| core/capabilities/remote/executable/client_test.go | Updates client-level test expectations from timeout to early ConsensusFailed quorum-unreachable error. |
Comments suppressed due to low confidence (1)
core/capabilities/remote/executable/request/client_request.go:402
trySendQuorumUnreachableError()is only invoked on the OK-response path. If the quorum becomes mathematically unreachable after processing an error response (e.g., a mix of unique OK payloads plus some error responses, where the last few responses are errors), the request can still sit until expiry even thoughmaxMatchingResponseCount()+pending < requiredResponseConfirmationsis already true. Consider callingtrySendQuorumUnreachableError()(or an equivalent check) after handling non-OK messages as well, onceresponseReceived[sender]has been updated, so the early-fail behavior is not dependent on the last message being OK.
if err != nil {
return fmt.Errorf("failed to encode payload with metadata: %w", err)
}
c.sendResponse(clientResponse{Result: payload})
} else {
c.trySendQuorumUnreachableError()
}
} else {
c.lggr.Debugw("received error from peer", "error", msg.Error, "errorMsg", msg.ErrorMsg, "peer", sender)
if commoncap.ErrResponsePayloadNotAvailable.Is(errors.New(msg.ErrorMsg)) {
|




…ayloads and no quorum is reached
https://smartcontract-it.atlassian.net/browse/PLEX-3032
We noticed that the WR on andesite was not working as expected returning the following to the user
Which was wrong, at that time the engine got all responses but due to a bug on the action/WR all payloads were different, making impossible to aggregate.
This PR fixes this scenario making an early return even with payloads still incoming if quorum is unreachable, providing a better error to the user.
Requires
Supports