fix: handle empty error channel in returncode to prevent race condition#2576
Conversation
|
Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: kshiteej-mali The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
|
|
Welcome @kshiteej-mali! |
/kind bug
The returncode property reads from the error channel after the WebSocket connection closes. On fast-exiting commands there is a race condition where the error channel message has not been buffered yet and read_channel() returns an empty string. The previous code passed this directly to yaml.safe_load() and accessed err['status'] without a null check, causing it to incorrectly return 0 even when the command failed with a non-zero exit code.
Fixes #2328 ( pod_exec() yields inconsistent results #2328 )
Reproducible by running command=['false'] in a loop 10 times - previously returned inconsistent exit codes (0 or 1 randomly).
Fixed a race condition in WSClient.returncode that caused pod exec commands with non-zero exit codes to intermittently return 0.