tun/mac: make server bypass route idempotent to avoid orphaned route#436
Open
luancamara wants to merge 1 commit into
Open
tun/mac: make server bypass route idempotent to avoid orphaned route#436luancamara wants to merge 1 commit into
luancamara wants to merge 1 commit into
Conversation
On macOS the server bypass host route (remote /32 via the physical default
gateway, installed when redirect-gateway is active) is added with a plain
"route add". If a previous session terminated uncleanly (sleep, network
change or process crash) the teardown ActionList never runs and a stale copy
of this route survives in the routing table.
On the next connection the "route add" fails because the route already
exists. Command::execute() detects "File exists" and throws; ActionList::
execute() records the route's mark as failed and remove_marked() then drops
the *paired delete* from the teardown list (tunsetup.hpp: "since we should
not undo failed actions, remove them"). The route is therefore never cleaned
up, and because the stale entry still points at the previous network's
gateway, once that gateway is no longer on-link every datagram to the server
fails with EADDRNOTAVAIL:
UDP send exception: send: Can't assign requested address
The client then loops RECONNECTING/RESOLVE until it times out, while the
server and all other users are unaffected.
Make the bypass route installation idempotent by issuing a best-effort
"route delete" immediately before the "route add". Deleting a non-existent
route is harmless ("not in table", non-fatal), while a stale orphan is
removed so the add always installs a fresh route bound to the current
gateway and stays managed for teardown. This mirrors the REPLACE semantics
already used by the Linux sitnl backend on EEXIST.
Signed-off-by: Luan Camara <luancamara@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
tun/mac: make server bypass route idempotent to avoid orphaned route
Summary
On macOS, the server bypass host route (the
remote /32route via the physicaldefault gateway that is installed when
redirect-gatewayis active) can be leftorphaned in the routing table after an unclean teardown. On the next
connection this orphan makes every datagram to the server fail with
EADDRNOTAVAIL, so the client can only time out — even though the server and allother clients are fine.
Observed in OpenVPN Connect 3.8.2 (macOS), whose core emits the exact log
signature below; reproduced against
OpenVPN/openvpn3master.Symptom
While disconnected, a stale host route to the server remains in the table:
Root cause
The bypass route is installed in
openvpn/tun/mac/client/tunsetup.hppwith a plain/sbin/route add. The create/destroy actions are built as a pair (sharedmark),and teardown runs the destroy (delete) action.
When a previous session ends uncleanly (laptop sleep, Wi‑Fi/network change, or a
crash), the teardown
ActionListnever runs, so the bypass route survives. On thenext connect:
Command::execute()(openvpn/common/process.hpp) seesroute addprintFile existsand throws"Route already exists, new route will be ignored".ActionList::execute()(openvpn/common/action.hpp) catches it and records theroute's
markinfailed_actions.tunsetup.hppthen callsremove_cmds->remove_marked(failed_actions, ...)— "since we should not undo failed actions, remove them" — which drops the
paired delete from the teardown list.
Net effect: the route is never cleaned up. Worse, the surviving entry still points at
the previous network's gateway, so once that gateway is no longer on-link, source
address selection for the server fails and
send()returnsEADDRNOTAVAILpermanently. The client loops
RECONNECTING/RESOLVEuntil timeout.This is local, per-machine state: other users (and the server) are unaffected, which
makes it easy to misdiagnose as a server/network problem.
Fix
Make the bypass-route installation idempotent: issue a best-effort
route deleteimmediately before theroute add.not in table,non-fatal —
Command::execute()only throws onFile exists).addalwaysinstalls a fresh route bound to the current gateway and the paired delete
stays in the teardown list.
This mirrors the
NLM_F_REPLACEsemantics the Linuxsitnlbackend already useswhen it hits
EEXIST(openvpn/tun/linux/client/sitnl.hpp), bringing macOS in line.The same
Action::Ptr d(a stateless, ref-countedCommand) is reused for thepre-delete and the teardown delete; running it at setup and teardown is safe. If
maintainers prefer a distinct object, the pre-delete can be built as a copy instead —
happy to adjust.
Scope / alternatives
This patch is intentionally scoped to the server bypass route, which is the entry that
strands a reachable physical gateway and triggers the failure. The same idempotent
pattern could be applied to the shared
add_del_route(..., ActionList&, ActionList&)choke point so all macOS routes self-heal on
EEXIST; that is a larger behavioralchange (and adds a benign
route deleteper route at connect), so I left it out ofthis minimal fix. I can extend it if preferred.
Testing
the orphan, switch networks, reconnect → without the patch the client loops on
EADDRNOTAVAIL; with the patch the stale route is replaced and the connection comesup normally.
route delete(logged asnot in table) before the bypassroute add.Based on
OpenVPN/openvpn3master @2c889af.