Resolve all same-machine streams (fix the 127.0.0.1 unicast lottery)#282
Open
cboulay wants to merge 2 commits into
Open
Resolve all same-machine streams (fix the 127.0.0.1 unicast lottery)#282cboulay wants to merge 2 commits into
cboulay wants to merge 2 commits into
Conversation
Same-machine discovery relied solely on a unicast query to the shared multicast_port (127.0.0.1:16571). Because that port is unicast rather than a multicast group, the OS delivers the datagram to only one of the responder sockets bound there, so only one local stream resolves and which one depends on socket bind order. Restarting programs shuffles which stream is visible. This is the root cause behind issues #28, #92, #202 and #207. Additionally probe the machine-local (loopback/unicast) addresses across the per-stream service-port range [base_port, base_port+port_range), where every stream owns a unique socket, so all local streams are discoverable regardless of bind order. This mirrors how KnownPeers are already sprayed and needs no OS routing changes. - api_config: parse and expose machine_addresses() (the unicast subset of multicast.MachineAddresses). - resolver_impl: add those addresses to the unicast target list. - test: lsl_test_resolve_machine creates 3 local outlets under ResolveScope=machine and asserts all are resolved (fails as 1==3 pre-fix).
A machine address that also appears in KnownPeers (or a repeated config entry) produced the same (address, port) target twice in the unicast list, so the resolver sent identical queries to the same socket. Sort + unique the multicast and unicast endpoint lists once, after they are built, so each endpoint is queried at most once per wave. This is independent of the existing UID-based dedup of results, which must stay: two genuinely different endpoints (e.g. a stream's multicast_port and its service port) can return the same stream, and only the result dedup collapses those.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes the long-standing "only one of my local streams shows up, and which one shuffles when I restart things" failure. Same-machine discovery relied solely on a unicast query to the shared
multicast_port(127.0.0.1:16571); the fix additionally probes the machine-local addresses across the per-stream service-port range, where every stream owns a unique socket.Root cause
MachineAddresses(default{127.0.0.1}) is folded into the multicast address list and queried only atmulticast_port(16571). But127.0.0.1is a unicast address, not a multicast group: the OS delivers a datagram sent to a shared port to only one of the responder sockets bound there. So when several outlets run on a machine, only one answers a discovery query, and which one depends on socket bind order. Restarting programs shuffles which stream is visible.This is the mechanism behind a cluster of reports: #28 (Azure macOS, only one stream resolves — diagnosed in-thread as exactly this), #92 (the issue literally titled "Remove 127.0.0.1 from multicast addresses"), #202 (Windows→Linux, order-dependent), and #207 (Android, only one stream when no Wi-Fi/loopback only).
Fix
In addition to the
16571query, probe each machine-local address across the per-stream service-port range[base_port, base_port+port_range)(16572–16603), where each outlet binds its own unique socket and answersLSL:shortinfo. This mirrors howKnownPeersare already sprayed and needs no OS routing changes.api_config: parse and exposemachine_addresses()(the unicast subset ofmulticast.MachineAddresses).resolver_impl: add those addresses to the unicast target list in the resolver's constructor.127.0.0.1is intentionally left in the multicast list too (covered redundantly and harmlessly at 16571), so there is no behavior change for setups that relied on it, and #92's "remove 127.0.0.1 + add a loopback multicast route" workaround is no longer needed.Test
New
lsl_test_resolve_machine(own executable, likeruntime_config, since it sets theapi_configsingleton): underResolveScope=machineit stands up 3 local outlets and asserts all 3 resolve. Pre-fix this fails deterministically as1 == 3(the lottery); post-fix all are found.Testing
Full suite green locally (macOS universal): new test passes;
discovery(live UDP resolution, 724 assertions),network,tcpserver,streaminfo, andruntime_configare unaffected by the now-default loopback unicast wave.Scope / relationship