Skip to content

init fix example#109

Merged
JimyMa merged 2 commits into
mainfrom
fix_tcp_peer_example
Jun 2, 2026
Merged

init fix example#109
JimyMa merged 2 commits into
mainfrom
fix_tcp_peer_example

Conversation

@JimyMa
Copy link
Copy Markdown
Contributor

@JimyMa JimyMa commented Jun 2, 2026

No description provided.

@JimyMa JimyMa requested a deployment to self-hosted-rdma June 2, 2026 07:23 — with GitHub Actions Waiting
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates topology discovery to support TCP-only deployments by returning an empty topology instead of throwing an error when no RDMA devices are available. It also restricts local resource inspection to NICs openable by libibverbs and handles initialization failures gracefully. The reviewer feedback highlights critical compatibility and stability issues: first, available_nic should be imported gracefully to prevent ImportError on older binaries; second, check_openable must handle cases where available_nic is unavailable; and third, the fallback dictionary returned when discover_topology fails should include standard keys like host and schema_version to avoid downstream KeyError crashes.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread dlslime/dlslime/peer_agent/_agent.py
Comment thread dlslime/dlslime/peer_agent/_agent.py
Comment thread dlslime/dlslime/peer_agent/_agent.py Outdated
@JimyMa JimyMa had a problem deploying to self-hosted-rdma June 2, 2026 07:31 — with GitHub Actions Failure
@JimyMa
Copy link
Copy Markdown
Contributor Author

JimyMa commented Jun 2, 2026

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the topology discovery and peer agent to support TCP-only deployments by allowing empty RDMA device lists instead of throwing errors. It also ensures that only openable RDMA devices are selected by checking them against available_nic(), and prevents initialization crashes by setting _shutdown_called earlier. Feedback suggests caching the results of available_nic() to avoid redundant system calls and improve performance.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +451 to +452
check_openable = resource is self._local_resource
openable = set(available_nic()) if check_openable else set()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Calling available_nic() on every invocation of _first_usable_resource_key when checking the local resource can be highly inefficient. available_nic() queries the system's RDMA devices (via ibv_get_device_list), which involves system calls and file I/O. Since the set of openable RDMA devices on a host is static during the lifetime of the agent, we can cache this set on the first call to avoid redundant system queries.

Suggested change
check_openable = resource is self._local_resource
openable = set(available_nic()) if check_openable else set()
check_openable = resource is self._local_resource
if check_openable:
if not hasattr(self, "_openable_nics_cache"):
self._openable_nics_cache = set(available_nic())
openable = self._openable_nics_cache
else:
openable = set()

@JimyMa JimyMa had a problem deploying to self-hosted-rdma June 2, 2026 07:37 — with GitHub Actions Failure
@JimyMa JimyMa temporarily deployed to self-hosted-rdma June 2, 2026 07:46 — with GitHub Actions Inactive
@JimyMa JimyMa merged commit 84e97a3 into main Jun 2, 2026
8 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant