Skip to content

cuda: reset cuda context after reading memory size#23935

Open
0cc4m wants to merge 2 commits into
masterfrom
0cc4m/cuda-get-memory-device-reset
Open

cuda: reset cuda context after reading memory size#23935
0cc4m wants to merge 2 commits into
masterfrom
0cc4m/cuda-get-memory-device-reset

Conversation

@0cc4m
Copy link
Copy Markdown
Contributor

@0cc4m 0cc4m commented May 31, 2026

Overview

Alternative to #23604, to allow reading CUDA memory in the router process in #21231 without allocating permanent memory through an initialized CUDA context. Instead of using NVML, this checks before running cudaMemGetInfo whether the context is already initialized. If not, it releases the context after the call.

I tried ref-counting as well as suggested in #23604 (comment), but that is harder to get right and introduces more edge cases.

Requirements

@0cc4m 0cc4m requested a review from a team as a code owner May 31, 2026 06:37
@github-actions github-actions Bot added Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning labels May 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant