Skip to content

[cli] Model load issue when running NPU models #829

Description

@SaiGayathri1999

Version used: 0.10.0+174be11ea7aeacd8d0d67b0ba1daebec615284b1

Device details:

 > foundry status
╭──────────────┬───────────────┬────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Section      │ Metric        │ Value                                                                                                  │
├──────────────┼───────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ System       │ OS            │ Microsoft Windows 10.0.26300                                                                           │
│ System       │ Architecture  │ Arm64                                                                                                  │
│ System       │ CPU           │ Snapdragon(R) X 12-core X1E80100 @ 3.40 GHz (12 logical cores)                                         │
│ System       │ GPU           │ Qualcomm Incorporated Qualcomm(R) Adreno(TM) X1-85 GPU (—)                                             │
│ System       │ NPU           │ Qualcomm Technologies, Inc. Snapdragon(R) X Elite - X1E80100 - Qualcomm(R) Hexagon(TM) NPU             │
│ System       │ RAM           │ 8.5 GB available / 15.6 GB total                                                                       │
│ System       │ Disk          │ 579.6 GB free / 951.6 GB total (C:\)                                                                   │
│ System       │ .NET          │ .NET 9.0.16                                                                                            │
│ Service      │ State         │ Not running                                                                                            │
│ Service      │ CLI version   │ 0.10.0                                                                                                 │
│ Connectivity │ Local service │ Not reachable                                                                                          │
│ Warnings     │ Warning 1     │ Qualcomm Adreno GPU detected. If acceleration fails, try a CPU model variant or update the GPU driver. │
╰──────────────┴───────────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────╯

Initially it was not showing NPU models and dint register QNNEP, so i uninstalled and installed WinML msix as suggested here. #797 and this helped.

Now when running the models its killing the server:

 > foundry server start
■ Starting Foundry Local server

WebGpuExecutionProvider ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100%
QNNExecutionProvider    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100%

● success: Server ready (http://127.0.0.1:56549)

> foundry run qwen2.5-coder-7b-instruct-qnn-npu:1
● error: Daemon closed the connection while awaiting response to op 'model.load'.
Hint: Run 'foundry server status' to inspect, or 'foundry server logs' for details

> foundry server status
State    Not running
PID      11220
Started  2026-06-21 23:56:31Z
Uptime   44s
Web URLs http://127.0.0.1:56549

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions