Skip to content

Conversation

@jchen10
Copy link
Contributor

@jchen10 jchen10 commented Nov 28, 2025

Prepack Conv kernels with path-aware transpose decisions, store the transposed kernels for reuse, and add ComputeContextBase helpers for node access and GPU buffer unmapping.

@jchen10
Copy link
Contributor Author

jchen10 commented Nov 28, 2025

Perf data on LNL:

model variance(%)
sd-turbo-unet-fp16-demo-layernorm -23.72%
modnet-fp32 -22.99%
sd-turbo-text-encoder-fp16-demo-layernorm -17.58%
efficientnet-lite-f16-demo -15.28%
mobilenetv2-12-f16-demo -14.18%
jina-clip-v1-version -12.61%
gazenet -12.22%
sdunet-v1.5-demo-layernorm -11.43%
modnet-fp16 -10.06%
resnet50-v1-f16-demo -8.14%
florence-2-base-decoder-fp16 -7.95%
movenet-singlepose-thunder-fp32 -7.61%
jina-clip-v1-version-fp16 -7.54%
depth-anything-base-fp32 -7.45%
detr-resnet-50-fp16 -6.55%
detr-resnet-50 -6.33%
jina-clip-v1-text -6.32%
movenet-singlepose-thunder-fp16 -6.04%
mobileclip_s0_vision_fp32 -5.14%

@jchen10
Copy link
Contributor Author

jchen10 commented Nov 28, 2025

@fs-eire @qjia7 @guschmue PTAL

PrePack Conv kernels with path-aware transpose decisions, store the
transposed kernels for reuse, and add ComputeContextBase helpers for
node access and GPU buffer unmapping.
@guschmue guschmue added the ep:WebGPU ort-web webgpu provider label Dec 2, 2025
@jchen10
Copy link
Contributor Author

jchen10 commented Dec 3, 2025

Found the CI error log below. Not quite sure if it is really caused by this PR.

2025-12-02T20:34:21.9671092Z 2: [ FAILED ] CudaNhwcTypedTest/0.ConvNhwcBias, where TypeParam = float (186 ms)

2025-12-02T20:34:21.8768402Z 2: �[1;31m2025-12-02 20:34:21.8759375 [E:onnxruntime:Conv, sequential_executor.cc:572 onnxruntime::ExecuteKernel] Non-zero status code returned while running Conv node. Name:'node1' Status Message: CUDA error cudaErrorNotSupported:operation not supported�[m
2025-12-02T20:34:21.8769974Z 2: E:\_work\onnxruntime\onnxruntime\onnxruntime\test\providers\compare_provider_test_utils.cc(172): error: Value of: _tmp_status.IsOK()
2025-12-02T20:34:21.8770496Z 2:   Actual: false
2025-12-02T20:34:21.8770652Z 2: Expected: true
2025-12-02T20:34:21.8771227Z 2: [ONNXRuntimeError] : 1 : FAIL : Non-zero status code returned while running Conv node. Name:'node1' Status Message: CUDA error cudaErrorNotSupported:operation not supported
2025-12-02T20:34:21.8771728Z 2: 

@jchen10
Copy link
Contributor Author

jchen10 commented Dec 3, 2025

Found the CI error log below. Not quite sure if it is really caused by this PR.

2025-12-02T20:34:21.9671092Z 2: [ FAILED ] CudaNhwcTypedTest/0.ConvNhwcBias, where TypeParam = float (186 ms)

2025-12-02T20:34:21.8768402Z 2: �[1;31m2025-12-02 20:34:21.8759375 [E:onnxruntime:Conv, sequential_executor.cc:572 onnxruntime::ExecuteKernel] Non-zero status code returned while running Conv node. Name:'node1' Status Message: CUDA error cudaErrorNotSupported:operation not supported�[m
2025-12-02T20:34:21.8769974Z 2: E:\_work\onnxruntime\onnxruntime\onnxruntime\test\providers\compare_provider_test_utils.cc(172): error: Value of: _tmp_status.IsOK()
2025-12-02T20:34:21.8770496Z 2:   Actual: false
2025-12-02T20:34:21.8770652Z 2: Expected: true
2025-12-02T20:34:21.8771227Z 2: [ONNXRuntimeError] : 1 : FAIL : Non-zero status code returned while running Conv node. Name:'node1' Status Message: CUDA error cudaErrorNotSupported:operation not supported
2025-12-02T20:34:21.8771728Z 2: 

Tried the case locally with CUDA EP. It didn't reproduce with this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ep:WebGPU ort-web webgpu provider

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants