-
Notifications
You must be signed in to change notification settings - Fork 31
Fix fill! #555
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Fix fill! #555
Conversation
|
Your PR requires formatting changes to meet the project's style guidelines. Click here to view the suggested changes.diff --git a/src/array.jl b/src/array.jl
index 4d621cf..8ab4ba8 100644
--- a/src/array.jl
+++ b/src/array.jl
@@ -505,17 +505,17 @@ fill(v, dims...) = fill!(oneArray{typeof(v)}(undef, dims...), v)
fill(v, dims::Dims) = fill!(oneArray{typeof(v)}(undef, dims...), v)
function Base.fill!(A::oneDenseArray{T}, val) where T
- length(A) == 0 && return A
- val = convert(T, val)
- sizeof(T) == 0 && return A
-
- # execute! is async, so we need to allocate the pattern in USM memory
- # and keep it alive until the operation completes.
- buf = oneL0.host_alloc(context(A), sizeof(T), Base.datatype_alignment(T))
- unsafe_store!(convert(Ptr{T}, buf), val)
- unsafe_fill!(context(A), device(), pointer(A), convert(ZePtr{T}, buf), length(A))
- synchronize(global_queue(context(A), device()))
- oneL0.free(buf)
+ length(A) == 0 && return A
+ val = convert(T, val)
+ sizeof(T) == 0 && return A
+
+ # execute! is async, so we need to allocate the pattern in USM memory
+ # and keep it alive until the operation completes.
+ buf = oneL0.host_alloc(context(A), sizeof(T), Base.datatype_alignment(T))
+ unsafe_store!(convert(Ptr{T}, buf), val)
+ unsafe_fill!(context(A), device(), pointer(A), convert(ZePtr{T}, buf), length(A))
+ synchronize(global_queue(context(A), device()))
+ oneL0.free(buf)
A
end
diff --git a/test/level-zero.jl b/test/level-zero.jl
index ed7b283..3b13f34 100644
--- a/test/level-zero.jl
+++ b/test/level-zero.jl
@@ -271,22 +271,22 @@ let src = rand(Int, 1024)
synchronize(queue)
@test chk == src
- # FIX: Allocate pattern in USM Host Memory
- # Standard Host memory (stack/heap) is not accessible by discrete GPUs for fill patterns.
- # We must use USM Host Memory.
- pattern_val = 42
- pattern_buf = oneL0.host_alloc(ctx, sizeof(Int), Base.datatype_alignment(Int))
- unsafe_store!(convert(Ptr{Int}, pattern_buf), pattern_val)
+ # FIX: Allocate pattern in USM Host Memory
+ # Standard Host memory (stack/heap) is not accessible by discrete GPUs for fill patterns.
+ # We must use USM Host Memory.
+ pattern_val = 42
+ pattern_buf = oneL0.host_alloc(ctx, sizeof(Int), Base.datatype_alignment(Int))
+ unsafe_store!(convert(Ptr{Int}, pattern_buf), pattern_val)
execute!(queue) do list
- # Use the USM pointer (converted to ZePtr)
- append_fill!(list, pointer(dst), convert(ZePtr{Int}, pattern_buf), sizeof(Int), sizeof(src))
+ # Use the USM pointer (converted to ZePtr)
+ append_fill!(list, pointer(dst), convert(ZePtr{Int}, pattern_buf), sizeof(Int), sizeof(src))
append_barrier!(list)
append_copy!(list, pointer(chk), pointer(dst), sizeof(src))
end
synchronize(queue)
- oneL0.free(pattern_buf)
+ oneL0.free(pattern_buf)
@test all(isequal(42), chk)
|
12e6f0a to
b85723b
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #555 +/- ##
==========================================
+ Coverage 79.24% 79.28% +0.04%
==========================================
Files 46 46
Lines 3064 3070 +6
==========================================
+ Hits 2428 2434 +6
Misses 636 636 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
maleadt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's curious; I developed this on an A770 discrete GPU where it worked fine.
b85723b to
7f022f5
Compare
There was a test failure on a Max 1100 GPU in
oneAPI.jl/test/level-zero.jl
Lines 274 to 281 in 010bd13
pattern = [42]) tozeCommandListAppendMemoryFill. AFAIK, on discrete Intel GPUs (unlike integrated ones), standard host memory is often not directly accessible by the device command processor. I also fixedfill!to address the same issue.In that vein, I will also add a GitHub Actions runner for that GPU.