You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While porting an existing OpenCL application that previously ran on the standard Intel OpenCL platform targeting integrated GPUs to now run on an OpenCL platform built using the Construction Kit, I observed an exception.
After reducing the test case, the problem boiled down to this (full test case below):
So this program constructs a kernel object, lets the Program object go out of scope (which will call clReleaseProgram()), and then tries to recover the Program object using a getInfo() call on the kernel object.
This works with the Intel OpenCL platform but fails with the Construction Kit because the clGetKernelInfo call fails. With the Construction Kit, after the Program object goes out of scope, the backing cl_program object will have an internal reference count of 1 (because it's still referenced by the cl_kernel) and an external reference count of 0 (because at that point, no application-visible references to the cl_program exist). Trying to build a Program object again will try to call clRetainProgram(), which will fail because of a special handling for exactly this case.
Obviously, what the program is trying to do here seemed a little bit fishy so I tried to understand what the expected behavior is, which proved to be surprisingly difficult. I found KhronosGroup/OpenCL-Docs#620, which seems to discuss exactly this topic, and if I understood the situation correctly, both behaviors (Intel and Construction Kit) are covered by the current OpenCL spec. The Construction Kit seems to take an approach that isn't explicitly considered in that discussion: it's almost like the "variant (2)" mentioned in the discussion (so it tracks internal references using a separate internal reference counter, but waits for both the application-visible reference counter to hit zero and the internal reference counter to hit zero before going through the process of deleting the object), but then it explicitly disallows "resurrecting" objects from attached objects (i.e., going from internal=1 external=0 to internal=1 external=1 is forbidden).
(Out of interest, I also ran the referenced OpenCL object lifetime validation layer against this program, but it didn't point out a problem, but to be fair, I don't fully understand if it covers this situation. Probably it doesn't.)
I think the test program is incorrect according to the OpenCL specification because for the cl_program object, the number of release calls is equal to the number of retain calls, thus it must be regarded as inaccessible. The fact that this object is still (internally) referenced by the cl_kernel object and thus (technically) "recoverable" isn't important. Do you agree with this?
Was there a specific reason for picking this approach to object lifetimes in the Construction Kit? What are the downsides of allowing "resurrecting" objects?
I think the test program is incorrect according to the OpenCL specification because for the cl_program object, the number of release calls is equal to the number of retain calls, thus it must be regarded as inaccessible. The fact that this object is still (internally) referenced by the cl_kernel object and thus (technically) "recoverable" isn't important. Do you agree with this?
Yes, I agree. The OpenCL spec referenced in the docs issue you referenced matches what you wrote:
The object becomes inaccessible to host code when the number of release operations performed matches the number of retain operations plus the allocation of the object.
It seems reasonable to me to interpret "inaccessible to host code" as "the OpenCL implementation may actively prevent further access by host code".
Was there a specific reason for picking this approach to object lifetimes in the Construction Kit? What are the downsides of allowing "resurrecting" objects?
I looked into the history of this. We used to have a single reference count, but changed it to avoid internal errors if the user calls clRelease*() more often than they are permitted. When we changed it to have a separate internal and external reference count, we did not permit a zero external reference count to become nonzero to avoid threading problems, where thread 1 sees the external reference count is zero, thread 2 increases the external reference count, thread 1 proceeds to check the internal reference count and deletes the object, thread 2 uses the object. There are other ways to prevent this from being an issue, but this is one of the simplest ones.
Version
85dfbf7
What is your question or problem?
While porting an existing OpenCL application that previously ran on the standard Intel OpenCL platform targeting integrated GPUs to now run on an OpenCL platform built using the Construction Kit, I observed an exception.
After reducing the test case, the problem boiled down to this (full test case below):
So this program constructs a kernel object, lets the
Program
object go out of scope (which will callclReleaseProgram()
), and then tries to recover theProgram
object using agetInfo()
call on the kernel object.This works with the Intel OpenCL platform but fails with the Construction Kit because the
clGetKernelInfo
call fails. With the Construction Kit, after theProgram
object goes out of scope, the backingcl_program
object will have an internal reference count of 1 (because it's still referenced by thecl_kernel
) and an external reference count of 0 (because at that point, no application-visible references to thecl_program
exist). Trying to build aProgram
object again will try to callclRetainProgram()
, which will fail because of a special handling for exactly this case.Obviously, what the program is trying to do here seemed a little bit fishy so I tried to understand what the expected behavior is, which proved to be surprisingly difficult. I found KhronosGroup/OpenCL-Docs#620, which seems to discuss exactly this topic, and if I understood the situation correctly, both behaviors (Intel and Construction Kit) are covered by the current OpenCL spec. The Construction Kit seems to take an approach that isn't explicitly considered in that discussion: it's almost like the "variant (2)" mentioned in the discussion (so it tracks internal references using a separate internal reference counter, but waits for both the application-visible reference counter to hit zero and the internal reference counter to hit zero before going through the process of deleting the object), but then it explicitly disallows "resurrecting" objects from attached objects (i.e., going from
internal=1 external=0
tointernal=1 external=1
is forbidden).(Out of interest, I also ran the referenced OpenCL object lifetime validation layer against this program, but it didn't point out a problem, but to be fair, I don't fully understand if it covers this situation. Probably it doesn't.)
I think the test program is incorrect according to the OpenCL specification because for the
cl_program
object, the number ofrelease
calls is equal to the number ofretain
calls, thus it must be regarded as inaccessible. The fact that this object is still (internally) referenced by thecl_kernel
object and thus (technically) "recoverable" isn't important. Do you agree with this?Was there a specific reason for picking this approach to object lifetimes in the Construction Kit? What are the downsides of allowing "resurrecting" objects?
Full test application:
The text was updated successfully, but these errors were encountered: