-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create and initialize Device object: the first PR for rewriting DML backend #287
base: main
Are you sure you want to change the base?
Conversation
d6f6a3d
to
cdfe64e
Compare
@mingmingtasd Could we please either rename the old backend ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mingmingtasd
src/webnn/native/dml/DeviceDML.h
Outdated
bool useDebugLayer = false; | ||
}; | ||
|
||
class Device { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about replacing "Device" with something like "ExecutionContext"?
I'm pretty certain that this code originates from Dawn. In Dawn, a device object is called "logical" (meaning there is state attached to it). In D3D12, the device object is actually stateless. The reason Dawn used a logical device is because Vulkan required such abstraction. Since DML does not depend on Vulkan, we don't need to create your own fake-logical device like this. Instead, we really just need a "context" that flushes DML commands using a queue + device + recorder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thanks, I agree to rename it, but there is a ContextDM.cpp
, will we rename Device.cpp
as ExecutionContextDML.cpp
? Maybe a little confused...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we could re-use ContextDML for this. ExecutionContextDML
could be a concrete implementation of ContextBase
. We could construct a ExecutionContextDML
by passing the D3D12 device, DML device, and D3D12 queue to it. A provider-factory, AdapterDML
, would produce ExecutionContextDML
(and creates the devices and queue). The Backend
uses AdapterDML
to produce this context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I deleted ExecutionContextDML
and moved the
ComPtr<IDMLDevice> mDevice;
ComPtr<ID3D12Device> mD3D12Device;
ComPtr<IDMLCommandRecorder> mCommandRecorder;
ComPtr<ID3D12CommandQueue> mCommandQueue;
ComPtr<ID3D12CommandAllocator> mCommandAllocator;
ComPtr<ID3D12GraphicsCommandList> mCommandList;
ComPtr<IDXGIAdapter1> mAdapter;
into ContextDML.cpp, and let Backend use Adapter
to create the context for DML.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still like ExecutionContext - "context" sounds to vague to me. But I don't feel strongly about that, you decide.
OK, I prefer to add a new temporary backend. @bbernhar And we can replace the old dml backend with the new backend when all ready. |
Add a temporary backend in main branch will confuse for the developer, you can draft a first PR to delete the DML backend implementation and disable the CI. Another approach is to rewrite the DML backend on your Repo that also can be reviewed by us. |
|
@fujunwei Seems fine to me too. |
9697ccc
to
19fef0d
Compare
@miaobin Could you help enable some simple ops like Sigmoid firstly for testing? Then you can give more feedbacks when we are re-writing the DML backend. Thanks! |
19fef0d
to
eaf00a2
Compare
Of course, I will write GraphDML based on the current PR. On one hand, I will move some content in Device.cpp to this layer, on the other hand, I will first implement add input, compile, compute and simple operators such as sigmoid. |
eaf00a2
to
6ab2258
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! Left some comments. Mostly regarding: recorder vs execution-context split. The others are minor.
src/webnn/native/dml/ContextDML.h
Outdated
|
||
ComPtr<IDMLDevice> mDevice; | ||
ComPtr<ID3D12Device> mD3D12Device; | ||
ComPtr<IDMLCommandRecorder> mCommandRecorder; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about having CommandRecorderDML
?
CommandRecorderDML
would store theID3D12CommandAllocator
and be stored within this context. Then everything that needs the recorder but not the execution context, eg. InitializeOperator, ExecuteOperator is moved to CommandRecorderDML
. This keeps the logic of recording clearly and cleanly separated from execution. The context just asks for the recorded command-list from this instance to execute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean move these member variables into
CommandRecorderDML{
ComPtr<IDMLDevice> mDevice;
ComPtr<ID3D12Device> mD3D12Device;
ComPtr<IDMLCommandRecorder> mCommandRecorder;
ComPtr<ID3D12CommandQueue> mCommandQueue;
ComPtr<ID3D12CommandAllocator> mCommandAllocator;
ComPtr<ID3D12GraphicsCommandList> mCommandList;
}
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, that's right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, updated.
src/webnn/native/dml/ContextDML.cpp
Outdated
} | ||
} | ||
WEBNN_RETURN_IF_FAILED( | ||
D3D12CreateDevice(mAdapter.Get(), D3D_FEATURE_LEVEL_11_0, IID_PPV_ARGS(&mD3D12Device))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will we want more than one context instance per queue (with a specified device)? If so, the queue and device should be passed by parameter, and not created within the context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood, thanks! But I think we only need one context instance per queue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my other comment.
89f3982
to
1e937c9
Compare
src/webnn/native/dml/ContextDML.h
Outdated
|
||
namespace webnn::native::dml { | ||
|
||
struct CommandRecorderDML { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move this into .cpp
? This class will become large-enough (for InitializeOperator, ExecuteOperator, etc).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may mean InitializeGraph, ExecuteGraph, right? I think these will be in GraphDML.cpp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GraphDML would not record the actual commands into the command-list. It just builds the operator (graph) for the CommandRecorder. The CommandRecorder itself records these commands via InitializeGraph, ExecuteGraph, etc. This maintains a clear separation of duties and avoids any state being shared between them (data-flow is one-way).
I highly encourage we all think about such things during the rewrite: clear boundaries, data-flow, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the commandlist is only used after the graph building. I guess your suggestion for separation may be:
CommandRecorder -> IntializeGraph()/ExecuteGraph() ? That's why you suggested defining the CommandRecorder and ExecutionContext?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move this into
.cpp
? This class will become large-enough (for InitializeOperator, ExecuteOperator, etc).
Will consider re-organizing codes when we add more functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM.
src/webnn/native/dml/ContextDML.h
Outdated
static ContextBase* Create(ComPtr<IDXGIAdapter1> adapter, bool useDebugLayer); | ||
~Context() override = default; | ||
|
||
CommandRecorderDML GetCommandRecorderDML() const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to get the CommandRecorder? The context can just execute it directly. Also, this returns a (copy) of the recorder, which is most definitely not what we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I plan to add InitializeGraph, ExecuteGraph in GraphDML.cpp which will use the commandRecord.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my other comment.
src/webnn/native/dml/ContextDML.cpp
Outdated
D3D12_COMMAND_QUEUE_DESC commandQueueDesc{}; | ||
commandQueueDesc.Type = D3D12_COMMAND_LIST_TYPE_DIRECT; | ||
commandQueueDesc.Flags = D3D12_COMMAND_QUEUE_FLAG_NONE; | ||
WEBNN_RETURN_IF_FAILED(mCommandRecorderDML.D3D12Device->CreateCommandQueue( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't have the context (itself) always create the [DML, D3D12] device and queue. Rather, we want the context to initialize and execute with any (device, queue) specified.
Also, the command allocator and list should be created by the recorder class you've defined. The recorder also accepts the (device, queue) as parameters to initialize it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand, could you explain the purpose? @bbernhar
These D3D related members will only be created once when we create one context, we don't always create them. And I think one context owns one CommandRecorder I defined and the context will create multiple graphs, then graphs will do InitializeGraph and ExecuteGraph. Am I right? @fujunwei
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The [DML, D3D12] device and queue does not have a 1:1 relationship with a context. You can have multiple contexts per {device, queue}. Chromium could for example, specify a {device, queue}, from GPUDevice, for a given context too. So, we cannot assume the context should always create a new one. In truth, the process only needs one queue (of each type).
In fact, if it does [create a new one], then we'll have to deal with complex cross-queue usage rules because resources shared between contexts wouldn't necessarily be compatible.
A CommandRecorderDML and the underlying IDMLCommandRecorder has 1:1 relationship, they should have the exact same lifetime.
Hopefully that makes more sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense, we should enum all adapters (integrateGPU, discreteGPU, maybe VPU) at once in the GPU process for all contexts like WebGPU's Backend::DiscoverAdapters.
For each adapter in DML backend, there is a device(DML, D3D12) as member variable like WebGPU's mD3d12Device.
The device in DML backend has 1 queue and 1 commandRecorder to recoder commands for all contexts, the commands come from initialize and execute Graph, am i right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fujunwei Only the queue is sharable. Since no other recorder will be provided to this context to use, it doesn't need to outlive it. Every context has exactly one "live" recorder.
The whole purpose of CommandRecorderDML was to keep recorder-state together (list AND allocator). The command recorder can (and eventually will) enqueue multiple command-lists. Similarly, the context might also need to produce a entirely new command-list via new recorder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense, thanks for explain patiently.
1e937c9
to
81c59b5
Compare
src/webnn/native/dml/ContextDML.h
Outdated
|
||
struct CommandRecorder { | ||
ComPtr<IDMLDevice> device; | ||
ComPtr<ID3D12Device> D3D12Device; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: please keep struct-field names consistent.
src/webnn/native/dml/ContextDML.cpp
Outdated
}; | ||
|
||
// static | ||
ContextBase* Context::Create(const ComPtr<IDMLDevice>& device, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not pass a modifiable object like this. The point of passing ComPtr by value is to know that ownership could be shared and this (passing by ComPtr&
) makes it ambiguous at the call-site. If you want to avoid the ref, wrap it in std::move
instead. For example, Context::Create(std::move(comptr))
and then new Context(std::move(comptr)), ...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, got it. I think maybe multiple graphs will share these Comptr, so I decide to copy.
81c59b5
to
9bfd240
Compare
PTAL, if there are no more significant comments, I intend to stop maintaining this PR and move to implement based on Mojo infrastrucure in otcshare/Chromium in this webnn_mojo branch to align with our team's main workflow now. Then you can go on reviewing codes there. Thanks! @bbernhar @fujunwei @miaobin |
src/webnn/native/dml/BackendDML.cpp
Outdated
} | ||
|
||
// Create the D3D device. | ||
if (FAILED(D3D12CreateDevice(adapter.Get(), D3D_FEATURE_LEVEL_11_0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we disscusion, the device and command queue can't be created when creating context, they should be created when connecting DML backend.
src/webnn/native/dml/ContextDML.h
Outdated
|
||
namespace webnn::native::dml { | ||
|
||
struct CommandRecorder { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1, Move it to cpp files.
2, We should comment which variables are sharable between different context (Device, Queue)
src/webnn/native/dml/ContextDML.cpp
Outdated
D3D12_COMMAND_QUEUE_DESC commandQueueDesc{}; | ||
commandQueueDesc.Type = D3D12_COMMAND_LIST_TYPE_DIRECT; | ||
commandQueueDesc.Flags = D3D12_COMMAND_QUEUE_FLAG_NONE; | ||
WEBNN_RETURN_IF_FAILED(mCommandRecorderDML.D3D12Device->CreateCommandQueue( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense, thanks for explain patiently.
9bfd240
to
aab6d9b
Compare
As you suggested , we are migrating to the chromium's mojo based branch. Thanks! @bbernhar |
PTAL, thanks! @bbernhar @fujunwei @miaobin