Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create and initialize Device object: the first PR for rewriting DML backend #287

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mingmingtasd
Copy link
Collaborator

PTAL, thanks! @bbernhar @fujunwei @miaobin

@mingmingtasd mingmingtasd force-pushed the dml_rewrite branch 2 times, most recently from d6f6a3d to cdfe64e Compare August 3, 2022 05:35
@bbernhar
Copy link
Contributor

bbernhar commented Aug 3, 2022

@mingmingtasd Could we please either rename the old backend (dml) or give the new backend another (temporary) name? That way, the git diffs will be accurate (= doesn't mix old and new). A plus, will also keep the bots green.

Copy link
Contributor

@bbernhar bbernhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mingmingtasd

src/webnn/native/dml/DeviceDML.cpp Outdated Show resolved Hide resolved
src/webnn/native/dml/DeviceDML.cpp Outdated Show resolved Hide resolved
src/webnn/native/dml/DeviceDML.cpp Outdated Show resolved Hide resolved
src/webnn/native/dml/DeviceDML.cpp Outdated Show resolved Hide resolved
src/webnn/native/dml/DeviceDML.cpp Outdated Show resolved Hide resolved
src/webnn/native/dml/DeviceDML.h Outdated Show resolved Hide resolved
src/webnn/native/dml/dml_platform.h Outdated Show resolved Hide resolved
src/webnn/native/dml/dml_platform.h Show resolved Hide resolved
bool useDebugLayer = false;
};

class Device {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about replacing "Device" with something like "ExecutionContext"?

I'm pretty certain that this code originates from Dawn. In Dawn, a device object is called "logical" (meaning there is state attached to it). In D3D12, the device object is actually stateless. The reason Dawn used a logical device is because Vulkan required such abstraction. Since DML does not depend on Vulkan, we don't need to create your own fake-logical device like this. Instead, we really just need a "context" that flushes DML commands using a queue + device + recorder.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks, I agree to rename it, but there is a ContextDM.cpp, will we rename Device.cpp as ExecutionContextDML.cpp? Maybe a little confused...

Copy link
Contributor

@bbernhar bbernhar Aug 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we could re-use ContextDML for this. ExecutionContextDML could be a concrete implementation of ContextBase. We could construct a ExecutionContextDML by passing the D3D12 device, DML device, and D3D12 queue to it. A provider-factory, AdapterDML, would produce ExecutionContextDML (and creates the devices and queue). The Backend uses AdapterDML to produce this context.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I deleted ExecutionContextDML and moved the

        ComPtr<IDMLDevice> mDevice;
        ComPtr<ID3D12Device> mD3D12Device;
        ComPtr<IDMLCommandRecorder> mCommandRecorder;
        ComPtr<ID3D12CommandQueue> mCommandQueue;
        ComPtr<ID3D12CommandAllocator> mCommandAllocator;
        ComPtr<ID3D12GraphicsCommandList> mCommandList;

        ComPtr<IDXGIAdapter1> mAdapter;

into ContextDML.cpp, and let Backend use Adapter to create the context for DML.

Copy link
Contributor

@bbernhar bbernhar Aug 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still like ExecutionContext - "context" sounds to vague to me. But I don't feel strongly about that, you decide.

@mingmingtasd
Copy link
Collaborator Author

mingmingtasd commented Aug 4, 2022

@mingmingtasd Could we please either rename the old backend (dml) or give the new backend another (temporary) name? That way, the git diffs will be accurate (= doesn't mix old and new). A plus, will also keep the bots green.

OK, I prefer to add a new temporary backend. @bbernhar And we can replace the old dml backend with the new backend when all ready.

@fujunwei
Copy link
Collaborator

fujunwei commented Aug 4, 2022

@mingmingtasd Could we please either rename the old backend (dml) or give the new backend another (temporary) name? That way, the git diffs will be accurate (= doesn't mix old and new). A plus, will also keep the bots green.

OK, I prefer to add a new temporary backend. @bbernhar And we can replace the old dml backend with the new backend when all ready.

Add a temporary backend in main branch will confuse for the developer, you can draft a first PR to delete the DML backend implementation and disable the CI. Another approach is to rewrite the DML backend on your Repo that also can be reviewed by us.

@mingmingtasd
Copy link
Collaborator Author

@mingmingtasd Could we please either rename the old backend (dml) or give the new backend another (temporary) name? That way, the git diffs will be accurate (= doesn't mix old and new). A plus, will also keep the bots green.

OK, I prefer to add a new temporary backend. @bbernhar And we can replace the old dml backend with the new backend when all ready.

Add a temporary backend in main branch will confuse for the developer, you can draft a first PR to delete the DML backend implementation and disable the CI. Another approach is to rewrite the DML backend on your Repo that also can be reviewed by us.

OK, agree, thanks! @fujunwei @bbernhar

@bbernhar
Copy link
Contributor

bbernhar commented Aug 4, 2022

@fujunwei Seems fine to me too.

@mingmingtasd mingmingtasd force-pushed the dml_rewrite branch 2 times, most recently from 9697ccc to 19fef0d Compare August 8, 2022 07:51
@mingmingtasd
Copy link
Collaborator Author

@miaobin Could you help enable some simple ops like Sigmoid firstly for testing? Then you can give more feedbacks when we are re-writing the DML backend. Thanks!

@miaobin
Copy link
Contributor

miaobin commented Aug 9, 2022

@miaobin Could you help enable some simple ops like Sigmoid firstly for testing? Then you can give more feedbacks when we are re-writing the DML backend. Thanks!

Of course, I will write GraphDML based on the current PR. On one hand, I will move some content in Device.cpp to this layer, on the other hand, I will first implement add input, compile, compute and simple operators such as sigmoid.

Copy link
Contributor

@bbernhar bbernhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Left some comments. Mostly regarding: recorder vs execution-context split. The others are minor.

src/webnn/native/dml/BackendDML.cpp Outdated Show resolved Hide resolved
src/webnn/native/dml/BackendDML.h Show resolved Hide resolved
src/webnn/native/dml/BackendDML.h Outdated Show resolved Hide resolved

ComPtr<IDMLDevice> mDevice;
ComPtr<ID3D12Device> mD3D12Device;
ComPtr<IDMLCommandRecorder> mCommandRecorder;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about having CommandRecorderDML?

CommandRecorderDML would store theID3D12CommandAllocator and be stored within this context. Then everything that needs the recorder but not the execution context, eg. InitializeOperator, ExecuteOperator is moved to CommandRecorderDML. This keeps the logic of recording clearly and cleanly separated from execution. The context just asks for the recorded command-list from this instance to execute.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean move these member variables into

CommandRecorderDML{
        ComPtr<IDMLDevice> mDevice;
        ComPtr<ID3D12Device> mD3D12Device;
        ComPtr<IDMLCommandRecorder> mCommandRecorder;
        ComPtr<ID3D12CommandQueue> mCommandQueue;
        ComPtr<ID3D12CommandAllocator> mCommandAllocator;
        ComPtr<ID3D12GraphicsCommandList> mCommandList;
}

?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, that's right.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, updated.

}
}
WEBNN_RETURN_IF_FAILED(
D3D12CreateDevice(mAdapter.Get(), D3D_FEATURE_LEVEL_11_0, IID_PPV_ARGS(&mD3D12Device)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will we want more than one context instance per queue (with a specified device)? If so, the queue and device should be passed by parameter, and not created within the context.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, thanks! But I think we only need one context instance per queue.

Copy link
Contributor

@bbernhar bbernhar Aug 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my other comment.

src/webnn/native/dml/BackendDML.cpp Outdated Show resolved Hide resolved
src/webnn/native/dml/BackendDML.cpp Outdated Show resolved Hide resolved
src/webnn/native/dml/ContextDML.cpp Outdated Show resolved Hide resolved
@mingmingtasd mingmingtasd force-pushed the dml_rewrite branch 3 times, most recently from 89f3982 to 1e937c9 Compare August 18, 2022 05:53

namespace webnn::native::dml {

struct CommandRecorderDML {
Copy link
Contributor

@bbernhar bbernhar Aug 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this into .cpp? This class will become large-enough (for InitializeOperator, ExecuteOperator, etc).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may mean InitializeGraph, ExecuteGraph, right? I think these will be in GraphDML.cpp.

Copy link
Contributor

@bbernhar bbernhar Aug 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GraphDML would not record the actual commands into the command-list. It just builds the operator (graph) for the CommandRecorder. The CommandRecorder itself records these commands via InitializeGraph, ExecuteGraph, etc. This maintains a clear separation of duties and avoids any state being shared between them (data-flow is one-way).

I highly encourage we all think about such things during the rewrite: clear boundaries, data-flow, etc.

Copy link
Collaborator Author

@mingmingtasd mingmingtasd Aug 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the commandlist is only used after the graph building. I guess your suggestion for separation may be:
CommandRecorder -> IntializeGraph()/ExecuteGraph() ? That's why you suggested defining the CommandRecorder and ExecutionContext?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this into .cpp? This class will become large-enough (for InitializeOperator, ExecuteOperator, etc).

Will consider re-organizing codes when we add more functions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM.

static ContextBase* Create(ComPtr<IDXGIAdapter1> adapter, bool useDebugLayer);
~Context() override = default;

CommandRecorderDML GetCommandRecorderDML() const {
Copy link
Contributor

@bbernhar bbernhar Aug 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to get the CommandRecorder? The context can just execute it directly. Also, this returns a (copy) of the recorder, which is most definitely not what we want.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I plan to add InitializeGraph, ExecuteGraph in GraphDML.cpp which will use the commandRecord.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my other comment.

D3D12_COMMAND_QUEUE_DESC commandQueueDesc{};
commandQueueDesc.Type = D3D12_COMMAND_LIST_TYPE_DIRECT;
commandQueueDesc.Flags = D3D12_COMMAND_QUEUE_FLAG_NONE;
WEBNN_RETURN_IF_FAILED(mCommandRecorderDML.D3D12Device->CreateCommandQueue(
Copy link
Contributor

@bbernhar bbernhar Aug 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't have the context (itself) always create the [DML, D3D12] device and queue. Rather, we want the context to initialize and execute with any (device, queue) specified.

Also, the command allocator and list should be created by the recorder class you've defined. The recorder also accepts the (device, queue) as parameters to initialize it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand, could you explain the purpose? @bbernhar
These D3D related members will only be created once when we create one context, we don't always create them. And I think one context owns one CommandRecorder I defined and the context will create multiple graphs, then graphs will do InitializeGraph and ExecuteGraph. Am I right? @fujunwei

Copy link
Contributor

@bbernhar bbernhar Aug 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The [DML, D3D12] device and queue does not have a 1:1 relationship with a context. You can have multiple contexts per {device, queue}. Chromium could for example, specify a {device, queue}, from GPUDevice, for a given context too. So, we cannot assume the context should always create a new one. In truth, the process only needs one queue (of each type).

In fact, if it does [create a new one], then we'll have to deal with complex cross-queue usage rules because resources shared between contexts wouldn't necessarily be compatible.

A CommandRecorderDML and the underlying IDMLCommandRecorder has 1:1 relationship, they should have the exact same lifetime.

Hopefully that makes more sense.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense, we should enum all adapters (integrateGPU, discreteGPU, maybe VPU) at once in the GPU process for all contexts like WebGPU's Backend::DiscoverAdapters.

For each adapter in DML backend, there is a device(DML, D3D12) as member variable like WebGPU's mD3d12Device.

The device in DML backend has 1 queue and 1 commandRecorder to recoder commands for all contexts, the commands come from initialize and execute Graph, am i right?

Copy link
Contributor

@bbernhar bbernhar Aug 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fujunwei Only the queue is sharable. Since no other recorder will be provided to this context to use, it doesn't need to outlive it. Every context has exactly one "live" recorder.

The whole purpose of CommandRecorderDML was to keep recorder-state together (list AND allocator). The command recorder can (and eventually will) enqueue multiple command-lists. Similarly, the context might also need to produce a entirely new command-list via new recorder.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense, thanks for explain patiently.


struct CommandRecorder {
ComPtr<IDMLDevice> device;
ComPtr<ID3D12Device> D3D12Device;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please keep struct-field names consistent.

};

// static
ContextBase* Context::Create(const ComPtr<IDMLDevice>& device,
Copy link
Contributor

@bbernhar bbernhar Aug 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not pass a modifiable object like this. The point of passing ComPtr by value is to know that ownership could be shared and this (passing by ComPtr&) makes it ambiguous at the call-site. If you want to avoid the ref, wrap it in std::move instead. For example, Context::Create(std::move(comptr)) and then new Context(std::move(comptr)), ...)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, got it. I think maybe multiple graphs will share these Comptr, so I decide to copy.

@mingmingtasd
Copy link
Collaborator Author

mingmingtasd commented Aug 23, 2022

PTAL, if there are no more significant comments, I intend to stop maintaining this PR and move to implement based on Mojo infrastrucure in otcshare/Chromium in this webnn_mojo branch to align with our team's main workflow now. Then you can go on reviewing codes there. Thanks! @bbernhar @fujunwei @miaobin

}

// Create the D3D device.
if (FAILED(D3D12CreateDevice(adapter.Get(), D3D_FEATURE_LEVEL_11_0,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we disscusion, the device and command queue can't be created when creating context, they should be created when connecting DML backend.


namespace webnn::native::dml {

struct CommandRecorder {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1, Move it to cpp files.
2, We should comment which variables are sharable between different context (Device, Queue)

D3D12_COMMAND_QUEUE_DESC commandQueueDesc{};
commandQueueDesc.Type = D3D12_COMMAND_LIST_TYPE_DIRECT;
commandQueueDesc.Flags = D3D12_COMMAND_QUEUE_FLAG_NONE;
WEBNN_RETURN_IF_FAILED(mCommandRecorderDML.D3D12Device->CreateCommandQueue(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense, thanks for explain patiently.

@mingmingtasd
Copy link
Collaborator Author

As you suggested , we are migrating to the chromium's mojo based branch. Thanks! @bbernhar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants