Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DML] Optimizations to improve DML memory utilization #221

Open
bbernhar opened this issue Mar 4, 2022 · 0 comments
Open

[DML] Optimizations to improve DML memory utilization #221

bbernhar opened this issue Mar 4, 2022 · 0 comments

Comments

@bbernhar
Copy link
Contributor

bbernhar commented Mar 4, 2022

The following (flame graph) is based on traces of {MobileNetv2, Resnet50v2, DirectMLSuperResolution}.

Issue 1. A large tensor resource being created takes 2x the wall-time due a re-attempt (intel/GPGMM#180).

image

First allocation attempts sub-allocation (blue), followed by another direct allocation (red-ish), about 2x slower. Should check the tensor size ahead of time and directly allocate it rather then attempt to sub-allocate then fallback.

Issue 2. Memory containing resources being created on-demand (intel/GPGMM#110).

image

Biggest cost of tensor creation is creating the heap. Should pre-fetch the next heap so a subsequent request is full-filled before being requested.

Issue 3. Tensors under utilizing memory.

Tensor allocations are 1MB whereas memory they occupy is 4MB (or 4x waste). My goal is to move away from having fixed/default memory sizes like we did in Dawn (now called PreferredResourceHeapSize) and to grow (or increase) them dynamically instead (intel/GPGMM#182).

image

FYI, @fujunwei, @huningxin @RafaelCintron

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant