Customized cost functions for faster contractions #171

jcmgray · 2021-10-23T18:48:25Z

As part of some work on contracting some very large tensor networks we found that a good heuristic cost to optimize for is not just 'FLOPs' but also 'WRITE' - i.e. the size of the intermediate output tensors regardless of the cost to product it.

For example, an outer product with shapes like (d, 1), (1, d)->(d, d) has the same WRITE cost as matrix multiplication with shapes like (d, d), (d, d) -> (d, d) despite having a FLOPs cost d times lower.

If you take a cost as FLOPS + alpha * WRITE and optimize for this you can get some significant and fairly generic gains across hardware in terms of actual efficiency and contraction time. Relevant probably for e.g. #103.

Here (you can ignore what the contraction is and the 'index sorting' bit), we have FLOPS and WRITE on top and actual contraction time on bottom on a single cpu thread:

You can see there's about a 5x speedup.

Here's the same for a larger contraction on an A100 (big fast gpu) with a speedup of 8x:

In both cases choosing alpha=64 is very decent despite probably having different ratios of cpu cycles to memory speed etc etc. Generally there seems to be a window where neither the FLOPS or WRITE cost is compromised too much. EDIT: I should emphasize that the above contractions tend to produce very low arithmetic intensity contractions, so the speedups will not always be this large.

I've had this in my own branch of opt_einsum for a while (just for the dp optimizer, but which one can use to take any large contraction path/tree and iteratively 'reconfigure' subtrees of), but it would be nice to have it in the main opt_einsum and it also might be useful to others.

Currently the interface is like:

# use the default alpha (probably 64)
opt = DynamicProgramming(minimize='combo')

# use a specific alpha
opt = DynamicProgramming(minimize='combo-100')

@dgasmith, thoughts on including?

The text was updated successfully, but these errors were encountered:

jcmgray · 2021-10-23T19:05:04Z

Oh the other useful thing to note is that just taking the 'WRITE' cost - i.e. the sum of the sizes of all the intermediates - is the most relevant quantity for computing the space complexity of back propagating through a contraction (since roughly speaking all intermediates must be kept to compute gradients), which is often the more practical limit than actual time cost.

jcmgray · 2021-10-23T19:13:00Z

Mostly for fun, here is the actual effect of changing alpha on the contraction trees (/paths):

[Cost and size of contractions are node and edge width+color respectively.]

On the left is a alpha=0 contraction, with many low-efficiency contraction of a very large tensor with a very small tensor, and on the right is an alpha=64 contraction, where the effort is concentrated into fewer, more balanced and thus more efficient contractions.

dgasmith · 2021-10-24T20:04:52Z

This will be great to get in and seems to be an evolution of the different greedy approaches. We seem to have many choices that we wish to parametrize over time and give them simple names- an alternative approach could be leaning into the object-based count object such as:

cost_fn = CostFunction(alpha=64, ...)
cost = cost_fn(indices, removed, ...)

jcmgray · 2022-01-17T02:27:14Z

These are now available with #181.

jcmgray added the enhancement label Oct 23, 2021

jcmgray mentioned this issue Jan 15, 2022

more custom dp scores for faster contractions #181

Merged

3 tasks

jcmgray closed this as completed Jan 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Customized cost functions for faster contractions #171

Customized cost functions for faster contractions #171

jcmgray commented Oct 23, 2021 •

edited

Loading

jcmgray commented Oct 23, 2021

jcmgray commented Oct 23, 2021

dgasmith commented Oct 24, 2021

jcmgray commented Jan 17, 2022

Customized cost functions for faster contractions #171

Customized cost functions for faster contractions #171

Comments

jcmgray commented Oct 23, 2021 • edited Loading

jcmgray commented Oct 23, 2021

jcmgray commented Oct 23, 2021

dgasmith commented Oct 24, 2021

jcmgray commented Jan 17, 2022

jcmgray commented Oct 23, 2021 •

edited

Loading