Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compiler: Rework HaloSpot optimization #2477

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

georgebisbas
Copy link
Contributor

Fix for issue #2448, supplemented with some code edits

devito/passes/iet/mpi.py Show resolved Hide resolved
tests/test_mpi.py Show resolved Hide resolved
Copy link

codecov bot commented Nov 4, 2024

Codecov Report

Attention: Patch coverage is 98.21429% with 4 lines in your changes missing coverage. Please review.

Project coverage is 87.34%. Comparing base (279f074) to head (eb1435a).

Files with missing lines Patch % Lines
devito/mpi/halo_scheme.py 90.47% 1 Missing and 1 partial ⚠️
devito/passes/iet/mpi.py 96.72% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2477      +/-   ##
==========================================
+ Coverage   87.29%   87.34%   +0.05%     
==========================================
  Files         238      238              
  Lines       45366    45522     +156     
  Branches     4029     4030       +1     
==========================================
+ Hits        39602    39763     +161     
+ Misses       5083     5079       -4     
+ Partials      681      680       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

devito/ir/iet/nodes.py Show resolved Hide resolved
devito/ir/iet/nodes.py Outdated Show resolved Hide resolved
frozenset(self.halos),
frozenset(self.dims)))

def rebuild(self, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subclass Reconstructable and add rargs?

@@ -95,8 +127,16 @@ def __init__(self, exprs, ispace):
self._honored = frozendict(self._honored)

def __repr__(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be attached to a mixin class for both HaloSpot and HaloSchemeEntry?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes but where would you add it?

I would just drop all these repr it's a lot of code for what at the end of the day?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you print the IET, you see more useful details, it makes dev/debug life easier!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anyway, Ed has a point, this is redundant code which should somehow be factorized (a private method of HaloScheme, also called by HaloSpot.__repr__? not sure)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was also thinking of this, but fmapper is the problem, we need to drop it from the entry? and thus be a class itself?
needs some thought, I may restructure more

tests/test_mpi.py Outdated Show resolved Hide resolved
tests/test_mpi.py Outdated Show resolved Hide resolved
tests/test_mpi.py Outdated Show resolved Hide resolved
tests/test_mpi.py Outdated Show resolved Hide resolved
devito/ir/iet/nodes.py Outdated Show resolved Hide resolved
@@ -28,7 +28,39 @@ class HaloLabel(Tag):
STENCIL = HaloLabel('stencil')


HaloSchemeEntry = namedtuple('HaloSchemeEntry', 'loc_indices loc_dirs halos dims')
class HaloSchemeEntry:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as Ed wrote below, this should inherit from Reconstructable

HaloSchemeEntry = namedtuple('HaloSchemeEntry', 'loc_indices loc_dirs halos dims')
class HaloSchemeEntry:

def __init__(self, loc_indices, loc_dirs, halos, dims):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can it not inherit, alternatively, from EnrichedTuple?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THere are some 'getters', that look redundant ?
Reconstrcutable seems to work fine

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that ok now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that look redundant

with EnrichedTuple, once you properly override __rargs__ and __rkwargs__, you probably don't need to override __init__ and __eq__

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fail to do so, so far,, the already existing code of DimensionTuple does not seem to help me....

@@ -95,8 +127,16 @@ def __init__(self, exprs, ispace):
self._honored = frozendict(self._honored)

def __repr__(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes but where would you add it?

I would just drop all these repr it's a lot of code for what at the end of the day?

tests/test_mpi.py Outdated Show resolved Hide resolved
# a stopper to halo merging
# Loop over the functions in the HaloSpots
for f, v in hs1.fmapper.items():
# If no time accesses, skip
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"E.g., if no time accesses, skip"

you have to say it's an example...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wdym by 'say it's an example' ?

return not any(d in hs.dimensions or dep.distance_mapper[d] is S.Infinity
for d in dep.cause)
# If the function is not in both HaloSpots, skip
if (*hs0.functions, *hs1.functions).count(f) < 2:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.intersection(...), again as mentioned above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no needed anymore

devito/passes/iet/mpi.py Outdated Show resolved Hide resolved

# Precompute scopes to save time
scopes = {i: Scope([e.expr for e in v]) for i, v in MapNodes().visit(iet).items()}

cond_mapper = _create_cond_mapper(iet)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typically we use the word "make", so _make_cond_mapper(iet), but obviously I'm nitpicking here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


candidates = [i.dim._defines for i in iters[n:]]
for hs1 in halo_spots[i+1:]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe you could avoid an indentation level but rather considering all pairs

from itertools import combinations

...

for hs0, hs1 in combinations(halo_schemes, 2):
    ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, are we sure that the current code works if there are 2 hoisting opportunities, instead of 1 ?

could we add a test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am working on this now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@georgebisbas georgebisbas force-pushed the halo_opt_revamp_II branch 4 times, most recently from ca5251a to 5951eaf Compare November 8, 2024 11:04
HaloSchemeEntry = namedtuple('HaloSchemeEntry', 'loc_indices loc_dirs halos dims')
class HaloSchemeEntry:

def __init__(self, loc_indices, loc_dirs, halos, dims):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that ok now?

"""
iet = _drop_halospots(iet)
iet = _hoist_halospots(iet)
iet = _drop_reduction_halospots(iet)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is just renamed


for it, halo_spots in iter_mapper.items():

for hs0, hs1 in combinations(halo_spots, r=2):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two-level combo as proposed

iet = Transformer(mapper, nested=True).visit(iet)

# Clean up: de-nest HaloSpots if necessary
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no need for that anymore

tests/test_mpi.py Show resolved Hide resolved
HaloSchemeEntry = namedtuple('HaloSchemeEntry', 'loc_indices loc_dirs halos dims')
class HaloSchemeEntry(Reconstructable):

__rkwargs__ = ('loc_indices', 'loc_dirs', 'halos', 'dims')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be __rargs__?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return cond0 != cond1


def motion_rules():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps define a motion_rules = _motion_rules()?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be a function

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I'd call this simply "rules"

def _rule0(...):
     """...."""
     <impl>

def _rule1(...):
     """...."""
     <impl>

rules = (_rule0, _rule1)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

HaloSchemeEntry = namedtuple('HaloSchemeEntry', 'loc_indices loc_dirs halos dims')
class HaloSchemeEntry(Reconstructable):

__rkwargs__ = ('loc_indices', 'loc_dirs', 'halos', 'dims')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

HaloSchemeEntry = namedtuple('HaloSchemeEntry', 'loc_indices loc_dirs halos dims')
class HaloSchemeEntry:

def __init__(self, loc_indices, loc_dirs, halos, dims):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that look redundant

with EnrichedTuple, once you properly override __rargs__ and __rkwargs__, you probably don't need to override __init__ and __eq__

@@ -95,8 +127,16 @@ def __init__(self, exprs, ispace):
self._honored = frozendict(self._honored)

def __repr__(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anyway, Ed has a point, this is redundant code which should somehow be factorized (a private method of HaloScheme, also called by HaloSpot.__repr__? not sure)

hse = HaloSchemeEntry(frozendict(loc_indices),
frozendict(loc_dirs),
hse0.halos, hse0.dims)
hse = hse0._rebuild(loc_indices=frozendict(loc_indices),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe the cast to frozendict should be moved to inside HaloSchemeEntry.__init__

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it be for all attributes?

devito/passes/iet/mpi.py Outdated Show resolved Hide resolved
for hs, v in cond_mapper.items()}
cond_mapper = _make_cond_mapper(iet)

mapper = {}

iter_mapper = MapNodes(Iteration, HaloSpot, 'immediate').visit(iet)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you may introudce a private function to get this iter_mapper too, since you're doing the same exact filtering that appears in hoist_invariant

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now that I think about it, you can probably also move the cond_mapper inside such a new utility function, and do the conditional filtering in there directly, rather than repeating it in both passes (it's basically the same stuff repeated at the moment)

for hs, v in cond_mapper.items()}


def ensure_control_flow(hs0, hs1, cond_mapper):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private


def ensure_control_flow(hs0, hs1, cond_mapper):
"""
# If there are Conditionals involved, both `hs0` and `hs1` must be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop "#"

def ensure_control_flow(hs0, hs1, cond_mapper):
"""
# If there are Conditionals involved, both `hs0` and `hs1` must be
# within the same Conditional, otherwise we would break the control
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

control flow

return cond0 != cond1


def motion_rules():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I'd call this simply "rules"

def _rule0(...):
     """...."""
     <impl>

def _rule1(...):
     """...."""
     <impl>

rules = (_rule0, _rule1)

tests/test_mpi.py Outdated Show resolved Hide resolved
@pytest.mark.parallel(mode=1)
def test_halo_structure(self, mode):

mytest = TestTTI()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove, oneline TestTTI().tti_operator

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if i is None or len(halo_spots) <= 1:
continue
# Drop pairs that have keys that are None
iter_mapper = {k: v for k, v in iter_mapper.items() if k is not None}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole section looks like a plain copy paste of 89-119 especially the loop next, iter_maper filtering and conditions, can't this be. merged or lifted in some common piece?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simplified a bit

@@ -95,8 +127,16 @@ def __init__(self, exprs, ispace):
self._honored = frozendict(self._honored)

def __repr__(self):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was also thinking of this, but fmapper is the problem, we need to drop it from the entry? and thus be a class itself?
needs some thought, I may restructure more

HaloSchemeEntry = namedtuple('HaloSchemeEntry', 'loc_indices loc_dirs halos dims')
class HaloSchemeEntry(Reconstructable):

__rkwargs__ = ('loc_indices', 'loc_dirs', 'halos', 'dims')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

hse = HaloSchemeEntry(frozendict(loc_indices),
frozendict(loc_dirs),
hse0.halos, hse0.dims)
hse = hse0._rebuild(loc_indices=frozendict(loc_indices),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

hse = HaloSchemeEntry(frozendict(loc_indices),
frozendict(loc_dirs),
hse0.halos, hse0.dims)
hse = hse0._rebuild(loc_indices=frozendict(loc_indices),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it be for all attributes?

devito/passes/iet/mpi.py Outdated Show resolved Hide resolved
iter_mapper = {k: v for k, v in iter_mapper.items() if len(v) > 1}

for it, halo_spots in iter_mapper.items():

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dropped

if ensure_control_flow(hs0, hs1, cond_mapper):
continue

# If there are overlapping time accesses, skip
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cahnged, is it better?

devito/passes/iet/mpi.py Outdated Show resolved Hide resolved
devito/passes/iet/mpi.py Outdated Show resolved Hide resolved
@@ -524,7 +524,7 @@ def ForwardOperator(model, geometry, space_order=4, kernel='sls', time_order=2,

# Substitute spacing terms to reduce flops
return Operator(eqn + src_term + rec_term, subs=model.spacing_map,
name='Forward', **kwargs)
name='ViscoAcForward', **kwargs)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mloubout @FabioLuporini are you ok with this?

for d in hse.loc_indices:
md = hse.loc_indices[d]
if md in it.uindices:
if it.direction is Backward:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be moved into it as it.start ?

Copy link
Contributor

@EdCaunt EdCaunt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few minor comments. Looks very good to me


# *** Utilities

def filter_iter_mapper(iet):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: if these aren't going to be exposed, do they need to be private? Either way, they should probably all be consistent with one another

for d, v in loc_indices.items())


rules = [_rule0, _rule1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: Making this a tuple would be safer


# Test two variants of receiver interpolation
nrec = 1
rec = SparseTimeFunction(name="rec", grid=grid, npoint=nrec, nt=tn)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: You could remove some lines by having npoint=1, nt=30 in here. You can do similar stuff throughout this test (for example Grid(shape=(2,)).

rec_term3 = rec2.interpolate(expr=v2.forward)

# Test receiver interpolation 0, here we have a halo exchange hoisted
op2 = Operator([u_v] + [u_v2] + [u_tau] + [u_tau2] + rec_term0 + rec_term2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this and op3 be split off in their own tests, or is it necessary to have them in this test?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler MPI mpi-related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

compiler: Redundant haloupdate
4 participants