Skip to content

2018 06 12 F2F Meeting in Austin

Guillaume Mercier edited this page Jul 11, 2018 · 1 revision

Summary

Guillaume started by presenting a summary of the progression of the work of the WG. It began with an outline of the discussions, first his presentation and then, Shinji will proceed to present the Fujitsu extensions and eventually there will be a short presentation around the PMI and its portability.

Guillaume started by presenting the pace of the discussion with three meetings each month trying to accommodate most timezones even if most attendees are for now from Europe and Japan. Guillaume then outlined the issue with the JP/US meeting which does not attract enough people and that we will have (for now) to possibly remove it due to lack of attendees.

The presentation then moved on the three « pillars » that we previously covered:

  • Implicit topology detection with Hsplit
  • Explicit topology detection (Fujitsu’s extensions, MPIT, Topology on CW)
  • Discussions around mapping and binding

Then more details were given on each of these tracks, first Guillaume recalled the Hsplit methodology which uses hierarchical communicators abstraction to represent actual hardware hierarchies. Uses cases from Météo France (French weather forecast service) was presented with a focus on their need for dynamic binding (MPMD program).

The mic was then passed to Shinji who presented the Fujitsu extensions for MPI. He started by presenting the topological tofu interconnect. Then he explained how users on the K are able to allocate topological jobs as slices of the torus. Eventually, he presented the extension which allows a given code to query its dimension and coordinates (including hardware ones).

Eventually, Guillaume introduced the last topic, « Binding » presenting it as a more conflictual one. Indeed it is not easy to define the frontier between what MPI should do and what should be done by the third parties such as the PMI or any other launcher. The fact is that there is no common interface to specify bindings in mpirun and that it keeps changing between versions of MPI and even between consecutive versions of the same implementation is a problem. Indeed, there is no way of launching topological MPI jobs in a portable manner (for example with scripts). Second, we posed the question of portability between machines as topology is bound to be specific to a given hardware. A similar question arises for containers which require a given binary to run in a changing environment. In the last part, Jean-Baptiste raised the question of containers in MPI as an example of the specificities of launching MPI jobs in a constrained environment. First outlining the problem of discovering the hardware environment dynamically and the strong ties to a given version of the PMI for MPI binaries. The PMI is the standard launch manner and probably MPI should never have such interfaces. Though, extended dynamism in MPI (e.g. sessions) and more complex launch scenarios (containers, constrained topology) question the boundaries between MPI and the PMI. One of the examples was an isolated network of VMs running MPI in pods (similar to what can be found in Kubernetes) and the issues arising from the PMI support; changing ABIs, compatibility between versions and need to back-connect to the PMI daemon. This led to animated discussions around the need to discuss this as part of MPI. A consensus raised that as a WG we cannot influence much these decisions, they also are spanning further than our topological aspects, however, we think that it is of interest to keep discussing them in a more explorative fashion.