-
Notifications
You must be signed in to change notification settings - Fork 424
UCF Workshop 2018
Pavel Shamis (Pasha) edited this page Oct 3, 2018
·
15 revisions
December 10-13
Austin, TX
- Async progress for protocols (Yossi) Progress various protocols, such as rendezvous, stream, disconnect, RMA/AMO emulation using progress thread
- Thread safety, fine-grained locking (Yossi) Discuss what is needed in UCP and UCT to support better concurrency than a big global lock
- Support for shmem signaled put (Yossi) How to support new OpenSHMEM primitive - put with signal
- Upstream (rdma-core) support status (Yossi) Using UCX with Inbox drivers and latest rdma-core
- Xpmem support for tag matching (Yossi) Use 1-copy for expected eager messages using UCT tag-offload API
- Stream API and close protocol (Yossi) Using stream API as replacement for TCP and considerations of closing/flushing a connection
- High availability, failover (Yossi) How to implement fabric error recovery by using multiple devices/ports
- UCP API v2.0 (Yossi) Things we would like to change/optimize/cleanup in next UCP API, and backward compatibility considerations
- UCP Active message API (Yossi) Discuss active messages implementation on UCP level
- UCT component architecture (Yossi) Split UCT to modules, and load them dynamically, so missing dependencies would disable only the relevant transports.
- Multi-binary support for various uarch (Pasha)
- Internal memcpy, DPDK style ? (Pasha)
- MPICH + UCX - State of the union (ANL/Ken)
- OpenSHMEM context to worker mapping
- UCX+GPU: AMD and NVIDIA - AMD(Khaled, Brad) / NVIDIA (Akshay) 2hr
- Collectives (AMD, Khaled) 1hr
- UCX specification update, manpages (Brad) 1hr
- OSSS SHMEM with UCX update (Tony) 1hr
- UCT API freeze (Lanl / Nathan) 1hr
- Regression and testing (LANL, Mellanox) 1-2hr ( Roce, iwarp, tcp, etc. )
- Datatypes + GPUs (Nvidia)
- OpenMPI + UCX - State of the union (Mellanox/Yossi)
- UCX + task based (UTK ? George ?)
- UCX + ML (Python bindings) / Nvidia/ Akshay / 1hr