Help:Remote Direct Memory Access

Jump to: navigation, search

Kogence supports multiple parallel processing frameworks including MPI, open MP and Cham++.

Message Passing Interface (MPI) is a standardized and portable message-passing standard designed to function on a wide variety of parallel computing architectures. There are several well-tested and efficient implementations of MPI. On Kogence, currently we support Open MPI (not to be confused with OpenMP), MPICH, MVAPICH (an MPICH derivative) and Intel MPI (an MPICH derivative) libraries. All of these support various hardware network interfaces either through Open Fabrics Interface (OFI) or through natively implemented support which provides OS Bypass Remote Direct Memory Access capabilities and avoid the inter process communication to go through traditional TCP/IP stack providing significant performance boost. On Kogence, users would not have to worry about the details of hardware and network infrastructure as we will configure appropriate transport layer for best performance for their HPC applications.

For example, the Intel MPI Library package contains the libfabric library, which is used when the mpivars.sh script executes. The libfabric implements OFI interface. What OFI does is that it provides a common set of commands that application developers can call and libfabric translates them to various different protocols that different hardware providers support. The hardware providers protocol can be selected by export I_MPI_OFI_PROVIDER=<name> or export FI_PROVIDER=<name> with first option being preferred. <name> could be mlx, verbs, efa, tcp, psm2. For example OFI can translate commands to standard tcp/ip protocol if <name>=tcp. This can work on standard ethernet hardware as well as on IPoIB (IP over infiniband), IPoOPA (IP over Intel Omni-Path network). If <name>=mlx then OFI can translate commands to UCX protocol that is needed for OS Bypass Mellanox InfiniBand. Similarly <name>=mlx can be used Intel Omni Path. The verbs is needed for other OS bypass network hardware (non-Melanox InfiniBand, iWarp/RoCE etc).

Similar to MPI, Cham++ is a machine independent parallel programming system. Programs written using this system will run unchanged on MIMD computing systems with or without a shared memory. It provides high-level mechanisms and strategies to facilitate the task of developing even highly complex parallel applications. Charm++ does offer some benefits over MPI such relative ease of programing to developers, fault tolerance and separating out machine layer so that application developer does not distinguish whether user executes the application leveraging multi-threading parallelism on shared memory SMP architectures or multi-processing parallelism on distributed memory clusters. Charm++ can use MPI as the transport layer (aka Charm over MPI), or you can run MPI programs using Charm++ Adaptive-MPI (AMPI) library (aka MPI over Cham). Charm++ can also use TCP/IP stack or other OS bypass network protocols for messaging needs. The communication protocols and infrastructures supported by Charm++ are UDP, MPI, OFI, UCX, Infiniband, uGNI, and PAMI. Charm++ programs can run without changing the source on all these platforms.