The MVAPICH team is pleased to announce the release of MVAPICH2 2.0b,
MVAPICH2-X 2.0b (Hybrid MPI+PGAS (OpenSHMEM) with Unified
Communication Runtime) and OSU Micro-Benchmarks (OMB) 4.2.
Features, Enhancements, and Bug Fixes for MVAPICH2 2.0b (since
MVAPICH2 2.0a release) are listed here.
* Features and Enhancements (since 2.0a):
- Based on MPICH-3.1b1
- Multi-rail support for GPU communication
- Non-blocking streams in asynchronous CUDA transfers for
better overlap
- Initialize GPU resources only when used by MPI transfer
- Extended support for MPI-3 RMA in OFA-IB-CH3, OFA-IWARP-CH3, and
OFA-RoCE-CH3
- Additional MPIT counters and performance variables
- Updated compiler wrappers to remove application dependency
on network and other extra libraries
- Thanks to Adam Moody from LLNL for the suggestion
- Capability to checkpoint CH3 channel using the Hydra process manager
- Optimized support for broadcast, reduce and other collectives
- Tuning for IvyBridge architecture
- Improved launch time for large-scale mpirun_rsh jobs
- Introduced retry mechanism in mpirun_rsh for socket binding
- Updated hwloc to version 1.7.2
* Bug-Fixes (since 2.0a):
- Consider list provided by MV2_IBA_HCA when scanning device list
- Fix issues in Nemesis interface with --with-ch3-rank-bits=32
- Better cleanup of XRC files in corner cases
- Initialize using better defaults for ibv_modify_qp (initial ring)
- Add unconditional check and addition of pthread library
- MPI_Get_library_version updated with proper MVAPICH2 branding
- Thanks to Jerome Vienne from the TACC for the report
MVAPICH2-X 2.0b software package provides support for hybrid MPI+PGAS
(UPC and OpenSHMEM) programming models with unified communication
runtime for emerging exascale systems. This software package
provides flexibility for users to write applications using the
following programming models with a unified communication runtime:
MPI, MPI+OpenMP, pure UPC, and pure OpenSHMEM programs as well as
hybrid MPI(+OpenMP) + PGAS (UPC and OpenSHMEM) programs.
Features, enhancements and bug-fixes for MVAPICH2-X 2.0b
(since MVAPICH2-X 2.0a) are as follows:
* Features and Enhancements (since 2.0a)
- OpenSHMEM Features
- Based on OpenSHMEM reference implementation 1.0e
- Enhanced optimization of OpenSHMEM collectives (shmem_collect,
shmem_fcollect, shmem_barrier, shmem_reduce, and shmem_broadcast)
- Optimized shmalloc routine
- UPC Features
- Based on Berkeley UPC 2.18.0
- Support for GUPC translator
- MPI Features
- Based on MVAPICH2 2.0b (OFA-IB-CH3 interface)
- Unified Runtime Features
- Based on MVAPICH2 2.0b (OFA-IB-CH3 interface). All the runtime
features enabled by default in OFA-IB-CH3 interface of
MVAPICH2 2.0b are available in MVAPICH2-X 2.0b
* Bug Fixes (since 2.0a):
- OpenSHMEM Bug Fixes
- Fixed synchronization issue in shmem_fence
- Fixed issue in shmem_collect which prevented variable length
collect routine
New features and Enhancements of OSU Micro-Benchmarks (OMB) 4.2 (since
OMB 4.1 release) are listed here.
* New Features & Enhancements
- New OpenSHMEM benchmarks
* osu_oshm_fcollect
- Enable handling of GPU device buffers in all MPI collective benchmarks
- Add device binding for OpenACC benchmarks
* Bug Fixes
- Add upc_fence after memput in osu_upc_memput benchmark
- Removed incorrect reuse of pSync in OpenSHMEM benchmarks
- Thanks Sayan Ghosh for the report
- Correct CUDA configuration example in README
- Fix several warnings
For downloading MVAPICH2 2.0b, MVAPICH2-X 2.0b, OMB 4.2, associated
user guides, quick start guide, and accessing the SVN, please visit
the following URL:
http://mvapich.cse.ohio-state.edu