Monday, November 11, 2013

Announcing the Release of MVAPICH2 2.0b, MVAPICH2-X 2.0b and OSU Micro-Benchmarks (OMB) 4.2

The MVAPICH team is pleased to announce the release of MVAPICH2 2.0b,
MVAPICH2-X 2.0b (Hybrid MPI+PGAS (OpenSHMEM) with Unified
Communication Runtime) and OSU Micro-Benchmarks (OMB) 4.2.
Features, Enhancements, and Bug Fixes for MVAPICH2 2.0b (since
MVAPICH2 2.0a release) are listed here.
* Features and Enhancements (since 2.0a):
    - Based on MPICH-3.1b1
    - Multi-rail support for GPU communication
    - Non-blocking streams in asynchronous CUDA transfers for
      better overlap
    - Initialize GPU resources only when used by MPI transfer
    - Extended support for MPI-3 RMA in OFA-IB-CH3, OFA-IWARP-CH3, and
    - Additional MPIT counters and performance variables
    - Updated compiler wrappers to remove application dependency
      on network and other extra libraries
        - Thanks to Adam Moody from LLNL for the suggestion
    - Capability to checkpoint CH3 channel using the Hydra process manager
    - Optimized support for broadcast, reduce and other collectives
    - Tuning for IvyBridge architecture
    - Improved launch time for large-scale mpirun_rsh jobs
    - Introduced retry mechanism in mpirun_rsh for socket binding
    - Updated hwloc to version 1.7.2
* Bug-Fixes (since 2.0a):
    - Consider list provided by MV2_IBA_HCA when scanning device list
    - Fix issues in Nemesis interface with --with-ch3-rank-bits=32
    - Better cleanup of XRC files in corner cases
    - Initialize using better defaults for ibv_modify_qp (initial ring)
    - Add unconditional check and addition of pthread library
    - MPI_Get_library_version updated with proper MVAPICH2 branding
        - Thanks to Jerome Vienne from the TACC for the report
MVAPICH2-X 2.0b software package provides support for hybrid MPI+PGAS
(UPC and OpenSHMEM) programming models with unified communication
runtime for emerging exascale systems. This software package
provides flexibility for users to write applications using the
following programming models with a unified communication runtime:
MPI, MPI+OpenMP, pure UPC, and pure OpenSHMEM programs as well as
hybrid MPI(+OpenMP) + PGAS (UPC and OpenSHMEM) programs.
Features, enhancements and bug-fixes for MVAPICH2-X 2.0b
(since MVAPICH2-X 2.0a) are as follows:
* Features and Enhancements (since 2.0a)
    - OpenSHMEM Features
        - Based on OpenSHMEM reference implementation 1.0e
        - Enhanced optimization of OpenSHMEM collectives (shmem_collect,
          shmem_fcollect, shmem_barrier, shmem_reduce, and shmem_broadcast)
        - Optimized shmalloc routine
    - UPC Features
        - Based on Berkeley UPC 2.18.0
        - Support for GUPC translator
    - MPI Features
        - Based on MVAPICH2 2.0b (OFA-IB-CH3 interface)
    - Unified Runtime Features
        - Based on MVAPICH2 2.0b (OFA-IB-CH3 interface). All the runtime
          features enabled by default in OFA-IB-CH3 interface of
          MVAPICH2 2.0b are available in MVAPICH2-X 2.0b
* Bug Fixes (since 2.0a):
    - OpenSHMEM Bug Fixes
        - Fixed synchronization issue in shmem_fence
        - Fixed issue in shmem_collect which prevented variable length
          collect routine
New features and Enhancements of OSU Micro-Benchmarks (OMB) 4.2 (since
OMB 4.1 release) are listed here.
* New Features & Enhancements
    - New OpenSHMEM benchmarks
        * osu_oshm_fcollect
    - Enable handling of GPU device buffers in all MPI collective benchmarks
    - Add device binding for OpenACC benchmarks
* Bug Fixes
    - Add upc_fence after memput in osu_upc_memput benchmark
    - Removed incorrect reuse of pSync in OpenSHMEM benchmarks
        - Thanks Sayan Ghosh for the report
    - Correct CUDA configuration example in README
    - Fix several warnings
For downloading MVAPICH2 2.0b, MVAPICH2-X 2.0b, OMB 4.2, associated
user guides, quick start guide, and accessing the SVN, please visit
the following URL:

