- ibnodes - (Show Infiniband nodes in topology)
- ibhosts - (Show InfiniBand host nodes in topology)
- ibswitches- (Show InfiniBand switch nodes in topology)
- ibnetdiscover - (Discover InfiniBand topology)
- ibchecknet - (Validate IB subnet and report errors)
- ibdiag (Scans the fabric using directed route packets and extracts all the available information regarding its connectivity and devices)
- perfquery (find errors on a particular or number of HCA's and switch ports)
Friday, August 30, 2013
Diagnostic Tools to diagnose Infiniband Fabric Information
There are a few diagnostic tools to diagnose Infiniband Fabric Information. Use man for the parameters for the
Thursday, August 29, 2013
New Intel® Enterprise Edition for Lustre* Software Designed to Simplify Big Data Management, Storage
NEWS HIGHLIGHTS from Intel
- Intel® Enterprise Edition for Lustre* software helps simplify configuration, monitoring, management and storage of high volumes of data.
- With Intel® Manager for Lustre* software, Intel is able to extend the reach of Lustre into new markets such as financial services, data analytics, pharmaceuticals, and oil and gas.
- When combined with the Intel® Distribution for Apache Hadoop* software, Hadoop users can access Lustre data files directly, saving time and resources.
- New software offering furthers Intel's commitment to drive new levels of performance and features through continuing contributions the open source community.
Wednesday, August 28, 2013
Issues when compiling OpenMPI 1.6 and Intel 2013 XE on CentOS 5
I encountered this same error when compiling on CentOS 5.3 and using OpenMPI 1.6 and Intel 2013 XE
There is one interesting forum discussion from Intel (Problem with Openmpi+c++ compiler)
that helps to explain how someone solve it. Basically, it could be due to mixed mis-match libraries and binaries especially if you have multiple version of intel compilers. Do look at your $PATH and $LD_LIBRARY_PATH
Even with correct editing of the $PATH and $LD_LIBRARY_PATH, I was not able to get aaway from the error. But when I use CentOS 6 with OpenMPI 1.6 and Intel XE 2013, was I able spare from this errror.
..... opal_wrapper.c:(............): undefined reference to `__intel_sse2_strdup' opal_wrapper.c:(............): undefined reference to `__intel_sse2_strncmp' ...... opal_wrapper.o:opal_wrapper.c:(.......): more undefined references to `__intel_sse2_strncmp' follow ......
There is one interesting forum discussion from Intel (Problem with Openmpi+c++ compiler)
that helps to explain how someone solve it. Basically, it could be due to mixed mis-match libraries and binaries especially if you have multiple version of intel compilers. Do look at your $PATH and $LD_LIBRARY_PATH
Even with correct editing of the $PATH and $LD_LIBRARY_PATH, I was not able to get aaway from the error. But when I use CentOS 6 with OpenMPI 1.6 and Intel XE 2013, was I able spare from this errror.
Monday, August 26, 2013
Announcing the Release of MVAPICH2 2.0a, MVAPICH2-X 2.0a and OSU Micro-Benchmarks (OMB) 4.1
The MVAPICH team is pleased to announce the release of
MVAPICH2 2.0a, MVAPICH2-X 2.0a (Hybrid MPI+PGAS (OpenSHMEM) with Unified
Communication
Runtime) and OSU Micro-Benchmarks (OMB) 4.1.
Features, Enhancements, and Bug Fixes for MVAPICH2 2.0a
(since MVAPICH2 1.9GA release) are listed here.
* Features and Enhancements (since 1.9GA):
- Based on
MPICH-3.0.4
- Dynamic CUDA
initialization. Support GPU device selection after MPI_Init
- Support for
running on heterogeneous clusters with GPU and non-GPU nodes
- Supporting
MPI-3 RMA atomic operations and flush operations
with
CH3-Gen2 interface
- Exposing
internal performance variables to MPI-3 Tools information
interface
(MPIT)
- Enhanced
MPI_Bcast performance
- Enhanced
performance for large message MPI_Scatter and MPI_Gather
- Enhanced
intra-node SMP performance
- Tuned SMP
eager threshold parameters
- Reduced
memory footprint
- Improved
job-startup performance
- Warn and
continue when ptmalloc fails to initialize
- Enable
hierarchical SSH-based startup with Checkpoint-Restart
- Enable the
use of Hydra launcher with Checkpoint-Restart
* Bug-Fixes (since 1.9GA):
- Fix data
validation issue with MPI_Bcast
- Thanks
to Claudio J. Margulis from University of Iowa for the report
- Fix buffer
alignment for large message shared memory transfers
- Fix a bug in
One-Sided shared memory backed windows
- Fix a
flow-control bug in UD transport
- Thanks
to Benjamin M. Auer from NASA for the report
- Fix bugs
with MPI-3 RMA in Nemesis IB interface
- Fix issue
with very large message (>2GB bytes) MPI_Bcast
- Thanks
to Lu Qiyue for the report
- Handle case
where $HOME is not set during search for MV2 user config file
- Thanks
to Adam Moody from LLNL for the patch
- Fix a hang
in connection setup with RDMA-CM
MVAPICH2-X 2.0a software package provides support for
hybrid MPI+PGAS (UPC and OpenSHMEM) programming models with unified
communication runtime for emerging exascale systems. This software package
provides flexibility for users to write applications using the following
programming models with a unified communication runtime: MPI, MPI+OpenMP, pure
UPC, and pure OpenSHMEM programs as well as hybrid MPI(+OpenMP) + PGAS (UPC and
OpenSHMEM) programs.
Features and enhancements for MVAPICH2-X 2.0a (since
MVAPICH2-X 1.9GA) are as follows:
* Features and Enhancements (since 1.9GA):
- OpenSHMEM
Features
-
Optimized OpenSHMEM Collectives (Improved performance for
shmem_collect, shmem_barrier, shmem_reduce and shmem_broadcast)
- MPI Features
- Based on
MVAPICH2 2.0a (OFA-IB-CH3 interface)
- Unified
Runtime Features
- Based on
MVAPICH2 2.0a (OFA-IB-CH3 interface). All the runtime
features
enabled by default in OFA-IB-CH3 interface of
MVAPICH2
2.0a are available in MVAPICH2-X 2.0a
New features and Enhancements of OSU Micro-Benchmarks
(OMB) 4.1 (since OMB
4.0.1 release) are listed here.
* New Features & Enhancements
- New
OpenSHMEM benchmarks
*
osu_oshm_barrier
*
osu_oshm_broadcast
*
osu_oshm_collect
*
osu_oshm_reduce
- New MPI-3
RMA Atomics benchmarks
*
osu_cas_flush
*
osu_fop_flush
For downloading MVAPICH2 2.0a, MVAPICH2-X 2.0a, OMB 4.1,
associated user guides, quick start guide, and accessing the SVN, please visit
the following URL:
Friday, August 23, 2013
Open Compute Project
Open Compute Project is the results of a group of facebook engineers contribution back to the community what they have learned to build a highly energy efficient Data Centre.
There are a few design specifications:
There are a few design specifications:
- Server Design Specification
- Storage Design Specification
- Data Center Design Specification
- Virtual IO Design Specification
- Hardware Management Specification
- Certification Standards
Tuesday, August 20, 2013
Building OpenMPI Libraries for 64-bit integers
There is an excellent article on how to build OpenMPI Libraries for 64-bit integers. For more detailed information, do look at How to build MPI libraries for 64-bit integers
The information on this website is taken from the above site.
Step 1: Check the integer size. Do the following:
Intel Compilers
Step 2a: To compile OpenMPI with Intel Compilers and with 64-bits integers, do the following:
* GNU Compilers
Step 2b: To compile OpenMPI with GNU Compilers and with 64-bits, do the followings:
Step 3: Update your PATH and LD_LIBRARY_PATH in your .bashrc
Verify that the installation is correct
The information on this website is taken from the above site.
Step 1: Check the integer size. Do the following:
# ompi_info -a | grep 'Fort integer size'If the output is as below, you have to compile OpenMPI with 64 bits.
Fort integer size: 4
Intel Compilers
Step 2a: To compile OpenMPI with Intel Compilers and with 64-bits integers, do the following:
# ./configure --prefix=/usr/local/openmpi CXX=icpc CC=icc F77=ifort FC=ifort FFLAGS=-i8 FCFLAGS=-i8 # make -j 8 # make install
* GNU Compilers
Step 2b: To compile OpenMPI with GNU Compilers and with 64-bits, do the followings:
# ./configure --prefix=/usr/local/openmpi CXX=g++ CC=gcc F77=gfortran FC=gfortran \ FFLAGS="-m64 -fdefault-integer-8" \ CFLAGS="-m64 -fdefault-integer-8" \ CFLAGS=-m64 \ CXXFLAGS=-m64 # make -j 8 # make install
Step 3: Update your PATH and LD_LIBRARY_PATH in your .bashrc
export $PATH=/usr/local/openmpi/bin:$PATH export $LD_LIBRARY_PATH=/usr/local/openmpi/lib:$LD_LIBRARY_PATH
Verify that the installation is correct
# ompi_info -a | grep 'Fort integer size' Fort integer size: 8
Monday, August 12, 2013
Registering sufficent memory for OpenIB when using Mellanox HCA
If you encountered errors like "error registering openib memory" similar to what is written below. You may want to take a look at the OpenMPI FAQ - I'm getting errors about "error registering openib memory"; what do I do? .
In summary, the error occurred when applications which consumed a large amount of memory, application might fail when not enough memory can be registered with RDMA. There is a need to increase MTT size. But increasing MTT size hasve the downside of increasing the number of "cache misses" and increases latency.
For a more details writeup. See Registering sufficent memory for OpenIB when using Mellanox HCA (linuxcluster.wordpress.com)
WARNING: It appears that your OpenFabrics subsystem is configured to only allow registering part of your physical memory. This can cause MPI jobs to run with erratic performance, hang, and/or crash. This may be caused by your OpenFabrics vendor limiting the amount of physical memory that can be registered. You should investigate the relevant Linux kernel module parameters that control how much physical memory can be registered, and increase them to allow registering all physical memory on your machine. See this Open MPI FAQ item for more information on these Linux kernel module parameters: http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages Local host: node02 Registerable memory: 32768 MiB Total memory: 65476 MiB Your MPI job will continue, but may be behave poorly and/or hang.The explanation solution can be found at How to increase MTT Size in Mellanox HCA.
In summary, the error occurred when applications which consumed a large amount of memory, application might fail when not enough memory can be registered with RDMA. There is a need to increase MTT size. But increasing MTT size hasve the downside of increasing the number of "cache misses" and increases latency.
For a more details writeup. See Registering sufficent memory for OpenIB when using Mellanox HCA (linuxcluster.wordpress.com)
Thursday, August 8, 2013
Tracking Batch Jobs at Platform LSF
The content article is taken from http://users.cs.fiu.edu/~tho01/psg/3rdParty/lsf4_userGuide/07-tracking.html
1. Displaying All Job Status
2. Report Reasons why a job is pending
3. Report Pending Reasons with host names for each conditions
4. Detailed Report on a specific jobs
5. Reasons why the job is suspended
6. Displaying Job History
7. Killing Jobs
8. Stop the Job
9 Resume the job
Other References:
1. Displaying All Job Status
# bjobs -u all
2. Report Reasons why a job is pending
# bjobs -p
3. Report Pending Reasons with host names for each conditions
# bjobs -lp
4. Detailed Report on a specific jobs
# bjobs -l 6653
5. Reasons why the job is suspended
# bjobs -s
6. Displaying Job History
# bpeek 12345
7. Killing Jobs
# bkill 12345
8. Stop the Job
# bstop 12345
9 Resume the job
# bresume 12345
Other References:
Tuesday, August 6, 2013
Virtual Memory PAGESIZE on CentOS
There is a very good write-up on Linux Virtual Memory PAGESIZE from Nixcraft
(Linux Find Out Virtual Memory PAGESIZE).
To get the Linux Virutal Memory PAGESIZE, do use the following command
(Linux Find Out Virtual Memory PAGESIZE).
To get the Linux Virutal Memory PAGESIZE, do use the following command
# getconf PAGESIZE 4096
Sunday, August 4, 2013
Installing pdsh to issue commands to a group of nodes in parallel in CentOS
1. What is pdsh?
Pdsh is a high-performance, parallel remote shell utility. It uses a sliding window of threads to execute remote commands, conserving socket resources while allowing some connections to timeout if needed. It was originally written as a replacement for IBM's DSH on clusters at LLNL. More information can be found at PDSH Web site
2. Setup EPEL yum repository on CentOS 6. For more information, see Repository of CentOS 6 and Scientific Linux 6
3. Do a yum install
4. Configure user environment for PDSH
5. Put the host name of the Compute Nodes
6. Make sure the nodes have their SSH-Key Exchange. For more information, see Auto SSH Login without Password 7. Do Install Step 1 to Step 3 on ALL the client nodes.
B. USING PDSH Run the command ( pdsh [options]... command )
1. To target all the nodes found at /etc/pdsh/machinefile. Assuming the files are transferred already. Do note that the parallel copy comes with the pdsh utilities
2. To target specific nodes, you may want to consider using the -x command
2. Setup EPEL yum repository on CentOS 6. For more information, see Repository of CentOS 6 and Scientific Linux 6
3. Do a yum install
# yum install pdshTo confirm installation
# which pdsh
4. Configure user environment for PDSH
# vim /etc/profile.dEdit the following:
# setup pdsh for cluster users export PDSH_RCMD_TYPE='ssh' export WCOLL='/etc/pdsh/machines'
5. Put the host name of the Compute Nodes
# vim /etc/pdsh/machines/
node1 node2 node3 ....... .......
6. Make sure the nodes have their SSH-Key Exchange. For more information, see Auto SSH Login without Password 7. Do Install Step 1 to Step 3 on ALL the client nodes.
B. USING PDSH Run the command ( pdsh [options]... command )
1. To target all the nodes found at /etc/pdsh/machinefile. Assuming the files are transferred already. Do note that the parallel copy comes with the pdsh utilities
# pdsh -a "rpm -Uvh /root/htop-1.0.2-1.el6.rf.x86_64.rpm"
2. To target specific nodes, you may want to consider using the -x command
# pdsh -x host1,host2 "rpm -Uvh /root/htop-1.0.2-1.el6.rf.x86_64.rpm"References
Saturday, August 3, 2013
Using nvidia-smi to get information on GPU Cards
NVIDIA’s System Management Interface (nvidia-smi) is a useful tool to manipulate and control the GPU Cards. There are a few use case listed here
1. Listing of NVIDIA GPU Cards
2. Display GPU information
3. Display selected GPU Information (MEMORY, UTILIZATION, ECC, TEMPERATURE, POWER, CLOCK, COMPUTE, PIDS, PERFORMANCE)
1. Listing of NVIDIA GPU Cards
# nvidia-smi -L GPU 0: Tesla M2070 (S/N: 03212xxxxxxxx) GPU 1: Tesla M2070 (S/N: 03212yyyyyyyy)
2. Display GPU information
# nvidia-smi -i 0 -q ==============NVSMI LOG============== Timestamp : Sun Jul 28 23:49:20 2013 Driver Version : 295.41 Attached GPUs : 2 GPU 0000:19:00.0 Product Name : Tesla M2070 Display Mode : Disabled Persistence Mode : Disabled Driver Model Current : N/A Pending : N/A Serial Number : 03212xxxxxxxx GPU UUID : GPU-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx VBIOS Version : 70.00.3E.00.03 Inforom Version OEM Object : 1.0 ECC Object : 1.0 Power Management Object : 1.0 PCI Bus : 0x19 Device : 0x00 Domain : 0x0000 Device Id : 0xxxxxxxxx Bus Id : 0000:19:00.0 Sub System Id : 0x083010DE GPU Link Info PCIe Generation Max : 2 Current : 2 Link Width Max : 16x Current : 16x Fan Speed : N/A Performance State : P0 Memory Usage Total : 6143 MB Used : 10 MB Free : 6132 MB Compute Mode : Exclusive_Thread Utilization Gpu : 0 % Memory : 0 % Ecc Mode Current : Disabled Pending : Disabled ECC Errors Volatile Single Bit Device Memory : N/A Register File : N/A L1 Cache : N/A L2 Cache : N/A Total : N/A Double Bit Device Memory : N/A Register File : N/A L1 Cache : N/A L2 Cache : N/A Total : N/A Aggregate Single Bit Device Memory : N/A Register File : N/A L1 Cache : N/A L2 Cache : N/A Total : N/A Double Bit Device Memory : N/A Register File : N/A L1 Cache : N/A L2 Cache : N/A Total : N/A Temperature Gpu : N/A Power Readings Power Management : N/A Power Draw : N/A Power Limit : N/A Clocks Graphics : 573 MHz SM : 1147 MHz Memory : 1566 MHz Max Clocks Graphics : 573 MHz SM : 1147 MHz Memory : 1566 MHz Compute Processes : None
3. Display selected GPU Information (MEMORY, UTILIZATION, ECC, TEMPERATURE, POWER, CLOCK, COMPUTE, PIDS, PERFORMANCE)
# nvidia-smi -i 0 -q -d MEMORY,ECC ==============NVSMI LOG============== Timestamp : Mon Jul 29 00:04:36 2013 Driver Version : 295.41 Attached GPUs : 2 GPU 0000:19:00.0 Memory Usage Total : 6143 MB Used : 10 MB Free : 6132 MB Ecc Mode Current : Disabled Pending : Disabled ECC Errors Volatile Single Bit Device Memory : N/A Register File : N/A L1 Cache : N/A L2 Cache : N/A Total : N/A Double Bit Device Memory : N/A Register File : N/A L1 Cache : N/A L2 Cache : N/A Total : N/A Aggregate Single Bit Device Memory : N/A Register File : N/A L1 Cache : N/A L2 Cache : N/A Total : N/A Double Bit Device Memory : N/A Register File : N/A L1 Cache : N/A L2 Cache : N/A Total : N/A
Friday, August 2, 2013
Compiling MVAPICH2-1.9 with Intel and CUDA
Step 1: Download the MVAPICH1.19 from the http://mvapich.cse.ohio-state.edu/ . The current version at point of writing is MVAPICH2
Step 2: Compile the MPAPICH2 with intel and cuda.
Step 2: Compile the MPAPICH2 with intel and cuda.
# tar -zxvf mvapich2-1.9.gz # cd mvapich2-1.9 # mkdir buildmpi # ../configure --prefix=/usr/local/mvapich2-1.9-intel-cuda CC=icc CXX=icpc F77=ifort FC=ifort --with-cuda=/opt/cuda/ --with-cuda-include=/opt/cuda/include --with-cuda-libpath=/opt/cuda/lib64 # make -j8 # make install
Thursday, August 1, 2013
Turning off and on ECC RAM for NVIDIA GP-GPU Cards
From NVIDIA Developer site.
Turn off ECC (C2050 and later). ECC can cost you up to 10% in performance and hurts parallel scaling. You should verify that your GPUs are working correctly, and not giving ECC errors for example before attempting this. You can turn this off on Fermi based cards and later by running the following command for each GPU ID as root, followed by a reboot:
Extensive testing of AMBER on a wide range of hardware has established that ECC has little to no benefit on the reliability of AMBER simulations. This is part of the reason it is acceptable (see recommended hardware) to use the GeForce gaming cards for AMBER simulations.
1. To Turn off the ECC RAM, just do a
2. To Turn back on ECC RAM, just do
Turn off ECC (C2050 and later). ECC can cost you up to 10% in performance and hurts parallel scaling. You should verify that your GPUs are working correctly, and not giving ECC errors for example before attempting this. You can turn this off on Fermi based cards and later by running the following command for each GPU ID as root, followed by a reboot:
Extensive testing of AMBER on a wide range of hardware has established that ECC has little to no benefit on the reliability of AMBER simulations. This is part of the reason it is acceptable (see recommended hardware) to use the GeForce gaming cards for AMBER simulations.
1. To Turn off the ECC RAM, just do a
# nvidia-smi -g 0 --ecc-config=0 (repeat with -g x for each GPU ID)
2. To Turn back on ECC RAM, just do
# nvidia-smi -g 0 --ecc-config=1 (repeat with -g x for each GPU ID)
Subscribe to:
Posts (Atom)