Thursday, April 29, 2010

Installing lammps using Intel Compilers, OpenMPI and FFTW

This is a entry on how I install LAMMPS using Intel, OpenMPI and FFTW
  1. If you are eligible for the Intel Compiler Free Download. Download the Free Non-Commercial Intel Compiler Download
  2. Build OpenMPI with Intel Compiler
  3. Install FFTW. Remember to install FFTW-2.1.x and not FFTW-3.x or you will face an issue fft3d.h(164): catastrophic error: could not open source file "fftw.h" . Read the LAMMPS "Getting Started" Section for more information
  4. When you are ready and about to compile, there are several "Make" selection found at "$SOURCE/lammps-30Mar10/src/MAKE". I chose the makefile.openmpi. Be default you do not need to edit the Makefile.openmpi. But if you are a guru and want to edit the file, feel free to
  5. Finally go to the preceding directory by typing
    cd .. (ie $SOURCE/lammps-30Mar10/src)
    make openmpi -j (-j for parallel compilation)
  6. At the end of the compilation, you should see a lmp_openmpi binary at the src directory. You are almost done
  7. Check that the executable are properly linked by doing a
    # ldd lmp_openmpi
  8. Remember to include /usr/local/lib in the LD_LIBRARY_PATH if libmpi_cxx.so.0 is located at /usr/local/lib

fft3d.h(164): catastrophic error: could not open source file "fftw.h"

I was compiling LAMMPS Molecular Dynamics Simulator.
  1. Using $SOURCE/lammps-30Mar10/src
  2. I compiled using make.linux. Quite soon, I encounter the following error fft3d.h(164): catastrophic error: could not open source file "fftw.h"
  3. I have compiled my fftw3 and my Intel Math Kernel library properly and was able to locate the header in my Intel Math Kernel Library. I've correctly "source" the path at LD_LIBRARY_PATH and /etc/ld.so.conf.d. But the LAMMPS is still not able to locate the library.
  4. I realise the cruz of the problem was that LAMPS requires FFTW-2.1.x. configuring and compiling fftw-2.x, the problem went away
Problem eliminated. My compilation is not done yet. But this fftw initial problem is settle for now :)

CPMD consortium



The CPMD code is a parallelized plane wave/pseudopotential implementation of Density Functional Theory, particularly designed for ab-initio molecular dynamics.

Monday, April 26, 2010

UNIX Binary Gaussian 09 Revision A.02 Installation instructions

Taken and modified from the README.BIN for my environment. This deserve hightlight for adminstrators to setup it quickly.

  1. Check that you have the correct versions of the OS, and libraries for your machine, as listed in the website G09 platform list
  2. Select or create a group (e.g. g09) which will own the Gaussian files inside the /etc/group. Users who will run Gaussian should either already be in this group, or should have this added to their list of groups.
  3. Create a Directory to place g09 and gv (For example gaussian). You can do it by using a command
    mkdir gaussian
  4. Mount the Gaussian CD  using commands like this one
    mount /mnt/cdrom 
  5. Within the CD, you can copy the gaussian binary contents (E64_930N.TGZ) out into your newly created gaussian directory.
  6. Untar it by using the command
    tar -zxvf E64_930N.TGZ
  7. Change ownership for the newly created g09 directory from step 6.
    chgrp -Rv g09 g09
  8. Install
    cd g09
    ./bsd/install
  9. Set the environment for users login
    touch .login
    Place the below contents into the .login
    g09root=/usr/local/gaussian/ 
    GAUSS_SCRDIR=/scratch/$USER
    export g09root GAUSS_SCRDIR
    . $g09root/g09/bsd/g09.profile
  10. Put it in your .bash_profile
    source .login

Manual setup of TCP LINDA for Gaussian
To configure for TCP Linda for Gaussian to run Parallel on Nodes, all you need is to tweak the ntsnet and LindaLauncher file found at g09 directory. For TCP Linda to work in Gaussian, just make sure the LINDA_PATH is correct.
  1. ntsnet is found $g09root/ntsnet (where $g09root = /usr/local/gaussian/g09 in my installation)
  2. LindaLauncher is found in $g09root/linda8.2/opteron-linux/bin/LindaLauncher (where $g09root = /usr/local/gaussian/g09 in my installation)
  3. flc is found at $g09root/opteron-linux/bin/flc
  4. pmbuild is found at $g09root/opteron-linux/bin/pmbuild
  5. vntsnet is found at $g09root/opteron-linux/bin/vntsnet
LINDA_PATH=/usr/local/gaussian/g09/linda8.2/opteron-linux/

Auto-Install for Gaussian. This can also be found at Gaussian Installation Notes

# cd /usr/local/gaussian/g09
# ./bsd/install

Put the .tsnet.config in your home directory.
# touch .tsnet.config

Tsnet.Appl.nodelist: n01 n02
Tsnet.Appl.verbose: True
Tsnet.Appl.veryverbose: True
Tsnet.Node.lindarsharg: ssh
Tsnet.Appl.useglobalconfig: True

Thursday, April 22, 2010

Advancing the Power of Visualization.

This is an interview by HPwire with Steve Briggs, HPCD’s SVA product marketing manager on Visualisation from HP point of view. Interesting information

Advancing the Power of Visualization –Coming Soon to Linux Clusters: 100 Million Pixels and More

GPFS Tuning Parameters

GPFS Tuning Parameters is a good wiki information resource written by IBM for GPFS Tuning. Just parroting some of the useful tips I have learned

To view the configuration parameters that has been changed from the default
mmlsconfig

To view the active value of any of these parameters you can run
mmfsadm dump config

To change any of these parameters use mmchconfig. For example to change the pagepool setting on all nodes.
mmchconfig pagepool=256M


1. Consideration to modify the PagePool

A. Sequential I/O
The default pagepool size may be sufficient for sequential IO workloads, however, a recommended value of 256MB is known to work well in many cases. To change the pagepool size
mmchconfig pagepool=256M [-i]

If the file system blocksize is larger than the default (256K), the pagepool size should be scaled accordingly. For example, if 1M blocksize is used, the default 64M pagepool should be increased by 4 times to 256M. This allows the same number of buffers to be cached.


B. Random I/O
The default pagepool size will likely not be sufficient for Random IO or workloads involving a large number of small files. In some cases allocating 4GB, 8GB or more memory can improve workload performance.
mmchconfig pagepool=4000M


C. Random Direct IO
For database applications that use Direct IO, the pagepool is not used for any user data. It's main purpose in this case is for system metadata and caching the indirect blocks of the database files.



D. NSD Server
Assuming no applications or Filesystem Manager services are running on the NSD servers, the pagepool is only used transiently by the NSD worker threads to gather data from client nodes and write the data to disk. The NSD server does not cache any of the data. Each NSD worker just needs one pagepool buffer per operation, and the buffer can be potentially as large as the largest filesystem blocksize that the disks belong to. With the default NSD configuration, there will be 3 NSD worker threads per LUN (nsdThreadsPerDisk) that the node services. So the amount of memory needed in the pagepool will be 3*#LUNS*maxBlockSize. The target amount of space in the pagepool for NSD workers is controlled by nsdBufSpace which defaults to 30%. So the pagepool should be large enough so that 30% of it has enough buffers.




For more information
  1. GPFS Tuning Parameters
  2. mmchconfig Command

Wednesday, April 21, 2010

NFS share on Linux client not immediately visible to other NFS clients

If you are using NFS as the shared file system, you may encounter this issue where NFS share on Linux client not immediately visible to other NFS clients. This is due to caching parameters which you must take note of on the NFS Client side. These are
  1. acregmin=n. The minimum time (in seconds) that the NFS client caches attributes of a regular file before it requests fresh attribute information from a server. The default is 3 seconds.
  2. acregmax=n. The maximum time (in seconds) that the NFS client caches attributes of a regular file before it requests fresh attribute information from a server. The default is 60.
  3. acdirmin=n. The minimum time (in seconds) that the NFS client caches attributes of a directory before it requests fresh attribute information from a server. The default is 60
  4. acdirmax=n. The maximum time (in seconds) that the NFS client caches attributes of a directory before it requests fresh attribute information from a server. The default is 60
  5. actimeo=n. When you wish to sets all of acregmin, acregmax, acdirmin, and acdirmax to the same value.
For more information, see
  1. Configuring NFS Client for Performance
  2. Why are changes made on an NFS share on my Red Hat Enterprise Linux 5 client not immediately visible to other NFS clients?

Tuesday, April 20, 2010

Moving HPC Applications to Cloud - The Practitioner Prospective

This is a very good summarise presentation by Victoria Livschitz, CEO of Grid Dynamics on some of the issues and challenges we will face when we unify Cloud and HPC into HPC-Cloud.

Read this: Moving HPC Applications to Cloud - The Practitioner Prospective

Thursday, April 15, 2010

Installing Cluster OpenMP* for Intel® Compilers



Overview
OpenMP* is a high level, pragma-based approach to parallel application programming. Cluster OpenMP is a simple means of extending OpenMP parallelism to 64-bit Intel® architecture-based clusters. It allows OpenMP code to run on clusters of Intel® Itanium® or Intel® 64 processors, with only slight modifications. 


Prerequisite
Cluster OpenMP use requires that you already have the latest version of the Intel® C++ Compiler for Linux* and/or the Intel® Fortran Compiler for Linux*.


Benefits of Cluster OpenMP
  1. Simplifies porting of serial or OpenMP code to clusters.
  2. Requires few source code modifications, which eases debugging.
  3. Allows slightly modified OpenMP code to run on more processors without requiring investment in expensive Symmetric Multiprocessing (SMP) hardware.
  4. Offers an alternative to MPI. Is easier to learn and faster to implement.

How to Install Cluster OpenMP.
  1. Installing Cluster OpenMP is simple. First you have to install Intel Compilers. For more information, see Blog Entry Free Non-Commercial Intel Compiler Download
  2. After installation of the Compilers, download the Cluster OpenMP License File from Cluster OpenMP Download site
  3. Place the Cluster OpenMP License file at the License Directory. Usually it is at /opt/intel/licenses
  4. With the Cluster OpenMP license file in place it will make it possible to use either the “-cluster-openmp” or “-cluster-openmp-profile” compiler options with your compiler when compiling a program.

Wednesday, April 14, 2010

Install g77 on CentOS 5

gfortran which is part of the GNU Compiler Collection (GCC) has replaced the g77 compiler, which stopped development before GCC version 4.0.

For some users who require to use g77, you have to install with the following yum command
yum install compat-gcc*

Tuesday, April 13, 2010

MPIRun and " You may set your LD_LIBRARY_PATH to have the location of the shared libraries ...... " issues

The Scenario:
I encountered this error while executing an mpirun. Do a "pbsnodes -l" and everything seems is online. I thought my $LD_LIBRARY_PATH was giving the issues. But after some exhaustive check, I've realise that communication to one of our nodes was having issues. Here's are the steps I took to solve the issue

--------------------------------------------------------------------------
A daemon (pid 16704) died unexpectedly with status 127 while attempting to launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared libraries on the remote node. You may set your LD_LIBRARY_PATH to have the  location of the shared libraries on the remote nodes and this will automatically be forwarded to the remote nodes.
--------------------------------------------------------------------------

The Error seems like it is due to LD_LIBRARY_PATH, but it may or may not.

Step 1: Check whether it is a LD_LIBRARY_PATH Issue for your head and compute node
First thing first, you should try to check whether you have the pathing of your LD_LIBRARY_PATH is blank or filled with the correct information for your head node and compute node.
$ echo $LD_LIBRARY_PATH
$/usr/local/lib:/opt/intel/Compiler/11.1/069/lib/intel64 .....
If everything looks normal. Proceed to step 2


Step 2: Check whether the mpirun can be executed cleanly.
$ mpirun -np 32 -hostfile hostfilename openmpi-with-intel-hello-world
where
  1. hostfilename contains all the compute node host name
  2. openmpi-with-intel-hello-world is the compiled mpi program

Step 3: If the error still remains.....
Modify the hostfilename and insert 1 compute node at a time and compile the  mpirun. You should be able to quickly identify that the problem is not $LD_LIBRARY_PATH but a problematic compute node
n01
n02
...
. In my situation, my problem was due to a broken ssh-generated-key and despite my torque showing all nodes as healthy

Monday, April 12, 2010

A Hello World OpenMPI program with Intel

I compiled a simple parallel hello world program to test whether OpenMPI is working well with Intel Compilers using the example taken from Compiler Examples from https://wiki.mst.edu/nic/how_to/compile/openmpi-intel-compile

Step 1: Ensure your OpenMPI is compiled with Intel. Read the Building OpenMPI with Intel Compiler (Ver 2) for more information


Step 2: Cut and paste the parallel program taken from https://wiki.mst.edu/nic/how_to/compile/openmpi-intel-compile. Compile the C++ program with mpi
$ mpicxx -o openmpi-intel-hello mpi_hello.cpp


Step 3: Test on SMP Machine
$ mpirun -np 8 open-intel-hello


Step 4: Test on  Distributed Cluster
$ mpirun -np 8 -hostfile hostfile.file open-intel-hello
You should see some output something like
Returned: 0 Hello World! I am 1 of 8
Returned: 0 Hello World! I am 6 of 8
Returned: 0 Hello World! I am 3 of 8
Returned: 0 Hello World! I am 0 of 8
Returned: 0 Hello World! I am 2 of 8
Returned: 0 Hello World! I am 5 of 8
Returned: 0 Hello World! I am 4 of 8
Returned: 0 Hello World! I am 7 of 8

Sunday, April 11, 2010

PCI Utilities

The PCI Utilities are a collection of programs for inspecting and manipulating configuration of PCI devices, all based on a common portable library libpci which offers access to the PCI configuration space on a variety of operating systems.

The utilities includes:
  1. lspci
  2. setpci
This utilities are usually installed by default for most Distribution. Definitely a must-installed for CentOS. This handy utility cross check the database and provide us with a more useful name.....

If you wish to install it on a RedHat Derivative Linux, just do a
yum install pciutils

Thursday, April 8, 2010

Using Intel® MKL in VASP

Using Intel® MKL in VASP guide is intended to help current VASP* (Vienna Ab-Initio Package Simulation*) users get better benchmark performance by utilizing Intel® Math Kernel Library (Intel® MKL).
The guide contain configuration and setup notes

Applications from Vmware

ThinApp
VMware ThinApp virtualizes applications by encapsulating application files and registry into a single ThinApp package that can be deployed, managed and updated independently from the underlying OS.

Some of the key benefits according to NetApp:
  1. Simplify Windows 7 migration:
  2. Eliminate application conflicts
  3. Consolidate application streaming servers:
  4. Reduce desktop storage costs:
  5. Increase mobility for end users:


SpringSource tc Server
SpringSource tc Server provides enterprise users with the lightweight server they want paired with the operational management, advanced diagnostics, and mission-critical support capabilities businesses need. It is designed to be a drop in replacement for Apache Tomcat 6, ensuring a seamless migration path for existing custom-built and commercial software applications already certified for Tomcat. One interesting feature is that the DownloadsSpringSource Tool Suite is Free.

Wednesday, April 7, 2010

Torque Error - Address already in use (98) in scan_for_exiting, cannot bind to port 464 in client_to_svr - too many retries

pbs_mom;Svr;pbs_mom;LOG_ERROR:: Address already in use (98) in scan_for_exiting, cannot bind to port 464 in client_to_svr - too many retries

One cause for this is very high traffic on the network not allowing the mom and the server to communicate properly. One common case are job scripts that incessantly run qstat. You will be surprise that sometimes users input some of these qstat scripts that cause the error

Monday, April 5, 2010

Sunday, April 4, 2010

Placing user xcat contributed scripted in the xcat directory

This is a continuation of blog entry User Contributed Script ported from xcat 1.x to xcat 2.x

Step 1: Placing addclusteruser in /opt/xcat/sbin
# cd /opt/xcat/sbin 
# wget https://xcat.svn.sourceforge.net/svnroot/xcat/xcat-contrib/admin_patch/xCAT-2-admin_patch-1.1/addclusteruser

Step 2: Placing gensshkeys in /opt/xcat/sbin
# cd /opt/xcat/sbin
# wget https://xcat.svn.sourceforge.net/svnroot/xcat/xcat-contrib/admin_patch/xCAT-2-admin_patch-1.1/gensshkeys

Step 3: Placing shfunctions1 in /opt/xcat/lib
#  cd /opt/xcat/lib
# wget https://xcat.svn.sourceforge.net/svnroot/xcat/xcat-contrib/admin_patch/xCAT-2-admin_patch-1.1/shfunctions1

To add users using addclusteruser
# addclusteruser
......

I'm assuming you have exported the home directory to other nodes
# pscp /etc/passwd compute:/etc/
# pscp /etc/shadow compute:/etc/
# pscp /etc/group compute:/etc/

Thursday, April 1, 2010

Can't find fftw3f library when configuring Gromacs

GROMACS is a versatile package to perform molecular dynamics.

If you are installing GROMACS using the Installation Instructions from Gromacs and encounter " can't find fftw3f library ", this is probably due to wrong precision being used. Try reconfiguring FFTW with the following settings "--enable-float"
./configure --enable-threads --enable-float
make
make install
and it will compile nicely