# wget http://dl.atrpms.net/el5-x86_64/atrpms/stable/f2c-20031026-3.0.1.el5.x86_64.rpm # rpm -Uvh f2c-20031026-3.0.1.el5.x86_64.rpm # ldconfig
Thursday, March 31, 2011
/usr/bin/ld cannot find -lf2c for CentOS 5
If you encounter this error "/usr/bin/ld: cannot find -lf2c", you are obviously missing f2c package. Do download the f2c-20031026-3.0.1.el5.x86_64.rpm package found at f2c-20031026-3.0.1.el5.x86_64.rpm - CentOS 5 (RHEL 5) - ATrpms
Wednesday, March 30, 2011
Compiling MPI BLACS on CentOS 5
An interesting article from Linux Cluster on Compiling MPI BLACS on CentOS 5. BLACS is compiled with OpenMPI 1.4.x with g77 and gfortran.
For more information see Compiling BLACS on CentOS 5
For more information see Compiling BLACS on CentOS 5
Compiling LAPACK on CentOS 5
Download the lapack latest stable version (lapack-3.3.0.tgz) from http://www.netlib.org/lapack/
# cd /root # tar -xzvf lapack-3.3.0.tgz # cd /root/lapack-3.3.0 # cp make.inc.example make.incAssuming Edit make.inc. Assuming the Compiling ATLAS on CentOS 5
#BLASLIB = ../../blas$(PLAT).a BLASLIB = /usr/local/atlas/lib/libf77blas.a /usr/local/atlas/lib/libatlas.aCompile lapack package
# makeCopy the libraries to
# mkdir /usr/local/lapack/lib # cp /root/lapack-3.3.0/*.a /usr/local/lapack/lib # cd /usr/local/lapack/lib/ # chmod 555 *.aOther related Information
Tuesday, March 29, 2011
Compiling ATLAS on CentOS 5
This tutorial is to help you compile ATLAS (Automatically Tuned Linear Algebra Software) with gFortran. For those who are using Intel Compiler, you have the reliable Intel MKL (Math Kernel Library)
First thing first, some comparison between ATLAS and MKL.
ATLAS
Compile ATLAS
Finally remember to add /usr/local/atlas/lib to your LD_LIBRARY_PATH
First thing first, some comparison between ATLAS and MKL.
ATLAS
ATLAS The Automatically Tuned Linear Algebra Software (ATLAS) provides a complete implementation of the BLAS API 3 and a subset of LAPACK 3. A big number of instructions-set specific optimizations are used throughout the library to achieve peak-performance on a wide variety of HW-platforms.MKL
ATLAS provides both C and Fortran interfaces.
ATLAS is available for all HW-platforms capable of running UNIX or UNIX-like operating systems as well as Windows (tm).
Intel's Math Kernel Library (MKL) implements a set of linear algebra, fast Fourier transforms and vector math functions. It includes LAPACK 3, BLAS 3 and extended BLAS and provides both C and Fortran interfaces.Download the latest stable package from ATLAS (http://sourceforge.net/projects/math-atlas/files/Stable/). The current stable version is atlas3.8.0.tar.gz. Do note that ATLAS don't like configuration on its original location, hence the need to create ATLAS_BUILD directory.
MKL is available for Windows (tm) and Linux (x86/i686 and above) only.
# cd /root # tar -xzvf atlas3.8.3.tar.gz # mkdir /root/ATLAS_BUILD # cd /root/ATLAS_BUILD # /root/ATLAS/configureYou will need to turn off CPU Throttling. For CentOS and Fedora, you will use
# /usr/bin/cpufreq-selector -g performanceFor more information, you can see my blog entry Switching off CPU Throttling on CentOS or Fedora
Compile ATLAS
make make check make ptcheck make time make installBy default, ATLAS installed to /usr/local/atlas
Finally remember to add /usr/local/atlas/lib to your LD_LIBRARY_PATH
Friday, March 25, 2011
Switching off CPU Throttling on CentOS or Fedora
Under CentOS and Fedora, you can switch off CPU Throttling or "Dynamic Frequency Scaling" to maximise your CPU performance. For more information of CPU Throttling, you can also read Dynamic Frequency Scaling from Wikipedia. Just type the command
For Debian-based hardware, you may want to take a look at
Looking at Xorg High CPU Usage Issue for netbook
# /usr/bin/cpufreq-selector -g performance
For Debian-based hardware, you may want to take a look at
Looking at Xorg High CPU Usage Issue for netbook
Wednesday, March 23, 2011
Installing Gromacs 4.0.x on CentOS 5.x
GROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles.
It is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers
Do note that this Gromacs Installation Guide is for Gromacs 4.0.x. For detailed instruction, see GROMACS Installation Instructions. For installation of FFTW, you may want to take a look at Blogh Entry Installing FFTW
Since I'm using FFTW, MPI (OpenMPI to be exact) and configure FFTW with --prefix=/usr/local/fftw,
I've configured the following
# ./configure CPPFLAGS="-I/usr/local/fftw/include" LDFLAGS="-L/usr/local/fftw/lib" \ --with-fft=fftw3 --enable-mpi --disable-floatSome notes...... (Assuming you are using bash)
- CPPFLAGS="-I/usr/local/fftw/include"
- LDFLAGS="-L/usr/local/fftw/lib"
- To compile with FFTW version 3 "--with-fft=fftw3"
- To enable MPI "--enable-mpi"
- To select Double precision "--disable-float"
# make -j 8where 8 is the number of cores.
# make mdrun* if you have configure with "--enable-mpi"
# make install* Install all the binaries, libraries and shared data files with:
# make install-mdrun* If you only want to build the mdrun executable (in the case of an MPI build),
# make links* If you want to create links in
/usr/local/bin
to the installed GROMACS executables
Tuesday, March 22, 2011
How to fix -fPIC errors
A very good article on fPIC error. See 3. HOWTO fix -fPIC errors by Gentoo Linux
If you have problem like " relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object; recompile with -fPIC .libs/assert.o: could not read symbols: Bad value ".
The article lists 4 cases of fPIC
Case 1: Broken Compiler
Case 3: Lack of `-fPIC' flag in the software to be built
Case 4: Linking dynamically against static archives
If you have problem like " relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object; recompile with -fPIC .libs/assert.o: could not read symbols: Bad value ".
The article lists 4 cases of fPIC
Case 1: Broken Compiler
At least GCC 3.4 is known to have a broken implementation of the -fvisibility-inlines-hidden flag. The use of this flag is therefore highly discouraged, reported bugs are usually marked as RESOLVED INVALID. See bug 108872 for an example of a typical error message caused by this flag."Case 2: Broken `-fPIC' support checks in configure
Many configure tools check whether the compiler supports the -fPIC flag or not. They do so by compiling a minimalistic program with the -fPIC flag and checking stderr. If the compiler prints *any* warnings, it is assumed that the -fPIC flag is not supported by the compiler and is therefore abandoned. Unfortunately, if the user specifies a non-existing flag (i.e. C++-only flags in CFLAGS or flags introduced by newer versions of GCC but unknown to older ones), GCC prints a warning too, resulting in borkage.
To prevent this kind of breakage, the AMD64 profiles use a bashrc that filters out invalid flags in C[XX]FLAGS.
Case 3: Lack of `-fPIC' flag in the software to be built
This is the most common case. It is a real bug in the build system and should be fixed in the ebuild, preferably with a patch that is sent upstream. Assuming the error message looks like this:
Code Listing 6.1: A sample error message .libs/assert.o: relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object; recompile with -fPIC .libs/assert.o: could not read symbols: Bad value
This means that the file assert.o was not compiled with the -fPIC flag, which it should. When you fix this kind of error, make sure only objects that are used in shared libraries are compiled with -fPIC.
In this case, globally adding -fPIC to C[XX]FLAGS resolves the issue, although this practice is discouraged because the executables end up being PIC-enabled, too.
Case 4: Linking dynamically against static archives
Sometimes a package tries to build shared libraries using statically built archives which are not PIC-enabled. There are two main reasons why this happens:
Often it is the result of mixing USE=static and USE=-static. If a library package can be built statically by setting USE=static, it usually doesn't create a .so file but only a .a archive. However, when GCC is given the -l flag to link to said (dynamic or static) library, it falls back to the static archive when it can't find a shared lib. In this case, the preferred solution is to build the static library using the -fPIC flag too.
Sometimes it is also the case that a library isn't intended to be a shared library at all, e.g. because it makes heavy usage of global variables. In this case the solution is to turn the to-be-built shared library into a static one.
Monday, March 21, 2011
Resolving "specifies multiple packages" error when removing a package
You may be using the good old rpm -e to remove a package, you may encounter an error, in my case "blas specified multiple packages". Naturally rpm will not allow you to remove the package.
# rpm -e --nodeps --allmatches (package)
Sunday, March 20, 2011
Resolving Single and Double Precision Discrepancy between pre-Nehalem Chipsets and Nehalem Chipsets
One of our researchers was running a job running on an SMP with older Intel Processors such as Intel(R) Xeon(R) CPU X7460 @2.66GHz (code-named "Dunnington") and we notice the accuracy between single and double precision was in the order of 5 decimal different.
For example:
0.623291xxxxxxx (Single Precision Code)
0.623290xxxxxxx (Double Precision Code)
One important thing to note is that the Intel Compiler is 11.x
But if we run the same code on the newer Intel Nehalem Architecture, you will see that the discrepancy between the single and double precision quite large. We notice the discrepancy of the order of 1 decimal point.
For example:
0.523667xxxxx (Single Precision Code)
0.4353836xxxxx (Double Precision Code)
Similarly, the Compiler used is the Intel Compiler 11.x
If we compare the results between the Dunnington Chipsets and the Nehalem Architecture, the discrepancy is really quite unacceptable.
Well, the solution is actually quite easy, you should update the Intel Compiler to the latest Intel® Parallel Studio XE 2011 for Linux* and your discrepancy should be eliminated and your results should be similar as what given to discrepancy. The Intel® Parallel Studio XE 2011 for Linux* has the latest libraries for the Nehalem Architecture.
For more information on where to download, do look at the Free Non-Commercial Intel Compiler Download
For example:
0.623291xxxxxxx (Single Precision Code)
0.623290xxxxxxx (Double Precision Code)
One important thing to note is that the Intel Compiler is 11.x
But if we run the same code on the newer Intel Nehalem Architecture, you will see that the discrepancy between the single and double precision quite large. We notice the discrepancy of the order of 1 decimal point.
For example:
0.523667xxxxx (Single Precision Code)
0.4353836xxxxx (Double Precision Code)
Similarly, the Compiler used is the Intel Compiler 11.x
If we compare the results between the Dunnington Chipsets and the Nehalem Architecture, the discrepancy is really quite unacceptable.
Well, the solution is actually quite easy, you should update the Intel Compiler to the latest Intel® Parallel Studio XE 2011 for Linux* and your discrepancy should be eliminated and your results should be similar as what given to discrepancy. The Intel® Parallel Studio XE 2011 for Linux* has the latest libraries for the Nehalem Architecture.
For more information on where to download, do look at the Free Non-Commercial Intel Compiler Download
Tuesday, March 15, 2011
Torque Resource Manager Server Parameters
Useful Information on Torque Resource Manager Server Parameters. Do take a look
Torque Resource Manager Appendix B: Server Parameters
Torque Resource Manager Appendix B: Server Parameters
Monday, March 14, 2011
Dealing with stuck jobs and Torque and MAUI
This is a add-on for the blog entry "Manually Deleting Torque amd PBS jobs using MAUI"
1. Force the Torque Server or MOM to send an obituary of the job ID to the server
2. Using the momctl command on the compute nodes where the job is listed. You can use a tracejob to check which nodes the job has been send to
3i. Setting the qmgr server setting mom_job_sync to True might help prevent jobs from hanging.
3ii. To verify that the setting in 3i is in, you can use trhe command
4. The final option. If all else fail, do a
For more information, see Adaptive Computing Website Section 11.1.7 Stuck Jobs
1. Force the Torque Server or MOM to send an obituary of the job ID to the server
# qsig -s 0 job_id
2. Using the momctl command on the compute nodes where the job is listed. You can use a tracejob to check which nodes the job has been send to
# momctl -c job_id -h compute_node_1
3i. Setting the qmgr server setting mom_job_sync to True might help prevent jobs from hanging.
# qmgr -c "set server mom_job_sync = True"
3ii. To verify that the setting in 3i is in, you can use trhe command
# qmgr -c "p s"
4. The final option. If all else fail, do a
qdel -p job_id
For more information, see Adaptive Computing Website Section 11.1.7 Stuck Jobs
Sunday, March 13, 2011
Manually Deleting Torque amd PBS jobs using MAUI
Tracing Jobs
To trace a job with MAUI commands including the nodes the jobs are residing, you can use the commands
Alternatively, you can use the MAUI commands to trace the job activity
Deleting Jobs
To delete a job with MAUI commands, you can use the commands,
Alternatively, you can also use PBS commands to delete a job
PBS mom control
If not able to delete a stale job which has no process, you can use the momctl command
If you are unable to delete the stale job with has no process, you can use momctl to do diagnostic. Basically The momctl command allows remote shutdown, reconfiguration, diagnostics, and querying of the pbs_mom daemon. For more information on momctl, do look at momctl by http://www.clusterresources.com/:
Example 1: Diagnosis of pbs_mom
Manually deleting the jobs
To manually delete the jobs, you should shutdown the pbs server
Remove the job spool files
Restart the pbs_server
Further Information:
To trace a job with MAUI commands including the nodes the jobs are residing, you can use the commands
# showq -r
Alternatively, you can use the MAUI commands to trace the job activity
# trace job_id
Deleting Jobs
To delete a job with MAUI commands, you can use the commands,
# canceljob job_id
Alternatively, you can also use PBS commands to delete a job
# qdel job_id
PBS mom control
If not able to delete a stale job which has no process, you can use the momctl command
# momctl
If you are unable to delete the stale job with has no process, you can use momctl to do diagnostic. Basically The momctl command allows remote shutdown, reconfiguration, diagnostics, and querying of the pbs_mom daemon. For more information on momctl, do look at momctl by http://www.clusterresources.com/:
Example 1: Diagnosis of pbs_mom
# momctl -h node1 -d 1Example 2: Cycle the pbs_mom on node 1
# momctl -h nod1 -C
Manually deleting the jobs
To manually delete the jobs, you should shutdown the pbs server
# service pbs_server stop
Remove the job spool files
# rm /var/spool/pbs/server_priv/jobs/111.host.SC # rm /var/spool/pbs/server_priv/jobs/111.host.JB
Restart the pbs_server
# service pbs_server restart
Further Information:
Thursday, March 10, 2011
Vmware View for iPAD is finally here!
Vmware View Client for iPAD is finally here…….
VMware View Client for iPad makes it easy to access your Windows virtual desktop from your iPad with the best possible user experience on the Local Area Network (LAN) or across a Wide Area Network (WAN).
Monday, March 7, 2011
Fast Access to the last Directory accessed
To quickly switch between 2 directories quickly, instead \of typing the whole path, you can use this command instead
# cd -
Friday, March 4, 2011
Installing Centrify Express on CentOS 5
I tried installing Centrify Express 64-bits on CentOS 5.4 x86_64 and it was quite smooth
Prerequisites:
1. You have root account and password
2. In order for you to join the domain, you need an Active Directory account with permission to add computers to the domain
Download Centrify Express, go to:
1. Go to Download Centrify Express
2. You may also wish to look at the Centrify Express Linux Quick Start Guide (pdf) and Centrify Express Admin Guide
Preparation for the Linux Box to join Centrify
1. Change of Hostname for the Linux Computer. See blog entry Changing the hostname on CentOS
2. Ensure your /etc/nsswitch.conf contains the following lines
hosts: files dnsSee man page for nsswitch.conf for more information on configuring for nsswitch
3. Ensure your resolv.conf includes a DNS Server than resolve SRV records for your domain
# less /etc/resolv.confYou should get something like
search example.com nameserver 192.168.1.5
4. Now you are ready to install
# mkdir centrify-suite
# mv centrify-suite-2011-rhel3-x86_64.tgz
# tar -zxvf centrify-suite-2011-rhel3-x86_64.tgz
# ./install-express.sh
Respond to the installation prompt (Taken from Centrify Admin)
How do you want to proceed? (E|S|X|C|Q) [X]: Accept the default, X (for Express Edition), by clicking Enter.
Do you want to run adcheck to verify your AD environment? (Q|Y|N) [Y]: Accept the default answer, Y (to run adcheck) by clicking Enter.
Please enter the Active Directory domain to check: Enter the fully qualified name of your AD domain; for example, ad.example.com
Join an Active Directory domain? (Q|Y|N) [Y] Accept the default answer, Y to join a domain.
Enter the Active Directory authorized user [administrator]: Enter the password for the Active Directory user:
Click Enter to select the defaults for the following prompts: Enter the computer name: [QA1.sales.acme.com] Enter the container DN [Computers]: Enter the name of the domain controller [auto detect]: Reboot the computer after the installation (Q|Y|N) [Y}:
You will see summation text similar to the following:
You chose Centrify Suite Express Edition and entered the following:
Install CentrifyDC 4.4.0 package: Y Install CentrifyDC-nis 4.4.0 package: N Install CentrifyDC-openssh 4.3.1 package: Y Install CentrifyDA 1.1.2 package: N Run adcheck : Y Join an Active Directory domain : Y Active Directory domain to join : ad.example.com Active Directory authorized user : administrator computer name : computername.ad.example.com container DN : Computers domain controller name : auto detect Reboot computer : Y
You can still try to do a direct Active Directory domain join.
# adjoin ad.example.com -u admin_user --force
Thursday, March 3, 2011
Encountering LW_ERROR_LDAP_INSUFFICIENT_ACCESS [LW_ERROR_LDAP_INSUFFICIENT_ACCESS] on Open Likewise
I was trying to join my Linux Box to an MS Active Directory Domain using Likewise Open from Likewise. Although I have permission to join computers to MS Active Directory domain but somehow once I use the command
From the Trace Stacks that comes out was
But somehow I suspect I may not have permission to set certain attributes (such as Description) on the computer account
# ./domainjoin-cli --logfile logfile join --ou my_OU my_AD_Domain Administrators
From the Trace Stacks that comes out was
Stack Trace: /builder/src-buildserver/Platform-6.0/src/linux/domainjoin/domainjoin-cli/src/main.c:937 /builder/src-buildserver/Platform-6.0/src/linux/domainjoin/domainjoin-cli/src/main.c:493 /builder/src-buildserver/Platform-6.0/src/linux/domainjoin/libdomainjoin/src/djmodule.c:332 /builder/src-buildserver/Platform-6.0/src/linux/domainjoin/libdomainjoin/src/djauthinfo.c:722 /builder/src-buildserver/Platform-6.0/src/linux/domainjoin/libdomainjoin/src/djauthinfo.c:1157 20110302130337:WARNING:Short domain name not specified. Defaulting to 'mydomain' 20110302130343:ERROR:LW_ERROR_LDAP_INSUFFICIENT_ACCESS [LW_ERROR_LDAP_INSUFFICIENT_ACCESS]
But somehow I suspect I may not have permission to set certain attributes (such as Description) on the computer account
Subscribe to:
Posts (Atom)