Learned a new concept of Process Equivalent or PE. It is a measure of the actual impact of a set of requested resources by a job on the total resources system wide.
Taken from MAUI Admin Guide by Cluster Resources
PE =MAX (ProcsRequestedByJob / TotalConfiguredProcs,
MemoryRequestedByJob / TotalConfiguredMemory,
DiskRequestedByJob / TotalConfiguredDisk,
SwapRequestedByJob / TotalConfiguredSwap) * TotalConfiguredProcs
Assume a homogeneous 100 node system with 4 processors and 1 GB of memory per node. A job is submitted requesting 2 processors and 768 MB of memory. The PE for this job would be calculated as:
PE = MAX(2/(100*4), 768/(100*1024)) * (100*4) = 3.
This result makes sense since the job would be consuming 3/4 of the memory on a 4 processor node. The calculation works equally well on homogeneous or heterogeneous systems, uniprocessor or large way SMP systems.
Thursday, May 28, 2009
Wednesday, May 27, 2009
Installing Intel Compiler on a compute node
The Compute Nodes of a HPC are often on a private network and may not have access to the internet and they are often very lean. How do we compile the Intel Compiler on the compute node?
After untaring the Intel Compiler, and running install.sh and registering the serial number, you will notice that there will be a missing essential prequiste. How to know what is the missing important prequiste?
Go to the intel installation log file /tmp/intel.pset.root........, you can see in the log file "missing g++". To install g++, do the following:
Problem solved!
After untaring the Intel Compiler, and running install.sh and registering the serial number, you will notice that there will be a missing essential prequiste. How to know what is the missing important prequiste?
Go to the intel installation log file /tmp/intel.pset.root........, you can see in the log file "missing g++". To install g++, do the following:
# yum install gcc-c++
Problem solved!
Creating a Yum Local Repository at CentOS
Sometimes, you may want a yum local repository for your CentOS 5 Server as yum automatically resolve all the dependency, With a local repo as a resource it will help resolve potential dependencies especially if you do not have internet access
Step 1: You will need an utility, named createrepo.
# yum install createrepo Or
# rpm -Uvh createrepo-0.4.11-3.el5.noarch.rpm
Step 2: Copy the whole CentOS CD to a directory (E.g. /install/centos5.2/x86_64/CentOS)
Step 3: Createrepo
# createrepo /install/centos5.2/x86_64/CentOS
(you will need to run the above command again, so that the repository metadata gets updated)
Step 4: Create a local-repo file
(You may want to remove CentOS-Base.repo, CentOS-Media.repo if you are not connected to network)
# touch install-local.repo
Type the followings:
[localrepo]
name=CentOS $releasever - My Local Repo
baseurl=file:///install/centos5.2/x86_64/
enabled=1
gpgcheck=0
gpgkey=file:///install/centos5.2/x86_64/RPM-GPG-KEY
Further Information:
Step 1: You will need an utility, named createrepo.
# yum install createrepo Or
# rpm -Uvh createrepo-0.4.11-3.el5.noarch.rpm
Step 2: Copy the whole CentOS CD to a directory (E.g. /install/centos5.2/x86_64/CentOS)
Step 3: Createrepo
# createrepo /install/centos5.2/x86_64/CentOS
(you will need to run the above command again, so that the repository metadata gets updated)
Step 4: Create a local-repo file
(You may want to remove CentOS-Base.repo, CentOS-Media.repo if you are not connected to network)
# touch install-local.repo
Type the followings:
[localrepo]
name=CentOS $releasever - My Local Repo
baseurl=file:///install/centos5.2/x86_64/
enabled=1
gpgcheck=0
gpgkey=file:///install/centos5.2/x86_64/RPM-GPG-KEY
Further Information:
- Local YUM Repository by George Notaras
- Docs/Drafts/SoftwareManagementGuide from Fedora
Tin Hat Distro
Tin Hat is a Linux distribution derived from hardened Gentoo which aims to provide a very secure, stable and fast Desktop environment that lives purely in RAM................
Tin Hat requires about 4 GB to run comfortably, 3 GB for the tmpfs root file system, and 1 GB for paging. If one wants to further reintroduce Gentoo's portage system and/or the kernel source tree, 4GB becomes a very tight squeeze.
For more information, go to Tin Hat
Tin Hat requires about 4 GB to run comfortably, 3 GB for the tmpfs root file system, and 1 GB for paging. If one wants to further reintroduce Gentoo's portage system and/or the kernel source tree, 4GB becomes a very tight squeeze.
For more information, go to Tin Hat
Tuesday, May 26, 2009
Building OpenMPI with Intel Compiler (Ver 2)
This is a follow up from Thursday, April 2, 2009 Blog Entry Building OpenMPI with Intel Compiler
Step 1: Download the OpenMPI Software from http://www.open-mpi.org/ . The current stable version at point of writing is OpenMPI 1.3.2
Step 2: Download and Install the Intel Compilers from Intel Website. More information can be taken from Free Non-Commercial Intel Compiler Download
Step 3: Add the Intel Directory Binary Path to the Bash Startup
At my ~/.bash_profile directory, I've added
Step 4: Configuration Information
Step 5: Setting PATH environment for OpenMPI
At my ~/.bash_profile directory, I've added.
Step 6: mpicc ........
Step 7: Repeat the procedures on the Compute Nodes
Step 1: Download the OpenMPI Software from http://www.open-mpi.org/ . The current stable version at point of writing is OpenMPI 1.3.2
Step 2: Download and Install the Intel Compilers from Intel Website. More information can be taken from Free Non-Commercial Intel Compiler Download
Step 3: Add the Intel Directory Binary Path to the Bash Startup
At my ~/.bash_profile directory, I've added
PATH=$PATH:/opt/intel/Compiler/11.0/081/bin/intel64 At command prompt # source .bashrc
Step 4: Configuration Information
gunzip -c openmpi-1.2.tar.gz tar xf - # cd openmpi-1.2 #./configure --prefix=/usr/local CC=icc CXX=icpc F77=ifort FC=ifort # make all install
Step 5: Setting PATH environment for OpenMPI
At my ~/.bash_profile directory, I've added.
export PATH=/usr/local/bin:${PATH} export LD_LIBRARY_PATH=/opt/intel/Compiler/11.0/081/lib/intel64:${LD_LIBRARY_PATH}(The LD_LIBRARY_PATH must point to /opt/intel/Compiler/11.0/081/lib/intel64/libimf.so)
Step 6: mpicc ........
Step 7: Repeat the procedures on the Compute Nodes
Gnome HardInfo
HardInfo can gather information about your system's hardware and operating system, perform benchmarks, and generate printable reports either in HTML or in plain text formats.
To install you can
# apt-get install hardinfo
To install you can
# apt-get install hardinfo
Monday, May 25, 2009
Saturday, May 23, 2009
Midnight Commander
Midnight Commander is my favourite Orthodox File Managers. Easy to use and work nicely over SSH or X11.
To install, you can use
More Information: Midnight Commander Development Center
To install, you can use
# yum install mc
More Information: Midnight Commander Development Center
Thursday, May 21, 2009
IBM Cloud Initiatives
1. IBM Cloud Initiatives
2. IBM Cloud Research Project
- April 16, 2009: IBM Establishes First Cloud Computing Laboratory in Hong Kong
- April 16, 2009: Kogakuin University in Japan Manages Next-Gen Infrastructure with Public Cloud Services from IBM
- April 23, 2009: National Science Foundation Awards Millions to Fourteen Universities for Cloud Computing Research
Wednesday, May 20, 2009
Using Gnome Guake
Gnome Guake is a sniffy little tool that help you access your command line. Once you launch the launch the Guake, it is tucked nicely in your bottom-right hand tray. Just press F12 and you have the command line. You can have tab functions for several command line screen similar to Mozilla.
To install just
# apt-get install guake
For more information on Guake, you can look at Guake Website
To install just
# apt-get install guake
For more information on Guake, you can look at Guake Website
Monday, May 18, 2009
NFS and "Permission Denied" Mount Problem at the Client
I've encountered a strange issue with the NFS mount today.
1. On the LINUX Client
# mount /nfs-server
mount: xxx.xxx.xxx.xxx:/nfs-server failed,
reason given by server: Permission denied
We have checked the permission on the nfs-server:/etc/exports and everything is correct. We did a:
# showmount -e nfs-server
Everything was correct too and seem correct. If we do a "rpcinfo -p", it also show correct information
program vers proto port
100000 2 tcp 111 portmapper
100000 2 udp 111 portmapper
100024 1 udp 615 status
100024 1 tcp 618 status
100021 1 udp 32771 nlockmgr
100021 3 udp 32771 nlockmgr
100021 4 udp 32771 nlockmgr
100021 1 tcp 37569 nlockmgr
100021 3 tcp 37569 nlockmgr
100021 4 tcp 37569 nlockmgr
On the Server side,
We check the /etc/exports and everything is correct. Next we restarted the NFS and the Portmap Services. But the Clients still show the "Permission Denied" Error
We noticed that the /proc/mounts/ did not have the NFS Information. To resolve this issues, we have to manually add NFS information inside the /proc/mounts
nfs-server:/home /home nfs rw,vers=3,rsize=8192,wsize=8192,hard,intr,proto=tcp,timeo=14,retrans=2,sec=sys,addr=n00 0 0
We did the NFS Mount and everything was working!
Related Articles:
1. On the LINUX Client
# mount /nfs-server
mount: xxx.xxx.xxx.xxx:/nfs-server failed,
reason given by server: Permission denied
We have checked the permission on the nfs-server:/etc/exports and everything is correct. We did a:
# showmount -e nfs-server
Everything was correct too and seem correct. If we do a "rpcinfo -p", it also show correct information
program vers proto port
100000 2 tcp 111 portmapper
100000 2 udp 111 portmapper
100024 1 udp 615 status
100024 1 tcp 618 status
100021 1 udp 32771 nlockmgr
100021 3 udp 32771 nlockmgr
100021 4 udp 32771 nlockmgr
100021 1 tcp 37569 nlockmgr
100021 3 tcp 37569 nlockmgr
100021 4 tcp 37569 nlockmgr
On the Server side,
We check the /etc/exports and everything is correct. Next we restarted the NFS and the Portmap Services. But the Clients still show the "Permission Denied" Error
We noticed that the /proc/mounts/ did not have the NFS Information. To resolve this issues, we have to manually add NFS information inside the /proc/mounts
nfs-server:/home /home nfs rw,vers=3,rsize=8192,wsize=8192,hard,intr,proto=tcp,timeo=14,retrans=2,sec=sys,addr=n00 0 0
We did the NFS Mount and everything was working!
Related Articles:
Using history effectively
Read an interesing article on Jun 2009 LINUX Format. Thought I write it down
- Solution 1:
If you are using BASH, and you will get to see a list of commands that have been typed.
# history - Solution 2:
Press Ctrl+R and a "reverse-i-search" prompt is provided. Just type what you need to search - Solution 3:
.bash_history can also show the list of commands used. Just
#less .bash_history - Solution 4: To record the command you use, you can use script
Before you start logging, just type script and the entire content of the session will be saved in a file called typescript. To end the session, just press CRTL+D.
My favourite. just the good old history
Sunday, May 17, 2009
Smile - Alternative Slideshow
Taken from an article I have read in LINUX Format....June 09 Edition
Want to explore an alternative slide show, you may try to use smile. Before you install, it will need to settle some dependency namely
Want to explore an alternative slide show, you may try to use smile. Before you install, it will need to settle some dependency namely
- Mplayer
- Sox, Sox Support for OOG and MP3
- Mencoder
Saturday, May 16, 2009
Adding Date and Time to Bash History
Taken from HowToForge "Adding Date and Time to your Bash History"
# vim /etc/bashrc
Type:
export HISTTIMEFORMAT="%h/%d - %H:%M:%S "
# vim /etc/bashrc
Type:
export HISTTIMEFORMAT="%h/%d - %H:%M:%S "
Friday, May 15, 2009
MAUI Installation in xCAT 2.x
What is MAUI?
Maui Cluster Scheduler (a.k.a. Maui Scheduler) is our first generation cluster scheduler. Maui is an advanced policy engine used to improve the manageability and efficiency of machines ranging from clusters of a few processors to multi-teraflop supercomputers.
Taken and modified from https://xcat.wiki.sourceforge.net/Maui
Step 1: Download the tarball
Step 3: Setup MAUI
Step 4: Configure MAUI
Now run:
You should see all of the processors. Next try running a job to make sure that maui picks it up.
Maui Cluster Scheduler (a.k.a. Maui Scheduler) is our first generation cluster scheduler. Maui is an advanced policy engine used to improve the manageability and efficiency of machines ranging from clusters of a few processors to multi-teraflop supercomputers.
Taken and modified from https://xcat.wiki.sourceforge.net/Maui
Step 1: Download the tarball
- Go to http://clusterresources.com/, create an account and download here:
- http://www.clusterresources.com/product/maui/index.php?
- Untar in /tmp
# cd /opt/torque # ln -s x86_64/bin . # ln -s x86_64/lib . # ln -s x86_64/sbin . # export PATH=$PATH:/opt/torque/x86_64/bin/
Step 3: Setup MAUI
# cd maui-3.2.6p21 # ./configure --prefix=/opt/maui --with-pbs=/opt/torque/ # make -j8 # make install # cp /opt/xcat/share/xcat/netboot/add-on/torque/moab /etc/init.d/maui (Edit /etc/init.d/maui so that all MOAB is MAUI and all moab becomes maui) # service start maui # chkconfig --level 345 maui on
Step 4: Configure MAUI
# touch /etc/profile.d/maui.sh # vim maui (Type: export PATH=$PATH:/opt/maui/bin) # source /etc/profile.d/maui.sh # vim /usr/local/maui/maui.cfg Change: RMCFG[] TYPE=PBS@...@ to: RMCFG[ ] TYPE=PBS # service maui restart
Now run:
# showq
You should see all of the processors. Next try running a job to make sure that maui picks it up.
Setting up Torque Server on xCAT 2.x
Modified from xCAT 2 Advanced Cookbook
Step 1: Setup Torque Server
Step 2: Configure Torque
Create /etc/profile.d/torque.sh:
Step 3. Define Nodes
Step 4: Setup and Start Service
Step 5: Edit pbs_mom, pbs_sched, pbs_server
Step 6: Install pbstop
Step 7: Install Perl Curses for pbstop
Step 8: Create a Torque Default Queue
Step 9: Setup Torque Clients (x86_64) (using manual installation)
Ensure all the /etc/hosts contains the head and compute node
Change the sticky bit for the /var/log/pbs/spool & /var/log/spool/undelivered
(Ensure your path is correct for chmod o+t /var/spool/pbs/spool /var/spool/pbs/undelivered)
Step 10: Start pbs_mom for Torque Client Node
Go to the Head Node
Step 11: Start the pbs_mom, pbs_server, pbs_sched services at Head Node
Step 12: Check the PBS is working
Go to Head Node
Step 1: Setup Torque Server
# cd /tmp # wget http://www.clusterresources.com/downloads/torque/torque-2.3.0.tar.gz # tar zxvf torque-2.3.0.tar.gz # cd torque-2.3.0 # CFLAGS=-D__TRR ./configure \ --prefix=/opt/torque \ --exec-prefix=/opt/torque/x86_64 \ --enable-docs \ --disable-gui \ --with-server-home=/var/spool/pbs \ --enable-syslog \ --with-scp \ --disable-rpp \ --disable-spool
# make
# make install
Step 2: Configure Torque
# cd /opt/torque/x86_64/lib # ln -s libtorque.so.2.0.0 libtorque.so.0 # echo "/opt/torque/x86_64/lib" >>/etc/ld.so.conf.d/torque.conf # ldconfig # cp -f /opt/xcat/share/xcat/netboot/add-on/torque/xpbsnodes /opt/torque/x86_64/bin/ # cp -f /opt/xcat/share/xcat/netboot/add-on/torque/pbsnodestat /opt/torque/x86_64/bin/
Create /etc/profile.d/torque.sh:
# vim /etc/profile.d/torque.shType:
export PBS_DEFAULT=n00 (where n00 is the Head Node) export PATH=/opt/torque/x86_64/bin:$PATH
Step 3. Define Nodes
# cd /var/spool/pbs/server_priv # vim nodesType:
n01 np=8 (where np is the number of core for the server)
Step 4: Setup and Start Service
# cp -f /opt/xcat/share/xcat/netboot/add-on/torque/pbs /etc/init.d/ # cp -f /opt/xcat/share/xcat/netboot/add-on/torque/pbs_mom /etc/init.d/ # cp -f /opt/xcat/share/xcat/netboot/add-on/torque/pbs_sched /etc/init.d/ # cp -f /opt/xcat/share/xcat/netboot/add-on/torque/pbs_server /etc/init.d/ # chkconfig --del pbs # chkconfig --del pbs_mom # chkconfig --del pbs_sched # chkconfig --level 345 pbs_server on # service pbs_server start
Step 5: Edit pbs_mom, pbs_sched, pbs_server
# vim /etc/init.d/pbs_sched (Ensure your path is correct for BASE_PBS_PREFIX=/opt/torque) # vim /etc/init.d/pbs_mom (Ensure your path is correct for chmod 777 /var/spool/pbs/spool /var/spool/pbs/undelivered) (Ensure your path is correct for chmod o+t /var/spool/pbs/spool /var/spool/pbs/undelivered) # vim /etc/init.d/pbs_server (Ensure your path is correct for BASE_PBS_PREFIX=/opt/torque) (Ensure your path is correct for PBS_HOME=/var/spool/pbs)
Step 6: Install pbstop
# cp -f /opt/xcat/share/xcat/netboot/add-on/torque/pbstop /opt/torque/x86_64/bin/ # chmod 755 /opt/torque/x86_64/bin/pbstop
Step 7: Install Perl Curses for pbstop
# yum install perl-Curses (You will need RPMForge Repository enabled. Do check Installing RPMForge)
Step 8: Create a Torque Default Queue
# qmgr
create queue dque set queue dque queue_type = Execution set queue dque enabled = True set queue dque started = True set server scheduling = True set server default_queue = dque set server log_events = 127 set server mail_from = adm set server query_other_jobs = True set server resources_default.walltime = 00:01:00 set server scheduler_iteration = 60 set server node_pack = False set server keep_completed=300
Step 9: Setup Torque Clients (x86_64) (using manual installation)
Ensure all the /etc/hosts contains the head and compute node
# vim /etc/hosts(include all the hosts)
# pscp -r /opt/torque compute:/opt/ # pscp -r /var/spool/pbs compute:/var/spool/
Change the sticky bit for the /var/log/pbs/spool & /var/log/spool/undelivered
# chmod 777 /var/spool/pbs/spool /var/spool/pbs/undelivered # chmod o+t /var/spool/pbs/spool /var/spool/pbs/undelivered(Ensure your path is correct for chmod 777 /var/spool/pbs/spool /var/spool/pbs/undelivered)
(Ensure your path is correct for chmod o+t /var/spool/pbs/spool /var/spool/pbs/undelivered)
Step 10: Start pbs_mom for Torque Client Node
Go to the Head Node
# pscp /etc/init.d/pbs_mom compute:/etc/init.d/Edit pbs_mom
# vim /etc/init.d/pbs_mom
BASE_PBS_PREFIX=/opt/torque chmod 777 /var/spool/pbs/spool /var/spool/pbs/undelivered chmod o+t /var/spool/pbs/spool /var/spool/pbs/undelivered
# service pbs_mom start
Step 11: Start the pbs_mom, pbs_server, pbs_sched services at Head Node
# service pbs_mom start # service pbs_sched start # service pbs_server start
Step 12: Check the PBS is working
Go to Head Node
# pbstop # pbsnodes -a (You should see some the nodes as "free")
Thursday, May 14, 2009
Setting NTP Client on CentOS
To set your CentOS Server as a NTP Client, here how
# yum install ntp
# chkconfig --levels 235 ntpd on
# ntpdate 0.pool.ntp.org
# /etc/init.d/ntpd start
# yum install ntp
# chkconfig --levels 235 ntpd on
# ntpdate 0.pool.ntp.org
# /etc/init.d/ntpd start
Connect to a running Screen
Need to work on 2 screen? Use Screen... This Article is taken from LINUX Format June 2009
Step 1: Using another screen
# screen (You will be ushered into a new work terminal)
Step 2: Detached from the screen
Hold Down Crtl + A followed by D.
Step 3: To re-attach
# screen -r
Interesting Tutorials on Screen
Step 1: Using another screen
# screen (You will be ushered into a new work terminal)
Step 2: Detached from the screen
Hold Down Crtl + A followed by D.
Step 3: To re-attach
# screen -r
Interesting Tutorials on Screen
Tuesday, May 12, 2009
Installing xCAT on Compute Nodes
Some Possible Error faced:
Error: Interface setup failed: pumpSetupInterface failed.
Solution (Due to wrong or missing drivers used in the genimage):
- cd /opt/xcat/share/xcat/netboot/centos
- ./genimage -i eth0 -n tg3,e1000,bnx2 -o centos5.2 -p compute
Monday, May 11, 2009
Auto SSH Login without Password
The SSH daemon validates SSH client access by Linux system verification via /etc/passwd, or by public/private key cryptography approach.
By using the public/private cryptography approach, we can do a SSH without password.
In my write-up it is for root-to-root connection. You can use it for user connections
Steps 1: At the Host Machine
Step 2: At the Remote Machine, test it out
By using the public/private cryptography approach, we can do a SSH without password.
In my write-up it is for root-to-root connection. You can use it for user connections
Steps 1: At the Host Machine
- Logon to the root home directory.
- Make sure the hidden .ssh directory has the permission 700. If not execute the command
chmod 700 .ssh
- Change Directory to .ssh directory by executing the command
cd .ssh
- Generate the public-private keys using the ssh-keygen command.
# ssh-keygen -t rsa
- The resulting file id_rsa and id_rsa.pub rsa type public key
# ssh-copy-id -i ~/.ssh/id_rsa.pub remote-host(ssh-copy-id appends the keys to the remote-host’s .ssh/authorized_key)
Step 2: At the Remote Machine, test it out
# ssh remote-host (It should automatically login)
Friday, May 8, 2009
Wednesday, May 6, 2009
References for xCAT 2.0
xCAT Summary Commands and Database Tables
Stateless GFPS
MAUI
- "xCAT How-to for MAUI" (pdf)
Torque
- Installing and setting up Torque for xCAT is documented in the xCAT2 Cookbook for LINUX (pdf)
Ganglia
- xCAT How-to for Ganglia (pdf)
xCAT Website
Tuesday, May 5, 2009
IBM - Install and configure General Parallel File System (GPFS) on xSeries
IBM - Install and configure General Parallel File System (GPFS) on xSeries is a good intermediate article on how to install GPFS on X Series
Saturday, May 2, 2009
whohas - Application Finder
whohas is a command line tool that allows you to query several package collections at once. It currently supports Arch Linux (and AUR), Debian, Fedora, Gentoo, Mandriva, openSUSE, Slackware (and linuxpackages.net), Source Mage Linux, Ubuntu, FreeBSD, NetBSD, OpenBSD, Fink, and MacPorts repositories.
Subscribe to:
Posts (Atom)