[root@headnode-h99 ~]# ibdev2netdev mlx4_0 port 1 ==> ib0 (Up) mlx4_0 port 2 ==> ib1 (Down)
Monday, March 30, 2015
Using ibdev2netdev to quickly identify ports
ibdev2netdev is a nice tool to quickly identify ports to ib0
Tools for Performance Test for IB
ibportstate
- Enables the querying of the logical link and physical por tstates of an IB Port.
- Displays information such as LinkSpeed, LinkWidth and extended link speed
- Allows adjusting of link speed that is enabled on any IB Port
# ibportstate LID PortNumber
# Port info: Lid 15 port 1 LinkState:.......................Active PhysLinkState:...................LinkUp Lid:.............................15 SMLid:...........................1 LMC:.............................0 LinkWidthSupported:..............1X or 4X LinkWidthEnabled:................1X or 4X LinkWidthActive:.................4X LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps LinkSpeedActive:.................10.0 Gbps LinkSpeedExtSupported:...........14.0625 Gbps LinkSpeedExtEnabled:.............14.0625 Gbps LinkSpeedExtActive:..............14.0625 Gbps Mkey:............................<not displayed> MkeyLeasePeriod:.................0 ProtectBits:.....................0 # MLNX ext Port info: Lid 15 port 1 StateChangeEnable:...............0x00 LinkSpeedSupported:..............0x01 LinkSpeedEnabled:................0x01 LinkSpeedActive:.................0x00
Friday, March 27, 2015
Leap Second on 30th June 2015 and effects on CentOS and RHEL
At 11:59 p.m. on June 30, clocks will count up all the way to 60 seconds. That will allow the Earth's spin to catch up with atomic time.
Background - http://www.usatoday.com/story/tech/2015/01/08/computer-chaos-feares/21433363/
All of Red Hat Enterprise Linux 4, 5, 6 & 7 will be affected.
*Resolve Leap Second Issues in Red Hat Enterprise Linux
https://access.redhat.com/articles/15145
*Are we susceptible to a leap second event?
https://access.redhat.com/articles/199563
*Labs: Leap Second Issue Detector
https://access.redhat.com/labs/leapsecond/
Background - http://www.usatoday.com/story/tech/2015/01/08/computer-chaos-feares/21433363/
All of Red Hat Enterprise Linux 4, 5, 6 & 7 will be affected.
*Resolve Leap Second Issues in Red Hat Enterprise Linux
https://access.redhat.com/articles/15145
*Are we susceptible to a leap second event?
https://access.redhat.com/articles/199563
*Labs: Leap Second Issue Detector
https://access.redhat.com/labs/leapsecond/
Basic Configuration of Octopus 4.1.2 with OpenMPI on CentOS 6
Do take a look at the installation writeup by linuxcluster Basic Configuration of Octopus 4.1.2 with OpenMPI on CentOS 6
Saturday, March 21, 2015
Unable to Submit via Torque Submission Node - Socket_Connect Error for Torque 4.2.7
I am using Torque Server version 4.2.7. I was trying to configure a Submission Node. Here are a sample of my qmgr -c 'p s" output. Firewall has allows the necessary traffic in outr
After we ssh into the submission_node, and as I simulate as a user, I got this errors. Yes, the submission_node has been configured as a conventional client.
Taking a look at the Torque 4.2.7 documentation, the documentation mentioned that you have to make sure the submission node have trqauthd script at /etc/init.d if you are using RH / CentOS. You can easily scp the /etc/init.d/trqauthd to the submision node
From the head_node
Create a /etc/hosts_equiv file
At the Submission_Node, start the trqauthd service
Now trying submitting as a normal user
# qmgr -c "p s" .......... set server acl_hosts = submission_node.cluster.spms.ntu.edu.sg set server acl_hosts += head_node.cluster.spms.ntu.edu.sg set server submit_hosts = submission_node.cluster.spms.ntu.edu.sg set server submit_hosts += head_node.cluster.spms.ntu.edu.sg set server allow_node_submit = True .......
After we ssh into the submission_node, and as I simulate as a user, I got this errors. Yes, the submission_node has been configured as a conventional client.
socket_connect error (VERIFY THAT trqauthd IS RUNNING) Error in connection to trqauthd (15137)-[could not connect to unix socket /tmp/trqauthd-unix: 111] socket_connect error (VERIFY THAT trqauthd IS RUNNING) Error in connection to trqauthd (15137)-[could not connect to unix socket /tmp/trqauthd-unix: 111] socket_connect error (VERIFY THAT trqauthd IS RUNNING) Error in connection to trqauthd (15137)-[could not connect to unix socket /tmp/trqauthd-unix: 111] Unable to communicate with head_node(10.10.10.20) Communication failure. qsub: cannot connect to server head_node (errno=15137) could not connect to trqauthd
Taking a look at the Torque 4.2.7 documentation, the documentation mentioned that you have to make sure the submission node have trqauthd script at /etc/init.d if you are using RH / CentOS. You can easily scp the /etc/init.d/trqauthd to the submision node
From the head_node
# scp -v /etc/init.d/trqauthd root@submssion_node:/etc/init.d/
Create a /etc/hosts_equiv file
# touch /etc/hosts_equivPut the Submission_Node file name at the /etc/hosts.equiv of the head_node
submission_node
At the Submission_Node, start the trqauthd service
# service trqauthd start
Now trying submitting as a normal user
Tuesday, March 17, 2015
Where to download Intel Compiler?
I often has to google a while before I can locate the download site for the our purchased Intel Compiler. Here is the link just in case I forget again. Just log on and you can access the Intel Compilers
https://registrationcenter.intel.com/RegCenter/MyProducts.aspx
Monday, March 9, 2015
Enabling Predictive Cache Statistics (PCS) for Data OnTap 8.2p
* node1 is the controller currently primary to the aggregate/vol/LUN.
Step 1: Enable PCS
Step 2: Allow the representative workload to run and Run your workload
Step 3: Collect data throughout the process
Step 1: Enable PCS
node1::> node run –node node1 node1::> options flexscale.enable on node1::>options flexscale.enable flexscale.enable pcs you should see this node1::>options flexscale.pcs_size 330GB based on 3 x 200GB SSD RAID4
Step 2: Allow the representative workload to run and Run your workload
Step 3: Collect data throughout the process
node1::>stats show -p flexscale-accessNetApp recommends issuing this command through an SSH connection and logging the output throughout the observation period because you want to capture and observe the peak performance of the system and the cache. This output can also be easily imported into spreadsheet software, graphed, and so on. This process initially provides information on the “cold” state of the emulated cache. That is, no data is in the cache at the start of the test, and the cache is filled as the workload runs. The best time to observe the emulated cache is once it is filled, or “warmed”, as this will be the point when it enters a steady state. Filling the emulated cache can take a considerable amount of time and depends greatly on the workload. References:
Sunday, March 8, 2015
Using Tuned to tune CentOS 6 System
Tuned is a Dynamic Adaptive Tuning System Daemon. According to Manual Page
tuned is a dynamic adaptive system tuning daemon that tunes system settings dynamically depending on usage. For each hardware subsystem a specific monitoring plugin collects data periodically. This information is then used by tuning plugins to change system settings to lower or higher power saving modes in order to adapt to the current usage. Currently monitoring and tuning plugins for CPU, ethernet network and ATA harddisk devices are implemented.
Using Tuned
1. Installing tuned
2. To view a list of available tuning profiles
3. Tuning to a specific profile
4. Checking current tuned profile used and its status
5. Turning off the tuned daemon
References:
tuned is a dynamic adaptive system tuning daemon that tunes system settings dynamically depending on usage. For each hardware subsystem a specific monitoring plugin collects data periodically. This information is then used by tuning plugins to change system settings to lower or higher power saving modes in order to adapt to the current usage. Currently monitoring and tuning plugins for CPU, ethernet network and ATA harddisk devices are implemented.
Using Tuned
1. Installing tuned
# yum install tuned
2. To view a list of available tuning profiles
[root@myCentOS ~]# tuned-adm list Available profiles: - laptop-ac-powersave - server-powersave - laptop-battery-powersave - desktop-powersave - virtual-host - virtual-guest - enterprise-storage - throughput-performance - latency-performance - spindown-disk - default
3. Tuning to a specific profile
# tuned-adm profile latency-performance Switching to profile 'latency-performance' Applying deadline elevator: dm-0 dm-1 dm-2 sda [ OK ] Applying ktune sysctl settings: /etc/ktune.d/tunedadm.conf: [ OK ] Calling '/etc/ktune.d/tunedadm.sh start': [ OK ] Applying sysctl settings from /etc/sysctl.conf Starting tuned: [ OK ]
4. Checking current tuned profile used and its status
# tuned-adm active Current active profile: latency-performance Service tuned: enabled, running Service ktune: enabled, running
5. Turning off the tuned daemon
# tuned-adm off
References:
- Tuning Your System With Tuned (http://servicesblog.redhat.com)
Compiling Gromacs 5.0.4 on CentOS 6
Compiling Gromacs has never been easier using the cmake. There are a few assumptions.
References:
- Use MKL and Intel Compilers
- Use OpenMPI as the MPI-of-choice. The necessary PATH and LD_LIBRARY_PATH have been placed in .bashrc
- We will use SINGLE precision for speed used MDRUN and MPI Flags
# tar xfz gromacs-5.0.4.tar.gz # cd gromacs-5.0.4 # mkdir build # cd build
# /usr/local/cmake-3.1.3/bin/cmake -DGMX_BUILD_OWN_FFTW=ON -DREGRESSIONTEST_DOWNLOAD=ON -DCMAKE_INSTALL_PREFIX=/usr/local/gromacs-5.0.4 -DGMX_MPI=on -DGMX_FFT_LIBRARY=mkl -DGMX_DOUBLE=off -DGMX_BUILD_MDRUN_ONLY=on -DCMAKE_C_COMPILER=icc -DCMAKE_CXX_COMPILER=icpc
# make # make check # sudo make install # source /usr/local/gromacs/bin/GMXRC
References:
Friday, March 6, 2015
FREAK (Factoring Attack on RSA-EXPORT Keys) Attack
FREAK (Factoring Attack on RSA-EXPORT Keys) Attack
The vulnerability allows attackers to intercept HTTPS connections between vulnerable clients and servers and force them to use ‘export-grade’ cryptography(weak export cipher suites), which can then be decrypted.
It is recommended to update to the latest software patches. OpenSSL (CVE-2015-0204): versions before 1.0.1k are vulnerable.
For non-OpenSSL, disable support for any export cipher suites and known insecure ciphers on your web server.
Solutions:
References:
The vulnerability allows attackers to intercept HTTPS connections between vulnerable clients and servers and force them to use ‘export-grade’ cryptography(weak export cipher suites), which can then be decrypted.
It is recommended to update to the latest software patches. OpenSSL (CVE-2015-0204): versions before 1.0.1k are vulnerable.
For non-OpenSSL, disable support for any export cipher suites and known insecure ciphers on your web server.
Solutions:
- Use latest version of Chrome/IE/Mozilla instead of the Android Browser and Safari.
- Check if your site is vulnerable. SSL Labs - https://www.ssllabs.com/ssltest/
References:
- FREAK Attack - https://freakattack.com/
- Graham Cluley - https://grahamcluley.com/2015/03/freak-attack-what-is-it-heres-what-you-need-to-know/
- Recommended Configuration - https://wiki.mozilla.org/Security/Server_Side_TLS#Recommended_configurations
do_vfs_lock: VFS is out of sync with lock manager for CentOS 5
If you are reading at the "do_vfs_lock: VFS is out of sync with lock manager" messages at your screen or in your log file,
According to RedHat Site,
The message will be printed whenever there is locking contention (two or more processes trying to lock the same file) and the mount had nolock specified.
The RHEL-5 code prints the message unconditionally, while on the upstream code it is a debugging message, so it won't be seen on normal operation there.
Do take a look at your /etc/fstab and the mounting option. You should remove the "nolock" options
References:
According to RedHat Site,
The message will be printed whenever there is locking contention (two or more processes trying to lock the same file) and the mount had nolock specified.
The RHEL-5 code prints the message unconditionally, while on the upstream code it is a debugging message, so it won't be seen on normal operation there.
Do take a look at your /etc/fstab and the mounting option. You should remove the "nolock" options
References:
Subscribe to:
Posts (Atom)