- Cloning Security Group Rules
You can now copy the rules from an existing security group to a new one by selecting the existing rule ad choosing Copy to new from the Actions menu: - Managing Outbound Rules in VPC Security Groups
You can now edit the outbound rules of a VPC Security Group from within the EC2 console (this operation was previously available from the VPC console): - Deep Linking Across EC2 Resources
The new deep linking feature lets you easily locate and work with resources that are associated with one another. For example, you can move from an instance to one of its security groups with a single click: - Compare Spot Prices Across AZs
The updated Spot Pricing History graph makes it easier for you to compare Spot prices across Availability Zones. Simply hover your cursor over the graph and observe the Spot prices across all of the Availability Zones in the Region: - Tagging of Spot
Requests
You can now add tags to requests for EC2 Spot instances:
Friday, February 28, 2014
Amazon EC2 Console Improvements
EC2 has made some improvement on the console. Do take a look at Amazon Web Services Blog. Some of the features are:
Thursday, February 27, 2014
SGI UV way of doing Hadoop
I thought this video clip is quite interesting as it proposed Large SMP-like machine, in this case SGI-UV to solve "Big Analytics" Issues. Do look at this interesting video from SGI - Hadoop-WHAT? How to find that needle of information in your own Big Data haystacks (HD)
Wednesday, February 26, 2014
Using mcelog to detect cpu and memory issues on CentOS 6
mcelog is a daemon that collects and decodes Machine Check Exception data on x86-64 machines
According to mcelog website,
The mcelog daemon accounts memory and some other errors errors in various ways. mcelog --client can be used to query a running daemon. The daemon can also execute triggers when configurable error thresholds are exceeded. This is used to implement a range of automatic predictive failure analysis algorithms: including bad page offlining and automatic cache error handling. User defined actions can be also configured.
For CentOS 6, mcelog is a default install. But you could do a yum install
CentOS has already configured the cron to run hourly check. You can take a look at /etc/cron.hourly/mcelog.cron. It should be something like this below
To view the error, do take a look at /var/log/mcelog
To see log in real time,
According to mcelog website,
The mcelog daemon accounts memory and some other errors errors in various ways. mcelog --client can be used to query a running daemon. The daemon can also execute triggers when configurable error thresholds are exceeded. This is used to implement a range of automatic predictive failure analysis algorithms: including bad page offlining and automatic cache error handling. User defined actions can be also configured.
For CentOS 6, mcelog is a default install. But you could do a yum install
# yum install mcelog
CentOS has already configured the cron to run hourly check. You can take a look at /etc/cron.hourly/mcelog.cron. It should be something like this below
#!/bin/bash # do not run if mcelogd is running service mcelogd status >& /dev/null [ $? -eq 0 ] && exit 0 # is mcelog supported? /usr/sbin/mcelog --supported >& /dev/null if [ $? -eq 1 ]; then exit 1; fi /usr/sbin/mcelog --ignorenodev --filter >> /var/log/mcelog
To view the error, do take a look at /var/log/mcelog
# less /var/log/mcelog
To see log in real time,
# tail -f /var/log/mcelog
Tuesday, February 25, 2014
A peek into NTU's supercomputer and hybrid cloud
A writeup of "A peek into NTU's supercomputer and hybrid cloud" by Enterprise NTU efforts to deploy Hybrid Cloud IT Architecture
Monday, February 24, 2014
NFS Share getting a (1) appended when adding NFS storage at VCentre
I seems to get be getting this (1) appended to the NFS Share when I enter the NFS storage and volume and Datastore Name at the VCentre "Add Storage" dialogue box. I was using VSphere 5.1 and VCentre 5.1
There are 2 scenarios
Scenario 1 - Case Sensitive for the ServerName
Do look at NFS share getting a (1) appended when adding to a new host in existing datacenter . The solution is due to different caps for the storage name
Scenario 2- Missing "/" for the Storage Folder
For example ( I have this in my existing host):
Server: 192.168.1.1
Folder: vol/vol1
DataStore Name: MyDataStore
Do note that the volume has already mapped in other hosts. I realized that the correction error was very simple. I missed the "/" in front on the vol/vol1
Server: 192.168.1.1
Folder: /vol/vol1
DataStore Name: MyDataStore
There are 2 scenarios
Scenario 1 - Case Sensitive for the ServerName
Do look at NFS share getting a (1) appended when adding to a new host in existing datacenter . The solution is due to different caps for the storage name
Scenario 2- Missing "/" for the Storage Folder
For example ( I have this in my existing host):
Server: 192.168.1.1
Folder: vol/vol1
DataStore Name: MyDataStore
Do note that the volume has already mapped in other hosts. I realized that the correction error was very simple. I missed the "/" in front on the vol/vol1
Server: 192.168.1.1
Folder: /vol/vol1
DataStore Name: MyDataStore
Sunday, February 23, 2014
50% Enterprise to Use Hybrid Cloud by 2017
I read this article from Information week Gartner: 50% Of Enterprises Use Hybrid Cloud By 2017 dated. Some exerpt from the article
Gartner predicts that almost half of large enterprises will be engaged in a combined, public/private cloud operation, often described as "hybrid" cloud computing, four years from now.
......
......
VMware, IBM and Microsoft have all launched technology initiatives on the strength of the future prospects of hybrid cloud computing. Despite that, Bittman noted that "actual hybrid cloud computing deployments are rare." While three-fourths of those polled predicted hybrid deployments would occur in the next two years, Bittman scaled that optimism back to "nearly half by the end of 2017" for his prediction.
.......
.......
If IT finds public cloud use follows a pattern already established in the private cloud, then it becomes possible, in some cases, to gain the flexibility of adding resources when needed from the public cloud. There will still be limiting factors. Only some public clouds are likely to have a degree of compatibility with a given private cloud. Application-to-application integration may not be possible for many legacy apps. But public/private cloud operations will be a logical outcome of the path many organizations have already started down, Bittman concluded.
Gartner predicts that almost half of large enterprises will be engaged in a combined, public/private cloud operation, often described as "hybrid" cloud computing, four years from now.
......
......
VMware, IBM and Microsoft have all launched technology initiatives on the strength of the future prospects of hybrid cloud computing. Despite that, Bittman noted that "actual hybrid cloud computing deployments are rare." While three-fourths of those polled predicted hybrid deployments would occur in the next two years, Bittman scaled that optimism back to "nearly half by the end of 2017" for his prediction.
.......
.......
If IT finds public cloud use follows a pattern already established in the private cloud, then it becomes possible, in some cases, to gain the flexibility of adding resources when needed from the public cloud. There will still be limiting factors. Only some public clouds are likely to have a degree of compatibility with a given private cloud. Application-to-application integration may not be possible for many legacy apps. But public/private cloud operations will be a logical outcome of the path many organizations have already started down, Bittman concluded.
Tuesday, February 18, 2014
SPICE - Simple Protocol for Independent Computing Environments
According to wikipedia,
In computing, SPICE (the Simple Protocol for Independent Computing Environments) is a remote-display system built for virtual environments which allows users to view a computing "desktop" environment - not only on its computer-server machine, but also from anywhere on the Internet and using a wide variety of machine architectures.
There is a good writeup of some observation of SPICE usage experiences and performance and comments on other remote protocol. do read Taking SPICE for a Spin
In computing, SPICE (the Simple Protocol for Independent Computing Environments) is a remote-display system built for virtual environments which allows users to view a computing "desktop" environment - not only on its computer-server machine, but also from anywhere on the Internet and using a wide variety of machine architectures.
There is a good writeup of some observation of SPICE usage experiences and performance and comments on other remote protocol. do read Taking SPICE for a Spin
Sunday, February 16, 2014
Compiling GSL/4.1 from GIT on CentOS 5
GSL/4.1 is a code construction tool. It will generate code in all languages and for all purposes. To compile GSL, do the followings
Prerequisites
Compilation
To show command-line help
Prerequisites
# yum install pcre
Compilation
# git clone git://github.com/imatix/gsl # cd gsl/src # make # sudo make install
To show command-line help
# ./gsl
Friday, February 14, 2014
High Performance Capabilities for Windows Azure
Taken from HPCWire Weekly Updates (3 Feb 2014)........
The Windows Azure Cloud Service now includes two new compute-intensive virtual machine sizes. Known as A8 and A9, they are Azure’s most performant instances to date. The A8 instance comes with 8 Intel virtual processor cores and 56 GB of RAM, while A9 comes with 16 such cores and 112 GB of memory. The instance family also includes 40 Gbps InfiniBand networking for low-latency and high-throughput communication.
......
......
The new instance type actually employs two interconnect protocols. Traditional Ethernet is the link to Azure Storage, CDN, and other Windows Azure services or solutions, while a 40 Gbps InfiniBand network connects compute instances within the same Cloud Services deployment. Furthermore, the InfiniBand network employs remote direct memory access (RDMA) technology for maximum efficiency of parallel MPI applications, an enhancement that Microsoft first previewed more than a year ago, when it debuted its Big Compute strategy.
References
The Windows Azure Cloud Service now includes two new compute-intensive virtual machine sizes. Known as A8 and A9, they are Azure’s most performant instances to date. The A8 instance comes with 8 Intel virtual processor cores and 56 GB of RAM, while A9 comes with 16 such cores and 112 GB of memory. The instance family also includes 40 Gbps InfiniBand networking for low-latency and high-throughput communication.
......
......
The new instance type actually employs two interconnect protocols. Traditional Ethernet is the link to Azure Storage, CDN, and other Windows Azure services or solutions, while a 40 Gbps InfiniBand network connects compute instances within the same Cloud Services deployment. Furthermore, the InfiniBand network employs remote direct memory access (RDMA) technology for maximum efficiency of parallel MPI applications, an enhancement that Microsoft first previewed more than a year ago, when it debuted its Big Compute strategy.
References
Wednesday, February 12, 2014
Open Source Enterprise-Ready Request Trackers
If you are looking for an Open Source Enterprise-Ready Request Trackers, do take a look at RT: Request Tracker . The current version is 4.2.2
According to the website, RT is a battle-tested issue tracking system which thousands of organizations use for bug tracking, help desk ticketing, customer service, workflow processes, change management, network operations, youth counselling and even more
Friday, February 7, 2014
Using SR-IOV with Intel® Ethernet Server Adapters
The FAQ on Using SR-IOV with Intel® Ethernet Server Adapters
- Which Intel® Ethernet Adapters and Controllers support SR-IOV?
- Which hypervisors support SR-IOV on Intel® Ethernet Adapters?
- Which guest operating systems have virtual function drivers?
- Where can I get the virtual function (VF) drivers?
- How can I make SR-IOV work in Linux*?
- Can I use SR-IOV with Microsoft Hyper-V*?
- Can I use SR-IOV with VMware products*?
Thursday, February 6, 2014
MPI fail due to insufficient space
If you encounter an error that cause your MPI run to fail, such as the one below, it is due to the /tmp not having sufficient space
The solution is to clean up the /tmp or the partition where /tmp is residing. You can choose the "df -h" at the console to verify that your HDD space
[node1:25646] [[31090,0],0] ORTE_ERROR_LOG: Error in file orterun.c at line 543 Number of requested processors = 8 [node1:25663] opal_os_dirpath_create: Error: Unable to create the sub-directory (/tmp/openmpi-sessions-user1@node1_0) of (/tmp/openmpi-sessions-user1@node1_0/31075/0/0), mkdir failed [1] [node1:25663] [[31075,0],0] ORTE_ERROR_LOG: Error in file util/session_dir.c at line 106 [node1:25663] [[31075,0],0] ORTE_ERROR_LOG: Error in file util/session_dir.c at line 399 [node1:25663] [[31075,0],0] ORTE_ERROR_LOG: Error in file ess_hnp_module.c at line 304
The solution is to clean up the /tmp or the partition where /tmp is residing. You can choose the "df -h" at the console to verify that your HDD space
# df -h
Tuesday, February 4, 2014
Depreciated libguide.so and replaced by libiomp5.so
If your codes are encountering errors when compiling,
"Error while loading shared libraries: libguide.so: cannot open shared object file: No such file or directory"
It could be due to that you are using depreceated libguide.so which may not be available in your recent version in Intel. To solve the issues, do use the latest libguide.so which is found in more recent Intel Compiler and replacement for libguide.so
So do change the compilation flag from -lguide to -liomp5
Libguide.so was not was not compatible gcc implementations of OpenMP. libiomp5.so is compatible now and is the default for Intel Library
References:
"Error while loading shared libraries: libguide.so: cannot open shared object file: No such file or directory"
It could be due to that you are using depreceated libguide.so which may not be available in your recent version in Intel. To solve the issues, do use the latest libguide.so which is found in more recent Intel Compiler and replacement for libguide.so
So do change the compilation flag from -lguide to -liomp5
Libguide.so was not was not compatible gcc implementations of OpenMP. libiomp5.so is compatible now and is the default for Intel Library
References:
Saturday, February 1, 2014
Using mpirun --mca orte_base_help_aggregate 0 to debug error
If your mpirun dies without any error messages you may want to take read from OpenMPI FAQ which
Debugging applications in parallel 7. My process dies without any output. Why?
If your application fails due to memory corruption, Open MPI may subsequently fail to output an error message before dying. Specifically, starting with v1.3, Open MPI attempts to aggregate error messages from multiple processes in an attempt to show unique error messages only once (vs. one for each MPI process -- which can be unweildly, especially when running large MPI jobs).
However, this aggregation process requires allocating memory in the MPI process when it displays the error message. If the process' memory is already corrupted, Open MPI's attempt to allocate memory may fail and the process will simply die, possibly silently. When Open MPI does not attempt to aggregate error messages, most of its setup work is done during MPI_INIT and no memory is allocated during the "print the error" routine. It therefore almost always successfully outputs error messages in real time -- but at the expense that you'll potentially see the same error message for each MPI process that encourntered the error.
Hence, the error message aggregation is usually a good thing, but sometimes it can mask a real error. You can disable Open MPI's error message aggregation with the orte_base_help_aggregate MCA parameter. For example:
Debugging applications in parallel 7. My process dies without any output. Why?
If your application fails due to memory corruption, Open MPI may subsequently fail to output an error message before dying. Specifically, starting with v1.3, Open MPI attempts to aggregate error messages from multiple processes in an attempt to show unique error messages only once (vs. one for each MPI process -- which can be unweildly, especially when running large MPI jobs).
However, this aggregation process requires allocating memory in the MPI process when it displays the error message. If the process' memory is already corrupted, Open MPI's attempt to allocate memory may fail and the process will simply die, possibly silently. When Open MPI does not attempt to aggregate error messages, most of its setup work is done during MPI_INIT and no memory is allocated during the "print the error" routine. It therefore almost always successfully outputs error messages in real time -- but at the expense that you'll potentially see the same error message for each MPI process that encourntered the error.
Hence, the error message aggregation is usually a good thing, but sometimes it can mask a real error. You can disable Open MPI's error message aggregation with the orte_base_help_aggregate MCA parameter. For example:
$ mpirun --mca orte_base_help_aggregate 0 ...
Subscribe to:
Posts (Atom)