Sunday, November 22, 2009  
Google
Web pcquest.com

CIOL Network sites

Search by Issue | Sitemap | Advanced Search

• For most updated version of DQ TOP 20 issue, visit dqindia.com • Ad : Play and Plug ERP by IBM
 Home > Technology

Pump up the Performance

High performance computing technologies for enterprises are here. We explain their architectures and interconnects to help you choose the best one for your needs and environment

Anindya Roy

Monday, October 09, 2006

Print Comment Email DiggDigg DeliciousDel.icio.us RedittReddit TwitterTwitter

High performance computing (HPC) is invariably linked to Rocket Sciences. Normally, we think that this technology is used only for fields such as weather forecasting, genome mapping, simulating chemical reactions, etc. But that's not the complete picture. Well, not taking the argument too far, just pick up our last six or seven issues where we have talked about half a dozen different ways to use HPC in an Enterprise. Today HPC is no longer meant for research labs and universities only, but it has become a key enabler for business applications.

As per top500.org's latest report, 51% of the top 500 supercomputers worlwide are being used across different industry verticals

Still not convinced? Let's do some number crunching. There is a website called www.top500. org which is responsible for deciding and ranking top 500 supercomputers across the globe. It conducts a survey every six months. As per its latest report, amongst the top 500 supercomputers across the globe, 51% have been used across different verticals, while the combined aggregate of those used in academics and research comes to around 41%. The point we are trying to make here is that with declining hardware and software costs, and with a choice of more than one technology, setup, management and application of HPC have become easier. It is gradually entering each and every vertical. Be it banking or finance, image processing/rendering or gaming and entertainment, automobiles or medical sciences, HPC is providing the much needed competitive edge to enterprises to do more in lesser time. In this article we take you through some of the latest technologies in this field.

HPC architectures
To begin with, let's talk about the HPC technology architecture. The most common architectures used today are MPP and Clusters.

MPP: MPP or massively parallel processing, is pretty much similar to SMP or Symmetric Processing, that we see today in a normal multiprocessor server. Even the hyperthreading processors are also a form of SMP. In both cases we have tightly bound processing units. This means the interconnect is more sophisticated and in most cases it's internal. The difference between the two is that in SMP systems all CPUs share the same memory while in MPP systems, each CPU has its own memory. This implies that the application must be divided in such a way that all executing segments can communicate with each other. Hence, MPP systems are difficult to program. But because of this architecture, MPP systems don't have the bottleneck problems as are present in SMP systems, where all the CPUs attempt to access the same memory at the same time.
Today, MPP is used in the core of a majority of high-end supercomputers. But there's a catch. You can't build an MPP-based HPC with a commodity PC. For doing so, you need to go to vendors such as IBM or Cray. And because of this, the cost involved is pretty high. But the benefits are in the reduction of bottlenecks caused by the interconnect and a less complex architecture in terms of manageability.

HPC solution providers
Cray : www.cray.com 
IBM : www.ibm.com/servers/deepcomputing 
Intel : www.intel.com/go/hpc 
IS : www.interactivesupercomputing.com 
NEC : www.hpce.nec.com 
SGI : www.sgi.com/products/servers 
SUN : www.sun.com/servers/hpc/index.jsp 

Clustering: The other technology is clustering, or rather high performance clustering. The major difference between MPP and Clustering is that in clustering we have loosely bound processing units which are referred to as nodes, and the interconnect is mostly external, such as a standard high speed LAN, a Myrinet or an InfiniBand. We will discuss these interconnect technologies later. A good thing about such an HPC is that it can be built on commodity hardware and networking equipment, which brings down the cost. There are plenty of software, applications and middleware available to build such an HPC. Clustered HPCs are divided into SSI and PVM based Clusters. We have discussed them quite extensively in our previous articles. Here's a quick recap.

1. SSI based clusters: Single System Image (SSI) is a clustering technology that can make the nodes on a network work like a single virtual machine with multiple processors. The best thing about SSI is that for running on the new virtual machine, it doesn't require any modification to your application. However, because of this, there are certain drawbacks as well. SSI works very well when you run many tasks simultaneously on the virtual machine, for instance, converting hundreds of media files from one format to another. In such a situation, the SSI cluster will migrate all tasks evenly to all machines available in the cluster and complete the job significantly faster than if it were on a single machine.

On the other hand, if you deploy a single job or thread which requires a large amount of number crunching, the SSI cluster will not give you any performance improvement. This is because it can not divide a single task into multiple threads and spread them across the nodes of the cluster. One example of clustering middleware for SSI is OpenMosix. The charm of SSI based clusters is that you can deploy any standard (Linux) application on the cluster, without any modification to the application. So, for enterprises that want to migrate an existing application (mainly batch processing applications) on to a cluster, but at the same time don't want to invest in re-creating their applications with the PVM/MPI support or are using a third-party application where they don't have access to the code, SSI based clusters are the best solution.

2. PVM clusters: Parallel Virtual Machine (PVM) is the other clustering technology. It's different from SSI as here you need to recompile or build the application which you want to run on this cluster with PVM/MPI support. This means that you cannot run any existing application, without modification, on this cluster.

The commonly used clustering middleware is OSCAR. A major benefit of using such a cluster is that if you are running a single application which needs huge number crunching capability on a PVM cluster, then the same application will automatically take care of thread management and job migration between nodes.

What would you use a PVM cluster for? Scientific applications for one are best suited for PVM clusters. If you want to build a cluster which can do genome mapping, for example, then PVM is the best choice. Similarly, data modeling and forecasting jobs are also best run on such a cluster.

HPC at work

We tested an SSI framework based cluster. For this, we built an 18-node OpenMosix cluster and compared it against a standard dual Xeon 2.4 GHz processor-based server with 1GB RAM. The cost of this server was nearly equal to the cost of our cluster. We compared both using two different tests. The results we got were really exciting.
Here, the server is fully loaded, while the cluster is under 10% load only (simultaneously converting 75 WAV files to OGG). It took about 50% less time on the cluster The cluster and server both are fully loaded with the same load (zipping and taring 55 MB files in batches of 100, 150 and so on upto 300). The cluster gave 6 times better performance

Interconnects
After architecture, the next most important thing in an HPC is the interconnect. Generally, if you choose an MPP based architecture then you don't need to bother about the interconnect, as it's already there in the system. But if you're going for a cluster based approach then you have to decide about the right interconnect to use. In the following portion, we discuss some of the key technologies involved in loosely attaching interconnects.
Myrinet: Myrinet is a high-speed LAN system, designed by Myricom. It is designed to be used as an interconnect amongst multiple machines, to form computer clusters. One of the benefits of using Myrinet is that it has much less protocol overhead than standard interconnects such as Ethernet. As a result, it provides better throughput, less interference and latency. This is also one of the most popular interconnect techniques for clusters.
A standard Myrinet consists of two fiber optic cables (one for upstream and the other for downstream) per node, switches and a router with low overhead. A fourth generation Myrinet can give a speed of 10 Gbps. But this is not the only reason for its popularity. The other benefit that you can get is very low latency when compared to a normal LAN. And this low latency is achieved by a technique in which the application that is running on the cluster is aware of the NIC's firmware and can bypass the OS by sending messages directly to the network. Some other key features of Myrinet are heartbeat, flow control and error control in each link.

InfiniBand: InfiniBand is a point-to-point bi-directional serial link used for connection of processors with high speed peripherals such as disks. It supports several signaling rates. Initially, InfiniBand technology was used for connecting servers with remote storage and networking devices, and other servers. But later it was to be used inside servers for inter-processor communication (IPC) in parallel clusters. The serial connection's signaling rate is 2.5 Gbit/s in each direction per connection. InfiniBand supports double and quad data speeds-5 and 10 Gbit/s respectively.

Links can also be aggregated in units of 4 or 12, called 4x or 12x. A quad-rate 12x link can carry 120 Gbit/s raw or 96 Gbit/s of useful data. Other benefits include greater performance, lower latency, easier and faster sharing of data, built-in security and quality of service, improved usability (the new form factor will be far easier to add/remove/upgrade than today's shared-bus I/O cards). But again this is not a commodity product and to deploy such a setup you need to hire specialists.

Gigabit LAN : Now, this technology is known to everyone. Yes, it is the standard Gigabit Ethernet connection which is used in standard LANs. It is also used as a cluster interconnect. The devices that will be required for such kind of topology are standard Gigabit switches/routers and CAT5 enhanced UTP cables.

Being a technology that can work with a commodity product, it is one of the most common interconnect for small or mid-sized HPC systems. The cost of deploying such an interconnect is very low and it can actually work on your existing infrastructure with minimal or no modification.
But as compared to other inter-connects, it has drawbacks such as a relatively high latency and a lack of QoS or HA built into the hardware.

Final verdict
Broadly speaking, we have two options before going for an HPC deployment. It could either be a specialized deployment or it can be made up of commodity hardware, software and interconnects. Now the decision is completely yours. And it depends on the type of work you want to do.
If you want to run common applications on top of a cluster, an SSI based commodity cluster will be fine for you. In case you have a substantial amount of unutilized processing power on your network, then also a commodity cluster will do.

But if you need to run some specially designed apps (most likely a single job which requires a huge amount of processing power) with hardware level Failsafe and rapid scalability, and in case you don't have the in-house expertise, then you should approach a vendor to do the deployment for you.

Setting up a commodity cluster

How much does it cost to set up a high performance cluster? The answer depends on the number of nodes you want to deploy. Here is what it cost us to deploy a 20 node cluster:
Item Configuration Number Unit Cost (Rs) Total
Nodes P4, 2.4GHz, 40 GB HDD 256MB RAM and CD Drive 20 12,000 240,000
Switch 24 port, gigabit 1 25,000 25,000
Monitors 14” color 1 4,500 4,500
Keyboard 101 Standard 2 200 400
      Sub Total 269,900
Option I Low Cost      
  Angel Rack 2 2500 5000
  Power strips - 15 amp 5 150 750
  Ethernet cabling - 50 m 1 500 500
      Sub Total 6250
Option II High Cost      
  Server Racks Installed 2 30000 60000
The cost does not include cooling and power solutions. Also, depending on the make of the rack used, your costs could go up by another half a lakh or so for this setup. One monitor is always connected to the cluster manager machine while the other one is used for troubleshooting. To keep costs down, we did not use a KVM switch. What we did instead was to use Rdesktop and SSH on Linux (Rdesktop for Linux to Windows and SSH for Linux to Linux) for desktop sharing. We used the Remote desktop client on Windows for Windows to Windows and Putty for Windows to Linux management. Doing away with the KVM switch, however, caused a few trips to the cluster to physically connect the monitor and keyboard for troubleshooting.

Page(s)   1  

Print Comment Email DiggDigg DeliciousDel.icio.us RedittReddit TwitterTwitter


Untitled Document



ZTE:Leading CDMA Technology


Extraordinary Networks:Freedom of Choice


   
 

 
 

Magazine Subscription | RQS | Contact Us | Team PCQuest | Advertising - Print | jobs@cybermedia