Monday, November 23, 2009  
Google
Web pcquest.com

CIOL Network sites

Search by Issue | Sitemap | Advanced Search

• For most updated version of DQ TOP 20 issue, visit dqindia.com • Ad : Play and Plug ERP by IBM

   Home > Network

Cluster and Reduce Network Downtime

An introduction to how clustering works

Anuj Jain

Tuesday, August 01, 2000

Print Comment Email DiggDigg DeliciousDel.icio.us RedittReddit TwitterTwitter

Clustering means linking several servers together, through special hardware and software, so that they appear as one to clients accessing them. They share the entire load amongst themselves, so if one goes down, the remaining take up its share of the load. This can be useful for applications requiring 100 percent uptime, Websites for example.

Clustering ensures that your services never go offline. Even if a service running on a server fails, it’s resumed on another server. Administering a cluster rather than multiple servers becomes easier, as you would manage one entity.

Moreover, as the load increases on a cluster, you can scale it up by adding more processors or computers.

Clustering configurations

Cluster hardware configurations vary depending on the technology and the operating system used. It comes in three flavors:

Shared Disk: This approach utilizes central I/O devices, accessible to all computers within the cluster. They rely on a common bus for disk access. Because all nodes are writing data simultaneously to the disks, data integrity is difficult to maintain. Thus, clustering software is required to maintain the coherence of data.

Shared disk clusters provide high system availability, as even if one node goes down, the others don’t get affected. On the downside, these kinds of clusters suffer from inherent bottlenecks involved in shared hardware that can affect performance. Shared Disk clusters are typically used by Oracle and AIX.

Shared Nothing: In these clusters, there is no central data storage. All nodes work independently with their own disks, but have the capability to take over the functioning of other disks, in case the node handling those disks ceases to function. They typically use a shared SCSI connection between the nodes. This type of clustering is not to be confused with the "shared disk" approach, since here there are no concurrent accesses being made to these disks by multiple nodes. Shared Nothing cluster solutions include MSCS (Microsoft Cluster Server) for Win NT/2k.

Mirrored Disk: Mirroring involves replicating all the data from a primary storage to a secondary storage device for availability purposes. Replication occurs while the primary system is online. If a failure occurs, the fail-over process (explained later) transfers control to the secondary system. However, some applications can lose some data during the fail-over process. One of the advantages of using mirroring is that your network doesn’t crash due to disk failure, nor is there any data loss. However, it may not be economical due to the redundant disks.

Terminologies and concepts

Members of a cluster are referred to as nodes. The Cluster Service is a collection of software on each node that manages all cluster-specific activity. A Resource is an item managed by the Cluster Service. Resources may include physical hardware devices such as disk drives and network cards, or logical items such as logical disk volumes, TCP/IP addresses, entire applications, and databases. A resource is said to be online when it’s providing its service on a node. A group is a collection of resources to be managed as a single unit. Operations performed on a group affect all resources contained in it.

A Group can be owned by only one node at a time. You can’t have resources within a group owned by multiple nodes simultaneously. If a particular node fails, then its group can be failed over or moved to another node as an atomic unit. Each group has a cluster-wide policy about which node it’ll run on, and the system it’ll move to in case of failure.

In case a node fails, a fail-over process automatically starts, which is responsible for distributing the workload to other nodes in the cluster. This implementation differs for different operating systems. When a node recovers from failure, a new fail-back process ensures that the node gets back its load.

Clustering in Windows 2000

In the Windows 2000 family, clustering is supported by the Advanced Server and Data Center versions. There are two flavors of clustering called MSCS (Microsoft Cluster Service) and NLB (Network Load Balancing). The first is meant to provide fail-over support for applications, such as databases, messaging systems, and file/print services while the second distributes load amongst nodes. MSCS can handle two-node clustering in Advanced Server and four nodes in Data Center. NLB can go up to 32 nodes in each.

MSCS uses software "heartbeats" to detect failed applications or servers. In case of failure, it uses the "shared nothing" clustering architecture that automatically transfers ownership of resources from a failed node to a surviving node. If an individual application fails (and not a node), MSCS will typically try to restart it on the same node. If that also fails, then it moves the application’s resources to the other node.

NLB as the name suggests, balances the load of incoming traffic across clusters of up to 32 nodes. One advantage of this setup is that you can add servers as per your requirement.

Both these clustering technologies can be used in conjunction for higher availability. Take an example of a large Internet site. You could have a Web server farm with Network Load Balancing as the front end, while the back-end, say the database application, is handled by the cluster service.

Clustering under NetWare

Novell introduced NCS (Netware Cluster Services) for NetWare 5, last fall. You can create up to 32-node clusters with the service using the shared disk architecture. It requires NetWare support pack 4 or higher to run. You can’t mix NetWare 5.x versions in a cluster. All nodes must be configured with TCP/IP, and be on the same subnet. Each server needs at least 64 MB RAM and should be part of the same NDS tree. In addition, each server must have at least one local disk device (not shared) to be used as volume SYS, and the NDS tree must be replicated on at least two servers in the cluster. The latest release of the Cluster Service includes fail-over for DHCP servers, which was not present earlier.

Anuj Jain

Print Comment Email DiggDigg DeliciousDel.icio.us RedittReddit TwitterTwitter


Untitled Document



ZTE:Leading CDMA Technology



   
 

 
 

Magazine Subscription | RQS | Contact Us | Team PCQuest | Advertising - Print | jobs@cybermedia