|
Building Your Own Cloud Computing Network
Heard about this new tech but still wary of getting your hands dirty deploying it across your data center? We tell you what it is all about, and why and how you should deploy it.
Anindya Roy
Thursday, January 01, 2009
We have talked about cloud computing on more than one occasion last year. It
has since developed into a buzz word today and lot of big companies are talking
about and deploying it in their enterprises. The main benefit of Cloud Computing
that we have focused on till now is that it gives a more customizable and
granular control over the hardware when we go for a hosted application approach.
We have also talked about some of the pioneers who have deployed Cloud Computing
for providing such on-demand hosted services to their customers. The more well
known ones are Amazon's EC2 and Google's AppEngine. Probably Amazon was the
first one to release their Cloud offerings to the public and hence has become
quite popular in this domain.
If you have read our old issues, all of this would not be news for you. This
time we talk about something completely different. We thought, if Cloud is such
a great technology and gives such a great granular and detailed control over the
Datacenter resources, then why enterprises don't deploy their own cloud setups
and reap those benefits instead of going and buying them from some third party
service provider? And while doing our first level of research we figured out
that, yes, it is actually very much possible to deploy your own cloud in a data
center and by just using commodity hardware and open source middleware. The
benefit you get out of such an approach is immense. We shall also talk about
some of those benefits in this article. But before we go into details, let's
refresh ourselves with what Cloud computing is all about.
Cloud for Enterprises
There are thousands of documents which explain what Cloud Computing is, and it
is such a vast topic that understanding it in great detail requires a lot of
effort. But for those IT Managers and CIOs who don't have that much of time to
invest for learning, we can explain it in a nutshell as 'a cluster of
Virtualization Servers.'
The easiest way of understanding cloud is by understanding why is a cloud
required in the first place. The emergence of cloud computing has happened in
data centers because of the inherent drawback in virtualization infrastructure.
There's a need for true on the fly, on-demand scalability which no
virtualization platform can provide today. You might disagree with me on this
statement because generally virtualization is supposed to be a great enabler for
on the fly resource allocation. But the point to note over here is that a
Virtual machine (VM) cannot be scaled beyond the resources available on the Host
machine. In case the resource requirement of a VM increases beyond the resources
available on the host machine, one needs to migrate it on another virtual
machine with requisite resources. And to top it of, if the resource requirement
of the VM was temporary, then there is no proactive mechanism which can throw
back the VM to the old Host machine, thus saving the resources of the new Host
machine.
And this is where Cloud Computing comes to light. It is also sometimes
referred to as Eclectic Computing because it gives the ability to acquire and
release resources from a unified pool of hardware depending on requirement. To
understand it in detail, let's imagine a scenario where you need to build a web
server which will receive 1000 hits per day for 25 days in a month and 1,00,000
hits for the other five days.
Now if you want to get this server, you have to make sure that the server can
take the maximum possible load. So if you want to host this server elsewhere,
you have to sign an service level agreement (SLA) for 1,00,000 hits per day,
whereas your actual average requirement is far lesser than this. So you end up
paying for something which you don't actually need.
Even if you want to run this server in house, then also you have to buy a server
which can take the load of 1,00,000 hits per day and not the one which can only
take a load of 1000 hits per day, which would be a waste of money.
Just imagine a scenario where each and every server in your data center is
connected via a single middleware which converts it into a huge pool of unified
resources in terms of processors and RAM, etc and you run your servers as
virtual machines on top of it. And depending upon load at a given point of time,
VMs can just acquire the available resources in the pool (to a certain defined
max limit), use it and release it when the job is done. That's what Elastic
Computing is all about. And let's say the complete cloud is utilized to its max,
then all you have to do is to plug in one or more new free servers depending on
the resource requirement and it will automatically add up to the pool of
existing resources.
So, with this thought, we are going to deploy our own Cloud Computing
Infrastructure and run some VMs on top of it. We will be using Amazon's EC2
client for connecting and using the Cloud we create. In other words, we can even
say that we are going to build an EC2 compatible cloud.
Here we go...
To begin with, let's first identify what all we need to build a Cloud. Of
course we need some servers, which will be the nodes of the Cloud, and then we
need a Controller Server, which will manage the complete cloud. To connect all
of these, we will require a network, preferably a Gbps network, and we will need
a clustering and a virtualization middleware.
Here all the software we are going to use is open source or free and is
available for download. And if you are going to do a test setup of cloud for
understanding and research, only two servers will be more than enough for you.
You don't need to worry about spending a huge amount of money in this
deployment. You can just do a test run in your backyard.
Installing the controller
The software which we are going to use as the Cloud middleware is called
Eucalyptus and is an Open Source project managed by UCSB. Eucalyptus stands for
“Elastic Utility Computing Architecture for Linking Your Programs To Useful
Systems”. A long name that!
There are two ways by which you can deploy Eucalyptus. One is a lengthy
process where you install multiple Linux machines, install Java and Xen on them,
and then Install the Client components of Eucalyptus. Then install a separate
Linux machine and install the controller components of Eucalyptus. Next you
configure and use it (if you still have energy left for doing that!).
Else, there is a simpler approach. Install a ROCKS clustering front end with
Java and Xen roll, then add Eucalyptus Roll to it manually. Then let the ROCKS'
TFTP server take over the node deployments and you are done. Once the nodes are
installed, you can just download Amoazon's EC2 client and start using it.
 |
| After booting the central
management machine with Rocks 5.0 DVD, type frontend to start installation
|
I am sure, you must be preferring the second option more and that's what I
also did. Now let's look at deploying Eucalyptus using ROCKS. For those who
don't know what ROCKS is, here's a small intro: it is a Clustering/Grid
middleware something similar to OSCAR and comes with a brilliant node monitoring
tool called Ganglia (which we have talked about before) and TFTP based remote
deployment support for installing nodes.
Installing ROCKS front end
Download ROCKS 5.0 Boot, Core and OS Roll DVD from http://www.rocksclusters.org/wordpress/?page_id=82.
Select the right architecture before downloading. We went with a 64-bit model,
which is most likely the desired architecture for a Cloud. But a word of caution
here! Don't get carried away with the instinct of downloading the latest and the
greatest because on the same site, ROCKS version 5.1 DVD is also available which
doesn't work with Eucalyptus. And no document anywhere says that. I first did
that and wasted a complete day in trying to deploy Eucalyptus for Rocks 5.1. Page(s) 1 2
|