|
De-mystifying Grid Technologies
Understand the concept of grids, the difference between grids and clusters and the emerging technologies in this field
Anindya Roy and Anadi Misra
Tuesday, April 03, 2007
Computation has changed drastically since the days of the first computer. In
the 60s and 70s, mainframes took charge of all processing and computation for
government, scientific and organizational needs. Thereafter, we saw the advent
of desktops or 'Micro Computers.' Almost parallelly, the concepts of networking
started to develop. And it didn't take long thereafter when grids and clusters
were implemented. In this article, we look into the concept of the computation
extremes achieved taking clusters a step further. Yes, we are talking about the
still in infancy yet very promising Computation Grid. Read on to find out what
it is, how it works, and most importantly which way it is heading.
What is a Grid?
Well its name and concept is derived from the electric power grid. To put it
shortly a grid is the way to share computational power and data storage over the
Internet. Just like the electric grid you don't have to worry where are you
receiving power from. Basically, the computational grid brings all the resources
under it into one entity. This collection of resources can then be used for high
end computation and with the storage of the participating systems combined,
provide an infinite but cheap storage option. While some might define it as a
'collection of clusters' or other definitions, we would like to stick to the
definition we gave a little while ago without giving any specific structural
example.
Now let us get down to a more elaborate definition. Grid computing can best
be defined as a form of distributed computing that works by sharing computing,
application, data, storage, or network resources across dynamic and
geographically dispersed organizations or computers. This is the reason we say
that a collection of clusters is not an appropriate definition. Clusters don't
work by bringing together systems or computers located geographically apart. We
will get down to differences between grids and clusters in detail a little
later.
Grid technologies promise to change the way organizations tackle complex
computational problems. However, the vision of large scale resource sharing is
not yet a reality in many areas-grid computing is an evolving area of computing,
where standards and technology are still being developed to enable this new
technology.
Need for a grid
Science has advanced by leaps and bounds and has grown more dependent on
computational power for research and analysis. While a powerful machine was
enough to analyze or compute whatever data, say a Pharma researcher had a decade
ago; things have changed a lot. Specifically in areas such as medical research,
nuclear physics, molecular studies, etc. For example, the amount of data that
scientists download from satellite monitoring activities in outer layers of
atmosphere goes up to approx 200 GB daily. Now you might realize the kind of
giant processing power you would need to consume data recorded over say a week
and perform computations on it. It has to be huge and powerful. This is one of
the reasons scientists demanded a system powerful enough and with near infinite
storage that could easily perform computation on the kind of data they
accumulate. It is scenarios like this which lead to the need for Computational
Grid. Rest as they say is history.
Grid architecture
Much like the Electric Grid from where the idea of Computational Grid came, the
architecture is a layered one. Thus we have grid applications as the top most
layer that might be scientific, engineering, and commercial or even web portals.
The next layer is that of the grid environment and tools. This layer provides
the libraries, runtime interfaces, even compilers and most importantly
parallelization tools. Next comes the layer which is rather a vendor specific
implementation, the Grid Middleware. This layer is in-charge of all the resource
management, scheduling services, job submission, storage access, and info
services across the entire grid. The middleware can further be segregated as a
layer comprising two sub layers. Some conceptualize two different layers. The
User-level middleware which takes care of the first two of all the tasks we
mentioned for middleware. The second one, Core Grid Middleware that handles the
latter four. Now since the grid will be using Internet as the communication,
computation and in-fact storage infrastructure and will be communicating or
connecting to clusters/grids across geographies; a Security Layer becomes
indispensible. Also referred to as the Security infrastructure, this layer
provides authentication and secure communication. The bottom most layer is the
'Grid Fabric' which is nothing but the existing 'network of networks' and its
components, clusters running on various OS, storage devices, databases and even
specific devices such as sensors.
| Grid Architecture |
Grid application
Science, engineering, commercial applications, Web portals |
Grid programming environments and tools
Languages, interfaces, libraries, compliers, parallelization tools |
User-level middleware–resource
aggregators
Resource management and scheduling services |
Core grid middleware
Job submission, storage access, info services, trading accounting |
Security infrastructure
Single sign-on, authentication, secure communication |
Grid fabric
PCs, workstations, clusters, networks, software, database, devices |
How it works
At the heart of the Grid is what we call the broker. We can describe the working
of the Grid at a rather abstract level as follows. Once a job is submitted for
operation in a Grid, the broker discovers resources that the user can access
through 'Grid Information Servers.' It then negotiates with grid-enabled
resources or their 'Agents' using middleware or middleware services, maps these
to the resources (also known as scheduling in Grid context) and then stages the
data for processing or application to be run. This last step is referred to as
'Deployment' in Grid context. The broker finally collects results. It monitors
the application's execution progress also. It also takes care of changes in the
Grid structure and resource failures. Next Page : Grid Vs Cluster computingPage(s) 1 2 3 4
|