|
Hot Technologies in Storage
Continued from page: 4
Manu Priyam
Tuesday, July 03, 2007
Data Replication over WANs
Not only you need to have a secure first copy, you also need to be having
an available second copy of data to restore the normal course of business, in
case disaster strikes.
You never know the value of a backup until a disaster strikes and wipes out
the principal copy of your data. Many of us have realized the value of
investment in creating backups at remote locations, only after seeing the few
mishaps which took place in recent times and the devastation that they caused to
the unprepared. The traditional method of data protection has been to back it up
on tape, and then physically move it to a safe location, preferably far away
from the main office. However, with proliferation of a wide range of disk-based
automated backup systems the platform is all set for technologies for data
replication over WANs to get mature. On one hand we are seeing advances in
networking technologies. To name a few-optical networks now support storage
networking protocols, inclusion of flow control mechanisms, and efficient
transport capabilities. On the other hand in the storage world, we have data
de-duplication technologies coming up, which significantly reduce the volume of
the data to be backed up.
Networking technologies
To send data to geographically dispersed storage, you need a resilient storage
networking infrastructure. For inter-datacenter data replication, you need a
network with low latency, so there's minimal packet loss. Plus, the bandwidth
should also be scalable for such a network. With these sensitivities in mind,
there are two options to build a network to support for data replication-Coarse
/ DenseWavelength Division Multiplexing (C/DWDM) and Synchronous Optical NETwork
/ Synchronous Digital Hierarchy (SONET/SDH).
C/DWDM is a technology that maps data from different sources and protocols
together on an optical fiber with each signal carried on its own separate and
private light wavelength. It can be used to interconnect data centers via a
variety of storage protocols such as Fibre Channel, FICON, and ESCON. It has
been verified to support data replication over distances up to several hundred
kilometers. C/DWDM provides bandwidth from one to several hundreds of
gigabits/second (Gbps).
SONET/SDH technology is based on Time Division Multiplexing (TDM). With this
technology, enterprise data centers can be interconnected over thousands of
kilometers for data replication and other storage-networking needs. Storage over
SONET/SDH is a reliable and readily available networking option.
Data de-duplication
Though data de-duplication technologies have been around for years, there is a
renewed focus on them recently as they are being utilized by products in the
disk-based backup market. Data reduction enables disks to be a feasible
long-term retention backup media-making it the same or lower cost than
tape-based systems.
Moreover, data de-duplication addresses the issue of data replication for
disaster recovery. With the reduced amount of data after de-duplication, the
network bandwidth required for replication reduces significantly. This makes
replication even possible for smaller companies with lower budgets.
There are two primary methods of data reduction found in disk-based backup
systems: One is, byte-level delta data reduction that compares versions of data
over time and stores only the differences at the byte level and the other is,
block level data de-duplication in which blocks of data are read from the
written data and only the unique blocks are stored.
A byte level delta data reduction outperforms data de-duplication in a
disk-based backup system, as it scales to larger amounts of data. It avoids hash
table and restore fragmentation issues. It also processes the backup data after
it's been written to disk and on top of all, it is content aware and optimized
for your specific backup application. Therefore, it knows how each backup
application operates, and understands file content and boundaries. All in all,
it helps in optimizing the de-duplication process.
Flavors of replication
There are several products available in the market for data replication over
WANs. There are four flavors of replication to choose from. You can do it at the
application level, host level, in the storage arrays, or with a storage
networking appliance. The advantage of having it at application level is that
the application is fully aware of such replication. DBAs have confidence in data
integrity. It supports both synchronous as well as asynchronous replication.
And, it is also not hardware dependent. The disadvantages of such an approach
are that application owners themselves are responsible for recovery. It is
specific to a particular application. Often it does not protect application
files.
If you have it done at host level the failover can be automated, while if you
have it at application level, then the DBAs themselves need to pull up the data
in case of any failure. It supports disparate hardware and many-to-one
replication can be facilitated. As cons, you can count its dependence on OS and
that, it requires additional resources at host and the replication has to be
explicitly integrated with applications. In case you plan to replicate at
storage array level, then you must know that it is unlikely to support
dissimilar hardware. Secondary copies are only usable with point-in-time copies.
It also requires integration with applications. You also need to work on fabric
extension and there is added complexity of keeping everything in sync. But
having it at storage array makes it agnostic to applications and OS. It does not
use any host resources, replicates all kinds of data and is also easier to
manage.
Despite being a little costlier as a separate appliance and additional
hardware is required, having replication in the fabric provides modern
functionalities like CDP. In this approach, no array or host resources are
required. Understandably, it is also agnostic to applications and OS. Besides it
is highly scalable.
Next Page : Bacteria-based Storage Right from the LabsPage(s) 1 2 3 4 5 6
|