The Zero Dataloss Myth

In previous posts I have focused on the technical side of running business applications (except my last post about the Joint Escalation Center). So let’s teleport to another level and have a look at business drivers.

What happens if you are an IT architect for an organization, and you ask your business people (your internal customers) how much data loss they can tolerate in case of a disaster? I bet the answer is always the same:

“zero!”

This relates to what is known in the industry as Recovery Point Objective (RPO).

Ask them how much downtime they can tolerate in case something bad happens. Again, the consistent answer:

“none!”

This is equivalent to Recovery Time Objective (RTO).

Now if you are in “Jukebox mode” (business asks, you provide, no questions asked) then you try to give them what they ask for (RPO = zero, RTO = zero). Which makes many IT vendors and communication service providers happy, because this means you have to run expensive clustering software, and synchronous data mirroring to a D/R site using pricey data connections.

If you are in “Consultative” mode, you try to figure out what the business really wants, not just what they ask for. And you wonder if their request is feasible at all, and if so, what the cost is of achieving these service levels.

Read more of this post

Data Guard protecting from EMC block corruptions?

Today I was giving a training to fellow EMC colleagues on some Oracle fundamentals. One of the things that was mentioned is something I have heard several times before: Oracle is claiming that EMC SRDF (a data mirroring function from EMC Symmetrix enterprise storage systems mainly to provide enterprise disaster recovery functions) cannot detect certain types of data corruption where Oracle Data Guard can. Ouch. The trouble with this statement is that it is half-true (and these ones are the most dangerous).
Read more of this post

Oracle and Data Integrity: Data in, Garbage Out?

Stop Corruption

A trivial question:

What is the basic function of a storage system?
I would say, the trivial function of a storage system is to store digital data and getting it back when you need it.

To be specific:
get the data back exactly the way you stored it.

You would probably say “Duh, of course!”

A storage system (as simple as a hard disk or as sophisticated as an EMC VMAX) is supposed to store data and give it back unmodified. But recent research shows that simple disk drives are not as reliable as you might think. Enough material is available that explains why and how often disk drives fail to return the correct information, often without any error as if the corrupted data is perfectly valid. See below for more references to this issue.

Read more of this post

Data Guard or Storage based replication?

A comparison between Oracle (Active) Data Guard and EMC replication for disaster recovery purposes

Panic Button
This is an article I wrote a while ago for customers’ Database Administrators (DBAs) and application managers, that helps them in selecting the right Disaster Recovery tools for their business applications.
It is slightly modified to update new insights and to make it more readable on the web.

Read more of this post

Eliminate Hot Backup with EMC consistency technology

For many years, EMC customers have been using storage replication technology to create copies of entire databases. Using storage cloning has many advantages over other mechanisms (file copy, tape restore, and the like). Most significant is that EMC storage can create near-instant copies of large applications without significant performance overhead. The reason is that the storage system is using its huge internal bandwidth and a couple of smart tricks to create the copy, therefore bypassing the host I/O layer.

Cloning

Cloning

In other words, a server running a database does not have to move a single bit of data for creating a copy of a multi-terabyte database.

Read more of this post

Through the wormhole with Stretched Clusters

Last year, EMC announced a new virtualization product called VPLEX. VPLEX allows logical storage volumes to be accessible from multiple locations. It boldly goes beyond existing storage virtualisation solutions (including those from EMC) in that it is not just a storage virtualisation cluster – but merely a storage federation platform, allowing one virtualized storage volume to be dynamically accessible from multiple locations, as if they were connected through a wormhole, and being built from one or more physical storage volumes.

Wormhole in space
Read more of this post

Stretched Clusters – Alien storage

In my previous posts I described how Oracle ASM can be used to build stretched clusters. I also pointed to some limitations of that scenario. But I am by far not the first one in doing so – and some of EMC’s competitors attempted to build products, features and solutions to overcome some of the limitations in host mirroring.

A while ago, some guys I met from an EMC partner, confronted me with the question why EMC, the market leader in external storage and premium Oracle technology partner, had not offered a solution for these limitations. They pointed to a number of products from competitors that – allegedly – solved the problem already. Also they pointed to the architectural simplicity of these solutions.

Alien Storage

At that time I had no good answer (which does not happen to me very often). I was not aware of how these products worked and I asked some questions on that. In that period I was also confronted by our enterprise customers who started demanding an EMC solution for stretched clustering – so I started digging. Could it be that EMC was over-passed by some of these alien storage start-up companies in continuous available storage solutions? It seemed to be the case.
Read more of this post

Limitations of host-based mirroring for stretched clusters

For data mirroring, EMC SRDF is sometimes used in such a setup that both servers write to one location only (the “far” server writes across dark fibre links to the local storage). EMC has similar tools (Mirrorview, Recoverpoint, etc) for other storage platforms than Symmetrix.

srdf cluster

SRDF cluster with passive target

Read more of this post

How does Oracle keep data consistent on filesystems?

Sometimes there is questioning on how EMC can provide consistent snapshots on storage systems, given the fact that file systems like ext3, ufs and the like may keep write data in file system (server) cache.
Read more of this post