Big Ideas; Big Tech: Continuous Operations for Oracle RAC with EMC VPLEX

Here’s an EMC video on Youtube about Oracle RAC with EMC VPLEX. Very nice, check it out!

http://www.youtube.com/watch?v=DRtl6dU2P_E

vplex-2

vplex-1

Why clone databases for firefighting

clonesAs more and more customers are moving their mission-critical Oracle database workloads to virtualized infrastructure, I often get asked how to deal with Oracle’s requirement to reproduce issues on a physical environment (especially if they use VMware as virtualization platform – as mentioned in Oracle Support Note # 249212.1).

In some cases, database engineers are still reluctant to move to VMware for that specific reason. But the discussion is not new – I remember a few years ago I was speaking in Vienna to a group of customers and partners from Eastern Europe, and these were the days we still had VMware ESX 3.5 as state-of-the-art virtualization platform. Performance was a bit limited (4 virtual CPUs max, some I/O overhead and memory limitations) but for smaller workloads it was stable enough for mission critical databases. So I discussed the “reproduce on physical in case of problems” issue and I stated that I never heared of any customer who really had to do this because of some issues. Immediately someone in the audience raised his hand and said, “well, I had to do that once!” – Duh, so far for my story…

Let’s say that very often I learn as much from my audience as (hopefully) the other way around ;-)

Later I heard of a few more occasions where customers actually were asked by Oracle support to “reproduce on physical” because of suspected problems with the VMware hypervisor. In all of the cases I am aware of, the root cause turned out to be elsewhere (Operating System or configuration) but having to create a copy in case of issues is a scary thought for many database administrators – as it could take a long time and if you have strict SLAs then this might bite back at you.

So what is my take on this?

Read more of this post

The Zero Dataloss Myth

In previous posts I have focused on the technical side of running business applications (except my last post about the Joint Escalation Center). So let’s teleport to another level and have a look at business drivers.

What happens if you are an IT architect for an organization, and you ask your business people (your internal customers) how much data loss they can tolerate in case of a disaster? I bet the answer is always the same:

“zero!”

This relates to what is known in the industry as Recovery Point Objective (RPO).

Ask them how much downtime they can tolerate in case something bad happens. Again, the consistent answer:

“none!”

This is equivalent to Recovery Time Objective (RTO).

Now if you are in “Jukebox mode” (business asks, you provide, no questions asked) then you try to give them what they ask for (RPO = zero, RTO = zero). Which makes many IT vendors and communication service providers happy, because this means you have to run expensive clustering software, and synchronous data mirroring to a D/R site using pricey data connections.

If you are in “Consultative” mode, you try to figure out what the business really wants, not just what they ask for. And you wonder if their request is feasible at all, and if so, what the cost is of achieving these service levels.

Read more of this post

Thank you, Larry Ellison

My colleague Vince Westin published this great post on his blog:

During his opening keynote at Oracle OpenWorld 2012, Larry Ellison launched the new Exadata X3.
LarryOOW2012 The new version appears to have some nice new capabilities, including caching writes to EFD, which are likely to improve the usability of Exadata for OLTP workloads. And he was nice enough to include the EMC Symmetrix VMAX 40K in detail on 30% of his slides as he announced the new Exadata. And for that, I give thanks. I am sure that Salesforce.com were similarly thankful when Larry focused so much of his time on their product in his keynote last year.

Read the rest of his post here.

The post provides a bunch of good reasons why EMC VMAX might be a better choice for customers that run high-performance mission-critical environments. A highly recommended read!

Oracle snapshots and clones with ZFS

Another Frequently Asked Question: Is there any disadvantage for a customer in using Oracle/SUN ZFS appliances to create database/application snapshots in comparison with EMC’s cloning/snapshot offerings?

Oracle marketing is pushing materials where they promote the ZFS storage appliance as the ultimate method for database cloning, especially when the source database is on Exadata. Essentially the idea is as follows: backup your primary DB to the ZFS appliance, then create snaps or clones off the backup for testing and development (more explanation in Oracle’s paper and video). Of course it is marketed as being much cheaper, easier and faster than using storage from an Enterprise Storage system such as those offered by EMC.

Oracle Youtube video

Oracle White paper

In order to understand the limitations of the ZFS appliance you need to know the fundamental workings of the ZFS filesystem. I recommend you look at the Wikipedia article on ZFS (here http://en.wikipedia.org/wiki/ZFS) and get familiar with its basic principles and features. The ZFS appliance is based on the same filesystem but due to it being an appliance, it’s a little bit different in behaviour.

So let’s see what a customer gets when he decides to go for the Sun appliance instead of EMC infrastructure (such as the Data Domain backup deduplication  system or VNX storage system).

Read more of this post

Exadata Hybrid Columnar Compression (HCC) for (storage) dummies

Columnar Basalt Landscape

Although EMC and Oracle have been long-time partners, the Exadata Database Machine is the exception to the rule and competes with EMC products directly. So I find myself more and more in situations where EMC offerings are compared directly with Exadata features and functions. Note that Oracle offers more competing products, including some storage offerings such as the ZFS storage appliance and the Axiom storage systems, but so far I haven’t seen a lot of pressure from those (except when these are bundled with Exadata).
Recently I have visited customers who asked me questions on how EMC technology for databases compares with, in particular, Oracle’s Hybrid Columnar Compression (HCC) on Exadata. And some of my colleagues, being storage aliens and typically not database experts, have been asking me what this Hybrid Compression thing is in the first place.

Read more of this post

Oracle RAC on VPLEX now certified

Last week EMC announced that Oracle RAC on VPLEX stretched clusters is now officially supported and certified by Oracle!

News Summary:

 

  • Oracle has certified that EMC® VPLEX™ METRO in a stretch cluster configuration can provide Oracle Real Application Clusters (Oracle RAC) customers with an easy-to-deploy, active/active solution, as they transform from single- to dual-site environments.
  • Having passed Oracle’s rigorous testing standards, the EMC VPLEX METRO solution can enable Oracle RAC to be easily configured over extended distances while enabling simultaneous access to the same data at both locations.

This is the final step in a process to help customers that have been asking for true active/active support over distance for their mission-critical Oracle Database business processes.

For those who are not yet familiar with this solution, here is a small summary:

  • Customers have been in search for ways to survive datacenter failures (i.e. “disasters”) without the need to recover and restart the databases, in such a way that any component failure or even complete site failure would not lead to database downtime
  • This was not possible before except when deploying complex configurations based on host mirroring using Oracle ASM or a 3rd party volume manager. (note that competing storage virtualization products from other storage vendors also do not offer this full capability – even though their marketing might make it seem so)
  • EMC VPLEX offers this functionality which is now completely certified and supported by Oracle, and the solution avoids risk by making the stretched cluster deployment as easy as a basic Oracle RAC install
  • The VPLEX solution offers additional benefits including better performance, better recovery from issues such as component or link failures and offers a complete solution for the whole application stack, not just Oracle
  • Note that AFAIK this solution should also work for IBM DB2 (but I haven’t confirmed)

The full news release can be found here: http://www.emc.com/about/news/press/2012/20120517-04.htm

A full series of blog posts on this solution can be found here: https://bartsjerps.wordpress.com/category/vplex/

The VPLEX witness (the final component of VPLEX that made this possible) was announced last year at EMC World 2011. Typically we see the start of market adoption between 1 to 1.5 years after bringing new technology in the market. I am working on a few customers myself who are on the edge of starting a project with this, hopefully by the end of year we have a set of good customer references!

Update: The new white paper can be found here: http://www.emc.com/collateral/software/white-papers/h8930-vplex-metro-oracle-rac-wp.pdf

Update 2: VPLEX support mentioned (briefly) on Oracle’s website: http://www.oracle.com/technetwork/database/enterprise-edition/tech-generic-linux-new-086754.html

Update 3: Demos available on EMC Demo Center:

EMC VPLEX Metro for Oracle RAC Solution Overview
Oracle RAC with VPLEX Metro Site Failure
Oracle RAC with VPLEX Metro Solution Overview
Oracle RAC with VPLEX Storage Failure

If you’re a frequent reader of my blog you might recognize familiar pictures there ;)

Data Guard protecting from EMC block corruptions?

Today I was giving a training to fellow EMC colleagues on some Oracle fundamentals. One of the things that was mentioned is something I have heard several times before: Oracle is claiming that EMC SRDF (a data mirroring function from EMC Symmetrix enterprise storage systems mainly to provide enterprise disaster recovery functions) cannot detect certain types of data corruption where Oracle Data Guard can. Ouch. The trouble with this statement is that it is half-true (and these ones are the most dangerous).
Read more of this post

Oracle and Data Integrity: Data in, Garbage Out?

Stop Corruption

A trivial question:

What is the basic function of a storage system?
I would say, the trivial function of a storage system is to store digital data and getting it back when you need it.

To be specific:
get the data back exactly the way you stored it.

You would probably say “Duh, of course!”

A storage system (as simple as a hard disk or as sophisticated as an EMC VMAX) is supposed to store data and give it back unmodified. But recent research shows that simple disk drives are not as reliable as you might think. Enough material is available that explains why and how often disk drives fail to return the correct information, often without any error as if the corrupted data is perfectly valid. See below for more references to this issue.

Read more of this post

POC: Piece Of Cake or Point Of Contradiction?

Every now and then I get involved in Customer Proof of Concepts. A Proof of Concept (POC) is, according to Wikipedia, something like a demonstration of feasibility of a certain idea, concept or theory.

Concept Performance Aircraft

Concept Aircraft

Read more of this post