The IOPS race is over

emc-f1-carInfrastructure has always been a tough place to compete in. Unlike applications, databases or middleware, infrastructure components are fairly easy to replace with another make and model, and thus the vendors try to show off their product as better than the one from the competition.

In case of storage subsystems, the important metrics has always been performance related and IOPS (I/O operations per second) in particular.

I remember a period when competitors of our high-end arrays (EMC Symmetrix, these days usually just called EMC VMAX) tried to artificially boost their benchmark numbers by limiting the data access pattern to only a few megabytes per front-end IO port. This caused their array to handle all I/O in the small memory buffer cache of each I/O port – and none of the I/O’s would really be handled by either central cache memory or backend disks. This way they could boost their IOPS numbers much higher than ours. Of course no real world application would ever only store a few megabytes of data so the numbers were pure bogus – but marketing wise it was an interesting move to say the least.

With the introduction of the first Sun based Exadata (the Exadata V2) late 2009, Oracle also jumped on the IOPS race and claimed a staggering one million IOPS. Awesome! So the gold standard was now 1 million IOPS, and the other players had to play along with the “mine’s bigger than yours” vendor contest.
Continue reading

The public transport company needs new buses

Future-British-Bus-1A public transport company in a city called Galactic City, needs to replace its aging city buses with new ones. It asks three bus vendors what they have to offer and if they can do a live test to see if their claims about performance and efficiency holds up.

The transport company uses the city buses to move people between different locations in the city. The average trip distance is about 2 km. The vendors all prepare their buses for the test. The buses are the latest and greatest, with the most efficient and powerful engines and state of the art technology.

Continue reading

Getting the most out of your server resources

hearseespeak

As an advocate on database virtualization, I often challenge customers to consider if they are using their resources in an optimal way.

And so I usually claim, often in front of a skeptical audience, that physically deployed servers hardly ever reach an average utilization of more than 20 per cent (thereby wasting over 80% of the expensive database licenses, maintenance and options).

Magic is really only the utilization of the entire spectrum of the senses. Humans have cut themselves off from their senses. Now they see only a tiny portion of the visible spectrum, hear only the loudest of sounds, their sense of smell is shockingly poor and they can only distinguish the sweetest and sourest of tastes.

– Michael Scott, The Alchemyst

About one in three times, someone in the audience objects and says that they achieve much better utilization than my stake-in-the-ground 20 percent number, and so use it as a reason (valid or not) for not having to virtualize their databases, for example, with VMware.

Continue reading

Starting an Oracle database on physical server using VMware VMDK volumes

By now, we all know Oracle is fully supported on VMware. Anyone telling you it’s not supported is either lying to you, or doesn’t know what he is talking about (I keep wondering what’s worse).

VMware support includes Oracle RAC (if it’s version 11.2.0.2.0 or above).  However, Oracle may request to reproduce problems on physically deployed systems in case they suspect the problem is related to the hypervisor. The support note says:

Oracle will only provide support for issues that either are known to occur on the native OS, or can be demonstrated not to be as a result of running on VMware.

In case that happens, I recommend to contact VMWare support first because they might be familiar with the issue or can escalate the problem quickly. VMware support will take full ownership of the problem. Still, I have met numerous customers who are afraid of having to reproduce issues quickly and reliably on physical in case the escalation policy does not help. We need to get out of the virtual world, into reality, without making any other changes.  How do we do that?

Continue reading

Why clone databases for firefighting

clonesAs more and more customers are moving their mission-critical Oracle database workloads to virtualized infrastructure, I often get asked how to deal with Oracle’s requirement to reproduce issues on a physical environment (especially if they use VMware as virtualization platform – as mentioned in Oracle Support Note # 249212.1).

In some cases, database engineers are still reluctant to move to VMware for that specific reason. But the discussion is not new – I remember a few years ago I was speaking in Vienna to a group of customers and partners from Eastern Europe, and these were the days we still had VMware ESX 3.5 as state-of-the-art virtualization platform. Performance was a bit limited (4 virtual CPUs max, some I/O overhead and memory limitations) but for smaller workloads it was stable enough for mission critical databases. So I discussed the “reproduce on physical in case of problems” issue and I stated that I never heared of any customer who really had to do this because of some issues. Immediately someone in the audience raised his hand and said, “well, I had to do that once!” – Duh, so far for my story…

Let’s say that very often I learn as much from my audience as (hopefully) the other way around ;-)

Later I heard of a few more occasions where customers actually were asked by Oracle support to “reproduce on physical” because of suspected problems with the VMware hypervisor. In all of the cases I am aware of, the root cause turned out to be elsewhere (Operating System or configuration) but having to create a copy in case of issues is a scary thought for many database administrators – as it could take a long time and if you have strict SLAs then this might bite back at you.

So what is my take on this?

Continue reading

The Zero Dataloss Myth

In previous posts I have focused on the technical side of running business applications (except my last post about the Joint Escalation Center). So let’s teleport to another level and have a look at business drivers.

What happens if you are an IT architect for an organization, and you ask your business people (your internal customers) how much data loss they can tolerate in case of a disaster? I bet the answer is always the same:

“zero!”

This relates to what is known in the industry as Recovery Point Objective (RPO).

Ask them how much downtime they can tolerate in case something bad happens. Again, the consistent answer:

“none!”

This is equivalent to Recovery Time Objective (RTO).

Now if you are in “Jukebox mode” (business asks, you provide, no questions asked) then you try to give them what they ask for (RPO = zero, RTO = zero). Which makes many IT vendors and communication service providers happy, because this means you have to run expensive clustering software, and synchronous data mirroring to a D/R site using pricey data connections.

If you are in “Consultative” mode, you try to figure out what the business really wants, not just what they ask for. And you wonder if their request is feasible at all, and if so, what the cost is of achieving these service levels.

Continue reading

The EMC Oracle Joint Escalation Center

Oracle & EMCEMC and Oracle have supported each others products since 1995 and both spent millions of dollars in making them work together. EMC actually became famous in the late nineties because of our “Guilty until proven innocent” support mentality. We are known for the first company to give meaning to the concept of “Remote Support / Phone Home”, and the success stories still go around that EMC field engineers sometimes surprised customers with a visit in order to repair components (mostly disk drives), often before they were broken, and if they were actually broken the customers would not even notice (needless to say that replacements were done online).

Continue reading

Oracle snapshots and clones with ZFS

Another Frequently Asked Question: Is there any disadvantage for a customer in using Oracle/SUN ZFS appliances to create database/application snapshots in comparison with EMC’s cloning/snapshot offerings?

Oracle marketing is pushing materials where they promote the ZFS storage appliance as the ultimate method for database cloning, especially when the source database is on Exadata. Essentially the idea is as follows: backup your primary DB to the ZFS appliance, then create snaps or clones off the backup for testing and development (more explanation in Oracle’s paper and video). Of course it is marketed as being much cheaper, easier and faster than using storage from an Enterprise Storage system such as those offered by EMC.

Oracle Youtube video

Oracle White paper

In order to understand the limitations of the ZFS appliance you need to know the fundamental workings of the ZFS filesystem. I recommend you look at the Wikipedia article on ZFS (here http://en.wikipedia.org/wiki/ZFS) and get familiar with its basic principles and features. The ZFS appliance is based on the same filesystem but due to it being an appliance, it’s a little bit different in behaviour.

So let’s see what a customer gets when he decides to go for the Sun appliance instead of EMC infrastructure (such as the Data Domain backup deduplication  system or VNX storage system).

Continue reading

Exadata Hybrid Columnar Compression (HCC) for (storage) dummies

Columnar Basalt Landscape

Although EMC and Oracle have been long-time partners, the Exadata Database Machine is the exception to the rule and competes with EMC products directly. So I find myself more and more in situations where EMC offerings are compared directly with Exadata features and functions. Note that Oracle offers more competing products, including some storage offerings such as the ZFS storage appliance and the Axiom storage systems, but so far I haven’t seen a lot of pressure from those (except when these are bundled with Exadata).
Recently I have visited customers who asked me questions on how EMC technology for databases compares with, in particular, Oracle’s Hybrid Columnar Compression (HCC) on Exadata. And some of my colleagues, being storage aliens and typically not database experts, have been asking me what this Hybrid Compression thing is in the first place.

Continue reading

Save money by virtualizing Oracle

Approved

I wrote an internal EMC memo on licensing issues with Oracle on VMware as I get a lot of questions on this topic. But I’d like to expand the question a bit. After all, my blog is named “Dirty Cache” which could also be substituted with “Dirty Cash” – and as said, my mission is to lower cost and drive up service levels for my customers…

Here my internal memo (slightly edited for the blog and updated with a few corrections). Again, I want to make it clear that these are my own opinions based on (limited) customer experiences, I might be completely wrong and that’s why my blog has a disclaimer ;-)

Use this information at your own risk – don’t shoot the messenger.

Original question:

How should we license Oracle database on VMware?

Beefed up question:

How can we save money on licensing and other expenses by virtualizing Oracle?

Continue reading