The Quick and Dirty Deduplication Analyzer

The best thing about being me… There are so many “me”s.

— Agent Smith, The Matrix Reloaded

One of our customers reported less than optimal space savings on XtremIO running Oracle. In order to test various scenarios with Oracle I was in search of a deduplication analysis method or tool – only to find out that there was nothing available that qualified.

TL;DR: QDDA is an Open Source tool I wrote to analyze Linux files, devices or data streams for duplicate blocks and compression estimates. It can quickly give you an idea of how much storage savings you could get using a modern All-Flash Array like XtremIO. It is safe to use on production systems and allows quick analysis of various test scenarios giving direct results, and even works with files/devices that are in use. No registration or uploading of your confidential data is required.

Read more of this post

The IOPS race is over

emc-f1-carInfrastructure has always been a tough place to compete in. Unlike applications, databases or middleware, infrastructure components are fairly easy to replace with another make and model, and thus the vendors try to show off their product as better than the one from the competition.

In case of storage subsystems, the important metrics has always been performance related and IOPS (I/O operations per second) in particular.

I remember a period when competitors of our high-end arrays (EMC Symmetrix, these days usually just called EMC VMAX) tried to artificially boost their benchmark numbers by limiting the data access pattern to only a few megabytes per front-end IO port. This caused their array to handle all I/O in the small memory buffer cache of each I/O port – and none of the I/O’s would really be handled by either central cache memory or backend disks. This way they could boost their IOPS numbers much higher than ours. Of course no real world application would ever only store a few megabytes of data so the numbers were pure bogus – but marketing wise it was an interesting move to say the least.

With the introduction of the first Sun based Exadata (the Exadata V2) late 2009, Oracle also jumped on the IOPS race and claimed a staggering one million IOPS. Awesome! So the gold standard was now 1 million IOPS, and the other players had to play along with the “mine’s bigger than yours” vendor contest.
Read more of this post

Baking a cake: trading CPU for IO?

Sometimes I hear people claim that by using faster storage, you can save on database licenses. True or false?

The idea is that many database servers are suffering from IO wait – which actually means that the processors are waiting for data to be transferred to or from storage – and in the meantime, no useful work can be done. Given the expensive licenses that are needed for running commercial database software, usually licensed per CPU core, this then leads to loss of efficiency.

Let’s see if we can visualise the problem here with a common world example – Baking a cake.

Read more of this post

Oracle ASM vs ZFS on XtremIO


In my previous post on ZFS I showed how ZFS causes fragmentation for Oracle database files. At the end I promised (sort of) to also come back on topic around how this affects database performance. In the meantime I have been busy with many other things, but ZFS issues still sneak up on me frequently. Eventually, I was forced to take another look at this because of two separate customers asking for ZFS comparisons agaisnt ASM at the same time.

The account team for one of the two customers asked if I could perform some testing on their lab environment to show the performance difference between Oracle on ASM and on ZFS. As things happen in this business, things were already rolling before I could influence the prerequisites and the suggested test method. Promises were already made to the customer and I was asked to produce results yesterday.

Without knowledge on the lab environment, customer requirements or even details on the test environment they had set up. Typical day at the office.

In addition to that, ZFS requires a supported host OS – so Linux is out of the question (the status on kernel ZFS for Linux is still a bit unclear and certainly it would not be supported with Oracle). I had been using FreeBSD in my post on fragmentation – because that was my platform of choice at that point (my Solaris skills are, at best, rusty). Of course Oracle on FreeBSD is a no-go so back then, I used NFS to run the database on Linux and ZFS on BSD. Which implicitly solves some of the potential issues whilst creating some new ones, but alas.

Solaris x86

This time the idea was to run Oracle on Solaris (x86) that had both ZFS and ASM configured. How to perform a reasonable comparison that also shows the different behavior was unclear and when asking that question to the account team, the conference call line stayed surprisingly silent. All that they indicated up front is that the test tool on Oracle should be SLOB.

Read more of this post

Getting the Best Oracle performance on XtremIO

(Blog repost from Virtual Storage Zone – Thanks to @cincystorage)

UPDATE: I’ll say it again because there seems to be some confusion: THIS IS A REPOST!

Original content is from the Virtual Storage Zone blog (not mine). Just reposted here because it’s interesting and related to Oracle, performance and EMC storage. Enjoy…

XtremIO is EMC’s all-flash scale out storage array designed to delivery the full performance of flash. The array is designed for 4k random I/O, low latency, inline data reduction, and even distribution of data blocks.  This even distribution of data blocks leads to maximum performance and minimal flash wear.  You can find all sorts of information on the architecture of the array, but I haven’t seen much talking about archive maximum performance from an Oracle database on XtremIO.

The nature of XtremIO ensures that’s any Oracle workload (OLTP, DSS, or Hybrid) will have high performance and low latency, however we can maximize performance with some configuration options.  Most of what I’ll be talking about is around RAC and ASM on Redhat Linux 6.x in a Fiber Channel Storage Area Network.

Read the full blogpost here.


The public transport company needs new buses

Future-British-Bus-1A public transport company in a city called Galactic City, needs to replace its aging city buses with new ones. It asks three bus vendors what they have to offer and if they can do a live test to see if their claims about performance and efficiency holds up.

The transport company uses the city buses to move people between different locations in the city. The average trip distance is about 2 km. The vendors all prepare their buses for the test. The buses are the latest and greatest, with the most efficient and powerful engines and state of the art technology.

Read more of this post

Getting the most out of your server resources


As an advocate on database virtualization, I often challenge customers to consider if they are using their resources in an optimal way.

And so I usually claim, often in front of a skeptical audience, that physically deployed servers hardly ever reach an average utilization of more than 20 per cent (thereby wasting over 80% of the expensive database licenses, maintenance and options).

Magic is really only the utilization of the entire spectrum of the senses. Humans have cut themselves off from their senses. Now they see only a tiny portion of the visible spectrum, hear only the loudest of sounds, their sense of smell is shockingly poor and they can only distinguish the sweetest and sourest of tastes.

– Michael Scott, The Alchemyst

About one in three times, someone in the audience objects and says that they achieve much better utilization than my stake-in-the-ground 20 percent number, and so use it as a reason (valid or not) for not having to virtualize their databases, for example, with VMware.

Read more of this post

Announcing my Openworld 2013 presentation material

oow2013flashLast Tuesday I had the privilege to present at Oracle Openworld 2013 together with Sam Marraccini (the guy with the big smile here in the pic) from EMC’s Flash products division. Sam introduced the various EMC Flash offerings we have, and I discussed some experiences and best practices from the field. We really got lots of interaction with the audience, and many questions (at one point I was looking at about 5 hands raised simultaneously) which caused me to run out of time finishing some of the best practices I planned to discuss at the end. But interaction is always better than just us talking so I got the feeling the session was successful – although I’d like to hear from people in the audience what their thoughts are (feel free to comment!)

When people started to make snapshots of the slides with their iPhones, we promised the audience to make the slides available ASAP. So here they are. They will probably also be available via Oracle’s OOW pages within time. Read more of this post

Linux Disk Alignment Reloaded

railtrackmisalignMy all-time high post with the most pageviews is the one on Linux disk alignment: How to set disk alignment in Linux. In that post I showed an easy method on how to set and check disk alignment under linux.
Read more of this post

Oracle Exadata X3 Database In-Memory Machine: Timely Thoughtful Thoughts For The Thinking Technologist – Part I

Awesome post by Kevin! Recommended read if you are interested in Oracle Exadata.

Kevin Closson's Blog: Platforms, Databases and Storage

Oracle Exadata X3 Database In-Memory Machine – An Introduction
On October 1, 2012 Oracle issued a press release announcing the Oracle Exadata X3 Database In-Memory Machine. Well-chosen words, Oracle marketing, surgical indeed.

Words matter.
Games Are Games–Including Word Games
Oracle didn’t issue a press release about Exadata “In-Memory Database.” No, not “In-Memory Database” but “Database In-Memory” and the distinction is quite important. I gave some thought to that press release and then searched Google for what is known about Oracle and “in-memory” database technology. Here is what Google offered me:

Note: a right-click on the following photos will enlarge them.


With the exception of the paid search result about voltdb, all of the links Google offered takes one to information about Oracle’s Times Ten In-Memory Database which is a true “in-memory” database. But this isn’t a blog post about semantics. No, not at all. Please read on.

Seemingly Silly…

View original post 3,187 more words