Announcing qdda 2.0

It’s almost a year since I blogged about qdda (the Quick & Dirty Dedupe Analyzer).

qdda is a tool that lets you scan any Linux disk or file (or multiple disks) and predicts potential thin, dedupe and compression savings if you would move that disk/file to an All Flash array like DellEMC XtremIO or VMAX All-flash. In contrast to similar (usually vendor-based) tools, qdda can run completely independent. It does NOT require a registration or sending a binary dataset back to the mothership (which would be a security risk). Anyone can inspect the source code and run it so there are no hidden secrets.

It’s based upon the most widely deployed database engine, SQLite, and uses MD5 hashing and LZ4 compression to produce data reduction estimates.

The reason it took a while to follow-up is because I spent a lot of evening hours to almost completely rewrite the tool. A summary of changes:

  • Run completely as non-privileged user (i.e. ‘nobody’) to make it safe to run on production systems
  • Increased the hash to 60 bits so it scales to at least 80 Terabyte without compromising accuracy
  • Decreased the database space consumption by 50%
  • Multithreading so there are separate readers, workers and a single database updater which allows qdda to use multiple CPU cores
  • Many other huge performance improvements (qdda has demonstrated to scan data at about 7GB/s on a fast server, bottleneck was IO and theoretically could handle double that bandwidth before maxing out on database updates)
  • Very detailed embedded man page (manual). The qdda executable itself can show its own man page (on Linux with ‘man’ installed)
  • Improved standard reports and detailed reports with compression and dedupe histograms
  • Option to define your own custom array definitions
  • Removed system dependencies (SQLite, LZ4, and other libraries) to allow qdda to run at almost any Linux system and can be downloaded as a single executable (no more requirements to install RPM packages)
  • Many other small improvements and additions
  • Completely moved to github – where you can also download the binary

Read the overview and animated demo on the project homepage here: https://github.com/outrunnl/qdda

HTML version of the detailed manual page: https://github.com/outrunnl/qdda/blob/master/doc/qdda.md

As qdda is licensed under GPL it offers no guarantee on anything. My recommendation is to use it for learning purposes or do a first what-if analysis, and if you’re interested in data reduction numbers from the vendor, then ask them for a formal analysis using their own tools. That said, I did a few comparison tests and the data reduction numbers were within 1% of the results from vendor-supported tools. The manpage has a section on accuracy explaining the differences.

Continue reading

Silly Little Oracle Benchmark – RPM edition

slob-rpmA while ago Kevin Closson announced a new release of the well-known SLOB kit.

SLOB is a simple but powerful toolkit that drives lots and lots of IO on a real Oracle database (so for performance testing of database platforms, it’s much better than synthetic IO tests).

A previous version was bundled with Outrun but required the entire Outrun distribution to work properly. With the new 2.3 version I created an RPM package that can be installed separate on any Enterprise Linux 6.x (64 bit) server.

The wiki page (including instructions) can be found here: SLOB RPM Package wiki

Thanks to Kevin for granting permission to redistribute this awesome toolkit!

Continue reading

Introducing Outrun for Oracle

Overview

outrun-logo-transparentIf you want to get your hands dirty with Oracle database, the first thing you have to do is build a system that actually runs Oracle database. Unless you have done that several times before, chances are that this will take considerable time spent on trial-and-error, several reinstalls, fixing install problems and dependencies and so on. The time it takes for someone who is reasonably experienced on Linux, but has no prior Oracle knowledge, would probably range from a full working day (8 hours, best case) to many days. I also have witnessed people actually giving up.

Even for experienced users, doing the whole process manually over and over again is very time consuming, and deploying five or more systems by hand is a guarantee that each one of them is slightly different – and thus a candidate for subtle problems that happen on one but not the others. Virtualization and consolidation is all about consistency and making many components as if they were only one.

There are literally dozens of web pages (such as blog posts) that contain detailed instructions on how to set up Oracle on a certain platform. Some examples:

The Gruff DBA – Oracle 12cR1 12.1.0.1 2-node RAC on CentOS 6.4 on VMware Workstation 9 – Introduction
Pythian – How to Install Oracle 12c RAC: A Step-by-Step Guide
Martin Bach – Installing Oracle 12.1.0.2 RAC on Oracle Linux 7-part 1

Even if you follow the guidelines in such articles, you are likely to run into problems due to running a different OS, different Oracle version, network problems, and so on. Not to mention that in many cases the “best practices” provided by various vendors are often not honoured because they tend to be overlooked due to information overload…

Some people have hinted to use automated deployment tools such as Ansible (i.e. Frits Hoogland – Using Ansible for executing Oracle DBA tasks) but there are (as far as I know) no complete out-of-the-box solutions.

EMC has published several white papers and reference architectures with instructions on how to set up Oracle to run best on EMC. Still, some of the papers are not a step-by-step manual so you have to extract configuration details manually from various (sometimes conflicting) sources and convert them in configuration file entries, commands, etc.

So I decided a while ago to go for a different approach, and build a virtual appliance that does all of these things for you while still offering (limited) flexibility in different platform and versions, and preferences for configuration.

Continue reading

Oracle ASM vs ZFS on VNX

swiss-cheeseIn my last post on ZFS I shared results of a lab test where ZFS was configured on Solaris x86 and using XtremIO storage. A strange combination maybe but this is what a specific customer asked for.

Another customer requested a similar test with ZFS versus ASM but on Solaris/SPARC and on EMC VNX. Also very interesting as on VNX we’re using spinning disk (not all-flash) so the effects of fragmentation over time should be much more visible.

So with support of the local administrators, I performed a similar test as the one before: start on ASM and get baseline random and sequential performance numbers, then move the tablespace (copy) to ZFS so you start off with as little fragmentation as possible. Then run random read/write followed by sequential read, multiple times and see how the I/O behaves.
Continue reading

Oracle ASM vs ZFS on XtremIO

zfs-asm-plateBackground

In my previous post on ZFS I showed how ZFS causes fragmentation for Oracle database files. At the end I promised (sort of) to also come back on topic around how this affects database performance. In the meantime I have been busy with many other things, but ZFS issues still sneak up on me frequently. Eventually, I was forced to take another look at this because of two separate customers asking for ZFS comparisons agaisnt ASM at the same time.

The account team for one of the two customers asked if I could perform some testing on their lab environment to show the performance difference between Oracle on ASM and on ZFS. As things happen in this business, things were already rolling before I could influence the prerequisites and the suggested test method. Promises were already made to the customer and I was asked to produce results yesterday.

Without knowledge on the lab environment, customer requirements or even details on the test environment they had set up. Typical day at the office.

In addition to that, ZFS requires a supported host OS – so Linux is out of the question (the status on kernel ZFS for Linux is still a bit unclear and certainly it would not be supported with Oracle). I had been using FreeBSD in my post on fragmentation – because that was my platform of choice at that point (my Solaris skills are, at best, rusty). Of course Oracle on FreeBSD is a no-go so back then, I used NFS to run the database on Linux and ZFS on BSD. Which implicitly solves some of the potential issues whilst creating some new ones, but alas.

Solaris x86

slob-rules-kenteken
This time the idea was to run Oracle on Solaris (x86) that had both ZFS and ASM configured. How to perform a reasonable comparison that also shows the different behavior was unclear and when asking that question to the account team, the conference call line stayed surprisingly silent. All that they indicated up front is that the test tool on Oracle should be SLOB.

Continue reading

ZFS and Database fragmentation

Disk Fragmentation

Disk Fragmentation – O&O technologies.
Hope they don’t mind the free advertising

Yet another customer was asking me for advice on implementing the ZFS file system on EMC storage systems. Recently I did some hands-on testing with ZFS as Oracle database file store so that I could get an opinion on the matter.

One of the frequent discussions comes up is on the fragmentation issue. ZFS uses a copy-on-write allocation mechanism which basically means, every time you write to a block on disk (whether this is a newly allocated block, or, very important, overwriting a previously allocated one) ZFS will buffer the data and write it out on a completely new location on disk. In other words, it will never overwrite data in place. Now a lot of discussions can be found in the blogosphere and on forums debating whether this is really the case, how serious this is, what the impact is on performance and what ZFS has done to either prevent, or, alternatively, to mitigate the issue (i.e. by using caching, smart disk allocation algorithms, etc).

In this post I attempt to prove how database files on ZFS file systems get fragmented on disk quickly. I will not make any comments on how this affects performance (I’ll save that for a future post). I also deliberately ignore ZFS caching and other optimizing features – the only thing I want to show right now is how much fragmentation is caused on physical disk by using ZFS for Oracle data files. Note that this is a deep technical and lengthy article so you might want to skip all the details and jump right to the conclusion at the bottom :-)

Continue reading