The Quick and Dirty Deduplication Analyzer

The best thing about being me… There are so many “me”s.

— Agent Smith, The Matrix Reloaded

One of our customers reported less than optimal space savings on XtremIO running Oracle. In order to test various scenarios with Oracle I was in search of a deduplication analysis method or tool – only to find out that there was nothing available that qualified.

TL;DR: QDDA is an Open Source tool I wrote to analyze Linux files, devices or data streams for duplicate blocks and compression estimates. It can quickly give you an idea of how much storage savings you could get using a modern All-Flash Array like XtremIO. It is safe to use on production systems and allows quick analysis of various test scenarios giving direct results, and even works with files/devices that are in use. No registration or uploading of your confidential data is required.

Read more of this post

Oracle ASM vs ZFS on VNX

swiss-cheeseIn my last post on ZFS I shared results of a lab test where ZFS was configured on Solaris x86 and using XtremIO storage. A strange combination maybe but this is what a specific customer asked for.

Another customer requested a similar test with ZFS versus ASM but on Solaris/SPARC and on EMC VNX. Also very interesting as on VNX we’re using spinning disk (not all-flash) so the effects of fragmentation over time should be much more visible.

So with support of the local administrators, I performed a similar test as the one before: start on ASM and get baseline random and sequential performance numbers, then move the tablespace (copy) to ZFS so you start off with as little fragmentation as possible. Then run random read/write followed by sequential read, multiple times and see how the I/O behaves.
Read more of this post

Oracle ASM vs ZFS on XtremIO

zfs-asm-plateBackground

In my previous post on ZFS I showed how ZFS causes fragmentation for Oracle database files. At the end I promised (sort of) to also come back on topic around how this affects database performance. In the meantime I have been busy with many other things, but ZFS issues still sneak up on me frequently. Eventually, I was forced to take another look at this because of two separate customers asking for ZFS comparisons agaisnt ASM at the same time.

The account team for one of the two customers asked if I could perform some testing on their lab environment to show the performance difference between Oracle on ASM and on ZFS. As things happen in this business, things were already rolling before I could influence the prerequisites and the suggested test method. Promises were already made to the customer and I was asked to produce results yesterday.

Without knowledge on the lab environment, customer requirements or even details on the test environment they had set up. Typical day at the office.

In addition to that, ZFS requires a supported host OS – so Linux is out of the question (the status on kernel ZFS for Linux is still a bit unclear and certainly it would not be supported with Oracle). I had been using FreeBSD in my post on fragmentation – because that was my platform of choice at that point (my Solaris skills are, at best, rusty). Of course Oracle on FreeBSD is a no-go so back then, I used NFS to run the database on Linux and ZFS on BSD. Which implicitly solves some of the potential issues whilst creating some new ones, but alas.

Solaris x86

slob-rules-kenteken
This time the idea was to run Oracle on Solaris (x86) that had both ZFS and ASM configured. How to perform a reasonable comparison that also shows the different behavior was unclear and when asking that question to the account team, the conference call line stayed surprisingly silent. All that they indicated up front is that the test tool on Oracle should be SLOB.

Read more of this post

Fun with Linux UDEV and ASM: Using UDEV to create ASM disk volumes

floppy-disksBecause of the many discussions and confusion around the topic of partitioning, disk alignment and it’s brother issue, ASM disk management, hereby an explanation on how to use UDEV, and as an extra, I present a tool that manages some of this stuff for you.

The questions could be summarized as follows:

  • When do we have issues with disk alignment and why?
  • What methods are available to set alignment correctly and to verify?
  • Should we use ASMlib or are there alternatives? If so, which ones and how to manage those?

I’ve written 2 blogposts on the matter of alignment so I am not going to repeat myself on the details. The only thing you need to remember is that classic “MS-DOS” disk partitioning, by default, starts the first partition on the disk at the wrong offset (wrong in terms of optimal performance). The old partitioning scheme was invented when physical spinning rust was formatted with 63 sectors of 512 bytes per disk track each. Because you need some header information for boot block and partition table, the smart guys back then thought it was a good idea to start the first block of the first data partition on track 1 (instead of track 0). These days we have completely different physical disk geometries (and sometimes even different sector sizes, another interesting topic) but we still have the legacy of the old days.

If you’re not using an Intel X86_64 based operating system then chances are you have no alignment issues at all (the only exception I know is Solaris if you use “fdisk”, similar problem). If you use newer partition methods (GPT) then the issue is gone (but many BIOSes, boot methods and other tools cannot handle GPT). As MSDOS partitioning is limited to 2 TiB (http://en.wikipedia.org/wiki/Master_boot_record) it will probably be a thing of the past in a few years but for now we have to deal with it.

Wrong alignment causes some reads and writes to be broken in 2 pieces causing extra IOPS. I don’t have hard numbers but a long time ago I was told it could be an overhead of up to 20%. So we need to get rid of it.

ASM storage configuration

ASM does not use OS file systems or volume managers but has its own way of managing volumes and files. It “eats” block devices and these block devices need to be read/write for the user/group that runs the ASM instance, as well as the user/group that runs Oracle database processes (a public secret is that ASM is out-of-band and databases write directly to ASM data chunks). ASM does not care what the name or device numbers are of a block device, neither does it care whether it is a full disk, a partition, or some other type of device as long as it behaves as a block device under Linux (and probably other UNIX flavors). It does not need partition tables at all but writes its own disk signatures to the volumes it gets.

[ Warning: Lengthy technical content, Rated T, parental advisory required ]

Read more of this post

Starting an Oracle database on physical server using VMware VMDK volumes

By now, we all know Oracle is fully supported on VMware. Anyone telling you it’s not supported is either lying to you, or doesn’t know what he is talking about (I keep wondering what’s worse).

VMware support includes Oracle RAC (if it’s version 11.2.0.2.0 or above).  However, Oracle may request to reproduce problems on physically deployed systems in case they suspect the problem is related to the hypervisor. The support note says:

Oracle will only provide support for issues that either are known to occur on the native OS, or can be demonstrated not to be as a result of running on VMware.

In case that happens, I recommend to contact VMWare support first because they might be familiar with the issue or can escalate the problem quickly. VMware support will take full ownership of the problem. Still, I have met numerous customers who are afraid of having to reproduce issues quickly and reliably on physical in case the escalation policy does not help. We need to get out of the virtual world, into reality, without making any other changes.  How do we do that?

Read more of this post

Linux Disk Alignment Reloaded

railtrackmisalignMy all-time high post with the most pageviews is the one on Linux disk alignment: How to set disk alignment in Linux. In that post I showed an easy method on how to set and check disk alignment under linux.
Read more of this post

ZFS and Database fragmentation

Disk Fragmentation

Disk Fragmentation – O&O technologies.
Hope they don’t mind the free advertising

Yet another customer was asking me for advice on implementing the ZFS file system on EMC storage systems. Recently I did some hands-on testing with ZFS as Oracle database file store so that I could get an opinion on the matter.

One of the frequent discussions comes up is on the fragmentation issue. ZFS uses a copy-on-write allocation mechanism which basically means, every time you write to a block on disk (whether this is a newly allocated block, or, very important, overwriting a previously allocated one) ZFS will buffer the data and write it out on a completely new location on disk. In other words, it will never overwrite data in place. Now a lot of discussions can be found in the blogosphere and on forums debating whether this is really the case, how serious this is, what the impact is on performance and what ZFS has done to either prevent, or, alternatively, to mitigate the issue (i.e. by using caching, smart disk allocation algorithms, etc).

In this post I attempt to prove how database files on ZFS file systems get fragmented on disk quickly. I will not make any comments on how this affects performance (I’ll save that for a future post). I also deliberately ignore ZFS caching and other optimizing features – the only thing I want to show right now is how much fragmentation is caused on physical disk by using ZFS for Oracle data files. Note that this is a deep technical and lengthy article so you might want to skip all the details and jump right to the conclusion at the bottom :-)

Read more of this post

Big Ideas; Big Tech: Continuous Operations for Oracle RAC with EMC VPLEX

Here’s an EMC video on Youtube about Oracle RAC with EMC VPLEX. Very nice, check it out!

http://www.youtube.com/watch?v=DRtl6dU2P_E

vplex-2

vplex-1

Managing REDO log performance


I have written before about managing database performance issues, and the topic is hot and alive as ever. Even with today’s fast processors, huge memory sizes and enormous bandwidth to storage and networks.

warning: Rated TG (Technical Guidance required) for sales guys and managers ;-)

A few recent conversations with customers showed other examples of miscommunication between IT teams, resulting in problems not being solved efficiently and quickly.
In this case, the problem was around Oracle REDO log sync times and some customers had a whole bunch of questions to me on what EMC’s best practices are, how they enhance or replace Oracle’s best practices, and in general how they should configure REDO logs in the first place to get best performance. The whole challenge is complicated by the fact that more and more organizations are using EMC’s FAST-VP for automated tiering and performance balancing of their applications and some of the questions were around how FAST-VP improves (or messes up) REDO log performance.

Read more of this post

Oracle RAC on VPLEX now certified

Last week EMC announced that Oracle RAC on VPLEX stretched clusters is now officially supported and certified by Oracle!

News Summary:

 

  • Oracle has certified that EMC® VPLEX™ METRO in a stretch cluster configuration can provide Oracle Real Application Clusters (Oracle RAC) customers with an easy-to-deploy, active/active solution, as they transform from single- to dual-site environments.
  • Having passed Oracle’s rigorous testing standards, the EMC VPLEX METRO solution can enable Oracle RAC to be easily configured over extended distances while enabling simultaneous access to the same data at both locations.

This is the final step in a process to help customers that have been asking for true active/active support over distance for their mission-critical Oracle Database business processes.

For those who are not yet familiar with this solution, here is a small summary:

  • Customers have been in search for ways to survive datacenter failures (i.e. “disasters”) without the need to recover and restart the databases, in such a way that any component failure or even complete site failure would not lead to database downtime
  • This was not possible before except when deploying complex configurations based on host mirroring using Oracle ASM or a 3rd party volume manager. (note that competing storage virtualization products from other storage vendors also do not offer this full capability – even though their marketing might make it seem so)
  • EMC VPLEX offers this functionality which is now completely certified and supported by Oracle, and the solution avoids risk by making the stretched cluster deployment as easy as a basic Oracle RAC install
  • The VPLEX solution offers additional benefits including better performance, better recovery from issues such as component or link failures and offers a complete solution for the whole application stack, not just Oracle
  • Note that AFAIK this solution should also work for IBM DB2 (but I haven’t confirmed)

The full news release can be found here: http://www.emc.com/about/news/press/2012/20120517-04.htm

A full series of blog posts on this solution can be found here: https://bartsjerps.wordpress.com/category/vplex/

The VPLEX witness (the final component of VPLEX that made this possible) was announced last year at EMC World 2011. Typically we see the start of market adoption between 1 to 1.5 years after bringing new technology in the market. I am working on a few customers myself who are on the edge of starting a project with this, hopefully by the end of year we have a set of good customer references!

Update: The new white paper can be found here: http://www.emc.com/collateral/software/white-papers/h8930-vplex-metro-oracle-rac-wp.pdf

Update 2: VPLEX support mentioned (briefly) on Oracle’s website: http://www.oracle.com/technetwork/database/enterprise-edition/tech-generic-linux-new-086754.html

Update 3: Demos available on EMC Demo Center:

EMC VPLEX Metro for Oracle RAC Solution Overview
Oracle RAC with VPLEX Metro Site Failure
Oracle RAC with VPLEX Metro Solution Overview
Oracle RAC with VPLEX Storage Failure

If you’re a frequent reader of my blog you might recognize familiar pictures there ;)