Managing REDO log performance


I have written before about managing database performance issues, and the topic is hot and alive as ever. Even with today’s fast processors, huge memory sizes and enormous bandwidth to storage and networks.

warning: Rated TG (Technical Guidance required) for sales guys and managers ;-)

A few recent conversations with customers showed other examples of miscommunication between IT teams, resulting in problems not being solved efficiently and quickly.
In this case, the problem was around Oracle REDO log sync times and some customers had a whole bunch of questions to me on what EMC’s best practices are, how they enhance or replace Oracle’s best practices, and in general how they should configure REDO logs in the first place to get best performance. The whole challenge is complicated by the fact that more and more organizations are using EMC’s FAST-VP for automated tiering and performance balancing of their applications and some of the questions were around how FAST-VP improves (or messes up) REDO log performance.

Read more of this post

Oracle RAC on VPLEX now certified

Last week EMC announced that Oracle RAC on VPLEX stretched clusters is now officially supported and certified by Oracle!

News Summary:

 

  • Oracle has certified that EMC® VPLEX™ METRO in a stretch cluster configuration can provide Oracle Real Application Clusters (Oracle RAC) customers with an easy-to-deploy, active/active solution, as they transform from single- to dual-site environments.
  • Having passed Oracle’s rigorous testing standards, the EMC VPLEX METRO solution can enable Oracle RAC to be easily configured over extended distances while enabling simultaneous access to the same data at both locations.

This is the final step in a process to help customers that have been asking for true active/active support over distance for their mission-critical Oracle Database business processes.

For those who are not yet familiar with this solution, here is a small summary:

  • Customers have been in search for ways to survive datacenter failures (i.e. “disasters”) without the need to recover and restart the databases, in such a way that any component failure or even complete site failure would not lead to database downtime
  • This was not possible before except when deploying complex configurations based on host mirroring using Oracle ASM or a 3rd party volume manager. (note that competing storage virtualization products from other storage vendors also do not offer this full capability – even though their marketing might make it seem so)
  • EMC VPLEX offers this functionality which is now completely certified and supported by Oracle, and the solution avoids risk by making the stretched cluster deployment as easy as a basic Oracle RAC install
  • The VPLEX solution offers additional benefits including better performance, better recovery from issues such as component or link failures and offers a complete solution for the whole application stack, not just Oracle
  • Note that AFAIK this solution should also work for Oracle DB2 (but I haven’t confirmed)

The full news release can be found here: http://www.emc.com/about/news/press/2012/20120517-04.htm

A full series of blog posts on this solution can be found here: http://bartsjerps.wordpress.com/category/vplex/

The VPLEX witness (the final component of VPLEX that made this possible) was announced last year at EMC World 2011. Typically we see the start of market adoption between 1 to 1.5 years after bringing new technology in the market. I am working on a few customers myself who are on the edge of starting a project with this, hopefully by the end of year we have a set of good customer references!

Update: The new white paper can be found here: http://www.emc.com/collateral/software/white-papers/h8930-vplex-metro-oracle-rac-wp.pdf

 

Data Guard protecting from EMC block corruptions?

Today I was giving a training to fellow EMC colleagues on some Oracle fundamentals. One of the things that was mentioned is something I have heard several times before: Oracle is claiming that EMC SRDF (a data mirroring function from EMC Symmetrix enterprise storage systems mainly to provide enterprise disaster recovery functions) cannot detect certain types of data corruption where Oracle Data Guard can. Ouch. The trouble with this statement is that it is half-true (and these ones are the most dangerous).
Read more of this post

Performance – The database stack

hamb-stackAs mentioned before, I frequently find myself in discussions around Oracle performance and how an Oracle database behaves on EMC storage. It turns out that often there is a lot of confusion on how the different layers interact with each other and very few people seem to understand the whole stack.

So I started a personal challenge to make a “one picture tells more than 1000 words” complete overview of the Oracle on EMC database stack.

I failed.

Turns out it’s nearly impossible to get everything in one picture without cutting corners.

So here is a simplified (and therefore incorrect) picture. It ignores certain complexities and is far from complete, and might even contain errors.

Read more of this post

Wikipedia blackout

Blackout
Just to inform you that tomorrow (wednesday jan. 18th, 2012), some of the links on my blog might not work due to Wikipedia’s one-day blackout, in protest against SOPA (and I use Wikipedia a lot as a great resource to learn from myself, and to point to my readers for more information on certain topics).

I think Wikipedia touches a true problem; governments (pushed by lobbyist groups) are pushing for an internet where you have to be cautious about what you say or publish. Best case, you might get blacklisted. Worst case? Figure it out for yourself.

I live in the Netherlands and currently something similar is going on about organizations trying to restrict people accessing certain information sources (in this case, the Pirate Bay). Whether Pirate Bay (or any other source of information for that matter) violates the law, or not, is (IMO) a different discussion. But if people (or organizations) want to restrict access by ordering ISP’s (information service providers, a.k.a. the mailman) to blacklist those sites (i.e. check your mail for offending content) instead of chasing the publishers of illegal materials of any kind, then we are well on our way to a different internet. An internet that is no longer free. I strongly oppose to that.

So, Wikipedia (and others), you have my full support.

Click the “STOP SOPA” banner (top right hand corner) if you want to learn more.

More info:

https://www.eff.org/deeplinks/2011/12/fight-blacklist-toolkit-anti-sopa-activists

https://www.eff.org/search/site/sopa

Oracle and Data Integrity: Data in, Garbage Out?

Stop Corruption

A trivial question:

What is the basic function of a storage system?
I would say, the trivial function of a storage system is to store digital data and getting it back when you need it.

To be specific:
get the data back exactly the way you stored it.

You would probably say “Duh, of course!”

A storage system (as simple as a hard disk or as sophisticated as an EMC VMAX) is supposed to store data and give it back unmodified. But recent research shows that simple disk drives are not as reliable as you might think. Enough material is available that explains why and how often disk drives fail to return the correct information, often without any error as if the corrupted data is perfectly valid. See below for more references to this issue.

Read more of this post

Application processing at lightning performance – The hourglass view of access times

HourglassEven in these modern times, when lots of things are changing in the ICT world, some lessons from the past still hold true.

Previously, I discussed the I/O stack in a typical database environment. As virtualization has complicated things a bit, the fundamental principles of performance tuning stay the same.

Recently I was browsing through old presentations of colleagues and found another interesting view on response times in an application stack. Again, I polished it up a bit and modified it to reflect a few innovations and personal insights.

The idea is as follows. We as humans have problems getting a feel of how fast modern microprocessors work. We talk in milliseconds, microseconds, nanoseconds. So – in the comparison we assume a 1 Gigahertz processor and then scale up one nanosecond to match one second – because this fits better in human’s view of the world. Then we compare various sorts of storage on the “indexed” timescale and see how they relate to each other.

Read more of this post

Managing Performance Expectations

Got this joke from a Dutch colleague (thanks Rohan ;-)

A customer is complaining that his shiny new storage system does not perform and (as usual) blames the storage vendor.

But sometimes you have to wonder if a customer uses a system where it was designed for…

Read more of this post

Save money by virtualizing Oracle

Approved

I wrote an internal EMC memo on licensing issues with Oracle on VMware as I get a lot of questions on this topic. But I’d like to expand the question a bit. After all, my blog is named “Dirty Cache” which could also be substituted with “Dirty Cash” – and as said, my mission is to lower cost and drive up service levels for my customers…

Here my internal memo (slightly edited for the blog and updated with a few corrections). Again, I want to make it clear that these are my own opinions based on (limited) customer experiences, I might be completely wrong and that’s why my blog has a disclaimer ;-)

Use this information at your own risk – don’t shoot the messenger.

Original question:

How should we license Oracle database on VMware?

Beefed up question:

How can we save money on licensing and other expenses by virtualizing Oracle?

Read more of this post

Performance – The I/O stack

Concorde Mach Indicator

In my last post, I gave a highly simplified representation of “The I/O stack” of a database. In reality, it’s much more complex.

I found an old picture where the whole I/O stack of a database was described and I decided to brush it off and include some additional layers (application and middleware) and show how the I/O flows if you run on a virtual server with a hypervisor.

Also, the storage network can provide virtualization which in turn adds a layer of complexity.

Read more of this post

Follow

Get every new post delivered to your Inbox.

Join 29 other followers