Pinpointing I/O bottlenecks on Linux

Does this story sound familiar?

The end users of a database application start complaining about poor system response and long running batch jobs. The DBA team starts investigating the problem. DBA’s look at their database tools such as Enterprise Manager, Automatic Workload Repository (AWR) reports, etc. They find that storage I/O response times are too high (such as an average of 50 milliseconds or more) and involve the storage team to resolve it.

The storage guys, in turn, look at their tooling – in case of EMC this could be Navisphere Analyzer, Symmetrix Performance Analyzer (SPA) or similar tools. They find completely normal response times – less than 10 milliseconds average.

The users still complain but the storage and database administrators point to each other to resolve the problem. There is no real progress in solving the problem though.

Two Way Communications

Continue reading