As storage
systems move away from the typical RAID\LUN model for storage assignment, IOPs for
a specific application are tougher to nail down and can mask a misbehaving
application.
With all the
new storage systems when a LUN/Volume is created it is stripped across all
elements that make up the storage tier.
So essentially you have access to the full I\O capability of the entire storage
tier, not just a subset of disks that make up a RAID Group. Unfortunately providing a higher number of IOPS is useless unless they are delivered at predictable low latency.
This leads to the question - So what good are IOPS figures? And why does the storage industry talk about
them all the time? Personally I think
it’s a hang-up from the days of disk, when IOPS were such a limiting factor…
and partly a marketing thing, because multi-million IOPs results sound
impressive. I’m more concerned in what
we should be asking about than what we should not, so what does matter?
Set aside IOPS as a factor for now. The whole point of a flash array is that IOPS
effectively become an unlimited resource. Sure, there is always a real limit –
but it’s so high that it’s no longer necessary to worry about it.
Latency is now the critical factor that should be focused on because
this is what injects delay into your system. Latency means lost time; time that could have been spent busily
producing results, but is instead spent waiting for I/O resources.
Business requirements tend to be along the lines of needing to supply
trading reports faster, or reduce the time spent by call center operatives
waiting for their CRM screens to refresh. These almost always translate back
into latency requirements. After all, the key to solving any performance issue
is always to follow the time and find out where it is being spent.
Have you noticed that latency is the only one of our three fundamental
characteristics which is expressed solely in units of time?
I was recently ran into this specific issue. A small change introduced a minor latency increase
to a claims processing system and the result was that each claims representative
went from process 40 claims a day to 38. That may not see like a lot but
multiply that over 400 reps over a week that translated to 4000 less claims processed
a week and since they get paid by the claims processed it was a significant hit
to profit. When looking at the system form
a strictly IOP perspective they were the same before and after the change. So
yes IOPS do matter but don’t get distracted them… it’s all about latency.
Labels: IOPS, Latency, Storage