Pavilion’s perspective as vendors enter the NVMe game

NVMe Demands Storage System Re-Design,
Not Retro-Fits

We are experiencing exciting times in the storage industry.  During the week of May 1, 2018, Dell announced the NVMe-based PowerMax.   At Pavilion we welcome these kinds of announcements, since it accelerates the customer awareness around the fact that the future of storage is NVMe.   Let the loud and expensive bullhorn sound…its fascinating to watch.   Why?  Because the transition to all-NVMe storage is being validated before our eyes.

You know you’re on to something important when the AFA storage incumbents make their retrofit product announcements.  Rather than doing the difficult, yet required work of storage re-design to exploit the potential of the NVMe protocol, the transitory product-retrofit unveiling occurs…targeted as a price-premium forklift upgrade to existing customers amidst a blaze of trumpets.  “Fastest All Flash Array” claims accompany the loud, brass symphony.  It makes for great theater.

If one wants to predict the future of technology, the past provides a reliable harbinger.  There exist cycles that frequently repeat…over and over.  As profound, disruptive technologies develop, it takes time to establish industry standards.  As these standards become established, legacy vendors respond; adopting as best they can by modifying existing platforms, leaving performance improvement and efficiency opportunities behind.  And it’s not just Dell/EMC.  It is true of any storage provider that leverages a design built around utilizing dual controller architectures and serial protocols.

Application Performance has Become the “Main Thing”

We’re continuing to see faster movement to the Cloud; both migrations of workloads to Cloud Providers OR adopting Cloud delivery models from a central IT function.  There better be solid justification for an application to run on-prem nowadays, and we’re seeing application performance becoming front-and-center as a key criterion.

Yet the types of applications requiring the levels of performance to be on-prem have changed dramatically over the past decade.  We are seeing an explosion of clustered databases running in highly distributed environments, ingesting and processing an order of magnitude larger data sets than ever before.  We’re seeing expansion of High Performance Computing architectures previously reserved for academic, government, and research institutions now moving into the commercial/enterprise world.  We’re seeing streaming analytics becoming increasingly required due to the speed of business where batch processing doesn’t cut it anymore.

And the really interesting aspect of these Modern Applications?  These are rack-scale designs where the unit of measure is the rack.  SAN has no place here.  Rather, the de-facto storage architecture for these highly performant, highly parallelized modern applications remains Direct Attached Storage.  Why?  Because Performance is NOW the “Main Thing.”

To get the maximum performance benefit from flash storage, Modern Applications run on  “shared nothing” direct attached storage, despite the downsides of over-provisioning, CPU contention for required storage management services, and the operational challenges of managing data distributed across 100’s of nodes.

Many analyst firms agree, stating that SAN deployments are not growing, but instead shrinking, and Direct Attached Storage is growing.

Enter NVMe-over-Fabric

The really cool thing about NVMe, and NVMe-over-Fabrics is that it makes available the incredible performance potential of Flash Media by allowing it to use a protocol designed to take advantage of flash.  NVMe-oF enables the possibilities of presenting hyper-fast storage within a shared, pooled resource AS IF it were local to a host through high-speed, lossless protocols.

This means we can deliver storage as a service to a host at the performance of local attached SSDs.  An obvious question that follows is “Isn’t that expensive?” and our answer is “No”.

Yet the reason why NVMe is so disruptive (back to the theme) and why retro-fits just won’t cut it is simple.  Flash is so inherently fast, that NVMe exposes 20+ year old bottlenecks that exist within every existing AFA array design today.  And these are i) the server-based dual controller sub-system designs and, ii) systems designed to continue to support serial protocols.

Time to Re-Think Storage Design

Pavilion’s goal is to democratize hyper-low latency, high throughput storage at the same price (or less) than Big Storage AFA’s.  This required us to completely re-think storage array design.  We quickly realized that existing storage array designs look a lot like a server platform, whereas we believed that the design should be modeled like a switch instead, where controller bottlenecks are removed as much as possible.

Figure 1: Server-based dual-controller storage system design is a bottleneck with NVMe

Pavilion was recently featured as a Cool Vendor in Storage Technologies by Gartner.    One of the statements the Gartner analyst made in the report was “Customers that require simple, small and dense storage for low-latency workloads but may also have high-bandwidth workloads will find Pavilion Data Systems as a useful solution.”

We are focused on the Simple, Small and Dense adjectives used above.    To illustrate what we DON’T mean when we use these terms, let’s examine Dell’s PowerMax announcement, self-proclaimed as “the fastest storage array”.   (We assume that they are basing this off of the 150 GB/s read bandwidth spec they have quoted).

Rather than focusing on overall performance, let’s look at performance from a density standpoint.   If you compare the performance of the EMC 8000 to Pavilion in terms of performance per rack unit, the comparison would look like this:

 

Pavilion Advantage vs PowerMax 8000
IOPS/Rack Unit10X
Bandwidth/Rack Unit15X
Physical Capacity/Rack Unit8X
IOPS/Watt11X

Figure 2: Comparing a Pavilion Array to PowerMax

So, which platform is faster?    The point here is that while Pavilion and PowerMax are designed for completely different use cases, this is an illustration of what Pavilion’s end-to-end NVMe technology can deliver vs. a traditional array design where NVMe is bolted on in places, but not redesigned into the platform from the ground up.

We’re grateful to Dell for jumping into the NVMe pond.  We’re grateful to Pure for incorporating NVMe early within their designs.  And we’ll continue to be grateful for each retrofit announced, building both momentum and market awareness of this profoundly important technology development called NVMe.

And, retrofits are transitory.  NVMe exposes a ton of legacy technology.  It’s difficult work re-thinking storage, yet we invite you to the dialogue to explore the full performance improvement and efficiencies that flash media represents to your organization.

Leave a comment