DAS aggravation or Disaggregation?

A disturbing trend is occurring with cloud-scale and enterprise accounts where the deployment of NVMe SSDs is capped at less than 3TB.  The reason?  Node recovery.  With DAS deployments, the average recovery time for 1TB of NVMe can take hours and negatively impact application performance!   The result?  We are deploying complete nodes (servers, storage, network, memory) to accommodate capacity growth when all we really need are fatter SSDs.   Modern applications were designed to scale out using Direct-Attached Storage (DAS) with many servers working together processing sharded data in a parallel fashion.  In most cases, customers have enjoyed the…

Award Pure a Participation Trophy

I was President of a Little League Baseball organization for several years.  Every player under the age of 9 would get a “participation trophy” for showing up and trying.  You see the same practice in most youth sports programs these days.   I thought of this when Pure Storage announced their support for DirectFlash™ Fabric this week.  The new feature allows them to connect their FlashArray//X to RDMA-capable Ethernet hosts using NVMe-oF.  Just like Pavilion, they are targeting ‘modern stack’ applications that require massive parallelism, low latency, and high bandwidth.  These applications rarely use shared storage since traditional SANs add…

Performance Density – A New Metric for Rack-Scale Design

I’m constantly amazed by the specsmanship of the data storage industry.  Every month we hear about some new system that can achieve a gazillion IOPS or store 100’s of petabytes.  We revel in our own glory, often without consideration of the consequences.  Current examples are NVMe All-Flash Array and Software-Defined Storage (SDS) marketeers running amok.   Our industry is consistently rewarded for storage capacity density. This is the number of TBs that can be crammed into a shelf or rack unit (RU).  It is why drive makers constantly increase capacities for every form factor and Big Storage is cleverly stuffing as many drives as possible…

Set the Speed Dial for Pavilion Data

NVMe SSDs are the most expensive non-volatile storage in today’s data centers. But optimizing your ROI on NVMe can be tricky. Conventional wisdom says the highest performance with the lowest latency is achieved by installing the SSD directly into the server and scaling throughput or IOPS in a parallel fashion by adding more servers, each with its own NVMe SSD. However, increasing IOPS or throughput to match your workload is directly proportional to the amount of NAND Flash and power settings on the SSD. If your workload requires a small amount of storage, as most databases do, but your need…

Supercomputing Architecture for Any Data Center

Mapping the Human Genome was just the beginning for the life sciences discipline of bioinformatics.  Now, the goal is to apply every map of every person against a perfectly healthy baseline to identify sequences of DNA that carry diseases like diabetes, asthma, migraine, and schizophrenia.  However, even with the most powerful compute, network and storage resources on our planet, full genomic map comparisons are not possible.   The International HapMap Project is a more efficient method to isolate genetic mutations using massively parallel processing with Hadoop® MapReduce and Spark® analytics to obtain statistically significant comparisons.  HapMap would not exist without…