Future-Proofing Your Cloud’s Flash Storage – Evaluate Your Cloud’s Needs

In today’s post we discuss why we should be avoiding flash array vendors that don’t provide the features your cloud needs.  After all why reward a vendor that doesn’t support de-duplication or non-disruptive updates at this late date. 

Everybody by now knows the advantages of flash over traditional spinning disk when it comes to delivery shear IOPS, low latency and higher throughput.  Every flash-based company now offers excellent performance.  We need to stop talking so much about IOPS and low-latency because all the flash vendors pretty much all deliver that and those flash storage companies that talk the most about ‘millions’ of IOPS are often the ones not offering meaningful storage and software features coupled with that speed. It’s not enough to offer merely IOPS – the industry in a very short time has matured. Let’s now turn our attention to discussing some of the features. In our last blog post we saw how Plexxi changed the complexion of the virtualization/cloud network and storage problem.  It’s important to look at what companies like Plexxi, SolidFire, Nutanix and other do start to imagine what we need to be thinking about with regards to build a cloud that can be resilient and withstand the intense pressures that over time it will face.  At the heart of this discussion, will be the notion of future-proofing your architecture.  These features should act as your talking points with any flash vendor and we should be thinking outside of any narrow box and include a wide range of technologies that will better allows us to future-proof our cloud. Some of points below serve as a way to converse about needs versus product.  If a particular flash vendor doesn’t have these features and are not even considering them – consider what you are trying to achieve not what they are trying to sell you.  You want to architect your cloud for the future.  

One basic question is – PCIe Flash Card or Flash/SSD array ? Flash and SSD arrays are being used within enterprise and cloud settings but a considerable number of companies use PCIe flash cards inside their servers to run databases and applications. In two separate posts – Recommended Reading : Reference Architecture for Oracle RAC with Fusion IO’s ION Data Accelerator Software and Dell r720 Servers and Recommended Reading : Oracle (RAC) Cluster on Virident FlashMAX II and vShare Solution Brief – we started to look at how PCIe flash cards from Virident, Fusion IO and Micron have changed the flash landscape. This week Fusion IO delivered over 2.5 million sustained IOPS running Oracle database in an T10 P1 setting. Four Oracle 11g database nodes connecting to four ION nodes (12 RU total).  Virident was also at Oracle World showing off a 4.8 TB PCIe flash card in conjunction with Oracle environments.

As you can see PCIe flash cards are being used to power cloud platforms. Architecture matters, so it is wise to look at what others are doing rather than simply jump quickly one way or the other.

There is a sea change happening with regards to flash arrays and storage systems.  Where last year there was only a few, this year you have lots of really fast and high quality storage systems to choose from. Importantly, last year the main point of interest was speed, this year it is that speed coupled to cloud features :

  • Scale-out Native Clusters.  Vendors that can cluster many arrays or nodes into one cluster with a contiguous view of storage have a huge advantage over those that can not.    Companies I am aware of that do this  –  SolidFire and Hitachi.  This will become a must-have.  Think about – you have a choice to aggregate storage nodes into one contiguous view as opposed to having to live with small data islands that may be insufficient.
  • Quality-of-Service.  Allows you to limit or guarantee IOPS to your VMs.  In highly virtualized environments or clouds this allows being able to limit VMs from over-consuming IOPS while offering a mechanism to do this (minimum, maximum and (temporary) burst thresholds).  VMs can be guaranteed IOPS performance levels.  This allows establishing SLA around storage performance and allows cloud providers to monetize these guarantees. One company that do this extremely well at this – SolidFire. In a heavily virtualized cloud setting this will become a must-have.
  • In-line Data Reduction. What if you got in-line deduplication and compression with your flash storage ? In other words, what if transparently you could reduce dramatically your storage requirements.  Some flash vendors have this – Skyera, Pure Storage, SolidFire, Nimbus Data NexGen and Nutanix. One example from Pure Storage shows how moving to their flash array allowed data reduction technologies to decrease the storage by a factor of 9 to 1.  This a must-have today.
  • Thin Provisioning. In a highly virtualized cloud this is a pretty obvious feature that is a must-have today.
  • Redundancy.  Array or cluster redundancy is extremely important.  Whether it is the array that is built with HA in mind or arrays that are interconnected to offer redundancy this is a must-have. You want high availability in your cloud.  An example of this type of redundancy is Nimbus Data’s new Gemini arrays which are completely redundant and hot-swap-everything.
  • Non-disruptive upgrades are absolutely critical in a cloud setting. Don’t let vendors tell you otherwise.  Don’t put your admins in the position of having to do things that the storage system should being doing for you.  If you need to upgrade to a new version of the storage operating system – how is this done without taking an outage ? Some vendors do not offer this feature and it worthwhile to consider that the cloud your building can not  afford to take an outage or for you to architect around a missing feature – this becomes costly.
  • Native Authentication.  Connect accounts to storage volumes for both security and monetization reasons. Storage systems like SolidFire provide this feature within their storage system.
  • Snapshots, Clones and Replication.  Every flash vendor has these or should. Worth double-checking – you would be surprised, there are some that don’t.
  • Storage APIs.  These are critical. Clouds are increasingly automated to the point of custom self-automation.  You need to be able to script or program the storage system. Is it possible to programmatically execute all the storage system’s functions and integrate into a custom automated cloud management system.
  • Cloud standards.  Does it comply with OpenStack or CloudStack ? Or are you using VMware storage APIs.  Some vendors are seriously committed to standards – they are actively engaged in consortiums like the OpenStack consortium. Most vendors target VMware. Like other vendors Hitachi is very focused on VMware – they highlight a number of VMware APIs that they inter-connect with.
  • Data Protection.  Are arrays/nodes using RAID or RAID-like approaches to prevent data corruption.  It worth the time to look at what the different vendors are using.
  • Virtualization Support.  What type of virtualization are you using ? Does the flash storage system you are considering offer deep integration with it ?
  • File System protocols. Does the flash storage system support NFS, FTP and SMB 3.0 ?  Are you using these ?
  • Cloud-scale Monitor. Is it easy to monitor your cloud storage ?

Some thoughts along the way to creating a future-proofed cloud –

  • TRY AND AVOID vendors that do not have non-disruptive upgrade in their array. You put your storage administrators in a bad place.
  • TRY AND AVOID vendors that do not provide a native clustering of their arrays/nodes into one contiguous storage space.
  • TRY AND AVOID vendors  that do not provide inline de-duplication, compression and thin provisioning.
  • TRY AND AVOID vendors that do not have either redundancy in their array or redundancy within a cluster. They should have one of these at least. Some have both.
  • TRY AND AVOID vendors that do not provide snapshots and clones.  Consider the importance of these to do quick copies at-a-point-in-time of volumes or used in virtualization with linked clones virtual machines in VMware.
  • TRY AND AVOID vendors that do not focus on automation. Most vendors have storage APIs and CLIs that make them scriptable and lend themselves to automation.

If you are trying to create a cloud that has to meet customer SLAs and/or your are trying to monetize storage performance :

Finally – watch out with regards to serviceability :

  • BEWARE and note that arrays/nodes that need to be serviced from the top of the array/node create issues. Servicing a flash storage system from the top means they have to be pulled out of the rack. This may be unavoidable – it is worth paying attention to.
  • BEWARE of how the SSDs or PCIe modules get pulled out – attention needs to be paid on safe removal or insertion of these modules.  Consider the removal latches used by DIMMs as an example of making it easier.
  • PAY ATTENTION to servicing of the storage system – it should not be surprising how important this is.

All of these elements may reduce your vendor list but you should try to get the best cloud storage you can get at reasonable price points.  Keep in mind that you might want to buy an array that advertises millions of IOPS but you may over the life of the array only need mid or high hundreds of thousands of IOPs or even better you  can choose to incrementally grow your flash storage cluster as needed. The days of simply buying flash arrays that are “fast” have quickly come and gone  – it doesn’t help your cloud or your storage administrators.  If you put your storage administrators in a position where they are without non-disruptive upgrades – they have to take an outage and possibly move the data from one array to another, upgrade and then move it back.  This is avoidable. Simply taking an outage adds to everyone’s workloads and is completely avoidable.  Remember, if you only pay attention to IOPS you will end up with, well, only IOPS.  You may not be able to allocate these IOPS in a granular way, or upgrade the hardware that gives you these IOPS without a major disruption or allow a reduction of duplicate copies of files and images or be able to non-disruptively grow these IOPS in one or more big clusters. But, you will have IOPS. Today there are a number of vendors that support critical storage features that avoid putting you in situations where you need to take outages simply to upgrade or use up valuable storage simply because they don’t do de-duplication.

Create a rudimentary cheat sheet like the one below to help you.

aimhigh

In this comparison the array on the right has substantially more features than the other array. The humorous aspect is that the “cloud-focused” SSD array has better support for enterprise features than the “enterprise”-focused array. These are but a few of the considerations, there is much more criteria that should be examined – an important like what is the data protection technology that the vendor is using or service considerations or whether the company in question has a focus on QA. We will spend more time looking at this in future posts.

 
 


gotostorageGo to more posts on storage and flash storage blogs at http://digitalcld.com/cld/category/storage.