Why Lego Makes Sense in Toys, Software, Servers and Storage

Update : For my most recent post on the new class of flash arrays that fully support deduplication and a full range of storage features, Recommendations for All-Flash Storage Arrays; Aiming Beyond Simply IOPS and Low Latency.

Legos are remarkable. You can almost build anything with them, however, that is really not the only remarkable aspect of legos. The underlying lesson is that from very small inter-legooperable parts you can efficiently build very big things. The lego approach has found itself into software practices and as well to servers. Good software design emphasizes small generic functions that can be used extensively and flexibly by other functions. On the server front – while very large servers still exist and serve many useful functions – the majority of data centers are building clouds from a large number of  small servers. These servers are 1RU or 2 RU at most.  And yes we see the larger 4RU and greater sized servers, but they are infrequent and very specific in function. For example, If you check out Hadoop in data centers, you will discover that they run on these smaller servers in clustered groups. If you check out database-as-a-service – increasingly they run on an architecture of small servers and small storage units. Software such as Mesos and Chef make it easier to provision, manage and create clouds populated by hundreds and thousands of small servers. The sweet spot is the two processor Intel server.  Oracle’s Exadata architecture is a database poster-child for this approach.  A popular version of Exadata is comprised of 14 storage servers and 8 database servers. Basically going from an eighth rack to a full rack approaches it in a lego-like manner. The reality is that the combination of operating systems software, application software and small servers have been combined to offer architectures that can do very big things. Increasingly, software plays a ever bigger role in everything. In storage, the lego approach is a big win for enterprise and cloud providers because it offers extreme scale-out and flexibility when combined with software, storage features and fast underlying hardware. Yes, you could get a 3RU or 18RU array architecture – but the win is in building architectures with smaller 1RU arrays that perform elastically, resiliently and flexibly like legos and construct large storage architectures and scale-out storage spaces that offer much more than what is possible from a single large storage array.  When I say more – I don’t necessarily mean IOPS – all these flash storage arrays provide more IOPS than are needed and arguably hybrid arrays also provide excellent performance.  More important – the architects can grow their architecture in an as-needed basis. They don’t have to buy huge blocks of expensive flash storage if they don’t need them. Let’s look at an example of the lego model.  I see SolidFire as employing a lego-like model. A starter configuration include five 1RU storage nodes with complete data protection across the storage nodes. Key – data protection doesn’t just happen at the node or array level, it takes place at the cluster level.  A good example of a lego-like storage model is SolidFire’s all-flash storage nodes, because of the operating system software each node offers features that become more powerful as they are aggregated.  First, you can add new nodes and aggregate the flash storage into a single view and single pool. Five, Six, Seven… ten … one hundred nodes can be pulled together to create large storage pools if desired. This is key you can add new nodes as demand dictates resulting in immediate capacity and increased performance. You add these nodes (or remove them) with no downtime and with minimal performance effects. This scale-out behavior allows you to go from five to one hundred nodes. SolidFire’s petabyte scale-out goes well beyond 280 TB of some competing systems, it goes to 3.4 PB. Second, you can guarantee performance levels because SolidFire offers quality-of-service features.  Third, each node can be upgraded dedupsfnon-disruptively. Fourth, all the usual data reduction suspects are available – thin provisioning, deduplication and compression are offered. Fifth, as in most clouds, automation is a key – REST APIs and an advanced user interface allows automation of the storage cloud. Sixth, failure happens and these node provide for redundancy so data is not lost. Seventh, real time replication is offered to cope with potential disaster recovery scenarios.  Eighth, these nodes come with snapshot and recovery software. Ninth, complete high availability that provides high availability in a distributed manner.   Tenth, these lego nodes offer encryption.  Finally, and importantedly, you can mix the older and newer storage nodes in the same pool.  This lego like model fits what has happened with servers and how enterprise and cloud designs like new engineered systems are moving to.  SolidFire is only one example of a flash storage vendor that is doing approaching this right. There are others that are adopting this lego model.

In the end, most clouds and many enterprises care about the up-front costs, floor and rack space consumed, non-disruptive upgrades feature, power & cooling costs, performance, latency, capacity, data resiliency and high availability – and importantedly – providing enterprise features like extreme scale-out storage pools, quality of service tunability, deduplication and an ability to add arrays or nodes incrementally without incurring huge costs.  The lego approach to storage delivers on this.

gotostorageGo to more posts on storage and flash storage at http://digitalcld.com/cld/category/storage.


Recommended Reading : PostgreSQL Performance Benchmark Whitepaper : Comparing Joyent and AWS EC2

Joyent has announced a very interesting benchmark and an accompanying whitepaper.  Joyent has optimized Postgres to run fast in their cloud. In head-to-head benchmark tests using standard Master/Slave Postgres configurations. According to Joyent – “Joyent’s virtual databases completed tasks up to 15X faster than a comparable Postgres multi-node virtualized database configurations running on Amazon Web Services.”  Very interesting read, their performance was very good, even beating AWS SSD configurations.



Recommended Learning : 4clojure

If you are interested in learning Clojure – there is a nice site which can bring you up to speed. The site hosts a large number of problems to be solved and you can work your way through them.


There is also a good book :



Sea Change : New Gartner Report on Flash Storage in 2013- IBM, Pure Storage come in #1 and #2

In September of last year I made a small prediction that with a slew of new players and many more agile ones competing in the flash storage array area that they would pose a big problem for  the leaders at the time.  Specifically, I pointed to strong new storage competitors and storage/software features as one of the reasons why things would change. I said – Odds were that the next Gartner Report would look very different from the last one. And it does.  You can find the original post at Sea Change : For a Gartner Report on Flash Storage : Last Year was, well, last year.

We have been in the midst of a ‘sea change’ in the area of flash storage and specifically flash storage arrays.seachange Realistically, the 2012 report was highly unusual in that many of the big players had not yet entered into flash storage. It was also unusual because a number of smaller ones were just gaining traction (Kaminario, SolidFire, etc).  Gartner has just released the latest flash storage report and in terms marketshare – IBM and Pure Storage have both taken the #1 and 2 spots.  EMC and NetApp did reasonably well coming in at #4 and 5th.  SolidFire which was not even on the map in 2012 and has done a good job at being close to Cisco and coming in ahead of HP.



gotostorageGo to more posts on storage and flash storage blogs at http://digitalcld.com/cld/category/storage.


Recommended: Flash Accelerating Oracle RAC in 16RU & VMware Power Savings with Local Flash

Fusion-io, Dell, and Mellanox have built a very small footprint (16RU), extremely fast, and price-sensitive reference architecture for Oracle RAC 12c. It is very impressive on a number of counts.  It further highlights the move to using local flash storage over remote flash storage arrays.  This solution aims at reducing hardware sprawl and power consumption by achieving higher performance in a smaller footprint.  From the description – “This reference architecture features four Fusion ION Accelerators and four Oracle RAC database nodes all connected through redundant Mellanox SX6036 switches. Each ION Accelerator in this reference architecture uses a Dell PowerEdge R720 with three 56 Gbps Mellanox ConnectX-3 InfiniBand cards and four industry-leading Fusion ioDrive2 2.4TB flash storage devices for a total of 9.6TB of all flash storage. Each of these ION Accelerator units fits in 2 rack units (2RU) and delivers well over 645,000 8K database IOPS and up to 12 GB/s sustained throughput. Data redundancy is maintained across pairs of ION Accelerators using synchronous writes, thus providing system and data high availability.  The Oracle RAC nodes consist of Dell PowerEdge R620 servers. Each two-socket server consumes a mere 1U of rack space and yet is capable of pulling 1.4 million 8K IOPS from the ION Accelerator storage layer as measured by the Flexible IO Tester utility.  The Oracle RAC nodes are connected to the ION Accelerators through redundant Mellanox switches. Each Oracle RAC node sees the ION Accelerator storage as simple multipath block storage. The multipath devices are aggregated by Oracle ASM to create a large and powerful diskgroup.  Expanding the size and performance of the database is as easy as adding more ION Accelerator devices to the ASM disk group.”  The links to the article and the Reference Architecture follow.



Increasingly, software vendors  see the virtue of putting their flash storage locally on the server.  For example, VMware engineers wrote recently that by replacing array shared storage with PCIe flash card storage on the servers they could substantially reduce power consumption and maintain high levels of performance.




Clojure and Messaging : Immutant 2

Immutant 2 is a customized version of the JBoss Application Server 7. It provides an integrated platform for webmessagingschedulingcachingdaemonstransactions, and more. It repackages AS7 with additional modules that support Clojure applications, similar to what the TorqueBox project does for Ruby applications.  It provides an API which leverages HornetQ, which is an implementation of JMS. JMS provides two primary destination types: queues and topics. Queues represent point-to-point destinations, and topics publish/subscribe.  If this is something you are interested in, it is worth examining because it provides messaging for non-trivial applications :