Climbing Over the Walls You Have Built : Extending Your Corporate Network to the Cloud (Part 1)

The move to the cloud is on.  Increasingly, even companies that are mandated to comply with various corporate and national privacy and security standards, such as HIPAA, are also looking at ways they can extend their company networks to include auto-scaling clouds while at the same time abiding by those security standards.  With the availability of sophisticated cloud diaglayersproviders such as AWS, Azure, Joyent and others it is increasingly attractive for companies to figure out ways to leverage these cloud providers and burst out of the corporate networks and transparently use these clouds.

In thinking about this, it becomes of interest to figure out the “how-to” of doing this. A number of cloud providers continue to work on being able to stretch on-premise with cloud. We can look at what Microsoft Azure has been doing to figure out why companies are looking at a merge of on-premises datacenter and cloud. They have provided an example of how a company could extend their on-premises datacenter with the Azure Cloud.  In their informational diagram it is possible to turn on and off information bits within the the datacenter/cloud architecture. They allow for turning on and off information layers as can be seen from the figure above. You can see the example in the image below.

cloudburst

The above cloud represents an example of how companies can now merge the Azure cloud and their company network securely. You can find more details on this at the link below.

microsoftazure

There are all sorts of challenges but companies like Microsoft are increasingly delivering ways to securely extend corporate networks into auto-scaling clouds.

Another company that allows bursting to the cloud is Cloudian.  Their focus on providing an enterprise hybrid cloud allows corporate networks to connect safely with clouds. In Cloudian’s case, their product, HyperStore, combined with the Amazon cloud, allows for a next-generation hybrid IT cloud. The Cloudian/Amazon combination allows a 100 percent S3-compliant hybrid cloud storage platform. Dynamic migration from on-premises physical storage to off-premises cloud storage allows near infinite capacity scaling to meet the security and cost requirements of enterprise environments. Service providers who provide multi-SLA storage services are also benefited by this hybrid structure. You can read more about it :

screenshot_546

In the next  extending-into-the-cloud post, we will look at extending Microsoft SQL Server into the cloud.  SQL Server 2016, when it arrives,  it will encrypt all data by default, and is integrated with the R statistical programming language. More interestingly it allows a stretch into the Azure cloud.  More on this in the next post.  In the post that follows we will also discuss HIPAA cloud providers and whether they can remain relevant in the face of substantial improvement in merging the on-premises networks with clouds.

Recommended Reading : PostgreSQL Performance Benchmark Whitepaper : Comparing Joyent and AWS EC2

Joyent has announced a very interesting benchmark and an accompanying whitepaper.  Joyent has optimized Postgres to run fast in their cloud. In head-to-head benchmark tests using standard Master/Slave Postgres configurations. According to Joyent – “Joyent’s virtual databases completed tasks up to 15X faster than a comparable Postgres multi-node virtualized database configurations running on Amazon Web Services.”  Very interesting read, their performance was very good, even beating AWS SSD configurations.

joyentwpaws

 

OpenStack Announcement : SolidFire/Dell/Red Hat Unleash SolidFire Agile Infrastructure (Flash Storage-Based) Cloud Reference Architecture

Very, very cool announcement today from SolidFire, Dell and Redhat. Everyone interested in the next generation datacenter should listen to this excellent keynote from a SolidFire’s CEO.  They are way, way ahead of the rest of the flash storage crowd.  Unlike announcements of some vendors whose native operating systems don’t even support extreme scale-out and quality-of-service and that announce their belated participation in the OpenStack foundation minus virtually any substantial meat – SolidFire delivered a strong announcement – a real-world, pre-tested, pre-validated Dell/Redhat/SolidFire reference architecture for building a flash-based cloud. And they are not new to OpenStack they have been supporting it for some time. Today, SolidFire unleashed a pre-validated, pre-tested reference architecture with two key players – Dell and Redhat. In this talk – two eBay engineers – Subbu Allamaraju (Chief Engineer, Cloud) and John Brogan (Cloud Storage Engineering) discussed with Dave Wright (CEO, SolidFire)  the challenges that moved them to look at OpenStack and how eBay is using OpenStack today.  It is definitely worthwhile to listen to these knowledgeable  eBay engineers provide a meaningful discussion on why OpenStack is important.

Then, Dave Wright discusses the newly created OpenStack cloud reference architecture.  You can find the Solidfire Agile Infrastructure (AI) Reference Architecture  here :

soldfireAICloud

David goes into some of detail on the reference architecture for a scale-out cloud. Two Dell and Redhat execs also joined David on stage to further discuss the new reference architecture.

Recommended Viewing : The BlueKai Playbook for Scaling to 10 Trillion Transaction a Month

Good talk on delivering a highly scalable solution. Ted Wallace, VP of Data Delivery at BlueKai discusses how BlueKai scales to 10 trillion data transactions per month.  BlueKai provides data-driven marketing and as a result needs highly scalable solutions. Ted Wallace discussed how they do this. He provides some good details – they use Aerospike to get the high database performance – average read/write response times are between 1 – 2 ms. Six Aerospike clusters with 6 to 10 server in each of three geographically located data centers. They use standard Linux hardware with four Intel 800G SSDs in each and 128 GB to 256 GB of RAM. Lots more details in the talk.  Select the image to go to the talk.

Aerospike_video

Architecture : SSD-Based Solutions Show Advantages In the NoSQL DB Tier (Video)

Today we look at the NoSQL database tier.  Some of this is taken from notes from a work-in-progressAn Introduction to Using High-Performance Flash-Based Storage in Constructing High Volume Transaction Architectures – A Manager’s Guide to Selecting Flash Storage.  This is not a complete look at Big Data, rather a partial look at some of the things Aerospike, one of the more interesting NoSQL databases, is doing. 

Aerospike and the NoSQL Database Tier.  An alternative or in addition to the relational database tier, there is a NoSQL database tier. With the arrival in recent years of Big Data architectures, new elements of a new architecture for dealing with both structured and unstructured data has arrived and with it some databases, like Aerospike, offer an extreme high performance solution in transaction-oriented environments.  Quite a bit different from typical Hadoop implementations as one of Aerospike’s real differentiators is that Aerospike was built as an in-memory database. Traditionally, in the past, this tier we have seen a number of spinning disks.  However, in the past few years, especially with the need for real-time information there has been a move to SSDs and PCIe-based flash cards.  Using Aerospike’s NoSQL database provides a means to get those high performance results. It is built to be run in-memory or in-flash. A partial glimpse into an architecture.  It is built to run on relatively low cost clustered hardware with either lots of memory and/or flash storage.  It supports ACID properties and as a NoSQL database also leverages a key-value store. If we look at an example in this tier – you can see the an example architecture where various transactions are occurring within applications and Aerospike interacts with these. It should be noted that with App tier, Aerospike uses a Smart Client to communicate to the Aerospike cluster.

nosqlarch

Of course, the producing/consuming sources may vary dramatically – from applications, web services, hadoop clusters, mobile devices, weblogs, marketing data repositories and many more.   Aerospike  is a best-of-breed of the NoSQL databases. You can see an example of a typical deployment is (from the Aerospike presentation below) :

aerospike100

And some of the Aerospike server deployments :

aerospike101

Aerospike offers support for the ACID standard and support for a high performance, clustered architecture.

aero_ssd

Of course, there are other databases such as MongoDB, Cassandra and HBase to name a few. You may choose  to use NoSQL database over relational databases. It depends wholly on what you are doing. The NoSQL database tier’s storage on these servers can use SSDs, flash PCIe cards and flash arrays.  Traditionally this tier has adopted a “share-nothing” philosophy using traditional spinning disks, SSDs or flash PCIe cards. Up to recently, flash arrays seemed like not only over-kill but also seemingly moving against the grain of the “share-nothing” philosophy.  SSDs and cards, like Micron’s P320h offer excellent performance and offer a price/performance advantage over arrays.  As prices of flash drops flash arrays are becoming a consideration in this tier and there are a number of recent deployments leveraging flash arrays for the NoSQL DB tier.  Recently, Aerospike tested Micron’s P320h (SLC SSD) PCIe card.  It “blew away the competition” according to people doing the testing.  You can read more here:

aero_ssd

More information on the P320h :

tomshwp320h

It should be noted that there are two versions of this from Micron. Micron offers a 2.5″ Flash PCIe form-factor which is hot-swappable.  You can read more here :

digimicron

It should be noted that competitors are not standing still and Virident, Fusion-IO and others have and are coming out with new cards that are worth looking at.

To understand what Aerospike is doing it is worth watching this video :

Aerospike_video

If you want to learn more, it is worth visiting Aerospike’s BrightTalk site

brighttalk_aerospike

 


gotostorageGo to more posts on storage and flash storage at http://digitalcld.com/cld/category/storage.


 

Recommended Viewing : Presentation Videos From O’Reilly Velocity 2013 Conference – Web Performance And Operations

The  presentations from the O’Reilly Velocity 2013 Conference are available in video format.  If you don’t what this conference is about :

  • Three days of concentrated focus on key aspects of web performance, operations and mobile performance.
  • Keynotes, tutorials and sessions
  • Experts, visionaries and industry leaders converge along with hundreds of web developers, sys admins and other web professionals all under one roof.

The slides :

velocity2013

In addition – a recent post on immutable servers :

cld_immut

 

Recommended Viewing : The Netflix Cloud and Cassandra

Netflix is doing some amazing things. If you have the service, you know they are dependent on Amazon Web Services but their cloud practices transcend that dependency.  Adrian Cockroft has delivered some really excellent talks explaining how they do what they do.

and also a nice talk on how they moved to Cassandra to do a lot of the heavy lifting.

and he provided another very nice presentation at the Cassandra conference C*2012 about running Cassandra on AWS.

Interesting in the two Cassandra talks he discusses use of SSDs to improve Cassandra performance. He talks about moving from 2 drives (1.7 TB) to 2 SSD volumes (2 TB).  He shows results from a hard disk versus SSD comparison.  Netflix is offering a number of Cassandra-related software as open source, such as Priam (for Cassandra automation), Astyanax (client, front-end into Cassandra) and more (like Aegisthus, Zeno, Chaos Monkey, Zuul, Pythias, etc).  Note that AppDynamics is used throughout these presentations.  One other project I’m aware of is a non-JVM way of getting to the recipes in Astyanax is STAASH.  You can follow all of this on the Netflix technical blog.

 

Also a post that may be of interest : Some Thoughts on Why We Want To Run Databases on Flash