Reference Architectures : MySQL Reference Architectures for Massively Scalable Web Infrastructure

Oracle has a whitepaper titled, MySQL Reference Architectures for Massively Scalable Infrastructures.  They highlight reference architectures for small, medium, large and extra large (social media).   They cover a variety of MySQL related best practices. One has the feeling that if another company (Joyent or Percona) had written this – it would have included a number of other aspect such as virtualizing MySQL, etc.  It is a well-written overview of considerations that should be examined in deployments.


If you are looking for more detailed information, this whitepaper is more of a higher level view of best practices around architectures.  A different more detailed reference and interesting benchmarking of MySQL replication on multi-threaded slaves shows a 5x performance improvement – this is more of a single data point on MySQL replication that is of interest to me.


Recommended Reading : OLTP Performance of Exadata versus SSDs versus Violin Memory 6616 (flash) Array

Often, the question comes up how solutions built around SSDs or Violin Memory arrays compare with Exadata.  A short time ago there was a nice benchmark that provided that some answers.  Ashminder Ubhi provided some nice numbers for us to look at. A quick snapshot showing the big picture :

Figure 1 : Comparing three different approaches in a basic benchmark.

Figure 1 : Comparing three different approaches in a basic benchmark (SLOB OLTP)

Using the SLOB benchmark – the different configurations in questions were run and the results might be surprising to some people.  Recommended reading for those interested in understanding performance differences under different workloads.


Talk : Scaling Pinterest

Pinterest is an interesting company which has had tremendous growth in a short time. In two years, they have gone from three engineers to forty engineers,  from one MySQL server to 88 MySQL databases where each has a slave (nice summary from  With that amazing growth the talk covers the evolution of the software/hardware infrastructure.



Recommended Reading : NoSQL Benchmark Data Points, Aerospike 10x Faster

For those that are interested in NoSQL databases, there is an interesting NoSQL Benchmark that Aerospike announced.   An article in InfoQ provides details on the benchmark.  I’ve actually spent some time with Aerospike – pretty nice.  I plan on spending some more time looking at this.  Thumbtrack Techologies has released two whitepapers with results.



Reference Architecture : All-Silicon Microsoft Fast Track Data Warehouse

If you are working with the Microsoft enterprise offerings – this may be of interest. The Microsoft Fast Track Data Warehouse guide are reference architectures which are a combination of Microsoft SQL Server software running on prescribed hardware configurations that have been tested and approved for data warehouse workloads by Microsoft.  More information can be found here.



Recommended Reading : In-Memory Databases and Flash

Recommended Reading.  There is a lot of chatter about in-memory databases. One of the nicest series I’ve seen on this topic comes from is from the flashdba blog.  In part 1 there is a nice overview of storage : primary versus secondary and volatile versus persistent.


In part 2 the discussion moves to the databases themselves. There is a discussion of Exadata 3, Hana and flash memory.


In the third installment. The discussion is about why in-memory databases need flash.



gotostorageGo to more posts on storage and flash storage at


Recommended Viewing & Reading : Cloud Analytics and Heatmaps

Recommended Viewing and Reading.  Doing Cloud Analytics and discovering latency issues is an interesting topic.  Joyent has tackled it via using a combination of node.js tools and heatmaps among many other tools.  The following is an excellent article from Brenden Gregg on heatmaps and the other two videos demonstrate use of heatmaps to do analytics on systems.





gotostorageGo to more posts on storage and flash storage at


Recommended Reading : Comparing SmartOS Zones versus Linux KVM versus Linux Xen

RECOMMENDED READING.  To understand the performance distinction between (Open)Solaris-based Zones versus other forms of virtualization it is worth reading Brendan Gregg’s article on the topic.  One aspect that I have come to understand is that the center of the Solaris world has shifted to Joyent.  They are doing so many incredible things with their distribution that they are making both Oracle Linux, Oracle Solaris and other forms of Linux look anemic.  From their SmartMachines to their SmartDataCenter to their Cloud Analytics to the fixes they have made to ZFS and Zones – Joyent is offering much more than Solaris and Linux does.  One interesting aspect, SmartOS, Joyent’s OpenSolaris based distribution offers both Zones and amazingly, Linux’s KVM.  To understand how KVM came to be ported to SmartOS (OpenSolaris) is another story entirely.  I encourage you to view the following Bryan Cantrill talk and the follow-on presentation.  In this talk Joyent announced that they had ported KVM to their Solaris-based distribution, SmartOS.  This was a remarkable feat made more so by the fact that they used DTrace as the diagnostic vehicle to make much of the porting job easier and that they were able to actually discover bugs in KVM that would help the Linux version as well.

recommended reading

recommended viewing


gotostorageGo to more posts on storage and flash storage at


A Simple Explanation Of Why Flash Matters.

Let’s start by saying that the title is a little deceptive.  I think that traditional hard disk is over.  You can start by seeing it in PC and Apple computer sales.  Look at the Apple’s laptops which offer SSDs as replacements for hard disks.  Or Sony which offers SSDs or at worst they offer traditional hard disks with a  flash-based cache SSD.  These are the consumer side of the flash revolution. Violin Memory provides extreme performance for the opposite end of the spectrum – the enterprise.  One of the key benefactors of enterprise flash is the database.  Figure 1 has a six reasons that only begin to scratch the surface.

Figure 1 : Flash Matters. Here are six reasons that only begin to scratch the surface.

Figure 1 : Flash Matters. Here are six reasons that only begin to scratch the surface.

Lower Latency. The first point is that it takes less time to accomplish the same work on flash. It lowers latency. When you lower the IO time-cost per transaction – you can provide more transaction in a given amount of time compared to hard disk technology.
Improve CPU Efficiency. As processes wait much less time for IO operations, CPU process “wait states”  dramatically decline, making for a more efficient CPU.
Increased Throughput. The combination of the improving latency and CPU efficiency make for higher throughput. Higher throughput.  More transactions in a given set of time are pushed through compared to traditional hard disk technology.
Reduce Licenses, Servers, Floor space, power & cooling costs.  Along the way, if large we are dealing with substantials numbers of servers –  the more efficient CPUs in those servers are doing much more work (because of the improved throughput, lower latencies and pushing through more transactions) and this reduces the need for more servers and database licenses. The result is less servers and less datacenter floor space are needed.
Improves Predictability.  With enterprise flash arrays, such as Violin Memory‘s 6000 Series a lot of effort has gone into developing algorithms that aim for improved predictability – avoiding the infamous SSD Write-Cliff issues.
Redundant and Highly Available. It’s important that everything in the array hardware is redundant and provides enterprise grade high availability.

This only scratches the surface.