Sunday, September 13, 2009

Grid Storage

My organization has managed two EMC Clariion cx3-20s for three years. We have had some problems with its overall design which I'll list below. I'll also list some vendors with the same problem and show some using grid storage to avoid these problems.

EMC problems

LUN Tetris

We have a mix of LUNs of approximately two types:
  • Fast and small
    • Used only by Databases and Email
    • Currently 10T not likely to grow fast
    • 146G 15k FC in RAID10
  • Fast enough and large
    • all other applications (Live VM images, Web roots, Home dirs, File svcs, Archives, etc)
    • Currently 40T grows by about 10T a year
    • 1T 7200 SATA in RAID5
We have many LUNs of one of the two types above and they stripe across a number of disks and if visualized would look like the end of a tetris game with differing colors and shapes. The variation of colors and shapes represent LUNs varying in size, meta-LUNs, RAID types, disk types etc. Some empty space represents unused space that is too small to be of use. When a new project comes and requires some space we analyze our tetris game and consider the best way to accommodate the request.

Service Processor bottleneck

Most SANs have Service Processors (SP), which are computers that run an OS: EMC runs Flare, NetApp runs Data on Tap (BSD derivative), etc. The SPs can be thought of as servers which pass block change operations from clients to the block devices which are connected directly to them. In EMC's case, this connection is implemented as daisy chained copper SCSI cables to several drawers of disks. The cx3-20 can hold eight drawers. We want to add an extra drawer this year, but we will have an additional cost to upgrade to a cx3-40, which is basically a new SP which can hold 16 drawers. So, every few years you must upgrade the SP. In our case EMC wants us to buy a cx4 instead of a cx3.

Same old SAN

I'm probably over simplifying the comparison of the products listed below, but since they have the same problems which I listed above, to me they look like the same old SAN. I'm going to speak with sales reps for each of the companies above to let them tell me about some other product that they offer so that I might update this page and list them as offering Grid Storage.

Grid Storage

There are new grid storage based systems which don't have these problems. The basic idea is that rather than have a smart SP and several dumb drawers, each drawer is smart and also known as a node. In IBM's XIV each node is an individual server made from commodity parts: 1 quad-core intel, 8G of RAM, 12 1T SATA drives and a stripped down Linux-based OS. These servers are then networked to speak which each other via 10G ethernet instead of daisy chained SCSI. Each portion of data that is written to any particular LUN is split across all of the disks and the large stripe helps the SATA perform as well as the fast disk. Redundant portions are also written so that one can loose up to three nodes in a six node system. Relative to the problems posed in the beginning we have:
  • LUN Tetris: The only property of an XIV LUN is size. Every LUN has the same speed which is fast. Every LUN is made of commodity SATA. Keep a tally of the total size and subtract the requested size for a new project.
  • SP Bottleneck: Since each node, or drawer, is an SP storage and processing scale at the same rate. There is no sudden need to upgrade the SP during an expansion
I am trying to build a list of vendors which use grid storage to serve block devices (IBM was the first vendor I found doing exactly this, so my description above is biased towards them). NEC's HydraStor and Isilon use grid storage except they are serving NFS volumes. Please post a comment if you know of storage vendors doing something similar.

No comments: