Saturday, May 23, 2009

MySQL RAID10 Dell 2960

I have six 73G SAS drives each in two Dell 2960 servers which will each go in one of two data centers and provide MySQL service and be configured for binary replication. I am configuring these servers with RAID 10 and looking at different options to optimize performance.

Dell's Perc 5/i hardware RAID controller's integrated BIOS configuration menu can be accessed by holding Ctrl-r at boot. Dell's Getting Started With RAID explains the difference between write-back (done when in cache) and write-through (done when on disk) caching. The BIOS gives you the following options (a * denotes the defaults):

  • Stripe Size: 8k, 16k, 32k, 64k*, 128k
  • Read Ahead: No Read Ahead*, Read Ahead, Adaptive Read Ahead
  • Write Cache: Writhe Through, Write Back*
It also offers options to enable things like write through with no battery (doesn't seem like a good idea). Note that it's recommended that you recondition the battery every six months. I found a cheat sheet from a DBA who explicitly sets the same hardware to all of the defaults but he doesn't explain his reasoning.

O'Reilly's High Performance MySQL discusses the RAID stripe chunk size and RAID cache in Chapter 7. The author contrasts the idea of maximizing the RAID stripe chunk size in theory vs practice and doesn't leave me with a conclusion as to what would be appropriate for "general mysql data" since I can't be sure enough of the content. I have had good performance with Zimbra's large MySQL database using 128k stripes on RAID 10 with an EMC Clariion. When creating the file system that would host MySQL I followed a recommendation to set the stride to 32k since I had 4k blocks and already had 128k RAID stripes:

mke2fs -j -L Z_DB -O dir_index -m 2 -J size=400 -b 4096 -R stride=32 /dev/sde2
This might be specific to Zimbra's MySQL database, but because my more generic MySQL service will be running on ext3 it will be important to set the stride relative to how the RAID is striped. The author warns that many RAID controllers don't work well with large stripe sizes because it wastes cache and it unlikely that the content will fit nicely into the chunks. Perhaps the 64k stripes perform better on average and that is why it is the default unless you have a specific need. It seems that upping it to 128k might be too extreme of a direction, especially since the author does not seem to think it's a good idea. I'm going to stick with 64k. The author's general advice on caching is to not waste cache on reads (which are likely to not be as good as the DB's own reads) and to save the cache for writes. Thus, I'm sticking with the defaults here too.

The MySQL manual has a disk issues section which discusses mount and hdparm options (gulp... it [hdparm] can crash a computer and make data on its disk inaccessible if certain parameters are misused... gulp). It also mentions how delicate setting the stripe size can be: The speed difference for striping is very dependent on the parameters. Depending on how you set the striping parameters and number of disks, you may get differences measured in orders of magnitude. You have to choose to optimize for random or sequential access. Once again the stripe size will be difficult to guess for an any-DB-goes type of MySQL server.

When creating the file system on top of these default RAID 10 settings it will be a good idea to:

  • mke2fs -j -L VAR -O dir_index -m 2 -J size=X -b 4096 -R stride=16 /dev/sdY
  • hdparm -m 16 -d 1
  • mount -o async -o noatime

The conclusion I come to is: Use RAID 10 with all of the Perc 5/i's defaults. You might be able to find a better setting for the stripe size but this will be difficult to determine so sticking with the default of 64k is probably reasonable for your initial settings. Also, be aware the RAID settings when creating your file system, tuning it with hdparm and mounting it.

No comments: