Monday, September 14, 2009

Thin Provisioning

According to IBM's XIV Red Book page 3 Thin Provisioning is "the capability to allocate storage to applications on a just-in-time and as needed basis". The wikipedia has more to say. Storage vendors seem to make this sound so much better than it is. Sure you get more blocks when you need them, the problem is that you now have to get your filesystem to use them. Repeat: you don't just run 'df' before and after and say "oh I grew the LUN, I'm done". How does this behave on *nix file systems?

If you're using ext3 on top of LVM, then it seems to be that you don't even need thin provisioning from your SAN. You could just add a new LUN, add it to the LVM storage group of the volume you want to grow and then ext2online your ext3 volume into it. I've done this a few times and it's worked fine, but it was, as Ben Rockwood said, sucky. Ben's blog mentions how ZFS can make this process less painful: ZFS and Thin Provisioning. Aside from needing to know what you're doing when 'df' lies to you, this looks handy. If my SAN allocates the extra space easily and if ZFS can just pick it up and run with it, then that is good and makes SAN based Thin Provisioning seem worthwhile. Looks like I'm going to have to test this feature with ZFS. I'll post an update as I learn more.

Update: Someone who used to work for XIV told me a little bit of how the thin provisioning system works. If I've understood correctly I take the scenario to be:

  • If a project requires x TB over the course of three years, but only y TB this year, then thin provision x TB such that y TB can be accessed now
  • When you create a file system (this includes ext3) on top of that project's LUN, you will see x TB (even though y TB is what will really be there). Thus, the inode table will be built to access blocks which are not yet there and 'df' will lie to you
  • As long as you have x TB available at some point in the future (perhaps in your total SAN) it will be allocated on demand and the file system won't have a problem.
The benefits I see of doing this are:
  • This can save you money if you're planning to purchase x TB within the next three years, but know that you can only afford y TB today
The problems I can see are:
  • If you don't get those extra disks before the user decides to run out of road (hey, you made the road look longer than it was) then you'll have problems
  • When you reach x TB physically and fill them you are back to the original problem: you will have to use ext2online or some other method to grow the filesystem

No comments: