Wednesday, April 28, 2010

SSD Experiment

I've been considering experimenting with an SSD drive. I already have a 1T Seagate Barracuda. My plan is to get a 160G Intel X25-M SSD. On a hardware layer, it looks like it would work with my Dell Precision 690 using the SATA-300 interface. I also see that it includes a 3.5-inch bay adapter (given that a standard SATA drive is 3.5" and this is a a 2.5" drive). On a software layer, I would configure the two drives as follows:
  • 160G SSD:
    * 96G system drive with the following partitions:
    /                     10G
    /home                 10G
    /tmp                   5G
    /usr                  10G
    /var                  61G
    
    * 64G of cache/swap: I would like to use the extra speed for cache, not storage. Ideas:

    • Swap: The simplest thing is to swapon to it and then run a lot of VMs under KVM and see if performance for the VM is tolerable. If it is then its much cheaper than buying more RAM.

    • Disk Cache: If I were running ZFS I'd add the SSDs to the L2ARC on-disk cache.

    • Other types of Cache: Lots of ideas on this topic recently posted on slashdot.
  • 1T SATA:
    I would then use the remaining 1T for directories under /home and /var to store VMs (as raw LVs) as well media. I'd also keep an extra 20G of the SATA space free to play with new file systems.

postmortem of the apache.org crack

apache.org has posted a good model for the self-reporting of incidents.

Sunday, April 11, 2010

tiered ram

A friend of mine suggested that I could get more memory related performance out of my servers by using fast disk for swap. Specifically, I could buy 1TB of SSD disk and have a dom0 Xen server swapon it. E.g. the largest VM server in my organization's data center has 128G of RAM. If I needed to run more VMs and some percentage of them could take a slight performance hit, then I could allocate 256G of SSD as swap space to up to four equally large servers to have RAM*2 of swap. I could then rely on Linux's paging algorithm to prioritize for those VMs that really need real RAM and run a lot more VMs for cheaper than if I had to buy 1TB of RAM.

Disclaimer: I have not tried this.

Thursday, April 8, 2010

Gratuitous ARPs whenever you need them

Today I cloned a VM which had several IPs bounded to eth0. When I brought up the clone eth0 answered pings from my workstation (which is on a different network), but eth0:1, eth0:2, .. eth0:N did not. virt-clone had changed the MAC address as expected but why would a mac address change cause this problem? Turns out that the router between my workstation and dom0 had the old MAC address in its ARP cache and the ARP cache timeout is set high enough that I'd have to wait for it in order to reach the other IPs. But why wouldn't eth0 also be affected? Apparently when you boot a server it sends a Gratuitous ARP for eth0 but not eth0:1 etc. So the ARP cache for eth0 was updated but not the other IPs bound to eth0. What I wanted was to send a Gratuitous ARP for every IP bound to eth0 so that the cache would be refreshed. This is where arping is useful. E.g. to send a gratuitous ARP for the IP 192.168.6.212 whose gateway is 192.168.6.1, run:
/sbin/arping -s 192.168.6.212 -I eth0 192.168.6.1
From there it was just a matter of using a bash loop:
for x in `ifconfig | grep 192 | awk {'print $2'} | sed s/addr://g`; 
  do /sbin/arping -c 1 -s $x -I eth0 192.168.6.1 ;
done
I'm lucky my co-workers were able to help me understand this problem.

shaping like behavior while doing an scp

While scp'ing a large file between two systems on the same network I experienced something similar to shaping, though I was not shaped:
$ scp drupal1* user@host:/var/foo/
drupal1_data.img.gz                             2% 3122MB  33.6MB/s 
...
drupal1_data.img.gz                            20%  294MB   0.0KB/s -
stalled -Write failed: Broken pipe
lost connection
$
The ascii illustration above isn't a direct paste but should convey what happened. I watched 33.6MB/s show down to a complete stop. What's odd is that I was then able to do the scp again and maintain 33.6MB/s and get the scp done. However, I also saw the issue above now and then intermittently. I was going from Fedora 12 to RHEL5.5. I'm curious and want to look into this more. Someone suggested I would find relative information at High Performance SSH/SCP - HPN-SSH

Update Dec 2011::
A solution to the above is posted at linuxsecure.blogspot.com. It comes down to adding:

net.ipv4.tcp_sack = 0
to /etc/sysctl.conf and then enacting it with "sysctl -p".

Monday, April 5, 2010

Python SSH with Paramiko quickly

A quick python script to SSH into a server and run a list of commands. Uses Paramiko (doc). Fedora has a package:
yum install python-paramiko
Assumes you are using RSA Public Key Authentication:
#!/usr/bin/python
import paramiko, os, getpass

# Variables
username = 'you'
host = 'your.server.com'
port = 22
key = '~/.ssh/id_rsa'
msg = "Enter passphrase for key '" + key + "': "
private_key_pass = getpass.getpass(prompt=msg)
private_key_file = os.path.expanduser(key)

# Connctions
pkey = paramiko.RSAKey.from_private_key_file(private_key_file,\
                                              private_key_pass)
transport = paramiko.Transport((host, port))
transport.connect(username = username, pkey = pkey)

# Commands
cmds = ['ls', 'ls /foo']
for cmd in cmds:
    channel = transport.open_session()
    channel.exec_command(cmd)
    output = channel.makefile('rb', -1).readlines()
    if output:
        print "Success:"
        print output
    else:
        print "Error:"
        print channel.makefile_stderr('rb', -1).readlines()

transport.close()