su - /usr/sbin/visudo /usr/sbin/usermod -a -G wheel $USER sed -i s/"PasswordAuthentication yes"/"PasswordAuthentication no"/g /etc/ssh/sshd_config /etc/init.d/sshd restart exit sudo ls
Wednesday, January 30, 2008
standard tightening
I often run the following commands when I set up a new server:
Thursday, January 24, 2008
xfs rhel4
Problem
Make a RHEL4 system mount a partition which can support more directories than ext3's inode max will allow.Solution
Use a kernel module to use xfs. Note that we don't need a new kernel, just a new kernel module. There are RPMs for this. If you can install them correctly this won't even require any downtime.Details
Going to test by making an XFS USB thumb drive. First we install the XFS Kernel Module. There is a howto for doing this with kernel modules via RPMs. For details see faqs.org. I need 3 RPMS: xfsprogs, xfsprogs-devel, and the kernel-module-xfs:rpm -Uvh xfsprogs-[kernel-version][rpm-version].rpm rpm -Uvh xfsprogs-devel-[kernel-version][rpm-version].rpm rpm -ivh kernel-module-xfs-[kernel-version][rpm-version].rpmGiven what I'm running:
# uname -r 2.6.9-67.0.1.EL #and a bit of searching I found a mirror which had the 2.6.9-67.0.1.EL kernel-module-xfs. Note that xfsprogs and xfsprogs-devel don't necessarily have to be the exact same version, just the specific kernel module. After following the order above I'm able to load the kernel module and verify that I have the XFS mkfs:
# modprobe xfs # lsmod | grep xfs xfs 526832 0 # which mkfs.xfs /sbin/mkfs.xfs #Next I'll look at the partition on the thumb drive (/dev/sda1 as per dmesg) and determine that I can mount it:
# parted (parted) select /dev/sda1 Using /dev/fd0 (parted) mklabel msdos (parted) print Disk geometry for /dev/fd0: 0.000-1.406 megabytes Disk label type: msdos Minor Start End Type Filesystem Flags (parted) quit # mount -t vfat /dev/sda1 /mnt/usb/ # umount /mnt/usb/Then we format the partition for XFS:
# /sbin/mkfs.xfs -f -i size=512,maxpct=0 /dev/sda1 meta-data=/dev/sda1 isize=512 agcount=3, agsize=4096 blks = sectsz=512 data = bsize=4096 blocks=12288, imaxpct=0 = sunit=0 swidth=0 blks, unwritten=1 naming =version 2 bsize=4096 log =internal log bsize=4096 blocks=1200, version=1 = sectsz=512 sunit=0 blks realtime =none extsz=65536 blocks=0, rtextents=0 #Finally we verify that we can mount it:
# mount -t xfs /dev/sda1 /mnt/usb/ # mount | grep xfs /dev/sda1 on /mnt/usb type xfs (rw) #After doing this you can see how many inodes it can handle and test it empirically. The following perl script will attempt to make an arbitrary number of directories:
#!/usr/bin/perl $num_dirs = 38000; system "mkdir test"; for($i=0; $i < $num_dirs; $i++) { system "mkdir test/$i"; print "$i\n"; }You can then run it in one window while you watch it eat inodes in the other:
# df -i /mnt/usb/ Filesystem Inodes IUsed IFree IUse% Mounted on /dev/sda1 86712 24138 62574 28% /mnt/usb # ... # df -i /mnt/usb/ Filesystem Inodes IUsed IFree IUse% Mounted on /dev/sda1 86176 38007 48169 45% /mnt/usb #So you can fill up half a drive with nothing by empty dirs:
# df -h /mnt/usb/ Filesystem Size Used Avail Use% Mounted on /dev/sda1 44M 20M 24M 46% /mnt/usb #
update
When I try to use XFS on an iSCSI LUN I get a kernel panic. All I have to do is mount the LUN, mkdir and then rmdir:# rmdir 2 Message from syslogd@localhost at Tue Jan 29 10:38:34 2008 ... kernel: Bad page state at free_hot_cold_page (in process 'iscsi-rx', page c1682a20) Message from syslogd@localhost at Tue Jan 29 10:38:34 2008 ... kernel: flags:0x20000084 mapping:00000000 mapcount:0 count:0 Message from syslogd@localhost at Tue Jan 29 10:38:34 2008 ... kernel: Backtrace: Message from syslogd@localhost at Tue Jan 29 10:38:35 2008 ... kernel: Trying to fix it up, but a reboot is needed #Someone else has this problem too.
Tuesday, January 22, 2008
Fibre Channel I/O Calls
A colleague of mine came across the following fact regarding I/O and fabric switches:
- 2 Gbps FC can queue 254 I/O commands
- 4 Gbps FC can queue 2048 I/O commands
# cat /proc/diskstats | grep " sdc " 8 32 sdc 391348542 3811329 642958694 1765819166 212637694 1424571277 438970288 1314722135 1 366113251 3445284834 #As per comp.os.linux.development these fields (starting after the device name) are:
Field 1 -- # of reads issued Field 2 -- # of reads merged, field 6 -- # of writes merged Field 3 -- # of sectors read Field 4 -- # of milliseconds spent reading Field 5 -- # of writes completed Field 7 -- # of sectors written Field 8 -- # of milliseconds spent writing Field 9 -- # of I/Os currently in progress Field 10 -- # of milliseconds spent doing I/Os Field 11 -- weighted # of milliseconds spent doing I/OsOr to put it another way:
391348542 reads issued (4) 3811329 reads merged (5) 642958694 sectors read (6) 1765819166 milliseconds spent reading (7) 212637694 writes completed (8) 1424571277 writes merged (9) 438970288 sectors written (10) 1314722135 milliseconds spent writing (11) 1 I/Os currently in progress (12) 366113251 milliseconds spent doing I/Os (13) 3445284834 weighted milliseconds spent doing I/Os (14)Note that I've put the awk offset in parentheses above. We can then take more readings and focus on essential columns. E.g. we spend more time reading than writing:
# while [ 1 ]; do grep " sdc " /proc/diskstats | awk {'print $7 " " $11'}; sleep 1; done 1767053699 1323835167 1767053722 1323835217 1767054231 1323858000 1767054400 1323858477 1767054401 1323859097 1767054420 1323859106 1767055201 1323863662 1767055543 1323863671 1767055666 1323864799 1767056048 1323865700 #If we look at them every quarter second we can see spikes in the number of I/O along with number of reads and write issues during that time (looking at a larger interval hides the spikes):
# while [ 1 ]; do grep " sdc " /proc/diskstats | awk {'print $12 "\t" $4 "\t" $8'}; sleep 0.25; done 1 391689249 213184077 1 391689253 213184467 4 391689253 213184912 4 391689253 213185311 4 391689253 213185780 1 391689257 213186170 1 391689257 213186558 2 391689258 213187017 68 391689271 213187319 1 391689271 213187801 2 391689313 213188219 1 391689338 213188481 2 391689379 213188863 44 391689379 213189282 32 391689384 213190180 3 391689400 213190569 3 391689400 213190971 1 391689405 213191429 3 391689407 213192172 #We can check the math on the last few lines. Because our sampling interval is missing events that occur in between our numbers won't add up exactly, but we can see a general trend in some of these numbers:
1 391689338 213188481 2 391689379 213188863 44 391689379 213189282 32 391689384 213190180There were a lot more writes than reads from the samples taken above
3 391689400 213190569 3 391689400 213190971Nothing was read, but 3 I/O operations seem to have been written. I don't have anything conclusive from the above but I do think I can use this over time and recognize trends on my system.
Monday, January 7, 2008
RHEL5 Install via Ubuntu NFS
After booting a RHEL4 or 5 disk with "linux askmethod" and getting the system online (the Celerra could ping it) and making sure the appropriate IP was in the ACL, I still couldn't mount the ISO images to install:
RPC timeout that directory could not be mounted from the serverI ended up mounting the Celerra from an already installed system on the same network, copying the ISOs over to it and then umounting the Celerra. I then made that system an NFS server and now the five RHEL5 hosts have no problem mounting it for an NFS based install. When setting up an NFS server make sure that portmap is running so that it listens on port 111 for the RPC calls that NFS needs.
# netstat -npl | egrep "111|2049" tcp 0 0 0.0.0.0:2049 0.0.0.0:* LISTEN - tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN 29969/portmap udp 0 0 0.0.0.0:2049 0.0.0.0:* - udp 0 0 0.0.0.0:111 0.0.0.0:* 29969/portmapIt's annoying that I'm not sure why the Celerra won't serve this purpose and that I don't have any log data to figure out why. If time allows I'll try again with tcpdump but I need to get these hosts installed.
Update
The firewall admin noticed that the host that was booted from the RHEL5 CD was trying to make a UDP connection to port 1234 of the Celerra. Opening this seems to have fixed the problem.nfs-common
Out of the box Ubuntu will support NFS mounting but will take about 90 seconds to do it and not work well. If you check /var/log/messages you'll see errors [1]. To fix this install the nfs-common package:
http://packages.ubuntu.com/feisty/net/nfs-common
Footnote:
[1]
[4661210.004709] portmap: server localhost not responding, timed out [4661210.004745] RPC: failed to contact portmap (errno -5). [4661244.949461] portmap: server localhost not responding, timed out [4661244.949496] RPC: failed to contact portmap (errno -5). [4661244.949513] lockd_up: makesock failed, error=-5 [4661279.894214] portmap: server localhost not responding, timed out [4661279.894248] RPC: failed to contact portmap (errno -5). [4661279.894255] nfs: Starting lockd failed (do you have nfs-common installed?). [4661279.894284] nfs: Continuing anyway, but this workaround will go away soon.
NFS: Stevens' TCP/IP Ch 29
Stevens' TCP/IP Illustrated, Volume 1 The Protocols Chapter 29 explains NFS in terms of UDP protocols and RPC. If you're making firewall rules for NFS you need to allow ports 2049 for NFS itself and 111 for RPC. It's possible for NFS to use both TCP and UDP for both ports.
Friday, January 4, 2008
Celerra Command Line Non-Troubleshooting
My colleague offers NFS service via an EMC Celerra NS 502G. I want to be able to troubleshoot it by grepping its logs for errors. Either it doesn't keep log files for the errors I've been encountering or I couldn't find them.
The problem
Originally I couldn't mount the host because port 2049 was not open to the client (so remember to check the network layer first with telnet). I then became curious and tried to mount from a host that worked in the past which I then specifically removed from the Celerra ACL:mount -t nfs nas0.prd.domain.tld:/isos /mnt/isos/ mount: nas0.prd.domain.tld:/isos failed, reason given by server: Permission deniedMy goal is to know where the Celerra logs these types of issues. I don't think it does, but I'm trying to prove a negative by searching so I could have missed something.
Getting to the command line:
You can SSH to a Celerra as nasadmin. It's really just a GNU/Linux box:[root@nas_cs0 root]# uname -a Linux nas_cs0 2.4.20-28.5506.EMC #1 Tue Aug 8 22:16:20 EDT 2006 i686 unknown [root@nas_cs0 root]#It's got a 2 GHz Celeron and 512MB of RAM:
[root@nas_cs0 etc]# dmesg | grep -i cpu Initializing CPU#0 CPU: Trace cache: 12K uops, L1 D cache: 8K CPU: L2 cache: 128K CPU: After generic, caps: bfebfbff 00000000 00000000 00000000 CPU: Common caps: bfebfbff 00000000 00000000 00000000 CPU: Intel(R) Celeron(R) CPU 2.00GHz stepping 09 [root@nas_cs0 etc]# free -m total used free shared buffers cached Mem: 503 469 33 0 77 185 -/+ buffers/cache: 207 295 Swap: 509 247 262 [root@nas_cs0 etc]#Seems to be RPM based, probably RedHat:
[root@nas_cs0 var]# rpm -qa | wc -l 262 [root@nas_cs0 var]#
What files are useful from here?
You can look in /celerra/backendmonitor to see some of the configuration files. But where are the log files? One way to find log files on any box is to find all the items that were modified after when you tested what you're trying to debug:# touch /tmp/x # find / -type f -newer /tmp/x 2> /dev/null | grep -v procIn the above I'm ignoring proc and standard errors while trying to find files newer than now (since I just touched /tmp/x). This returns:
/var/log/pacct /tmp/ch_globals.tmp /nas/log/eventstore/slot_1/sys_logI then hopped into /nas/log/ and tried to find files containing the IP of the host that couldn't NFS mount the system:
[root@nas_cs0 log]# find . -exec fgrep -q "123.456.7.89" '{}' \; -print 2> /dev/null ./nas_log.al ./cmd_log ./cel_api.log [root@nas_cs0 log]#All of the above just contained logs from when the host was added to the ACL.
Non-results
I wish I could end with and then I found the log in ... but I never found useful logs. Since I searched for files modified after the time of error and found nothing, my position is that it's not logging these errors. It might have been easier to just buy a server with fibre cards and let it work as an NFS wrapper. Then I'd have a more standard NFS server. At least it does iSCSI, but I haven't yet trouble shot it at this level of detail. I also found some comments on EMC's NFS implementation.OpenMoko update
There will soon be a new developer open moko phone. However, we still seem to be in phase 1. I'm waiting for phase 2 since I can't do without a reliable means of making phone calls. It would be fun to just get one to develop on, but I've got too much going on. I read yesterday that they'll make it "available to the mass market later this year". The year just started so it might just be a whole year. It's been delayed before.
rhel5 nfs install
I have an NFS share with ISOs [1]. When installing a RHEL4 system the boot disk offers to use an NFS source. I am then able to bring up eth0, mount the NFS server and complete the install with one CD. RHEL5 is different and doesn't have a KickStart CD. However, you can achieve the same effect by booting from the first CD with "linux askmethod". The installer will will then prompt for the NFS server details.
Footnote:
[1]
# mount -t nfs nas0.prd.domain.tld:/isos /mnt/isos/ # ls /mnt/isos/datastore1/rhel* /mnt/isos/datastore1/rhel4u4_i386: RHEL4-U4-i386-ES-disc1.iso RHEL4-U4-i386-ES-disc4.iso RHEL4-U4-i386-ES-disc2.iso RHEL4-U4-i386-ES-disc5.iso RHEL4-U4-i386-ES-disc3.iso /mnt/isos/datastore1/rhel4u4_x86_64: RHEL4-U4-x86_64-ES-disc1.iso RHEL4-U4-x86_64-ES-disc4.iso RHEL4-U4-x86_64-ES-disc2.iso RHEL4-U4-x86_64-ES-disc5.iso RHEL4-U4-x86_64-ES-disc3.iso /mnt/isos/datastore1/rhel5u1_i386: rhel-5-server-i386-disc1.iso rhel-5-server-i386-disc4.iso rhel-5-server-i386-disc2.iso rhel-5-server-i386-disc5.iso rhel-5-server-i386-disc3.iso /mnt/isos/datastore1/rhel5u1_x86_64: rhel-5-client-x86_64-disc1.iso rhel-5-client-x86_64-disc5.iso rhel-5-client-x86_64-disc2.iso rhel-5-client-x86_64-disc6.iso rhel-5-client-x86_64-disc3.iso rhel-5-client-x86_64-disc7.iso rhel-5-client-x86_64-disc4.iso /mnt/isos/datastore1/rhel-5-x86-64: rhel-5-server-x86_64-disc1.iso rhel-5-server-x86_64-disc4.iso rhel-5-server-x86_64-disc2.iso rhel-5-server-x86_64-disc5.iso rhel-5-server-x86_64-disc3.iso #
Wednesday, January 2, 2008
dreams of being cracked
I read about a new theory on dreaming:
Dreams are a sort of nighttime theater in which our brains screen realistic scenarios. This virtual reality simulates emergency situations and provides an arena for safe training: "The primary function of negative dreams is rehearsal for similar real events, so that threat recognition and avoidance happens faster and more automatically in comparable real situations." Dreaming helps us recognize dangers more quickly and respond more efficiently.
...
The difference between the typical and optimal response could save your life. But making such a reaction swift and automatic takes practice. It's the reason martial arts students drill their movements over and over. Frequent rehearsal prepares them for that one decisive moment, ensuring that their response in an actual life-or-death situation is the one they practiced. Dreams may do the same thing.
It even offers a method for a brain to for select what to dream about:
The dreaming brain scans emotional memories. When it detects a memory trace with a strong negative emotion, it constructs a nightmare around that theme. The more traumatic the event, the more intense the nightmare. The brain's system for detecting threats is sensitive and flexible: Anything the brain tags with a strong negative charge gets thrown into the threat bin and dredged up at night.
...
Even a single exposure to a life-threatening situation can plunge a person into an inferno of post-traumatic nightmares, dreams in which the threatening event—the attack, the rape, the war—is repeated over and over in every possible variation.
I had a dream I had been socially engineered. That I had given a root password to someone I suddenly realized was not to be trusted. If this theory of dreaming is correct it's good that it's in time with modern fears, not just being chased by tigers.
don't chmod 777
I came across a document which said "In order to use all these tools, you have to change the chmod of wp-content folder to 777".
I very rarely do this and it's usually unnecessary. It's analogous to unlocking all the doors so that your friend can use one of them. I normally prefer to unlock only the necessary door ("That's just perfectly normal paranoia" -- Slarty Bartfast). If you're being lazy and writing a doc you might just ask the user to 777 a directory since you probably don't want to write long notes like this one explaining the different types of doors and the conditions in which you should unlock different ones. 777 will always work, especially if the user installing the software doesn't have root. If you do have root, then you should just chown or chgrp to the user or group which needs to write to the directory. This is normally apache and this is what I recommend.
However, this doesn't mean problem solved and everything is secure. It means the back door is open and you could be at risk if the person you've asked to watch it for you is incompetent. Since WordPress is a popular and active project with a lot of developers I'm going to err on the side of trusting them and endorse letting apache write to said part of the file system. You should then put a noexec .htaccess in that directory so that if something bad is uploaded it can't be run.
Let me explain more about what I mean by not asking some incompetent person (or script) to watch the back door. Some web applications need to be able to write to the file system for them to be of use. E.g. WordPress probably wouldn't be too handy without this feature. By the bug:feature ratio law we will also open ourselves to plenty of exploits. It all depends on an arms race between developers and crackers. If you are installing some dead project like "Jim's PHP-weekend photo gallery" and he hasn't put too much work into thinking about possible exploits or updating his code you are putting the server at risk. Someone could drop in some code to use your box as a spam relay or they might even upload PHP shell and try to root the box. I've seen it happen.
For web servers where many users have shell accounts allowing them to make the call on whether a project is secure enough to be hosted with the apache write option is not a good idea. You basically need to support apache reads only. Your best bet if they want to do writes is to force it though a database like MySQL. This works well for text (the majority of cases) but is not a good idea for attachments. If a good open source project had a generalized attachment solution which focused on security and provided an API and then other projects adapted it, I think admins and users would be happier.
NFS intr
I've been using the NFS intr option:
If an NFS file operation has a major timeout and it is hard mounted, then allow signals to interupt the file operation and cause it to return EINTR to the calling program. The default is to not allow file operations to be interrupted.
The Linux Network Administrators Guide or wlug has more information.
But like others I've had problems when the network share went away. Normally RedHat's fix has worked for me in the past but I recently had it not work. I ended up doing a lazy umount (-l option) to get off of it without hurting the server.
Subscribe to:
Posts (Atom)