- SysRq: SysRq is a key combo you can hit which the kernel will respond to regardless of whatever else it is doing, unless it is completely locked up.
- Hangwatch: Hangwatch periodically polls /proc/loadavg, and echos a user-defined set of characters into /proc/sysrq-trigger if a user-defined load threshold is exceeded.
- NMI: Non-Maskable Interrupt (NMI) is a mechanism to detect system lockups. It enables the built-in kernel deadlock detector. By executing periodic NMI interrupts, the kernel can monitor whether any CPU has locked up and print out debugging messages as needed.
Tuesday, October 30, 2007
redhat hang
If your RedHat system gets really overloaded and hangs here are some things that might help:
Labels:
linux
Thursday, October 25, 2007
five-minute monitoring
A system independent of my mail server will text my phone if any of the ports on the mail server are not accessible. It checks every 5 minutes, which is about how long it took to whip this thing up.
me@workstation:~$ crontab -l */5 * * * * /bin/sh /home/me/code/shell/monitor.sh > /dev/null 2>&1 me@workstation:~$ cat /home/me/code/shell/monitor.sh #!/bin/sh EMAIL="my_number@cell_phone_company.com"; HOST="mailserver.domain.tld"; PORTS="25,80,110,143,443,587,993,995"; CMD=`/usr/bin/nmap -p $PORTS $HOST | grep tcp | grep -v open`; if [ -n "$CMD" ] # if output of command has non-zero length then echo $CMD | /bin/mail $EMAIL; else echo "$PORTS are open on $HOST"; fi me@workstation:~$
Wednesday, October 24, 2007
DNS MX hacks
In a previous post I talked about mail gateway load balancing by having two MX records in BIND. I mentioned that I'd want the more powerful server chosen a larger percentage of the time.
You would think I could just add another instance of the same host:
MX 10 mta0 MX 10 mta1 MX 10 mta1but this doesn't work. BIND just ignores the second entry as redundant. I could make mta2 a CNAME for mta1 and then add mta2 as a third MX record. I've done some tests in a test environment and this works. However, this is a hack and having CNAMES for MX records is not theoretically not permitted (RFC 1034 section 3.6.2).
mail gateway load balancing
I have a dedicated mail gateway (mta0) which filters spam. It's been overworked so I set up a second (mta1). mta0 is the master and stores spam definitions and user preferences in a MySQL DB. mta1 is a slave and receives these content updates from mta0.
In order to get both systems sharing the load I simply add a second MX record for mta1. Here are the relevant portions of my zone file before:
$ grep -n MX domain.tld.zone 16: MX 10 mta0.domain.tld. 359:mail MX 10 mta0.domain.tld. $and after:
$ grep -n MX domain.tld.zone 16: MX 10 mta0.domain.tld. 17: MX 10 mta1.domain.tld. 360:mail MX 10 mta0.domain.tld. 361:mail MX 10 mta1.domain.tld. $BIND will automatically swap the order of either MX record for a given lookup. E.g. note how 0 or 1 end up on top for alternating queries:
$ dig @dns.domain.tld domain.tld +short MX 10 mta0.domain.tld. 10 mta1.domain.tld. $ dig @dns.domain.tld domain.tld +short MX 10 mta1.domain.tld. 10 mta0.domain.tld. $It then just takes a little time for your DNS updates to propagate. You can test your changes by using mxtoolbox.com or sending mail from hosts like gmail and yahoo and seeing which mta relayed by viewing full headers. Before you drop a second email hub into service be sure that it sends mail where it should. It would be a shame if half of your mail was lost. Use the following test and make sure you get the email where you'd expect. You might need to adjust your spam filter to let the test message below through:
telnet mta1 25 HELO workstation.domain.tld MAIL FROM: me@domain.tld RCPT TO:me@domain.tld DATA test .One nice thing about this you can add the second system without any downtime. mta0 does not need to be brought offline; it's just a matter of waiting for DNS to propagate. Since mta1 has twice as much CPU and RAM as mta0 I'm going to look into weighing the records so that mta1 gets more of the load.
Labels:
mail dns
Saturday, October 20, 2007
/lib/modules space hack
I made / too small for xubuntu and had trouble upgrading my kernel from 2.6.20-15 to 2.6.20-16.
dpkg: error processing /var/cache/apt/archives/linux-restricted-modules-2.6.20-16-generic_2.6.20.5-16.29_i386.deb (--unpack): failed in buffer_write(fd) (9, ret=-1): backend dpkg-deb during `./lib/linux-restricted-modules/2.6.20-16-generic/nvidia_legacy/nv-kernel.o': No space left on deviceThe issue was that /lib/modules/2.6.20-15-generic was taking up too much space (~100M):
$ df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 236M 214M 9.4M 96% / varrun 375M 100K 375M 1% /var/run varlock 375M 0 375M 0% /var/lock procbususb 375M 100K 375M 1% /proc/bus/usb udev 375M 100K 375M 1% /dev devshm 375M 0 375M 0% /dev/shm lrm 375M 33M 342M 9% /lib/modules/2.6.20-15-generic/volatile /dev/sda7 20G 1.4G 18G 8% /home /dev/sda5 4.6G 2.3G 2.1G 53% /usr /dev/sda6 1.9G 800M 982M 45% /varMy klugey fix (this system wasn't too important) was to "rm -r /lib/modules/2.6.20-15-generic" (after I tared it up on another partition). There were complaints about not being able to remove volatile but I was counting on that and left it, everything else made way. I did this while running the 2.6.20-15 kernel; I figured that what I would need for this operation was in RAM. The upgrade worked so I was then able to boot off of the 2.6.20-16 kernel. I then removed the 2.6.20-15 kernel. I guess I only have room for one at a time.
$ sudo dpkg -r linux-image-2.6.20-15-generic Password: (Reading database ... 102721 files and directories currently installed.) Removing linux-image-2.6.20-15-generic ... Running postrm hook script /sbin/update-grub. You shouldn't call /sbin/update-grub. Please call /usr/sbin/update-grub instead! Searching for GRUB installation directory ... found: /boot/grub Testing for an existing GRUB menu.lst file ... found: /boot/grub/menu.lst Searching for splash image ... none found, skipping ... Found kernel: /boot/vmlinuz-2.6.20-16-generic Found kernel: /boot/memtest86+.bin Updating /boot/grub/menu.lst ... done The link /vmlinuz.old is a damaged link Removing symbolic link vmlinuz.old Unless you used the optional flag in lilo, you may need to re-run your boot loader[lilo] The link /initrd.img.old is a damaged link Removing symbolic link initrd.img.old Unless you used the optional flag in lilo, you may need to re-run your boot loader[lilo]
Thursday, October 18, 2007
load watch
If you want to keep your eye on a system and you don't even have cron you can keep this script running. It wakes up every 10 minutes and sends me an email if the load is more than six:
#!/usr/bin/env python # Filename: load_watch.py # Description: emails me if load too high # Supported Langauge(s): Python 2.5.1 # -------------------------------------------------------- import commands import os import time import smtplib def main(): while(1): frequency = 60 * 10 # ten minutes warn_load = 6.0 loadcmd = 'cut -d " " -f1 /proc/loadavg' loadavg = float(commands.getoutput(loadcmd)) if loadavg > warn_load: send_warning(loadavg) time.sleep(frequency) print loadavg def send_warning(loadavg): """emails high load as subject""" print "warning" domain = 'domain.tld' smtpServer = 'mail.' + domain fromAddr = 'load_watch@mail.' + domain toAddr = 'me@' + domain msg = "" msg += "To: " + toAddr + "\n" msg += "From: " + fromAddr + "\n" msg += "Subject: " + 'Mail Load: %s' % loadavg msg += "\n\n" server = smtplib.SMTP(smtpServer) server.set_debuglevel(0) server.sendmail(fromAddr, toAddr, msg) if __name__=="__main__": main()
Wednesday, October 17, 2007
ping keep shell
If you don't want an SSH session to timeout you can leave the following command running.
ping -i 30 $hostThis command sends a single ping to $host every 30 seconds by using ping's interval option.
Tuesday, October 16, 2007
blackberry spoofing?
I have a colleague who does user support with blackberry devices. We ended up looking at the headers of a message from one of the new devices that he was testing. He was told that the new device uses a different protocol. The first header looked something like this:
Received: from mail.domain.tld (HELO domain.tld) ([123.456.78.9]) by as16.bis.na.blackberry.com with ESMTP; 11 Oct 2007 20:29:46 +0000Note that there's no message ID and I have nothing in my logs from this transaction. A normal message sent to google looks like this:
Received: from domain.tld (mail.domain.tld [123.456.78.9]) by mx.google.com with ESMTP id i35si14940528wxd.2007.10.16.11.05.19; Tue, 16 Oct 2007 11:05:19 -0700 (PDT)Note the message ID and that I can confirm an SMTP handshake in my logs:
14:05:20.14 2 SMTP-25607(gmail.com) [12046375] sent to [66.249.83.27:25], got:250 2.0.0 OK 1192557920 i35si14940528wxdIn the case of the blackberry the first header really came from them. I suspect that the device connected to their server over the cellular network to send the mail. Their server then wrote that header to say it was from us, not them. So this first header in what supposedly happened is misleading as far as I can tell.
Friday, October 12, 2007
DNS Math
; I'm posting this one in Elisp. ; ; Someone I work with entered 200710110333 instead of ; 2007101103 for a serial number field of the SOA RR on ; our root DNS server. Once this high number propagated ; DNS updates with the correct date broke. ; ; The problem is that the following is true: (< 2007101201 200710110333) ; so today's updates didn't propogate to our other servers. ; ; According to: ; http://www.zytrax.com/books/dns/ch9/serial.html ; ; "perhaps ritual suicide is the best option" it also says: ; ;; The SOA serial number is an unsigned 32-bit field with ;; a maximum value of 2**31, which gives a range of 0 to ;; 4294967295, but the maximum increment to such a number ;; is 2**(31 - 1) or 2147483647, incrementing the number ;; by the maximum would give the same number. ; ; Did I read that math right? I think the key word here ; is "usigned". Checking up on this: (insert-string (expt 2 32)) (insert-string (expt 2 31)) (insert-string (expt 2 30)) ; Inserts 4294967296, 2147483648 and 1073741824 into the ; buffer respectively. Also, the notation is bad, I think they ; mean (2**31) - 1 not 2**(31 - 1). More precisely one of ; the two: (- (expt 2 31) 1) (- (expt 2 32) 1) ; I guess I'll compute both and use this to solve my problem. ; Let's assume that the following is true: ; ;; An unsigned 32-bit field with a maximum value of 2**31 ; ; Actually let's check the RFC: ; ;; http://www.faqs.org/rfcs/rfc1982.html ; ; which defines SERIAL as: ; ;; The unsigned 32 bit version number of the original copy of ;; the zone. Zone transfers preserve this value. This value ;; wraps and should be compared using sequence space arithmetic. ; ; It is "the maximum is always one less than a power of two." ; The DNS-Pro book then says: ; ;; Using the maximum increment, the serial number fix is a two-step ;; process. First, add 2147483647 to the erroneous value, for example, ;; 2008022800 + 2147483647 = 4155506447, restart BIND or reload the zone, ;; and make absolutely sure the zone has transferred to all the slave ;; servers. Second, set the SOA serial number for the zone to the correct ;; value and restart BIND or reload the zone again. The zone will ;; transfer to the slave because the serial number has wrapped through ;; zero and is greater that the previous value of 4155506447! RFC 1982 ;; contains all the gruesome details of serial number comparison ;; algorithms if you are curious about such things. ; ; OK so what's the real_error value? It wrapped by 2^32: (mod 200710110333 (expt 2 32)) ; Which makes sense since ; ; me@workstation:~> dig @nameserver mta1.domain.tld ; ... ; domain.tld. 3600 IN SOA ; nameserver.domain.tld. hostmaster.domain.tld. ; 3141614717 1800 900 86400 3600 ; me@workstation:~> ; So, I'm going to set my root server's SOA SN to 3141614717 (mod 200710110333 (expt 2 32)) ; To get everyone back in sync. dig verified that they're ; back in sync in less than an hour: ; ;; for x in dns1 ... dnsN; ;; do dig @$x domain.tld SOA +short; ;; done ; ; I'm then free to set it to: (let ((today 2007101201)) (- (expt 2 32) 1 (abs (- today (- (expt 2 31) 1))))) ; or greater than 0 but less than: (let ((error_value 3141614717)) (mod (+ (- (expt 2 31) 1) error_value) (expt 2 32))) ; For fun my colleague and I went with 666 and then did ; an "rndc reload" and then set it to today and reload ; again.
mail arriving out of order?
If your users complain about mail arriving out of order you can tell them the story below. But if they're complaining that this happens a lot, it's probably a symptom of your MTA being overloaded.
The mail protocol makes no guarantee that your messages will arrive in order.
If the mail server for foo.com tries to relay message-0 sent at 2:00 to mta0.domain.tld and mta0 doesn't have the resources it can and will refuse the SMTP connection. It's also possible that foo.com will then put message-0 back in its queue to not be resent for as long as root@foo.com sees fit to configure it. Let's assume this value is 1 hour. foo.com could then try to send message-1 (this not not message-0 which is still in the queue) at 2:05. When it tries to get it's SMTP connection this time mta0 has resources at that moment and accepts the message for delivery. Then at 3:00 foo.com tries to make another connection for message-0 to mta0 which again has resources and accepts the message. So, in the end message-0 sent at 2:00 arrives at 3:00 and message-1 sent at 2:05 arrives at 2:05. All of this is perfectly legal.
Wednesday, October 10, 2007
This is a handy command for testing:
echo -e "sent on `date`" | mail -s "test: `hostname`" foo@domain.tldIt will send mail quickly from the command line and there's no need to muck about with any clients After running this you can go see how your message is doing in /var/spool/mqueue etc.
Tuesday, October 9, 2007
tar -C
Would you believe that I got this far without knowing the tar -C option?
If you do the following:
tar xzf foo.tar.gz -C /then /foo will contain the contents of foo.tar.gz. Note that the -C means change to the specified directory so that the contents of your extract end up there. As the man page says:
-C, --directory DIR change to directory DIRI normally just put the tarball where I want it extracted and then extract it. My problem tonight was that the partition where I wanted to extract it to was not big enough for both the tarball and its files. I had to extract it to that directory from a different partition. Of course I had to wait until I saw a disk full error to realize this. If I had known about it I would probably be home now and not waiting for a 5G tarball to uncompress into 15G. I'll probably remember it now.
Monday, October 8, 2007
tcpdump
If you're debugging network services between two hosts don't forget to let tcpdump help you.
E.g. if host1 can't SSH to host2 and you think an external firewall is blocking you, try this:
1. Have host2 display TCP info on port 22 and display only results containing host1:
host2$ tcpdump -i eth0 tcp port 22 | grep host12. Try to SSH from host1 to host2. If an external firewall is blocking them you should see nothing from the command above. However, if you see something like the following, then the external firewall isn't blocking you and it's an issue between the two hosts:
host2$ tcpdump -i eth0 tcp port 22 | grep host1 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes 14:08:50.412007 IP host1.domain.tld.54403 > host2.domain.tld.ssh: S 2707949880:2707949880(0) win 5840In the case above host2 is unable to get back to host1 to complete the TCP hand shake. In this case you could try reaching host1 from host2 and debug the resulting issue (e.g. broken netmask on host2). A healthy looking tcpdump from host2 would look like this:14:08:53.411702 IP host1.domain.tld.54403 > host2.domain.tld.ssh: S 2707949880:2707949880(0) win 5840 14:08:59.411422 IP host1.domain.tld.54403 > host2.domain.tld.ssh: S 2707949880:2707949880(0) win 5840
host2$ tcpdump -i eth0 tcp port 22 | grep host1 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes 14:11:12.101381 IP host1.domain.tld.54432 > host2.domain.tld.ssh: S 1120004832:1120004832(0) win 5840If you read the above you can see host1 ack'ing 1120004833 from host2 to establish a connection. There are plenty of tcpdump examples on the Interblag.14:11:12.101391 IP host2.domain.tld.ssh > host1.domain.tld.54432: S 320695995:320695995(0) ack 1120004833 win 5792 14:11:12.101498 IP host1.domain.tld.54432 > host2.domain.tld.ssh: . ack 1 win 183 14:11:12.107969 IP host2.domain.tld.ssh > host1.domain.tld.54432: P 1:24(23) ack 1 win 1448 14:11:12.108063 IP host1.domain.tld.54432 > host2.domain.tld.ssh: . ack 24 win 183 14:11:12.108165 IP host1.domain.tld.54432 > host2.domain.tld.ssh: P 1:23(22) ack 24 win 183 14:11:12.108173 IP host2.domain.tld.ssh > host1.domain.tld.54432: . ack 23 win 1448 14:11:12.108341 IP host1.domain.tld.54432 > host2.domain.tld.ssh: P 23:663(640) ack 24 win 183 14:11:12.108347 IP host2.domain.tld.ssh > host1.domain.tld.54432: . ack 663 win 1768 14:11:12.109143 IP host2.domain.tld.ssh > host1.domain.tld.54432: P 24:664(640) ack 663 win 1768 14:11:12.109313 IP host1.domain.tld.54432 > host2.domain.tld.ssh: P 663:687(24) ack 664 win 223 14:11:12.111342 IP host2.domain.tld.ssh > host1.domain.tld.54432: P 664:816(152) ack 687 win 1768 14:11:12.113385 IP host1.domain.tld.54432 > host2.domain.tld.ssh: P 687:831(144) ack 816 win 223 14:11:12.119580 IP host2.domain.tld.ssh > host1.domain.tld.54432: P 816:1280(464) ack 831 win 1768 14:11:12.121828 IP host1.domain.tld.54432 > host2.domain.tld.ssh: P 831:847(16) ack 1280 win 263 14:11:12.162117 IP host2.domain.tld.ssh > host1.domain.tld.54432: . ack 847 win 1768 14:11:12.162204 IP host1.domain.tld.54432 > host2.domain.tld.ssh: P 847:895(48) ack 1280 win 263 14:11:12.162219 IP host2.domain.tld.ssh > host1.domain.tld.54432: . ack 895 win 1768 14:11:12.162262 IP host2.domain.tld.ssh > host1.domain.tld.54432: P 1280:1328(48) ack 895 win 1768 14:11:12.163845 IP host1.domain.tld.54432 > host2.domain.tld.ssh: P 895:959(64) ack 1328 win 263 14:11:12.164001 IP host2.domain.tld.ssh > host1.domain.tld.54432: P 1328:1392(64) ack 959 win 1768 14:11:12.166947 IP host1.domain.tld.54432 > host2.domain.tld.ssh: P 959:1055(96) ack 1392 win 263 14:11:12.168083 IP host2.domain.tld.ssh > host1.domain.tld.54432: P 1392:1456(64) ack 1055 win 1768 14:11:12.170456 IP host1.domain.tld.54432 > host2.domain.tld.ssh: P 1055:1151(96) ack 1456 win 263 14:11:12.170493 IP host2.domain.tld.ssh > host1.domain.tld.54432: P 1456:1520(64) ack 1151 win 1768 14:11:12.170635 IP host1.domain.tld.54432 > host2.domain.tld.ssh: P 1151:1519(368) ack 1520 win 263 14:11:12.171024 IP host2.domain.tld.ssh > host1.domain.tld.54432: P 1520:1840(320) ack 1519 win 2088 14:11:12.180209 IP host1.domain.tld.54432 > host2.domain.tld.ssh: P 1519:2159(640) ack 1840 win 263 14:11:12.183403 IP host2.domain.tld.ssh > host1.domain.tld.54432: P 1840:1872(32) ack 2159 win 2408 -- 14:11:12.183659 IP host1.domain.tld.54432 > host2.domain.tld.ssh: P 2159:2223(64) ack 1872 win 263 14:11:12.187702 IP host2.domain.tld.ssh > host1.domain.tld.54432: P 1872:1920(48) ack 2223 win 2408 14:11:12.187923 IP host1.domain.tld.54432 > host2.domain.tld.ssh: P 2223:2671(448) ack 1920 win 263 14:11:12.201571 IP host2.domain.tld.ssh > host1.domain.tld.54432: P 1920:1968(48) ack 2671 win 2728 14:11:12.201596 IP host2.domain.tld.ssh > host1.domain.tld.54432: P 1968:2080(112) ack 2671 win 2728 14:11:12.201691 IP host1.domain.tld.54432 > host2.domain.tld.ssh: . ack 2080 win 263 14:11:12.229982 IP host2.domain.tld.ssh > host1.domain.tld.54432: P 2080:2144(64) ack 2671 win 2728 14:11:12.239626 IP host2.domain.tld.ssh > host1.domain.tld.54432: P 2144:2208(64) ack 2671 win 2728 14:11:12.239719 IP host1.domain.tld.54432 > host2.domain.tld.ssh: . ack 2208 win 263
Labels:
network
Wednesday, October 3, 2007
nice
I experimented with nice values and top and I'm pasting my results here.
When running top note the value of PR vs NI:
PR:
The priority number is calculated by the kernel and is used to determine the order in which processes are schedule. The kernel takes many factors in to consideration when calculating this number, and it is not unusual to see large fluctuations in this number over the lifetime of a process.
NI:
This column reflects the "nice" setting of each process. A process's nice is inhereted from its parent. Most user processes run at a nice of 0, indicating normal priority. Users have the option of starting a process with a positive nice value to allow the system to reduce the priority given to that process. This is normally done for long-running cpu-bound jobs to keep them from interfering with interactive processes. The Unix command "nice" controls setting this value. Only
root can set a nice value lower than the current value. Nice values can be negative. On most systems they range from -20 to 20. The nice value influences the priority value calculated by the Unix scheduler.
To see these values in action I'll start two CPU intensive processes:
raise 415: renice -19 -p 31415 The results of the first:
someguy@machine:~$ dd if=/dev/urandom of=/dev/null & [1] 31406 someguy@machine:~$ dd if=/dev/urandom of=/dev/null & [2] 31415 someguy@machine:~$I see them both at the top of the stack:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 31406 someguy 25 0 9200 772 620 R 100 0.0 3:48.62 dd 31415 someguy 25 0 9204 772 620 R 100 0.0 3:28.74 ddNote their priority of 25 and their nice value, let's see how they change as we renice them. Remember, the nicer a program the lower it's priority. The less nice a program the higher it's priority. For exmple:
- Nice value of 19 is letting people walk on you
- Nice value of 5 is holding the door
- Nice value of -20 is pushing people out of the way
nice -n -10 xmmsIf it's already running:
renice 19 -p $pidIn this case I'll move one way up and one way down: lower 406: renice 19 -p 31406
raise 415: renice -19 -p 31415 The results of the first:
someguy@machine:~$ renice 19 -p 31406 31406: old priority 0, new priority 19 someguy@machine:~$ PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 31415 someguy 25 0 9204 772 620 R 100 0.0 6:50.54 dd 31406 someguy 39 19 9200 772 620 R 100 0.0 7:10.48 ddSo, it's priority went higher and the higher nice value shows. So, even the PR numbers are inversely proportional to nice numbers. Now, I raise the other:
someguy@machine:~$ sudo renice -19 -p 31415 Password: 31415: old priority 0, new priority -19 someguy@machine:~$ PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 31406 someguy 39 19 9200 772 620 R 100 0.0 8:46.89 dd 31415 someguy 6 -19 9204 772 620 R 100 0.0 8:26.92 ddSo, I've raised the priority of the first. I could maximize it:
someguy@machine:~$ sudo renice -20 -p 31415 31415: old priority -19, new priority -20 someguy@machine:~$ PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 31406 someguy 39 19 9200 772 620 R 100 0.0 17:00.51 dd 31415 someguy 5 -20 9204 772 620 R 100 0.0 16:41.13 ddNote that I still can't get it past 5 with the lowest possible nice value. Note also that these are the two most CPU intensive processes so they both take the top of the stack, even with the most extreme nice values. If I had a third it should be sandwiched between the two.
someguy@machine:~$ dd if=/dev/urandom of=/dev/null & [1] 32089 someguy@machine:~$ PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 32089 someguy 25 0 9200 768 620 R 100 0.0 0:21.44 dd 31415 someguy 5 -20 9204 772 620 R 100 0.0 18:29.59 dd 31406 someguy 39 19 9200 772 620 R 100 0.0 18:48.88 ddNow it's scheduling them round robin. 15, 6, 89 :: 15, 89, 6 :: 6, 89, 15 :: 89, 6, 15 :: 6, 15, 89 But 15 is on top 2/3 of the time. Interesting. It still letting the other jobs at the resources. Now, I'll start a fourth CPU hog with a nice value from the start:
someguy@machine:~$ sudo nice -n -20 dd if=/dev/urandom of=/dev/null & [2] 32288 someguy@machine:~$I predict that 88 and 15 will stay on top most of the time. Note that it was started with a higher nice value which required sudo so it's running as root:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 31415 someguy 5 -20 9204 772 620 R 100 0.0 30:48.50 dd 32288 root 5 -20 9200 772 620 R 100 0.0 4:59.89 dd 31406 someguy 39 19 9200 772 620 R 99 0.0 31:05.53 dd 32089 someguy 25 0 9200 768 620 R 99 0.0 12:39.09 ddNow I'll kill 88 and 15. Then let's try raising the importance of 6 and introducting a new process with a priority just below it:
renice -20 -p 31406 && nice -n -19 dd if=/dev/urandom of=/dev/null & someguy@machine:~$ sudo su - root@machine:~# renice -20 -p 31406 && nice -n -19 dd if=/dev/urandom of=/dev/null & [1] 32656 root@machine:~# 31406: old priority 19, new priority -20 root@machine:~#top shows:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 31406 someguy 5 -20 9200 772 620 R 100 0.0 39:28.19 dd 32089 someguy 25 0 9200 768 620 R 100 0.0 21:02.10 dd 32658 root 6 -19 9200 768 620 R 100 0.0 0:22.50 ddWasn't that fun? Remember when running top that you can see how all of the CPUs are working by pressing 1.
printing
No, I don't mean having your program print something to the screen:
printf("Hello, world...\n");I mean sending a PS file (if you're lucky) to a mechanical device which puts its content on paper. What good is that? You can't grep paper. You can't pipe programs to or from it or even copy and paste with it. I guess it works when the power is out, but you can't read it in the dark. A wise man once told me "the only time I've had to print something is when I had to hand it to an idiot". Since this post is about doing something that's not cool I'll describe a non-cool way to approach the issue since it relates to a non-cool part of my job and I'm just going to log it here. Sorry it's lame. I hope the anecdote made this post worth reading on some level. Anyway, If you're configuring a RedHat box that wasn't minimally installed you can add printers remotely by doing:
ssh -X serverOnce you're in you can do:
sudo /usr/sbin/system-config-printer-gui<sarcasm>love it!</sarcasm> Note that the GUI requires you to have something present for the printer path. This seems like a bug since there are times when I want to print to a network printer like an LaserJet 4000 and this field simply doesn't apply. If this happens you can create the printer with the GUI and then you need to go in and edit the file that it generates (which is /etc/cups/printers.conf) and delete that path.
Subscribe to:
Posts (Atom)