A colleague of mine came across the following fact regarding I/O and fabric switches:
- 2 Gbps FC can queue 254 I/O commands
- 4 Gbps FC can queue 2048 I/O commands
We wonder if moving a certain server from a 2G switch (McData DS-24M2) to a 4G switch (Cisco MDS-9124) will improve performance. In order to determine this I'd need to see how many I/O commands we have for different points in time.
I've
blogged about seeing I/O with /proc/diskstats before. Let's look at it closer with awk. Note that I don't have anything conclusive below, just some observations. I do think I can use this over time and recognize trends on my system however.
Here's what proc says about a particular LUN:
# cat /proc/diskstats | grep " sdc "
8 32 sdc 391348542 3811329 642958694 1765819166 212637694
1424571277 438970288 1314722135 1 366113251 3445284834
#
As per
comp.os.linux.development these fields (starting after the device name) are:
Field 1 -- # of reads issued
Field 2 -- # of reads merged, field 6 -- # of writes merged
Field 3 -- # of sectors read
Field 4 -- # of milliseconds spent reading
Field 5 -- # of writes completed
Field 7 -- # of sectors written
Field 8 -- # of milliseconds spent writing
Field 9 -- # of I/Os currently in progress
Field 10 -- # of milliseconds spent doing I/Os
Field 11 -- weighted # of milliseconds spent doing I/Os
Or to put it another way:
391348542 reads issued (4)
3811329 reads merged (5)
642958694 sectors read (6)
1765819166 milliseconds spent reading (7)
212637694 writes completed (8)
1424571277 writes merged (9)
438970288 sectors written (10)
1314722135 milliseconds spent writing (11)
1 I/Os currently in progress (12)
366113251 milliseconds spent doing I/Os (13)
3445284834 weighted milliseconds spent doing I/Os (14)
Note that I've put the awk offset in parentheses above. We can then take more readings and focus on essential columns. E.g. we spend more time reading than writing:
# while [ 1 ]; do grep " sdc " /proc/diskstats |
awk {'print $7 " " $11'}; sleep 1; done
1767053699 1323835167
1767053722 1323835217
1767054231 1323858000
1767054400 1323858477
1767054401 1323859097
1767054420 1323859106
1767055201 1323863662
1767055543 1323863671
1767055666 1323864799
1767056048 1323865700
#
If we look at them every quarter second we can see spikes in the number of I/O along with number of reads and write issues during that time (looking at a larger interval hides the spikes):
# while [ 1 ]; do grep " sdc " /proc/diskstats |
awk {'print $12 "\t" $4 "\t" $8'}; sleep 0.25; done
1 391689249 213184077
1 391689253 213184467
4 391689253 213184912
4 391689253 213185311
4 391689253 213185780
1 391689257 213186170
1 391689257 213186558
2 391689258 213187017
68 391689271 213187319
1 391689271 213187801
2 391689313 213188219
1 391689338 213188481
2 391689379 213188863
44 391689379 213189282
32 391689384 213190180
3 391689400 213190569
3 391689400 213190971
1 391689405 213191429
3 391689407 213192172
#
We can check the math on the last few lines. Because our sampling interval is missing events that occur in between our numbers won't add up exactly, but we can see a general trend in some of these numbers:
1 391689338 213188481
2 391689379 213188863
44 391689379 213189282
32 391689384 213190180
There were a lot more writes than reads from the samples taken above
3 391689400 213190569
3 391689400 213190971
Nothing was read, but 3 I/O operations seem to have been written.
I don't have anything conclusive from the above but I do think I can use this over time and recognize trends on my system.
No comments:
Post a Comment