One thorn in the side of any Linux systems administrator is tracking down what processes are utilizing the most disk I/O resources. When your load is going up and 'top' says that you are mostly just "waiting" around you need to do something to find out what is exhausting the resource. I have yet to find a fool proof method that will identify with certainty those processes but I have run across several methods that will at least give you some information to go on. I will start with the basics and move towards some more advanced methods.
The most basic method is to list processes in the D state which, according to the man page for 'ps', is the "Uninterruptible sleep" state and usually IO related. The problem here is that you are listing processes that are in that state, not necessarily processes that are causing the issue.
# ps -eo pid,user,wchan=WIDE-WCHAN-COLUMN -o s,cmd|awk ' $4 ~ /D/ {print $0}'
The above creates a formatted output for the 'ps' command and prints the lines where the 'STAT' column is "D".
Another common practice is to find more information about those processes with the 'lsof' command. This one liner identify the processes as above, but also will list the files opened by that process:
# for i in `ps -eo pid,user,wchan=WIDE-WCHAN-COLUMN -o s,cmd | \
awk '$4 ~ /D/ {print $1}'`; do echo "----------"$i"----------"; \
lsof -p $i; done
As you can see we are just capturing the output of the first command, and running 'lsof' against the PID of each of the processes in the resulting list which will provide a detailed picture of what those processes are doing. At least what they are doing with files, which happen to usually reside on the disk, which is what we are talking about anyway.
And finally the most detailed, accurate method for tracking down and I/O issue that I have found is to enable block_dump debugging. I show how to enable block_dump in the commands below. When this flag is set, Linux reports all disk read and write operations that take place. The output of block_dump is written to the kernel output, and it can be retrieved using "dmesg". When you use block_dump and your kernel logging level also includes kernel debugging messages, you probably want to turn off klogd, otherwise the output of block_dump will be logged, causing disk activity that is not normally there. Here is the process for getting useful information from the block_dump debugging messages:
# /etc/init.d/syslog stop
# echo 1 > /proc/sys/vm/block_dump
# sleep 60
# dmesg | awk '/(READ|WRITE|dirtied)/ {process[$1]++} END {for (x in process) \
print process[x],x}' |sort -nr |awk '{print $2 " " $1}' | \
head -n 10
... list of 10 most offensive processes ...
# echo 0 > /proc/sys/vm/block_dump
# /etc/init.d/syslog start
That is just about the best you can hope for, a list of the processes that have touched the disk most often. Adjust the 'sleep' command to set your sample window or update/remove the 'head' to adjust the size of the output.
Not to be forgotten are the tools 'iostat' and 'hdparm'. These utilities will go along way in helping you understand how much I/O your disks can handle and help you make the desicion on whether or not additional spindles are needed.
Thanks to all of the people that provided their insight on this subject and for the countless internet blurbs discussing the same. One final thought, I would love to be updated with any other methods people are using to track down I/O related issues in the wild; knowledge is power, yada yada yada, did I mention the lobster bisk?