I frequently have Plesk customers that need to know what is consuming the space on their server. The Plesk interface provides for a break down by domain but not for the different components within the domain (email, web content, databases, etc.)

I recently devised these one liners to assist me and my customers with identfying the specific pieces of the domains contributing towards the disk usage:

# for i in `ls -1 /var/qmail/mailnames`; do
        echo "### $i  ###";
        DBDIRS="";
        DATABASES=`mysql -D psa -e "select db.name  from data_bases db,domains d \
            WHERE d.id=db.dom_id AND d.name='$i';" |grep -v ^name$`
        for n in $DATABASES; do
                DBDIRS="$DBDIRS /var/lib/mysql/$n";
        done
        du -hcs /var/qmail/mailnames/$i /var/www/vhosts/$i $DBDIRS;
done

To find the largest mailbox I use this:

# cd /var/qmail/mailnames ; du -hs --block-size=1024K */* | \
    sort -nr | head -50 |sed s/[[:space:]]/'\/'/ |awk -F\/ '{print $1"M\t"$3"@"$2}'

Any others that you use commonly?

I have been working the last couple of days with a customer
using Joomla for a site under Plesk.  Asking around it didn't seem like
there was a standard way to get these 2 products to behave
nicely together.  The customer wanted to be able to install components
via the Joomla admin interface and update those files via FTP.  Here is
what I ended up doing, anyone have any additions or advice here?

1) Change the Umask in '/etc/proftpd.conf' to '002'

2) Add the 'apache' user to the 'psacln' group

3) Update directory permissions:

cd /home/httpd/vhosts/<domain.com>

chown -R <username>:psacln httpdocs
chmod -R g+w httpdocs
find httpdocs -type d -exec chmod g+s {} \;


4) Joomla also complains about some PHP settings, sometimes including
not being able to write to '/var/lib/php/session'.  To fix the issues I
used the following in the 'vhost.conf' file for the domain:

<Directory /home/httpd/vhosts/<domain>/httpdocs>
php_admin_flag magic_quotes_gpc on
php_admin_flag display_errors on
php_admin_value session.save_path /tmp
</Directory>

find more large files

| | Comments (0) | TrackBacks (0)

There has to be a better way.  Thats what I thought when I created the previous "find large files" entry.  After a little digging and some find magic I came up with the following:

 

# find / \( -path '/proc' -o -path '/sys' -o -path '/dev' \) -prune -o \
  -type f -size +1024k -printf "%s %h/%f\n"

 

 Then you can 'sort', 'head' and 'awk' the results down to what you want.  This avoids the problem of scanning the directories you are excluding anyway.  Hope it helps.

find large files

| | Comments (0) | TrackBacks (0)

There are probably tons of ways to find large files on a system but this is the one I use currently.  It is only practical on a system that does not have high I/O load or an excessive number of files. 


# find / -type f -printf "%s %h/%f\n" 2> /dev/null | egrep -v \
   '^[[:digit:]]+ (<EXCLUSIONS>)[[:print:]]*$' | sort -rn -k1 | head -n 10 

 

The <EXCLUSIONS> are defined by a pipe (|) sepatated lists in the form of a regular expression.  For example the exclusions lists: 


/proc
/sys
/var/lib/mysql
/home/httpd/vhosts/*/statistics 

 

Becomes an <EXCLUSIONS> definition of: 


/proc|/sys|/var/lib/mysql|/home/httpd/vhosts/[[:print:]]*/statistics

 

Just replace the <EXCLUSIONS> place holder within the parrens () with your definition following the example above.  The locations in the <EXCLUSIONS> list are still traversed by 'find' but are filtered out of the resulting report.  You can still clean up the output to display the file sizes in MB by piping through something like


awk '{ print $1/1048576 "MB" " " $2}'

 

Now that I think about it we should be able to complete the entire process from within awk and avoid the 'egrep'.  Maybe another time.  Most likely some of the largest files you find will be log files and the best way to deal with log files is to setup proper log rotation.

 

bandwidth usage calculation

| | Comments (0) | TrackBacks (0)

Here are some nasty one liners that will pull data from log files to calculate outgoing bandwidth for a variety of services.  I will add more as I create them,  feel free to comment on any improvements you feel can be made:

HTTP traffic:

# for day in `seq -w 01 31`; do echo -n "January $day: "; egrep "$day/Jan" \
   access_log* |awk '{ SUM += $10 } END {print SUM/1024/1024, "M"}' ;done

FTP traffic:

# for day in `seq 01 31`; do echo -n "January $day: "; egrep \
   " Jan[[:space:]]{1,2}$day " xferlog* | awk '  $12 ~ /o/  { SUM +=$8 } \
   END {print SUM/1024/1024, "M"}'; done

SMTP traffic (sendmail):

# for day in `seq 01 31`; do echo -n "January $day: "; egrep \
   "Jan[[:space:]]{1,2}$day " maillog* | grep 'relay=localhost' | \
   awk ' $8 ~ /size=/ {print $8}' |awk -F= '{SUM += $2} END \
   {print SUM/1024,"K"}'; done

 

tracing long i/o wait

| | Comments (0) | TrackBacks (0)

One thorn in the side of any Linux systems administrator is tracking down what processes are utilizing the most disk I/O resources.  When your load is going up and 'top' says that you are mostly just "waiting" around you need to do something to find out what is exhausting the resource.  I have yet to find a fool proof method that will identify with certainty those processes but I have run across several methods that will at least give you some information to go on.  I will start with the basics and move towards some more advanced methods.

The most basic method is to list processes in the D state which, according to the man page for 'ps', is the "Uninterruptible sleep" state and usually IO related.  The problem here is that you are listing processes that are in that state, not necessarily processes that are causing the issue.

 

# ps -eo pid,user,wchan=WIDE-WCHAN-COLUMN -o s,cmd|awk ' $4 ~ /D/ {print $0}'

 

The above creates a formatted output for the 'ps' command and prints the lines where the 'STAT' column is "D".

Another common practice is to find more information about those processes with the 'lsof' command.  This one liner identify the processes as above, but also will list the files opened by that process:

 


# for i in `ps -eo pid,user,wchan=WIDE-WCHAN-COLUMN -o s,cmd | \
   awk '$4 ~ /D/ {print $1}'`; do echo "----------"$i"----------"; \
   lsof -p $i; done

 

As you can see we are just capturing the output of the first command, and running 'lsof' against the PID of each of the processes in the resulting list which will provide a detailed picture of what those processes are doing.  At least what they are doing with files, which happen to usually reside on the disk, which is what we are talking about anyway.

And finally the most detailed, accurate method for tracking down and I/O issue that I have found is to enable block_dump debugging. I show how to enable block_dump in the commands below.  When this flag is set, Linux reports all disk read and write operations that take place.  The output of block_dump is written to the kernel output, and it can be retrieved using "dmesg". When you use block_dump and your kernel logging level also includes kernel debugging messages, you probably want to turn off klogd, otherwise the output of block_dump will be logged, causing disk activity that is not normally there.  Here is the process for getting useful information from the block_dump debugging messages:

 

# /etc/init.d/syslog stop
# echo 1 > /proc/sys/vm/block_dump
# sleep 60
# dmesg | awk '/(READ|WRITE|dirtied)/ {process[$1]++} END {for (x in process) \
   print process[x],x}' |sort -nr |awk '{print $2 " " $1}' | \
   head -n 10
... list of 10 most offensive processes ...
# echo 0 > /proc/sys/vm/block_dump
# /etc/init.d/syslog start

 

That is just about the best you can hope for, a list of the processes that have touched the disk most often. Adjust the 'sleep' command to set your sample window or update/remove the 'head' to adjust the size of the output.

Not to be forgotten are the tools 'iostat' and 'hdparm'.  These utilities will go along way in helping you understand how much I/O your disks can handle and help you make the desicion on whether or not additional spindles are needed.

Thanks to all of the people that provided their insight on this subject and for the countless internet blurbs discussing the same.  One final thought, I would love to be updated with any other methods people are using to track down I/O related issues in the wild; knowledge is power, yada yada yada, did I mention the lobster bisk?

 

 

touchpad madness

| | Comments (0) | TrackBacks (0)

I have a Compaq Presario C500 class laptop (C501NR). Very weak machine, but very stable and well built. The perfect linux laptop. I made my way through 'ndiswrapper' for the wireless, '915resolution' for the widescreen, but one problem that had alluded me was the damn touch pad. Whenever I would type I would inevitably hit the touch pad with my thumbs causing the cursor to move and disrupting my work. Here is the quick solution:

1) install synaptics
2) run the command:
$ synclient TapButton1=0

Now tapping the touchpad does not count as a mouse click and you can type all over. Just add it to whatever start up script associated with your X session and you should be all set.

Have an external mouse? Disable the touch pad completely with the following:

$ synclient TouchpadOff=1

Again, this will have to be run each time you start an X session.

first post

| | Comments (0) | TrackBacks (0)

Another attempt at a web log. Regular updates, useful information? We shall see.