When working as a systems administrator, you’ll always end up having to solve a file system full error in a hurry. Here are a few commands and hints to help you get out of it quickly on a UNIX like system.
df command
The df (disk free) command will report how much of each file system is used. You can combine this with with the sort command to get the fullest file systems at the end :
kattoo@roadrunner ~/Downloads $ df | sort -n -k 5,5
Filesystem 1K-blocks Used Available Use% Mounted on
shm 2029696 0 2029696 0% /dev/shm
/dev/sda6 1968588 3232 1865356 1% /tmp
udev 10240 228 10012 3% /dev
/dev/sda12 367628876 28899532 338729344 8% /files
/dev/sda7 1989820 246300 1743520 13% /var
/dev/sda9 1999932 260996 1738936 14% /usr/portage
/dev/sda5 244176 52264 191912 22% /
/dev/sda8 9990188 3161340 6828848 32% /usr
/dev/sda11 99947736 32910756 67036980 33% /files/vservers
/dev/sda10 3989912 1699372 2290540 43% /usr/portage/distfiles
The sort arguments are :
- -n : order numerically
- -k 5,5 : order by the 5th field
du command
When you found out that a specific file system is full, you need to quickly analyze which folders are the biggest.
You can use the du (disk usage) command. This command will report the space used by a directory and sub-directories. I usually run something like :
kattoo@roadrunner /usr/portage $ du -sk * | sort -n | tail -20
5711 x11-misc
6056 app-misc
6224 dev-ruby
6320 app-dicts
7219 profiles
7428 net-analyzer
7686 app-text
8112 media-plugins
8777 kde-base
8883 media-libs
8943 sys-apps
9764 dev-util
9986 media-sound
10955 dev-libs
11295 dev-python
11479 dev-java
11645 net-misc
17553 dev-perl
112556 metadata
1705976 distfiles
This will show the 20 largest files or directories. du will display the size of each directories and sub-directories by default. The -s switch will make du display only the grand total for each arguments.
According to the results, you might want to go on repeating this command in the sub-directories, until you’ve narrowed down enough to find the culprit.
find command
The file system full error is more often than not due to a faulty script or program which logs are running wild due to a bug. This will usually lead to a huge file somewhere in a directory, hogging all the available space on the file system.
An easy way to spot it is to use the find command. The following snippet for example will search for files bigger than 10MB in the current directory and sub-directories and print the 10 largest ones :
kattoo@roadrunner /usr/portage $ find . -type f -size +10000000c -exec ls -l \{} \; | sort -n -k 5,5 | tail -10
-rw-rw-r-- 1 portage portage 23416703 Oct 1 10:07 ./distfiles/samba-3.0.37.tar.gz
-rw-rw-r-- 1 portage portage 28749072 Jan 18 2008 ./distfiles/extremetuxracer-0.4.tar.gz
-rw-rw-r-- 1 portage portage 46905557 Oct 16 17:20 ./distfiles/firefox-3.5.4.source.tar.bz2
-rw-rw-r-- 1 portage portage 46914620 Dec 2 05:32 ./distfiles/firefox-3.5.6.source.tar.bz2
-rw-rw-r-- 1 portage portage 50380241 Oct 2 11:47 ./distfiles/VirtualBox-3.0.8-53138-Linux_amd64.run
-rw-rw-r-- 1 portage portage 50595281 Nov 17 10:44 ./distfiles/VirtualBox-3.0.12-54655-Linux_amd64.run
-rw-rw-r-- 1 portage portage 59368714 Aug 8 02:10 ./distfiles/gcc-4.3.4.tar.bz2
-rw-rw-r-- 1 portage portage 61494822 Sep 10 00:34 ./distfiles/linux-2.6.31.tar.bz2
-rw-rw-r-- 1 portage portage 125384668 Oct 1 13:58 ./distfiles/qt-x11-opensource-src-4.5.3.tar.gz
-rw-rw-r-- 1 portage portage 314942420 Jun 18 2008 ./distfiles/sauerbraten_2008_06_17_ctf_edition_linux.tar.bz2
kattoo@roadrunner /usr/portage $
I also like to filter out already compressed files, in order to collect the biggest files which I could compress to save some space with something like :
find . -type f -size +10000000c \! -name "*.Z" \! -name "*.gz" \! -name "*.bz2" -exec ls -l \{} \;
Tip : Set this as an alias in your profile by adding this to your .bashrc (if you’re using bash … otherwise check your shell documentation) :
alias bigfiles='find . -type f -size +10000000c \! -name "*.Z" \! -name "*.gz" \! -name "*.bz2" -exec ls -l \{} \; | sort -n -k 5,5 | tail -10'
This will give you the 10 largest not-yet-compressed files in a single handy command.
Caveat
“Too many arguments” error :
If you are working with directory holding an important number of files, you might get a “Too many arguments” error when using a star expansion (like in the du -sk * example above).
This is actually not the du command complaining, but the shell which gets too much data when expanding the star. When this happens, you are usually better off using the find command as explained above.
Another possibility is to pipe the output of find to the xargs command. Basically xargs will take everything on the standard input and give it as arguments to the specified command.
Deleting / compressing a file which is still open
Also beware of this : if you delete or compress a file which is still being open by a process, then the space this file use won’t get freed before the process actually closes the file (I’ll explain why in a different post, this is an interesting topic on its own 🙂 ).
If you deleted or compressed the file (so basically the big file disappeared or was replaced by a compressed version), but the space doesn’t get freed (which you can check with df), then you can bet that a process is still holding the file open. You can spot this with tools like lsof or fuser. Those tools vary greatly according to which variant of Unix you’re running. On IBM’s AIX, fuser had a handy -d option to spot files on the file system which have a link count of 0 and it will report the PID of the attached processes.
Better check those tools’ man pages before you run into this situation !
A word to the wise
Those recipes will help you to find a way out when you are already facing a file system full problem. The best is of course to avoid them in the first place … The following ideas are a good start :
- Set alerts to have an early warning and let you deal with it before applications start crashing. Tools like Nagios would be your friends here, but home made scripts running from cron and sending emails might be enough.
- Check the trends : You can use tools like cacti to graph the space occupation of your file systems over the time. This will let you anticipate when you’ll need to add more disks or if your log rotation and/or file archiving policies are adequate.
More ideas ? Tips to share ? Hit the comments !
What a wonderful and straightforward guide for both the beginning sysadmin and the experienced sysadmin looking for a few new tips and tricks. Thanks!
Hey Phil,
Glad you enjoyed these tips and thanks for your visit and kind comment !
Stéphane
Thanks, this was helpful. I used the -h option to display sizes in “human readable” format. Just one thing, this actually conflicted with GNOME’s folder properties where one directory showed as being 460 MB in GNOME and 801M du…wonder whats up there…