This Perl one-liner is intended to print the 10 most frequent client IP addresses in an Apache log file. It can easily be recycled to count anything, though.
kattoo@roadrunner /files/toSave/home/kattoo/Downloads $ perl -ane '$c{$F[0]}++; END {print map {$_ . "\t->\t" . $c{$_} . "\n"} (sort {$c{$b} <=> $c{$a}} keys %c)[0..9]}' sakana.fr-28-02-2010.log
XX.XX.XX.XXX -> 163
XX.XXX.XXX.XXX -> 150
XXX.XXX.XX.XX -> 136
XXX.XX.XXX.XXX -> 134
XX.XXX.XXX.XX -> 116
XX.XXX.XXX.XXX -> 110
XX.XX.XXX.XX -> 104
XX.XXX.XXX.XXX -> 90
XX.XXX.X.XXX -> 75
XXX.XXX.XX.XXX -> 72
kattoo@roadrunner /files/toSave/home/kattoo/Downloads $
Short explanation of the command line switches :
- -a : turns on autosplit, the different fields will then be accessible in the @F variable
- -e : specifies the code to execute
- -n : wraps that code into a loop for each text line of the file to process from the standard input or the files specified on the command line
This is equivalent to Nils’ approach in awk but you know, I’ve always been partial to Perl 🙂
Feel free to suggest any improvement… I know there must be many !
Suggested reading
- Programming Perl (3rd Edition) (on Amazon) : The reference guide for Perl programming, written by Perl Gods (Larry Wall, Tom Christiansen, John Orwant).
- Use command-line Perl to make UNIX administration easier (on techrepublic) : Short intro about perl 1-liners
{$_ . “\t->\t” . $c{$_} . “\n”}
->
{“$_\t->\t$c{$_}\n”}
That is all.
awk ‘{ print $1 }’ access.log | uniq -c
Iain and dblackshell, thanks for your input !
Stephane
dblackshell is really close. The problem is that the awk line doesn’t sort.
You need this:
awk ‘{print $1}’ access_log | sort | uniq -c | sort -g | tail -n 10
Another case when the comments are worth more than the post itself … Thanks to all 😀
Stephane
The perl way is still valuable, showing very nice usage of map, sort and array range..