Perl : Counting occurences of IP addresses in Apache logs

This Perl one-liner is intended to print the 10 most frequent client IP addresses in an Apache log file. It can easily be recycled to count anything, though.

kattoo@roadrunner /files/toSave/home/kattoo/Downloads $ perl -ane '$c{$F[0]}++; END {print map {$_ . "\t->\t" . $c{$_} . "\n"} (sort {$c{$b} <=> $c{$a}} keys %c)[0..9]}' sakana.fr-28-02-2010.log 
XX.XX.XX.XXX	->	163
XX.XXX.XXX.XXX	->	150
XXX.XXX.XX.XX	->	136
XXX.XX.XXX.XXX	->	134
XX.XXX.XXX.XX	->	116
XX.XXX.XXX.XXX	->	110
XX.XX.XXX.XX	->	104
XX.XXX.XXX.XXX	->	90
XX.XXX.X.XXX	->	75
XXX.XXX.XX.XXX	->	72
kattoo@roadrunner /files/toSave/home/kattoo/Downloads $ 

Short explanation of the command line switches :

  • -a : turns on autosplit, the different fields will then be accessible in the @F variable
  • -e : specifies the code to execute
  • -n : wraps that code into a loop for each text line of the file to process from the standard input or the files specified on the command line

This is equivalent to Nils’ approach in awk but you know, I’ve always been partial to Perl 🙂

Feel free to suggest any improvement… I know there must be many !

Suggested reading

8 thoughts on “Perl : Counting occurences of IP addresses in Apache logs”

  1. {$_ . “\t->\t” . $c{$_} . “\n”}

    ->

    {“$_\t->\t$c{$_}\n”}

    That is all.

  2. dblackshell is really close. The problem is that the awk line doesn’t sort.

    You need this:

    awk ‘{print $1}’ access_log | sort | uniq -c | sort -g | tail -n 10

Comments are closed.