Realtime logfile/traffic analyses (apache webserver)

Live „top 10“ useragents
~$: cat access.log | awk -F '"' '{print $6}' | sort | uniq -c | sort -nr | head

„Top 10“ useragents certain date and hour (grep pattern may differ)
~$: grep '12/Sep/2011:15' access.log | awk -F '"' '{print $6}' | sort | uniq -c | sort -nr | head

„Top 20“ referrers from the last 5000 hits
~$: tail -5000 access.log | awk '{print $11}' | tr -d '"' | sort | uniq -c | sort -rn | head -20
~$: tail -5000 access.log | awk '{freq[$11]++} END {for (x in freq) {print freq[x], x}}' | tr -d '"' | sort -rn | head -20

Top 20 IPs from the last 5000 hits
~$: tail -5000 access.log | awk '{print $1}' | sort | uniq -c | sort -rn | head -20
~$: tail -5000 access.log | awk '{freq[$1]++} END {for (x in freq) {print freq[x], x}}' | sort -rn | head -20

Top 20 URLs from the last 5000 hits
~$: tail -5000 ./access.log | awk '{print $7}' | sort | uniq -c | sort -rn | head -20
~$: tail -5000 ./access.log | awk '{freq[$7]++} END {for (x in freq) {print freq[x], x}}' | sort -rn | head -20

Top 20 URLs requested from a certain ip from the last 5000 hits
~$: IP=1.2.3.4; tail -5000 ./access.log| grep $IP | awk '{print $7}' | sort | uniq -c | sort -rn | head -20
~$: IP=1.2.3.4; tail -5000 ./access.log | awk -v ip=$IP ' $1 ~ ip {freq[$7]++} END {for (x in freq) {print freq[x], x}}' | sort -rn | head -20

real time analysis with apachetop
HowTo: monitor your website in real time with apachetop
/usr/sbin/apachetop -f /path/to/your/log/access.log