helpful awk command examples when working with Apache access logs

There are some utils that analyze access logs, one of such util is webalizer. But I always believe that getting to know you access log files is more appropriate when you want to do some more in depth analysis on something specific.

There are several commands available such as grep, cut, sed etc that you can use for different scenarios, depending on what kind of information you actually need, but in this post I will touch base on awk command and working with access logs. awk is a pattern directed scanning and processing language. Very powerful language indeed.

awk manual can be found here

Here is an Apache Access logs example

The format of any entry above is shown below, columns are whitespace delimited

To extract above columns individually we can use the following commands

 

You will see that above commands taking a space as a delimiter, we can change this however as shown below

Let’s work on some scenarios now, say if you want to get list of unique ip addresses from your logs, you can run this command to get that info

or if you want to see which IP addresses has been accessing a specific resources then you can use either of these commands

Check if the requests are coming from an automated scripts

When checking for automated scripts, we will check for an empty user agent value, generally these scripts won’t send through a user agent information

Here is the command that you can use

To check how many times a resource has been requested

You can use the following command

 

Identify issues with your web resources

Generally we will be working for 404 errors, we can get this kinda report from Google Analytics as well but Google won’t list internal linking resource errors, you can also use developer tools to check if 404 errors are being produced on a certain page, let’s use awk to do that now

We are check for column 9 and pattern matching it against string 404. We are then piping the output to another awk command to print the required data and then piping the output to sort command to sort the output

We can also use something like this

Above will produce similar result without using 2 awk commands

There is so much more you can do with awk all you have to do is to understand how the command works, combine other commands such as sed to customize your output

If you are using awk command to achieve other things please do leave your comments.

Ace editor cursor position has space between expected position and the current position

capture

Hi, I have used Ace editor a few times in my previous projects. Never did I face this problem before of having my cursor misplaced in my Ace editor. Let me guide you step by step Editor markup Here is a markup where I want my Ace editor to show … [Continue reading]

How to activate or enable color themes in bash scripts or other languages in Nano editor

nano

Bored of looking at that black and white text for a while? You would want to enable or activate color themes in Nano editor which is really easy to do. There are many languages supported for color theming within nano, such … [Continue reading]

W3C CSS3 and W3C HTML5 Badge Logos in SVG

Hi Guys, Here are some couple of logos that may come handy when you need them. Bookmark this page if you like because sometimes things become rather hard to find. CSS3 SVG Logo       HTML5 SVG … [Continue reading]

Online webpage to image conversion with ImgCake.com includes API

Hi Guys, I thought I would inform that there is a new tool available for taking screenshots of webpages called imgcake.com Talking about features there are many Unlimited renders/month High quality renders Custom window and shot … [Continue reading]

CommandLine Convert WMF, EPS etc to SVG using inkscape

Hello, I thought this might be a good tip to share for people who are looking to convert their vector images to SVG format. inkscape is the right tool just for that. Get inkscape from their website https://inkscape.org/en/ for you operating … [Continue reading]

Installing New Comodo Positive SSL Certificate on Apache and OpenSSL on Centos, Fedora and RedHat

positive-ssl

Hello, I had to buy a certificate and install it on my new server I was issued with a new certificate by Comodo under Comodo Positive SSL. To start with I think it doesn't matter which certificate you choose, end of the day they all follow … [Continue reading]

Rackspace Cloud FIles Bulk Delete objects prefixed/* wildcard php sdk

Hi Guys   Rackspace cloud files is an awesome file storage system, We upload a file and can make it publicly available through their CDN. While I use Cloud files in pretty much all my projects at work or personal stuff, one thing that I … [Continue reading]

Australian suburb boundaries database released

Hello Its been a great pleasure for me to release Australian suburb boundaries database, A suburb boundary is represented in a polygon of longitude and latitude points. More details can he found on product page which is located here Product … [Continue reading]

How to make xml_split, xml_merge, xml_pp, xml_grep, xml_spellcheck under XML-Twig tools work on Windows

XML twig

Large XML files could sometimes lead to overall slowness of your program. Thus it makes sense to sometimes split them to rearrange contents within to optimize querying your XML files. This post will touch base on how to split a large XML file on … [Continue reading]