extract unique ip address from log file

How to Get Unique IP Address from Log File

Often website administrators need to find the unique IP addresses that drive traffic to their websites. This is usually done to identify high value traffic sources as well as discover spam traffic. In this article, we will learn how to get unique IP address from Log file.


How to Get Unique IP Address from Log File

For our article, we will get unique IP address from Apache server’s log file. You can use this method on log files of other servers also.

Here is a sample log entry from Apache.

XXX.64.70.XXX - - [26/Mar/2021:00:28:23 -0700] "GET / HTTP/1.1" 403 4609 
"-" "Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/534.16 (KHTML, like 
Gecko) Chrome/10.0.648.204 Safari/534.16"

In most log files, the first value in each log entry is the IP address from which the request was received. You can use the following tail command to get the latest entry of your log file, which you can use to study its format.

$ tail -n 1 /home/ubuntu/apache_log

To get unique IP address, we use a combination of awk, sort and uniq commands. Let us say your apache log file is at /home/ubuntu/apache_log then here is the awk command to extract the first value (column) from each entry.

$ awk '{ print $1 } ' /home/ubuntu/apache_log

In the above awk command, $1 is very important since it specifies the column number of IP address on each line of your input file. For example, if IP address is the 3rd column instead of 1st column, then you need to replace $1 with $3 above.

Each request sent to your web server is logged as a separate entry so there will be multiple entries for each IP address. This is because each request(html, css, js, images, etc.) sent from a web page on a client browser is logged as a separate entry.

So the above awk command’s output will contain multiple entries for each IP address. We will use sort command to sort the IP addresses. We will pipe the output of awk command to sort command for our purpose.

$ awk '{ print $1 } ' apache_log | sort

Finally, we use uniq command to get a unique list of IP addresses. We will pipe the output of sort command to uniq command for our purpose.

$ awk '{ print $1 } ' apache_log | sort | uniq

The above command will give you the list of unique IP addresses in your web server’s log file.

If you want to get a count of unique IP addresses hitting your website, then you can pipe the above output to wc command as shown below.

$ awk '{ print $1 } ' /home/ubuntu/apache_log | sort | uniq | wc -l

If your server log’s first column does not contain IP address, then update the above awk command accordingly. For example, if IP address is the 3rd column of each entry, then update awk command as shown.

$ awk '{ print $3 } ' /home/ubuntu/apache_log

In this article, we have learnt a simple way to get unique IP address from log file. You can customize it as per your requirement, depending on your log format, by modifying the awk command

Also read:

How to Merge Two JS Objects
How to Restrict Internet Access for Programs in Linux
How to Check if String is Valid Number in JavaScript
How to Detect Invalid Date in JavaScript
How to Convert Date to Another Time Zone in JavaScript

Leave a Reply

Your email address will not be published. Required fields are marked *