Often website administrators need to find the unique IP addresses that drive traffic to their websites. This is usually done to identify high value traffic sources as well as discover spam traffic. In this article, we will learn how to get unique IP address from Log file.
How to Get Unique IP Address from Log File
For our article, we will get unique IP address from Apache server’s log file. You can use this method on log files of other servers also.
Here is a sample log entry from Apache.
XXX.64.70.XXX - - [26/Mar/2021:00:28:23 -0700] "GET / HTTP/1.1" 403 4609 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.204 Safari/534.16"
In most log files, the first value in each log entry is the IP address from which the request was received. You can use the following tail command to get the latest entry of your log file, which you can use to study its format.
$ tail -n 1 /home/ubuntu/apache_log
To get unique IP address, we use a combination of awk, sort and uniq commands. Let us say your apache log file is at /home/ubuntu/apache_log then here is the awk command to extract the first value (column) from each entry.
$ awk '{ print $1 } ' /home/ubuntu/apache_log
In the above awk command, $1 is very important since it specifies the column number of IP address on each line of your input file. For example, if IP address is the 3rd column instead of 1st column, then you need to replace $1 with $3 above.
Each request sent to your web server is logged as a separate entry so there will be multiple entries for each IP address. This is because each request(html, css, js, images, etc.) sent from a web page on a client browser is logged as a separate entry.
So the above awk command’s output will contain multiple entries for each IP address. We will use sort command to sort the IP addresses. We will pipe the output of awk command to sort command for our purpose.
$ awk '{ print $1 } ' apache_log | sort
Finally, we use uniq command to get a unique list of IP addresses. We will pipe the output of sort command to uniq command for our purpose.
$ awk '{ print $1 } ' apache_log | sort | uniq
The above command will give you the list of unique IP addresses in your web server’s log file.
If you want to get a count of unique IP addresses hitting your website, then you can pipe the above output to wc command as shown below.
$ awk '{ print $1 } ' /home/ubuntu/apache_log | sort | uniq | wc -l
If your server log’s first column does not contain IP address, then update the above awk command accordingly. For example, if IP address is the 3rd column of each entry, then update awk command as shown.
$ awk '{ print $3 } ' /home/ubuntu/apache_log
In this article, we have learnt a simple way to get unique IP address from log file. You can customize it as per your requirement, depending on your log format, by modifying the awk command
Also read:
How to Merge Two JS Objects
How to Restrict Internet Access for Programs in Linux
How to Check if String is Valid Number in JavaScript
How to Detect Invalid Date in JavaScript
How to Convert Date to Another Time Zone in JavaScript
Related posts:
Sreeram has more than 10 years of experience in web development, Python, Linux, SQL and database programming.