Advanced Parser

This project was inspired by an in-class log parser I had to build using Bash. I wanted to create a more comprehensive Python-based log analysis system designed to parse server log files while displaying key metrics and identify suspicious activity. The system extracts and summarizes metrics like IP connections, user-specific login attempts (both successful and failed), and data statistics (uploads and downloads). It then takes the data and organizes it by date.

The suspicious activity module identifies potential security threats by looking at failed login attempts and anomalous behavior. It uses geopy, a real-time IP geolocation API, to determine the origin of suspicious IP address. It will then checks for signs of “impossible travel” where login attempts occur from geographically distant locations within an unrealistic time frame.

Here is the general output of the parser when ran. It shows the statistics for the day including Connection Counts per IP, Login Success/Fail Counts per IP/User, Upload Stats per IP, and Download Stats per IP.

At the end of the log report, the program will produce a Suspicious Activity report that goes over each user that had anomalous behavior such as high failed logins. For any users deemed suspicious, the program will automatically try and find their location. While hackers may spoof their location, it can still give a generally idea of whether the activity is real.

Here is a section of the code that highlights how impossible travel is checked. If the code has reached this point, it noticed that the time between logins is suspicious. It will then take the suspicious username and compare the user’s IPs when attempting to log in. If the range is higher than 500km (~310 miles) it will state that the user moved too far in too little amount of time.