Apache log file format analysis by R

Ask Time:2012-09-22T17:49:06         Author:furianpandit

I was trying to do the analysis of weblog files by R. I am comfortable to deal with the date and bytes, wherever numeric data is present but fail to deal with the strings.

From the log file (log file in CSV format), I want to find out the particular user (with help of IP and Agents) and its total spending on the web page.

Paul Hiemstra :

There are numurous libraries to do this kind of analysis, although I could find none in R. A google for parse apache logfile yielded a library in Perl, and python parse apache logfile yields the Scratchy library. Both rely on regular expressions to parse the contents of the file.\n\nFrom here there are two ways to deal with the apache logfile:\n\n\nCall perl or python from R, either using a direct link, or using a system call (this is simpler).\nTake the idea from the perl or python lib and use it to implement R versions of the functions. This will take a lot of time.\n\n\nYou refer to a csv file, but I think the libraries above work with the original text file with the Apache log, so I'd use those, and not your csv file. \n\nIn addition, this SO post mentions an answer by @doug (profile) where he states that he has created some functions to create visualizations of apache logfile data, parsed by Python. Maybe you could send him a message or mail and see if he is willing to share the code.",