downloading log files

The best place to analyze log files is the local computer, not the server:
1) Log files grow fast, taking all the server space
2) Queries might be CPU-intensite
If you like me, you prefer that the log files appear on the local computer automatically.

Server setup

At the moment "X-1", the server logs are truncated, and the collected data is put to a separate file.

Note that the setup consists of two parts, one is "at the moment..." and another one is "the server logs...". The both tasks are solved using the standard tools.

The latter task is called "log rotation", and the usual tool is "logrotate". In the easiest case, you already have the config file "/etc/logrotate.d/httpd", which defines how to rotate web server logs and has an entry for the default web server.

By analogue, add entries for your sites. You'll get something like this:

/home/uucode/logs/*log ...other sites... {
    daily
    missingok
    notifempty
    sharedscripts
    postrotate
        /bin/kill -USR1 `cat /var/run/httpd.pid 2>/dev/null`
                         2> /dev/null || true
    endscript
}

Logrotate is executed from cron. On modern Linux systems, there is a small infrastructure. Running logrotate is the task of /etc/cron.daily/logrotate. Running the directory /etc/cron.daily/ is specified in the file /etc/crontab.

Set the time "X-1" in /etc/crontab. Don't forget about timezones!

Local setup

I haven't found a standard tool, therefore have written a Python script. Here is its functionality:

* Download the latest rotated log "xxx.1" from the server and save it as "xxx". (Obviously, I use rsync and key authentifications.)

* If the local files "xxx" and "xxx.1" are the same, it means that the log was already downloaded, and there is nothing to do. Exit.

* Otherwise,
. - rename "xxx.1" to something like "xxx-2007MMDD-HHMM"
. - copy "xxx" to "xxx.1"

As result, we always have the latest rotated log "xxx", a copy of it ("xxx.1") to simplify programming and the set of timestamped old logs.

The script is executed from cron at the moment "X-45min".

At the moment "X" I start to work and can check the new logs.

Categories: blogging python

Updated: