Skip to content

Check logfiles for recent entries only

wordpress meta

title: 'Check Logfiles For Recent Entries Only'
date: '2015-03-15T09:57:53-05:00'
status: publish
permalink: /check-logfiles-for-recent-entries-only
author: admin
excerpt: ''
type: post
id: 853
category:
    - Logging
    - Python
tag: []
post_format: []

Frequently I have a cron job to check for specific entries in log files but want to avoid being notified of something already checked. For example I want my 10 minute cron job to only look for most recent 10 minute entries.

Here is what I did in python.

from datetime import datetime, timedelta

## Get time right now. ie cron job execution
#now = datetime(2015,3,15,8,55,00)
now = datetime.now()

## How long back to check. Making it 11 mins because cron runs every 10 mins
checkBack = 11

lines = []

print "log entries newer than " + now.strftime('%b %d %H:%M:%S') + " minus " + str(checkBack) + " minutes"

with open('/var/log/syslog', 'r') as f:
    for line in f:
      ## Linux syslog format like this:
      ## Mar 15 08:50:01 EP45-DS3L postfix/sendmail[6492]: fatal
      ## Brain dead log has no year. So this hack will not work close to year ticking over
      myDate = str(now.year) + " " + line[:15]

      ## What about "Mar  1" having double space vs "Mar 15". That will break strptime %d.
      ## zero pad string position 4 to make %d work?
      if myDate[3] == " ":
        myDate = myDate.replace(myDate[3],"0")

      lt = datetime.strptime(myDate,'%Y %b %d %H:%M:%S')
      diff = now - lt
      if diff.days <= 0:
        if lt > now - timedelta(minutes=checkBack):
          # print myDate + " --- diff: " + str(diff)
          lines.append(line)

if lines:
    # message = '\n'.join(lines)
    # do some grepping for my specific errors here..
    # send message per mail...

Just for reference here is an older test where no year is used. This is doing a string compare but I have not tested this one good enough. Most likely it will fail when month ticks over Apr will not be bigger than Mar. Also what about midnight 23:59 > 00:00?

from datetime import datetime, timedelta
now = datetime.now()
lookback = timedelta(minutes=5)

## Linux syslog format "Mar 15 07:30:10 ..."
## Probably need to zero pad string position 4 to make %d work?
oldest = (now - lookback).strftime('%b %d %H:%M:%S')

with open('/var/log/syslog', 'r') as f:
    for line in f:
        if line[:15] > oldest:
          print "entry: " + line[:15] + " --- " + line[16:50]