SteamCode: October 2010

DSLs (Domain Specific Languages) are small languages built on top of general purpose languages to accomplish specific tasks. Examples of DSLs are things like rake/make for building executables, bc (the Unix/Linux calculator), and macro systems like the C/C++ precompiler and M4. Ruby, because of its syntax provides a great platform for building DSLs.

Recently, at work, we ended up needing to monitor a log file for a particular string. Our OPS group implemented this, but it turned out that we were getting quite a few false positives from the string by itself. We could however do a database check on a couple of items in the string to remove these false positives. The logging software we use didn't really allow this so we ended up putting together a .NET application (Windows box) that would be called when the string was found, it would do the database check, and then do the notifications as necessary. I started thinking that it would be nice if you could match a string and then provide some code to do whatever you'd like in a monitoring type system. Since I'd been looking at DSLs, this seemed like it might be a natural application for them.

Let's start out with a simple program to provide us a log file to monitor ...


while true
    v = rand
    if v < 0.1
        s = "Error"
    elsif v < 0.2
        s = "Warn"
    else
        s = "Info"
    end
    puts "#{Time.new}: #{s}"
    $stdout.flush
    sleep 1
end

So essentially, do forever, get a random number, 1/10 of the time put out an Error, 1/10 of the time put out a Warning, and the rest of the time put out an Info. These will all be preceded by the date/time. Obviously, in a real system, you'd just use your own logs. For this, I just ran this and redirected the output to mylog.txt.

OK, here's the DSL we'd like to implement. I put it in monitor.mon and when we run the program, we'll "interpret" this file. Here it is ...


monitor "Er.*r" do |line, match|
    puts "Error in: #{line} "
end

monitor "Warn" do |line, m|
    puts "Warning in: Match: #{m[0]}: #{line} "
end

monitor "Info"

So in the first item to monitor, we look for something like an "Error" (the ".*" is in there really just to show this can really be a regular expression. The second thing to monitor is the word "Warn", and finally the work "Info". In the first two cases we print a message, with the 2nd one actually showing the "match". We don't give a block to the 3rd one, but the default is to print a message too. Now even though we've done something pretty simple here, there is nothing preventing us from doing whatever we'd like in them that we can do in Ruby. Here, I'm thinking of things like emailing using possibly mailit) or database access as in our above problem using Sequel. At this point, we can "shell out" to the general purpose programming language of Ruby.

So, now we need to interpret the above file. Here's the code for this


require 'file/tail'

class MonitorLog 

    def initialize(filename, number)
        @filename = filename
        @number = number
        @monitors = []
    end

    # Add a new monitor (regex/block).
    def add_monitor(monitor)
        @monitors << monitor
    end

    # Monitor the log using the filename passed in. For each
    # line we'll check for a monitor match.
    def monitor
        File::Tail::Logfile.open(@filename) do |log|
            log.backward(@number).tail do |line| 
                @monitors.each do |m|
                    m.match line
                end
            end
        end
    end

end

# The Monitor class which is a type of Regexp with a block given for checking.
class Monitor

    # Create a new monitor with a regex and a block.
    def initialize(regex, block)
        @regex = Regexp.new regex
        @block = block
    end

    # Match a line. If we match, then we'll call the block
    # with the line and the MatchData.
    def match(line)
        m = @regex.match(line)
        @block.call(line, m) if m != nil
    end
end

# This will normally come from the monitor file where there will be a regular
# expression and a block to execute when it is found. If no block is given,
# we'll just print the line.
def monitor(regex, &block)
    if block_given?
        $monitor_log.add_monitor Monitor.new(regex, block)
    else
        $monitor_log.add_monitor Monitor.new(regex, lambda { |l, m| puts "Match: #{l}" })
    end
end

if __FILE__ == $PROGRAM_NAME

require 'getoptlong'

def usage
    puts "Usage #$0 [-n number] [-f filename] [-m monitor_filename]"
end

# Set up the command line options
opts = GetoptLong.new(
    ["--number", "-n", GetoptLong::REQUIRED_ARGUMENT],
    ["--filename", "-f", GetoptLong::REQUIRED_ARGUMENT],
    ["--monitor_file", "-m", GetoptLong::REQUIRED_ARGUMENT],
    ["--verbose", "-v", GetoptLong::NO_ARGUMENT],
    ["--help", "-h", GetoptLong::NO_ARGUMENT]
    )

# Set the default values for the options
number = 10
filename = 'monitor.log'
monitor_filename = 'monitor.mon'
$verbose = false

# Parse the command line options. If we find one we don't recognize
# an exception will be thrown and we'll rescue with a usage.
begin
    opts.each do | opt, arg|
        case opt
        when "--number"
            number = arg.to_i
        when "--filename"
            filename = arg
        when "--monitor_filename"
            monitor_filename = arg
        when "--verbose"
            $verbose = true
        when "--help"
            usage
            exit
        end
    end
rescue
    usage
    exit
end


# Create the montior log. We make it global as it's used for the monitor()
# creation method above.
$monitor_log = MonitorLog.new(filename, number)

# Load in the monitor file DSL.
load(monitor_filename)

# Start monitoring.
$monitor_log.monitor

end

Our first class is MonitorLog which takes a filename to monitor and a number the number of lines to start back in the file. We also have an array for the different monitors we define in the monitor.mon file. There's a add_monitor which takes a monitor and adds it to the array and finally, the monitor (I know too many monitor variables) which uses the file-tail gem. This method will simply tail the file and for each line call each of the monitors. The Monitor class contains a regular expression to match and a block to execute on a match. In the match method, we check the regular expression and if it matches, we'll call the block with the line we matched and the MatchData. Here, if we had a more interesting regex than I've shown so far, you could grab out things like login IDs or similar from the MatchData if you needed it for something rather than reparsing the line.

Now comes the "interesting" piece, the monitor (I know, I know). Note that this matches with the monitor.mon name and indeed this is what gets called when that file is loaded (see below). The two parameters are the regex and potentially a block. If the block is given we create a Monitor using them and if there's no block, as in the Info then we'll create a lambda that will print out "Match" and the line.

The rest of the code is the "main" program, getting the command line arguments, creating a new $monitor_log (made global for use in the monitor method, here, I'd be interested in a better way to do this), loading the monitor.mon file, and finally starting the monitoring.

Here's how to run this code ruby Monitor.rb -n 20 -f mylog.txt -m monitor.mon.

There's quite a bit missing here to make this a truely useful application. Certainly adding email would get you a long way towards this and is really easy to do. The slightly more difficult thing is to make this work with rotating log files or even multiple log files and these are left as an exercise for the reader ;-).

Let me know if you have questions or comments.

I saw a comment from one of my Facebook friends that October 2010 had five Fridays, Saturdays, and Sundays and that this occurred only once every 823 years. Now this didn't sound exactly right to me and it didn't take much time to figure out why. Take a look at the following calendar


   Su  Mo  Tu  We  Th  Fr  Sa
                        1   2
    3   4   5   6   7   8   9
   10  11  12  13  14  15  16
   17  18  19  20  21  22  23
   24  25  26  27  28  29  30
   31

You can see pretty easily from this, that a) we'll have the five Fr/Sa/Su anytime we start October on a Friday and b) no other configuration will have this. Now just thinking about this off the top of my head, I figured that this should happen about ever 7 years (disregarding leap years).

Now as it happens, I'd also just finished a short ruby method for Zeller's Congruence that I'd written from an article on Programming Praxis. So, I combined the two and came up with the following ...


# Zeller’s Congruence is a simple mathematical method for determining the day
# of the week for a given date.
# 
# In the 1880s, Christian Zeller, a German mathematician, noticed that, if you
# run a year from March through February, the cumulative number of days in each
# month forms a nearly straight line; that works because February, which would
# normally perturb the straight line, is moved to the end. He worked out the
# formula ⌊(13m−1)/5⌋ to give the number of weekdays that the start of the
# month moves each month, where m is the month number.
# 
# Then it is easy to calculate the day of the week for any given day: add the
# day of the month, the offset for the number of months since March, an offset
# for each year, and additional offsets for leap years and leap centuries
# (remembering to subtract one year for dates in January and February), taking
# the whole thing mod 7. It’s fun to work out the arithmetic yourself, but if
# you don’t want to take the time, the whole formula is shown in the solution.
# 
# Your task is to write a function that uses Zeller’s Congruence to calculate
# the day of the week for any given date. When you are finished, you are
# welcome to read or run a suggested solution, or to post your own solution or
# discuss the exercise in the comments below.#
#
# f = k + floor(13m - 1) / 5) + d + floor(d / 4) + floor(c / 4) - 2c mod 7
#
#   Su  Mo  Tu  We  Th  Fr  Sa
#                        1   2
#    3   4   5   6   7   8   9
#   10  11  12  13  14  15  16
#   17  18  19  20  21  22  23
#   24  25  26  27  28  29  30
#   31
#
#
#
def zeller(year, month, day)
    months = %w[march april may june july august september october november december january february]
    weekdays = %w[Sunday Monday Tuesday Wednesday Thursday Friday Saturday]
    k = day
    m = months.index(month.downcase) + 1
    y = (m <= 10) ? year : year-1
    d = y % 100
    c = y / 100
    f = (k + (((13*m) - 1) / 5).floor + d + (d/4).floor + (c/4).floor - (2*c)) % 7
    weekdays[f]
end

1900.upto(2300) do |y|
    puts "Five Friday, Saturday, and Sundays for October, #{y}" if zeller(y, "october", 1) == "Friday"
end

Here, we'll just go through the years from 1900 to 2300 or about 400 years and see how many times the five Fr/Sa/Su pattern occurs. If we run this and pipe it through wc -l, we end up with 56. Interestingly enough, 400 / 7 is 57 which is what we originally calculated above ignoring the leap years.

Let me know if you have any questions or comments.

SteamCode

Saturday, October 16, 2010

Log Monitoring with a Ruby DSL

Tuesday, October 12, 2010

Zeller's Congruence and Five Fr/Sa/Su in October

Followers

Blog Archive

About Me