Tuesday, November 1, 2011

Ruby Distance Calculation

I was just looking at a clustering algorithm and needed a distance calculation between two points in n-dimensional space. If you remember your high school geometry (or even earlier) the distance between two points in the xy plane is the sqrt((x1 - x2)**2 + (y1-y2)**2). Where the two points are represented by (x1, y1) and (x2, y2). For more dimensions, just add values inside the sqrt as in (z1-z2)**2 for a third dimension. Given that here's a one line function that I came up with


def distance(a, b)
Math.sqrt(a.zip(b).inject(0) { |d, c| d + (c[0] - c[1]) ** 2 })
end


It's a bit tricky, so let's go through it. First we have the Math.sqrt which will just take the square root of whatever is inside it. Next the a.zip(b) will take two arrays (here the input parameters) and take the first values of each of the arrays, create a new array from them and then add that to the output array (see also this post for more information on zip). Same with the second values of the arrays and so on. For example if we have [x1, y1, z1].zip([x2, y2, z2]) this will return [[x1, x2], [y1, y2], [z1, z2]] (hopefully, this looks like something we can use to you). Next we're going to use inject to run through this array with a starting value of 0 (d) and add to it the square of the difference of the two values in the array. Finally as noted before, take the square root and return.

If you have any problems with this, break it up into multiple lines and check the intermediate values. As always, let me know if you have any questions.

1 comment:

  1. It might also be useful to decide if you need a distance function, or a metric in euclidean space.

    I think you'd find that if you just need to compare ("this is closer than that from a common point"), you might just use the sum-of-squares-of-differences and not the square-root of it. As a metric, it will need much less computation. Then when you need the distance, you have its square and just run sqrt then.

    ReplyDelete