March 6th, 2014
The other day, I was writing some Ruby code to go through a list, apply a function to it, and return the first element that the function didn’t return nil for. A good example would be matching a RegExp, and returning the MatchData for the first result element of the array that matched.
There is a Enumerable#grep function, which almost does this, but not quite. Other than that, there didn’t seem to be a function.
Now, you could do this with a map, followed by a find, but that seems inefficient since you only want the first non-nil result. Probably a first world programming problem, but it got me thinking if there was a way to do this efficiently, simply and idiomatically.
Hoogle is a great tool for looking for Haskell functions based on their type signature. I figured if there was a Haskell function for this, I wouldn’t be going crazy and it might give me an idea of what the name is called. In this case, we want a function that takes a list of something, a function that converts that something into either None or Maybe something else (equivalent to returning nil or something in Ruby, but in a statically-typed way). The Haskell signature is therefore [a] -> (a -> Maybe B) -> Maybe B.
Search for that on Hoogle, and you don’t get any useful results. So am I crazy for wanting it? Well, no.
Because Haskell is lazy, map followed by find is efficient, because the map will only be run as many times as needed to get output from find.
This reminded me of a more general problem with Ruby – its reduce function is strict (not lazy), so it always consumes the whole list, even if you don’t need it to. So, since this list function can be written as a reduce, if we had a lazy reduce, we would have our solution.
Let’s take a simple definition of reduce, and rewrite it to be left associative and lazy.
def reduce(initial, &block) each do |x| initial = block.call(initial, x) end initial end
Okay, so how do we make this lazy? The block normally gets the initial value (or memo value), and the next value, and returns a new memo value. Instead, we’ll get the next value in the list, and a block we can call that reduces the rest of the list (and returns initial on an empty list.)
def reduce_left(initial, &block) again = lambda do |list| if list.empty? initial else thunk = lambda do again.call(list.rest) end block.call(list.first, thunk) end end again.call(self) end
Calling list.rest like this is inefficient, so we’ll rewrite it to use an Enumerator in a minute. Firstly, let’s see how we could use it.
Without further ado, the first non-nil result, or nil is:
def map_first(&block) reduce_left(nil) do |x, rest| block.call(x) || rest.call end end
Note this has the short-circuiting OR structure we would expect.
An efficient implementation of reduce_left
We should use an Enumerator, so it would look like this:
def reduce_left(initial, &block) e = self.each again = lambda do block.call(e.next, again) end again.call rescue StopIteration initial end
And it’s even less code and fewer lambdas. Thus concludes the blog post.