Intro to lazy evaluation and functional DP

Aug 16

post haskell project-euler

Functional languages are assumed to be inferior at tackling problems that are best solved with dynamic programming. This belief is largely based on the absence of state and mutation from pure languages. It is true that DP often requires a global state (the memoization table), and continuous updates to the said state.

With lazy evaluation, however, a declarative solution's elegance far supersedes its imperative counterparts.

A dynamic programming approach to any problem involves finding solutions to smaller sub-problems of similar nature, and then combining them to obtain the answer.

Consider the Fibonacci sequence for example:

fib 0 = 1
fib 1 = 2
fib n = fib (n - 1) + fib (n - 2)

Most imperative solutions to computing the Nth Fibonacci number would involve a table of some sort, where each number i is mapped to Fib_i. The first two items in the table are set to 0 and 1 (or 1 and 2), and the program then iterates over the size of the table, setting fib[i] to fib[i - 1] + fib[i - 2]. One such solution is discussed here.

Project Euler #2

This is the question posed in problem 2 of the Project Euler archive:

A new term in the Fibonacci sequence is generated by adding the previous two.
Starting with 1 and 2, the first 10 terms will be:

1, 2, 3, 5, 8, 13, 21, 34, 55, 89, ...

Considering the terms in the Fibonacci sequence whose values do not exceed 
four million, find the sum of the even-valued terms

Over the remainder of this post, we'll go over two solutions to this challenge - an obvious solution using python, and a slightly more elegant approach in Haskell.

The goal is to understand lazy evaluation and functional DP. Prior knowledge of Haskell will help, but isn't necessary.

Imperative solution

There is not much to go over with this one:

def sum_even_fibs_below(n: int) -> int:
    fibs_cache = list(range(1, 40))
    for i in range(3, len(fibs_cache)):
       fibs_cache[i] = fibs_cache[i - 1] + fibs_cache[i - 2]
    sieve = lambda x: x % 2 == 0 and x < n
    return sum(filter(sieve, fibs_cache))

if __name__ == '__main__':
    print(sum_even_fibs_below(4000000))

First, we have fibs_cache - a standard DP table that we update in O(n) time. Then we filter out entries that are even (or larger than 4M), and sum over the list.

Mark the presence of a "state" object in the python program - namely, the fibs_cache list. We update the items in this cache within a loop.

The rest of the operations in the code are stateless and referentially transparent. To implement memoization, we're going to leverage two features of Haskell:

Lazy evaluation.
Infinite lists.

The second is really just a natural consequence of the first.

Lazy evaluation

Haskell is a lazily evaluated language, wherein no value is evaluated until it's needed. To understand what this means, consider the following snippet:

let a = get_a () -- takes 1 seconds to compute
    b = get_b () -- takes 2 seconds to compute
    c = get_c () -- takes 3 seconds to compute
in print (a + b)-- takes 1 + 2 = 3 seconds

When we call print, the expression a + b must be evaluated so that it can be written to stdout. Before the runtime encounters the call to print, however, a, b and c remain "frozen".

Since we made the assumption that get_a and get_b require 1 and 2 seconds to compute, a + b will block the thread for a total of 3 seconds, whereas get_c will never be visited.

Here we notice Haskell's ability to leave chunks of code unevaluated until necessary. This will turn out to be our cornerstone in tackling dynamic programming problems with a functional arsenal.

Infinite lists

Lazy evaluation enables a neat trick - infinite lists. Fire up the ghci REPL and enter the following expression:

Prelude> [1, 2, ..]

You will notice that the REPL infinitely prints contents of this list until halted. If you bind to a variable, however, the REPL won't evaluate it right away:

Prelude> xs = [1, 2 ..]

We can use xs like a regular linked list:

Prelude> xs !! 4
5
Prelude> takeWhile (< 5) xs
[1,2,3,4]

To understand the "laziness" better, let's strip away the syntactic sugar. Without list comprehensions, we can write xs this way:

xs :: [Int]
xs = 1 : next xs
  where
    next (x : rest) = (x + 1) : next rest
    next [] = []

We tell Haskell that xs is a linked list where the first item is 1, and the rest of the items are given by the function call: next xs. Since Haskell is lazy, it won't make the effort to actually evaluate next xs. This self reference of xs is therefore considered safe.

However, when we ask for the nth element in xs, the runtime will start iterating over the list, evaluating the values and chasing pointers until it reaches the desired slot.

Visually, an infinite list can be thought of as a linked list where the head is some value, and the body is a "frozen" computation which returns the rest of the list upon thawing. Calling tail xs will take a step forward and return 2 -> [ ... ]. Calling tail $ tail xs will take 2 steps forward, and return 3 -> [...].

To get comfortable, try working this out on paper.

An infinite list in Python

Python is not lazy, but we can emulate infinite lists with some effort. This is purely to build an intuition for the subject, and is not idiomatic Python code.

We start by defining make_list - a function that returns an infinite list. Later, we're going to define rest and index_list, to mirror Haskell's tail and (!!) functions.

def make_list(first_item):
    """
    Create an infinite list where every item
    is greater than the last by 1 
    """
    xs = [lambda: first_item]
    xs.append(rest(xs))
    return xs

To get around the lack of self-references in the language, we exploit the fact that non-primitives are stored by reference. i.e - The second element of xs can store a reference to xs.

Every element in xs is a function. This is to prevent the interpreter from eagerly evaluating the values returned by the function. The first element is simply the head of the list, and the second element is a function that, when called, returns the body of xs.

Next, we define rest like so:

def rest(xs):
    """
    Returns the body of a linked list, minus the head.
    """
    return lambda: [lambda: xs[0]() + 1, rest(xs[1])]

It looks gnarly at first, but becomes easier to grasp once you realize that we're really just using lambda to "freeze" a value. The interpreter won't evaluate a frozen value unless we "thaw" it using the call operator - ().

At any given moment, our list always contains two items:

A "frozen" head.
A "frozen" body, containing an unevaluated recursive call to rest. This will again return a 2-element list of the same shape.

You could imagine the function rest as a taking a step forward in our linked-list:

rest [1, <...>] = [2, <...>]
rest [2, <...>] = [3, <...>]

Next, we define two simple helpers:

def head(xs):
    """Returns the head of an infinite list"""
    return xs[0]()

def index_list(xs, i):
    """
    Get the ith item in an infinite list
    """
    curr = xs
    for _ in range(i):
        curr = rest(curr)()
    return head(curr)

With that, we have an infinite list similar to the one we wrote in Haskell.

xs = make_list(1)
print(index_list(xs, 100)) # 101

An infinite list to draw Fibonacci numbers from

We've established that infinite lists are really just regular values that are left unevaluated thanks to Haskell's lazy nature.

By now, an infinite list that stores Fibonacci numbers should be easy to digest:

fibs :: [Int]
fibs = 1 : 2 : next fibs
  where
    next (x : xs) = (x + head xs) : next xs
    next _ = []

If this is still confusing, try working out the evaluation of fibs !! 4 on paper. You'll see that the fibs list is nearly identical to xs.

In the problem, we are asked to find out the sum of all even fibonacci numbers that are less than 4000000. Try taking a stab at it before reading ahead.

You'll see that all we need to do now is translate the Python solution to Haskell.

Keep pulling items out of fibs until we hit a number larger than 4 million.
Filter out all the odd numbers.
Sum over the remaining numbers.

Here is the full solution, around 12 lines of haskell:

fibs :: [Int]
fibs = 1 : 2 : next fibs
  where
    next (x : xs) = (x + head xs) : next xs
    next _ = []

sumEvenFibsBelow :: Int -> Int
sumEvenFibsBelow n = sum evenFibsBelowN
  where
    evenFibsBelowN = filter even fibsBelowN
    fibsBelowN = takeWhile (< n) fibs

main :: IO ()
main = print $ sumEvenFibsBelow 4000000

Of course, fibonacci series is only a basic introduction to dynamic programming. Other problems of similar nature can also be modelled on top of lazy lists. As an exercise, you could try attempting this leetcode problem using memoization and infinite lists.