TinselCity What you see is the application



Y combinator in Javascript

25 Feb 2010
Posted by gonzalo

When programming, there're problems we find almost everyday. Then there're some problems we find just once in a while. Lastly, some of them will nerver or rarely cross our path. It's not that the Y combinator doesn't have practical uses, it's just that it's only needed in very specific contexts.

The problems with recursivity

Recursivity is a problem in itself for a number of inexperienced (and some experienced) programmers. But even for those of us who understand it, recursivity still has some hard to fix details in certain situations.

First and foremost is optimization. We tend to relish in the elegance of a recursive solution but a lot of times we fail to take into account the cost of using a recursive algorithm. Specially in some particular cases. Right now my poor Athlon 64 3500+ is spending a few minutes applying 95% of its power to obtain the 50th term in the Fibonacci sequence recursively. The mathematical operations on each step is just a simple sum, but to make that calculation the number of calls grows exponentially with each term of the series. For the fiftieth term around 40 billion calls are made.

This can be solved in various ways, of course. But there're details which sometimes can't be solved so easily. What if the language itself doesn't support a certain behavior or construction? Let's take a look at recursivity over anonymous functions.

How do we refer to a function from within itself, when this function doesn't have a name? Let's have this example:

    // generates a function which can be used to calculate Fibonacci numbers
    function FibonacciGenerator() {
        return function(n) {
            return (n<2)? n : _this_function_(n-1) + _this_function_(n-2);
        };
    }
    var Fibonacci = FibonacciGenerator();
    alert(Fibonacci(50));

What reference do we use instead of the _this_function_ placeholder? Sure, most of you will have guessed the easy answer to that in Javascript: arguments.callee! And you're right. But while that works it does have some bothersome details. First there's the uncertain future of arguments.callee in the language and then there's the fact that using it does have an impact on performance. In some JS engines the hit is quite noticeable.

You may be thinking of another solution: Giving a name to the anonymous function, like so:

    function FibonacciGenerator() {
        return function f(n) {
            return (n<2)? n : f(n-1) + f(n-2);
        };
    }

which is valid... but doesn't work in some JS engines.

So, forgetting for a moment those easy solutions, let's find a completely different way of implementing recursivity.

The function as a value

First of all, let's be aware that we're dealing with a functional system, where functions are first order entities, values which can be used as any other value (a number, an array...). We can, for example, pass a function as an argument to another function.

Ahá! So we could do something like this:

    function FibonacciGenerator(f) {
        return function(n) {
            return (n<2)? n : f(f,n-1) + f(f,n-2);
        };
    }

This way, we're keeping the reference to the function unsolved until we pass it to FibonacciGenerator. This way we could try to do something like this:

    var Fibonacci = FibonacciGenerator(Fibonacci);

Which obviously doesn't work, because when we pass Fibonacci to FibonacciGenerator it still doesn't hold the reference to our function. Tsk!

Anyway, before continuing let's make a small change to the code above. This f(f, n-1) is a bit bothersome, because our Fibonacci function has an arity of one (is a single argument function) and this introduces functions which take 2 arguments. The easy solution to this is writing it simply as f(f)(n-1). f takes a first argument and returns a function which takes the second argument.

So, we write the above as:

    function FibonacciGenerator(f) {
        return function(n) {
            return (n<2)? n : f(f)(n-1) + f(f)(n-2);
        };
    }

Now, back to our expression: Fibonacci = FibonacciGenerator(Fibonacci). Let's take a good look at this. This expression, written in a more mathematical way, is of the form x = f(x), where x is the value we want. If you remember your calculus, this is the definition of a fixed point for function f. This an interesting insight into what the Y combinator is. We still have some way to go before you realize this completely, but for now what we can see is that what we wanted when we wrote that

    var Fibonacci = FibonacciGenerator(Fibonacci);

is just a fixed point for FibonacciGenerator. In this case, FibonacciGenerator is a function which receives functions and returns functions, which just makes it a little more confusing than what you may have learnt in Calculus but it's actually the same. So, from an immediate point of view we're looking for a fixed point to FibonacciGenerator. From a more general point of view, we want a way to make that search into an algorithm which will abstract the mechanism of recursion.

The Y combinator

The definition for the Y combinator is exactly the solution we're seeking. The Y combinator is defined as the combinator (which is just a name for a function that takes and returns functions) that takes any function and returns the fixed point for that function. (Actually, not of any function, but better not make things to complicated if we don't want to make it fun).

So, Y is defined such that:

    Y(f) = f(Y(f));

Nothing new here: Y is a function which takes f as argument and returns its fixed point. There're numerous texts available on-line that explain the derivation of Y. Some are quite nice and understandable. Some even don't use Lisp to do it! Gasp! WE have already made an important part of the way to here, so I hope you can follow me to the end without too much confusion :)

From the last definition of FibonacciGenerator we have, let's extract the reference to f(f):

    function FibonacciGenerator(f) {
        return function (n) {
            var g = function(h,n) {
                return (n<2)? n : h(n-1) + h(n-2);
            };
            return g(f(f),n);
        };
    }

And to make this a bit more operative, we, again, separate the arguments to the most inner function (g):

    function FibonacciGenerator(f) {
        return function (n) {
            var g = function(h) {
                return function(n) {
                    return (n<2)? n : h(n-1) + h(n-2);
                };
            };
            return g(f(f),n);
        };
    }

Now, g is a function which does not have any external references beyond its own argument (h). We can, then, extract it from FibonacciGenerator:

    var g = function(h) {
        return function(n) {
            return (n<2)? n : h(n-1) + h(n-2);
        };
    };
 
    function FibonacciGenerator(f) {
        return function(n) {
            return g(f(f),n);
        };
    }

Pause! What do we have here? On one hand we've got g, a function which is not recursive and somehow represents quite well the algorithm of the original Fibonacci problem. That is to say, if we pass it a certain function h it returns a function that for a number n it calculates h(n-1) + h(n-2).

On the other hand we have FibonacciGenerator, which basically contains the mechanism to obtain the fixed point for the previous g function. don't loose track of the target: That fixed point is exactly the function we're looking for.

With a couple of transformations on FibonacciGenerator, we could write it as:

    function FibonacciGenerator(f) {
        function U(g) { return g(g) }
        function S(h) {
            return function(x) {
                return f(h(h))(x);
            }
        }
        return U(S);
    }

This doesn't contain any external references, so the name falls short for what it does. This function calculates the fixed point for any given function we pass, not just for our g above. So we should change its name, and while we're at it, let's write it in a way a bit more solid:

    var Y = function(f) {
        return function(fn) {
            return fn(fn);
        }(function(fn) {
            return f(function() {
                return fn(fn).apply(null, arguments);
            });
        });
    }
Anybody missing?

It's easy getting lost in the way before reaching this point, so let's recap explaining the content of Y and what it does.

Y is a function that receives a parameter f (another function). Internally it defines function(fn) { return fn(fn); } which incidentally is know as the U combinator and is a combinator which simply applies a function to itself (U(f) = f(f)). Y then applies U to a certain function and returns the result. The function it applies U to is the one you see there. In general terms what this does is apply the mechanism of recursivity to f, the argument to Y. The returned result is the fixed point for f.

Now, the only thing that remains is how to define those functions that we can pass to Y. Going back up there, we see that they have this shape:

    function(h) {
        return function(n) {
            return (n<2)? n : h(n-1) + h(n-2);
        };
    };

That is, it is a high order function that, receiving some appropriate function as argument, returns the Fibonacci function we've been searching for all this time. So finally we can do:

    var calculaFibonacci = Y(function(h) {
        return function(n) {
            return (n<2)? n : h(n-1) + h(n-2);
        };
    });
Ok, so what's all this for?

If Y didn't have any other use out of a theoretical context (or better said, a combinatory context), it would still be interesting at least as an exercise. Still, while it does not have a wide application to everyday problems, it does have some. Y provides a way to shift the control of recursivity to our program's space instead of relying in the language's own mechanisms. This control is not something that we often need, but sometimes it can be useful.

For example, having control over the mechanism of recursion means that we could have an algorithm that refines the approximation of a certain calculation, and controlling recursion we could limit the precision or the time spent in it externally, from our control combinator.

Another use could be inserting in the process of recursion a way to cache previously calculated results. In a naive calculation of Fibonacci(7), we'd have to calculate F(6) and F(5), and for F(6) we'd have to calculate F(5) -again- and F(4). F(5) requires calculation of F(4) again and F(3)... As you see there're a lot of calculations which are repeated several times (an exponential number of times as I said in the beginning). We could memoize those calls in our combinator, needing no changes in the original Fibonacci algorithm. So optimizing 1 combinator we'd speed up all recursive methods.

To be honest, in this case we'd need to modify our Y combinator a bit. As an example here's what a memoizing combinator could look like:

    function Ym(F,cache) {
        cache = cache || {}; // If we don't have one yet, we initialize a cache.
        return function(arg) {
            if (cache[arg]) return cache[arg];
            var result = (F(function(n){
                return (Ym(F,cache))(n);
            }))(arg);
            cache[arg] = result;
            return result;
        };
    }

Just test this with:

    var Fibonacci1 = Y(function(h) {
        return function(n) {
            return (n<2)? n : h(n-1) + h(n-2);
        };
    });
    var Fibonacci2 = Ym(function(h) {
        return function(n) {
            return (n<2)? n : h(n-1) + h(n-2);
        };
    });
    var d = new Date();
    var r = Fibonacci1(50);
    alert(r + ": " + (new Date() -d));
    var d = new Date();
    var r = Fibonacci2(50);
    alert(r + ": " + (new Date() -d));
Tags:


Ver mi perfil en debug_mode=ON



Follow genezeta on Twitter