Lacking Natural Simplicity

Random musings on books, code, and tabletop games.

Lisp-style trampolines in Common Lisp, C, Ada, Oberon-2, and Revised Oberon

Are you familiar with lisp-style trampolines? A trampoline is a loop that iteratively invokes functions that return functions. The previous link will lead you through CONS Should Not CONS Its Arguments, Part II: Cheney on the M.T.A. (PDF version; see p. 17 in the original publication, but that is the first page of the PDF that link eventually leads to), which, while saying you should just go ahead and convert to Continuation-passing style form (CPS), mentions in passing No assembly required: compiling standard ML to C, (see p. 168 in the original publication, which is the page 8 of the PDF that link eventually leads to) which leads you to RABBIT: A Compiler for SCHEME, where the concept is discussed under the name the "SCHEME UUO handler" (see p. 23–24).

Why is this useful? It allows you to compile a language that requires proper tail call optimization to one that does not provide that. For instance, if you wanted to compile Scheme, which requires proper tail call optimization, to Common Lisp, which does not require proper tail call optimization, you can't just translate Scheme functions directly into Common Lisp functions, because tail calls allocate stack space, and eventually the stack will run out of space.

Here's an example that will run forever in any standard confirming Scheme, forever.scm:

;;; Recurse forever, because with tail call optimization, the stack
;;; never runs out!
(define i 0)

(define (f)
  (set! i (+ 1 i))
  (display "call #") (display i) (newline)
  (f))

(f)

Here's not_forever.lisp, the same thing in Common Lisp:

;;; Recurse until the stack space runs out.
(defparameter i 0)

(defun f ()
  (incf i)
  (format t "call #~d~%" i)
  (f))

(f)

Now, some Common Lisp implementations don't do tail call optimization, and some do. Some don't do tail call optimization unless you compile the functions in question.

So, for instance, if I load that file into GNU CLISP 2.49.92, the function executes about 4668 times and then CLISP dies with the error message:

*** - Lisp stack overflow. RESET

However if I compile that file in CLISP with the Common Lisp function compile-file and then load the resulting .fas file into CLISP, it will run forever, because CLISP does tail call optimization when it compiles code.

Furthermore, if I load that file into SBCL it will run forever, because SBCL does tail call optimization by default.

ECL is another Common Lisp system where if you load that file into an interactive session it will die with stack overflow, but if you compile that file into an executable it will run forever.

So, suppose you wanted to translate the Scheme code into Common Lisp. You'd use a trampoline to make sure the stack doesn't overflow.

Here's trampoline.lisp, a trampoline in Common Lisp that runs through three functions and then stops, for simplicity:

;;; Demonstrate lisp-style trampolines.
(defun baz ()
  (format t "baz~%")
  nil)

(defun bar ()
  (format t "bar~%")
  #'baz)

(defun foo ()
  (format t "foo~%")
  #'bar)

(let ((f #'foo))
  (loop for i from 1 while (not (null f))
        do (setf f (funcall f))))

Here's trampoline_forever.lisp, a trampoline in Common Lisp that runs forever:

;;; Recurse forever without running out of stack space.
(defun baz ()
  (format t "baz~%")
  #'foo)

(defun bar ()
  (format t "bar~%")
  #'baz)

(defun foo ()
  (format t "foo~%")
  #'bar)

(let ((f #'foo))
  (loop for i from 1 while (not (null f))
        do (progn
             (format t "trampoline call #~s~%" i)
             (setf f (funcall f)))))

Of course, you can do the same things in C. First, here's not_forever.c, a program in C that will (usually) die with a stack overflow:

/* Recurse until stack space runs out.
   Unless the compiler does tail-call optimization. */
#include <stdio.h>

static int i;                 /* Number of times f has been called. */

void f (void)
{
  i++;
  printf ("call #%d\n", i);
  f ();
}

int main (int argc, char **argv)
{
  i = 0;
  f ();
}

I say usually, because tail call optimization is not required by the standard, and many C compilers do not do it. For instance, gcc doesn't do tail call optimization unless you specify -foptimize-sibling-calls or -O2, -O3, or -Os. If I don't specify any of those options, on my system that program dies with the error Segmentation fault: 11 after call #523932.

Here's trampoline.c, the limited trampoline in C:

/* Demonstrate lisp-style trampolines. */
#include <stdio.h>

typedef void* (*trampoline)(void);

void *
baz (void)
{
  printf ("baz\n");
  return NULL;
}

void *
bar (void)
{
  printf ("bar\n");
  return baz;
}

void *
foo (void)
{
  printf ("foo\n");
  return bar;
}


int
main (int argc, char **argv)
{
  trampoline t = foo;
  while (t)
    t = t ();
  return 0;
}

Notice this works by converting pointers to functions to pointers to void — it doesn't even require any explicit casting!

And here's trampoline_forever.c, the trampoline that runs forever:

/* Recurse forever without running out of stack spacc. */
#include <stdio.h>

typedef void* (*trampoline)(void);

void *foo (void);               /* Forward declaration. */

void *
baz (void)
{
  printf ("baz\n");
  return foo;
}

void *
bar (void)
{
  printf ("bar\n");
  return baz;
}

void *
foo (void)
{
  printf ("foo\n");
  return bar;
}


int
main (int argc, char **argv)
{
  trampoline t = foo;
  while (t)
    t = t ();
  return 0;
}

So, here's where C's weak typing lets it get away with things that more strongly typed languages don't. Notice the declaration of the trampoline type:

type def void* (*trampoline)(void);

Notice how it returns a void *, instead of something more specific? That's because if it tried to return something more specific, it would have to a recursive type: that is to say, while defining the type trampoline, you would use a reference to the type while defining the type. It would look something like this:

typedef trampoline* (*trampoline)(void);

and that results in gcc issuing the following error:

error: unknown type name 'trampoline'

Very few traditional programming languages allow this. It isn't a problem in Scheme or Common Lisp because those languages use strong dynamic typing, where the types are checked at runtime.

So how do you do this in languages with strong static typing?

Well, let's try this in some of the Oberon programming language dialects. Oberon was designed and implemented by Niklaus Wirth (NW1, NW2) as a simplification and generalization of his earlier languages Pascal, Modula, and Modula-2. (Here's The Programming Language Oberon (1990), the original Oberon language report, in PDF for reference.) I find the original Oberon admirable for its simplicity, strong typing, understandable syntax, and its introduction of Type Extensions (which organizes record types in a inheritance hierarchy, which with the use of procedure variables enables object oriented programming in a particularly straightforward and flexible way) but struggle with its minimalism and how its standard libraries differ in paradigm from the standard Unix libraries, since Oberon was used to implement a new operating system, the Oberon System with its own completely unique API.

Oberon has a number of dialects. I'm most fond of Oberon-2, which was the second language in the Oberon family, developed by Hanspeter Mössenböck and Niklaus Wirth. It is a little less minimalist than Oberon, and among a few other things adds type-bound procedures to the record hierarchy provided by Type Extensions, providing a appealingly simple and direct design for object-oriented programming that was later adopted by the Ada programming language in a more complicated and confusing manner, as might be expected by Ada's plethora of design goals and constraints. (Here's a couple of papers that mention it: Object-oriented programming through type extension in Ada 9X (ADAOO1PDF) and Integrating Object-Oriented Programming and Protected Objects in Ada 95 (ADAOO2PDF). I wish I knew of a reference that discussed explicitly the process of choosing Type Extensions for Ada and how they were adopted and adapted in Ada.)

Here's a copy of the Oberon-2 language report in PDF (O2PDF) and HTML (O2HTML), for reference.

Anyway, Oberon-2 has procedure types and procedure variables, so one would think it would be simple to implement trampolines in Oberon-2, without messing about with pointers. It turns out to be more complicated than one would think.

I'm using Vishap Oberon, a free and open source Oberon-2 compiler, by the way.

First, here's NotForever.Mod, the standard program with a recursive function procedure that will overflow the stack.

MODULE NotForever;              (* Recurse until stack space runs out. *)
  IMPORT Out;
  VAR i: LONGINT;               (* Number of times f has been called. *)

  PROCEDURE f;
  BEGIN
    INC(i);
    Out.String ("call #"); Out.Int (i, 0); Out.Ln;
    f;
  END f;

BEGIN
  i := 0;
  f;
END NotForever.

On my system, this program dies with with the error Segmentation fault: 11 after call #524008.

Now on to trampolines. In theory we should be able to declare a type that is a function procedure that returns other function procedures. Here's the first attempt at the limited trampoline, TrampolineBroken.Mod.

MODULE TrampolineBroken;              (* Fail to demonstrate lisp-style trampolines. *)
  IMPORT Out;

  TYPE Thunk = PROCEDURE (): Thunk; (* This is an error. *)

  VAR next: Thunk;

  PROCEDURE baz (): Thunk;
  BEGIN
    Out.String ("baz"); Out.Ln;
    next := NIL;
  END baz;

  PROCEDURE bar (): Thunk;
  BEGIN
    Out.String ("bar"); Out.Ln;
    next := baz;
  END bar;

  PROCEDURE foo (): Thunk;
  BEGIN
    Out.String ("foo"); Out.Ln;
    next := bar;
  END foo;

BEGIN
  next := foo;
  WHILE next # NIL DO
    next := next ();
  END;
END TrampolineBroken.

Unfortunately, trying to compile this dies with the following error message:

TrampolineBroken.Mod  Compiling TrampolineBroken.

   4:   TYPE Thunk = PROCEDURE (): Thunk;
                             ^
    pos   126  err 244  cyclic type definition not allowed

Module compilation failed.

As mentioned above, this is a case of a recursive type. Well, drat.

At this point the immediate reaction is to look at the C version and try to hack up something analogous using functionality from Oberon-2's SYSTEM module, but that way lies madness, difficulty, and type errors. Instead, you have to step back and think about things from another viewpoint. The problem is that we can't declare a type for a function procedure that returns another function procedure of its type, because that is recursive. Instead of trying for a recursive type, what if we switched to storing the next procedure to be run in a global variable, next, and having each procedure in the chain set that to the procedure that should run next? That should work!

Here's Trampoline.Mod, a version that works!

MODULE Trampoline;              (* Demonstrate lisp-style trampolines. *)
  IMPORT Out;

  TYPE Thunk = PROCEDURE ();

  VAR next: Thunk;              (* Next procedure to be called. *)

  PROCEDURE baz ();
  BEGIN
    Out.String ("baz"); Out.Ln;
    next := NIL;
  END baz;

  PROCEDURE bar ();
  BEGIN
    Out.String ("bar"); Out.Ln;
    next := baz;
  END bar;

  PROCEDURE foo ();
  BEGIN
    Out.String ("foo"); Out.Ln;
    next := bar;
  END foo;

BEGIN
  next := foo;
  WHILE next # NIL DO
    next ();
  END;
END Trampoline.

And here's TrampolineForever.Mod, which also works!

MODULE TrampolineForever;
  (* Recurse forever without running out of stack space.  *)
  IMPORT Out;

  TYPE Thunk = PROCEDURE ();

  VAR
    next: Thunk;         (* Next procedure to be called. *)
    i: INTEGER;          (* Number of times through the trampoline. *)

  PROCEDURE ^foo;        (* Forward declaration. *)

  PROCEDURE baz ();
  BEGIN
    Out.String ("baz"); Out.Ln;
    next := foo;
  END baz;

  PROCEDURE bar ();
  BEGIN
    Out.String ("bar"); Out.Ln;
    next := baz;
  END bar;

  PROCEDURE foo ();
  BEGIN
    Out.String ("foo"); Out.Ln;
    next := bar;
  END foo;

BEGIN
  i := 0;
  next := foo;
  WHILE next # NIL DO
    INC (i);
    Out.String ("call #"); Out.Int (i, 0); Out.Ln;
    next ();
  END;
END TrampolineForever.

Wirth has continued to work on Oberon, producing an even more minimalist revision, often know as Oberon-07, or Revised Oberon. (Here's the The Programming Language Oberon-07 (Revised Oberon) in PDF, for reference.) Unfortunately, he removed forward declarations and the LONGINT type, which means we have to make some minor changes.

I'm using OBNC (OBNC1, OBNC2) for Revised Oberon.

Here's the Revised Oberon NotForever.Mod, with LONGINT replaced by INTEGER:

MODULE NotForever;           (* Recurse until stack space runs out. *)
  IMPORT Out;
  VAR i: INTEGER;            (* Number of times f has been called. *)
    (* Alas, no more LONGINT. *)

  PROCEDURE f;
  BEGIN
    INC(i);
    Out.String ("call #"); Out.Int (i, 0); Out.Ln;
    f;
  END f;

BEGIN
  i := 0;
  f;
END NotForever.

Here's the Revised Oberon Trampoline.Mod:

MODULE Trampoline;              (* Demonstrate lisp-style trampolines. *)
  IMPORT Out;

  TYPE Thunk = PROCEDURE ();

  VAR next: Thunk;              (* Next procedure to be called. *)

  PROCEDURE baz ();
  BEGIN
    Out.String ("baz"); Out.Ln;
    next := NIL;
  END baz;

  PROCEDURE bar ();
  BEGIN
    Out.String ("bar"); Out.Ln;
    next := baz;
  END bar;

  PROCEDURE foo ();
  BEGIN
    Out.String ("foo"); Out.Ln;
    next := bar;
  END foo;

BEGIN
  next := foo;
  WHILE next # NIL DO
    next ();
  END;
END Trampoline.

Here's the Revised Oberon TrampolineForever.Mod, with a workaround for the removal of forward declarations of procedures:

MODULE TrampolineForever;
  (* Recurse forever without running out of stack space.  *)
  IMPORT Out;

  TYPE Thunk = PROCEDURE ();

  VAR
    forward: Thunk;      (* Forward declaration. *)
    next: Thunk;         (* Next procedure to be called. *)
    i: INTEGER;          (* Number of times through the trampoline. *)

  PROCEDURE baz ();
  BEGIN
    Out.String ("baz"); Out.Ln;
    next := forward;
  END baz;

  PROCEDURE bar ();
  BEGIN
    Out.String ("bar"); Out.Ln;
    next := baz;
  END bar;

  PROCEDURE foo ();
  BEGIN
    Out.String ("foo"); Out.Ln;
    next := bar;
  END foo;

BEGIN
  forward := foo;
  i := 0;
  next := foo;
  WHILE next # NIL DO
    INC (i);
    Out.String ("call #"); Out.Int (i, 0); Out.Ln;
    next ();
  END;
END TrampolineForever.

Note that with forward declarations removed, we just declare a procedure variable, forward, initialize it before starting the trampoline, and refer to it instead of foo in procedure baz.

And of course, since we mentioned Ada above, we should do a version in that. I'm using GNAT.

Here's not_forever.adb:

with Ada.Text_IO; use Ada.Text_IO;
procedure not_forever is        -- recurse until stack space runs out.
  type Unsigned is mod 2**64;   -- wrap to 0 when maximum value is execeeded.
  i: Unsigned := 0;             -- Number of times f has been called.

  procedure f is
  begin
    i := i + 1;
    f;
  end f;

begin
  f;
exception
  when STORAGE_ERROR =>
  Put ("STORAGE_ERROR raised with i = "); Put (i'Image); New_Line;
end not_forever;

Since Ada has exceptions, we actually catch the exception that happens when the stack runs out of space:

STORAGE_ERROR raised with i =  262002

Again, Ada would have the same problem with recursive types as the Oberon dialects. Don't look at the C version and wander off into forest of Ada.Unchecked_Conversion because that's unsafe, or the thicket of System.Address_To_Access_Conversions, because that one's also unsafe and more complicated (and the simple approach didn't work, when I tried it). Instead, do the same thing as we did in the Oberon dialects, and move to a global variable instead of returning the values from the functions.

Here's trampoline.adb:

with Ada.Text_IO; use Ada.Text_IO;
procedure trampoline is         -- Demonstrate lisp-style trampolines.
    type Thunk is access procedure;

    Next: Thunk := null;        -- Next procedure to be called.

    procedure baz is
    begin
        Put_Line ("baz");
        Next := null;
    end baz;

    procedure bar is
    begin
        Put_Line ("bar");
        Next := baz'Access;
    end bar;

    procedure foo is
    begin
        Put_Line ("foo");
        Next := bar'Access;
    end foo;

begin
    next := foo'Access;
    while Next /= null loop
        Next.all;
    end loop;
end trampoline;

And here's trampoline_forever.adb:

with Ada.Text_IO; use Ada.Text_IO;
procedure trampoline_forever is
  -- Recurse forever without running out of stack space.
  type Unsigned is mod 2**64;   -- wrap to 0 when maximum value is execeeded.
  i: Unsigned := 0;             -- Number of times through the trampoline.

  type Thunk is access procedure;

  Next: Thunk := null;          -- Next procedure to be called.

  procedure foo;        -- forward declaration.

  procedure baz is
  begin
    Put_Line ("baz");
    Next := foo'access;
  end baz;

  procedure bar is
  begin
    Put_Line ("bar");
    Next := baz'Access;
  end bar;

  procedure foo is
  begin
    Put_Line ("foo");
    Next := bar'Access;
  end foo;

begin
  next := foo'Access;
  while Next /= null loop
    i := i + 1;
    Put ("call #"); Put (i'Image); New_Line;
    Next.all;
  end loop;
exception
  when STORAGE_ERROR =>
  Put ("STORAGE_ERROR raised with i = "); Put (i'Image); New_Line;
end trampoline_forever;

Of course, languages with more sophisticated type systems have other ways of dealing with things, but I haven't investigated them. I did stumble across an example in OCaml (OCAML1, OCAML2).


If you want to play around with this, the code is in a repository at Github.

Print Friendly and PDF

Comments

Comments powered by Disqus