The (tiny) Placo Programming Language
A tiny, basic programming language is something every software developer should build at least once. Here's mine, for now at least.
It's certainly not a fast language, with a tree-walking interpreter implemented in a dynamically-typed language. It's not terribly ergonomic, either. It's only data types are numbers and functions. The functions are, at least, proper lexical closures. I made sure of that.
Mostly, Placo is a demonstration of a minimal foundation for a familiar-looking language. It has all the classic steps of an interpreter: tokenizer, parser, semantics. The tokenizer and parser even operate on nice stream abastractions (specifically, a generator for the tokens) instead of materializing a big input string or token list.
Without further ado, a code sample:
def fac-impl(i acc k)
if i = 1 then
acc
else
k(i - 1 acc * i k)
end
end
def factorial(i)
fac-impl(i 1 fac-impl)
end
def main()
print(factorial(5))
end
There are a couple standout features.
Perhaps most obviously, no commas between arguments of a function.
Commas aren't necessary for disambiguating expression boundaries!
If this was in any way a serious language, I would require commas for ergonomic reasons;
it's a little less obvious to the eye than the parser
what's going on with k(i - 1 acc * i k).
The other odd feature is that fac-impl isn't quite directly recursive.
The lexical scope at the time a function is defined doesn't include the function itself,
and that scope is all that's available inside the function.
Functional programming languages, of which Placo is almost one, tend to have seperate
let and let rec, syntax, where the latter avoids this problem.
I didn't bother here.
I might someday, but it's honestly kind of fun to do the little continuation-passing
transform manually.
One of the little challenges I set myself for this project was to implement both
the evaluator and a pretty printer using the same tree-walking/induction framework.
I wanted to supply an AST and a big struct full of callbacks
to a general-purpose visit function.
Each AST node callback should receive as arguments the results of visiting its children.
This turned out to be a bit tricky, since I'm casually letting `print` be a side effect.
The node callbacks for both the evaluator and printer return (and therefore accept as arguments)
functions that take some kind of context
and perform side effects before returning their final result, if any.
I don't know if it's quite correct, but I've taken to calling these functions "thunks".
A "thunk" in functional programming usage tends not to take any arguments at all,
but I think Placo's "thunks" are at least in the same spirit.
A nullary function still runs in a context, and that's all I'm passing in.
For the evaluator, the thunks returned from the callbacks take a good old-fashioned environment that maps names to fully evaluated values (functions or numbers), perform side effects, and return a function or number. The interesting case is evaluating closures. When the definition of a function is evaluated, the environment at that definition is passed to the definition thunk and squirrelled away in a Scheme procedure that is returned from the thunk. That procedure, directly corresponding to a function in the Placo source, is probably stored in an environment, and is later passed to the evaluator for a function call. The call evaluates the called expression using the call-site environment, most likely looking up the previously created procedure. It also evaluates its argument thunks in the same call-site environment. The argument values are passed straight to the aforementioned procedure representing the Placo function. That procedure loads the arguments into a new environment that maps them to the argument names and defaults to the environment captured at the definition. This combined environment is the one (finally) passed to the thunk for the function's body expression, thus providing it both the arguments of the call and the lexical context.
; excerpt of eval.rkt
; these two functions are the interesting part
; of the evaluator/visitor system
(define (eval-visit-fundef arg-names body-thunk)
; n.b. arg-names is not a thunk, just a syntactic list of strings.
; just need to capture lexical context, i.e. env-at-def
; we end up storing this in a regular scheme procedure. lol.
(lambda (env-at-def)
(lambda (arg-values)
(let ([actual-env (bind-env-names arg-names
arg-values
env-at-def)])
(body-thunk actual-env)))))
(define (eval-visit-funcall fun-thunk arg-thunks)
(lambda (env)
; the fun to be called is an expression
; that needs to be evaluated in the current call-site env
; arg-values are the results of evaluating
; the argument thunks in the same call-site env
(let ((fun (fun-thunk env))
(arg-values (map (lambda (value-thunk)
(value-thunk env))
arg-thunks)))
(if (procedure? fun)
(fun arg-values)
(error (format "trying to call non-function ~a" fun))))))
For the printer, it's a little simpler. The visitor-results take a nesting depth and output port (Scheme/Racket's abstraction of a character stream) and simply writes the representation of the given AST node to the port. The visitor-result functions of nested syntactic objects, for instance the body of an if statement, are invoked by the visitor-result of the parent with the same output port but incremented depth.
; excerpt of visit.rkt
; the least boring of the print functions.
; the output is... not especially pretty.
; "pretty-printer" is a figure of speech, ok?
(define (print-ifexpr cond-thunk true-thunk else-thunk-or-false)
(lambda (out depth)
; (indent n) just returns a string consisting of
; an appropriate number of 2*n spaces
(display (format "~aif(" (indent depth)) out)
(cond-thunk out depth)
(display ") then\n" out)
(true-thunk out (incr depth))
(if (not (false? else-thunk-or-false))
(begin
(display (format "~aelse\n" (indent depth)) out)
(else-thunk-or-false out (incr depth)))
(void))
(display (format "~aend\n" (indent depth)) out)))
This was a good project to show I can put together the basics of a
programming language without too much trouble.
I have a few ideas for other features to add.
Besides creature-comforts like strings, I might use it to learn how to implement
control operators like call-with-current-continuation.
But for now I'm letting it stand as a good exercise and an
example of the good clean fun you can have while programming.
(Look, you gotta ship eventually.)