As part of the popshop project I designed and implemented a concatenative (FORTH-like) language I called Popcode. It has lists (of commands), which among other things makes loops possible. The first concatenative language was Forth and nowadays the best known is probably Joy. They don’t have looping constructs, only recursion.
Basic principles
Concatenative languages are based on (normally) postfix notation rather than infix notation. Thus to add 2 and 3 and write the result the code is
2 3 + write
rather than
write(2+3)
which is typical of hierarchical infix languages (like Python, Java, C etc).
Concatenative languages work on an invisible value stack, where computation takes place. The program above pushes 2 then 3 on the value stack, removes them and pushes their sum, then pops it and writes it.
Advantages
The main advantage of concatenative languages is they are much easier to implement – no parsing, and neither compiling nor recursive hierarchical tree evaluation. You implement the stack operations individually, and Bob’s your uncle.
Larger concatenative programs are difficult for humans to write but not for computers to generate. Popcode, for example was designed to be an intermediate notation between the Lucid compiler and the Lucid interpreter. Also, Popcode had a purely iterative implementation (no recursion) and this greatly simplified storage management since a stack is easy to mark.
Popcode, like pyLucid, uses the data types (numbers, strings, words, signs, and lists) of Pop-2. Pop-2 itself is dead but it’s successor, POP-11 is alive and well on Linux.
Loops
But how do you write a loop? Not something like
while dup 0 > 1 swap – end
because while and end don’t work as commands (what would they do?). Forth and Joy use recursion instead, which isn’t always easy..
Here’s how Popcode uses command lists to implement looping.
First there are two registers, reg and block (both hold lists). The first contains the commands waiting to be executed, and the second is the original value of reg, a sort of backup in case we need another iteration. There is also a control stack used to store programs (command lists) that are temporarily suspended
The simplest loops use the commands loop and break. The loop command takes us back to the beginning of block for another iteration, while break immediately exits it.
Here’s how they work. When a loop is executed, the value of block is (re)assigned to reg. When a break is executed, the suspended values of block and reg are popped from the control stack and (re)stored in their registers.
That’s all there is to it. No, wait – how do we start loops in the first place? The simplest way is with the do command, which causes a program to be executed. Lists are normally treated as such – they can be (eg) appended and written out, and when the Popmachine encounters one while working through reg, it is pushed onto the value stack.
However, when a do command is invoked, the current block and reg lists are pushed onto the control stack. The top of the value stack (which should be a list) is popped and assigned to both block and reg. Execution is resumed.
The code
1 [dup write 1 + ‘, ‘ write loop] do
is a counter which will output 1, 2, 3, … indefinitely
Conditional commands
Unfortunately break and loop are almost useless because they act unconditionally. For everyday use we need their conditional versions, breakc and loopc. Both expect a truth value on top of the value stack. They consume it and act only if it is the value true, otherwise no further action is taken.
Here is code that sums the squares of the numbers from 1 to 100.
0 1 [dup 2 npush dup * + 1 unpush dup > 100 breakc 1 + loop] do pop
Here the code 2 npush pushes the top of the stack two levels deep while 1 unpush brings to the top the value one level deep.
Combinators
There are other operations that, like do, can cause argument lists to be run. (Joy calls these operations “combinators”).
One useful combinator is if. It expects three items on the stack, from lower to higher a truth value, a program list corresponding to the value true, and above it a program list corresponding to the value false. The operation if consumes the truth value and executes the corresponding list, discarding (without executing) the other list. Thus
[dup 0 > [] [0 swap -] if]
computes the absolute value of the top of the stack.
Repeat and for
Popcode also has versions of looping constructs found in conventional languages, specifically repeat and for.
The repeat command allows you to repeat a program a specified number of times. For example
[ ‘Happy Birthday’ writeln] 3 repeat
will output three Happy Birthdays.
The command is implemented using the control stack. When a repeat command is encountered, the machine first examines the top of the stack (the count). If it’s 0 the count and the program under it are popped and no other action is taken.
Otherwise the rest of the current program is suspended (pushed on the control stack) and the count is popped off the value stack and decremented. The program now on the top of the stack is popped and becomes the current program.
Meanwhile the program and the decremented count, for example
[[ ‘Happy Birthday’ writeln] 2 repeat]
is pushed on the control stack and execution proceeds.
The command for takes instead of a count a list of values to be run through. On each iteration, one of these values is placed on top of the value stack. Thus for example
[ ‘Happy Birthday, ‘ write writeln] [Tom Dick Harry] for
will output three personalized Happy Birthday greetings.The command for is implemented by a similar copy-and-push scheme.
Defining commands
The ability to invoke subcomputations allows us to expand the instruction set by defining commands in terms of program lists. Suppose we want the word abs to act like a built in absolute value command. We define it as above with the command def:
[dup 0 > [] [0 swap -] if] ” abs def
Notice that
” abs
pushes the word abs on the stack without trying to execute it as a command.
When the machine encounters abs as a command, is suspends the current computation, fetches the program associated with it, and begins its execution.
Commands can be defined recursively, i.e. the defining program can invoke the command being defined. Here is a definition of a command that computes the factorial of the number on the top of the stack:
[dup 2 < [1][dup 1 – fac *] if] ” fac def
Variables
Part of the fun of concatenative languages is that extending them is very simple. No need to modify a parser extend a recursive evaluator, or hack a code generator.
One simple extension allows words to be used as variables/storage locations. The extension implements two commands, fetch and store which allow us to associate words with assignable values (there is no necessary connection between a word’s meaning as a command and the assignable value associated with it).
The command fetch expects a word on the top of the stack, consumes it, and pushes the value associated with it. The command store expects a word on the top of the stack, consumes it and the value underneath it, and associates this value with the word.
Here is the definition of a command inc that increments a stored value by 1:
[dup fetch 1 + swap store] ” inc def
Here is code for the sum of the first 100 squares using fetch and store.
0 ” sumsq store 1 [dup >100 breakc dup dup * ” sumsq fetch +
” sumsq store 1 + loop] pop ” sumsq fetch
Iterators
The possibilities for extending Popcode are endless, an the implementations are usually simple – a few lines of Python added to the interpreter.
One possibility (I’ll skip the details) is iterators, which when called repeatedly leave the terms of a sequence on the top of the stack. For example, odd could when called leave the next odd number on the stack. If we wanted finite sequences, the iterator could leave the word more on top of the value of there is more, or done if there isn’t any more. It’s not hard to write a loop construct foreach which expects a block of code and an iterator and loops through the value.
It seems similar techniques and conventions could make a form of OO available but I haven’t worked out the details.
Extensions
I worked out a general scheme for extending the Pop machine using the module import facility of Sloth, the C configuration tool. Each extension is packaged in a module. The procedure files implement the commands, one procedure for each command. The body (startup code) has calls to a procedure Enterop(w,f) which enters function pointer p as the semantics of command w (w a word or sign). (Actually it’s a bit trickier but never mind). Enterop is defined in the basic machine module.
The result is that extending the Pop machine is as easy as including the names of desired extension modules in the import list.
It was just such an extended machine that evaluated compiled Lucid programs. The extensions operated the time ‘register’, a stack for storing suspended time coordinates, searching and fetching from the warehouse, and so on.
The Lucid compiler turned Lucid source code into Popcode for this extended machine.
Epilogue
In the end all this ingenuity came to nought. When I first implemented the pyLucid interpreter I didn’t use the pop machine because Python has its own garbage collector. Instead I used a straight forward recursive demand driven evaluation scheme. It worked well but the recursions were shockingly deep – the Python stack would grow thousands of levels deep for relatively simple Lucid programs.
As a result I decided the Pop machine was worth a try after all. It’s implementation is purely iterative, no recursion. (Though I found a hidden recursion in my legacy C code, that I had to work hard to remove.)
So I implemented the Pop machine in Python and gave it a try. Sure enough, the Python stack didn’t overflow although the Pop machine control stack got rather big. But unfortunately the Pop machine turned out to be pretty slow. Also, the Popcode version of the interpreter was also significantly more complex so I reluctantly abandoned it. Sigh!
So in the end the only thing that came out of the Pop machine was this blog post. Maybe it could be of use to some of you. The Python implementation of the Pop machine is mothballed but you never know.
readagain loopc break
The block style for providing command list arguments to functions reminds me a lot of the concatenative language Factor. (i.e. factorcode.org)
I did some playing around with constructs to get a bare bones concatenative language, which I called FURPHY (you can google for it if you want). In that, looping fell out very naturally: just end a word definition with RECURSE and break it with a conditional return ?R somewhere. This does rely on tail call optimisation making the definition into a loop, or it will get impractical. I also came up with an even more slimmed down variant that didn’t use immediate words at all, basically an over-simplified Forth that needs kludges to do a lot of things; in that, all new definitions form infinite loops and need explicit return code. In the latter variant, absolutely any finite source is acceptable, storage space permitting!