Programming a Problem-Oriented Language

Note The colorforth.com website referred to in this article has since gone away, but for now it can still be found via the Internet Archive Wayback Machine. I think that the "Programming a Problem-Oriented Language" document is so important that I have also hosted a copy on this site.

Several Posts ago, during a discussion of why I am interested in bare-metal operating systems development, I linked to Charles Moore's biography on his ColorForth.com web site. At the time I was interested in looking at a variety of stack-based languages for inspiration, including ColorForth. Sure enough, ColorForth had some interesting ideas, but what stopped me short was when I followed the link to his draft book Programming a Problem-Oriented Language.

This is a fascinating document. Chuck wrote it in 1970, when FORTH was new and software was a new country to explore. As a draft for a book that was never published, it is a bit rough around the edges. Chuck has a straightforward, even dogmatic, style and there are certainly some aspects of the book which are inadvertently amusing when taken in today's context:

you probably aren't in a position to pick a language. Your installation probably has reduced your choice to nil.

All of that fades into insignificance compared to rich vein of brilliant ideas buried just below the surface. I have read the document through from start to finish three times over the last couple of weeks, and have got something valuable from it every time. What particularly fascinates me is how much it predicts and pre-dates so many of today's trendy issues in software development. Here's just a few examples:

Charles Moore on "do the simplest thing", "YAGNI", "Minimum Viable Product" etc.

So to offer guidance when the trade-offs become obscure, I am going to define the Basic Principle:
Keep it Simple As the number of capabilities you add to a program increases, the complexity of the program increases exponentially. The problem of maintaining compatibility among these capabililties, to say nothing of some sort of internal consistency in the program, can easily get out of hand. You can avoid this if you apply the Basic Principle. You may be acquainted with an operating system that ignored the Basic Principle.
It is very hard to apply. All the pressures, internal and external, conspire to add features to your program. After all, it only takes a half-dozen instructions; so why not? The only opposing pressure is the Basic Principle, and if you ignore it, there is no opposing pressure.
...
The Basic Principle has a corollary:
Do Not Speculate! Do not put code in your program that might be used. Do not leave hooks on which you can hang extensions. The things you might want to do are infinite; that means that each one has 0 probability of realization. If you need an extension later, you can code it later - and probably do a better job than if you did it now. And if someone else adds the extension, will they notice the hooks you left? Will you document that aspect of your program?

Charles Moore on "craftsmanship", "mastery", "polyglot", "10000 hours", "practice", "small methods" etc.

The Basic Principle has another corollary:
Do It Yourself! Now we get down the the nitty-gritty. This is our first clash with the establishment. The conventionsl approach, enforced to a greater or lesser extent, is that you shall use a standard subroutine. I say that you should write your own subroutines. Before you can write your own subroutine, you have to know how. This means, to be practical, that you have written it before; which makes it difficult to get started. But give it a try. After writing the same subroutine a dozen times on as many computers and languages, you'll be pretty good at it. If you don't plan to be programming that long, you won't be interested in this book.
What sort of subroutines do you write for yourself? I have acquired respect for SQRT subroutines. They're tricky things; seem to attract a lot of talent. You can use the library routine to good advantage. Input subroutines now. They seem to have crawled out from under a rock. I somehow can't agree that the last word was said 15 years ago when FORMAT statements were invented.
As I will detail later, the input routine is the most important code in your program. After all, no one sees your program; but everyone sees your input. To abdicate to a system subroutine that hasn't the slightest interest in your particular problem is foolish. The same can be said for output subroutine and disk-access subroutine.
Moreovere, the task is not that great as to deter you. Although it takes hundreds of instructions to write a general purpose subroutine, you can do what you need with tens of instructions. In fact, I would advise against writing a subroutine longer that a hundred instructions.
So if you want to read double-precision, complex integers; don't rely on the COBOL input subroutine, or wait till the manufacturer revises it. It's a lot easier to write your own.
But suppose everyone wrote their own subroutines? Isn't that a step backward; away from the millenium when our programs are machine independent, when we all write in the same language, maybe even on the same computer? Let me take a stand: I can't solve the problems of the world. With luck, I can write a good program.

Charles Moore on "scripting", "hard and soft layers", "interactivity" etc.

I'm going to tell you how to write a program. It is a specific program; that is, a program with a specific structure and capabilities. In particular, it is a program that can be expanded from simple to complex along a well defined path, to handle a wide range of problems, likewise varying from simple to complex. One of the problems it considers is exactly the problem of complexity. How can you control your program so that it doesn't grow more complicated than your application warrants? First I'll define "input", and mention some general rules of programming that apply to all programs, whether they have input or not. Actually we will be almost exclusively concerned with input, so I've not much to say about programs lacking input.
By admitting input, a program acquires a control language by which a user can guide the program through a maze of possibilities. Naturally, this increases the flexibility of the program, it also requires a more complex application to justify it. However it is possible to achieve a considerable simplification of the program, by recognising that it needs a control language as a tool of implementation.

Charles Moore on Domain-Specific Languages

The next step is a problem-oriented-language. By permitting the program to dynamically modify its control language, we mark a qualitative change in capability. We also change our attention from the program to the language it implements. This is an important, and dangerous, diversion. For it's easy to lose sight of the problem amidst the beauty of the solution.
In a sense, our program has evolved into a meta-language, which describes a language we apply to the application. But having mentioned meta-language, I want to explain why I won't use the term again. You see things get pretty complicated, particularly on a philosophic level. To precisely describe our situation requires not 2 levels of language - language and meta-language - but a least 4 levels. To distinguish between these levels requires subtle arguments that promote not clarity but confusion. Moreover, the various levels can often be interchanged in practice, which reduces the philosophic arguments to hair-splitting.
A problem-oriented-language can express any problem I've encountered. And remember, we're not concerned with the language, but with the program that makes the language work. By modifying the language we can apply the same program to many applications. However there are a class of extensions to the language that constitute another qualitative change. They don't increase the capacity of the program, but the increase the capability of the language. That is, they make the language more expressive. We will consider some such extensions in Chapter 8. I gathered them together chiefly because they share the common property that I don't quite comprehend their potential.

As well as the philosophy, the document also goes into great depth explaining not just how but why to build an extremely compact and efficient language in the FORTH style. In a few more paragraphs he also adds input and output, storage, virtual memory, multi-user and multi-tasking features which allow his tiny little language to also be its own operating system. If you are at all interested in language and system programming this is invaluable reading. It strips away all the mystique and scariness.

To cap it all, Chuck ends by talking though, step by step, how to bootstrap a complete system, from scratch. Really from scratch. Toggle in just enough machine instructions to read some input from a keyboard and run it, then build output, disk storage and so on in the language you have just created. Computers may not usually have lights and switches on the front panel to enter data directly into memory any more, but that's where he starts, and everything else is built up from there. The more I read it the more I appreciate the elegant ultra-simplicity, and the more I want to make this kind of system to run on the Raspberry Pi.

Raspberry Alpha Omega

Raspberry Pi from start to finish

Programming a Problem-Oriented Language

Charles Moore on "do the simplest thing", "YAGNI", "Minimum Viable Product" etc.

Charles Moore on "craftsmanship", "mastery", "polyglot", "10000 hours", "practice", "small methods" etc.

Charles Moore on "scripting", "hard and soft layers", "interactivity" etc.

Charles Moore on Domain-Specific Languages