More FORTH-like inspirations

As I slowly build my bare-metal operating system and language for the Raspberry Pi, I like to keep looking around for more sources of inspiration. I found a list of stack-based languages somewhere which (among many others) pointed me at Raven, an interesting combination of FORTH, Python and perl. A bit more digging on the author's site also turned up three other FORTH-like stack-based languages. "enth" seems to be a fairly standard FORTH, "flux" is an interpretation of the ideas in Charles Moore's ColorFORTH, and "reforth" is an attempt to harmonise the best parts of the other three.

Of the four languages, Raven appears the most developed in a practical sense, it has a wide range of capabilities and would be useful for a broad spectrum of tasks. The one which intrigues me most, though is "flux". I had read about ColorFORTH on the official web site, and even tried to download and run it years ago, but always been pretty much baffled. There's not really enough information either on the site or in the code to explain what it is all about. Even with a reasonable understanding of FORTH and other POL languages, ColorFORTH is a bit of an oddity with its use of colour as syntax.

Although the author of (presumably Sean Pringle) takes some pains to explain that flux is not a clone of ColorFORTH, the detail in the description of colours, colour transitions, and their meaning compared with the FORTH state model gave me a much clearer understanding of the ideas behind ColorFORTH. Now I feel that I know at least enough about such things to begin to have opinions...

For me the use of colours as syntax tokens is a bit of a red herring. The main advantages claimed seem to hinge on (a) visibility of the different word states, and (b) conciseness by burying the state transitions in the spaces between words. The benefit of (a) is largely moot these days. Most text editors used for software development are easily configured to highlight and colour the code based on arbitrary rules. While the conciseness of (b) is a good thing, there are potentially other forms of syntax which could give almost as much benefit, by using visible tokens such as prefix characters.

My main concern with ColorFORTH is not actually about the use of colour, but the way that it moves away from the almost-syntaxless nature of FORTH toward a language with several syntax elements (the colours) which affect and thus pin-down the behaviour of the other words. ColorFORTH is considerably less useful as a pure POL than FORTH. The use of the colours and their corresponding invisible tokens is "part of the language" rather than part of the problem domain.

What I hope to learn from this for my own language projects is that syntax need not be directly visible and, conversely, that just because syntax is invisible, does not mean that it is free of constraints.

My current ELIUS syntax has three (or four?) syntax elements and a very simple state model. incoming characters are either whitespace, [, ], or anything else. [ and ] act as text delimiters: [ starts a text block, and ] ends it. Inside text blocks characters are just text, outside text blocks they are split on whitespace into "words". While this is simple, it is also still constraining. The use of [ and ] for quoting is hardly common parlance. For full flexibility the language probably needs a way to dynamically adjust the characters, character sequences or other indicators (colour?) used for these syntactical tokens.

As a thought experiment, consider some simple ELIUS (pseudo) code to print a time-appropriate greeting:

[Morning] [Afternoon] time 12 < ifelse [Good ] print print

By choosing a simplified colour syntax, where black and white are words and spaces, and all other colours represent text (allowing different colours to be used to differentiate strings) might give something like:

Morning Afternoon time 12 < ifelse Good print print

I'm not proposing to do this, though. Even in this simple example, details such as the trailing space in [Good ] become harder to see without the delimiters, and the complexity of typing, importing, and exporting such code outweighs for me any benefits of using colour.

Supporting run-time changes to the minimal syntax might still be a good idea. Imagine adding punctuation such as , and . as individual tokens which don't need to be surrounded by spaces to be recognized, and thus can occupy their usual places adjacent to words and numbers. Or imagine adding single and double quotes as non-nestable text delimiters, or an "escape" character such as \.

I guess I need to keep thinking about how to make the basic language parser flexible but still keep it efficient.


    • I took a look at that a few weeks ago. I love the idea of it because it’s so barmy (software video?) but I have this distinct feeling that either I’d never do anything with it or I’d get buried in the depths of yet another device.

      I also have a philosophical issue with FIG FORTH, ANSI FORTH and all the other attempts to standardize the language. To me FORTH at its heart is a language for creating languages which make sense in a particular situation (internal DSLs if you like that kind of terminology). I can’t help feeling that attempts to standardize anything beyond a few words are missing the point – all they are doing is forcing the language (and by inference its implementors) into building a language for the imaginary domain of computer science. Why should a FORTH implementation be required to implement any words which are not used for the application and its domain?

      That’s why I am trying so hard to minimize the “language” part of my system, to make it as domain-agnostic and syntax-agnostic as possible

        • The “backwards notation” is only really used for passing parameters around on a stack. When you get towards more domain-specific stuff the amount of backwards stack manipulation usually decreases enormously. When you have single very-high-level domain abstractions, it’s common to just string them together: open-input-valve wait-for-tank-full close-and-disconnect or whatever. If you really want parameter passing but don’t like backwards, how about “om” ( ). A similarly simple language which goes “forwards” instead!

  1. Hmmm, having looked into it a bit further I guess it’s also the stack-based programming (in addition to the prefix or postfix notation) that I can’t be bothered to learn ;-)
    I guess it’s probably one of those things where you just get used to what you’re familiar with…

Leave a Reply

Your email address will not be published. Required fields are marked *