CORNELIUS code changes again

One thing which has been causing me to scratch my head more than any other as I have chugged along the development of CORNELIUS (my bare-metal operating system and language for the Raspberry Pi) is the difference between the code needed when building an application and the code needed when running an application. For example, when building and testing code it's vital to be able to create new words, save work, print diagnostics, check that things work as expected and so on. Once an application is built and working, however, all that code is no longer needed.

The classical solution to this is embodied in the notion of a compiler. A compiler reads a specification of code in one language and converts it to another language for execution. Most often the input (source code) to the compiler is written in a "high level" language such as C or Java, and the output is some form of executable machine code. Of course there are subtleties to this. Some compiler output is machine code for a "virtual machine" rather than any particular processor, and the machine code produced by the compiler is often in a slightly abstract form, so it can be "relocated" to run at different locations in memory.

Once the source code has been compiled to a machine code executable, there is (in theory) no longer any need for the high-level code and the tools used to produce and manage it (editors, compilers, debuggers and so on.) In practice, of course, things are hardly ever right first time, and even if they are, they will probably need to be changed later. Just throwing code and tools away after compilation would be crazy, but we still have no need of them while the generated machine code is running.

This is where the idea of cross-compilation comes in. When cross-compiling, the source code and tools stay on a "development" machine, and only the generated code is transferred to the "target" machine for execution. This is a very powerful technique, and allows software to be developed using high-level languages for machines without the resources to run the compiler and tools. This approach is how most micro-controllers and embedded computers are programmed.

The Raspberry Pi is unusual in that it is an embeddable computer with enough power and memory to compile its own software. If you want to wait an hour or two, you can happily recompile the whole Raspbian Linux Operating System on a Raspberry Pi.

For my CORNELIUS development I have to use a cross-compilation approach at the moment. Although the Raspberry Pi hardware is powerful enough, the CORNELIUS system as it stands has no way to run anything as complex as a text editor or a C compiler.

This is pretty much what I have been doing so far. Write the key bits of the system in C, cross-compile them to ARM code to run on the bare Raspberry Pi, and transfer over using an SD card. As the system grows beyond a simple proof of concept, it is becoming harder and harder to keep the a clear separation between things which are used to build the core, and the core itself. Part of this is because of the nature of C as a programming language. Most of the "code" is actually data - the bytes and words which make up the dictionary entries, string pool and variables for the minimal system - but describing such data in C ends up either requiring code to populate the values, or some serious wrangling with CPP defines.

I have tried separating out this "building" code into separate source files which are not included in the generated executable, but this has eventually fallen foul of the same problems. So I have decided to be bold, and experiment with a different approach.

I have always hoped that I could define as much as possible of the ELIUS language and the CORN runtime nucleus in ELIUS, but there has always been a "catch-22". I can write the code for the defining words in ELIUS, but without already having defining words in place, I can't use them. So, rather than either including the C-code defining words in the deployed language, or laboriously hand-crafting the dictionary and string pool entries so they can be used, I now plan to write a separate ELIUS "compiler" which can understand and execute some basic ELIUS defining words, just enough to convert ELIUS word definitions to the appropriate heap, dictionary and pool structure. This "bootstrap" code should enable me to write almost all the core language code in ELIUS, and manage it using comfortable tools such as a text editor and "make" to include the generated data in the deployed executable.

I'm writing an initial version of the bootstrap compiler in Ruby and using it to read both ELIUS definitions and settings for system constants and variables, and to generate equivalent C code. This is not really where I want to end up, but it's an important step in making sure that the idea makes sense before I go too far with it.

Leave a Reply

Your email address will not be published. Required fields are marked *