I'm trying to make a concerted push on a first working version of CORNELIUS at the moment, and it's throwing up all sorts of interesting aspects of software development.
For the first version I am building the minimum OS and language in C. However, I am trying very hard to minimise the amount of C code involved, so that (a) I can move some/all to assembler for maximum performance and minimum size and (b) so that I can get the maximum benefit and portability of building the great majority of the system in its own language.
Today I spent some time separating out as much as I can from the core code, so that I could get a good feel for just how much is really needed. Fairly obviously this involved moving out any diagnostic tooling (code to dump out dictionary, pool and stack data in a human readable form.) The bit that surprised me, though, was when I started to move out the modelling of the Pi memory to the unit tests and a platform-specific startup file.
Ever since the beginnings of this project, the Pi memory has been emulated as statically-allocated C arrays. At the start it was several different arrays for the different areas of the memory map, then each area in turn moved to a single contiguous array of bytes. I hadn't thought much at that point about how this might work when actually running on a bare-metal Raspberry Pi. The closer I get to an actual deployable version, though, the more significant this seems. On the real Raspberry Pi platform, all I/O is through memory-mapped access, even the GPU mailboxes are mapped to addresses in memory. This is fine, except that the I/O addresses are far away from the working RAM in which the OS code and data will be stored. It does not seem to make much sense to allocate a huge amount of empty memory just to allow the CORNELIUS code to read and write to I/O registers.
With this in mind, it began to appear that the allocation of the memory space did not sit well with the OS code, but should instead be provided by the underlying hardware. On the real Pi this is no problem, the memory is there, and all the devices are mapped where they should be, but when writing code and running unit tests on a separate development platform this is a bit more tricky. Strange as it may seem, my generic Windows PC does not have GPIO pins, timers and mailboxes in the appropriate places, and writing to arbitrary memory locations is not likely to end well.
Luckily, one of the nice things about my code is that all memory access goes through four basic functions which read and write 8-bit bytes and 32-bit words. All that is required is that these functions do something sensible on whatever platform is being used to run the code. As a first step:
- On a Raspberry Pi they should access the memory as transparently as possible
- On a development machine they access the memory in the allocated space, but reject accesses outside to catch bugs
Once I have this basic level of memory management in place, the next step is to model some devices. The Raspberry Pi version still passes straight through to real hardware, but the development platform can start to recognize certain addresses and treat them separately. Writing to a UART register might print a character on the development machine, and reading from a clock address might return a value calculated from the development machine system clock.
As it progresses further the opportunities expand even more. I have mentioned before how impressed I was with the graphical UI for the PiFace board, so something similar for the built-in GPIO pins would be cool too. There's even the possibility of separating out the emulated hardware from the emulated OS. It would be pretty simple to just send a notification over a network socket when out-of-RAM memory is accessed, and let a separate program handle all the display stuff. Sure, this is bound to introduce some lag, but for a large number of human-timescale projects (such as morse code) this would not be a problem.