An OS in under a week
Throughout my cursus at ISEN Lille, I’ve been asked to complete various project. During my fourth year, we had the choice with a handful of projects.
I’m not going to lie. Most of them were boring. Did I want to make an ANOTHER web-app that would be tossed to the bin the second it was completed ? The answer was no, never, not again, please don’t force me.
With a friend, we had an idea. Looking back, it was a stupid idea. Highly unachievable, too difficult, especially given the timeframe.
TL;DR: It’s a really complex task, even a basic OS, without network or window management is challenging. Careful architectural choices and deep knowledge of ASM x86_64 is required, as well as a death wish
Behold the OPERATING SYSTEM
Yes. A fully-fledged OS was the idea:
- Not a janky web based fake os that runs in your browser and promises tons of
uselessAWESOME features, most of which you’ll never use anyway. - Not a poorly assembled linux distro (I use Arch, btw)
- Not some poor make-alike of an OS that runs on a microcontroller
A real, x86_64 OS.
Basic feature list was, unknowningly to us, a bit presomptuous:
- A bootloader
- x86_64
- Multi-tasking (not multithreading, we already knew it was a nightmare to implement)
- Some kind of stdlib
All of those, under a month
Let’s get started
But where do we start ? Where did Microsoft start (Answer: in a garage, but neither my friend nor I had a garage at hand. Our biggest mistake) The real answer is: we start with Google, and a pack of energy drinks
Google takes us to OSDev. (Big up to these guys, by the way)
OSDev tells us that we need a year. Perfect, we only have a month !
Oh god, oh f*, we have we got ourselves into
You have been fairly warned of the hard work ahead, but if you are still interested then proceed forward into the realm of the operating system programmer. Prepare yourself for occasional bouts of confusion, discouragement, and for some of us…temporary insanity. In time, and with enough dedication, you will find yourself among the elite few who have contributed to a working operating system. If you do get discouraged along the way, refresh yourself with the content of this book. Hopefully it will remind you why you started such an insane journey in the first place.
From https://wiki.osdev.org/Getting_Started
Having worked on ISEN projects for a few years, I can assure you that reading other student’s code had already dropped my sanity level.
Anyway, we had to get working.
Chapter 1: The ḃ̵͕o̴͉̦̣̱̖̮͔̼̓ơ̸͕͕̗̲̫̪̝̱̆̓̓̈̔̓͆̊̚t̷͎̟͔̹̘̊͝͠ļ̷̗̞̻̻͂́o̸̢̟̤͖̹̩̹̭͔͂͐͒̊͜a̷̢͚͕̘̩̗͕̿̈͂̃ḑ̷̧̢̮͕̊̇͗ę̶͈̙̖͈͐̽̎̐̐̿͐̓͘r̵͙͛̎̈́̎̃͠
Basically, a computer, after the POST sequence, is a blank canvas. Devices are not initialized (or not totally, at least). We have a few basic data structures ready for us to joyingly parse and use.
Not much more.
How do we take this blank canvas to a fully fledged operating system an instable nightmare that only works for 10 minutes, just enough for the presentation ?
With a bootloader
When a computer is turned off, its software—including operating systems, application code, and data—remains stored on non-volatile memory. When the computer is powered on, it typically does not have an operating system or its loader in random-access memory (RAM). The computer first executes a relatively small program stored in read-only memory (ROM, and later EEPROM, NOR flash) along with some needed data, to initialize CPU and motherboard, to initialize RAM (especially on x86 systems), to access the nonvolatile device (usually block device, e.g. NAND flash) or devices from which the operating system programs and data can be loaded into RAM. The small program that starts this sequence is known as a bootstrap loader, bootstrap or boot loader. Often, multiple-stage boot loaders are used, during which several programs of increasing complexity load one after the other in a process of chain loading.
From https://en.wikipedia.org/wiki/Booting#Modern_boot_loaders
Basically:
- Read data structures: where’s my ram, where are my devices
- Make sure everything is ready and initialized
- Load the OS
- Execute the OS, and pass it the data it needs to work
We did ours with UEFI (Unified Extensible Firmware Interface). It is the successor to the BIOS (Basic Input/Output System) We also did a BIOS bootloader, just in case.
We made it with EDK2. Contrary to alternatives, this is an “official” SDK (Software Development Kit) endorsed by Intel, HP, Microsoft, insert some other big names here.
It’s a fully featured dev environment. Aka, a nightmare to work with.
Thanks to an unknown divinity, a Docker version is available for compilation.
We only went for the basics.
The R̸̭̀͝ḁ̶̊̃͒̔́n̵̙̣̹̂̉̽̿d̷͓̿ò̶̟̼̼͒̈͋m̷͕̬͈̈́́̏́ ̶̳̕Ä̷̺̯͔͊͠͠c̷̻̦̠̩̒̐̏͑c̷̣̹̣̻̮̈́̒̽̇̽ẽ̶̬̦̗͙̥̍̎ș̶̦̺̯̉͑̔͒s̴̯̣̈́͋̃́ ̵̣̫̊̇̊M̴͕͔̂͜ͅě̷͔̗̯m̸͉̙̒o̴̡̫̺̝̥̓̓͝͝r̴̥̯̜̤̈̊y̵̧̥̮͖̝͌
(I wish I could make this zalgo much worse, but that would make half the page unreadable. Just so you know.)
TL;DR: It wasn’t basic at all.
Basically, memory in a PC is complicated. When I say complicated, I mean, getting something to work reliably makes you understand why Microsoft and Apple are multi-billion companies with a ton of employees.
Just for information, the Intel Software Developer Manual (our bible) is 5082 pages long. And all pages are filled to the brim with technical details
Around the 60s, James Kilburn wrote a paper about Virtual Memory. For this project, we had to understand virtual memory.
Now, if you are still reading, you also need to understand virtual memory
The TL;DR of the TL;DR of the TL;DR of Virtual Memory
Let’s say you have 16GB of RAM on your computer. Two sticks of neat DDR4 (DDR5 if you’ve built a new computer recently) 8GB RAM.
You’re playing the latest game from, let’s say, a french game company, which happens to be a buggy mess.
The game crashes. You’re given a “Memory Violation 0xFFFF800001230f07”. Let’s take this address:
- 0xFFFF800001230f07 = 18446603336240271111
- 18446603336240271111 bytes = 16383,875000016942067 Petabytes
If you were working with physical memory, that would mean you have 16383 Pebibytes of RAM. Don’t try to understand this number, it’s ridiculous.
So why do we have an error at that address ?
Because of virtual memory !
The OS can decide memory translations. Which means we can use addresses that wouldn’t be conceivable in physical memory.
Ranges of virtual memory are automatically translated to physical memory using tables (and MMU unit: Memory Management Unit)
If you want more info on this, check out OSDev: https://wiki.osdev.org/Paging
Chapter 2: The kernel
Once we have built our bootloader, and memory tables. We have an OS that is able to make pages (memory allocations) for each processes.
We also have to build our Graphical Interface.
The graphical interface
If you’re reading this, you’re probably using an OS that has a graphical interface. Windows, Linux, MacOS, Android, the thing that runs on iPhones (I forgot) For the sake of simpleness, I will totally ignore the concept of graphic cards, because I don’t want to speak on a subject I don’t understand fully
(For our experienced readers, it’s the FrameBuffer, provided by UEFI)
Basically. The graphical interface is a lie. The computer doesn’t comprehend the concept of Window.
What you have, is a grid of pixels. Probably something along the lines of 1920*1080, or some variation.
What do we make of it ?
- Draw shapes
- Draw text
- Draw shapes
- Draw text
That’s pretty much what we did. We didn’t want, nor had time to develop a real interface. So we had to make a CLI.
And when you only have a 2D grid of pixels, it can get really hard, really quick if you don’t carefully consider what’s happening.
Let’s say you want to display text.
Hello world is simple:
- Take the text
- Use a char to bitmap array to convert it to a bitmap
- Get the position of the cursor
- Copy the bitmap to the framebuffer, at the location of the cursor.
Now, you want to write Lorem ipsum dolor sit amet, consectetur adipiscing elit. Maecenas pulvinar, massa sit amet efficitur faucibus, sem metus mattis tellus, vel blandit eros lorem at mi. Cras orci eros, tristique vitae lectus eget, consectetur semper diam. Integer tincidunt eros non eros tincidunt, sit amet luctus magna scelerisque. Curabitur in rutrum magna. Fusce semper eget lorem fringilla porta. Fusce quis condimentum nulla. Proin ultricies tortor ut nunc condimentum, non tincidunt lectus aliquet. Nam nisi nisi, porta non diam non, lacinia ultricies nunc.
You have to handle the fact that the text is too long, so you need linebreaks, and you need to compute where linebreaks should be. You also need to handle the action of deleting characters, and moving the cursor, all while maintening a coherent state.
It’s not that simple. Not excruciatingly complex, but not really simple.
IO
Basically, IO was not my part. Interfacing with drivers is hard.
You can do it the easy way: IO provided by UEFI Or the hard way: Interface with the drivers.
Of course, the second option is the best
Loading apps
Now that we can use memory, IO, framebuffer, let’s make an app.
Loading an app is something that requires careful consideration. We chose to use ELF binaries, to support relocations, and to allow us to use a basic gcc barebone toolchain.
So first, you need to load your executable into memory. You also need to provide it to the pointers to your “library”. Finally, you need to make the memory tables that will allow your app to work.
Memory management was always the worst part of the project.
You also need to setup a bunch of timers to allow for multitasking.
Once you’re done. You just need a context switch to switch from the OS to the app.
Wrapping up
At the end of the day (and the project), we had a functionnal, albeit buggy OS that was ready enough to launch a few apps, pass them the IO library, graphical library, etc.
That was totalling to somewhere around 3 thousands lines of codes, IIRC
Source code: Github. Just remember it was a school project, a few years back, made in a hurry…