Design of the Operating System

Purpose of this Article

As a beginner in operating system development it is difficult to find your way through the material that is out there. There are no definite guides in implementing an operating system because most of the authors have thought themselves by reading Intel’s developer manuals. A lot of the books are therefore opinionated. Operating Systems become a very personal subject all of a sudden.

In general free tutorials teaching os development in a concise manner that go beyond a hello world boot loader are scarce. James Molloy wrote an excellent set of articles that even explain paging and a heap implementation from scratch line by line! James articles do explain the implementation of more concepts than any other article on the internet.

The problem is that some other sites on the internet attack his implementation mentioning problems in James’s code. James also mentions concepts but does not thouroughly explain why he wants to adhere to those concepts. For example he says kmalloc should not be called after identity mapping but before a heap is initialized. He never says what errors occur should this concept be ignored!

On the other hand you have the heaps of academic books that explain broad concept in nicely written and well organized fashion! The problem is that academic material is too high level to implement anything. Implementation deatils are left out of the picture. A student could leave a course with a broad understanding of what an operating system does but at the same time that student is not able to implement any of the concepts because the understanding is too high level. In a sense the student knows everything and nothing at all at the same time! The knowledge becomes useless beyond passing an exam. Only the most brilliant minds will be able to transform an academic book into a working operating system. Us mere mortals, we need material that gets us started with a basic implementation so we can learn the steps it takes to write an operating sytem.

I think the problem is that in order to be able to write an operating system, you have to be able to find a way to dive down from the level of abstraction displayed in academic books to the very low level tutorials of the web. Articles or books that provide those intermediary steps do not exist. You have to wade through the heaps of opinionated articles and find your own why. You have to establish a plan of what to implement in what order. In order to create an implementation plan, you have to manifest all the concepts in your brain into a concrete architecture not knowing yet if the architecture will hold up. You should not let yourself get discouraged by opinionated, elitist posts on the internet and keep on working on your own implementation.

This article outlines my personal architecture and ideas of how an operating system could be implemented. It is heavily influenced by James Molloys articles and tries to solve problems that might arise due to James implementation as mentioned on the OSDev wiki.

I have no idea if all that is written in this post is correct or not. If I find myself in a situation that I cannot solve based on my current understanding, I will update this architecture with changed architecture which does not have the problems of the architecture before it.

Besides James Molloy’s influence, the architecture is heavily influenced by Unix concepts, because almost all books on operating systems are based on the Unix operating system, I almost exclusively think in Unix terms when I think about operating systems at this point. Although things like fork before exec do not intuitively make sense to me, (why fork? Why not just exec?) I can see however that an implementation of fork and exec is possible and leads to new processes. So I will implement this concept. Especially since it aligns with James Molloys ideas.

Overall Design

The operating system will use paging. It will not swap to a hard drive in the early versions. If all physical memory is used, no process will receive more memory. A process has to be able to deal with this situation. It will use paging to secure the kernel frames from being written to by user mode code and also to facilitate the creation of new processes by copying physical frames for isolation of processes while maintaining the same virtual address space for the copied process. It will also use paging to map the operating system’s frames to the bottom of every running process.

The kernel’s page directory and page tables are stored in identity mapped frames, although I still do not 100% understand why identity mapping is needed. That is not true! The page directory and page tables have to be managed by the heap and they have to belong to the init process. They should be copied (not mapped) on fork(). Because fork() will copy everything above the last used kernel frame, the page directory and page tables have to be located above the last identity mapped kernel frame.

The kernel’s frames are located at the bottom of the physical memory starting from 0 (Where does GRUB put the kernel? Also where is the stack placed by GRUB?). Also code placed by the BIOS is contained in the low frames of the memory map. See http://www.cs.bham.ac.uk/~exr/lectures/opsys/10_11/lectures/os-dev.pdf – Figure 3.4) The kernel’s frames and the BIOS’s frames have to be marked as occupied in the bitmap of used frames to save them from being used twice. This is how the kernel prevents code from overriding it in the early stages of the boot process.

Before activating a heap, the kernel will just use static memory (placement memory system). It will start with an address (read from GRUB’s multiboot information) and it will move that pointer (placement_address) up the address space whenever it needs memory. This memory is never returned as long as the OS runs and hence no heap is necessary for this simple placement memory system.

Calls to fork() will map (not copy) all lower frames into the newly forked processes memory space. Overall fork will map

  • The frames created by the BIOS
    • contains the interrupt vector table
  • The kernel code installed by the bootloader GRUB
  • The modules that where placed into memory by GRUB above the kernel
  • The kernel’s placement memory created via the placement memory system
    • The placement memory contains the bitmap of used frames

Program Flow and the Init Process

When the OS starts, it eventually creates the init process. (It also creates an entry for the init process in its list of running processes. This list is the array of process descriptor data structures. That list is also updated by fork(). Because the list is maintained in the kernel memory section at the bottom of the memory map, it has to be static in size, because once the kernel initializes init’s heap, its memory remains fixed to keep it from growing up the address space into any other area of the memory trashing processes along the way. The kernel can therefore only run a fixed amount of processes. Maximum 16 for the start maybe?)

The program flow starts after the bootloader (GRUB) has set the CPU’s instruction pointer onto the kernel code that it has placed into memory. The CPU starts to execute the kernel code. The kernel at this early stage has no processes running, it is the only part executing.

The kernel will

  • Set GDT and interrupt tables
  • Create its own stack
  • Prepare the array of process descriptors
  • Create the bitmap of frames. All frames are free yet.
  • Prepare frames (above the static kernel memory) and put Page Directory and Page Tables into those frames (The page table entries do not point to frames yet!)
  • Identity map the kernels frames and mark them used in the bitmap of frames.
  • Assign frames to the Page Table Entries prepared above and marks the frames as used in the bitmap of frames
  • Initialize the heap
  • Start the init process and hands the program flow over to the init process

Once the init process has received the program flow, it is taking care of all further tasks. It will read the harddrive to find a configuration file. The configuration file describes, which process init should start. In most cases, the new process will be the console process (user space) which reads from stdin and outputs to stdout and can fork() and exec() new processes via system calls.

At some point the program flow has to go into a scheduler which assigns processing time to all running processes. It will run the init process, the console process and all forked and execed processes in their respective time slices.

Back to the init process that just took over the flow from the kernel. The init process will inherit and use the operating system’s stack and its first Page Directory and Page Tables. In the future, whenever init forks itself (maybe to prepare a call to exec later on), the kernel’s/init’s stack will be copied to the new processes by in-memory copying the frames that the kernel stack occupies. The newly copied stack will then be available under the same virtual memory address as the kernel’s/init’s stack (no pointers have to be moved around). The new process has all pointers functioning because the virtual memory looks the same as init’s virtual memory. The frames underneath the copied stack are copies of init’s frames and the new process cannot affect init’s original stack. James Molloy copies frames which is fine but then proceeds to moves pointers around, I think this is not necessary. OSDev wiki also say that it is harmful.

During fork() the new process will not make a copy of the frames that the operating system uses (except the stack as explained above) but it will just map them into the new processes virtual address space. Mapping means that some of the Page Table Entries will receive a pointer to the kernel’s physical memory. That pointer is copied from the parent process as is without any changes.

Overview of Frames

The operating system uses frames for:

  • The frames where GRUB loaded the kernel sections into.
  • The frames that where allocated by the OS before a heap was activated (= identity mapped frames)
  • The frames used to manage running process information. This is the fixed size array of process descriptor data structures.
  • The frames used by the bitmap of free or occupied physical frames.
  • The frames used by the operating systems Page Directory and Page Tables (more or less owned by the init process).
  • Its own stack (more or less owned by the init process also).
  • The frames used by the heap (more or less owned by the init process).

The latter three types of frames are passed over by the operating system to the init process. The reason is that every process should have

  • it’s own virtual memory address space (Page Directory and Page Tables)
  • it’s own stack
  • It’s own heap

By assigning those three objects to the init process, they can be in-memory copied for new, forked processes and they are not merely mapped.

Because the criteria for fork() to decide whether to copy or to fork is the last frame used by the OS before the heap was activated. Everything below that frame is mapped instead of copied. Everything above that frame is copied (cloned, duplicated). The stack, heap and the Page Directory and Page Tables have to be place above the last used OS frame in the memory map so they are copied and not mapped.

Also frames used by the Page Directory and Page Table will not be statically allocated by the OS! Instead they should be allocated via the heap. The reason is that a process has to have its own virtual address space and has to be able to grow or shrink its own Page Directory and Page Tables (= virtual address space) via the heap. That means that before adding the first Page Directory and Page Tables by the OS for the init proces, a heap implementation has to be activated so init immediately starts to behave like a normal process. init has to be a fully functional template for all other processes forked from it.

The reason for mapping the OS frames is that the operating systems code, static resource data and the memory that it allocated before the heap is active will remain unchanged as long as the OS is running. Because those frames are static, there is no reason to make physical copies of those frames. How does fork() know which frames to copy and which to map? All frames from 0 to the last frame allocated before the heap was activated are mapped. Those are the frames the operating system uses. All other frames (Stack, the processes data) are copied.

Process Descriptors, List of Process Information

What is stored in a ProcessDescriptor? A ProcessDescriptor is created for each running process. The ProcessDescriptors are stored in the kernels static memory, hence their number must be limited to a maximum value (16 for the beginning). Each Process Descriptor contains:

  • The process identifier PID, a numeric value identifying the process amongst all processes. 0 is not a valid PID because fork returns 0 in the child process after fork.
  • A pointer to a physical address. That physical address contains the processe’s Page Directory Table. This pointer is needed to enlarge the memory managment structures Page Directory and Page Table if the process requests more memory. Also the memory management structures have to be in-memory copied during a fork().
  • A pointer to the heap.
  • A pointer to the stack.
  • A place in memory where registers can be stored to conserve the CPU state.

Leave a Reply