Building OpenCV on MacOS

Preparation

These installation instructions are taken from: https://blogs.wcode.org/2014/10/howto-install-build-and-use-opencv-macosx-10-10/

Download the sources from https://opencv.org/releases/

Extract the zip to /Users/bischowg/Downloads/opencv-4.1.2

Create two new folders inside of the openCV directory, one called StaticLibs and the other SharedLibs.

Build the Static Libraries

Open cmake-gui (type cmake-gui into the console)

Click Browse Source and navigate to your openCV folder.

Click Browse Build and navigate to your StaticLib Folder.

Click the configure button.

You will be asked how you would like to generate the files.

Choose Unix-Makefile from the Drop Down menu and Click OK.

CMake will perform some tests and return a set of red boxes appear in the CMake Window.

You will need to uncheck and add to the following options.

Uncheck BUILD_SHARED_LIBS

Uncheck BUILD_TESTS

Add an SDK path to CMAKE_OSX_SYSROOT,
it will look something like this “/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.9.sdk”.
—> Option on my system: /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk

Add x86_64 to CMAKE_OSX_ARCHITECTURES, this tells it to compile against the current system

Uncheck WITH_1394

Uncheck WITH_FFMPEG

Click Configure again, then Click Generate.

When the application has finished generating, Open Terminal and type the following commands.

– cd <path/to/your/opencv/staticlibs/folder/>
– make (This will take awhile)
– sudo make install

Enter your password.
This will install the static libraries on your computer.

Build the Shared Libraries

Open cmake-gui (type cmake-gui into the console)

Click Browse Source and navigate to your openCV folder.

Click Browse Build and navigate to your SharedLib Folder.

Click the configure button.

You will be asked how you would like to generate the files.

Choose Unix-Makefile from the Drop Down menu and Click OK.

CMake will perform some tests and return a set of red boxes appear in the CMake Window.

You will need to uncheck and add to the following options.

Check BUILD_SHARED_LIBS

Uncheck BUILD_TESTS

Add an SDK path to CMAKE_OSX_SYSROOT, it will look something like this
“/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.9.sdk”.
—> Option on my system: /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk

Add x86_64 to CMAKE_OSX_ARCHITECTURES, this tells it to compile against the current system

Uncheck WITH_1394

Uncheck WITH_FFMPEG

Click Configure again, then Click Generate.

When the application has finished generating, Open Terminal.

– cd <path/to/your/opencv/SharedLibs/folder/>
– make (This will take awhile)
– sudo make install

Enter your password.
This will install the shared libraries on your computer.

Problem During Building the Example Application
OpenCV 4.x+ requires enabled C++11 support

Into the CMakeLists.txt add:

set(CMAKE_CXX_STANDARD 11)
set(CMAKE_CXX_STANDARD_REQUIRED ON)

Then call

cmake .
make

Sequelize

Introduction

Sequelize describes itself on the sequelize homepage:

Sequelize is a promise-based Node.js ORM for Postgres, MySQL, MariaDB, SQLite and Microsoft SQL Server. It features solid transaction support, relations, eager and lazy loading, read replication and more.

Sequelize is an ORM (Object Relational Mapper) which allows you to interface with a SQL database without writing SQL but purely using the domain objects of your node application.

Sequelize shields you of any SQL to the point where Sequelize will automatically generate and execute table creation statements and all the insert, update, select and delete statements for you! You do not have to create tables in your database server, Sequelize will do it for you.

While reading this article, you can look at this repository. It is not good code by any means but maybe it helps you to see a running example which you can modify and do your own tests on.

Defining Models

A model is an object that has attributes which will be stored in the columns of a database table. Models can be connected to each other using associations.

An example would be an application that manages bank accounts. There is a model for an account, each account has amounts of money in them and money is transferred between accounts by transactions. You could model Accounts, Amounts and Transactions. Sequelize will allow you to perform CRUD for those objects without writing a single line of SQL.

The model folder and index.js

In order for Sequelize to manage your objects, you have to define the models first so Sequelize knows about their structure and can generate SQL for you.

This approach is taken from https://github.com/sequelize/express-example/blob/master/models and it works very well.

One way of managing your models is to have a separate folder to store all your model definitions in. A model definition is a node module that exports a model. A model optionally has a function that defines it’s associations to the other models.

Inside that model folder, you will also have a index.js file. This index.js file will first scan for all the models defined in the models folder, set them up and then export an object called db. db will be the object your node application uses to interface with sequelize. db contains a connection to the database and all the model definitions that you can store in the database.

index.js will perform the following steps to set up models:

  • It will connect to the database and store the connection into the db object
  • It will scan the model folder and load all model definitions it finds
  • It will call the associate() function on model (if defined) so that a model can define it’s association to the other models
TypeError: defineCall is not a function

One very import thing to remember is the following: The error “TypeError: defineCall is not a function” is thrown if your model folder does contain any files except valid Sequelize Model-Modules or the index.js file! If you put any non-Sequelize code into the model folder or if you even comment out the entire contents one of your Model-Module files, Sequelize will get confused and throw the “defineCall” error! So do not comment out any of your Model-Modules and do not put any other files into the model folder!

An example Model-Module (models/account.model.js) for accounts is:

module.exports = (sequelize, DataTypes) => {

    var Account = sequelize.define('Account', {
        'id': {
            type: DataTypes.INTEGER(11),
            allowNull: false,
            primaryKey: true,
            autoIncrement: true
        },
        'name': {
            type: DataTypes.STRING(255)
        }
    });

    Account.associate = function (models) {

        // optional, foreign key is stored in the source model 
// (= Account has foreign key to RealAccount) models.Account.belongsTo(models.RealAccount, { foreignKey: 'realaccountid', targetKey: 'id' }); models.Account.hasMany(models.Amount, { foreignKey: 'virtualaccountid', targetKey: 'id' }) }; return Account; };

This module defines a RealAccount model and it’s associations to a Account Model and several Amount models.

The index.js file looks like this:

// inspired by https://github.com/sequelize/express-example/blob/master/models

var fs = require('fs');
var path = require('path');
var Sequelize = require('sequelize');

var basename = path.basename(__filename);

const sqlConfig = {
    user: 'root',
    password: 'test',
    server: '127.0.0.1:3306',
    database: 'cash'
}

var sequelizeConnection = new Sequelize(sqlConfig.database, sqlConfig.user, sqlConfig.password, {
    host: 'localhost',
    port: 3306,
    dialect: 'mysql',
    logging: false,
    pool: {
        max: 5,
        min: 0,
        idle: 10000
    }
});

var db = {
    Sequelize: Sequelize,
    sequelize: sequelizeConnection
};

// collect all model files in the models folder to automatically load all the defined models
fs.readdirSync(__dirname)
    .filter(file => {
        return (file.indexOf('.') !== 0) && (file !== basename) && (file.slice(-3) === '.js');
    })
    .forEach(file => {
        var model = db.sequelize['import'](path.join(__dirname, file));
        db[model.name] = model;
    });

// if a model has an associtate methode, call it.
// The associate method will define the relationships between the models.
Object.keys(db).forEach(modelName => {
    if (db[modelName].associate) {
        db[modelName].associate(db);
    }
});

module.exports = db;

 

Using the db object

To store and load into the database, you have to use the db object. Import the db object:

var db = require('../model/index.js');

Now the db object is ready.

For Testing: Erase and recreate all the tables for all models

Note: Never have this in your production code! All data will be lost! This is usefull for testing!

During the testing phase, it is usefull to let Sequelize erase all tables and recreate them for you. To do so execute sync() with the force parameter:

// delete the entire database
await db.sequelize.sync({
    force: true
});
Inserting a Model

The functions that operate on models are added to a mode_services folder into the service.js file.

It is a mistake to insert the services.js file into the model folder as Sequelize will get confused and produce the “TypeError: defineCall is not a function” error.

The module to insert a model looks like this:

async function addAmountByIds(db, realAccountId, virtualAccountId, amount) {
    return db.Amount.create({
        amount: amount,
        realaccountid: realAccountId,
        virtualaccountid: virtualAccountId
    });
}

module.exports = {

    addAmountByIds: addAmountByIds

};

To use this method:

var services = require('../persistence_services/services');
var db = require('../persistence/index.js');

var realAccount = await services.addAmountByIds(db, 1, 2, 299);

You can see that the function addAmountByIds() is defined async because Sequelize is inherently async and uses Promises for everything.

addAmountByIds() will not return the created Amount object instantly but it will return a Promise. The Promise is your handle to an asynchronous flow of operation which is creating the Amount object in the database asynchronosly. As soon as that flow finishes, the Promise will return the actual result which is the created amount object.

Declaring addAmountByIds() async makes it a non-blocking call which means that when you call it, it immediately returns a Promise and your normal program flow continues while a parallel thread starts in the background.

This way of dealing with asynchronicity is awesome but my brain just can’t handle it. I cannot write a working application dealing with hundreds of Promises instead of real objects. Because of my own imperfection, the examples will always call the service functions with the await keyword.

The await keyword turns the non-blocking calls into plain old blocking functions. The program flow will block after calling the service functions until the real object is returned from the promise. That way, you are sure that the result returned is a valid object that is no persisted into your database. You can now write sequential code and sole your problems without thinking about asynchronous code.

then() – chaining

An alternative way of dealing with asynchronous Promises is to chain Promises together with calls to the then() function.

The callback specified in the then() function is called as soon as the Promise returns the real object. Besides await, this is another way to block until the result is ready. then() chains are a very akward way of writing sequential code in my humble opinion. I maybe have never seen code that is high-quality and easily readable using then() chains, but I cannot imagine easy to read code using then() changes as of now.

Accessing Associations via Properties

If you have objects associated to each other such as a BlogPost and all Comments people left on that BlogPost, you can access all Comments if you have a reference to the BlogPost object just by calling a property on the object.

This means, you do not have to explicitly query for the Comments instead Sequelize will lazy load the Comments for you in the background.

var comments = await blogPost.getComments();

Again, the Promise is resolved by the await keyword and the comments variable contains the loaded comments.

Associatons

How do you connect a BlogPost to a Comment for example?

Remember that index.js will call the associate() function of each model during initialization? The associate() function is where you define the associations for each model.

You should consult the Sequelize documentation to find the right type of association that describe the situation you want to model best.

As an example, if a BlogPost should be associated to many Comments, define:

BlogPost.associate = function (models) {

    models.BlogPost.hasMany(models.Comment, {
        foreignKey: 'blogpostid',
        targetKey: 'id'
    })

};
Not creating Objects in Associations

I have not fully understood why but calling Setters will automatically insert new objects into the database unless you use the save: false parameters:

transaction.setSource(targetAccountObject, { save: false });
Updating Objects

If you want to change the amount in an account, you have to change the value and trigger an update:

amount.amount += 299;
await amount.save().then(() => { });

Here, the amount is increased by 299 and an await is used to wait for the result of the save() call which updates the object in the database.

Another way is to use update():

amountObject.update({
    amount: newAmount
}).then(() => { });
Querying Objects

The findByPk() convenience method allows you to query by primary key.

var result = await db.RealAccount.findByPk(amountObject.realaccountid);

General queries are executed using findAll() and a where clause:

var result = await db.Amount.findAll({
    where: {
        virtualaccountid: targetAccountObject.id
    }
});

 

Creating an Angular Application

Generating the Application using ng new

Angular uses the ng tool to generate all sorts of code for you. In fact ng is used to generate all of the angular items from components to the entire application.

First install the current long term support version of npm:

nvm install --lts
nvm use node --lts

Alternatively use the latest release:

$ nvm ls
v8.11.2
        v8.12.0
         v9.3.0
       v10.14.2
       v10.15.3
       v10.16.0
        v11.4.0
        v12.0.0
        v12.2.0
       v12.13.1
       v13.11.0
       v14.17.0
->     v14.17.3
        v16.1.0
         system

Now install and use the lastest version:

$ nvm install v16.1.0
$ nvm use v16.1.0

ng is added to your system by installing the Angular cli globally using the node package manager npm:

npm install -g @angular/cli

You can check the angular CLI version:

ng version

At the time of this writing, the version Angular CLI: 12.1.3 is current.

The global ng installation is used to generate a new application:

ng new <APPLICATION_NAME> --style=scss --routing

This will create a folder called <APPLICATION_NAME> in the current working folder containing the new project. It will use the sass processor for CSS stylesheets using the scss syntax. It will automatically use routing.

You can also go through a interactive process where the angular CLI asks you about all options before creating the project.

ng new <APPLICATION_NAME> --interactive

Once a project was generated using the global version of the angular CLI, you change into the project folder and from there on out, the local version of the angular CLI as specified in the package.json is used. This means even when the global Angular CLI is updated to a newer version, your application will not break because it locally uses the version as specified in the package.json file.

This helps project stability as you can update the global angular CLI version for new projects and keep the local angular CLI version to prevent the angular application from breaking due to version differences.

Additional Dependencies

For the NgRx store architecture:

npm install @ngrx/store --save
npm install @ngrx/effects --save

Starting the Application

You can open the application folder using Visual Studio Code. Withing Visual Studio Code, open a terminal using the Terminal menu item in the menu bar and the New Terminal Item within it.

To start the application, type npm start which will internally call ng serve.

npm run start

You can also call ng serve directly

ng serve

The application is running at http://localhost:4200/

Adding a Module

Angular is particularly useful because it has a rigid scheme for organizing code which benefits structure and ultimately the quality your application code will have in the long run.

That scheme consists of the use of Typescript Modules which contain components.

There are two types of modules: modules and feature modules.

Initially, ng generates a module which is the app module containing the app component, which is used as a starting point for the application (called bootstrapping).

Starting from the main property inside the angular.json file, a main.ts typescript file is configured. WebPack will execute this main.ts / main.js file when starting the application.

Inside main.ts, a call to bootstrapModule() is made and the app module (AppModule) is specified as a parameter.

Looking into app/app.module.ts, you can see the bootstrap property of the @NgModule decorator. It contains the AppComponent. That means, when angular starts the app module, it will initialize the AppComponent first and use it as an entry point into the application.

Feature modules can be added to the application to extent the application with new functionality. For each separate feature, you create a new feature module so that the feature modules stay independent from each other (strong cohesion, weak coupling). A module will contain all components, services, interceptors, decorators, pipes, model classes and everything else to make a feature work as a independent unit.

Move common functionality such as utilities into their own modules to reuse them from several modules.

The angular CLI allows you to add a module:

ng generate module <MODULE_NAME>

To generate a module that houses all user-related code, execute

ng generate module user

Note that the name was choosen to be user and not user-module or anything. Angular CLI will automatically generate a user folder and a user.module.ts file. The CLI will postfix the module identifier to the generated files for you!

Lazy Loaded Feature Modules

Documentation is: https://angular.io/guide/lazy-loading-ngmodules

A word of advice: lazy loaded feature modules have been updated in newer versions of angular. This article shows console output from version 12. If your output differs, consider migrating to the newest angular version. 

When you want a lazy loaded module, do not import the module into the app module, instead use the router. The idea behind a lazy loaded module is to only load it into memory, when the router navigates to the module.

Therefore the router is used to load the module when a certain route is visited. To setup the lazy loading, update the file app-routing.module.ts

const routes: Routes = [
{
path: 'projects',
loadChildren: () =>
import('./projects/projects.module').then((mod) => mod.ProjectsModule),
},
];

Here, once the path ‘projects’ is followed, the router will execute the import() function which loads the lazy loaded projects module in this case.

The question is, which component will be rendered when the path is visited? The code in the app-routing.module.ts file does not specify a component to load within the projects module! The projects module itself will again contain a routing configuration file called projects-routing.module.ts which specifies all the routes and components.

The file looks like this:

import { NgModule } from '@angular/core';
import { Routes, RouterModule } from '@angular/router';

import { ProjectContainerComponent } from './components/project-container/project-container.component';

const routes: Routes = [
{
path: '',
component: ProjectContainerComponent,
},
];

@NgModule({
imports: [RouterModule.forChild(routes)],
exports: [RouterModule],
})
export class ProjectsRoutingModule {}

One last change is necessary, in the lazy loaded feature module, import the ProjectsRoutingModule from the projects-routing.module.ts file and add it to the imports of the FeatureModule so it partakes in the routing:

import { NgModule } from '@angular/core';
import { CommonModule } from '@angular/common';
import { ProjectContainerComponent } from './components/project-container/project-container.component';

import { ProjectsRoutingModule } from './projects-routing.module';

@NgModule({
declarations: [
ProjectContainerComponent
],
imports: [
CommonModule, ProjectsRoutingModule
]
})
export class ProjectsModule { }

When you start your app, you should see a listing of Chunks. Chunks are the files containing the application code, that are ultimately downloaded to the client to run your app. You should see your lazy loaded module listed as a Lazy Chunk File as opposed to the Initital Chunk File option for eagerly loaded modules.

Initial Chunk Files                    | Names         |      Size
vendor.js                              | vendor        |   2.39 MB
polyfills.js                           | polyfills     | 128.55 kB
runtime.js                             | runtime       |  12.51 kB
main.js                                | main          |   9.50 kB
styles.css                             | styles        | 118 bytes

                                       | Initial Total |   2.54 MB

Lazy Chunk Files                       | Names         |      Size
src_app_projects_projects_module_ts.js | -             |   5.81 kB

Adding a Service

To add a service into a module, you can use the Angular CLI.

ng generate module auth
cd src/app/module
ng generate service services/auth

This will create the auth.service.ts file inside the auth module.

import { Injectable } from '@angular/core';

@Injectable({
providedIn: 'root'
})
export class AuthService {

constructor() { }
}

@Injectable providedIn: root means that, the provider for this service is added to the root provider. This is documented here: https://angular.io/guide/providers

A provider is responsible to retrieve an instance for a dependency used in dependency injection.

Building an Angular Example App

We will build an application called ‘cash’ that helps you organize your finances. The code is available on github.

The cash application manages accounts.

Real Accounts mirror an existing account that you own within your bank institution.

Virtual Accounts are accounts that you can create within cash and which do not exist in your bank institution.

Virtual Accounts can have zero or one real accounts connected to them.
That means virtual accounts can exist, that are not backed by a real account. A real account can be connected to at most one virtual account.
That means it is not possible to connect a real account to two virtual accounts.

If a virtual account is connected to a real account, the real account is hidden by the virtual account. The amount of money that is stored in the real account is now accesseable only via it’s virtual account.

The way the cash application is structured is that whenever you add a real account to the cash application, cash will automatically create a virtual account for you and connect it to the real account.

You basically only are working with virtual accounts when you use the cash application. The layer of virtual accounts hides the layer of real accounts beneath it. You only interface with virtual accounts.

You can create any number of virtual accounts. For example if you want to structure your income and if you want to save up some money for a new PC or anything, you can create a virtual account for your savings and transfer money in between virtual accounts onto your savings account.

That way your income is subtracted by the amount of money you want to save and you can clearly see how much money there is left to spend after putting away the saved amount of money.

Money is transferred between accounts in transactions. A transaction starts at at most one virtual account and ends at at most another virtual account. A transaction transfers a non-negative amount of money and has a date as well as a description. Transaction between the same virtual accounts are not allowed.

For a transaction, there are the following cases:
A) Source Virtual Account (SVA) and Target Virtual Account (TVA) both are not backed by real accounts.
B) SVA is backed by a real account but TVA is not.
C) SVA is not backed by a real account but TVA is.
D) SVA and TVA are both backed by real accounts.

There are special cases:

Incoming amounts of money are transactions without a source account.
Expenses are transactions that have no target.

E) There is no SVA. The transaction adds income to your bank accounts.
F) There is no TVA. The transaction denotes an expense you made.

If SVA and TVA are both backed by real accounts (case D), then the money of the transaction is also transferred between the real accounts.

If SVA and TVA are both not backed (case A) the real accounts are not altered.

If there is a sequence of transactions of cases B – A* – C, then the money that made it from the real source account to the real target account is also transferred between the real accounts.

B – A* – C means that the sequence starts with a transaction of type B, then there are arbitrary many transactions of type A, then the sequence ends with a transaction of type C. That means, money travels over several hops over virtual accounts and in a global perspective between real accounts.

The amount of money in an account over a time frame can be displayed as a graph.

Transactions are stored in a transaction log which you can open and inspect at any time.

After every transaction, ‘cash’ will compute the amount of money in all real accounts and the money in all virtual accounts. The two amounts of money have to be the same at all times.

Let’s implement this as an angular application. It might be a bit ambitious but it is a usefull example that you can really make use of in your real life.

CRUD for Accounts

The first step will be CRUD for accounts. CRUD stands for Create, Retrieve, Update and Delete. I will use the acronym for an application that provides a user interface and backend code that allows the user to manage items (Create, Retrieve, Update and Delete). The user interface will consist of forms that allow the user to input data for accounts or edit existing accounts. The backend code will persist the accounts to a SQL datastore using Express and the excellent Sequelize SQL-ORM mapper.

First a SQL schema is created to manage real and virtual accounts. Then an API is created for the account CRUD operations in the backend. Once the backend and data storage are functional, a Form is needed to add new accounts. A list is needed to display existing accounts for editing, inspection and deletion.

Accessing Virtual Account via the API

First, lets create a service that covers the API calls. Creating services is done via the angular cli.

ng generate service <service_name>
ng generate service Account
ng generate service AccountDetails

If you want an AccountService, you have to generate with the name ‘Account’. ng will add the name service for you.

A Buddy System Memory Allocator Explained

Introduction

When paging is used by an OS, the OS can provide pages to a process that needs memory. Pages are often times 4kB in size (the size can vary). If a process needs only a few bytes, handing out an entire page is a waste of space.

To use memory more efficiently, another layer of memory management on top of paging is used by operating systems. That layer can be implemented by using a heap. A heap provides the malloc() and free() primitives which allocate memory of a certain size and also allow an application to return memory to the operating system when it does not need it any more.

One well-known algorithm for heaps is the buddy system described by Donald E. Knuth in his first Volume of The Art of Computer Programming.

This article contains my notes on the Buddy System Allocator implementation by evanw. The implementation can manage a linear address space starting at an arbitrary location. It speeds up operations by maintaining a binary tree of the buddy memory areas. The tree allows for fast split and merge operations because it naturally encodes the buddy relationship between memory areas by storing buddies as siblings in the binary tree.

Content

This is a explanation of the allocator code available here:
https://github.com/evanw/buddy-malloc/blob/master/buddy-malloc.c

The allocator manages a linear address space with malloc() and free() operations
to allow the reservation and the return of memory.

The underlying algorithm is the buddy memory system.
buddies are a pair of two adjacent memory regions.

Start and End of the memory space

The allocator will work on a linear block of memory which can be extended when more memory is needed.

The start of the linear block is marked by the base_ptr

static uint8_t *base_ptr;

Then current end of the linear block is marked by max_ptr (a.k.a. the break)

static uint8_t *max_ptr;

The method

static int update_max_ptr(uint8_t *new_value)

is used to advance the max_ptr in order to reserve additional memory for the heap. update_max_ptr() uses the brk() system call to retrieve more memory from the operating system. brk() stands for break which is a common term used for the current end of the heap.

The circular linked list datatype

Again this is the element definition:

typedef struct list_t {
    struct list_t *prev, *next;
} list_t;

List operations are:

static void list_init(list_t *list) – initializes a list so the element points to itself with both prev and next pointers which denotes an empty list
static void list_push(list_t *list, list_t *entry) – adds an element to the end of the list
static void list_remove(list_t *entry) – removes an element from a list
static list_t *list_pop(list_t *list) – removes the element at the end of the list

There is one list per bucket to manage all the free elements that have the buckets size.
The buckets are an array of list_t structures.

// array of lists
static list_t buckets[BUCKET_COUNT];

buckets[0] —> circular linked list of free blocks of whole memory space
buckets[1] —> circular linked list of free blocks of 1/2 memory space
buckets[2] —> circular linked list of free blocks of 1/4 memory space
buckets[3] —> circular linked list of free blocks of 1/8 memory space
buckets[4] —> circular linked list of free blocks of 1/2^4 memory space
buckets[5] —> circular linked list of free blocks of 1/2^5 memory space

The list_t structures in the bucket are part of their respective linked list but they are treated specially because they are merely
elements to manage those lists not elements that denote a free block of memory.

If a bucket of size 2 ^ 4 currently manages five blocks of free memory, the circular linked list actually contains six elements.
The first element in the list is just the list_t element that is used to point to the list.

The linearized bit tree

The linearized bit tree is a binary tree, where every node has zero or two children.

Linearized means that the tree is stored in memory as an array of bits.
It is not stored as nodes that store pointers to other nodes to form a tree.

The tree stores a bit per node (except for it’s leafs (therefore BUCKET_COUNT – 1) because leaves cannot be split).
A node stands for a block of memory.
The state of that block of memory is encoded in the bit.
A value of 0 means that the block was not split.
A value of 1 menas that the block was split.

The array that stores the linearized binary bit tree

The array is defined by the array called ‘node_is_split’:

static uint8_t node_is_split[(1 << (BUCKET_COUNT - 1)) / 8];

BUCKET_COUNT is the amount of buckets which are lists of free memory blocks of certain sizes. The sizes are multiples of 2.
BUCKET_COUNT -1 is used because the leaves (smallest possible block) can never be split so that information does not have to be managed.

1 << (BUCKET_COUNT – 1) is the amount of memory areas that the tree has to keep track of at most.
The division by 8 returns the bytes that are required to manage all the bits.

Finding children, parents and siblings

Because there are no pointers between nodes and a node is represented by a single bit,
how does the tree know, where the child nodes of a parent node are in the bit array?

The answer is that there is a convention on how to compute the left and right child’s indexes given the index of a parent:

Left child: index = index * 2 + 1;
Right child: index = index * 2 + 2;

Example: the root node has an index of 0.
Using the formula above, it’s left child has an index of 1 and it’s right child has an index of 2.

The left root child has an index of 1.
It’s left child has an index of 3, it’s right child has an index of 4.

The right root child has an index of 2.
It’s left child has an index of 5, it’s right child has an index of 6.

Going from a child to a parent is achieved by the formula:
index = (index – 1) / 2;

For example, the child with index 6 has a parent with an index of 5 / 2 = 2.
Remember this formula is computed on integer datatypes where floating part is just cut off without any rounding. Therefore 5 / 2 results in the value 2 not 2.5 or 3.

The child with the index 2 has a parent with the index 0.

Going from a node to it’s sibling is achieved by:
index = ((index – 1) ^ 1) + 1;

^ 1 inverts the last bit in the number

Nodes 5 and 6 are siblings.

Putting 5 into the formula yields (binary numbers are flagged with a ‘b’ as a suffix)
((5 – 1) ^ 1) + 1 = (4 ^ 1) + 1 = (100b ^ 1b) + 1b = 101b + 1b = 110b = 6

Putting 6 into the formula yields (binary numbers are flagged with a ‘b’ as a suffix)
((6 – 1) ^ 1) + 1 = (5 ^ 1) + 1 = (101b ^ 1b) + 1b = 100b + 1b = 101b = 5

The meaning of the bit

The blocks of memory are represented as nodes in the bit tree.
The bit can either be 0 or 1. The question is what is encoded by the bit?

A value of 0 means that the block was not split.
A value of 1 means that the block was split.

If a block of memory was split, the respective node in the tree has two children.
A left child and a right child.

Converting an index to a pointer to the respective block

An index in the tree denotes a memory block.
The operation that computes a pointer to that block is:

static uint8_t *ptr_for_node(size_t index, size_t bucket) {
     return base_ptr + ((index - (1 << bucket) + 1) << (MAX_ALLOC_LOG2 - bucket));
}

The first parameter is the index, the second parameter is the bucket the memory block pointed to by the index belongs into.

The formula that the function uses is:

base_ptr + ((index – (1 << bucket) + 1) << (MAX_ALLOC_LOG2 – bucket))

It computes a amount of bytes (offset) than adds that amount of bytes to the absolute base_ptr to retrieve an absolute pointer again.

The amount of bytes are the bytes located before the block indexed by index starts. This is the definition of an offset. The addition between base_ptr and the offset returns an absolute pointer to the start of the block.

The amount of bytes (offset) is computed by:

(index – (1 << bucket) + 1) << (MAX_ALLOC_LOG2 – bucket)

The binary left shift operator is the same as repeatedly multiplying by two.
The number that is multiplied by two is:

[1] index – (1 << bucket) + 1

The amount to multiply by is

[2] MAX_ALLOC_LOG2 – bucket

Looking at formula [1], this will compute the zero based index of the block amongst its all siblings at that level of the binary tree.

Given the tree

          -------------------------
bucket 0: |           0           |
          -------------------------
bucket 1: |     1     |     2     |
          -------------------------
bucket 2: |  3  |  4  |  5  |  6  |
          -------------------------
bucket 3: | 7| 8| 9|10|11|12|13|14|

when using the index 12 (in bucket 3) in the formula:
index – (1 << bucket) + 1
we get
12 – (1 << 3) + 1 = 12 – 8 + 1 = 5.
5 is the zero based index of the node 12 inside it’s level of the tree. (Count 7, 8, 9, 10, 11, 12 using zero based indexes and you will arrive at node twelve when you counted from 0 to 5. 5 is node 12’s zero-based index inside it’s level of the tree).

The formula basically works by subtracting the amount of nodes below the node’s level in the tree. For the node 12, there are 7 nodes below it’s level, so 12 – 7 = 12 – 8 + 1 = 5

Once the formula [1] has determined the zero-based index among all children, the formula [2] will compute the size of a block on that level of the tree (= bucket) and multiply this amount of bytes by the zero based index.

The size of a block at level/bucket n in the tree is (2 ^ (MAX_ALLOC_LOG2 – bucket))

(MAX_ALLOC_LOG2 – bucket) is a larger number, the higher up in the tree the node is located. A larger number results in a larger block of bytes since the number is used as an exponent. This makes sense since a block higher up in the tree is twice as large as it’s direct children. The root node is as large as the entire memory area.

MAX_ALLOC_LOG2 is the exponent of the largest block used by the allocators configuration. The root node in the tree, which is the largest block is of size 2 ^ MAX_ALLOC_LOG2. If MAX_ALLOC_LOG2 is set to 32 for example, the largest block is 2 ^ MAX_ALLOC_LOG2. Bucket 0 stores those blocks.

Blocks in bucket 1 are 2 ^ (MAX_ALLOC_LOG2 – 1)
Blocks in bucket n are 2 ^ (MAX_ALLOC_LOG2 – n)

This is exactly what formula [2] computes.

The computed number from [2] is used as an exponent to 2 to compute the size of the block at that level in the tree in bytes.

The bitwise shift between the results of [1] and [2] will multiply the blocksize times the zero base index. The result is the offset to the start of the memory block.

Adding that offset to the absolute address stored in base_ptr yields the absolute address of the block of free memory.

The inverse operation is:

static size_t node_for_ptr(uint8_t *ptr, size_t bucket) {
    return ((ptr - base_ptr) >> (MAX_ALLOC_LOG2 - bucket)) + (1 << bucket) - 1;
}

This first computes the offset of the ptr (ptr – base_ptr).
This offset is an amount of bytes which is then divided by 2 repeatedly by bitwise right shifting.
The amount of right shifts is (MAX_ALLOC_LOG2 – bucket) which is taking the size of blocks at the bucket-level into account.

We arrive at the zero-based index of the node relative to its level.
The last step is to add the amount of nodes in all lower levels of the tree which is achieved by (1 << bucket) – 1

You can see that this are all the steps of the opposite operation (outlined above) in reverse order.

Operations for managing ‘node split or not’ concerns

The operation parent_is_split() figures out if the parent of a node (NOT THE NODE ITSELF) is split or not.

static int parent_is_split(size_t index) {
    index = (index - 1) / 2;
    return (node_is_split[index / 8] >> (index % 8)) & 1;
}

Line [2] is the operation to compute the parent index of a node.
Line [3] a straigh-forward bitmap operation. It returns the value at a specific index of a bitmap.
The & 1 operation returns true if the bit is set and hence if the node is split.
It returns 0 otherwise.

The operation flip_parent_is_split() toggles the parent’s split status.

static void flip_parent_is_split(size_t index) {
    index = (index - 1) / 2;
    node_is_split[index / 8] ^= 1 << (index % 8);
}

Operations: malloc() and free()

This is the meat and potatoes of the implementation! Finally!

The Header of Blocks

There is a header at the start of each memory block. That header is always 8 byte in size. A memory can either be free or used.

If the memory block is free, it is part of a circular linked list of a bucket. In that case, the header contains a list_t struct. That struct stores a pointer to it’s predecessor free block (4 byte) and it’s successor free block (4 byte).

If the memory block is used the malloc() method stores the size of that block in the first 4 byte of the header. The following 4 byte are basically unused although they still contain the 4 byte successor pointer from the list_t struct that the header contained when the block was on the free list. The size is a multiple of 2. Given a pointer to the memory block, traversing back 8 byte, the header and within it the size of the block can be retrieved. Given the size value, it is possible to compute the bucket the block belongs to and from the bucket it is possible to compute the node index in the tree using:

ptr = (uint8_t *)ptr - HEADER_SIZE;
bucket = bucket_for_request(*(size_t *)ptr + HEADER_SIZE);
i = node_for_ptr((uint8_t *)ptr, bucket);

Once the node index is known, the block can be merged with it’s buddy during a call to free().

How Memory is Layed Out in the Linearized Binary Bit Tree

The implementation does not create a root node spanning the entire available memory since, according to the author, the first request for memory would immediately reserve the entire memory space and break down memory blocks repeatedly until the first allocation can be served.

This approach is expensive, especially if the application only requests small blocks of memory. Instead another approach is choosen.

The tree starts out with the smallest possible block size. That means it’s root node starts at the bucket with the highest index which. The bucket with the highest index manages the smallest possible memory blocks. That means that the used bits in the linearized binary tree are not starting at the index 0 but rather somewhere at the end or the middle of the bitmap.

If memory is requested that does not fit into the smallest memory block, the tree is grown. The root will be placed at the next bucket. The next bucket has a smaller index than the bucket before. That is why the operation is called lower_bucket_limit(). lower_bucket_limit() will put the root onto the next smaller bucket and by doing so, it will double the amount of memory that the tree manages. lower_bucket_limit() will add free memory areas into buckets and provide more space to allocate. In the linearized bitmap, the tree will use more bits towards the start of the bitmap. That means the tree continuoulsy grows towards the first bit in the bitmap and if it shrinks, it will shrink back towards the end of the bitmap.

How Memory is Reserved When Growing the Heap

When the tree cannot satisfy a request for memory, it has to grow. In order to grow there may be the point where the max_ptr has to be moved to reserve more memory. The function update_max_ptr() does exactly that. It will call the operating system using the brk() call. If brk() manages to reserve the memory, max_ptr is moved to the newly allocated border.

Implementation of malloc()

The idea of malloc() is that it receives an amount of bytes to reserve and it returns a pointer to the address at which the bytes are available in a contiguous block. The pointer can later be used in a call to free() and free() is then able to return the block back onto a free list of it’s corresponding bucket.

malloc() starts of by initialization if malloc() has never been called before.

malloc() than looks at the amount of bytes it has to reserve. It will compute the next larger multiple of 2 that is larger than the amount of bytes to reserve. The computed multiple of 2 corresponds to a certain bucket. malloc() will grow the tree, until it’s root is located at a bucket that can serve the requested amount of bytes.

malloc() contains a while loop which iterates over all buckets from the highest index towards the lower indexes. That means it iterates from small buckets to ever larger ones until it is able to locate a matching block to serve the request.

For each bucket index, it will call lower_bucket_limit() to grow the tree to the current bucket size.

NOTE: TODO: I have not understood the rest of the method completely. It kind of walks up the tree, than splits the nodes down again until it finds a matching block. If a matching block is found, it will write the size of the block into the header, killing the linked list struct in the process that was store in the header before. It will then return a pointer to after the header to the use as a return value of malloc(). The user can than use the space and return the block using the pointer as a parameter to free() later.

 

Implementation of  free()

TODO

Connecting to the Uhlenbrock Intellibox I using a USB Serial Cable

The Cable I used is the one from CableCreation (link). It is a 2 meter cable with nice golden contacts and a FTDI chipset which supposedly are the most capable manufacturers of Serial equipment.

The cable is detected as a 2xx chipset and so I downloaded the 2xx windows driver for the cable from their drivers page. I did install the 64bit version of the 2.12.28 version of the windows driver.

After installing the drivers, use the windows device manager and unfold the COM port section of the device tree to check which COM port is created when plugging in the cable.

In your ModelTrain-Software Package, when adding the Uhlenbrock Intellibox I, use that COM port as a connection option. Your PC should now be able to send commands via the ModelTrain-Software throught the USB – Serial Cable to the Uhlenbrock Intellibox I.

My first test was to control some of the switches and to control one of the engines. It worked just fine. As model trains tend to have hundreds of options to control, it is hard to say if every last option will work as planned for you but the first test looked promising at least.

“–Kurzschluss Booster–“

The Intellibox I from Uhlenbrock shows “–Kurzschluss Booster–“, What to do?

Before you hunt for short circuits on your layout, first disconnect the boosters and the Intellibox I from your layout and connect them to a very small system that only connects the boosters to the Intellibox I but nothing else. No trains, signals, rails, rolling stock or anything else. If this small subsystem still shows the error, you know that there is no short circuit caused by your layout or anything on it but the settings in either the boosters or the Intellibox are incorrect.

If the small subsystem still shows the error, try the solution outlined here. For me the solution option A did the trick. It tells you to set the Sonderoption (custom option) 25 to the Value 1 and the Sonderoption 907 to the Value 5.

The peculiar thing is the the error is described as the Intellibox I sending DCC commands to a Märklin Booster 6017 which will signal a short circuit back to the Intellibox because it cannot correctly process the DCC signals. My dad swears that he used the system in Motorla / DCC co-operation for two years and the problem never occured. I do not know what to think about all this, but Solution A solved it at least for now.

The following text is taken from the Uhlenbrock FAQ and is not mine! I did copy it in case the Uhlenbrock FAQ goes away. The Uhlenbrock FAQ is awesome and spot on and it would be a shame if it is lost one day.

Problem/Frage:
Wenn die Intellibox mit einem Märklin 6017 Booster betrieben wird, erscheint nach dem einschalten der Anlage die Fehlermeldung:
  "--Kurzschluss Booster--".
Hinweis:
Fall A - Sie benutzen nur Motorola-Decoder

es ist wahrscheinlich, dass Sie, obwohl Sie nur Motorola-Decoder benutzen, doch zumindest einmal eine DCC-Lok "zum ausprobieren" aufgerufen haben.

Dies hat zur Folge, dass die Intellibox automatisch die Sonderoption 25 auf den Wert 1 und die Sonderoption 907 auf dem Wert 5 gestellt hat. Das bedeutet, dass beim Anschalten der Anlage ab jetzt auch ein DCC-Signal mit ausgesendet wird, welches vom Booster 6017 als Kurzschluß interpretiert wird.
Lösung:
  Bitte Sonderoption 25 zurücksetzen auf Wert = 0
  Bitte Sonderoption 907 zurücksetzen auf Wert = 1

Falls trotzt dieser Änderung immer noch eine Kurzschluss- Fehlermeldung entsteht, verändern Sie bitte auch den Wert der Sonderoption 36 auf einen größeren Wert als die Werkseinstellung von 20 z.B. auf den Wert 100.
Hinweis:
Fall B - Sie fahren auch im Mischbetrieb mit DCC-Decodern.
Lösung:
die Sonderoption 901 ist wahrscheinlich nicht - laut Intellibox Handbuch - erhöht worden. Die Sonderoption 901 sollte auf den Wert 3 bis 5 gesetzt werden.

Falls trotzt dieser Änderung immer noch eine Kurzschluss- Fehlermeldung entsteht, verändern Sie bitte auch den Wert der Sonderoption 36 auf einen größeren Wert als die Werkseinstellung von 20 z.B. auf den Wert 100 .

Ebenso am Märklin Booster die DIP-Schalter 1 und 2 auf "on" stellen.

Kurzanleitung: Zurücksetzen einer Sonderoption auf die Werkseinstellung

Sonderoption anwählen, den Cursor auf die rechte Seite setzen und die "C" Taste so lange betätigen bis die Werkseinstellung der Sonderoption wieder im Display erscheint. Danach die [enter] Taste betätigen.

 

Test or It Did Not Happen

What if someone as a programming would strictly act by the following simple rule:

Untested code does not exist!

What I mean by ‘does not exist’ is, what if the programmer would just pretend that the untested code is not available to him. He acts as if the code literaly is not there! In other words he ignores it, when he downloads the code from his source repository. And he will not perform a code review on the untested code, even if a code review was assigned to him, because the code does not exist in his mind! The programmer would just act as if the code was not there. He could not see it, smell it, use it, modify it, compile it, ship it nor deploy it.

Beside probably getting into trouble with his boss, this would make the programmers life substantially more easy.

I assume that this programmer writes tests for all of his own code, so he is able to validate that his own code does work, in other words, his own code does in fact exist! Now I know this is not a very realistic assumption because code is often hard to test and often there are time limitations of financial limitations. Would you work for free writing tests if those tests are not paid for by the customer?

Now imagine the programmer is responsible for an application that others contribute code to. If other programmers commit errneous code, the programmer is the one who has to fix those bugs, not the one that commitet buggy features. At this point if the programmer receives untested code, he would just act as if this code did not exist. If he is assigned the code review, he will say: “Which code do you mean? I can’t perform a code review of code that does not exist! I do not see any code in this merge request you assigned to me! Please check your code again and let me know when your tests are ready!”. As a consequence the untested code won’t be merged!

If someone else is dumb enough to merge the code, the programmer could just say: “Well, the code that is supposed to implement the much requested feature X does obviously not exist, since there are no unit tests! I will have to implement feature X again! Such a shame!”. He would then take the time to write code and test the code. He is then convinced that his code works and can ship quality code to the customer.

How do you even come up with this concept of “code does not exist”? Well, think about it. If someone sends you code and there is no unit test for it, how do you know that the code works? It might be possible, that the author of the untested code did not run the code even once! It is possible, that the code does not even compile! It is possible that the code is not even a real computer programm! Parts of the code might be missing! Without a test, you actually know nothing about the characters in the text file you have received! Not only can you not trust the code, there is no proof of the functionality of the code! The code does not even exist!

I am sure I am breaking a few rules of logic here and there, but just imagine how much better your life would be, if you could just pretend untested code did not exist!

Splint – C Static Code Analysis

Introduction

Static code analysis tools such as PMD, Findbugs and Checkstyle for Java, FxCop and StyleCop for C# are a great way to learn about the language you are using, to form a uniform style for all team-members and, the original reason, to improve the code and get rid of bugs before they make it to production.

Static means that the code is checked and not the running program. Static code analysis is limited because it does not deal with code that processes real data at runtime. Static code analysis can however detect a lot of things that could easily cause a lot of trouble such as missing break statements, uninitialized or unused variables and the like.

Splint is a free, static code analyzer for C that can be easily installed on Ubuntu and is very easy to use after installation.

Installing Splint

Use the apt package manager:

sudo apt-get install splint

Using Splint

Execute splint and pass the files to check as parameters.

splint main.c

You should also listen to your compiler. -Wall -Wextra will output a lot of warnings for gcc. You should decide in your team which warnings you want the compiler to output and then accross those selected warnings you should have a strict zero warning policy. What is a warning worth if it is ignored? It is better to suppress warnings that your team decides not to fix and then fix all the relevant warnings. I am not talking about compiler errors because code that causes compiler errors won’t compile in the first place.

Multiboot Memory Map

Read this article to learn about common pitfals.

The multiboot specification contains a memory map. The memory map has some characteristics that are hard to understand.

A trick for debugging is to type c when GRUB shows the selection screen to enter a console mode. Once in console mode, Type lsmmap to tell GRUB to print the multiboot memory map.

The Memory Map Contains More Addresses than are Physically Available!

The memory map describes more addresses than are actually available. Some of the addresses in the memory map are not backed by physical memory! Those areas are marked as reserved.

For example if you start qemu and emulate a machine with 1G of RAM,

qemu-system-i386 -m 1G -cdrom image.iso

the memory map will contain entries with addresses for up to 4G. The memory areas in the memory map that are not backed by physical memory are marked as reserved. That means if you ignore all reserved areas in the memory map and only concentrate on the available memory areas, you will have no problems. The areas marked as available are always backed by physical memory as far is I can tell from my personal experiments.

Available Memory Is Not The Same As Free Memory!

The kernel can read and use that memory map to find available regions of physical memory. Available means usable by the the BIOS, the bootloader and the operating system! All these components make use of the available memory, so some of the available memory will be occupied by the time your operating system takes over control from the bootloader! Available and “Not Occupied” or “Free” are two different things! The information that the multiboot memory map gives you is about theoretically available memory not about free memory. That memory might already be filled with important data!

Lower 1 Megabyte

For GRUB, a memory area is available, if you do have access to it and can write into it. The lower MB of RAM typically contains things like the Video RAM at 0xB8000 placed there by the BIOS. You can choose to store data in this section of the memory but being video memory, it will write those characters to the screen! You should just not use this part of the memory although it is available RAM! GRUB does not care! GRUB will just tell you that this Area of memory is available.

The lowest 1 Megabyte also contains interupt vectors amongst other things.

A good approach would be to treat the first MB of RAM is memory that is in use and to just start using the available memory starting from 1MB.

Dealing with memory occupied by GRUB Modules

If you tell GRUB to load modules, those modules are loaded into available memory areas by GRUB. GRUB still says those memory areas are available! You can decide to reuse that space and just write new data over the modules rendering the modules unusable. GRUB does not care!

That means you have to take a close look at the modules section of the multiboot information data struct to learn about which locations in available memory are occupied by the modules in order to not accidentely write data into those sections..

Creating an ISO with GRUB and Your kernel as an elf File

Source Material and Usage

The following article contains a very good explanation about how to create an iso image that contains the GRUB bootloader to load your own kernel which has the elf file format.

Just type make in the folder. It will run the makefile that creates an iso image. You should create a cross compiler for 32 bit x86 code using the System V ABI for the kernel binary to be created correctly. See this Article.

You can then run that iso image on Ubuntu with either VirtualBox, bochs or qemu.

For qemu, first install qemu:

sudo apt-get install qemu
sudo apt-get install qemu-system-i386

Then start qemu using the iso file:

qemu-system-i386 -m 2G -cdrom image.iso

For VirtualBox, you can install VirtualBox via the Ubuntu activities search box. Then in the graphical user interface create a virtual machine and configure it to have your iso loaded in the cdrom drive and start it.

Bootloader and Kernel

The GRUB bootloader implements the Multiboot specification. The multiboot specification describes how a bootloader and a kernel can interact with each other. Any bootloader implementing the Multiboot specification can load any operating system that also adheres to the Multiboot specification.

What does it mean for an operating system to be multiboot compliant? The specification in section “3.1 OS image format” states that the operating system kernel binary must contain a Multiboot header structure in its first 8192 bytes. The structure must be contained in the text segment. Only if after scanning the text segment this structure can be found by a multiboot bootloader, the bootloader will recognize the binary as a kernel and list it in the list of bootable operating systems for example.

Do not confuse the Multiboot header structure with the Multiboot Information Data structure. The Multiboot header is part of the OS kernel binary and is a service for the bootloader by the kernel, whereas the Multiboot Information Data structure is passed to the kernel main function as a parameter by the bootloader and is a service by the bootloader for the operating system.

Multiboot Information Data Structure

The Multiboot specification under section “3.3 Boot information format” says

Upon entry to the operating system, the EBX register contains the physical address of a Multiboot information data structure, through which the boot loader communicates vital information to the operating system. The operating system can use or ignore any parts of the structure as it chooses; all information passed by the boot loader is advisory only.

The Multiboot Information Data structure is a way to transfer information from the bootloader into the kernel. This way, the kernel can learn for example how much physical memory is available.

To use it, you must define the structure. The easiest way to define the structure is to insert the official header file into your codebase. The bottom of the multiboot specification contains the header file in raw text form along with a small example operating system that shows how to use that structure.

Kernel Start Address – At which address will the kernel be loaded by GRUB?

There is no fixed value, it depends on what the elf file instructs GRUB to do.

The kernel binary created from the article is an elf file. The elf file format contains the address it expects to be loaded at. After loading an elf file the elf file should be under the requested address in virtual/physical memory so that all absolute addresses in the application actually point to the correct location.

GRUB is capable of loading elf binary files. If it loads a elf binary, it will load it at the physical address that the elf file requests. The linker script that creates the elf file looks like this:

OUTPUT_FORMAT(elf32-i386)
ENTRY(start)
SECTIONS
 {
   . = 1M;
   .text BLOCK(4K) : ALIGN(4K)
   {
       *(.multiboot)
       *(.text)
   }
   .data : { *(.data) }
   .bss  : { *(.bss)  }
 }

It first tells the linker to create a elf binary. Then, under the sections block, it lets the text segment start at 1M which is one megabyte = 0x00100000. The elf file will now specify that it wants the text segment to be loaded at 1M. The text segment contains the kernel’s executable code. So the kernel is loaded to 1M by GRUB in this example. You could also choose another physical address to load your kernel to.

Kernel End Address – How does one know how large the kernel is and where usable memory starts

The kernel itself is a binary and is placed into memory. As the binary has a certain size in bytes, it will take up a certain amount of memory. So far we know where the kernel binary starts in memory (defined by the linker, contained in the elf binary).

How do we figure out, where the kernel binary ends and where free space starts after the kernel? Another thing to keep in mind is that the kernel binary will keep getting larger and larger the more features and therefore code you add to your kernel codebase. It is inconvenient to make an assumption about an upper bound of the kernel’s size.

Also how does a bootloader know how many sectors to copy from the elf binary into RAM? If the bootloader copies too many sectors, that is not a problem aside from a waste of space. If the bootloader copies too few sectors, only a part of the kernel is available in RAM which will have fatal consequences. Your functions will work just fine until in the midst of execution, your code will have incorrect behaviour such as not outputting the log statements you expect or just total fault of all operations. The reason is that the instruction pointer just moves into parts of the memory that should contain more kernel code but just have not been loaded by the bootloader.

GRUB as a bootloader will determine how large your kernel is by looking at the metadata in the elf file. You do not have to worry about GRUB. If you write your own custom bootloader that is a problem you have to solve. Also if your kernel is a flat binary file and not an elf binary with metadata, how does GRUB or your custom bootloader know how many sectors to loader into RAM to load the entire kernel?

A problem you have to solve in your kernel (!= bootloader) is to figure out, where the first address is that can be use to store data in RAM (= placement address).

After the kernel boots, you will be in a state where paging is disabled and no heap is set up. In this phase you will use placement memory which means you put a unsigned byte pointer to the start address (= placement address) and whenever you require n bytes of memory, you increment the placement address pointer by n. The problem is, that this approach is so simple and basic that it lacks a lot of features that a heap has. For example you cannot free memory with the placement memory system because you have no metadata where objects start and where they end. From this lack of features it follows that one way to deal with this situation is to accept the fact that the kernel will never free memory that it has allocated in the phase before paging and a heap have been activated.

How does the kernel learn about the placement address? The kernel code can use a variable that contains an address set by the linker script. If the linker script sets the address after the kernel binary and all kernel segments, the kernel code suddenly knows about an address where placement memory can start. The linker can set the address correctly because he constructs the binaries and segments and hence knows their sizes. An example of such a linker script is James Molloys linker script. Check out the end label in the linker script. end is the address where the placement memory could start.

/* Link.ld -- Linker script for the kernel - ensure everything goes in the */
/*            Correct place.  */
/*            Original file taken from Bran's Kernel Development */
/*            tutorials: http://www.osdever.net/bkerndev/index.php. */

ENTRY(start)
SECTIONS
{

    .text 0x100000 :
    {
        code = .; _code = .; __code = .;
        *(.text)
        . = ALIGN(4096);
    }

    .data :
    {
        data = .; _data = .; __data = .;
        *(.data)
        *(.rodata)
        . = ALIGN(4096);
    }

    .bss :
    {
        bss = .; _bss = .; __bss = .;
        *(.bss)
        . = ALIGN(4096);
    }

    end = .; _end = .; __end = .;
}

Now the kernel code can now use C code to make use of the end label:

// end is defined in the linker script.
extern u32int end;
u32int placement_address = (u32int)&end

If you look closely, the end variable’s value is not used at all! It is the end variable’s address that is used to retrieve the end of the kernel! (See here)

Working with GRUB Modules

GRUB can, besides loading your kernel, put so called modules in memory. Modules are files (code binaries or just any arbitrary file) that the kernel can use to provide additional functionality or to read configuration from or do anything with in general. A module could be a binary program such as a hello world test program as an elf binary that you want to run as a test of your kernel elf loader for example. It could be a module that allows the kernel to understand how to read a FAT12 file system. It could also be a prepared image file of a filesystem that the kernel can use during it’s early stages of operation.

If you make use of GRUB’s module loader feature, it is not enough to just know where the kernel binary ends, you need to know where in memory GRUB has put the modules and also how much memory those modules occupy. GRUB will choose a memory address to put the modules. There is no well-known memory location where GRUB puts the modules, you have to retrieve the memory locations from GRUB somehow. You can learn that information from the memory map stored inside the multiboot information data. (Example code at the bottom of the multiboot specification).

Knowing which parts of the memory is occupied by your kernel’s binary, the placement memory and the modules is important because you do not want to override that memory in order to guarantee stable operation of your OS. The OS has to have a way to mark the memory as occupied. As the example OS that is build throughout those articles, will use paging, the most straightforward way is to maintain a bitmap of occupied phyiscal frames as outlined in James Molloy’s article about paging.

How does the kernel interact with the multiboot loader to learn about the end address of the modules? The information can be read from the multiboot information data. To retrieve that structure, the assembler boot code has to push the ebx register onto the stack because ebx contains the address of the multiboot information data which was put into ebx by the multiboot loader.

Let’s implement this idea using GRUB and a custom kernel! In this test, the module is a plain ASCII text file that contains the sentence “This is a plain text file module test.”. Create a file called “test” in the folder that contains the Makefile. Into the file “test” enter the following text:

This is a plain text file module test.

Update the Makefile to copy the “test” file into the boot folder. You can use another folder if you want.

CP := cp
RM := rm -rf
MKDIR := mkdir -pv

BIN = kernel
CFG = grub.cfg
ISO_PATH := iso
BOOT_PATH := $(ISO_PATH)/boot
GRUB_PATH := $(BOOT_PATH)/grub

#GCC := gcc
GCC := ~/dev/cross/install/bin/i386-elf-gcc

#LD := ld
LD := ~/dev/cross/install/bin/i386-elf-ld


.PHONY: all

all: bootloader kernel linker modules iso
  @echo Make has completed.

bootloader: boot.asm
  nasm -f elf32 boot.asm -o boot.o

kernel: kernel.c
  $(GCC) -m32 -c kernel.c -o kernel.o

linker: linker.ld boot.o kernel.o
  $(LD) -m elf_i386 -T linker.ld -o kernel boot.o kernel.o

iso: kernel
  $(MKDIR) $(GRUB_PATH)
  $(CP) $(BIN) $(BOOT_PATH)
  $(CP) $(CFG) $(GRUB_PATH)
  grub-file --is-x86-multiboot $(BOOT_PATH)/$(BIN)
  grub-mkrescue -o image.iso $(ISO_PATH)

modules:
  $(MKDIR) $(GRUB_PATH)
  $(CP) test $(BOOT_PATH)

.PHONY: clean
clean:
  $(RM) *.o $(BIN) *iso

If you build using the command “make”, the “test” file will be part of the created iso image. (It is not part of the kernel itself but part of the iso image).

Let GRUB know about the module so it is loaded alongside your kernel. To configure GRUB, change the menuentry in the grub.cfg file and add the “module” keyword and pass in the path to where the “test” file is contained in the iso image (/boot/test in this example).

# timeout in seconds, -1 waits indefinitely without timing out ever
set timeout=-1

# first entry is the default entry to boot after a timeout
set default=0

# custom kernel
# https://www.gnu.org/software/grub/manual/grub/grub.html#menuentry
menuentry "The worst kernel ever" {
        multiboot /boot/kernel
        module /boot/test /boot/test
}

An explanation of all parameters allowed by the menuentry is contained in https://www.gnu.org/software/grub/manual/grub/grub.html#menuentry.

In your assembler file which eventually calls the kernel’s main function, right before calling main, push ebx and eax in this order onto the stack. The multibootloader will write the address of the multiboot information data structure into ebx and the multiboot magic number into eax. Pushing data onto the stack will actually make those pushed bytes available as parameters to the called function in your C code! This is part of the Application Binary Interface (ABI) used which defines this behaviour.

bits 32

section .multiboot               ;according to multiboot spec
        dd 0x1BADB002            ;set magic number for bootloader
        dd 0x0                   ;set flags
        dd - (0x1BADB002 + 0x0)  ;set checksum

section .text
global start
extern main                      ;defined in the C file

start:
        cli                      ;block interrupts
        mov esp, stack_space     ;set stack pointer

        
        push   ebx             ;Push the pointer to the Multiboot information structure.
        push   eax             ;Push the magic value.

        call main                ; call main
        hlt                      ;halt the CPU

section .bss
resb 8192                        ;8KB for stack
stack_space:

Now update your kernel’s main function. An example is contained at the bottom of the multiboot specification. I will only list excerpts from there because there is quite a bit of code involved for printing strings via the BIOS.

void main(unsigned long magic, unsigned long addr) {
    
  multiboot_info_t *mbi;

  terminal_buffer = (unsigned short *)VGA_ADDRESS;

  // clear_screen();
  cls();

  vga_index = 0;
  // print_string("Hello World!", WHITE_COLOR);
  printf("Hello World!\n\n");

  /* Am I booted by a Multiboot-compliant boot loader? */
  if (magic != MULTIBOOT_BOOTLOADER_MAGIC) {
    vga_index = 160;
    // print_string("Invalid magic number", RED);
    printf("Invalid magic number!\n");

    return;
  }

  /* Set MBI to the address of the Multiboot information structure. */
  mbi = (multiboot_info_t *)addr;

  /* Are mods_* valid? */
  if (CHECK_FLAG(mbi->flags, 3)) {

    module_t *mod;
    int i;
    int j;

    printf("mods_count = %d, mods_addr = 0x%x\n", mbi->mods_count,
           mbi->mods_addr);

    for (i = 0, mod = (module_t *)mbi->mods_addr; i < mbi->mods_count;
         i++, mod += sizeof(module_t)) {
      printf(" mod_start = 0x%x, mod_end = 0x%x, string = %s\n", mod->mod_start,
             mod->mod_end, (char *)mod->string);

      // output the first characters from the test module
      char *character = mod->mod_start;
      for (j = 0; j < 37; j++) {
        // putchar(&mod->mod_start);
        putchar((*character));
        character++;
      }

      printf("\n");
    }
  } else {
    printf("No mods found!\n");
  }

  /* Is the section header table of ELF valid? */
  if (CHECK_FLAG(mbi->flags, 5)) {
    elf_section_header_table_t *elf_sec = &(mbi->u.elf_sec);

    printf("elf_sec: num = %d, size = 0x%x,"
           " addr = 0x%x, shndx = 0x%x\n",
           elf_sec->num, elf_sec->size, elf_sec->addr, elf_sec->shndx);
  }

  /* Are mmap_* valid? */
  if (CHECK_FLAG(mbi->flags, 6)) {

    memory_map_t *mmap;

    printf("mmap_addr = 0x%x, mmap_length = 0x%x\n", mbi->mmap_addr,
           mbi->mmap_length);

    for (mmap = (memory_map_t *)mbi->mmap_addr;
         (unsigned long)mmap < mbi->mmap_addr + mbi->mmap_length;
         mmap = (memory_map_t *)((unsigned long)mmap + mmap->size +
                                 sizeof(mmap->size))) {

      printf(" size = 0x%x, base_addr = 0x%x%x,"
             " length = 0x%x%x, type = 0x%x",
             mmap->size, mmap->base_addr_high, mmap->base_addr_low,
             mmap->length_high, mmap->length_low, mmap->type);

      // https://www.gnu.org/software/grub/manual/multiboot/multiboot.html#Boot-modules
      //
      // ‘type’ is the variety of address range represented, where a value of 1
      // indicates available RAM, value of 3 indicates usable memory holding
      // ACPI information, value of 4 indicates reserved memory which needs to
      // be preserved on hibernation, value of 5 indicates a memory which is
      // occupied by defective RAM modules and all other values currently
      // indicated a reserved area.

      switch (mmap->type) {

      case 1:
        printf("Available RAM\n");
        break;

      case 3:
        printf("Usable memory holding ACPI information\n");
        break;

      case 4:
        printf("Reserved memory which needs to be preserved on hibernation\n");
break; case 5: printf("Defective RAM\n"); break; default: printf("Reserved Area\n"); break; } } } vga_index = 80; // print_string("Goodbye World!", WHITE_COLOR); printf("Goodbye World!\n"); return; }

You can see that the main function now does not have a void parameter any more but it contains the magic number as first parameter and a pointer to the multiboot information data structure as the second parameter.

The kernel’s main function makes use of these parameters to output data about the modules and the memory map. Our goal was to use the custom module which is the plain ASCII text file “test”. The above main function does cheat quite a bit! It loops over all available modules and outputs the first 37 bytes contained in each module. This code assumes that there is a module that contains our “test” file containing a sentence consisting of 37 characters (“This is a plain text file module test.”)!

When you make this project and start the iso using qemu, you will see the text contained in “test” be printed.

As a general reminder: This is not production ready code! This code is merely here to illustrate concepts! It is not good code by any stretch of the imagination! Do not use this code in any project you plan to use for production purposes! Learn the concepts, write your own code, write tests to verify your code and prevent it from regression errors, let your colleagues or peers review your code and tests and only then use your own code for any important purpose! Do yourself a favor and cover your own back.

The Stack

The multiboot specification in “3.2 Machine State” states that the register ESP has an undefined value after booting and then ads:

The OS image must create its own stack as soon as it needs one.

Maybe GRUB will initialize a valid stack but according to the specification it does not have to. The kernel should therefore always creates a stack for itself. This is advisable because if your OS is loaded by a multiboot bootloader other than GRUB, this bootloader might not set a stack pointer and your OS has to be prepared for that situation.

Writing the ISO to a bootable USB Stick

In order to use your operating system on a real machine outside an emulator, you have to get the machine to boot your operating system. The most straightforward way is to boot from USB.

You need a USB stick which contains no relevant data as the USB stick will be erased in the process. You also need your operating system packaged as an ISO. On the machine in the BIOS settings, make sure that the machine does use USB as one of the entries in it’s boot order.

Creating a bootable ISO cannot be achieved by just manually copying the ISO file to an existing USB stick, as your ISO file is then just contained on the stick just like a regular file in the filesystem. Instead, you can use a tool that correctly lays out the ISO on the USB stick so that the machine can recognize the USB stick as a bootable media.

On ubuntu you can use the Startup Disk Creator application which comes preinstalled with the standard ubuntu installation.

The Startup Disk Creator is a little bit on in the sense that it will not accept your custom iso image. I had to rename the image to give it the .img extension. So instead of image.iso, I had to rename it to image.img

mv image.iso image.img

After you have your image.img file, load it in Startup Disk Creator, select the USB stick to write it to and start the process. The USB stick now can be used to boot your operating system.

If you prefer to use command line only to create the USB stick, you should look at this post. It maybe contains a command line only solution.