Building an OS: The workflow!

Over the past few months, I’ve been working on a project which has inspired me to think about how a complete Operating System is built from the ground up. Luckily for me, this process is pretty well documented by the Fedora Project.

The project I’ve been working on does require a bit of thought around enterprise Linux versions run by a community. There is the ever amazing CentOS, Scientific Linux and a few others who have been around the block a time or two. The work that they have done has been immense and very helpful to many, including me.

For my project, the work was about building a fully binary compatible, enterprise-ready, community version of Linux, very similar to what CentOS and others have done. The question always comes ‘why?’ which will be addressed later on in future posts. Suffice it to say, the work we’ve been doing has paid off in both a individual and community sense.

In the beginning of this project, it was clear that we needed some tools to make things work the way we wanted. Luckily, there were tools out there to do a good portion of our work. Tools like koji, mock and of course Linux to bring it all together. But other tools seemed to be missing and I went on a quest…

The first tool that seemed to be missing was a way to import src.rpms from the most popular upstream vendor. These packages needed to be rebuilt by koji in some fashion, but just taking the srpms and rebuilding them had been done before, and seems to be the preferred way to date. In my mind however, it seemed that we were missing a step. Enter skein.

While skein is still very green and will need quite a bit more work, it accomplishes the goal of extracting the srpms into two parts. This tool basically sets up two things; a git repository on (for now) along with a location and verifiable way to store the archive stored inside the srpm, called a lookaside cache. If one looks at the way the Fedora Project maintains their source, this process is very similar.

Once the srpm is imported with skein, it can be built with koji. At the moment, this process is fairly manual, but the plan is to improve skein to also allow building from the repositories. However, another bit more automatic way to build would be to use a git hook. Luckily, github provides several ways to accomplish this, including a custom URL to which an HTTP POST can be sent. At which time, koji would download the spec file and source from the appropriate locations and build a srpm.

Koji completes its task by building the binary RPM(s) and appropriately tagging the successful builds. Once complete, mash can be used to generate custom repositories to prepare for composing actual iso images. Mash is a command-line tool, again used by Fedora.

Once the repositories are generated by mash, pungi takes over. The process of building an iso is actually very simple, just a kickstart file, some repositories and pungi are used to create a fully installable DVD or multi-CD iso image.

Here’s a bit of my excellent artwork to better describe the process.

A couple things to note about this process is that while it is starting to become clear how to build an OS from an upstream vendor, there are parts that still haven’t been addressed. Currently, we can import with skein and rebuild the SRPM and build the binary RPMs with koji. We yet to have enough binary RPMs to actually construct a buildroot, but we are getting very close.

Automating the builds with git hooks and a skein build process is a nice big step toward making our own Operating System possible.

The other big piece of the puzzle is dependency resolution. Now this has been mostly resolved by tools and apis like Yum and RPM, but I still feel very much like a n00b when working with them. My hope is to figure out that process in the next week or so, and update skein to make building faster and easier overall.