In this article, which is split into multiple parts, I"m going to address the issue of keeping your Linux servers up to date. More often than not, when starting to administer the servers of a startup business, the systems administration team finds an already created datacenter setup around a bunch of servers installed with the needed software, which is often a chaotically configured, "barely functioning" setup done by the first team of developers, doing DevOps jobs.
This situation gets in the way when one wants to keep the production or even the development systems updated and it leads to long hours of work. Often, many sysadmins find systems with big uptimes that weren"t updated for years. From the business point of view, as long as the systems are up and running, providing their services, it does not matter if they are up to date or not. Only when something bad happens - like a password database leak or a security compromise that leads to stolen valuable data, or when the developers find their programs need new versions of software installed that are incompatible with the production operating systems" versions - does the management team rush the systems administrators to do upgrades a.s.a.p. This rush further causes poorly tested solutions to reach the production environment.
To simplify and streamline the update process without buying new hardware, we approached the problem in two steps: first we virtualized the application layer by using Linux Containers (LxC). This type of virtualization enabled us to create "virtual machines" without losing performance or overloading the servers with the hardware simulation layer, as is the case with VMWare or RedHat"s KVM or Oracle VirtualBox or XEN; the second step was to "virtualize" the hardware nodes by using a partition image file as the root filesystem.
The upside of the "root filesystem in a file" approach is that we are able to change the root filesystem with a new one only by rebooting the machine and we are able to reverse the change fast, by doing another reboot. Also, having a file as root filesystem, the operating system installation from inside can be upgraded and tested out-of-band and then published to all servers automatically. Then, when any of those servers needs to be upgraded, it only has to be rebooted. This type of approach is already common in the embedded world where vendors publish "image files" to be uploaded to routers, out-of-band management boards (DRAC, iLO, AMT), etc.
In my setup, I used a source based Linux distribution - Gentoo Linux. One may ask "Why would anyone want to compile everything from sources when you can easily install binary packages?" When installing a server, a sysadmin often finds himself compiling software from source for various reasons: outdated packages, custom build features, patches to be applied, etc. Couple that with constant system updates and he finds himself "really working to earn his money". Also, he often finds that not only one package needs to be compiled from sources but many of its dependencies, as well. On a binary distribution, one runs into multiple problems when compiling from source, like the well-known "dependency hell" or replacing system components (python, perl) with unsupported versions. So I better build everything from source using automation tools - the Gentoo"s package manager, portage (which resembles FreeBSD"s ports).
In this first part of the article, I"ll talk about the host system, not the virtualized containers and virtual machines.
When designing the host system root image, also known as a "hardware node""s root image file, we found that it needs to satisfy some requirements:
Below is a diagram of the system"s host storage device partitions and the root image file:
Genkernel is a script from the Gentoo project that helps Linux users compile their kernels; it should also work outside of the Gentoo distribution (if you want a binary distribution that is granted to allow genkernel to work, take a look at a Gentoo derivative: Sabayon Linux or Calculate Linux). To be able to boot from the image partition, I modified the genkernel"s default/linuxrc file and added the logic described in the above diagram. You can clone the modified genkernel from github at https://github.com/psihozefir/genkernel.git.
Also, the init script that cleans up the hn-root.img file of unneeded files and settings will be available soon. In the meantime, you can do it manually. This script is needed in order for you to be able to put the image file on multiple servers without causing confusion on your datacenter network and management applications (like Nagios or Icinga, or nodes` administration panels).
To install a system into a root partition image, the following steps should be followed (this procedure will wipe the storage drive clean, so backup everything before continuing):
1. Using parted, create a new GPT label if you didn"t partition your storage using GPT yet; this step will destroy all the data stored on the storage device;
2. Create a very small partition to host the Grub2 boot loader (128 MB), format it as FAT32 (or FAT16 if the mkfs tool complains about the size of the allocation table) and label it EFI-BOOT (notice the case);
3. Create another partition that fills up the rest of the storage space and format it with any Linux FS (BTRFS is recommended);
4. On a separate machine (can be a workstation), create a 15 GB partition (can be bigger or smaller, as you see fit; a bigger partition will take longer to copy to all servers and a smaller one may fill up more quickly) and install a Linux distribution of your choice; once the installation is done, create an image of that partition using the dd command; note that the /boot directory is inside this partition, so the kernel and the initramfs will be accessed by grub after setting up a "grub2 loop device" which is different than the kernel"s /dev/loop0 device;
5. Recompile the kernel using genkernel; if you only want to generate a compatible initramfs, you can do that using the following command genkernel initramfs, instead of genkernel --menuconfig all;
6. In the /boot directory, the file "kernel" should be a symlink to the actual kernel file and the file "initramfs" should be a symlink to the actual initramfs file;
7. Loop mount the partition image file and in /boot/loader/grub/grub.cfg, create a new menu entry (my partition image file is formated as reiserfs, so I added the reiserfs module; you"ll need to load the ext2 module if you have an ext2, ext3 or ext4 image file):
menuentry "GNU/Linux in a file" {
insmod part_gpt
insmod fat
insmod reiserfs
insmod ext2
insmod gzio
set root="hd0,gpt2"
loopback loop ($root)/hn-root.img
echo "Loading Linux…"
linux (loop)/boot/kernel root=/dev/ram0
real_root=/hn-root.img raw_loop_root_host_
partition=LABEL=hostpart ro
echo "Loading initial ramdisk…"
initrd (loop)/initramfs
}
LABEL=EFI-BOOT /boot/loader vfat noauto,noatime 1 2
LABEL=hostpart /hostpart auto noatime 0 1
9. Dismantle the file and copy it to the destination server; install grub on its host storage drive: grub2-install --target=x86_64-efi --boot-directory=/boot/loader --efi-directory=/boot/loader; you"ll need to use chroot from a live USB Linux (System Rescue CD, for example) in order to be able to install grub, but that is beyond the scope of this article.
Another advantage for using image files is that it makes switching Linux distributions really easy. Just put an hn-root-new.img containing another Linux distribution and you"re done!
In my next part of this article I"ll talk about the XtreemFS cluster storage and in the 3rd part, I"ll talk about virtualization of systems (using LxC and KVM) and applications (using Docker). When OpenStack will become available in one of our images, I"ll also write the fourth part, where I detail how we will use OpenStack.
This setup is currently work in progress and some parts are not even developed yet. So, stay tuned!