Gentoo has supposedly one of the fastest boot-ups in the Linux world. I say “supposedly”, as at the moment I don’t really have anything I could compare it with and I have to refer to what has been posted in bug #69579. As fast and as advanced as our rc-scripts might be, there is always some room for improvement.
I have been looking through the Gentoo Bugzilla, searching for bugs about boot speed improvement. Fortunately there is the meta-bug mentioned in the previous paragraph which makes things easier to track. I haven’t had much time to play with the various modifications proposed there, but I’ve done some preliminary analyses and tests.
As you can see in comment #25, I don’t have much faith in running rc-scripts in parallel. Not that this not an useful feature - it certainly is, especially when you’re doing things like waiting for DHCP-provided network configuration or mounting some network filesystems. But, on my system running these scripts in parallel saves me not much more than 1 second. And it breaks the cool new boot icons I’ve been coding into splashutils for the past few days. The odds are definitely against setting RC_PARALLEL_STARTUP to “yes” :)
As far as the other changes are concerned, I’m not going to comment most of them just yet, I still need to do some tests. While I doubt the speed-up introduced by them will be dramatic, I think it’s worthwhile to integrate any optimizations that don’t break anything. After all, a large number of small speed-ups might very well result in a noticeably faster boot.
However, what I’m really looking for is a relatively huge speed increase (I’m talking 100% or more here). These things, if available, usually come at a price. Well, I would be willing to pay that price, even if getting the speed-up would mean resorting to some dirty tricks and hacks. What got my attention in these matters is the idea of using software suspend to make things faster.
The original idea proposed in one of the comments of our meta-bug was to make a snapshot of the system state at a late stage of the boot process and then use it to resume the system after reboot. While it sounds quite easy, it’s in fact pretty tricky on the technical side. The main problem I can see at this point is getting the in-kernel data structs in sync with the filesystems and thus not breaking anything and not sending any data from the hdd straight to /dev/null. I have yet to think whether this is even practically possible (comments, anyone?).
Since rewriting half of the software suspend code to make this new ideas work is not what I originally had in mind, I began looking for another possible solutions. What I have come up with so far, is making a snapshot of the system instead the normal shutdown sequence, with some potential “cleaning up” (ie. killing processes we wouldn’t want to run after we “boot up” with our snapshot) before doing so. This idea appears to work - I have already written some code and successfully tested it a few times. The speed improvement is huge - huge enough to make me notice how slow waiting for the BIOS to finish the POST is. My normal boot sequence takes some 48 seconds (from GRUB to the login prompt). With the new software suspend modifications in place, it’s down to 5-7 secs. I call that a promising result :)
The code still needs MUCH MORE testing and extending, but I think it’s something worth spending time on. Some may argue that this is cheating - we are not going through the normal boot sequence after all. On one hand they are right, but on the other - if the end result is the same, who cares what is really going on? ;)