Free Software programmer
This blog existed before my current employment, and obviously reflects my own opinions and not theirs.
This work is licensed under a Creative Commons Attribution 2.1 Australia License.
Thu, 09 Aug 2007
So after two weeks of going through the kvm kernel code, reading Intel docs, and dozens of patches later, this week was supposed to be the start of implementing "kvm-lite" which I'm supposed to be presenting at the KVM Forum at the end of the month (yeah, I love pressure).
Of course, other things (such as a couple of lguest bug reports) stole some time, but just tonight I got it to the stage where it flips into the guest and back (multiple times). Now, since I haven't even hacked a console together for the guest, it doesn't get far, but from here to booting should be less painful than those early steps.
What's interesting is that by mangling the lguest code into this different context I revisit the code with a little more x86 knowledge. Indeed, while copying the segment handling code into kvm-lite, I discovered (and wrote a test for) a nasty bug. The guest can tell us to change a GDT entry it's currently using, and we'll fault when we try to restore the guest segment registers. I handle the simple case of marking a currently used entry not-present, but not the more obscure cases which can cause a fault such as changing the stack segment descriptor to a code segment.
The problem is made worse by the user-modifiable registers of kvm-lite (or anything which wants to offer guest restore, such as future lguest). With lguest, we know that the segments were OK when we last ran the guest: we only have to be careful when executing the two hypercalls which modify the GDT. With kvm-lite we also have to be suspicious of userspace-supplied GDT entries, as they can crash the host.
The solution was rather simple, if in some ways less than elegent. We catch faults in the switcher and return to the host: because we didn't enter the guest, the trap number is not updated and so we can tell the switcher faulted. We kill the guest that caused it.
This also gives us some insulation against other such bugs: rather than causing a triple fault and host reboot (or even a re-install for poor Ron!), it just causes the problematic guest to die.
[/tech] permanent link