Fri, 28 Jul 2006

paravirt_ops, Xen and VMWare

Xen is a hypervisor: it sits under a (slightly) modified operating system and allows it share the real hardware with other operating system. This is called "paravirtualization", because the OS knows it's not running directly on the hardware, and helps the hypervisor along a little. Naturally, Xen has patches for Linux.

Chris Wright has been doing excellent work (first for OSDL, then Red Hat) laborously preparing long the series of patches to merge the Xen code into the mainline kernel. Then, along came VMWare, with a proposal for an ABI which all Operating Systems and hypervisors could use, called VMI. They use this currently, and they have a version which supports Xen as well. In the wings are other contenders, such as Microsoft and the L4 work.

So, knowing that just about every place in the kernel where we support multiple implementations of the an API at runtime we use an "ops" struct, it makes sense to use it here, and in fact, ppc64 already does this to support the same kernel on native and under a hypervisor. No kernel programmer is likely to be surprised by this approach, aka. "paravirt_ops".

So, with this plan, I agreed to help with the merge. There are some performance issues shown up by lmbench with doing an indirect call instead of (on native) a single instruction. This is solved by extending the infrastructure we already have for binary patching in the kernel, and it turns out that the interrupt operations dominate other paravirtual-sensitive instructions by a couple of orders of magnitude: patching them is sufficient.

The kernel summit helped convince both the Xen and VMWare people that this approach was most likely to be merged, so we have a mercurial patch queue and everyone who wants it has commit access. It's been pretty good for allowing everyone to hack away (after we sorted out some tools issues...).

[/tech] permanent link