Fri, 22 Dec 2006

virtbench and lhype

So, with some help from Tony Breeds, virtbench now runs reasonably well.

The purpose of virtbench is to provide low-level benchmarks for virtualization developers to optimize Linux on their systems. It's designed to be hypervisor-agnostic and very simple to run (Tony is writing the scripts for Xen now). It runs a statically-linked standalone "virtclient" binary in each virtual machine, then the server binary coordinates testing. At the moment it consists of 10 tests, but it will grow significantly as I test more things and get feedback. It also has a local mode, where each virtclient is launched as a process in the local machine (inter-guest tests are then TCP between processes rather than between machines).

So I compared lhype against local performance on my 1533MHz AMD Athlon. Since the benchmarks are mostly chosen to measure things where we expect virtualization to be slow, we expect to be measurably slower than native. Indeed, the naivety of lhype shows up almost immediately: I refused to optimize before this benchmarking.

Time for one context switch via pipe 108 times slower
Time for one Copy-on-Write fault 3 times slower
Time to exec client once 26 times slower
Time for one fork/exit/wait 81 times slower
Time to walk random 64 MB 1 times slower
Time to walk linear 64 MB 1 times slower
Time to read from disk (16 kB) 2 times slower
Time for one disk read 0 times slower
Time for one syscall 35 times slower
Time for inter-guest pingpong 8 times slower

The "disk read" cases are wrong, because the disk read isn't synchronous in lhype. The memory walks are about the same (this would only expect to be different with overcommit), but the worst results are the context switch, the fork/exit/wait, and the syscall. The context switch slowness I expected, because lhype throws away all the pagetables when the toplevel changes (and then immediatly faults them all back in). A better solution would be to keep track of multiple toplevels, or at least pre-fault back in the stack pointer and program counter. The system call overhead is much higher than I expected: it goes into the hypervisor and is reflected back out, but 35 times slower? Fortunately, this is the easiest to fix, too: we should direct the system call trap directly into the guest. That means some trickier handling for the system call return (which could be done without a hypercall).

[/tech] permanent link