Thu, 14 Jun 2007

Virtio I/O Draft III

So my work with trying to create a generic virtual I/O layer continues, with fascinating asides like the conversation with DaveM. My very first attempt (which never saw public release) was a low-level ring-buffer interface. My second attempt (draft I) was an interface to register input and output buffers and "used" pointers: in the "interrupt" you scanned the used pointers to see what had been used up.

I liked this, because the virtio-using drivers looked much like normal drivers. Unfortunately, it had the fatal flaw that delivery wasn't in-order. After some toying around, I moved to a callback model (draft II): each buffer has an associated callback which gets called and says what length was used. In order to get the locking to be sane, I moved the lock into the virtio subsyststem: driver callbacks are called with the lock held. This is much less Linux-driver-like, but seemed to work.

Then I tried to NAPIfy the net driver, and the difference between virtio and "normal" hardware bit me on the ass. NAPI assumes that the interrupt and the information about what happened are separate: you can disable the interrupt and still poll for incoming packets. You can't do this if you're relying on callback "interrupts" to tell you about used buffers. So I switched to a "get_inbuf()" method: instead of the callback being passed information about the used buffers, it (or any other code) can ask for them one at a time.

But now the draft-II centralized locking hit me: the net devices poll function stops the input callbacks, but the virtio output callback code will still try to grab the lock. So revert the lock centralization as well, and draft II has almost entirely vanished.

The moral here is that Linux driver infrastructure is optimized for real hardware: interrupts, status registers, DMA and such. If you're designing an I/O mechanism and you do something "alternative" your drivers won't fit the infrastructure: they'll be foreign-looking and complicated, and possibly buggy and sub-optimal as well.


[/tech] permanent link