[K42-discussion] Linux Dynamic Upgrade
Andrew Baumann
andrewb at cse.unsw.edu.au
Tue Oct 3 14:36:53 EST 2006
Hi,
On Tuesday 03 October 2006 10:04, Christopher Yeoh wrote:
> Will Schmidt (LTC) has been looking at doing a Linux implementation of the
> K42 dynamic upgrade. For those who are interested, here's a forward of
> some email about it....
>
> From: Will Schmidt <will_schmidt at vnet.ibm.com>
[snip]
> This is a work in progress.. A sample of dynamic upgrade for Linux.
> This is loosely based on the papers written about K42's
> implementation.
It's awesome that you're trying this stuff out in Linux. We had thought about
this a fair bit before the LCA paper, but unfortunately never got any hacking
done. I'd like to help as much as I can; if you have any questions about
details of the K42 implementation etc, feel free to ask.
> The loopback code was chosen, as it seemed like it would be a
> straightforward place to get a demo going.
Agreed, I think a demo is the best/quickest way to show what can be done with
this approach. People are going to complain that the patch is ugly or has a
lot of hacks, but the point is that if you were going to add this kind of
thing properly, you would implement a lot of it at a lower level (eg. in the
module system instead of hacking on top).
> To me, the most likely scenario will involve a bug being discoved, code
> getting fixed, modules being rebuilt, and then trying to load the new
> module on top of the old one...
That's one scenario, but it is also worthwhile keeping in mind that one thing
you can do with this approach that you can't with hot patching (DKM, kprobes,
etc.) is what it was originally designed for in K42: hot-swapping. Because
you change the functions only for a specific instance of a module (eg. a
single loopback filesystem, rather than all loopback filesystems), you can do
interesting adaptation and optimisation by swapping to a different
implementation on the fly.
> And the real change comes next.. I've got two new _fops structures
> involved. The first is a preserved_lo_fops, which contains pointers
> back to the original lo_fops functions; and second is a switcher_fops,
> which points to a controlling function, which directs the calls between
> the new and old versions.
>
> For the switching logic, in this case i'm just using counters, with
> arbitrary threshold values, to determine when to call the new version of
> the function, and another random counter to trigger when to update the
> fops pointer to bypass the fops_switcher completely and call the 'new'
> functions directly. This is where some fancier RCU sort of code could
> be involved.
That works for now, but one of the key things we wanted to support was updates
that changed the data structures maintained by a module. I think that's one
of the main advantages of this approach over hot patching with kprobes. I
guess this kind of change isn't likely for the loop device, but have you
thought about that at all? It means you need to not just switch the functions
from the old code to the new versions, but in between doing the switch you
need to run a state transformer function to change the data structures of the
module, and to do that safely you need to know that none of the old functions
are executing.
That is where things get tricky, because in Linux there isn't a clean
mechanism to tell when threads are executing in the module's code... the only
approach I've seen used is walking all the kernel stacks, which feels wrong
to me. What we discussed before LCA is adding an entry/exit counter to all
the exported functions of a module. Of course, those counters would need to
be in the base kernel before you tried loading an update.
Again, it's really cool that you're trying this out in Linux. Let me know what
you think and if/how I can help.
Andrew
More information about the K42-discussion
mailing list