Sun, 17 Jun 2007

Talloc: References Considered Harmful?

So everyone knows I'm a big fan of talloc, the hierarchical allocator. The more you use it, the nicer it gets. So much so that I wouldn't code a new C project without it ("if it uses free(), it needs talloc").

A longstanding question in talloc circles has been the use of talloc_reference() vs. talloc_free(). (See posts like this one to samba-technical).

The normal problem runs like this: code A does "X = talloc(PARENT, ...); somefn(X); talloc_free(X)". This code should still be correct if somefn() uses talloc_reference() to hold a reference to X, and yet in general we don't know whether the "talloc_free()" meant to unlink X from PARENT or whatever somefn() attached it to. One solution is to insist that everyone use "talloc_unlink()" which explicitly says what parent to free the node from. But that's awkward, and can be difficult.

I came up with an algorithm which gets this "correct", in the sense that code using talloc_free() never free X before code using talloc_unlink(), and never after the worst-case talloc_unlink(). But one important feature of talloc's destructors are that they are reliable unlike garbage collection: this algorithm means that X's destructor (which you might be relying on for cleanup) might not get called when you expect, depending on who else references it.

But this leads us back to examine the original case: why is "somefn()" taking a reference? If the object is going to be destroyed, as in the case of our code, presumably it should no longer be used by the somefn() logic anyway (some destructor should deregister it). And if that is the case, somefn() doesn't need a reference...

I spoke briefly with Tridge about it, and he said "well, removing references does seem to remove bugs". So perhaps if you think you need a reference you should look deeper...


[/tech] permanent link