[K42-discussion] race condition in clustered object call on unoptimized build
Raymond Fingas
fingas at eecg.toronto.edu
Sat Feb 11 08:54:20 EST 2006
Well, changing from a real object to a default object (or vice versa...)
is supposed to be a valid thing to do, even if one or more threads are
running. So that isn't really an option. I'm not sure it's a serious
problem, though, since the odds of hitting the race are low, there is an
easy workaround (an optimized build such as partDeb or noDeb), and there
is an obvious proper solution (modify the clustered object calling
algorithm to avoid multiple reads of the LTT). The proper solution may
well be possible by modifying the DREF macro, or it may require modifying
the compiler (ouch!). At least that is my thought... It's definitely a bug
in K42, though, and in one of the well tested and expected to work places.
Here is an example of the problem that should be easy to understand.
Consider threads A and B, both running on the same VP. A is trying to call
a clustered object when it is interrupted by B, which calls the same
clustered object:
A B
DREF(co)->foo()
r11=*co (ltt)
r11=*r11 (this)
r11=112(*r11) (f'n pointer)
mtctr r11
DREF(co)->foo()
r11=*co (ltt)
r11=*r11 (this)
r11=112(*r11) (f'n pointer)
mtctr r11
r11=*co (ltt)
r3=*r11 (this)
bctrl
(calls default object and installs local representative)
r11=*co (ltt)
r3=*r11 (this)
bctrl
(calls default object with local representative this pointer)
On Thu, 9 Feb 2006, Dilma DaSilva wrote:
>
> This sounds like a nasty race condition.
>
> Could we mark the entry on the LTT is invalid when changing it from a
> real object to the default one, and only reuse entries after the
> generation count indicates that no "old thread" could be in the midst
> of checking?
>
> I'm probably missing something ... I wish I could jump into this now
> :-)
More information about the K42-discussion
mailing list