[K42-discussion] DirLinuxFSVolatile deadlock

Dilma DaSilva dilma at watson.ibm.com
Mon Sep 25 10:34:37 EST 2006


Patrick Bozeman writes:
 > 
 > A few things are wrong here.  The main reason I noticed this was that it 
 > causes
 > a deadlock.  Notice that frame 8 is NameTreeLInuxFS::_getStatus.  If you
 > look at that the code for that function, it has previously called
 >     rc = DREF(rootDirLinuxRef)->lookup(pathName, namelen,
 >                        remainder, remainderLen,
 >                        dirLinuxRef, nhLock);
 > 
 > which performs a read lock on the name holder lock on behalf of 
 > NameTreeLinuxFS
 > and returns it as nhLock.  NameTreeLinuxFS releases this lock a little ways
 > after the call to DirLinuxFSVolatile::getStatus.
 > 
 > Of course, this leads to deadlock because the very next line after the
 > temporary assert is to grab a write lock on the nameholder.
 > 

Patrick, in my understanding, the trace stack you showed us is not 
all operating on the same object.
_getStatus will be calling getStatus on an object, which ends up
calling eliminateStaleDir, which will call locked_doDetachInvalidDir in
the PARENT directory. So the nameholder object that we're trying
to acquire for write is not the one being held. Am I missing
something?

 > On a related note, it seems that children.remove(ObjRef, NameHolderInfo) is
 > leaking memory.  In DentryListHash::remove(name, len, nhi) the hash
 > entry is deleted at the end, but in DentryListHash::remove(ref, nhi) it 
 > is not.

I'm looking at DentryList.C (version 1.21), and I see that
DentryListHash::remove(ref,nhi) is deleting the entry in line 337,
at the point where the entry is found.  

Thanks!

dilma



More information about the K42-discussion mailing list