[K42-discussion] lib-emu header patch (has become procfs and ProcessLinuxServer interaction)

Patrick Bozeman PEBozeman at lbl.gov
Thu Sep 28 08:01:35 EST 2006


It sounds like you might not like the direction I have gone with the 
process linux FD work.  Let me know if you think I should approach this 
differently.

The following discusses the high level requirements for process specific 
fd support as needed by procfs.  I then talk about two alternate 
designs, and then provide an example snippet from the approach I ended 
up taking.  Note, that I'll be happy to change to another design if you 
guys think another approach would be better.

At a high level, I need to do 3 basic things with procfs fd support. 
    1) Enumerate the fds within a process (to figure out what dirents 
need to be in the /proc/<pid>/fd directory.) 

    2) Make sure that an fd is still valid so that I know when to tickle 
the cache validation logic. 

    3) Lookup the client specific path the associated with a particular 
fd (keeping hardlinks, and file/dir renames in mind.)

Since procfs is running as a separate server, I need a way to do this 
for arbitrary processes in spite of the fact that the bulk of the 
information about FDs is stored in the per-process _FD static object.  
For example, if I am populating the dirents for /proc/320/fd, I need to 
make a cobj call to the process linux client associated with pid 320.

My original thinking, which I didn't implement, was that I would ask the 
ProcessLinuxServer to return ProcessLinux references that I would make 
the proc related calls against.  This would have included the FD 
operations above, but also  things like getName(), getEnvironment(), 
etc.  I wrote some initial ProcessLinuxServer iteration code that 
returned process linux references rather than pids.  I had some problems 
using the references I was being returned and asked the k42 discuss list 
for help.  In that thread, I was told that the Watson guys talked about 
returning ProcessLinux references to proc and that they were against 
it.  They advocated the second approach, which I ended up using, and 
which I describe next.

The second option, the one I implemented, is to route process related 
calls through the ProcessLinuxServer rather than returning process 
references to procfs.  This leads to calls like,

class ProcessLinuxServer {
    <snip>

    virtual SysStatus _getName(__in pid_t about, __outbuf(*:nameLen) 
char *name,
                               __in uval nameLen, __XHANDLE xhandle);

    virtual SysStatus _getFirstFD(__in pid_t about, __inout uval& fd,
                                  __XHANDLE xhandle);

    virtual SysStatus _getNextFD(__in pid_t about, __inout uval& fd,
                                 __XHANDLE xhandle);

    virtual SysStatus _getFDStatus(__in pid_t about, __in uval fd,
                                   __out FileLinux::Stat& stat,
                                   __XHANDLE xhandle);

    virtual SysStatus _getFDPath(__in pid_t about, __in uval fd,
                                 __outbuf(__rc:bufsize) char *buf,
                                 __in uval bufsize, __XHANDLE xhandle);


    <snip>
}


being handled by ProcessLinuxServer.  The calls stack sfter one of these 
calls looks as follows (well, it's not really a call stack since there 
are 3 cobj calls here, but you get the idea)

    ProcessLinuxServer::_getFDStatus(__in pid_t about, __in uval fd,
                                     __out FileLinux::Stat& stat,
                                     __XHANDLE xhandle)
       // lookup pid and get client stub
       // stub is for a ProcessLinuxClient
       stub._getFDStatus(fd, stat);
          _FD::GetStatus(fd, stat);
             FileTable->getStatus(fd, stat);
                // lookup fref by fd
                DREF(fref)->getStatus(&stat);
               
         
I follow the same pattern for all the other info gathered by proc, i.e. 
procfs makes an info or enumeration call to ProcessLinuxServer with a 
pid plus the arguments ProcessLinuxServer will use to call the 
corresponding ProcessLinuxClient call.  ProcessLinuxServer converts the 
pid to a stub, makes the call to the stub, and then ProcessLinuxClient 
performs the requested operation, usually resulting in more cobj calls.

On a related note, I have an issue with all of these kinds of calls 
which I was going to ask about once the pre-requisite patches had been 
approved.  However, since we have jumped forward to this code, I might 
as well ask it now.

I can't use an auto lock in these kinds of calls because after I call 
the client's doWhatever function, one of the later cobj objects 
inevitably tries to call back to the ProcessLinuxServer with a call to 
getCredsPointer, which also tries lock the task lock.  So, I'm 
explicitly dropping the taskLock prior to calling the ProcessLinuxClient. 

I am hopping that by calling stub.setOH() that the cobj layer will 
guarantee that my object won't be destroyed until this thread has 
completed and I return from stub._doWhatever().  Hopefully, this is 
where the K42 existence locks come into play, and I am safe to release 
the taskLock here.  Unfortunately, I still know very little about this 
aspect of K42.

/* virtual */ SysStatus
ProcessLinuxServer::_getFDStatus(__in pid_t about, __in uval fd,
                                 __out FileLinux::Stat& stat,
                                 __XHANDLE xhandle)
{
    task_struct *linux_task;

    // FIXME: this is a temporary hack utill I figure out what to do.
    // The problem is that when we call stub._getFDStatus below,
    // ProcessLinuxServer is likely to be reentered via a call to
    // getcreddspointer, at which point we deadlock.
    taskLock.acquire();

    if (about) {
        linux_task = locked_find_task_by_pid(about);
    } else {
        linux_task = (task_struct*)(XHandleTrans::GetClientData(xhandle));
    }

    //not a linux process
    if (!linux_task) {
        // FIXME: allocate error code
        taskLock.release();
        return _SERROR(10100, 0, ESRCH);
    }

    StubProcessLinuxClient stub(StubObj::UNINITIALIZED);
    if (!linux_task->callbackOH.valid()) {
        // FIXME: allocate error code
        taskLock.release();
        return _SERROR(10101, 0, ESRCH);
    }

    stub.setOH(linux_task->callbackOH);
    taskLock.release();

    return stub._getFDStatus(fd, stat);
}

p.s. The lookup and validation code in that function is duplicated in 
several places (in each of my new calls) and I plan on refactoring it 
prior to the submitting the patch.. so expect the submitted patch to 
look a little different.







More information about the K42-discussion mailing list