[K42-discussion] Debugging help

David Tam tamda at eecg.toronto.edu
Thu Nov 10 05:40:24 EST 2005


I need some debugging help and I'm wondering if anyone has any clue as to
what my bug might be.

I've been attempting to run SPECjbb2000 + J9 JVM on K42 with my user-level
thread migration patch enabled.  My kitchsrc was last updated on Nov 2nd.

Everything runs fine on a 4-CPU system (k10) but I encounter a gdb
breakpoint in the kernel when running on a 8-CPU system (k0 with only 8
CPUs enabled).

The k42 console reports the following message ~145 times and then
hits a gdb breakpoint in the kernel.

	Giving back 0x10 pages (Y > 0x80000)

, where Y is between 0x80002 and 0x0005 inclusively.

gdb tells me that I triggered the assert in FRPA::startPutPage()
because rc=0x800000000b2c0110.

FRPA::startPutPage() {
...
..
.
    // FIXME, pass in blocking info here...
    rc = convertAddressWriteTo(physAddr, addr, rr);
    tassertMsg(_SUCCESS(rc), "rc 0x%lx\n", rc);
...
..
.
}

Upon further investigation of the source code of convertAddressWriteTo(),
I find that it always returns 0.
Therefore, it should be impossible for the tassertMsg() to be triggered.

virtual SysStatus convertAddressWriteTo(uval physAddr, uval &vaddr,
                                        IORestartRequests *rr=0) {
    vaddr = physAddr;
    return 0;
}



Perhaps there is memory corruption caused by my changes to the
user-level scheduler code (kitchsrc/lib/libc/scheduler/*) ?

Any guesses, hints, suggestions are gladly welcomed.
Thanks.



=========

Here is some more information about that frame.

(gdb) info frame
Stack level 3, frame at 0xc000000002ef4860:
 pc = 0xc00000000221eafc
    in FRPA::startPutPage(unsigned long, unsigned long, IORestartRequests*)
    (/homes/kix/tamdavid/k42-20050520/kitchsrc/os/kernel/mem/FRPA.C:136); 
    saved pc 0xc0000000022c3b58
 called by frame at 0xc000000002ef4910, caller of frame at 0xc000000002ef47d0
 source language c++.
 Arglist at 0xc000000002ef4860, args: this=0x8002000020623600, 
    physAddr=877690880, objOffset=5795840, rr=0x80020000208c6d80
 Locals at 0xc000000002ef4860, Previous frame's sp in r1
 Saved registers:
  r30 at 0xc000000002ef4900, r31 at 0xc000000002ef4908,
  lr at 0xc000000002ef4920
(gdb) 


Local variables:
(gdb) info local
size = 4096
addr = 11460608
rc = -9223372036667342576
(gdb) 

Doing a gdb "backtrace" reports the following.
(gdb) bt
#0  breakpoint () at libksup.C:49
#1  0xc0000000023b0ae4 in raiseError() ()
    at /homes/kix/tamdavid/k42-20050520/kitchsrc/lib/libc/sys/TAssert.C:50
#2  0xc0000000023b0c48 in errorWithMsg(char const*, char const*, unsigned long, char const*, ...) (
    failedexpr=0xc000000002474b98 "(__builtin_expect(((rc)>=0),1))", 
    fname=0xc000000002474bb8 "/homes/kix/tamdavid/k42-20050520/kitchsrc/os/kernel/mem/FRPA.C", lineno=136, fmt=0xc000000002474d10 "rc 0x%lx\n")
    at /homes/kix/tamdavid/k42-20050520/kitchsrc/lib/libc/sys/TAssert.C:108
#3  0xc00000000221eafc in FRPA::startPutPage(unsigned long, unsigned long, IORestartRequests*) (this=0x8002000020623600, physAddr=877690880, 
    objOffset=5795840, rr=0x80020000208c6d80)
    at /homes/kix/tamdavid/k42-20050520/kitchsrc/os/kernel/mem/FRPA.C:136
#4  0xc0000000022c3b58 in FSFRSwap::startPutPage(unsigned long, FRComputation**, unsigned long, unsigned long&, unsigned long volatile*, IORestartRequests*) (
    this=0xc00000000549b300, physAddr=877690880, ref=0x8000000010008f20, 
    offset=11460608, blockID=@0xc000000002ef4a30, context=0x8002000000328070, 
    rr=0x80020000208c6d80)
    at /homes/kix/tamdavid/k42-20050520/kitchsrc/os/kernel/bilge/FSFRSwap.C:227
#5  0xc00000000222414c in FRComputation::putPageInternal(unsigned long, unsigned long, unsigned long, IORestartRequests*) (this=0x8002000000328000, 
    physAddr=877690880, offset=11460608, async=1, rr=0x80020000208c6d80)
    at /homes/kix/tamdavid/k42-20050520/kitchsrc/os/kernel/mem/FRComputation.C:200
#6  0xc00000000222431c in FRComputation::startPutPage(unsigned long, unsigned long, IORestartRequests*) (this=0x8002000000328000, physAddr=877690880, 
    offset=11460608, rr=0x80020000208c6d80)
    at /homes/kix/tamdavid/k42-20050520/kitchsrc/os/kernel/mem/FRComputation.C:242
#7  0xc0000000021b0788 in FCMDefault::resumeIO() (this=0x8002000020811a00)
    at /homes/kix/tamdavid/k42-20050520/kitchsrc/os/kernel/mem/FCMDefault.C:808
#8  0xc0000000021b2424 in IORestartRequests::notify() (this=0x80020000208c6d80)
    at IORestartRequests.H:103
#9  0xc000000002230ee8 in IORestartRequests::NotifyAll(IORestartRequests*) (
    qcopy=0x0) at IORestartRequests.H:119
#10 0xc000000002240874 in KernelPagingTransport::ioComplete() (
    this=0x800200000030a400)
    at /homes/kix/tamdavid/k42-20050520/kitchsrc/os/kernel/mem/KernelPagingTransport.C:171
#11 0xc00000000221cd48 in FRVA::_ioComplete(unsigned long, unsigned long, long)
    (this=0x8002000020623600, vaddr=1100586164224, fileOffset=5541888, rc=0)
    at /homes/kix/tamdavid/k42-20050520/kitchsrc/os/kernel/mem/FRVA.C:114
#12 0xc000000002250148 in XFRVA::__ioCompleteEmm(unsigned long) (
    this=0x8002000000406f00, callerID=4294967301) at XFRVA.C:130
#13 0xc00000000239a018 in DispatcherDefault_InvokeXObjMethod ()
    at CObjRootMediator.H:102
#14 0xc000000002399eb0 in DispatcherDefault_PPCServerOnThread ()
    at CObjRootMediator.H:102
(gdb) 

=============================

k42console output
-----------------
	Giving back 0x10 pages (0x80001 > 0x80000)
	Giving back 0x10 pages (0x80002 > 0x80000)
	Giving back 0x10 pages (0x80003 > 0x80000)
	Giving back 0x10 pages (0x80004 > 0x80000)
	Giving back 0x10 pages (0x80004 > 0x80000)
...
..
.
(~145 times)
	Giving back 0x10 pages (0x80001 > 0x80000)
	Giving back 0x10 pages (0x80001 > 0x80000)
	Giving back 0x10 pages (0x80001 > 0x80000)
	Giving back 0x10 pages (0x80001 > 0x80000)
ERROR: file "/homes/kix/tamdavid/k42-20050520/kitchsrc/os/kernel/mem/FRPA.C", line 136
rc 0x800000000b2c0110
GDB got trap: Program Interrupt
vector=0x700, sr=0xa00000000002b032, pc=0xc0000000022afb34 lr=0xc0000000023b0ae4
Kernel Connecting to GDB via thinwire channel
(use kvictim to find gdb target machine and port)


-- 
David Tam <tamda at eecg.toronto.edu>
Graduate Student, ECE Dept, University of Toronto
http://www.eecg.toronto.edu/~tamda




More information about the K42-discussion mailing list