From trini at kernel.crashing.org  Wed Feb  1 02:08:24 2006
From: trini at kernel.crashing.org (Tom Rini)
Date: Tue, 31 Jan 2006 08:08:24 -0700
Subject: Maple fails to boot current git
In-Reply-To: <1138679592.4934.1.camel@localhost.localdomain>
References: <20060130171759.GE22672@smtp.west.cox.net>
	<20060130231118.GA19671@localhost.localdomain>
	<1138679592.4934.1.camel@localhost.localdomain>
Message-ID: <20060131150824.GO22672@smtp.west.cox.net>

On Tue, Jan 31, 2006 at 02:53:11PM +1100, Benjamin Herrenschmidt wrote:
> On Tue, 2006-01-31 at 12:11 +1300, David Gibson wrote:
> > On Mon, Jan 30, 2006 at 10:17:59AM -0700, Tom Rini wrote:
> > > Hello, trying to boot my maple board (ppc64_defconfig +
> > > CONFIG_PPC_EARLY_DEBUG_MAPLE=y) fails as follows (the "dirty" is
> > > #define DEBUG in kernel/prom_parse.c and platforms/maple/time.c):
> > 
> > Crud.  Our Maple is stuffed at the moment (doesn't complete the CPU
> > init script, so PIBS never even comes up on the 970), so I can't
> > really investigate.
> 
> Well, the RTC problem definitely looks like a bogus or lack of "ranges"
> property or the fact that the parser doesn't recognize "ht" as a PCI
> bus. You may want to try updating prom_parse.c to treat "ht" as a PCI
> bus and see if that helps.

With the following, I get parent bus is pci now, but still:
OF: ** translation for device /ht at 0/isa at 4/rtc at 900 **
OF: bus is isa (na=2, ns=1) on /ht at 0/isa at 4
OF: translating address: 00000001 00000900
OF: parent bus is pci (na=3, ns=2) on /ht at 0
OF: walking ranges...
OF: not found !
Maple: Unable to translate RTC address
Maple: No device node for RTC, assuming legacy address (0x70)

diff --git a/arch/powerpc/kernel/prom_parse.c b/arch/powerpc/kernel/prom_parse.c
index a8099c8..6006201 100644
--- a/arch/powerpc/kernel/prom_parse.c
+++ b/arch/powerpc/kernel/prom_parse.c
@@ -1,4 +1,4 @@
-#undef DEBUG
+#define DEBUG
 
 #include <linux/kernel.h>
 #include <linux/string.h>
@@ -113,8 +113,10 @@ static unsigned int of_bus_default_get_f
 
 static int of_bus_pci_match(struct device_node *np)
 {
-	/* "vci" is for the /chaos bridge on 1st-gen PCI powermacs */
-	return !strcmp(np->type, "pci") || !strcmp(np->type, "vci");
+	/* "vci" is for the /chaos bridge on 1st-gen PCI powermacs, "ht"
+	 * is the maple board. */
+	return !strcmp(np->type, "pci") || !strcmp(np->type, "vci") ||
+		!strcmp(np->type, "ht");
 }
 
 static void of_bus_pci_count_cells(struct device_node *np,
@@ -239,6 +241,16 @@ static struct of_bus of_busses[] = {
 		.translate = of_bus_pci_translate,
 		.get_flags = of_bus_pci_get_flags,
 	},
+	/* HT */
+	{
+		.name = "ht",
+		.addresses = "assigned-addresses",
+		.match = of_bus_pci_match,
+		.count_cells = of_bus_pci_count_cells,
+		.map = of_bus_pci_map,
+		.translate = of_bus_pci_translate,
+		.get_flags = of_bus_pci_get_flags,
+	},
 	/* ISA */
 	{
 		.name = "isa",


-- 
Tom Rini
http://gate.crashing.org/~trini/


From trini at kernel.crashing.org  Wed Feb  1 02:11:17 2006
From: trini at kernel.crashing.org (Tom Rini)
Date: Tue, 31 Jan 2006 08:11:17 -0700
Subject: [PATCH 2.6.16-rc1] Fix booting Maple boards (was: Re: LINUXPPC64
	Maple fails to boot current git)
In-Reply-To: <1138662630.3417.26.camel@brick.watson.ibm.com>
References: <20060130171759.GE22672@smtp.west.cox.net>
	<1138662630.3417.26.camel@brick.watson.ibm.com>
Message-ID: <20060131151117.GP22672@smtp.west.cox.net>

On Mon, Jan 30, 2006 at 06:10:30PM -0500, Michal Ostrowski wrote:

> I saw something similar on a JS-20 w SLOF.  The last message you see is
> related to the RTC driver, but the next thing to run after that is
> console_init(), which was where my system was dying.
> 
> Dropping the "#ifdef CONFIG_ISA" statements in
> arch/powerpc/kernel/legacy_serial.c appears to fix things, and I've been
> told that a patch to this effect has been posted (though I've yet to see
> it).

The following gets my Maple booting again, and I _think_ is testing what
was intended

---

When looking for legacy serial ports, condition poking of "ISA" areas
on CONFIG_GENERIC_ISA_DMA, rather than CONFIG_ISA as some boards (such
as the Maple) have no ISA slots, but do have ISA serial ports.

Signed-off-by: Tom Rini <trini at kernel.crashing.org>

 arch/powerpc/kernel/legacy_serial.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/legacy_serial.c b/arch/powerpc/kernel/legacy_serial.c
index f970ace..3dd7b39 100644
--- a/arch/powerpc/kernel/legacy_serial.c
+++ b/arch/powerpc/kernel/legacy_serial.c
@@ -134,7 +134,7 @@ static int __init add_legacy_soc_port(st
 	return add_legacy_port(np, -1, UPIO_MEM, addr, addr, NO_IRQ, flags);
 }
 
-#ifdef CONFIG_ISA
+#ifdef CONFIG_GENERIC_ISA_DMA
 static int __init add_legacy_isa_port(struct device_node *np,
 				      struct device_node *isa_brg)
 {
@@ -276,7 +276,7 @@ void __init find_legacy_serial_ports(voi
 		of_node_put(soc);
 	}
 
-#ifdef CONFIG_ISA
+#ifdef CONFIG_GENERIC_ISA_DMA
 	/* First fill our array with ISA ports */
 	for (np = NULL; (np = of_find_node_by_type(np, "serial"));) {
 		struct device_node *isa = of_get_parent(np);

-- 
Tom Rini
http://gate.crashing.org/~trini/


From linas at austin.ibm.com  Wed Feb  1 07:22:14 2006
From: linas at austin.ibm.com (linas)
Date: Tue, 31 Jan 2006 14:22:14 -0600
Subject: creating PCI-related sysfs entries
Message-ID: <20060131202214.GZ19465@austin.ibm.com>


Hi, 

I want to create some sysfs entries in order to report on the 
status of PCI slots.  (If you are guessing that this pertains
to the PCI error recovery code, you'd be right).  I'm having 
trouble figuring out the best way to do this.

There are existing entries at /sys/bus/pci/slots/... but these
are for hotplug slots only; none of the soldered-onto-the-MB
devices show up here.  Is this intentional, or is this a bug/
overshight/not-yet-implemented thing?

I also want to report some roll-up system-wide statistics
both /sys/module and /sys/class seem reasonable. My code
does not compile as a module. Suggestions?

Yes, I'm going to RTFM shortly after I hit the send key,
assuming I find the FM. 

--linas


From greg at kroah.com  Wed Feb  1 07:34:56 2006
From: greg at kroah.com (Greg KH)
Date: Tue, 31 Jan 2006 12:34:56 -0800
Subject: creating PCI-related sysfs entries
In-Reply-To: <20060131202214.GZ19465@austin.ibm.com>
References: <20060131202214.GZ19465@austin.ibm.com>
Message-ID: <20060131203456.GA23819@kroah.com>

On Tue, Jan 31, 2006 at 02:22:14PM -0600, linas wrote:
> 
> Hi, 
> 
> I want to create some sysfs entries in order to report on the 
> status of PCI slots.  (If you are guessing that this pertains
> to the PCI error recovery code, you'd be right).  I'm having 
> trouble figuring out the best way to do this.
> 
> There are existing entries at /sys/bus/pci/slots/... but these
> are for hotplug slots only; none of the soldered-onto-the-MB
> devices show up here.  Is this intentional, or is this a bug/
> overshight/not-yet-implemented thing?

Not implemented, as it's up to a pci hotplug controller driver to
provide those slots.  It sounds like your driver needs to be expanded :)

> I also want to report some roll-up system-wide statistics
> both /sys/module and /sys/class seem reasonable. My code
> does not compile as a module. Suggestions?

What kind of statistics?  Is this driver related?  PCI bus related?
Device related?

thanks,

greg k-h


From linas at austin.ibm.com  Wed Feb  1 08:08:05 2006
From: linas at austin.ibm.com (linas)
Date: Tue, 31 Jan 2006 15:08:05 -0600
Subject: creating PCI-related sysfs entries
In-Reply-To: <20060131203456.GA23819@kroah.com>
References: <20060131202214.GZ19465@austin.ibm.com>
	<20060131203456.GA23819@kroah.com>
Message-ID: <20060131210805.GA19465@austin.ibm.com>

On Tue, Jan 31, 2006 at 12:34:56PM -0800, Greg KH was heard to remark:
> On Tue, Jan 31, 2006 at 02:22:14PM -0600, linas wrote:
> > 
> > I want to create some sysfs entries in order to report on the 
> > status of PCI slots.  (If you are guessing that this pertains
> > to the PCI error recovery code, you'd be right).  I'm having 
> > trouble figuring out the best way to do this.
> > 
> > There are existing entries at /sys/bus/pci/slots/... but these
> > are for hotplug slots only; none of the soldered-onto-the-MB
> > devices show up here.  Is this intentional, or is this a bug/
> > overshight/not-yet-implemented thing?
> 
> Not implemented, as it's up to a pci hotplug controller driver to
> provide those slots.  It sounds like your driver needs to be expanded :)

Hmm. But these slots are not hot-plugabble; should the arch
use the hotplug infrastructure even on those slots?

I note that /sys/devices/pciXXXX does have all of the pci 
slos listed, so perhaps that is where I can place per-slot data.

> > I also want to report some roll-up system-wide statistics
> > both /sys/module and /sys/class seem reasonable. My code
> > does not compile as a module. Suggestions?
> 
> What kind of statistics?  Is this driver related?  PCI bus related?
> Device related?

Related to the PCI error recovery. I'm not sure how to conceptually
peg this: one could say that it is the driver for a specific type
of pci-host bridge, although the code is not currently structured 
as such. Should I try to restructure it as such? If so, I'm not 
clear on how to proceed; I can't say I've clearly seen a kernel
abstraction of a pci-host bridge device onto which to staple myself.

I wanted to report a few read-only statistics, and a few writeable 
parameters:

Read-only:
-- total number of PCI device resets due to detected errors
-- total number of "false positives" (probable errors that weren't)
-- some other misc related stats.

Most, but not all, of these statistics could be obtained by 
totalling up the per-slot statistics.

Writable: 
-- Number of reset tries to perform before concluding that the 
   device is hopelessly dead.  Resets are disruptive and intensive,
   and I don't want to get stuck in an inf loop on a dead device.


Linas.


From greg at kroah.com  Wed Feb  1 08:26:24 2006
From: greg at kroah.com (Greg KH)
Date: Tue, 31 Jan 2006 13:26:24 -0800
Subject: creating PCI-related sysfs entries
In-Reply-To: <20060131210805.GA19465@austin.ibm.com>
References: <20060131202214.GZ19465@austin.ibm.com>
	<20060131203456.GA23819@kroah.com>
	<20060131210805.GA19465@austin.ibm.com>
Message-ID: <20060131212624.GA10513@kroah.com>

On Tue, Jan 31, 2006 at 03:08:05PM -0600, linas wrote:
> On Tue, Jan 31, 2006 at 12:34:56PM -0800, Greg KH was heard to remark:
> > On Tue, Jan 31, 2006 at 02:22:14PM -0600, linas wrote:
> > > 
> > > I want to create some sysfs entries in order to report on the 
> > > status of PCI slots.  (If you are guessing that this pertains
> > > to the PCI error recovery code, you'd be right).  I'm having 
> > > trouble figuring out the best way to do this.
> > > 
> > > There are existing entries at /sys/bus/pci/slots/... but these
> > > are for hotplug slots only; none of the soldered-onto-the-MB
> > > devices show up here.  Is this intentional, or is this a bug/
> > > overshight/not-yet-implemented thing?
> > 
> > Not implemented, as it's up to a pci hotplug controller driver to
> > provide those slots.  It sounds like your driver needs to be expanded :)
> 
> Hmm. But these slots are not hot-plugabble; should the arch
> use the hotplug infrastructure even on those slots?

Why not?  It's a good place to put them, right?

> I note that /sys/devices/pciXXXX does have all of the pci 
> slos listed, so perhaps that is where I can place per-slot data.

That's only because your arch might happen to have 1 device per slot,
which is not true for other arches.  And I bet it's also not true for
your non-virtual boxes...

> > > I also want to report some roll-up system-wide statistics
> > > both /sys/module and /sys/class seem reasonable. My code
> > > does not compile as a module. Suggestions?
> > 
> > What kind of statistics?  Is this driver related?  PCI bus related?
> > Device related?
> 
> Related to the PCI error recovery. I'm not sure how to conceptually
> peg this: one could say that it is the driver for a specific type
> of pci-host bridge, although the code is not currently structured 
> as such. Should I try to restructure it as such? If so, I'm not 
> clear on how to proceed; I can't say I've clearly seen a kernel
> abstraction of a pci-host bridge device onto which to staple myself.

People have suggested that they create such a driver for a long time.
Why not just do that?

> I wanted to report a few read-only statistics, and a few writeable 
> parameters:
> 
> Read-only:
> -- total number of PCI device resets due to detected errors
> -- total number of "false positives" (probable errors that weren't)
> -- some other misc related stats.

These are all "per slot" right?

> Most, but not all, of these statistics could be obtained by 
> totalling up the per-slot statistics.
> 
> Writable: 
> -- Number of reset tries to perform before concluding that the 
>    device is hopelessly dead.  Resets are disruptive and intensive,
>    and I don't want to get stuck in an inf loop on a dead device.

Why would you want to change this value?  Just pick one at build time.

thanks,

greg k-h


From benh at kernel.crashing.org  Wed Feb  1 08:31:34 2006
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Wed, 01 Feb 2006 08:31:34 +1100
Subject: Maple fails to boot current git
In-Reply-To: <20060131150824.GO22672@smtp.west.cox.net>
References: <20060130171759.GE22672@smtp.west.cox.net>
	<20060130231118.GA19671@localhost.localdomain>
	<1138679592.4934.1.camel@localhost.localdomain>
	<20060131150824.GO22672@smtp.west.cox.net>
Message-ID: <1138743094.4934.11.camel@localhost.localdomain>

On Tue, 2006-01-31 at 08:08 -0700, Tom Rini wrote:
> On Tue, Jan 31, 2006 at 02:53:11PM +1100, Benjamin Herrenschmidt wrote:
> > On Tue, 2006-01-31 at 12:11 +1300, David Gibson wrote:
> > > On Mon, Jan 30, 2006 at 10:17:59AM -0700, Tom Rini wrote:
> > > > Hello, trying to boot my maple board (ppc64_defconfig +
> > > > CONFIG_PPC_EARLY_DEBUG_MAPLE=y) fails as follows (the "dirty" is
> > > > #define DEBUG in kernel/prom_parse.c and platforms/maple/time.c):
> > > 
> > > Crud.  Our Maple is stuffed at the moment (doesn't complete the CPU
> > > init script, so PIBS never even comes up on the 970), so I can't
> > > really investigate.
> > 
> > Well, the RTC problem definitely looks like a bogus or lack of "ranges"
> > property or the fact that the parser doesn't recognize "ht" as a PCI
> > bus. You may want to try updating prom_parse.c to treat "ht" as a PCI
> > bus and see if that helps.
> 
> With the following, I get parent bus is pci now, but still:
> OF: ** translation for device /ht at 0/isa at 4/rtc at 900 **
> OF: bus is isa (na=2, ns=1) on /ht at 0/isa at 4
> OF: translating address: 00000001 00000900
> OF: parent bus is pci (na=3, ns=2) on /ht at 0
> OF: walking ranges...
> OF: not found !
> Maple: Unable to translate RTC address
> Maple: No device node for RTC, assuming legacy address (0x70)

Can you send me the device-tree dump ?

Ben.


From markh at osdl.org  Wed Feb  1 08:33:00 2006
From: markh at osdl.org (Mark Haverkamp)
Date: Tue, 31 Jan 2006 13:33:00 -0800
Subject: iommu_alloc failure and panic
In-Reply-To: <43DF691E.1020008@emulex.com>
References: <200601310118.k0V1Il7Z018408@falcon30.maxeymade.com>
	<43DF691E.1020008@emulex.com>
Message-ID: <1138743180.15732.15.camel@markh3.pdx.osdl.net>

On Tue, 2006-01-31 at 08:41 -0500, James Smart wrote:
> >> 2) The emulex driver has been prone to problems in the past where it's
> >> been very aggressive at starting DMA operations, and I think it can
> >> be avoided with tuning. What I don't know is if it's because of this,
> >> or simply because of the large number of targets you have. Cc:ing James
> >> Smart.
> 
> I don't have data points for the 2.6 kernel, but I can comment on what I
> have seen on the 2.4 kernel.
> 
> The issue that I saw on the 2.4 kernel was that the pci dma alloc routine
> was inappropriately allocating from the dma s/g maps. On systems with less
> than 4Gig of memory, or on those with no iommmu (emt64), the checks around
> adapter-supported dma masks were off (I'm going to be loose in terms to not
> describe it in detail). The result was, although the adapter could support
> a fully 64bit address and/or although the physical dma address would be under
> 32-bits, the logic forced allocation from the mapped dma pool. On some
> systems, this pool was originally only 16MB. Around 2.4.30, the swiotlb was
> introduced, which reduced issue, but unfortunately, still never solved the
> allocation logic. It fails less as the swiotlb simply had more space.
> As far as I know, this problem doesn't exist in the 2.6 kernel. I'd have to
> go look at the dma map functions to make sure.
> 
> Why was the lpfc driver prone to the dma map exhaustion failures ? Due to the
> default # of commands per lun and max sg segments reported by the driver to
> the scsi midlayer, the scsi mid-layer's preallocation of dma maps for commands
> for each lun, and the fact that our FC configs were usually large, had lots
> of luns, and replicated the resources for each path to the same storage.
> 
> Ultimately, what I think is the real issue here is the way the scsi mid-layer
> is preallocating dma maps for the luns. 16000 luns is a huge number.
> Multiply this by a max sg segment count of 64 by the driver, and a number
> between 3 and 30 commands per lun, and you can see the numbers. Scsi does do
> some interesting allocation algorithms once it hits an allocation failure.
> One side effect of this is that it is fairly efficient at allocating the
> bulk of the dma pool.

James,

Thanks for the information.  I tried loading the lpfc driver with
lpfc_lun_queue_depth=1 and haven't seen iommu_alloc failures.  I'm still
curious why the alloc failures lead to a panic though.

Mark.


> 
> -- james s
-- 
Mark Haverkamp <markh at osdl.org>


From grundler at parisc-linux.org  Wed Feb  1 09:48:52 2006
From: grundler at parisc-linux.org (Grant Grundler)
Date: Tue, 31 Jan 2006 15:48:52 -0700
Subject: creating PCI-related sysfs entries
In-Reply-To: <20060131210805.GA19465@austin.ibm.com>
References: <20060131202214.GZ19465@austin.ibm.com>
	<20060131203456.GA23819@kroah.com>
	<20060131210805.GA19465@austin.ibm.com>
Message-ID: <20060131224852.GA25579@colo.lackof.org>

On Tue, Jan 31, 2006 at 03:08:05PM -0600, linas wrote:
> Related to the PCI error recovery. I'm not sure how to conceptually
> peg this: one could say that it is the driver for a specific type
> of pci-host bridge, although the code is not currently structured 
> as such. Should I try to restructure it as such? If so, I'm not 
> clear on how to proceed; I can't say I've clearly seen a kernel
> abstraction of a pci-host bridge device onto which to staple myself.

AFAIK, no pci-host device abstraction exists.
Each arch deals with pci-host bridges as it sees fit.

But access methods to some PCI features are abstracted:
o method access to CFG space
o method to register IRQs
o advertise MMIO/IO Port routing.

Sounds like you want to add another method for error recovery
stats/control.

grant


From James.Smart at Emulex.Com  Wed Feb  1 00:41:50 2006
From: James.Smart at Emulex.Com (James Smart)
Date: Tue, 31 Jan 2006 08:41:50 -0500
Subject: iommu_alloc failure and panic
In-Reply-To: <200601310118.k0V1Il7Z018408@falcon30.maxeymade.com>
References: <200601310118.k0V1Il7Z018408@falcon30.maxeymade.com>
Message-ID: <43DF691E.1020008@emulex.com>


>> 2) The emulex driver has been prone to problems in the past where it's
>> been very aggressive at starting DMA operations, and I think it can
>> be avoided with tuning. What I don't know is if it's because of this,
>> or simply because of the large number of targets you have. Cc:ing James
>> Smart.

I don't have data points for the 2.6 kernel, but I can comment on what I
have seen on the 2.4 kernel.

The issue that I saw on the 2.4 kernel was that the pci dma alloc routine
was inappropriately allocating from the dma s/g maps. On systems with less
than 4Gig of memory, or on those with no iommmu (emt64), the checks around
adapter-supported dma masks were off (I'm going to be loose in terms to not
describe it in detail). The result was, although the adapter could support
a fully 64bit address and/or although the physical dma address would be under
32-bits, the logic forced allocation from the mapped dma pool. On some
systems, this pool was originally only 16MB. Around 2.4.30, the swiotlb was
introduced, which reduced issue, but unfortunately, still never solved the
allocation logic. It fails less as the swiotlb simply had more space.
As far as I know, this problem doesn't exist in the 2.6 kernel. I'd have to
go look at the dma map functions to make sure.

Why was the lpfc driver prone to the dma map exhaustion failures ? Due to the
default # of commands per lun and max sg segments reported by the driver to
the scsi midlayer, the scsi mid-layer's preallocation of dma maps for commands
for each lun, and the fact that our FC configs were usually large, had lots
of luns, and replicated the resources for each path to the same storage.

Ultimately, what I think is the real issue here is the way the scsi mid-layer
is preallocating dma maps for the luns. 16000 luns is a huge number.
Multiply this by a max sg segment count of 64 by the driver, and a number
between 3 and 30 commands per lun, and you can see the numbers. Scsi does do
some interesting allocation algorithms once it hits an allocation failure.
One side effect of this is that it is fairly efficient at allocating the
bulk of the dma pool.

-- james s


From olh at suse.de  Wed Feb  1 19:26:21 2006
From: olh at suse.de (Olaf Hering)
Date: Wed, 1 Feb 2006 09:26:21 +0100
Subject: [PATCH] ppc64: per cpu data optimisations
In-Reply-To: <20060111021644.GC4767@krispykreme>
References: <20060111021644.GC4767@krispykreme>
Message-ID: <20060201082621.GA29274@suse.de>

 On Wed, Jan 11, Anton Blanchard wrote:


Anton, this causes trouble if you have sles10 installed and if runlevel
6 is your default runlevel (aka reboot in a loop). Whats wrong with the
patch?
See https://bugzilla.novell.com/show_bug.cgi?id=145459 for details.

there are 2 other bugs which are seen also on other archs,
will start looking at them now.

-- 
short story of a lazy sysadmin:
 alias appserv=wotan


From linas at austin.ibm.com  Thu Feb  2 08:30:18 2006
From: linas at austin.ibm.com (linas)
Date: Wed, 1 Feb 2006 15:30:18 -0600
Subject: creating PCI-related sysfs entries
In-Reply-To: <20060131212624.GA10513@kroah.com>
References: <20060131202214.GZ19465@austin.ibm.com>
	<20060131203456.GA23819@kroah.com>
	<20060131210805.GA19465@austin.ibm.com>
	<20060131212624.GA10513@kroah.com>
Message-ID: <20060201213018.GG14705@austin.ibm.com>

On Tue, Jan 31, 2006 at 01:26:24PM -0800, Greg KH was heard to remark:
> On Tue, Jan 31, 2006 at 03:08:05PM -0600, linas wrote:
> > 
> > ... the PCI error recovery. I'm not sure how to conceptually
> > peg this: one could say that it is the driver for a specific type
> > of pci-host bridge, although the code is not currently structured 
> > as such. Should I try to restructure it as such? If so, I'm not 
> > clear on how to proceed; I can't say I've clearly seen a kernel
> > abstraction of a pci-host bridge device onto which to staple myself.
> 
> People have suggested that they create such a driver for a long time.
> Why not just do that?

OK. Let me get this straight, then. Create a generic 
struct pci_host_bridge, which encapsulates  some (all?)
of the functions that Grant Grundler mentions in his email:

Grant Grundler <grundler at parisc-linux.org>:
<> Each arch deals with pci-host bridges as it sees fit.
<>
<>But access methods to some PCI features are abstracted:
<>o method access to CFG space
<>o method to register IRQs
<>o advertise MMIO/IO Port routing.

At the risk of over-engineering, maybe there should be a 
struct bus_host_bridge, and struct pci_host_bridge would 
derive from that?

--linas

p.s. rest of message:

> > I wanted to report a few read-only statistics, and a few writeable 
> > parameters:
> > 
> > Read-only:
> > -- total number of PCI device resets due to detected errors
> > -- total number of "false positives" (probable errors that weren't)
> > -- some other misc related stats.
> 
> These are all "per slot" right?

Right. I'll keep them that way.

> > Writable: 
> > -- Number of reset tries to perform before concluding that the 
> >    device is hopelessly dead.  Resets are disruptive and intensive,
> >    and I don't want to get stuck in an inf loop on a dead device.
> 
> Why would you want to change this value?  Just pick one at build time.

OK.

--linas


From linas at austin.ibm.com  Thu Feb  2 08:35:46 2006
From: linas at austin.ibm.com (linas)
Date: Wed, 1 Feb 2006 15:35:46 -0600
Subject: creating PCI-related sysfs entries
In-Reply-To: <20060131224852.GA25579@colo.lackof.org>
References: <20060131202214.GZ19465@austin.ibm.com>
	<20060131203456.GA23819@kroah.com>
	<20060131210805.GA19465@austin.ibm.com>
	<20060131224852.GA25579@colo.lackof.org>
Message-ID: <20060201213546.GH14705@austin.ibm.com>

On Tue, Jan 31, 2006 at 03:48:52PM -0700, Grant Grundler was heard to remark:
> On Tue, Jan 31, 2006 at 03:08:05PM -0600, linas wrote:
> > Related to the PCI error recovery. I'm not sure how to conceptually
> > peg this: one could say that it is the driver for a specific type
> > of pci-host bridge, although the code is not currently structured 
> > as such. Should I try to restructure it as such? If so, I'm not 
> > clear on how to proceed; I can't say I've clearly seen a kernel
> > abstraction of a pci-host bridge device onto which to staple myself.
> 
> AFAIK, no pci-host device abstraction exists.
> Each arch deals with pci-host bridges as it sees fit.
> 
> But access methods to some PCI features are abstracted:
> o method access to CFG space
> o method to register IRQs
> o advertise MMIO/IO Port routing.
> 
> Sounds like you want to add another method for error recovery
> stats/control.

Actually, the "recovery" part is already (mostly) in mainline,
See Documentation/pci-error-recovery.txt

What's hanging out are patches to specific device drivers, which have
been submitted, but haven't been accepted.

Another issue is that there's no implementation at this time for 
any arch other than powerpc, although the latest pci express bridges
support this function in principle.

--linas


From linas at austin.ibm.com  Thu Feb  2 11:19:06 2006
From: linas at austin.ibm.com (Linas Vepstas)
Date: Wed, 1 Feb 2006 18:19:06 -0600
Subject: [PATCH 1/2]: PowerPC/PCI Hotplug build break
In-Reply-To: <1138833335.6933.5.camel@sinatra.austin.ibm.com>
References: <1138833335.6933.5.camel@sinatra.austin.ibm.com>
Message-ID: <20060202001906.GA24916@austin.ibm.com>


Please apply ASAP:

Build break: Building PCI hotplug on PowerPC results in 
a build break, due to failure to export symbols.

Reported today by Dave Jones <davej at redhat.com>:
drivers/pci/hotplug/rpaphp.ko needs unknown symbol pcibios_add_pci_devices

This patch fixes the break in the arch/powerpc tree.
Next patch fixes same problem in drivers/pci tree

Signed-off-by: Linas Vepstas <linas at austin.ibm.com>

---
 pci_dlpar.c |    3 +++
 1 files changed, 3 insertions(+)

Index: linux-2.6.16-rc1-git5/arch/powerpc/platforms/pseries/pci_dlpar.c
===================================================================
--- linux-2.6.16-rc1-git5.orig/arch/powerpc/platforms/pseries/pci_dlpar.c	2006-02-01 18:06:12.380829512 -0600
+++ linux-2.6.16-rc1-git5/arch/powerpc/platforms/pseries/pci_dlpar.c	2006-02-01 18:11:41.040673750 -0600
@@ -58,6 +58,7 @@
 
 	return find_bus_among_children(pdn->phb->bus, dn);
 }
+EXPORT_SYMBOL_GPL(pcibios_find_pci_bus);
 
 /**
  * pcibios_remove_pci_devices - remove all devices under this bus
@@ -106,6 +107,7 @@
 		}
 	}
 }
+EXPORT_SYMBOL_GPL(pcibios_fixup_new_pci_devices);
 
 static int
 pcibios_pci_config_bridge(struct pci_dev *dev)
@@ -172,3 +174,4 @@
 			pcibios_pci_config_bridge(dev);
 	}
 }
+EXPORT_SYMBOL_GPL(pcibios_add_pci_devices);


From linas at austin.ibm.com  Thu Feb  2 11:21:09 2006
From: linas at austin.ibm.com (Linas Vepstas)
Date: Wed, 1 Feb 2006 18:21:09 -0600
Subject: [PATCH 2/2]: PowerPC/PCI Hotplug build break
Message-ID: <20060202002109.GB24916@austin.ibm.com>


Please apply ASAP:

Build break: Building PCI hotplug on PowerPC results in
a build break, due to failure to export symbols.

Reported today by Dave Jones <davej at redhat.com>:
drivers/pci/hotplug/rpaphp.ko needs unknown symbol pcibios_add_pci_devices

This patch fixes same problem in drivers/pci tree
Previous patch fixes the break in the arch/powerpc tree.

Signed-off-by: Linas Vepstas <linas at austin.ibm.com>
---
 rpaphp_slot.c |    1 +
 1 files changed, 1 insertion(+)

Index: linux-2.6.16-rc1-git5/drivers/pci/hotplug/rpaphp_slot.c
===================================================================
--- linux-2.6.16-rc1-git5.orig/drivers/pci/hotplug/rpaphp_slot.c	2006-02-01 18:06:06.022722369 -0600
+++ linux-2.6.16-rc1-git5/drivers/pci/hotplug/rpaphp_slot.c	2006-02-01 18:11:46.049970222 -0600
@@ -159,6 +159,7 @@
 	dbg("%s - Exit: rc[%d]\n", __FUNCTION__, retval);
 	return retval;
 }
+EXPORT_SYMBOL_GPL(rpaphp_deregister_slot);
 
 int rpaphp_register_slot(struct slot *slot)
 {


From grundler at parisc-linux.org  Thu Feb  2 16:52:43 2006
From: grundler at parisc-linux.org (Grant Grundler)
Date: Wed, 1 Feb 2006 22:52:43 -0700
Subject: creating PCI-related sysfs entries
In-Reply-To: <20060201213546.GH14705@austin.ibm.com>
References: <20060131202214.GZ19465@austin.ibm.com>
	<20060131203456.GA23819@kroah.com>
	<20060131210805.GA19465@austin.ibm.com>
	<20060131224852.GA25579@colo.lackof.org>
	<20060201213546.GH14705@austin.ibm.com>
Message-ID: <20060202055243.GA12588@colo.lackof.org>

On Wed, Feb 01, 2006 at 03:35:46PM -0600, linas wrote:
> > Sounds like you want to add another method for error recovery
> > stats/control.
> 
> Actually, the "recovery" part is already (mostly) in mainline,
> See Documentation/pci-error-recovery.txt

Yes - I've reviewed a few of the times you submitted it.

What I meant was, you want to formalize error recovery methods
and make it a peer to the other resources access methods I listed.

...
> Another issue is that there's no implementation at this time for 
> any arch other than powerpc,

Well, some ia64 chipsets have some limited support but it's
really up to the respective companies to drive that.

> although the latest pci express bridges support this function in principle.

"Nguyen, Tom L" <tom.l.nguyen at intel.com> has proposed patches
to support PCI-e AER (Advanced Error Reporting):
	http://lkml.org/lkml/2005/3/11/269

I've cc'd him in case he has an interest in resurrecting those patches
and adapting them to the current framework (and vice versa).

grant


From linas at austin.ibm.com  Fri Feb  3 03:36:36 2006
From: linas at austin.ibm.com (Linas Vepstas)
Date: Thu, 2 Feb 2006 10:36:36 -0600
Subject: creating PCI-related sysfs entries
In-Reply-To: <20060202055243.GA12588@colo.lackof.org>
References: <20060131202214.GZ19465@austin.ibm.com>
	<20060131203456.GA23819@kroah.com>
	<20060131210805.GA19465@austin.ibm.com>
	<20060131224852.GA25579@colo.lackof.org>
	<20060201213546.GH14705@austin.ibm.com>
	<20060202055243.GA12588@colo.lackof.org>
Message-ID: <20060202163636.GD24916@austin.ibm.com>

On Wed, Feb 01, 2006 at 10:52:43PM -0700, Grant Grundler was heard to remark:
> On Wed, Feb 01, 2006 at 03:35:46PM -0600, linas wrote:
> > > Sounds like you want to add another method for error recovery
> > > stats/control.
> > 
> > Actually, the "recovery" part is already (mostly) in mainline,
> > See Documentation/pci-error-recovery.txt
> 
> What I meant was, you want to formalize error recovery methods
> and make it a peer to the other resources access methods I listed.

Hmm. Not sure what you mean by "a peer". pci config-space i/o is done
through callbacks in the pci bus->ops structure. The PCI error recovery
is done via callbacks in the pci dev structure.  Was there something 
you'd like to see done differently?

Given GregKH's remarks, it sounded like there was some interest in 
a "struct bus_host_bridge" abstraction, and I'd be willing
to take a shot at that, provided there is general interest and 
general agreement.  I'm not quite sure what such a struct might 
contain, just yet, I'm just imagining it might be non-empty.

> "Nguyen, Tom L" <tom.l.nguyen at intel.com> has proposed patches
> to support PCI-e AER (Advanced Error Reporting):

I kept looking at AER, and could not figure out what to do 
with it. 

--linas


From jfaslist at yahoo.fr  Fri Feb  3 04:03:06 2006
From: jfaslist at yahoo.fr (jfaslist)
Date: Thu, 02 Feb 2006 18:03:06 +0100
Subject: Maple freezing on PCI Target-Abort
Message-ID: <43E23B4A.4020402@yahoo.fr>

Hi,
We have designed our own IBM970fx motherboard which is a  (almost)clone 
to the IBM Maple reference kit.
We are seeing that whenever a PIO read PCI cycle bound to the PCI bus 
that is across the AMD8111 is ended w/ a target-abort, the whole system 
freezes. The device signaling the TA is a PCI-VME bridge. It does so as 
the address passed is invalid.

When the system hangs, using the service processor, I can access some 
AMD8111, CPC925 registers from which I can draw the following conclusions:

1- The AMD8111 secondary status tells me the AMD8111 got a TA
2- The CPC925 status/command register (0cf8070010) tells me that the TA 
error was forwarded to the CPC925.
3- The CPC925 APIEXCP register tells me that a DERR exception was signaled.

 From what I can read on the CPC925 and IBM970 cpu user manual, the DERR 
is the bus error that is returned to the CPU by the CPC925 to let him 
know that the cycle ended w/ an error. I have the following questions:

-What exception vector is taking care of a DERR excp? From what I can 
see it seems to be the "machine check" vector. But that seems a bit 
drastic to me. After all this is just a PCI target abort.
-I expect that the normal behavior would be for the kernel to send a 
signal termination to the user process which caused the PIO READ PCI 
cycle (from a previously mmap()'ed VMA address). Is it  doable on this 
platform?  Since a READ operation is coupled by nature, I think this is 
the only acceptable way.

I have tried to set the MSR[RI] bit before doing the PCI cycle, but it 
didn't change change anything. Also on our design we disconnect the 
CPC925 checkstop pin from the 970 machine check pin.(see page 39 of 
cpc925 user's manual). So a DERR shouldn't cause a machine check I would 
think.

I realize that these questions are very H/W related but couldn't find 
the answer in IBM doc.

Thanks for the help,

-- 

Best regards,
_______________________________________
jean-francois simon - themis computer
5, rue irene joliot curie
38330 eybens - france
+33 (0)4 76 14 77 85

	
___________________________________________________________________________ 
Nouveau : t?l?phonez moins cher avec Yahoo! Messenger ! D?couvez les tarifs exceptionnels pour appeler la France et l'international.
T?l?chargez sur http://fr.messenger.yahoo.com


From grundler at parisc-linux.org  Fri Feb  3 06:39:02 2006
From: grundler at parisc-linux.org (Grant Grundler)
Date: Thu, 2 Feb 2006 12:39:02 -0700
Subject: creating PCI-related sysfs entries
In-Reply-To: <20060202163636.GD24916@austin.ibm.com>
References: <20060131202214.GZ19465@austin.ibm.com>
	<20060131203456.GA23819@kroah.com>
	<20060131210805.GA19465@austin.ibm.com>
	<20060131224852.GA25579@colo.lackof.org>
	<20060201213546.GH14705@austin.ibm.com>
	<20060202055243.GA12588@colo.lackof.org>
	<20060202163636.GD24916@austin.ibm.com>
Message-ID: <20060202193902.GA5424@colo.lackof.org>

On Thu, Feb 02, 2006 at 10:36:36AM -0600, Linas Vepstas wrote:
> Hmm. Not sure what you mean by "a peer".

Just at the same level of the architecture - i.e. a server like the others.

> pci config-space i/o is done through callbacks in the pci bus->ops
> structure. The PCI error recovery is done via callbacks in the pci dev
> structure.  Was there something you'd like to see done differently?

No. Each set of callbacks serves a different purpose.
The services/resources at the bus level are different from 
those at the device level.

My guess is error handling/containment can abstract at the "bus" level
since we can't always guarantee "one device per slot" (think
multifunction devices).

> Given GregKH's remarks, it sounded like there was some interest in 
> a "struct bus_host_bridge" abstraction, and I'd be willing
> to take a shot at that, provided there is general interest and 
> general agreement.  I'm not quite sure what such a struct might 
> contain, just yet, I'm just imagining it might be non-empty.

Yes, I agree don't have a better idea other than what I already 
pointed out.

> I kept looking at AER, and could not figure out what to do 
> with it. 

I haven't either - other folks in HP "own" that.

grant


From linas at austin.ibm.com  Fri Feb  3 07:46:24 2006
From: linas at austin.ibm.com (Linas Vepstas)
Date: Thu, 2 Feb 2006 14:46:24 -0600
Subject: creating PCI-related sysfs entries
In-Reply-To: <20060202193902.GA5424@colo.lackof.org>
References: <20060131202214.GZ19465@austin.ibm.com>
	<20060131203456.GA23819@kroah.com>
	<20060131210805.GA19465@austin.ibm.com>
	<20060131224852.GA25579@colo.lackof.org>
	<20060201213546.GH14705@austin.ibm.com>
	<20060202055243.GA12588@colo.lackof.org>
	<20060202163636.GD24916@austin.ibm.com>
	<20060202193902.GA5424@colo.lackof.org>
Message-ID: <20060202204624.GM24916@austin.ibm.com>

On Thu, Feb 02, 2006 at 12:39:02PM -0700, Grant Grundler was heard to remark:
> 
> My guess is error handling/containment can abstract at the "bus" level
> since we can't always guarantee "one device per slot" (think
> multifunction devices).

:-) Yes, testing with multi-function cards exposed bugs, but the
code should work fine with them.  In particular, the design allows
multi-function devices to "vote" how they want to be reset, with
the dumbest voter getting thier way.

The bus disconnect is reported to all functions on all affected
cards/slots. This allows all instances of a device driver to react
appropriately.  However, for card setup/init, typically, you 
want to have only one driver instance do that. You'll notice 
in the sym53cxx2 patch I just sent, there's a 

+       if (PCI_FUNC(pdev->devfn) == 0)
+               sym_reset_scsi_bus(np, 0);

so that the other instances don't fall over each other reseting.
There's similar code in the e100 e1000 and ixgb drivers; I tested 
multi-function versions of these. (not sure about ixgb).

> Yes, I agree don't have a better idea other than what I already 
> pointed out.

Hmm. well, I may have lost the thread of what that was. 

--linas


From geoffrey.levand at am.sony.com  Fri Feb  3 09:47:12 2006
From: geoffrey.levand at am.sony.com (Geoff Levand)
Date: Thu, 02 Feb 2006 14:47:12 -0800
Subject: [PATCH] spufs split off platform code
In-Reply-To: <200601280457.08170.arnd@arndb.de>
References: <200601280457.08170.arnd@arndb.de>
Message-ID: <43E28BF0.8060700@am.sony.com>

Arnd Bergmann wrote:
> I guess that the "spc" device type can be removed now, I don't think
> that
> any systems are left that have not been converted.
> 
> Do you have "spe" type nodes at all? Is there anything that you need to
> do different about them?


Yes, scp can be removed.  I think we can arrange it so some of the
create_spu code can go back to generic code.  Still investigating...


>>+void spu_free_irqs(struct spu *spu)
>>+{
>>+???????int irq_base;
>>+
>>+???????if(!spu->priv_data) {
>>+???????????????pr_debug("null priv_data in %p\n", spu);
>>+???????????????return;
>>+???????}
> 
> 
> It may be just me, but I don't like this bit of coding style:
> You are trying to deal with priv_data being either allocated
> or not allocated at this point. Better make sure that you have
> freed the structure before returning an error from any function
> that would allocate it on success. Then get rid of the check
> here.


Yes, it really doesn't add any value does it.


>>+struct spu_priv_data;
>>+struct spu_phys {
>>+???????unsigned long addr;
>>+???????unsigned long size;
>>+};
>>
>>?struct spu {
>>+???????struct spu_priv_data *priv_data; /* opaque */
>>????????char *name;
> 
> 
> If you want priv_data to point to different types of data structures
> depending on the context, I find it easier to understand if there is
> a simple void pointer and the actual struct definitions have different
> type names.


Yes, a good tip.


I'm looking into pushing these differences down into the lower level
platform code.  Hopefully it will simplify these parts.

-Geoff


From linas at austin.ibm.com  Fri Feb  3 11:06:02 2006
From: linas at austin.ibm.com (Linas Vepstas)
Date: Thu, 2 Feb 2006 18:06:02 -0600
Subject: [PATCH]: Documentation: Updated PCI Error Recovery
Message-ID: <20060203000602.GQ24916@austin.ibm.com>


I'm not sure who I'm addressing this patch to: Linus, maybe?

Please apply. Fingers crossed, I hope this may make it into 2.6.16.

--linas

This patch is a cleanup/restructuring/clarification of the
PCI error handling doc. It should look rather professional
at this point.

Signed-off-by: Linas Vepstas <linas at austin.ibm.com>

--
 pci-error-recovery.txt |  462 ++++++++++++++++++++++++++++++++-----------------
 1 files changed, 306 insertions(+), 156 deletions(-)

Index: linux-2.6.16-rc1-git5/Documentation/pci-error-recovery.txt
===================================================================
--- linux-2.6.16-rc1-git5.orig/Documentation/pci-error-recovery.txt	2006-02-01 17:09:01.000000000 -0600
+++ linux-2.6.16-rc1-git5/Documentation/pci-error-recovery.txt	2006-02-02 18:04:57.714942210 -0600
@@ -1,246 +1,396 @@
 
                        PCI Error Recovery
                        ------------------
-                         May 31, 2005
+                        February 2, 2006
 
-               Current document maintainer:
-           Linas Vepstas <linas at austin.ibm.com>
+                 Current document maintainer:
+             Linas Vepstas <linas at austin.ibm.com>
 
 
-Some PCI bus controllers are able to detect certain "hard" PCI errors
-on the bus, such as parity errors on the data and address busses, as
-well as SERR and PERR errors.  These chipsets are then able to disable
-I/O to/from the affected device, so that, for example, a bad DMA
-address doesn't end up corrupting system memory.  These same chipsets
-are also able to reset the affected PCI device, and return it to
-working condition.  This document describes a generic API form
-performing error recovery.
-
-The core idea is that after a PCI error has been detected, there must
-be a way for the kernel to coordinate with all affected device drivers
-so that the pci card can be made operational again, possibly after
-performing a full electrical #RST of the PCI card.  The API below
-provides a generic API for device drivers to be notified of PCI
-errors, and to be notified of, and respond to, a reset sequence.
-
-Preliminary sketch of API, cut-n-pasted-n-modified email from
-Ben Herrenschmidt, circa 5 april 2005
+Many PCI bus controllers are able to detect a variety of hardware
+PCI errors on the bus, such as parity errors on the data and address 
+busses, as well as SERR and PERR errors.  Some of the more advanced
+chipsets are able to deal with these errors; these include PCI-E chipsets, 
+and the PCI-host bridges found on IBM Power4 and Power5-based pSeries
+boxes. A typical action taken is to disconnect the affected device, 
+halting all I/O to it.  The goal of a disconnection is to avoid system
+corruption; for example, to halt system memory corruption due to DMA's 
+to "wild" addresses. Typically, a reconnection mechanism is also
+offered, so that the affected PCI device(s) are reset and put back
+into working condition. The reset phase requires coordination 
+between the affected device drivers and the PCI controller chip.
+This document describes a generic API for notifying device drivers
+of a bus disconnection, and then performing error recovery.
+This API is currently implemented in the 2.6.16 and later kernels.
+
+Reporting and recovery is performed in several steps. First, when
+a PCI hardware error has resulted in a bus disconnect, that event
+is reported as soon as possible to all affected device drivers,
+including multiple instances of a device driver on multi-function 
+cards. This allows device drivers to avoid deadlocking in spinloops,
+waiting for some i/o-space register to change, when it never will.
+It also gives the drivers a chance to defer incoming I/O as 
+needed. 
+
+Next, recovery is performed in several stages. Most of the complexity
+is forced by the need to handle multi-function devices, that is,
+devices that have multiple device drivers associated with them.
+In the first stage, each driver is allowed to indicate what type
+of reset it desires, the choices being a simple re-enabling of I/O 
+or requesting a hard reset (a full electrical #RST of the PCI card).
+If any driver requests a full reset, that is what will be done. 
+
+After a full reset and/or a re-enabling of I/O, all drivers are
+again notified, so that they may then perform any device setup/config 
+that may be required.  After these have all completed, a final 
+"resume normal operations" event is sent out.
+
+The biggest reason for choosing a kernel-based implementation rather 
+than a user-space implementation was the need to deal with bus 
+disconnects of PCI devices attached to storage media, and, in particular, 
+disconnects from devices holding the root file system.  If the root 
+file system is disconnected, a user-space mechanism would have to go 
+through a large number of contortions to complete recovery. Almost all 
+of the current Linux file systems are not tolerant of disconnection 
+from/reconnection to their underlying block device. By contrast, 
+bus errors are easy to manage in the device driver. Indeed, most
+device drivers already handle very similar recovery procedures;
+for example, the SCSI-generic layer already provides significant 
+mechanisms for dealing with SCSI bus errors and SCSI bus resets.
+
+
+Detailed Design
+---------------
+Design and implementation details below, based on a chain of
+public email discussions with Ben Herrenschmidt, circa 5 April 2005.
 
 The error recovery API support is exposed to the driver in the form of
 a structure of function pointers pointed to by a new field in struct
-pci_driver. The absence of this pointer in pci_driver denotes an
-"non-aware" driver, behaviour on these is platform dependant.
-Platforms like ppc64 can try to simulate pci hotplug remove/add.
-
-The definition of "pci_error_token" is not covered here. It is based on
-Seto's work on the synchronous error detection. We still need to define
-functions for extracting infos out of an opaque error token. This is
-separate from this API.
+pci_driver. A driver that fails to provide the structure is "non-aware",
+and the actual recovery steps taken are platform dependent.  The 
+arch/powerpc implementation will simulate a PCI hotplug remove/add.
 
 This structure has the form:
-
 struct pci_error_handlers
 {
-	int (*error_detected)(struct pci_dev *dev, pci_error_token error);
+	int (*error_detected)(struct pci_dev *dev, enum pci_channel_state);
 	int (*mmio_enabled)(struct pci_dev *dev);
-	int (*resume)(struct pci_dev *dev);
 	int (*link_reset)(struct pci_dev *dev);
 	int (*slot_reset)(struct pci_dev *dev);
+	void (*resume)(struct pci_dev *dev);
+};
+
+The possible channel states are:
+enum pci_channel_state {
+	pci_channel_io_normal,  /* I/O channel is in normal state */
+	pci_channel_io_frozen,  /* I/O to channel is blocked */
+	pci_channel_io_perm_failure, /* PCI card is dead */
+};
+
+Possible return values are:
+enum pci_ers_result {
+	PCI_ERS_RESULT_NONE,        /* no result/none/not supported in device driver */
+	PCI_ERS_RESULT_CAN_RECOVER, /* Device driver can recover without slot reset */
+	PCI_ERS_RESULT_NEED_RESET,  /* Device driver wants slot to be reset. */
+	PCI_ERS_RESULT_DISCONNECT,  /* Device has completely failed, is unrecoverable */
+	PCI_ERS_RESULT_RECOVERED,   /* Device driver is fully recovered and operational */
 };
 
-A driver doesn't have to implement all of these callbacks. The
-only mandatory one is error_detected(). If a callback is not
-implemented, the corresponding feature is considered unsupported.
-For example, if mmio_enabled() and resume() aren't there, then the
-driver is assumed as not doing any direct recovery and requires
+A driver does not have to implement all of these callbacks; however,
+if it implements any, it must implement error_detected(). If a callback 
+is not implemented, the corresponding feature is considered unsupported.
+For example, if mmio_enabled() and resume() aren't there, then it
+is assumed that the driver is not doing any direct recovery and requires
 a reset. If link_reset() is not implemented, the card is assumed as
-not caring about link resets, in which case, if recover is supported,
-the core can try recover (but not slot_reset() unless it really did
-reset the slot). If slot_reset() is not supported, link_reset() can
-be called instead on a slot reset.
-
-At first, the call will always be :
-
-	1) error_detected()
-
-	Error detected. This is sent once after an error has been detected. At
-this point, the device might not be accessible anymore depending on the
-platform (the slot will be isolated on ppc64). The driver may already
-have "noticed" the error because of a failing IO, but this is the proper
-"synchronisation point", that is, it gives a chance to the driver to
-cleanup, waiting for pending stuff (timers, whatever, etc...) to
-complete; it can take semaphores, schedule, etc... everything but touch
-the device. Within this function and after it returns, the driver
+not care about link resets. Typically a driver will want to know about
+a slot_reset().
+
+The actual steps taken by a platform to recover from a PCI error
+event will be platform-dependent, but will follow the general 
+sequence described below.
+
+STEP 0: Error Event
+-------------------
+PCI bus error is detect by the PCI hardware.  On powerpc, the slot 
+is isolated, in that all I/O is blocked: all reads return 0xffffffff, 
+all writes are ignored.
+
+
+STEP 1: Notification
+--------------------
+Platform calls the error_detected() callback on every instance of 
+every driver affected by the error.
+
+At this point, the device might not be accessible anymore, depending on 
+the platform (the slot will be isolated on powerpc). The driver may 
+already have "noticed" the error because of a failing I/O, but this 
+is the proper "synchronization point", that is, it gives the driver 
+a chance to cleanup, waiting for pending stuff (timers, whatever, etc...) 
+to complete; it can take semaphores, schedule, etc... everything but 
+touch the device. Within this function and after it returns, the driver
 shouldn't do any new IOs. Called in task context. This is sort of a
 "quiesce" point. See note about interrupts at the end of this doc.
 
-	Result codes:
-		- PCIERR_RESULT_CAN_RECOVER:
-		  Driever returns this if it thinks it might be able to recover
+All drivers participating in this system must implement this call.
+The driver must return one of the following result codes:
+		- PCI_ERS_RESULT_CAN_RECOVER:
+		  Driver returns this if it thinks it might be able to recover
 		  the HW by just banging IOs or if it wants to be given
-		  a chance to extract some diagnostic informations (see
-		  below).
-		- PCIERR_RESULT_NEED_RESET:
-		  Driver returns this if it thinks it can't recover unless the
-		  slot is reset.
-		- PCIERR_RESULT_DISCONNECT:
-		  Return this if driver thinks it won't recover at all,
-		  (this will detach the driver ? or just leave it
-		  dangling ? to be decided)
-
-So at this point, we have called error_detected() for all drivers
-on the segment that had the error. On ppc64, the slot is isolated. What
-happens now typically depends on the result from the drivers. If all
-drivers on the segment/slot return PCIERR_RESULT_CAN_RECOVER, we would
-re-enable IOs on the slot (or do nothing special if the platform doesn't
-isolate slots) and call 2). If not and we can reset slots, we go to 4),
-if neither, we have a dead slot. If it's an hotplug slot, we might
-"simulate" reset by triggering HW unplug/replug though.
+		  a chance to extract some diagnostic information (see
+		  mmio_enable, below).
+		- PCI_ERS_RESULT_NEED_RESET:
+		  Driver returns this if it can't recover without a hard
+		  slot reset.
+		- PCI_ERS_RESULT_DISCONNECT:
+		  Driver returns this if it doesn't want to recover at all.
+
+The next step taken will depend on the result codes returned by the 
+drivers. 
+
+If all drivers on the segment/slot return PCI_ERS_RESULT_CAN_RECOVER, 
+then the platform should re-enable IOs on the slot (or do nothing in
+particular, if the platform doesn't isolate slots), and recovery 
+proceeds to STEP 2 (MMIO Enable).
+
+If any driver requested a slot reset (by returning PCI_ERS_RESULT_NEED_RESET), 
+then recovery proceeds to STEP 4 (Slot Reset).
+
+If the platform is unable to recover the slot, the next step 
+is STEP 6 (Permanent Failure).
 
->>> Current ppc64 implementation assumes that a device driver will
->>> *not* schedule or semaphore in this routine; the current ppc64
+>>> The current powerpc implementation assumes that a device driver will
+>>> *not* schedule or semaphore in this routine; the current powerpc
 >>> implementation uses one kernel thread to notify all devices;
->>> thus, of one device sleeps/schedules, all devices are affected.
+>>> thus, if one device sleeps/schedules, all devices are affected.
 >>> Doing better requires complex multi-threaded logic in the error
 >>> recovery implementation (e.g. waiting for all notification threads
 >>> to "join" before proceeding with recovery.)  This seems excessively
 >>> complex and not worth implementing.
 
->>> The current ppc64 implementation doesn't much care if the device
->>> attempts i/o at this point, or not.  I/O's will fail, returning
+>>> The current powerpc implementation doesn't much care if the device
+>>> attempts I/O at this point, or not.  I/O's will fail, returning
 >>> a value of 0xff on read, and writes will be dropped. If the device
 >>> driver attempts more than 10K I/O's to a frozen adapter, it will
 >>> assume that the device driver has gone into an infinite loop, and
->>> it will panic the the kernel.
+>>> it will panic the the kernel. There doesn't seem to be any other
+>>> way of stopping a device driver that insists on spinning on I/O.
 
-	2) mmio_enabled()
+STEP 2: MMIO Enabled
+-------------------
+The platform re-enables MMIO to the device (but typically not the 
+DMA), and then calls the mmio_enabled() callback on all affected 
+device drivers.
 
-	This is the "early recovery" call. IOs are allowed again, but DMA is
+This is the "early recovery" call. IOs are allowed again, but DMA is
 not (hrm... to be discussed, I prefer not), with some restrictions. This
 is NOT a callback for the driver to start operations again, only to
 peek/poke at the device, extract diagnostic information, if any, and
 eventually do things like trigger a device local reset or some such,
-but not restart operations. This is sent if all drivers on a segment
-agree that they can try to recover and no automatic link reset was
-performed by the HW. If the platform can't just re-enable IOs without
-a slot reset or a link reset, it doesn't call this callback and goes
-directly to 3) or 4). All IOs should be done _synchronously_ from
-within this callback, errors triggered by them will be returned via
-the normal pci_check_whatever() api, no new error_detected() callback
-will be issued due to an error happening here. However, such an error
-might cause IOs to be re-blocked for the whole segment, and thus
-invalidate the recovery that other devices on the same segment might
-have done, forcing the whole segment into one of the next states,
-that is link reset or slot reset.
+but not restart operations. This is callback is made if all drivers on 
+a segment agree that they can try to recover and if no automatic link reset 
+was performed by the HW. If the platform can't just re-enable IOs without
+a slot reset or a link reset, it wont call this callback, and instead
+will have gone directly to STEP 3 (Link Reset) or STEP 4 (Slot Reset)
+
+>>> The following is proposed; no platform implements this yet:
+>>> Proposal: All I/O's should be done _synchronously_ from within 
+>>> this callback, errors triggered by them will be returned via
+>>> the normal pci_check_whatever() API, no new error_detected() 
+>>> callback will be issued due to an error happening here. However, 
+>>> such an error might cause IOs to be re-blocked for the whole 
+>>> segment, and thus invalidate the recovery that other devices 
+>>> on the same segment might have done, forcing the whole segment 
+>>> into one of the next states, that is, link reset or slot reset.
 
-	Result codes:
-		- PCIERR_RESULT_RECOVERED
+The driver should return one of the following result codes:
+		- PCI_ERS_RESULT_RECOVERED
 		  Driver returns this if it thinks the device is fully
-		  functionnal and thinks it is ready to start
+		  functional and thinks it is ready to start
 		  normal driver operations again. There is no
 		  guarantee that the driver will actually be
 		  allowed to proceed, as another driver on the
 		  same segment might have failed and thus triggered a
 		  slot reset on platforms that support it.
 
-		- PCIERR_RESULT_NEED_RESET
+		- PCI_ERS_RESULT_NEED_RESET
 		  Driver returns this if it thinks the device is not
 		  recoverable in it's current state and it needs a slot
 		  reset to proceed.
 
-		- PCIERR_RESULT_DISCONNECT
+		- PCI_ERS_RESULT_DISCONNECT
 		  Same as above. Total failure, no recovery even after
 		  reset driver dead. (To be defined more precisely)
 
->>> The current ppc64 implementation does not implement this callback.
-
-	3) link_reset()
-
-	This is called after the link has been reset. This is typically
-a PCI Express specific state at this point and is done whenever a
-non-fatal error has been detected that can be "solved" by resetting
-the link. This call informs the driver of the reset and the driver
-should check if the device appears to be in working condition.
-This function acts a bit like 2) mmio_enabled(), in that the driver
-is not supposed to restart normal driver I/O operations right away.
-Instead, it should just "probe" the device to check it's recoverability
-status. If all is right, then the core will call resume() once all
-drivers have ack'd link_reset().
+The next step taken depends on the results returned by the drivers.
+If all drivers returned PCI_ERS_RESULT_RECOVERED, then the platform
+proceeds to either STEP3 (Link Reset) or to STEP 5 (Resume Operations).
+
+If any driver returned PCI_ERS_RESULT_NEED_RESET, then the platform
+proceeds to STEP 4 (Slot Reset)
+
+>>> The current powerpc implementation does not implement this callback.
+
+
+STEP 3: Link Reset
+------------------
+The platform resets the link, and then calls the link_reset() callback
+on all affected device drivers.  This is a PCI-Express specific state 
+and is done whenever a non-fatal error has been detected that can be 
+"solved" by resetting the link. This call informs the driver of the 
+reset and the driver should check to see if the device appears to be 
+in working condition.
+
+The driver is not supposed to restart normal driver I/O operations 
+at this point.  It should limit itself to "probing" the device to 
+check it's recoverability status. If all is right, then the platform
+will call resume() once all drivers have ack'd link_reset().
 
 	Result codes:
-		(identical to mmio_enabled)
+		(identical to STEP 3 (MMIO Enabled)
 
->>> The current ppc64 implementation does not implement this callback.
+The platform then proceeds to either STEP 4 (Slot Reset) or STEP 5
+(Resume Operations).
 
-	4) slot_reset()
+>>> The current powerpc implementation does not implement this callback.
 
-	This is called after the slot has been soft or hard reset by the
-platform.  A soft reset consists of asserting the adapter #RST line
-and then restoring the PCI BARs and PCI configuration header. If the
-platform supports PCI hotplug, then it might instead perform a hard
-reset by toggling power on the slot off/on. This call gives drivers
-the chance to re-initialize the hardware (re-download firmware, etc.),
-but drivers shouldn't restart normal I/O processing operations at
-this point.  (See note about interrupts; interrupts aren't guaranteed
-to be delivered until the resume() callback has been called). If all
-device drivers report success on this callback, the patform will call
-resume() to complete the error handling and let the driver restart
-normal I/O processing.
+
+STEP 4: Slot Reset
+------------------
+The platform performs a soft or hard reset of the device, and then
+calls the slot_reset() callback.
+
+A soft reset consists of asserting the adapter #RST line and then 
+restoring the PCI BAR's and PCI configuration header to a state
+that is equivalent to what it would be after a fresh system 
+power-on followed by power-on BIOS/system firmware initialization.
+If the platform supports PCI hotplug, then the reset might be 
+performed by toggling the slot electrical power off/on. 
+
+It is important for the platform to restore the PCI config space
+to the "fresh poweron" state, rather than the "last state". After
+a slot reset, the device driver will almost always use its standard
+device initialization routines, and an unusual config space setup
+may result in hung devices, kernel panics, or silent data corruption.
+
+This call gives drivers the chance to re-initialize the hardware 
+(re-download firmware, etc.).  At this point, the driver may assume
+that he card is in a fresh state and is fully functional. In
+particular, interrupt generation should work normally.
+
+Drivers should not yet restart normal I/O processing operations 
+at this point.  If all device drivers report success on this 
+callback, the platform will call resume() to complete the sequence,
+and let the driver restart normal I/O processing.
 
 A driver can still return a critical failure for this function if
 it can't get the device operational after reset.  If the platform
-previously tried a soft reset, it migh now try a hard reset (power
+previously tried a soft reset, it might now try a hard reset (power
 cycle) and then call slot_reset() again.  It the device still can't
 be recovered, there is nothing more that can be done;  the platform
 will typically report a "permanent failure" in such a case.  The
 device will be considered "dead" in this case.
 
+Drivers for multi-function cards will need to coordinate among
+themselves as to which driver instance will perform any "one-shot"
+or global device initialization. For example, the Symbios sym53cxx2 
+driver performs device init only from PCI function 0:
+
++       if (PCI_FUNC(pdev->devfn) == 0)
++               sym_reset_scsi_bus(np, 0);
+
 	Result codes:
-		- PCIERR_RESULT_DISCONNECT
+		- PCI_ERS_RESULT_DISCONNECT
 		Same as above.
 
->>> The current ppc64 implementation does not try a power-cycle reset
->>> if the driver returned PCIERR_RESULT_DISCONNECT. However, it should.
-
-	5) resume()
+Platform proceeds either to STEP 5 (Resume Operations) or STEP 6 (Permanent 
+Failure).
 
-	This is called if all drivers on the segment have returned
-PCIERR_RESULT_RECOVERED from one of the 3 prevous callbacks.
-That basically tells the driver to restart activity, tht everything
-is back and running. No result code is taken into account here. If
-a new error happens, it will restart a new error handling process.
-
-That's it. I think this covers all the possibilities. The way those
-callbacks are called is platform policy. A platform with no slot reset
-capability for example may want to just "ignore" drivers that can't
+>>> The current powerpc implementation does not currently try a 
+>>> power-cycle reset if the driver returned PCI_ERS_RESULT_DISCONNECT. 
+>>> However, it probably should.
+
+
+STEP 5: Resume Operations
+-------------------------
+The platform will call the resume() callback on all affected device 
+drivers if all drivers on the segment have returned
+PCI_ERS_RESULT_RECOVERED from one of the 3 previous callbacks.
+The goal of this callback is to tell the driver to restart activity, 
+that everything is back and running. This callback does not return
+a result code.
+
+At this point, if a new error happens, the platform will restart 
+a new error recovery sequence.
+
+STEP 6: Permanent Failure
+-------------------------
+A "permanent failure" has occurred, and the platform cannot recover
+the device.  The platform will call error_detected() with a 
+pci_channel_state value of pci_channel_io_perm_failure. 
+
+The device driver should, at this point, assume the worst. It should
+cancel all pending I/O, refuse all new I/O, returning -EIO to 
+higher layers. The device driver should then clean up all of its 
+memory and remove itself from kernel operations, much as it would
+during system shutdown.
+
+The platform will typically notify the system operator of the 
+permanent failure in some way.  If the device is hotplug-capable,
+the operator will probably want to remove and replace the device.
+Note, however, not all failures are truly "permanent". Some are
+caused by over-heating, some by a poorly seated card. Many 
+PCI error events are caused by software bugs, e.g. DMA's to 
+wild addresses or bogus split transactions due to programming 
+errors. See the discussion in powerpc/eeh-pci-error-recovery.txt
+for additional detail on real-life experience of the causes of 
+software errors.
+
+
+Conclusion; General Remarks
+---------------------------
+The way those callbacks are called is platform policy. A platform with 
+no slot reset capability may want to just "ignore" drivers that can't
 recover (disconnect them) and try to let other cards on the same segment
 recover. Keep in mind that in most real life cases, though, there will
 be only one driver per segment.
 
-Now, there is a note about interrupts. If you get an interrupt and your
+Now, a note about interrupts. If you get an interrupt and your
 device is dead or has been isolated, there is a problem :)
-
-After much thinking, I decided to leave that to the platform. That is,
-the recovery API only precies that:
+The current policy is to turn this into a platform policy.
+That is, the recovery API only requires that:
 
  - There is no guarantee that interrupt delivery can proceed from any
 device on the segment starting from the error detection and until the
-restart callback is sent, at which point interrupts are expected to be
+resume callback is sent, at which point interrupts are expected to be
 fully operational.
 
- - There is no guarantee that interrupt delivery is stopped, that is, ad
-river that gets an interrupts after detecting an error, or that detects
-and error within the interrupt handler such that it prevents proper
+ - There is no guarantee that interrupt delivery is stopped, that is, 
+a driver that gets an interrupt after detecting an error, or that detects
+an error within the interrupt handler such that it prevents proper
 ack'ing of the interrupt (and thus removal of the source) should just
-return IRQ_NOTHANDLED. It's up to the platform to deal with taht
-condition, typically by masking the irq source during the duration of
+return IRQ_NOTHANDLED. It's up to the platform to deal with that
+condition, typically by masking the IRQ source during the duration of
 the error handling. It is expected that the platform "knows" which
 interrupts are routed to error-management capable slots and can deal
-with temporarily disabling that irq number during error processing (this
+with temporarily disabling that IRQ number during error processing (this
 isn't terribly complex). That means some IRQ latency for other devices
 sharing the interrupt, but there is simply no other way. High end
 platforms aren't supposed to share interrupts between many devices
 anyway :)
 
+>>> Implementation details for the powerpc platform are discussed in
+>>> the file Documentation/powerpc/eeh-pci-error-recovery.txt
+
+>>> As of this writing, there are six device drivers with patches
+>>> implementing error recovery. Not all of these patches are in
+>>> mainline yet. These may be used as "examples":
+>>>
+>>> drivers/scsi/ipr.c
+>>> drivers/scsi/sym53cxx_2
+>>> drivers/next/e100.c
+>>> drivers/net/e1000
+>>> drivers/net/ixgb
+>>> drivers/net/s2io.c
 
-Revised: 31 May 2005 Linas Vepstas <linas at austin.ibm.com>
+The End
+-------


From benh at kernel.crashing.org  Fri Feb  3 12:42:37 2006
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Fri, 03 Feb 2006 12:42:37 +1100
Subject: Maple freezing on PCI Target-Abort
In-Reply-To: <43E23B4A.4020402@yahoo.fr>
References: <43E23B4A.4020402@yahoo.fr>
Message-ID: <1138930958.4934.102.camel@localhost.localdomain>


> -What exception vector is taking care of a DERR excp? From what I can 
> see it seems to be the "machine check" vector. But that seems a bit 
> drastic to me. After all this is just a PCI target abort.

I would expect a machine check yes.

> -I expect that the normal behavior would be for the kernel to send a 
> signal termination to the user process which caused the PIO READ PCI 
> cycle (from a previously mmap()'ed VMA address). Is it  doable on this 
> platform?  Since a READ operation is coupled by nature, I think this is 
> the only acceptable way.

It should SIGBUS except if the problem occurred in the kernel. I don't
know why it's not doing so, maybe you are hitting an issue/errata or
misconfiguration of the 925 ?

> I have tried to set the MSR[RI] bit before doing the PCI cycle, but it 
> didn't change change anything. Also on our design we disconnect the 
> CPC925 checkstop pin from the 970 machine check pin.(see page 39 of 
> cpc925 user's manual). So a DERR shouldn't cause a machine check I would 
> think.
> 
> I realize that these questions are very H/W related but couldn't find 
> the answer in IBM doc.


From benh at kernel.crashing.org  Fri Feb  3 12:45:03 2006
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Fri, 03 Feb 2006 12:45:03 +1100
Subject: creating PCI-related sysfs entries
In-Reply-To: <20060131210805.GA19465@austin.ibm.com>
References: <20060131202214.GZ19465@austin.ibm.com>
	<20060131203456.GA23819@kroah.com>
	<20060131210805.GA19465@austin.ibm.com>
Message-ID: <1138931103.4934.105.camel@localhost.localdomain>

On Tue, 2006-01-31 at 15:08 -0600, linas wrote:

> Hmm. But these slots are not hot-plugabble; should the arch
> use the hotplug infrastructure even on those slots?

If those are EEH slots, they should probably treated as hotplugable...
after all, didn't we discuss back then that one strategy we could use
for recovery simulating an unplug/replug to the driver along with a slot
hard reset ?
 

From benh at kernel.crashing.org  Fri Feb  3 12:56:01 2006
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Fri, 03 Feb 2006 12:56:01 +1100
Subject: [PATCH 2.6.16-rc1] Fix booting Maple boards (was: Re:
	LINUXPPC64 Maple fails to boot current git)
In-Reply-To: <20060131151117.GP22672@smtp.west.cox.net>
References: <20060130171759.GE22672@smtp.west.cox.net>
	<1138662630.3417.26.camel@brick.watson.ibm.com>
	<20060131151117.GP22672@smtp.west.cox.net>
Message-ID: <1138931761.4934.113.camel@localhost.localdomain>


> When looking for legacy serial ports, condition poking of "ISA" areas
> on CONFIG_GENERIC_ISA_DMA, rather than CONFIG_ISA as some boards (such
> as the Maple) have no ISA slots, but do have ISA serial ports.

Hrm... not sure ISA_DMA has anything to do with that at all.. in fact
its more like "has legacy devices". I don't remember adding the ifdef
CONFIG_ISA in the first place, maybe I did... it's a bit dodgy I'd say.
Indeed, lots of machines have ISA devices (a superIO typically) without
having ISA slots...

Ben.

> Signed-off-by: Tom Rini <trini at kernel.crashing.org>
> 
>  arch/powerpc/kernel/legacy_serial.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/legacy_serial.c b/arch/powerpc/kernel/legacy_serial.c
> index f970ace..3dd7b39 100644
> --- a/arch/powerpc/kernel/legacy_serial.c
> +++ b/arch/powerpc/kernel/legacy_serial.c
> @@ -134,7 +134,7 @@ static int __init add_legacy_soc_port(st
>  	return add_legacy_port(np, -1, UPIO_MEM, addr, addr, NO_IRQ, flags);
>  }
>  
> -#ifdef CONFIG_ISA
> +#ifdef CONFIG_GENERIC_ISA_DMA
>  static int __init add_legacy_isa_port(struct device_node *np,
>  				      struct device_node *isa_brg)
>  {
> @@ -276,7 +276,7 @@ void __init find_legacy_serial_ports(voi
>  		of_node_put(soc);
>  	}
>  
> -#ifdef CONFIG_ISA
> +#ifdef CONFIG_GENERIC_ISA_DMA
>  	/* First fill our array with ISA ports */
>  	for (np = NULL; (np = of_find_node_by_type(np, "serial"));) {
>  		struct device_node *isa = of_get_parent(np);
> 


From benh at kernel.crashing.org  Fri Feb  3 12:53:21 2006
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Fri, 03 Feb 2006 12:53:21 +1100
Subject: creating PCI-related sysfs entries
In-Reply-To: <20060131212624.GA10513@kroah.com>
References: <20060131202214.GZ19465@austin.ibm.com>
	<20060131203456.GA23819@kroah.com>
	<20060131210805.GA19465@austin.ibm.com>
	<20060131212624.GA10513@kroah.com>
Message-ID: <1138931602.4934.110.camel@localhost.localdomain>


> That's only because your arch might happen to have 1 device per slot,
> which is not true for other arches.  And I bet it's also not true for
> your non-virtual boxes...

Even that is not true since we can have multi-function devices or
devices with p2p bridges but the basic entity where the error management
infos is available to us is indeed the physical slot.

> People have suggested that they create such a driver for a long time.
> Why not just do that?

Depends if he wants per domain statistics or really per slot ... we do
have per-slot control on most of IBM machines, thus I would rather have
these info there (though if he also wants consolidated "global" stats,
then yes, a host controller driver might be the way to go).


From linas at austin.ibm.com  Fri Feb  3 13:03:41 2006
From: linas at austin.ibm.com (Linas Vepstas)
Date: Thu, 2 Feb 2006 20:03:41 -0600
Subject: creating PCI-related sysfs entries
In-Reply-To: <1138931103.4934.105.camel@localhost.localdomain>
References: <20060131202214.GZ19465@austin.ibm.com>
	<20060131203456.GA23819@kroah.com>
	<20060131210805.GA19465@austin.ibm.com>
	<1138931103.4934.105.camel@localhost.localdomain>
Message-ID: <20060203020341.GR24916@austin.ibm.com>

On Fri, Feb 03, 2006 at 12:45:03PM +1100, Benjamin Herrenschmidt was heard to remark:
> On Tue, 2006-01-31 at 15:08 -0600, linas wrote:
> 
> > Hmm. But these slots are not hot-plugabble; should the arch
> > use the hotplug infrastructure even on those slots?
> 
> If those are EEH slots, they should probably treated as hotplugable...
> after all, didn't we discuss back then that one strategy we could use
> for recovery simulating an unplug/replug to the driver along with a slot
> hard reset ?

Yes, and EEH does do that (in mainline, 10K times in a row, 
last I tried). This email was in reference to the 
layout of /sys/bus/pci/slots which seems to have only hotplug
slots in there; I am not yet sure why. Its possible John Rose 
can shed some rapid insight? 

--linas


From gregkh at suse.de  Fri Feb  3 14:48:41 2006
From: gregkh at suse.de (Greg KH)
Date: Thu, 2 Feb 2006 19:48:41 -0800
Subject: [PATCH]: Documentation: Updated PCI Error Recovery
In-Reply-To: <20060203000602.GQ24916@austin.ibm.com>
References: <20060203000602.GQ24916@austin.ibm.com>
Message-ID: <20060203034841.GA14169@suse.de>

On Thu, Feb 02, 2006 at 06:06:02PM -0600, Linas Vepstas wrote:
> 
> I'm not sure who I'm addressing this patch to: Linus, maybe?

As it's PCI related, I'll take it, like the other PCI stuff, and put it
into my trees, which go into -mm, and then into Linus's tree.

I'll add this to my queue.

thanks,

greg k-h


From boutcher at cs.umn.edu  Fri Feb  3 18:18:50 2006
From: boutcher at cs.umn.edu (Dave C Boutcher)
Date: Fri, 3 Feb 2006 01:18:50 -0600
Subject: [PATCH 0/3] powerpc minor fixes to the rtas_percpu_suspend_me routine
Message-ID: <17379.986.599275.637898@hound.rchland.ibm.com>


A series of small fixes to the rtas_percpu_suspend_me routine
for problems discovered since it was pushed to 2.6.16-rc1.

Dave Boutcher


From boutcher at cs.umn.edu  Fri Feb  3 18:18:36 2006
From: boutcher at cs.umn.edu (Dave C Boutcher)
Date: Fri, 3 Feb 2006 01:18:36 -0600
Subject: [PATCH 3/3] powerpc remove useless call to touch_softlockup_watchdog
Message-ID: <17379.972.53558.75428@hound.rchland.ibm.com>


It turns out that we can't stop the watchdog from
triggering here.  If we touch the timer (which just uses the current jiffie
value) before we enable interrupts, it does nothing because jiffies
are not mass-updated until after we enable interrupts.  If we touch the
timer after we enable interrupts, its too late because the softlockup
watchdog will already have triggered.  The touch_softlockup_watchdog
call removed below does nothing.

Signed-off-by: Dave Boutcher <sleddog at us.ibm.com>

---

 arch/powerpc/kernel/rtas.c |    4 ----
 1 files changed, 0 insertions(+), 4 deletions(-)

14caae1e3b5508ce8798618f9f952f14e7c6d41a
diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 4038ac1..1ecfcf8 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -598,10 +598,6 @@ static void rtas_percpu_suspend_me(void 
 	}
 
 out:
-	/* before we restore interrupts, make sure we don't
-	 * generate a spurious soft lockup errors
-	 */
-	touch_softlockup_watchdog();
 	local_irq_restore(flags);
 	return;
 }
-- 
1.1.4.g7310


From boutcher at cs.umn.edu  Fri Feb  3 18:18:39 2006
From: boutcher at cs.umn.edu (Dave C Boutcher)
Date: Fri, 3 Feb 2006 01:18:39 -0600
Subject: [PATCH 2/3] powerpc prod all processors after ibm,suspend-me
Message-ID: <17379.975.326033.286493@hound.rchland.ibm.com>


We need to prod everyone here since this is the only CPU that is
guaranteed to be running after the ibm,suspend-me RTAS call returns.

Signed-off-by: Dave Boutcher <sleddog at us.ibm.com>

---

 arch/powerpc/kernel/rtas.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

9d615a50c077f82926732c8b9f366bebe50a4660
diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 107bd86..4038ac1 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -565,6 +565,7 @@ static int ibm_suspend_me_token = RTAS_U
 #ifdef CONFIG_PPC_PSERIES
 static void rtas_percpu_suspend_me(void *info)
 {
+	int i;
 	long rc;
 	long flags;
 	struct rtas_suspend_me_data *data =
@@ -589,6 +590,8 @@ static void rtas_percpu_suspend_me(void 
 		data->waiting = 0;
 		data->args->args[data->args->nargs] =
 			rtas_call(ibm_suspend_me_token, 0, 1, NULL);
+		for_each_cpu(i)
+			plpar_hcall_norets(H_PROD,i);
 	} else {
 		data->waiting = -EBUSY;
 		printk(KERN_ERR "Error on H_Join hypervisor call\n");
-- 
1.1.4.g7310


From boutcher at cs.umn.edu  Fri Feb  3 18:18:46 2006
From: boutcher at cs.umn.edu (Dave C Boutcher)
Date: Fri, 3 Feb 2006 01:18:46 -0600
Subject: [PATCH 1/3] powerpc return correct rtas status from ibm,suspend-me
Message-ID: <17379.982.159401.407606@hound.rchland.ibm.com>


Correctly return the status from the RTAS call.  rtas_call expects
to return the status as a return value.

Signed-off-by: Dave Boutcher <sleddog at us.ibm.com>

---

 arch/powerpc/kernel/rtas.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

a0f3095607ff19d730f2ed5181bd37df231d4015
diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 7fe4a5c..107bd86 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -587,8 +587,8 @@ static void rtas_percpu_suspend_me(void 
 
 	if (rc == H_Continue) {
 		data->waiting = 0;
-		rtas_call(ibm_suspend_me_token, 0, 1,
-			  data->args->args);
+		data->args->args[data->args->nargs] =
+			rtas_call(ibm_suspend_me_token, 0, 1, NULL);
 	} else {
 		data->waiting = -EBUSY;
 		printk(KERN_ERR "Error on H_Join hypervisor call\n");
-- 
1.1.4.g7310


From michael at ellerman.id.au  Fri Feb  3 19:05:14 2006
From: michael at ellerman.id.au (Michael Ellerman)
Date: Fri, 03 Feb 2006 19:05:14 +1100
Subject: [PATCH] powerpc: Don't start secondary CPUs in a UP && KEXEC kernel
Message-ID: <20060203080536.DA5AF68A10@ozlabs.org>

Because smp_release_cpus() is built for SMP || KEXEC, it's not safe to
unconditionally call it from setup_system(). On a UP && KEXEC kernel we'll
start up the secondary CPUs which will then go beserk and we die.

Simple fix is to conditionally call smp_release_cpus() in setup_system(). We
that in place we don't need the dummy definition of smp_release_cpus() because
all call sites are #ifdef'ed either SMP or KEXEC.

Signed-off-by: Michael Ellerman <michael at ellerman.id.au>
---

 arch/powerpc/kernel/setup_64.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

Index: kdump/arch/powerpc/kernel/setup_64.c
===================================================================
--- kdump.orig/arch/powerpc/kernel/setup_64.c
+++ kdump/arch/powerpc/kernel/setup_64.c
@@ -311,8 +311,6 @@ void smp_release_cpus(void)
 
 	DBG(" <- smp_release_cpus()\n");
 }
-#else
-#define smp_release_cpus()
 #endif /* CONFIG_SMP || CONFIG_KEXEC */
 
 /*
@@ -470,10 +468,12 @@ void __init setup_system(void)
 	check_smt_enabled();
 	smp_setup_cpu_maps();
 
+#ifdef CONFIG_SMP
 	/* Release secondary cpus out of their spinloops at 0x60 now that
 	 * we can map physical -> logical CPU ids
 	 */
 	smp_release_cpus();
+#endif
 
 	printk("Starting Linux PPC64 %s\n", system_utsname.version);
 

From michael at ellerman.id.au  Fri Feb  3 19:05:47 2006
From: michael at ellerman.id.au (Michael Ellerman)
Date: Fri, 03 Feb 2006 19:05:47 +1100
Subject: [PATCH] powerpc: Don't overwrite flat device tree with kdump kernel
Message-ID: <20060203080609.403CA68A1F@ozlabs.org>

It's possible for prom_init to allocate the flat device tree inside the
kdump crash kernel region. If this happens, when we load the kdump kernel we
overwrite the flattened device tree, which is bad.

We could make prom_init try and avoid allocating inside the crash kernel
region, but then we run into issues if the crash kernel region uses all the
space inside the RMO. The easiest solution is to move the flat device tree
once we're running in the kernel.

Signed-off-by: Michael Ellerman <michael at ellerman.id.au>
---

 arch/powerpc/kernel/prom.c     |   27 +++++++++++++++++++++++++++
 arch/powerpc/kernel/setup_64.c |    3 +++
 include/asm-powerpc/prom.h     |    2 ++
 3 files changed, 32 insertions(+)

Index: kdump/arch/powerpc/kernel/prom.c
===================================================================
--- kdump.orig/arch/powerpc/kernel/prom.c
+++ kdump/arch/powerpc/kernel/prom.c
@@ -1913,3 +1913,30 @@ int prom_update_property(struct device_n
 
 	return 0;
 }
+
+#ifdef CONFIG_KEXEC
+/* We may have allocated the flat device tree inside the crash kernel region
+ * in prom_init. If so we need to move it out into regular memory. */
+void kdump_move_device_tree(void)
+{
+	unsigned long start, end;
+	struct boot_param_header *new;
+
+	start = __pa((unsigned long)initial_boot_params);
+	end = start + initial_boot_params->totalsize;
+
+	if (end < crashk_res.start || start > crashk_res.end)
+		return;
+
+	new = (struct boot_param_header*)
+		__va(lmb_alloc(initial_boot_params->totalsize, PAGE_SIZE));
+
+	memcpy(new, initial_boot_params, initial_boot_params->totalsize);
+
+	initial_boot_params = new;
+
+	DBG("Flat device tree blob moved to %p\n", initial_boot_params);
+
+	/* XXX should we unreserve the old DT? */
+}
+#endif /* CONFIG_KEXEC */
Index: kdump/arch/powerpc/kernel/setup_64.c
===================================================================
--- kdump.orig/arch/powerpc/kernel/setup_64.c
+++ kdump/arch/powerpc/kernel/setup_64.c
@@ -398,6 +398,9 @@ void __init setup_system(void)
 {
 	DBG(" -> setup_system()\n");
 
+#ifdef CONFIG_KEXEC
+	kdump_move_device_tree();
+#endif
 	/*
 	 * Unflatten the device-tree passed by prom_init or kexec
 	 */
Index: kdump/include/asm-powerpc/prom.h
===================================================================
--- kdump.orig/include/asm-powerpc/prom.h
+++ kdump/include/asm-powerpc/prom.h
@@ -222,5 +222,7 @@ extern int of_address_to_resource(struct
 extern int of_pci_address_to_resource(struct device_node *dev, int bar,
 				      struct resource *r);
 
+extern void kdump_move_device_tree(void);
+
 #endif /* __KERNEL__ */
 #endif /* _POWERPC_PROM_H */


From benh at kernel.crashing.org  Fri Feb  3 20:07:37 2006
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Fri, 03 Feb 2006 20:07:37 +1100
Subject: creating PCI-related sysfs entries
In-Reply-To: <20060203020341.GR24916@austin.ibm.com>
References: <20060131202214.GZ19465@austin.ibm.com>
	<20060131203456.GA23819@kroah.com>
	<20060131210805.GA19465@austin.ibm.com>
	<1138931103.4934.105.camel@localhost.localdomain>
	<20060203020341.GR24916@austin.ibm.com>
Message-ID: <1138957657.4934.124.camel@localhost.localdomain>

On Thu, 2006-02-02 at 20:03 -0600, Linas Vepstas wrote:

> Yes, and EEH does do that (in mainline, 10K times in a row, 
> last I tried). This email was in reference to the 
> layout of /sys/bus/pci/slots which seems to have only hotplug
> slots in there; I am not yet sure why. Its possible John Rose 
> can shed some rapid insight? 

Ok... also, about this "max number of resets" thing, it would be useful
in fact to have a rate limit rather ... a network card that for some
reason need to be reset about once a day is still fairly useable and it
would be nice if the system didn't consider it dead after 10 days ...

Also, it might be useful to have an entry to force a retry on a card
that has been considered dead...

Ben.


From galak at kernel.crashing.org  Sat Feb  4 01:25:08 2006
From: galak at kernel.crashing.org (Kumar Gala)
Date: Fri, 3 Feb 2006 08:25:08 -0600
Subject: [PATCH] powerpc: Don't overwrite flat device tree with kdump
	kernel
In-Reply-To: <20060203080609.403CA68A1F@ozlabs.org>
References: <20060203080609.403CA68A1F@ozlabs.org>
Message-ID: <8FC7251A-6C37-4B4B-9120-0845616D0E60@kernel.crashing.org>


On Feb 3, 2006, at 2:05 AM, Michael Ellerman wrote:

> It's possible for prom_init to allocate the flat device tree inside  
> the
> kdump crash kernel region. If this happens, when we load the kdump  
> kernel we
> overwrite the flattened device tree, which is bad.
>
> We could make prom_init try and avoid allocating inside the crash  
> kernel
> region, but then we run into issues if the crash kernel region uses  
> all the
> space inside the RMO. The easiest solution is to move the flat  
> device tree
> once we're running in the kernel.
>
> Signed-off-by: Michael Ellerman <michael at ellerman.id.au>

Doesn't setup_32.c need a similar change?

- k

> ---
>
>  arch/powerpc/kernel/prom.c     |   27 +++++++++++++++++++++++++++
>  arch/powerpc/kernel/setup_64.c |    3 +++
>  include/asm-powerpc/prom.h     |    2 ++
>  3 files changed, 32 insertions(+)
>
> Index: kdump/arch/powerpc/kernel/prom.c
> ===================================================================
> --- kdump.orig/arch/powerpc/kernel/prom.c
> +++ kdump/arch/powerpc/kernel/prom.c
> @@ -1913,3 +1913,30 @@ int prom_update_property(struct device_n
>
>  	return 0;
>  }
> +
> +#ifdef CONFIG_KEXEC
> +/* We may have allocated the flat device tree inside the crash  
> kernel region
> + * in prom_init. If so we need to move it out into regular memory. */
> +void kdump_move_device_tree(void)
> +{
> +	unsigned long start, end;
> +	struct boot_param_header *new;
> +
> +	start = __pa((unsigned long)initial_boot_params);
> +	end = start + initial_boot_params->totalsize;
> +
> +	if (end < crashk_res.start || start > crashk_res.end)
> +		return;
> +
> +	new = (struct boot_param_header*)
> +		__va(lmb_alloc(initial_boot_params->totalsize, PAGE_SIZE));
> +
> +	memcpy(new, initial_boot_params, initial_boot_params->totalsize);
> +
> +	initial_boot_params = new;
> +
> +	DBG("Flat device tree blob moved to %p\n", initial_boot_params);
> +
> +	/* XXX should we unreserve the old DT? */
> +}
> +#endif /* CONFIG_KEXEC */
> Index: kdump/arch/powerpc/kernel/setup_64.c
> ===================================================================
> --- kdump.orig/arch/powerpc/kernel/setup_64.c
> +++ kdump/arch/powerpc/kernel/setup_64.c
> @@ -398,6 +398,9 @@ void __init setup_system(void)
>  {
>  	DBG(" -> setup_system()\n");
>
> +#ifdef CONFIG_KEXEC
> +	kdump_move_device_tree();
> +#endif
>  	/*
>  	 * Unflatten the device-tree passed by prom_init or kexec
>  	 */
> Index: kdump/include/asm-powerpc/prom.h
> ===================================================================
> --- kdump.orig/include/asm-powerpc/prom.h
> +++ kdump/include/asm-powerpc/prom.h
> @@ -222,5 +222,7 @@ extern int of_address_to_resource(struct
>  extern int of_pci_address_to_resource(struct device_node *dev, int  
> bar,
>  				      struct resource *r);
>
> +extern void kdump_move_device_tree(void);
> +
>  #endif /* __KERNEL__ */
>  #endif /* _POWERPC_PROM_H */
> _______________________________________________
> Linuxppc64-dev mailing list
> Linuxppc64-dev at ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc64-dev


From trini at kernel.crashing.org  Sat Feb  4 01:47:13 2006
From: trini at kernel.crashing.org (Tom Rini)
Date: Fri, 3 Feb 2006 07:47:13 -0700
Subject: LINUXPPC64 Maple fails to boot current git)
In-Reply-To: <1138931761.4934.113.camel@localhost.localdomain>
References: <20060130171759.GE22672@smtp.west.cox.net>
	<1138662630.3417.26.camel@brick.watson.ibm.com>
	<20060131151117.GP22672@smtp.west.cox.net>
	<1138931761.4934.113.camel@localhost.localdomain>
Message-ID: <20060203144713.GE3800@smtp.west.cox.net>

On Fri, Feb 03, 2006 at 12:56:01PM +1100, Benjamin Herrenschmidt wrote:
> 
> > When looking for legacy serial ports, condition poking of "ISA" areas
> > on CONFIG_GENERIC_ISA_DMA, rather than CONFIG_ISA as some boards (such
> > as the Maple) have no ISA slots, but do have ISA serial ports.
> 
> Hrm... not sure ISA_DMA has anything to do with that at all.. in fact
> its more like "has legacy devices". I don't remember adding the ifdef
> CONFIG_ISA in the first place, maybe I did... it's a bit dodgy I'd say.
> Indeed, lots of machines have ISA devices (a superIO typically) without
> having ISA slots...

Olaf says that he sent a patch to Andrew, who should be passing it along
if not already, to just remove the #ifdefs there.

-- 
Tom Rini
http://gate.crashing.org/~trini/


From ericvh at gmail.com  Sat Feb  4 01:54:41 2006
From: ericvh at gmail.com (Eric Van Hensbergen)
Date: Fri,  3 Feb 2006 08:54:41 -0600 (CST)
Subject: [patch 0/3] systemsim patch cleanup
Message-ID: <20060203145441.6EC0A5A8075@localhost.localdomain>

These are a set of code cleanups based on Arnd's systemsim patch-set sent out
on January 14th.  This patch attempts to clean-up some of the issues with the
bogus network and bogus disk facilities of systemsim -- but is largely
cosmetic.

We had looked at incorporating the bogus devices into the IBM-maintained 
virtualization drivers in the past, but at the time it didn't look like there 
was a good match in the veth or the vscsi code -- the call-thru's would not 
integrate as nicely as they did with the hvc console code.

The bogus disk and bogus network drivers are largely a stop-gap measure for
systems the simulator doesn't have complete device models for.  More
complete device models are already in the plans for systemsim-cell, which
will likely eventually replace the need for the "bogus" drivers.  

As such, I'll maintain the existing bogus drivers out-of-tree in my git 
repository on kernel.org (/pub/scm/linux/kernel/git/ericvh/systemsim.git)
Unless there are any objections, I'll continue cc:'ing the ppc64-dev list on 
modifications to the patches.

   -eric


From ericvh at gmail.com  Sat Feb  4 01:56:17 2006
From: ericvh at gmail.com (Eric Van Hensbergen)
Date: Fri,  3 Feb 2006 08:56:17 -0600 (CST)
Subject: [patch 3/3] systemsim: new systemsim default configuration
Message-ID: <20060203145617.D6FCD5A809C@localhost.localdomain>

Subject: [PATCH] systemsim: clean up default configuration

Signed-off-by: Eric Van Hensbergen <bergevan at us.ibm.com>

---

 arch/powerpc/configs/systemsim_defconfig |  125 +++++++-----------------------
 1 files changed, 28 insertions(+), 97 deletions(-)

72e13e73b5998b853a9bd20e8c425486818ed09a
diff --git a/arch/powerpc/configs/systemsim_defconfig b/arch/powerpc/configs/systemsim_defconfig
index 59f1d0f..f7daa08 100644
--- a/arch/powerpc/configs/systemsim_defconfig
+++ b/arch/powerpc/configs/systemsim_defconfig
@@ -1,7 +1,7 @@
 #
 # Automatically generated make config: don't edit
-# Linux kernel version: 
-# Fri Jan 13 09:33:18 2006
+# Linux kernel version: 2.6.16-rc1
+# Thu Feb  2 15:18:13 2006
 #
 CONFIG_PPC64=y
 CONFIG_64BIT=y
@@ -18,7 +18,6 @@ CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y
 CONFIG_ARCH_MAY_HAVE_PC_FDC=y
 CONFIG_PPC_OF=y
 CONFIG_PPC_UDBG_16550=y
-# CONFIG_CRASH_DUMP is not set
 CONFIG_GENERIC_TBSYNC=y
 
 #
@@ -57,7 +56,7 @@ CONFIG_IKCONFIG=y
 CONFIG_IKCONFIG_PROC=y
 # CONFIG_CPUSETS is not set
 CONFIG_INITRAMFS_SOURCE=""
-CONFIG_CC_OPTIMIZE_FOR_SIZE=y
+# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
 # CONFIG_EMBEDDED is not set
 CONFIG_KALLSYMS=y
 # CONFIG_KALLSYMS_ALL is not set
@@ -100,11 +99,11 @@ CONFIG_IOSCHED_NOOP=y
 CONFIG_IOSCHED_AS=y
 CONFIG_IOSCHED_DEADLINE=y
 CONFIG_IOSCHED_CFQ=y
-# CONFIG_DEFAULT_AS is not set
+CONFIG_DEFAULT_AS=y
 # CONFIG_DEFAULT_DEADLINE is not set
 # CONFIG_DEFAULT_CFQ is not set
-CONFIG_DEFAULT_NOOP=y
-CONFIG_DEFAULT_IOSCHED="noop"
+# CONFIG_DEFAULT_NOOP is not set
+CONFIG_DEFAULT_IOSCHED="anticipatory"
 
 #
 # Platform support
@@ -116,7 +115,7 @@ CONFIG_PPC_MULTIPLATFORM=y
 CONFIG_PPC_PSERIES=y
 # CONFIG_PPC_PMAC is not set
 CONFIG_PPC_MAPLE=y
-CONFIG_PPC_CELL=y
+# CONFIG_PPC_CELL is not set
 CONFIG_PPC_SYSTEMSIM=y
 CONFIG_SYSTEMSIM_IDLE=y
 CONFIG_XICS=y
@@ -126,9 +125,8 @@ CONFIG_PPC_RTAS=y
 CONFIG_RTAS_ERROR_LOGGING=y
 CONFIG_RTAS_PROC=y
 # CONFIG_RTAS_FLASH is not set
-CONFIG_MMIO_NVRAM=y
+# CONFIG_MMIO_NVRAM is not set
 CONFIG_MPIC_BROKEN_U3=y
-CONFIG_CELL_IIC=y
 CONFIG_IBMVIO=y
 # CONFIG_IBMEBUS is not set
 # CONFIG_PPC_MPC106 is not set
@@ -136,11 +134,6 @@ CONFIG_IBMVIO=y
 # CONFIG_WANT_EARLY_SERIAL is not set
 
 #
-# Cell Broadband Engine options
-#
-CONFIG_SPU_FS=m
-
-#
 # Kernel options
 #
 # CONFIG_HZ_100 is not set
@@ -157,6 +150,7 @@ CONFIG_FORCE_MAX_ZONEORDER=13
 # CONFIG_IOMMU_VMERGE is not set
 # CONFIG_HOTPLUG_CPU is not set
 # CONFIG_KEXEC is not set
+# CONFIG_CRASH_DUMP is not set
 # CONFIG_IRQ_ALL_CPUS is not set
 # CONFIG_PPC_SPLPAR is not set
 CONFIG_EEH=y
@@ -299,6 +293,7 @@ CONFIG_BRIDGE_NETFILTER=y
 # Core Netfilter Configuration
 #
 # CONFIG_NETFILTER_NETLINK is not set
+# CONFIG_NETFILTER_XTABLES is not set
 
 #
 # IP: Netfilter Configuration
@@ -315,91 +310,11 @@ CONFIG_IP_NF_TFTP=m
 CONFIG_IP_NF_AMANDA=m
 # CONFIG_IP_NF_PPTP is not set
 CONFIG_IP_NF_QUEUE=m
-CONFIG_IP_NF_IPTABLES=m
-CONFIG_IP_NF_MATCH_LIMIT=m
-# CONFIG_IP_NF_MATCH_IPRANGE is not set
-CONFIG_IP_NF_MATCH_MAC=m
-CONFIG_IP_NF_MATCH_PKTTYPE=m
-CONFIG_IP_NF_MATCH_MARK=m
-CONFIG_IP_NF_MATCH_MULTIPORT=m
-CONFIG_IP_NF_MATCH_TOS=m
-CONFIG_IP_NF_MATCH_RECENT=m
-CONFIG_IP_NF_MATCH_ECN=m
-CONFIG_IP_NF_MATCH_DSCP=m
-CONFIG_IP_NF_MATCH_AH_ESP=m
-CONFIG_IP_NF_MATCH_LENGTH=m
-CONFIG_IP_NF_MATCH_TTL=m
-CONFIG_IP_NF_MATCH_TCPMSS=m
-CONFIG_IP_NF_MATCH_HELPER=m
-CONFIG_IP_NF_MATCH_STATE=m
-CONFIG_IP_NF_MATCH_CONNTRACK=m
-CONFIG_IP_NF_MATCH_OWNER=m
-# CONFIG_IP_NF_MATCH_PHYSDEV is not set
-# CONFIG_IP_NF_MATCH_ADDRTYPE is not set
-# CONFIG_IP_NF_MATCH_REALM is not set
-# CONFIG_IP_NF_MATCH_SCTP is not set
-# CONFIG_IP_NF_MATCH_DCCP is not set
-# CONFIG_IP_NF_MATCH_COMMENT is not set
-# CONFIG_IP_NF_MATCH_HASHLIMIT is not set
-# CONFIG_IP_NF_MATCH_STRING is not set
-# CONFIG_IP_NF_MATCH_POLICY is not set
-CONFIG_IP_NF_FILTER=m
-CONFIG_IP_NF_TARGET_REJECT=m
-CONFIG_IP_NF_TARGET_LOG=m
-CONFIG_IP_NF_TARGET_ULOG=m
-CONFIG_IP_NF_TARGET_TCPMSS=m
-# CONFIG_IP_NF_TARGET_NFQUEUE is not set
-CONFIG_IP_NF_NAT=m
-CONFIG_IP_NF_NAT_NEEDED=y
-CONFIG_IP_NF_TARGET_MASQUERADE=m
-CONFIG_IP_NF_TARGET_REDIRECT=m
-# CONFIG_IP_NF_TARGET_NETMAP is not set
-# CONFIG_IP_NF_TARGET_SAME is not set
-CONFIG_IP_NF_NAT_SNMP_BASIC=m
-CONFIG_IP_NF_NAT_IRC=m
-CONFIG_IP_NF_NAT_FTP=m
-CONFIG_IP_NF_NAT_TFTP=m
-CONFIG_IP_NF_NAT_AMANDA=m
-CONFIG_IP_NF_MANGLE=m
-CONFIG_IP_NF_TARGET_TOS=m
-CONFIG_IP_NF_TARGET_ECN=m
-CONFIG_IP_NF_TARGET_DSCP=m
-CONFIG_IP_NF_TARGET_MARK=m
-# CONFIG_IP_NF_TARGET_CLASSIFY is not set
-# CONFIG_IP_NF_TARGET_TTL is not set
-# CONFIG_IP_NF_RAW is not set
-CONFIG_IP_NF_ARPTABLES=m
-CONFIG_IP_NF_ARPFILTER=m
-CONFIG_IP_NF_ARP_MANGLE=m
 
 #
 # IPv6: Netfilter Configuration (EXPERIMENTAL)
 #
 # CONFIG_IP6_NF_QUEUE is not set
-CONFIG_IP6_NF_IPTABLES=m
-CONFIG_IP6_NF_MATCH_LIMIT=m
-CONFIG_IP6_NF_MATCH_MAC=m
-CONFIG_IP6_NF_MATCH_RT=m
-CONFIG_IP6_NF_MATCH_OPTS=m
-CONFIG_IP6_NF_MATCH_FRAG=m
-CONFIG_IP6_NF_MATCH_HL=m
-CONFIG_IP6_NF_MATCH_MULTIPORT=m
-CONFIG_IP6_NF_MATCH_OWNER=m
-CONFIG_IP6_NF_MATCH_MARK=m
-CONFIG_IP6_NF_MATCH_IPV6HEADER=m
-CONFIG_IP6_NF_MATCH_AHESP=m
-CONFIG_IP6_NF_MATCH_LENGTH=m
-CONFIG_IP6_NF_MATCH_EUI64=m
-# CONFIG_IP6_NF_MATCH_PHYSDEV is not set
-# CONFIG_IP6_NF_MATCH_POLICY is not set
-CONFIG_IP6_NF_FILTER=m
-CONFIG_IP6_NF_TARGET_LOG=m
-# CONFIG_IP6_NF_TARGET_REJECT is not set
-# CONFIG_IP6_NF_TARGET_NFQUEUE is not set
-CONFIG_IP6_NF_MANGLE=m
-CONFIG_IP6_NF_TARGET_MARK=m
-# CONFIG_IP6_NF_TARGET_HL is not set
-# CONFIG_IP6_NF_RAW is not set
 
 #
 # DECnet: Netfilter Configuration
@@ -443,6 +358,11 @@ CONFIG_IPDDP_ENCAP=y
 CONFIG_IPDDP_DECAP=y
 # CONFIG_X25 is not set
 # CONFIG_LAPB is not set
+
+#
+# TIPC Configuration (EXPERIMENTAL)
+#
+# CONFIG_TIPC is not set
 CONFIG_NET_DIVERT=y
 # CONFIG_ECONET is not set
 CONFIG_WAN_ROUTER=m
@@ -555,6 +475,7 @@ CONFIG_MTD_CFI_I2=y
 # CONFIG_MTD_RAM is not set
 # CONFIG_MTD_ROM is not set
 # CONFIG_MTD_ABSENT is not set
+# CONFIG_MTD_OBSOLETE_CHIPS is not set
 
 #
 # Mapping drivers for chip access
@@ -707,7 +628,6 @@ CONFIG_SYSTEMSIM_NET=y
 # CONFIG_SK98LIN is not set
 # CONFIG_TIGON3 is not set
 # CONFIG_BNX2 is not set
-# CONFIG_SPIDER_NET is not set
 # CONFIG_MV643XX_ETH is not set
 
 #
@@ -815,7 +735,7 @@ CONFIG_HW_CONSOLE=y
 CONFIG_SERIAL_8250=y
 # CONFIG_SERIAL_8250_CONSOLE is not set
 CONFIG_SERIAL_8250_NR_UARTS=4
-CONFIG_SERIAL_8250_RUNTIME_UARTS=2
+CONFIG_SERIAL_8250_RUNTIME_UARTS=4
 # CONFIG_SERIAL_8250_EXTENDED is not set
 
 #
@@ -826,7 +746,10 @@ CONFIG_SERIAL_CORE=y
 CONFIG_UNIX98_PTYS=y
 CONFIG_LEGACY_PTYS=y
 CONFIG_LEGACY_PTY_COUNT=256
+CONFIG_HVC_DRIVER=y
 # CONFIG_HVC_CONSOLE is not set
+CONFIG_HVC_FSS=y
+CONFIG_HVC_RTAS=y
 # CONFIG_HVCS is not set
 
 #
@@ -864,6 +787,12 @@ CONFIG_LEGACY_PTY_COUNT=256
 # CONFIG_I2C is not set
 
 #
+# SPI support
+#
+# CONFIG_SPI is not set
+# CONFIG_SPI_MASTER is not set
+
+#
 # Dallas's 1-wire bus
 #
 # CONFIG_W1 is not set
@@ -1057,6 +986,7 @@ CONFIG_UNIXWARE_DISKLABEL=y
 CONFIG_SGI_PARTITION=y
 # CONFIG_ULTRIX_PARTITION is not set
 CONFIG_SUN_PARTITION=y
+# CONFIG_KARMA_PARTITION is not set
 # CONFIG_EFI_PARTITION is not set
 
 #
@@ -1137,6 +1067,7 @@ CONFIG_DEBUG_SPINLOCK_SLEEP=y
 # CONFIG_DEBUG_INFO is not set
 # CONFIG_DEBUG_FS is not set
 # CONFIG_DEBUG_VM is not set
+CONFIG_FORCED_INLINING=y
 # CONFIG_RCU_TORTURE_TEST is not set
 # CONFIG_DEBUG_STACKOVERFLOW is not set
 # CONFIG_DEBUG_STACK_USAGE is not set
-- 
1.0.GIT


From ericvh at gmail.com  Sat Feb  4 01:55:06 2006
From: ericvh at gmail.com (Eric Van Hensbergen)
Date: Fri,  3 Feb 2006 08:55:06 -0600 (CST)
Subject: [patch 1/3] systemsim: cleanup systemsim network patch
Message-ID: <20060203145506.1E0405A807B@localhost.localdomain>

Subject: [PATCH] systemsim: clean-up systemsim network patch

Incorporate some of the LKML feedback, clean-up naming conventions and fix
a bogus free in the close routine.

Signed-off-by: Eric Van Hensbergen <bergevan at us.ibm.com>

---

 drivers/net/systemsim_net.c |  113 ++++++++++++++++++++++---------------------
 1 files changed, 57 insertions(+), 56 deletions(-)

79e30c5718a29c6de20e45f00bc1b458b359c29c
diff --git a/drivers/net/systemsim_net.c b/drivers/net/systemsim_net.c
index babc1fb..0a4cea9 100644
--- a/drivers/net/systemsim_net.c
+++ b/drivers/net/systemsim_net.c
@@ -60,32 +60,32 @@
 #include <linux/version.h>
 #include <asm/systemsim.h>
 
-#define MAMBO_BOGUS_NET_PROBE   119
-#define MAMBO_BOGUS_NET_SEND    120
-#define MAMBO_BOGUS_NET_RECV    121
+#define SYSTEMSIM_NET_PROBE   119
+#define SYSTEMSIM_NET_SEND    120
+#define SYSTEMSIM_NET_RECV    121
 
-static inline int MamboBogusNetProbe(int devno, void *buf)
+static inline int systemsim_bogusnet_probe(int devno, void *buf)
 {
-	return callthru2(MAMBO_BOGUS_NET_PROBE,
+	return callthru2(SYSTEMSIM_NET_PROBE,
 			 (unsigned long)devno, (unsigned long)buf);
 }
 
-static inline int MamboBogusNetSend(int devno, void *buf, ulong size)
+static inline int systemsim_bogusnet_send(int devno, void *buf, ulong size)
 {
-	return callthru3(MAMBO_BOGUS_NET_SEND,
+	return callthru3(SYSTEMSIM_NET_SEND,
 			 (unsigned long)devno,
 			 (unsigned long)buf, (unsigned long)size);
 }
 
-static inline int MamboBogusNetRecv(int devno, void *buf, ulong size)
+static inline int systemsim_bogusnet_recv(int devno, void *buf, ulong size)
 {
-	return callthru3(MAMBO_BOGUS_NET_RECV,
+	return callthru3(SYSTEMSIM_NET_RECV,
 			 (unsigned long)devno,
 			 (unsigned long)buf, (unsigned long)size);
 }
 
 static irqreturn_t
-mambonet_interrupt(int irq, void *dev_instance, struct pt_regs *regs);
+systemsim_net_intr(int irq, void *dev_instance, struct pt_regs *regs);
 
 #define INIT_BOTTOM_HALF(x,y,z) INIT_WORK(x, y, (void*)z)
 #define SCHEDULE_BOTTOM_HALF(x) schedule_delayed_work(x, 1)
@@ -100,18 +100,18 @@ struct netdev_private {
 	struct net_device_stats stats;
 };
 
-static int mambonet_probedev(int devno, void *buf)
+static int systemsim_net_probedev(int devno, void *buf)
 {
-	struct device_node *mambo;
+	struct device_node *systemsim;
 	struct device_node *net;
 	unsigned int *reg;
 
-	mambo = find_path_device("/mambo");
+	systemsim = find_path_device("/systemsim");
 
-	if (mambo == NULL) {
+	if (systemsim == NULL) {
 		return -1;
 	}
-	net = find_path_device("/mambo/bogus-net at 0");
+	net = find_path_device("/systemsim/bogus-net at 0");
 	if (net == NULL) {
 		return -1;
 	}
@@ -121,20 +121,20 @@ static int mambonet_probedev(int devno, 
 		return -1;
 	}
 
-	return MamboBogusNetProbe(devno, buf);
+	return systemsim_bogusnet_probe(devno, buf);
 }
 
-static int mambonet_send(int devno, void *buf, ulong size)
+static int systemsim_net_send(int devno, void *buf, ulong size)
 {
-	return MamboBogusNetSend(devno, buf, size);
+	return systemsim_bogusnet_send(devno, buf, size);
 }
 
-static int mambonet_recv(int devno, void *buf, ulong size)
+static int systemsim_net_recv(int devno, void *buf, ulong size)
 {
-	return MamboBogusNetRecv(devno, buf, size);
+	return systemsim_bogusnet_recv(devno, buf, size);
 }
 
-static int mambonet_start_xmit(struct sk_buff *skb, struct net_device *dev)
+static int systemsim_net_start_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct netdev_private *priv = (struct netdev_private *)dev->priv;
 	int devno = priv->devno;
@@ -142,7 +142,7 @@ static int mambonet_start_xmit(struct sk
 	skb->dev = dev;
 
 	/* we might need to checksum or something */
-	mambonet_send(devno, skb->data, skb->len);
+	systemsim_net_send(devno, skb->data, skb->len);
 
 	dev->last_rx = jiffies;
 	priv->stats.rx_bytes += skb->len;
@@ -155,7 +155,7 @@ static int mambonet_start_xmit(struct sk
 	return (0);
 }
 
-static int mambonet_poll(struct net_device *dev, int *budget)
+static int systemsim_net_poll(struct net_device *dev, int *budget)
 {
 	struct netdev_private *np = dev->priv;
 	int devno = np->devno;
@@ -166,7 +166,7 @@ static int mambonet_poll(struct net_devi
 	int max_frames = min(*budget, dev->quota);
 	int ret = 0;
 
-	while ((ns = mambonet_recv(devno, buffer, 1600)) > 0) {
+	while ((ns = systemsim_net_recv(devno, buffer, 1600)) > 0) {
 		if ((skb = dev_alloc_skb(ns + 2)) != NULL) {
 			skb->dev = dev;
 			skb_reserve(skb, 2);	/* 16 byte align the IP
@@ -209,12 +209,12 @@ static int mambonet_poll(struct net_devi
 	return ret;
 }
 
-static void mambonet_timer(struct net_device *dev)
+static void systemsim_net_timer(struct net_device *dev)
 {
 	int budget = 16;
 	struct netdev_private *priv = (struct netdev_private *)dev->priv;
 
-	mambonet_poll(dev, &budget);
+	systemsim_net_poll(dev, &budget);
 
 	if (!priv->closing) {
 		SCHEDULE_BOTTOM_HALF(&priv->poll_task);
@@ -228,7 +228,7 @@ static struct net_device_stats *get_stat
 }
 
 static irqreturn_t
-mambonet_interrupt(int irq, void *dev_instance, struct pt_regs *regs)
+systemsim_net_intr(int irq, void *dev_instance, struct pt_regs *regs)
 {
 	struct net_device *dev = dev_instance;
 	if (netif_rx_schedule_prep(dev)) {
@@ -237,7 +237,7 @@ mambonet_interrupt(int irq, void *dev_in
 	return IRQ_HANDLED;
 }
 
-static int mambonet_open(struct net_device *dev)
+static int systemsim_net_open(struct net_device *dev)
 {
 	struct netdev_private *priv;
 	int ret = 0;
@@ -245,29 +245,30 @@ static int mambonet_open(struct net_devi
 	priv = dev->priv;
 
 	/*
-	 * we can't start polling in mambonet_init, because I don't think
+	 * we can't start polling in systemsim_net_init, because I don't think
 	 * workqueues are usable that early. so start polling now.
 	 */
 
 	if (dev->irq) {
-		ret = request_irq(dev->irq, &mambonet_interrupt, 0,
+		ret = request_irq(dev->irq, &systemsim_net_intr, 0,
 				  dev->name, dev);
 
 		if (ret == 0) {
 			netif_start_queue(dev);
 		} else {
-			printk(KERN_ERR "mambonet: request irq failed\n");
+			printk(KERN_ERR "systemsim net: request irq failed\n");
 		}
 
-		MamboBogusNetProbe(priv->devno, NULL);	/* probe with NULL to activate interrupts */
+		/* probe with NULL to activate interrupts */
+		systemsim_bogusnet_probe(priv->devno, NULL);	
 	} else {
-		mambonet_timer(dev);
+		systemsim_net_timer(dev);
 	}
 
 	return ret;
 }
 
-static int mambonet_close(struct net_device *dev)
+static int systemsim_net_close(struct net_device *dev)
 {
 	struct netdev_private *priv;
 
@@ -282,30 +283,29 @@ static int mambonet_close(struct net_dev
 		KILL_BOTTOM_HALF(&priv->poll_task);
 	}
 
-	kfree(priv);
-
 	return 0;
 }
 
-static struct net_device_stats mambonet_stats;
+static struct net_device_stats systemsim_net_stats;
 
-static struct net_device_stats *mambonet_get_stats(struct net_device *dev)
+static struct net_device_stats *systemsim_net_get_stats(struct net_device *dev)
 {
-	return &mambonet_stats;
+	return &systemsim_net_stats;
 }
 
-static int mambonet_set_mac_address(struct net_device *dev, void *p)
+static int systemsim_net_set_mac_address(struct net_device *dev, void *p)
 {
 	return -EOPNOTSUPP;
 }
-static int mambonet_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
+static int systemsim_net_ioctl(struct net_device *dev, struct ifreq *ifr,
+			       int cmd)
 {
 	return -EOPNOTSUPP;
 }
 static int nextdevno = 0;	/* running count of device numbers */
 
 /* Initialize the rest of the device. */
-int __init do_mambonet_probe(struct net_device *dev)
+int __init do_systemsim_net_probe(struct net_device *dev)
 {
 	struct netdev_private *priv;
 	int devno = nextdevno++;
@@ -313,7 +313,7 @@ int __init do_mambonet_probe(struct net_
 
 	printk("eth%d: bogus network driver initialization\n", devno);
 
-	irq = mambonet_probedev(devno, dev->dev_addr);
+	irq = systemsim_net_probedev(devno, dev->dev_addr);
 
 	if (irq < 0) {
 		printk("No IRQ retreived\n");
@@ -328,14 +328,14 @@ int __init do_mambonet_probe(struct net_
 
 	dev->irq = irq;
 	dev->mtu = MAMBO_MTU;
-	dev->open = mambonet_open;
-	dev->poll = mambonet_poll;
+	dev->open = systemsim_net_open;
+	dev->poll = systemsim_net_poll;
 	dev->weight = 16;
-	dev->stop = mambonet_close;
-	dev->hard_start_xmit = mambonet_start_xmit;
-	dev->get_stats = mambonet_get_stats;
-	dev->set_mac_address = mambonet_set_mac_address;
-	dev->do_ioctl = mambonet_ioctl;
+	dev->stop = systemsim_net_close;
+	dev->hard_start_xmit = systemsim_net_start_xmit;
+	dev->get_stats = systemsim_net_get_stats;
+	dev->set_mac_address = systemsim_net_set_mac_address;
+	dev->do_ioctl = systemsim_net_ioctl;
 
 	dev->priv = kmalloc(sizeof(struct netdev_private), GFP_KERNEL);
 	if (dev->priv == NULL)
@@ -348,14 +348,14 @@ int __init do_mambonet_probe(struct net_
 	dev->get_stats = get_stats;
 
 	if (dev->irq == 0) {
-		INIT_BOTTOM_HALF(&priv->poll_task, (void *)mambonet_timer,
+		INIT_BOTTOM_HALF(&priv->poll_task, (void *)systemsim_net_timer,
 				 (void *)dev);
 	}
 
 	return (0);
 };
 
-struct net_device *__init mambonet_probe(int unit)
+struct net_device *__init systemsim_net_probe(int unit)
 {
 	struct net_device *dev = alloc_etherdev(0);
 	int err;
@@ -366,7 +366,7 @@ struct net_device *__init mambonet_probe
 	sprintf(dev->name, "eth%d", unit);
 	netdev_boot_setup_check(dev);
 
-	err = do_mambonet_probe(dev);
+	err = do_systemsim_net_probe(dev);
 
 	if (err)
 		goto out;
@@ -382,11 +382,12 @@ struct net_device *__init mambonet_probe
 	return ERR_PTR(err);
 }
 
-int __init init_mambonet(void)
+int __init init_systemsim_net(void)
 {
-	mambonet_probe(0);
+	systemsim_net_probe(0);
 	return 0;
 }
 
-module_init(init_mambonet);
+module_init(init_systemsim_net);
+MODULE_DESCRIPTION("Systemsim Network Driver");
 MODULE_LICENSE("GPL");
-- 
1.0.GIT


From ericvh at gmail.com  Sat Feb  4 01:55:36 2006
From: ericvh at gmail.com (Eric Van Hensbergen)
Date: Fri,  3 Feb 2006 08:55:36 -0600 (CST)
Subject: [patch 2/3] systemsim: cleanup systemsim block driver patch
Message-ID: <20060203145536.CB9C35A8098@localhost.localdomain>

Subject: [PATCH] systemsim: clean up systemsim block driver

Clean-up the systemsim block driver and integrate some of the suggestions
from LKML.

Signed-off-by: Eric Van Hensbergen <bergevan at us.ibm.com>

---

 drivers/block/systemsim_bd.c |  159 ++++++++++++++++++++++++------------------
 1 files changed, 91 insertions(+), 68 deletions(-)

ea40711c3a573b917cade94c1bdca659e4f3f905
diff --git a/drivers/block/systemsim_bd.c b/drivers/block/systemsim_bd.c
index deecfb8..bec453e 100644
--- a/drivers/block/systemsim_bd.c
+++ b/drivers/block/systemsim_bd.c
@@ -11,7 +11,7 @@
  *    written by Pavel Machek and Steven Whitehouse
  *
  *  Some code is from the IBM Full System Simulator Group in ARL
- *  Author: PAtrick Bohrer <IBM Austin Research Lab>
+ *  Author: Patrick Bohrer <IBM Austin Research Lab>
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
@@ -43,7 +43,7 @@
 #include <linux/ioctl.h>
 #include <linux/blkdev.h>
 #include <net/sock.h>
-
+#include <asm/prom.h>
 #include <asm/systemsim.h>
 
 #include <linux/devfs_fs_kernel.h>
@@ -52,21 +52,21 @@
 #include <asm/types.h>
 
 #define MAJOR_NR 42
-#define MAX_MBD 128
+#define MAX_SYSTEMSIM_BD 128
 
-#define MBD_SET_BLKSIZE _IO( 0xab, 1 )
-#define MBD_SET_SIZE    _IO( 0xab, 2 )
-#define MBD_SET_SIZE_BLOCKS     _IO( 0xab, 7 )
-#define MBD_DISCONNECT  _IO( 0xab, 8 )
+#define SYSTEMSIM_BD_SET_BLKSIZE _IO( 0xab, 1 )
+#define SYSTEMSIM_BD_SET_SIZE    _IO( 0xab, 2 )
+#define SYSTEMSIM_BD_SET_SIZE_BLOCKS     _IO( 0xab, 7 )
+#define SYSTEMSIM_BD_DISCONNECT  _IO( 0xab, 8 )
 
-struct mbd_device {
+struct systemsim_bd_device {
 	int initialized;
 	int refcnt;
 	int flags;
 	struct gendisk *disk;
 };
 
-static struct mbd_device mbd_dev[MAX_MBD];
+static struct systemsim_bd_device systemsim_bd_dev[MAX_SYSTEMSIM_BD];
 
 #define BD_INFO_SYNC   0
 #define BD_INFO_STATUS 1
@@ -79,7 +79,7 @@ static struct mbd_device mbd_dev[MAX_MBD
 #define BOGUS_DISK_INFO  118
 
 static inline int
-MamboBogusDiskRead(int devno, void *buf, ulong sect, ulong nrsect)
+systemsim_disk_read(int devno, void *buf, ulong sect, ulong nrsect)
 {
 	return callthru3(BOGUS_DISK_READ, (unsigned long)buf,
 			 (unsigned long)sect,
@@ -87,34 +87,34 @@ MamboBogusDiskRead(int devno, void *buf,
 }
 
 static inline int
-MamboBogusDiskWrite(int devno, void *buf, ulong sect, ulong nrsect)
+systemsim_disk_write(int devno, void *buf, ulong sect, ulong nrsect)
 {
 	return callthru3(BOGUS_DISK_WRITE, (unsigned long)buf,
 			 (unsigned long)sect,
 			 (unsigned long)((nrsect << 16) | devno));
 }
 
-static inline int MamboBogusDiskInfo(int op, int devno)
+static inline int systemsim_disk_info(int op, int devno)
 {
 	return callthru2(BOGUS_DISK_INFO, (unsigned long)op,
 			 (unsigned long)devno);
 }
 
-static int mbd_init_disk(int devno)
+static int systemsim_bd_init_disk(int devno)
 {
-	struct gendisk *disk = mbd_dev[devno].disk;
+	struct gendisk *disk = systemsim_bd_dev[devno].disk;
 	unsigned int sz;
 
 	/* check disk configured */
-	if (!MamboBogusDiskInfo(BD_INFO_STATUS, devno)) {
+	if (!systemsim_disk_info(BD_INFO_STATUS, devno)) {
 		printk(KERN_ERR
 		       "Attempting to open bogus disk before initializaiton\n");
 		return 0;
 	}
 
-	mbd_dev[devno].initialized++;
+	systemsim_bd_dev[devno].initialized++;
 
-	sz = MamboBogusDiskInfo(BD_INFO_DEVSZ, devno);
+	sz = systemsim_disk_info(BD_INFO_DEVSZ, devno);
 
 	printk("Initializing disk %d with devsz %u\n", devno, sz);
 
@@ -123,7 +123,7 @@ static int mbd_init_disk(int devno)
 	return 1;
 }
 
-static void do_mbd_request(request_queue_t * q)
+static void do_systemsim_bd_request(request_queue_t * q)
 {
 	int result = 0;
 	struct request *req;
@@ -133,14 +133,14 @@ static void do_mbd_request(request_queue
 
 		switch (rq_data_dir(req)) {
 		case READ:
-			result = MamboBogusDiskRead(minor,
-						    req->buffer, req->sector,
-						    req->current_nr_sectors);
-			break;
-		case WRITE:
-			result = MamboBogusDiskWrite(minor,
+			result = systemsim_disk_read(minor,
 						     req->buffer, req->sector,
 						     req->current_nr_sectors);
+			break;
+		case WRITE:
+			result = systemsim_disk_write(minor,
+						      req->buffer, req->sector,
+						      req->current_nr_sectors);
 		};
 
 		if (result)
@@ -150,108 +150,131 @@ static void do_mbd_request(request_queue
 	}
 }
 
-static int mbd_release(struct inode *inode, struct file *file)
+static int systemsim_bd_release(struct inode *inode, struct file *file)
 {
-	struct mbd_device *lo;
+	struct systemsim_bd_device *lo;
 	int dev;
 
 	if (!inode)
 		return -ENODEV;
 	dev = inode->i_bdev->bd_disk->first_minor;
-	if (dev >= MAX_MBD)
+	if (dev >= MAX_SYSTEMSIM_BD)
 		return -ENODEV;
-	if (MamboBogusDiskInfo(BD_INFO_SYNC, dev) < 0) {
-		printk(KERN_ALERT "mbd_release: unable to sync\n");
+	if (systemsim_disk_info(BD_INFO_SYNC, dev) < 0) {
+		printk(KERN_ALERT "systemsim_bd_release: unable to sync\n");
 	}
-	lo = &mbd_dev[dev];
+	lo = &systemsim_bd_dev[dev];
 	if (lo->refcnt <= 0)
-		printk(KERN_ALERT "mbd_release: refcount(%d) <= 0\n",
+		printk(KERN_ALERT "systemsim_bd_release: refcount(%d) <= 0\n",
 		       lo->refcnt);
 	lo->refcnt--;
 	return 0;
 }
 
-static int mbd_revalidate(struct gendisk *disk)
+static int systemsim_bd_revalidate(struct gendisk *disk)
 {
 	int devno = disk->first_minor;
 
-	mbd_init_disk(devno);
+	systemsim_bd_init_disk(devno);
 
 	return 0;
 }
 
-static int mbd_open(struct inode *inode, struct file *file)
+static int systemsim_bd_open(struct inode *inode, struct file *file)
 {
 	int dev;
 
 	if (!inode)
 		return -EINVAL;
 	dev = inode->i_bdev->bd_disk->first_minor;
-	if (dev >= MAX_MBD)
+	if (dev >= MAX_SYSTEMSIM_BD)
 		return -ENODEV;
 
 	check_disk_change(inode->i_bdev);
 
-	if (!mbd_dev[dev].initialized)
-		if (!mbd_init_disk(dev))
+	if (!systemsim_bd_dev[dev].initialized)
+		if (!systemsim_bd_init_disk(dev))
 			return -ENODEV;
 
-	mbd_dev[dev].refcnt++;
+	systemsim_bd_dev[dev].refcnt++;
 	return 0;
 }
 
-static struct block_device_operations mbd_fops = {
+static struct block_device_operations systemsim_bd_fops = {
       owner:THIS_MODULE,
-      open:mbd_open,
-      release:mbd_release,
-	/* media_changed:      mbd_check_change, */
-      revalidate_disk:mbd_revalidate,
+      open:systemsim_bd_open,
+      release:systemsim_bd_release,
+	/* media_changed:      systemsim_bd_check_change, */
+      revalidate_disk:systemsim_bd_revalidate,
 };
 
-static spinlock_t mbd_lock = SPIN_LOCK_UNLOCKED;
+static spinlock_t systemsim_bd_lock = SPIN_LOCK_UNLOCKED;
 
-static int __init mbd_init(void)
+static int __init systemsim_bd_init(void)
 {
+	struct device_node *systemsim;
 	int err = -ENOMEM;
 	int i;
 
-	for (i = 0; i < MAX_MBD; i++) {
+	systemsim = find_path_device("/systemsim");
+
+	if (systemsim == NULL) {
+		printk("NO SYSTEMSIM BOGUS DISK DETECTED\n");
+		return -1;
+	}
+
+	/*
+	 * We could detect which disks are configured in openfirmware
+	 * but I think this unnecessarily limits us from being able to 
+	 * hot-plug bogus disks durning run-time.
+	 *
+	 */
+
+	for (i = 0; i < MAX_SYSTEMSIM_BD; i++) {
 		struct gendisk *disk = alloc_disk(1);
 		if (!disk)
 			goto out;
-		mbd_dev[i].disk = disk;
+		systemsim_bd_dev[i].disk = disk;
 		/*
 		 * The new linux 2.5 block layer implementation requires
 		 * every gendisk to have its very own request_queue struct.
 		 * These structs are big so we dynamically allocate them.
 		 */
-		disk->queue = blk_init_queue(do_mbd_request, &mbd_lock);
+		disk->queue =
+		    blk_init_queue(do_systemsim_bd_request, &systemsim_bd_lock);
 		if (!disk->queue) {
 			put_disk(disk);
 			goto out;
 		}
 	}
 
-	if (register_blkdev(MAJOR_NR, "mbd")) {
+	if (register_blkdev(MAJOR_NR, "systemsim_bd")) {
 		err = -EIO;
 		goto out;
 	}
 #ifdef MODULE
-	printk("mambo bogus disk: registered device at major %d\n", MAJOR_NR);
+	printk("systemsim bogus disk: registered device at major %d\n",
+	       MAJOR_NR);
 #else
-	printk("mambo bogus disk: compiled in with kernel\n");
+	printk("systemsim bogus disk: compiled in with kernel\n");
 #endif
 
+	/* 
+	 * left device name alone for now as too much depends on it
+	 * external to the kernel
+	 *
+	 */
+
 	devfs_mk_dir("mambobd");
-	for (i = 0; i < MAX_MBD; i++) {	/* load defaults */
-		struct gendisk *disk = mbd_dev[i].disk;
-		mbd_dev[i].initialized = 0;
-		mbd_dev[i].refcnt = 0;
-		mbd_dev[i].flags = 0;
+	for (i = 0; i < MAX_SYSTEMSIM_BD; i++) {	/* load defaults */
+		struct gendisk *disk = systemsim_bd_dev[i].disk;
+		systemsim_bd_dev[i].initialized = 0;
+		systemsim_bd_dev[i].refcnt = 0;
+		systemsim_bd_dev[i].flags = 0;
 		disk->major = MAJOR_NR;
 		disk->first_minor = i;
-		disk->fops = &mbd_fops;
-		disk->private_data = &mbd_dev[i];
+		disk->fops = &systemsim_bd_fops;
+		disk->private_data = &systemsim_bd_dev[i];
 		sprintf(disk->disk_name, "mambobd%d", i);
 		sprintf(disk->devfs_name, "mambobd%d", i);
 		set_capacity(disk, 0x7ffffc00ULL << 1);	/* 2 TB */
@@ -261,25 +284,25 @@ static int __init mbd_init(void)
 	return 0;
       out:
 	while (i--) {
-		if (mbd_dev[i].disk->queue)
-			blk_cleanup_queue(mbd_dev[i].disk->queue);
-		put_disk(mbd_dev[i].disk);
+		if (systemsim_bd_dev[i].disk->queue)
+			blk_cleanup_queue(systemsim_bd_dev[i].disk->queue);
+		put_disk(systemsim_bd_dev[i].disk);
 	}
 	return -EIO;
 }
 
-static void __exit mbd_cleanup(void)
+static void __exit systemsim_bd_cleanup(void)
 {
 	devfs_remove("mambobd");
 
-	if (unregister_blkdev(MAJOR_NR, "mbd") != 0)
-		printk("mbd: cleanup_module failed\n");
+	if (unregister_blkdev(MAJOR_NR, "systemsim_bd") != 0)
+		printk("systemsim_bd: cleanup_module failed\n");
 	else
-		printk("mbd: module cleaned up.\n");
+		printk("systemsim_bd: module cleaned up.\n");
 }
 
-module_init(mbd_init);
-module_exit(mbd_cleanup);
+module_init(systemsim_bd_init);
+module_exit(systemsim_bd_cleanup);
 
-MODULE_DESCRIPTION("Mambo Block Device");
+MODULE_DESCRIPTION("Systemsim Block Device");
 MODULE_LICENSE("GPL");
-- 
1.0.GIT


From jfaslist at yahoo.fr  Sat Feb  4 02:58:36 2006
From: jfaslist at yahoo.fr (jfaslist)
Date: Fri, 03 Feb 2006 16:58:36 +0100
Subject: Maple freezing on PCI Target-Abort
In-Reply-To: <1138930958.4934.102.camel@localhost.localdomain>
References: <43E23B4A.4020402@yahoo.fr>
	<1138930958.4934.102.camel@localhost.localdomain>
Message-ID: <43E37DAC.4030606@yahoo.fr>

Hi,
Yes, we are going to dig into all this CPC925 and Processor Interface 
initialization.
Note that I checked that both MSR_ME and MSR_RI were set prior to 
triggering the PCI Target-Abort.

-MSR_ME: If not set the CPU will "checkstop" on a machine chaeck.
-MSR_RI: So that the exception is recoverable.

Regarding MSR_RI, this should always be set, I think?
Thanks
-jfs


Benjamin Herrenschmidt wrote:

>>-What exception vector is taking care of a DERR excp? From what I can 
>>see it seems to be the "machine check" vector. But that seems a bit 
>>drastic to me. After all this is just a PCI target abort.
>>    
>>
>
>I would expect a machine check yes.
>
>  
>
>>-I expect that the normal behavior would be for the kernel to send a 
>>signal termination to the user process which caused the PIO READ PCI 
>>cycle (from a previously mmap()'ed VMA address). Is it  doable on this 
>>platform?  Since a READ operation is coupled by nature, I think this is 
>>the only acceptable way.
>>    
>>
>
>It should SIGBUS except if the problem occurred in the kernel. I don't
>know why it's not doing so, maybe you are hitting an issue/errata or
>misconfiguration of the 925 ?
>
>  
>
>>I have tried to set the MSR[RI] bit before doing the PCI cycle, but it 
>>didn't change change anything. Also on our design we disconnect the 
>>CPC925 checkstop pin from the 970 machine check pin.(see page 39 of 
>>cpc925 user's manual). So a DERR shouldn't cause a machine check I would 
>>think.
>>
>>I realize that these questions are very H/W related but couldn't find 
>>the answer in IBM doc.
>>    
>>
>
>
>
>
>  
>


___________________________________________________________________________ 
Nouveau : t?l?phonez moins cher avec Yahoo! Messenger ! D?couvez les tarifs exceptionnels pour appeler la France et l'international.
T?l?chargez sur http://fr.messenger.yahoo.com


From ahuja at austin.ibm.com  Wed Feb  1 06:11:54 2006
From: ahuja at austin.ibm.com (Manish Ahuja)
Date: Tue, 31 Jan 2006 13:11:54 -0600
Subject: [PATCH] PPC64 collect and export low-level cpu usage statistics
In-Reply-To: <20060126204432.GG19465@austin.ibm.com>
References: <43CFC094.8000709@austin.ibm.com>
	<20060126204432.GG19465@austin.ibm.com>
Message-ID: <43DFB67A.5080508@austin.ibm.com>

Yes,

It probably is a good idea to have #define for it, but since purr is
only available on power5 architecture, none
of the other architecture's really need this code and maybe I should
enclose this for power5 setup only.


>>+static ssize_t show_dispatchedcycles(struct sys_device *, char *);
>>+static ssize_t show_offline_cpu_cycles(struct sys_device *, char *);
>>+
>>+static SYSDEV_ATTR(offline_cpu_cycles, 0444, show_offline_cpu_cycles, NULL);
>>+static SYSDEV_ATTR(cpu_dispatched_cycles, 0444, show_dispatchedcycles, NULL);
>>    
>>
>
>I think you need a #ifdef CONFIG_PPC64 around the above.
>  
>
>>-	if (cpu_has_feature(CPU_FTR_SMT))
>>+	if (cpu_has_feature(CPU_FTR_SMT)) {
>> 		sysdev_create_file(s, &attr_purr);
>>+		sysdev_create_file(s, &attr_offline_cpu_cycles);
>>+		sysdev_create_file(s, &attr_cpu_dispatched_cycles);
>>+	}
>>    
>>
>
>Shouldn't this be CPU_FTR_PURR not FTR_SMT ? (and also in the next
>section too).
>
>  
>

Yes, the original was FTR_SMT. I overlooked it. Thanks for pointing it out.


+/* Defined in setup.c */

>>+extern u64 offline_cpu_total_tb;
>>+extern u64 offline_cpu_total_cpu_util;
>>+extern u64 offline_cpu_total_krncycles;
>>+extern u64 offline_cpu_total_idle;
>>    
>>
>
>These should be in a header file, probably arch/powerpc/kernel/setup.h
>
>  
>
>>+static ssize_t show_offline_cpu_cycles(struct sys_device *dev, char *buf)
>>    
>>
>
>#ifdef CONFIG_PPC64 surrounding the above ....
>
>--linas
>  
>


Okay, I can move it around, if its okay with everyone else.

Thanks for the comments.


From linas at austin.ibm.com  Sat Feb  4 03:58:30 2006
From: linas at austin.ibm.com (Linas Vepstas)
Date: Fri, 3 Feb 2006 10:58:30 -0600
Subject: creating PCI-related sysfs entries
In-Reply-To: <1138957657.4934.124.camel@localhost.localdomain>
References: <20060131202214.GZ19465@austin.ibm.com>
	<20060131203456.GA23819@kroah.com>
	<20060131210805.GA19465@austin.ibm.com>
	<1138931103.4934.105.camel@localhost.localdomain>
	<20060203020341.GR24916@austin.ibm.com>
	<1138957657.4934.124.camel@localhost.localdomain>
Message-ID: <20060203165829.GS24916@austin.ibm.com>

On Fri, Feb 03, 2006 at 08:07:37PM +1100, Benjamin Herrenschmidt was heard to remark:
> On Thu, 2006-02-02 at 20:03 -0600, Linas Vepstas wrote:
> 
> > Yes, and EEH does do that (in mainline, 10K times in a row, 
> > last I tried). This email was in reference to the 
> > layout of /sys/bus/pci/slots which seems to have only hotplug
> > slots in there; I am not yet sure why. Its possible John Rose 
> > can shed some rapid insight? 
> 
> Ok... also, about this "max number of resets" thing, it would be useful
> in fact to have a rate limit rather ... a network card that for some
> reason need to be reset about once a day is still fairly useable and it
> would be nice if the system didn't consider it dead after 10 days ...

Yes, I've often thought about this. Only two designs come to mind:

1) a timer pops ever 8 hours, and decrements the failure count by 1.
   Thus, anything less than 3 resets a day would be acceptable.

2) Store the jiffies of the last reset. Increment the fail count only if
   previous jiffies is less than 8 hours ago. Set fail count to 1 if
   previous jiffies is more then 48 hours ago. Advantage over 1: no
   timers. 

Any preferences?

> Also, it might be useful to have an entry to force a retry on a card
> that has been considered dead...

Actually, hotplug remove/add or dlpar remove/add can be used to 
clear the count. (and that's how I do my test cases) The problem 
is that the documentation for this is buried somwhere where it
cannot be found.

Actually, this is one of my bigger/biggest concerns: the info 
about any of this is unfindable. I'd like to hype it up a bit,
but am not sure how.

--linas


From linas at austin.ibm.com  Sat Feb  4 04:08:34 2006
From: linas at austin.ibm.com (Linas Vepstas)
Date: Fri, 3 Feb 2006 11:08:34 -0600
Subject: creating PCI-related sysfs entries
In-Reply-To: <1138931602.4934.110.camel@localhost.localdomain>
References: <20060131202214.GZ19465@austin.ibm.com>
	<20060131203456.GA23819@kroah.com>
	<20060131210805.GA19465@austin.ibm.com>
	<20060131212624.GA10513@kroah.com>
	<1138931602.4934.110.camel@localhost.localdomain>
Message-ID: <20060203170834.GT24916@austin.ibm.com>

On Fri, Feb 03, 2006 at 12:53:21PM +1100, Benjamin Herrenschmidt was heard to remark:
> 
> > People have suggested that they create such a driver for a long time.
> > Why not just do that?
> 
> if he also wants consolidated "global" stats,
> then yes, a host controller driver might be the way to go).

I've had trouble parsing these suggestions. I can certainly 
hack up some pci-host structure so I can publish a few stats
(the goal is to eliminate /proc/ppc64/eeh).  By "hack" I mean
something that would live with either the rpaphp code or the 
powerpc code. However, this is a different type of activity 
than the idea "define a generic architecture-neutral pci-host 
bridge structure".

Maybe I should just do the first, and if it ignites anyone's
imagination, we can talk about the second.

--linas


From info at schihei.de  Sat Feb  4 05:04:42 2006
From: info at schihei.de (Heiko J Schick)
Date: Fri, 3 Feb 2006 19:04:42 +0100
Subject: kernel debugging tool
In-Reply-To: <0ITB0074BC0LA6@mmp2.samsung.com>
References: <0ITB0074BC0LA6@mmp2.samsung.com>
Message-ID: <07030771-257D-4204-A0C4-1833B9F9FBD3@schihei.de>

Hello,

you can also use XMON or KDB, which are kernel debuggers.
XMON is normally included in PowerPC kernels. I think
for KDB you have to patch your kernel, but that could be
wrong.

If you dump out the crash instruction and compare it
with the assembler output of your GCC, you can find
fast the source code line which caused the kernel panic.

Perhaps the following links helps, too:
http://urbanmyth.org/linux/oops/
http://www-128.ibm.com/developerworks/library/l-kprobes.html?ca=dgr- 
lnxw42Kprobe
http://www-128.ibm.com/developerworks/linux/library/l-kdbug/

Sometimes also very useful, too. :)


On Jan 19, 2006, at 1:00 AM, Hyo Jung Song wrote:

>    WE are interested in Cell BE (broadband engine) Linux patch.  
> (found in
> http://www.bsc.es/projects/deepcomputing/linuxoncell/cbexdev.html)
>    We want to debug kernel sources sometimes. How can we do it?
>    I believe you guys debugged kernel source codes for CBE and you  
> used
>    some tools.
>    Could you please some tips for this? Thank you.
>
>
>
>    Hyo Jung Song
>    Senior Engineer
>    Samsung Electronics
>    tel. 82-2-3416-0355
>
> -----Original Message-----
> From: Cell Support [mailto:cell_support at bsc.es]
> Sent: Wednesday, January 18, 2006 11:27 PM
> To: hjsong at samsung.com
> Cc: cell_support at bsc.es
> Subject: Re: Fwd: kernel debugging tool
>
> Dear Hyo,
>
> we don't develop linux patches for Cell BE. We got them from public
> kernel mailing lists and post them to help
> people to built a kernel that works with Cell BE. This avoids  
> having to
> go through kernel mailing lists to
> find the correct patch files that fit a specific kernel release. Hope
> this helps people.
>
> We think you should post your question to a linux kernel mailing list.
> Regarding the ppc64 kernel development,
> the linuxppc64-dev at ozlabs.org is the right place
> (https://ozlabs.org/mailman/listinfo/linuxppc64-dev). But
> you can also sent this to the http://www.kernel.org mailing lists.
> Probable, kernel developers can help you
> because they are always debugging new their code.
>
> Hope this helps.
>
> Regards,
>
>> Sender : hjsong at samsung.com
>> Date   : 2006-01-17 10:13
>> Title  : kernel debugging tool
>>
>> Hi.
>>
>> WE are interested in CBE Linux patch.
>> We want to debug kernel sources sometimes. How can we do it?
>> I believe you guys debugged kernel source codes for CBE and you used
>> some tools.
>> Could you please some tips for this? Thank you.
>>
>>
>>
>> Hyo Jung Song
>> Senior Engineer
>> Samsung Electronics
>> tel. 82-2-3416-0355
>
>
>
> _______________________________________________
> Linuxppc64-dev mailing list
> Linuxppc64-dev at ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc64-dev


From info at schihei.de  Sat Feb  4 04:55:19 2006
From: info at schihei.de (Heiko J Schick)
Date: Fri, 3 Feb 2006 18:55:19 +0100
Subject: kernel debugging tool
In-Reply-To: <0ITB0074BC0LA6@mmp2.samsung.com>
References: <0ITB0074BC0LA6@mmp2.samsung.com>
Message-ID: <759A45E8-8A3E-47CD-B3C9-880C0EBDC25B@schihei.de>

Hello,

you can also use XMON or KDB, which are kernel debuggers.
XMON is normally included in PowerPC kernels. I think
for KDB you have to patch your kernel, but that could be
wrong.

Sometimes also very useful, too. :)

On Jan 19, 2006, at 1:00 AM, Hyo Jung Song wrote:

>    WE are interested in Cell BE (broadband engine) Linux patch.  
> (found in
> http://www.bsc.es/projects/deepcomputing/linuxoncell/cbexdev.html)
>    We want to debug kernel sources sometimes. How can we do it?
>    I believe you guys debugged kernel source codes for CBE and you  
> used
>    some tools.
>    Could you please some tips for this? Thank you.
>
>
>
>    Hyo Jung Song
>    Senior Engineer
>    Samsung Electronics
>    tel. 82-2-3416-0355
>
> -----Original Message-----
> From: Cell Support [mailto:cell_support at bsc.es]
> Sent: Wednesday, January 18, 2006 11:27 PM
> To: hjsong at samsung.com
> Cc: cell_support at bsc.es
> Subject: Re: Fwd: kernel debugging tool
>
> Dear Hyo,
>
> we don't develop linux patches for Cell BE. We got them from public
> kernel mailing lists and post them to help
> people to built a kernel that works with Cell BE. This avoids  
> having to
> go through kernel mailing lists to
> find the correct patch files that fit a specific kernel release. Hope
> this helps people.
>
> We think you should post your question to a linux kernel mailing list.
> Regarding the ppc64 kernel development,
> the linuxppc64-dev at ozlabs.org is the right place
> (https://ozlabs.org/mailman/listinfo/linuxppc64-dev). But
> you can also sent this to the http://www.kernel.org mailing lists.
> Probable, kernel developers can help you
> because they are always debugging new their code.
>
> Hope this helps.
>
> Regards,
>
>> Sender : hjsong at samsung.com
>> Date   : 2006-01-17 10:13
>> Title  : kernel debugging tool
>>
>> Hi.
>>
>> WE are interested in CBE Linux patch.
>> We want to debug kernel sources sometimes. How can we do it?
>> I believe you guys debugged kernel source codes for CBE and you used
>> some tools.
>> Could you please some tips for this? Thank you.
>>
>>
>>
>> Hyo Jung Song
>> Senior Engineer
>> Samsung Electronics
>> tel. 82-2-3416-0355
>
>
>
> _______________________________________________
> Linuxppc64-dev mailing list
> Linuxppc64-dev at ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc64-dev


------------------------------------------------------------------------
  heiko j schick             tel1:        +49 (0) 7031 438635
  theodor-storm-weg 14       tel2:        +49 (0) 7431 971370
  71101 schoenaich           mobil-tel:   +49 (0) 172 9365733
                             email:       info at schihei.de
                             homepage:    http://www.schihei.de/
                             icq:         29165160
                             pgp-key id:  0x899AD7DC
------------------------------------------------------------------------


From haren at us.ibm.com  Sat Feb  4 06:03:11 2006
From: haren at us.ibm.com (Haren Myneni)
Date: Fri, 03 Feb 2006 11:03:11 -0800
Subject: [PATCH] powerpc: Don't overwrite flat device tree with kdump
	kernel
In-Reply-To: <20060203080609.403CA68A1F@ozlabs.org>
References: <20060203080609.403CA68A1F@ozlabs.org>
Message-ID: <43E3A8EF.8000609@us.ibm.com>

Michael Ellerman wrote:

>It's possible for prom_init to allocate the flat device tree inside the
>kdump crash kernel region. If this happens, when we load the kdump kernel we
>overwrite the flattened device tree, which is bad.
>
>We could make prom_init try and avoid allocating inside the crash kernel
>region, but then we run into issues if the crash kernel region uses all the
>space inside the RMO. The easiest solution is to move the flat device tree
>once we're running in the kernel.
>
>Signed-off-by: Michael Ellerman <michael at ellerman.id.au>
>---
>
> arch/powerpc/kernel/prom.c     |   27 +++++++++++++++++++++++++++
> arch/powerpc/kernel/setup_64.c |    3 +++
> include/asm-powerpc/prom.h     |    2 ++
> 3 files changed, 32 insertions(+)
>
>Index: kdump/arch/powerpc/kernel/prom.c
>===================================================================
>--- kdump.orig/arch/powerpc/kernel/prom.c
>+++ kdump/arch/powerpc/kernel/prom.c
>@@ -1913,3 +1913,30 @@ int prom_update_property(struct device_n
> 
> 	return 0;
> }
>+
>+#ifdef CONFIG_KEXEC
>+/* We may have allocated the flat device tree inside the crash kernel region
>+ * in prom_init. If so we need to move it out into regular memory. */
>+void kdump_move_device_tree(void)
>  
>
Should be void __init kdump_move_device_tree(void)

>+{
>+	unsigned long start, end;
>+	struct boot_param_header *new;
>+
>+	start = __pa((unsigned long)initial_boot_params);
>+	end = start + initial_boot_params->totalsize;
>+
>+	if (end < crashk_res.start || start > crashk_res.end)
>+		return;
>+
>+	new = (struct boot_param_header*)
>+		__va(lmb_alloc(initial_boot_params->totalsize, PAGE_SIZE));
>+
>+	memcpy(new, initial_boot_params, initial_boot_params->totalsize);
>  
>
We are touching the second kernel memory and the kexec boot will not 
initialize this region. So, reset this memory.
    memset((void *)initial_boot_params, 0, initial_boot_params->totalsize);

Thanks
Haren


From cfriesen at nortel.com  Sat Feb  4 10:21:00 2006
From: cfriesen at nortel.com (Christopher Friesen)
Date: Fri, 03 Feb 2006 17:21:00 -0600
Subject: how to limit memory with 2.6.10 on ppc64 machine?
Message-ID: <43E3E55C.90504@nortel.com>


I'm running 2.6.10 on a ppc64 machine with 4GB of memory.

We're debugging an issue and would like to try and see if disabling the 
U3 DART makes the problem go away.  Unfortunately, this particular blade 
is unstable if not all the memory banks are populated.

After some frustration I looked at the code and realized that the "mem=" 
functionality is not supported for ppc64 on this particular kernel.

Can anyone give me some advice on the simplest way to limit this thing 
to under 2GB of memory so that the DART is not allocated/used?

Does anyone know when support for "mem=" was added?  I know it is there 
in the current git version, but the "powerpc" consolidation means 
everything is all different now.

Thanks,

Chris


From benh at kernel.crashing.org  Sat Feb  4 11:12:54 2006
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Sat, 04 Feb 2006 11:12:54 +1100
Subject: Maple freezing on PCI Target-Abort
In-Reply-To: <43E37DAC.4030606@yahoo.fr>
References: <43E23B4A.4020402@yahoo.fr>
	<1138930958.4934.102.camel@localhost.localdomain>
	<43E37DAC.4030606@yahoo.fr>
Message-ID: <1139011975.8543.4.camel@localhost.localdomain>

On Fri, 2006-02-03 at 16:58 +0100, jfaslist wrote:
> Hi,
> Yes, we are going to dig into all this CPC925 and Processor Interface 
> initialization.
> Note that I checked that both MSR_ME and MSR_RI were set prior to 
> triggering the PCI Target-Abort.
> 
> -MSR_ME: If not set the CPU will "checkstop" on a machine chaeck.
> -MSR_RI: So that the exception is recoverable.
> 
> Regarding MSR_RI, this should always be set, I think?

Yes, MSR:RI is always set by the kernel except in the rare code path
where taking an exception is actually unsafe (like in some of the
exception handling code itself)

Ben.


From michael at ellerman.id.au  Sat Feb  4 11:54:02 2006
From: michael at ellerman.id.au (Michael Ellerman)
Date: Sat, 4 Feb 2006 11:54:02 +1100
Subject: how to limit memory with 2.6.10 on ppc64 machine?
In-Reply-To: <43E3E55C.90504@nortel.com>
References: <43E3E55C.90504@nortel.com>
Message-ID: <200602041154.12710.michael@ellerman.id.au>

On Sat, 4 Feb 2006 10:21, Christopher Friesen wrote:
> I'm running 2.6.10 on a ppc64 machine with 4GB of memory.
>
> We're debugging an issue and would like to try and see if disabling the
> U3 DART makes the problem go away.  Unfortunately, this particular blade
> is unstable if not all the memory banks are populated.
>
> After some frustration I looked at the code and realized that the "mem="
> functionality is not supported for ppc64 on this particular kernel.
>
> Can anyone give me some advice on the simplest way to limit this thing
> to under 2GB of memory so that the DART is not allocated/used?
>
> Does anyone know when support for "mem=" was added?  I know it is there
> in the current git version, but the "powerpc" consolidation means
> everything is all different now.

From memory (harhar) the mem= support was merged in 2.6.11, so the original 
patch should _probably_ apply on a vanilla 2.6.10 tree, try it:

http://patchwork.ozlabs.org/linuxppc64/patch?id=724

cheers

-- 
Michael Ellerman
IBM OzLabs

email: michael:ellerman.id.au
inmsg: mpe:jabber.org
wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060204/229eceaf/attachment.pgp 

From akpm at osdl.org  Sat Feb  4 15:25:31 2006
From: akpm at osdl.org (Andrew Morton)
Date: Fri, 3 Feb 2006 20:25:31 -0800
Subject: Altix SN2 2.6.16-rc1-mm5 build breakage (was:  msi support)
In-Reply-To: <20060203201441.194be500.pj@sgi.com>
References: <20060119194647.12213.44658.14543@lnx-maule.americas.sgi.com>
	<20060119194702.12213.16524.93275@lnx-maule.americas.sgi.com>
	<20060203201441.194be500.pj@sgi.com>
Message-ID: <20060203202531.27d685fa.akpm@osdl.org>

Paul Jackson <pj at sgi.com> wrote:
>
> The following patch seems to be breaking my ia64 sn2_defconfig
>  build of 2.6.16-rc1-mm5:
> 
>      gregkh-pci-altix-msi-support-git-ia64-fix.patch
> 
>  I'm guessing you should remove it for now.
> 
> 
>  Details:
>  ========
> 
>  When I try to build an ia64 sn2_defconfig 2.6.16-rc1-mm5, the
>  build fails:
> 
>      arch/ia64/sn/pci/tioce_provider.c:699:49: macro "ATE_MAKE" passed 3 arguments, but takes just 2
>      arch/ia64/sn/pci/tioce_provider.c: In function `tioce_reserve_m32':
>      arch/ia64/sn/pci/tioce_provider.c:699: error: `ATE_MAKE' undeclared (first use in this function)
> 
>  If I remove the patch:
> 
>      gregkh-pci-altix-msi-support-git-ia64-fix.patch
> 
>  then it compiles fine.

OK.  I autodrop several of Greg's MSI patches because a) they had bugs
which broke stuff a while ago and b) they don't apply and I'm lazy.  So it
looks like you've found a fix for a patch which isn't actually in -mm any
more.  I sent that fix to Greg the other day.

>  It seems that someone added a patchset to change the ATE_MAKE()
>  macro from 2 to 3 args, then someone added this above fix patch
>  for a missed change, then someone reverted it all back to 2 args,
>  but leaving this fix patch.
> 
>  I guess it means Andrew should remove the above patch.

I'll do that, thanks.


From akpm at osdl.org  Sat Feb  4 15:27:42 2006
From: akpm at osdl.org (Andrew Morton)
Date: Fri, 3 Feb 2006 20:27:42 -0800
Subject: Altix SN2 2.6.16-rc1-mm5 build breakage (was:  msi support)
In-Reply-To: <20060203202531.27d685fa.akpm@osdl.org>
References: <20060119194647.12213.44658.14543@lnx-maule.americas.sgi.com>
	<20060119194702.12213.16524.93275@lnx-maule.americas.sgi.com>
	<20060203201441.194be500.pj@sgi.com>
	<20060203202531.27d685fa.akpm@osdl.org>
Message-ID: <20060203202742.1e514fcc.akpm@osdl.org>

Andrew Morton <akpm at osdl.org> wrote:
>
> So it
>  looks like you've found a fix for a patch which isn't actually in -mm any
>  more.  I sent that fix to Greg the other day.

Actually, gregkh-pci-altix-msi-support-git-ia64-fix.patch fix`es
git-ia64.patch when gregkh-pci-altix-msi-support.patch is also applied, so
it's not presently useful to either Greg or Tony.  I'll take care of it,
somehow..


From pj at sgi.com  Sat Feb  4 15:14:41 2006
From: pj at sgi.com (Paul Jackson)
Date: Fri, 3 Feb 2006 20:14:41 -0800
Subject: Altix SN2 2.6.16-rc1-mm5 build breakage (was:  msi support)
In-Reply-To: <20060119194702.12213.16524.93275@lnx-maule.americas.sgi.com>
References: <20060119194647.12213.44658.14543@lnx-maule.americas.sgi.com>
	<20060119194702.12213.16524.93275@lnx-maule.americas.sgi.com>
Message-ID: <20060203201441.194be500.pj@sgi.com>

Andrew,

The following patch seems to be breaking my ia64 sn2_defconfig
build of 2.6.16-rc1-mm5:

    gregkh-pci-altix-msi-support-git-ia64-fix.patch

I'm guessing you should remove it for now.


Details:
========

When I try to build an ia64 sn2_defconfig 2.6.16-rc1-mm5, the
build fails:

    arch/ia64/sn/pci/tioce_provider.c:699:49: macro "ATE_MAKE" passed 3 arguments, but takes just 2
    arch/ia64/sn/pci/tioce_provider.c: In function `tioce_reserve_m32':
    arch/ia64/sn/pci/tioce_provider.c:699: error: `ATE_MAKE' undeclared (first use in this function)

If I remove the patch:

    gregkh-pci-altix-msi-support-git-ia64-fix.patch

then it compiles fine.

It seems that someone added a patchset to change the ATE_MAKE()
macro from 2 to 3 args, then someone added this above fix patch
for a missed change, then someone reverted it all back to 2 args,
but leaving this fix patch.

I guess it means Andrew should remove the above patch.

But I really do not know what is going on here.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj at sgi.com> 1.925.600.0401


From maule at sgi.com  Sat Feb  4 15:42:34 2006
From: maule at sgi.com (Mark Maule)
Date: Fri, 3 Feb 2006 22:42:34 -0600
Subject: Altix SN2 2.6.16-rc1-mm5 build breakage (was:  msi support)
In-Reply-To: <20060203202742.1e514fcc.akpm@osdl.org>
References: <20060119194647.12213.44658.14543@lnx-maule.americas.sgi.com>
	<20060119194702.12213.16524.93275@lnx-maule.americas.sgi.com>
	<20060203201441.194be500.pj@sgi.com>
	<20060203202531.27d685fa.akpm@osdl.org>
	<20060203202742.1e514fcc.akpm@osdl.org>
Message-ID: <20060204044234.GA31134@sgi.com>

On Fri, Feb 03, 2006 at 08:27:42PM -0800, Andrew Morton wrote:
> Andrew Morton <akpm at osdl.org> wrote:
> >
> > So it
> >  looks like you've found a fix for a patch which isn't actually in -mm any
> >  more.  I sent that fix to Greg the other day.
> 
> Actually, gregkh-pci-altix-msi-support-git-ia64-fix.patch fix`es
> git-ia64.patch when gregkh-pci-altix-msi-support.patch is also applied, so
> it's not presently useful to either Greg or Tony.  I'll take care of it,
> somehow..
> 

I think what happened here is that I submitted a patchset for msi
abstractions (and others posted a couple of subsequent bugfix incrementals),
but these were not taken into the 2.6.16 base 'cause of their invasiveness.
These patches touched the tioce_provider.c file.

Then I submitted another patch which touched the tioce_provider.c file, and
it looks like I probably based this file on the previous msi versions which
were being held back, so in order for everything to build, you need all of
the msi patches applied first.

What's the preferred way to handle this ... fix the current ia64 build and
then resubmit the msi patches relative to that base?

Mark


From akpm at osdl.org  Sat Feb  4 16:08:07 2006
From: akpm at osdl.org (Andrew Morton)
Date: Fri, 3 Feb 2006 21:08:07 -0800
Subject: Altix SN2 2.6.16-rc1-mm5 build breakage (was:  msi support)
In-Reply-To: <20060204044234.GA31134@sgi.com>
References: <20060119194647.12213.44658.14543@lnx-maule.americas.sgi.com>
	<20060119194702.12213.16524.93275@lnx-maule.americas.sgi.com>
	<20060203201441.194be500.pj@sgi.com>
	<20060203202531.27d685fa.akpm@osdl.org>
	<20060203202742.1e514fcc.akpm@osdl.org>
	<20060204044234.GA31134@sgi.com>
Message-ID: <20060203210807.56a48888.akpm@osdl.org>

Mark Maule <maule at sgi.com> wrote:
>
> On Fri, Feb 03, 2006 at 08:27:42PM -0800, Andrew Morton wrote:
> > Andrew Morton <akpm at osdl.org> wrote:
> > >
> > > So it
> > >  looks like you've found a fix for a patch which isn't actually in -mm any
> > >  more.  I sent that fix to Greg the other day.
> > 
> > Actually, gregkh-pci-altix-msi-support-git-ia64-fix.patch fix`es
> > git-ia64.patch when gregkh-pci-altix-msi-support.patch is also applied, so
> > it's not presently useful to either Greg or Tony.  I'll take care of it,
> > somehow..
> > 
> 
> I think what happened here is that I submitted a patchset for msi
> abstractions (and others posted a couple of subsequent bugfix incrementals),
> but these were not taken into the 2.6.16 base 'cause of their invasiveness.
> These patches touched the tioce_provider.c file.
> 
> Then I submitted another patch which touched the tioce_provider.c file, and
> it looks like I probably based this file on the previous msi versions which
> were being held back, so in order for everything to build, you need all of
> the msi patches applied first.
> 
> What's the preferred way to handle this ... fix the current ia64 build and
> then resubmit the msi patches relative to that base?
> 

umm, tricky.  This situation doesn't arise very often.

What you could do is to prepare the patches against Tony's latest tree. 
Then I can put them in -mm and Greg can drop them.  Once Tony merges up
with Linus I transfer the patches to Greg.

Or we put the patches into Tony's tree.

Either way - they'll be the same patches.  But it does mean that the
patches won't be merged into mainline until Tony merges up.  If that's a
problem then we'll need to think again.


From benh at kernel.crashing.org  Sun Feb  5 09:56:02 2006
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Sun, 05 Feb 2006 09:56:02 +1100
Subject: [PATCH] powerpc: Fix PowerMac sound i2c
In-Reply-To: <87y80rkmk7.fsf@briny.internal.ondioline.org>
References: <1136695956.30123.44.camel@localhost.localdomain>
	<87y80rkmk7.fsf@briny.internal.ondioline.org>
Message-ID: <1139093763.5634.3.camel@localhost.localdomain>

On Sat, 2006-02-04 at 18:26 +0000, Paul Collins wrote:
> Hi Ben,
> 
> Benjamin Herrenschmidt <benh at kernel.crashing.org> writes:
> 
> > My patch reworking the PowerMac i2c code break the sound drivers as they
> > used to rely on some broken behaviour of i2c-keywest that is gone now.
> > This patch should fix them (tested on a g5 with alsa only). It might
> > also fix an oops if the alsa driver hits an unsupported chip.
> 
> Applied Linus's current git tree, this patch makes ALSA sound on my
> PowerBook5,4 work again.  The second patch does not work because the
> i2c wrapper (I assume that's what i2c_smbus_write_i2c_block_data is)
> has apparently not yet returned.
> 
> It would be nice to have this fix in 2.6.16 if possible.

The second patch is the one that should go in, but it relies on an i2c
fix (undoing some Bunk damage) that is still staging in Greg tree... I
don't know what's up with that, I'll ask around.

Ben.


From haren at us.ibm.com  Sun Feb  5 13:25:14 2006
From: haren at us.ibm.com (Haren Myneni)
Date: Sat, 04 Feb 2006 18:25:14 -0800
Subject: [PATCH] powerpc: Don't overwrite flat device tree with kdump
	kernel
In-Reply-To: <8FC7251A-6C37-4B4B-9120-0845616D0E60@kernel.crashing.org>
References: <20060203080609.403CA68A1F@ozlabs.org>
	<8FC7251A-6C37-4B4B-9120-0845616D0E60@kernel.crashing.org>
Message-ID: <43E5620A.6060503@us.ibm.com>

Kumar Gala wrote:

>On Feb 3, 2006, at 2:05 AM, Michael Ellerman wrote:
>
>  
>
>>It's possible for prom_init to allocate the flat device tree inside  
>>the
>>kdump crash kernel region. If this happens, when we load the kdump  
>>kernel we
>>overwrite the flattened device tree, which is bad.
>>
>>We could make prom_init try and avoid allocating inside the crash  
>>kernel
>>region, but then we run into issues if the crash kernel region uses  
>>all the
>>space inside the RMO. The easiest solution is to move the flat  
>>device tree
>>once we're running in the kernel.
>>
>>Signed-off-by: Michael Ellerman <michael at ellerman.id.au>
>>    
>>
>
>Doesn't setup_32.c need a similar change?
>  
>
At present kdump will not be supported on ppc32.
In case, kdump_move_device_tree() can be called at the beginning of 
unflatten_device_tree() to support both ppc32 and 64 instead of making 
changes in setup_64.c and setup_32.c.  The extern definition can be 
removed from asm-powerpc/prom.h and this function can be static.

Michael, what do you think if we have some printk to tell the user that 
device_tree is moved to new location. Because, the console messages from 
prom_init are saying about old addresses.

Thanks
Haren

>- k
>
>  
>
>>---
>>
>> arch/powerpc/kernel/prom.c     |   27 +++++++++++++++++++++++++++
>> arch/powerpc/kernel/setup_64.c |    3 +++
>> include/asm-powerpc/prom.h     |    2 ++
>> 3 files changed, 32 insertions(+)
>>
>>Index: kdump/arch/powerpc/kernel/prom.c
>>===================================================================
>>--- kdump.orig/arch/powerpc/kernel/prom.c
>>+++ kdump/arch/powerpc/kernel/prom.c
>>@@ -1913,3 +1913,30 @@ int prom_update_property(struct device_n
>>
>> 	return 0;
>> }
>>+
>>+#ifdef CONFIG_KEXEC
>>+/* We may have allocated the flat device tree inside the crash  
>>kernel region
>>+ * in prom_init. If so we need to move it out into regular memory. */
>>+void kdump_move_device_tree(void)
>>+{
>>+	unsigned long start, end;
>>+	struct boot_param_header *new;
>>+
>>+	start = __pa((unsigned long)initial_boot_params);
>>+	end = start + initial_boot_params->totalsize;
>>+
>>+	if (end < crashk_res.start || start > crashk_res.end)
>>+		return;
>>+
>>+	new = (struct boot_param_header*)
>>+		__va(lmb_alloc(initial_boot_params->totalsize, PAGE_SIZE));
>>+
>>+	memcpy(new, initial_boot_params, initial_boot_params->totalsize);
>>+
>>+	initial_boot_params = new;
>>+
>>+	DBG("Flat device tree blob moved to %p\n", initial_boot_params);
>>+
>>+	/* XXX should we unreserve the old DT? */
>>+}
>>+#endif /* CONFIG_KEXEC */
>>Index: kdump/arch/powerpc/kernel/setup_64.c
>>===================================================================
>>--- kdump.orig/arch/powerpc/kernel/setup_64.c
>>+++ kdump/arch/powerpc/kernel/setup_64.c
>>@@ -398,6 +398,9 @@ void __init setup_system(void)
>> {
>> 	DBG(" -> setup_system()\n");
>>
>>+#ifdef CONFIG_KEXEC
>>+	kdump_move_device_tree();
>>+#endif
>> 	/*
>> 	 * Unflatten the device-tree passed by prom_init or kexec
>> 	 */
>>Index: kdump/include/asm-powerpc/prom.h
>>===================================================================
>>--- kdump.orig/include/asm-powerpc/prom.h
>>+++ kdump/include/asm-powerpc/prom.h
>>@@ -222,5 +222,7 @@ extern int of_address_to_resource(struct
>> extern int of_pci_address_to_resource(struct device_node *dev, int  
>>bar,
>> 				      struct resource *r);
>>
>>+extern void kdump_move_device_tree(void);
>>+
>> #endif /* __KERNEL__ */
>> #endif /* _POWERPC_PROM_H */
>>_______________________________________________
>>Linuxppc64-dev mailing list
>>Linuxppc64-dev at ozlabs.org
>>https://ozlabs.org/mailman/listinfo/linuxppc64-dev
>>    
>>
>
>_______________________________________________
>Linuxppc64-dev mailing list
>Linuxppc64-dev at ozlabs.org
>https://ozlabs.org/mailman/listinfo/linuxppc64-dev
>
>  
>


From paul at briny.ondioline.org  Sun Feb  5 05:26:16 2006
From: paul at briny.ondioline.org (Paul Collins)
Date: Sat, 04 Feb 2006 18:26:16 +0000
Subject: [PATCH] powerpc: Fix PowerMac sound i2c
In-Reply-To: <1136695956.30123.44.camel@localhost.localdomain> (Benjamin
	Herrenschmidt's message of "Sun, 08 Jan 2006 15:52:36 +1100")
References: <1136695956.30123.44.camel@localhost.localdomain>
Message-ID: <87y80rkmk7.fsf@briny.internal.ondioline.org>

Hi Ben,

Benjamin Herrenschmidt <benh at kernel.crashing.org> writes:

> My patch reworking the PowerMac i2c code break the sound drivers as they
> used to rely on some broken behaviour of i2c-keywest that is gone now.
> This patch should fix them (tested on a g5 with alsa only). It might
> also fix an oops if the alsa driver hits an unsupported chip.

Applied Linus's current git tree, this patch makes ALSA sound on my
PowerBook5,4 work again.  The second patch does not work because the
i2c wrapper (I assume that's what i2c_smbus_write_i2c_block_data is)
has apparently not yet returned.

It would be nice to have this fix in 2.6.16 if possible.

> Signed-off-by: Benjamin Herrenschmidt <benh at kernel.crashing.org>
>
> Index: linux-work/sound/ppc/tumbler.c
> ===================================================================
> --- linux-work.orig/sound/ppc/tumbler.c	2005-11-24 17:19:14.000000000 +1100
> +++ linux-work/sound/ppc/tumbler.c	2006-01-08 15:18:09.000000000 +1100
> @@ -137,6 +137,22 @@ static int send_init_client(pmac_keywest
>  	return 0;
>  }
>  
> +static int tumbler_write_block(struct i2c_client *client, u8 reg, int len,
> +			       u8 *values)
> +{
> +        union i2c_smbus_data data;
> +        int err;
> +
> +        data.block[0] = len;
> +	memcpy(&data.block[1], values, len);
> +        err = i2c_smbus_xfer(client->adapter, client->addr, client->flags,
> +                             I2C_SMBUS_WRITE, reg, I2C_SMBUS_I2C_BLOCK_DATA,
> +                             &data);
> +        return err;
> +}
> +
> +
> +
>  
>  static int tumbler_init_client(pmac_keywest_t *i2c)
>  {
> @@ -239,8 +255,7 @@ static int tumbler_set_master_volume(pma
>  	block[4] = (right_vol >> 8)  & 0xff;
>  	block[5] = (right_vol >> 0)  & 0xff;
>    
> -	if (i2c_smbus_write_block_data(mix->i2c.client, TAS_REG_VOL,
> -				       6, block) < 0) {
> +	if (tumbler_write_block(mix->i2c.client, TAS_REG_VOL, 6, block) < 0) {
>  		snd_printk("failed to set volume \n");
>  		return -EINVAL;
>  	}
> @@ -340,8 +355,7 @@ static int tumbler_set_drc(pmac_tumbler_
>  		val[1] = 0;
>  	}
>  
> -	if (i2c_smbus_write_block_data(mix->i2c.client, TAS_REG_DRC,
> -				       2, val) < 0) {
> +	if (tumbler_write_block(mix->i2c.client, TAS_REG_DRC, 2, val) < 0) {
>  		snd_printk("failed to set DRC\n");
>  		return -EINVAL;
>  	}
> @@ -376,8 +390,7 @@ static int snapper_set_drc(pmac_tumbler_
>  	val[4] = 0x60;
>  	val[5] = 0xa0;
>  
> -	if (i2c_smbus_write_block_data(mix->i2c.client, TAS_REG_DRC,
> -				       6, val) < 0) {
> +	if (tumbler_write_block(mix->i2c.client, TAS_REG_DRC, 6, val) < 0) {
>  		snd_printk("failed to set DRC\n");
>  		return -EINVAL;
>  	}
> @@ -481,8 +494,8 @@ static int tumbler_set_mono_volume(pmac_
>  	vol = info->table[vol];
>  	for (i = 0; i < info->bytes; i++)
>  		block[i] = (vol >> ((info->bytes - i - 1) * 8)) & 0xff;
> -	if (i2c_smbus_write_block_data(mix->i2c.client, info->reg,
> -				       info->bytes, block) < 0) {
> +	if (tumbler_write_block(mix->i2c.client, info->reg,
> +				  info->bytes, block) < 0) {
>  		snd_printk("failed to set mono volume %d\n", info->index);
>  		return -EINVAL;
>  	}
> @@ -611,7 +624,7 @@ static int snapper_set_mix_vol1(pmac_tum
>  		for (j = 0; j < 3; j++)
>  			block[i * 3 + j] = (vol >> ((2 - j) * 8)) & 0xff;
>  	}
> -	if (i2c_smbus_write_block_data(mix->i2c.client, reg, 9, block) < 0) {
> +	if (tumbler_write_block(mix->i2c.client, reg, 9, block) < 0) {
>  		snd_printk("failed to set mono volume %d\n", reg);
>  		return -EINVAL;
>  	}
> Index: linux-work/sound/oss/dmasound/tas_common.h
> ===================================================================
> --- linux-work.orig/sound/oss/dmasound/tas_common.h	2005-11-24 17:19:14.000000000 +1100
> +++ linux-work/sound/oss/dmasound/tas_common.h	2006-01-08 15:33:29.000000000 +1100
> @@ -157,6 +157,21 @@ tas_mono_to_stereo(uint mono)
>  	return mono | (mono<<8);
>  }
>  
> +static int tas_write_block(struct i2c_client *client, u8 reg, int len, u8 *vals)
> +{
> +        union i2c_smbus_data data;
> +        int err;
> +
> +        data.block[0] = len;
> +	memcpy(&data.block[1], vals, len);
> +        err = i2c_smbus_xfer(client->adapter, client->addr, client->flags,
> +                             I2C_SMBUS_WRITE, reg, I2C_SMBUS_I2C_BLOCK_DATA,
> +                             &data);
> +        return err;
> +}
> +
> +
> +
>  /*
>   * Todo: make these functions a bit more efficient !
>   */
> @@ -178,10 +193,8 @@ tas_write_register(	struct tas_data_t *s
>  	if (write_mode & WRITE_SHADOW)
>  		memcpy(self->shadow[reg_num],data,reg_width);
>  	if (write_mode & WRITE_HW) {
> -		rc=i2c_smbus_write_block_data(self->client,
> -					      reg_num,
> -					      reg_width,
> -					      data);
> +		rc = tas_write_block(self->client, reg_num,
> +				     reg_width, data);
>  		if (rc < 0) {
>  			printk("tas: I2C block write failed \n");  
>  			return rc; 
> @@ -199,10 +212,8 @@ tas_sync_register(	struct tas_data_t *se
>  
>  	if (reg_width==0 || self==NULL)
>  		return -EINVAL;
> -	rc=i2c_smbus_write_block_data(self->client,
> -				      reg_num,
> -				      reg_width,
> -				      self->shadow[reg_num]);
> +	rc = tas_write_block(self->client, reg_num,
> +			     reg_width, self->shadow[reg_num]);
>  	if (rc < 0) {
>  		printk("tas: I2C block write failed \n");
>  		return rc;
> Index: linux-work/sound/ppc/pmac.c
> ===================================================================
> --- linux-work.orig/sound/ppc/pmac.c	2005-12-19 16:13:48.000000000 +1100
> +++ linux-work/sound/ppc/pmac.c	2006-01-08 15:37:10.000000000 +1100
> @@ -74,7 +74,7 @@ static int snd_pmac_dbdma_alloc(pmac_t *
>  
>  static void snd_pmac_dbdma_free(pmac_t *chip, pmac_dbdma_t *rec)
>  {
> -	if (rec) {
> +	if (rec->space) {
>  		unsigned int rsize = sizeof(struct dbdma_cmd) * (rec->size + 1);
>  
>  		dma_free_coherent(&chip->pdev->dev, rsize, rec->space, rec->dma_base);
> @@ -895,6 +895,7 @@ static int __init snd_pmac_detect(pmac_t
>  	chip->can_capture = 1;
>  	chip->num_freqs = ARRAY_SIZE(awacs_freqs);
>  	chip->freq_table = awacs_freqs;
> +	chip->pdev = NULL;
>  
>  	chip->control_mask = MASK_IEPC | MASK_IEE | 0x11; /* default */
>  

-- 
Dag vijandelijk luchtschip de huismeester is dood


From bdc at carlstrom.com  Sun Feb  5 17:10:48 2006
From: bdc at carlstrom.com (Brian D. Carlstrom)
Date: 5 Feb 2006 06:10:48 -0000
Subject: G5 fan problems return moving to 2.6.15 with dual processor 2.7GHz
	machine
Message-ID: <20060205061048.7261.qmail@electricrain.com>

I've been having problems with overheating on two of my three dual
processor 2.7GHzs running Fedora Core 4's 2.6.14 kernels since
November. Because of a pending deadline and the time of year, I simply
opened the window and let nature cool the machines. 

In early January, I saw the therm_pm72.c fix in 2.6.15
    http://ozlabs.org/pipermail/linuxppc64-dev/2006-January/007299.html
    [PATCH] powerpc: more g5 overtemp problem fix
I tried Fedora's updates-testing 2.6.15 kernel to get the fix.
That caused the fans to blow full blast like the old days, which was
better than leaving the window open, which had several issues with
storm winds closing the windows, unhappy facilities managers, and
"helpful" co-workers closing the windows for me.

Finally last week I had some time to work on this. My first step was to
backport the therm_pm72.c fix from 2.6.14 and that worked like a charm,
allowing humans with hearing to inhabit the office again. I'm running
CPU simulations 24/7 on these machines, and without this fix they were
powering off once a day or more without any fix, although I'd fixed them
to reboot instead with a /sbin/critical_overtemp script that called
"reboot -f".

However, even after reporting the problems with the 2.6.15
updates-testing kernel, Fedora Core released the 2.6.15 kernel update
anyway. I decided to try and debug what is going on since other people
are going to start seeing this issue.

Looking at the dmesg output change between 2.6.14 and 2.6.15, both
start with the following:

    PowerMac G5 Thermal control driver 1.2b2
    Detected fan controls:
      0: PWM fan, id 1, location: BACKSIDE,SYS CTRLR FAN
      1: RPM fan, id 2, location: DRIVE BAY
      2: PWM fan, id 2, location: SLOT,PCI FAN
      3: RPM fan, id 3, location: CPU A INTAKE
      4: RPM fan, id 4, location: CPU A EXHAUST
      5: RPM fan, id 5, location: CPU B INTAKE
      6: RPM fan, id 6, location: CPU B EXHAUST
      7: RPM fan, id 1, location: CPU A PUMP
      8: RPM fan, id 0, location: CPU B PUMP

However, 2.6.14 has the following addition line which I've come to
expect on the 2.5GHz and 2.7GHz machines, although not on the 2.0GHz
machines of course:
    Liquid cooling pumps detected, using new algorithm !

I decided to do a little more debugging before reporting this. I built
the driver with "#define DEBUG" and added some additional DBG tracing
messages (marked "XXX bdc" below). Here is the output with therm_pm72
built into the kernel, not as a module:

Feb  4 12:19:06 youngmc kernel: Detected fan controls:
Feb  4 12:19:06 youngmc kernel:   0: PWM fan, id 1, location: BACKSIDE,SYS CTRLR FAN
Feb  4 12:19:06 youngmc kernel:   1: RPM fan, id 2, location: DRIVE BAY
Feb  4 12:19:06 youngmc kernel:   2: PWM fan, id 2, location: SLOT,PCI FAN
Feb  4 12:19:06 youngmc kernel:   3: RPM fan, id 3, location: CPU A INTAKE
Feb  4 12:19:06 youngmc kernel:   4: RPM fan, id 4, location: CPU A EXHAUST
Feb  4 12:19:06 youngmc kernel:   5: RPM fan, id 5, location: CPU B INTAKE
Feb  4 12:19:06 youngmc kernel:   6: RPM fan, id 6, location: CPU B EXHAUST
Feb  4 12:19:06 youngmc kernel:   7: RPM fan, id 1, location: CPU A PUMP
Feb  4 12:19:06 youngmc kernel:   8: RPM fan, id 0, location: CPU B PUMP
Feb  4 12:19:06 youngmc kernel: XXX bdc therm_pm72_attach
Feb  4 12:19:06 youngmc kernel: XXX bdc therm_pm72_attach adapter->name=monid
Feb  4 12:19:06 youngmc kernel: XXX bdc therm_pm72_attach
Feb  4 12:19:06 youngmc kernel: XXX bdc therm_pm72_attach adapter->name=dvi
Feb  4 12:19:06 youngmc kernel: XXX bdc therm_pm72_attach
Feb  4 12:19:06 youngmc kernel: XXX bdc therm_pm72_attach adapter->name=vga
Feb  4 12:19:06 youngmc kernel: XXX bdc therm_pm72_attach
Feb  4 12:19:06 youngmc kernel: XXX bdc therm_pm72_attach adapter->name=crt2
Feb  4 12:19:06 youngmc kernel: Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
Feb  4 12:19:06 youngmc kernel: ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
Feb  4 12:19:06 youngmc kernel: ide0: Found Apple K2 ATA-6 controller, bus ID 3, irq 39
Feb  4 12:19:06 youngmc kernel: hda: PIONEER DVD-RW DVR-109, ATAPI CD/DVD-ROM drive
Feb  4 12:19:06 youngmc kernel: hda: Enabling Ultra DMA 4
Feb  4 12:19:06 youngmc kernel: ide0 at 0xd000080083656000-0xd000080083656007,0xd000080083656160 on irq 39
Feb  4 12:19:06 youngmc kernel: hda: ATAPI 32X DVD-ROM DVD-R CD-R/RW drive, 2000kB Cache, UDMA(66)
Feb  4 12:19:06 youngmc kernel: Uniform CD-ROM driver Revision: 3.20
Feb  4 12:19:06 youngmc kernel: ide-floppy driver 0.99.newide
Feb  4 12:19:06 youngmc kernel: usbcore: registered new driver libusual
Feb  4 12:19:06 youngmc kernel: usbcore: registered new driver hiddev
Feb  4 12:19:06 youngmc kernel: usbcore: registered new driver usbhid
Feb  4 12:19:06 youngmc kernel: drivers/usb/input/hid-core.c: v2.6:USB HID core driver
Feb  4 12:19:06 youngmc kernel: mice: PS/2 mouse device common for all mice
Feb  4 12:19:06 youngmc kernel: /u3 at 0,f8000000/i2c at f8001000: Missing interrupt or address !
Feb  4 12:19:06 youngmc kernel: XXX bdc therm_pm72_attach
Feb  4 12:19:06 youngmc kernel: XXX bdc therm_pm72_attach adapter->name=mac-io 0
Feb  4 12:19:06 youngmc kernel: Found K2
Feb  4 12:19:06 youngmc kernel: Found KeyWest i2c on "mac-io", 1 channel, stepping: 4 bits

I'm guessing I should have seen a "found U3-0", but I see a suspicious message here:
    /u3 at 0,f8000000/i2c at f8001000: Missing interrupt or address !
that I do not see in the working 2.6.14 boot.

I was wondering if the change from module to builtin was causing the
problem (grasping at straws I guess) so I also tried building it as a
module. I get the similar results:

Feb  4 12:59:26 youngmc kernel: /u3 at 0,f8000000/i2c at f8001000: Missing interrupt or address !
Feb  4 12:59:26 youngmc kernel: Found KeyWest i2c on "mac-io", 1 channel, stepping: 4 bits
...
Feb  4 12:59:27 youngmc kernel: Detected fan controls:
Feb  4 12:59:27 youngmc kernel:   0: PWM fan, id 1, location: BACKSIDE,SYS CTRLR FAN
Feb  4 12:59:27 youngmc kernel:   1: RPM fan, id 2, location: DRIVE BAY
Feb  4 12:59:27 youngmc kernel:   2: PWM fan, id 2, location: SLOT,PCI FAN
Feb  4 12:59:27 youngmc kernel:   3: RPM fan, id 3, location: CPU A INTAKE
Feb  4 12:59:27 youngmc kernel:   4: RPM fan, id 4, location: CPU A EXHAUST
Feb  4 12:59:27 youngmc kernel:   5: RPM fan, id 5, location: CPU B INTAKE
Feb  4 12:59:27 youngmc kernel:   6: RPM fan, id 6, location: CPU B EXHAUST
Feb  4 12:59:27 youngmc kernel:   7: RPM fan, id 1, location: CPU A PUMP
Feb  4 12:59:28 youngmc kernel:   8: RPM fan, id 0, location: CPU B PUMP
Feb  4 12:59:28 youngmc kernel: XXX bdc therm_pm72_attach
Feb  4 12:59:28 youngmc kernel: XXX bdc therm_pm72_attach adapter->name=monid
Feb  4 12:59:28 youngmc kernel: XXX bdc therm_pm72_attach
Feb  4 12:59:28 youngmc kernel: XXX bdc therm_pm72_attach adapter->name=dvi
Feb  4 12:59:28 youngmc kernel: XXX bdc therm_pm72_attach
Feb  4 12:59:28 youngmc kernel: XXX bdc therm_pm72_attach adapter->name=vga
Feb  4 12:59:28 youngmc kernel: XXX bdc therm_pm72_attach
Feb  4 12:59:28 youngmc kernel: XXX bdc therm_pm72_attach adapter->name=crt2
Feb  4 12:59:28 youngmc kernel: XXX bdc therm_pm72_attach
Feb  4 12:59:28 youngmc kernel: XXX bdc therm_pm72_attach adapter->name=mac-io 0
Feb  4 12:59:28 youngmc kernel: Found K2

Now this "/u3 at 0,f8000000/i2c at f8001000: Missing interrupt or address !"
warning I'm seeing in both cases looked familar, in fact I was on a
thread about it when the 2.7GHz machines first came out:

    http://patchwork.ozlabs.org/linuxppc64/patch?id=1982

The code that this patched applied to has moved to a new location
arch/powerpc/kernel/prom_init.c, but logically it still seems like it
should cover my case. The code says:

    if (u3_rev < 0x35 || u3_rev > 0x39)
        return;

and my u3_rev looks to be 0x35
    $ hexdump /proc/device-tree/u3 at 0,f8000000/device-rev
    0000000 0000 0035
    0000004

Unforunately it looks like I need to use prom_print to add debugging,
which I'm guessing only comes to the console which I'm not near right
now.

Before going further, is there something obvious that the Fedora
2.6.15 kernel is doing wrong, given that the 2.6.14 kernel works and
the 2.6.15 seems to have a regression? I'm willing to do some more
debugging or try a more up-to-date kernel to help resolve this issue.

One last note, my dual processor 2.0GHz and 2.5GHz machines are running
fine with 2.6.15...

-bri


From benh at kernel.crashing.org  Sun Feb  5 20:06:24 2006
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Sun, 05 Feb 2006 20:06:24 +1100
Subject: G5 fan problems return moving to 2.6.15 with dual processor
	2.7GHz machine
In-Reply-To: <20060205061048.7261.qmail@electricrain.com>
References: <20060205061048.7261.qmail@electricrain.com>
Message-ID: <1139130385.5634.14.camel@localhost.localdomain>


> I'm guessing I should have seen a "found U3-0", but I see a suspicious message here:
>     /u3 at 0,f8000000/i2c at f8001000: Missing interrupt or address !
> that I do not see in the working 2.6.14 boot.
> 
> I was wondering if the change from module to builtin was causing the
> problem (grasping at straws I guess) so I also tried building it as a
> module. I get the similar results:
> 
> Feb  4 12:59:26 youngmc kernel: /u3 at 0,f8000000/i2c at f8001000: Missing interrupt or address !
> Feb  4 12:59:26 youngmc kernel: Found KeyWest i2c on "mac-io", 1 channel, stepping: 4 bits

That looks like a fix in prom_init.c that is missing from 2.6.15.. I'll
have to double check... Apple indeed seems to have a "bug" in the
device-tree of some of those machines. prom_init.c has some code to fix
it up, but there have been several versions of the fix and maybe that
broke some way...

Ah... now I'm reading the rest of the message and see that you figured
that out already ;)

> Now this "/u3 at 0,f8000000/i2c at f8001000: Missing interrupt or address !"
> warning I'm seeing in both cases looked familar, in fact I was on a
> thread about it when the 2.7GHz machines first came out:
> 
>     http://patchwork.ozlabs.org/linuxppc64/patch?id=1982
> 
> The code that this patched applied to has moved to a new location
> arch/powerpc/kernel/prom_init.c, but logically it still seems like it
> should cover my case. The code says:
> 
>     if (u3_rev < 0x35 || u3_rev > 0x39)
>         return;
> 
> and my u3_rev looks to be 0x35
>     $ hexdump /proc/device-tree/u3 at 0,f8000000/device-rev
>     0000000 0000 0035
>     0000004
> 
> Unforunately it looks like I need to use prom_print to add debugging,
> which I'm guessing only comes to the console which I'm not near right
> now.
> 
> Before going further, is there something obvious that the Fedora
> 2.6.15 kernel is doing wrong, given that the 2.6.14 kernel works and
> the 2.6.15 seems to have a regression? I'm willing to do some more
> debugging or try a more up-to-date kernel to help resolve this issue.
> 
> One last note, my dual processor 2.0GHz and 2.5GHz machines are running
> fine with 2.6.15...

Might be something in that prom_init.c fix that broke... it would be
really nice if you could give a try with the console and find out what
it is ... Unfortunately, I don't have access to one of these machines
with the "problem" at the moment...

Cheers,
Ben.


From galak at kernel.crashing.org  Mon Feb  6 03:41:18 2006
From: galak at kernel.crashing.org (Kumar Gala)
Date: Sun, 5 Feb 2006 10:41:18 -0600
Subject: [PATCH] powerpc: Don't overwrite flat device tree with kdump
	kernel
In-Reply-To: <43E5620A.6060503@us.ibm.com>
References: <20060203080609.403CA68A1F@ozlabs.org>
	<8FC7251A-6C37-4B4B-9120-0845616D0E60@kernel.crashing.org>
	<43E5620A.6060503@us.ibm.com>
Message-ID: <2E5AE9B2-F10D-4CC4-B8C9-FBC929C76B3D@kernel.crashing.org>


On Feb 4, 2006, at 8:25 PM, Haren Myneni wrote:

> Kumar Gala wrote:
>
>> On Feb 3, 2006, at 2:05 AM, Michael Ellerman wrote:
>>
>>
>>> It's possible for prom_init to allocate the flat device tree  
>>> inside  the
>>> kdump crash kernel region. If this happens, when we load the  
>>> kdump  kernel we
>>> overwrite the flattened device tree, which is bad.
>>>
>>> We could make prom_init try and avoid allocating inside the  
>>> crash  kernel
>>> region, but then we run into issues if the crash kernel region  
>>> uses  all the
>>> space inside the RMO. The easiest solution is to move the flat   
>>> device tree
>>> once we're running in the kernel.
>>>
>>> Signed-off-by: Michael Ellerman <michael at ellerman.id.au>
>>>
>>
>> Doesn't setup_32.c need a similar change?
>>
> At present kdump will not be supported on ppc32.
> In case, kdump_move_device_tree() can be called at the beginning of  
> unflatten_device_tree() to support both ppc32 and 64 instead of  
> making changes in setup_64.c and setup_32.c.  The extern definition  
> can be removed from asm-powerpc/prom.h and this function can be  
> static.
>
> Michael, what do you think if we have some printk to tell the user  
> that device_tree is moved to new location. Because, the console  
> messages from prom_init are saying about old addresses.

What's the issue with kdump on ppc32?  One of the reasons we merged  
ppc32 and ppc64 was to avoid such issues going forward?

- k

>>> ---
>>>
>>> arch/powerpc/kernel/prom.c     |   27 +++++++++++++++++++++++++++
>>> arch/powerpc/kernel/setup_64.c |    3 +++
>>> include/asm-powerpc/prom.h     |    2 ++
>>> 3 files changed, 32 insertions(+)
>>>
>>> Index: kdump/arch/powerpc/kernel/prom.c
>>> ===================================================================
>>> --- kdump.orig/arch/powerpc/kernel/prom.c
>>> +++ kdump/arch/powerpc/kernel/prom.c
>>> @@ -1913,3 +1913,30 @@ int prom_update_property(struct device_n
>>>
>>> 	return 0;
>>> }
>>> +
>>> +#ifdef CONFIG_KEXEC
>>> +/* We may have allocated the flat device tree inside the crash   
>>> kernel region
>>> + * in prom_init. If so we need to move it out into regular  
>>> memory. */
>>> +void kdump_move_device_tree(void)
>>> +{
>>> +	unsigned long start, end;
>>> +	struct boot_param_header *new;
>>> +
>>> +	start = __pa((unsigned long)initial_boot_params);
>>> +	end = start + initial_boot_params->totalsize;
>>> +
>>> +	if (end < crashk_res.start || start > crashk_res.end)
>>> +		return;
>>> +
>>> +	new = (struct boot_param_header*)
>>> +		__va(lmb_alloc(initial_boot_params->totalsize, PAGE_SIZE));
>>> +
>>> +	memcpy(new, initial_boot_params, initial_boot_params->totalsize);
>>> +
>>> +	initial_boot_params = new;
>>> +
>>> +	DBG("Flat device tree blob moved to %p\n", initial_boot_params);
>>> +
>>> +	/* XXX should we unreserve the old DT? */
>>> +}
>>> +#endif /* CONFIG_KEXEC */
>>> Index: kdump/arch/powerpc/kernel/setup_64.c
>>> ===================================================================
>>> --- kdump.orig/arch/powerpc/kernel/setup_64.c
>>> +++ kdump/arch/powerpc/kernel/setup_64.c
>>> @@ -398,6 +398,9 @@ void __init setup_system(void)
>>> {
>>> 	DBG(" -> setup_system()\n");
>>>
>>> +#ifdef CONFIG_KEXEC
>>> +	kdump_move_device_tree();
>>> +#endif
>>> 	/*
>>> 	 * Unflatten the device-tree passed by prom_init or kexec
>>> 	 */
>>> Index: kdump/include/asm-powerpc/prom.h
>>> ===================================================================
>>> --- kdump.orig/include/asm-powerpc/prom.h
>>> +++ kdump/include/asm-powerpc/prom.h
>>> @@ -222,5 +222,7 @@ extern int of_address_to_resource(struct
>>> extern int of_pci_address_to_resource(struct device_node *dev,  
>>> int  bar,
>>> 				      struct resource *r);
>>>
>>> +extern void kdump_move_device_tree(void);
>>> +
>>> #endif /* __KERNEL__ */
>>> #endif /* _POWERPC_PROM_H */
>>> _______________________________________________
>>> Linuxppc64-dev mailing list
>>> Linuxppc64-dev at ozlabs.org
>>> https://ozlabs.org/mailman/listinfo/linuxppc64-dev
>>>
>>
>> _______________________________________________
>> Linuxppc64-dev mailing list
>> Linuxppc64-dev at ozlabs.org
>> https://ozlabs.org/mailman/listinfo/linuxppc64-dev
>>
>>


From david at gibson.dropbear.id.au  Mon Feb  6 13:18:53 2006
From: david at gibson.dropbear.id.au (David Gibson)
Date: Mon, 6 Feb 2006 13:18:53 +1100
Subject: Hugepages need clear_user_highpage() not clear_highpage()
Message-ID: <20060206021853.GC10708@localhost.localdomain>

When hugepages are newly allocated to a file in mm/hugetlb.c, we clear
them with a call to clear_highpage() on each of the subpages.  We
should be using clear_user_highpage(): on powerpc, at least,
clear_highpage() doesn't correctly mark the page as icache dirty so if
the page is executed shortly after it's possible to get strange
results.

This is a bugfix and should go into 2.6.16.

Signed-off-by: David Gibson <dwg at au1.ibm.com>

Index: working-2.6/mm/hugetlb.c
===================================================================
--- working-2.6.orig/mm/hugetlb.c	2006-02-06 12:58:07.000000000 +1100
+++ working-2.6/mm/hugetlb.c	2006-02-06 12:58:19.000000000 +1100
@@ -107,7 +107,7 @@ struct page *alloc_huge_page(struct vm_a
 	set_page_count(page, 1);
 	page[1].mapping = (void *)free_huge_page;
 	for (i = 0; i < (HPAGE_SIZE/PAGE_SIZE); ++i)
-		clear_highpage(&page[i]);
+		clear_user_highpage(&page[i], addr);
 	return page;
 }
 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson


From david at gibson.dropbear.id.au  Mon Feb  6 13:24:53 2006
From: david at gibson.dropbear.id.au (David Gibson)
Date: Mon, 6 Feb 2006 13:24:53 +1100
Subject: powerpc: Cleanup, consolidating icache dirtying logic
Message-ID: <20060206022453.GA19390@localhost.localdomain>

The code to mark a page as icache dirty (so that it will later be
icache-dcache flushed when we try to execute from it) is duplicated in
three places: flush_dcache_page() does this marking and nothing else,
but clear_user_page() and copy_user_page() duplicate it, since those
functions make the page icache dirty themselves.

This patch makes those other functions call flush_dcache_page()
instead, so the logic's all in one place.  This will make life less
confusing if we ever need to tweak the details of the the lazy icache
flush mechanism.

 arch/powerpc/mm/mem.c |   14 ++------------
 1 file changed, 2 insertions(+), 12 deletions(-)

Signed-off-by: David Gibson <dwg at au1.ibm.com>

Index: working-2.6/arch/powerpc/mm/mem.c
===================================================================
--- working-2.6.orig/arch/powerpc/mm/mem.c	2006-02-06 12:58:07.000000000 +1100
+++ working-2.6/arch/powerpc/mm/mem.c	2006-02-06 13:20:29.000000000 +1100
@@ -435,17 +435,12 @@ void clear_user_page(void *page, unsigne
 {
 	clear_page(page);
 
-	if (cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
-		return;
 	/*
 	 * We shouldnt have to do this, but some versions of glibc
 	 * require it (ld.so assumes zero filled pages are icache clean)
 	 * - Anton
 	 */
-
-	/* avoid an atomic op if possible */
-	if (test_bit(PG_arch_1, &pg->flags))
-		clear_bit(PG_arch_1, &pg->flags);
+	flush_dcache_page(pg);
 }
 EXPORT_SYMBOL(clear_user_page);
 
@@ -469,12 +464,7 @@ void copy_user_page(void *vto, void *vfr
 		return;
 #endif
 
-	if (cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
-		return;
-
-	/* avoid an atomic op if possible */
-	if (test_bit(PG_arch_1, &pg->flags))
-		clear_bit(PG_arch_1, &pg->flags);
+	flush_dcache_page(pg);
 }
 
 void flush_icache_user_range(struct vm_area_struct *vma, struct page *page,

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson


From haren at us.ibm.com  Mon Feb  6 13:46:05 2006
From: haren at us.ibm.com (Haren Myneni)
Date: Sun, 05 Feb 2006 18:46:05 -0800
Subject: [PATCH] powerpc: Don't overwrite flat device tree with kdump
	kernel
In-Reply-To: <2E5AE9B2-F10D-4CC4-B8C9-FBC929C76B3D@kernel.crashing.org>
References: <20060203080609.403CA68A1F@ozlabs.org>
	<8FC7251A-6C37-4B4B-9120-0845616D0E60@kernel.crashing.org>
	<43E5620A.6060503@us.ibm.com>
	<2E5AE9B2-F10D-4CC4-B8C9-FBC929C76B3D@kernel.crashing.org>
Message-ID: <43E6B86D.4070108@us.ibm.com>

Kumar Gala wrote:

>
> On Feb 4, 2006, at 8:25 PM, Haren Myneni wrote:
>
>> Kumar Gala wrote:
>>
>>> On Feb 3, 2006, at 2:05 AM, Michael Ellerman wrote:
>>>
>>>
>>>> It's possible for prom_init to allocate the flat device tree  
>>>> inside  the
>>>> kdump crash kernel region. If this happens, when we load the  
>>>> kdump  kernel we
>>>> overwrite the flattened device tree, which is bad.
>>>>
>>>> We could make prom_init try and avoid allocating inside the  crash  
>>>> kernel
>>>> region, but then we run into issues if the crash kernel region  
>>>> uses  all the
>>>> space inside the RMO. The easiest solution is to move the flat   
>>>> device tree
>>>> once we're running in the kernel.
>>>>
>>>> Signed-off-by: Michael Ellerman <michael at ellerman.id.au>
>>>>
>>>
>>> Doesn't setup_32.c need a similar change?
>>>
>> At present kdump will not be supported on ppc32.
>> In case, kdump_move_device_tree() can be called at the beginning of  
>> unflatten_device_tree() to support both ppc32 and 64 instead of  
>> making changes in setup_64.c and setup_32.c.  The extern definition  
>> can be removed from asm-powerpc/prom.h and this function can be  static.
>>
>> Michael, what do you think if we have some printk to tell the user  
>> that device_tree is moved to new location. Because, the console  
>> messages from prom_init are saying about old addresses.
>
>
> What's the issue with kdump on ppc32?  One of the reasons we merged  
> ppc32 and ppc64 was to avoid such issues going forward?

Main issue is with the user level kexec-tools which does not have the 
kdump support for PPC32. One of the changes should be some cleanup/merge 
the way it happened in the kernel. At present, normal kexec support is 
included for gamecube/ppc32. Any help is appreciated.

Thanks
Haren

>
> - k
>
>>>> ---
>>>>
>>>> arch/powerpc/kernel/prom.c     |   27 +++++++++++++++++++++++++++
>>>> arch/powerpc/kernel/setup_64.c |    3 +++
>>>> include/asm-powerpc/prom.h     |    2 ++
>>>> 3 files changed, 32 insertions(+)
>>>>
>>>> Index: kdump/arch/powerpc/kernel/prom.c
>>>> ===================================================================
>>>> --- kdump.orig/arch/powerpc/kernel/prom.c
>>>> +++ kdump/arch/powerpc/kernel/prom.c
>>>> @@ -1913,3 +1913,30 @@ int prom_update_property(struct device_n
>>>>
>>>>     return 0;
>>>> }
>>>> +
>>>> +#ifdef CONFIG_KEXEC
>>>> +/* We may have allocated the flat device tree inside the crash   
>>>> kernel region
>>>> + * in prom_init. If so we need to move it out into regular  
>>>> memory. */
>>>> +void kdump_move_device_tree(void)
>>>> +{
>>>> +    unsigned long start, end;
>>>> +    struct boot_param_header *new;
>>>> +
>>>> +    start = __pa((unsigned long)initial_boot_params);
>>>> +    end = start + initial_boot_params->totalsize;
>>>> +
>>>> +    if (end < crashk_res.start || start > crashk_res.end)
>>>> +        return;
>>>> +
>>>> +    new = (struct boot_param_header*)
>>>> +        __va(lmb_alloc(initial_boot_params->totalsize, PAGE_SIZE));
>>>> +
>>>> +    memcpy(new, initial_boot_params, initial_boot_params->totalsize);
>>>> +
>>>> +    initial_boot_params = new;
>>>> +
>>>> +    DBG("Flat device tree blob moved to %p\n", initial_boot_params);
>>>> +
>>>> +    /* XXX should we unreserve the old DT? */
>>>> +}
>>>> +#endif /* CONFIG_KEXEC */
>>>> Index: kdump/arch/powerpc/kernel/setup_64.c
>>>> ===================================================================
>>>> --- kdump.orig/arch/powerpc/kernel/setup_64.c
>>>> +++ kdump/arch/powerpc/kernel/setup_64.c
>>>> @@ -398,6 +398,9 @@ void __init setup_system(void)
>>>> {
>>>>     DBG(" -> setup_system()\n");
>>>>
>>>> +#ifdef CONFIG_KEXEC
>>>> +    kdump_move_device_tree();
>>>> +#endif
>>>>     /*
>>>>      * Unflatten the device-tree passed by prom_init or kexec
>>>>      */
>>>> Index: kdump/include/asm-powerpc/prom.h
>>>> ===================================================================
>>>> --- kdump.orig/include/asm-powerpc/prom.h
>>>> +++ kdump/include/asm-powerpc/prom.h
>>>> @@ -222,5 +222,7 @@ extern int of_address_to_resource(struct
>>>> extern int of_pci_address_to_resource(struct device_node *dev,  
>>>> int  bar,
>>>>                       struct resource *r);
>>>>
>>>> +extern void kdump_move_device_tree(void);
>>>> +
>>>> #endif /* __KERNEL__ */
>>>> #endif /* _POWERPC_PROM_H */
>>>> _______________________________________________
>>>> Linuxppc64-dev mailing list
>>>> Linuxppc64-dev at ozlabs.org
>>>> https://ozlabs.org/mailman/listinfo/linuxppc64-dev
>>>>
>>>
>>> _______________________________________________
>>> Linuxppc64-dev mailing list
>>> Linuxppc64-dev at ozlabs.org
>>> https://ozlabs.org/mailman/listinfo/linuxppc64-dev
>>>
>>>
>
>


From michael at ellerman.id.au  Mon Feb  6 14:04:49 2006
From: michael at ellerman.id.au (Michael Ellerman)
Date: Mon, 6 Feb 2006 14:04:49 +1100
Subject: [PATCH] powerpc: Don't overwrite flat device tree with kdump
	kernel
In-Reply-To: <2E5AE9B2-F10D-4CC4-B8C9-FBC929C76B3D@kernel.crashing.org>
References: <20060203080609.403CA68A1F@ozlabs.org>
	<43E5620A.6060503@us.ibm.com>
	<2E5AE9B2-F10D-4CC4-B8C9-FBC929C76B3D@kernel.crashing.org>
Message-ID: <200602061404.52478.michael@ellerman.id.au>

On Mon, 6 Feb 2006 03:41, Kumar Gala wrote:
> On Feb 4, 2006, at 8:25 PM, Haren Myneni wrote:
> > Kumar Gala wrote:
> >> Doesn't setup_32.c need a similar change?
> >
> > At present kdump will not be supported on ppc32.
> > In case, kdump_move_device_tree() can be called at the beginning of
> > unflatten_device_tree() to support both ppc32 and 64 instead of
> > making changes in setup_64.c and setup_32.c.  The extern definition
> > can be removed from asm-powerpc/prom.h and this function can be
> > static.
>
> What's the issue with kdump on ppc32?  One of the reasons we merged
> ppc32 and ppc64 was to avoid such issues going forward?

Currently in mainline all we have is hooks for ppc32 kexec, no actual 
implementation. Apparently it exists somewhere, but I haven't seen it.

cheers

-- 
Michael Ellerman
IBM OzLabs

email: michael:ellerman.id.au
inmsg: mpe:jabber.org
wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060206/03a24485/attachment.pgp 

From michael at ellerman.id.au  Mon Feb  6 17:34:14 2006
From: michael at ellerman.id.au (Michael Ellerman)
Date: Mon, 06 Feb 2006 17:34:14 +1100
Subject: [PATCH] powerpc: Don't use toc in decrementer_iSeries_masked
Message-ID: <20060206063434.22D37689F3@ozlabs.org>

Since 404849bbd2bfd62e05b36f4753f6e1af6050a824 we've been using
LOAD_REG_ADDRBASE, which uses the toc pointer, in decrementer_iSeries_masked.

This can explode if we take the decrementer interrupt while we're in a module,
because the toc pointer in r2 will be the module's toc pointer.

Instead do an immediate load. I'm not sure if we really need the trickery in
here, what do people think?

 arch/powerpc/kernel/head_64.S |    9 +++++++--
 1 files changed, 7 insertions(+), 2 deletions(-)

Index: iseries/arch/powerpc/kernel/head_64.S
===================================================================
--- iseries.orig/arch/powerpc/kernel/head_64.S
+++ iseries/arch/powerpc/kernel/head_64.S
@@ -752,8 +752,13 @@ decrementer_iSeries_masked:
 	li	r11,1
 	ld	r12,PACALPPACAPTR(r13)
 	stb	r11,LPPACADECRINT(r12)
-	LOAD_REG_ADDRBASE(r12,tb_ticks_per_jiffy)
-	lwz	r12,ADDROFF(tb_ticks_per_jiffy)(r12)
+	/* We have no TOC pointer in here, so we can't load tb_ticks_per_jiffy
+	 * using the usual macros. We want to be fast, so we assume the top
+	 * word of the lppaca pointer is the same as the top word of
+	 * &tb_ticks_per_jiffy.
+	 */
+	oris    r12,r12,tb_ticks_per_jiffy at h
+	lwz	r12,tb_ticks_per_jiffy at l(r12)
 	mtspr	SPRN_DEC,r12
 	/* fall through */
 

From david at gibson.dropbear.id.au  Mon Feb  6 17:42:47 2006
From: david at gibson.dropbear.id.au (David Gibson)
Date: Mon, 6 Feb 2006 17:42:47 +1100
Subject: [PATCH] powerpc: Don't use toc in decrementer_iSeries_masked
In-Reply-To: <20060206063434.22D37689F3@ozlabs.org>
References: <20060206063434.22D37689F3@ozlabs.org>
Message-ID: <20060206064247.GA31631@localhost.localdomain>

On Mon, Feb 06, 2006 at 05:34:14PM +1100, Michael Ellerman wrote:
> Since 404849bbd2bfd62e05b36f4753f6e1af6050a824 we've been using
> LOAD_REG_ADDRBASE, which uses the toc pointer, in decrementer_iSeries_masked.
> 
> This can explode if we take the decrementer interrupt while we're in a module,
> because the toc pointer in r2 will be the module's toc pointer.
> 
> Instead do an immediate load. I'm not sure if we really need the trickery in
> here, what do people think?
> 
>  arch/powerpc/kernel/head_64.S |    9 +++++++--
>  1 files changed, 7 insertions(+), 2 deletions(-)
> 
> Index: iseries/arch/powerpc/kernel/head_64.S
> ===================================================================
> --- iseries.orig/arch/powerpc/kernel/head_64.S
> +++ iseries/arch/powerpc/kernel/head_64.S
> @@ -752,8 +752,13 @@ decrementer_iSeries_masked:
>  	li	r11,1
>  	ld	r12,PACALPPACAPTR(r13)
>  	stb	r11,LPPACADECRINT(r12)
> -	LOAD_REG_ADDRBASE(r12,tb_ticks_per_jiffy)
> -	lwz	r12,ADDROFF(tb_ticks_per_jiffy)(r12)
> +	/* We have no TOC pointer in here, so we can't load tb_ticks_per_jiffy
> +	 * using the usual macros. We want to be fast, so we assume the top
> +	 * word of the lppaca pointer is the same as the top word of
> +	 * &tb_ticks_per_jiffy.
> +	 */

you need a
	clrrdi	r12,r12,32
here, because r12's low word may not contain zero.

> +	oris    r12,r12,tb_ticks_per_jiffy at h
> +	lwz	r12,tb_ticks_per_jiffy at l(r12)
>  	mtspr	SPRN_DEC,r12
>  	/* fall through */
>  
> _______________________________________________
> Linuxppc64-dev mailing list
> Linuxppc64-dev at ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc64-dev
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson


From david at gibson.dropbear.id.au  Mon Feb  6 17:44:38 2006
From: david at gibson.dropbear.id.au (David Gibson)
Date: Mon, 6 Feb 2006 17:44:38 +1100
Subject: [PATCH] powerpc: Don't use toc in decrementer_iSeries_masked
In-Reply-To: <20060206064247.GA31631@localhost.localdomain>
References: <20060206063434.22D37689F3@ozlabs.org>
	<20060206064247.GA31631@localhost.localdomain>
Message-ID: <20060206064438.GB31631@localhost.localdomain>


On Mon, Feb 06, 2006 at 05:42:47PM +1100, David Gibson wrote:
> On Mon, Feb 06, 2006 at 05:34:14PM +1100, Michael Ellerman wrote:
> > Since 404849bbd2bfd62e05b36f4753f6e1af6050a824 we've been using
> > LOAD_REG_ADDRBASE, which uses the toc pointer, in decrementer_iSeries_masked.
> > 
> > This can explode if we take the decrementer interrupt while we're in a module,
> > because the toc pointer in r2 will be the module's toc pointer.
> > 
> > Instead do an immediate load. I'm not sure if we really need the trickery in
> > here, what do people think?
> > 
> >  arch/powerpc/kernel/head_64.S |    9 +++++++--
> >  1 files changed, 7 insertions(+), 2 deletions(-)
> > 
> > Index: iseries/arch/powerpc/kernel/head_64.S
> > ===================================================================
> > --- iseries.orig/arch/powerpc/kernel/head_64.S
> > +++ iseries/arch/powerpc/kernel/head_64.S
> > @@ -752,8 +752,13 @@ decrementer_iSeries_masked:
> >  	li	r11,1
> >  	ld	r12,PACALPPACAPTR(r13)
> >  	stb	r11,LPPACADECRINT(r12)
> > -	LOAD_REG_ADDRBASE(r12,tb_ticks_per_jiffy)
> > -	lwz	r12,ADDROFF(tb_ticks_per_jiffy)(r12)
> > +	/* We have no TOC pointer in here, so we can't load tb_ticks_per_jiffy
> > +	 * using the usual macros. We want to be fast, so we assume the top
> > +	 * word of the lppaca pointer is the same as the top word of
> > +	 * &tb_ticks_per_jiffy.
> > +	 */
> 
> you need a
> 	clrrdi	r12,r12,32
> here, because r12's low word may not contain zero.
> 
> > +	oris    r12,r12,tb_ticks_per_jiffy at h

Oh, and that needs to be @ha, because the load offset below is treated
as signed.

> > +	lwz	r12,tb_ticks_per_jiffy at l(r12)
> >  	mtspr	SPRN_DEC,r12
> >  	/* fall through */

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson


From olof at lixom.net  Mon Feb  6 17:53:54 2006
From: olof at lixom.net (Olof Johansson)
Date: Mon, 6 Feb 2006 00:53:54 -0600
Subject: [PATCH] powerpc: Don't use toc in decrementer_iSeries_masked
In-Reply-To: <20060206064247.GA31631@localhost.localdomain>
References: <20060206063434.22D37689F3@ozlabs.org>
	<20060206064247.GA31631@localhost.localdomain>
Message-ID: <20060206065354.GA7626@pb15.lixom.net>

On Mon, Feb 06, 2006 at 05:42:47PM +1100, David Gibson wrote:
> On Mon, Feb 06, 2006 at 05:34:14PM +1100, Michael Ellerman wrote:
> > Since 404849bbd2bfd62e05b36f4753f6e1af6050a824 we've been using
> > LOAD_REG_ADDRBASE, which uses the toc pointer, in decrementer_iSeries_masked.
> > 
> > This can explode if we take the decrementer interrupt while we're in a module,
> > because the toc pointer in r2 will be the module's toc pointer.
> > 
> > Instead do an immediate load. I'm not sure if we really need the trickery in
> > here, what do people think?
> > 
> >  arch/powerpc/kernel/head_64.S |    9 +++++++--
> >  1 files changed, 7 insertions(+), 2 deletions(-)
> > 
> > Index: iseries/arch/powerpc/kernel/head_64.S
> > ===================================================================
> > --- iseries.orig/arch/powerpc/kernel/head_64.S
> > +++ iseries/arch/powerpc/kernel/head_64.S
> > @@ -752,8 +752,13 @@ decrementer_iSeries_masked:
> >  	li	r11,1
> >  	ld	r12,PACALPPACAPTR(r13)
> >  	stb	r11,LPPACADECRINT(r12)
> > -	LOAD_REG_ADDRBASE(r12,tb_ticks_per_jiffy)
> > -	lwz	r12,ADDROFF(tb_ticks_per_jiffy)(r12)
> > +	/* We have no TOC pointer in here, so we can't load tb_ticks_per_jiffy
> > +	 * using the usual macros. We want to be fast, so we assume the top
> > +	 * word of the lppaca pointer is the same as the top word of
> > +	 * &tb_ticks_per_jiffy.
> > +	 */
> > +	oris    r12,r12,tb_ticks_per_jiffy at h
> 
> you need a
> 	clrrdi	r12,r12,32
> here, because r12's low word may not contain zero.

Or do:
	oris	r12,r11,tb_ticks_per_jiffy at ha

since r11 only contains '1'. It's a bit obfuscated though, it depends on
if the saved single extra instruction is that precious or not. :-)


-Olof


From olof at lixom.net  Mon Feb  6 17:55:47 2006
From: olof at lixom.net (Olof Johansson)
Date: Mon, 6 Feb 2006 00:55:47 -0600
Subject: [PATCH] powerpc: Don't use toc in decrementer_iSeries_masked
In-Reply-To: <20060206065354.GA7626@pb15.lixom.net>
References: <20060206063434.22D37689F3@ozlabs.org>
	<20060206064247.GA31631@localhost.localdomain>
	<20060206065354.GA7626@pb15.lixom.net>
Message-ID: <20060206065547.GB7626@pb15.lixom.net>

On Mon, Feb 06, 2006 at 12:53:54AM -0600, Olof Johansson wrote:
> Or do:
> 	oris	r12,r11,tb_ticks_per_jiffy at ha
> 
> since r11 only contains '1'. It's a bit obfuscated though, it depends on
> if the saved single extra instruction is that precious or not. :-)

DOH. That obviously won't work. Nevermind.


-Olof


From wli at holomorphy.com  Mon Feb  6 19:19:39 2006
From: wli at holomorphy.com (William Lee Irwin III)
Date: Mon, 6 Feb 2006 00:19:39 -0800
Subject: Hugepages need clear_user_highpage() not clear_highpage()
In-Reply-To: <20060206021853.GC10708@localhost.localdomain>
References: <20060206021853.GC10708@localhost.localdomain>
Message-ID: <20060206081939.GA6789@holomorphy.com>

On Mon, Feb 06, 2006 at 01:18:53PM +1100, David Gibson wrote:
> When hugepages are newly allocated to a file in mm/hugetlb.c, we clear
> them with a call to clear_highpage() on each of the subpages.  We
> should be using clear_user_highpage(): on powerpc, at least,
> clear_highpage() doesn't correctly mark the page as icache dirty so if
> the page is executed shortly after it's possible to get strange
> results.
> This is a bugfix and should go into 2.6.16.
> Signed-off-by: David Gibson <dwg at au1.ibm.com>

Not sure how this got past the usual crapfilters. Sorry about that.

Acked-by: William Irwin <wli at holomorphy.com>


-- wli


From bgill at freescale.com  Tue Feb  7 07:26:31 2006
From: bgill at freescale.com (Becky Bruce)
Date: Mon, 6 Feb 2006 14:26:31 -0600 (CST)
Subject: [PATCH] documentation: add bus-frequency property to SOC node
Message-ID: <Pine.LNX.4.61.0602061423450.18645@cde-tx32-ldt329.sps.mot.com>

Updated SOC node definition in documentation to include bus-frequency
property. Also extended mdio example to match specification.

Signed-off-by: Becky Bruce <becky.bruce at freescale.com>
Signed-off-by: Kumar Gala <galak at gate.crashing.org>

---
commit 3441bf59c7e1dc3823f9be57838a2536c78f6f8f
tree 2901a0e19418f1fe904ff0d041c630b3af048961
parent 66c490c9b00c52cd0f1e088ad689c9148e46f49e
author Becky Bruce <becky.bruce at freescale.com> Thu, 02 Feb 2006 15:41:11 -0600
committer Becky Bruce <becky.bruce at freescale.com> Thu, 02 Feb 2006 15:41:11 -0600

 Documentation/powerpc/booting-without-of.txt |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/Documentation/powerpc/booting-without-of.txt b/Documentation/powerpc/booting-without-of.txt
index 1284498..54e5f9b 100644
--- a/Documentation/powerpc/booting-without-of.txt
+++ b/Documentation/powerpc/booting-without-of.txt
@@ -880,6 +880,10 @@ address which can extend beyond that lim
     - device_type : Should be "soc"
     - ranges : Should be defined as specified in 1) to describe the
       translation of SOC addresses for memory mapped SOC registers.
+    - bus-frequency: Contains the bus frequency for the SOC node.
+      Typically, the value of this field is filled in by the boot
+      loader. 
+
 
   Recommended properties:
 
@@ -919,6 +923,7 @@ SOC.
 		device_type = "soc";
 		ranges = <00000000 e0000000 00100000>
 		reg = <e0000000 00003000>;
+		bus-frequency = <0>;
 	}
 
 
@@ -1170,6 +1175,8 @@ platforms are moved over to use the flat
 
 	mdio at 24520 {
 		reg = <24520 20>;
+		device_type = "mdio"; 
+		compatible = "gianfar";
 
 		ethernet-phy at 0 {
 			......
@@ -1317,6 +1324,7 @@ not necessary as they are usually the sa
 		device_type = "soc";
 		ranges = <00000000 e0000000 00100000>
 		reg = <e0000000 00003000>;
+		bus-frequency = <0>;
 
 		mdio at 24520 {
 			reg = <24520 20>;


From mikey at neuling.org  Tue Feb  7 10:58:21 2006
From: mikey at neuling.org (Michael Neuling)
Date: Tue, 7 Feb 2006 10:58:21 +1100
Subject: [PATCH] powerpc: hypervisor check in pseries_kexec_cpu_down
Message-ID: <20060207105821.bfd5ea21.mikey@neuling.org>

Paulus,

We call unregister_vpa but we don't check to see if the hypervisor
supports this.

Please apply.

Signed-off-by: Michael Neuling <mikey at neuling.org>
Acked-by: Anton Blanchard <anton at samba.org>
--
 arch/powerpc/platforms/pseries/setup.c |    2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6-powerpc/arch/powerpc/platforms/pseries/setup.c
===================================================================
--- linux-2.6-powerpc.orig/arch/powerpc/platforms/pseries/setup.c
+++ linux-2.6-powerpc/arch/powerpc/platforms/pseries/setup.c
@@ -585,7 +585,7 @@ static int pSeries_pci_probe_mode(struct
 static void pseries_kexec_cpu_down(int crash_shutdown, int secondary)
 {
 	/* Don't risk a hypervisor call if we're crashing */
-	if (!crash_shutdown) {
+	if (firmware_has_feature(FW_FEATURE_SPLPAR) && !crash_shutdown) {
 		unsigned long vpa = __pa(get_lppaca());
 
 		if (unregister_vpa(hard_smp_processor_id(), vpa)) {


From sfr at canb.auug.org.au  Tue Feb  7 10:59:06 2006
From: sfr at canb.auug.org.au (Stephen Rothwell)
Date: Tue, 7 Feb 2006 10:59:06 +1100
Subject: [PATCH] powerpc: wire up the *at system calls
Message-ID: <20060207105906.04a22df3.sfr@canb.auug.org.au>


Signed-off-by: Stephen Rothwell <sfr at canb.auug.org.au>
---

 arch/powerpc/kernel/systbl.S |   13 +++++++++++++
 include/asm-powerpc/unistd.h |   15 ++++++++++++++-
 2 files changed, 27 insertions(+), 1 deletions(-)

This depend on the patch that creates all the compat wrappers.
-- 
Cheers,
Stephen Rothwell                    sfr at canb.auug.org.au
http://www.canb.auug.org.au/~sfr/

d02d8208813d8cae2c814a85734a1a31fed2f3ac
diff --git a/arch/powerpc/kernel/systbl.S b/arch/powerpc/kernel/systbl.S
index 007b15e..fe16d9c 100644
--- a/arch/powerpc/kernel/systbl.S
+++ b/arch/powerpc/kernel/systbl.S
@@ -323,3 +323,16 @@ SYSCALL(spu_run)
 SYSCALL(spu_create)
 COMPAT_SYS(pselect6)
 COMPAT_SYS(ppoll)
+COMPAT_SYS(openat)
+COMPAT_SYS(mkdirat)
+COMPAT_SYS(mknodat)
+COMPAT_SYS(fchownat)
+COMPAT_SYS(futimesat)
+COMPAT_SYS(newfstatat)
+COMPAT_SYS(unlinkat)
+COMPAT_SYS(renameat)
+COMPAT_SYS(linkat)
+COMPAT_SYS(symlinkat)
+COMPAT_SYS(readlinkat)
+COMPAT_SYS(fchmodat)
+COMPAT_SYS(faccessat)
diff --git a/include/asm-powerpc/unistd.h b/include/asm-powerpc/unistd.h
index a40cdff..d05b85e 100644
--- a/include/asm-powerpc/unistd.h
+++ b/include/asm-powerpc/unistd.h
@@ -300,8 +300,21 @@
 #define __NR_spu_create		279
 #define __NR_pselect6		280
 #define __NR_ppoll		281
+#define __NR_openat		282
+#define __NR_mkdirat		283
+#define __NR_mknodat		284
+#define __NR_fchownat		285
+#define __NR_futimesat		286
+#define __NR_newfstatat		287
+#define __NR_unlinkat		288
+#define __NR_renameat		289
+#define __NR_linkat		290
+#define __NR_symlinkat		291
+#define __NR_readlinkat		292
+#define __NR_fchmodat		293
+#define __NR_faccessat		294
 
-#define __NR_syscalls		282
+#define __NR_syscalls		295
 
 #ifdef __KERNEL__
 #define __NR__exit __NR_exit
-- 
1.1.5


From michael at ellerman.id.au  Tue Feb  7 11:07:58 2006
From: michael at ellerman.id.au (Michael Ellerman)
Date: Tue, 7 Feb 2006 11:07:58 +1100
Subject: [PATCH] powerpc: Don't use toc in decrementer_iSeries_masked
In-Reply-To: <20060206063434.22D37689F3@ozlabs.org>
References: <20060206063434.22D37689F3@ozlabs.org>
Message-ID: <200602071108.01571.michael@ellerman.id.au>

On Mon, 6 Feb 2006 17:34, Michael Ellerman wrote:
> Since 404849bbd2bfd62e05b36f4753f6e1af6050a824 we've been using
> LOAD_REG_ADDRBASE, which uses the toc pointer, in
> decrementer_iSeries_masked.
>
> This can explode if we take the decrementer interrupt while we're in a
> module, because the toc pointer in r2 will be the module's toc pointer.
>
> Instead do an immediate load. I'm not sure if we really need the trickery
> in here, what do people think?

I think we answered that question pretty thoroughly, I'll post an updated and 
simplified version soon.

cheers

-- 
Michael Ellerman
IBM OzLabs

email: michael:ellerman.id.au
inmsg: mpe:jabber.org
wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060207/3ebdce2c/attachment.pgp 

From michael at ellerman.id.au  Tue Feb  7 11:21:13 2006
From: michael at ellerman.id.au (Michael Ellerman)
Date: Tue, 7 Feb 2006 11:21:13 +1100
Subject: [PATCH] powerpc: hypervisor check in pseries_kexec_cpu_down
In-Reply-To: <20060207105821.bfd5ea21.mikey@neuling.org>
References: <20060207105821.bfd5ea21.mikey@neuling.org>
Message-ID: <200602071121.17076.michael@ellerman.id.au>

On Tue, 7 Feb 2006 10:58, Michael Neuling wrote:
> Index: linux-2.6-powerpc/arch/powerpc/platforms/pseries/setup.c
> ===================================================================
> --- linux-2.6-powerpc.orig/arch/powerpc/platforms/pseries/setup.c
> +++ linux-2.6-powerpc/arch/powerpc/platforms/pseries/setup.c
> @@ -585,7 +585,7 @@ static int pSeries_pci_probe_mode(struct
>  static void pseries_kexec_cpu_down(int crash_shutdown, int secondary)
>  {
>  	/* Don't risk a hypervisor call if we're crashing */
> -	if (!crash_shutdown) {
> +	if (firmware_has_feature(FW_FEATURE_SPLPAR) && !crash_shutdown) {
>  		unsigned long vpa = __pa(get_lppaca());
>
>  		if (unregister_vpa(hard_smp_processor_id(), vpa)) {

Is SPLPAR the right test? I would have thought LPAR?

cheers

-- 
Michael Ellerman
IBM OzLabs

email: michael:ellerman.id.au
inmsg: mpe:jabber.org
wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060207/782a9e55/attachment.pgp 

From mikey at neuling.org  Tue Feb  7 11:58:16 2006
From: mikey at neuling.org (Michael Neuling)
Date: Tue, 7 Feb 2006 11:58:16 +1100
Subject: [PATCH] powerpc: hypervisor check in pseries_kexec_cpu_down
In-Reply-To: <200602071121.17076.michael@ellerman.id.au>
References: <20060207105821.bfd5ea21.mikey@neuling.org>
	<200602071121.17076.michael@ellerman.id.au>
Message-ID: <20060207115816.c2314f86.mikey@neuling.org>

> Is SPLPAR the right test? I would have thought LPAR?

I missed your patch which added this but you're right.

Revised patch attached.  Now depends on MPE's patches from here:
http://patchwork.ozlabs.org/linuxppc64/patch?id=4088

--
We call unregister_vpa but we don't check to see if the hypervisor
supports this.

Signed-off-by: Michael Neuling <mikey at neuling.org>

 arch/powerpc/platforms/pseries/setup.c |    2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6-powerpc/arch/powerpc/platforms/pseries/setup.c
===================================================================
--- linux-2.6-powerpc.orig/arch/powerpc/platforms/pseries/setup.c
+++ linux-2.6-powerpc/arch/powerpc/platforms/pseries/setup.c
@@ -585,7 +585,7 @@ static int pSeries_pci_probe_mode(struct
 static void pseries_kexec_cpu_down(int crash_shutdown, int secondary)
 {
 	/* Don't risk a hypervisor call if we're crashing */
-	if (!crash_shutdown) {
+	if (firmware_has_feature(FW_FEATURE_LPAR) && !crash_shutdown) {
 		unsigned long vpa = __pa(get_lppaca());
 
 		if (unregister_vpa(hard_smp_processor_id(), vpa)) {


From michael at ellerman.id.au  Tue Feb  7 13:26:14 2006
From: michael at ellerman.id.au (Michael Ellerman)
Date: Tue, 07 Feb 2006 13:26:14 +1100
Subject: [PATCH] powerpc: Don't use toc in decrementer_iSeries_masked
Message-ID: <20060207022639.1F591689DD@ozlabs.org>

Since 404849bbd2bfd62e05b36f4753f6e1af6050a824 we've been using
LOAD_REG_ADDRBASE, which uses the toc pointer, in decrementer_iSeries_masked.

This can explode if we take the decrementer interrupt while we're in a module,
because the toc pointer in r2 will be the module's toc pointer.

 arch/powerpc/kernel/head_64.S |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

Index: iseries/arch/powerpc/kernel/head_64.S
===================================================================
--- iseries.orig/arch/powerpc/kernel/head_64.S
+++ iseries/arch/powerpc/kernel/head_64.S
@@ -749,11 +749,12 @@ iSeries_secondary_smp_loop:
 
 	.globl decrementer_iSeries_masked
 decrementer_iSeries_masked:
+	/* We may not have a valid TOC pointer in here. */
 	li	r11,1
 	ld	r12,PACALPPACAPTR(r13)
 	stb	r11,LPPACADECRINT(r12)
-	LOAD_REG_ADDRBASE(r12,tb_ticks_per_jiffy)
-	lwz	r12,ADDROFF(tb_ticks_per_jiffy)(r12)
+	LOAD_REG_IMMEDIATE(r12, tb_ticks_per_jiffy)
+	lwz	r12,0(r12)
 	mtspr	SPRN_DEC,r12
 	/* fall through */
 

From ntl at pobox.com  Tue Feb  7 14:39:53 2006
From: ntl at pobox.com (Nathan Lynch)
Date: Mon, 6 Feb 2006 21:39:53 -0600
Subject: [PATCH] powerpc: hypervisor check in pseries_kexec_cpu_down
In-Reply-To: <20060207115816.c2314f86.mikey@neuling.org>
References: <20060207105821.bfd5ea21.mikey@neuling.org>
	<200602071121.17076.michael@ellerman.id.au>
	<20060207115816.c2314f86.mikey@neuling.org>
Message-ID: <20060207033953.GH18730@localhost.localdomain>

Michael Neuling wrote:
> > Is SPLPAR the right test? I would have thought LPAR?
> 
> I missed your patch which added this but you're right.

Actually I think the original patch is correct.  VPAs come into play
only when the hypervisor supports the SPLPAR option.


From paulus at samba.org  Tue Feb  7 14:41:39 2006
From: paulus at samba.org (Paul Mackerras)
Date: Tue, 7 Feb 2006 14:41:39 +1100
Subject: merge these lists?
Message-ID: <17384.5875.790692.127762@cargo.ozlabs.ibm.com>

A lot of messages seem to get cross-posted to both linuxppc-dev and
linuxppc64-dev these days, since we are all working in the one tree.
Rather than having to cross-post, I propose that we create a single
powerpc-dev at ozlabs.org list to replace linuxppc-dev and
linuxppc64-dev.  (The linuxppc-embedded list would continue as at
present.)

If we do this, we would set up the new list with the union of the
subscribers of the old lists, and make emails sent to linuxppc-dev and
linuxppc64-dev go to the new list, so it should be painless.

Thoughts? Comments? Objections?

Paul.


From jk at ozlabs.org  Tue Feb  7 14:45:45 2006
From: jk at ozlabs.org (Jeremy Kerr)
Date: Tue, 7 Feb 2006 14:45:45 +1100
Subject: merge these lists?
In-Reply-To: <17384.5875.790692.127762@cargo.ozlabs.ibm.com>
References: <17384.5875.790692.127762@cargo.ozlabs.ibm.com>
Message-ID: <200602071445.45805.jk@ozlabs.org>

Paul,

> If we do this, we would set up the new list with the union of the
> subscribers of the old lists, and make emails sent to linuxppc-dev
> and linuxppc64-dev go to the new list, so it should be painless.
>
> Thoughts? Comments? Objections?

How about the patchwork lists? Should I look at merging those too?


Jeremy


From haren at us.ibm.com  Tue Feb  7 14:50:03 2006
From: haren at us.ibm.com (Haren Myneni)
Date: Mon, 06 Feb 2006 19:50:03 -0800
Subject: [PATCH] Fix in free initrd when overlapped with crashkernel region
Message-ID: <43E818EB.7010003@us.ibm.com>


It is possible that the reserved crashkernel region can be overlapped 
with initrd since the bootloader sets the initrd location. When the 
initrd region is freed, the second kernel memory will not be contiguous. 
The Kexec_load can cause an oops since there is no contiguous memory to 
write the second kernel or this memory could be used in the first kernel 
itself and may not be part of the dump. For example, on powerpc, the 
initrd is located at 36MB and the crashkernel starts at 32MB. The 
kexec_load caused panic since writing into non-allocated memory (after 
36MB). We could see the similar issue even on other archs.

One possibility is to move the initrd outside of crashkernel region. 
But, the initrd region will be freed anyway before the system is up.  
This patch fixes this issue and frees only regions that are not part of 
crashkernel memory in case overlaps. 

Thanks
Haren

Signed-off-by: Haren Myneni <haren at us.ibm.com>


-------------- next part --------------
A non-text attachment was scrubbed...
Name: kdump-initrd-overlap-fix.patch
Type: text/x-patch
Size: 1723 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060206/cb4e1a08/attachment.bin 

From olof at lixom.net  Tue Feb  7 15:12:43 2006
From: olof at lixom.net (Olof Johansson)
Date: Mon, 6 Feb 2006 22:12:43 -0600
Subject: merge these lists?
In-Reply-To: <200602071445.45805.jk@ozlabs.org>
References: <17384.5875.790692.127762@cargo.ozlabs.ibm.com>
	<200602071445.45805.jk@ozlabs.org>
Message-ID: <20060207041243.GC7626@pb15.lixom.net>

On Tue, Feb 07, 2006 at 02:45:45PM +1100, Jeremy Kerr wrote:
> Paul,
> 
> > If we do this, we would set up the new list with the union of the
> > subscribers of the old lists, and make emails sent to linuxppc-dev
> > and linuxppc64-dev go to the new list, so it should be painless.
> >
> > Thoughts? Comments? Objections?
> 
> How about the patchwork lists? Should I look at merging those too?

I get a feeling that our maintainers might not be using them much any
more (most patches since August of last year are still "New"), but I
find it convenient to search for a patch that you know has gone by but
can't find in your list mbox.

I would appreciate either a merge, or a new archive started for the new
list. It's useful.


-Olof


From michael at ellerman.id.au  Tue Feb  7 15:29:49 2006
From: michael at ellerman.id.au (Michael Ellerman)
Date: Tue, 7 Feb 2006 15:29:49 +1100
Subject: merge these lists?
In-Reply-To: <20060207041243.GC7626@pb15.lixom.net>
References: <17384.5875.790692.127762@cargo.ozlabs.ibm.com>
	<200602071445.45805.jk@ozlabs.org>
	<20060207041243.GC7626@pb15.lixom.net>
Message-ID: <200602071529.52073.michael@ellerman.id.au>

On Tue, 7 Feb 2006 15:12, Olof Johansson wrote:
> On Tue, Feb 07, 2006 at 02:45:45PM +1100, Jeremy Kerr wrote:
> > Paul,
> >
> > > If we do this, we would set up the new list with the union of the
> > > subscribers of the old lists, and make emails sent to linuxppc-dev
> > > and linuxppc64-dev go to the new list, so it should be painless.
> > >
> > > Thoughts? Comments? Objections?
> >
> > How about the patchwork lists? Should I look at merging those too?
>
> I get a feeling that our maintainers might not be using them much any
> more (most patches since August of last year are still "New"), but I
> find it convenient to search for a patch that you know has gone by but
> can't find in your list mbox.
>
> I would appreciate either a merge, or a new archive started for the new
> list. It's useful.

And while Jk has nothing else to do .. I'd like to be able to managed my own 
patches, ie. set them as obsolete etc etc.

Oh and yeah I think we should merge the lists :D

cheers

-- 
Michael Ellerman
IBM OzLabs

email: michael:ellerman.id.au
inmsg: mpe:jabber.org
wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060207/1a84a395/attachment.pgp 

From ntl at pobox.com  Tue Feb  7 15:44:23 2006
From: ntl at pobox.com (Nathan Lynch)
Date: Mon, 6 Feb 2006 22:44:23 -0600
Subject: [PATCH] avoid timer interrupt replay effect when onlining cpu
Message-ID: <20060207044422.GI18730@localhost.localdomain>


When a cpu is hotplug-onlined, if we don't set per_cpu(last_jiffy) to
something sane, timer_interrupt will execute its while loop for every
tick missed since the cpu was last online (or since the system was
booted, if we're adding a new cpu).  This can cause weird hangs, ssh
sessions dropping, and we can even go xmon if we take a global IPI at
the wrong time.

Signed-off-by: Nathan Lynch <ntl at pobox.com>


--- powerpc-timer_interrupt-replay.orig/arch/powerpc/kernel/smp.c
+++ powerpc-timer_interrupt-replay/arch/powerpc/kernel/smp.c
@@ -540,6 +540,9 @@ int __devinit start_secondary(void *unus
 	if (smp_ops->take_timebase)
 		smp_ops->take_timebase();
 
+	if (system_state > SYSTEM_BOOTING)
+		per_cpu(last_jiffy, cpu) = get_tb();
+
 	spin_lock(&call_lock);
 	cpu_set(cpu, cpu_online_map);
 	spin_unlock(&call_lock);


From mikey at neuling.org  Tue Feb  7 15:55:39 2006
From: mikey at neuling.org (Michael Neuling)
Date: Tue, 7 Feb 2006 15:55:39 +1100
Subject: [PATCH] powerpc: hypervisor check in pseries_kexec_cpu_down
In-Reply-To: <20060207033953.GH18730@localhost.localdomain>
References: <20060207105821.bfd5ea21.mikey@neuling.org>
	<200602071121.17076.michael@ellerman.id.au>
	<20060207115816.c2314f86.mikey@neuling.org>
	<20060207033953.GH18730@localhost.localdomain>
Message-ID: <20060207155539.8e2130b7.mikey@neuling.org>

> > > Is SPLPAR the right test? I would have thought LPAR?
> > 
> > I missed your patch which added this but you're right.
> 
> Actually I think the original patch is correct.  VPAs come into play
> only when the hypervisor supports the SPLPAR option.

My bad.  Looking at the PAPR you're right. 

Original patch is good.  Second patch is bogus.

Mikey


From michael at ellerman.id.au  Tue Feb  7 16:02:33 2006
From: michael at ellerman.id.au (Michael Ellerman)
Date: Tue, 7 Feb 2006 16:02:33 +1100
Subject: [PATCH] powerpc: Don't use toc in decrementer_iSeries_masked
In-Reply-To: <20060207022639.1F591689DD@ozlabs.org>
References: <20060207022639.1F591689DD@ozlabs.org>
Message-ID: <200602071602.36387.michael@ellerman.id.au>

On Tue, 7 Feb 2006 13:26, Michael Ellerman wrote:
> Since 404849bbd2bfd62e05b36f4753f6e1af6050a824 we've been using
> LOAD_REG_ADDRBASE, which uses the toc pointer, in
> decrementer_iSeries_masked.
>
> This can explode if we take the decrementer interrupt while we're in a
> module, because the toc pointer in r2 will be the module's toc pointer.

Ooops ...

Signed-off-by: Michael Ellerman <michael at ellerman.id.au>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060207/bf7fb4ea/attachment.pgp 

From michael at ellerman.id.au  Tue Feb  7 16:03:11 2006
From: michael at ellerman.id.au (Michael Ellerman)
Date: Tue, 7 Feb 2006 16:03:11 +1100
Subject: [PATCH] powerpc: Fix !SMP build of rtas.c
In-Reply-To: <20060131061807.0104468A53@ozlabs.org>
References: <20060131061807.0104468A53@ozlabs.org>
Message-ID: <200602071603.14043.michael@ellerman.id.au>

On Tue, 31 Jan 2006 17:17, Michael Ellerman wrote:
> arch/powerpc/kernel/rtas.c is getting hvcall.h via spinlock.h, but when
> we're building for UP we don't include spinlock.h.
>
>  arch/powerpc/kernel/rtas.c |    1 +
>  1 files changed, 1 insertion(+)

Signed-off-by: Michael Ellerman <michael at ellerman.id.au>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060207/00d0561d/attachment.pgp 

From michael at ellerman.id.au  Tue Feb  7 16:03:52 2006
From: michael at ellerman.id.au (Michael Ellerman)
Date: Tue, 7 Feb 2006 16:03:52 +1100
Subject: [PATCH] powerpc: ibmveth: Harden driver initilisation for kexec
In-Reply-To: <20060131041055.5623C68A46@ozlabs.org>
References: <20060131041055.5623C68A46@ozlabs.org>
Message-ID: <200602071603.55743.michael@ellerman.id.au>

On Tue, 31 Jan 2006 15:10, Michael Ellerman wrote:
> After a kexec the veth driver will fail when trying to register with the
> Hypervisor because the previous kernel has not unregistered.
>
> So if the registration fails, we unregister and then try again.

Sorry this is missing:

Signed-off-by: Michael Ellerman <michael at ellerman.id.au>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060207/a5ff0bbc/attachment.pgp 

From michael at ellerman.id.au  Tue Feb  7 16:22:00 2006
From: michael at ellerman.id.au (Michael Ellerman)
Date: Tue, 07 Feb 2006 16:22:00 +1100
Subject: [PATCH] powerpc: Make BUG_ON & WARN_ON play nice with compile-time
	optimisations
Message-ID: <20060207052220.917C668A92@ozlabs.org>

Currently if you do BUG_ON(0) you'll still get a trap instruction in your
object, although it'll never trigger. That's ok, but a bit ugly, it'd be nice
if the compiler could completely eliminate any trace of the BUG_ON.

So update the BUG_ON & WARN_ON macros to make this possible. From the comment
in the patch:

 The if statement in BUG_ON and WARN_ON gives the compiler a chance to do
 compile-time optimisation and possibly elide the entire block. The check
 for !__builtin_constant(x) has the oppposite effect, if we must do the
 test at runtime then we avoid a spurious compare and branch by ensuring
 the if condition is always true.

I've confirmed it works in both cases, if the condition is false at compile
time we get no code emitted for the BUG statement. If the condition needs to
be evaluated at runtime we get the same code we used to, ie. only one test
in the trap instruction.

It's not clear from the patch due to the whitespace changes, but there's no
changes to the inline asm whatsoever.

For consideration for 2.6.17 I guess.

Signed-off-by: Michael Ellerman <michael at ellerman.id.au>
---

 include/asm-powerpc/bug.h |   46 +++++++++++++++++++++++++++++-----------------
 1 files changed, 29 insertions(+), 17 deletions(-)

Index: iseries/include/asm-powerpc/bug.h
===================================================================
--- iseries.orig/include/asm-powerpc/bug.h
+++ iseries/include/asm-powerpc/bug.h
@@ -39,25 +39,37 @@ struct bug_entry *find_bug(unsigned long
 		: : "i" (__LINE__), "i" (__FILE__), "i" (__FUNCTION__)); \
 } while (0)
 
-#define BUG_ON(x) do {						\
-	__asm__ __volatile__(					\
-		"1:	"PPC_TLNEI"	%0,0\n"			\
-		".section __bug_table,\"a\"\n"			\
-		"\t"PPC_LONG"	1b,%1,%2,%3\n"		\
-		".previous"					\
-		: : "r" ((long)(x)), "i" (__LINE__),		\
-		    "i" (__FILE__), "i" (__FUNCTION__));	\
+/*
+ * The if statement in BUG_ON and WARN_ON gives the compiler a chance to do
+ * compile-time optimisation and possibly elide the entire block. The check
+ * for !__builtin_constant(x) has the oppposite effect, if we must do the
+ * test at runtime then we avoid a spurious compare and branch by ensuring
+ * the if condition is always true.
+ */
+
+#define BUG_ON(x) do {							\
+	if (!__builtin_constant_p(x) || (x)) {				\
+		__asm__ __volatile__(					\
+			"1:	"PPC_TLNEI"	%0,0\n"			\
+			".section __bug_table,\"a\"\n"			\
+			"\t"PPC_LONG"	1b,%1,%2,%3\n"			\
+			".previous"					\
+			: : "r" ((long)(x)), "i" (__LINE__),		\
+			    "i" (__FILE__), "i" (__FUNCTION__));	\
+	}								\
 } while (0)
 
-#define WARN_ON(x) do {						\
-	__asm__ __volatile__(					\
-		"1:	"PPC_TLNEI"	%0,0\n"			\
-		".section __bug_table,\"a\"\n"			\
-		"\t"PPC_LONG"	1b,%1,%2,%3\n"		\
-		".previous"					\
-		: : "r" ((long)(x)),				\
-		    "i" (__LINE__ + BUG_WARNING_TRAP),		\
-		    "i" (__FILE__), "i" (__FUNCTION__));	\
+#define WARN_ON(x) do {							\
+	if (!__builtin_constant_p(x) || (x)) {				\
+		__asm__ __volatile__(					\
+			"1:	"PPC_TLNEI"	%0,0\n"			\
+			".section __bug_table,\"a\"\n"			\
+			"\t"PPC_LONG"	1b,%1,%2,%3\n"			\
+			".previous"					\
+			: : "r" ((long)(x)),				\
+			    "i" (__LINE__ + BUG_WARNING_TRAP),		\
+			    "i" (__FILE__), "i" (__FUNCTION__));	\
+	}								\
 } while (0)
 
 #define HAVE_ARCH_BUG


From sfr at canb.auug.org.au  Tue Feb  7 17:40:17 2006
From: sfr at canb.auug.org.au (Stephen Rothwell)
Date: Tue, 7 Feb 2006 17:40:17 +1100
Subject: [PATCH] compat: add compat functions for *at syscalls
In-Reply-To: <20060206.160140.59716704.davem@davemloft.net>
References: <20060207105631.39a1080c.sfr@canb.auug.org.au>
	<20060206.160140.59716704.davem@davemloft.net>
Message-ID: <20060207174017.5e3b0ce0.sfr@canb.auug.org.au>

On Mon, 06 Feb 2006 16:01:40 -0800 (PST) "David S. Miller" <davem at davemloft.net> wrote:
>
> From: Stephen Rothwell <sfr at canb.auug.org.au>
> Date: Tue, 7 Feb 2006 10:56:31 +1100
> 
> > This adds compat version of all the remaining *at syscalls
> > so that the "dfd" arguments can be properly sign extended.
> > 
> > Signed-off-by: Stephen Rothwell <sfr at canb.auug.org.au>
> 
> I do the sign extension with tiny stubs in arch/sparc64/kernel/sys32.S
> so that the arg frobbing does not consume a stack frame, which is what
> happens if you do this in C code.
> 
> We need to revisit this at some point and make a way for all
> compat platforms to do this with a portable table of some kind
> that expands a bunch of macros defined by the platform.

How about the following (modifiying Linus' suggestion and copying what
sparc64 already does)?

The assumption is that all arguments have been zero extended by the compat
syscall entry code, so we just sign extend those that need it.

I am not sure of the sparc64 code below, s390 doesn't seem to follow our
"all arguments are zero extended" assumption and x86_64 may not need any
of these wrappers anyway.

It may be that we would be better following Linus's suggestion of
generating stubs for all of the compat syscalls.

-- 
Cheers,
Stephen Rothwell                    sfr at canb.auug.org.au
http://www.canb.auug.org.au/~sfr/

Subject: [PATCH] compat: introduce kernel/compat_wrapper.S

and the necessary compat_wrapper.h with implementations
for powerpc and sparc64.

compat_wrapper.S builds wrappers for those syscalls that
require sign extension for some of their arguments.

Signed-off-by: Stephen Rothwell <sfr at canb.auug.org.au>

---

 arch/sparc64/kernel/systbls.S        |    6 +++---
 include/asm-ia64/compat_wrapper.h    |   15 +++++++++++++++
 include/asm-mips/compat_wrapper.h    |   15 +++++++++++++++
 include/asm-parisc/compat_wrapper.h  |   15 +++++++++++++++
 include/asm-powerpc/compat_wrapper.h |   28 ++++++++++++++++++++++++++++
 include/asm-s390/compat_wrapper.h    |   15 +++++++++++++++
 include/asm-sparc64/compat_wrapper.h |   33 +++++++++++++++++++++++++++++++++
 include/asm-x86_64/compat_wrapper.h  |   15 +++++++++++++++
 include/linux/compat.h               |   22 ++++++++++++++++++++++
 kernel/Makefile                      |    2 +-
 kernel/compat_wrapper.S              |   18 ++++++++++++++++++
 11 files changed, 180 insertions(+), 4 deletions(-)
 create mode 100644 include/asm-ia64/compat_wrapper.h
 create mode 100644 include/asm-mips/compat_wrapper.h
 create mode 100644 include/asm-parisc/compat_wrapper.h
 create mode 100644 include/asm-powerpc/compat_wrapper.h
 create mode 100644 include/asm-s390/compat_wrapper.h
 create mode 100644 include/asm-sparc64/compat_wrapper.h
 create mode 100644 include/asm-x86_64/compat_wrapper.h
 create mode 100644 kernel/compat_wrapper.S

1cffeae9ae628af849952cf90fbfca1d98befb97
diff --git a/arch/sparc64/kernel/systbls.S b/arch/sparc64/kernel/systbls.S
index 2881faf..a2cc631 100644
--- a/arch/sparc64/kernel/systbls.S
+++ b/arch/sparc64/kernel/systbls.S
@@ -77,9 +77,9 @@ sys_call_table32:
 /*270*/	.word sys32_io_submit, sys_io_cancel, compat_sys_io_getevents, sys32_mq_open, sys_mq_unlink
 	.word compat_sys_mq_timedsend, compat_sys_mq_timedreceive, compat_sys_mq_notify, compat_sys_mq_getsetattr, compat_sys_waitid
 /*280*/	.word sys_ni_syscall, sys_add_key, sys_request_key, sys_keyctl, compat_sys_openat
-	.word sys_mkdirat, sys_mknodat, sys_fchownat, compat_sys_futimesat, compat_sys_newfstatat
-/*285*/	.word sys_unlinkat, sys_renameat, sys_linkat, sys_symlinkat, sys_readlinkat
-	.word sys_fchmodat, sys_faccessat, compat_sys_pselect6, compat_sys_ppoll
+	.word compat_sys_mkdirat, compat_sys_mknodat, compat_sys_fchownat, compat_sys_futimesat, compat_sys_newfstatat
+/*285*/	.word compat_sys_unlinkat, compat_sys_renameat, compat_sys_linkat, compat_sys_symlinkat, compat_sys_readlinkat
+	.word compat_sys_fchmodat, compat_sys_faccessat, compat_sys_pselect6, compat_sys_ppoll
 
 #endif /* CONFIG_COMPAT */
 
diff --git a/include/asm-ia64/compat_wrapper.h b/include/asm-ia64/compat_wrapper.h
new file mode 100644
index 0000000..f82befc
--- /dev/null
+++ b/include/asm-ia64/compat_wrapper.h
@@ -0,0 +1,15 @@
+/*
+ * Definitions used to generate the sign extending stubs
+ * for compat syscalls
+ */
+
+#define ARG1
+#define ARG2
+#define ARG3
+#define ARG4
+#define ARG5
+#define ARG6
+
+#define compat_fn1(fn, arg)
+
+#define compat_fn2(fn, arg1, arg2)
diff --git a/include/asm-mips/compat_wrapper.h b/include/asm-mips/compat_wrapper.h
new file mode 100644
index 0000000..f82befc
--- /dev/null
+++ b/include/asm-mips/compat_wrapper.h
@@ -0,0 +1,15 @@
+/*
+ * Definitions used to generate the sign extending stubs
+ * for compat syscalls
+ */
+
+#define ARG1
+#define ARG2
+#define ARG3
+#define ARG4
+#define ARG5
+#define ARG6
+
+#define compat_fn1(fn, arg)
+
+#define compat_fn2(fn, arg1, arg2)
diff --git a/include/asm-parisc/compat_wrapper.h b/include/asm-parisc/compat_wrapper.h
new file mode 100644
index 0000000..f82befc
--- /dev/null
+++ b/include/asm-parisc/compat_wrapper.h
@@ -0,0 +1,15 @@
+/*
+ * Definitions used to generate the sign extending stubs
+ * for compat syscalls
+ */
+
+#define ARG1
+#define ARG2
+#define ARG3
+#define ARG4
+#define ARG5
+#define ARG6
+
+#define compat_fn1(fn, arg)
+
+#define compat_fn2(fn, arg1, arg2)
diff --git a/include/asm-powerpc/compat_wrapper.h b/include/asm-powerpc/compat_wrapper.h
new file mode 100644
index 0000000..9bc0669
--- /dev/null
+++ b/include/asm-powerpc/compat_wrapper.h
@@ -0,0 +1,28 @@
+/*
+ * Definitions used to generate the sign extending stubs
+ * for compat syscalls
+ *
+ * Copyright (C) 2006 Stephen Rothwell, IBM Corp
+ */
+
+#define ARG1	%r3
+#define ARG2	%r4
+#define ARG3	%r5
+#define ARG4	%r6
+#define ARG5	%r7
+#define ARG6	%r8
+
+#define compat_fn1(fn, arg)		\
+	.text;				\
+	.global	.compat_sys_ ## fn;	\
+.compat_sys_ ## fn:			\
+	extsw	arg, arg;		\
+	b	.sys_ ## fn
+
+#define compat_fn2(fn, arg1, arg2)	\
+	.text;				\
+	.global	.compat_sys_ ## fn;	\
+.compat_sys_ ## fn:			\
+	extsw	arg1, arg1;		\
+	extsw	arg2, arg2;		\
+	b	.sys_ ## fn
diff --git a/include/asm-s390/compat_wrapper.h b/include/asm-s390/compat_wrapper.h
new file mode 100644
index 0000000..f82befc
--- /dev/null
+++ b/include/asm-s390/compat_wrapper.h
@@ -0,0 +1,15 @@
+/*
+ * Definitions used to generate the sign extending stubs
+ * for compat syscalls
+ */
+
+#define ARG1
+#define ARG2
+#define ARG3
+#define ARG4
+#define ARG5
+#define ARG6
+
+#define compat_fn1(fn, arg)
+
+#define compat_fn2(fn, arg1, arg2)
diff --git a/include/asm-sparc64/compat_wrapper.h b/include/asm-sparc64/compat_wrapper.h
new file mode 100644
index 0000000..42afb2c
--- /dev/null
+++ b/include/asm-sparc64/compat_wrapper.h
@@ -0,0 +1,33 @@
+/*
+ * Definitions used to generate the sign extending stubs
+ * for compat syscalls
+ *
+ * Copyright (C) 2006 Stephen Rothwell, IBM Corp
+ * Based on arch/sparc64/kernel/sys32.S
+ */
+
+#define ARG1	%o0
+#define ARG2	%o1
+#define ARG3	%o2
+#define ARG4	%o3
+#define ARG5	%o4
+#define ARG6	%o5
+
+#define compat_fn1(fn, arg)			\
+	.text;					\
+	.align	32;				\
+	.globl	compat_sys_ ## fn;		\
+compat_sys_ ## fn:				\
+	sethi	%hi(sys_ ## fn), %g1;		\
+	jmpl	%g1 + %lo(sys_ ## fn), %g0;	\
+	sra	arg, 0, arg
+
+#define compat_fn2(fn, arg1, arg2)		\
+	.text;					\
+	.align	32;				\
+	.globl	compat_sys_ ## fn;		\
+compat_sys_ ## fn:				\
+	sethi	%hi(sys_ ## fn), %g1;		\
+	sra	arg1, 0, arg1;			\
+	jmpl	%g1 + %lo(sys_ ## fn), %g0;	\
+	sra	arg2, 0, arg2
diff --git a/include/asm-x86_64/compat_wrapper.h b/include/asm-x86_64/compat_wrapper.h
new file mode 100644
index 0000000..f82befc
--- /dev/null
+++ b/include/asm-x86_64/compat_wrapper.h
@@ -0,0 +1,15 @@
+/*
+ * Definitions used to generate the sign extending stubs
+ * for compat syscalls
+ */
+
+#define ARG1
+#define ARG2
+#define ARG3
+#define ARG4
+#define ARG5
+#define ARG6
+
+#define compat_fn1(fn, arg)
+
+#define compat_fn2(fn, arg1, arg2)
diff --git a/include/linux/compat.h b/include/linux/compat.h
index 2d7e7f1..b501201 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -168,6 +168,28 @@ asmlinkage long compat_sys_newfstatat(un
 				      int flag);
 asmlinkage long compat_sys_openat(unsigned int dfd, const char __user *filename,
 				   int flags, int mode);
+asmlinkage long compat_sys_mkdirat(unsigned int dfd,
+		const char __user * pathname, int mode);
+asmlinkage long compat_sys_mknodat(unsigned int dfd,
+		const char __user *filename, int mode, unsigned dev);
+asmlinkage long compat_sys_fchownat(unsigned int dfd,
+		const char __user *filename, uid_t user, gid_t group, int flag);
+asmlinkage long compat_sys_unlinkat(unsigned int dfd,
+		const char __user *pathname, int flag);
+asmlinkage long compat_sys_renameat(unsigned int olddfd,
+		const char __user *oldname, unsigned int newdfd,
+		const char __user *newname);
+asmlinkage long compat_sys_linkat(unsigned int olddfd,
+		const char __user *oldname, unsigned int newdfd,
+		const char __user *newname);
+asmlinkage long compat_sys_symlinkat(const char __user *oldname,
+		unsigned int newdfd, const char __user *newname);
+asmlinkage long compat_sys_readlinkat(unsigned int dfd,
+		const char __user *path, char __user *buf, int bufsiz);
+asmlinkage long compat_sys_fchmodat(unsigned int dfd,
+		const char __user *filename, mode_t mode);
+asmlinkage long compat_sys_faccessat(unsigned int dfd,
+		const char __user *filename, int mode);
 
 #endif /* CONFIG_COMPAT */
 #endif /* _LINUX_COMPAT_H */
diff --git a/kernel/Makefile b/kernel/Makefile
index 4ae0fbd..a0679c4 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -22,7 +22,7 @@ obj-$(CONFIG_KALLSYMS) += kallsyms.o
 obj-$(CONFIG_PM) += power/
 obj-$(CONFIG_BSD_PROCESS_ACCT) += acct.o
 obj-$(CONFIG_KEXEC) += kexec.o
-obj-$(CONFIG_COMPAT) += compat.o
+obj-$(CONFIG_COMPAT) += compat.o compat_wrapper.o
 obj-$(CONFIG_CPUSETS) += cpuset.o
 obj-$(CONFIG_IKCONFIG) += configs.o
 obj-$(CONFIG_STOP_MACHINE) += stop_machine.o
diff --git a/kernel/compat_wrapper.S b/kernel/compat_wrapper.S
new file mode 100644
index 0000000..da009eb
--- /dev/null
+++ b/kernel/compat_wrapper.S
@@ -0,0 +1,18 @@
+/*
+ * Copyright (C) 2006 Stephen Rothwell, IBM Corp
+ *
+ * this file will generate compat_ wrapper functions for
+ * syscalls that need sign extension for some of their arguments
+ */
+#include <asm/compat_wrapper.h>
+
+compat_fn1(mkdirat, ARG1)
+compat_fn1(mknodat, ARG1)
+compat_fn1(fchownat, ARG1)
+compat_fn1(unlinkat, ARG1)
+compat_fn2(renameat, ARG1, ARG3)
+compat_fn2(linkat, ARG1, ARG3)
+compat_fn1(symlinkat, ARG2)
+compat_fn1(readlinkat, ARG1)
+compat_fn1(fchmodat, ARG1)
+compat_fn1(faccessat, ARG1)
-- 
1.1.5
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060207/553bb4f8/attachment.pgp 

From paulus at samba.org  Tue Feb  7 19:47:39 2006
From: paulus at samba.org (Paul Mackerras)
Date: Tue, 7 Feb 2006 19:47:39 +1100
Subject: [PATCH] Fix in free initrd when overlapped with crashkernel region
In-Reply-To: <43E818EB.7010003@us.ibm.com>
References: <43E818EB.7010003@us.ibm.com>
Message-ID: <17384.24235.960221.979322@cargo.ozlabs.ibm.com>

Haren Myneni writes:

> --- 2616-rc2.orig/include/linux/kexec.h	2006-02-06 19:08:01.000000000 -0800
> +++ 2616-rc2/include/linux/kexec.h	2006-02-06 19:06:37.000000000 -0800
> @@ -6,6 +6,7 @@
>  #include <linux/list.h>
>  #include <linux/linkage.h>
>  #include <linux/compat.h>
> +#include <linux/ioport.h>
>  #include <asm/kexec.h>

What's this hunk for?

Paul.


From hch at lst.de  Tue Feb  7 21:56:43 2006
From: hch at lst.de (Christoph Hellwig)
Date: Tue, 7 Feb 2006 11:56:43 +0100
Subject: merge these lists?
In-Reply-To: <17384.5875.790692.127762@cargo.ozlabs.ibm.com>
References: <17384.5875.790692.127762@cargo.ozlabs.ibm.com>
Message-ID: <20060207105643.GA22234@lst.de>

On Tue, Feb 07, 2006 at 02:41:39PM +1100, Paul Mackerras wrote:
> A lot of messages seem to get cross-posted to both linuxppc-dev and
> linuxppc64-dev these days, since we are all working in the one tree.
> Rather than having to cross-post, I propose that we create a single
> powerpc-dev at ozlabs.org list to replace linuxppc-dev and
> linuxppc64-dev.  (The linuxppc-embedded list would continue as at
> present.)

Why not just kill linuxppc64-dev and leave linuxppc-dev?  Probably not
worth to remove the well-known and widely used address just for the
sake of it.


From galak at kernel.crashing.org  Wed Feb  8 01:35:23 2006
From: galak at kernel.crashing.org (Kumar Gala)
Date: Tue, 7 Feb 2006 08:35:23 -0600
Subject: merge these lists?
In-Reply-To: <20060207105643.GA22234@lst.de>
References: <17384.5875.790692.127762@cargo.ozlabs.ibm.com>
	<20060207105643.GA22234@lst.de>
Message-ID: <B21A08E2-3D64-45B8-8319-872B1C1D31BC@kernel.crashing.org>


On Feb 7, 2006, at 4:56 AM, Christoph Hellwig wrote:

> On Tue, Feb 07, 2006 at 02:41:39PM +1100, Paul Mackerras wrote:
>> A lot of messages seem to get cross-posted to both linuxppc-dev and
>> linuxppc64-dev these days, since we are all working in the one tree.
>> Rather than having to cross-post, I propose that we create a single
>> powerpc-dev at ozlabs.org list to replace linuxppc-dev and
>> linuxppc64-dev.  (The linuxppc-embedded list would continue as at
>> present.)
>
> Why not just kill linuxppc64-dev and leave linuxppc-dev?  Probably not
> worth to remove the well-known and widely used address just for the
> sake of it.

I agree.  Let's just kill linuxppc64-dev and direct people at  
linuxppc-dev.

- kumar


From galak at kernel.crashing.org  Wed Feb  8 01:36:35 2006
From: galak at kernel.crashing.org (Kumar Gala)
Date: Tue, 7 Feb 2006 08:36:35 -0600
Subject: merge these lists?
In-Reply-To: <200602071445.45805.jk@ozlabs.org>
References: <17384.5875.790692.127762@cargo.ozlabs.ibm.com>
	<200602071445.45805.jk@ozlabs.org>
Message-ID: <6FED97E8-FD6B-4CEC-983F-5CA149A50E79@kernel.crashing.org>


On Feb 6, 2006, at 9:45 PM, Jeremy Kerr wrote:

> Paul,
>
>> If we do this, we would set up the new list with the union of the
>> subscribers of the old lists, and make emails sent to linuxppc-dev
>> and linuxppc64-dev go to the new list, so it should be painless.
>>
>> Thoughts? Comments? Objections?
>
> How about the patchwork lists? Should I look at merging those too?

Hmm, how about a merged patchwork starting after 2.6.16?

- kumar


From linas at austin.ibm.com  Wed Feb  8 03:43:05 2006
From: linas at austin.ibm.com (Linas Vepstas)
Date: Tue, 7 Feb 2006 10:43:05 -0600
Subject: merge these lists?
In-Reply-To: <B21A08E2-3D64-45B8-8319-872B1C1D31BC@kernel.crashing.org>
References: <17384.5875.790692.127762@cargo.ozlabs.ibm.com>
	<20060207105643.GA22234@lst.de>
	<B21A08E2-3D64-45B8-8319-872B1C1D31BC@kernel.crashing.org>
Message-ID: <20060207164305.GI24916@austin.ibm.com>

On Tue, Feb 07, 2006 at 08:35:23AM -0600, Kumar Gala was heard to remark:
> 
> I agree.  Let's just kill linuxppc64-dev and direct people at  
> linuxppc-dev.

Can a sysadmin merge the subscription lists?
I didn't even know that there was a linuxppc-dev list; 
I thought the merge of these two lists occured a year ago,
when it was moved to ozlabs :-/

--linas


From jschopp at austin.ibm.com  Wed Feb  8 03:45:38 2006
From: jschopp at austin.ibm.com (Joel Schopp)
Date: Tue, 07 Feb 2006 10:45:38 -0600
Subject: merge these lists?
In-Reply-To: <17384.5875.790692.127762@cargo.ozlabs.ibm.com>
References: <17384.5875.790692.127762@cargo.ozlabs.ibm.com>
Message-ID: <43E8CEB2.4020009@austin.ibm.com>

> A lot of messages seem to get cross-posted to both linuxppc-dev and
> linuxppc64-dev these days, since we are all working in the one tree.
> Rather than having to cross-post, I propose that we create a single
> powerpc-dev at ozlabs.org list to replace linuxppc-dev and
> linuxppc64-dev.  (The linuxppc-embedded list would continue as at
> present.)
> 
> If we do this, we would set up the new list with the union of the
> subscribers of the old lists, and make emails sent to linuxppc-dev and
> linuxppc64-dev go to the new list, so it should be painless.
> 
> Thoughts? Comments? Objections?

Marvelous idea.  I like the new name too, matches the kernel tree and 
all that.  A few points to make sure we get right, most of which have 
already been mentioned:

-New archives should have links to the old archives.
-Old list addresses should automagically send to new list.
-The subscribers to both lists should automagically be subscribed to the 
new list.


From jschopp at austin.ibm.com  Wed Feb  8 04:06:25 2006
From: jschopp at austin.ibm.com (Joel Schopp)
Date: Tue, 07 Feb 2006 11:06:25 -0600
Subject: [PATCH] avoid timer interrupt replay effect when onlining cpu
In-Reply-To: <20060207044422.GI18730@localhost.localdomain>
References: <20060207044422.GI18730@localhost.localdomain>
Message-ID: <43E8D391.2090208@austin.ibm.com>

> Signed-off-by: Nathan Lynch <ntl at pobox.com>
> 
> 
> --- powerpc-timer_interrupt-replay.orig/arch/powerpc/kernel/smp.c
> +++ powerpc-timer_interrupt-replay/arch/powerpc/kernel/smp.c
> @@ -540,6 +540,9 @@ int __devinit start_secondary(void *unus
>  	if (smp_ops->take_timebase)
>  		smp_ops->take_timebase();
>  
> +	if (system_state > SYSTEM_BOOTING)
> +		per_cpu(last_jiffy, cpu) = get_tb();
> +
>  	spin_lock(&call_lock);
>  	cpu_set(cpu, cpu_online_map);
>  	spin_unlock(&call_lock);
> _______________________________________________

Yep, this bug has been seen in SUSE & Redhat distro kernels and this 
patch fixes it.

While we are here, is there any reason we still have 
next_jiffy_update_tb in the paca?  It isn't used anywhere anymore.

Acked-by: Joel Schopp <jschopp at austin.ibm.com>


From bdc at carlstrom.com  Wed Feb  8 06:26:17 2006
From: bdc at carlstrom.com (Brian D. Carlstrom)
Date: Tue, 7 Feb 2006 11:26:17 -0800
Subject: G5 fan problems return moving to 2.6.15 with dual processor
	2.7GHz machine
In-Reply-To: <1139130385.5634.14.camel@localhost.localdomain>
References: <20060205061048.7261.qmail@electricrain.com>
	<1139130385.5634.14.camel@localhost.localdomain>
Message-ID: <17384.62553.442011.514155@zot.electricrain.com>

Benjamin Herrenschmidt writes:
 > Might be something in that prom_init.c fix that broke... it would be
 > really nice if you could give a try with the console and find out what
 > it is ... Unfortunately, I don't have access to one of these machines
 > with the "problem" at the moment...

Well, I added several prom_printf calls to prom_init.c's
fixup_device_tree routine. I assumed I would spot these scrolling by
during boot before what appears to be the video mode switch. However, I
didn't see anything, but I wasn't sure if it wasn't just going by too
fast.

I tried using PROM_BUG to halt the output, but that just resulted in
returning to an OpenFirmware prompt, although with a white background
instead of the usual black background when I go their from yaboot with
'o'.

I also tried putting a "while (1) ;" after one of my prom_printf, in
case the illegal instruction used by PROM_BUG was causing the output to
be lost, since it was clearing the screen to display the OpenFirmware
prompt. However then I just got a pure white screen. So clearly in both
cases it was running my changed code, but I see no output.

I tried reviewing some OpenFirmware doc, looking at their talk of
debugging via serial and telnet, but that all seemed to be a dead end,
although I learned much more about the device tree. :)

Clearly I could theoretically debug by moving the while(1); around to
see what branches are being taken, but since I'm away from the machines
today, I figured I'd ask how I'm expected to use prom_printf, before
returning to debugging tomorrow. Sorry my lack of ppc experience is
showing.

-bri


From geoffrey.levand at am.sony.com  Wed Feb  8 07:37:32 2006
From: geoffrey.levand at am.sony.com (Geoff Levand)
Date: Tue, 07 Feb 2006 12:37:32 -0800
Subject: __setup_cpu_be problem
Message-ID: <43E9050C.2000300@am.sony.com>

Arnd,

It seems HID6 is a hypervisor resource...  Can we just have 
'.cpu_setup = __setup_cpu_power4', and you setup your
page sizes somewhere else?

-Geoff

struct cpu_spec	cpu_specs[] = {
	{	/* Cell Broadband Engine */
		.cpu_setup		= __setup_cpu_be,
	},

_GLOBAL(__setup_cpu_be)
        /* Set large page sizes LP=0: 16MB, LP=1: 64KB */
        addi    r3, 0,  0
        ori     r3, r3, HID6_LB
        sldi    r3, r3, 32
        nor     r3, r3, r3
        mfspr   r4, SPRN_HID6
        and     r4, r4, r3
        addi    r3, 0, 0x02000
        sldi    r3, r3, 32
        or      r4, r4, r3
        mtspr   SPRN_HID6, r4
	blr


From heiko.carstens at de.ibm.com  Tue Feb  7 20:31:54 2006
From: heiko.carstens at de.ibm.com (Heiko Carstens)
Date: Tue, 7 Feb 2006 10:31:54 +0100
Subject: [PATCH] compat: add compat functions for *at syscalls
In-Reply-To: <20060207174017.5e3b0ce0.sfr@canb.auug.org.au>
References: <20060207105631.39a1080c.sfr@canb.auug.org.au>
	<20060206.160140.59716704.davem@davemloft.net>
	<20060207174017.5e3b0ce0.sfr@canb.auug.org.au>
Message-ID: <20060207093154.GA9311@osiris.boeblingen.de.ibm.com>

> How about the following (modifiying Linus' suggestion and copying what
> sparc64 already does)?
> 
> The assumption is that all arguments have been zero extended by the compat
> syscall entry code, so we just sign extend those that need it.
> 
> I am not sure of the sparc64 code below, s390 doesn't seem to follow our
> "all arguments are zero extended" assumption and x86_64 may not need any
> of these wrappers anyway.

On s390 we do already sign extension for int/long and zero extension for
the unsigned parameters. Even though I wasn't aware that we should do zero
extension for _all_ parameters of the compat system calls, regardless of
their type.
In addition we must do pointer conversion to 64 bit, since the compat tasks
have the most significant bit set (to distinguish between 24- and 31-bit
addressing mode).

Therefore I think Linus' suggestion with having something like

	compat_fn6(sys_waitif, SARG, UARG, UARG, SARG, UARG);

would be better. Just that we would need something for pointers as well.
And to make things just a bit more complicated: only the first five
parameters are in registers. Number six and the following are already on
the stack. E.g. the compat wrapper for the futex syscall would need extra
assembly code to do conversion on the stack.

Maybe having defines like SARG1..SARG6 that would define assembly code
instead of the register would do the job.


Thanks,
Heiko


From heiko.carstens at de.ibm.com  Wed Feb  8 00:29:49 2006
From: heiko.carstens at de.ibm.com (Heiko Carstens)
Date: Tue, 7 Feb 2006 14:29:49 +0100
Subject: [PATCH] compat: add compat functions for *at syscalls
In-Reply-To: <20060207093154.GA9311@osiris.boeblingen.de.ibm.com>
References: <20060207105631.39a1080c.sfr@canb.auug.org.au>
	<20060206.160140.59716704.davem@davemloft.net>
	<20060207174017.5e3b0ce0.sfr@canb.auug.org.au>
	<20060207093154.GA9311@osiris.boeblingen.de.ibm.com>
Message-ID: <20060207132949.GB9311@osiris.boeblingen.de.ibm.com>

> Therefore I think Linus' suggestion with having something like
> 
> 	compat_fn6(sys_waitif, SARG, UARG, UARG, SARG, UARG);
> 
> would be better. Just that we would need something for pointers as well.
> And to make things just a bit more complicated: only the first five
> parameters are in registers. Number six and the following are already on
> the stack. E.g. the compat wrapper for the futex syscall would need extra
> assembly code to do conversion on the stack.
> 
> Maybe having defines like SARG1..SARG6 that would define assembly code
> instead of the register would do the job.

Ah, stupid me... the SARG define defines assembly code of course. Just
that we would need different defines for arguments that are in registers
or on the stack. Is s390 the only architecture that has argument six on
the stack?

Heiko


From greg at kroah.com  Wed Feb  8 09:21:44 2006
From: greg at kroah.com (Greg KH)
Date: Tue, 7 Feb 2006 14:21:44 -0800
Subject: [PATCH]: Documentation: Updated PCI Error Recovery
In-Reply-To: <20060203000602.GQ24916@austin.ibm.com>
References: <20060203000602.GQ24916@austin.ibm.com>
Message-ID: <20060207222144.GA15622@kroah.com>

On Thu, Feb 02, 2006 at 06:06:02PM -0600, Linas Vepstas wrote:
> 
> I'm not sure who I'm addressing this patch to: Linus, maybe?
> 
> Please apply. Fingers crossed, I hope this may make it into 2.6.16.

This does not apply to the current tree, what kernel did you do it
against?

Care to respin it against the latest -git release?

thanks,

greg k-h


From akpm at osdl.org  Wed Feb  8 09:30:52 2006
From: akpm at osdl.org (Andrew Morton)
Date: Tue, 7 Feb 2006 14:30:52 -0800
Subject: [PATCH]: Documentation: Updated PCI Error Recovery
In-Reply-To: <20060207222144.GA15622@kroah.com>
References: <20060203000602.GQ24916@austin.ibm.com>
	<20060207222144.GA15622@kroah.com>
Message-ID: <20060207143052.19978ca7.akpm@osdl.org>

Greg KH <greg at kroah.com> wrote:
>
> On Thu, Feb 02, 2006 at 06:06:02PM -0600, Linas Vepstas wrote:
> > 
> > I'm not sure who I'm addressing this patch to: Linus, maybe?
> > 
> > Please apply. Fingers crossed, I hope this may make it into 2.6.16.
> 
> This does not apply to the current tree, what kernel did you do it
> against?
> 
> Care to respin it against the latest -git release?
> 

err, I already merged it.  Saw "documentation" and leapt on it ;)


From greg at kroah.com  Wed Feb  8 09:39:56 2006
From: greg at kroah.com (Greg KH)
Date: Tue, 7 Feb 2006 14:39:56 -0800
Subject: [PATCH]: Documentation: Updated PCI Error Recovery
In-Reply-To: <20060207143052.19978ca7.akpm@osdl.org>
References: <20060203000602.GQ24916@austin.ibm.com>
	<20060207222144.GA15622@kroah.com>
	<20060207143052.19978ca7.akpm@osdl.org>
Message-ID: <20060207223956.GA19009@kroah.com>

On Tue, Feb 07, 2006 at 02:30:52PM -0800, Andrew Morton wrote:
> Greg KH <greg at kroah.com> wrote:
> >
> > On Thu, Feb 02, 2006 at 06:06:02PM -0600, Linas Vepstas wrote:
> > > 
> > > I'm not sure who I'm addressing this patch to: Linus, maybe?
> > > 
> > > Please apply. Fingers crossed, I hope this may make it into 2.6.16.
> > 
> > This does not apply to the current tree, what kernel did you do it
> > against?
> > 
> > Care to respin it against the latest -git release?
> > 
> 
> err, I already merged it.  Saw "documentation" and leapt on it ;)

Ah, nevermind then...  For some reason patch didn't say it looked like
it had already been applied, otherwise I would have caught that...

thanks,

greg k-h


From slpratt at austin.ibm.com  Wed Feb  8 09:45:48 2006
From: slpratt at austin.ibm.com (Steven Pratt)
Date: Tue, 07 Feb 2006 16:45:48 -0600
Subject: make install fails
Message-ID: <43E9231C.4000004@austin.ibm.com>

Hey, does anyone know why you can no longer do "make install" on ppc 
kernels on recent releases?

Steve


From akpm at osdl.org  Wed Feb  8 09:53:47 2006
From: akpm at osdl.org (Andrew Morton)
Date: Tue, 7 Feb 2006 14:53:47 -0800
Subject: [PATCH]: Documentation: Updated PCI Error Recovery
In-Reply-To: <20060207223956.GA19009@kroah.com>
References: <20060203000602.GQ24916@austin.ibm.com>
	<20060207222144.GA15622@kroah.com>
	<20060207143052.19978ca7.akpm@osdl.org>
	<20060207223956.GA19009@kroah.com>
Message-ID: <20060207145347.72c0a77e.akpm@osdl.org>

Greg KH <greg at kroah.com> wrote:
>
> On Tue, Feb 07, 2006 at 02:30:52PM -0800, Andrew Morton wrote:
> > Greg KH <greg at kroah.com> wrote:
> > >
> > > On Thu, Feb 02, 2006 at 06:06:02PM -0600, Linas Vepstas wrote:
> > > > 
> > > > I'm not sure who I'm addressing this patch to: Linus, maybe?
> > > > 
> > > > Please apply. Fingers crossed, I hope this may make it into 2.6.16.
> > > 
> > > This does not apply to the current tree, what kernel did you do it
> > > against?
> > > 
> > > Care to respin it against the latest -git release?
> > > 
> > 
> > err, I already merged it.  Saw "documentation" and leapt on it ;)
> 
> Ah, nevermind then...  For some reason patch didn't say it looked like
> it had already been applied, otherwise I would have caught that...
> 

It could be all the newly-added trailing whitespace I chopped off.
`patch -p1 -R -l --dry-run'.


From haren at us.ibm.com  Wed Feb  8 10:01:30 2006
From: haren at us.ibm.com (Haren Myneni)
Date: Tue, 07 Feb 2006 15:01:30 -0800
Subject: [PATCH] Fix in free initrd when overlapped with crashkernel region
In-Reply-To: <17384.24235.960221.979322@cargo.ozlabs.ibm.com>
References: <43E818EB.7010003@us.ibm.com>
	<17384.24235.960221.979322@cargo.ozlabs.ibm.com>
Message-ID: <43E926CA.3000601@us.ibm.com>

Paul Mackerras wrote:

>Haren Myneni writes:
>
>  
>
>>--- 2616-rc2.orig/include/linux/kexec.h	2006-02-06 19:08:01.000000000 -0800
>>+++ 2616-rc2/include/linux/kexec.h	2006-02-06 19:06:37.000000000 -0800
>>@@ -6,6 +6,7 @@
>> #include <linux/list.h>
>> #include <linux/linkage.h>
>> #include <linux/compat.h>
>>+#include <linux/ioport.h>
>> #include <asm/kexec.h>
>>    
>>
>
>What's this hunk for?
>
>Paul.
>  
>

crashk_res is an extern declaration in kexec.h. Declared as "struct 
resource" which is defined in linux/ioport.h.
For other places wherever this variable is used, ioport.h got included 
through some other header file.  Whereas for initramfs.c, either we need 
to include ioport.h explicitly or include in kexec.h. Chosen the later 
one. Probably, some comment would be better to make it clear.

Paul, do you prefer to repost the patch with the comment?

Thanks
Haren

>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo at vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/
>
>  
>


From geoffrey.levand at am.sony.com  Wed Feb  8 10:10:47 2006
From: geoffrey.levand at am.sony.com (Geoff Levand)
Date: Tue, 07 Feb 2006 15:10:47 -0800
Subject: [PATCH] fix prom_init undefined error
Message-ID: <43E928F7.1050307@am.sony.com>

Paul,

This patch fixes a build error when CONFIG_PPC_OF=n,
CONFIG_PPC_MULTIPLATFORM=y.  It makes the conditionals
consistent in arch/powerpc/kernel/Makefile and head_64.S
to both be on CONFIG_PPC_OF.

  arch/powerpc/kernel/head_64.o: In function `.__boot_from_prom':
  linux/arch/powerpc/kernel/head_64.S:(.text+0x8158): undefined reference to `.prom_init'

obj-$(CONFIG_PPC_OF) += prom_init.o


Signed-off-by: Geoff Levand <geoffrey.levand at am.sony.com>

--

--- powerpc.git.orig/arch/powerpc/kernel/head_64.S	2006-02-07 13:18:14.000000000 -0800
+++ powerpc.git/arch/powerpc/kernel/head_64.S	2006-02-07 14:51:15.000000000 -0800
@@ -1515,7 +1515,7 @@
  *
  */
 _GLOBAL(__start_initialization_multiplatform)
-#ifdef CONFIG_PPC_MULTIPLATFORM
+#ifdef CONFIG_PPC_OF
 	/*
 	 * Are we booted from a PROM Of-type client-interface ?
 	 */
@@ -1542,7 +1542,7 @@
 	bl	.__mmu_off
 	b	.__after_prom_start
 
-#ifdef CONFIG_PPC_MULTIPLATFORM
+#ifdef CONFIG_PPC_OF
 _STATIC(__boot_from_prom)
 	/* Save parameters */
 	mr	r31,r3


From greg at kroah.com  Wed Feb  8 10:19:27 2006
From: greg at kroah.com (Greg KH)
Date: Tue, 7 Feb 2006 15:19:27 -0800
Subject: [PATCH]: Documentation: Updated PCI Error Recovery
In-Reply-To: <20060207145347.72c0a77e.akpm@osdl.org>
References: <20060203000602.GQ24916@austin.ibm.com>
	<20060207222144.GA15622@kroah.com>
	<20060207143052.19978ca7.akpm@osdl.org>
	<20060207223956.GA19009@kroah.com>
	<20060207145347.72c0a77e.akpm@osdl.org>
Message-ID: <20060207231927.GB19648@kroah.com>

On Tue, Feb 07, 2006 at 02:53:47PM -0800, Andrew Morton wrote:
> Greg KH <greg at kroah.com> wrote:
> >
> > On Tue, Feb 07, 2006 at 02:30:52PM -0800, Andrew Morton wrote:
> > > Greg KH <greg at kroah.com> wrote:
> > > >
> > > > On Thu, Feb 02, 2006 at 06:06:02PM -0600, Linas Vepstas wrote:
> > > > > 
> > > > > I'm not sure who I'm addressing this patch to: Linus, maybe?
> > > > > 
> > > > > Please apply. Fingers crossed, I hope this may make it into 2.6.16.
> > > > 
> > > > This does not apply to the current tree, what kernel did you do it
> > > > against?
> > > > 
> > > > Care to respin it against the latest -git release?
> > > > 
> > > 
> > > err, I already merged it.  Saw "documentation" and leapt on it ;)
> > 
> > Ah, nevermind then...  For some reason patch didn't say it looked like
> > it had already been applied, otherwise I would have caught that...
> > 
> 
> It could be all the newly-added trailing whitespace I chopped off.
> `patch -p1 -R -l --dry-run'.

Yup, that was it, quilt would have stripped them off for me too.  Linas,
please don't do this anymore...

thanks,

greg k-h


From haren at us.ibm.com  Wed Feb  8 10:47:03 2006
From: haren at us.ibm.com (Haren Myneni)
Date: Tue, 07 Feb 2006 15:47:03 -0800
Subject: [PATHC] Trivial fix to set the proper timeout value for kdump
Message-ID: <43E93177.5020601@us.ibm.com>


The panic CPU is waiting forever due to some large timeout value if some 
CPU is not responding to an IPI.
This patch will fixes this issue -  The maximum waiting period will be 
10 seconds and does the kdump boot.

Thanks
Haren

Signed-off-by: Haren Myneni <haren at us.ibm.com>


-------------- next part --------------
A non-text attachment was scrubbed...
Name: kdump-timeout-value-fix.patch
Type: text/x-patch
Size: 616 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060207/2b4e3870/attachment.bin 

From davem at davemloft.net  Wed Feb  8 09:57:25 2006
From: davem at davemloft.net (David S. Miller)
Date: Tue, 07 Feb 2006 14:57:25 -0800 (PST)
Subject: [PATCH] compat: add compat functions for *at syscalls
In-Reply-To: <20060207132949.GB9311@osiris.boeblingen.de.ibm.com>
References: <20060207174017.5e3b0ce0.sfr@canb.auug.org.au>
	<20060207093154.GA9311@osiris.boeblingen.de.ibm.com>
	<20060207132949.GB9311@osiris.boeblingen.de.ibm.com>
Message-ID: <20060207.145725.22157385.davem@davemloft.net>

From: Heiko Carstens <heiko.carstens at de.ibm.com>
Date: Tue, 7 Feb 2006 14:29:49 +0100

> Ah, stupid me... the SARG define defines assembly code of course. Just
> that we would need different defines for arguments that are in registers
> or on the stack. Is s390 the only architecture that has argument six on
> the stack?

If I remember correctly, o32 mips binaries put arg 6 on the stack
too.


From sfr at canb.auug.org.au  Wed Feb  8 11:01:50 2006
From: sfr at canb.auug.org.au (Stephen Rothwell)
Date: Wed, 8 Feb 2006 11:01:50 +1100
Subject: merge these lists?
In-Reply-To: <20060207164305.GI24916@austin.ibm.com>
References: <17384.5875.790692.127762@cargo.ozlabs.ibm.com>
	<20060207105643.GA22234@lst.de>
	<B21A08E2-3D64-45B8-8319-872B1C1D31BC@kernel.crashing.org>
	<20060207164305.GI24916@austin.ibm.com>
Message-ID: <20060208110150.5d9d1936.sfr@canb.auug.org.au>

On Tue, 7 Feb 2006 10:43:05 -0600 linas at austin.ibm.com (Linas Vepstas) wrote:
>
> On Tue, Feb 07, 2006 at 08:35:23AM -0600, Kumar Gala was heard to remark:
> > 
> > I agree.  Let's just kill linuxppc64-dev and direct people at  
> > linuxppc-dev.
> 
> Can a sysadmin merge the subscription lists?

Yes, "a sysadmin" could do that.  However, those that are
subscribed with different addresses on each list will end
up subscribed twice and those who have changed their preferences on
the abondoned list will have fix them as well.

-- 
Cheers,
Stephen Rothwell                    sfr at canb.auug.org.au
http://www.canb.auug.org.au/~sfr/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060208/fbe140ee/attachment.pgp 

From sfr at canb.auug.org.au  Wed Feb  8 11:07:18 2006
From: sfr at canb.auug.org.au (Stephen Rothwell)
Date: Wed, 8 Feb 2006 11:07:18 +1100
Subject: Membership stats (Was: Re: merge these lists?)
In-Reply-To: <20060208110150.5d9d1936.sfr@canb.auug.org.au>
References: <17384.5875.790692.127762@cargo.ozlabs.ibm.com>
	<20060207105643.GA22234@lst.de>
	<B21A08E2-3D64-45B8-8319-872B1C1D31BC@kernel.crashing.org>
	<20060207164305.GI24916@austin.ibm.com>
	<20060208110150.5d9d1936.sfr@canb.auug.org.au>
Message-ID: <20060208110718.57e9f9f5.sfr@canb.auug.org.au>

On Wed, 8 Feb 2006 11:01:50 +1100 Stephen Rothwell <sfr at canb.auug.org.au> wrote:
>
> Yes, "a sysadmin" could do that.  However, those that are
> subscribed with different addresses on each list will end
> up subscribed twice and those who have changed their preferences on
> the abondoned list will have fix them as well.

Just for interest:

	members of linuxppc-dev		473
	members of linuxppc64-dev	264
	common				 98

But, as I said, "common" above does not count those who have different
addresses subscribed to each list.
-- 
Cheers,
Stephen Rothwell                    sfr at canb.auug.org.au
http://www.canb.auug.org.au/~sfr/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060208/0efd61cc/attachment.pgp 

From benh at kernel.crashing.org  Wed Feb  8 14:56:56 2006
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Wed, 08 Feb 2006 14:56:56 +1100
Subject: G5 fan problems return moving to 2.6.15 with dual processor
	2.7GHz machine
In-Reply-To: <17384.62553.442011.514155@zot.electricrain.com>
References: <20060205061048.7261.qmail@electricrain.com>
	<1139130385.5634.14.camel@localhost.localdomain>
	<17384.62553.442011.514155@zot.electricrain.com>
Message-ID: <1139371016.8187.1.camel@localhost.localdomain>

On Tue, 2006-02-07 at 11:26 -0800, Brian D. Carlstrom wrote:
> Benjamin Herrenschmidt writes:
>  > Might be something in that prom_init.c fix that broke... it would be
>  > really nice if you could give a try with the console and find out what
>  > it is ... Unfortunately, I don't have access to one of these machines
>  > with the "problem" at the moment...
> 
> Well, I added several prom_printf calls to prom_init.c's
> fixup_device_tree routine. I assumed I would spot these scrolling by
> during boot before what appears to be the video mode switch. However, I
> didn't see anything, but I wasn't sure if it wasn't just going by too
> fast.
> 
> I tried using PROM_BUG to halt the output, but that just resulted in
> returning to an OpenFirmware prompt, although with a white background
> instead of the usual black background when I go their from yaboot with
> 'o'.
> 
> I also tried putting a "while (1) ;" after one of my prom_printf, in
> case the illegal instruction used by PROM_BUG was causing the output to
> be lost, since it was clearing the screen to display the OpenFirmware
> prompt. However then I just got a pure white screen. So clearly in both
> cases it was running my changed code, but I see no output.
> 
> I tried reviewing some OpenFirmware doc, looking at their talk of
> debugging via serial and telnet, but that all seemed to be a dead end,
> although I learned much more about the device tree. :)
> 
> Clearly I could theoretically debug by moving the while(1); around to
> see what branches are being taken, but since I'm away from the machines
> today, I figured I'd ask how I'm expected to use prom_printf, before
> returning to debugging tomorrow. Sorry my lack of ppc experience is
> showing.

prom_printf should work ... try booting manually (from the OF command
line) and maybe comment out the code that opens the displays... (it may
be clearing the screen).... 

Ben.


From benh at kernel.crashing.org  Wed Feb  8 14:59:21 2006
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Wed, 08 Feb 2006 14:59:21 +1100
Subject: [PATCH] fix prom_init undefined error
In-Reply-To: <43E928F7.1050307@am.sony.com>
References: <43E928F7.1050307@am.sony.com>
Message-ID: <1139371162.8187.3.camel@localhost.localdomain>

On Tue, 2006-02-07 at 15:10 -0800, Geoff Levand wrote:
> Paul,
> 
> This patch fixes a build error when CONFIG_PPC_OF=n,
> CONFIG_PPC_MULTIPLATFORM=y.  It makes the conditionals
> consistent in arch/powerpc/kernel/Makefile and head_64.S
> to both be on CONFIG_PPC_OF.
> 
>   arch/powerpc/kernel/head_64.o: In function `.__boot_from_prom':
>   linux/arch/powerpc/kernel/head_64.S:(.text+0x8158): undefined reference to `.prom_init'
> 
> obj-$(CONFIG_PPC_OF) += prom_init.o

With ARCH=powerpc, CONFIG_PPC_OF should always be set. It's supposed to
be set when the device-tree accessors exist which they always do.

Besides, I'll be removing support for !MULTIPLATFORM too :) (Except for
iSeries at least for a little while). Look at the patch I posted that
removes _machine for an idea of where things are going.

Why do you want CONFIG_PPC_OF not set ?

Ben.


From benh at kernel.crashing.org  Wed Feb  8 16:42:51 2006
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Wed, 08 Feb 2006 16:42:51 +1100
Subject: [PATCH] powerpc: Thermal control for dual core G5s
Message-ID: <1139377372.8187.16.camel@localhost.localdomain>

This patch adds a windfarm module, windfarm_pm112, for the dual core G5s
(both 2 and 4 core models), keeping the machine from getting into
vacuum-cleaner mode ;) For proper credits, the patch was initially
written by Paul Mackerras, and slightly reworked by me to add overtemp
handling among others. The patch also removes the sysfs attributes from
windfarm_pm81 and windfarm_pm91 and instead adds code to the windfarm
core to automagically expose attributes for sensor & controls.

Signed-off-by; Benjamin Herrenschmidt <benh at kernel.crashing.org>

Index: linux-work/drivers/macintosh/Kconfig
===================================================================
--- linux-work.orig/drivers/macintosh/Kconfig	2006-01-12 16:33:08.000000000 +1100
+++ linux-work/drivers/macintosh/Kconfig	2006-02-07 13:45:57.000000000 +1100
@@ -187,6 +187,14 @@ config WINDFARM_PM91
 	  This driver provides thermal control for the PowerMac9,1
           which is the recent (SMU based) single CPU desktop G5
 
+config WINDFARM_PM112
+	tristate "Support for thermal management on PowerMac11,2"
+	depends on WINDFARM && I2C && PMAC_SMU
+	select I2C_PMAC_SMU
+	help
+	  This driver provides thermal control for the PowerMac11,2
+	  which are the recent dual and quad G5 machines using the
+	  970MP dual-core processor.
 
 config ANSLCD
 	tristate "Support for ANS LCD display"
Index: linux-work/drivers/macintosh/Makefile
===================================================================
--- linux-work.orig/drivers/macintosh/Makefile	2005-11-09 11:49:03.000000000 +1100
+++ linux-work/drivers/macintosh/Makefile	2006-02-07 13:45:57.000000000 +1100
@@ -35,3 +35,8 @@ obj-$(CONFIG_WINDFARM_PM91)     += windf
 				   windfarm_smu_sensors.o \
 				   windfarm_lm75_sensor.o windfarm_pid.o \
 				   windfarm_cpufreq_clamp.o windfarm_pm91.o
+obj-$(CONFIG_WINDFARM_PM112)	+= windfarm_pm112.o windfarm_smu_sat.o \
+				   windfarm_smu_controls.o \
+				   windfarm_smu_sensors.o \
+				   windfarm_max6690_sensor.o \
+				   windfarm_lm75_sensor.o windfarm_pid.o
Index: linux-work/drivers/macintosh/windfarm.h
===================================================================
--- linux-work.orig/drivers/macintosh/windfarm.h	2005-11-09 11:49:03.000000000 +1100
+++ linux-work/drivers/macintosh/windfarm.h	2006-02-07 13:45:57.000000000 +1100
@@ -14,6 +14,7 @@
 #include <linux/list.h>
 #include <linux/module.h>
 #include <linux/notifier.h>
+#include <linux/device.h>
 
 /* Display a 16.16 fixed point value */
 #define FIX32TOPRINT(f)	((f) >> 16),((((f) & 0xffff) * 1000) >> 16)
@@ -39,6 +40,7 @@ struct wf_control {
 	char			*name;
 	int			type;
 	struct kref		ref;
+	struct device_attribute	attr;
 };
 
 #define WF_CONTROL_TYPE_GENERIC		0
@@ -87,6 +89,7 @@ struct wf_sensor {
 	struct wf_sensor_ops	*ops;
 	char			*name;
 	struct kref		ref;
+	struct device_attribute	attr;
 };
 
 /* Same lifetime rules as controls */
Index: linux-work/drivers/macintosh/windfarm_core.c
===================================================================
--- linux-work.orig/drivers/macintosh/windfarm_core.c	2005-11-09 11:49:03.000000000 +1100
+++ linux-work/drivers/macintosh/windfarm_core.c	2006-02-07 13:45:57.000000000 +1100
@@ -55,6 +55,10 @@ static unsigned int wf_overtemp;
 static unsigned int wf_overtemp_counter;
 struct task_struct *wf_thread;
 
+static struct platform_device wf_platform_device = {
+	.name	= "windfarm",
+};
+
 /*
  * Utilities & tick thread
  */
@@ -156,6 +160,40 @@ static void wf_control_release(struct kr
 		kfree(ct);
 }
 
+static ssize_t wf_show_control(struct device *dev,
+			       struct device_attribute *attr, char *buf)
+{
+	struct wf_control *ctrl = container_of(attr, struct wf_control, attr);
+	s32 val = 0;
+	int err;
+
+	err = ctrl->ops->get_value(ctrl, &val);
+	if (err < 0)
+		return err;
+	return sprintf(buf, "%d\n", val);
+}
+
+/* This is really only for debugging... */
+static ssize_t wf_store_control(struct device *dev,
+				struct device_attribute *attr,
+				const char *buf, size_t count)
+{
+	struct wf_control *ctrl = container_of(attr, struct wf_control, attr);
+	int val;
+	int err;
+	char *endp;
+
+	val = simple_strtoul(buf, &endp, 0);
+	while (endp < buf + count && (*endp == ' ' || *endp == '\n'))
+		++endp;
+	if (endp - buf < count)
+		return -EINVAL;
+	err = ctrl->ops->set_value(ctrl, val);
+	if (err < 0)
+		return err;
+	return count;
+}
+
 int wf_register_control(struct wf_control *new_ct)
 {
 	struct wf_control *ct;
@@ -172,6 +210,13 @@ int wf_register_control(struct wf_contro
 	kref_init(&new_ct->ref);
 	list_add(&new_ct->link, &wf_controls);
 
+	new_ct->attr.attr.name = new_ct->name;
+	new_ct->attr.attr.owner = THIS_MODULE;
+	new_ct->attr.attr.mode = 0644;
+	new_ct->attr.show = wf_show_control;
+	new_ct->attr.store = wf_store_control;
+	device_create_file(&wf_platform_device.dev, &new_ct->attr);
+
 	DBG("wf: Registered control %s\n", new_ct->name);
 
 	wf_notify(WF_EVENT_NEW_CONTROL, new_ct);
@@ -246,6 +291,19 @@ static void wf_sensor_release(struct kre
 		kfree(sr);
 }
 
+static ssize_t wf_show_sensor(struct device *dev,
+			      struct device_attribute *attr, char *buf)
+{
+	struct wf_sensor *sens = container_of(attr, struct wf_sensor, attr);
+	s32 val = 0;
+	int err;
+
+	err = sens->ops->get_value(sens, &val);
+	if (err < 0)
+		return err;
+	return sprintf(buf, "%d.%03d\n", FIX32TOPRINT(val));
+}
+
 int wf_register_sensor(struct wf_sensor *new_sr)
 {
 	struct wf_sensor *sr;
@@ -262,6 +320,13 @@ int wf_register_sensor(struct wf_sensor 
 	kref_init(&new_sr->ref);
 	list_add(&new_sr->link, &wf_sensors);
 
+	new_sr->attr.attr.name = new_sr->name;
+	new_sr->attr.attr.owner = THIS_MODULE;
+	new_sr->attr.attr.mode = 0444;
+	new_sr->attr.show = wf_show_sensor;
+	new_sr->attr.store = NULL;
+	device_create_file(&wf_platform_device.dev, &new_sr->attr);
+
 	DBG("wf: Registered sensor %s\n", new_sr->name);
 
 	wf_notify(WF_EVENT_NEW_SENSOR, new_sr);
@@ -395,10 +460,6 @@ int wf_is_overtemp(void)
 }
 EXPORT_SYMBOL_GPL(wf_is_overtemp);
 
-static struct platform_device wf_platform_device = {
-	.name	= "windfarm",
-};
-
 static int __init windfarm_core_init(void)
 {
 	DBG("wf: core loaded\n");
Index: linux-work/drivers/macintosh/windfarm_max6690_sensor.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-work/drivers/macintosh/windfarm_max6690_sensor.c	2006-02-07 16:05:23.000000000 +1100
@@ -0,0 +1,169 @@
+/*
+ * Windfarm PowerMac thermal control.  MAX6690 sensor.
+ *
+ * Copyright (C) 2005 Paul Mackerras, IBM Corp. <paulus at samba.org>
+ *
+ * Use and redistribute under the terms of the GNU GPL v2.
+ */
+#include <linux/types.h>
+#include <linux/errno.h>
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/slab.h>
+#include <linux/i2c.h>
+#include <linux/i2c-dev.h>
+#include <asm/prom.h>
+#include <asm/pmac_low_i2c.h>
+
+#include "windfarm.h"
+
+#define VERSION "0.1"
+
+/* This currently only exports the external temperature sensor,
+   since that's all the control loops need. */
+
+/* Some MAX6690 register numbers */
+#define MAX6690_INTERNAL_TEMP	0
+#define MAX6690_EXTERNAL_TEMP	1
+
+struct wf_6690_sensor {
+	struct i2c_client	i2c;
+	struct wf_sensor	sens;
+};
+
+#define wf_to_6690(x)	container_of((x), struct wf_6690_sensor, sens)
+#define i2c_to_6690(x)	container_of((x), struct wf_6690_sensor, i2c)
+
+static int wf_max6690_attach(struct i2c_adapter *adapter);
+static int wf_max6690_detach(struct i2c_client *client);
+
+static struct i2c_driver wf_max6690_driver = {
+	.driver = {
+		.name		= "wf_max6690",
+	},
+	.attach_adapter	= wf_max6690_attach,
+	.detach_client	= wf_max6690_detach,
+};
+
+static int wf_max6690_get(struct wf_sensor *sr, s32 *value)
+{
+	struct wf_6690_sensor *max = wf_to_6690(sr);
+	s32 data;
+
+	if (max->i2c.adapter == NULL)
+		return -ENODEV;
+
+	/* chip gets initialized by firmware */
+	data = i2c_smbus_read_byte_data(&max->i2c, MAX6690_EXTERNAL_TEMP);
+	if (data < 0)
+		return data;
+	*value = data << 16;
+	return 0;
+}
+
+static void wf_max6690_release(struct wf_sensor *sr)
+{
+	struct wf_6690_sensor *max = wf_to_6690(sr);
+
+	if (max->i2c.adapter) {
+		i2c_detach_client(&max->i2c);
+		max->i2c.adapter = NULL;
+	}
+	kfree(max);
+}
+
+static struct wf_sensor_ops wf_max6690_ops = {
+	.get_value	= wf_max6690_get,
+	.release	= wf_max6690_release,
+	.owner		= THIS_MODULE,
+};
+
+static void wf_max6690_create(struct i2c_adapter *adapter, u8 addr)
+{
+	struct wf_6690_sensor *max;
+	char *name = "u4-temp";
+
+	max = kzalloc(sizeof(struct wf_6690_sensor), GFP_KERNEL);
+	if (max == NULL) {
+		printk(KERN_ERR "windfarm: Couldn't create MAX6690 sensor %s: "
+		       "no memory\n", name);
+		return;
+	}
+
+	max->sens.ops = &wf_max6690_ops;
+	max->sens.name = name;
+	max->i2c.addr = addr >> 1;
+	max->i2c.adapter = adapter;
+	max->i2c.driver = &wf_max6690_driver;
+	strncpy(max->i2c.name, name, I2C_NAME_SIZE-1);
+
+	if (i2c_attach_client(&max->i2c)) {
+		printk(KERN_ERR "windfarm: failed to attach MAX6690 sensor\n");
+		goto fail;
+	}
+
+	if (wf_register_sensor(&max->sens)) {
+		i2c_detach_client(&max->i2c);
+		goto fail;
+	}
+
+	return;
+
+ fail:
+	kfree(max);
+}
+
+static int wf_max6690_attach(struct i2c_adapter *adapter)
+{
+	struct device_node *busnode, *dev = NULL;
+	struct pmac_i2c_bus *bus;
+	const char *loc;
+	u32 *reg;
+
+	bus = pmac_i2c_adapter_to_bus(adapter);
+	if (bus == NULL)
+		return -ENODEV;
+	busnode = pmac_i2c_get_bus_node(bus);
+
+	while ((dev = of_get_next_child(busnode, dev)) != NULL) {
+		if (!device_is_compatible(dev, "max6690"))
+			continue;
+		loc = get_property(dev, "hwsensor-location", NULL);
+		reg = (u32 *) get_property(dev, "reg", NULL);
+		if (!loc || !reg)
+			continue;
+		printk("found max6690, loc=%s reg=%x\n", loc, *reg);
+		if (strcmp(loc, "BACKSIDE"))
+			continue;
+		wf_max6690_create(adapter, *reg);
+	}
+
+	return 0;
+}
+
+static int wf_max6690_detach(struct i2c_client *client)
+{
+	struct wf_6690_sensor *max = i2c_to_6690(client);
+
+	max->i2c.adapter = NULL;
+	wf_unregister_sensor(&max->sens);
+
+	return 0;
+}
+
+static int __init wf_max6690_sensor_init(void)
+{
+	return i2c_add_driver(&wf_max6690_driver);
+}
+
+static void __exit wf_max6690_sensor_exit(void)
+{
+	i2c_del_driver(&wf_max6690_driver);
+}
+
+module_init(wf_max6690_sensor_init);
+module_exit(wf_max6690_sensor_exit);
+
+MODULE_AUTHOR("Paul Mackerras <paulus at samba.org>");
+MODULE_DESCRIPTION("MAX6690 sensor objects for PowerMac thermal control");
+MODULE_LICENSE("GPL");
Index: linux-work/drivers/macintosh/windfarm_pid.c
===================================================================
--- linux-work.orig/drivers/macintosh/windfarm_pid.c	2005-11-09 11:49:03.000000000 +1100
+++ linux-work/drivers/macintosh/windfarm_pid.c	2006-02-07 13:45:57.000000000 +1100
@@ -88,8 +88,8 @@ EXPORT_SYMBOL_GPL(wf_cpu_pid_init);
 
 s32 wf_cpu_pid_run(struct wf_cpu_pid_state *st, s32 new_power, s32 new_temp)
 {
-	s64	error, integ, deriv, prop;
-	s32	target, sval, adj;
+	s64	integ, deriv, prop;
+	s32	error, target, sval, adj;
 	int	i, hlen = st->param.history_len;
 
 	/* Calculate error term */
@@ -117,7 +117,7 @@ s32 wf_cpu_pid_run(struct wf_cpu_pid_sta
 		integ += st->errors[(st->index + hlen - i) % hlen];
 	integ *= st->param.interval;
 	integ *= st->param.gr;
-	sval = st->param.tmax - ((integ >> 20) & 0xffffffff);
+	sval = st->param.tmax - (s32)(integ >> 20);
 	adj = min(st->param.ttarget, sval);
 
 	DBG("integ: %lx, sval: %lx, adj: %lx\n", integ, sval, adj);
@@ -129,7 +129,7 @@ s32 wf_cpu_pid_run(struct wf_cpu_pid_sta
 	deriv *= st->param.gd;
 
 	/* Calculate proportional term */
-	prop = (new_temp - adj);
+	prop = st->last_delta = (new_temp - adj);
 	prop *= st->param.gp;
 
 	DBG("deriv: %lx, prop: %lx\n", deriv, prop);
Index: linux-work/drivers/macintosh/windfarm_pid.h
===================================================================
--- linux-work.orig/drivers/macintosh/windfarm_pid.h	2005-11-09 11:49:03.000000000 +1100
+++ linux-work/drivers/macintosh/windfarm_pid.h	2006-02-07 13:45:57.000000000 +1100
@@ -72,6 +72,7 @@ struct wf_cpu_pid_state {
 	int	index; 				/* index of current power */
 	int	tindex; 			/* index of current temp */
 	s32	target;				/* current target value */
+	s32	last_delta;			/* last Tactual - Ttarget */
 	s32	powers[WF_PID_MAX_HISTORY];	/* power history buffer */
 	s32	errors[WF_PID_MAX_HISTORY];	/* error history buffer */
 	s32	temps[2];			/* temp. history buffer */
Index: linux-work/drivers/macintosh/windfarm_pm112.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-work/drivers/macintosh/windfarm_pm112.c	2006-02-08 16:28:38.000000000 +1100
@@ -0,0 +1,698 @@
+/*
+ * Windfarm PowerMac thermal control.
+ * Control loops for machines with SMU and PPC970MP processors.
+ *
+ * Copyright (C) 2005 Paul Mackerras, IBM Corp. <paulus at samba.org>
+ * Copyright (C) 2006 Benjamin Herrenschmidt, IBM Corp.
+ *
+ * Use and redistribute under the terms of the GNU GPL v2.
+ */
+#include <linux/types.h>
+#include <linux/errno.h>
+#include <linux/kernel.h>
+#include <linux/device.h>
+#include <linux/platform_device.h>
+#include <linux/reboot.h>
+#include <asm/prom.h>
+#include <asm/smu.h>
+
+#include "windfarm.h"
+#include "windfarm_pid.h"
+
+#define VERSION "0.2"
+
+#define DEBUG
+#undef LOTSA_DEBUG
+
+#ifdef DEBUG
+#define DBG(args...)	printk(args)
+#else
+#define DBG(args...)	do { } while(0)
+#endif
+
+#ifdef LOTSA_DEBUG
+#define DBG_LOTS(args...)	printk(args)
+#else
+#define DBG_LOTS(args...)	do { } while(0)
+#endif
+
+/* define this to force CPU overtemp to 60 degree, useful for testing
+ * the overtemp code
+ */
+#undef HACKED_OVERTEMP
+
+/* We currently only handle 2 chips, 4 cores... */
+#define NR_CHIPS	2
+#define NR_CORES	4
+#define NR_CPU_FANS	3 * NR_CHIPS
+
+/* Controls and sensors */
+static struct wf_sensor *sens_cpu_temp[NR_CORES];
+static struct wf_sensor *sens_cpu_power[NR_CORES];
+static struct wf_sensor *hd_temp;
+static struct wf_sensor *slots_power;
+static struct wf_sensor *u4_temp;
+
+static struct wf_control *cpu_fans[NR_CPU_FANS];
+static char *cpu_fan_names[NR_CPU_FANS] = {
+	"cpu-rear-fan-0",
+	"cpu-rear-fan-1",
+	"cpu-front-fan-0",
+	"cpu-front-fan-1",
+	"cpu-pump-0",
+	"cpu-pump-1",
+};
+static struct wf_control *cpufreq_clamp;
+
+/* Second pump isn't required (and isn't actually present) */
+#define CPU_FANS_REQD		(NR_CPU_FANS - 2)
+#define FIRST_PUMP		4
+#define LAST_PUMP		5
+
+/* We keep a temperature history for average calculation of 180s */
+#define CPU_TEMP_HIST_SIZE	180
+
+/* Scale factor for fan speed, *100 */
+static int cpu_fan_scale[NR_CPU_FANS] = {
+	100,
+	100,
+	97,		/* inlet fans run at 97% of exhaust fan */
+	97,
+	100,		/* updated later */
+	100,		/* updated later */
+};
+
+static struct wf_control *backside_fan;
+static struct wf_control *slots_fan;
+static struct wf_control *drive_bay_fan;
+
+/* PID loop state */
+static struct wf_cpu_pid_state cpu_pid[NR_CORES];
+static u32 cpu_thist[CPU_TEMP_HIST_SIZE];
+static int cpu_thist_pt;
+static s64 cpu_thist_total;
+static s32 cpu_all_tmax = 100 << 16;
+static int cpu_last_target;
+static struct wf_pid_state backside_pid;
+static int backside_tick;
+static struct wf_pid_state slots_pid;
+static int slots_started;
+static struct wf_pid_state drive_bay_pid;
+static int drive_bay_tick;
+
+static int nr_cores;
+static int have_all_controls;
+static int have_all_sensors;
+static int started;
+
+static int failure_state;
+#define FAILURE_SENSOR		1
+#define FAILURE_FAN		2
+#define FAILURE_PERM		4
+#define FAILURE_LOW_OVERTEMP	8
+#define FAILURE_HIGH_OVERTEMP	16
+
+/* Overtemp values */
+#define LOW_OVER_AVERAGE	0
+#define LOW_OVER_IMMEDIATE	(10 << 16)
+#define LOW_OVER_CLEAR		((-10) << 16)
+#define HIGH_OVER_IMMEDIATE	(14 << 16)
+#define HIGH_OVER_AVERAGE	(10 << 16)
+#define HIGH_OVER_IMMEDIATE	(14 << 16)
+
+
+/* Implementation... */
+static int create_cpu_loop(int cpu)
+{
+	int chip = cpu / 2;
+	int core = cpu & 1;
+	struct smu_sdbp_header *hdr;
+	struct smu_sdbp_cpupiddata *piddata;
+	struct wf_cpu_pid_param pid;
+	struct wf_control *main_fan = cpu_fans[0];
+	s32 tmax;
+	int fmin;
+
+	/* Get PID params from the appropriate SAT */
+	hdr = smu_sat_get_sdb_partition(chip, 0xC8 + core, NULL);
+	if (hdr == NULL) {
+		printk(KERN_WARNING"windfarm: can't get CPU PID fan config\n");
+		return -EINVAL;
+	}
+	piddata = (struct smu_sdbp_cpupiddata *)&hdr[1];
+
+	/* Get FVT params to get Tmax; if not found, assume default */
+	hdr = smu_sat_get_sdb_partition(chip, 0xC4 + core, NULL);
+	if (hdr) {
+		struct smu_sdbp_fvt *fvt = (struct smu_sdbp_fvt *)&hdr[1];
+		tmax = fvt->maxtemp << 16;
+	} else
+		tmax = 95 << 16;	/* default to 95 degrees C */
+
+	/* We keep a global tmax for overtemp calculations */
+	if (tmax < cpu_all_tmax)
+		cpu_all_tmax = tmax;
+
+	/*
+	 * Darwin has a minimum fan speed of 1000 rpm for the 4-way and
+	 * 515 for the 2-way.  That appears to be overkill, so for now,
+	 * impose a minimum of 750 or 515.
+	 */
+	fmin = (nr_cores > 2) ? 750 : 515;
+
+	/* Initialize PID loop */
+	pid.interval = 1;	/* seconds */
+	pid.history_len = piddata->history_len;
+	pid.gd = piddata->gd;
+	pid.gp = piddata->gp;
+	pid.gr = piddata->gr / piddata->history_len;
+	pid.pmaxadj = (piddata->max_power << 16) - (piddata->power_adj << 8);
+	pid.ttarget = tmax - (piddata->target_temp_delta << 16);
+	pid.tmax = tmax;
+	pid.min = main_fan->ops->get_min(main_fan);
+	pid.max = main_fan->ops->get_max(main_fan);
+	if (pid.min < fmin)
+		pid.min = fmin;
+
+	wf_cpu_pid_init(&cpu_pid[cpu], &pid);
+	return 0;
+}
+
+static void cpu_max_all_fans(void)
+{
+	int i;
+
+	/* We max all CPU fans in case of a sensor error. We also do the
+	 * cpufreq clamping now, even if it's supposedly done later by the
+	 * generic code anyway, we do it earlier here to react faster
+	 */
+	if (cpufreq_clamp)
+		wf_control_set_max(cpufreq_clamp);
+	for (i = 0; i < NR_CPU_FANS; ++i)
+		if (cpu_fans[i])
+			wf_control_set_max(cpu_fans[i]);
+}
+
+static int cpu_check_overtemp(s32 temp)
+{
+	int new_state = 0;
+	s32 t_avg, t_old;
+
+	/* First check for immediate overtemps */
+	if (temp >= (cpu_all_tmax + LOW_OVER_IMMEDIATE)) {
+		new_state |= FAILURE_LOW_OVERTEMP;
+		if ((failure_state & FAILURE_LOW_OVERTEMP) == 0)
+			printk(KERN_ERR "windfarm: Overtemp due to immediate CPU"
+			       " temperature !\n");
+	}
+	if (temp >= (cpu_all_tmax + HIGH_OVER_IMMEDIATE)) {
+		new_state |= FAILURE_HIGH_OVERTEMP;
+		if ((failure_state & FAILURE_HIGH_OVERTEMP) == 0)
+			printk(KERN_ERR "windfarm: Critical overtemp due to"
+			       " immediate CPU temperature !\n");
+	}
+
+	/* We calculate a history of max temperatures and use that for the
+	 * overtemp management
+	 */
+	t_old = cpu_thist[cpu_thist_pt];
+	cpu_thist[cpu_thist_pt] = temp;
+	cpu_thist_pt = (cpu_thist_pt + 1) % CPU_TEMP_HIST_SIZE;
+	cpu_thist_total -= t_old;
+	cpu_thist_total += temp;
+	t_avg = cpu_thist_total / CPU_TEMP_HIST_SIZE;
+
+	DBG_LOTS("t_avg = %d.%03d (out: %d.%03d, in: %d.%03d)\n",
+		 FIX32TOPRINT(t_avg), FIX32TOPRINT(t_old), FIX32TOPRINT(temp));
+
+	/* Now check for average overtemps */
+	if (t_avg >= (cpu_all_tmax + LOW_OVER_AVERAGE)) {
+		new_state |= FAILURE_LOW_OVERTEMP;
+		if ((failure_state & FAILURE_LOW_OVERTEMP) == 0)
+			printk(KERN_ERR "windfarm: Overtemp due to average CPU"
+			       " temperature !\n");
+	}
+	if (t_avg >= (cpu_all_tmax + HIGH_OVER_AVERAGE)) {
+		new_state |= FAILURE_HIGH_OVERTEMP;
+		if ((failure_state & FAILURE_HIGH_OVERTEMP) == 0)
+			printk(KERN_ERR "windfarm: Critical overtemp due to"
+			       " average CPU temperature !\n");
+	}
+
+	/* Now handle overtemp conditions. We don't currently use the windfarm
+	 * overtemp handling core as it's not fully suited to the needs of those
+	 * new machine. This will be fixed later.
+	 */
+	if (new_state) {
+		/* High overtemp -> immediate shutdown */
+		if (new_state & FAILURE_HIGH_OVERTEMP)
+			machine_power_off();
+		if ((failure_state & new_state) != new_state)
+			cpu_max_all_fans();
+		failure_state |= new_state;
+	} else if ((failure_state & FAILURE_LOW_OVERTEMP) &&
+		   (temp < (cpu_all_tmax + LOW_OVER_CLEAR))) {
+		printk(KERN_ERR "windfarm: Overtemp condition cleared !\n");
+		failure_state &= ~FAILURE_LOW_OVERTEMP;
+	}
+
+	return failure_state & (FAILURE_LOW_OVERTEMP | FAILURE_HIGH_OVERTEMP);
+}
+
+static void cpu_fans_tick(void)
+{
+	int err, cpu;
+	s32 greatest_delta = 0;
+	s32 temp, power, t_max = 0;
+	int i, t, target = 0;
+	struct wf_sensor *sr;
+	struct wf_control *ct;
+	struct wf_cpu_pid_state *sp;
+
+	DBG_LOTS(KERN_DEBUG);
+	for (cpu = 0; cpu < nr_cores; ++cpu) {
+		/* Get CPU core temperature */
+		sr = sens_cpu_temp[cpu];
+		err = sr->ops->get_value(sr, &temp);
+		if (err) {
+			DBG("\n");
+			printk(KERN_WARNING "windfarm: CPU %d temperature "
+			       "sensor error %d\n", cpu, err);
+			failure_state |= FAILURE_SENSOR;
+			cpu_max_all_fans();
+			return;
+		}
+
+		/* Keep track of highest temp */
+		t_max = max(t_max, temp);
+
+		/* Get CPU power */
+		sr = sens_cpu_power[cpu];
+		err = sr->ops->get_value(sr, &power);
+		if (err) {
+			DBG("\n");
+			printk(KERN_WARNING "windfarm: CPU %d power "
+			       "sensor error %d\n", cpu, err);
+			failure_state |= FAILURE_SENSOR;
+			cpu_max_all_fans();
+			return;
+		}
+
+		/* Run PID */
+		sp = &cpu_pid[cpu];
+		t = wf_cpu_pid_run(sp, power, temp);
+
+		if (cpu == 0 || sp->last_delta > greatest_delta) {
+			greatest_delta = sp->last_delta;
+			target = t;
+		}
+		DBG_LOTS("[%d] P=%d.%.3d T=%d.%.3d ",
+		    cpu, FIX32TOPRINT(power), FIX32TOPRINT(temp));
+	}
+	DBG_LOTS("fans = %d, t_max = %d.%03d\n", target, FIX32TOPRINT(t_max));
+
+	/* Darwin limits decrease to 20 per iteration */
+	if (target < (cpu_last_target - 20))
+		target = cpu_last_target - 20;
+	cpu_last_target = target;
+	for (cpu = 0; cpu < nr_cores; ++cpu)
+		cpu_pid[cpu].target = target;
+
+	/* Handle possible overtemps */
+	if (cpu_check_overtemp(t_max))
+		return;
+
+	/* Set fans */
+	for (i = 0; i < NR_CPU_FANS; ++i) {
+		ct = cpu_fans[i];
+		if (ct == NULL)
+			continue;
+		err = ct->ops->set_value(ct, target * cpu_fan_scale[i] / 100);
+		if (err) {
+			printk(KERN_WARNING "windfarm: fan %s reports "
+			       "error %d\n", ct->name, err);
+			failure_state |= FAILURE_FAN;
+			break;
+		}
+	}
+}
+
+/* Backside/U4 fan */
+static struct wf_pid_param backside_param = {
+	.interval	= 5,
+	.history_len	= 2,
+	.gd		= 48 << 20,
+	.gp		= 5 << 20,
+	.gr		= 0,
+	.itarget	= 64 << 16,
+	.additive	= 1,
+};
+
+static void backside_fan_tick(void)
+{
+	s32 temp;
+	int speed;
+	int err;
+
+	if (!backside_fan || !u4_temp)
+		return;
+	if (!backside_tick) {
+		/* first time; initialize things */
+		backside_param.min = backside_fan->ops->get_min(backside_fan);
+		backside_param.max = backside_fan->ops->get_max(backside_fan);
+		wf_pid_init(&backside_pid, &backside_param);
+		backside_tick = 1;
+	}
+	if (--backside_tick > 0)
+		return;
+	backside_tick = backside_pid.param.interval;
+
+	err = u4_temp->ops->get_value(u4_temp, &temp);
+	if (err) {
+		printk(KERN_WARNING "windfarm: U4 temp sensor error %d\n",
+		       err);
+		failure_state |= FAILURE_SENSOR;
+		wf_control_set_max(backside_fan);
+		return;
+	}
+	speed = wf_pid_run(&backside_pid, temp);
+	DBG_LOTS("backside PID temp=%d.%.3d speed=%d\n",
+		 FIX32TOPRINT(temp), speed);
+
+	err = backside_fan->ops->set_value(backside_fan, speed);
+	if (err) {
+		printk(KERN_WARNING "windfarm: backside fan error %d\n", err);
+		failure_state |= FAILURE_FAN;
+	}
+}
+
+/* Drive bay fan */
+static struct wf_pid_param drive_bay_prm = {
+	.interval	= 5,
+	.history_len	= 2,
+	.gd		= 30 << 20,
+	.gp		= 5 << 20,
+	.gr		= 0,
+	.itarget	= 40 << 16,
+	.additive	= 1,
+};
+
+static void drive_bay_fan_tick(void)
+{
+	s32 temp;
+	int speed;
+	int err;
+
+	if (!drive_bay_fan || !hd_temp)
+		return;
+	if (!drive_bay_tick) {
+		/* first time; initialize things */
+		drive_bay_prm.min = drive_bay_fan->ops->get_min(drive_bay_fan);
+		drive_bay_prm.max = drive_bay_fan->ops->get_max(drive_bay_fan);
+		wf_pid_init(&drive_bay_pid, &drive_bay_prm);
+		drive_bay_tick = 1;
+	}
+	if (--drive_bay_tick > 0)
+		return;
+	drive_bay_tick = drive_bay_pid.param.interval;
+
+	err = hd_temp->ops->get_value(hd_temp, &temp);
+	if (err) {
+		printk(KERN_WARNING "windfarm: drive bay temp sensor "
+		       "error %d\n", err);
+		failure_state |= FAILURE_SENSOR;
+		wf_control_set_max(drive_bay_fan);
+		return;
+	}
+	speed = wf_pid_run(&drive_bay_pid, temp);
+	DBG_LOTS("drive_bay PID temp=%d.%.3d speed=%d\n",
+		 FIX32TOPRINT(temp), speed);
+
+	err = drive_bay_fan->ops->set_value(drive_bay_fan, speed);
+	if (err) {
+		printk(KERN_WARNING "windfarm: drive bay fan error %d\n", err);
+		failure_state |= FAILURE_FAN;
+	}
+}
+
+/* PCI slots area fan */
+/* This makes the fan speed proportional to the power consumed */
+static struct wf_pid_param slots_param = {
+	.interval	= 1,
+	.history_len	= 2,
+	.gd		= 0,
+	.gp		= 0,
+	.gr		= 0x1277952,
+	.itarget	= 0,
+	.min		= 1560,
+	.max		= 3510,
+};
+
+static void slots_fan_tick(void)
+{
+	s32 power;
+	int speed;
+	int err;
+
+	if (!slots_fan || !slots_power)
+		return;
+	if (!slots_started) {
+		/* first time; initialize things */
+		wf_pid_init(&slots_pid, &slots_param);
+		slots_started = 1;
+	}
+
+	err = slots_power->ops->get_value(slots_power, &power);
+	if (err) {
+		printk(KERN_WARNING "windfarm: slots power sensor error %d\n",
+		       err);
+		failure_state |= FAILURE_SENSOR;
+		wf_control_set_max(slots_fan);
+		return;
+	}
+	speed = wf_pid_run(&slots_pid, power);
+	DBG_LOTS("slots PID power=%d.%.3d speed=%d\n",
+		 FIX32TOPRINT(power), speed);
+
+	err = slots_fan->ops->set_value(slots_fan, speed);
+	if (err) {
+		printk(KERN_WARNING "windfarm: slots fan error %d\n", err);
+		failure_state |= FAILURE_FAN;
+	}
+}
+
+static void set_fail_state(void)
+{
+	int i;
+
+	if (cpufreq_clamp)
+		wf_control_set_max(cpufreq_clamp);
+	for (i = 0; i < NR_CPU_FANS; ++i)
+		if (cpu_fans[i])
+			wf_control_set_max(cpu_fans[i]);
+	if (backside_fan)
+		wf_control_set_max(backside_fan);
+	if (slots_fan)
+		wf_control_set_max(slots_fan);
+	if (drive_bay_fan)
+		wf_control_set_max(drive_bay_fan);
+}
+
+static void pm112_tick(void)
+{
+	int i, last_failure;
+
+	if (!started) {
+		started = 1;
+		for (i = 0; i < nr_cores; ++i) {
+			if (create_cpu_loop(i) < 0) {
+				failure_state = FAILURE_PERM;
+				set_fail_state();
+				break;
+			}
+		}
+		DBG_LOTS("cpu_all_tmax=%d.%03d\n", FIX32TOPRINT(cpu_all_tmax));
+
+#ifdef HACKED_OVERTEMP
+		cpu_all_tmax = 60 << 16;
+#endif
+	}
+
+	/* Permanent failure, bail out */
+	if (failure_state & FAILURE_PERM)
+		return;
+	/* Clear all failure bits except low overtemp which will be eventually
+	 * cleared by the control loop itself
+	 */
+	last_failure = failure_state;
+	failure_state &= FAILURE_LOW_OVERTEMP;
+	cpu_fans_tick();
+	backside_fan_tick();
+	slots_fan_tick();
+	drive_bay_fan_tick();
+
+	DBG_LOTS("last_failure: 0x%x, failure_state: %x\n",
+		 last_failure, failure_state);
+
+	/* Check for failures. Any failure causes cpufreq clamping */
+	if (failure_state && last_failure == 0 && cpufreq_clamp)
+		wf_control_set_max(cpufreq_clamp);
+	if (failure_state == 0 && last_failure && cpufreq_clamp)
+		wf_control_set_min(cpufreq_clamp);
+
+	/* That's it for now, we might want to deal with other failures
+	 * differently in the future though
+	 */
+}
+
+static void pm112_new_control(struct wf_control *ct)
+{
+	int i, max_exhaust;
+
+	if (cpufreq_clamp == NULL && !strcmp(ct->name, "cpufreq-clamp")) {
+		if (wf_get_control(ct) == 0)
+			cpufreq_clamp = ct;
+	}
+
+	for (i = 0; i < NR_CPU_FANS; ++i) {
+		if (!strcmp(ct->name, cpu_fan_names[i])) {
+			if (cpu_fans[i] == NULL && wf_get_control(ct) == 0)
+				cpu_fans[i] = ct;
+			break;
+		}
+	}
+	if (i >= NR_CPU_FANS) {
+		/* not a CPU fan, try the others */
+		if (!strcmp(ct->name, "backside-fan")) {
+			if (backside_fan == NULL && wf_get_control(ct) == 0)
+				backside_fan = ct;
+		} else if (!strcmp(ct->name, "slots-fan")) {
+			if (slots_fan == NULL && wf_get_control(ct) == 0)
+				slots_fan = ct;
+		} else if (!strcmp(ct->name, "drive-bay-fan")) {
+			if (drive_bay_fan == NULL && wf_get_control(ct) == 0)
+				drive_bay_fan = ct;
+		}
+		return;
+	}
+
+	for (i = 0; i < CPU_FANS_REQD; ++i)
+		if (cpu_fans[i] == NULL)
+			return;
+
+	/* work out pump scaling factors */
+	max_exhaust = cpu_fans[0]->ops->get_max(cpu_fans[0]);
+	for (i = FIRST_PUMP; i <= LAST_PUMP; ++i)
+		if ((ct = cpu_fans[i]) != NULL)
+			cpu_fan_scale[i] =
+				ct->ops->get_max(ct) * 100 / max_exhaust;
+
+	have_all_controls = 1;
+}
+
+static void pm112_new_sensor(struct wf_sensor *sr)
+{
+	unsigned int i;
+
+	if (have_all_sensors)
+		return;
+	if (!strncmp(sr->name, "cpu-temp-", 9)) {
+		i = sr->name[9] - '0';
+		if (sr->name[10] == 0 && i < NR_CORES &&
+		    sens_cpu_temp[i] == NULL && wf_get_sensor(sr) == 0)
+			sens_cpu_temp[i] = sr;
+
+	} else if (!strncmp(sr->name, "cpu-power-", 10)) {
+		i = sr->name[10] - '0';
+		if (sr->name[11] == 0 && i < NR_CORES &&
+		    sens_cpu_power[i] == NULL && wf_get_sensor(sr) == 0)
+			sens_cpu_power[i] = sr;
+	} else if (!strcmp(sr->name, "hd-temp")) {
+		if (hd_temp == NULL && wf_get_sensor(sr) == 0)
+			hd_temp = sr;
+	} else if (!strcmp(sr->name, "slots-power")) {
+		if (slots_power == NULL && wf_get_sensor(sr) == 0)
+			slots_power = sr;
+	} else if (!strcmp(sr->name, "u4-temp")) {
+		if (u4_temp == NULL && wf_get_sensor(sr) == 0)
+			u4_temp = sr;
+	} else
+		return;
+
+	/* check if we have all the sensors we need */
+	for (i = 0; i < nr_cores; ++i)
+		if (sens_cpu_temp[i] == NULL || sens_cpu_power[i] == NULL)
+			return;
+
+	have_all_sensors = 1;
+}
+
+static int pm112_wf_notify(struct notifier_block *self,
+			   unsigned long event, void *data)
+{
+	switch (event) {
+	case WF_EVENT_NEW_SENSOR:
+		pm112_new_sensor(data);
+		break;
+	case WF_EVENT_NEW_CONTROL:
+		pm112_new_control(data);
+		break;
+	case WF_EVENT_TICK:
+		if (have_all_controls && have_all_sensors)
+			pm112_tick();
+	}
+	return 0;
+}
+
+static struct notifier_block pm112_events = {
+	.notifier_call = pm112_wf_notify,
+};
+
+static int wf_pm112_probe(struct device *dev)
+{
+	wf_register_client(&pm112_events);
+	return 0;
+}
+
+static int wf_pm112_remove(struct device *dev)
+{
+	wf_unregister_client(&pm112_events);
+	/* should release all sensors and controls */
+	return 0;
+}
+
+static struct device_driver wf_pm112_driver = {
+	.name = "windfarm",
+	.bus = &platform_bus_type,
+	.probe = wf_pm112_probe,
+	.remove = wf_pm112_remove,
+};
+
+static int __init wf_pm112_init(void)
+{
+	struct device_node *cpu;
+
+	if (!machine_is_compatible("PowerMac11,2"))
+		return -ENODEV;
+
+	/* Count the number of CPU cores */
+	nr_cores = 0;
+	for (cpu = NULL; (cpu = of_find_node_by_type(cpu, "cpu")) != NULL; )
+		++nr_cores;
+
+	printk(KERN_INFO "windfarm: initializing for dual-core desktop G5\n");
+	driver_register(&wf_pm112_driver);
+	return 0;
+}
+
+static void __exit wf_pm112_exit(void)
+{
+	driver_unregister(&wf_pm112_driver);
+}
+
+module_init(wf_pm112_init);
+module_exit(wf_pm112_exit);
+
+MODULE_AUTHOR("Paul Mackerras <paulus at samba.org>");
+MODULE_DESCRIPTION("Thermal control for PowerMac11,2");
+MODULE_LICENSE("GPL");
Index: linux-work/drivers/macintosh/windfarm_smu_controls.c
===================================================================
--- linux-work.orig/drivers/macintosh/windfarm_smu_controls.c	2006-01-12 16:33:08.000000000 +1100
+++ linux-work/drivers/macintosh/windfarm_smu_controls.c	2006-02-07 13:45:57.000000000 +1100
@@ -24,7 +24,7 @@
 
 #include "windfarm.h"
 
-#define VERSION "0.3"
+#define VERSION "0.4"
 
 #undef DEBUG
 
@@ -34,6 +34,8 @@
 #define DBG(args...)	do { } while(0)
 #endif
 
+static int smu_supports_new_fans_ops = 1;
+
 /*
  * SMU fans control object
  */
@@ -59,23 +61,49 @@ static int smu_set_fan(int pwm, u8 id, u
 
 	/* Fill SMU command structure */
 	cmd.cmd = SMU_CMD_FAN_COMMAND;
-	cmd.data_len = 14;
+
+	/* The SMU has an "old" and a "new" way of setting the fan speed
+	 * Unfortunately, I found no reliable way to know which one works
+	 * on a given machine model. After some investigations it appears
+	 * that MacOS X just tries the new one, and if it fails fallbacks
+	 * to the old ones ... Ugh.
+	 */
+ retry:
+	if (smu_supports_new_fans_ops) {
+		buffer[0] = 0x30;
+		buffer[1] = id;
+		*((u16 *)(&buffer[2])) = value;
+		cmd.data_len = 4;
+	} else {
+		if (id > 7)
+			return -EINVAL;
+		/* Fill argument buffer */
+		memset(buffer, 0, 16);
+		buffer[0] = pwm ? 0x10 : 0x00;
+		buffer[1] = 0x01 << id;
+		*((u16 *)&buffer[2 + id * 2]) = value;
+		cmd.data_len = 14;
+	}
+
 	cmd.reply_len = 16;
 	cmd.data_buf = cmd.reply_buf = buffer;
 	cmd.status = 0;
 	cmd.done = smu_done_complete;
 	cmd.misc = &comp;
 
-	/* Fill argument buffer */
-	memset(buffer, 0, 16);
-	buffer[0] = pwm ? 0x10 : 0x00;
-	buffer[1] = 0x01 << id;
-	*((u16 *)&buffer[2 + id * 2]) = value;
-
 	rc = smu_queue_cmd(&cmd);
 	if (rc)
 		return rc;
 	wait_for_completion(&comp);
+
+	/* Handle fallback (see coment above) */
+	if (cmd.status != 0 && smu_supports_new_fans_ops) {
+		printk(KERN_WARNING "windfarm: SMU failed new fan command "
+		       "falling back to old method\n");
+		smu_supports_new_fans_ops = 0;
+		goto retry;
+	}
+
 	return cmd.status;
 }
 
@@ -158,19 +186,29 @@ static struct smu_fan_control *smu_fan_c
 
 	/* Names used on desktop models */
 	if (!strcmp(l, "Rear Fan 0") || !strcmp(l, "Rear Fan") ||
-	    !strcmp(l, "Rear fan 0") || !strcmp(l, "Rear fan"))
+	    !strcmp(l, "Rear fan 0") || !strcmp(l, "Rear fan") ||
+	    !strcmp(l, "CPU A EXHAUST"))
 		fct->ctrl.name = "cpu-rear-fan-0";
-	else if (!strcmp(l, "Rear Fan 1") || !strcmp(l, "Rear fan 1"))
+	else if (!strcmp(l, "Rear Fan 1") || !strcmp(l, "Rear fan 1") ||
+		 !strcmp(l, "CPU B EXHAUST"))
 		fct->ctrl.name = "cpu-rear-fan-1";
 	else if (!strcmp(l, "Front Fan 0") || !strcmp(l, "Front Fan") ||
-		 !strcmp(l, "Front fan 0") || !strcmp(l, "Front fan"))
+		 !strcmp(l, "Front fan 0") || !strcmp(l, "Front fan") ||
+		 !strcmp(l, "CPU A INTAKE"))
 		fct->ctrl.name = "cpu-front-fan-0";
-	else if (!strcmp(l, "Front Fan 1") || !strcmp(l, "Front fan 1"))
+	else if (!strcmp(l, "Front Fan 1") || !strcmp(l, "Front fan 1") ||
+		 !strcmp(l, "CPU B INTAKE"))
 		fct->ctrl.name = "cpu-front-fan-1";
-	else if (!strcmp(l, "Slots Fan") || !strcmp(l, "Slots fan"))
+	else if (!strcmp(l, "CPU A PUMP"))
+		fct->ctrl.name = "cpu-pump-0";
+	else if (!strcmp(l, "Slots Fan") || !strcmp(l, "Slots fan") ||
+		 !strcmp(l, "EXPANSION SLOTS INTAKE"))
 		fct->ctrl.name = "slots-fan";
-	else if (!strcmp(l, "Drive Bay") || !strcmp(l, "Drive bay"))
+	else if (!strcmp(l, "Drive Bay") || !strcmp(l, "Drive bay") ||
+		 !strcmp(l, "DRIVE BAY A INTAKE"))
 		fct->ctrl.name = "drive-bay-fan";
+	else if (!strcmp(l, "BACKSIDE"))
+		fct->ctrl.name = "backside-fan";
 
 	/* Names used on iMac models */
 	if (!strcmp(l, "System Fan") || !strcmp(l, "System fan"))
@@ -223,7 +261,8 @@ static int __init smu_controls_init(void
 
 	/* Look for RPM fans */
 	for (fans = NULL; (fans = of_get_next_child(smu, fans)) != NULL;)
-		if (!strcmp(fans->name, "rpm-fans"))
+		if (!strcmp(fans->name, "rpm-fans") ||
+		    device_is_compatible(fans, "smu-rpm-fans"))
 			break;
 	for (fan = NULL;
 	     fans && (fan = of_get_next_child(fans, fan)) != NULL;) {
Index: linux-work/drivers/macintosh/windfarm_smu_sensors.c
===================================================================
--- linux-work.orig/drivers/macintosh/windfarm_smu_sensors.c	2006-01-12 16:33:08.000000000 +1100
+++ linux-work/drivers/macintosh/windfarm_smu_sensors.c	2006-02-07 13:45:57.000000000 +1100
@@ -220,14 +220,29 @@ static struct smu_ad_sensor *smu_ads_cre
 	    !strcmp(l, "CPU T-Diode")) {
 		ads->sens.ops = &smu_cputemp_ops;
 		ads->sens.name = "cpu-temp";
+		if (cpudiode == NULL) {
+			DBG("wf: cpudiode partition (%02x) not found\n",
+			    SMU_SDB_CPUDIODE_ID);
+			goto fail;
+		}
 	} else if (!strcmp(c, "current-sensor") &&
 		   !strcmp(l, "CPU Current")) {
 		ads->sens.ops = &smu_cpuamp_ops;
 		ads->sens.name = "cpu-current";
+		if (cpuvcp == NULL) {
+			DBG("wf: cpuvcp partition (%02x) not found\n",
+			    SMU_SDB_CPUVCP_ID);
+			goto fail;
+		}
 	} else if (!strcmp(c, "voltage-sensor") &&
 		   !strcmp(l, "CPU Voltage")) {
 		ads->sens.ops = &smu_cpuvolt_ops;
 		ads->sens.name = "cpu-voltage";
+		if (cpuvcp == NULL) {
+			DBG("wf: cpuvcp partition (%02x) not found\n",
+			    SMU_SDB_CPUVCP_ID);
+			goto fail;
+		}
 	} else if (!strcmp(c, "power-sensor") &&
 		   !strcmp(l, "Slots Power")) {
 		ads->sens.ops = &smu_slotspow_ops;
@@ -365,29 +380,22 @@ smu_cpu_power_create(struct wf_sensor *v
 	return NULL;
 }
 
-static int smu_fetch_param_partitions(void)
+static void smu_fetch_param_partitions(void)
 {
 	struct smu_sdbp_header *hdr;
 
 	/* Get CPU voltage/current/power calibration data */
 	hdr = smu_get_sdb_partition(SMU_SDB_CPUVCP_ID, NULL);
-	if (hdr == NULL) {
-		DBG("wf: cpuvcp partition (%02x) not found\n",
-		    SMU_SDB_CPUVCP_ID);
-		return -ENODEV;
+	if (hdr != NULL) {
+		cpuvcp = (struct smu_sdbp_cpuvcp *)&hdr[1];
+		/* Keep version around */
+		cpuvcp_version = hdr->version;
 	}
-	cpuvcp = (struct smu_sdbp_cpuvcp *)&hdr[1];
-	/* Keep version around */
-	cpuvcp_version = hdr->version;
 
 	/* Get CPU diode calibration data */
 	hdr = smu_get_sdb_partition(SMU_SDB_CPUDIODE_ID, NULL);
-	if (hdr == NULL) {
-		DBG("wf: cpudiode partition (%02x) not found\n",
-		    SMU_SDB_CPUDIODE_ID);
-		return -ENODEV;
-	}
-	cpudiode = (struct smu_sdbp_cpudiode *)&hdr[1];
+	if (hdr != NULL)
+		cpudiode = (struct smu_sdbp_cpudiode *)&hdr[1];
 
 	/* Get slots power calibration data if any */
 	hdr = smu_get_sdb_partition(SMU_SDB_SLOTSPOW_ID, NULL);
@@ -398,23 +406,18 @@ static int smu_fetch_param_partitions(vo
 	hdr = smu_get_sdb_partition(SMU_SDB_DEBUG_SWITCHES_ID, NULL);
 	if (hdr != NULL)
 		debugswitches = (u8 *)&hdr[1];
-
-	return 0;
 }
 
 static int __init smu_sensors_init(void)
 {
 	struct device_node *smu, *sensors, *s;
 	struct smu_ad_sensor *volt_sensor = NULL, *curr_sensor = NULL;
-	int rc;
 
 	if (!smu_present())
 		return -ENODEV;
 
 	/* Get parameters partitions */
-	rc = smu_fetch_param_partitions();
-	if (rc)
-		return rc;
+	smu_fetch_param_partitions();
 
 	smu = of_find_node_by_type(NULL, "smu");
 	if (smu == NULL)
Index: linux-work/drivers/macintosh/windfarm_smu_sat.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-work/drivers/macintosh/windfarm_smu_sat.c	2006-02-07 16:18:32.000000000 +1100
@@ -0,0 +1,418 @@
+/*
+ * Windfarm PowerMac thermal control.  SMU "satellite" controller sensors.
+ *
+ * Copyright (C) 2005 Paul Mackerras, IBM Corp. <paulus at samba.org>
+ *
+ * Released under the terms of the GNU GPL v2.
+ */
+
+#include <linux/types.h>
+#include <linux/errno.h>
+#include <linux/kernel.h>
+#include <linux/slab.h>
+#include <linux/init.h>
+#include <linux/wait.h>
+#include <linux/i2c.h>
+#include <linux/i2c-dev.h>
+#include <asm/semaphore.h>
+#include <asm/prom.h>
+#include <asm/smu.h>
+#include <asm/pmac_low_i2c.h>
+
+#include "windfarm.h"
+
+#define VERSION "0.2"
+
+#define DEBUG
+
+#ifdef DEBUG
+#define DBG(args...)	printk(args)
+#else
+#define DBG(args...)	do { } while(0)
+#endif
+
+/* If the cache is older than 800ms we'll refetch it */
+#define MAX_AGE		msecs_to_jiffies(800)
+
+struct wf_sat {
+	int			nr;
+	atomic_t		refcnt;
+	struct semaphore	mutex;
+	unsigned long		last_read; /* jiffies when cache last updated */
+	u8			cache[16];
+	struct i2c_client	i2c;
+	struct device_node	*node;
+};
+
+static struct wf_sat *sats[2];
+
+struct wf_sat_sensor {
+	int		index;
+	int		index2;		/* used for power sensors */
+	int		shift;
+	struct wf_sat	*sat;
+	struct wf_sensor sens;
+};
+
+#define wf_to_sat(c)	container_of(c, struct wf_sat_sensor, sens)
+#define i2c_to_sat(c)	container_of(c, struct wf_sat, i2c)
+
+static int wf_sat_attach(struct i2c_adapter *adapter);
+static int wf_sat_detach(struct i2c_client *client);
+
+static struct i2c_driver wf_sat_driver = {
+	.driver = {
+		.name		= "wf_smu_sat",
+	},
+	.attach_adapter	= wf_sat_attach,
+	.detach_client	= wf_sat_detach,
+};
+
+/*
+ * XXX i2c_smbus_read_i2c_block_data doesn't pass the requested
+ * length down to the low-level driver, so we use this, which
+ * works well enough with the SMU i2c driver code...
+ */
+static int sat_read_block(struct i2c_client *client, u8 command,
+			  u8 *values, int len)
+{
+	union i2c_smbus_data data;
+	int err;
+
+	data.block[0] = len;
+	err = i2c_smbus_xfer(client->adapter, client->addr, client->flags,
+			     I2C_SMBUS_READ, command, I2C_SMBUS_I2C_BLOCK_DATA,
+			     &data);
+	if (!err)
+		memcpy(values, data.block, len);
+	return err;
+}
+
+struct smu_sdbp_header *smu_sat_get_sdb_partition(unsigned int sat_id, int id,
+						  unsigned int *size)
+{
+	struct wf_sat *sat;
+	int err;
+	unsigned int i, len;
+	u8 *buf;
+	u8 data[4];
+
+	/* TODO: Add the resulting partition to the device-tree */
+
+	if (sat_id > 1 || (sat = sats[sat_id]) == NULL)
+		return NULL;
+
+	err = i2c_smbus_write_word_data(&sat->i2c, 8, id << 8);
+	if (err) {
+		printk(KERN_ERR "smu_sat_get_sdb_part wr error %d\n", err);
+		return NULL;
+	}
+
+	len = i2c_smbus_read_word_data(&sat->i2c, 9);
+	if (len < 0) {
+		printk(KERN_ERR "smu_sat_get_sdb_part rd len error\n");
+		return NULL;
+	}
+	if (len == 0) {
+		printk(KERN_ERR "smu_sat_get_sdb_part no partition %x\n", id);
+		return NULL;
+	}
+
+	len = le16_to_cpu(len);
+	len = (len + 3) & ~3;
+	buf = kmalloc(len, GFP_KERNEL);
+	if (buf == NULL)
+		return NULL;
+
+	for (i = 0; i < len; i += 4) {
+		err = sat_read_block(&sat->i2c, 0xa, data, 4);
+		if (err) {
+			printk(KERN_ERR "smu_sat_get_sdb_part rd err %d\n",
+			       err);
+			goto fail;
+		}
+		buf[i] = data[1];
+		buf[i+1] = data[0];
+		buf[i+2] = data[3];
+		buf[i+3] = data[2];
+	}
+#ifdef DEBUG
+	DBG(KERN_DEBUG "sat %d partition %x:", sat_id, id);
+	for (i = 0; i < len; ++i)
+		DBG(" %x", buf[i]);
+	DBG("\n");
+#endif
+
+	if (size)
+		*size = len;
+	return (struct smu_sdbp_header *) buf;
+
+ fail:
+	kfree(buf);
+	return NULL;
+}
+
+/* refresh the cache */
+static int wf_sat_read_cache(struct wf_sat *sat)
+{
+	int err;
+
+	err = sat_read_block(&sat->i2c, 0x3f, sat->cache, 16);
+	if (err)
+		return err;
+	sat->last_read = jiffies;
+#ifdef LOTSA_DEBUG
+	{
+		int i;
+		DBG(KERN_DEBUG "wf_sat_get: data is");
+		for (i = 0; i < 16; ++i)
+			DBG(" %.2x", sat->cache[i]);
+		DBG("\n");
+	}
+#endif
+	return 0;
+}
+
+static int wf_sat_get(struct wf_sensor *sr, s32 *value)
+{
+	struct wf_sat_sensor *sens = wf_to_sat(sr);
+	struct wf_sat *sat = sens->sat;
+	int i, err;
+	s32 val;
+
+	if (sat->i2c.adapter == NULL)
+		return -ENODEV;
+
+	down(&sat->mutex);
+	if (time_after(jiffies, (sat->last_read + MAX_AGE))) {
+		err = wf_sat_read_cache(sat);
+		if (err)
+			goto fail;
+	}
+
+	i = sens->index * 2;
+	val = ((sat->cache[i] << 8) + sat->cache[i+1]) << sens->shift;
+	if (sens->index2 >= 0) {
+		i = sens->index2 * 2;
+		/* 4.12 * 8.8 -> 12.20; shift right 4 to get 16.16 */
+		val = (val * ((sat->cache[i] << 8) + sat->cache[i+1])) >> 4;
+	}
+
+	*value = val;
+	err = 0;
+
+ fail:
+	up(&sat->mutex);
+	return err;
+}
+
+static void wf_sat_release(struct wf_sensor *sr)
+{
+	struct wf_sat_sensor *sens = wf_to_sat(sr);
+	struct wf_sat *sat = sens->sat;
+
+	if (atomic_dec_and_test(&sat->refcnt)) {
+		if (sat->i2c.adapter) {
+			i2c_detach_client(&sat->i2c);
+			sat->i2c.adapter = NULL;
+		}
+		if (sat->nr >= 0)
+			sats[sat->nr] = NULL;
+		kfree(sat);
+	}
+	kfree(sens);
+}
+
+static struct wf_sensor_ops wf_sat_ops = {
+	.get_value	= wf_sat_get,
+	.release	= wf_sat_release,
+	.owner		= THIS_MODULE,
+};
+
+static void wf_sat_create(struct i2c_adapter *adapter, struct device_node *dev)
+{
+	struct wf_sat *sat;
+	struct wf_sat_sensor *sens;
+	u32 *reg;
+	char *loc, *type;
+	u8 addr, chip, core;
+	struct device_node *child;
+	int shift, cpu, index;
+	char *name;
+	int vsens[2], isens[2];
+
+	reg = (u32 *) get_property(dev, "reg", NULL);
+	if (reg == NULL)
+		return;
+	addr = *reg;
+	DBG(KERN_DEBUG "wf_sat: creating sat at address %x\n", addr);
+
+	sat = kzalloc(sizeof(struct wf_sat), GFP_KERNEL);
+	if (sat == NULL)
+		return;
+	sat->nr = -1;
+	sat->node = of_node_get(dev);
+	atomic_set(&sat->refcnt, 0);
+	init_MUTEX(&sat->mutex);
+	sat->i2c.addr = (addr >> 1) & 0x7f;
+	sat->i2c.adapter = adapter;
+	sat->i2c.driver = &wf_sat_driver;
+	strncpy(sat->i2c.name, "smu-sat", I2C_NAME_SIZE-1);
+
+	if (i2c_attach_client(&sat->i2c)) {
+		printk(KERN_ERR "windfarm: failed to attach smu-sat to i2c\n");
+		goto fail;
+	}
+
+	vsens[0] = vsens[1] = -1;
+	isens[0] = isens[1] = -1;
+	child = NULL;
+	while ((child = of_get_next_child(dev, child)) != NULL) {
+		reg = (u32 *) get_property(child, "reg", NULL);
+		type = get_property(child, "device_type", NULL);
+		loc = get_property(child, "location", NULL);
+		if (reg == NULL || loc == NULL)
+			continue;
+
+		/* the cooked sensors are between 0x30 and 0x37 */
+		if (*reg < 0x30 || *reg > 0x37)
+			continue;
+		index = *reg - 0x30;
+
+		/* expect location to be CPU [AB][01] ... */
+		if (strncmp(loc, "CPU ", 4) != 0)
+			continue;
+		chip = loc[4] - 'A';
+		core = loc[5] - '0';
+		if (chip > 1 || core > 1) {
+			printk(KERN_ERR "wf_sat_create: don't understand "
+			       "location %s for %s\n", loc, child->full_name);
+			continue;
+		}
+		cpu = 2 * chip + core;
+		if (sat->nr < 0)
+			sat->nr = chip;
+		else if (sat->nr != chip) {
+			printk(KERN_ERR "wf_sat_create: can't cope with "
+			       "multiple CPU chips on one SAT (%s)\n", loc);
+			continue;
+		}
+
+		if (strcmp(type, "voltage-sensor") == 0) {
+			name = "cpu-voltage";
+			shift = 4;
+			vsens[core] = index;
+		} else if (strcmp(type, "current-sensor") == 0) {
+			name = "cpu-current";
+			shift = 8;
+			isens[core] = index;
+		} else if (strcmp(type, "temp-sensor") == 0) {
+			name = "cpu-temp";
+			shift = 10;
+		} else
+			continue;	/* hmmm shouldn't happen */
+
+		/* the +16 is enough for "cpu-voltage-n" */
+		sens = kzalloc(sizeof(struct wf_sat_sensor) + 16, GFP_KERNEL);
+		if (sens == NULL) {
+			printk(KERN_ERR "wf_sat_create: couldn't create "
+			       "%s sensor %d (no memory)\n", name, cpu);
+			continue;
+		}
+		sens->index = index;
+		sens->index2 = -1;
+		sens->shift = shift;
+		sens->sat = sat;
+		atomic_inc(&sat->refcnt);
+		sens->sens.ops = &wf_sat_ops;
+		sens->sens.name = (char *) (sens + 1);
+		snprintf(sens->sens.name, 16, "%s-%d", name, cpu);
+
+		if (wf_register_sensor(&sens->sens)) {
+			atomic_dec(&sat->refcnt);
+			kfree(sens);
+		}
+	}
+
+	/* make the power sensors */
+	for (core = 0; core < 2; ++core) {
+		if (vsens[core] < 0 || isens[core] < 0)
+			continue;
+		cpu = 2 * sat->nr + core;
+		sens = kzalloc(sizeof(struct wf_sat_sensor) + 16, GFP_KERNEL);
+		if (sens == NULL) {
+			printk(KERN_ERR "wf_sat_create: couldn't create power "
+			       "sensor %d (no memory)\n", cpu);
+			continue;
+		}
+		sens->index = vsens[core];
+		sens->index2 = isens[core];
+		sens->shift = 0;
+		sens->sat = sat;
+		atomic_inc(&sat->refcnt);
+		sens->sens.ops = &wf_sat_ops;
+		sens->sens.name = (char *) (sens + 1);
+		snprintf(sens->sens.name, 16, "cpu-power-%d", cpu);
+
+		if (wf_register_sensor(&sens->sens)) {
+			atomic_dec(&sat->refcnt);
+			kfree(sens);
+		}
+	}
+
+	if (sat->nr >= 0)
+		sats[sat->nr] = sat;
+
+	return;
+
+ fail:
+	kfree(sat);
+}
+
+static int wf_sat_attach(struct i2c_adapter *adapter)
+{
+	struct device_node *busnode, *dev = NULL;
+	struct pmac_i2c_bus *bus;
+
+	bus = pmac_i2c_adapter_to_bus(adapter);
+	if (bus == NULL)
+		return -ENODEV;
+	busnode = pmac_i2c_get_bus_node(bus);
+
+	while ((dev = of_get_next_child(busnode, dev)) != NULL)
+		if (device_is_compatible(dev, "smu-sat"))
+			wf_sat_create(adapter, dev);
+	return 0;
+}
+
+static int wf_sat_detach(struct i2c_client *client)
+{
+	struct wf_sat *sat = i2c_to_sat(client);
+
+	/* XXX TODO */
+
+	sat->i2c.adapter = NULL;
+	return 0;
+}
+
+static int __init sat_sensors_init(void)
+{
+	int err;
+
+	err = i2c_add_driver(&wf_sat_driver);
+	if (err < 0)
+		return err;
+	return 0;
+}
+
+static void __exit sat_sensors_exit(void)
+{
+	i2c_del_driver(&wf_sat_driver);
+}
+
+module_init(sat_sensors_init);
+/*module_exit(sat_sensors_exit); Uncomment when cleanup is implemented */
+
+MODULE_AUTHOR("Paul Mackerras <paulus at samba.org>");
+MODULE_DESCRIPTION("SMU satellite sensors for PowerMac thermal control");
+MODULE_LICENSE("GPL");
Index: linux-work/include/asm-powerpc/smu.h
===================================================================
--- linux-work.orig/include/asm-powerpc/smu.h	2006-01-13 16:55:09.000000000 +1100
+++ linux-work/include/asm-powerpc/smu.h	2006-02-07 13:45:57.000000000 +1100
@@ -521,6 +521,11 @@ struct smu_sdbp_cpupiddata {
 extern struct smu_sdbp_header *smu_get_sdb_partition(int id,
 					unsigned int *size);
 
+/* Get "sdb" partition data from an SMU satellite */
+extern struct smu_sdbp_header *smu_sat_get_sdb_partition(unsigned int sat_id,
+					int id, unsigned int *size);
+
+
 #endif /* __KERNEL__ */
 
 
Index: linux-work/drivers/macintosh/windfarm_pm91.c
===================================================================
--- linux-work.orig/drivers/macintosh/windfarm_pm91.c	2005-11-09 11:49:03.000000000 +1100
+++ linux-work/drivers/macintosh/windfarm_pm91.c	2006-02-08 16:34:39.000000000 +1100
@@ -458,45 +458,6 @@ static void wf_smu_slots_fans_tick(struc
 
 
 /*
- * ****** Attributes ******
- *
- */
-
-#define BUILD_SHOW_FUNC_FIX(name, data)				\
-static ssize_t show_##name(struct device *dev,                  \
-			   struct device_attribute *attr,       \
-			   char *buf)	                        \
-{								\
-	ssize_t r;						\
-	s32 val = 0;                                            \
-	data->ops->get_value(data, &val);                       \
-	r = sprintf(buf, "%d.%03d", FIX32TOPRINT(val)); 	\
-	return r;						\
-}                                                               \
-static DEVICE_ATTR(name,S_IRUGO,show_##name, NULL);
-
-
-#define BUILD_SHOW_FUNC_INT(name, data)				\
-static ssize_t show_##name(struct device *dev,                  \
-			   struct device_attribute *attr,       \
-			   char *buf)	                        \
-{								\
-	s32 val = 0;                                            \
-	data->ops->get_value(data, &val);                       \
-	return sprintf(buf, "%d", val);  			\
-}                                                               \
-static DEVICE_ATTR(name,S_IRUGO,show_##name, NULL);
-
-BUILD_SHOW_FUNC_INT(cpu_fan, fan_cpu_main);
-BUILD_SHOW_FUNC_INT(hd_fan, fan_hd);
-BUILD_SHOW_FUNC_INT(slots_fan, fan_slots);
-
-BUILD_SHOW_FUNC_FIX(cpu_temp, sensor_cpu_temp);
-BUILD_SHOW_FUNC_FIX(cpu_power, sensor_cpu_power);
-BUILD_SHOW_FUNC_FIX(hd_temp, sensor_hd_temp);
-BUILD_SHOW_FUNC_FIX(slots_power, sensor_slots_power);
-
-/*
  * ****** Setup / Init / Misc ... ******
  *
  */
@@ -581,10 +542,8 @@ static void wf_smu_new_control(struct wf
 		return;
 
 	if (fan_cpu_main == NULL && !strcmp(ct->name, "cpu-rear-fan-0")) {
-		if (wf_get_control(ct) == 0) {
+		if (wf_get_control(ct) == 0)
 			fan_cpu_main = ct;
-			device_create_file(wf_smu_dev, &dev_attr_cpu_fan);
-		}
 	}
 
 	if (fan_cpu_second == NULL && !strcmp(ct->name, "cpu-rear-fan-1")) {
@@ -603,17 +562,13 @@ static void wf_smu_new_control(struct wf
 	}
 
 	if (fan_hd == NULL && !strcmp(ct->name, "drive-bay-fan")) {
-		if (wf_get_control(ct) == 0) {
+		if (wf_get_control(ct) == 0)
 			fan_hd = ct;
-			device_create_file(wf_smu_dev, &dev_attr_hd_fan);
-		}
 	}
 
 	if (fan_slots == NULL && !strcmp(ct->name, "slots-fan")) {
-		if (wf_get_control(ct) == 0) {
+		if (wf_get_control(ct) == 0)
 			fan_slots = ct;
-			device_create_file(wf_smu_dev, &dev_attr_slots_fan);
-		}
 	}
 
 	if (fan_cpu_main && (fan_cpu_second || fan_cpu_third) && fan_hd &&
@@ -627,31 +582,23 @@ static void wf_smu_new_sensor(struct wf_
 		return;
 
 	if (sensor_cpu_power == NULL && !strcmp(sr->name, "cpu-power")) {
-		if (wf_get_sensor(sr) == 0) {
+		if (wf_get_sensor(sr) == 0)
 			sensor_cpu_power = sr;
-			device_create_file(wf_smu_dev, &dev_attr_cpu_power);
-		}
 	}
 
 	if (sensor_cpu_temp == NULL && !strcmp(sr->name, "cpu-temp")) {
-		if (wf_get_sensor(sr) == 0) {
+		if (wf_get_sensor(sr) == 0)
 			sensor_cpu_temp = sr;
-			device_create_file(wf_smu_dev, &dev_attr_cpu_temp);
-		}
 	}
 
 	if (sensor_hd_temp == NULL && !strcmp(sr->name, "hd-temp")) {
-		if (wf_get_sensor(sr) == 0) {
+		if (wf_get_sensor(sr) == 0)
 			sensor_hd_temp = sr;
-			device_create_file(wf_smu_dev, &dev_attr_hd_temp);
-		}
 	}
 
 	if (sensor_slots_power == NULL && !strcmp(sr->name, "slots-power")) {
-		if (wf_get_sensor(sr) == 0) {
+		if (wf_get_sensor(sr) == 0)
 			sensor_slots_power = sr;
-			device_create_file(wf_smu_dev, &dev_attr_slots_power);
-		}
 	}
 
 	if (sensor_cpu_power && sensor_cpu_temp &&
@@ -720,40 +667,26 @@ static int wf_smu_remove(struct device *
 	 * with that except by adding locks all over... I'll do that
 	 * eventually but heh, who ever rmmod this module anyway ?
 	 */
-	if (sensor_cpu_power) {
-		device_remove_file(wf_smu_dev, &dev_attr_cpu_power);
+	if (sensor_cpu_power)
 		wf_put_sensor(sensor_cpu_power);
-	}
-	if (sensor_cpu_temp) {
-		device_remove_file(wf_smu_dev, &dev_attr_cpu_temp);
+	if (sensor_cpu_temp)
 		wf_put_sensor(sensor_cpu_temp);
-	}
-	if (sensor_hd_temp) {
-		device_remove_file(wf_smu_dev, &dev_attr_hd_temp);
+	if (sensor_hd_temp)
 		wf_put_sensor(sensor_hd_temp);
-	}
-	if (sensor_slots_power) {
-		device_remove_file(wf_smu_dev, &dev_attr_slots_power);
+	if (sensor_slots_power)
 		wf_put_sensor(sensor_slots_power);
-	}
 
 	/* Release all controls */
-	if (fan_cpu_main) {
-		device_remove_file(wf_smu_dev, &dev_attr_cpu_fan);
+	if (fan_cpu_main)
 		wf_put_control(fan_cpu_main);
-	}
 	if (fan_cpu_second)
 		wf_put_control(fan_cpu_second);
 	if (fan_cpu_third)
 		wf_put_control(fan_cpu_third);
-	if (fan_hd) {
-		device_remove_file(wf_smu_dev, &dev_attr_hd_fan);
+	if (fan_hd)
 		wf_put_control(fan_hd);
-	}
-	if (fan_slots) {
-		device_remove_file(wf_smu_dev, &dev_attr_slots_fan);
+	if (fan_slots)
 		wf_put_control(fan_slots);
-	}
 	if (cpufreq_clamp)
 		wf_put_control(cpufreq_clamp);
 
Index: linux-work/drivers/macintosh/windfarm_pm81.c
===================================================================
--- linux-work.orig/drivers/macintosh/windfarm_pm81.c	2006-01-13 16:55:07.000000000 +1100
+++ linux-work/drivers/macintosh/windfarm_pm81.c	2006-02-08 16:35:28.000000000 +1100
@@ -538,45 +538,6 @@ static void wf_smu_cpu_fans_tick(struct 
 	}
 }
 
-
-/*
- * ****** Attributes ******
- *
- */
-
-#define BUILD_SHOW_FUNC_FIX(name, data)				\
-static ssize_t show_##name(struct device *dev,                  \
-			   struct device_attribute *attr,       \
-			   char *buf)	                        \
-{								\
-	ssize_t r;						\
-	s32 val = 0;                                            \
-	data->ops->get_value(data, &val);                       \
-	r = sprintf(buf, "%d.%03d", FIX32TOPRINT(val)); 	\
-	return r;						\
-}                                                               \
-static DEVICE_ATTR(name,S_IRUGO,show_##name, NULL);
-
-
-#define BUILD_SHOW_FUNC_INT(name, data)				\
-static ssize_t show_##name(struct device *dev,                  \
-			   struct device_attribute *attr,       \
-			   char *buf)	                        \
-{								\
-	s32 val = 0;                                            \
-	data->ops->get_value(data, &val);                       \
-	return sprintf(buf, "%d", val);  			\
-}                                                               \
-static DEVICE_ATTR(name,S_IRUGO,show_##name, NULL);
-
-BUILD_SHOW_FUNC_INT(cpu_fan, fan_cpu_main);
-BUILD_SHOW_FUNC_INT(sys_fan, fan_system);
-BUILD_SHOW_FUNC_INT(hd_fan, fan_hd);
-
-BUILD_SHOW_FUNC_FIX(cpu_temp, sensor_cpu_temp);
-BUILD_SHOW_FUNC_FIX(cpu_power, sensor_cpu_power);
-BUILD_SHOW_FUNC_FIX(hd_temp, sensor_hd_temp);
-
 /*
  * ****** Setup / Init / Misc ... ******
  *
@@ -654,17 +615,13 @@ static void wf_smu_new_control(struct wf
 		return;
 
 	if (fan_cpu_main == NULL && !strcmp(ct->name, "cpu-fan")) {
-		if (wf_get_control(ct) == 0) {
+		if (wf_get_control(ct) == 0)
 			fan_cpu_main = ct;
-			device_create_file(wf_smu_dev, &dev_attr_cpu_fan);
-		}
 	}
 
 	if (fan_system == NULL && !strcmp(ct->name, "system-fan")) {
-		if (wf_get_control(ct) == 0) {
+		if (wf_get_control(ct) == 0)
 			fan_system = ct;
-			device_create_file(wf_smu_dev, &dev_attr_sys_fan);
-		}
 	}
 
 	if (cpufreq_clamp == NULL && !strcmp(ct->name, "cpufreq-clamp")) {
@@ -683,10 +640,8 @@ static void wf_smu_new_control(struct wf
 	}
 
 	if (fan_hd == NULL && !strcmp(ct->name, "drive-bay-fan")) {
-		if (wf_get_control(ct) == 0) {
+		if (wf_get_control(ct) == 0)
 			fan_hd = ct;
-			device_create_file(wf_smu_dev, &dev_attr_hd_fan);
-		}
 	}
 
 	if (fan_system && fan_hd && fan_cpu_main && cpufreq_clamp)
@@ -699,24 +654,18 @@ static void wf_smu_new_sensor(struct wf_
 		return;
 
 	if (sensor_cpu_power == NULL && !strcmp(sr->name, "cpu-power")) {
-		if (wf_get_sensor(sr) == 0) {
+		if (wf_get_sensor(sr) == 0)
 			sensor_cpu_power = sr;
-			device_create_file(wf_smu_dev, &dev_attr_cpu_power);
-		}
 	}
 
 	if (sensor_cpu_temp == NULL && !strcmp(sr->name, "cpu-temp")) {
-		if (wf_get_sensor(sr) == 0) {
+		if (wf_get_sensor(sr) == 0)
 			sensor_cpu_temp = sr;
-			device_create_file(wf_smu_dev, &dev_attr_cpu_temp);
-		}
 	}
 
 	if (sensor_hd_temp == NULL && !strcmp(sr->name, "hd-temp")) {
-		if (wf_get_sensor(sr) == 0) {
+		if (wf_get_sensor(sr) == 0)
 			sensor_hd_temp = sr;
-			device_create_file(wf_smu_dev, &dev_attr_hd_temp);
-		}
 	}
 
 	if (sensor_cpu_power && sensor_cpu_temp && sensor_hd_temp)
@@ -794,32 +743,20 @@ static int wf_smu_remove(struct device *
 	 * with that except by adding locks all over... I'll do that
 	 * eventually but heh, who ever rmmod this module anyway ?
 	 */
-	if (sensor_cpu_power) {
-		device_remove_file(wf_smu_dev, &dev_attr_cpu_power);
+	if (sensor_cpu_power)
 		wf_put_sensor(sensor_cpu_power);
-	}
-	if (sensor_cpu_temp) {
-		device_remove_file(wf_smu_dev, &dev_attr_cpu_temp);
+	if (sensor_cpu_temp)
 		wf_put_sensor(sensor_cpu_temp);
-	}
-	if (sensor_hd_temp) {
-		device_remove_file(wf_smu_dev, &dev_attr_hd_temp);
+	if (sensor_hd_temp)
 		wf_put_sensor(sensor_hd_temp);
-	}
 
 	/* Release all controls */
-	if (fan_cpu_main) {
-		device_remove_file(wf_smu_dev, &dev_attr_cpu_fan);
+	if (fan_cpu_main)
 		wf_put_control(fan_cpu_main);
-	}
-	if (fan_hd) {
-		device_remove_file(wf_smu_dev, &dev_attr_hd_fan);
+	if (fan_hd)
 		wf_put_control(fan_hd);
-	}
-	if (fan_system) {
-		device_remove_file(wf_smu_dev, &dev_attr_sys_fan);
+	if (fan_system)
 		wf_put_control(fan_system);
-	}
 	if (cpufreq_clamp)
 		wf_put_control(cpufreq_clamp);
 

From torvalds at osdl.org  Wed Feb  8 17:07:30 2006
From: torvalds at osdl.org (Linus Torvalds)
Date: Tue, 7 Feb 2006 22:07:30 -0800 (PST)
Subject: [PATCH] powerpc: Thermal control for dual core G5s
In-Reply-To: <1139377372.8187.16.camel@localhost.localdomain>
References: <1139377372.8187.16.camel@localhost.localdomain>
Message-ID: <Pine.LNX.4.64.0602072206290.2458@g5.osdl.org>


On Wed, 8 Feb 2006, Benjamin Herrenschmidt wrote:
>
> This patch adds a windfarm module, windfarm_pm112, for the dual core G5s
> (both 2 and 4 core models), keeping the machine from getting into
> vacuum-cleaner mode ;)

This seems to introduce a new warning..

  arch/powerpc/platforms/83xx/Kconfig:10:
	warning: 'select' used by config symbol 'MPC834x_SYS' refer to undefined symbol 'DEFAULT_UIMAGE'

  drivers/macintosh/Kconfig:193:
	warning: 'select' used by config symbol 'WINDFARM_PM112' refer to undefined symbol 'I2C_PMAC_SMU'


Hmm?

		Linus


From benh at kernel.crashing.org  Wed Feb  8 17:38:51 2006
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Wed, 08 Feb 2006 17:38:51 +1100
Subject: [PATCH] powerpc: Thermal control for dual core G5s
In-Reply-To: <Pine.LNX.4.64.0602072206290.2458@g5.osdl.org>
References: <1139377372.8187.16.camel@localhost.localdomain>
	<Pine.LNX.4.64.0602072206290.2458@g5.osdl.org>
Message-ID: <1139380731.5003.1.camel@localhost.localdomain>

On Tue, 2006-02-07 at 22:07 -0800, Linus Torvalds wrote:
> 
> On Wed, 8 Feb 2006, Benjamin Herrenschmidt wrote:
> >
> > This patch adds a windfarm module, windfarm_pm112, for the dual core G5s
> > (both 2 and 4 core models), keeping the machine from getting into
> > vacuum-cleaner mode ;)
> 
> This seems to introduce a new warning..
> 
>   arch/powerpc/platforms/83xx/Kconfig:10:
> 	warning: 'select' used by config symbol 'MPC834x_SYS' refer to undefined symbol 'DEFAULT_UIMAGE'

Former is not mine...

>   drivers/macintosh/Kconfig:193:
> 	warning: 'select' used by config symbol 'WINDFARM_PM112' refer to undefined symbol 'I2C_PMAC_SMU'

Ok, looks like I forgot to update the Kconfig for the new i2c driver, it
should select I2C_POWERMAC instead. Do you want a new patch or can you
just fix it there ?

Ben.


From galak at kernel.crashing.org  Thu Feb  9 01:52:33 2006
From: galak at kernel.crashing.org (Kumar Gala)
Date: Wed, 8 Feb 2006 08:52:33 -0600
Subject: [PATCH] powerpc: Thermal control for dual core G5s
In-Reply-To: <Pine.LNX.4.64.0602072206290.2458@g5.osdl.org>
References: <1139377372.8187.16.camel@localhost.localdomain>
	<Pine.LNX.4.64.0602072206290.2458@g5.osdl.org>
Message-ID: <87E547D2-A8FC-4A89-89A4-60313C9647B0@kernel.crashing.org>


On Feb 8, 2006, at 12:07 AM, Linus Torvalds wrote:

>
>
> On Wed, 8 Feb 2006, Benjamin Herrenschmidt wrote:
>>
>> This patch adds a windfarm module, windfarm_pm112, for the dual  
>> core G5s
>> (both 2 and 4 core models), keeping the machine from getting into
>> vacuum-cleaner mode ;)
>
> This seems to introduce a new warning..
>
>   arch/powerpc/platforms/83xx/Kconfig:10:
> 	warning: 'select' used by config symbol 'MPC834x_SYS' refer to  
> undefined symbol 'DEFAULT_UIMAGE'

That's my fault.  Paul did push a simple build system update.  I'll  
ask him to do so.

http://ozlabs.org/pipermail/linuxppc-dev/2006-January/020980.html

- kumar


From dwmw2 at infradead.org  Thu Feb  9 05:01:00 2006
From: dwmw2 at infradead.org (David Woodhouse)
Date: Wed, 08 Feb 2006 18:01:00 +0000
Subject: PPC64 boot failure with 2.6.15
In-Reply-To: <1138660828.12601.21.camel@localhost.localdomain>
References: <200601251821.47557.pat@computer-refuge.org>
	<200601252307.45741.pat@computer-refuge.org>
	<200601260051.00902.pat@computer-refuge.org>
	<200601262025.44777.pat@computer-refuge.org>
	<1138660828.12601.21.camel@localhost.localdomain>
Message-ID: <1139421660.4183.15.camel@pmac.infradead.org>

On Tue, 2006-01-31 at 09:40 +1100, Benjamin Herrenschmidt wrote:
> Interesting... best would be to try to bisect to find out what
> specific patch broke it but I understand that's not easy with those
> old kernels that were maintained with bitkeeper... Maybe you could try
> to spot which daily bk snapshot broke it if they are still available
> somewhere ?

What's wrong with just using 'git bisect' on the BK->git converted tree?

-- 
dwmw2


From linas at austin.ibm.com  Thu Feb  9 05:29:13 2006
From: linas at austin.ibm.com (Linas Vepstas)
Date: Wed, 8 Feb 2006 12:29:13 -0600
Subject: [PATCH]: Documentation: Updated PCI Error Recovery
In-Reply-To: <20060207231927.GB19648@kroah.com>
References: <20060203000602.GQ24916@austin.ibm.com>
	<20060207222144.GA15622@kroah.com>
	<20060207143052.19978ca7.akpm@osdl.org>
	<20060207223956.GA19009@kroah.com>
	<20060207145347.72c0a77e.akpm@osdl.org>
	<20060207231927.GB19648@kroah.com>
Message-ID: <20060208182913.GQ24916@austin.ibm.com>

On Tue, Feb 07, 2006 at 03:19:27PM -0800, Greg KH was heard to remark:
> > It could be all the newly-added trailing whitespace I chopped off.
> 
> Yup, that was it, quilt would have stripped them off for me too.  Linas,
> please don't do this anymore...

Sorry; I'm usually good about that in code, but the Pavlovian
reaction didn't trip on docs.

--linas


From geoffrey.levand at am.sony.com  Thu Feb  9 14:08:48 2006
From: geoffrey.levand at am.sony.com (Geoff Levand)
Date: Wed, 08 Feb 2006 19:08:48 -0800
Subject: [PATCH] fix prom_init undefined error
In-Reply-To: <1139371162.8187.3.camel@localhost.localdomain>
References: <1139371162.8187.3.camel@localhost.localdomain>
Message-ID: <43EAB240.8030107@am.sony.com>

Benjamin Herrenschmidt wrote:
> On Tue, 2006-02-07 at 15:10 -0800, Geoff Levand wrote:
> 
>>Paul,
>>
>>This patch fixes a build error when CONFIG_PPC_OF=n,
>>CONFIG_PPC_MULTIPLATFORM=y.  It makes the conditionals
>>consistent in arch/powerpc/kernel/Makefile and head_64.S
>>to both be on CONFIG_PPC_OF.
>>
>>  arch/powerpc/kernel/head_64.o: In function `.__boot_from_prom':
>>  linux/arch/powerpc/kernel/head_64.S:(.text+0x8158): undefined
> 
> reference to `.prom_init'
> 
>>obj-$(CONFIG_PPC_OF) += prom_init.o
> 
> 
> With ARCH=powerpc, CONFIG_PPC_OF should always be set. It's supposed to
> be set when the device-tree accessors exist which they always do.

OK, that makes things clear.  

> Besides, I'll be removing support for !MULTIPLATFORM too :) (Except for
> iSeries at least for a little while). Look at the patch I posted that
> removes _machine for an idea of where things are going. 

I'll try and rework things when you get rid of MULTIPLATFORM.  When can 
we expect it?

-Geoff


From benh at kernel.crashing.org  Thu Feb  9 14:44:45 2006
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Thu, 09 Feb 2006 14:44:45 +1100
Subject: [PATCH] fix prom_init undefined error
In-Reply-To: <43EAB240.8030107@am.sony.com>
References: <1139371162.8187.3.camel@localhost.localdomain>
	<43EAB240.8030107@am.sony.com>
Message-ID: <1139456685.5003.33.camel@localhost.localdomain>


> > Besides, I'll be removing support for !MULTIPLATFORM too :) (Except for
> > iSeries at least for a little while). Look at the patch I posted that
> > removes _machine for an idea of where things are going. 
> 
> I'll try and rework things when you get rid of MULTIPLATFORM.  When can 
> we expect it?

Well, I posted a first patch that removes _machine and makes the
platform "probe" closer between 32 and 64 bits already a few weeks ago
before I went on vacation... I'll revive that patch today or tomorrow
and have it merged in powerpc.git. At which point, I'll tackle in no
specific order, removing of pre-parsed interrupt stuff in device_node
(and implement a proper generic OF interrupt tree parser), making the
early init code between 32 and 64 bits even more similar (si discussion
about boot code I posted a month or two ago), etc...

I'm sorry I can't promise any timeframe at this point though due to
personal constraints (just had a baby).

Ben.


From bdc at carlstrom.com  Thu Feb  9 17:02:37 2006
From: bdc at carlstrom.com (Brian D. Carlstrom)
Date: Wed, 8 Feb 2006 22:02:37 -0800
Subject: G5 fan problems return moving to 2.6.15 with dual processor
	2.7GHz machine
In-Reply-To: <1139371016.8187.1.camel@localhost.localdomain>
References: <20060205061048.7261.qmail@electricrain.com>
	<1139130385.5634.14.camel@localhost.localdomain>
	<17384.62553.442011.514155@zot.electricrain.com>
	<1139371016.8187.1.camel@localhost.localdomain>
Message-ID: <17386.56061.78892.44180@zot.electricrain.com>

Benjamin Herrenschmidt writes:
 > prom_printf should work ... try booting manually (from the OF command
 > line) and maybe comment out the code that opens the displays... (it
 > may be clearing the screen)....

I tried commented out prom_check_displays and that does prevent the
clearing of the screen, but still no visible prom_printf output. The
last output seems to be from yaboot, its certainly not one of the
prom_print messages from prom_init.c. For good measure, I also tried
adding "video=ofonly". Still no prom_printf output visible.

However, when I rebooted back to my 2.6.14 kernel, I saw the usual
prom_printf messages from prom_init without any changes. I reviewed the
prom_init.c diffs between 2.6.14 and 2.6.15 but they are large enough
that its not easy to spot an obvious problem.

In any case, I didn't have much time to really look at this today, just
enough to try the disabling prom_check_displays, I'll have to look more
Friday.

-bri


From michael at ellerman.id.au  Thu Feb  9 17:03:27 2006
From: michael at ellerman.id.au (Michael Ellerman)
Date: Thu, 09 Feb 2006 17:03:27 +1100
Subject: [PATCH 1/3] powerpc: Clean up pSeries firmware feature initialisation
Message-ID: <1139465007.297357.792110844862.qpush@concordia>

Clean up fw_feature_init in platforms/pseries/setup.c. Clean up white space
and replace the while loop with a for loop - which seems clearer to me.

Signed-off-by: Michael Ellerman <michael at ellerman.id.au>
---

 arch/powerpc/platforms/pseries/setup.c |   45 +++++++++++++++------------------
 1 files changed, 21 insertions(+), 24 deletions(-)

Index: to-merge/arch/powerpc/platforms/pseries/setup.c
===================================================================
--- to-merge.orig/arch/powerpc/platforms/pseries/setup.c
+++ to-merge/arch/powerpc/platforms/pseries/setup.c
@@ -263,48 +263,45 @@ static int __init pSeries_init_panel(voi
 arch_initcall(pSeries_init_panel);
 
 
-/* Build up the ppc64_firmware_features bitmask field
- * using contents of device-tree/ibm,hypertas-functions.
- * Ultimately this functionality may be moved into prom.c prom_init().
+/* Build up the firmware features bitmask using the contents of
+ * device-tree/ibm,hypertas-functions.  Ultimately this functionality may
+ * be moved into prom.c prom_init().
  */
 static void __init fw_feature_init(void)
 {
-	struct device_node * dn;
-	char * hypertas;
-	unsigned int len;
+	struct device_node *dn;
+	char *hypertas, *s;
+	int len, i;
 
 	DBG(" -> fw_feature_init()\n");
 
-	ppc64_firmware_features = 0;
 	dn = of_find_node_by_path("/rtas");
 	if (dn == NULL) {
-		printk(KERN_ERR "WARNING ! Cannot find RTAS in device-tree !\n");
+		printk(KERN_ERR "WARNING! Cannot find RTAS in device-tree!\n");
 		goto no_rtas;
 	}
 
 	hypertas = get_property(dn, "ibm,hypertas-functions", &len);
-	if (hypertas) {
-		while (len > 0){
-			int i, hypertas_len;
+	if (hypertas == NULL)
+		goto no_hypertas;
+
+	for (s = hypertas; s < hypertas + len; s += strlen(s) + 1) {
+		for (i = 0; i < FIRMWARE_MAX_FEATURES; i++) {
 			/* check value against table of strings */
-			for(i=0; i < FIRMWARE_MAX_FEATURES ;i++) {
-				if ((firmware_features_table[i].name) &&
-				    (strcmp(firmware_features_table[i].name,hypertas))==0) {
-					/* we have a match */
-					ppc64_firmware_features |= 
-						(firmware_features_table[i].val);
-					break;
-				} 
-			}
-			hypertas_len = strlen(hypertas);
-			len -= hypertas_len +1;
-			hypertas+= hypertas_len +1;
+			if (!firmware_features_table[i].name ||
+			    strcmp(firmware_features_table[i].name, s))
+				continue;
+
+			/* we have a match */
+			ppc64_firmware_features |=
+				firmware_features_table[i].val;
+			break;
 		}
 	}
 
+no_hypertas:
 	of_node_put(dn);
 no_rtas:
-
 	DBG(" <- fw_feature_init()\n");
 }
 

From michael at ellerman.id.au  Thu Feb  9 17:03:33 2006
From: michael at ellerman.id.au (Michael Ellerman)
Date: Thu, 09 Feb 2006 17:03:33 +1100
Subject: [PATCH 2/3] powerpc: Move pSeries firmware feature setup into
	platforms/pseries
In-Reply-To: <1139465007.297357.792110844862.qpush@concordia>
Message-ID: <20060209060356.5606C679F6@ozlabs.org>

Currently we have some stuff in firmware.h and kernel/firmware.c that is
#ifdef CONFIG_PPC_PSERIES. Move it all into platforms/pseries.

Signed-off-by: Michael Ellerman <michael at ellerman.id.au>
---

 arch/powerpc/kernel/firmware.c            |   25 -------
 arch/powerpc/platforms/pseries/Makefile   |    3 
 arch/powerpc/platforms/pseries/firmware.c |  104 ++++++++++++++++++++++++++++++
 arch/powerpc/platforms/pseries/firmware.h |   17 ++++
 arch/powerpc/platforms/pseries/setup.c    |   46 -------------
 include/asm-powerpc/firmware.h            |    9 --
 6 files changed, 124 insertions(+), 80 deletions(-)

Index: to-merge/arch/powerpc/kernel/firmware.c
===================================================================
--- to-merge.orig/arch/powerpc/kernel/firmware.c
+++ to-merge/arch/powerpc/kernel/firmware.c
@@ -18,28 +18,3 @@
 #include <asm/firmware.h>
 
 unsigned long ppc64_firmware_features;
-
-#ifdef CONFIG_PPC_PSERIES
-firmware_feature_t firmware_features_table[FIRMWARE_MAX_FEATURES] = {
-	{FW_FEATURE_PFT,		"hcall-pft"},
-	{FW_FEATURE_TCE,		"hcall-tce"},
-	{FW_FEATURE_SPRG0,		"hcall-sprg0"},
-	{FW_FEATURE_DABR,		"hcall-dabr"},
-	{FW_FEATURE_COPY,		"hcall-copy"},
-	{FW_FEATURE_ASR,		"hcall-asr"},
-	{FW_FEATURE_DEBUG,		"hcall-debug"},
-	{FW_FEATURE_PERF,		"hcall-perf"},
-	{FW_FEATURE_DUMP,		"hcall-dump"},
-	{FW_FEATURE_INTERRUPT,		"hcall-interrupt"},
-	{FW_FEATURE_MIGRATE,		"hcall-migrate"},
-	{FW_FEATURE_PERFMON,		"hcall-perfmon"},
-	{FW_FEATURE_CRQ,		"hcall-crq"},
-	{FW_FEATURE_VIO,		"hcall-vio"},
-	{FW_FEATURE_RDMA,		"hcall-rdma"},
-	{FW_FEATURE_LLAN,		"hcall-lLAN"},
-	{FW_FEATURE_BULK,		"hcall-bulk"},
-	{FW_FEATURE_XDABR,		"hcall-xdabr"},
-	{FW_FEATURE_MULTITCE,		"hcall-multi-tce"},
-	{FW_FEATURE_SPLPAR,		"hcall-splpar"},
-};
-#endif
Index: to-merge/arch/powerpc/platforms/pseries/Makefile
===================================================================
--- to-merge.orig/arch/powerpc/platforms/pseries/Makefile
+++ to-merge/arch/powerpc/platforms/pseries/Makefile
@@ -1,5 +1,6 @@
 obj-y			:= pci.o lpar.o hvCall.o nvram.o reconfig.o \
-			   setup.o iommu.o ras.o rtasd.o pci_dlpar.o
+			   setup.o iommu.o ras.o rtasd.o pci_dlpar.o \
+			   firmware.o
 obj-$(CONFIG_SMP)	+= smp.o
 obj-$(CONFIG_IBMVIO)	+= vio.o
 obj-$(CONFIG_XICS)	+= xics.o
Index: to-merge/arch/powerpc/platforms/pseries/firmware.c
===================================================================
--- /dev/null
+++ to-merge/arch/powerpc/platforms/pseries/firmware.c
@@ -0,0 +1,104 @@
+/*
+ *  pSeries firmware setup code.
+ *
+ *  Portions from arch/powerpc/platforms/pseries/setup.c:
+ *   Copyright (C) 1995  Linus Torvalds
+ *   Adapted from 'alpha' version by Gary Thomas
+ *   Modified by Cort Dougan (cort at cs.nmt.edu)
+ *   Modified by PPC64 Team, IBM Corp
+ *
+ *  Portions from arch/powerpc/kernel/firmware.c
+ *   Copyright (C) 2001 Ben. Herrenschmidt (benh at kernel.crashing.org)
+ *   Modifications for ppc64:
+ *    Copyright (C) 2003 Dave Engebretsen <engebret at us.ibm.com>
+ *    Copyright (C) 2005 Stephen Rothwell, IBM Corporation
+ *
+ *  Copyright 2006 IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#undef DEBUG
+
+#include <asm/firmware.h>
+#include <asm/prom.h>
+
+#ifdef DEBUG
+#define DBG(fmt...) udbg_printf(fmt)
+#else
+#define DBG(fmt...)
+#endif
+
+typedef struct {
+    unsigned long val;
+    char * name;
+} firmware_feature_t;
+
+static __initdata firmware_feature_t
+firmware_features_table[FIRMWARE_MAX_FEATURES] = {
+	{FW_FEATURE_PFT,		"hcall-pft"},
+	{FW_FEATURE_TCE,		"hcall-tce"},
+	{FW_FEATURE_SPRG0,		"hcall-sprg0"},
+	{FW_FEATURE_DABR,		"hcall-dabr"},
+	{FW_FEATURE_COPY,		"hcall-copy"},
+	{FW_FEATURE_ASR,		"hcall-asr"},
+	{FW_FEATURE_DEBUG,		"hcall-debug"},
+	{FW_FEATURE_PERF,		"hcall-perf"},
+	{FW_FEATURE_DUMP,		"hcall-dump"},
+	{FW_FEATURE_INTERRUPT,		"hcall-interrupt"},
+	{FW_FEATURE_MIGRATE,		"hcall-migrate"},
+	{FW_FEATURE_PERFMON,		"hcall-perfmon"},
+	{FW_FEATURE_CRQ,		"hcall-crq"},
+	{FW_FEATURE_VIO,		"hcall-vio"},
+	{FW_FEATURE_RDMA,		"hcall-rdma"},
+	{FW_FEATURE_LLAN,		"hcall-lLAN"},
+	{FW_FEATURE_BULK,		"hcall-bulk"},
+	{FW_FEATURE_XDABR,		"hcall-xdabr"},
+	{FW_FEATURE_MULTITCE,		"hcall-multi-tce"},
+	{FW_FEATURE_SPLPAR,		"hcall-splpar"},
+};
+
+/* Build up the firmware features bitmask using the contents of
+ * device-tree/ibm,hypertas-functions.  Ultimately this functionality may
+ * be moved into prom.c prom_init().
+ */
+void __init fw_feature_init(void)
+{
+	struct device_node *dn;
+	char *hypertas, *s;
+	int len, i;
+
+	DBG(" -> fw_feature_init()\n");
+
+	dn = of_find_node_by_path("/rtas");
+	if (dn == NULL) {
+		printk(KERN_ERR "WARNING! Cannot find RTAS in device-tree!\n");
+		goto no_rtas;
+	}
+
+	hypertas = get_property(dn, "ibm,hypertas-functions", &len);
+	if (hypertas == NULL)
+		goto no_hypertas;
+
+	for (s = hypertas; s < hypertas + len; s += strlen(s) + 1) {
+		for (i = 0; i < FIRMWARE_MAX_FEATURES; i++) {
+			/* check value against table of strings */
+			if (!firmware_features_table[i].name ||
+			    strcmp(firmware_features_table[i].name, s))
+				continue;
+
+			/* we have a match */
+			ppc64_firmware_features |=
+				firmware_features_table[i].val;
+			break;
+		}
+	}
+
+no_hypertas:
+	of_node_put(dn);
+no_rtas:
+	DBG(" <- fw_feature_init()\n");
+}
Index: to-merge/arch/powerpc/platforms/pseries/setup.c
===================================================================
--- to-merge.orig/arch/powerpc/platforms/pseries/setup.c
+++ to-merge/arch/powerpc/platforms/pseries/setup.c
@@ -60,7 +60,6 @@
 #include <asm/time.h>
 #include <asm/nvram.h>
 #include "xics.h"
-#include <asm/firmware.h>
 #include <asm/pmc.h>
 #include <asm/mpic.h>
 #include <asm/ppc-pci.h>
@@ -70,6 +69,7 @@
 
 #include "plpar_wrappers.h"
 #include "ras.h"
+#include "firmware.h"
 
 #ifdef DEBUG
 #define DBG(fmt...) udbg_printf(fmt)
@@ -262,50 +262,6 @@ static int __init pSeries_init_panel(voi
 }
 arch_initcall(pSeries_init_panel);
 
-
-/* Build up the firmware features bitmask using the contents of
- * device-tree/ibm,hypertas-functions.  Ultimately this functionality may
- * be moved into prom.c prom_init().
- */
-static void __init fw_feature_init(void)
-{
-	struct device_node *dn;
-	char *hypertas, *s;
-	int len, i;
-
-	DBG(" -> fw_feature_init()\n");
-
-	dn = of_find_node_by_path("/rtas");
-	if (dn == NULL) {
-		printk(KERN_ERR "WARNING! Cannot find RTAS in device-tree!\n");
-		goto no_rtas;
-	}
-
-	hypertas = get_property(dn, "ibm,hypertas-functions", &len);
-	if (hypertas == NULL)
-		goto no_hypertas;
-
-	for (s = hypertas; s < hypertas + len; s += strlen(s) + 1) {
-		for (i = 0; i < FIRMWARE_MAX_FEATURES; i++) {
-			/* check value against table of strings */
-			if (!firmware_features_table[i].name ||
-			    strcmp(firmware_features_table[i].name, s))
-				continue;
-
-			/* we have a match */
-			ppc64_firmware_features |=
-				firmware_features_table[i].val;
-			break;
-		}
-	}
-
-no_hypertas:
-	of_node_put(dn);
-no_rtas:
-	DBG(" <- fw_feature_init()\n");
-}
-
-
 static  void __init pSeries_discover_pic(void)
 {
 	struct device_node *np;
Index: to-merge/include/asm-powerpc/firmware.h
===================================================================
--- to-merge.orig/include/asm-powerpc/firmware.h
+++ to-merge/include/asm-powerpc/firmware.h
@@ -89,15 +89,6 @@ static inline unsigned long firmware_has
 		(FW_FEATURE_POSSIBLE & ppc64_firmware_features & feature);
 }
 
-#ifdef CONFIG_PPC_PSERIES
-typedef struct {
-    unsigned long val;
-    char * name;
-} firmware_feature_t;
-
-extern firmware_feature_t firmware_features_table[];
-#endif
-
 extern void system_reset_fwnmi(void);
 extern void machine_check_fwnmi(void);
 
Index: to-merge/arch/powerpc/platforms/pseries/firmware.h
===================================================================
--- /dev/null
+++ to-merge/arch/powerpc/platforms/pseries/firmware.h
@@ -0,0 +1,17 @@
+/*
+ * Copyright 2006 IBM Corporation.
+ *
+ *  This program is free software; you can redistribute it and/or
+ *  modify it under the terms of the GNU General Public License
+ *  as published by the Free Software Foundation; either version
+ *  2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _PSERIES_FIRMWARE_H
+#define _PSERIES_FIRMWARE_H
+
+#include <asm/firmware.h>
+
+extern void __init fw_feature_init(void);
+
+#endif /* _PSERIES_FIRMWARE_H */


From michael at ellerman.id.au  Thu Feb  9 17:03:35 2006
From: michael at ellerman.id.au (Michael Ellerman)
Date: Thu, 09 Feb 2006 17:03:35 +1100
Subject: [PATCH 3/3] powerpc: Replace platform_is_lpar() with a firmware
	feature
In-Reply-To: <1139465007.297357.792110844862.qpush@concordia>
Message-ID: <20060209060359.2F789679F7@ozlabs.org>

It has been decreed that platform numbers are evil, so as a step in that
direction, replace platform_is_lpar() with a FW_FEATURE_LPAR bit.

Signed-off-by: Michael Ellerman <michael at ellerman.id.au>
---

 arch/powerpc/mm/hash_utils_64.c         |    4 ++--
 arch/powerpc/oprofile/op_model_power4.c |    3 ++-
 arch/powerpc/platforms/iseries/setup.c  |   10 +++++++---
 arch/powerpc/platforms/pseries/iommu.c  |    2 +-
 arch/powerpc/platforms/pseries/setup.c  |   11 +++++++----
 arch/powerpc/platforms/pseries/smp.c    |    2 +-
 arch/powerpc/platforms/pseries/xics.c   |    3 ++-
 include/asm-powerpc/firmware.h          |    7 ++++---
 include/asm-powerpc/processor.h         |    1 -
 9 files changed, 26 insertions(+), 17 deletions(-)

Index: to-merge/include/asm-powerpc/firmware.h
===================================================================
--- to-merge.orig/include/asm-powerpc/firmware.h
+++ to-merge/include/asm-powerpc/firmware.h
@@ -41,6 +41,7 @@
 #define FW_FEATURE_MULTITCE	(1UL<<19)
 #define FW_FEATURE_SPLPAR	(1UL<<20)
 #define FW_FEATURE_ISERIES	(1UL<<21)
+#define FW_FEATURE_LPAR		(1UL<<22)
 
 enum {
 #ifdef CONFIG_PPC64
@@ -51,10 +52,10 @@ enum {
 		FW_FEATURE_MIGRATE | FW_FEATURE_PERFMON | FW_FEATURE_CRQ |
 		FW_FEATURE_VIO | FW_FEATURE_RDMA | FW_FEATURE_LLAN |
 		FW_FEATURE_BULK | FW_FEATURE_XDABR | FW_FEATURE_MULTITCE |
-		FW_FEATURE_SPLPAR,
+		FW_FEATURE_SPLPAR | FW_FEATURE_LPAR,
 	FW_FEATURE_PSERIES_ALWAYS = 0,
-	FW_FEATURE_ISERIES_POSSIBLE = FW_FEATURE_ISERIES,
-	FW_FEATURE_ISERIES_ALWAYS = FW_FEATURE_ISERIES,
+	FW_FEATURE_ISERIES_POSSIBLE = FW_FEATURE_ISERIES | FW_FEATURE_LPAR,
+	FW_FEATURE_ISERIES_ALWAYS = FW_FEATURE_ISERIES | FW_FEATURE_LPAR,
 	FW_FEATURE_POSSIBLE =
 #ifdef CONFIG_PPC_PSERIES
 		FW_FEATURE_PSERIES_POSSIBLE |
Index: to-merge/arch/powerpc/mm/hash_utils_64.c
===================================================================
--- to-merge.orig/arch/powerpc/mm/hash_utils_64.c
+++ to-merge/arch/powerpc/mm/hash_utils_64.c
@@ -421,7 +421,7 @@ void __init htab_initialize(void)
 
 	htab_hash_mask = pteg_count - 1;
 
-	if (platform_is_lpar()) {
+	if (firmware_has_feature(FW_FEATURE_LPAR)) {
 		/* Using a hypervisor which owns the htab */
 		htab_address = NULL;
 		_SDR1 = 0; 
@@ -515,7 +515,7 @@ void __init htab_initialize(void)
 
 void htab_initialize_secondary(void)
 {
-	if (!platform_is_lpar())
+	if (!firmware_has_feature(FW_FEATURE_LPAR))
 		mtspr(SPRN_SDR1, _SDR1);
 }
 
Index: to-merge/arch/powerpc/oprofile/op_model_power4.c
===================================================================
--- to-merge.orig/arch/powerpc/oprofile/op_model_power4.c
+++ to-merge/arch/powerpc/oprofile/op_model_power4.c
@@ -10,6 +10,7 @@
 #include <linux/oprofile.h>
 #include <linux/init.h>
 #include <linux/smp.h>
+#include <asm/firmware.h>
 #include <asm/ptrace.h>
 #include <asm/system.h>
 #include <asm/processor.h>
@@ -232,7 +233,7 @@ static unsigned long get_pc(struct pt_re
 	mmcra = mfspr(SPRN_MMCRA);
 
 	/* Were we in the hypervisor? */
-	if (platform_is_lpar() && (mmcra & MMCRA_SIHV))
+	if (firmware_has_feature(FW_FEATURE_LPAR) && (mmcra & MMCRA_SIHV))
 		/* function descriptor madness */
 		return *((unsigned long *)hypervisor_bucket);
 
Index: to-merge/arch/powerpc/platforms/pseries/iommu.c
===================================================================
--- to-merge.orig/arch/powerpc/platforms/pseries/iommu.c
+++ to-merge/arch/powerpc/platforms/pseries/iommu.c
@@ -582,7 +582,7 @@ void iommu_init_early_pSeries(void)
 		return;
 	}
 
-	if (platform_is_lpar()) {
+	if (firmware_has_feature(FW_FEATURE_LPAR)) {
 		if (firmware_has_feature(FW_FEATURE_MULTITCE)) {
 			ppc_md.tce_build = tce_buildmulti_pSeriesLP;
 			ppc_md.tce_free	 = tce_freemulti_pSeriesLP;
Index: to-merge/arch/powerpc/platforms/pseries/setup.c
===================================================================
--- to-merge.orig/arch/powerpc/platforms/pseries/setup.c
+++ to-merge/arch/powerpc/platforms/pseries/setup.c
@@ -246,7 +246,7 @@ static void __init pSeries_setup_arch(vo
 		ppc_md.idle_loop = default_idle;
 	}
 
-	if (platform_is_lpar())
+	if (firmware_has_feature(FW_FEATURE_LPAR))
 		ppc_md.enable_pmcs = pseries_lpar_enable_pmcs;
 	else
 		ppc_md.enable_pmcs = power4_enable_pmcs;
@@ -326,7 +326,7 @@ static void __init pSeries_init_early(vo
 
 	fw_feature_init();
 	
-	if (platform_is_lpar())
+	if (firmware_has_feature(FW_FEATURE_LPAR))
 		hpte_init_lpar();
 	else {
 		hpte_init_native();
@@ -334,7 +334,7 @@ static void __init pSeries_init_early(vo
 			     get_property(of_chosen, "linux,iommu-off", NULL));
 	}
 
-	if (platform_is_lpar())
+	if (firmware_has_feature(FW_FEATURE_LPAR))
 		find_udbg_vterm();
 
 	if (firmware_has_feature(FW_FEATURE_DABR))
@@ -390,6 +390,9 @@ static int __init pSeries_probe(int plat
 	 * it here ...
 	 */
 
+	if (platform == PLATFORM_PSERIES_LPAR)
+		ppc64_firmware_features |= FW_FEATURE_LPAR;
+
 	return 1;
 }
 
@@ -529,7 +532,7 @@ static void pseries_shared_idle(void)
 
 static int pSeries_pci_probe_mode(struct pci_bus *bus)
 {
-	if (platform_is_lpar())
+	if (firmware_has_feature(FW_FEATURE_LPAR))
 		return PCI_PROBE_DEVTREE;
 	return PCI_PROBE_NORMAL;
 }
Index: to-merge/arch/powerpc/platforms/pseries/smp.c
===================================================================
--- to-merge.orig/arch/powerpc/platforms/pseries/smp.c
+++ to-merge/arch/powerpc/platforms/pseries/smp.c
@@ -443,7 +443,7 @@ void __init smp_init_pSeries(void)
 	smp_ops->cpu_die = pSeries_cpu_die;
 
 	/* Processors can be added/removed only on LPAR */
-	if (platform_is_lpar())
+	if (firmware_has_feature(FW_FEATURE_LPAR))
 		pSeries_reconfig_notifier_register(&pSeries_smp_nb);
 #endif
 
Index: to-merge/arch/powerpc/platforms/pseries/xics.c
===================================================================
--- to-merge.orig/arch/powerpc/platforms/pseries/xics.c
+++ to-merge/arch/powerpc/platforms/pseries/xics.c
@@ -20,6 +20,7 @@
 #include <linux/gfp.h>
 #include <linux/radix-tree.h>
 #include <linux/cpu.h>
+#include <asm/firmware.h>
 #include <asm/prom.h>
 #include <asm/io.h>
 #include <asm/pgtable.h>
@@ -536,7 +537,7 @@ nextnode:
 		of_node_put(np);
 	}
 
-	if (platform_is_lpar())
+	if (firmware_has_feature(FW_FEATURE_LPAR))
 		ops = &pSeriesLP_ops;
 	else {
 #ifdef CONFIG_SMP
Index: to-merge/arch/powerpc/platforms/iseries/setup.c
===================================================================
--- to-merge.orig/arch/powerpc/platforms/iseries/setup.c
+++ to-merge/arch/powerpc/platforms/iseries/setup.c
@@ -303,8 +303,6 @@ static void __init iSeries_init_early(vo
 {
 	DBG(" -> iSeries_init_early()\n");
 
-	ppc64_firmware_features = FW_FEATURE_ISERIES;
-
 	ppc64_interrupt_controller = IC_ISERIES;
 
 #if defined(CONFIG_BLK_DEV_INITRD)
@@ -710,7 +708,13 @@ void __init iSeries_init_IRQ(void) { }
 
 static int __init iseries_probe(int platform)
 {
-	return PLATFORM_ISERIES_LPAR == platform;
+	if (PLATFORM_ISERIES_LPAR != platform)
+		return 0;
+
+	ppc64_firmware_features |= FW_FEATURE_ISERIES;
+	ppc64_firmware_features |= FW_FEATURE_LPAR;
+
+	return 1;
 }
 
 struct machdep_calls __initdata iseries_md = {
Index: to-merge/include/asm-powerpc/processor.h
===================================================================
--- to-merge.orig/include/asm-powerpc/processor.h
+++ to-merge/include/asm-powerpc/processor.h
@@ -52,7 +52,6 @@
 #ifdef __KERNEL__
 #define platform_is_pseries()	(_machine == PLATFORM_PSERIES || \
 				 _machine == PLATFORM_PSERIES_LPAR)
-#define platform_is_lpar()	(!!(_machine & PLATFORM_LPAR))
 
 #if defined(CONFIG_PPC_MULTIPLATFORM)
 extern int _machine;


From ntl at pobox.com  Thu Feb  9 18:23:32 2006
From: ntl at pobox.com (Nathan Lynch)
Date: Thu, 9 Feb 2006 01:23:32 -0600
Subject: [PATCH 1/3] powerpc: Clean up pSeries firmware feature
	initialisation
In-Reply-To: <1139465007.297357.792110844862.qpush@concordia>
References: <1139465007.297357.792110844862.qpush@concordia>
Message-ID: <20060209072331.GK18730@localhost.localdomain>

Michael Ellerman wrote:

...

>  	DBG(" -> fw_feature_init()\n");
>  
> -	ppc64_firmware_features = 0;
>  	dn = of_find_node_by_path("/rtas");
>  	if (dn == NULL) {
> -		printk(KERN_ERR "WARNING ! Cannot find RTAS in device-tree !\n");
> +		printk(KERN_ERR "WARNING! Cannot find RTAS in device-tree!\n");
>  		goto no_rtas;
>  	}
>  
>  	hypertas = get_property(dn, "ibm,hypertas-functions", &len);
> -	if (hypertas) {
> -		while (len > 0){
> -			int i, hypertas_len;
> +	if (hypertas == NULL)
> +		goto no_hypertas;

...

> +no_hypertas:
>  	of_node_put(dn);
>  no_rtas:
> -
>  	DBG(" <- fw_feature_init()\n");
>  }

of_node_put can handle a null pointer fine, so you could get away with
just one label here.


From cfriesen at nortel.com  Fri Feb 10 03:03:22 2006
From: cfriesen at nortel.com (Christopher Friesen)
Date: Thu, 09 Feb 2006 10:03:22 -0600
Subject: question on ptep_clear_flush_dirty() for ppc64
Message-ID: <43EB67CA.7000207@nortel.com>


I notice that (at least for 2.6.10) ptep_clear_flush_dirty() for ppc64 
simply does ptep_test_and_clear_dirty(), then calls flush_tlb_pending().

I want to call ptep_clear_flush_dirty() for a large number of pages 
(tens of thousands) in an optimal manner--would it be legal for me to 
call ptep_test_and_clear_dirty() for each page, then call 
flush_tlb_pending() once at the end?  Are there any implications for SMP 
machines?

The reason I ask is that in a small experiment I did this increased the 
speed of a certain task by a factor of about 25%, which is significant 
in our application.  I just wanted to make sure it was safe.

Thanks,

Chris


From benh at kernel.crashing.org  Fri Feb 10 10:04:40 2006
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Fri, 10 Feb 2006 10:04:40 +1100
Subject: G5 fan problems return moving to 2.6.15 with dual processor
	2.7GHz machine
In-Reply-To: <17386.56061.78892.44180@zot.electricrain.com>
References: <20060205061048.7261.qmail@electricrain.com>
	<1139130385.5634.14.camel@localhost.localdomain>
	<17384.62553.442011.514155@zot.electricrain.com>
	<1139371016.8187.1.camel@localhost.localdomain>
	<17386.56061.78892.44180@zot.electricrain.com>
Message-ID: <1139526280.5003.40.camel@localhost.localdomain>

On Wed, 2006-02-08 at 22:02 -0800, Brian D. Carlstrom wrote:
> Benjamin Herrenschmidt writes:
>  > prom_printf should work ... try booting manually (from the OF command
>  > line) and maybe comment out the code that opens the displays... (it
>  > may be clearing the screen)....
> 
> I tried commented out prom_check_displays and that does prevent the
> clearing of the screen, but still no visible prom_printf output. The
> last output seems to be from yaboot, its certainly not one of the
> prom_print messages from prom_init.c. For good measure, I also tried
> adding "video=ofonly". Still no prom_printf output visible.
> 
> However, when I rebooted back to my 2.6.14 kernel, I saw the usual
> prom_printf messages from prom_init without any changes. I reviewed the
> prom_init.c diffs between 2.6.14 and 2.6.15 but they are large enough
> that its not easy to spot an obvious problem.
> 
> In any case, I didn't have much time to really look at this today, just
> enough to try the disabling prom_check_displays, I'll have to look more
> Friday.

That is strange... 

Ben.


From olof at lixom.net  Fri Feb 10 10:25:12 2006
From: olof at lixom.net (Olof Johansson)
Date: Thu, 9 Feb 2006 17:25:12 -0600
Subject: [PATCH 2/3] powerpc: Move pSeries firmware feature setup into
	platforms/pseries
In-Reply-To: <20060209060356.5606C679F6@ozlabs.org>
References: <1139465007.297357.792110844862.qpush@concordia>
	<20060209060356.5606C679F6@ozlabs.org>
Message-ID: <20060209232511.GN4833@pb15.lixom.net>

On Thu, Feb 09, 2006 at 05:03:33PM +1100, Michael Ellerman wrote:
> Currently we have some stuff in firmware.h and kernel/firmware.c that is
> #ifdef CONFIG_PPC_PSERIES. Move it all into platforms/pseries.

I suggest renaming it to something like fw_set_hv_features()
while you're at it, since all features it parses and sets are
hypervisor-related.

There are other, not yet fully merged hypervisor guest ports that might
want to share this code (Xen, rhype) for non-pseries machines, so there's
a chance that a move into platforms will just need to be undone down the
road. However, since that code isn't merged yet, let's worry about that
then instead of now.


-Olof


From michael at ellerman.id.au  Fri Feb 10 10:48:46 2006
From: michael at ellerman.id.au (Michael Ellerman)
Date: Fri, 10 Feb 2006 10:48:46 +1100
Subject: [PATCH 1/3] powerpc: Clean up pSeries firmware feature
	initialisation
In-Reply-To: <20060209072331.GK18730@localhost.localdomain>
References: <1139465007.297357.792110844862.qpush@concordia>
	<20060209072331.GK18730@localhost.localdomain>
Message-ID: <200602101048.50034.michael@ellerman.id.au>

On Thu, 9 Feb 2006 18:23, Nathan Lynch wrote:
> Michael Ellerman wrote:
> > +no_hypertas:
> >  	of_node_put(dn);
> >  no_rtas:
> > -
> >  	DBG(" <- fw_feature_init()\n");
> >  }
>
> of_node_put can handle a null pointer fine, so you could get away with
> just one label here.

Nice, I'll fix it up and resend.

-- 
Michael Ellerman
IBM OzLabs

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060210/11ffcc7e/attachment.pgp 

From benh at kernel.crashing.org  Fri Feb 10 13:54:27 2006
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Fri, 10 Feb 2006 13:54:27 +1100
Subject: question on ptep_clear_flush_dirty() for ppc64
In-Reply-To: <43EB67CA.7000207@nortel.com>
References: <43EB67CA.7000207@nortel.com>
Message-ID: <1139540068.5003.72.camel@localhost.localdomain>

On Thu, 2006-02-09 at 10:03 -0600, Christopher Friesen wrote:
> I notice that (at least for 2.6.10) ptep_clear_flush_dirty() for ppc64 
> simply does ptep_test_and_clear_dirty(), then calls flush_tlb_pending().
> 
> I want to call ptep_clear_flush_dirty() for a large number of pages 
> (tens of thousands) in an optimal manner--would it be legal for me to 
> call ptep_test_and_clear_dirty() for each page, then call 
> flush_tlb_pending() once at the end?  Are there any implications for SMP 
> machines?
> 
> The reason I ask is that in a small experiment I did this increased the 
> speed of a certain task by a factor of about 25%, which is significant 
> in our application.  I just wanted to make sure it was safe.

If you do that, just beware that if any "new" dirtying happens between
the update of the linux PTE and the flush_tlb_pending(), it will not be
lost... this is not a problem if you only "use" the result of the
function (the collected dirty bits) after you flush_tlb_pending() since
you will have those already marked dirty.

I don't think you need the page table lock, though I'm a bit tired at
the moment and may be missing something, and I don't think you need to
disable preemption as a context switch will call flush_tlb_pending()...

Ben.


From benh at kernel.crashing.org  Fri Feb 10 14:01:15 2006
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Fri, 10 Feb 2006 14:01:15 +1100
Subject: __setup_cpu_be problem
In-Reply-To: <43E9050C.2000300@am.sony.com>
References: <43E9050C.2000300@am.sony.com>
Message-ID: <1139540476.5003.74.camel@localhost.localdomain>

On Tue, 2006-02-07 at 12:37 -0800, Geoff Levand wrote:
> Arnd,
> 
> It seems HID6 is a hypervisor resource...  Can we just have 
> '.cpu_setup = __setup_cpu_power4', and you setup your
> page sizes somewhere else?

Or better test if MSR:HV is set ? :) But yeah, he shouldn't have to set
the page sizes there anyway, I would expect the firmware to do it and
pass the right sizes via the device-tree since that's what the kernel
expects. (Though you really want LP0 to be 16M and not 1M as the kernel
can't really deal with the later properly anyway with the current page
table layouts).

> -Geoff
> 
> struct cpu_spec	cpu_specs[] = {
> 	{	/* Cell Broadband Engine */
> 		.cpu_setup		= __setup_cpu_be,
> 	},
> 
> _GLOBAL(__setup_cpu_be)
>         /* Set large page sizes LP=0: 16MB, LP=1: 64KB */
>         addi    r3, 0,  0
>         ori     r3, r3, HID6_LB
>         sldi    r3, r3, 32
>         nor     r3, r3, r3
>         mfspr   r4, SPRN_HID6
>         and     r4, r4, r3
>         addi    r3, 0, 0x02000
>         sldi    r3, r3, 32
>         or      r4, r4, r3
>         mtspr   SPRN_HID6, r4
> 	blr
> 
> _______________________________________________
> Linuxppc64-dev mailing list
> Linuxppc64-dev at ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc64-dev


From arnd at arndb.de  Fri Feb 10 15:21:15 2006
From: arnd at arndb.de (arnd at arndb.de)
Date: Fri, 10 Feb 2006 05:21:15 +0100
Subject: AW: Re: __setup_cpu_be problem
Message-ID: <2812322.110611139545275893.JavaMail.servlet@kundenserver>

>
>On Tue, 2006-02-07 at 12:37 -0800, Geoff Levand wrote:
>> Arnd,
>> 
>> It seems HID6 is a hypervisor resource...  Can we just have 
>> '.cpu_setup = __setup_cpu_power4', and you setup your
>> page sizes somewhere else?
>
>Or better test if MSR:HV is set ? :) But yeah, he shouldn't have to set
>the page sizes there anyway, I would expect the firmware to do it and
>pass the right sizes via the device-tree since that's what the kernel
>expects. (Though you really want LP0 to be 16M and not 1M as the kernel
>can't really deal with the later properly anyway with the current page
>table layouts).
>

[/me is sing webmail from some distant location in .nz, sorry if
 the mail gets messed up]

Doing it in the firmware sounds like the right solution to me.
I would however not want to do that if the current firmware
sets the wrong page sizes.

I know that Hartmut wanted me to provide him with the right device
tree information that he needs to create to say that the page
size are 16M, 64k and 4k. Maybe we can find a combined solution
for these problems. Using __setup_cpu_power4 should be ok.

We could probably do a fallback in the cell setup to see if
the properties are in the device tree and do our own HID6 
setup stuff if not, normally expecting that the firmware settings
match the device tree.

Geoff, if your firmware does not already have the properties
for large page sizes, could you add them?

Ben, could you point Hartmut (and maybe Geoff) to the documentation
for how the device tree needs to look like?

Hartmut, can you find out the value of HID6 when you enter the kernel
from the firmware?

     Arnd <><


From benh at kernel.crashing.org  Fri Feb 10 15:35:15 2006
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Fri, 10 Feb 2006 15:35:15 +1100
Subject: AW: Re: __setup_cpu_be problem
In-Reply-To: <2812322.110611139545275893.JavaMail.servlet@kundenserver>
References: <2812322.110611139545275893.JavaMail.servlet@kundenserver>
Message-ID: <1139546116.5003.81.camel@localhost.localdomain>

> [/me is sing webmail from some distant location in .nz, sorry if
>  the mail gets messed up]
> 
> Doing it in the firmware sounds like the right solution to me.
> I would however not want to do that if the current firmware
> sets the wrong page sizes.
> 
> I know that Hartmut wanted me to provide him with the right device
> tree information that he needs to create to say that the page
> size are 16M, 64k and 4k. Maybe we can find a combined solution
> for these problems. Using __setup_cpu_power4 should be ok.

I don't completely understand your statement ... sorry

> We could probably do a fallback in the cell setup to see if
> the properties are in the device tree and do our own HID6 
> setup stuff if not, normally expecting that the firmware settings
> match the device tree.

We should not touch HID6 at all ... we should assume the firmware set it
appropriately and have setup matching page size entries in the
device-tree. I don't think we need to support changing that value
especially since the kernel doesn't quite support 1M large page sizes
anyway.

> Geoff, if your firmware does not already have the properties
> for large page sizes, could you add them?
> 
> Ben, could you point Hartmut (and maybe Geoff) to the documentation
> for how the device tree needs to look like?

I'm not sure we published that yet :) I would suggest looking at what
the kernel does to parse these instead in hash_utils.c until I get a
former IBM approval for the spec to be published

> Hartmut, can you find out the value of HID6 when you enter the kernel
> from the firmware?

Ben.


From michael at ellerman.id.au  Fri Feb 10 15:47:32 2006
From: michael at ellerman.id.au (Michael Ellerman)
Date: Fri, 10 Feb 2006 15:47:32 +1100
Subject: [PATCH 1/2] powerpc: Clean up pSeries firmware feature initialisation
Message-ID: <1139546852.877687.10893684750.qpush@concordia>

Clean up fw_feature_init in platforms/pseries/setup.c. Clean up white space
and replace the while loop with a for loop - which seems clearer to me.

Signed-off-by: Michael Ellerman <michael at ellerman.id.au>
---

 arch/powerpc/platforms/pseries/setup.c |   48 +++++++++++++++------------------
 1 files changed, 22 insertions(+), 26 deletions(-)

Index: to-merge/arch/powerpc/platforms/pseries/setup.c
===================================================================
--- to-merge.orig/arch/powerpc/platforms/pseries/setup.c
+++ to-merge/arch/powerpc/platforms/pseries/setup.c
@@ -263,48 +263,44 @@ static int __init pSeries_init_panel(voi
 arch_initcall(pSeries_init_panel);
 
 
-/* Build up the ppc64_firmware_features bitmask field
- * using contents of device-tree/ibm,hypertas-functions.
- * Ultimately this functionality may be moved into prom.c prom_init().
+/* Build up the firmware features bitmask using the contents of
+ * device-tree/ibm,hypertas-functions.  Ultimately this functionality may
+ * be moved into prom.c prom_init().
  */
 static void __init fw_feature_init(void)
 {
-	struct device_node * dn;
-	char * hypertas;
-	unsigned int len;
+	struct device_node *dn;
+	char *hypertas, *s;
+	int len, i;
 
 	DBG(" -> fw_feature_init()\n");
 
-	ppc64_firmware_features = 0;
 	dn = of_find_node_by_path("/rtas");
 	if (dn == NULL) {
-		printk(KERN_ERR "WARNING ! Cannot find RTAS in device-tree !\n");
-		goto no_rtas;
+		printk(KERN_ERR "WARNING! Cannot find RTAS in device-tree!\n");
+		goto out;
 	}
 
 	hypertas = get_property(dn, "ibm,hypertas-functions", &len);
-	if (hypertas) {
-		while (len > 0){
-			int i, hypertas_len;
+	if (hypertas == NULL)
+		goto out;
+
+	for (s = hypertas; s < hypertas + len; s += strlen(s) + 1) {
+		for (i = 0; i < FIRMWARE_MAX_FEATURES; i++) {
 			/* check value against table of strings */
-			for(i=0; i < FIRMWARE_MAX_FEATURES ;i++) {
-				if ((firmware_features_table[i].name) &&
-				    (strcmp(firmware_features_table[i].name,hypertas))==0) {
-					/* we have a match */
-					ppc64_firmware_features |= 
-						(firmware_features_table[i].val);
-					break;
-				} 
-			}
-			hypertas_len = strlen(hypertas);
-			len -= hypertas_len +1;
-			hypertas+= hypertas_len +1;
+			if (!firmware_features_table[i].name ||
+			    strcmp(firmware_features_table[i].name, s))
+				continue;
+
+			/* we have a match */
+			ppc64_firmware_features |=
+				firmware_features_table[i].val;
+			break;
 		}
 	}
 
+out:
 	of_node_put(dn);
-no_rtas:
-
 	DBG(" <- fw_feature_init()\n");
 }
 

From michael at ellerman.id.au  Fri Feb 10 15:47:36 2006
From: michael at ellerman.id.au (Michael Ellerman)
Date: Fri, 10 Feb 2006 15:47:36 +1100
Subject: [PATCH 2/2] powerpc: Move pSeries firmware feature setup into
	platforms/pseries
In-Reply-To: <1139546852.877687.10893684750.qpush@concordia>
Message-ID: <20060210044802.B8D1E67B20@ozlabs.org>

Currently we have some stuff in firmware.h and kernel/firmware.c that is
#ifdef CONFIG_PPC_PSERIES. Move it all into platforms/pseries.

Signed-off-by: Michael Ellerman <michael at ellerman.id.au>
---

 arch/powerpc/kernel/firmware.c            |   25 -------
 arch/powerpc/platforms/pseries/Makefile   |    3 
 arch/powerpc/platforms/pseries/firmware.c |  103 ++++++++++++++++++++++++++++++
 arch/powerpc/platforms/pseries/firmware.h |   17 ++++
 arch/powerpc/platforms/pseries/setup.c    |   45 -------------
 include/asm-powerpc/firmware.h            |    9 --
 6 files changed, 123 insertions(+), 79 deletions(-)

Index: to-merge/arch/powerpc/kernel/firmware.c
===================================================================
--- to-merge.orig/arch/powerpc/kernel/firmware.c
+++ to-merge/arch/powerpc/kernel/firmware.c
@@ -18,28 +18,3 @@
 #include <asm/firmware.h>
 
 unsigned long ppc64_firmware_features;
-
-#ifdef CONFIG_PPC_PSERIES
-firmware_feature_t firmware_features_table[FIRMWARE_MAX_FEATURES] = {
-	{FW_FEATURE_PFT,		"hcall-pft"},
-	{FW_FEATURE_TCE,		"hcall-tce"},
-	{FW_FEATURE_SPRG0,		"hcall-sprg0"},
-	{FW_FEATURE_DABR,		"hcall-dabr"},
-	{FW_FEATURE_COPY,		"hcall-copy"},
-	{FW_FEATURE_ASR,		"hcall-asr"},
-	{FW_FEATURE_DEBUG,		"hcall-debug"},
-	{FW_FEATURE_PERF,		"hcall-perf"},
-	{FW_FEATURE_DUMP,		"hcall-dump"},
-	{FW_FEATURE_INTERRUPT,		"hcall-interrupt"},
-	{FW_FEATURE_MIGRATE,		"hcall-migrate"},
-	{FW_FEATURE_PERFMON,		"hcall-perfmon"},
-	{FW_FEATURE_CRQ,		"hcall-crq"},
-	{FW_FEATURE_VIO,		"hcall-vio"},
-	{FW_FEATURE_RDMA,		"hcall-rdma"},
-	{FW_FEATURE_LLAN,		"hcall-lLAN"},
-	{FW_FEATURE_BULK,		"hcall-bulk"},
-	{FW_FEATURE_XDABR,		"hcall-xdabr"},
-	{FW_FEATURE_MULTITCE,		"hcall-multi-tce"},
-	{FW_FEATURE_SPLPAR,		"hcall-splpar"},
-};
-#endif
Index: to-merge/arch/powerpc/platforms/pseries/Makefile
===================================================================
--- to-merge.orig/arch/powerpc/platforms/pseries/Makefile
+++ to-merge/arch/powerpc/platforms/pseries/Makefile
@@ -1,5 +1,6 @@
 obj-y			:= pci.o lpar.o hvCall.o nvram.o reconfig.o \
-			   setup.o iommu.o ras.o rtasd.o pci_dlpar.o
+			   setup.o iommu.o ras.o rtasd.o pci_dlpar.o \
+			   firmware.o
 obj-$(CONFIG_SMP)	+= smp.o
 obj-$(CONFIG_IBMVIO)	+= vio.o
 obj-$(CONFIG_XICS)	+= xics.o
Index: to-merge/arch/powerpc/platforms/pseries/firmware.c
===================================================================
--- /dev/null
+++ to-merge/arch/powerpc/platforms/pseries/firmware.c
@@ -0,0 +1,103 @@
+/*
+ *  pSeries firmware setup code.
+ *
+ *  Portions from arch/powerpc/platforms/pseries/setup.c:
+ *   Copyright (C) 1995  Linus Torvalds
+ *   Adapted from 'alpha' version by Gary Thomas
+ *   Modified by Cort Dougan (cort at cs.nmt.edu)
+ *   Modified by PPC64 Team, IBM Corp
+ *
+ *  Portions from arch/powerpc/kernel/firmware.c
+ *   Copyright (C) 2001 Ben. Herrenschmidt (benh at kernel.crashing.org)
+ *   Modifications for ppc64:
+ *    Copyright (C) 2003 Dave Engebretsen <engebret at us.ibm.com>
+ *    Copyright (C) 2005 Stephen Rothwell, IBM Corporation
+ *
+ *  Copyright 2006 IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#undef DEBUG
+
+#include <asm/firmware.h>
+#include <asm/prom.h>
+
+#ifdef DEBUG
+#define DBG(fmt...) udbg_printf(fmt)
+#else
+#define DBG(fmt...)
+#endif
+
+typedef struct {
+    unsigned long val;
+    char * name;
+} firmware_feature_t;
+
+static __initdata firmware_feature_t
+firmware_features_table[FIRMWARE_MAX_FEATURES] = {
+	{FW_FEATURE_PFT,		"hcall-pft"},
+	{FW_FEATURE_TCE,		"hcall-tce"},
+	{FW_FEATURE_SPRG0,		"hcall-sprg0"},
+	{FW_FEATURE_DABR,		"hcall-dabr"},
+	{FW_FEATURE_COPY,		"hcall-copy"},
+	{FW_FEATURE_ASR,		"hcall-asr"},
+	{FW_FEATURE_DEBUG,		"hcall-debug"},
+	{FW_FEATURE_PERF,		"hcall-perf"},
+	{FW_FEATURE_DUMP,		"hcall-dump"},
+	{FW_FEATURE_INTERRUPT,		"hcall-interrupt"},
+	{FW_FEATURE_MIGRATE,		"hcall-migrate"},
+	{FW_FEATURE_PERFMON,		"hcall-perfmon"},
+	{FW_FEATURE_CRQ,		"hcall-crq"},
+	{FW_FEATURE_VIO,		"hcall-vio"},
+	{FW_FEATURE_RDMA,		"hcall-rdma"},
+	{FW_FEATURE_LLAN,		"hcall-lLAN"},
+	{FW_FEATURE_BULK,		"hcall-bulk"},
+	{FW_FEATURE_XDABR,		"hcall-xdabr"},
+	{FW_FEATURE_MULTITCE,		"hcall-multi-tce"},
+	{FW_FEATURE_SPLPAR,		"hcall-splpar"},
+};
+
+/* Build up the firmware features bitmask using the contents of
+ * device-tree/ibm,hypertas-functions.  Ultimately this functionality may
+ * be moved into prom.c prom_init().
+ */
+void __init fw_feature_init(void)
+{
+	struct device_node *dn;
+	char *hypertas, *s;
+	int len, i;
+
+	DBG(" -> fw_feature_init()\n");
+
+	dn = of_find_node_by_path("/rtas");
+	if (dn == NULL) {
+		printk(KERN_ERR "WARNING! Cannot find RTAS in device-tree!\n");
+		goto out;
+	}
+
+	hypertas = get_property(dn, "ibm,hypertas-functions", &len);
+	if (hypertas == NULL)
+		goto out;
+
+	for (s = hypertas; s < hypertas + len; s += strlen(s) + 1) {
+		for (i = 0; i < FIRMWARE_MAX_FEATURES; i++) {
+			/* check value against table of strings */
+			if (!firmware_features_table[i].name ||
+			    strcmp(firmware_features_table[i].name, s))
+				continue;
+
+			/* we have a match */
+			ppc64_firmware_features |=
+				firmware_features_table[i].val;
+			break;
+		}
+	}
+
+out:
+	of_node_put(dn);
+	DBG(" <- fw_feature_init()\n");
+}
Index: to-merge/arch/powerpc/platforms/pseries/setup.c
===================================================================
--- to-merge.orig/arch/powerpc/platforms/pseries/setup.c
+++ to-merge/arch/powerpc/platforms/pseries/setup.c
@@ -60,7 +60,6 @@
 #include <asm/time.h>
 #include <asm/nvram.h>
 #include "xics.h"
-#include <asm/firmware.h>
 #include <asm/pmc.h>
 #include <asm/mpic.h>
 #include <asm/ppc-pci.h>
@@ -70,6 +69,7 @@
 
 #include "plpar_wrappers.h"
 #include "ras.h"
+#include "firmware.h"
 
 #ifdef DEBUG
 #define DBG(fmt...) udbg_printf(fmt)
@@ -262,49 +262,6 @@ static int __init pSeries_init_panel(voi
 }
 arch_initcall(pSeries_init_panel);
 
-
-/* Build up the firmware features bitmask using the contents of
- * device-tree/ibm,hypertas-functions.  Ultimately this functionality may
- * be moved into prom.c prom_init().
- */
-static void __init fw_feature_init(void)
-{
-	struct device_node *dn;
-	char *hypertas, *s;
-	int len, i;
-
-	DBG(" -> fw_feature_init()\n");
-
-	dn = of_find_node_by_path("/rtas");
-	if (dn == NULL) {
-		printk(KERN_ERR "WARNING! Cannot find RTAS in device-tree!\n");
-		goto out;
-	}
-
-	hypertas = get_property(dn, "ibm,hypertas-functions", &len);
-	if (hypertas == NULL)
-		goto out;
-
-	for (s = hypertas; s < hypertas + len; s += strlen(s) + 1) {
-		for (i = 0; i < FIRMWARE_MAX_FEATURES; i++) {
-			/* check value against table of strings */
-			if (!firmware_features_table[i].name ||
-			    strcmp(firmware_features_table[i].name, s))
-				continue;
-
-			/* we have a match */
-			ppc64_firmware_features |=
-				firmware_features_table[i].val;
-			break;
-		}
-	}
-
-out:
-	of_node_put(dn);
-	DBG(" <- fw_feature_init()\n");
-}
-
-
 static  void __init pSeries_discover_pic(void)
 {
 	struct device_node *np;
Index: to-merge/include/asm-powerpc/firmware.h
===================================================================
--- to-merge.orig/include/asm-powerpc/firmware.h
+++ to-merge/include/asm-powerpc/firmware.h
@@ -89,15 +89,6 @@ static inline unsigned long firmware_has
 		(FW_FEATURE_POSSIBLE & ppc64_firmware_features & feature);
 }
 
-#ifdef CONFIG_PPC_PSERIES
-typedef struct {
-    unsigned long val;
-    char * name;
-} firmware_feature_t;
-
-extern firmware_feature_t firmware_features_table[];
-#endif
-
 extern void system_reset_fwnmi(void);
 extern void machine_check_fwnmi(void);
 
Index: to-merge/arch/powerpc/platforms/pseries/firmware.h
===================================================================
--- /dev/null
+++ to-merge/arch/powerpc/platforms/pseries/firmware.h
@@ -0,0 +1,17 @@
+/*
+ * Copyright 2006 IBM Corporation.
+ *
+ *  This program is free software; you can redistribute it and/or
+ *  modify it under the terms of the GNU General Public License
+ *  as published by the Free Software Foundation; either version
+ *  2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _PSERIES_FIRMWARE_H
+#define _PSERIES_FIRMWARE_H
+
+#include <asm/firmware.h>
+
+extern void __init fw_feature_init(void);
+
+#endif /* _PSERIES_FIRMWARE_H */


From geoffrey.levand at am.sony.com  Sat Feb 11 03:44:26 2006
From: geoffrey.levand at am.sony.com (Geoff Levand)
Date: Fri, 10 Feb 2006 08:44:26 -0800
Subject: AW: Re: __setup_cpu_be problem
In-Reply-To: <1139546116.5003.81.camel@localhost.localdomain>
References: <2812322.110611139545275893.JavaMail.servlet@kundenserver>
	<1139546116.5003.81.camel@localhost.localdomain>
Message-ID: <43ECC2EA.5040505@am.sony.com>

Benjamin Herrenschmidt wrote:
>> [/me is sing webmail from some distant location in .nz, sorry if
>>  the mail gets messed up]
>> 
>> Doing it in the firmware sounds like the right solution to me.
>> I would however not want to do that if the current firmware
>> sets the wrong page sizes.


Then we should consider support to change page sizes a firmware
bug work-around.


>> I know that Hartmut wanted me to provide him with the right device
>> tree information that he needs to create to say that the page
>> size are 16M, 64k and 4k. Maybe we can find a combined solution
>> for these problems. Using __setup_cpu_power4 should be ok.
> 
> I don't completely understand your statement ... sorry
> 
>> We could probably do a fallback in the cell setup to see if
>> the properties are in the device tree and do our own HID6 
>> setup stuff if not, normally expecting that the firmware settings
>> match the device tree.
> 
> We should not touch HID6 at all ... we should assume the firmware set it
> appropriately and have setup matching page size entries in the
> device-tree. I don't think we need to support changing that value
> especially since the kernel doesn't quite support 1M large page sizes
> anyway.


This seems reasonable.  In the general case we shouldn't change the
sizes.


>> Ben, could you point Hartmut (and maybe Geoff) to the documentation
>> for how the device tree needs to look like?
> 
> I'm not sure we published that yet :) I would suggest looking at what
> the kernel does to parse these instead in hash_utils.c until I get a
> former IBM approval for the spec to be published


It seems we can work with the info in hash_utils_64.c, but it would be
good to get the documentation.


-Geoff


From ahuja at austin.ibm.com  Sat Feb 11 07:45:08 2006
From: ahuja at austin.ibm.com (Manish Ahuja)
Date: Fri, 10 Feb 2006 14:45:08 -0600
Subject: [PATCH] PPC64 collect and export low-level cpu usage statistics
In-Reply-To: <43DFB67A.5080508@austin.ibm.com>
References: <43CFC094.8000709@austin.ibm.com>	<20060126204432.GG19465@austin.ibm.com>
	<43DFB67A.5080508@austin.ibm.com>
Message-ID: <43ECFB54.9050907@austin.ibm.com>

I have made some changes to the patch. I will drop another patch at a 
later date with enhancements
and some modifications. At this point, I would like to resubmit this for 
review and forward if there are
no further comments.

Thanks,
Manish Ahuja
--------------------------------------------------------------------

The issue of correctness of time is an important one, where users
would like to accurately get a feel for how the system is performing.
With Virtualization and addition of abstract layers between the hardware
and the OS, we have introduced changes that do not allow us to correctly
measure performance or accuracy at this moment. Any activity at this
moment that collects metrics is faced with the challenge of collecting
values that might be bogus.

POWER5 machines have a per-hardware-thread register which counts at a
rate which is proportional to the percentage of cycles on which the
cpu dispatches an instruction for this thread (if the thread gets all
the dispatch cycles it counts at the same rate as the timebase
register).  This register is also context-switched by the hypervisor.
Thus it gives a fine-grained measure of the actual cpu usage by the
thread over time.

This patch builds on a patch submitted earlier. This patch provides
framework and data which allows other tools to report measurements
accurately to different tools.

This Patch calculates the amount of real physical time spent by the
processor in each USER/SYS mode. It calculates that by trapping
entry and exits into the kernel. The values after calculations
are avilable from /sys/devices/system/cpu/cpuX/dispatched_cycles.
These values are calculated during interrupts & context switches.

To be able to correctly report all cycles, it is important to be
able to track all the cycles that are given to lpars that are
either offline or have been removed since the system started.
All such cycles are calculated and stored in
/sys/devices/system/cpu/cpuX/offline_cpu_cycles

A few tools are in the works that will exploit the values being
calculated. Example output look like as follows.

%user   %sys    %wait   %idle
------  ------  ------  ------
00.90    0.09    0.00   99.01

This patch also keeps track of exact user/kernel times for each process
and updates them accordingly to be used by tools like CKRM.

I am working with performance group to calculate the impact
of this patch. I will add those numbers as soon something
becomes available.

Signed-off-by: Manish Ahuja <ahuja at austin.ibm.com>


-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: cpu_acct.patch.2
Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060210/9d849630/attachment.txt 

From olof at lixom.net  Sat Feb 11 10:23:25 2006
From: olof at lixom.net (Olof Johansson)
Date: Fri, 10 Feb 2006 17:23:25 -0600
Subject: [PATCH] rename fw_feature_init()
Message-ID: <20060210232325.GA4795@pb15.lixom.net>

Hi,

fw_feature_init() on pSeries is really just a setup of hypervisor
features. Rename the function accordingly.

Signed-off-by: Olof Johansson <olof at lixom.net>
---

 firmware.c |   18 +++++++++---------
 firmware.h |    2 +-
 setup.c    |    2 +-
 3 files changed, 11 insertions(+), 11 deletions(-)

Index: powerpc-git/arch/powerpc/platforms/pseries/firmware.c
===================================================================
--- powerpc-git.orig/arch/powerpc/platforms/pseries/firmware.c
+++ powerpc-git/arch/powerpc/platforms/pseries/firmware.c
@@ -35,10 +35,10 @@
 typedef struct {
     unsigned long val;
     char * name;
-} firmware_feature_t;
+} hypervisor_feature_t;
 
-static __initdata firmware_feature_t
-firmware_features_table[FIRMWARE_MAX_FEATURES] = {
+static __initdata hypervisor_feature_t
+hypervisor_features_table[FIRMWARE_MAX_FEATURES] = {
 	{FW_FEATURE_PFT,		"hcall-pft"},
 	{FW_FEATURE_TCE,		"hcall-tce"},
 	{FW_FEATURE_SPRG0,		"hcall-sprg0"},
@@ -65,13 +65,13 @@ firmware_features_table[FIRMWARE_MAX_FEA
  * device-tree/ibm,hypertas-functions.  Ultimately this functionality may
  * be moved into prom.c prom_init().
  */
-void __init fw_feature_init(void)
+void __init hypervisor_feature_init(void)
 {
 	struct device_node *dn;
 	char *hypertas, *s;
 	int len, i;
 
-	DBG(" -> fw_feature_init()\n");
+	DBG(" -> hypervisor_feature_init()\n");
 
 	dn = of_find_node_by_path("/rtas");
 	if (dn == NULL) {
@@ -86,18 +86,18 @@ void __init fw_feature_init(void)
 	for (s = hypertas; s < hypertas + len; s += strlen(s) + 1) {
 		for (i = 0; i < FIRMWARE_MAX_FEATURES; i++) {
 			/* check value against table of strings */
-			if (!firmware_features_table[i].name ||
-			    strcmp(firmware_features_table[i].name, s))
+			if (!hypervisor_features_table[i].name ||
+			    strcmp(hypervisor_features_table[i].name, s))
 				continue;
 
 			/* we have a match */
 			ppc64_firmware_features |=
-				firmware_features_table[i].val;
+				hypervisor_features_table[i].val;
 			break;
 		}
 	}
 
 out:
 	of_node_put(dn);
-	DBG(" <- fw_feature_init()\n");
+	DBG(" <- hypervisor_feature_init()\n");
 }
Index: powerpc-git/arch/powerpc/platforms/pseries/firmware.h
===================================================================
--- powerpc-git.orig/arch/powerpc/platforms/pseries/firmware.h
+++ powerpc-git/arch/powerpc/platforms/pseries/firmware.h
@@ -12,6 +12,6 @@
 
 #include <asm/firmware.h>
 
-extern void __init fw_feature_init(void);
+extern void __init hypervisor_feature_init(void);
 
 #endif /* _PSERIES_FIRMWARE_H */
Index: powerpc-git/arch/powerpc/platforms/pseries/setup.c
===================================================================
--- powerpc-git.orig/arch/powerpc/platforms/pseries/setup.c
+++ powerpc-git/arch/powerpc/platforms/pseries/setup.c
@@ -324,7 +324,7 @@ static void __init pSeries_init_early(vo
 
 	DBG(" -> pSeries_init_early()\n");
 
-	fw_feature_init();
+	hypervisor_feature_init();
 	
 	if (platform_is_lpar())
 		hpte_init_lpar();


From olof at lixom.net  Sat Feb 11 10:49:03 2006
From: olof at lixom.net (Olof Johansson)
Date: Fri, 10 Feb 2006 17:49:03 -0600
Subject: [PATCH] Update {g5,pseries,ppc64}_defconfig
Message-ID: <20060210234903.GB4795@pb15.lixom.net>

Hi,

For powerpc.git (post-2.6.16):

Update defconfigs for g5, pseries and generic ppc64. Default choices
for everything, with the following exceptions:

 * Enable WINDFARM_PM112 on g5 and ppc64.
 * Increase CONFIG_NR_CPUS to 4 on g5_defconfig


Signed-off-by: Olof Johansson <olof at lixom.net>

Index: powerpc-git/arch/powerpc/configs/g5_defconfig
===================================================================
--- powerpc-git.orig/arch/powerpc/configs/g5_defconfig
+++ powerpc-git/arch/powerpc/configs/g5_defconfig
@@ -1,7 +1,7 @@
 #
 # Automatically generated make config: don't edit
-# Linux kernel version: 2.6.15-rc5
-# Tue Dec 20 15:59:30 2005
+# Linux kernel version: 2.6.16-rc2
+# Fri Feb 10 17:33:08 2006
 #
 CONFIG_PPC64=y
 CONFIG_64BIT=y
@@ -16,6 +16,10 @@ CONFIG_COMPAT=y
 CONFIG_SYSVIPC_COMPAT=y
 CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y
 CONFIG_ARCH_MAY_HAVE_PC_FDC=y
+CONFIG_PPC_OF=y
+# CONFIG_PPC_UDBG_16550 is not set
+CONFIG_GENERIC_TBSYNC=y
+# CONFIG_DEFAULT_UIMAGE is not set
 
 #
 # Processor support
@@ -26,13 +30,12 @@ CONFIG_PPC_FPU=y
 CONFIG_ALTIVEC=y
 CONFIG_PPC_STD_MMU=y
 CONFIG_SMP=y
-CONFIG_NR_CPUS=2
+CONFIG_NR_CPUS=4
 
 #
 # Code maturity level options
 #
 CONFIG_EXPERIMENTAL=y
-CONFIG_CLEAN_COMPILE=y
 CONFIG_LOCK_KERNEL=y
 CONFIG_INIT_ENV_ARG_LIMIT=32
 
@@ -47,8 +50,6 @@ CONFIG_POSIX_MQUEUE=y
 # CONFIG_BSD_PROCESS_ACCT is not set
 CONFIG_SYSCTL=y
 # CONFIG_AUDIT is not set
-CONFIG_HOTPLUG=y
-CONFIG_KOBJECT_UEVENT=y
 CONFIG_IKCONFIG=y
 CONFIG_IKCONFIG_PROC=y
 # CONFIG_CPUSETS is not set
@@ -58,8 +59,10 @@ CONFIG_CC_OPTIMIZE_FOR_SIZE=y
 CONFIG_KALLSYMS=y
 # CONFIG_KALLSYMS_ALL is not set
 # CONFIG_KALLSYMS_EXTRA_PASS is not set
+CONFIG_HOTPLUG=y
 CONFIG_PRINTK=y
 CONFIG_BUG=y
+CONFIG_ELF_CORE=y
 CONFIG_BASE_FULL=y
 CONFIG_FUTEX=y
 CONFIG_EPOLL=y
@@ -68,8 +71,10 @@ CONFIG_CC_ALIGN_FUNCTIONS=0
 CONFIG_CC_ALIGN_LABELS=0
 CONFIG_CC_ALIGN_LOOPS=0
 CONFIG_CC_ALIGN_JUMPS=0
+CONFIG_SLAB=y
 # CONFIG_TINY_SHMEM is not set
 CONFIG_BASE_SMALL=0
+# CONFIG_SLOB is not set
 
 #
 # Loadable module support
@@ -112,13 +117,12 @@ CONFIG_PPC_PMAC=y
 CONFIG_PPC_PMAC64=y
 # CONFIG_PPC_MAPLE is not set
 # CONFIG_PPC_CELL is not set
-CONFIG_PPC_OF=y
 CONFIG_U3_DART=y
 CONFIG_MPIC=y
 # CONFIG_PPC_RTAS is not set
 # CONFIG_MMIO_NVRAM is not set
+CONFIG_MPIC_BROKEN_U3=y
 # CONFIG_PPC_MPC106 is not set
-CONFIG_GENERIC_TBSYNC=y
 CONFIG_CPU_FREQ=y
 CONFIG_CPU_FREQ_TABLE=y
 # CONFIG_CPU_FREQ_DEBUG is not set
@@ -151,6 +155,7 @@ CONFIG_FORCE_MAX_ZONEORDER=13
 CONFIG_IOMMU_VMERGE=y
 # CONFIG_HOTPLUG_CPU is not set
 CONFIG_KEXEC=y
+# CONFIG_CRASH_DUMP is not set
 CONFIG_IRQ_ALL_CPUS=y
 # CONFIG_NUMA is not set
 CONFIG_ARCH_SELECT_MEMORY_MODEL=y
@@ -202,6 +207,7 @@ CONFIG_NET=y
 #
 # Networking options
 #
+# CONFIG_NETDEBUG is not set
 CONFIG_PACKET=y
 # CONFIG_PACKET_MMAP is not set
 CONFIG_UNIX=y
@@ -239,6 +245,7 @@ CONFIG_NETFILTER=y
 # Core Netfilter Configuration
 #
 # CONFIG_NETFILTER_NETLINK is not set
+# CONFIG_NETFILTER_XTABLES is not set
 
 #
 # IP: Netfilter Configuration
@@ -255,65 +262,6 @@ CONFIG_IP_NF_TFTP=m
 CONFIG_IP_NF_AMANDA=m
 # CONFIG_IP_NF_PPTP is not set
 CONFIG_IP_NF_QUEUE=m
-CONFIG_IP_NF_IPTABLES=m
-CONFIG_IP_NF_MATCH_LIMIT=m
-CONFIG_IP_NF_MATCH_IPRANGE=m
-CONFIG_IP_NF_MATCH_MAC=m
-CONFIG_IP_NF_MATCH_PKTTYPE=m
-CONFIG_IP_NF_MATCH_MARK=m
-CONFIG_IP_NF_MATCH_MULTIPORT=m
-CONFIG_IP_NF_MATCH_TOS=m
-CONFIG_IP_NF_MATCH_RECENT=m
-CONFIG_IP_NF_MATCH_ECN=m
-CONFIG_IP_NF_MATCH_DSCP=m
-CONFIG_IP_NF_MATCH_AH_ESP=m
-CONFIG_IP_NF_MATCH_LENGTH=m
-CONFIG_IP_NF_MATCH_TTL=m
-CONFIG_IP_NF_MATCH_TCPMSS=m
-CONFIG_IP_NF_MATCH_HELPER=m
-CONFIG_IP_NF_MATCH_STATE=m
-CONFIG_IP_NF_MATCH_CONNTRACK=m
-CONFIG_IP_NF_MATCH_OWNER=m
-CONFIG_IP_NF_MATCH_ADDRTYPE=m
-CONFIG_IP_NF_MATCH_REALM=m
-CONFIG_IP_NF_MATCH_SCTP=m
-# CONFIG_IP_NF_MATCH_DCCP is not set
-CONFIG_IP_NF_MATCH_COMMENT=m
-CONFIG_IP_NF_MATCH_CONNMARK=m
-CONFIG_IP_NF_MATCH_CONNBYTES=m
-CONFIG_IP_NF_MATCH_HASHLIMIT=m
-CONFIG_IP_NF_MATCH_STRING=m
-CONFIG_IP_NF_FILTER=m
-CONFIG_IP_NF_TARGET_REJECT=m
-CONFIG_IP_NF_TARGET_LOG=m
-CONFIG_IP_NF_TARGET_ULOG=m
-CONFIG_IP_NF_TARGET_TCPMSS=m
-CONFIG_IP_NF_TARGET_NFQUEUE=m
-CONFIG_IP_NF_NAT=m
-CONFIG_IP_NF_NAT_NEEDED=y
-CONFIG_IP_NF_TARGET_MASQUERADE=m
-CONFIG_IP_NF_TARGET_REDIRECT=m
-CONFIG_IP_NF_TARGET_NETMAP=m
-CONFIG_IP_NF_TARGET_SAME=m
-CONFIG_IP_NF_NAT_SNMP_BASIC=m
-CONFIG_IP_NF_NAT_IRC=m
-CONFIG_IP_NF_NAT_FTP=m
-CONFIG_IP_NF_NAT_TFTP=m
-CONFIG_IP_NF_NAT_AMANDA=m
-CONFIG_IP_NF_MANGLE=m
-CONFIG_IP_NF_TARGET_TOS=m
-CONFIG_IP_NF_TARGET_ECN=m
-CONFIG_IP_NF_TARGET_DSCP=m
-CONFIG_IP_NF_TARGET_MARK=m
-CONFIG_IP_NF_TARGET_CLASSIFY=m
-CONFIG_IP_NF_TARGET_TTL=m
-CONFIG_IP_NF_TARGET_CONNMARK=m
-CONFIG_IP_NF_TARGET_CLUSTERIP=m
-CONFIG_IP_NF_RAW=m
-CONFIG_IP_NF_TARGET_NOTRACK=m
-CONFIG_IP_NF_ARPTABLES=m
-CONFIG_IP_NF_ARPFILTER=m
-CONFIG_IP_NF_ARP_MANGLE=m
 
 #
 # DCCP Configuration (EXPERIMENTAL)
@@ -324,6 +272,11 @@ CONFIG_IP_NF_ARP_MANGLE=m
 # SCTP Configuration (EXPERIMENTAL)
 #
 # CONFIG_IP_SCTP is not set
+
+#
+# TIPC Configuration (EXPERIMENTAL)
+#
+# CONFIG_TIPC is not set
 # CONFIG_ATM is not set
 # CONFIG_BRIDGE is not set
 # CONFIG_VLAN_8021Q is not set
@@ -342,7 +295,6 @@ CONFIG_LLC=y
 # QoS and/or fair queueing
 #
 # CONFIG_NET_SCHED is not set
-CONFIG_NET_CLS_ROUTE=y
 
 #
 # Network testing
@@ -545,13 +497,7 @@ CONFIG_SCSI_SATA_SVW=y
 # CONFIG_SCSI_IPR is not set
 # CONFIG_SCSI_QLOGIC_FC is not set
 # CONFIG_SCSI_QLOGIC_1280 is not set
-CONFIG_SCSI_QLA2XXX=y
-# CONFIG_SCSI_QLA21XX is not set
-# CONFIG_SCSI_QLA22XX is not set
-# CONFIG_SCSI_QLA2300 is not set
-# CONFIG_SCSI_QLA2322 is not set
-# CONFIG_SCSI_QLA6312 is not set
-# CONFIG_SCSI_QLA24XX is not set
+# CONFIG_SCSI_QLA_FC is not set
 # CONFIG_SCSI_LPFC is not set
 # CONFIG_SCSI_DC395x is not set
 # CONFIG_SCSI_DC390T is not set
@@ -614,7 +560,6 @@ CONFIG_IEEE1394_SBP2=m
 CONFIG_IEEE1394_ETH1394=m
 CONFIG_IEEE1394_DV1394=m
 CONFIG_IEEE1394_RAWIO=y
-# CONFIG_IEEE1394_CMP is not set
 
 #
 # I2O device support
@@ -630,6 +575,7 @@ CONFIG_THERM_PM72=y
 CONFIG_WINDFARM=y
 CONFIG_WINDFARM_PM81=y
 CONFIG_WINDFARM_PM91=y
+CONFIG_WINDFARM_PM112=y
 
 #
 # Network device support
@@ -682,6 +628,7 @@ CONFIG_E1000=y
 # CONFIG_R8169 is not set
 # CONFIG_SIS190 is not set
 # CONFIG_SKGE is not set
+# CONFIG_SKY2 is not set
 # CONFIG_SK98LIN is not set
 CONFIG_TIGON3=m
 # CONFIG_BNX2 is not set
@@ -861,8 +808,7 @@ CONFIG_I2C_ALGOBIT=y
 # CONFIG_I2C_I801 is not set
 # CONFIG_I2C_I810 is not set
 # CONFIG_I2C_PIIX4 is not set
-CONFIG_I2C_KEYWEST=y
-CONFIG_I2C_PMAC_SMU=y
+CONFIG_I2C_POWERMAC=y
 # CONFIG_I2C_NFORCE2 is not set
 # CONFIG_I2C_PARPORT_LIGHT is not set
 # CONFIG_I2C_PROSAVAGE is not set
@@ -895,6 +841,12 @@ CONFIG_I2C_PMAC_SMU=y
 # CONFIG_I2C_DEBUG_CHIP is not set
 
 #
+# SPI support
+#
+# CONFIG_SPI is not set
+# CONFIG_SPI_MASTER is not set
+
+#
 # Dallas's 1-wire bus
 #
 # CONFIG_W1 is not set
@@ -961,7 +913,6 @@ CONFIG_FB_RADEON_I2C=y
 # CONFIG_FB_KYRO is not set
 # CONFIG_FB_3DFX is not set
 # CONFIG_FB_VOODOO1 is not set
-# CONFIG_FB_CYBLA is not set
 # CONFIG_FB_TRIDENT is not set
 # CONFIG_FB_VIRTUAL is not set
 
@@ -1008,9 +959,10 @@ CONFIG_SND_OSSEMUL=y
 CONFIG_SND_MIXER_OSS=m
 CONFIG_SND_PCM_OSS=m
 CONFIG_SND_SEQUENCER_OSS=y
+# CONFIG_SND_DYNAMIC_MINORS is not set
+CONFIG_SND_SUPPORT_OLD_API=y
 # CONFIG_SND_VERBOSE_PRINTK is not set
 # CONFIG_SND_DEBUG is not set
-CONFIG_SND_GENERIC_DRIVER=y
 
 #
 # Generic devices
@@ -1024,6 +976,8 @@ CONFIG_SND_GENERIC_DRIVER=y
 #
 # PCI devices
 #
+# CONFIG_SND_AD1889 is not set
+# CONFIG_SND_ALS4000 is not set
 # CONFIG_SND_ALI5451 is not set
 # CONFIG_SND_ATIIXP is not set
 # CONFIG_SND_ATIIXP_MODEM is not set
@@ -1032,39 +986,38 @@ CONFIG_SND_GENERIC_DRIVER=y
 # CONFIG_SND_AU8830 is not set
 # CONFIG_SND_AZT3328 is not set
 # CONFIG_SND_BT87X is not set
-# CONFIG_SND_CS46XX is not set
+# CONFIG_SND_CA0106 is not set
+# CONFIG_SND_CMIPCI is not set
 # CONFIG_SND_CS4281 is not set
+# CONFIG_SND_CS46XX is not set
 # CONFIG_SND_EMU10K1 is not set
 # CONFIG_SND_EMU10K1X is not set
-# CONFIG_SND_CA0106 is not set
-# CONFIG_SND_KORG1212 is not set
-# CONFIG_SND_MIXART is not set
-# CONFIG_SND_NM256 is not set
-# CONFIG_SND_RME32 is not set
-# CONFIG_SND_RME96 is not set
-# CONFIG_SND_RME9652 is not set
-# CONFIG_SND_HDSP is not set
-# CONFIG_SND_HDSPM is not set
-# CONFIG_SND_TRIDENT is not set
-# CONFIG_SND_YMFPCI is not set
-# CONFIG_SND_AD1889 is not set
-# CONFIG_SND_ALS4000 is not set
-# CONFIG_SND_CMIPCI is not set
 # CONFIG_SND_ENS1370 is not set
 # CONFIG_SND_ENS1371 is not set
 # CONFIG_SND_ES1938 is not set
 # CONFIG_SND_ES1968 is not set
-# CONFIG_SND_MAESTRO3 is not set
 # CONFIG_SND_FM801 is not set
+# CONFIG_SND_HDA_INTEL is not set
+# CONFIG_SND_HDSP is not set
+# CONFIG_SND_HDSPM is not set
 # CONFIG_SND_ICE1712 is not set
 # CONFIG_SND_ICE1724 is not set
 # CONFIG_SND_INTEL8X0 is not set
 # CONFIG_SND_INTEL8X0M is not set
+# CONFIG_SND_KORG1212 is not set
+# CONFIG_SND_MAESTRO3 is not set
+# CONFIG_SND_MIXART is not set
+# CONFIG_SND_NM256 is not set
+# CONFIG_SND_PCXHR is not set
+# CONFIG_SND_RME32 is not set
+# CONFIG_SND_RME96 is not set
+# CONFIG_SND_RME9652 is not set
 # CONFIG_SND_SONICVIBES is not set
+# CONFIG_SND_TRIDENT is not set
 # CONFIG_SND_VIA82XX is not set
 # CONFIG_SND_VIA82XX_MODEM is not set
 # CONFIG_SND_VX222 is not set
-# CONFIG_SND_HDA_INTEL is not set
+# CONFIG_SND_YMFPCI is not set
 
 #
 # ALSA PowerMac devices
@@ -1136,13 +1089,16 @@ CONFIG_USB_STORAGE_DPCM=y
 CONFIG_USB_STORAGE_SDDR09=y
 CONFIG_USB_STORAGE_SDDR55=y
 CONFIG_USB_STORAGE_JUMPSHOT=y
+# CONFIG_USB_STORAGE_ALAUDA is not set
 # CONFIG_USB_STORAGE_ONETOUCH is not set
+# CONFIG_USB_LIBUSUAL is not set
 
 #
 # USB Input Devices
 #
 CONFIG_USB_HID=y
 CONFIG_USB_HIDINPUT=y
+# CONFIG_USB_HIDINPUT_POWERBOOK is not set
 CONFIG_HID_FF=y
 CONFIG_HID_PID=y
 CONFIG_LOGITECH_FF=y
@@ -1159,6 +1115,7 @@ CONFIG_USB_HIDDEV=y
 # CONFIG_USB_YEALINK is not set
 # CONFIG_USB_XPAD is not set
 # CONFIG_USB_ATI_REMOTE is not set
+# CONFIG_USB_ATI_REMOTE2 is not set
 # CONFIG_USB_KEYSPAN_REMOTE is not set
 # CONFIG_USB_APPLETOUCH is not set
 
@@ -1207,6 +1164,7 @@ CONFIG_USB_SERIAL_GENERIC=y
 # CONFIG_USB_SERIAL_AIRPRIME is not set
 # CONFIG_USB_SERIAL_ANYDATA is not set
 CONFIG_USB_SERIAL_BELKIN=m
+# CONFIG_USB_SERIAL_WHITEHEAT is not set
 CONFIG_USB_SERIAL_DIGI_ACCELEPORT=m
 # CONFIG_USB_SERIAL_CP2101 is not set
 CONFIG_USB_SERIAL_CYPRESS_M8=m
@@ -1288,6 +1246,10 @@ CONFIG_USB_EZUSB=y
 #
 
 #
+# EDAC - error detection and reporting (RAS)
+#
+
+#
 # File systems
 #
 CONFIG_EXT2_FS=y
@@ -1317,6 +1279,7 @@ CONFIG_XFS_EXPORT=y
 CONFIG_XFS_SECURITY=y
 CONFIG_XFS_POSIX_ACL=y
 # CONFIG_XFS_RT is not set
+# CONFIG_OCFS2_FS is not set
 # CONFIG_MINIX_FS is not set
 # CONFIG_ROMFS_FS is not set
 CONFIG_INOTIFY=y
@@ -1357,6 +1320,7 @@ CONFIG_HUGETLBFS=y
 CONFIG_HUGETLB_PAGE=y
 CONFIG_RAMFS=y
 # CONFIG_RELAYFS_FS is not set
+# CONFIG_CONFIGFS_FS is not set
 
 #
 # Miscellaneous filesystems
@@ -1426,6 +1390,7 @@ CONFIG_MSDOS_PARTITION=y
 # CONFIG_SGI_PARTITION is not set
 # CONFIG_ULTRIX_PARTITION is not set
 # CONFIG_SUN_PARTITION is not set
+# CONFIG_KARMA_PARTITION is not set
 # CONFIG_EFI_PARTITION is not set
 
 #
@@ -1481,10 +1446,6 @@ CONFIG_CRC32=y
 CONFIG_LIBCRC32C=m
 CONFIG_ZLIB_INFLATE=y
 CONFIG_ZLIB_DEFLATE=m
-CONFIG_TEXTSEARCH=y
-CONFIG_TEXTSEARCH_KMP=m
-CONFIG_TEXTSEARCH_BM=m
-CONFIG_TEXTSEARCH_FSM=m
 
 #
 # Instrumentation Support
@@ -1497,24 +1458,31 @@ CONFIG_OPROFILE=y
 # Kernel hacking
 #
 # CONFIG_PRINTK_TIME is not set
-CONFIG_DEBUG_KERNEL=y
 CONFIG_MAGIC_SYSRQ=y
+CONFIG_DEBUG_KERNEL=y
 CONFIG_LOG_BUF_SHIFT=17
 CONFIG_DETECT_SOFTLOCKUP=y
 # CONFIG_SCHEDSTATS is not set
 # CONFIG_DEBUG_SLAB is not set
+CONFIG_DEBUG_MUTEXES=y
 # CONFIG_DEBUG_SPINLOCK is not set
 # CONFIG_DEBUG_SPINLOCK_SLEEP is not set
 # CONFIG_DEBUG_KOBJECT is not set
 # CONFIG_DEBUG_INFO is not set
 CONFIG_DEBUG_FS=y
 # CONFIG_DEBUG_VM is not set
+CONFIG_FORCED_INLINING=y
 # CONFIG_RCU_TORTURE_TEST is not set
 # CONFIG_DEBUG_STACKOVERFLOW is not set
 # CONFIG_DEBUG_STACK_USAGE is not set
 # CONFIG_DEBUGGER is not set
 CONFIG_IRQSTACKS=y
 CONFIG_BOOTX_TEXT=y
+# CONFIG_PPC_EARLY_DEBUG_LPAR is not set
+# CONFIG_PPC_EARLY_DEBUG_G5 is not set
+# CONFIG_PPC_EARLY_DEBUG_RTAS is not set
+# CONFIG_PPC_EARLY_DEBUG_MAPLE is not set
+# CONFIG_PPC_EARLY_DEBUG_ISERIES is not set
 
 #
 # Security options
Index: powerpc-git/arch/powerpc/configs/ppc64_defconfig
===================================================================
--- powerpc-git.orig/arch/powerpc/configs/ppc64_defconfig
+++ powerpc-git/arch/powerpc/configs/ppc64_defconfig
@@ -1,7 +1,7 @@
 #
 # Automatically generated make config: don't edit
-# Linux kernel version: 2.6.15-rc5
-# Tue Dec 20 15:59:38 2005
+# Linux kernel version: 2.6.16-rc2
+# Fri Feb 10 17:32:14 2006
 #
 CONFIG_PPC64=y
 CONFIG_64BIT=y
@@ -16,6 +16,10 @@ CONFIG_COMPAT=y
 CONFIG_SYSVIPC_COMPAT=y
 CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y
 CONFIG_ARCH_MAY_HAVE_PC_FDC=y
+CONFIG_PPC_OF=y
+CONFIG_PPC_UDBG_16550=y
+CONFIG_GENERIC_TBSYNC=y
+# CONFIG_DEFAULT_UIMAGE is not set
 
 #
 # Processor support
@@ -33,7 +37,6 @@ CONFIG_NR_CPUS=32
 # Code maturity level options
 #
 CONFIG_EXPERIMENTAL=y
-CONFIG_CLEAN_COMPILE=y
 CONFIG_LOCK_KERNEL=y
 CONFIG_INIT_ENV_ARG_LIMIT=32
 
@@ -48,8 +51,6 @@ CONFIG_POSIX_MQUEUE=y
 # CONFIG_BSD_PROCESS_ACCT is not set
 CONFIG_SYSCTL=y
 # CONFIG_AUDIT is not set
-CONFIG_HOTPLUG=y
-CONFIG_KOBJECT_UEVENT=y
 CONFIG_IKCONFIG=y
 CONFIG_IKCONFIG_PROC=y
 CONFIG_CPUSETS=y
@@ -59,8 +60,10 @@ CONFIG_CC_OPTIMIZE_FOR_SIZE=y
 CONFIG_KALLSYMS=y
 CONFIG_KALLSYMS_ALL=y
 # CONFIG_KALLSYMS_EXTRA_PASS is not set
+CONFIG_HOTPLUG=y
 CONFIG_PRINTK=y
 CONFIG_BUG=y
+CONFIG_ELF_CORE=y
 CONFIG_BASE_FULL=y
 CONFIG_FUTEX=y
 CONFIG_EPOLL=y
@@ -69,8 +72,10 @@ CONFIG_CC_ALIGN_FUNCTIONS=0
 CONFIG_CC_ALIGN_LABELS=0
 CONFIG_CC_ALIGN_LOOPS=0
 CONFIG_CC_ALIGN_JUMPS=0
+CONFIG_SLAB=y
 # CONFIG_TINY_SHMEM is not set
 CONFIG_BASE_SMALL=0
+# CONFIG_SLOB is not set
 
 #
 # Loadable module support
@@ -113,7 +118,6 @@ CONFIG_PPC_PMAC=y
 CONFIG_PPC_PMAC64=y
 CONFIG_PPC_MAPLE=y
 # CONFIG_PPC_CELL is not set
-CONFIG_PPC_OF=y
 CONFIG_XICS=y
 CONFIG_U3_DART=y
 CONFIG_MPIC=y
@@ -124,8 +128,8 @@ CONFIG_RTAS_FLASH=m
 # CONFIG_MMIO_NVRAM is not set
 CONFIG_MPIC_BROKEN_U3=y
 CONFIG_IBMVIO=y
+# CONFIG_IBMEBUS is not set
 # CONFIG_PPC_MPC106 is not set
-CONFIG_GENERIC_TBSYNC=y
 CONFIG_CPU_FREQ=y
 CONFIG_CPU_FREQ_TABLE=y
 # CONFIG_CPU_FREQ_DEBUG is not set
@@ -158,6 +162,7 @@ CONFIG_FORCE_MAX_ZONEORDER=13
 CONFIG_IOMMU_VMERGE=y
 CONFIG_HOTPLUG_CPU=y
 CONFIG_KEXEC=y
+# CONFIG_CRASH_DUMP is not set
 CONFIG_IRQ_ALL_CPUS=y
 CONFIG_PPC_SPLPAR=y
 CONFIG_EEH=y
@@ -178,6 +183,7 @@ CONFIG_HAVE_MEMORY_PRESENT=y
 CONFIG_SPARSEMEM_EXTREME=y
 # CONFIG_MEMORY_HOTPLUG is not set
 CONFIG_SPLIT_PTLOCK_CPUS=4
+CONFIG_MIGRATION=y
 # CONFIG_PPC_64K_PAGES is not set
 # CONFIG_SCHED_SMT is not set
 CONFIG_PROC_DEVICETREE=y
@@ -221,6 +227,7 @@ CONFIG_NET=y
 #
 # Networking options
 #
+# CONFIG_NETDEBUG is not set
 CONFIG_PACKET=y
 # CONFIG_PACKET_MMAP is not set
 CONFIG_UNIX=y
@@ -260,6 +267,7 @@ CONFIG_NETFILTER=y
 CONFIG_NETFILTER_NETLINK=y
 CONFIG_NETFILTER_NETLINK_QUEUE=m
 CONFIG_NETFILTER_NETLINK_LOG=m
+# CONFIG_NETFILTER_XTABLES is not set
 
 #
 # IP: Netfilter Configuration
@@ -277,65 +285,6 @@ CONFIG_IP_NF_TFTP=m
 CONFIG_IP_NF_AMANDA=m
 # CONFIG_IP_NF_PPTP is not set
 CONFIG_IP_NF_QUEUE=m
-CONFIG_IP_NF_IPTABLES=m
-CONFIG_IP_NF_MATCH_LIMIT=m
-CONFIG_IP_NF_MATCH_IPRANGE=m
-CONFIG_IP_NF_MATCH_MAC=m
-CONFIG_IP_NF_MATCH_PKTTYPE=m
-CONFIG_IP_NF_MATCH_MARK=m
-CONFIG_IP_NF_MATCH_MULTIPORT=m
-CONFIG_IP_NF_MATCH_TOS=m
-CONFIG_IP_NF_MATCH_RECENT=m
-CONFIG_IP_NF_MATCH_ECN=m
-CONFIG_IP_NF_MATCH_DSCP=m
-CONFIG_IP_NF_MATCH_AH_ESP=m
-CONFIG_IP_NF_MATCH_LENGTH=m
-CONFIG_IP_NF_MATCH_TTL=m
-CONFIG_IP_NF_MATCH_TCPMSS=m
-CONFIG_IP_NF_MATCH_HELPER=m
-CONFIG_IP_NF_MATCH_STATE=m
-CONFIG_IP_NF_MATCH_CONNTRACK=m
-CONFIG_IP_NF_MATCH_OWNER=m
-CONFIG_IP_NF_MATCH_ADDRTYPE=m
-CONFIG_IP_NF_MATCH_REALM=m
-CONFIG_IP_NF_MATCH_SCTP=m
-CONFIG_IP_NF_MATCH_DCCP=m
-CONFIG_IP_NF_MATCH_COMMENT=m
-CONFIG_IP_NF_MATCH_CONNMARK=m
-CONFIG_IP_NF_MATCH_CONNBYTES=m
-CONFIG_IP_NF_MATCH_HASHLIMIT=m
-CONFIG_IP_NF_MATCH_STRING=m
-CONFIG_IP_NF_FILTER=m
-CONFIG_IP_NF_TARGET_REJECT=m
-CONFIG_IP_NF_TARGET_LOG=m
-CONFIG_IP_NF_TARGET_ULOG=m
-CONFIG_IP_NF_TARGET_TCPMSS=m
-CONFIG_IP_NF_TARGET_NFQUEUE=m
-CONFIG_IP_NF_NAT=m
-CONFIG_IP_NF_NAT_NEEDED=y
-CONFIG_IP_NF_TARGET_MASQUERADE=m
-CONFIG_IP_NF_TARGET_REDIRECT=m
-CONFIG_IP_NF_TARGET_NETMAP=m
-CONFIG_IP_NF_TARGET_SAME=m
-CONFIG_IP_NF_NAT_SNMP_BASIC=m
-CONFIG_IP_NF_NAT_IRC=m
-CONFIG_IP_NF_NAT_FTP=m
-CONFIG_IP_NF_NAT_TFTP=m
-CONFIG_IP_NF_NAT_AMANDA=m
-CONFIG_IP_NF_MANGLE=m
-CONFIG_IP_NF_TARGET_TOS=m
-CONFIG_IP_NF_TARGET_ECN=m
-CONFIG_IP_NF_TARGET_DSCP=m
-CONFIG_IP_NF_TARGET_MARK=m
-CONFIG_IP_NF_TARGET_CLASSIFY=m
-CONFIG_IP_NF_TARGET_TTL=m
-CONFIG_IP_NF_TARGET_CONNMARK=m
-CONFIG_IP_NF_TARGET_CLUSTERIP=m
-CONFIG_IP_NF_RAW=m
-CONFIG_IP_NF_TARGET_NOTRACK=m
-CONFIG_IP_NF_ARPTABLES=m
-CONFIG_IP_NF_ARPFILTER=m
-CONFIG_IP_NF_ARP_MANGLE=m
 
 #
 # DCCP Configuration (EXPERIMENTAL)
@@ -346,6 +295,11 @@ CONFIG_IP_NF_ARP_MANGLE=m
 # SCTP Configuration (EXPERIMENTAL)
 #
 # CONFIG_IP_SCTP is not set
+
+#
+# TIPC Configuration (EXPERIMENTAL)
+#
+# CONFIG_TIPC is not set
 # CONFIG_ATM is not set
 # CONFIG_BRIDGE is not set
 # CONFIG_VLAN_8021Q is not set
@@ -364,7 +318,6 @@ CONFIG_LLC=y
 # QoS and/or fair queueing
 #
 # CONFIG_NET_SCHED is not set
-CONFIG_NET_CLS_ROUTE=y
 
 #
 # Network testing
@@ -572,13 +525,7 @@ CONFIG_SCSI_IPR_TRACE=y
 CONFIG_SCSI_IPR_DUMP=y
 # CONFIG_SCSI_QLOGIC_FC is not set
 # CONFIG_SCSI_QLOGIC_1280 is not set
-CONFIG_SCSI_QLA2XXX=y
-CONFIG_SCSI_QLA21XX=m
-CONFIG_SCSI_QLA22XX=m
-CONFIG_SCSI_QLA2300=m
-CONFIG_SCSI_QLA2322=m
-CONFIG_SCSI_QLA6312=m
-CONFIG_SCSI_QLA24XX=m
+# CONFIG_SCSI_QLA_FC is not set
 CONFIG_SCSI_LPFC=m
 # CONFIG_SCSI_DC395x is not set
 # CONFIG_SCSI_DC390T is not set
@@ -642,8 +589,6 @@ CONFIG_IEEE1394_SBP2=m
 CONFIG_IEEE1394_ETH1394=m
 CONFIG_IEEE1394_DV1394=m
 CONFIG_IEEE1394_RAWIO=y
-CONFIG_IEEE1394_CMP=m
-CONFIG_IEEE1394_AMDTP=m
 
 #
 # I2O device support
@@ -659,6 +604,7 @@ CONFIG_THERM_PM72=y
 CONFIG_WINDFARM=y
 CONFIG_WINDFARM_PM81=y
 CONFIG_WINDFARM_PM91=y
+CONFIG_WINDFARM_PM112=y
 
 #
 # Network device support
@@ -731,6 +677,7 @@ CONFIG_E1000=y
 # CONFIG_R8169 is not set
 # CONFIG_SIS190 is not set
 # CONFIG_SKGE is not set
+# CONFIG_SKY2 is not set
 # CONFIG_SK98LIN is not set
 # CONFIG_VIA_VELOCITY is not set
 CONFIG_TIGON3=y
@@ -853,6 +800,7 @@ CONFIG_HW_CONSOLE=y
 CONFIG_SERIAL_8250=y
 CONFIG_SERIAL_8250_CONSOLE=y
 CONFIG_SERIAL_8250_NR_UARTS=4
+CONFIG_SERIAL_8250_RUNTIME_UARTS=4
 # CONFIG_SERIAL_8250_EXTENDED is not set
 
 #
@@ -880,6 +828,7 @@ CONFIG_HVCS=m
 # CONFIG_WATCHDOG is not set
 # CONFIG_RTC is not set
 CONFIG_GEN_RTC=y
+# CONFIG_GEN_RTC_X is not set
 # CONFIG_DTLK is not set
 # CONFIG_R3964 is not set
 # CONFIG_APPLICOM is not set
@@ -923,8 +872,7 @@ CONFIG_I2C_AMD8111=y
 # CONFIG_I2C_I801 is not set
 # CONFIG_I2C_I810 is not set
 # CONFIG_I2C_PIIX4 is not set
-CONFIG_I2C_KEYWEST=y
-CONFIG_I2C_PMAC_SMU=y
+CONFIG_I2C_POWERMAC=y
 # CONFIG_I2C_NFORCE2 is not set
 # CONFIG_I2C_PARPORT_LIGHT is not set
 # CONFIG_I2C_PROSAVAGE is not set
@@ -957,6 +905,12 @@ CONFIG_I2C_PMAC_SMU=y
 # CONFIG_I2C_DEBUG_CHIP is not set
 
 #
+# SPI support
+#
+# CONFIG_SPI is not set
+# CONFIG_SPI_MASTER is not set
+
+#
 # Dallas's 1-wire bus
 #
 # CONFIG_W1 is not set
@@ -1028,7 +982,6 @@ CONFIG_FB_RADEON_I2C=y
 # CONFIG_FB_KYRO is not set
 # CONFIG_FB_3DFX is not set
 # CONFIG_FB_VOODOO1 is not set
-# CONFIG_FB_CYBLA is not set
 # CONFIG_FB_TRIDENT is not set
 # CONFIG_FB_VIRTUAL is not set
 
@@ -1073,9 +1026,10 @@ CONFIG_SND_OSSEMUL=y
 CONFIG_SND_MIXER_OSS=m
 CONFIG_SND_PCM_OSS=m
 CONFIG_SND_SEQUENCER_OSS=y
+# CONFIG_SND_DYNAMIC_MINORS is not set
+CONFIG_SND_SUPPORT_OLD_API=y
 # CONFIG_SND_VERBOSE_PRINTK is not set
 # CONFIG_SND_DEBUG is not set
-CONFIG_SND_GENERIC_DRIVER=y
 
 #
 # Generic devices
@@ -1089,6 +1043,8 @@ CONFIG_SND_GENERIC_DRIVER=y
 #
 # PCI devices
 #
+# CONFIG_SND_AD1889 is not set
+# CONFIG_SND_ALS4000 is not set
 # CONFIG_SND_ALI5451 is not set
 # CONFIG_SND_ATIIXP is not set
 # CONFIG_SND_ATIIXP_MODEM is not set
@@ -1097,39 +1053,38 @@ CONFIG_SND_GENERIC_DRIVER=y
 # CONFIG_SND_AU8830 is not set
 # CONFIG_SND_AZT3328 is not set
 # CONFIG_SND_BT87X is not set
-# CONFIG_SND_CS46XX is not set
+# CONFIG_SND_CA0106 is not set
+# CONFIG_SND_CMIPCI is not set
 # CONFIG_SND_CS4281 is not set
+# CONFIG_SND_CS46XX is not set
 # CONFIG_SND_EMU10K1 is not set
 # CONFIG_SND_EMU10K1X is not set
-# CONFIG_SND_CA0106 is not set
-# CONFIG_SND_KORG1212 is not set
-# CONFIG_SND_MIXART is not set
-# CONFIG_SND_NM256 is not set
-# CONFIG_SND_RME32 is not set
-# CONFIG_SND_RME96 is not set
-# CONFIG_SND_RME9652 is not set
-# CONFIG_SND_HDSP is not set
-# CONFIG_SND_HDSPM is not set
-# CONFIG_SND_TRIDENT is not set
-# CONFIG_SND_YMFPCI is not set
-# CONFIG_SND_AD1889 is not set
-# CONFIG_SND_ALS4000 is not set
-# CONFIG_SND_CMIPCI is not set
 # CONFIG_SND_ENS1370 is not set
 # CONFIG_SND_ENS1371 is not set
 # CONFIG_SND_ES1938 is not set
 # CONFIG_SND_ES1968 is not set
-# CONFIG_SND_MAESTRO3 is not set
 # CONFIG_SND_FM801 is not set
+# CONFIG_SND_HDA_INTEL is not set
+# CONFIG_SND_HDSP is not set
+# CONFIG_SND_HDSPM is not set
 # CONFIG_SND_ICE1712 is not set
 # CONFIG_SND_ICE1724 is not set
 # CONFIG_SND_INTEL8X0 is not set
 # CONFIG_SND_INTEL8X0M is not set
+# CONFIG_SND_KORG1212 is not set
+# CONFIG_SND_MAESTRO3 is not set
+# CONFIG_SND_MIXART is not set
+# CONFIG_SND_NM256 is not set
+# CONFIG_SND_PCXHR is not set
+# CONFIG_SND_RME32 is not set
+# CONFIG_SND_RME96 is not set
+# CONFIG_SND_RME9652 is not set
 # CONFIG_SND_SONICVIBES is not set
+# CONFIG_SND_TRIDENT is not set
 # CONFIG_SND_VIA82XX is not set
 # CONFIG_SND_VIA82XX_MODEM is not set
 # CONFIG_SND_VX222 is not set
-# CONFIG_SND_HDA_INTEL is not set
+# CONFIG_SND_YMFPCI is not set
 
 #
 # ALSA PowerMac devices
@@ -1201,13 +1156,16 @@ CONFIG_USB_STORAGE=m
 # CONFIG_USB_STORAGE_SDDR09 is not set
 # CONFIG_USB_STORAGE_SDDR55 is not set
 # CONFIG_USB_STORAGE_JUMPSHOT is not set
+# CONFIG_USB_STORAGE_ALAUDA is not set
 # CONFIG_USB_STORAGE_ONETOUCH is not set
+# CONFIG_USB_LIBUSUAL is not set
 
 #
 # USB Input Devices
 #
 CONFIG_USB_HID=y
 CONFIG_USB_HIDINPUT=y
+# CONFIG_USB_HIDINPUT_POWERBOOK is not set
 # CONFIG_HID_FF is not set
 CONFIG_USB_HIDDEV=y
 # CONFIG_USB_AIPTEK is not set
@@ -1221,6 +1179,7 @@ CONFIG_USB_HIDDEV=y
 # CONFIG_USB_YEALINK is not set
 # CONFIG_USB_XPAD is not set
 # CONFIG_USB_ATI_REMOTE is not set
+# CONFIG_USB_ATI_REMOTE2 is not set
 # CONFIG_USB_KEYSPAN_REMOTE is not set
 # CONFIG_USB_APPLETOUCH is not set
 
@@ -1307,6 +1266,10 @@ CONFIG_INFINIBAND_IPOIB=m
 #
 
 #
+# EDAC - error detection and reporting (RAS)
+#
+
+#
 # File systems
 #
 CONFIG_EXT2_FS=y
@@ -1340,6 +1303,7 @@ CONFIG_XFS_EXPORT=y
 CONFIG_XFS_SECURITY=y
 CONFIG_XFS_POSIX_ACL=y
 # CONFIG_XFS_RT is not set
+# CONFIG_OCFS2_FS is not set
 # CONFIG_MINIX_FS is not set
 # CONFIG_ROMFS_FS is not set
 CONFIG_INOTIFY=y
@@ -1379,6 +1343,7 @@ CONFIG_HUGETLBFS=y
 CONFIG_HUGETLB_PAGE=y
 CONFIG_RAMFS=y
 # CONFIG_RELAYFS_FS is not set
+# CONFIG_CONFIGFS_FS is not set
 
 #
 # Miscellaneous filesystems
@@ -1449,6 +1414,7 @@ CONFIG_MSDOS_PARTITION=y
 # CONFIG_SGI_PARTITION is not set
 # CONFIG_ULTRIX_PARTITION is not set
 # CONFIG_SUN_PARTITION is not set
+# CONFIG_KARMA_PARTITION is not set
 # CONFIG_EFI_PARTITION is not set
 
 #
@@ -1504,10 +1470,6 @@ CONFIG_CRC32=y
 CONFIG_LIBCRC32C=m
 CONFIG_ZLIB_INFLATE=y
 CONFIG_ZLIB_DEFLATE=m
-CONFIG_TEXTSEARCH=y
-CONFIG_TEXTSEARCH_KMP=m
-CONFIG_TEXTSEARCH_BM=m
-CONFIG_TEXTSEARCH_FSM=m
 
 #
 # Instrumentation Support
@@ -1520,18 +1482,20 @@ CONFIG_OPROFILE=y
 # Kernel hacking
 #
 # CONFIG_PRINTK_TIME is not set
-CONFIG_DEBUG_KERNEL=y
 CONFIG_MAGIC_SYSRQ=y
+CONFIG_DEBUG_KERNEL=y
 CONFIG_LOG_BUF_SHIFT=17
 CONFIG_DETECT_SOFTLOCKUP=y
 # CONFIG_SCHEDSTATS is not set
 # CONFIG_DEBUG_SLAB is not set
+CONFIG_DEBUG_MUTEXES=y
 # CONFIG_DEBUG_SPINLOCK is not set
 # CONFIG_DEBUG_SPINLOCK_SLEEP is not set
 # CONFIG_DEBUG_KOBJECT is not set
 # CONFIG_DEBUG_INFO is not set
 CONFIG_DEBUG_FS=y
 # CONFIG_DEBUG_VM is not set
+CONFIG_FORCED_INLINING=y
 # CONFIG_RCU_TORTURE_TEST is not set
 CONFIG_DEBUG_STACKOVERFLOW=y
 CONFIG_DEBUG_STACK_USAGE=y
@@ -1540,6 +1504,11 @@ CONFIG_XMON=y
 # CONFIG_XMON_DEFAULT is not set
 CONFIG_IRQSTACKS=y
 CONFIG_BOOTX_TEXT=y
+# CONFIG_PPC_EARLY_DEBUG_LPAR is not set
+# CONFIG_PPC_EARLY_DEBUG_G5 is not set
+# CONFIG_PPC_EARLY_DEBUG_RTAS is not set
+# CONFIG_PPC_EARLY_DEBUG_MAPLE is not set
+# CONFIG_PPC_EARLY_DEBUG_ISERIES is not set
 
 #
 # Security options
Index: powerpc-git/arch/powerpc/configs/pseries_defconfig
===================================================================
--- powerpc-git.orig/arch/powerpc/configs/pseries_defconfig
+++ powerpc-git/arch/powerpc/configs/pseries_defconfig
@@ -1,7 +1,7 @@
 #
 # Automatically generated make config: don't edit
-# Linux kernel version: 2.6.15-rc5
-# Tue Dec 20 15:59:40 2005
+# Linux kernel version: 2.6.16-rc2
+# Fri Feb 10 17:33:32 2006
 #
 CONFIG_PPC64=y
 CONFIG_64BIT=y
@@ -16,6 +16,10 @@ CONFIG_COMPAT=y
 CONFIG_SYSVIPC_COMPAT=y
 CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y
 CONFIG_ARCH_MAY_HAVE_PC_FDC=y
+CONFIG_PPC_OF=y
+CONFIG_PPC_UDBG_16550=y
+# CONFIG_GENERIC_TBSYNC is not set
+# CONFIG_DEFAULT_UIMAGE is not set
 
 #
 # Processor support
@@ -33,7 +37,6 @@ CONFIG_NR_CPUS=128
 # Code maturity level options
 #
 CONFIG_EXPERIMENTAL=y
-CONFIG_CLEAN_COMPILE=y
 CONFIG_LOCK_KERNEL=y
 CONFIG_INIT_ENV_ARG_LIMIT=32
 
@@ -49,8 +52,6 @@ CONFIG_POSIX_MQUEUE=y
 CONFIG_SYSCTL=y
 CONFIG_AUDIT=y
 CONFIG_AUDITSYSCALL=y
-CONFIG_HOTPLUG=y
-CONFIG_KOBJECT_UEVENT=y
 CONFIG_IKCONFIG=y
 CONFIG_IKCONFIG_PROC=y
 CONFIG_CPUSETS=y
@@ -60,8 +61,10 @@ CONFIG_CC_OPTIMIZE_FOR_SIZE=y
 CONFIG_KALLSYMS=y
 CONFIG_KALLSYMS_ALL=y
 # CONFIG_KALLSYMS_EXTRA_PASS is not set
+CONFIG_HOTPLUG=y
 CONFIG_PRINTK=y
 CONFIG_BUG=y
+CONFIG_ELF_CORE=y
 CONFIG_BASE_FULL=y
 CONFIG_FUTEX=y
 CONFIG_EPOLL=y
@@ -70,8 +73,10 @@ CONFIG_CC_ALIGN_FUNCTIONS=0
 CONFIG_CC_ALIGN_LABELS=0
 CONFIG_CC_ALIGN_LOOPS=0
 CONFIG_CC_ALIGN_JUMPS=0
+CONFIG_SLAB=y
 # CONFIG_TINY_SHMEM is not set
 CONFIG_BASE_SMALL=0
+# CONFIG_SLOB is not set
 
 #
 # Loadable module support
@@ -113,7 +118,6 @@ CONFIG_PPC_PSERIES=y
 # CONFIG_PPC_PMAC is not set
 # CONFIG_PPC_MAPLE is not set
 # CONFIG_PPC_CELL is not set
-CONFIG_PPC_OF=y
 CONFIG_XICS=y
 # CONFIG_U3_DART is not set
 CONFIG_MPIC=y
@@ -123,8 +127,8 @@ CONFIG_RTAS_PROC=y
 CONFIG_RTAS_FLASH=m
 # CONFIG_MMIO_NVRAM is not set
 CONFIG_IBMVIO=y
+# CONFIG_IBMEBUS is not set
 # CONFIG_PPC_MPC106 is not set
-# CONFIG_GENERIC_TBSYNC is not set
 # CONFIG_CPU_FREQ is not set
 # CONFIG_WANT_EARLY_SERIAL is not set
 
@@ -145,6 +149,7 @@ CONFIG_FORCE_MAX_ZONEORDER=13
 CONFIG_IOMMU_VMERGE=y
 CONFIG_HOTPLUG_CPU=y
 CONFIG_KEXEC=y
+# CONFIG_CRASH_DUMP is not set
 CONFIG_IRQ_ALL_CPUS=y
 CONFIG_PPC_SPLPAR=y
 CONFIG_EEH=y
@@ -165,6 +170,7 @@ CONFIG_HAVE_MEMORY_PRESENT=y
 CONFIG_SPARSEMEM_EXTREME=y
 # CONFIG_MEMORY_HOTPLUG is not set
 CONFIG_SPLIT_PTLOCK_CPUS=4
+CONFIG_MIGRATION=y
 CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID=y
 # CONFIG_PPC_64K_PAGES is not set
 CONFIG_SCHED_SMT=y
@@ -209,6 +215,7 @@ CONFIG_NET=y
 #
 # Networking options
 #
+# CONFIG_NETDEBUG is not set
 CONFIG_PACKET=y
 # CONFIG_PACKET_MMAP is not set
 CONFIG_UNIX=y
@@ -248,6 +255,7 @@ CONFIG_NETFILTER=y
 CONFIG_NETFILTER_NETLINK=y
 CONFIG_NETFILTER_NETLINK_QUEUE=m
 CONFIG_NETFILTER_NETLINK_LOG=m
+# CONFIG_NETFILTER_XTABLES is not set
 
 #
 # IP: Netfilter Configuration
@@ -265,65 +273,6 @@ CONFIG_IP_NF_TFTP=m
 CONFIG_IP_NF_AMANDA=m
 # CONFIG_IP_NF_PPTP is not set
 CONFIG_IP_NF_QUEUE=m
-CONFIG_IP_NF_IPTABLES=m
-CONFIG_IP_NF_MATCH_LIMIT=m
-CONFIG_IP_NF_MATCH_IPRANGE=m
-CONFIG_IP_NF_MATCH_MAC=m
-CONFIG_IP_NF_MATCH_PKTTYPE=m
-CONFIG_IP_NF_MATCH_MARK=m
-CONFIG_IP_NF_MATCH_MULTIPORT=m
-CONFIG_IP_NF_MATCH_TOS=m
-CONFIG_IP_NF_MATCH_RECENT=m
-CONFIG_IP_NF_MATCH_ECN=m
-CONFIG_IP_NF_MATCH_DSCP=m
-CONFIG_IP_NF_MATCH_AH_ESP=m
-CONFIG_IP_NF_MATCH_LENGTH=m
-CONFIG_IP_NF_MATCH_TTL=m
-CONFIG_IP_NF_MATCH_TCPMSS=m
-CONFIG_IP_NF_MATCH_HELPER=m
-CONFIG_IP_NF_MATCH_STATE=m
-CONFIG_IP_NF_MATCH_CONNTRACK=m
-CONFIG_IP_NF_MATCH_OWNER=m
-CONFIG_IP_NF_MATCH_ADDRTYPE=m
-CONFIG_IP_NF_MATCH_REALM=m
-CONFIG_IP_NF_MATCH_SCTP=m
-# CONFIG_IP_NF_MATCH_DCCP is not set
-CONFIG_IP_NF_MATCH_COMMENT=m
-CONFIG_IP_NF_MATCH_CONNMARK=m
-CONFIG_IP_NF_MATCH_CONNBYTES=m
-CONFIG_IP_NF_MATCH_HASHLIMIT=m
-CONFIG_IP_NF_MATCH_STRING=m
-CONFIG_IP_NF_FILTER=m
-CONFIG_IP_NF_TARGET_REJECT=m
-CONFIG_IP_NF_TARGET_LOG=m
-CONFIG_IP_NF_TARGET_ULOG=m
-CONFIG_IP_NF_TARGET_TCPMSS=m
-CONFIG_IP_NF_TARGET_NFQUEUE=m
-CONFIG_IP_NF_NAT=m
-CONFIG_IP_NF_NAT_NEEDED=y
-CONFIG_IP_NF_TARGET_MASQUERADE=m
-CONFIG_IP_NF_TARGET_REDIRECT=m
-CONFIG_IP_NF_TARGET_NETMAP=m
-CONFIG_IP_NF_TARGET_SAME=m
-CONFIG_IP_NF_NAT_SNMP_BASIC=m
-CONFIG_IP_NF_NAT_IRC=m
-CONFIG_IP_NF_NAT_FTP=m
-CONFIG_IP_NF_NAT_TFTP=m
-CONFIG_IP_NF_NAT_AMANDA=m
-CONFIG_IP_NF_MANGLE=m
-CONFIG_IP_NF_TARGET_TOS=m
-CONFIG_IP_NF_TARGET_ECN=m
-CONFIG_IP_NF_TARGET_DSCP=m
-CONFIG_IP_NF_TARGET_MARK=m
-CONFIG_IP_NF_TARGET_CLASSIFY=m
-CONFIG_IP_NF_TARGET_TTL=m
-CONFIG_IP_NF_TARGET_CONNMARK=m
-CONFIG_IP_NF_TARGET_CLUSTERIP=m
-CONFIG_IP_NF_RAW=m
-CONFIG_IP_NF_TARGET_NOTRACK=m
-CONFIG_IP_NF_ARPTABLES=m
-CONFIG_IP_NF_ARPFILTER=m
-CONFIG_IP_NF_ARP_MANGLE=m
 
 #
 # DCCP Configuration (EXPERIMENTAL)
@@ -334,6 +283,11 @@ CONFIG_IP_NF_ARP_MANGLE=m
 # SCTP Configuration (EXPERIMENTAL)
 #
 # CONFIG_IP_SCTP is not set
+
+#
+# TIPC Configuration (EXPERIMENTAL)
+#
+# CONFIG_TIPC is not set
 # CONFIG_ATM is not set
 # CONFIG_BRIDGE is not set
 # CONFIG_VLAN_8021Q is not set
@@ -352,7 +306,6 @@ CONFIG_LLC=y
 # QoS and/or fair queueing
 #
 # CONFIG_NET_SCHED is not set
-CONFIG_NET_CLS_ROUTE=y
 
 #
 # Network testing
@@ -550,13 +503,7 @@ CONFIG_SCSI_IPR_TRACE=y
 CONFIG_SCSI_IPR_DUMP=y
 # CONFIG_SCSI_QLOGIC_FC is not set
 # CONFIG_SCSI_QLOGIC_1280 is not set
-CONFIG_SCSI_QLA2XXX=y
-CONFIG_SCSI_QLA21XX=m
-CONFIG_SCSI_QLA22XX=m
-CONFIG_SCSI_QLA2300=m
-CONFIG_SCSI_QLA2322=m
-CONFIG_SCSI_QLA6312=m
-CONFIG_SCSI_QLA24XX=m
+# CONFIG_SCSI_QLA_FC is not set
 CONFIG_SCSI_LPFC=m
 # CONFIG_SCSI_DC395x is not set
 # CONFIG_SCSI_DC390T is not set
@@ -678,6 +625,7 @@ CONFIG_E1000=y
 # CONFIG_R8169 is not set
 # CONFIG_SIS190 is not set
 # CONFIG_SKGE is not set
+# CONFIG_SKY2 is not set
 # CONFIG_SK98LIN is not set
 # CONFIG_VIA_VELOCITY is not set
 CONFIG_TIGON3=y
@@ -803,6 +751,7 @@ CONFIG_HW_CONSOLE=y
 CONFIG_SERIAL_8250=y
 CONFIG_SERIAL_8250_CONSOLE=y
 CONFIG_SERIAL_8250_NR_UARTS=4
+CONFIG_SERIAL_8250_RUNTIME_UARTS=4
 # CONFIG_SERIAL_8250_EXTENDED is not set
 
 #
@@ -909,6 +858,12 @@ CONFIG_I2C_ALGOBIT=y
 # CONFIG_I2C_DEBUG_CHIP is not set
 
 #
+# SPI support
+#
+# CONFIG_SPI is not set
+# CONFIG_SPI_MASTER is not set
+
+#
 # Dallas's 1-wire bus
 #
 # CONFIG_W1 is not set
@@ -976,7 +931,6 @@ CONFIG_FB_RADEON_I2C=y
 # CONFIG_FB_KYRO is not set
 # CONFIG_FB_3DFX is not set
 # CONFIG_FB_VOODOO1 is not set
-# CONFIG_FB_CYBLA is not set
 # CONFIG_FB_TRIDENT is not set
 # CONFIG_FB_VIRTUAL is not set
 
@@ -1061,12 +1015,15 @@ CONFIG_USB_STORAGE=y
 # CONFIG_USB_STORAGE_SDDR09 is not set
 # CONFIG_USB_STORAGE_SDDR55 is not set
 # CONFIG_USB_STORAGE_JUMPSHOT is not set
+# CONFIG_USB_STORAGE_ALAUDA is not set
+# CONFIG_USB_LIBUSUAL is not set
 
 #
 # USB Input Devices
 #
 CONFIG_USB_HID=y
 CONFIG_USB_HIDINPUT=y
+# CONFIG_USB_HIDINPUT_POWERBOOK is not set
 # CONFIG_HID_FF is not set
 CONFIG_USB_HIDDEV=y
 # CONFIG_USB_AIPTEK is not set
@@ -1080,6 +1037,7 @@ CONFIG_USB_HIDDEV=y
 # CONFIG_USB_YEALINK is not set
 # CONFIG_USB_XPAD is not set
 # CONFIG_USB_ATI_REMOTE is not set
+# CONFIG_USB_ATI_REMOTE2 is not set
 # CONFIG_USB_KEYSPAN_REMOTE is not set
 # CONFIG_USB_APPLETOUCH is not set
 
@@ -1167,6 +1125,10 @@ CONFIG_INFINIBAND_IPOIB=m
 #
 
 #
+# EDAC - error detection and reporting (RAS)
+#
+
+#
 # File systems
 #
 CONFIG_EXT2_FS=y
@@ -1200,6 +1162,7 @@ CONFIG_XFS_EXPORT=y
 CONFIG_XFS_SECURITY=y
 CONFIG_XFS_POSIX_ACL=y
 # CONFIG_XFS_RT is not set
+# CONFIG_OCFS2_FS is not set
 # CONFIG_MINIX_FS is not set
 # CONFIG_ROMFS_FS is not set
 CONFIG_INOTIFY=y
@@ -1240,6 +1203,7 @@ CONFIG_HUGETLBFS=y
 CONFIG_HUGETLB_PAGE=y
 CONFIG_RAMFS=y
 # CONFIG_RELAYFS_FS is not set
+# CONFIG_CONFIGFS_FS is not set
 
 #
 # Miscellaneous filesystems
@@ -1351,10 +1315,6 @@ CONFIG_CRC32=y
 CONFIG_LIBCRC32C=m
 CONFIG_ZLIB_INFLATE=y
 CONFIG_ZLIB_DEFLATE=m
-CONFIG_TEXTSEARCH=y
-CONFIG_TEXTSEARCH_KMP=m
-CONFIG_TEXTSEARCH_BM=m
-CONFIG_TEXTSEARCH_FSM=m
 
 #
 # Instrumentation Support
@@ -1367,18 +1327,20 @@ CONFIG_OPROFILE=y
 # Kernel hacking
 #
 # CONFIG_PRINTK_TIME is not set
-CONFIG_DEBUG_KERNEL=y
 CONFIG_MAGIC_SYSRQ=y
+CONFIG_DEBUG_KERNEL=y
 CONFIG_LOG_BUF_SHIFT=17
 CONFIG_DETECT_SOFTLOCKUP=y
 # CONFIG_SCHEDSTATS is not set
 # CONFIG_DEBUG_SLAB is not set
+CONFIG_DEBUG_MUTEXES=y
 # CONFIG_DEBUG_SPINLOCK is not set
 # CONFIG_DEBUG_SPINLOCK_SLEEP is not set
 # CONFIG_DEBUG_KOBJECT is not set
 # CONFIG_DEBUG_INFO is not set
 CONFIG_DEBUG_FS=y
 # CONFIG_DEBUG_VM is not set
+CONFIG_FORCED_INLINING=y
 # CONFIG_RCU_TORTURE_TEST is not set
 CONFIG_DEBUG_STACKOVERFLOW=y
 CONFIG_DEBUG_STACK_USAGE=y
@@ -1387,6 +1349,11 @@ CONFIG_XMON=y
 CONFIG_XMON_DEFAULT=y
 CONFIG_IRQSTACKS=y
 # CONFIG_BOOTX_TEXT is not set
+# CONFIG_PPC_EARLY_DEBUG_LPAR is not set
+# CONFIG_PPC_EARLY_DEBUG_G5 is not set
+# CONFIG_PPC_EARLY_DEBUG_RTAS is not set
+# CONFIG_PPC_EARLY_DEBUG_MAPLE is not set
+# CONFIG_PPC_EARLY_DEBUG_ISERIES is not set
 
 #
 # Security options


From arnd at arndb.de  Sun Feb 12 15:52:25 2006
From: arnd at arndb.de (Arnd Bergmann)
Date: Sun, 12 Feb 2006 05:52:25 +0100
Subject: AW: Re: __setup_cpu_be problem
In-Reply-To: <1139546116.5003.81.camel@localhost.localdomain>
References: <2812322.110611139545275893.JavaMail.servlet@kundenserver>
	<1139546116.5003.81.camel@localhost.localdomain>
Message-ID: <200602120552.26164.arnd@arndb.de>

On Friday 10 February 2006 05:35, Benjamin Herrenschmidt wrote:
> 
> > Doing it in the firmware sounds like the right solution to me.
> > I would however not want to do that if the current firmware
> > sets the wrong page sizes.
> > 
> > I know that Hartmut wanted me to provide him with the right device
> > tree information that he needs to create to say that the page
> > size are 16M, 64k and 4k. Maybe we can find a combined solution
> > for these problems. Using __setup_cpu_power4 should be ok.
> 
> I don't completely understand your statement ... sorry

The current firmware on the Cell blades does neither the setup of
the HID6 register nor have the correct tables in the device tree.

Since I'm still currently sitting in a garden in NZ instead of the
B?blingen lab, I can't find out what the HID6 power-on defaults
are. We might get away with just leaving the default there, but that
might prevent us from using 16M and/or 64k pages and there are 
definitely some application which depend on 16M hugetlb mappings
on Cell. 

The two problems we are facing currently are:
- If HID6 defaults to disabling 16M large pages, the kernel will
  get the wrong information from the CPU features and applications
  that use it break. The firmware should add the setup if HID6
  _now_, but we also should be prepared for users of old firmware
  that want to upgrade their kernel without upgrading the firmware
  at the same time.
- We want to use 64k pages in the future, so the firmware needs to
  add the 'ibm,segment-page-sizes' property ASAP, preferrably at
  the same time they start setting up HID6. I currently have a
  hack for the kernel to override that, but we're in the process
  of eliminating all the special hacks that won't make in into
  the mainline kernel.

> > We could probably do a fallback in the cell setup to see if
> > the properties are in the device tree and do our own HID6 
> > setup stuff if not, normally expecting that the firmware settings
> > match the device tree.
> 
> We should not touch HID6 at all ... we should assume the firmware set it
> appropriately and have setup matching page size entries in the
> device-tree. I don't think we need to support changing that value
> especially since the kernel doesn't quite support 1M large page sizes
> anyway.

Yes, 1M mappings are probably not of much use to us, and other OSs
already do whatever they like ;-).

> > Geoff, if your firmware does not already have the properties
> > for large page sizes, could you add them?
> > 
> > Ben, could you point Hartmut (and maybe Geoff) to the documentation
> > for how the device tree needs to look like?
> 
> I'm not sure we published that yet :) I would suggest looking at what
> the kernel does to parse these instead in hash_utils.c until I get a
> former IBM approval for the spec to be published.

Then please try to at least send the spec or a link to Hartmut's IBM
internal address (hpenner at de.ibm.com). I already pointed him to the
linux code when it was initially merged, but he argued that reverse
engineering that code is not good enough to be sure to get the
property right and not having it in there is better than having incorrect
properties.

	Arnd <><


From benh at kernel.crashing.org  Mon Feb 13 08:24:22 2006
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Mon, 13 Feb 2006 08:24:22 +1100
Subject: AW: Re: __setup_cpu_be problem
In-Reply-To: <200602120552.26164.arnd@arndb.de>
References: <2812322.110611139545275893.JavaMail.servlet@kundenserver>
	<1139546116.5003.81.camel@localhost.localdomain>
	<200602120552.26164.arnd@arndb.de>
Message-ID: <1139779462.5247.30.camel@localhost.localdomain>


> The current firmware on the Cell blades does neither the setup of
> the HID6 register nor have the correct tables in the device tree.
> 
> Since I'm still currently sitting in a garden in NZ instead of the
> B?blingen lab, I can't find out what the HID6 power-on defaults
> are. We might get away with just leaving the default there, but that
> might prevent us from using 16M and/or 64k pages and there are 
> definitely some application which depend on 16M hugetlb mappings
> on Cell. 

Yes, however, how much widely distributed and "frozen" is this current
Cell firmware ? I mean, do we really need to add a workaround to the
kenrel instead of just fixing the firmware here ?

> The two problems we are facing currently are:
> - If HID6 defaults to disabling 16M large pages, the kernel will
>   get the wrong information from the CPU features and applications
>   that use it break. The firmware should add the setup if HID6
>   _now_, but we also should be prepared for users of old firmware
>   that want to upgrade their kernel without upgrading the firmware
>   at the same time.

Do we really need to support old/broken firmware ? It's not like we had
a released product all over the field...

> - We want to use 64k pages in the future, so the firmware needs to
>   add the 'ibm,segment-page-sizes' property ASAP, preferrably at
>   the same time they start setting up HID6. I currently have a
>   hack for the kernel to override that, but we're in the process
>   of eliminating all the special hacks that won't make in into
>   the mainline kernel.

The only things you need is to have this property set and the new
ibm,pa-feature for which I need to dig out the latest spec.... The
problem is that the kernel will currentl not enable 64k pages on any
processor due to the lack of a feature bit (intentionally) from the
cputable. That bit will be extracted from ibm,pa-features at least on
pSeries. It's the bit indicating that L=1 works for cache inhibited
mappings.

> Yes, 1M mappings are probably not of much use to us, and other OSs
> already do whatever they like ;-).

Sure. Note that the firmware can still set HID6 to 1M pages and put the
appropriate entries in the device-tree for 1M large pages. Linux won't
be able to use them as-is though but at least the device-tree infos will
be sane. I don't want to enter a debate wether we should be able to
change HID6 etc... right now. It's more a firmware configuration issue
as far as I'm concerned.

> Then please try to at least send the spec or a link to Hartmut's IBM
> internal address (hpenner at de.ibm.com). I already pointed him to the
> linux code when it was initially merged, but he argued that reverse
> engineering that code is not good enough to be sure to get the
> property right and not having it in there is better than having incorrect
> properties.

Will do
Ben.


From benh at kernel.crashing.org  Mon Feb 13 09:11:44 2006
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Mon, 13 Feb 2006 09:11:44 +1100
Subject: [PATCH] Update {g5,pseries,ppc64}_defconfig
In-Reply-To: <20060210234903.GB4795@pb15.lixom.net>
References: <20060210234903.GB4795@pb15.lixom.net>
Message-ID: <1139782304.5247.42.camel@localhost.localdomain>

On Fri, 2006-02-10 at 17:49 -0600, Olof Johansson wrote:
> Hi,
> 
> For powerpc.git (post-2.6.16):
> 
> Update defconfigs for g5, pseries and generic ppc64. Default choices
> for everything, with the following exceptions:
> 
>  * Enable WINDFARM_PM112 on g5 and ppc64.
>  * Increase CONFIG_NR_CPUS to 4 on g5_defconfig

You probably also want to make tg3 built-in...

Ben.

> 
> Signed-off-by: Olof Johansson <olof at lixom.net>
> 
> Index: powerpc-git/arch/powerpc/configs/g5_defconfig
> ===================================================================
> --- powerpc-git.orig/arch/powerpc/configs/g5_defconfig
> +++ powerpc-git/arch/powerpc/configs/g5_defconfig
> @@ -1,7 +1,7 @@
>  #
>  # Automatically generated make config: don't edit
> -# Linux kernel version: 2.6.15-rc5
> -# Tue Dec 20 15:59:30 2005
> +# Linux kernel version: 2.6.16-rc2
> +# Fri Feb 10 17:33:08 2006
>  #
>  CONFIG_PPC64=y
>  CONFIG_64BIT=y
> @@ -16,6 +16,10 @@ CONFIG_COMPAT=y
>  CONFIG_SYSVIPC_COMPAT=y
>  CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y
>  CONFIG_ARCH_MAY_HAVE_PC_FDC=y
> +CONFIG_PPC_OF=y
> +# CONFIG_PPC_UDBG_16550 is not set
> +CONFIG_GENERIC_TBSYNC=y
> +# CONFIG_DEFAULT_UIMAGE is not set
>  
>  #
>  # Processor support
> @@ -26,13 +30,12 @@ CONFIG_PPC_FPU=y
>  CONFIG_ALTIVEC=y
>  CONFIG_PPC_STD_MMU=y
>  CONFIG_SMP=y
> -CONFIG_NR_CPUS=2
> +CONFIG_NR_CPUS=4
>  
>  #
>  # Code maturity level options
>  #
>  CONFIG_EXPERIMENTAL=y
> -CONFIG_CLEAN_COMPILE=y
>  CONFIG_LOCK_KERNEL=y
>  CONFIG_INIT_ENV_ARG_LIMIT=32
>  
> @@ -47,8 +50,6 @@ CONFIG_POSIX_MQUEUE=y
>  # CONFIG_BSD_PROCESS_ACCT is not set
>  CONFIG_SYSCTL=y
>  # CONFIG_AUDIT is not set
> -CONFIG_HOTPLUG=y
> -CONFIG_KOBJECT_UEVENT=y
>  CONFIG_IKCONFIG=y
>  CONFIG_IKCONFIG_PROC=y
>  # CONFIG_CPUSETS is not set
> @@ -58,8 +59,10 @@ CONFIG_CC_OPTIMIZE_FOR_SIZE=y
>  CONFIG_KALLSYMS=y
>  # CONFIG_KALLSYMS_ALL is not set
>  # CONFIG_KALLSYMS_EXTRA_PASS is not set
> +CONFIG_HOTPLUG=y
>  CONFIG_PRINTK=y
>  CONFIG_BUG=y
> +CONFIG_ELF_CORE=y
>  CONFIG_BASE_FULL=y
>  CONFIG_FUTEX=y
>  CONFIG_EPOLL=y
> @@ -68,8 +71,10 @@ CONFIG_CC_ALIGN_FUNCTIONS=0
>  CONFIG_CC_ALIGN_LABELS=0
>  CONFIG_CC_ALIGN_LOOPS=0
>  CONFIG_CC_ALIGN_JUMPS=0
> +CONFIG_SLAB=y
>  # CONFIG_TINY_SHMEM is not set
>  CONFIG_BASE_SMALL=0
> +# CONFIG_SLOB is not set
>  
>  #
>  # Loadable module support
> @@ -112,13 +117,12 @@ CONFIG_PPC_PMAC=y
>  CONFIG_PPC_PMAC64=y
>  # CONFIG_PPC_MAPLE is not set
>  # CONFIG_PPC_CELL is not set
> -CONFIG_PPC_OF=y
>  CONFIG_U3_DART=y
>  CONFIG_MPIC=y
>  # CONFIG_PPC_RTAS is not set
>  # CONFIG_MMIO_NVRAM is not set
> +CONFIG_MPIC_BROKEN_U3=y
>  # CONFIG_PPC_MPC106 is not set
> -CONFIG_GENERIC_TBSYNC=y
>  CONFIG_CPU_FREQ=y
>  CONFIG_CPU_FREQ_TABLE=y
>  # CONFIG_CPU_FREQ_DEBUG is not set
> @@ -151,6 +155,7 @@ CONFIG_FORCE_MAX_ZONEORDER=13
>  CONFIG_IOMMU_VMERGE=y
>  # CONFIG_HOTPLUG_CPU is not set
>  CONFIG_KEXEC=y
> +# CONFIG_CRASH_DUMP is not set
>  CONFIG_IRQ_ALL_CPUS=y
>  # CONFIG_NUMA is not set
>  CONFIG_ARCH_SELECT_MEMORY_MODEL=y
> @@ -202,6 +207,7 @@ CONFIG_NET=y
>  #
>  # Networking options
>  #
> +# CONFIG_NETDEBUG is not set
>  CONFIG_PACKET=y
>  # CONFIG_PACKET_MMAP is not set
>  CONFIG_UNIX=y
> @@ -239,6 +245,7 @@ CONFIG_NETFILTER=y
>  # Core Netfilter Configuration
>  #
>  # CONFIG_NETFILTER_NETLINK is not set
> +# CONFIG_NETFILTER_XTABLES is not set
>  
>  #
>  # IP: Netfilter Configuration
> @@ -255,65 +262,6 @@ CONFIG_IP_NF_TFTP=m
>  CONFIG_IP_NF_AMANDA=m
>  # CONFIG_IP_NF_PPTP is not set
>  CONFIG_IP_NF_QUEUE=m
> -CONFIG_IP_NF_IPTABLES=m
> -CONFIG_IP_NF_MATCH_LIMIT=m
> -CONFIG_IP_NF_MATCH_IPRANGE=m
> -CONFIG_IP_NF_MATCH_MAC=m
> -CONFIG_IP_NF_MATCH_PKTTYPE=m
> -CONFIG_IP_NF_MATCH_MARK=m
> -CONFIG_IP_NF_MATCH_MULTIPORT=m
> -CONFIG_IP_NF_MATCH_TOS=m
> -CONFIG_IP_NF_MATCH_RECENT=m
> -CONFIG_IP_NF_MATCH_ECN=m
> -CONFIG_IP_NF_MATCH_DSCP=m
> -CONFIG_IP_NF_MATCH_AH_ESP=m
> -CONFIG_IP_NF_MATCH_LENGTH=m
> -CONFIG_IP_NF_MATCH_TTL=m
> -CONFIG_IP_NF_MATCH_TCPMSS=m
> -CONFIG_IP_NF_MATCH_HELPER=m
> -CONFIG_IP_NF_MATCH_STATE=m
> -CONFIG_IP_NF_MATCH_CONNTRACK=m
> -CONFIG_IP_NF_MATCH_OWNER=m
> -CONFIG_IP_NF_MATCH_ADDRTYPE=m
> -CONFIG_IP_NF_MATCH_REALM=m
> -CONFIG_IP_NF_MATCH_SCTP=m
> -# CONFIG_IP_NF_MATCH_DCCP is not set
> -CONFIG_IP_NF_MATCH_COMMENT=m
> -CONFIG_IP_NF_MATCH_CONNMARK=m
> -CONFIG_IP_NF_MATCH_CONNBYTES=m
> -CONFIG_IP_NF_MATCH_HASHLIMIT=m
> -CONFIG_IP_NF_MATCH_STRING=m
> -CONFIG_IP_NF_FILTER=m
> -CONFIG_IP_NF_TARGET_REJECT=m
> -CONFIG_IP_NF_TARGET_LOG=m
> -CONFIG_IP_NF_TARGET_ULOG=m
> -CONFIG_IP_NF_TARGET_TCPMSS=m
> -CONFIG_IP_NF_TARGET_NFQUEUE=m
> -CONFIG_IP_NF_NAT=m
> -CONFIG_IP_NF_NAT_NEEDED=y
> -CONFIG_IP_NF_TARGET_MASQUERADE=m
> -CONFIG_IP_NF_TARGET_REDIRECT=m
> -CONFIG_IP_NF_TARGET_NETMAP=m
> -CONFIG_IP_NF_TARGET_SAME=m
> -CONFIG_IP_NF_NAT_SNMP_BASIC=m
> -CONFIG_IP_NF_NAT_IRC=m
> -CONFIG_IP_NF_NAT_FTP=m
> -CONFIG_IP_NF_NAT_TFTP=m
> -CONFIG_IP_NF_NAT_AMANDA=m
> -CONFIG_IP_NF_MANGLE=m
> -CONFIG_IP_NF_TARGET_TOS=m
> -CONFIG_IP_NF_TARGET_ECN=m
> -CONFIG_IP_NF_TARGET_DSCP=m
> -CONFIG_IP_NF_TARGET_MARK=m
> -CONFIG_IP_NF_TARGET_CLASSIFY=m
> -CONFIG_IP_NF_TARGET_TTL=m
> -CONFIG_IP_NF_TARGET_CONNMARK=m
> -CONFIG_IP_NF_TARGET_CLUSTERIP=m
> -CONFIG_IP_NF_RAW=m
> -CONFIG_IP_NF_TARGET_NOTRACK=m
> -CONFIG_IP_NF_ARPTABLES=m
> -CONFIG_IP_NF_ARPFILTER=m
> -CONFIG_IP_NF_ARP_MANGLE=m
>  
>  #
>  # DCCP Configuration (EXPERIMENTAL)
> @@ -324,6 +272,11 @@ CONFIG_IP_NF_ARP_MANGLE=m
>  # SCTP Configuration (EXPERIMENTAL)
>  #
>  # CONFIG_IP_SCTP is not set
> +
> +#
> +# TIPC Configuration (EXPERIMENTAL)
> +#
> +# CONFIG_TIPC is not set
>  # CONFIG_ATM is not set
>  # CONFIG_BRIDGE is not set
>  # CONFIG_VLAN_8021Q is not set
> @@ -342,7 +295,6 @@ CONFIG_LLC=y
>  # QoS and/or fair queueing
>  #
>  # CONFIG_NET_SCHED is not set
> -CONFIG_NET_CLS_ROUTE=y
>  
>  #
>  # Network testing
> @@ -545,13 +497,7 @@ CONFIG_SCSI_SATA_SVW=y
>  # CONFIG_SCSI_IPR is not set
>  # CONFIG_SCSI_QLOGIC_FC is not set
>  # CONFIG_SCSI_QLOGIC_1280 is not set
> -CONFIG_SCSI_QLA2XXX=y
> -# CONFIG_SCSI_QLA21XX is not set
> -# CONFIG_SCSI_QLA22XX is not set
> -# CONFIG_SCSI_QLA2300 is not set
> -# CONFIG_SCSI_QLA2322 is not set
> -# CONFIG_SCSI_QLA6312 is not set
> -# CONFIG_SCSI_QLA24XX is not set
> +# CONFIG_SCSI_QLA_FC is not set
>  # CONFIG_SCSI_LPFC is not set
>  # CONFIG_SCSI_DC395x is not set
>  # CONFIG_SCSI_DC390T is not set
> @@ -614,7 +560,6 @@ CONFIG_IEEE1394_SBP2=m
>  CONFIG_IEEE1394_ETH1394=m
>  CONFIG_IEEE1394_DV1394=m
>  CONFIG_IEEE1394_RAWIO=y
> -# CONFIG_IEEE1394_CMP is not set
>  
>  #
>  # I2O device support
> @@ -630,6 +575,7 @@ CONFIG_THERM_PM72=y
>  CONFIG_WINDFARM=y
>  CONFIG_WINDFARM_PM81=y
>  CONFIG_WINDFARM_PM91=y
> +CONFIG_WINDFARM_PM112=y
>  
>  #
>  # Network device support
> @@ -682,6 +628,7 @@ CONFIG_E1000=y
>  # CONFIG_R8169 is not set
>  # CONFIG_SIS190 is not set
>  # CONFIG_SKGE is not set
> +# CONFIG_SKY2 is not set
>  # CONFIG_SK98LIN is not set
>  CONFIG_TIGON3=m
>  # CONFIG_BNX2 is not set
> @@ -861,8 +808,7 @@ CONFIG_I2C_ALGOBIT=y
>  # CONFIG_I2C_I801 is not set
>  # CONFIG_I2C_I810 is not set
>  # CONFIG_I2C_PIIX4 is not set
> -CONFIG_I2C_KEYWEST=y
> -CONFIG_I2C_PMAC_SMU=y
> +CONFIG_I2C_POWERMAC=y
>  # CONFIG_I2C_NFORCE2 is not set
>  # CONFIG_I2C_PARPORT_LIGHT is not set
>  # CONFIG_I2C_PROSAVAGE is not set
> @@ -895,6 +841,12 @@ CONFIG_I2C_PMAC_SMU=y
>  # CONFIG_I2C_DEBUG_CHIP is not set
>  
>  #
> +# SPI support
> +#
> +# CONFIG_SPI is not set
> +# CONFIG_SPI_MASTER is not set
> +
> +#
>  # Dallas's 1-wire bus
>  #
>  # CONFIG_W1 is not set
> @@ -961,7 +913,6 @@ CONFIG_FB_RADEON_I2C=y
>  # CONFIG_FB_KYRO is not set
>  # CONFIG_FB_3DFX is not set
>  # CONFIG_FB_VOODOO1 is not set
> -# CONFIG_FB_CYBLA is not set
>  # CONFIG_FB_TRIDENT is not set
>  # CONFIG_FB_VIRTUAL is not set
>  
> @@ -1008,9 +959,10 @@ CONFIG_SND_OSSEMUL=y
>  CONFIG_SND_MIXER_OSS=m
>  CONFIG_SND_PCM_OSS=m
>  CONFIG_SND_SEQUENCER_OSS=y
> +# CONFIG_SND_DYNAMIC_MINORS is not set
> +CONFIG_SND_SUPPORT_OLD_API=y
>  # CONFIG_SND_VERBOSE_PRINTK is not set
>  # CONFIG_SND_DEBUG is not set
> -CONFIG_SND_GENERIC_DRIVER=y
>  
>  #
>  # Generic devices
> @@ -1024,6 +976,8 @@ CONFIG_SND_GENERIC_DRIVER=y
>  #
>  # PCI devices
>  #
> +# CONFIG_SND_AD1889 is not set
> +# CONFIG_SND_ALS4000 is not set
>  # CONFIG_SND_ALI5451 is not set
>  # CONFIG_SND_ATIIXP is not set
>  # CONFIG_SND_ATIIXP_MODEM is not set
> @@ -1032,39 +986,38 @@ CONFIG_SND_GENERIC_DRIVER=y
>  # CONFIG_SND_AU8830 is not set
>  # CONFIG_SND_AZT3328 is not set
>  # CONFIG_SND_BT87X is not set
> -# CONFIG_SND_CS46XX is not set
> +# CONFIG_SND_CA0106 is not set
> +# CONFIG_SND_CMIPCI is not set
>  # CONFIG_SND_CS4281 is not set
> +# CONFIG_SND_CS46XX is not set
>  # CONFIG_SND_EMU10K1 is not set
>  # CONFIG_SND_EMU10K1X is not set
> -# CONFIG_SND_CA0106 is not set
> -# CONFIG_SND_KORG1212 is not set
> -# CONFIG_SND_MIXART is not set
> -# CONFIG_SND_NM256 is not set
> -# CONFIG_SND_RME32 is not set
> -# CONFIG_SND_RME96 is not set
> -# CONFIG_SND_RME9652 is not set
> -# CONFIG_SND_HDSP is not set
> -# CONFIG_SND_HDSPM is not set
> -# CONFIG_SND_TRIDENT is not set
> -# CONFIG_SND_YMFPCI is not set
> -# CONFIG_SND_AD1889 is not set
> -# CONFIG_SND_ALS4000 is not set
> -# CONFIG_SND_CMIPCI is not set
>  # CONFIG_SND_ENS1370 is not set
>  # CONFIG_SND_ENS1371 is not set
>  # CONFIG_SND_ES1938 is not set
>  # CONFIG_SND_ES1968 is not set
> -# CONFIG_SND_MAESTRO3 is not set
>  # CONFIG_SND_FM801 is not set
> +# CONFIG_SND_HDA_INTEL is not set
> +# CONFIG_SND_HDSP is not set
> +# CONFIG_SND_HDSPM is not set
>  # CONFIG_SND_ICE1712 is not set
>  # CONFIG_SND_ICE1724 is not set
>  # CONFIG_SND_INTEL8X0 is not set
>  # CONFIG_SND_INTEL8X0M is not set
> +# CONFIG_SND_KORG1212 is not set
> +# CONFIG_SND_MAESTRO3 is not set
> +# CONFIG_SND_MIXART is not set
> +# CONFIG_SND_NM256 is not set
> +# CONFIG_SND_PCXHR is not set
> +# CONFIG_SND_RME32 is not set
> +# CONFIG_SND_RME96 is not set
> +# CONFIG_SND_RME9652 is not set
>  # CONFIG_SND_SONICVIBES is not set
> +# CONFIG_SND_TRIDENT is not set
>  # CONFIG_SND_VIA82XX is not set
>  # CONFIG_SND_VIA82XX_MODEM is not set
>  # CONFIG_SND_VX222 is not set
> -# CONFIG_SND_HDA_INTEL is not set
> +# CONFIG_SND_YMFPCI is not set
>  
>  #
>  # ALSA PowerMac devices
> @@ -1136,13 +1089,16 @@ CONFIG_USB_STORAGE_DPCM=y
>  CONFIG_USB_STORAGE_SDDR09=y
>  CONFIG_USB_STORAGE_SDDR55=y
>  CONFIG_USB_STORAGE_JUMPSHOT=y
> +# CONFIG_USB_STORAGE_ALAUDA is not set
>  # CONFIG_USB_STORAGE_ONETOUCH is not set
> +# CONFIG_USB_LIBUSUAL is not set
>  
>  #
>  # USB Input Devices
>  #
>  CONFIG_USB_HID=y
>  CONFIG_USB_HIDINPUT=y
> +# CONFIG_USB_HIDINPUT_POWERBOOK is not set
>  CONFIG_HID_FF=y
>  CONFIG_HID_PID=y
>  CONFIG_LOGITECH_FF=y
> @@ -1159,6 +1115,7 @@ CONFIG_USB_HIDDEV=y
>  # CONFIG_USB_YEALINK is not set
>  # CONFIG_USB_XPAD is not set
>  # CONFIG_USB_ATI_REMOTE is not set
> +# CONFIG_USB_ATI_REMOTE2 is not set
>  # CONFIG_USB_KEYSPAN_REMOTE is not set
>  # CONFIG_USB_APPLETOUCH is not set
>  
> @@ -1207,6 +1164,7 @@ CONFIG_USB_SERIAL_GENERIC=y
>  # CONFIG_USB_SERIAL_AIRPRIME is not set
>  # CONFIG_USB_SERIAL_ANYDATA is not set
>  CONFIG_USB_SERIAL_BELKIN=m
> +# CONFIG_USB_SERIAL_WHITEHEAT is not set
>  CONFIG_USB_SERIAL_DIGI_ACCELEPORT=m
>  # CONFIG_USB_SERIAL_CP2101 is not set
>  CONFIG_USB_SERIAL_CYPRESS_M8=m
> @@ -1288,6 +1246,10 @@ CONFIG_USB_EZUSB=y
>  #
>  
>  #
> +# EDAC - error detection and reporting (RAS)
> +#
> +
> +#
>  # File systems
>  #
>  CONFIG_EXT2_FS=y
> @@ -1317,6 +1279,7 @@ CONFIG_XFS_EXPORT=y
>  CONFIG_XFS_SECURITY=y
>  CONFIG_XFS_POSIX_ACL=y
>  # CONFIG_XFS_RT is not set
> +# CONFIG_OCFS2_FS is not set
>  # CONFIG_MINIX_FS is not set
>  # CONFIG_ROMFS_FS is not set
>  CONFIG_INOTIFY=y
> @@ -1357,6 +1320,7 @@ CONFIG_HUGETLBFS=y
>  CONFIG_HUGETLB_PAGE=y
>  CONFIG_RAMFS=y
>  # CONFIG_RELAYFS_FS is not set
> +# CONFIG_CONFIGFS_FS is not set
>  
>  #
>  # Miscellaneous filesystems
> @@ -1426,6 +1390,7 @@ CONFIG_MSDOS_PARTITION=y
>  # CONFIG_SGI_PARTITION is not set
>  # CONFIG_ULTRIX_PARTITION is not set
>  # CONFIG_SUN_PARTITION is not set
> +# CONFIG_KARMA_PARTITION is not set
>  # CONFIG_EFI_PARTITION is not set
>  
>  #
> @@ -1481,10 +1446,6 @@ CONFIG_CRC32=y
>  CONFIG_LIBCRC32C=m
>  CONFIG_ZLIB_INFLATE=y
>  CONFIG_ZLIB_DEFLATE=m
> -CONFIG_TEXTSEARCH=y
> -CONFIG_TEXTSEARCH_KMP=m
> -CONFIG_TEXTSEARCH_BM=m
> -CONFIG_TEXTSEARCH_FSM=m
>  
>  #
>  # Instrumentation Support
> @@ -1497,24 +1458,31 @@ CONFIG_OPROFILE=y
>  # Kernel hacking
>  #
>  # CONFIG_PRINTK_TIME is not set
> -CONFIG_DEBUG_KERNEL=y
>  CONFIG_MAGIC_SYSRQ=y
> +CONFIG_DEBUG_KERNEL=y
>  CONFIG_LOG_BUF_SHIFT=17
>  CONFIG_DETECT_SOFTLOCKUP=y
>  # CONFIG_SCHEDSTATS is not set
>  # CONFIG_DEBUG_SLAB is not set
> +CONFIG_DEBUG_MUTEXES=y
>  # CONFIG_DEBUG_SPINLOCK is not set
>  # CONFIG_DEBUG_SPINLOCK_SLEEP is not set
>  # CONFIG_DEBUG_KOBJECT is not set
>  # CONFIG_DEBUG_INFO is not set
>  CONFIG_DEBUG_FS=y
>  # CONFIG_DEBUG_VM is not set
> +CONFIG_FORCED_INLINING=y
>  # CONFIG_RCU_TORTURE_TEST is not set
>  # CONFIG_DEBUG_STACKOVERFLOW is not set
>  # CONFIG_DEBUG_STACK_USAGE is not set
>  # CONFIG_DEBUGGER is not set
>  CONFIG_IRQSTACKS=y
>  CONFIG_BOOTX_TEXT=y
> +# CONFIG_PPC_EARLY_DEBUG_LPAR is not set
> +# CONFIG_PPC_EARLY_DEBUG_G5 is not set
> +# CONFIG_PPC_EARLY_DEBUG_RTAS is not set
> +# CONFIG_PPC_EARLY_DEBUG_MAPLE is not set
> +# CONFIG_PPC_EARLY_DEBUG_ISERIES is not set
>  
>  #
>  # Security options
> Index: powerpc-git/arch/powerpc/configs/ppc64_defconfig
> ===================================================================
> --- powerpc-git.orig/arch/powerpc/configs/ppc64_defconfig
> +++ powerpc-git/arch/powerpc/configs/ppc64_defconfig
> @@ -1,7 +1,7 @@
>  #
>  # Automatically generated make config: don't edit
> -# Linux kernel version: 2.6.15-rc5
> -# Tue Dec 20 15:59:38 2005
> +# Linux kernel version: 2.6.16-rc2
> +# Fri Feb 10 17:32:14 2006
>  #
>  CONFIG_PPC64=y
>  CONFIG_64BIT=y
> @@ -16,6 +16,10 @@ CONFIG_COMPAT=y
>  CONFIG_SYSVIPC_COMPAT=y
>  CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y
>  CONFIG_ARCH_MAY_HAVE_PC_FDC=y
> +CONFIG_PPC_OF=y
> +CONFIG_PPC_UDBG_16550=y
> +CONFIG_GENERIC_TBSYNC=y
> +# CONFIG_DEFAULT_UIMAGE is not set
>  
>  #
>  # Processor support
> @@ -33,7 +37,6 @@ CONFIG_NR_CPUS=32
>  # Code maturity level options
>  #
>  CONFIG_EXPERIMENTAL=y
> -CONFIG_CLEAN_COMPILE=y
>  CONFIG_LOCK_KERNEL=y
>  CONFIG_INIT_ENV_ARG_LIMIT=32
>  
> @@ -48,8 +51,6 @@ CONFIG_POSIX_MQUEUE=y
>  # CONFIG_BSD_PROCESS_ACCT is not set
>  CONFIG_SYSCTL=y
>  # CONFIG_AUDIT is not set
> -CONFIG_HOTPLUG=y
> -CONFIG_KOBJECT_UEVENT=y
>  CONFIG_IKCONFIG=y
>  CONFIG_IKCONFIG_PROC=y
>  CONFIG_CPUSETS=y
> @@ -59,8 +60,10 @@ CONFIG_CC_OPTIMIZE_FOR_SIZE=y
>  CONFIG_KALLSYMS=y
>  CONFIG_KALLSYMS_ALL=y
>  # CONFIG_KALLSYMS_EXTRA_PASS is not set
> +CONFIG_HOTPLUG=y
>  CONFIG_PRINTK=y
>  CONFIG_BUG=y
> +CONFIG_ELF_CORE=y
>  CONFIG_BASE_FULL=y
>  CONFIG_FUTEX=y
>  CONFIG_EPOLL=y
> @@ -69,8 +72,10 @@ CONFIG_CC_ALIGN_FUNCTIONS=0
>  CONFIG_CC_ALIGN_LABELS=0
>  CONFIG_CC_ALIGN_LOOPS=0
>  CONFIG_CC_ALIGN_JUMPS=0
> +CONFIG_SLAB=y
>  # CONFIG_TINY_SHMEM is not set
>  CONFIG_BASE_SMALL=0
> +# CONFIG_SLOB is not set
>  
>  #
>  # Loadable module support
> @@ -113,7 +118,6 @@ CONFIG_PPC_PMAC=y
>  CONFIG_PPC_PMAC64=y
>  CONFIG_PPC_MAPLE=y
>  # CONFIG_PPC_CELL is not set
> -CONFIG_PPC_OF=y
>  CONFIG_XICS=y
>  CONFIG_U3_DART=y
>  CONFIG_MPIC=y
> @@ -124,8 +128,8 @@ CONFIG_RTAS_FLASH=m
>  # CONFIG_MMIO_NVRAM is not set
>  CONFIG_MPIC_BROKEN_U3=y
>  CONFIG_IBMVIO=y
> +# CONFIG_IBMEBUS is not set
>  # CONFIG_PPC_MPC106 is not set
> -CONFIG_GENERIC_TBSYNC=y
>  CONFIG_CPU_FREQ=y
>  CONFIG_CPU_FREQ_TABLE=y
>  # CONFIG_CPU_FREQ_DEBUG is not set
> @@ -158,6 +162,7 @@ CONFIG_FORCE_MAX_ZONEORDER=13
>  CONFIG_IOMMU_VMERGE=y
>  CONFIG_HOTPLUG_CPU=y
>  CONFIG_KEXEC=y
> +# CONFIG_CRASH_DUMP is not set
>  CONFIG_IRQ_ALL_CPUS=y
>  CONFIG_PPC_SPLPAR=y
>  CONFIG_EEH=y
> @@ -178,6 +183,7 @@ CONFIG_HAVE_MEMORY_PRESENT=y
>  CONFIG_SPARSEMEM_EXTREME=y
>  # CONFIG_MEMORY_HOTPLUG is not set
>  CONFIG_SPLIT_PTLOCK_CPUS=4
> +CONFIG_MIGRATION=y
>  # CONFIG_PPC_64K_PAGES is not set
>  # CONFIG_SCHED_SMT is not set
>  CONFIG_PROC_DEVICETREE=y
> @@ -221,6 +227,7 @@ CONFIG_NET=y
>  #
>  # Networking options
>  #
> +# CONFIG_NETDEBUG is not set
>  CONFIG_PACKET=y
>  # CONFIG_PACKET_MMAP is not set
>  CONFIG_UNIX=y
> @@ -260,6 +267,7 @@ CONFIG_NETFILTER=y
>  CONFIG_NETFILTER_NETLINK=y
>  CONFIG_NETFILTER_NETLINK_QUEUE=m
>  CONFIG_NETFILTER_NETLINK_LOG=m
> +# CONFIG_NETFILTER_XTABLES is not set
>  
>  #
>  # IP: Netfilter Configuration
> @@ -277,65 +285,6 @@ CONFIG_IP_NF_TFTP=m
>  CONFIG_IP_NF_AMANDA=m
>  # CONFIG_IP_NF_PPTP is not set
>  CONFIG_IP_NF_QUEUE=m
> -CONFIG_IP_NF_IPTABLES=m
> -CONFIG_IP_NF_MATCH_LIMIT=m
> -CONFIG_IP_NF_MATCH_IPRANGE=m
> -CONFIG_IP_NF_MATCH_MAC=m
> -CONFIG_IP_NF_MATCH_PKTTYPE=m
> -CONFIG_IP_NF_MATCH_MARK=m
> -CONFIG_IP_NF_MATCH_MULTIPORT=m
> -CONFIG_IP_NF_MATCH_TOS=m
> -CONFIG_IP_NF_MATCH_RECENT=m
> -CONFIG_IP_NF_MATCH_ECN=m
> -CONFIG_IP_NF_MATCH_DSCP=m
> -CONFIG_IP_NF_MATCH_AH_ESP=m
> -CONFIG_IP_NF_MATCH_LENGTH=m
> -CONFIG_IP_NF_MATCH_TTL=m
> -CONFIG_IP_NF_MATCH_TCPMSS=m
> -CONFIG_IP_NF_MATCH_HELPER=m
> -CONFIG_IP_NF_MATCH_STATE=m
> -CONFIG_IP_NF_MATCH_CONNTRACK=m
> -CONFIG_IP_NF_MATCH_OWNER=m
> -CONFIG_IP_NF_MATCH_ADDRTYPE=m
> -CONFIG_IP_NF_MATCH_REALM=m
> -CONFIG_IP_NF_MATCH_SCTP=m
> -CONFIG_IP_NF_MATCH_DCCP=m
> -CONFIG_IP_NF_MATCH_COMMENT=m
> -CONFIG_IP_NF_MATCH_CONNMARK=m
> -CONFIG_IP_NF_MATCH_CONNBYTES=m
> -CONFIG_IP_NF_MATCH_HASHLIMIT=m
> -CONFIG_IP_NF_MATCH_STRING=m
> -CONFIG_IP_NF_FILTER=m
> -CONFIG_IP_NF_TARGET_REJECT=m
> -CONFIG_IP_NF_TARGET_LOG=m
> -CONFIG_IP_NF_TARGET_ULOG=m
> -CONFIG_IP_NF_TARGET_TCPMSS=m
> -CONFIG_IP_NF_TARGET_NFQUEUE=m
> -CONFIG_IP_NF_NAT=m
> -CONFIG_IP_NF_NAT_NEEDED=y
> -CONFIG_IP_NF_TARGET_MASQUERADE=m
> -CONFIG_IP_NF_TARGET_REDIRECT=m
> -CONFIG_IP_NF_TARGET_NETMAP=m
> -CONFIG_IP_NF_TARGET_SAME=m
> -CONFIG_IP_NF_NAT_SNMP_BASIC=m
> -CONFIG_IP_NF_NAT_IRC=m
> -CONFIG_IP_NF_NAT_FTP=m
> -CONFIG_IP_NF_NAT_TFTP=m
> -CONFIG_IP_NF_NAT_AMANDA=m
> -CONFIG_IP_NF_MANGLE=m
> -CONFIG_IP_NF_TARGET_TOS=m
> -CONFIG_IP_NF_TARGET_ECN=m
> -CONFIG_IP_NF_TARGET_DSCP=m
> -CONFIG_IP_NF_TARGET_MARK=m
> -CONFIG_IP_NF_TARGET_CLASSIFY=m
> -CONFIG_IP_NF_TARGET_TTL=m
> -CONFIG_IP_NF_TARGET_CONNMARK=m
> -CONFIG_IP_NF_TARGET_CLUSTERIP=m
> -CONFIG_IP_NF_RAW=m
> -CONFIG_IP_NF_TARGET_NOTRACK=m
> -CONFIG_IP_NF_ARPTABLES=m
> -CONFIG_IP_NF_ARPFILTER=m
> -CONFIG_IP_NF_ARP_MANGLE=m
>  
>  #
>  # DCCP Configuration (EXPERIMENTAL)
> @@ -346,6 +295,11 @@ CONFIG_IP_NF_ARP_MANGLE=m
>  # SCTP Configuration (EXPERIMENTAL)
>  #
>  # CONFIG_IP_SCTP is not set
> +
> +#
> +# TIPC Configuration (EXPERIMENTAL)
> +#
> +# CONFIG_TIPC is not set
>  # CONFIG_ATM is not set
>  # CONFIG_BRIDGE is not set
>  # CONFIG_VLAN_8021Q is not set
> @@ -364,7 +318,6 @@ CONFIG_LLC=y
>  # QoS and/or fair queueing
>  #
>  # CONFIG_NET_SCHED is not set
> -CONFIG_NET_CLS_ROUTE=y
>  
>  #
>  # Network testing
> @@ -572,13 +525,7 @@ CONFIG_SCSI_IPR_TRACE=y
>  CONFIG_SCSI_IPR_DUMP=y
>  # CONFIG_SCSI_QLOGIC_FC is not set
>  # CONFIG_SCSI_QLOGIC_1280 is not set
> -CONFIG_SCSI_QLA2XXX=y
> -CONFIG_SCSI_QLA21XX=m
> -CONFIG_SCSI_QLA22XX=m
> -CONFIG_SCSI_QLA2300=m
> -CONFIG_SCSI_QLA2322=m
> -CONFIG_SCSI_QLA6312=m
> -CONFIG_SCSI_QLA24XX=m
> +# CONFIG_SCSI_QLA_FC is not set
>  CONFIG_SCSI_LPFC=m
>  # CONFIG_SCSI_DC395x is not set
>  # CONFIG_SCSI_DC390T is not set
> @@ -642,8 +589,6 @@ CONFIG_IEEE1394_SBP2=m
>  CONFIG_IEEE1394_ETH1394=m
>  CONFIG_IEEE1394_DV1394=m
>  CONFIG_IEEE1394_RAWIO=y
> -CONFIG_IEEE1394_CMP=m
> -CONFIG_IEEE1394_AMDTP=m
>  
>  #
>  # I2O device support
> @@ -659,6 +604,7 @@ CONFIG_THERM_PM72=y
>  CONFIG_WINDFARM=y
>  CONFIG_WINDFARM_PM81=y
>  CONFIG_WINDFARM_PM91=y
> +CONFIG_WINDFARM_PM112=y
>  
>  #
>  # Network device support
> @@ -731,6 +677,7 @@ CONFIG_E1000=y
>  # CONFIG_R8169 is not set
>  # CONFIG_SIS190 is not set
>  # CONFIG_SKGE is not set
> +# CONFIG_SKY2 is not set
>  # CONFIG_SK98LIN is not set
>  # CONFIG_VIA_VELOCITY is not set
>  CONFIG_TIGON3=y
> @@ -853,6 +800,7 @@ CONFIG_HW_CONSOLE=y
>  CONFIG_SERIAL_8250=y
>  CONFIG_SERIAL_8250_CONSOLE=y
>  CONFIG_SERIAL_8250_NR_UARTS=4
> +CONFIG_SERIAL_8250_RUNTIME_UARTS=4
>  # CONFIG_SERIAL_8250_EXTENDED is not set
>  
>  #
> @@ -880,6 +828,7 @@ CONFIG_HVCS=m
>  # CONFIG_WATCHDOG is not set
>  # CONFIG_RTC is not set
>  CONFIG_GEN_RTC=y
> +# CONFIG_GEN_RTC_X is not set
>  # CONFIG_DTLK is not set
>  # CONFIG_R3964 is not set
>  # CONFIG_APPLICOM is not set
> @@ -923,8 +872,7 @@ CONFIG_I2C_AMD8111=y
>  # CONFIG_I2C_I801 is not set
>  # CONFIG_I2C_I810 is not set
>  # CONFIG_I2C_PIIX4 is not set
> -CONFIG_I2C_KEYWEST=y
> -CONFIG_I2C_PMAC_SMU=y
> +CONFIG_I2C_POWERMAC=y
>  # CONFIG_I2C_NFORCE2 is not set
>  # CONFIG_I2C_PARPORT_LIGHT is not set
>  # CONFIG_I2C_PROSAVAGE is not set
> @@ -957,6 +905,12 @@ CONFIG_I2C_PMAC_SMU=y
>  # CONFIG_I2C_DEBUG_CHIP is not set
>  
>  #
> +# SPI support
> +#
> +# CONFIG_SPI is not set
> +# CONFIG_SPI_MASTER is not set
> +
> +#
>  # Dallas's 1-wire bus
>  #
>  # CONFIG_W1 is not set
> @@ -1028,7 +982,6 @@ CONFIG_FB_RADEON_I2C=y
>  # CONFIG_FB_KYRO is not set
>  # CONFIG_FB_3DFX is not set
>  # CONFIG_FB_VOODOO1 is not set
> -# CONFIG_FB_CYBLA is not set
>  # CONFIG_FB_TRIDENT is not set
>  # CONFIG_FB_VIRTUAL is not set
>  
> @@ -1073,9 +1026,10 @@ CONFIG_SND_OSSEMUL=y
>  CONFIG_SND_MIXER_OSS=m
>  CONFIG_SND_PCM_OSS=m
>  CONFIG_SND_SEQUENCER_OSS=y
> +# CONFIG_SND_DYNAMIC_MINORS is not set
> +CONFIG_SND_SUPPORT_OLD_API=y
>  # CONFIG_SND_VERBOSE_PRINTK is not set
>  # CONFIG_SND_DEBUG is not set
> -CONFIG_SND_GENERIC_DRIVER=y
>  
>  #
>  # Generic devices
> @@ -1089,6 +1043,8 @@ CONFIG_SND_GENERIC_DRIVER=y
>  #
>  # PCI devices
>  #
> +# CONFIG_SND_AD1889 is not set
> +# CONFIG_SND_ALS4000 is not set
>  # CONFIG_SND_ALI5451 is not set
>  # CONFIG_SND_ATIIXP is not set
>  # CONFIG_SND_ATIIXP_MODEM is not set
> @@ -1097,39 +1053,38 @@ CONFIG_SND_GENERIC_DRIVER=y
>  # CONFIG_SND_AU8830 is not set
>  # CONFIG_SND_AZT3328 is not set
>  # CONFIG_SND_BT87X is not set
> -# CONFIG_SND_CS46XX is not set
> +# CONFIG_SND_CA0106 is not set
> +# CONFIG_SND_CMIPCI is not set
>  # CONFIG_SND_CS4281 is not set
> +# CONFIG_SND_CS46XX is not set
>  # CONFIG_SND_EMU10K1 is not set
>  # CONFIG_SND_EMU10K1X is not set
> -# CONFIG_SND_CA0106 is not set
> -# CONFIG_SND_KORG1212 is not set
> -# CONFIG_SND_MIXART is not set
> -# CONFIG_SND_NM256 is not set
> -# CONFIG_SND_RME32 is not set
> -# CONFIG_SND_RME96 is not set
> -# CONFIG_SND_RME9652 is not set
> -# CONFIG_SND_HDSP is not set
> -# CONFIG_SND_HDSPM is not set
> -# CONFIG_SND_TRIDENT is not set
> -# CONFIG_SND_YMFPCI is not set
> -# CONFIG_SND_AD1889 is not set
> -# CONFIG_SND_ALS4000 is not set
> -# CONFIG_SND_CMIPCI is not set
>  # CONFIG_SND_ENS1370 is not set
>  # CONFIG_SND_ENS1371 is not set
>  # CONFIG_SND_ES1938 is not set
>  # CONFIG_SND_ES1968 is not set
> -# CONFIG_SND_MAESTRO3 is not set
>  # CONFIG_SND_FM801 is not set
> +# CONFIG_SND_HDA_INTEL is not set
> +# CONFIG_SND_HDSP is not set
> +# CONFIG_SND_HDSPM is not set
>  # CONFIG_SND_ICE1712 is not set
>  # CONFIG_SND_ICE1724 is not set
>  # CONFIG_SND_INTEL8X0 is not set
>  # CONFIG_SND_INTEL8X0M is not set
> +# CONFIG_SND_KORG1212 is not set
> +# CONFIG_SND_MAESTRO3 is not set
> +# CONFIG_SND_MIXART is not set
> +# CONFIG_SND_NM256 is not set
> +# CONFIG_SND_PCXHR is not set
> +# CONFIG_SND_RME32 is not set
> +# CONFIG_SND_RME96 is not set
> +# CONFIG_SND_RME9652 is not set
>  # CONFIG_SND_SONICVIBES is not set
> +# CONFIG_SND_TRIDENT is not set
>  # CONFIG_SND_VIA82XX is not set
>  # CONFIG_SND_VIA82XX_MODEM is not set
>  # CONFIG_SND_VX222 is not set
> -# CONFIG_SND_HDA_INTEL is not set
> +# CONFIG_SND_YMFPCI is not set
>  
>  #
>  # ALSA PowerMac devices
> @@ -1201,13 +1156,16 @@ CONFIG_USB_STORAGE=m
>  # CONFIG_USB_STORAGE_SDDR09 is not set
>  # CONFIG_USB_STORAGE_SDDR55 is not set
>  # CONFIG_USB_STORAGE_JUMPSHOT is not set
> +# CONFIG_USB_STORAGE_ALAUDA is not set
>  # CONFIG_USB_STORAGE_ONETOUCH is not set
> +# CONFIG_USB_LIBUSUAL is not set
>  
>  #
>  # USB Input Devices
>  #
>  CONFIG_USB_HID=y
>  CONFIG_USB_HIDINPUT=y
> +# CONFIG_USB_HIDINPUT_POWERBOOK is not set
>  # CONFIG_HID_FF is not set
>  CONFIG_USB_HIDDEV=y
>  # CONFIG_USB_AIPTEK is not set
> @@ -1221,6 +1179,7 @@ CONFIG_USB_HIDDEV=y
>  # CONFIG_USB_YEALINK is not set
>  # CONFIG_USB_XPAD is not set
>  # CONFIG_USB_ATI_REMOTE is not set
> +# CONFIG_USB_ATI_REMOTE2 is not set
>  # CONFIG_USB_KEYSPAN_REMOTE is not set
>  # CONFIG_USB_APPLETOUCH is not set
>  
> @@ -1307,6 +1266,10 @@ CONFIG_INFINIBAND_IPOIB=m
>  #
>  
>  #
> +# EDAC - error detection and reporting (RAS)
> +#
> +
> +#
>  # File systems
>  #
>  CONFIG_EXT2_FS=y
> @@ -1340,6 +1303,7 @@ CONFIG_XFS_EXPORT=y
>  CONFIG_XFS_SECURITY=y
>  CONFIG_XFS_POSIX_ACL=y
>  # CONFIG_XFS_RT is not set
> +# CONFIG_OCFS2_FS is not set
>  # CONFIG_MINIX_FS is not set
>  # CONFIG_ROMFS_FS is not set
>  CONFIG_INOTIFY=y
> @@ -1379,6 +1343,7 @@ CONFIG_HUGETLBFS=y
>  CONFIG_HUGETLB_PAGE=y
>  CONFIG_RAMFS=y
>  # CONFIG_RELAYFS_FS is not set
> +# CONFIG_CONFIGFS_FS is not set
>  
>  #
>  # Miscellaneous filesystems
> @@ -1449,6 +1414,7 @@ CONFIG_MSDOS_PARTITION=y
>  # CONFIG_SGI_PARTITION is not set
>  # CONFIG_ULTRIX_PARTITION is not set
>  # CONFIG_SUN_PARTITION is not set
> +# CONFIG_KARMA_PARTITION is not set
>  # CONFIG_EFI_PARTITION is not set
>  
>  #
> @@ -1504,10 +1470,6 @@ CONFIG_CRC32=y
>  CONFIG_LIBCRC32C=m
>  CONFIG_ZLIB_INFLATE=y
>  CONFIG_ZLIB_DEFLATE=m
> -CONFIG_TEXTSEARCH=y
> -CONFIG_TEXTSEARCH_KMP=m
> -CONFIG_TEXTSEARCH_BM=m
> -CONFIG_TEXTSEARCH_FSM=m
>  
>  #
>  # Instrumentation Support
> @@ -1520,18 +1482,20 @@ CONFIG_OPROFILE=y
>  # Kernel hacking
>  #
>  # CONFIG_PRINTK_TIME is not set
> -CONFIG_DEBUG_KERNEL=y
>  CONFIG_MAGIC_SYSRQ=y
> +CONFIG_DEBUG_KERNEL=y
>  CONFIG_LOG_BUF_SHIFT=17
>  CONFIG_DETECT_SOFTLOCKUP=y
>  # CONFIG_SCHEDSTATS is not set
>  # CONFIG_DEBUG_SLAB is not set
> +CONFIG_DEBUG_MUTEXES=y
>  # CONFIG_DEBUG_SPINLOCK is not set
>  # CONFIG_DEBUG_SPINLOCK_SLEEP is not set
>  # CONFIG_DEBUG_KOBJECT is not set
>  # CONFIG_DEBUG_INFO is not set
>  CONFIG_DEBUG_FS=y
>  # CONFIG_DEBUG_VM is not set
> +CONFIG_FORCED_INLINING=y
>  # CONFIG_RCU_TORTURE_TEST is not set
>  CONFIG_DEBUG_STACKOVERFLOW=y
>  CONFIG_DEBUG_STACK_USAGE=y
> @@ -1540,6 +1504,11 @@ CONFIG_XMON=y
>  # CONFIG_XMON_DEFAULT is not set
>  CONFIG_IRQSTACKS=y
>  CONFIG_BOOTX_TEXT=y
> +# CONFIG_PPC_EARLY_DEBUG_LPAR is not set
> +# CONFIG_PPC_EARLY_DEBUG_G5 is not set
> +# CONFIG_PPC_EARLY_DEBUG_RTAS is not set
> +# CONFIG_PPC_EARLY_DEBUG_MAPLE is not set
> +# CONFIG_PPC_EARLY_DEBUG_ISERIES is not set
>  
>  #
>  # Security options
> Index: powerpc-git/arch/powerpc/configs/pseries_defconfig
> ===================================================================
> --- powerpc-git.orig/arch/powerpc/configs/pseries_defconfig
> +++ powerpc-git/arch/powerpc/configs/pseries_defconfig
> @@ -1,7 +1,7 @@
>  #
>  # Automatically generated make config: don't edit
> -# Linux kernel version: 2.6.15-rc5
> -# Tue Dec 20 15:59:40 2005
> +# Linux kernel version: 2.6.16-rc2
> +# Fri Feb 10 17:33:32 2006
>  #
>  CONFIG_PPC64=y
>  CONFIG_64BIT=y
> @@ -16,6 +16,10 @@ CONFIG_COMPAT=y
>  CONFIG_SYSVIPC_COMPAT=y
>  CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y
>  CONFIG_ARCH_MAY_HAVE_PC_FDC=y
> +CONFIG_PPC_OF=y
> +CONFIG_PPC_UDBG_16550=y
> +# CONFIG_GENERIC_TBSYNC is not set
> +# CONFIG_DEFAULT_UIMAGE is not set
>  
>  #
>  # Processor support
> @@ -33,7 +37,6 @@ CONFIG_NR_CPUS=128
>  # Code maturity level options
>  #
>  CONFIG_EXPERIMENTAL=y
> -CONFIG_CLEAN_COMPILE=y
>  CONFIG_LOCK_KERNEL=y
>  CONFIG_INIT_ENV_ARG_LIMIT=32
>  
> @@ -49,8 +52,6 @@ CONFIG_POSIX_MQUEUE=y
>  CONFIG_SYSCTL=y
>  CONFIG_AUDIT=y
>  CONFIG_AUDITSYSCALL=y
> -CONFIG_HOTPLUG=y
> -CONFIG_KOBJECT_UEVENT=y
>  CONFIG_IKCONFIG=y
>  CONFIG_IKCONFIG_PROC=y
>  CONFIG_CPUSETS=y
> @@ -60,8 +61,10 @@ CONFIG_CC_OPTIMIZE_FOR_SIZE=y
>  CONFIG_KALLSYMS=y
>  CONFIG_KALLSYMS_ALL=y
>  # CONFIG_KALLSYMS_EXTRA_PASS is not set
> +CONFIG_HOTPLUG=y
>  CONFIG_PRINTK=y
>  CONFIG_BUG=y
> +CONFIG_ELF_CORE=y
>  CONFIG_BASE_FULL=y
>  CONFIG_FUTEX=y
>  CONFIG_EPOLL=y
> @@ -70,8 +73,10 @@ CONFIG_CC_ALIGN_FUNCTIONS=0
>  CONFIG_CC_ALIGN_LABELS=0
>  CONFIG_CC_ALIGN_LOOPS=0
>  CONFIG_CC_ALIGN_JUMPS=0
> +CONFIG_SLAB=y
>  # CONFIG_TINY_SHMEM is not set
>  CONFIG_BASE_SMALL=0
> +# CONFIG_SLOB is not set
>  
>  #
>  # Loadable module support
> @@ -113,7 +118,6 @@ CONFIG_PPC_PSERIES=y
>  # CONFIG_PPC_PMAC is not set
>  # CONFIG_PPC_MAPLE is not set
>  # CONFIG_PPC_CELL is not set
> -CONFIG_PPC_OF=y
>  CONFIG_XICS=y
>  # CONFIG_U3_DART is not set
>  CONFIG_MPIC=y
> @@ -123,8 +127,8 @@ CONFIG_RTAS_PROC=y
>  CONFIG_RTAS_FLASH=m
>  # CONFIG_MMIO_NVRAM is not set
>  CONFIG_IBMVIO=y
> +# CONFIG_IBMEBUS is not set
>  # CONFIG_PPC_MPC106 is not set
> -# CONFIG_GENERIC_TBSYNC is not set
>  # CONFIG_CPU_FREQ is not set
>  # CONFIG_WANT_EARLY_SERIAL is not set
>  
> @@ -145,6 +149,7 @@ CONFIG_FORCE_MAX_ZONEORDER=13
>  CONFIG_IOMMU_VMERGE=y
>  CONFIG_HOTPLUG_CPU=y
>  CONFIG_KEXEC=y
> +# CONFIG_CRASH_DUMP is not set
>  CONFIG_IRQ_ALL_CPUS=y
>  CONFIG_PPC_SPLPAR=y
>  CONFIG_EEH=y
> @@ -165,6 +170,7 @@ CONFIG_HAVE_MEMORY_PRESENT=y
>  CONFIG_SPARSEMEM_EXTREME=y
>  # CONFIG_MEMORY_HOTPLUG is not set
>  CONFIG_SPLIT_PTLOCK_CPUS=4
> +CONFIG_MIGRATION=y
>  CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID=y
>  # CONFIG_PPC_64K_PAGES is not set
>  CONFIG_SCHED_SMT=y
> @@ -209,6 +215,7 @@ CONFIG_NET=y
>  #
>  # Networking options
>  #
> +# CONFIG_NETDEBUG is not set
>  CONFIG_PACKET=y
>  # CONFIG_PACKET_MMAP is not set
>  CONFIG_UNIX=y
> @@ -248,6 +255,7 @@ CONFIG_NETFILTER=y
>  CONFIG_NETFILTER_NETLINK=y
>  CONFIG_NETFILTER_NETLINK_QUEUE=m
>  CONFIG_NETFILTER_NETLINK_LOG=m
> +# CONFIG_NETFILTER_XTABLES is not set
>  
>  #
>  # IP: Netfilter Configuration
> @@ -265,65 +273,6 @@ CONFIG_IP_NF_TFTP=m
>  CONFIG_IP_NF_AMANDA=m
>  # CONFIG_IP_NF_PPTP is not set
>  CONFIG_IP_NF_QUEUE=m
> -CONFIG_IP_NF_IPTABLES=m
> -CONFIG_IP_NF_MATCH_LIMIT=m
> -CONFIG_IP_NF_MATCH_IPRANGE=m
> -CONFIG_IP_NF_MATCH_MAC=m
> -CONFIG_IP_NF_MATCH_PKTTYPE=m
> -CONFIG_IP_NF_MATCH_MARK=m
> -CONFIG_IP_NF_MATCH_MULTIPORT=m
> -CONFIG_IP_NF_MATCH_TOS=m
> -CONFIG_IP_NF_MATCH_RECENT=m
> -CONFIG_IP_NF_MATCH_ECN=m
> -CONFIG_IP_NF_MATCH_DSCP=m
> -CONFIG_IP_NF_MATCH_AH_ESP=m
> -CONFIG_IP_NF_MATCH_LENGTH=m
> -CONFIG_IP_NF_MATCH_TTL=m
> -CONFIG_IP_NF_MATCH_TCPMSS=m
> -CONFIG_IP_NF_MATCH_HELPER=m
> -CONFIG_IP_NF_MATCH_STATE=m
> -CONFIG_IP_NF_MATCH_CONNTRACK=m
> -CONFIG_IP_NF_MATCH_OWNER=m
> -CONFIG_IP_NF_MATCH_ADDRTYPE=m
> -CONFIG_IP_NF_MATCH_REALM=m
> -CONFIG_IP_NF_MATCH_SCTP=m
> -# CONFIG_IP_NF_MATCH_DCCP is not set
> -CONFIG_IP_NF_MATCH_COMMENT=m
> -CONFIG_IP_NF_MATCH_CONNMARK=m
> -CONFIG_IP_NF_MATCH_CONNBYTES=m
> -CONFIG_IP_NF_MATCH_HASHLIMIT=m
> -CONFIG_IP_NF_MATCH_STRING=m
> -CONFIG_IP_NF_FILTER=m
> -CONFIG_IP_NF_TARGET_REJECT=m
> -CONFIG_IP_NF_TARGET_LOG=m
> -CONFIG_IP_NF_TARGET_ULOG=m
> -CONFIG_IP_NF_TARGET_TCPMSS=m
> -CONFIG_IP_NF_TARGET_NFQUEUE=m
> -CONFIG_IP_NF_NAT=m
> -CONFIG_IP_NF_NAT_NEEDED=y
> -CONFIG_IP_NF_TARGET_MASQUERADE=m
> -CONFIG_IP_NF_TARGET_REDIRECT=m
> -CONFIG_IP_NF_TARGET_NETMAP=m
> -CONFIG_IP_NF_TARGET_SAME=m
> -CONFIG_IP_NF_NAT_SNMP_BASIC=m
> -CONFIG_IP_NF_NAT_IRC=m
> -CONFIG_IP_NF_NAT_FTP=m
> -CONFIG_IP_NF_NAT_TFTP=m
> -CONFIG_IP_NF_NAT_AMANDA=m
> -CONFIG_IP_NF_MANGLE=m
> -CONFIG_IP_NF_TARGET_TOS=m
> -CONFIG_IP_NF_TARGET_ECN=m
> -CONFIG_IP_NF_TARGET_DSCP=m
> -CONFIG_IP_NF_TARGET_MARK=m
> -CONFIG_IP_NF_TARGET_CLASSIFY=m
> -CONFIG_IP_NF_TARGET_TTL=m
> -CONFIG_IP_NF_TARGET_CONNMARK=m
> -CONFIG_IP_NF_TARGET_CLUSTERIP=m
> -CONFIG_IP_NF_RAW=m
> -CONFIG_IP_NF_TARGET_NOTRACK=m
> -CONFIG_IP_NF_ARPTABLES=m
> -CONFIG_IP_NF_ARPFILTER=m
> -CONFIG_IP_NF_ARP_MANGLE=m
>  
>  #
>  # DCCP Configuration (EXPERIMENTAL)
> @@ -334,6 +283,11 @@ CONFIG_IP_NF_ARP_MANGLE=m
>  # SCTP Configuration (EXPERIMENTAL)
>  #
>  # CONFIG_IP_SCTP is not set
> +
> +#
> +# TIPC Configuration (EXPERIMENTAL)
> +#
> +# CONFIG_TIPC is not set
>  # CONFIG_ATM is not set
>  # CONFIG_BRIDGE is not set
>  # CONFIG_VLAN_8021Q is not set
> @@ -352,7 +306,6 @@ CONFIG_LLC=y
>  # QoS and/or fair queueing
>  #
>  # CONFIG_NET_SCHED is not set
> -CONFIG_NET_CLS_ROUTE=y
>  
>  #
>  # Network testing
> @@ -550,13 +503,7 @@ CONFIG_SCSI_IPR_TRACE=y
>  CONFIG_SCSI_IPR_DUMP=y
>  # CONFIG_SCSI_QLOGIC_FC is not set
>  # CONFIG_SCSI_QLOGIC_1280 is not set
> -CONFIG_SCSI_QLA2XXX=y
> -CONFIG_SCSI_QLA21XX=m
> -CONFIG_SCSI_QLA22XX=m
> -CONFIG_SCSI_QLA2300=m
> -CONFIG_SCSI_QLA2322=m
> -CONFIG_SCSI_QLA6312=m
> -CONFIG_SCSI_QLA24XX=m
> +# CONFIG_SCSI_QLA_FC is not set
>  CONFIG_SCSI_LPFC=m
>  # CONFIG_SCSI_DC395x is not set
>  # CONFIG_SCSI_DC390T is not set
> @@ -678,6 +625,7 @@ CONFIG_E1000=y
>  # CONFIG_R8169 is not set
>  # CONFIG_SIS190 is not set
>  # CONFIG_SKGE is not set
> +# CONFIG_SKY2 is not set
>  # CONFIG_SK98LIN is not set
>  # CONFIG_VIA_VELOCITY is not set
>  CONFIG_TIGON3=y
> @@ -803,6 +751,7 @@ CONFIG_HW_CONSOLE=y
>  CONFIG_SERIAL_8250=y
>  CONFIG_SERIAL_8250_CONSOLE=y
>  CONFIG_SERIAL_8250_NR_UARTS=4
> +CONFIG_SERIAL_8250_RUNTIME_UARTS=4
>  # CONFIG_SERIAL_8250_EXTENDED is not set
>  
>  #
> @@ -909,6 +858,12 @@ CONFIG_I2C_ALGOBIT=y
>  # CONFIG_I2C_DEBUG_CHIP is not set
>  
>  #
> +# SPI support
> +#
> +# CONFIG_SPI is not set
> +# CONFIG_SPI_MASTER is not set
> +
> +#
>  # Dallas's 1-wire bus
>  #
>  # CONFIG_W1 is not set
> @@ -976,7 +931,6 @@ CONFIG_FB_RADEON_I2C=y
>  # CONFIG_FB_KYRO is not set
>  # CONFIG_FB_3DFX is not set
>  # CONFIG_FB_VOODOO1 is not set
> -# CONFIG_FB_CYBLA is not set
>  # CONFIG_FB_TRIDENT is not set
>  # CONFIG_FB_VIRTUAL is not set
>  
> @@ -1061,12 +1015,15 @@ CONFIG_USB_STORAGE=y
>  # CONFIG_USB_STORAGE_SDDR09 is not set
>  # CONFIG_USB_STORAGE_SDDR55 is not set
>  # CONFIG_USB_STORAGE_JUMPSHOT is not set
> +# CONFIG_USB_STORAGE_ALAUDA is not set
> +# CONFIG_USB_LIBUSUAL is not set
>  
>  #
>  # USB Input Devices
>  #
>  CONFIG_USB_HID=y
>  CONFIG_USB_HIDINPUT=y
> +# CONFIG_USB_HIDINPUT_POWERBOOK is not set
>  # CONFIG_HID_FF is not set
>  CONFIG_USB_HIDDEV=y
>  # CONFIG_USB_AIPTEK is not set
> @@ -1080,6 +1037,7 @@ CONFIG_USB_HIDDEV=y
>  # CONFIG_USB_YEALINK is not set
>  # CONFIG_USB_XPAD is not set
>  # CONFIG_USB_ATI_REMOTE is not set
> +# CONFIG_USB_ATI_REMOTE2 is not set
>  # CONFIG_USB_KEYSPAN_REMOTE is not set
>  # CONFIG_USB_APPLETOUCH is not set
>  
> @@ -1167,6 +1125,10 @@ CONFIG_INFINIBAND_IPOIB=m
>  #
>  
>  #
> +# EDAC - error detection and reporting (RAS)
> +#
> +
> +#
>  # File systems
>  #
>  CONFIG_EXT2_FS=y
> @@ -1200,6 +1162,7 @@ CONFIG_XFS_EXPORT=y
>  CONFIG_XFS_SECURITY=y
>  CONFIG_XFS_POSIX_ACL=y
>  # CONFIG_XFS_RT is not set
> +# CONFIG_OCFS2_FS is not set
>  # CONFIG_MINIX_FS is not set
>  # CONFIG_ROMFS_FS is not set
>  CONFIG_INOTIFY=y
> @@ -1240,6 +1203,7 @@ CONFIG_HUGETLBFS=y
>  CONFIG_HUGETLB_PAGE=y
>  CONFIG_RAMFS=y
>  # CONFIG_RELAYFS_FS is not set
> +# CONFIG_CONFIGFS_FS is not set
>  
>  #
>  # Miscellaneous filesystems
> @@ -1351,10 +1315,6 @@ CONFIG_CRC32=y
>  CONFIG_LIBCRC32C=m
>  CONFIG_ZLIB_INFLATE=y
>  CONFIG_ZLIB_DEFLATE=m
> -CONFIG_TEXTSEARCH=y
> -CONFIG_TEXTSEARCH_KMP=m
> -CONFIG_TEXTSEARCH_BM=m
> -CONFIG_TEXTSEARCH_FSM=m
>  
>  #
>  # Instrumentation Support
> @@ -1367,18 +1327,20 @@ CONFIG_OPROFILE=y
>  # Kernel hacking
>  #
>  # CONFIG_PRINTK_TIME is not set
> -CONFIG_DEBUG_KERNEL=y
>  CONFIG_MAGIC_SYSRQ=y
> +CONFIG_DEBUG_KERNEL=y
>  CONFIG_LOG_BUF_SHIFT=17
>  CONFIG_DETECT_SOFTLOCKUP=y
>  # CONFIG_SCHEDSTATS is not set
>  # CONFIG_DEBUG_SLAB is not set
> +CONFIG_DEBUG_MUTEXES=y
>  # CONFIG_DEBUG_SPINLOCK is not set
>  # CONFIG_DEBUG_SPINLOCK_SLEEP is not set
>  # CONFIG_DEBUG_KOBJECT is not set
>  # CONFIG_DEBUG_INFO is not set
>  CONFIG_DEBUG_FS=y
>  # CONFIG_DEBUG_VM is not set
> +CONFIG_FORCED_INLINING=y
>  # CONFIG_RCU_TORTURE_TEST is not set
>  CONFIG_DEBUG_STACKOVERFLOW=y
>  CONFIG_DEBUG_STACK_USAGE=y
> @@ -1387,6 +1349,11 @@ CONFIG_XMON=y
>  CONFIG_XMON_DEFAULT=y
>  CONFIG_IRQSTACKS=y
>  # CONFIG_BOOTX_TEXT is not set
> +# CONFIG_PPC_EARLY_DEBUG_LPAR is not set
> +# CONFIG_PPC_EARLY_DEBUG_G5 is not set
> +# CONFIG_PPC_EARLY_DEBUG_RTAS is not set
> +# CONFIG_PPC_EARLY_DEBUG_MAPLE is not set
> +# CONFIG_PPC_EARLY_DEBUG_ISERIES is not set
>  
>  #
>  # Security options


From olof at lixom.net  Mon Feb 13 10:30:31 2006
From: olof at lixom.net (Olof Johansson)
Date: Sun, 12 Feb 2006 17:30:31 -0600
Subject: [PATCH] Update {g5,pseries,ppc64}_defconfig
In-Reply-To: <1139782304.5247.42.camel@localhost.localdomain>
References: <20060210234903.GB4795@pb15.lixom.net>
	<1139782304.5247.42.camel@localhost.localdomain>
Message-ID: <20060212233030.GA12445@pb15.lixom.net>

On Mon, Feb 13, 2006 at 09:11:44AM +1100, Benjamin Herrenschmidt wrote:

> You probably also want to make tg3 built-in...

Good point. New patch below.

---


Update defconfigs for g5, pseries and generic ppc64. Default choices
for everything, with the following exceptions:

 * Enable WINDFARM_PM112 on g5 and ppc64.
 * Increase CONFIG_NR_CPUS to 4 in g5_defconfig
 * CONFIG_TIGON3=y instead of =m in g5_defconfig


Signed-off-by: Olof Johansson <olof at lixom.net>


Index: powerpc-git/arch/powerpc/configs/g5_defconfig
===================================================================
--- powerpc-git.orig/arch/powerpc/configs/g5_defconfig
+++ powerpc-git/arch/powerpc/configs/g5_defconfig
@@ -1,7 +1,7 @@
 #
 # Automatically generated make config: don't edit
-# Linux kernel version: 2.6.15-rc5
-# Tue Dec 20 15:59:30 2005
+# Linux kernel version: 2.6.16-rc2
+# Fri Feb 10 17:33:08 2006
 #
 CONFIG_PPC64=y
 CONFIG_64BIT=y
@@ -16,6 +16,10 @@ CONFIG_COMPAT=y
 CONFIG_SYSVIPC_COMPAT=y
 CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y
 CONFIG_ARCH_MAY_HAVE_PC_FDC=y
+CONFIG_PPC_OF=y
+# CONFIG_PPC_UDBG_16550 is not set
+CONFIG_GENERIC_TBSYNC=y
+# CONFIG_DEFAULT_UIMAGE is not set
 
 #
 # Processor support
@@ -26,13 +30,12 @@ CONFIG_PPC_FPU=y
 CONFIG_ALTIVEC=y
 CONFIG_PPC_STD_MMU=y
 CONFIG_SMP=y
-CONFIG_NR_CPUS=2
+CONFIG_NR_CPUS=4
 
 #
 # Code maturity level options
 #
 CONFIG_EXPERIMENTAL=y
-CONFIG_CLEAN_COMPILE=y
 CONFIG_LOCK_KERNEL=y
 CONFIG_INIT_ENV_ARG_LIMIT=32
 
@@ -47,8 +50,6 @@ CONFIG_POSIX_MQUEUE=y
 # CONFIG_BSD_PROCESS_ACCT is not set
 CONFIG_SYSCTL=y
 # CONFIG_AUDIT is not set
-CONFIG_HOTPLUG=y
-CONFIG_KOBJECT_UEVENT=y
 CONFIG_IKCONFIG=y
 CONFIG_IKCONFIG_PROC=y
 # CONFIG_CPUSETS is not set
@@ -58,8 +59,10 @@ CONFIG_CC_OPTIMIZE_FOR_SIZE=y
 CONFIG_KALLSYMS=y
 # CONFIG_KALLSYMS_ALL is not set
 # CONFIG_KALLSYMS_EXTRA_PASS is not set
+CONFIG_HOTPLUG=y
 CONFIG_PRINTK=y
 CONFIG_BUG=y
+CONFIG_ELF_CORE=y
 CONFIG_BASE_FULL=y
 CONFIG_FUTEX=y
 CONFIG_EPOLL=y
@@ -68,8 +71,10 @@ CONFIG_CC_ALIGN_FUNCTIONS=0
 CONFIG_CC_ALIGN_LABELS=0
 CONFIG_CC_ALIGN_LOOPS=0
 CONFIG_CC_ALIGN_JUMPS=0
+CONFIG_SLAB=y
 # CONFIG_TINY_SHMEM is not set
 CONFIG_BASE_SMALL=0
+# CONFIG_SLOB is not set
 
 #
 # Loadable module support
@@ -112,13 +117,12 @@ CONFIG_PPC_PMAC=y
 CONFIG_PPC_PMAC64=y
 # CONFIG_PPC_MAPLE is not set
 # CONFIG_PPC_CELL is not set
-CONFIG_PPC_OF=y
 CONFIG_U3_DART=y
 CONFIG_MPIC=y
 # CONFIG_PPC_RTAS is not set
 # CONFIG_MMIO_NVRAM is not set
+CONFIG_MPIC_BROKEN_U3=y
 # CONFIG_PPC_MPC106 is not set
-CONFIG_GENERIC_TBSYNC=y
 CONFIG_CPU_FREQ=y
 CONFIG_CPU_FREQ_TABLE=y
 # CONFIG_CPU_FREQ_DEBUG is not set
@@ -151,6 +155,7 @@ CONFIG_FORCE_MAX_ZONEORDER=13
 CONFIG_IOMMU_VMERGE=y
 # CONFIG_HOTPLUG_CPU is not set
 CONFIG_KEXEC=y
+# CONFIG_CRASH_DUMP is not set
 CONFIG_IRQ_ALL_CPUS=y
 # CONFIG_NUMA is not set
 CONFIG_ARCH_SELECT_MEMORY_MODEL=y
@@ -202,6 +207,7 @@ CONFIG_NET=y
 #
 # Networking options
 #
+# CONFIG_NETDEBUG is not set
 CONFIG_PACKET=y
 # CONFIG_PACKET_MMAP is not set
 CONFIG_UNIX=y
@@ -239,6 +245,7 @@ CONFIG_NETFILTER=y
 # Core Netfilter Configuration
 #
 # CONFIG_NETFILTER_NETLINK is not set
+# CONFIG_NETFILTER_XTABLES is not set
 
 #
 # IP: Netfilter Configuration
@@ -255,65 +262,6 @@ CONFIG_IP_NF_TFTP=m
 CONFIG_IP_NF_AMANDA=m
 # CONFIG_IP_NF_PPTP is not set
 CONFIG_IP_NF_QUEUE=m
-CONFIG_IP_NF_IPTABLES=m
-CONFIG_IP_NF_MATCH_LIMIT=m
-CONFIG_IP_NF_MATCH_IPRANGE=m
-CONFIG_IP_NF_MATCH_MAC=m
-CONFIG_IP_NF_MATCH_PKTTYPE=m
-CONFIG_IP_NF_MATCH_MARK=m
-CONFIG_IP_NF_MATCH_MULTIPORT=m
-CONFIG_IP_NF_MATCH_TOS=m
-CONFIG_IP_NF_MATCH_RECENT=m
-CONFIG_IP_NF_MATCH_ECN=m
-CONFIG_IP_NF_MATCH_DSCP=m
-CONFIG_IP_NF_MATCH_AH_ESP=m
-CONFIG_IP_NF_MATCH_LENGTH=m
-CONFIG_IP_NF_MATCH_TTL=m
-CONFIG_IP_NF_MATCH_TCPMSS=m
-CONFIG_IP_NF_MATCH_HELPER=m
-CONFIG_IP_NF_MATCH_STATE=m
-CONFIG_IP_NF_MATCH_CONNTRACK=m
-CONFIG_IP_NF_MATCH_OWNER=m
-CONFIG_IP_NF_MATCH_ADDRTYPE=m
-CONFIG_IP_NF_MATCH_REALM=m
-CONFIG_IP_NF_MATCH_SCTP=m
-# CONFIG_IP_NF_MATCH_DCCP is not set
-CONFIG_IP_NF_MATCH_COMMENT=m
-CONFIG_IP_NF_MATCH_CONNMARK=m
-CONFIG_IP_NF_MATCH_CONNBYTES=m
-CONFIG_IP_NF_MATCH_HASHLIMIT=m
-CONFIG_IP_NF_MATCH_STRING=m
-CONFIG_IP_NF_FILTER=m
-CONFIG_IP_NF_TARGET_REJECT=m
-CONFIG_IP_NF_TARGET_LOG=m
-CONFIG_IP_NF_TARGET_ULOG=m
-CONFIG_IP_NF_TARGET_TCPMSS=m
-CONFIG_IP_NF_TARGET_NFQUEUE=m
-CONFIG_IP_NF_NAT=m
-CONFIG_IP_NF_NAT_NEEDED=y
-CONFIG_IP_NF_TARGET_MASQUERADE=m
-CONFIG_IP_NF_TARGET_REDIRECT=m
-CONFIG_IP_NF_TARGET_NETMAP=m
-CONFIG_IP_NF_TARGET_SAME=m
-CONFIG_IP_NF_NAT_SNMP_BASIC=m
-CONFIG_IP_NF_NAT_IRC=m
-CONFIG_IP_NF_NAT_FTP=m
-CONFIG_IP_NF_NAT_TFTP=m
-CONFIG_IP_NF_NAT_AMANDA=m
-CONFIG_IP_NF_MANGLE=m
-CONFIG_IP_NF_TARGET_TOS=m
-CONFIG_IP_NF_TARGET_ECN=m
-CONFIG_IP_NF_TARGET_DSCP=m
-CONFIG_IP_NF_TARGET_MARK=m
-CONFIG_IP_NF_TARGET_CLASSIFY=m
-CONFIG_IP_NF_TARGET_TTL=m
-CONFIG_IP_NF_TARGET_CONNMARK=m
-CONFIG_IP_NF_TARGET_CLUSTERIP=m
-CONFIG_IP_NF_RAW=m
-CONFIG_IP_NF_TARGET_NOTRACK=m
-CONFIG_IP_NF_ARPTABLES=m
-CONFIG_IP_NF_ARPFILTER=m
-CONFIG_IP_NF_ARP_MANGLE=m
 
 #
 # DCCP Configuration (EXPERIMENTAL)
@@ -324,6 +272,11 @@ CONFIG_IP_NF_ARP_MANGLE=m
 # SCTP Configuration (EXPERIMENTAL)
 #
 # CONFIG_IP_SCTP is not set
+
+#
+# TIPC Configuration (EXPERIMENTAL)
+#
+# CONFIG_TIPC is not set
 # CONFIG_ATM is not set
 # CONFIG_BRIDGE is not set
 # CONFIG_VLAN_8021Q is not set
@@ -342,7 +295,6 @@ CONFIG_LLC=y
 # QoS and/or fair queueing
 #
 # CONFIG_NET_SCHED is not set
-CONFIG_NET_CLS_ROUTE=y
 
 #
 # Network testing
@@ -545,13 +497,7 @@ CONFIG_SCSI_SATA_SVW=y
 # CONFIG_SCSI_IPR is not set
 # CONFIG_SCSI_QLOGIC_FC is not set
 # CONFIG_SCSI_QLOGIC_1280 is not set
-CONFIG_SCSI_QLA2XXX=y
-# CONFIG_SCSI_QLA21XX is not set
-# CONFIG_SCSI_QLA22XX is not set
-# CONFIG_SCSI_QLA2300 is not set
-# CONFIG_SCSI_QLA2322 is not set
-# CONFIG_SCSI_QLA6312 is not set
-# CONFIG_SCSI_QLA24XX is not set
+# CONFIG_SCSI_QLA_FC is not set
 # CONFIG_SCSI_LPFC is not set
 # CONFIG_SCSI_DC395x is not set
 # CONFIG_SCSI_DC390T is not set
@@ -614,7 +560,6 @@ CONFIG_IEEE1394_SBP2=m
 CONFIG_IEEE1394_ETH1394=m
 CONFIG_IEEE1394_DV1394=m
 CONFIG_IEEE1394_RAWIO=y
-# CONFIG_IEEE1394_CMP is not set
 
 #
 # I2O device support
@@ -630,6 +575,7 @@ CONFIG_THERM_PM72=y
 CONFIG_WINDFARM=y
 CONFIG_WINDFARM_PM81=y
 CONFIG_WINDFARM_PM91=y
+CONFIG_WINDFARM_PM112=y
 
 #
 # Network device support
@@ -682,8 +628,9 @@ CONFIG_E1000=y
 # CONFIG_R8169 is not set
 # CONFIG_SIS190 is not set
 # CONFIG_SKGE is not set
+# CONFIG_SKY2 is not set
 # CONFIG_SK98LIN is not set
-CONFIG_TIGON3=m
+CONFIG_TIGON3=y
 # CONFIG_BNX2 is not set
 # CONFIG_MV643XX_ETH is not set
 
@@ -861,8 +808,7 @@ CONFIG_I2C_ALGOBIT=y
 # CONFIG_I2C_I801 is not set
 # CONFIG_I2C_I810 is not set
 # CONFIG_I2C_PIIX4 is not set
-CONFIG_I2C_KEYWEST=y
-CONFIG_I2C_PMAC_SMU=y
+CONFIG_I2C_POWERMAC=y
 # CONFIG_I2C_NFORCE2 is not set
 # CONFIG_I2C_PARPORT_LIGHT is not set
 # CONFIG_I2C_PROSAVAGE is not set
@@ -895,6 +841,12 @@ CONFIG_I2C_PMAC_SMU=y
 # CONFIG_I2C_DEBUG_CHIP is not set
 
 #
+# SPI support
+#
+# CONFIG_SPI is not set
+# CONFIG_SPI_MASTER is not set
+
+#
 # Dallas's 1-wire bus
 #
 # CONFIG_W1 is not set
@@ -961,7 +913,6 @@ CONFIG_FB_RADEON_I2C=y
 # CONFIG_FB_KYRO is not set
 # CONFIG_FB_3DFX is not set
 # CONFIG_FB_VOODOO1 is not set
-# CONFIG_FB_CYBLA is not set
 # CONFIG_FB_TRIDENT is not set
 # CONFIG_FB_VIRTUAL is not set
 
@@ -1008,9 +959,10 @@ CONFIG_SND_OSSEMUL=y
 CONFIG_SND_MIXER_OSS=m
 CONFIG_SND_PCM_OSS=m
 CONFIG_SND_SEQUENCER_OSS=y
+# CONFIG_SND_DYNAMIC_MINORS is not set
+CONFIG_SND_SUPPORT_OLD_API=y
 # CONFIG_SND_VERBOSE_PRINTK is not set
 # CONFIG_SND_DEBUG is not set
-CONFIG_SND_GENERIC_DRIVER=y
 
 #
 # Generic devices
@@ -1024,6 +976,8 @@ CONFIG_SND_GENERIC_DRIVER=y
 #
 # PCI devices
 #
+# CONFIG_SND_AD1889 is not set
+# CONFIG_SND_ALS4000 is not set
 # CONFIG_SND_ALI5451 is not set
 # CONFIG_SND_ATIIXP is not set
 # CONFIG_SND_ATIIXP_MODEM is not set
@@ -1032,39 +986,38 @@ CONFIG_SND_GENERIC_DRIVER=y
 # CONFIG_SND_AU8830 is not set
 # CONFIG_SND_AZT3328 is not set
 # CONFIG_SND_BT87X is not set
-# CONFIG_SND_CS46XX is not set
+# CONFIG_SND_CA0106 is not set
+# CONFIG_SND_CMIPCI is not set
 # CONFIG_SND_CS4281 is not set
+# CONFIG_SND_CS46XX is not set
 # CONFIG_SND_EMU10K1 is not set
 # CONFIG_SND_EMU10K1X is not set
-# CONFIG_SND_CA0106 is not set
-# CONFIG_SND_KORG1212 is not set
-# CONFIG_SND_MIXART is not set
-# CONFIG_SND_NM256 is not set
-# CONFIG_SND_RME32 is not set
-# CONFIG_SND_RME96 is not set
-# CONFIG_SND_RME9652 is not set
-# CONFIG_SND_HDSP is not set
-# CONFIG_SND_HDSPM is not set
-# CONFIG_SND_TRIDENT is not set
-# CONFIG_SND_YMFPCI is not set
-# CONFIG_SND_AD1889 is not set
-# CONFIG_SND_ALS4000 is not set
-# CONFIG_SND_CMIPCI is not set
 # CONFIG_SND_ENS1370 is not set
 # CONFIG_SND_ENS1371 is not set
 # CONFIG_SND_ES1938 is not set
 # CONFIG_SND_ES1968 is not set
-# CONFIG_SND_MAESTRO3 is not set
 # CONFIG_SND_FM801 is not set
+# CONFIG_SND_HDA_INTEL is not set
+# CONFIG_SND_HDSP is not set
+# CONFIG_SND_HDSPM is not set
 # CONFIG_SND_ICE1712 is not set
 # CONFIG_SND_ICE1724 is not set
 # CONFIG_SND_INTEL8X0 is not set
 # CONFIG_SND_INTEL8X0M is not set
+# CONFIG_SND_KORG1212 is not set
+# CONFIG_SND_MAESTRO3 is not set
+# CONFIG_SND_MIXART is not set
+# CONFIG_SND_NM256 is not set
+# CONFIG_SND_PCXHR is not set
+# CONFIG_SND_RME32 is not set
+# CONFIG_SND_RME96 is not set
+# CONFIG_SND_RME9652 is not set
 # CONFIG_SND_SONICVIBES is not set
+# CONFIG_SND_TRIDENT is not set
 # CONFIG_SND_VIA82XX is not set
 # CONFIG_SND_VIA82XX_MODEM is not set
 # CONFIG_SND_VX222 is not set
-# CONFIG_SND_HDA_INTEL is not set
+# CONFIG_SND_YMFPCI is not set
 
 #
 # ALSA PowerMac devices
@@ -1136,13 +1089,16 @@ CONFIG_USB_STORAGE_DPCM=y
 CONFIG_USB_STORAGE_SDDR09=y
 CONFIG_USB_STORAGE_SDDR55=y
 CONFIG_USB_STORAGE_JUMPSHOT=y
+# CONFIG_USB_STORAGE_ALAUDA is not set
 # CONFIG_USB_STORAGE_ONETOUCH is not set
+# CONFIG_USB_LIBUSUAL is not set
 
 #
 # USB Input Devices
 #
 CONFIG_USB_HID=y
 CONFIG_USB_HIDINPUT=y
+# CONFIG_USB_HIDINPUT_POWERBOOK is not set
 CONFIG_HID_FF=y
 CONFIG_HID_PID=y
 CONFIG_LOGITECH_FF=y
@@ -1159,6 +1115,7 @@ CONFIG_USB_HIDDEV=y
 # CONFIG_USB_YEALINK is not set
 # CONFIG_USB_XPAD is not set
 # CONFIG_USB_ATI_REMOTE is not set
+# CONFIG_USB_ATI_REMOTE2 is not set
 # CONFIG_USB_KEYSPAN_REMOTE is not set
 # CONFIG_USB_APPLETOUCH is not set
 
@@ -1207,6 +1164,7 @@ CONFIG_USB_SERIAL_GENERIC=y
 # CONFIG_USB_SERIAL_AIRPRIME is not set
 # CONFIG_USB_SERIAL_ANYDATA is not set
 CONFIG_USB_SERIAL_BELKIN=m
+# CONFIG_USB_SERIAL_WHITEHEAT is not set
 CONFIG_USB_SERIAL_DIGI_ACCELEPORT=m
 # CONFIG_USB_SERIAL_CP2101 is not set
 CONFIG_USB_SERIAL_CYPRESS_M8=m
@@ -1288,6 +1246,10 @@ CONFIG_USB_EZUSB=y
 #
 
 #
+# EDAC - error detection and reporting (RAS)
+#
+
+#
 # File systems
 #
 CONFIG_EXT2_FS=y
@@ -1317,6 +1279,7 @@ CONFIG_XFS_EXPORT=y
 CONFIG_XFS_SECURITY=y
 CONFIG_XFS_POSIX_ACL=y
 # CONFIG_XFS_RT is not set
+# CONFIG_OCFS2_FS is not set
 # CONFIG_MINIX_FS is not set
 # CONFIG_ROMFS_FS is not set
 CONFIG_INOTIFY=y
@@ -1357,6 +1320,7 @@ CONFIG_HUGETLBFS=y
 CONFIG_HUGETLB_PAGE=y
 CONFIG_RAMFS=y
 # CONFIG_RELAYFS_FS is not set
+# CONFIG_CONFIGFS_FS is not set
 
 #
 # Miscellaneous filesystems
@@ -1426,6 +1390,7 @@ CONFIG_MSDOS_PARTITION=y
 # CONFIG_SGI_PARTITION is not set
 # CONFIG_ULTRIX_PARTITION is not set
 # CONFIG_SUN_PARTITION is not set
+# CONFIG_KARMA_PARTITION is not set
 # CONFIG_EFI_PARTITION is not set
 
 #
@@ -1481,10 +1446,6 @@ CONFIG_CRC32=y
 CONFIG_LIBCRC32C=m
 CONFIG_ZLIB_INFLATE=y
 CONFIG_ZLIB_DEFLATE=m
-CONFIG_TEXTSEARCH=y
-CONFIG_TEXTSEARCH_KMP=m
-CONFIG_TEXTSEARCH_BM=m
-CONFIG_TEXTSEARCH_FSM=m
 
 #
 # Instrumentation Support
@@ -1497,24 +1458,31 @@ CONFIG_OPROFILE=y
 # Kernel hacking
 #
 # CONFIG_PRINTK_TIME is not set
-CONFIG_DEBUG_KERNEL=y
 CONFIG_MAGIC_SYSRQ=y
+CONFIG_DEBUG_KERNEL=y
 CONFIG_LOG_BUF_SHIFT=17
 CONFIG_DETECT_SOFTLOCKUP=y
 # CONFIG_SCHEDSTATS is not set
 # CONFIG_DEBUG_SLAB is not set
+CONFIG_DEBUG_MUTEXES=y
 # CONFIG_DEBUG_SPINLOCK is not set
 # CONFIG_DEBUG_SPINLOCK_SLEEP is not set
 # CONFIG_DEBUG_KOBJECT is not set
 # CONFIG_DEBUG_INFO is not set
 CONFIG_DEBUG_FS=y
 # CONFIG_DEBUG_VM is not set
+CONFIG_FORCED_INLINING=y
 # CONFIG_RCU_TORTURE_TEST is not set
 # CONFIG_DEBUG_STACKOVERFLOW is not set
 # CONFIG_DEBUG_STACK_USAGE is not set
 # CONFIG_DEBUGGER is not set
 CONFIG_IRQSTACKS=y
 CONFIG_BOOTX_TEXT=y
+# CONFIG_PPC_EARLY_DEBUG_LPAR is not set
+# CONFIG_PPC_EARLY_DEBUG_G5 is not set
+# CONFIG_PPC_EARLY_DEBUG_RTAS is not set
+# CONFIG_PPC_EARLY_DEBUG_MAPLE is not set
+# CONFIG_PPC_EARLY_DEBUG_ISERIES is not set
 
 #
 # Security options
Index: powerpc-git/arch/powerpc/configs/ppc64_defconfig
===================================================================
--- powerpc-git.orig/arch/powerpc/configs/ppc64_defconfig
+++ powerpc-git/arch/powerpc/configs/ppc64_defconfig
@@ -1,7 +1,7 @@
 #
 # Automatically generated make config: don't edit
-# Linux kernel version: 2.6.15-rc5
-# Tue Dec 20 15:59:38 2005
+# Linux kernel version: 2.6.16-rc2
+# Fri Feb 10 17:32:14 2006
 #
 CONFIG_PPC64=y
 CONFIG_64BIT=y
@@ -16,6 +16,10 @@ CONFIG_COMPAT=y
 CONFIG_SYSVIPC_COMPAT=y
 CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y
 CONFIG_ARCH_MAY_HAVE_PC_FDC=y
+CONFIG_PPC_OF=y
+CONFIG_PPC_UDBG_16550=y
+CONFIG_GENERIC_TBSYNC=y
+# CONFIG_DEFAULT_UIMAGE is not set
 
 #
 # Processor support
@@ -33,7 +37,6 @@ CONFIG_NR_CPUS=32
 # Code maturity level options
 #
 CONFIG_EXPERIMENTAL=y
-CONFIG_CLEAN_COMPILE=y
 CONFIG_LOCK_KERNEL=y
 CONFIG_INIT_ENV_ARG_LIMIT=32
 
@@ -48,8 +51,6 @@ CONFIG_POSIX_MQUEUE=y
 # CONFIG_BSD_PROCESS_ACCT is not set
 CONFIG_SYSCTL=y
 # CONFIG_AUDIT is not set
-CONFIG_HOTPLUG=y
-CONFIG_KOBJECT_UEVENT=y
 CONFIG_IKCONFIG=y
 CONFIG_IKCONFIG_PROC=y
 CONFIG_CPUSETS=y
@@ -59,8 +60,10 @@ CONFIG_CC_OPTIMIZE_FOR_SIZE=y
 CONFIG_KALLSYMS=y
 CONFIG_KALLSYMS_ALL=y
 # CONFIG_KALLSYMS_EXTRA_PASS is not set
+CONFIG_HOTPLUG=y
 CONFIG_PRINTK=y
 CONFIG_BUG=y
+CONFIG_ELF_CORE=y
 CONFIG_BASE_FULL=y
 CONFIG_FUTEX=y
 CONFIG_EPOLL=y
@@ -69,8 +72,10 @@ CONFIG_CC_ALIGN_FUNCTIONS=0
 CONFIG_CC_ALIGN_LABELS=0
 CONFIG_CC_ALIGN_LOOPS=0
 CONFIG_CC_ALIGN_JUMPS=0
+CONFIG_SLAB=y
 # CONFIG_TINY_SHMEM is not set
 CONFIG_BASE_SMALL=0
+# CONFIG_SLOB is not set
 
 #
 # Loadable module support
@@ -113,7 +118,6 @@ CONFIG_PPC_PMAC=y
 CONFIG_PPC_PMAC64=y
 CONFIG_PPC_MAPLE=y
 # CONFIG_PPC_CELL is not set
-CONFIG_PPC_OF=y
 CONFIG_XICS=y
 CONFIG_U3_DART=y
 CONFIG_MPIC=y
@@ -124,8 +128,8 @@ CONFIG_RTAS_FLASH=m
 # CONFIG_MMIO_NVRAM is not set
 CONFIG_MPIC_BROKEN_U3=y
 CONFIG_IBMVIO=y
+# CONFIG_IBMEBUS is not set
 # CONFIG_PPC_MPC106 is not set
-CONFIG_GENERIC_TBSYNC=y
 CONFIG_CPU_FREQ=y
 CONFIG_CPU_FREQ_TABLE=y
 # CONFIG_CPU_FREQ_DEBUG is not set
@@ -158,6 +162,7 @@ CONFIG_FORCE_MAX_ZONEORDER=13
 CONFIG_IOMMU_VMERGE=y
 CONFIG_HOTPLUG_CPU=y
 CONFIG_KEXEC=y
+# CONFIG_CRASH_DUMP is not set
 CONFIG_IRQ_ALL_CPUS=y
 CONFIG_PPC_SPLPAR=y
 CONFIG_EEH=y
@@ -178,6 +183,7 @@ CONFIG_HAVE_MEMORY_PRESENT=y
 CONFIG_SPARSEMEM_EXTREME=y
 # CONFIG_MEMORY_HOTPLUG is not set
 CONFIG_SPLIT_PTLOCK_CPUS=4
+CONFIG_MIGRATION=y
 # CONFIG_PPC_64K_PAGES is not set
 # CONFIG_SCHED_SMT is not set
 CONFIG_PROC_DEVICETREE=y
@@ -221,6 +227,7 @@ CONFIG_NET=y
 #
 # Networking options
 #
+# CONFIG_NETDEBUG is not set
 CONFIG_PACKET=y
 # CONFIG_PACKET_MMAP is not set
 CONFIG_UNIX=y
@@ -260,6 +267,7 @@ CONFIG_NETFILTER=y
 CONFIG_NETFILTER_NETLINK=y
 CONFIG_NETFILTER_NETLINK_QUEUE=m
 CONFIG_NETFILTER_NETLINK_LOG=m
+# CONFIG_NETFILTER_XTABLES is not set
 
 #
 # IP: Netfilter Configuration
@@ -277,65 +285,6 @@ CONFIG_IP_NF_TFTP=m
 CONFIG_IP_NF_AMANDA=m
 # CONFIG_IP_NF_PPTP is not set
 CONFIG_IP_NF_QUEUE=m
-CONFIG_IP_NF_IPTABLES=m
-CONFIG_IP_NF_MATCH_LIMIT=m
-CONFIG_IP_NF_MATCH_IPRANGE=m
-CONFIG_IP_NF_MATCH_MAC=m
-CONFIG_IP_NF_MATCH_PKTTYPE=m
-CONFIG_IP_NF_MATCH_MARK=m
-CONFIG_IP_NF_MATCH_MULTIPORT=m
-CONFIG_IP_NF_MATCH_TOS=m
-CONFIG_IP_NF_MATCH_RECENT=m
-CONFIG_IP_NF_MATCH_ECN=m
-CONFIG_IP_NF_MATCH_DSCP=m
-CONFIG_IP_NF_MATCH_AH_ESP=m
-CONFIG_IP_NF_MATCH_LENGTH=m
-CONFIG_IP_NF_MATCH_TTL=m
-CONFIG_IP_NF_MATCH_TCPMSS=m
-CONFIG_IP_NF_MATCH_HELPER=m
-CONFIG_IP_NF_MATCH_STATE=m
-CONFIG_IP_NF_MATCH_CONNTRACK=m
-CONFIG_IP_NF_MATCH_OWNER=m
-CONFIG_IP_NF_MATCH_ADDRTYPE=m
-CONFIG_IP_NF_MATCH_REALM=m
-CONFIG_IP_NF_MATCH_SCTP=m
-CONFIG_IP_NF_MATCH_DCCP=m
-CONFIG_IP_NF_MATCH_COMMENT=m
-CONFIG_IP_NF_MATCH_CONNMARK=m
-CONFIG_IP_NF_MATCH_CONNBYTES=m
-CONFIG_IP_NF_MATCH_HASHLIMIT=m
-CONFIG_IP_NF_MATCH_STRING=m
-CONFIG_IP_NF_FILTER=m
-CONFIG_IP_NF_TARGET_REJECT=m
-CONFIG_IP_NF_TARGET_LOG=m
-CONFIG_IP_NF_TARGET_ULOG=m
-CONFIG_IP_NF_TARGET_TCPMSS=m
-CONFIG_IP_NF_TARGET_NFQUEUE=m
-CONFIG_IP_NF_NAT=m
-CONFIG_IP_NF_NAT_NEEDED=y
-CONFIG_IP_NF_TARGET_MASQUERADE=m
-CONFIG_IP_NF_TARGET_REDIRECT=m
-CONFIG_IP_NF_TARGET_NETMAP=m
-CONFIG_IP_NF_TARGET_SAME=m
-CONFIG_IP_NF_NAT_SNMP_BASIC=m
-CONFIG_IP_NF_NAT_IRC=m
-CONFIG_IP_NF_NAT_FTP=m
-CONFIG_IP_NF_NAT_TFTP=m
-CONFIG_IP_NF_NAT_AMANDA=m
-CONFIG_IP_NF_MANGLE=m
-CONFIG_IP_NF_TARGET_TOS=m
-CONFIG_IP_NF_TARGET_ECN=m
-CONFIG_IP_NF_TARGET_DSCP=m
-CONFIG_IP_NF_TARGET_MARK=m
-CONFIG_IP_NF_TARGET_CLASSIFY=m
-CONFIG_IP_NF_TARGET_TTL=m
-CONFIG_IP_NF_TARGET_CONNMARK=m
-CONFIG_IP_NF_TARGET_CLUSTERIP=m
-CONFIG_IP_NF_RAW=m
-CONFIG_IP_NF_TARGET_NOTRACK=m
-CONFIG_IP_NF_ARPTABLES=m
-CONFIG_IP_NF_ARPFILTER=m
-CONFIG_IP_NF_ARP_MANGLE=m
 
 #
 # DCCP Configuration (EXPERIMENTAL)
@@ -346,6 +295,11 @@ CONFIG_IP_NF_ARP_MANGLE=m
 # SCTP Configuration (EXPERIMENTAL)
 #
 # CONFIG_IP_SCTP is not set
+
+#
+# TIPC Configuration (EXPERIMENTAL)
+#
+# CONFIG_TIPC is not set
 # CONFIG_ATM is not set
 # CONFIG_BRIDGE is not set
 # CONFIG_VLAN_8021Q is not set
@@ -364,7 +318,6 @@ CONFIG_LLC=y
 # QoS and/or fair queueing
 #
 # CONFIG_NET_SCHED is not set
-CONFIG_NET_CLS_ROUTE=y
 
 #
 # Network testing
@@ -572,13 +525,7 @@ CONFIG_SCSI_IPR_TRACE=y
 CONFIG_SCSI_IPR_DUMP=y
 # CONFIG_SCSI_QLOGIC_FC is not set
 # CONFIG_SCSI_QLOGIC_1280 is not set
-CONFIG_SCSI_QLA2XXX=y
-CONFIG_SCSI_QLA21XX=m
-CONFIG_SCSI_QLA22XX=m
-CONFIG_SCSI_QLA2300=m
-CONFIG_SCSI_QLA2322=m
-CONFIG_SCSI_QLA6312=m
-CONFIG_SCSI_QLA24XX=m
+# CONFIG_SCSI_QLA_FC is not set
 CONFIG_SCSI_LPFC=m
 # CONFIG_SCSI_DC395x is not set
 # CONFIG_SCSI_DC390T is not set
@@ -642,8 +589,6 @@ CONFIG_IEEE1394_SBP2=m
 CONFIG_IEEE1394_ETH1394=m
 CONFIG_IEEE1394_DV1394=m
 CONFIG_IEEE1394_RAWIO=y
-CONFIG_IEEE1394_CMP=m
-CONFIG_IEEE1394_AMDTP=m
 
 #
 # I2O device support
@@ -659,6 +604,7 @@ CONFIG_THERM_PM72=y
 CONFIG_WINDFARM=y
 CONFIG_WINDFARM_PM81=y
 CONFIG_WINDFARM_PM91=y
+CONFIG_WINDFARM_PM112=y
 
 #
 # Network device support
@@ -731,6 +677,7 @@ CONFIG_E1000=y
 # CONFIG_R8169 is not set
 # CONFIG_SIS190 is not set
 # CONFIG_SKGE is not set
+# CONFIG_SKY2 is not set
 # CONFIG_SK98LIN is not set
 # CONFIG_VIA_VELOCITY is not set
 CONFIG_TIGON3=y
@@ -853,6 +800,7 @@ CONFIG_HW_CONSOLE=y
 CONFIG_SERIAL_8250=y
 CONFIG_SERIAL_8250_CONSOLE=y
 CONFIG_SERIAL_8250_NR_UARTS=4
+CONFIG_SERIAL_8250_RUNTIME_UARTS=4
 # CONFIG_SERIAL_8250_EXTENDED is not set
 
 #
@@ -880,6 +828,7 @@ CONFIG_HVCS=m
 # CONFIG_WATCHDOG is not set
 # CONFIG_RTC is not set
 CONFIG_GEN_RTC=y
+# CONFIG_GEN_RTC_X is not set
 # CONFIG_DTLK is not set
 # CONFIG_R3964 is not set
 # CONFIG_APPLICOM is not set
@@ -923,8 +872,7 @@ CONFIG_I2C_AMD8111=y
 # CONFIG_I2C_I801 is not set
 # CONFIG_I2C_I810 is not set
 # CONFIG_I2C_PIIX4 is not set
-CONFIG_I2C_KEYWEST=y
-CONFIG_I2C_PMAC_SMU=y
+CONFIG_I2C_POWERMAC=y
 # CONFIG_I2C_NFORCE2 is not set
 # CONFIG_I2C_PARPORT_LIGHT is not set
 # CONFIG_I2C_PROSAVAGE is not set
@@ -957,6 +905,12 @@ CONFIG_I2C_PMAC_SMU=y
 # CONFIG_I2C_DEBUG_CHIP is not set
 
 #
+# SPI support
+#
+# CONFIG_SPI is not set
+# CONFIG_SPI_MASTER is not set
+
+#
 # Dallas's 1-wire bus
 #
 # CONFIG_W1 is not set
@@ -1028,7 +982,6 @@ CONFIG_FB_RADEON_I2C=y
 # CONFIG_FB_KYRO is not set
 # CONFIG_FB_3DFX is not set
 # CONFIG_FB_VOODOO1 is not set
-# CONFIG_FB_CYBLA is not set
 # CONFIG_FB_TRIDENT is not set
 # CONFIG_FB_VIRTUAL is not set
 
@@ -1073,9 +1026,10 @@ CONFIG_SND_OSSEMUL=y
 CONFIG_SND_MIXER_OSS=m
 CONFIG_SND_PCM_OSS=m
 CONFIG_SND_SEQUENCER_OSS=y
+# CONFIG_SND_DYNAMIC_MINORS is not set
+CONFIG_SND_SUPPORT_OLD_API=y
 # CONFIG_SND_VERBOSE_PRINTK is not set
 # CONFIG_SND_DEBUG is not set
-CONFIG_SND_GENERIC_DRIVER=y
 
 #
 # Generic devices
@@ -1089,6 +1043,8 @@ CONFIG_SND_GENERIC_DRIVER=y
 #
 # PCI devices
 #
+# CONFIG_SND_AD1889 is not set
+# CONFIG_SND_ALS4000 is not set
 # CONFIG_SND_ALI5451 is not set
 # CONFIG_SND_ATIIXP is not set
 # CONFIG_SND_ATIIXP_MODEM is not set
@@ -1097,39 +1053,38 @@ CONFIG_SND_GENERIC_DRIVER=y
 # CONFIG_SND_AU8830 is not set
 # CONFIG_SND_AZT3328 is not set
 # CONFIG_SND_BT87X is not set
-# CONFIG_SND_CS46XX is not set
+# CONFIG_SND_CA0106 is not set
+# CONFIG_SND_CMIPCI is not set
 # CONFIG_SND_CS4281 is not set
+# CONFIG_SND_CS46XX is not set
 # CONFIG_SND_EMU10K1 is not set
 # CONFIG_SND_EMU10K1X is not set
-# CONFIG_SND_CA0106 is not set
-# CONFIG_SND_KORG1212 is not set
-# CONFIG_SND_MIXART is not set
-# CONFIG_SND_NM256 is not set
-# CONFIG_SND_RME32 is not set
-# CONFIG_SND_RME96 is not set
-# CONFIG_SND_RME9652 is not set
-# CONFIG_SND_HDSP is not set
-# CONFIG_SND_HDSPM is not set
-# CONFIG_SND_TRIDENT is not set
-# CONFIG_SND_YMFPCI is not set
-# CONFIG_SND_AD1889 is not set
-# CONFIG_SND_ALS4000 is not set
-# CONFIG_SND_CMIPCI is not set
 # CONFIG_SND_ENS1370 is not set
 # CONFIG_SND_ENS1371 is not set
 # CONFIG_SND_ES1938 is not set
 # CONFIG_SND_ES1968 is not set
-# CONFIG_SND_MAESTRO3 is not set
 # CONFIG_SND_FM801 is not set
+# CONFIG_SND_HDA_INTEL is not set
+# CONFIG_SND_HDSP is not set
+# CONFIG_SND_HDSPM is not set
 # CONFIG_SND_ICE1712 is not set
 # CONFIG_SND_ICE1724 is not set
 # CONFIG_SND_INTEL8X0 is not set
 # CONFIG_SND_INTEL8X0M is not set
+# CONFIG_SND_KORG1212 is not set
+# CONFIG_SND_MAESTRO3 is not set
+# CONFIG_SND_MIXART is not set
+# CONFIG_SND_NM256 is not set
+# CONFIG_SND_PCXHR is not set
+# CONFIG_SND_RME32 is not set
+# CONFIG_SND_RME96 is not set
+# CONFIG_SND_RME9652 is not set
 # CONFIG_SND_SONICVIBES is not set
+# CONFIG_SND_TRIDENT is not set
 # CONFIG_SND_VIA82XX is not set
 # CONFIG_SND_VIA82XX_MODEM is not set
 # CONFIG_SND_VX222 is not set
-# CONFIG_SND_HDA_INTEL is not set
+# CONFIG_SND_YMFPCI is not set
 
 #
 # ALSA PowerMac devices
@@ -1201,13 +1156,16 @@ CONFIG_USB_STORAGE=m
 # CONFIG_USB_STORAGE_SDDR09 is not set
 # CONFIG_USB_STORAGE_SDDR55 is not set
 # CONFIG_USB_STORAGE_JUMPSHOT is not set
+# CONFIG_USB_STORAGE_ALAUDA is not set
 # CONFIG_USB_STORAGE_ONETOUCH is not set
+# CONFIG_USB_LIBUSUAL is not set
 
 #
 # USB Input Devices
 #
 CONFIG_USB_HID=y
 CONFIG_USB_HIDINPUT=y
+# CONFIG_USB_HIDINPUT_POWERBOOK is not set
 # CONFIG_HID_FF is not set
 CONFIG_USB_HIDDEV=y
 # CONFIG_USB_AIPTEK is not set
@@ -1221,6 +1179,7 @@ CONFIG_USB_HIDDEV=y
 # CONFIG_USB_YEALINK is not set
 # CONFIG_USB_XPAD is not set
 # CONFIG_USB_ATI_REMOTE is not set
+# CONFIG_USB_ATI_REMOTE2 is not set
 # CONFIG_USB_KEYSPAN_REMOTE is not set
 # CONFIG_USB_APPLETOUCH is not set
 
@@ -1307,6 +1266,10 @@ CONFIG_INFINIBAND_IPOIB=m
 #
 
 #
+# EDAC - error detection and reporting (RAS)
+#
+
+#
 # File systems
 #
 CONFIG_EXT2_FS=y
@@ -1340,6 +1303,7 @@ CONFIG_XFS_EXPORT=y
 CONFIG_XFS_SECURITY=y
 CONFIG_XFS_POSIX_ACL=y
 # CONFIG_XFS_RT is not set
+# CONFIG_OCFS2_FS is not set
 # CONFIG_MINIX_FS is not set
 # CONFIG_ROMFS_FS is not set
 CONFIG_INOTIFY=y
@@ -1379,6 +1343,7 @@ CONFIG_HUGETLBFS=y
 CONFIG_HUGETLB_PAGE=y
 CONFIG_RAMFS=y
 # CONFIG_RELAYFS_FS is not set
+# CONFIG_CONFIGFS_FS is not set
 
 #
 # Miscellaneous filesystems
@@ -1449,6 +1414,7 @@ CONFIG_MSDOS_PARTITION=y
 # CONFIG_SGI_PARTITION is not set
 # CONFIG_ULTRIX_PARTITION is not set
 # CONFIG_SUN_PARTITION is not set
+# CONFIG_KARMA_PARTITION is not set
 # CONFIG_EFI_PARTITION is not set
 
 #
@@ -1504,10 +1470,6 @@ CONFIG_CRC32=y
 CONFIG_LIBCRC32C=m
 CONFIG_ZLIB_INFLATE=y
 CONFIG_ZLIB_DEFLATE=m
-CONFIG_TEXTSEARCH=y
-CONFIG_TEXTSEARCH_KMP=m
-CONFIG_TEXTSEARCH_BM=m
-CONFIG_TEXTSEARCH_FSM=m
 
 #
 # Instrumentation Support
@@ -1520,18 +1482,20 @@ CONFIG_OPROFILE=y
 # Kernel hacking
 #
 # CONFIG_PRINTK_TIME is not set
-CONFIG_DEBUG_KERNEL=y
 CONFIG_MAGIC_SYSRQ=y
+CONFIG_DEBUG_KERNEL=y
 CONFIG_LOG_BUF_SHIFT=17
 CONFIG_DETECT_SOFTLOCKUP=y
 # CONFIG_SCHEDSTATS is not set
 # CONFIG_DEBUG_SLAB is not set
+CONFIG_DEBUG_MUTEXES=y
 # CONFIG_DEBUG_SPINLOCK is not set
 # CONFIG_DEBUG_SPINLOCK_SLEEP is not set
 # CONFIG_DEBUG_KOBJECT is not set
 # CONFIG_DEBUG_INFO is not set
 CONFIG_DEBUG_FS=y
 # CONFIG_DEBUG_VM is not set
+CONFIG_FORCED_INLINING=y
 # CONFIG_RCU_TORTURE_TEST is not set
 CONFIG_DEBUG_STACKOVERFLOW=y
 CONFIG_DEBUG_STACK_USAGE=y
@@ -1540,6 +1504,11 @@ CONFIG_XMON=y
 # CONFIG_XMON_DEFAULT is not set
 CONFIG_IRQSTACKS=y
 CONFIG_BOOTX_TEXT=y
+# CONFIG_PPC_EARLY_DEBUG_LPAR is not set
+# CONFIG_PPC_EARLY_DEBUG_G5 is not set
+# CONFIG_PPC_EARLY_DEBUG_RTAS is not set
+# CONFIG_PPC_EARLY_DEBUG_MAPLE is not set
+# CONFIG_PPC_EARLY_DEBUG_ISERIES is not set
 
 #
 # Security options
Index: powerpc-git/arch/powerpc/configs/pseries_defconfig
===================================================================
--- powerpc-git.orig/arch/powerpc/configs/pseries_defconfig
+++ powerpc-git/arch/powerpc/configs/pseries_defconfig
@@ -1,7 +1,7 @@
 #
 # Automatically generated make config: don't edit
-# Linux kernel version: 2.6.15-rc5
-# Tue Dec 20 15:59:40 2005
+# Linux kernel version: 2.6.16-rc2
+# Fri Feb 10 17:33:32 2006
 #
 CONFIG_PPC64=y
 CONFIG_64BIT=y
@@ -16,6 +16,10 @@ CONFIG_COMPAT=y
 CONFIG_SYSVIPC_COMPAT=y
 CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y
 CONFIG_ARCH_MAY_HAVE_PC_FDC=y
+CONFIG_PPC_OF=y
+CONFIG_PPC_UDBG_16550=y
+# CONFIG_GENERIC_TBSYNC is not set
+# CONFIG_DEFAULT_UIMAGE is not set
 
 #
 # Processor support
@@ -33,7 +37,6 @@ CONFIG_NR_CPUS=128
 # Code maturity level options
 #
 CONFIG_EXPERIMENTAL=y
-CONFIG_CLEAN_COMPILE=y
 CONFIG_LOCK_KERNEL=y
 CONFIG_INIT_ENV_ARG_LIMIT=32
 
@@ -49,8 +52,6 @@ CONFIG_POSIX_MQUEUE=y
 CONFIG_SYSCTL=y
 CONFIG_AUDIT=y
 CONFIG_AUDITSYSCALL=y
-CONFIG_HOTPLUG=y
-CONFIG_KOBJECT_UEVENT=y
 CONFIG_IKCONFIG=y
 CONFIG_IKCONFIG_PROC=y
 CONFIG_CPUSETS=y
@@ -60,8 +61,10 @@ CONFIG_CC_OPTIMIZE_FOR_SIZE=y
 CONFIG_KALLSYMS=y
 CONFIG_KALLSYMS_ALL=y
 # CONFIG_KALLSYMS_EXTRA_PASS is not set
+CONFIG_HOTPLUG=y
 CONFIG_PRINTK=y
 CONFIG_BUG=y
+CONFIG_ELF_CORE=y
 CONFIG_BASE_FULL=y
 CONFIG_FUTEX=y
 CONFIG_EPOLL=y
@@ -70,8 +73,10 @@ CONFIG_CC_ALIGN_FUNCTIONS=0
 CONFIG_CC_ALIGN_LABELS=0
 CONFIG_CC_ALIGN_LOOPS=0
 CONFIG_CC_ALIGN_JUMPS=0
+CONFIG_SLAB=y
 # CONFIG_TINY_SHMEM is not set
 CONFIG_BASE_SMALL=0
+# CONFIG_SLOB is not set
 
 #
 # Loadable module support
@@ -113,7 +118,6 @@ CONFIG_PPC_PSERIES=y
 # CONFIG_PPC_PMAC is not set
 # CONFIG_PPC_MAPLE is not set
 # CONFIG_PPC_CELL is not set
-CONFIG_PPC_OF=y
 CONFIG_XICS=y
 # CONFIG_U3_DART is not set
 CONFIG_MPIC=y
@@ -123,8 +127,8 @@ CONFIG_RTAS_PROC=y
 CONFIG_RTAS_FLASH=m
 # CONFIG_MMIO_NVRAM is not set
 CONFIG_IBMVIO=y
+# CONFIG_IBMEBUS is not set
 # CONFIG_PPC_MPC106 is not set
-# CONFIG_GENERIC_TBSYNC is not set
 # CONFIG_CPU_FREQ is not set
 # CONFIG_WANT_EARLY_SERIAL is not set
 
@@ -145,6 +149,7 @@ CONFIG_FORCE_MAX_ZONEORDER=13
 CONFIG_IOMMU_VMERGE=y
 CONFIG_HOTPLUG_CPU=y
 CONFIG_KEXEC=y
+# CONFIG_CRASH_DUMP is not set
 CONFIG_IRQ_ALL_CPUS=y
 CONFIG_PPC_SPLPAR=y
 CONFIG_EEH=y
@@ -165,6 +170,7 @@ CONFIG_HAVE_MEMORY_PRESENT=y
 CONFIG_SPARSEMEM_EXTREME=y
 # CONFIG_MEMORY_HOTPLUG is not set
 CONFIG_SPLIT_PTLOCK_CPUS=4
+CONFIG_MIGRATION=y
 CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID=y
 # CONFIG_PPC_64K_PAGES is not set
 CONFIG_SCHED_SMT=y
@@ -209,6 +215,7 @@ CONFIG_NET=y
 #
 # Networking options
 #
+# CONFIG_NETDEBUG is not set
 CONFIG_PACKET=y
 # CONFIG_PACKET_MMAP is not set
 CONFIG_UNIX=y
@@ -248,6 +255,7 @@ CONFIG_NETFILTER=y
 CONFIG_NETFILTER_NETLINK=y
 CONFIG_NETFILTER_NETLINK_QUEUE=m
 CONFIG_NETFILTER_NETLINK_LOG=m
+# CONFIG_NETFILTER_XTABLES is not set
 
 #
 # IP: Netfilter Configuration
@@ -265,65 +273,6 @@ CONFIG_IP_NF_TFTP=m
 CONFIG_IP_NF_AMANDA=m
 # CONFIG_IP_NF_PPTP is not set
 CONFIG_IP_NF_QUEUE=m
-CONFIG_IP_NF_IPTABLES=m
-CONFIG_IP_NF_MATCH_LIMIT=m
-CONFIG_IP_NF_MATCH_IPRANGE=m
-CONFIG_IP_NF_MATCH_MAC=m
-CONFIG_IP_NF_MATCH_PKTTYPE=m
-CONFIG_IP_NF_MATCH_MARK=m
-CONFIG_IP_NF_MATCH_MULTIPORT=m
-CONFIG_IP_NF_MATCH_TOS=m
-CONFIG_IP_NF_MATCH_RECENT=m
-CONFIG_IP_NF_MATCH_ECN=m
-CONFIG_IP_NF_MATCH_DSCP=m
-CONFIG_IP_NF_MATCH_AH_ESP=m
-CONFIG_IP_NF_MATCH_LENGTH=m
-CONFIG_IP_NF_MATCH_TTL=m
-CONFIG_IP_NF_MATCH_TCPMSS=m
-CONFIG_IP_NF_MATCH_HELPER=m
-CONFIG_IP_NF_MATCH_STATE=m
-CONFIG_IP_NF_MATCH_CONNTRACK=m
-CONFIG_IP_NF_MATCH_OWNER=m
-CONFIG_IP_NF_MATCH_ADDRTYPE=m
-CONFIG_IP_NF_MATCH_REALM=m
-CONFIG_IP_NF_MATCH_SCTP=m
-# CONFIG_IP_NF_MATCH_DCCP is not set
-CONFIG_IP_NF_MATCH_COMMENT=m
-CONFIG_IP_NF_MATCH_CONNMARK=m
-CONFIG_IP_NF_MATCH_CONNBYTES=m
-CONFIG_IP_NF_MATCH_HASHLIMIT=m
-CONFIG_IP_NF_MATCH_STRING=m
-CONFIG_IP_NF_FILTER=m
-CONFIG_IP_NF_TARGET_REJECT=m
-CONFIG_IP_NF_TARGET_LOG=m
-CONFIG_IP_NF_TARGET_ULOG=m
-CONFIG_IP_NF_TARGET_TCPMSS=m
-CONFIG_IP_NF_TARGET_NFQUEUE=m
-CONFIG_IP_NF_NAT=m
-CONFIG_IP_NF_NAT_NEEDED=y
-CONFIG_IP_NF_TARGET_MASQUERADE=m
-CONFIG_IP_NF_TARGET_REDIRECT=m
-CONFIG_IP_NF_TARGET_NETMAP=m
-CONFIG_IP_NF_TARGET_SAME=m
-CONFIG_IP_NF_NAT_SNMP_BASIC=m
-CONFIG_IP_NF_NAT_IRC=m
-CONFIG_IP_NF_NAT_FTP=m
-CONFIG_IP_NF_NAT_TFTP=m
-CONFIG_IP_NF_NAT_AMANDA=m
-CONFIG_IP_NF_MANGLE=m
-CONFIG_IP_NF_TARGET_TOS=m
-CONFIG_IP_NF_TARGET_ECN=m
-CONFIG_IP_NF_TARGET_DSCP=m
-CONFIG_IP_NF_TARGET_MARK=m
-CONFIG_IP_NF_TARGET_CLASSIFY=m
-CONFIG_IP_NF_TARGET_TTL=m
-CONFIG_IP_NF_TARGET_CONNMARK=m
-CONFIG_IP_NF_TARGET_CLUSTERIP=m
-CONFIG_IP_NF_RAW=m
-CONFIG_IP_NF_TARGET_NOTRACK=m
-CONFIG_IP_NF_ARPTABLES=m
-CONFIG_IP_NF_ARPFILTER=m
-CONFIG_IP_NF_ARP_MANGLE=m
 
 #
 # DCCP Configuration (EXPERIMENTAL)
@@ -334,6 +283,11 @@ CONFIG_IP_NF_ARP_MANGLE=m
 # SCTP Configuration (EXPERIMENTAL)
 #
 # CONFIG_IP_SCTP is not set
+
+#
+# TIPC Configuration (EXPERIMENTAL)
+#
+# CONFIG_TIPC is not set
 # CONFIG_ATM is not set
 # CONFIG_BRIDGE is not set
 # CONFIG_VLAN_8021Q is not set
@@ -352,7 +306,6 @@ CONFIG_LLC=y
 # QoS and/or fair queueing
 #
 # CONFIG_NET_SCHED is not set
-CONFIG_NET_CLS_ROUTE=y
 
 #
 # Network testing
@@ -550,13 +503,7 @@ CONFIG_SCSI_IPR_TRACE=y
 CONFIG_SCSI_IPR_DUMP=y
 # CONFIG_SCSI_QLOGIC_FC is not set
 # CONFIG_SCSI_QLOGIC_1280 is not set
-CONFIG_SCSI_QLA2XXX=y
-CONFIG_SCSI_QLA21XX=m
-CONFIG_SCSI_QLA22XX=m
-CONFIG_SCSI_QLA2300=m
-CONFIG_SCSI_QLA2322=m
-CONFIG_SCSI_QLA6312=m
-CONFIG_SCSI_QLA24XX=m
+# CONFIG_SCSI_QLA_FC is not set
 CONFIG_SCSI_LPFC=m
 # CONFIG_SCSI_DC395x is not set
 # CONFIG_SCSI_DC390T is not set
@@ -678,6 +625,7 @@ CONFIG_E1000=y
 # CONFIG_R8169 is not set
 # CONFIG_SIS190 is not set
 # CONFIG_SKGE is not set
+# CONFIG_SKY2 is not set
 # CONFIG_SK98LIN is not set
 # CONFIG_VIA_VELOCITY is not set
 CONFIG_TIGON3=y
@@ -803,6 +751,7 @@ CONFIG_HW_CONSOLE=y
 CONFIG_SERIAL_8250=y
 CONFIG_SERIAL_8250_CONSOLE=y
 CONFIG_SERIAL_8250_NR_UARTS=4
+CONFIG_SERIAL_8250_RUNTIME_UARTS=4
 # CONFIG_SERIAL_8250_EXTENDED is not set
 
 #
@@ -909,6 +858,12 @@ CONFIG_I2C_ALGOBIT=y
 # CONFIG_I2C_DEBUG_CHIP is not set
 
 #
+# SPI support
+#
+# CONFIG_SPI is not set
+# CONFIG_SPI_MASTER is not set
+
+#
 # Dallas's 1-wire bus
 #
 # CONFIG_W1 is not set
@@ -976,7 +931,6 @@ CONFIG_FB_RADEON_I2C=y
 # CONFIG_FB_KYRO is not set
 # CONFIG_FB_3DFX is not set
 # CONFIG_FB_VOODOO1 is not set
-# CONFIG_FB_CYBLA is not set
 # CONFIG_FB_TRIDENT is not set
 # CONFIG_FB_VIRTUAL is not set
 
@@ -1061,12 +1015,15 @@ CONFIG_USB_STORAGE=y
 # CONFIG_USB_STORAGE_SDDR09 is not set
 # CONFIG_USB_STORAGE_SDDR55 is not set
 # CONFIG_USB_STORAGE_JUMPSHOT is not set
+# CONFIG_USB_STORAGE_ALAUDA is not set
+# CONFIG_USB_LIBUSUAL is not set
 
 #
 # USB Input Devices
 #
 CONFIG_USB_HID=y
 CONFIG_USB_HIDINPUT=y
+# CONFIG_USB_HIDINPUT_POWERBOOK is not set
 # CONFIG_HID_FF is not set
 CONFIG_USB_HIDDEV=y
 # CONFIG_USB_AIPTEK is not set
@@ -1080,6 +1037,7 @@ CONFIG_USB_HIDDEV=y
 # CONFIG_USB_YEALINK is not set
 # CONFIG_USB_XPAD is not set
 # CONFIG_USB_ATI_REMOTE is not set
+# CONFIG_USB_ATI_REMOTE2 is not set
 # CONFIG_USB_KEYSPAN_REMOTE is not set
 # CONFIG_USB_APPLETOUCH is not set
 
@@ -1167,6 +1125,10 @@ CONFIG_INFINIBAND_IPOIB=m
 #
 
 #
+# EDAC - error detection and reporting (RAS)
+#
+
+#
 # File systems
 #
 CONFIG_EXT2_FS=y
@@ -1200,6 +1162,7 @@ CONFIG_XFS_EXPORT=y
 CONFIG_XFS_SECURITY=y
 CONFIG_XFS_POSIX_ACL=y
 # CONFIG_XFS_RT is not set
+# CONFIG_OCFS2_FS is not set
 # CONFIG_MINIX_FS is not set
 # CONFIG_ROMFS_FS is not set
 CONFIG_INOTIFY=y
@@ -1240,6 +1203,7 @@ CONFIG_HUGETLBFS=y
 CONFIG_HUGETLB_PAGE=y
 CONFIG_RAMFS=y
 # CONFIG_RELAYFS_FS is not set
+# CONFIG_CONFIGFS_FS is not set
 
 #
 # Miscellaneous filesystems
@@ -1351,10 +1315,6 @@ CONFIG_CRC32=y
 CONFIG_LIBCRC32C=m
 CONFIG_ZLIB_INFLATE=y
 CONFIG_ZLIB_DEFLATE=m
-CONFIG_TEXTSEARCH=y
-CONFIG_TEXTSEARCH_KMP=m
-CONFIG_TEXTSEARCH_BM=m
-CONFIG_TEXTSEARCH_FSM=m
 
 #
 # Instrumentation Support
@@ -1367,18 +1327,20 @@ CONFIG_OPROFILE=y
 # Kernel hacking
 #
 # CONFIG_PRINTK_TIME is not set
-CONFIG_DEBUG_KERNEL=y
 CONFIG_MAGIC_SYSRQ=y
+CONFIG_DEBUG_KERNEL=y
 CONFIG_LOG_BUF_SHIFT=17
 CONFIG_DETECT_SOFTLOCKUP=y
 # CONFIG_SCHEDSTATS is not set
 # CONFIG_DEBUG_SLAB is not set
+CONFIG_DEBUG_MUTEXES=y
 # CONFIG_DEBUG_SPINLOCK is not set
 # CONFIG_DEBUG_SPINLOCK_SLEEP is not set
 # CONFIG_DEBUG_KOBJECT is not set
 # CONFIG_DEBUG_INFO is not set
 CONFIG_DEBUG_FS=y
 # CONFIG_DEBUG_VM is not set
+CONFIG_FORCED_INLINING=y
 # CONFIG_RCU_TORTURE_TEST is not set
 CONFIG_DEBUG_STACKOVERFLOW=y
 CONFIG_DEBUG_STACK_USAGE=y
@@ -1387,6 +1349,11 @@ CONFIG_XMON=y
 CONFIG_XMON_DEFAULT=y
 CONFIG_IRQSTACKS=y
 # CONFIG_BOOTX_TEXT is not set
+# CONFIG_PPC_EARLY_DEBUG_LPAR is not set
+# CONFIG_PPC_EARLY_DEBUG_G5 is not set
+# CONFIG_PPC_EARLY_DEBUG_RTAS is not set
+# CONFIG_PPC_EARLY_DEBUG_MAPLE is not set
+# CONFIG_PPC_EARLY_DEBUG_ISERIES is not set
 
 #
 # Security options


From anton at samba.org  Mon Feb 13 14:48:35 2006
From: anton at samba.org (Anton Blanchard)
Date: Mon, 13 Feb 2006 14:48:35 +1100
Subject: [PATCH] powerpc: Fix runlatch performance issues
Message-ID: <20060213034835.GB7922@krispykreme>


The runlatch SPR can take a lot of time to write. My original runlatch
code would set it on every exception entry even though most of the time
this was not required. It would also continually set it in the idle
loop, which is an issue on an SMT capable processor.

Now we cache the runlatch value in a threadinfo bit, and only check for
it in decrementer and hardware interrupt exceptions as well as the idle
loop. Boot on POWER3, POWER5 and iseries, and compile tested on pmac32.

Signed-off-by: Anton Blanchard <anton at samba.org>
---

Index: build/arch/powerpc/kernel/head_64.S
===================================================================
--- build.orig/arch/powerpc/kernel/head_64.S	2006-02-11 14:50:46.000000000 +1100
+++ build/arch/powerpc/kernel/head_64.S	2006-02-13 13:11:22.000000000 +1100
@@ -321,7 +321,6 @@ exception_marker:
 label##_pSeries:					\
 	HMT_MEDIUM;					\
 	mtspr	SPRN_SPRG1,r13;		/* save r13 */	\
-	RUNLATCH_ON(r13);				\
 	EXCEPTION_PROLOG_PSERIES(PACA_EXGEN, label##_common)
 
 #define STD_EXCEPTION_ISERIES(n, label, area)		\
@@ -329,7 +328,6 @@ label##_pSeries:					\
 label##_iSeries:					\
 	HMT_MEDIUM;					\
 	mtspr	SPRN_SPRG1,r13;		/* save r13 */	\
-	RUNLATCH_ON(r13);				\
 	EXCEPTION_PROLOG_ISERIES_1(area);		\
 	EXCEPTION_PROLOG_ISERIES_2;			\
 	b	label##_common
@@ -339,7 +337,6 @@ label##_iSeries:					\
 label##_iSeries:							\
 	HMT_MEDIUM;							\
 	mtspr	SPRN_SPRG1,r13;		/* save r13 */			\
-	RUNLATCH_ON(r13);						\
 	EXCEPTION_PROLOG_ISERIES_1(PACA_EXGEN);				\
 	lbz	r10,PACAPROCENABLED(r13);				\
 	cmpwi	0,r10,0;						\
@@ -392,6 +389,7 @@ label##_common:						\
 label##_common:						\
 	EXCEPTION_PROLOG_COMMON(trap, PACA_EXGEN);	\
 	DISABLE_INTS;					\
+	bl	.ppc64_runlatch_on;			\
 	addi	r3,r1,STACK_FRAME_OVERHEAD;		\
 	bl	hdlr;					\
 	b	.ret_from_except_lite
@@ -409,7 +407,6 @@ __start_interrupts:
 _machine_check_pSeries:
 	HMT_MEDIUM
 	mtspr	SPRN_SPRG1,r13		/* save r13 */
-	RUNLATCH_ON(r13)
 	EXCEPTION_PROLOG_PSERIES(PACA_EXMC, machine_check_common)
 
 	. = 0x300
@@ -436,7 +433,6 @@ END_FTR_SECTION_IFCLR(CPU_FTR_SLB)
 data_access_slb_pSeries:
 	HMT_MEDIUM
 	mtspr	SPRN_SPRG1,r13
-	RUNLATCH_ON(r13)
 	mfspr	r13,SPRN_SPRG3		/* get paca address into r13 */
 	std	r3,PACA_EXSLB+EX_R3(r13)
 	mfspr	r3,SPRN_DAR
@@ -462,7 +458,6 @@ data_access_slb_pSeries:
 instruction_access_slb_pSeries:
 	HMT_MEDIUM
 	mtspr	SPRN_SPRG1,r13
-	RUNLATCH_ON(r13)
 	mfspr	r13,SPRN_SPRG3		/* get paca address into r13 */
 	std	r3,PACA_EXSLB+EX_R3(r13)
 	mfspr	r3,SPRN_SRR0		/* SRR0 is faulting address */
@@ -493,7 +488,6 @@ instruction_access_slb_pSeries:
 	.globl	system_call_pSeries
 system_call_pSeries:
 	HMT_MEDIUM
-	RUNLATCH_ON(r9)
 	mr	r9,r13
 	mfmsr	r10
 	mfspr	r13,SPRN_SPRG3
@@ -577,7 +571,6 @@ slb_miss_user_pseries:
 system_reset_fwnmi:
 	HMT_MEDIUM
 	mtspr	SPRN_SPRG1,r13		/* save r13 */
-	RUNLATCH_ON(r13)
 	EXCEPTION_PROLOG_PSERIES(PACA_EXGEN, system_reset_common)
 
 	.globl machine_check_fwnmi
@@ -585,7 +578,6 @@ system_reset_fwnmi:
 machine_check_fwnmi:
 	HMT_MEDIUM
 	mtspr	SPRN_SPRG1,r13		/* save r13 */
-	RUNLATCH_ON(r13)
 	EXCEPTION_PROLOG_PSERIES(PACA_EXMC, machine_check_common)
 
 #ifdef CONFIG_PPC_ISERIES
@@ -896,7 +888,6 @@ unrecov_fer:
 	.align	7
 	.globl data_access_common
 data_access_common:
-	RUNLATCH_ON(r10)		/* It wont fit in the 0x300 handler */
 	mfspr	r10,SPRN_DAR
 	std	r10,PACA_EXGEN+EX_DAR(r13)
 	mfspr	r10,SPRN_DSISR
@@ -1044,6 +1035,7 @@ hardware_interrupt_common:
 	EXCEPTION_PROLOG_COMMON(0x500, PACA_EXGEN)
 hardware_interrupt_entry:
 	DISABLE_INTS
+	bl	.ppc64_runlatch_on
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	.do_IRQ
 	b	.ret_from_except_lite
Index: build/arch/powerpc/kernel/process.c
===================================================================
--- build.orig/arch/powerpc/kernel/process.c	2006-02-11 14:50:46.000000000 +1100
+++ build/arch/powerpc/kernel/process.c	2006-02-13 13:12:45.000000000 +1100
@@ -888,3 +888,35 @@ void dump_stack(void)
 	show_stack(current, NULL);
 }
 EXPORT_SYMBOL(dump_stack);
+
+#ifdef CONFIG_PPC64
+void ppc64_runlatch_on(void)
+{
+	unsigned long ctrl;
+
+	if (cpu_has_feature(CPU_FTR_CTRL) && !test_thread_flag(TIF_RUNLATCH)) {
+		HMT_medium();
+
+		ctrl = mfspr(SPRN_CTRLF);
+		ctrl |= CTRL_RUNLATCH;
+		mtspr(SPRN_CTRLT, ctrl);
+
+		set_thread_flag(TIF_RUNLATCH);
+	}
+}
+
+void ppc64_runlatch_off(void)
+{
+	unsigned long ctrl;
+
+	if (cpu_has_feature(CPU_FTR_CTRL) && test_thread_flag(TIF_RUNLATCH)) {
+		HMT_medium();
+
+		clear_thread_flag(TIF_RUNLATCH);
+
+		ctrl = mfspr(SPRN_CTRLF);
+		ctrl &= ~CTRL_RUNLATCH;
+		mtspr(SPRN_CTRLT, ctrl);
+	}
+}
+#endif
Index: build/arch/powerpc/platforms/iseries/setup.c
===================================================================
--- build.orig/arch/powerpc/platforms/iseries/setup.c	2006-02-11 14:50:46.000000000 +1100
+++ build/arch/powerpc/platforms/iseries/setup.c	2006-02-11 14:50:55.000000000 +1100
@@ -648,6 +648,7 @@ static void yield_shared_processor(void)
 	 * here and let the timer_interrupt code sort out the actual time.
 	 */
 	get_lppaca()->int_dword.fields.decr_int = 1;
+	ppc64_runlatch_on();
 	process_iSeries_events();
 }
 
Index: build/include/asm-powerpc/reg.h
===================================================================
--- build.orig/include/asm-powerpc/reg.h	2006-02-11 14:50:46.000000000 +1100
+++ build/include/asm-powerpc/reg.h	2006-02-11 14:50:55.000000000 +1100
@@ -615,27 +615,9 @@
 #define proc_trap()	asm volatile("trap")
 
 #ifdef CONFIG_PPC64
-static inline void ppc64_runlatch_on(void)
-{
-	unsigned long ctrl;
-
-	if (cpu_has_feature(CPU_FTR_CTRL)) {
-		ctrl = mfspr(SPRN_CTRLF);
-		ctrl |= CTRL_RUNLATCH;
-		mtspr(SPRN_CTRLT, ctrl);
-	}
-}
-
-static inline void ppc64_runlatch_off(void)
-{
-	unsigned long ctrl;
-
-	if (cpu_has_feature(CPU_FTR_CTRL)) {
-		ctrl = mfspr(SPRN_CTRLF);
-		ctrl &= ~CTRL_RUNLATCH;
-		mtspr(SPRN_CTRLT, ctrl);
-	}
-}
+
+extern void ppc64_runlatch_on(void);
+extern void ppc64_runlatch_off(void);
 
 extern unsigned long scom970_read(unsigned int address);
 extern void scom970_write(unsigned int address, unsigned long value);
@@ -645,15 +627,6 @@ extern void scom970_write(unsigned int a
 #define __get_SP()	({unsigned long sp; \
 			asm volatile("mr %0,1": "=r" (sp)); sp;})
 
-#else /* __ASSEMBLY__ */
-
-#define RUNLATCH_ON(REG)			\
-BEGIN_FTR_SECTION				\
-	mfspr	(REG),SPRN_CTRLF;		\
-	ori	(REG),(REG),CTRL_RUNLATCH;	\
-	mtspr	SPRN_CTRLT,(REG);		\
-END_FTR_SECTION_IFSET(CPU_FTR_CTRL)
-
 #endif /* __ASSEMBLY__ */
 #endif /* __KERNEL__ */
 #endif /* _ASM_POWERPC_REG_H */
Index: build/include/asm-powerpc/thread_info.h
===================================================================
--- build.orig/include/asm-powerpc/thread_info.h	2006-02-11 14:50:46.000000000 +1100
+++ build/include/asm-powerpc/thread_info.h	2006-02-11 14:50:55.000000000 +1100
@@ -113,7 +113,7 @@ static inline struct thread_info *curren
 #define TIF_POLLING_NRFLAG	4	/* true if poll_idle() is polling
 					   TIF_NEED_RESCHED */
 #define TIF_32BIT		5	/* 32 bit binary */
-/* #define SPARE		6 */
+#define TIF_RUNLATCH		6	/* Is the runlatch enabled? */
 #define TIF_ABI_PENDING		7	/* 32/64 bit switch needed */
 #define TIF_SYSCALL_AUDIT	8	/* syscall auditing active */
 #define TIF_SINGLESTEP		9	/* singlestepping active */
@@ -131,7 +131,7 @@ static inline struct thread_info *curren
 #define _TIF_NEED_RESCHED	(1<<TIF_NEED_RESCHED)
 #define _TIF_POLLING_NRFLAG	(1<<TIF_POLLING_NRFLAG)
 #define _TIF_32BIT		(1<<TIF_32BIT)
-/* #define _SPARE		(1<<SPARE) */
+#define _TIF_RUNLATCH		(1<<TIF_RUNLATCH)
 #define _TIF_ABI_PENDING	(1<<TIF_ABI_PENDING)
 #define _TIF_SYSCALL_AUDIT	(1<<TIF_SYSCALL_AUDIT)
 #define _TIF_SINGLESTEP		(1<<TIF_SINGLESTEP)


From anton at samba.org  Mon Feb 13 18:11:13 2006
From: anton at samba.org (Anton Blanchard)
Date: Mon, 13 Feb 2006 18:11:13 +1100
Subject: [PATCH] ppc64: remove HMT support
Message-ID: <20060213071113.GD7922@krispykreme>


Hi,

HMT support is currently broken and needs to be reworked to play nicely
with the SMT scheduler. Remove the bit rotten bits for the time being.

I also updated an incorrect comment, we enter __secondary_hold with the
physical cpu id in r3.

Anton

Signed-off-by: Anton Blanchard <anton at samba.org>
---

Index: build/arch/powerpc/kernel/head_64.S
===================================================================
--- build.orig/arch/powerpc/kernel/head_64.S	2006-02-13 18:08:18.000000000 +1100
+++ build/arch/powerpc/kernel/head_64.S	2006-02-13 18:08:19.000000000 +1100
@@ -139,7 +139,7 @@ _GLOBAL(__secondary_hold)
 	ori	r24,r24,MSR_RI
 	mtmsrd	r24			/* RI on */
 
-	/* Grab our linux cpu number */
+	/* Grab our physical cpu number */
 	mr	r24,r3
 
 	/* Tell the master cpu we're here */
@@ -153,11 +153,6 @@ _GLOBAL(__secondary_hold)
 	cmpdi	0,r4,1
 	bne	100b
 
-#ifdef CONFIG_HMT
-	SET_REG_IMMEDIATE(r4, .hmt_init)
-	mtctr	r4
-	bctr
-#else
 #ifdef CONFIG_SMP
 	LOAD_REG_IMMEDIATE(r4, .pSeries_secondary_smp_init)
 	mtctr	r4
@@ -166,7 +161,6 @@ _GLOBAL(__secondary_hold)
 #else
 	BUG_OPCODE
 #endif
-#endif
 
 /* This value is used to mark exception frames on the stack. */
 	.section ".toc","aw"
@@ -1810,22 +1804,6 @@ _STATIC(start_here_multiplatform)
 	ori	r6,r6,MSR_RI
 	mtmsrd	r6			/* RI on */
 
-#ifdef CONFIG_HMT
-	/* Start up the second thread on cpu 0 */
-	mfspr	r3,SPRN_PVR
-	srwi	r3,r3,16
-	cmpwi	r3,0x34			/* Pulsar  */
-	beq	90f
-	cmpwi	r3,0x36			/* Icestar */
-	beq	90f
-	cmpwi	r3,0x37			/* SStar   */
-	beq	90f
-	b	91f			/* HMT not supported */
-90:	li	r3,0
-	bl	.hmt_start_secondary
-91:
-#endif
-
 	/* The following gets the stack and TOC set up with the regs */
 	/* pointing to the real addr of the kernel stack.  This is   */
 	/* all done to support the C function call below which sets  */
@@ -1939,77 +1917,8 @@ _STATIC(start_here_common)
 
 	bl .start_kernel
 
-_GLOBAL(hmt_init)
-#ifdef CONFIG_HMT
-	LOAD_REG_IMMEDIATE(r5, hmt_thread_data)
-	mfspr	r7,SPRN_PVR
-	srwi	r7,r7,16
-	cmpwi	r7,0x34			/* Pulsar  */
-	beq	90f
-	cmpwi	r7,0x36			/* Icestar */
-	beq	91f
-	cmpwi	r7,0x37			/* SStar   */
-	beq	91f
-	b	101f
-90:	mfspr	r6,SPRN_PIR
-	andi.	r6,r6,0x1f
-	b	92f
-91:	mfspr	r6,SPRN_PIR
-	andi.	r6,r6,0x3ff
-92:	sldi	r4,r24,3
-	stwx	r6,r5,r4
-	bl	.hmt_start_secondary
-	b	101f
-
-__hmt_secondary_hold:
-	LOAD_REG_IMMEDIATE(r5, hmt_thread_data)
-	clrldi	r5,r5,4
-	li	r7,0
-	mfspr	r6,SPRN_PIR
-	mfspr	r8,SPRN_PVR
-	srwi	r8,r8,16
-	cmpwi	r8,0x34
-	bne	93f
-	andi.	r6,r6,0x1f
-	b	103f
-93:	andi.	r6,r6,0x3f
-
-103:	lwzx	r8,r5,r7
-	cmpw	r8,r6
-	beq	104f
-	addi	r7,r7,8
-	b	103b
-
-104:	addi	r7,r7,4
-	lwzx	r9,r5,r7
-	mr	r24,r9
-101:
-#endif
-	mr	r3,r24
-	b	.pSeries_secondary_smp_init
-
-#ifdef CONFIG_HMT
-_GLOBAL(hmt_start_secondary)
-	LOAD_REG_IMMEDIATE(r4,__hmt_secondary_hold)
-	clrldi	r4,r4,4
-	mtspr	SPRN_NIADORM, r4
-	mfspr	r4, SPRN_MSRDORM
-	li	r5, -65
-	and	r4, r4, r5
-	mtspr	SPRN_MSRDORM, r4
-	lis	r4,0xffef
-	ori	r4,r4,0x7403
-	mtspr	SPRN_TSC, r4
-	li	r4,0x1f4
-	mtspr	SPRN_TST, r4
-	mfspr	r4, SPRN_HID0
-	ori	r4, r4, 0x1
-	mtspr	SPRN_HID0, r4
-	mfspr	r4, SPRN_CTRLF
-	oris	r4, r4, 0x40
-	mtspr	SPRN_CTRLT, r4
-	blr
-#endif
+	/* Not reached */
+	BUG_OPCODE
 
 /*
  * We put a few things here that have to be page-aligned.
Index: build/arch/powerpc/kernel/prom_init.c
===================================================================
--- build.orig/arch/powerpc/kernel/prom_init.c	2006-02-13 15:04:15.000000000 +1100
+++ build/arch/powerpc/kernel/prom_init.c	2006-02-13 18:08:19.000000000 +1100
@@ -205,14 +205,6 @@ static cell_t __initdata regbuf[1024];
 
 #define MAX_CPU_THREADS 2
 
-/* TO GO */
-#ifdef CONFIG_HMT
-struct {
-	unsigned int pir;
-	unsigned int threadid;
-} hmt_thread_data[NR_CPUS];
-#endif /* CONFIG_HMT */
-
 /*
  * Error results ... some OF calls will return "-1" on error, some
  * will return 0, some will return either. To simplify, here are
@@ -1319,10 +1311,6 @@ static void __init prom_hold_cpus(void)
 	 */
 	*spinloop = 0;
 
-#ifdef CONFIG_HMT
-	for (i = 0; i < NR_CPUS; i++)
-		RELOC(hmt_thread_data)[i].pir = 0xdeadbeef;
-#endif
 	/* look for cpus */
 	for (node = 0; prom_next_node(&node); ) {
 		type[0] = 0;
@@ -1389,32 +1377,6 @@ static void __init prom_hold_cpus(void)
 		/* Reserve cpu #s for secondary threads.   They start later. */
 		cpuid += cpu_threads;
 	}
-#ifdef CONFIG_HMT
-	/* Only enable HMT on processors that provide support. */
-	if (__is_processor(PV_PULSAR) || 
-	    __is_processor(PV_ICESTAR) ||
-	    __is_processor(PV_SSTAR)) {
-		prom_printf("    starting secondary threads\n");
-
-		for (i = 0; i < NR_CPUS; i += 2) {
-			if (!cpu_online(i))
-				continue;
-
-			if (i == 0) {
-				unsigned long pir = mfspr(SPRN_PIR);
-				if (__is_processor(PV_PULSAR)) {
-					RELOC(hmt_thread_data)[i].pir = 
-						pir & 0x1f;
-				} else {
-					RELOC(hmt_thread_data)[i].pir = 
-						pir & 0x3ff;
-				}
-			}
-		}
-	} else {
-		prom_printf("Processor is not HMT capable\n");
-	}
-#endif
 
 	if (cpuid > NR_CPUS)
 		prom_printf("WARNING: maximum CPUs (" __stringify(NR_CPUS)
Index: build/arch/powerpc/platforms/pseries/Kconfig
===================================================================
--- build.orig/arch/powerpc/platforms/pseries/Kconfig	2005-11-05 20:51:08.000000000 +1100
+++ build/arch/powerpc/platforms/pseries/Kconfig	2006-02-13 18:09:21.000000000 +1100
@@ -9,13 +9,6 @@ config PPC_SPLPAR
 	  processors, that is, which share physical processors between
 	  two or more partitions.
 
-config HMT
-	bool "Hardware multithreading"
-	depends on SMP && PPC_PSERIES && BROKEN
-	help
-	  This option enables hardware multithreading on RS64 cpus.
-	  pSeries systems p620 and p660 have such a cpu type.
-
 config EEH
 	bool "PCI Extended Error Handling (EEH)" if EMBEDDED
 	depends on PPC_PSERIES


From utz.bacher at de.ibm.com  Tue Feb 14 07:36:25 2006
From: utz.bacher at de.ibm.com (Utz Bacher)
Date: Mon, 13 Feb 2006 21:36:25 +0100 (CET)
Subject: [FYI/PATCH 2/3] fix IIC device tree interpretation for Cell
Message-ID: <Pine.LNX.4.62.0602132134010.19276@tuxmkge1.boeblingen.de.ibm.com>

This patch applies on top of Arnd's posting (patch id 4188) from 1/18 (on
top of 2.6.15.4).
It fixes the Linux interpretation of the Cell SLOF deivce tree IIC target
IDs and is recommended for running on a Cell blade today.

Cc: Jens Osterkamp <Jens.Osterkamp at de.ibm.com>
Cc: Arnd Bergmann <arndb at de.ibm.com>
From: Gerhard Stenzel <gerhard.stenzel at de.ibm.com>
Signed-off-by: Utz Bacher <utz.bacher at de.ibm.com>

Index: linux-2.6.15.4/arch/powerpc/platforms/cell/interrupt.c
===================================================================
--- linux-2.6.15.4.orig/arch/powerpc/platforms/cell/interrupt.c
+++ linux-2.6.15.4/arch/powerpc/platforms/cell/interrupt.c
@@ -254,13 +254,13 @@
                 iic = &per_cpu(iic, np[0]);
                 iic->regs = __ioremap(regs[0], sizeof(struct iic_regs),
                                       _PAGE_NO_CACHE);
-               iic->target_id = (np[0] << 4) + 0xe;
+               iic->target_id = ((np[0] & 2) << 3) + ((np[0] & 1) ? 0xf : 0xe);
                 printk("IIC for CPU %d at %lx mapped to %p\n", np[0], regs[0], iic->regs);

                 iic = &per_cpu(iic, np[1]);
                 iic->regs = __ioremap(regs[2], sizeof(struct iic_regs),
                                       _PAGE_NO_CACHE);
-               iic->target_id = (np[1] << 3) + 0xe;
+               iic->target_id = ((np[1] & 2) << 3) + ((np[1] & 1) ? 0xf : 0xe);
                 printk("IIC for CPU %d at %lx mapped to %p\n", np[1], regs[2], iic->regs);

                 found++;


From utz.bacher at de.ibm.com  Tue Feb 14 07:33:42 2006
From: utz.bacher at de.ibm.com (Utz Bacher)
Date: Mon, 13 Feb 2006 21:33:42 +0100 (CET)
Subject: [FYI/PATCH 1/3] reenable CONFIG_GEN_RTC for Cell
Message-ID: <Pine.LNX.4.62.0602132131360.19276@tuxmkge1.boeblingen.de.ibm.com>

This patch applies on top of Arnd's posting (patch id 4182) from 1/18 (on
top of 2.6.15.4).
It reenables CONFIG_GEN_RTC which allows the clock to be set on Cell
blades and is recommended for running on a such a system today.

Cc: Arnd Bergmann <arndb at de.ibm.com>
From: Gerhard Stenzel <gerhard.stenzel at de.ibm.com>
Signed-off-by: Utz Bacher <utz.bacher at de.ibm.com>

Index: linux-2.6.15.4/arch/powerpc/configs/cell_defconfig
===================================================================
--- linux-2.6.15.4.orig/arch/powerpc/configs/cell_defconfig
+++ linux-2.6.15.4/arch/powerpc/configs/cell_defconfig
@@ -654,7 +654,7 @@
  # CONFIG_PCIPCWATCHDOG is not set
  # CONFIG_WDTPCI is not set
  # CONFIG_RTC is not set
-# CONFIG_GEN_RTC is not set
+CONFIG_GEN_RTC=y
  # CONFIG_DTLK is not set
  # CONFIG_R3964 is not set
  # CONFIG_APPLICOM is not set


From HPENNER at de.ibm.com  Tue Feb 14 05:19:10 2006
From: HPENNER at de.ibm.com (Hartmut Penner)
Date: Mon, 13 Feb 2006 19:19:10 +0100
Subject: AW: Re: __setup_cpu_be problem
In-Reply-To: <1139779462.5247.30.camel@localhost.localdomain>
Message-ID: <OFB15C3FFF.405484C2-ONC1257114.0053DD64-41257114.0064A127@de.ibm.com>

Hello
      the initial value set by FW of the HID6 is 0x00010034_00000000.

I would like to support the large pages in the Firmware, but need to know
excactly what properties I have to set.
Looked at the linux code, but still are not quiet sure what values to put
into ibm,segment-page-sizes.
Could somebody enlighten me, how to find out ? I am right now in Rochester,
would there be somebody here
to talk about ?

      regards, Hartmut


|---------+---------------------------->
|         |           Benjamin         |
|         |           Herrenschmidt    |
|         |           <benh at kernel.cras|
|         |           hing.org>        |
|         |                            |
|         |           02/12/06 10:24 PM|
|---------+---------------------------->
  >-------------------------------------------------------------------------------------------------------------------------------------|
  |                                                                                                                                     |
  |       To:       Arnd Bergmann <arnd at arndb.de>                                                                                       |
  |       cc:       Masato.Noguchi at jp.sony.com, linuxppc64-dev at ozlabs.org, geoffrey.levand at am.sony.com, Hartmut Penner/Germany/IBM at IBMDE|
  |       Subject:  Re: AW: Re: __setup_cpu_be problem                                                                                  |
  |                                                                                                                                     |
  |                                                                                                                                     |
  >-------------------------------------------------------------------------------------------------------------------------------------|


> The current firmware on the Cell blades does neither the setup of
> the HID6 register nor have the correct tables in the device tree.
>
> Since I'm still currently sitting in a garden in NZ instead of the
> B?blingen lab, I can't find out what the HID6 power-on defaults
> are. We might get away with just leaving the default there, but that
> might prevent us from using 16M and/or 64k pages and there are
> definitely some application which depend on 16M hugetlb mappings
> on Cell.

Yes, however, how much widely distributed and "frozen" is this current
Cell firmware ? I mean, do we really need to add a workaround to the
kenrel instead of just fixing the firmware here ?

> The two problems we are facing currently are:
> - If HID6 defaults to disabling 16M large pages, the kernel will
>   get the wrong information from the CPU features and applications
>   that use it break. The firmware should add the setup if HID6
>   _now_, but we also should be prepared for users of old firmware
>   that want to upgrade their kernel without upgrading the firmware
>   at the same time.

Do we really need to support old/broken firmware ? It's not like we had
a released product all over the field...

> - We want to use 64k pages in the future, so the firmware needs to
>   add the 'ibm,segment-page-sizes' property ASAP, preferrably at
>   the same time they start setting up HID6. I currently have a
>   hack for the kernel to override that, but we're in the process
>   of eliminating all the special hacks that won't make in into
>   the mainline kernel.

The only things you need is to have this property set and the new
ibm,pa-feature for which I need to dig out the latest spec.... The
problem is that the kernel will currentl not enable 64k pages on any
processor due to the lack of a feature bit (intentionally) from the
cputable. That bit will be extracted from ibm,pa-features at least on
pSeries. It's the bit indicating that L=1 works for cache inhibited
mappings.

> Yes, 1M mappings are probably not of much use to us, and other OSs
> already do whatever they like ;-).

Sure. Note that the firmware can still set HID6 to 1M pages and put the
appropriate entries in the device-tree for 1M large pages. Linux won't
be able to use them as-is though but at least the device-tree infos will
be sane. I don't want to enter a debate wether we should be able to
change HID6 etc... right now. It's more a firmware configuration issue
as far as I'm concerned.

> Then please try to at least send the spec or a link to Hartmut's IBM
> internal address (hpenner at de.ibm.com). I already pointed him to the
> linux code when it was initially merged, but he argued that reverse
> engineering that code is not good enough to be sure to get the
> property right and not having it in there is better than having incorrect
> properties.

Will do
Ben.


From arnd at arndb.de  Tue Feb 14 09:17:31 2006
From: arnd at arndb.de (Arnd Bergmann)
Date: Mon, 13 Feb 2006 23:17:31 +0100
Subject: AW: Re: __setup_cpu_be problem
In-Reply-To: <OFB15C3FFF.405484C2-ONC1257114.0053DD64-41257114.0064A127@de.ibm.com>
References: <OFB15C3FFF.405484C2-ONC1257114.0053DD64-41257114.0064A127@de.ibm.com>
Message-ID: <200602132317.32034.arnd@arndb.de>

On Monday 13 February 2006 19:19, Hartmut Penner wrote:
> ? ? ? the initial value set by FW of the HID6 is 0x00010034_00000000.

Ok, good: Both large page sizes are set to 16M, neither is 1M or 64k.

That means that we can just rip out the HID6 setup from the kernel
without losing the ability for 16M pages. Geoff, please submit a
patch to replace __setup_cpu_be with __setup_cpu_power4 if that
solves your problem.

The new 64k page support has never worked so far on Cell because
of missing spufs code for this, so we don't get a regression either
way. We still need the firmware changes (HID6 setup and the device
tree properties) in order to support 64k pages, but we don't need
to worry about breaking stuff in the process.

	Arnd <><


From arnd at arndb.de  Tue Feb 14 09:24:44 2006
From: arnd at arndb.de (Arnd Bergmann)
Date: Mon, 13 Feb 2006 23:24:44 +0100
Subject: AW: Re: __setup_cpu_be problem
In-Reply-To: <1139779462.5247.30.camel@localhost.localdomain>
References: <2812322.110611139545275893.JavaMail.servlet@kundenserver>
	<200602120552.26164.arnd@arndb.de>
	<1139779462.5247.30.camel@localhost.localdomain>
Message-ID: <200602132324.45433.arnd@arndb.de>

On Sunday 12 February 2006 22:24, Benjamin Herrenschmidt wrote:
> > The current firmware on the Cell blades does neither the setup of
> > the HID6 register nor have the correct tables in the device tree.
> > 
> > Since I'm still currently sitting in a garden in NZ instead of the
> > B?blingen lab, I can't find out what the HID6 power-on defaults
> > are. We might get away with just leaving the default there, but that
> > might prevent us from using 16M and/or 64k pages and there are 
> > definitely some application which depend on 16M hugetlb mappings
> > on Cell. 
> 
> Yes, however, how much widely distributed and "frozen" is this current
> Cell firmware ? I mean, do we really need to add a workaround to the
> kenrel instead of just fixing the firmware here ?

The firmware update procedure is a little tricky, so our firmware
people decided to as few updates as possible, which means we won't
have small 'hotfix' updates going to the customer.

> > The two problems we are facing currently are:
> > - If HID6 defaults to disabling 16M large pages, the kernel will
> > ? get the wrong information from the CPU features and applications
> > ? that use it break. The firmware should add the setup if HID6
> > ? _now_, but we also should be prepared for users of old firmware
> > ? that want to upgrade their kernel without upgrading the firmware
> > ? at the same time.
> 
> Do we really need to support old/broken firmware ? It's not like we had
> a released product all over the field...

Basically, we do want to support old firmware that went out in our
customer shippings, but as I wrote in the other mail, we don't need
to worry about that in this case. Also, the requirement is only to
be able to boot with the mainline kernel, for production setup, users
of the currently shipping hardware would also need other patches e.g.
to work around performance errata in the CPU stepping.

I expect that for the systems that ship in larger quantities (Mercury,
Sony and IBM ones in the forseeable future) we can do without ugly
hacks of that sort.

	Arnd <><


From benh at kernel.crashing.org  Tue Feb 14 09:40:49 2006
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Tue, 14 Feb 2006 09:40:49 +1100
Subject: AW: Re: __setup_cpu_be problem
In-Reply-To: <200602132317.32034.arnd@arndb.de>
References: <OFB15C3FFF.405484C2-ONC1257114.0053DD64-41257114.0064A127@de.ibm.com>
	<200602132317.32034.arnd@arndb.de>
Message-ID: <1139870450.5237.34.camel@localhost.localdomain>

On Mon, 2006-02-13 at 23:17 +0100, Arnd Bergmann wrote:
> On Monday 13 February 2006 19:19, Hartmut Penner wrote:
> >       the initial value set by FW of the HID6 is 0x00010034_00000000.
> 
> Ok, good: Both large page sizes are set to 16M, neither is 1M or 64k.

That should be changed. One should be set to 64K and the other to 16M.
At this point, it's not yet clear how the kernel will make use of 64K
pages, it requires a feature bit that is never set (indicating that
cache inhibited L pages are supported). It will be provided by
ibm,pa-feature property in the long run but last I looked, it wasn't yet
implemented by any firmware.

> That means that we can just rip out the HID6 setup from the kernel
> without losing the ability for 16M pages. Geoff, please submit a
> patch to replace __setup_cpu_be with __setup_cpu_power4 if that
> solves your problem.
>
> The new 64k page support has never worked so far on Cell because
> of missing spufs code for this, so we don't get a regression either
> way. We still need the firmware changes (HID6 setup and the device
> tree properties) in order to support 64k pages, but we don't need
> to worry about breaking stuff in the process.

Ben.


From utz.bacher at de.ibm.com  Tue Feb 14 12:58:34 2006
From: utz.bacher at de.ibm.com (Utz Bacher)
Date: Tue, 14 Feb 2006 02:58:34 +0100 (CET)
Subject: [FYI/PATCH 3/3] increase direct mapping sizes for spufs
Message-ID: <Pine.LNX.4.62.0602140253300.32569@tuxmkge1.boeblingen.de.ibm.com>

This patch applies on top of Arnd's postings (patch ids 4192, 4185, 4190)
from 1/17 (on top of 2.6.15.4).
It maps 16k instead of 4k for each problem-state mapped subarea. The mfc
mapping contains the Multisource Synchronization Area, the MFC Command
Parameter Area and the MFC Command Queue Control Area; the cntl mapping
contains the SPU Control Area while the signal1 and signal2 mapping
contain the relevant Signal-Notification Area.
This allows libspe to build on direct problem state mapping and is
recommended for running on a Cell blade today. The code may change in the
near future.

Cc: Mark Nutter <mnutter at us.ibm.com>
Cc: Arnd Bergmann <arndb at de.ibm.com>
From: Ulrich Weigand <Ulrich.Weigand at de.ibm.com>
Signed-off-by: Utz Bacher <utz.bacher at de.ibm.com>

Index: linux-2.6.15.4/arch/powerpc/platforms/cell/spufs/context.c
===================================================================
--- linux-2.6.15.4.orig/arch/powerpc/platforms/cell/spufs/context.c
+++ linux-2.6.15.4/arch/powerpc/platforms/cell/spufs/context.c
@@ -116,13 +116,13 @@
  	if (ctx->local_store)
  		unmap_mapping_range(ctx->local_store, 0, LS_SIZE, 1);
  	if (ctx->mfc)
-		unmap_mapping_range(ctx->mfc, 0, 0x1000, 1);
+		unmap_mapping_range(ctx->mfc, 0, 0x4000, 1);
  	if (ctx->cntl)
-		unmap_mapping_range(ctx->cntl, 0, 0x1000, 1);
+		unmap_mapping_range(ctx->cntl, 0, 0x4000, 1);
  	if (ctx->signal1)
-		unmap_mapping_range(ctx->signal1, 0, 0x1000, 1);
+		unmap_mapping_range(ctx->signal1, 0, 0x4000, 1);
  	if (ctx->signal2)
-		unmap_mapping_range(ctx->signal2, 0, 0x1000, 1);
+		unmap_mapping_range(ctx->signal2, 0, 0x4000, 1);
  }

  int spu_acquire_runnable(struct spu_context *ctx)
Index: linux-2.6.15.4/arch/powerpc/platforms/cell/spufs/file.c
===================================================================
--- linux-2.6.15.4.orig/arch/powerpc/platforms/cell/spufs/file.c
+++ linux-2.6.15.4/arch/powerpc/platforms/cell/spufs/file.c
@@ -158,7 +158,7 @@
  	int ret;

  	offset += vma->vm_pgoff << PAGE_SHIFT;
-	if (offset > 0x1000)
+	if (offset >= 0x4000)
  		goto out;

  	ret = spu_acquire_runnable(ctx);


From geoffrey.levand at am.sony.com  Tue Feb 14 13:08:01 2006
From: geoffrey.levand at am.sony.com (Geoff Levand)
Date: Mon, 13 Feb 2006 18:08:01 -0800
Subject: AW: Re: __setup_cpu_be problem
In-Reply-To: <200602132317.32034.arnd@arndb.de>
References: <OFB15C3FFF.405484C2-ONC1257114.0053DD64-41257114.0064A127@de.ibm.com>
	<200602132317.32034.arnd@arndb.de>
Message-ID: <43F13B81.9020804@am.sony.com>

Arnd Bergmann wrote:
> That means that we can just rip out the HID6 setup from the kernel
> without losing the ability for 16M pages. Geoff, please submit a
> patch to replace __setup_cpu_be with __setup_cpu_power4 if that
> solves your problem.

This patch removes the incorrect and unneeded Cell processor setup 
routine __setup_cpu_be.  __setup_cpu_be improperly accesses the
hypervisor page size configuration at SPR HID6.  The correct behavior
is for the firmware or hypervisor to setup the correct page size 
configuration and pass those settings to the kernel in the device-tree.


Signed-off-by: Geoff Levand <geoffrey.levand at am.sony.com>

--


diff --git a/arch/powerpc/kernel/cpu_setup_power4.S b/arch/powerpc/kernel/cpu_setup_power4.S
index b61d86e..5c96481 100644
--- a/arch/powerpc/kernel/cpu_setup_power4.S
+++ b/arch/powerpc/kernel/cpu_setup_power4.S
@@ -76,20 +76,6 @@ _GLOBAL(__970_cpu_preinit)
 _GLOBAL(__setup_cpu_power4)
 	blr
 
-_GLOBAL(__setup_cpu_be)
-        /* Set large page sizes LP=0: 16MB, LP=1: 64KB */
-        addi    r3, 0,  0
-        ori     r3, r3, HID6_LB
-        sldi    r3, r3, 32
-        nor     r3, r3, r3
-        mfspr   r4, SPRN_HID6
-        and     r4, r4, r3
-        addi    r3, 0, 0x02000
-        sldi    r3, r3, 32
-        or      r4, r4, r3
-        mtspr   SPRN_HID6, r4
-	blr
-
 _GLOBAL(__setup_cpu_ppc970)
 	mfspr	r0,SPRN_HID0
 	li	r11,5			/* clear DOZE and SLEEP */
diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c
index 3191be7..19fc380 100644
--- a/arch/powerpc/kernel/cputable.c
+++ b/arch/powerpc/kernel/cputable.c
@@ -33,7 +33,6 @@ EXPORT_SYMBOL(cur_cpu_spec);
 #ifdef CONFIG_PPC64
 extern void __setup_cpu_power3(unsigned long offset, struct cpu_spec* spec);
 extern void __setup_cpu_power4(unsigned long offset, struct cpu_spec* spec);
-extern void __setup_cpu_be(unsigned long offset, struct cpu_spec* spec);
 #else
 extern void __setup_cpu_603(unsigned long offset, struct cpu_spec* spec);
 extern void __setup_cpu_604(unsigned long offset, struct cpu_spec* spec);
@@ -270,7 +269,7 @@ struct cpu_spec	cpu_specs[] = {
 			PPC_FEATURE_CELL | PPC_FEATURE_HAS_ALTIVEC_COMP,
 		.icache_bsize		= 128,
 		.dcache_bsize		= 128,
-		.cpu_setup		= __setup_cpu_be,
+		.cpu_setup		= __setup_cpu_power4,
 		.platform		= "ppc-cell-be",
 	},
 	{	/* default match */


From arndb at de.ibm.com  Tue Feb 14 14:46:18 2006
From: arndb at de.ibm.com (Arnd Bergmann)
Date: Tue, 14 Feb 2006 04:46:18 +0100
Subject: [FYI/PATCH 3/3] increase direct mapping sizes for spufs
In-Reply-To: <Pine.LNX.4.62.0602140253300.32569@tuxmkge1.boeblingen.de.ibm.com>
References: <Pine.LNX.4.62.0602140253300.32569@tuxmkge1.boeblingen.de.ibm.com>
Message-ID: <200602140446.19422.arndb@de.ibm.com>

On Tuesday 14 February 2006 02:58, Utz Bacher wrote:
> This patch applies on top of Arnd's postings (patch ids 4192, 4185, 4190)
> from 1/17 (on top of 2.6.15.4).
> It maps 16k instead of 4k for each problem-state mapped subarea. The mfc
> mapping contains the Multisource Synchronization Area, the MFC Command
> Parameter Area and the MFC Command Queue Control Area; the cntl mapping
> contains the SPU Control Area while the signal1 and signal2 mapping
> contain the relevant Signal-Notification Area.
> This allows libspe to build on direct problem state mapping and is
> recommended for running on a Cell blade today. The code may change in the
> near future.
> 
> Cc: Mark Nutter <mnutter at us.ibm.com>
> Cc: Arnd Bergmann <arndb at de.ibm.com>
> From: Ulrich Weigand <Ulrich.Weigand at de.ibm.com>
> Signed-off-by: Utz Bacher <utz.bacher at de.ibm.com>

Nack.

Both the intent and the implementation are flawed. Please keep the
size of each problem state mapping to one page. Your description
is not completely clear on the actual problem. I assume that the code
that I posted earlier had the wrong start address for the MFC page,
if that's what happened, please just fix the start address.

Last time we discussed this, the understanding was that the Multisource
Synchronization Area does not need to be exposed to user space. If a 
need for that has now come up, we should add a new file for it that
also allows synchronizing with file operations. Alternatively, we
could implement that as a 'fsync' file operation on the mfc file.

	Arnd <><


From sfr at canb.auug.org.au  Tue Feb 14 18:32:59 2006
From: sfr at canb.auug.org.au (Stephen Rothwell)
Date: Tue, 14 Feb 2006 18:32:59 +1100
Subject: [PATCH] PPC64 collect and export low-level cpu usage statistics
In-Reply-To: <17393.16261.768862.724265@cargo.ozlabs.ibm.com>
References: <17393.16261.768862.724265@cargo.ozlabs.ibm.com>
Message-ID: <20060214183259.28a6a501.sfr@canb.auug.org.au>

Hi Manish,

Paul has asked me to have a look at this patch and to also consider what
has been done for similar work in s390.   I will compare this to s390
tomorrow, but for now here are some preliminary comments:

> Index: linux-2.6.15-rc6/arch/powerpc/kernel/asm-offsets.c
> ===================================================================
> --- linux-2.6.15-rc6.orig/arch/powerpc/kernel/asm-offsets.c	2005-12-18 16:36:54.000000000 -0800
> +++ linux-2.6.15-rc6/arch/powerpc/kernel/asm-offsets.c	2006-01-17 15:39:03.000000000 -0800
> @@ -144,6 +144,10 @@
>  	DEFINE(LPPACASRR1, offsetof(struct lppaca, saved_srr1));
>  	DEFINE(LPPACAANYINT, offsetof(struct lppaca, int_dword.any_int));
>  	DEFINE(LPPACADECRINT, offsetof(struct lppaca, int_dword.fields.decr_int));
> +	DEFINE(PACA_STARTB, offsetof(struct paca_struct, start_tb));
> +	DEFINE(PACA_CDFLAG, offsetof(struct paca_struct, cdflag));
> +	DEFINE(PACA_DELTATB, offsetof(struct paca_struct, delta_tb));

Why not PACA_START_TB and PACA_DELTA_TB?  Also, start_tb and delta_tb don't really
store time base values, but PURR values.

> Index: linux-2.6.15-rc6/arch/powerpc/kernel/entry_64.S
> ===================================================================
> --- linux-2.6.15-rc6.orig/arch/powerpc/kernel/entry_64.S	2005-12-18 16:36:54.000000000 -0800
> +++ linux-2.6.15-rc6/arch/powerpc/kernel/entry_64.S	2006-01-17 15:39:03.000000000 -0800
> @@ -520,7 +520,19 @@
>  	 * r13 is our per cpu area, only restore it if we are returning to
>  	 * userspace
>  	 */
> +
>  	beq	1f
> +BEGIN_FTR_SECTION
> +	li	r10,0
> +	stb	r10,PACA_CDFLAG(r13)

cdflag get set here but not set or used anywhere else.

> Index: linux-2.6.15-rc6/arch/powerpc/kernel/process.c
> ===================================================================
> --- linux-2.6.15-rc6.orig/arch/powerpc/kernel/process.c	2005-12-18 16:36:54.000000000 -0800
> +++ linux-2.6.15-rc6/arch/powerpc/kernel/process.c	2006-01-17 21:20:25.000000000 -0800
> @@ -243,6 +243,7 @@
>  	struct thread_struct *new_thread, *old_thread;
>  	unsigned long flags;
>  	struct task_struct *last;
> +	struct paca_struct *lpaca;

This could have been declared below (near pd)

>  
>  #ifdef CONFIG_SMP
>  	/* avoid complexity of lazy save/restore of fpu
> @@ -313,19 +314,34 @@
>  	new_thread = &new->thread;
>  	old_thread = &current->thread;
>  
> -#ifdef CONFIG_PPC64
> -	/*
> -	 * Collect processor utilization data per process
> -	 */
> -	if (firmware_has_feature(FW_FEATURE_SPLPAR)) {
> -		struct cpu_usage *cu = &__get_cpu_var(cpu_usage_array);
> -		long unsigned start_tb, current_tb;
> -		start_tb = old_thread->start_tb;
> -		cu->current_tb = current_tb = mfspr(SPRN_PURR);
> -		old_thread->accum_tb += (current_tb - start_tb);
> -		new_thread->start_tb = current_tb;
> +
> +/* Collect cpu_util utilization data per process and per processor wise */
> +	if (cpu_has_feature(CPU_FTR_PURR)) {
> +		struct cpu_usage *pd = &__get_cpu_var(cpu_usage_array);

Was there some good reason to change this variable name from cu to pd?

> +		long unsigned start_cpu_util, current_cpu_util;
> +
> +		if ( old_thread->start_cpu_util )
> +			pd->current_cpu_util = current_cpu_util = mfspr(SPRN_PURR);
> +		else
> +		   	old_thread->start_cpu_util = pd->current_cpu_util = current_cpu_util = mfspr(SPRN_PURR);

Probably better would be:
	pd->current_cpu_util = current_cpu_util = mfspr(SPRN_PURR);
	if (old_thread->start_cpu_util == 0)
		old_thread->start_cpu_util = current_cpu_util;

> +
> +		/* store delta_tb & mftb into cpu_util data array for    *
> +		 * later easy access otherwise you have to do run_on_cpu *
> +		 * which is expensive             			 */

Comment style should be:

	/* store delta_tb & mftb into cpu_util data array for
	 * later easy access otherwise you have to do run_on_cpu
	 * which is expensive
	 */

> +
> +		lpaca = get_paca();
> +		pd->collected_krntb = lpaca->delta_tb;
> +		pd->collected_timebase = mftb();
> +
> +		start_cpu_util = old_thread->start_cpu_util;
> +		old_thread->total_dp += (current_cpu_util - start_cpu_util);
> +
> +		/* collect time from entry into kernel to now and account it *
> +		 * in process kernel time 				     */

Comment style again.

> +
> +		old_thread->proc_stime += (current_cpu_util - lpaca->start_tb);
> +		new_thread->start_cpu_util = current_cpu_util;
>  	}
> -#endif
>  
>  	local_irq_save(flags);
>  	last = _switch(old_thread, new_thread);
> Index: linux-2.6.15-rc6/arch/powerpc/kernel/setup_64.c
> ===================================================================
> --- linux-2.6.15-rc6.orig/arch/powerpc/kernel/setup_64.c	2005-12-18 16:36:54.000000000 -0800
> +++ linux-2.6.15-rc6/arch/powerpc/kernel/setup_64.c	2006-02-10 11:51:28.197401840 -0800
> @@ -851,3 +851,153 @@

> +static void collect_cpu_deltas(int cpu)

> +static void post_cpu_deltas(int cpu)

Should those two be #ifdef CONFIG_HOTPLUG_CPU ?

> +		/* Initialize the global variables to zero */
> +		offline_cpu_total_tb = 0;
> +		offline_cpu_total_cpu_util = 0;
> +		offline_cpu_total_krncycles = 0;
> +		offline_cpu_total_idle = 0;

You don't need to set these to zero explicitly.

> Index: linux-2.6.15-rc6/arch/powerpc/kernel/sysfs.c
> ===================================================================
> --- linux-2.6.15-rc6.orig/arch/powerpc/kernel/sysfs.c	2005-12-18 16:36:54.000000000 -0800
> +++ linux-2.6.15-rc6/arch/powerpc/kernel/sysfs.c	2006-02-10 12:36:02.375372096 -0800
> @@ -232,8 +240,11 @@
>  	if (cur_cpu_spec->num_pmcs >= 8)
>  		sysdev_create_file(s, &attr_pmc8);
>  
> -	if (cpu_has_feature(CPU_FTR_SMT))
> +	if (cpu_has_feature(CPU_FTR_PURR)) {
>  		sysdev_create_file(s, &attr_purr);

This will mean that the "purr" file doesn't exist in some cases where it
used to (even if it was useless).  Not sure if that is a problem for any
user mode utilities.

> Index: linux-2.6.15-rc6/include/asm-powerpc/processor.h
> ===================================================================
> --- linux-2.6.15-rc6.orig/include/asm-powerpc/processor.h	2005-12-18 16:36:54.000000000 -0800
> +++ linux-2.6.15-rc6/include/asm-powerpc/processor.h	2006-01-17 21:31:17.000000000 -0800
> @@ -177,6 +177,9 @@
>  #ifdef CONFIG_PPC64
>  	unsigned long	start_tb;	/* Start purr when proc switched in */
>  	unsigned long	accum_tb;	/* Total accumilated purr for process */
> +	unsigned long   start_cpu_util;	/* Start cpu_util when proc switch in */
> +	unsigned long   total_dp ;	/* Total delta cpu_util accum for proc */
> +	unsigned long   proc_stime;	/* Was pad,Now process cpu_util stime */

total_dp and proc_stime are not used anywhere and start_tb accum_tb are no longer used.

-- 
Cheers,
Stephen Rothwell                    sfr at canb.auug.org.au
http://www.canb.auug.org.au/~sfr/


From mohan at in.ibm.com  Tue Feb 14 23:07:48 2006
From: mohan at in.ibm.com (Mohan Kumar M)
Date: Tue, 14 Feb 2006 17:37:48 +0530
Subject: kexec tools gcc 4.1.0 issue
Message-ID: <1139918867.8472.100.camel@explorer.in.ibm.com>

Hi,

Latest kexec tools for PPC64 with purgatory patch
(ppc64-kdump-purgatory-backup-support.patch) was not working with gcc
version 4.1.0 due to the change in object file generation.

Here is the patch to fix this issue.

This patch is created on top of the following level of
kexec-tools:

- kexec-tools-1.101.tar.gz (from eric biederman's site or 
from lse site)
- kexec-tools-1.101-kdump6.patch (consolidated patch posted
on
http://lse.sourceforge.net/kdump/patches/1.101-kdump6/kexec-tools-1.101-kdump6.patch)

Review and suggestions are welcome.

Note:
Resending the patch since its not delivered to both fastboot and
linuxppc64-dev mailing list.

Regards,
Mohan.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: kexec-ppc-gcc410-fix.patch
Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060214/03a7bcc7/attachment.txt 

From mohan at in.ibm.com  Tue Feb 14 23:30:03 2006
From: mohan at in.ibm.com (Mohan Kumar M)
Date: Tue, 14 Feb 2006 18:00:03 +0530
Subject: kexec tools gcc warnings cleanup
Message-ID: <1139918870.8472.102.camel@explorer.in.ibm.com>

Cleanup the warnings generated in GCC 4.1.0 compilation of kexec-tools.

This patch is created on top of the following level of
kexec-tools:

- kexec-tools-1.101.tar.gz (from eric biederman's site or 
from lse site)
- kexec-tools-1.101-kdump6.patch (consolidated patch posted
on
http://lse.sourceforge.net/kdump/patches/1.101-kdump6/kexec-tools-1.101-kdump6.patch)

Review and suggestions are welcome.

Note:
Resending the patch since its not delivered to both fastboot and
linuxppc64-dev mailing list.

Regards,
Mohan.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: kexec-gcc-cleanup.patch
Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060214/2d50a242/attachment.txt 

From geoffrey.levand at am.sony.com  Wed Feb 15 05:22:08 2006
From: geoffrey.levand at am.sony.com (Geoff Levand)
Date: Tue, 14 Feb 2006 10:22:08 -0800
Subject: AW: Re: __setup_cpu_be problem
In-Reply-To: <200602132324.45433.arnd@arndb.de>
References: <2812322.110611139545275893.JavaMail.servlet@kundenserver>
	<200602120552.26164.arnd@arndb.de>
	<1139779462.5247.30.camel@localhost.localdomain>
	<200602132324.45433.arnd@arndb.de>
Message-ID: <43F21FD0.507@am.sony.com>

Arnd Bergmann wrote:
> On Sunday 12 February 2006 22:24, Benjamin Herrenschmidt wrote:
>> > The current firmware on the Cell blades does neither the setup of
>> > the HID6 register nor have the correct tables in the device tree.
>> > 
>> > Since I'm still currently sitting in a garden in NZ instead of the
>> > B?blingen lab, I can't find out what the HID6 power-on defaults
>> > are. We might get away with just leaving the default there, but that
>> > might prevent us from using 16M and/or 64k pages and there are 
>> > definitely some application which depend on 16M hugetlb mappings
>> > on Cell. 
>> 
>> Yes, however, how much widely distributed and "frozen" is this current
>> Cell firmware ? I mean, do we really need to add a workaround to the
>> kenrel instead of just fixing the firmware here ?
> 
> The firmware update procedure is a little tricky, so our firmware
> people decided to as few updates as possible, which means we won't
> have small 'hotfix' updates going to the customer.
> 
>> > The two problems we are facing currently are:
>> > - If HID6 defaults to disabling 16M large pages, the kernel will
>> >   get the wrong information from the CPU features and applications
>> >   that use it break. The firmware should add the setup if HID6
>> >   _now_, but we also should be prepared for users of old firmware
>> >   that want to upgrade their kernel without upgrading the firmware
>> >   at the same time.
>> 
>> Do we really need to support old/broken firmware ? It's not like we had
>> a released product all over the field...
> 
> Basically, we do want to support old firmware that went out in our
> customer shippings, but as I wrote in the other mail, we don't need
> to worry about that in this case. Also, the requirement is only to
> be able to boot with the mainline kernel, for production setup, users
> of the currently shipping hardware would also need other patches e.g.
> to work around performance errata in the CPU stepping.

Sorry about changing my mind on this Ben, but after reading the Book 4
docs on page sizes I see that each partition can have independent
page size settings.  I made the wrong assumption that all partitions
needed the same size setting.  Based on this, and on Arnd's comments,
I think in general we will need to setup page sizes in the kernel.
This is particularly true if we setup 16M + 64k pages for the spufs,
since the cpu default is 16M + 16M, which is probably what most
firmware will use.

At any rate, I don't think we need to worry about it so much now,
since those settings can be handled inside the platform code.  If
it makes sense later we can have an interface to access the page
size settings.

-Geoff


From benh at kernel.crashing.org  Wed Feb 15 08:22:09 2006
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Wed, 15 Feb 2006 08:22:09 +1100
Subject: AW: Re: __setup_cpu_be problem
In-Reply-To: <43F21FD0.507@am.sony.com>
References: <2812322.110611139545275893.JavaMail.servlet@kundenserver>
	<200602120552.26164.arnd@arndb.de>
	<1139779462.5247.30.camel@localhost.localdomain>
	<200602132324.45433.arnd@arndb.de>  <43F21FD0.507@am.sony.com>
Message-ID: <1139952130.7903.24.camel@localhost.localdomain>


> 
> Sorry about changing my mind on this Ben, but after reading the Book 4
> docs on page sizes I see that each partition can have independent
> page size settings.  I made the wrong assumption that all partitions
> needed the same size setting.  Based on this, and on Arnd's comments,
> I think in general we will need to setup page sizes in the kernel.
> This is particularly true if we setup 16M + 64k pages for the spufs,
> since the cpu default is 16M + 16M, which is probably what most
> firmware will use.
> 
> At any rate, I don't think we need to worry about it so much now,
> since those settings can be handled inside the platform code.  If
> it makes sense later we can have an interface to access the page
> size settings.

Well, I'll have to look more closely at the initialisation then. The
kernel currently assume that non-legacy page sizes (that is something
other than 4k and 16M) are completely described at boot by the
device-tree, and that would imply HID6 has already been setup.

If we want to do something differently, that means that we need a "hook"
for the platform code to fill the page size description array. However,
currently, there is no platform hook between that array being filled
from the device-tree and the memory management being initialized based
on those data.

An option would be to let platform probe() functions fill the table and
set a variable telling the later hash init code to ignore the
device-tree description of page sizes ...

Ben.


From olof at lixom.net  Wed Feb 15 08:27:46 2006
From: olof at lixom.net (Olof Johansson)
Date: Tue, 14 Feb 2006 15:27:46 -0600
Subject: AW: Re: __setup_cpu_be problem
In-Reply-To: <1139952130.7903.24.camel@localhost.localdomain>
References: <2812322.110611139545275893.JavaMail.servlet@kundenserver>
	<200602120552.26164.arnd@arndb.de>
	<1139779462.5247.30.camel@localhost.localdomain>
	<200602132324.45433.arnd@arndb.de> <43F21FD0.507@am.sony.com>
	<1139952130.7903.24.camel@localhost.localdomain>
Message-ID: <20060214212746.GA6291@pb15.lixom.net>

On Wed, Feb 15, 2006 at 08:22:09AM +1100, Benjamin Herrenschmidt wrote:

> Well, I'll have to look more closely at the initialisation then. The
> kernel currently assume that non-legacy page sizes (that is something
> other than 4k and 16M) are completely described at boot by the
> device-tree, and that would imply HID6 has already been setup.

Isn't this something that should be configured in the hypervisor /
partition firmware on the machine then, instead of hacked into the
kernel? The hypervisor would of course switch HID contents when
dispatching different partitions, if needed.


-Olof


From d.herrendoerfer at de.ibm.com  Wed Feb 15 01:45:07 2006
From: d.herrendoerfer at de.ibm.com (Dirk Herrendoerfer)
Date: Tue, 14 Feb 2006 15:45:07 +0100
Subject: libspe-1.0.1
Message-ID: <8b812ffd71fa2e5bedb463af40cf3184@de.ibm.com>

***********************
Warning: Your file, libspe-1.0.1.tar.gz, contains more than 32 files after decompression and cannot be scanned.
***********************


This is the current snapshot of libspe.
I is an update to version 1.0, conforming to the JSRE  SPE
Runtime Management Library documentation version 1.1.

New in this release is the avaiability of direct problem state mapping,
and ppe initiated dma.

   D. Herrendoerfer


-------------- next part --------------
A non-text attachment was scrubbed...
Name: libspe-1.0.1.tar.gz
Type: application/x-gzip
Size: 40465 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060214/8422c1a3/attachment.bin 

From hollis at penguinppc.org  Wed Feb 15 09:31:49 2006
From: hollis at penguinppc.org (Hollis Blanchard)
Date: Tue, 14 Feb 2006 16:31:49 -0600
Subject: AW: Re: __setup_cpu_be problem
In-Reply-To: <43F21FD0.507@am.sony.com>
References: <2812322.110611139545275893.JavaMail.servlet@kundenserver>
	<200602120552.26164.arnd@arndb.de>
	<1139779462.5247.30.camel@localhost.localdomain>
	<200602132324.45433.arnd@arndb.de> <43F21FD0.507@am.sony.com>
Message-ID: <1139956309.780.254385845@webmail.messagingengine.com>

On Tue, 2006-02-14 at 10:22 -0800, Geoff Levand wrote:
> 
> Sorry about changing my mind on this Ben, but after reading the Book 4
> docs on page sizes I see that each partition can have independent
> page size settings.  I made the wrong assumption that all partitions
> needed the same size setting.  Based on this, and on Arnd's comments,
> I think in general we will need to setup page sizes in the kernel. 

On Tue, 2006-02-14 at 15:27 -0600, Olof Johansson wrote: 
> 
> Isn't this something that should be configured in the hypervisor /
> partition firmware on the machine then, instead of hacked into the
> kernel? The hypervisor would of course switch HID contents when
> dispatching different partitions, if needed.

I agree with Olof; I don't follow the original leap of logic.

If every partition can have independent page size settings, and
especially if HID6 is a hypervisor-privileged resource as mentioned
earlier, then the hypervisor needs to set it. Only the hypervisor can
restore each partition's different HID6 value when it switches between
them...

-Hollis


From geoffrey.levand at am.sony.com  Wed Feb 15 10:14:47 2006
From: geoffrey.levand at am.sony.com (Geoff Levand)
Date: Tue, 14 Feb 2006 15:14:47 -0800
Subject: AW: Re: __setup_cpu_be problem
In-Reply-To: <1139956309.780.254385845@webmail.messagingengine.com>
References: <1139956309.780.254385845@webmail.messagingengine.com>
Message-ID: <43F26467.3060407@am.sony.com>

Hollis Blanchard wrote:
> On Tue, 2006-02-14 at 10:22 -0800, Geoff Levand wrote:
>> 
>> Sorry about changing my mind on this Ben, but after reading the Book 4
>> docs on page sizes I see that each partition can have independent
>> page size settings.  I made the wrong assumption that all partitions
>> needed the same size setting.  Based on this, and on Arnd's comments,
>> I think in general we will need to setup page sizes in the kernel. 
> 
> On Tue, 2006-02-14 at 15:27 -0600, Olof Johansson wrote: 
>> 
>> Isn't this something that should be configured in the hypervisor /
>> partition firmware on the machine then, instead of hacked into the
>> kernel? The hypervisor would of course switch HID contents when
>> dispatching different partitions, if needed.
> 
> I agree with Olof; I don't follow the original leap of logic.
> 
> If every partition can have independent page size settings, and
> especially if HID6 is a hypervisor-privileged resource as mentioned
> earlier, then the hypervisor needs to set it. Only the hypervisor can
> restore each partition's different HID6 value when it switches between
> them...
> 

I guess what I am thinking of are cases like when the firmware has no
clue and just uses defaults, or when the firmware or hypervisor
expect the kernel to set what sizes work best for it.  In these
cases a change needs to be initiated by the kernel.

-Geoff


From olof at lixom.net  Wed Feb 15 10:25:47 2006
From: olof at lixom.net (Olof Johansson)
Date: Tue, 14 Feb 2006 17:25:47 -0600
Subject: AW: Re: __setup_cpu_be problem
In-Reply-To: <43F26467.3060407@am.sony.com>
References: <1139956309.780.254385845@webmail.messagingengine.com>
	<43F26467.3060407@am.sony.com>
Message-ID: <20060214232547.GB6291@pb15.lixom.net>

On Tue, Feb 14, 2006 at 03:14:47PM -0800, Geoff Levand wrote:
> Hollis Blanchard wrote:

> I guess what I am thinking of are cases like when the firmware has no
> clue and just uses defaults, or when the firmware or hypervisor
> expect the kernel to set what sizes work best for it.  In these
> cases a change needs to be initiated by the kernel.

But then give the firmware a clue, and fix it. For the partitioned
case, I'm sure you have ways for an alpha partition to define the
characteristics of a guest partition, and/or a small controller image
running in your hypervisor for similar purposes.

If the kernel needs to set "what works best for it", then you should
look into some of the ELF header flag stuff that IBM pSeries firmware
architects seems to love these days, it seems to be the preferred way
for the OS to tell firmware/hypervisor what it wants.

There should be no need to introduce yet another interface for this. There
are plenty of them already.


-Olof


From hollis at penguinppc.org  Wed Feb 15 10:42:28 2006
From: hollis at penguinppc.org (Hollis Blanchard)
Date: Tue, 14 Feb 2006 17:42:28 -0600
Subject: AW: Re: __setup_cpu_be problem
In-Reply-To: <20060214232547.GB6291@pb15.lixom.net>
References: <1139956309.780.254385845@webmail.messagingengine.com>
	<43F26467.3060407@am.sony.com> <20060214232547.GB6291@pb15.lixom.net>
Message-ID: <1139960548.9067.254390313@webmail.messagingengine.com>


On Tue, 14 Feb 2006 17:25:47 -0600, "Olof Johansson" <olof at lixom.net>
said:
> On Tue, Feb 14, 2006 at 03:14:47PM -0800, Geoff Levand wrote:
> > I guess what I am thinking of are cases like when the firmware has no
> > clue and just uses defaults, or when the firmware or hypervisor
> > expect the kernel to set what sizes work best for it.  In these
> > cases a change needs to be initiated by the kernel.
> 
> But then give the firmware a clue, and fix it. For the partitioned
> case, I'm sure you have ways for an alpha partition to define the
> characteristics of a guest partition, and/or a small controller image
> running in your hypervisor for similar purposes.
> 
> If the kernel needs to set "what works best for it", then you should
> look into some of the ELF header flag stuff that IBM pSeries firmware
> architects seems to love these days, it seems to be the preferred way
> for the OS to tell firmware/hypervisor what it wants.

The solution used with the IBM pSeries hypervisor (look for "fake_elf"
in prom_init.c, in particular the "rpa_note" part of it) is considered
poor by some kernel developers. Implementing something more
fine-grained, like a "capabilities" hcall/rtas method/whatever would
allow for much more flexibility, which makes sense since the information
we want to communicate will undoubtedly grow on future platforms.

In this case, an hcall requesting two page sizes would allow the
hypervisor to validate the request and implement it as needed on
differing hardware, whether it's via HID6 or some other
hypervisor-privileged mechanism.

-Hollis


From geoffrey.levand at am.sony.com  Wed Feb 15 10:43:35 2006
From: geoffrey.levand at am.sony.com (Geoff Levand)
Date: Tue, 14 Feb 2006 15:43:35 -0800
Subject: AW: Re: __setup_cpu_be problem
In-Reply-To: <20060214232547.GB6291@pb15.lixom.net>
References: <1139956309.780.254385845@webmail.messagingengine.com>
	<43F26467.3060407@am.sony.com>
	<20060214232547.GB6291@pb15.lixom.net>
Message-ID: <43F26B27.4080208@am.sony.com>

Olof Johansson wrote:
> On Tue, Feb 14, 2006 at 03:14:47PM -0800, Geoff Levand wrote:
>> Hollis Blanchard wrote:
> 
>> I guess what I am thinking of are cases like when the firmware has no
>> clue and just uses defaults, or when the firmware or hypervisor
>> expect the kernel to set what sizes work best for it.  In these
>> cases a change needs to be initiated by the kernel.
> 
> But then give the firmware a clue, and fix it. For the partitioned
> case, I'm sure you have ways for an alpha partition to define the
> characteristics of a guest partition, and/or a small controller image
> running in your hypervisor for similar purposes.
> 
> If the kernel needs to set "what works best for it", then you should
> look into some of the ELF header flag stuff that IBM pSeries firmware
> architects seems to love these days, it seems to be the preferred way
> for the OS to tell firmware/hypervisor what it wants.
> 
> There should be no need to introduce yet another interface for this. There
> are plenty of them already.
> 

I wish the part where I wrote 'I don't think we need to worry about it
so much now' didn't get cut from the discussion, since I am in agreement
with you to try to avoid some new mechanisms...

-Geoff


From olof at lixom.net  Wed Feb 15 10:45:40 2006
From: olof at lixom.net (Olof Johansson)
Date: Tue, 14 Feb 2006 17:45:40 -0600
Subject: AW: Re: __setup_cpu_be problem
In-Reply-To: <43F26B27.4080208@am.sony.com>
References: <1139956309.780.254385845@webmail.messagingengine.com>
	<43F26467.3060407@am.sony.com>
	<20060214232547.GB6291@pb15.lixom.net>
	<43F26B27.4080208@am.sony.com>
Message-ID: <20060214234539.GC6291@pb15.lixom.net>

On Tue, Feb 14, 2006 at 03:43:35PM -0800, Geoff Levand wrote:

> I wish the part where I wrote 'I don't think we need to worry about it
> so much now' didn't get cut from the discussion, since I am in agreement
> with you to try to avoid some new mechanisms...

Ok, sounds good. :)


-Olof


From geoffrey.levand at am.sony.com  Wed Feb 15 11:22:46 2006
From: geoffrey.levand at am.sony.com (Geoff Levand)
Date: Tue, 14 Feb 2006 16:22:46 -0800
Subject: AW: Re: __setup_cpu_be problem
In-Reply-To: <1139960548.9067.254390313@webmail.messagingengine.com>
References: <1139956309.780.254385845@webmail.messagingengine.com>
	<43F26467.3060407@am.sony.com>
	<20060214232547.GB6291@pb15.lixom.net>
	<1139960548.9067.254390313@webmail.messagingengine.com>
Message-ID: <43F27456.9080105@am.sony.com>

Hollis Blanchard wrote:
> On Tue, 14 Feb 2006 17:25:47 -0600, "Olof Johansson" <olof at lixom.net>
> said:
>> On Tue, Feb 14, 2006 at 03:14:47PM -0800, Geoff Levand wrote:
>> > I guess what I am thinking of are cases like when the firmware has no
>> > clue and just uses defaults, or when the firmware or hypervisor
>> > expect the kernel to set what sizes work best for it.  In these
>> > cases a change needs to be initiated by the kernel.
>> 
>> But then give the firmware a clue, and fix it. For the partitioned
>> case, I'm sure you have ways for an alpha partition to define the
>> characteristics of a guest partition, and/or a small controller image
>> running in your hypervisor for similar purposes.
>> 
>> If the kernel needs to set "what works best for it", then you should
>> look into some of the ELF header flag stuff that IBM pSeries firmware
>> architects seems to love these days, it seems to be the preferred way
>> for the OS to tell firmware/hypervisor what it wants.
> 
> The solution used with the IBM pSeries hypervisor (look for "fake_elf"
> in prom_init.c, in particular the "rpa_note" part of it) is considered
> poor by some kernel developers. Implementing something more
> fine-grained, like a "capabilities" hcall/rtas method/whatever would
> allow for much more flexibility, which makes sense since the information
> we want to communicate will undoubtedly grow on future platforms.

Certainly looks clunky...

> In this case, an hcall requesting two page sizes would allow the
> hypervisor to validate the request and implement it as needed on
> differing hardware, whether it's via HID6 or some other
> hypervisor-privileged mechanism.

That seems a better way.  Do you have any ideas on what other
'capabilities' are or would be desirable?

-Geoff


From jimix at watson.ibm.com  Wed Feb 15 22:21:57 2006
From: jimix at watson.ibm.com (Jimi Xenidis)
Date: Wed, 15 Feb 2006 06:21:57 -0500
Subject: AW: Re: __setup_cpu_be problem
In-Reply-To: <OFB15C3FFF.405484C2-ONC1257114.0053DD64-41257114.0064A127@de.ibm.com>
References: <OFB15C3FFF.405484C2-ONC1257114.0053DD64-41257114.0064A127@de.ibm.com>
Message-ID: <68BEC61B-70ED-4057-A669-DE5EC38B6180@watson.ibm.com>


On Feb 13, 2006, at 1:19 PM, Hartmut Penner wrote:

> I would like to support the large pages in the Firmware, but need  
> to know
> excactly what properties I have to set.

Why? What would you gain from using Large Pages?
Is your FW that big?
Are thinking of using Large PAges in IO space? Cuz I don't think you  
can.
-JX


From paulus at samba.org  Wed Feb 15 22:30:49 2006
From: paulus at samba.org (Paul Mackerras)
Date: Wed, 15 Feb 2006 22:30:49 +1100
Subject: AW: Re: __setup_cpu_be problem
In-Reply-To: <68BEC61B-70ED-4057-A669-DE5EC38B6180@watson.ibm.com>
References: <OFB15C3FFF.405484C2-ONC1257114.0053DD64-41257114.0064A127@de.ibm.com>
	<68BEC61B-70ED-4057-A669-DE5EC38B6180@watson.ibm.com>
Message-ID: <17395.4329.1338.898562@cargo.ozlabs.ibm.com>

Jimi Xenidis writes:

> Are thinking of using Large PAges in IO space? Cuz I don't think you  
> can.

Why not?

Paul.


From jimix at watson.ibm.com  Wed Feb 15 23:11:33 2006
From: jimix at watson.ibm.com (Jimi Xenidis)
Date: Wed, 15 Feb 2006 07:11:33 -0500
Subject: AW: Re: __setup_cpu_be problem
In-Reply-To: <17395.4329.1338.898562@cargo.ozlabs.ibm.com>
References: <OFB15C3FFF.405484C2-ONC1257114.0053DD64-41257114.0064A127@de.ibm.com>
	<68BEC61B-70ED-4057-A669-DE5EC38B6180@watson.ibm.com>
	<17395.4329.1338.898562@cargo.ozlabs.ibm.com>
Message-ID: <1B5EC317-861F-46D2-AA05-AEC16DFE4737@watson.ibm.com>


On Feb 15, 2006, at 6:30 AM, Paul Mackerras wrote:

> Jimi Xenidis writes:
>
>> Are thinking of using Large PAges in IO space? Cuz I don't think you
>> can.
>
> Why not?

 From the 970 User manual:
   To avoid accidental large/small page translation aliasing, the  
970FX implements a HID4 bit (HID4[61]) to
   disable the large page facility and does not permit cache  
inhibited accesses to an address in a large page.

I'm not 100% but I believe this effects P4, and maybe even P5.
WRT Cell, I believe the BPA_Map can be mapped with large pages but  
I'm not sure about "real" devices.
-JX


From ahuja at austin.ibm.com  Wed Feb 15 12:05:55 2006
From: ahuja at austin.ibm.com (Manish Ahuja)
Date: Tue, 14 Feb 2006 19:05:55 -0600
Subject: [PATCH] PPC64 collect and export low-level cpu usage statistics
In-Reply-To: <20060214183259.28a6a501.sfr@canb.auug.org.au>
References: <17393.16261.768862.724265@cargo.ozlabs.ibm.com>
	<20060214183259.28a6a501.sfr@canb.auug.org.au>
Message-ID: <43F27E73.30709@austin.ibm.com>

Stephen Rothwell wrote:

>Why not PACA_START_TB and PACA_DELTA_TB?  Also, start_tb and delta_tb don't really
>store time base values, but PURR values.
>  
>

Stephen,

Thanks for the review. I will address all points or make appropriate 
changes where required.
Just a quick note before I head out for the day. I will send another 
detailed response a bit later.

On why these are called tb and not purr. I presume when i dropped the 
last patch, we weren't exhaustively
tracking anything else other than purr and Paul M suggested that I use 
"tb" instead of purr. I would personally
prefer purr as it makes reading the code easier as it suggests exactly 
what is being tracked.

I can try and change it back to purr if Paul M agrees to it.

Thanks,
Manish.


From olof at lixom.net  Thu Feb 16 00:58:56 2006
From: olof at lixom.net (Olof Johansson)
Date: Wed, 15 Feb 2006 07:58:56 -0600
Subject: AW: Re: __setup_cpu_be problem
In-Reply-To: <68BEC61B-70ED-4057-A669-DE5EC38B6180@watson.ibm.com>
References: <OFB15C3FFF.405484C2-ONC1257114.0053DD64-41257114.0064A127@de.ibm.com>
	<68BEC61B-70ED-4057-A669-DE5EC38B6180@watson.ibm.com>
Message-ID: <20060215135856.GE6291@pb15.lixom.net>

On Wed, Feb 15, 2006 at 06:21:57AM -0500, Jimi Xenidis wrote:
> 
> On Feb 13, 2006, at 1:19 PM, Hartmut Penner wrote:
> 
> > I would like to support the large pages in the Firmware, but need  
> > to know
> > excactly what properties I have to set.
> 
> Why? What would you gain from using Large Pages?
> Is your FW that big?

I read that as he wants to have firmware configure which large pages to
use, and use the architected manners in which FW tells the OS which
pagesizes are available and what fields to set to select them, not
necessarily use them to map firmware memory?

> Are thinking of using Large PAges in IO space? Cuz I don't think you  
> can.

POWER5+ can use large I/O pages, at least 64K. Other processors might
also, but I don't know about Cell.

The problem with I/O pages on PPC 2.01 was when the page size was only
selected in the SLB entry. Since it's not a hypervisor resource, the OS
could break isolation requirements by mapping a 16M I/O page that allowed
access to other partitons' I/O space right after it's own. That's
probably why PPC970 has the HID bits to disable it.

This changed in PPC 2.02, where the L bit was introduced in the PTE
entry as well. So, there the HV has a chance to verify it being set
properly before allowing hash table insertions, which should allow for
16MB I/O pages also in a partitioned environment. I'm not sure if it's
actually used anywhere or not.


-Olof


From utz.bacher at de.ibm.com  Tue Feb 14 07:38:50 2006
From: utz.bacher at de.ibm.com (Utz Bacher)
Date: Mon, 13 Feb 2006 21:38:50 +0100 (CET)
Subject: [FYI/PATCH 3/3] increase direct mapping sizes for spufs
Message-ID: <Pine.LNX.4.62.0602132136340.19276@tuxmkge1.boeblingen.de.ibm.com>

This patch applies on top of Arnd's postings (patch ids 4192, 4185, 4190)
from 1/17 (on top of 2.6.15.4).
It maps 16k instead of 4k for each problem-state mapped subarea. The mfc
mapping contains the Multisource Synchronization Area, the MFC Command
Parameter Area and the MFC Command Queue Control Area; the cntl mapping
contains the SPU Control Area while the signal1 and signal2 mapping
contain the relevant Signal-Notification Area.
This allows libspe to build on direct problem state mapping and is
recommended for running on a Cell blade today. The code may well change
in the near future.

Cc: Mark Nutter <mnutter at us.ibm.com>
Cc: Arnd Bergmann <arndb at de.ibm.com>
From: Ulrich Weigand <Ulrich.Weigand at de.ibm.com>
Signed-off-by: Utz Bacher <utz.bacher at de.ibm.com>

Index: linux-2.6.15.4/arch/powerpc/platforms/cell/spufs/context.c
===================================================================
--- linux-2.6.15.4.orig/arch/powerpc/platforms/cell/spufs/context.c
+++ linux-2.6.15.4/arch/powerpc/platforms/cell/spufs/context.c
@@ -116,13 +116,13 @@
         if (ctx->local_store)
                 unmap_mapping_range(ctx->local_store, 0, LS_SIZE, 1);
         if (ctx->mfc)
-               unmap_mapping_range(ctx->mfc, 0, 0x1000, 1);
+               unmap_mapping_range(ctx->mfc, 0, 0x4000, 1);
         if (ctx->cntl)
-               unmap_mapping_range(ctx->cntl, 0, 0x1000, 1);
+               unmap_mapping_range(ctx->cntl, 0, 0x4000, 1);
         if (ctx->signal1)
-               unmap_mapping_range(ctx->signal1, 0, 0x1000, 1);
+               unmap_mapping_range(ctx->signal1, 0, 0x4000, 1);
         if (ctx->signal2)
-               unmap_mapping_range(ctx->signal2, 0, 0x1000, 1);
+               unmap_mapping_range(ctx->signal2, 0, 0x4000, 1);
  }

  int spu_acquire_runnable(struct spu_context *ctx)
Index: linux-2.6.15.4/arch/powerpc/platforms/cell/spufs/file.c
===================================================================
--- linux-2.6.15.4.orig/arch/powerpc/platforms/cell/spufs/file.c
+++ linux-2.6.15.4/arch/powerpc/platforms/cell/spufs/file.c
@@ -158,7 +158,7 @@
         int ret;

         offset += vma->vm_pgoff << PAGE_SHIFT;
-       if (offset > 0x1000)
+       if (offset >= 0x4000)
                 goto out;

         ret = spu_acquire_runnable(ctx);


From olof at lixom.net  Thu Feb 16 02:02:09 2006
From: olof at lixom.net (Olof Johansson)
Date: Wed, 15 Feb 2006 09:02:09 -0600
Subject: [PATCH] [2.6.16] powerpc: Fix OOPS in lparcfg on G5
Message-ID: <20060215150209.GF6291@pb15.lixom.net>

Hi,

Bugfix, so please consider for 2.6.16:


Hit the following with LTP with a ppc64_defconfig kernel on a G5:

Unable to handle kernel paging request for data at address 0x00000030
Faulting instruction address: 0xc00000000001f6d0
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=32 POWERMAC
Modules linked in:
NIP: C00000000001F6D0 LR: C00000000001F6CC CTR: 0000000000000000
REGS: c000000054853790 TRAP: 0300   Not tainted  (2.6.16-rc3-mm1)
MSR: 9000000000009032 <EE,ME,IR,DR>  CR: 24000444  XER: 00000000
DAR: 0000000000000030, DSISR: 0000000040000000
TASK = c00000005fb65810[4820] 'proc01' THREAD: c000000054850000 CPU: 1
GPR00: C00000000001F6CC C000000054853A10 C00000000079FBB0 C0000000007A32E8
GPR04: C0000000004AE220 0000000000000000 0000000000000020 0000000000000000
GPR08: C000000000610178 0000000000000072 C00000005FFFEE62 0000000000000092
GPR12: 0000000000000002 C0000000005BA100 00000000100D0000 0000000010116C88
GPR16: 00000000100D0000 00000000FFFF9008 0000000000000000 0000000000000000
GPR20: 000000001001B5D8 000000000FF58224 C000000054853E08 C00000000F44A330
GPR24: C00000005E47B700 0000000010016FB4 0000000000000000 C00000000F44A300
GPR28: 0000000000000000 0000000000000000 C0000000004AE220 0000000000000000
NIP [C00000000001F6D0] .of_find_property+0x30/0xa8
LR [C00000000001F6CC] .of_find_property+0x2c/0xa8
Call Trace:
[C000000054853A10] [C00000000001F6CC] .of_find_property+0x2c/0xa8 (unreliable)
[C000000054853AA0] [C00000000001F758] .get_property+0x10/0x34
[C000000054853B10] [C00000000001D3C8] .lparcfg_data+0x11c/0x6c8
[C000000054853C20] [C0000000000DC78C] .seq_read+0x198/0x418
[C000000054853CF0] [C0000000000B2634] .vfs_read+0xd0/0x1b0
[C000000054853D90] [C0000000000B32FC] .sys_read+0x4c/0x8c
[C000000054853E30] [C0000000000086F8] syscall_exit+0x0/0x40


It happens since the lookup of the /rtas device node is never checked for
success and just passed into get_property.

It doesn't make sense to create the lparcfg proc entry on non-LPAR
systems at all. On LPAR systems, there will always be an RTAS so the
lookup will always succeed.


Signed-off-by: Olof Johansson <olof at lixom.net>


Index: linux/arch/powerpc/kernel/lparcfg.c
===================================================================
--- linux.orig/arch/powerpc/kernel/lparcfg.c
+++ linux/arch/powerpc/kernel/lparcfg.c
@@ -565,6 +565,9 @@ int __init lparcfg_init(void)
 	struct proc_dir_entry *ent;
 	mode_t mode = S_IRUSR | S_IRGRP | S_IROTH;
 
+	if (!platform_is_lpar())
+		return 0;
+
 	/* Allow writing if we have FW_FEATURE_SPLPAR */
 	if (firmware_has_feature(FW_FEATURE_SPLPAR)) {
 		lparcfg_fops.write = lparcfg_write;


From jimix at watson.ibm.com  Wed Feb 15 22:29:20 2006
From: jimix at watson.ibm.com (Jimi Xenidis)
Date: Wed, 15 Feb 2006 06:29:20 -0500
Subject: AW: Re: __setup_cpu_be problem
In-Reply-To: <1139956309.780.254385845@webmail.messagingengine.com>
References: <2812322.110611139545275893.JavaMail.servlet@kundenserver>
	<200602120552.26164.arnd@arndb.de>
	<1139779462.5247.30.camel@localhost.localdomain>
	<200602132324.45433.arnd@arndb.de> <43F21FD0.507@am.sony.com>
	<1139956309.780.254385845@webmail.messagingengine.com>
Message-ID: <7EC0BED8-30BC-4F52-92AE-19C7CEC3AD47@watson.ibm.com>

It is important we consider the cases where the hypervisor is present  
and not present.
There is also the problem of different Hypervisors.
I do not think FW without Hypervisor has any  business choosing the  
page sizes for an OS.
For Hypervisor machines, as discussed below, it needs to be negotiated.
There are plenty of things that need to be negotiated like this, and  
it is likely that each hypervisor will do this differently.

I guess for Hypervisors we'll wait and see.
-JX

On Feb 14, 2006, at 5:31 PM, Hollis Blanchard wrote:

> On Tue, 2006-02-14 at 10:22 -0800, Geoff Levand wrote:
>>
>> Sorry about changing my mind on this Ben, but after reading the  
>> Book 4
>> docs on page sizes I see that each partition can have independent
>> page size settings.  I made the wrong assumption that all partitions
>> needed the same size setting.  Based on this, and on Arnd's comments,
>> I think in general we will need to setup page sizes in the kernel.
>
> On Tue, 2006-02-14 at 15:27 -0600, Olof Johansson wrote:
>>
>> Isn't this something that should be configured in the hypervisor /
>> partition firmware on the machine then, instead of hacked into the
>> kernel? The hypervisor would of course switch HID contents when
>> dispatching different partitions, if needed.
>
> I agree with Olof; I don't follow the original leap of logic.
>
> If every partition can have independent page size settings, and
> especially if HID6 is a hypervisor-privileged resource as mentioned
> earlier, then the hypervisor needs to set it. Only the hypervisor can
> restore each partition's different HID6 value when it switches between
> them...
>
> -Hollis
> _______________________________________________
> Linuxppc64-dev mailing list
> Linuxppc64-dev at ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc64-dev
>


From mohan at in.ibm.com  Wed Feb 15 23:57:22 2006
From: mohan at in.ibm.com (Mohan Kumar M)
Date: Wed, 15 Feb 2006 18:27:22 +0530
Subject: kexec tools gcc 4.1.0 issue
In-Reply-To: <1139918867.8472.100.camel@explorer.in.ibm.com>
References: <1139918867.8472.100.camel@explorer.in.ibm.com>
Message-ID: <20060215125722.GA15333@in.ibm.com>

Hi,

One more patch is required to solve the gcc 4.1.0 issue with kexec-tools.
When users run ./configure script without running autoconf, -mcall-aixdesc flag
will not be added to the EXTRA_CFLAGS.

This patch adds the flag to configure script also. So that even if the user does
not run autoconf, -mcall-aixdesc flag is added to EXTRA_CFLAGS.


This patch is required in addition to
kexec-ppc-gcc410-fix.patch. When users run ./configure
script without running autoconf, "-mcall-aixdesc" flag will
not be included to the EXTRA_CFLAGS. This patch adds this
flag to EXTRA_CLFAGS in "configure" script also.

Signed-off-by: Mohan <mohan at in.ibm.com>
---

 configure |    3 ++-
 1 files changed, 2 insertions(+), 1 deletion(-)

diff -puN configure~kexec-ppc-gcc410-fix-2 configure
--- kexec-tools-1.101/configure~kexec-ppc-gcc410-fix-2	2006-02-15 18:01:30.000000000 +0530
+++ kexec-tools-1.101-mohan/configure	2006-02-15 18:03:18.000000000 +0530
@@ -1413,8 +1413,9 @@ fi
 EXTRA_CFLAGS=""
 
 # Check whether ppc64. Add -m64 for building 64-bit binary
+# Add -mcall-aixdesc to generate dot-symbols as in gcc 3.3.3
 if test "$ARCH" = ppc64; then
-  EXTRA_CFLAGS="$EXTRA_CFLAGS -m64"
+  EXTRA_CFLAGS="$EXTRA_CFLAGS -m64 -mcall-aixdesc"
 fi;
 
 # Check whether --with-objdir or --without-objdir was given.
_

On Tue, Feb 14, 2006 at 05:37:48PM +0530, Mohan Kumar M wrote:
> Hi,
> 
> Latest kexec tools for PPC64 with purgatory patch
> (ppc64-kdump-purgatory-backup-support.patch) was not working with gcc
> version 4.1.0 due to the change in object file generation.
> 
> Here is the patch to fix this issue.
> 
> This patch is created on top of the following level of
> kexec-tools:
> 
> - kexec-tools-1.101.tar.gz (from eric biederman's site or 
> from lse site)
> - kexec-tools-1.101-kdump6.patch (consolidated patch posted
> on
> http://lse.sourceforge.net/kdump/patches/1.101-kdump6/kexec-tools-1.101-kdump6.patch)
> 
> Review and suggestions are welcome.
> 
> Note:
> Resending the patch since its not delivered to both fastboot and
> linuxppc64-dev mailing list.
> 
> Regards,
> Mohan.


From benh at kernel.crashing.org  Thu Feb 16 09:27:09 2006
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Thu, 16 Feb 2006 09:27:09 +1100
Subject: AW: Re: __setup_cpu_be problem
In-Reply-To: <1B5EC317-861F-46D2-AA05-AEC16DFE4737@watson.ibm.com>
References: <OFB15C3FFF.405484C2-ONC1257114.0053DD64-41257114.0064A127@de.ibm.com>
	<68BEC61B-70ED-4057-A669-DE5EC38B6180@watson.ibm.com>
	<17395.4329.1338.898562@cargo.ozlabs.ibm.com>
	<1B5EC317-861F-46D2-AA05-AEC16DFE4737@watson.ibm.com>
Message-ID: <1140042429.4054.6.camel@localhost.localdomain>

On Wed, 2006-02-15 at 07:11 -0500, Jimi Xenidis wrote:
> On Feb 15, 2006, at 6:30 AM, Paul Mackerras wrote:
> 
> > Jimi Xenidis writes:
> >
> >> Are thinking of using Large PAges in IO space? Cuz I don't think you
> >> can.
> >
> > Why not?
> 
>  From the 970 User manual:
>    To avoid accidental large/small page translation aliasing, the  
> 970FX implements a HID4 bit (HID4[61]) to
>    disable the large page facility and does not permit cache  
> inhibited accesses to an address in a large page.
> 
> I'm not 100% but I believe this effects P4, and maybe even P5.
> WRT Cell, I believe the BPA_Map can be mapped with large pages but  
> I'm not sure about "real" devices.

AS 2.03 lifts this limitation, and from GS DD2.1 onward, L=1 can be
cache inhibited (this is a requirement for the kernel to be able to use
64k HW pages btw). I think Cell works that way too but that remains to
be confirmed.

Ben.


From benh at kernel.crashing.org  Thu Feb 16 09:36:19 2006
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Thu, 16 Feb 2006 09:36:19 +1100
Subject: AW: Re: __setup_cpu_be problem
In-Reply-To: <7EC0BED8-30BC-4F52-92AE-19C7CEC3AD47@watson.ibm.com>
References: <2812322.110611139545275893.JavaMail.servlet@kundenserver>
	<200602120552.26164.arnd@arndb.de>
	<1139779462.5247.30.camel@localhost.localdomain>
	<200602132324.45433.arnd@arndb.de> <43F21FD0.507@am.sony.com>
	<1139956309.780.254385845@webmail.messagingengine.com>
	<7EC0BED8-30BC-4F52-92AE-19C7CEC3AD47@watson.ibm.com>
Message-ID: <1140042979.4054.14.camel@localhost.localdomain>

On Wed, 2006-02-15 at 06:29 -0500, Jimi Xenidis wrote:
> It is important we consider the cases where the hypervisor is present  
> and not present.
> There is also the problem of different Hypervisors.
> I do not think FW without Hypervisor has any  business choosing the  
> page sizes for an OS.
> For Hypervisor machines, as discussed below, it needs to be negotiated.
> There are plenty of things that need to be negotiated like this, and  
> it is likely that each hypervisor will do this differently.

Page sizes are normally not "chosen" in that the architecture was
written with the intend that a given CPU model supports a given range of
page sizes and that gets exposed via the device-tree.

What is causing the current "situation" is that Cell was designed
slightly differently :) It supports 2 large page sizes encodings but 3
actual large page sizes. The matching of one of the encodings to one of
the large page page sizes is done in software via HID6. This doesn't
quite fit in anything that has been defined by our firmware stuff, thus
my initial idea to try to have Cell based firmwares pick the encodings
that make sense for linux, populate the device-tree accordingly and
forget about it (that is 64k and 16M). However, I suppose there might be
applications where 1M makes sense, non-linux OSes or even future
versions of linux that get "fixed" to handle 1M large pages....

Thus if we want that configurable, the question is "where".

I'm not too fan of having yet another mecanism for detecting page sizes
in the hash code though. I'd really like that we stick to the current
mecanism via the device-tree. Thus if we want a way to select the page
sizes on CPUs like Cell, it should be done before we retreive the
device-tree from OF, so that the firmware, when instructed to change it,
can appropriately update the device-tree properties.

The simple way I think is an nvram OF option in /options, along with
other OF environment variables. The more complicated way would be a
specific OF or rtas call (i'd rather avoid HV calls from prom_init but
if we have to ...).

Ben.


From ntl at pobox.com  Thu Feb 16 12:47:41 2006
From: ntl at pobox.com (Nathan Lynch)
Date: Wed, 15 Feb 2006 19:47:41 -0600
Subject: [PATCH] [2.6.16] powerpc: Fix OOPS in lparcfg on G5
In-Reply-To: <20060215150209.GF6291@pb15.lixom.net>
References: <20060215150209.GF6291@pb15.lixom.net>
Message-ID: <20060216014741.GD3293@localhost.localdomain>

Olof Johansson wrote:
> Hi,
> 
> Bugfix, so please consider for 2.6.16:
> 
> 
> Hit the following with LTP with a ppc64_defconfig kernel on a G5:
> 
> Unable to handle kernel paging request for data at address 0x00000030
> Faulting instruction address: 0xc00000000001f6d0
> Oops: Kernel access of bad area, sig: 11 [#1]
> SMP NR_CPUS=32 POWERMAC
> Modules linked in:
> NIP: C00000000001F6D0 LR: C00000000001F6CC CTR: 0000000000000000
> REGS: c000000054853790 TRAP: 0300   Not tainted  (2.6.16-rc3-mm1)
> MSR: 9000000000009032 <EE,ME,IR,DR>  CR: 24000444  XER: 00000000
> DAR: 0000000000000030, DSISR: 0000000040000000
> TASK = c00000005fb65810[4820] 'proc01' THREAD: c000000054850000 CPU: 1
> GPR00: C00000000001F6CC C000000054853A10 C00000000079FBB0 C0000000007A32E8
> GPR04: C0000000004AE220 0000000000000000 0000000000000020 0000000000000000
> GPR08: C000000000610178 0000000000000072 C00000005FFFEE62 0000000000000092
> GPR12: 0000000000000002 C0000000005BA100 00000000100D0000 0000000010116C88
> GPR16: 00000000100D0000 00000000FFFF9008 0000000000000000 0000000000000000
> GPR20: 000000001001B5D8 000000000FF58224 C000000054853E08 C00000000F44A330
> GPR24: C00000005E47B700 0000000010016FB4 0000000000000000 C00000000F44A300
> GPR28: 0000000000000000 0000000000000000 C0000000004AE220 0000000000000000
> NIP [C00000000001F6D0] .of_find_property+0x30/0xa8
> LR [C00000000001F6CC] .of_find_property+0x2c/0xa8
> Call Trace:
> [C000000054853A10] [C00000000001F6CC] .of_find_property+0x2c/0xa8 (unreliable)
> [C000000054853AA0] [C00000000001F758] .get_property+0x10/0x34
> [C000000054853B10] [C00000000001D3C8] .lparcfg_data+0x11c/0x6c8
> [C000000054853C20] [C0000000000DC78C] .seq_read+0x198/0x418
> [C000000054853CF0] [C0000000000B2634] .vfs_read+0xd0/0x1b0
> [C000000054853D90] [C0000000000B32FC] .sys_read+0x4c/0x8c
> [C000000054853E30] [C0000000000086F8] syscall_exit+0x0/0x40
> 
> 
> It happens since the lookup of the /rtas device node is never checked for
> success and just passed into get_property.
> 
> It doesn't make sense to create the lparcfg proc entry on non-LPAR
> systems at all.

Despite the lparcfg name, I think there are apps which depend on it
even on non-lpar systems; we should still create the file on non-lpar
Power4, for example.


From michael at ellerman.id.au  Thu Feb 16 14:13:48 2006
From: michael at ellerman.id.au (Michael Ellerman)
Date: Thu, 16 Feb 2006 14:13:48 +1100
Subject: [PATCH 0/3] powerpc: Bug fixes for 2.6.16
Message-ID: <1140059628.718206.692588263539.qpush@concordia>

This is a series of three bug fixes which I think should go in for 2.6.16.

The first makes UP kernels work again, we were unconditionally starting
secondary cpus. Paulus, you said you didn't like this much, but I think it's
the best option for 2.6.16, I have a patch that cleans this stuff up in the
works but it'll take a bit longer.

The second patch makes UP to SMP kexec work again, this was supposed to work in
the past but was never tested and got busted somewhere along the line.

The third fixes a long standing bug on pSeries machines, where if secondary
threads have different logical/physical ids we fail to spin them up correctly.
We don't normally hit this because the logical/physical ids are the same.

Built for pSeries, iSeries and pmac32. Booted on P5 LPAR, Power3 and iSeries.


From michael at ellerman.id.au  Thu Feb 16 14:13:50 2006
From: michael at ellerman.id.au (Michael Ellerman)
Date: Thu, 16 Feb 2006 14:13:50 +1100
Subject: [PATCH 1/3] powerpc: Don't start secondary CPUs in a UP && KEXEC
	kernel
In-Reply-To: <1140059628.718206.692588263539.qpush@concordia>
Message-ID: <20060216031415.8F491679F2@ozlabs.org>

Because smp_release_cpus() is built for SMP || KEXEC, it's not safe to
unconditionally call it from setup_system(). On a UP && KEXEC kernel we'll
start up the secondary CPUs which will then go beserk and we die.

Simple fix is to conditionally call smp_release_cpus() in setup_system(). With
that in place we don't need the dummy definition of smp_release_cpus() because
all call sites are #ifdef'ed either SMP or KEXEC.

Signed-off-by: Michael Ellerman <michael at ellerman.id.au>
---

 arch/powerpc/kernel/setup_64.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

Index: to-merge/arch/powerpc/kernel/setup_64.c
===================================================================
--- to-merge.orig/arch/powerpc/kernel/setup_64.c
+++ to-merge/arch/powerpc/kernel/setup_64.c
@@ -311,8 +311,6 @@ void smp_release_cpus(void)
 
 	DBG(" <- smp_release_cpus()\n");
 }
-#else
-#define smp_release_cpus()
 #endif /* CONFIG_SMP || CONFIG_KEXEC */
 
 /*
@@ -473,10 +471,12 @@ void __init setup_system(void)
 	check_smt_enabled();
 	smp_setup_cpu_maps();
 
+#ifdef CONFIG_SMP
 	/* Release secondary cpus out of their spinloops at 0x60 now that
 	 * we can map physical -> logical CPU ids
 	 */
 	smp_release_cpus();
+#endif
 
 	printk("Starting Linux PPC64 %s\n", system_utsname.version);
 

From michael at ellerman.id.au  Thu Feb 16 14:13:51 2006
From: michael at ellerman.id.au (Michael Ellerman)
Date: Thu, 16 Feb 2006 14:13:51 +1100
Subject: [PATCH 2/3] powerpc: Make UP -> SMP kexec work again
In-Reply-To: <1140059628.718206.692588263539.qpush@concordia>
Message-ID: <20060216031417.4024067AA0@ozlabs.org>

For UP to SMP kexec to work we need to jump into pSeries_secondary_smp_init
event on a UP + KEXEC kernel. The secondary cpus will not find their hw_cpu_id
in the paca and so they'll jump into kexec_wait, ready for a kexec.

Signed-off-by: Michael Ellerman <michael at ellerman.id.au>
---

 arch/powerpc/kernel/head_64.S |    4 +---
 1 files changed, 1 insertion(+), 3 deletions(-)

Index: to-merge/arch/powerpc/kernel/head_64.S
===================================================================
--- to-merge.orig/arch/powerpc/kernel/head_64.S
+++ to-merge/arch/powerpc/kernel/head_64.S
@@ -155,8 +155,7 @@ _GLOBAL(__secondary_hold)
 	SET_REG_IMMEDIATE(r4, .hmt_init)
 	mtctr	r4
 	bctr
-#else
-#ifdef CONFIG_SMP
+#elif defined(CONFIG_SMP) || defined(CONFIG_KEXEC)
 	LOAD_REG_IMMEDIATE(r4, .pSeries_secondary_smp_init)
 	mtctr	r4
 	mr	r3,r24
@@ -164,7 +163,6 @@ _GLOBAL(__secondary_hold)
 #else
 	BUG_OPCODE
 #endif
-#endif
 
 /* This value is used to mark exception frames on the stack. */
 	.section ".toc","aw"


From michael at ellerman.id.au  Thu Feb 16 14:13:53 2006
From: michael at ellerman.id.au (Michael Ellerman)
Date: Thu, 16 Feb 2006 14:13:53 +1100
Subject: [PATCH 3/3] powerpc: Fix bug in spinup of renumbered secondary threads
In-Reply-To: <1140059628.718206.692588263539.qpush@concordia>
Message-ID: <20060216031418.C2E0467B51@ozlabs.org>

If the logical and physical cpu ids of a secondary thread don't match, we will
fail to spin the thread up on pSeries machines due to a bug in pseries/smp.c

We call the RTAS "start-cpu" method with the physical cpu id, the address of
pSeries_secondary_smp_init and the value to pass that function in r3. Currently
we pass "lcpu", the logical cpu id, but pSeries_secondary_smp_init expects
the physical cpu id in r3.

We should be passing pcpu instead.

Signed-off-by: Michael Ellerman <michael at ellerman.id.au>
---

 arch/powerpc/platforms/pseries/smp.c |    2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)

Index: to-merge/arch/powerpc/platforms/pseries/smp.c
===================================================================
--- to-merge.orig/arch/powerpc/platforms/pseries/smp.c
+++ to-merge/arch/powerpc/platforms/pseries/smp.c
@@ -292,7 +292,7 @@ static inline int __devinit smp_startup_
 	if (start_cpu == RTAS_UNKNOWN_SERVICE)
 		return 1;
 
-	status = rtas_call(start_cpu, 3, 1, NULL, pcpu, start_here, lcpu);
+	status = rtas_call(start_cpu, 3, 1, NULL, pcpu, start_here, pcpu);
 	if (status != 0) {
 		printk(KERN_ERR "start-cpu failed: %i\n", status);
 		return 0;


From olof at lixom.net  Thu Feb 16 14:40:44 2006
From: olof at lixom.net (Olof Johansson)
Date: Wed, 15 Feb 2006 21:40:44 -0600
Subject: [PATCH] [2.6.16] powerpc: Fix OOPS in lparcfg on G5
In-Reply-To: <20060216014741.GD3293@localhost.localdomain>
References: <20060215150209.GF6291@pb15.lixom.net>
	<20060216014741.GD3293@localhost.localdomain>
Message-ID: <20060216034044.GK6291@pb15.lixom.net>

On Wed, Feb 15, 2006 at 07:47:41PM -0600, Nathan Lynch wrote:

> Despite the lparcfg name, I think there are apps which depend on it
> even on non-lpar systems; we should still create the file on non-lpar
> Power4, for example.

Hrm, ok. Thanks Nathan.

Paulus, please apply for 2.6.16.


Thanks,

Olof

---


Fallback gracefully when reading /proc/ppc64/lparcfg when the /rtas
device node can't be found.

Signed-off-by: Olof Johansson <olof at lixom.net>


Index: powerpc-git/arch/powerpc/kernel/lparcfg.c
===================================================================
--- powerpc-git.orig/arch/powerpc/kernel/lparcfg.c
+++ powerpc-git/arch/powerpc/kernel/lparcfg.c
@@ -341,7 +341,7 @@ static int lparcfg_data(struct seq_file 
 	const char *system_id = "";
 	unsigned int *lp_index_ptr, lp_index = 0;
 	struct device_node *rtas_node;
-	int *lrdrp;
+	int *lrdrp = NULL;
 
 	rootdn = find_path_device("/");
 	if (rootdn) {
@@ -362,7 +362,9 @@ static int lparcfg_data(struct seq_file 
 	seq_printf(m, "partition_id=%d\n", (int)lp_index);
 
 	rtas_node = find_path_device("/rtas");
-	lrdrp = (int *)get_property(rtas_node, "ibm,lrdr-capacity", NULL);
+	if (rtas_node)
+		lrdrp = (int *)get_property(rtas_node, "ibm,lrdr-capacity",
+		                            NULL);
 
 	if (lrdrp == NULL) {
 		partition_potential_processors = vdso_data->processorCount;


From latten at austin.ibm.com  Thu Feb 16 10:31:26 2006
From: latten at austin.ibm.com (Joy Latten)
Date: Wed, 15 Feb 2006 17:31:26 -0600
Subject: problem booting
Message-ID: <1140046286.3137.160.camel@faith.austin.ibm.com>

Al Viro recommended I send this problem to linuxppc64-dev.

I have Rawhide installed on a pseries lpar. It is working fine.
The Rawhide kernel is vmlinuz-2.6.15-1.1948_FC5.

I installed lspp.8 from Steve Grubb. When I rebooted my machine, I 
received the below kernel panic.

I have seen something similar when downloading a vanilla kernel from 
kernel.org and using the default config file in arch/powerpc/configs/
ppc64_defconfig. I usually turn on selinux and ipsec protocols and
ensure ibmveth and ibmvscsi are included in my kernel. I do not use
initrd. A co-worker gave me a .config that seem to get past my
problems, so I concluded that perhaps my config was missing something
the lpar needed.
I have included his config that works ok for me in my email. I apologice
for such a large email. The only thing I change is his use of initrd. I
do not use initrd. Perhaps I should...  
I will next try and compile with he rawhide config and a kernel.org
kernel and see if it works ok or not.

My gcc version is gcc version 4.1.0 20060213 (Red Hat 4.1.0-0.25)

Oh I have been also been using arch/powerpc/boot/Zimage for my kernel.
Advise if I should be using vmlinux instead. Thanks. Let me know if
there are any questions.

Regards,
Joy Latten

---------------------------------------------------------------------

boot: 2.6.15-1.1941.4
Please wait, loading kernel...
   Elf32 kernel loaded...
Loading ramdisk...
ramdisk loaded at 02200000, size: 1117 Kbytes
OF stdout device is: /vdevice/vty at 30000000
command line: ro console=hvc0 root=LABEL=/1
memory layout at init:
  memory_limit : 00000000 (16 MB aligned)
  alloc_bottom : 02318000
  alloc_top    : 08000000
  alloc_top_hi : 88000000
  rmo_top      : 08000000
  ram_top      : 88000000
Looking for displays
instantiating rtas at 0x077d7000 ... done
00000000 : boot cpu     00000000
00000002 : starting cpu hw idx 00000002... done
00000004 : starting cpu hw idx 00000004... done
00000006 : starting cpu hw idx 00000006... done
WARNING: maximum CPUs (4) exceeded: ignoring extras
copying OF device tree ...
Building dt strings...
Building dt structure...
Device tree strings 0x02619000 -> 0x02619f49
Device tree struct  0x0261a000 -> 0x02621000
Calling quiesce ...
returning from prom_init
DEFAULT CATCH!, exception-handler=fff00300
at   %SRR0: 0000000000c3c21c   %SRR1: 8000000000003002
Call History
------------
@  - c3c1b0
find-method  - c467f4
(poplocals)  - c3a718
$call-method  - c468ac
(poplocals)  - c3a718
key-fillq  - c46e24
?xoff  - c46f20
(poplocals)  - c3a718
(stdout-write)  - c4754c
(type)  - c475d8
_syscatch  - c4d43c
_exception  - c4cf00
<excp>  - c39834
_syscatch  - c4d3a0
_syscatch  - c4d3a0
invalid pointer - 1800000000864

Client's Fix Pt Regs:
 00 00080000000001f4 ffffffffff2581d4 00000000deadbeef fffffffffffffffc
 04 0000000000000000 0000000000000000 000003fe007d0000 0000000000c03010
 08 0000000008000000 000000000000003a 00000000003ff000 0000000000000008
 0c 0000000000004000 0000000000000000 0000000000000000 0000000000000000
 10 0000000000db3710 0000000000db3710 0000000000c465f4 0000000000c467f4
 14 0000000000000000 0000000001bfff81 0000000001ef46f0 0000000000117400
 18 0000000000c13000 0000000000c38000 0000000000c14f40 0000000000c16fc0
 1c 0000000000c20000 0000000000c3fd20 0000000000c11f98 0000000000c10fd0
Special Regs:
    %IV: 00000300     %CR: 82000082    %XER: 00000000  %DSISR: 08000000
  %SRR0: 0000000000c3c21c   %SRR1: 8000000000003002
    %LR: 0000000000c3c1b0    %CTR: 0000000000000000
   %DAR: ffffffffff2581d4
Virtual PID = 0
PFW: Unable to send error log!
 ofdbg
0 >


-------------- next part --------------
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.15
# Wed Feb  8 15:35:32 2006
#
CONFIG_PPC64=y
CONFIG_64BIT=y
CONFIG_PPC_MERGE=y
CONFIG_MMU=y
CONFIG_GENERIC_HARDIRQS=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_PPC=y
CONFIG_EARLY_PRINTK=y
CONFIG_COMPAT=y
CONFIG_SYSVIPC_COMPAT=y
CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y

#
# Processor support
#
# CONFIG_POWER4_ONLY is not set
CONFIG_POWER3=y
CONFIG_POWER4=y
CONFIG_PPC_FPU=y
CONFIG_ALTIVEC=y
CONFIG_PPC_STD_MMU=y
CONFIG_SMP=y
CONFIG_NR_CPUS=128

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_CLEAN_COMPILE=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32

#
# General setup
#
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
# CONFIG_BSD_PROCESS_ACCT_V3 is not set
CONFIG_SYSCTL=y
CONFIG_AUDIT=y
CONFIG_AUDITSYSCALL=y
CONFIG_HOTPLUG=y
CONFIG_KOBJECT_UEVENT=y
# CONFIG_IKCONFIG is not set
CONFIG_CPUSETS=y
CONFIG_INITRAMFS_SOURCE="/boot/initrd-2.6.15.cpio"
CONFIG_INITRAMFS_ROOT_UID=0
CONFIG_INITRAMFS_ROOT_GID=0
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
# CONFIG_EMBEDDED is not set
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_ALL is not set
CONFIG_KALLSYMS_EXTRA_PASS=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SHMEM=y
CONFIG_CC_ALIGN_FUNCTIONS=0
CONFIG_CC_ALIGN_LABELS=0
CONFIG_CC_ALIGN_LOOPS=0
CONFIG_CC_ALIGN_JUMPS=0
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
CONFIG_OBSOLETE_MODPARM=y
CONFIG_MODVERSIONS=y
CONFIG_MODULE_SRCVERSION_ALL=y
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y

#
# Block layer
#

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
CONFIG_DEFAULT_AS=y
# CONFIG_DEFAULT_DEADLINE is not set
# CONFIG_DEFAULT_CFQ is not set
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="anticipatory"

#
# Platform support
#
CONFIG_PPC_MULTIPLATFORM=y
# CONFIG_PPC_ISERIES is not set
# CONFIG_EMBEDDED6xx is not set
# CONFIG_APUS is not set
CONFIG_PPC_PSERIES=y
CONFIG_PPC_PMAC=y
CONFIG_PPC_PMAC64=y
CONFIG_PPC_MAPLE=y
CONFIG_PPC_CELL=y
CONFIG_PPC_OF=y
CONFIG_XICS=y
CONFIG_U3_DART=y
CONFIG_MPIC=y
CONFIG_PPC_RTAS=y
CONFIG_RTAS_ERROR_LOGGING=y
CONFIG_RTAS_PROC=y
CONFIG_RTAS_FLASH=y
CONFIG_MMIO_NVRAM=y
CONFIG_MPIC_BROKEN_U3=y
CONFIG_CELL_IIC=y
CONFIG_IBMVIO=y
# CONFIG_PPC_MPC106 is not set
CONFIG_GENERIC_TBSYNC=y
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_TABLE=y
CONFIG_CPU_FREQ_DEBUG=y
CONFIG_CPU_FREQ_STAT=m
CONFIG_CPU_FREQ_STAT_DETAILS=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is not set
CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE=y
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=m
CONFIG_CPU_FREQ_GOV_USERSPACE=y
CONFIG_CPU_FREQ_GOV_ONDEMAND=m
CONFIG_CPU_FREQ_GOV_CONSERVATIVE=m
CONFIG_CPU_FREQ_PMAC64=y
# CONFIG_WANT_EARLY_SERIAL is not set

#
# Kernel options
#
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250
# CONFIG_PREEMPT_NONE is not set
CONFIG_PREEMPT_VOLUNTARY=y
# CONFIG_PREEMPT is not set
CONFIG_PREEMPT_BKL=y
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_MISC=y
CONFIG_FORCE_MAX_ZONEORDER=13
CONFIG_IOMMU_VMERGE=y
CONFIG_HOTPLUG_CPU=y
# CONFIG_KEXEC is not set
CONFIG_IRQ_ALL_CPUS=y
CONFIG_PPC_SPLPAR=y
CONFIG_EEH=y
CONFIG_SCANLOG=y
CONFIG_LPARCFG=y
CONFIG_NUMA=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_DEFAULT=y
CONFIG_SELECT_MEMORY_MODEL=y
# CONFIG_FLATMEM_MANUAL is not set
# CONFIG_DISCONTIGMEM_MANUAL is not set
CONFIG_SPARSEMEM_MANUAL=y
CONFIG_SPARSEMEM=y
CONFIG_NEED_MULTIPLE_NODES=y
CONFIG_HAVE_MEMORY_PRESENT=y
# CONFIG_SPARSEMEM_STATIC is not set
CONFIG_SPARSEMEM_EXTREME=y
# CONFIG_MEMORY_HOTPLUG is not set
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID=y
# CONFIG_PPC_64K_PAGES is not set
CONFIG_SCHED_SMT=y
CONFIG_PROC_DEVICETREE=y
# CONFIG_CMDLINE_BOOL is not set
CONFIG_PM=y
CONFIG_PM_LEGACY=y
CONFIG_PM_DEBUG=y
# CONFIG_SECCOMP is not set
CONFIG_ISA_DMA_API=y

#
# Bus options
#
CONFIG_GENERIC_ISA_DMA=y
CONFIG_PPC_I8259=y
# CONFIG_PPC_INDIRECT_PCI is not set
CONFIG_PCI=y
CONFIG_PCI_DOMAINS=y
CONFIG_PCI_LEGACY_PROC=y
# CONFIG_PCI_DEBUG is not set

#
# PCCARD (PCMCIA/CardBus) support
#
CONFIG_PCCARD=y
# CONFIG_PCMCIA_DEBUG is not set
CONFIG_PCMCIA=y
CONFIG_PCMCIA_LOAD_CIS=y
CONFIG_PCMCIA_IOCTL=y
CONFIG_CARDBUS=y

#
# PC-card bridges
#
CONFIG_YENTA=y
CONFIG_PD6729=m
CONFIG_I82092=m
CONFIG_PCCARD_NONSTATIC=y

#
# PCI Hotplug Support
#
CONFIG_HOTPLUG_PCI=y
# CONFIG_HOTPLUG_PCI_FAKE is not set
# CONFIG_HOTPLUG_PCI_CPCI is not set
CONFIG_HOTPLUG_PCI_SHPC=m
CONFIG_HOTPLUG_PCI_SHPC_POLL_EVENT_MODE=y
# CONFIG_HOTPLUG_PCI_SHPC_PHPRM_LEGACY is not set
CONFIG_HOTPLUG_PCI_RPA=m
CONFIG_HOTPLUG_PCI_RPA_DLPAR=m
CONFIG_KERNEL_START=0xc000000000000000

#
# Networking
#
CONFIG_NET=y

#
# Networking options
#
CONFIG_PACKET=y
CONFIG_PACKET_MMAP=y
CONFIG_UNIX=y
CONFIG_XFRM=y
CONFIG_XFRM_USER=y
CONFIG_NET_KEY=m
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_ASK_IP_FIB_HASH=y
# CONFIG_IP_FIB_TRIE is not set
CONFIG_IP_FIB_HASH=y
CONFIG_IP_MULTIPLE_TABLES=y
CONFIG_IP_ROUTE_FWMARK=y
CONFIG_IP_ROUTE_MULTIPATH=y
# CONFIG_IP_ROUTE_MULTIPATH_CACHED is not set
CONFIG_IP_ROUTE_VERBOSE=y
# CONFIG_IP_PNP is not set
CONFIG_NET_IPIP=m
CONFIG_NET_IPGRE=m
CONFIG_NET_IPGRE_BROADCAST=y
CONFIG_IP_MROUTE=y
CONFIG_IP_PIMSM_V1=y
CONFIG_IP_PIMSM_V2=y
# CONFIG_ARPD is not set
CONFIG_SYN_COOKIES=y
CONFIG_INET_AH=m
CONFIG_INET_ESP=m
CONFIG_INET_IPCOMP=m
CONFIG_INET_TUNNEL=m
CONFIG_INET_DIAG=m
CONFIG_INET_TCP_DIAG=m
# CONFIG_TCP_CONG_ADVANCED is not set
CONFIG_TCP_CONG_BIC=y

#
# IP: Virtual Server Configuration
#
CONFIG_IP_VS=m
# CONFIG_IP_VS_DEBUG is not set
CONFIG_IP_VS_TAB_BITS=12

#
# IPVS transport protocol load balancing support
#
CONFIG_IP_VS_PROTO_TCP=y
CONFIG_IP_VS_PROTO_UDP=y
CONFIG_IP_VS_PROTO_ESP=y
CONFIG_IP_VS_PROTO_AH=y

#
# IPVS scheduler
#
CONFIG_IP_VS_RR=m
CONFIG_IP_VS_WRR=m
CONFIG_IP_VS_LC=m
CONFIG_IP_VS_WLC=m
CONFIG_IP_VS_LBLC=m
CONFIG_IP_VS_LBLCR=m
CONFIG_IP_VS_DH=m
CONFIG_IP_VS_SH=m
CONFIG_IP_VS_SED=m
CONFIG_IP_VS_NQ=m

#
# IPVS application helper
#
CONFIG_IP_VS_FTP=m
CONFIG_IPV6=m
CONFIG_IPV6_PRIVACY=y
CONFIG_INET6_AH=m
CONFIG_INET6_ESP=m
CONFIG_INET6_IPCOMP=m
CONFIG_INET6_TUNNEL=m
CONFIG_IPV6_TUNNEL=m
CONFIG_NETFILTER=y
# CONFIG_NETFILTER_DEBUG is not set
CONFIG_BRIDGE_NETFILTER=y

#
# Core Netfilter Configuration
#
CONFIG_NETFILTER_NETLINK=m
CONFIG_NETFILTER_NETLINK_QUEUE=m
CONFIG_NETFILTER_NETLINK_LOG=m

#
# IP: Netfilter Configuration
#
CONFIG_IP_NF_CONNTRACK=m
CONFIG_IP_NF_CT_ACCT=y
CONFIG_IP_NF_CONNTRACK_MARK=y
CONFIG_IP_NF_CONNTRACK_EVENTS=y
CONFIG_IP_NF_CONNTRACK_NETLINK=m
CONFIG_IP_NF_CT_PROTO_SCTP=m
CONFIG_IP_NF_FTP=m
CONFIG_IP_NF_IRC=m
CONFIG_IP_NF_NETBIOS_NS=m
CONFIG_IP_NF_TFTP=m
CONFIG_IP_NF_AMANDA=m
CONFIG_IP_NF_PPTP=m
CONFIG_IP_NF_QUEUE=m
CONFIG_IP_NF_IPTABLES=m
CONFIG_IP_NF_MATCH_LIMIT=m
CONFIG_IP_NF_MATCH_IPRANGE=m
CONFIG_IP_NF_MATCH_MAC=m
CONFIG_IP_NF_MATCH_PKTTYPE=m
CONFIG_IP_NF_MATCH_MARK=m
CONFIG_IP_NF_MATCH_MULTIPORT=m
CONFIG_IP_NF_MATCH_TOS=m
CONFIG_IP_NF_MATCH_RECENT=m
CONFIG_IP_NF_MATCH_ECN=m
CONFIG_IP_NF_MATCH_DSCP=m
CONFIG_IP_NF_MATCH_AH_ESP=m
CONFIG_IP_NF_MATCH_LENGTH=m
CONFIG_IP_NF_MATCH_TTL=m
CONFIG_IP_NF_MATCH_TCPMSS=m
CONFIG_IP_NF_MATCH_HELPER=m
CONFIG_IP_NF_MATCH_STATE=m
CONFIG_IP_NF_MATCH_CONNTRACK=m
CONFIG_IP_NF_MATCH_OWNER=m
CONFIG_IP_NF_MATCH_PHYSDEV=m
CONFIG_IP_NF_MATCH_ADDRTYPE=m
CONFIG_IP_NF_MATCH_REALM=m
CONFIG_IP_NF_MATCH_SCTP=m
CONFIG_IP_NF_MATCH_DCCP=m
CONFIG_IP_NF_MATCH_COMMENT=m
CONFIG_IP_NF_MATCH_CONNMARK=m
CONFIG_IP_NF_MATCH_CONNBYTES=m
CONFIG_IP_NF_MATCH_HASHLIMIT=m
CONFIG_IP_NF_MATCH_STRING=m
CONFIG_IP_NF_FILTER=m
CONFIG_IP_NF_TARGET_REJECT=m
CONFIG_IP_NF_TARGET_LOG=m
CONFIG_IP_NF_TARGET_ULOG=m
CONFIG_IP_NF_TARGET_TCPMSS=m
CONFIG_IP_NF_TARGET_NFQUEUE=m
CONFIG_IP_NF_NAT=m
CONFIG_IP_NF_NAT_NEEDED=y
CONFIG_IP_NF_TARGET_MASQUERADE=m
CONFIG_IP_NF_TARGET_REDIRECT=m
CONFIG_IP_NF_TARGET_NETMAP=m
CONFIG_IP_NF_TARGET_SAME=m
CONFIG_IP_NF_NAT_SNMP_BASIC=m
CONFIG_IP_NF_NAT_IRC=m
CONFIG_IP_NF_NAT_FTP=m
CONFIG_IP_NF_NAT_TFTP=m
CONFIG_IP_NF_NAT_AMANDA=m
CONFIG_IP_NF_NAT_PPTP=m
CONFIG_IP_NF_MANGLE=m
CONFIG_IP_NF_TARGET_TOS=m
CONFIG_IP_NF_TARGET_ECN=m
CONFIG_IP_NF_TARGET_DSCP=m
CONFIG_IP_NF_TARGET_MARK=m
CONFIG_IP_NF_TARGET_CLASSIFY=m
CONFIG_IP_NF_TARGET_TTL=m
CONFIG_IP_NF_TARGET_CONNMARK=m
CONFIG_IP_NF_TARGET_CLUSTERIP=m
CONFIG_IP_NF_RAW=m
CONFIG_IP_NF_TARGET_NOTRACK=m
CONFIG_IP_NF_ARPTABLES=m
CONFIG_IP_NF_ARPFILTER=m
CONFIG_IP_NF_ARP_MANGLE=m

#
# IPv6: Netfilter Configuration (EXPERIMENTAL)
#
CONFIG_IP6_NF_QUEUE=m
CONFIG_IP6_NF_IPTABLES=m
CONFIG_IP6_NF_MATCH_LIMIT=m
CONFIG_IP6_NF_MATCH_MAC=m
CONFIG_IP6_NF_MATCH_RT=m
CONFIG_IP6_NF_MATCH_OPTS=m
CONFIG_IP6_NF_MATCH_FRAG=m
CONFIG_IP6_NF_MATCH_HL=m
CONFIG_IP6_NF_MATCH_MULTIPORT=m
CONFIG_IP6_NF_MATCH_OWNER=m
CONFIG_IP6_NF_MATCH_MARK=m
CONFIG_IP6_NF_MATCH_IPV6HEADER=m
CONFIG_IP6_NF_MATCH_AHESP=m
CONFIG_IP6_NF_MATCH_LENGTH=m
CONFIG_IP6_NF_MATCH_EUI64=m
CONFIG_IP6_NF_MATCH_PHYSDEV=m
CONFIG_IP6_NF_FILTER=m
CONFIG_IP6_NF_TARGET_LOG=m
CONFIG_IP6_NF_TARGET_REJECT=m
CONFIG_IP6_NF_TARGET_NFQUEUE=m
CONFIG_IP6_NF_MANGLE=m
CONFIG_IP6_NF_TARGET_MARK=m
CONFIG_IP6_NF_TARGET_HL=m
CONFIG_IP6_NF_RAW=m

#
# Bridge: Netfilter Configuration
#
CONFIG_BRIDGE_NF_EBTABLES=m
CONFIG_BRIDGE_EBT_BROUTE=m
CONFIG_BRIDGE_EBT_T_FILTER=m
CONFIG_BRIDGE_EBT_T_NAT=m
CONFIG_BRIDGE_EBT_802_3=m
CONFIG_BRIDGE_EBT_AMONG=m
CONFIG_BRIDGE_EBT_ARP=m
CONFIG_BRIDGE_EBT_IP=m
CONFIG_BRIDGE_EBT_LIMIT=m
CONFIG_BRIDGE_EBT_MARK=m
CONFIG_BRIDGE_EBT_PKTTYPE=m
CONFIG_BRIDGE_EBT_STP=m
CONFIG_BRIDGE_EBT_VLAN=m
CONFIG_BRIDGE_EBT_ARPREPLY=m
CONFIG_BRIDGE_EBT_DNAT=m
CONFIG_BRIDGE_EBT_MARK_T=m
CONFIG_BRIDGE_EBT_REDIRECT=m
CONFIG_BRIDGE_EBT_SNAT=m
CONFIG_BRIDGE_EBT_LOG=m
CONFIG_BRIDGE_EBT_ULOG=m

#
# DCCP Configuration (EXPERIMENTAL)
#
CONFIG_IP_DCCP=m
CONFIG_INET_DCCP_DIAG=m

#
# DCCP CCIDs Configuration (EXPERIMENTAL)
#
CONFIG_IP_DCCP_CCID3=m
CONFIG_IP_DCCP_TFRC_LIB=m

#
# DCCP Kernel Hacking
#
# CONFIG_IP_DCCP_DEBUG is not set
CONFIG_IP_DCCP_UNLOAD_HACK=y

#
# SCTP Configuration (EXPERIMENTAL)
#
CONFIG_IP_SCTP=m
# CONFIG_SCTP_DBG_MSG is not set
# CONFIG_SCTP_DBG_OBJCNT is not set
# CONFIG_SCTP_HMAC_NONE is not set
# CONFIG_SCTP_HMAC_SHA1 is not set
CONFIG_SCTP_HMAC_MD5=y
CONFIG_ATM=m
CONFIG_ATM_CLIP=m
# CONFIG_ATM_CLIP_NO_ICMP is not set
CONFIG_ATM_LANE=m
# CONFIG_ATM_MPOA is not set
CONFIG_ATM_BR2684=m
# CONFIG_ATM_BR2684_IPFILTER is not set
CONFIG_BRIDGE=m
CONFIG_VLAN_8021Q=m
# CONFIG_DECNET is not set
CONFIG_LLC=y
# CONFIG_LLC2 is not set
CONFIG_IPX=m
# CONFIG_IPX_INTERN is not set
CONFIG_ATALK=m
CONFIG_DEV_APPLETALK=y
CONFIG_IPDDP=m
CONFIG_IPDDP_ENCAP=y
CONFIG_IPDDP_DECAP=y
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
CONFIG_NET_DIVERT=y
# CONFIG_ECONET is not set
CONFIG_WAN_ROUTER=m

#
# QoS and/or fair queueing
#
CONFIG_NET_SCHED=y
CONFIG_NET_SCH_CLK_JIFFIES=y
# CONFIG_NET_SCH_CLK_GETTIMEOFDAY is not set
# CONFIG_NET_SCH_CLK_CPU is not set

#
# Queueing/Scheduling
#
CONFIG_NET_SCH_CBQ=m
CONFIG_NET_SCH_HTB=m
CONFIG_NET_SCH_HFSC=m
CONFIG_NET_SCH_ATM=m
CONFIG_NET_SCH_PRIO=m
CONFIG_NET_SCH_RED=m
CONFIG_NET_SCH_SFQ=m
CONFIG_NET_SCH_TEQL=m
CONFIG_NET_SCH_TBF=m
CONFIG_NET_SCH_GRED=m
CONFIG_NET_SCH_DSMARK=m
CONFIG_NET_SCH_NETEM=m
CONFIG_NET_SCH_INGRESS=m

#
# Classification
#
CONFIG_NET_CLS=y
CONFIG_NET_CLS_BASIC=m
CONFIG_NET_CLS_TCINDEX=m
CONFIG_NET_CLS_ROUTE4=m
CONFIG_NET_CLS_ROUTE=y
CONFIG_NET_CLS_FW=m
CONFIG_NET_CLS_U32=m
CONFIG_CLS_U32_PERF=y
CONFIG_CLS_U32_MARK=y
CONFIG_NET_CLS_RSVP=m
CONFIG_NET_CLS_RSVP6=m
CONFIG_NET_EMATCH=y
CONFIG_NET_EMATCH_STACK=32
CONFIG_NET_EMATCH_CMP=m
CONFIG_NET_EMATCH_NBYTE=m
CONFIG_NET_EMATCH_U32=m
CONFIG_NET_EMATCH_META=m
CONFIG_NET_EMATCH_TEXT=m
# CONFIG_NET_CLS_ACT is not set
CONFIG_NET_CLS_POLICE=y
CONFIG_NET_CLS_IND=y
CONFIG_NET_ESTIMATOR=y

#
# Network testing
#
CONFIG_NET_PKTGEN=m
# CONFIG_HAMRADIO is not set
CONFIG_IRDA=m

#
# IrDA protocols
#
CONFIG_IRLAN=m
CONFIG_IRNET=m
CONFIG_IRCOMM=m
# CONFIG_IRDA_ULTRA is not set

#
# IrDA options
#
CONFIG_IRDA_CACHE_LAST_LSAP=y
CONFIG_IRDA_FAST_RR=y
# CONFIG_IRDA_DEBUG is not set

#
# Infrared-port device drivers
#

#
# SIR device drivers
#
CONFIG_IRTTY_SIR=m

#
# Dongle support
#
CONFIG_DONGLE=y
CONFIG_ESI_DONGLE=m
CONFIG_ACTISYS_DONGLE=m
CONFIG_TEKRAM_DONGLE=m
CONFIG_LITELINK_DONGLE=m
CONFIG_MA600_DONGLE=m
CONFIG_GIRBIL_DONGLE=m
CONFIG_MCP2120_DONGLE=m
CONFIG_OLD_BELKIN_DONGLE=m
CONFIG_ACT200L_DONGLE=m

#
# Old SIR device drivers
#

#
# Old Serial dongle support
#

#
# FIR device drivers
#
CONFIG_USB_IRDA=m
CONFIG_SIGMATEL_FIR=m
CONFIG_NSC_FIR=m
CONFIG_WINBOND_FIR=m
CONFIG_SMC_IRCC_FIR=m
CONFIG_ALI_FIR=m
CONFIG_VLSI_FIR=m
CONFIG_VIA_FIR=m
CONFIG_BT=m
CONFIG_BT_L2CAP=m
CONFIG_BT_SCO=m
CONFIG_BT_RFCOMM=m
CONFIG_BT_RFCOMM_TTY=y
CONFIG_BT_BNEP=m
CONFIG_BT_BNEP_MC_FILTER=y
CONFIG_BT_BNEP_PROTO_FILTER=y
CONFIG_BT_CMTP=m
CONFIG_BT_HIDP=m

#
# Bluetooth device drivers
#
CONFIG_BT_HCIUSB=m
CONFIG_BT_HCIUSB_SCO=y
CONFIG_BT_HCIUART=m
CONFIG_BT_HCIUART_H4=y
CONFIG_BT_HCIUART_BCSP=y
CONFIG_BT_HCIBCM203X=m
CONFIG_BT_HCIBPA10X=m
CONFIG_BT_HCIBFUSB=m
CONFIG_BT_HCIDTL1=m
CONFIG_BT_HCIBT3C=m
CONFIG_BT_HCIBLUECARD=m
CONFIG_BT_HCIBTUART=m
CONFIG_BT_HCIVHCI=m
CONFIG_IEEE80211=m
CONFIG_IEEE80211_DEBUG=y
CONFIG_IEEE80211_CRYPT_WEP=m
CONFIG_IEEE80211_CRYPT_CCMP=m
CONFIG_IEEE80211_CRYPT_TKIP=m

#
# Device Drivers
#

#
# Generic Driver Options
#
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_FW_LOADER=y
# CONFIG_DEBUG_DRIVER is not set

#
# Connector - unified userspace <-> kernelspace linker
#
CONFIG_CONNECTOR=m

#
# Memory Technology Devices (MTD)
#
CONFIG_MTD=m
# CONFIG_MTD_DEBUG is not set
CONFIG_MTD_CONCAT=m
CONFIG_MTD_PARTITIONS=y
CONFIG_MTD_REDBOOT_PARTS=m
CONFIG_MTD_REDBOOT_DIRECTORY_BLOCK=-1
# CONFIG_MTD_REDBOOT_PARTS_UNALLOCATED is not set
# CONFIG_MTD_REDBOOT_PARTS_READONLY is not set
CONFIG_MTD_CMDLINE_PARTS=y

#
# User Modules And Translation Layers
#
CONFIG_MTD_CHAR=m
CONFIG_MTD_BLOCK=m
CONFIG_MTD_BLOCK_RO=m
CONFIG_FTL=m
CONFIG_NFTL=m
CONFIG_NFTL_RW=y
CONFIG_INFTL=m
CONFIG_RFD_FTL=m

#
# RAM/ROM/Flash chip drivers
#
CONFIG_MTD_CFI=m
CONFIG_MTD_JEDECPROBE=m
CONFIG_MTD_GEN_PROBE=m
# CONFIG_MTD_CFI_ADV_OPTIONS is not set
CONFIG_MTD_MAP_BANK_WIDTH_1=y
CONFIG_MTD_MAP_BANK_WIDTH_2=y
CONFIG_MTD_MAP_BANK_WIDTH_4=y
# CONFIG_MTD_MAP_BANK_WIDTH_8 is not set
# CONFIG_MTD_MAP_BANK_WIDTH_16 is not set
# CONFIG_MTD_MAP_BANK_WIDTH_32 is not set
CONFIG_MTD_CFI_I1=y
CONFIG_MTD_CFI_I2=y
# CONFIG_MTD_CFI_I4 is not set
# CONFIG_MTD_CFI_I8 is not set
CONFIG_MTD_CFI_INTELEXT=m
CONFIG_MTD_CFI_AMDSTD=m
CONFIG_MTD_CFI_AMDSTD_RETRY=3
CONFIG_MTD_CFI_STAA=m
CONFIG_MTD_CFI_UTIL=m
CONFIG_MTD_RAM=m
CONFIG_MTD_ROM=m
CONFIG_MTD_ABSENT=m

#
# Mapping drivers for chip access
#
CONFIG_MTD_COMPLEX_MAPPINGS=y
# CONFIG_MTD_PHYSMAP is not set
CONFIG_MTD_PCI=m
# CONFIG_MTD_PLATRAM is not set

#
# Self-contained MTD device drivers
#
CONFIG_MTD_PMC551=m
# CONFIG_MTD_PMC551_BUGFIX is not set
# CONFIG_MTD_PMC551_DEBUG is not set
# CONFIG_MTD_SLRAM is not set
# CONFIG_MTD_PHRAM is not set
CONFIG_MTD_MTDRAM=m
CONFIG_MTDRAM_TOTAL_SIZE=4096
CONFIG_MTDRAM_ERASE_SIZE=128
# CONFIG_MTD_BLKMTD is not set
CONFIG_MTD_BLOCK2MTD=m

#
# Disk-On-Chip Device Drivers
#
CONFIG_MTD_DOC2000=m
# CONFIG_MTD_DOC2001 is not set
CONFIG_MTD_DOC2001PLUS=m
CONFIG_MTD_DOCPROBE=m
CONFIG_MTD_DOCECC=m
# CONFIG_MTD_DOCPROBE_ADVANCED is not set
CONFIG_MTD_DOCPROBE_ADDRESS=0

#
# NAND Flash Device Drivers
#
CONFIG_MTD_NAND=m
# CONFIG_MTD_NAND_VERIFY_WRITE is not set
CONFIG_MTD_NAND_IDS=m
# CONFIG_MTD_NAND_DISKONCHIP is not set
# CONFIG_MTD_NAND_NANDSIM is not set

#
# OneNAND Flash Device Drivers
#
# CONFIG_MTD_ONENAND is not set

#
# Parallel port support
#
CONFIG_PARPORT=m
CONFIG_PARPORT_PC=m
CONFIG_PARPORT_SERIAL=m
# CONFIG_PARPORT_PC_FIFO is not set
# CONFIG_PARPORT_PC_SUPERIO is not set
CONFIG_PARPORT_PC_PCMCIA=m
CONFIG_PARPORT_NOT_PC=y
# CONFIG_PARPORT_GSC is not set
CONFIG_PARPORT_1284=y

#
# Plug and Play support
#

#
# Block devices
#
CONFIG_BLK_DEV_FD=m
CONFIG_PARIDE=m
CONFIG_PARIDE_PARPORT=m

#
# Parallel IDE high-level drivers
#
CONFIG_PARIDE_PD=m
CONFIG_PARIDE_PCD=m
CONFIG_PARIDE_PF=m
CONFIG_PARIDE_PT=m
CONFIG_PARIDE_PG=m

#
# Parallel IDE protocol modules
#
CONFIG_PARIDE_ATEN=m
CONFIG_PARIDE_BPCK=m
CONFIG_PARIDE_COMM=m
CONFIG_PARIDE_DSTR=m
CONFIG_PARIDE_FIT2=m
CONFIG_PARIDE_FIT3=m
CONFIG_PARIDE_EPAT=m
CONFIG_PARIDE_EPATC8=y
CONFIG_PARIDE_EPIA=m
CONFIG_PARIDE_FRIQ=m
CONFIG_PARIDE_FRPW=m
CONFIG_PARIDE_KBIC=m
CONFIG_PARIDE_KTTI=m
CONFIG_PARIDE_ON20=m
CONFIG_PARIDE_ON26=m
# CONFIG_BLK_CPQ_DA is not set
CONFIG_BLK_CPQ_CISS_DA=m
CONFIG_CISS_SCSI_TAPE=y
CONFIG_BLK_DEV_DAC960=m
CONFIG_BLK_DEV_UMEM=m
# CONFIG_BLK_DEV_COW_COMMON is not set
CONFIG_BLK_DEV_LOOP=m
CONFIG_BLK_DEV_CRYPTOLOOP=m
CONFIG_BLK_DEV_NBD=m
CONFIG_BLK_DEV_SX8=m
CONFIG_BLK_DEV_UB=m
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_COUNT=16
CONFIG_BLK_DEV_RAM_SIZE=16384
CONFIG_BLK_DEV_INITRD=y
CONFIG_CDROM_PKTCDVD=m
CONFIG_CDROM_PKTCDVD_BUFFERS=8
# CONFIG_CDROM_PKTCDVD_WCACHE is not set
CONFIG_ATA_OVER_ETH=m

#
# ATA/ATAPI/MFM/RLL support
#
CONFIG_IDE=y
CONFIG_BLK_DEV_IDE=y

#
# Please see Documentation/ide.txt for help/info on IDE drives
#
# CONFIG_BLK_DEV_IDE_SATA is not set
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_IDEDISK_MULTI_MODE=y
CONFIG_BLK_DEV_IDECS=m
CONFIG_BLK_DEV_IDECD=y
# CONFIG_BLK_DEV_IDETAPE is not set
CONFIG_BLK_DEV_IDEFLOPPY=y
CONFIG_BLK_DEV_IDESCSI=m
CONFIG_IDE_TASK_IOCTL=y

#
# IDE chipset support/bugfixes
#
CONFIG_IDE_GENERIC=y
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_IDEPCI_SHARE_IRQ=y
# CONFIG_BLK_DEV_OFFBOARD is not set
CONFIG_BLK_DEV_GENERIC=y
# CONFIG_BLK_DEV_OPTI621 is not set
CONFIG_BLK_DEV_SL82C105=y
CONFIG_BLK_DEV_IDEDMA_PCI=y
# CONFIG_BLK_DEV_IDEDMA_FORCED is not set
CONFIG_IDEDMA_PCI_AUTO=y
# CONFIG_IDEDMA_ONLYDISK is not set
CONFIG_BLK_DEV_AEC62XX=y
CONFIG_BLK_DEV_ALI15X3=y
# CONFIG_WDC_ALI15X3 is not set
CONFIG_BLK_DEV_AMD74XX=y
CONFIG_BLK_DEV_CMD64X=y
CONFIG_BLK_DEV_TRIFLEX=y
CONFIG_BLK_DEV_CY82C693=y
CONFIG_BLK_DEV_CS5520=y
CONFIG_BLK_DEV_CS5530=y
CONFIG_BLK_DEV_HPT34X=y
# CONFIG_HPT34X_AUTODMA is not set
CONFIG_BLK_DEV_HPT366=y
# CONFIG_BLK_DEV_SC1200 is not set
CONFIG_BLK_DEV_PIIX=y
CONFIG_BLK_DEV_IT821X=y
# CONFIG_BLK_DEV_NS87415 is not set
CONFIG_BLK_DEV_PDC202XX_OLD=y
# CONFIG_PDC202XX_BURST is not set
CONFIG_BLK_DEV_PDC202XX_NEW=y
CONFIG_PDC202XX_FORCE=y
CONFIG_BLK_DEV_SVWKS=y
CONFIG_BLK_DEV_SIIMAGE=y
CONFIG_BLK_DEV_SLC90E66=y
# CONFIG_BLK_DEV_TRM290 is not set
CONFIG_BLK_DEV_VIA82CXXX=y
CONFIG_BLK_DEV_IDE_PMAC=y
CONFIG_BLK_DEV_IDE_PMAC_ATA100FIRST=y
CONFIG_BLK_DEV_IDEDMA_PMAC=y
CONFIG_BLK_DEV_IDE_PMAC_BLINK=y
# CONFIG_IDE_ARM is not set
CONFIG_BLK_DEV_IDEDMA=y
# CONFIG_IDEDMA_IVB is not set
CONFIG_IDEDMA_AUTO=y
# CONFIG_BLK_DEV_HD is not set

#
# SCSI device support
#
CONFIG_RAID_ATTRS=m
CONFIG_SCSI=y
CONFIG_SCSI_PROC_FS=y

#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=y
CONFIG_CHR_DEV_ST=m
CONFIG_CHR_DEV_OSST=m
CONFIG_BLK_DEV_SR=m
CONFIG_BLK_DEV_SR_VENDOR=y
CONFIG_CHR_DEV_SG=y
CONFIG_CHR_DEV_SCH=m

#
# Some SCSI devices (e.g. CD jukebox) support multiple LUNs
#
CONFIG_SCSI_MULTI_LUN=y
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_LOGGING=y

#
# SCSI Transport Attributes
#
CONFIG_SCSI_SPI_ATTRS=m
CONFIG_SCSI_FC_ATTRS=m
CONFIG_SCSI_ISCSI_ATTRS=m
CONFIG_SCSI_SAS_ATTRS=m

#
# SCSI low-level drivers
#
CONFIG_ISCSI_TCP=m
CONFIG_BLK_DEV_3W_XXXX_RAID=m
CONFIG_SCSI_3W_9XXX=m
CONFIG_SCSI_ACARD=m
CONFIG_SCSI_AACRAID=m
CONFIG_SCSI_AIC7XXX=m
CONFIG_AIC7XXX_CMDS_PER_DEVICE=4
CONFIG_AIC7XXX_RESET_DELAY_MS=15000
# CONFIG_AIC7XXX_DEBUG_ENABLE is not set
CONFIG_AIC7XXX_DEBUG_MASK=0
# CONFIG_AIC7XXX_REG_PRETTY_PRINT is not set
CONFIG_SCSI_AIC7XXX_OLD=m
CONFIG_SCSI_AIC79XX=m
CONFIG_AIC79XX_CMDS_PER_DEVICE=4
CONFIG_AIC79XX_RESET_DELAY_MS=15000
# CONFIG_AIC79XX_ENABLE_RD_STRM is not set
# CONFIG_AIC79XX_DEBUG_ENABLE is not set
CONFIG_AIC79XX_DEBUG_MASK=0
# CONFIG_AIC79XX_REG_PRETTY_PRINT is not set
CONFIG_MEGARAID_NEWGEN=y
CONFIG_MEGARAID_MM=m
CONFIG_MEGARAID_MAILBOX=m
CONFIG_MEGARAID_SAS=m
CONFIG_SCSI_SATA=m
CONFIG_SCSI_SATA_AHCI=m
CONFIG_SCSI_SATA_SVW=m
CONFIG_SCSI_ATA_PIIX=m
CONFIG_SCSI_SATA_MV=m
CONFIG_SCSI_SATA_NV=m
CONFIG_SCSI_PDC_ADMA=m
CONFIG_SCSI_SATA_QSTOR=m
CONFIG_SCSI_SATA_PROMISE=m
CONFIG_SCSI_SATA_SX4=m
CONFIG_SCSI_SATA_SIL=m
CONFIG_SCSI_SATA_SIL24=m
CONFIG_SCSI_SATA_SIS=m
CONFIG_SCSI_SATA_ULI=m
CONFIG_SCSI_SATA_VIA=m
CONFIG_SCSI_SATA_VITESSE=m
CONFIG_SCSI_SATA_INTEL_COMBINED=y
# CONFIG_SCSI_BUSLOGIC is not set
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_EATA is not set
# CONFIG_SCSI_FUTURE_DOMAIN is not set
CONFIG_SCSI_GDTH=m
CONFIG_SCSI_IPS=m
CONFIG_SCSI_IBMVSCSI=y
CONFIG_SCSI_INITIO=m
CONFIG_SCSI_INIA100=m
CONFIG_SCSI_PPA=m
CONFIG_SCSI_IMM=m
# CONFIG_SCSI_IZIP_EPP16 is not set
# CONFIG_SCSI_IZIP_SLOW_CTR is not set
CONFIG_SCSI_SYM53C8XX_2=m
CONFIG_SCSI_SYM53C8XX_DMA_ADDRESSING_MODE=1
CONFIG_SCSI_SYM53C8XX_DEFAULT_TAGS=16
CONFIG_SCSI_SYM53C8XX_MAX_TAGS=64
# CONFIG_SCSI_SYM53C8XX_IOMAPPED is not set
CONFIG_SCSI_IPR=m
CONFIG_SCSI_IPR_TRACE=y
CONFIG_SCSI_IPR_DUMP=y
# CONFIG_SCSI_QLOGIC_FC is not set
CONFIG_SCSI_QLOGIC_1280=m
CONFIG_SCSI_QLA2XXX=y
CONFIG_SCSI_QLA21XX=m
CONFIG_SCSI_QLA22XX=m
CONFIG_SCSI_QLA2300=m
CONFIG_SCSI_QLA2322=m
CONFIG_SCSI_QLA6312=m
CONFIG_SCSI_QLA24XX=m
CONFIG_SCSI_LPFC=m
CONFIG_SCSI_DC395x=m
# CONFIG_SCSI_DC390T is not set
# CONFIG_SCSI_DEBUG is not set

#
# PCMCIA SCSI adapter support
#
# CONFIG_PCMCIA_FDOMAIN is not set
CONFIG_PCMCIA_QLOGIC=m
CONFIG_PCMCIA_SYM53C500=m

#
# Multi-device support (RAID and LVM)
#
CONFIG_MD=y
CONFIG_BLK_DEV_MD=y
CONFIG_MD_LINEAR=m
CONFIG_MD_RAID0=m
CONFIG_MD_RAID1=m
CONFIG_MD_RAID10=m
CONFIG_MD_RAID5=m
CONFIG_MD_RAID6=m
CONFIG_MD_MULTIPATH=m
CONFIG_MD_FAULTY=m
CONFIG_BLK_DEV_DM=m
CONFIG_DM_CRYPT=m
CONFIG_DM_SNAPSHOT=m
CONFIG_DM_MIRROR=m
CONFIG_DM_ZERO=m
CONFIG_DM_MULTIPATH=m
CONFIG_DM_MULTIPATH_EMC=m

#
# Fusion MPT device support
#
CONFIG_FUSION=y
CONFIG_FUSION_SPI=m
CONFIG_FUSION_FC=m
CONFIG_FUSION_SAS=m
CONFIG_FUSION_MAX_SGE=40
CONFIG_FUSION_CTL=m
CONFIG_FUSION_LAN=m

#
# IEEE 1394 (FireWire) support
#
CONFIG_IEEE1394=m

#
# Subsystem Options
#
# CONFIG_IEEE1394_VERBOSEDEBUG is not set
CONFIG_IEEE1394_OUI_DB=y
CONFIG_IEEE1394_EXTRA_CONFIG_ROMS=y
CONFIG_IEEE1394_CONFIG_ROM_IP1394=y
# CONFIG_IEEE1394_EXPORT_FULL_API is not set

#
# Device Drivers
#
CONFIG_IEEE1394_PCILYNX=m
CONFIG_IEEE1394_OHCI1394=m

#
# Protocol Drivers
#
CONFIG_IEEE1394_VIDEO1394=m
CONFIG_IEEE1394_SBP2=m
# CONFIG_IEEE1394_SBP2_PHYS_DMA is not set
CONFIG_IEEE1394_ETH1394=m
CONFIG_IEEE1394_DV1394=m
CONFIG_IEEE1394_RAWIO=m
CONFIG_IEEE1394_CMP=m
CONFIG_IEEE1394_AMDTP=m

#
# I2O device support
#
# CONFIG_I2O is not set

#
# Macintosh device drivers
#
CONFIG_ADB_PMU=y
CONFIG_PMAC_SMU=y
CONFIG_THERM_PM72=y
CONFIG_WINDFARM=y
CONFIG_WINDFARM_PM81=y
CONFIG_WINDFARM_PM91=y

#
# Network device support
#
CONFIG_NETDEVICES=y
CONFIG_DUMMY=m
CONFIG_BONDING=m
CONFIG_EQUALIZER=m
CONFIG_TUN=m

#
# ARCnet devices
#
# CONFIG_ARCNET is not set

#
# PHY device support
#
CONFIG_PHYLIB=m

#
# MII PHY device drivers
#
CONFIG_MARVELL_PHY=m
CONFIG_DAVICOM_PHY=m
CONFIG_QSEMI_PHY=m
CONFIG_LXT_PHY=m
CONFIG_CICADA_PHY=m

#
# Ethernet (10 or 100Mbit)
#
CONFIG_NET_ETHERNET=y
CONFIG_MII=m
CONFIG_HAPPYMEAL=m
CONFIG_SUNGEM=m
CONFIG_CASSINI=m
CONFIG_NET_VENDOR_3COM=y
CONFIG_VORTEX=m
CONFIG_TYPHOON=m

#
# Tulip family network device support
#
CONFIG_NET_TULIP=y
CONFIG_DE2104X=m
CONFIG_TULIP=m
# CONFIG_TULIP_MWI is not set
CONFIG_TULIP_MMIO=y
# CONFIG_TULIP_NAPI is not set
CONFIG_DE4X5=m
CONFIG_WINBOND_840=m
CONFIG_DM9102=m
CONFIG_ULI526X=m
CONFIG_PCMCIA_XIRCOM=m
# CONFIG_HP100 is not set
CONFIG_IBMVETH=m
CONFIG_NET_PCI=y
CONFIG_PCNET32=m
CONFIG_AMD8111_ETH=m
CONFIG_AMD8111E_NAPI=y
CONFIG_ADAPTEC_STARFIRE=m
CONFIG_ADAPTEC_STARFIRE_NAPI=y
CONFIG_B44=m
CONFIG_FORCEDETH=m
CONFIG_DGRS=m
# CONFIG_EEPRO100 is not set
CONFIG_E100=m
CONFIG_FEALNX=m
CONFIG_NATSEMI=m
CONFIG_NE2K_PCI=m
CONFIG_8139CP=m
CONFIG_8139TOO=m
# CONFIG_8139TOO_PIO is not set
# CONFIG_8139TOO_TUNE_TWISTER is not set
CONFIG_8139TOO_8129=y
# CONFIG_8139_OLD_RX_RESET is not set
CONFIG_SIS900=m
CONFIG_EPIC100=m
CONFIG_SUNDANCE=m
# CONFIG_SUNDANCE_MMIO is not set
CONFIG_VIA_RHINE=m
CONFIG_VIA_RHINE_MMIO=y
CONFIG_NET_POCKET=y
CONFIG_DE600=m
CONFIG_DE620=m

#
# Ethernet (1000 Mbit)
#
CONFIG_ACENIC=m
# CONFIG_ACENIC_OMIT_TIGON_I is not set
CONFIG_DL2K=m
CONFIG_E1000=m
CONFIG_E1000_NAPI=y
CONFIG_NS83820=m
CONFIG_HAMACHI=m
CONFIG_YELLOWFIN=m
CONFIG_R8169=m
CONFIG_R8169_NAPI=y
CONFIG_R8169_VLAN=y
CONFIG_SIS190=m
CONFIG_SKGE=m
# CONFIG_SK98LIN is not set
CONFIG_VIA_VELOCITY=m
CONFIG_TIGON3=m
CONFIG_BNX2=m
# CONFIG_MV643XX_ETH is not set

#
# Ethernet (10000 Mbit)
#
CONFIG_CHELSIO_T1=m
CONFIG_IXGB=m
CONFIG_IXGB_NAPI=y
CONFIG_S2IO=m
CONFIG_S2IO_NAPI=y

#
# Token Ring devices
#
CONFIG_TR=y
CONFIG_IBMOL=m
CONFIG_3C359=m
# CONFIG_TMS380TR is not set

#
# Wireless LAN (non-hamradio)
#
CONFIG_NET_RADIO=y

#
# Obsolete Wireless cards support (pre-802.11)
#
# CONFIG_STRIP is not set
CONFIG_PCMCIA_WAVELAN=m
CONFIG_PCMCIA_NETWAVE=m

#
# Wireless 802.11 Frequency Hopping cards support
#
# CONFIG_PCMCIA_RAYCS is not set

#
# Wireless 802.11b ISA/PCI cards support
#
# CONFIG_IPW2100 is not set
# CONFIG_IPW2200 is not set
CONFIG_AIRO=m
CONFIG_HERMES=m
CONFIG_APPLE_AIRPORT=m
CONFIG_PLX_HERMES=m
CONFIG_TMD_HERMES=m
CONFIG_NORTEL_HERMES=m
CONFIG_PCI_HERMES=m
CONFIG_ATMEL=m
CONFIG_PCI_ATMEL=m

#
# Wireless 802.11b Pcmcia/Cardbus cards support
#
CONFIG_PCMCIA_HERMES=m
CONFIG_PCMCIA_SPECTRUM=m
CONFIG_AIRO_CS=m
CONFIG_PCMCIA_ATMEL=m
CONFIG_PCMCIA_WL3501=m

#
# Prism GT/Duette 802.11(a/b/g) PCI/Cardbus support
#
CONFIG_PRISM54=m
CONFIG_HOSTAP=m
CONFIG_HOSTAP_FIRMWARE=y
CONFIG_HOSTAP_PLX=m
CONFIG_HOSTAP_PCI=m
CONFIG_HOSTAP_CS=m
CONFIG_NET_WIRELESS=y

#
# PCMCIA network device support
#
CONFIG_NET_PCMCIA=y
CONFIG_PCMCIA_3C589=m
CONFIG_PCMCIA_3C574=m
CONFIG_PCMCIA_FMVJ18X=m
CONFIG_PCMCIA_PCNET=m
CONFIG_PCMCIA_NMCLAN=m
CONFIG_PCMCIA_SMC91C92=m
CONFIG_PCMCIA_XIRC2PS=m
CONFIG_PCMCIA_AXNET=m

#
# Wan interfaces
#
# CONFIG_WAN is not set

#
# ATM drivers
#
# CONFIG_ATM_DUMMY is not set
CONFIG_ATM_TCP=m
CONFIG_ATM_LANAI=m
CONFIG_ATM_ENI=m
# CONFIG_ATM_ENI_DEBUG is not set
# CONFIG_ATM_ENI_TUNE_BURST is not set
# CONFIG_ATM_FIRESTREAM is not set
# CONFIG_ATM_ZATM is not set
CONFIG_ATM_IDT77252=m
# CONFIG_ATM_IDT77252_DEBUG is not set
# CONFIG_ATM_IDT77252_RCV_ALL is not set
CONFIG_ATM_IDT77252_USE_SUNI=y
# CONFIG_ATM_AMBASSADOR is not set
# CONFIG_ATM_HORIZON is not set
CONFIG_ATM_FORE200E_MAYBE=m
# CONFIG_ATM_FORE200E_PCA is not set
CONFIG_ATM_HE=m
# CONFIG_ATM_HE_USE_SUNI is not set
CONFIG_FDDI=y
# CONFIG_DEFXX is not set
CONFIG_SKFP=m
# CONFIG_HIPPI is not set
CONFIG_PLIP=m
CONFIG_PPP=m
CONFIG_PPP_MULTILINK=y
CONFIG_PPP_FILTER=y
CONFIG_PPP_ASYNC=m
CONFIG_PPP_SYNC_TTY=m
CONFIG_PPP_DEFLATE=m
# CONFIG_PPP_BSDCOMP is not set
CONFIG_PPP_MPPE=m
CONFIG_PPPOE=m
CONFIG_PPPOATM=m
CONFIG_SLIP=m
CONFIG_SLIP_COMPRESSED=y
CONFIG_SLIP_SMART=y
# CONFIG_SLIP_MODE_SLIP6 is not set
CONFIG_NET_FC=y
# CONFIG_SHAPER is not set
CONFIG_NETCONSOLE=m
CONFIG_NETPOLL=y
# CONFIG_NETPOLL_RX is not set
CONFIG_NETPOLL_TRAP=y
CONFIG_NET_POLL_CONTROLLER=y

#
# ISDN subsystem
#
CONFIG_ISDN=m

#
# Old ISDN4Linux
#
CONFIG_ISDN_I4L=m
CONFIG_ISDN_PPP=y
CONFIG_ISDN_PPP_VJ=y
CONFIG_ISDN_MPP=y
CONFIG_IPPP_FILTER=y
# CONFIG_ISDN_PPP_BSDCOMP is not set
CONFIG_ISDN_AUDIO=y
CONFIG_ISDN_TTY_FAX=y

#
# ISDN feature submodules
#
CONFIG_ISDN_DIVERSION=m

#
# ISDN4Linux hardware drivers
#

#
# Passive cards
#
CONFIG_ISDN_DRV_HISAX=m

#
# D-channel protocol features
#
CONFIG_HISAX_EURO=y
CONFIG_DE_AOC=y
CONFIG_HISAX_NO_SENDCOMPLETE=y
CONFIG_HISAX_NO_LLC=y
CONFIG_HISAX_NO_KEYPAD=y
CONFIG_HISAX_1TR6=y
CONFIG_HISAX_NI1=y
CONFIG_HISAX_MAX_CARDS=8

#
# HiSax supported cards
#
CONFIG_HISAX_16_3=y
CONFIG_HISAX_S0BOX=y
CONFIG_HISAX_AVM_A1_PCMCIA=y
CONFIG_HISAX_ELSA=y
CONFIG_HISAX_DIEHLDIVA=y
CONFIG_HISAX_SEDLBAUER=y
CONFIG_HISAX_NICCY=y
CONFIG_HISAX_BKM_A4T=y
CONFIG_HISAX_SCT_QUADRO=y
CONFIG_HISAX_GAZEL=y
CONFIG_HISAX_W6692=y
CONFIG_HISAX_HFC_SX=y
# CONFIG_HISAX_DEBUG is not set

#
# HiSax PCMCIA card service modules
#
CONFIG_HISAX_SEDLBAUER_CS=m
CONFIG_HISAX_ELSA_CS=m
CONFIG_HISAX_AVM_A1_CS=m
CONFIG_HISAX_TELES_CS=m

#
# HiSax sub driver modules
#
CONFIG_HISAX_ST5481=m
# CONFIG_HISAX_HFCUSB is not set
CONFIG_HISAX_HFC4S8S=m
CONFIG_HISAX_FRITZ_PCIPNP=m
CONFIG_HISAX_HDLC=y

#
# Active cards
#

#
# CAPI subsystem
#
CONFIG_ISDN_CAPI=m
CONFIG_ISDN_DRV_AVMB1_VERBOSE_REASON=y
CONFIG_ISDN_CAPI_MIDDLEWARE=y
CONFIG_ISDN_CAPI_CAPI20=m
CONFIG_ISDN_CAPI_CAPIFS_BOOL=y
CONFIG_ISDN_CAPI_CAPIFS=m
CONFIG_ISDN_CAPI_CAPIDRV=m

#
# CAPI hardware drivers
#

#
# Active AVM cards
#
CONFIG_CAPI_AVM=y
CONFIG_ISDN_DRV_AVMB1_B1PCI=m
CONFIG_ISDN_DRV_AVMB1_B1PCIV4=y
CONFIG_ISDN_DRV_AVMB1_B1PCMCIA=m
CONFIG_ISDN_DRV_AVMB1_AVM_CS=m
CONFIG_ISDN_DRV_AVMB1_T1PCI=m
CONFIG_ISDN_DRV_AVMB1_C4=m

#
# Active Eicon DIVA Server cards
#
CONFIG_CAPI_EICON=y
CONFIG_ISDN_DIVAS=m
CONFIG_ISDN_DIVAS_BRIPCI=y
CONFIG_ISDN_DIVAS_PRIPCI=y
CONFIG_ISDN_DIVAS_DIVACAPI=m
CONFIG_ISDN_DIVAS_USERIDI=m
CONFIG_ISDN_DIVAS_MAINT=m

#
# Telephony Support
#
# CONFIG_PHONE is not set

#
# Input device support
#
CONFIG_INPUT=y

#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
# CONFIG_INPUT_MOUSEDEV_PSAUX is not set
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
CONFIG_INPUT_JOYDEV=m
# CONFIG_INPUT_TSDEV is not set
CONFIG_INPUT_EVDEV=y
# CONFIG_INPUT_EVBUG is not set

#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_LKKBD is not set
# CONFIG_KEYBOARD_XTKBD is not set
# CONFIG_KEYBOARD_NEWTON is not set
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
CONFIG_MOUSE_SERIAL=m
CONFIG_MOUSE_VSXXXAA=m
CONFIG_INPUT_JOYSTICK=y
CONFIG_JOYSTICK_ANALOG=m
CONFIG_JOYSTICK_A3D=m
CONFIG_JOYSTICK_ADI=m
CONFIG_JOYSTICK_COBRA=m
CONFIG_JOYSTICK_GF2K=m
CONFIG_JOYSTICK_GRIP=m
CONFIG_JOYSTICK_GRIP_MP=m
CONFIG_JOYSTICK_GUILLEMOT=m
CONFIG_JOYSTICK_INTERACT=m
CONFIG_JOYSTICK_SIDEWINDER=m
CONFIG_JOYSTICK_TMDC=m
CONFIG_JOYSTICK_IFORCE=m
CONFIG_JOYSTICK_IFORCE_USB=y
CONFIG_JOYSTICK_IFORCE_232=y
CONFIG_JOYSTICK_WARRIOR=m
CONFIG_JOYSTICK_MAGELLAN=m
CONFIG_JOYSTICK_SPACEORB=m
CONFIG_JOYSTICK_SPACEBALL=m
CONFIG_JOYSTICK_STINGER=m
CONFIG_JOYSTICK_TWIDJOY=m
CONFIG_JOYSTICK_DB9=m
CONFIG_JOYSTICK_GAMECON=m
CONFIG_JOYSTICK_TURBOGRAFX=m
CONFIG_JOYSTICK_JOYDUMP=m
CONFIG_INPUT_TOUCHSCREEN=y
CONFIG_TOUCHSCREEN_GUNZE=m
CONFIG_TOUCHSCREEN_ELO=m
CONFIG_TOUCHSCREEN_MTOUCH=m
CONFIG_TOUCHSCREEN_MK712=m
CONFIG_INPUT_MISC=y
# CONFIG_INPUT_PCSPKR is not set
CONFIG_INPUT_UINPUT=m

#
# Hardware I/O ports
#
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_SERIO_SERPORT=y
# CONFIG_SERIO_PARKBD is not set
# CONFIG_SERIO_PCIPS2 is not set
CONFIG_SERIO_LIBPS2=y
# CONFIG_SERIO_RAW is not set
CONFIG_GAMEPORT=m
CONFIG_GAMEPORT_NS558=m
CONFIG_GAMEPORT_L4=m
CONFIG_GAMEPORT_EMU10K1=m
CONFIG_GAMEPORT_FM801=m

#
# Character devices
#
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
CONFIG_SERIAL_NONSTANDARD=y
CONFIG_ROCKETPORT=m
# CONFIG_CYCLADES is not set
# CONFIG_DIGIEPCA is not set
# CONFIG_MOXA_SMARTIO is not set
# CONFIG_ISI is not set
# CONFIG_SYNCLINK is not set
# CONFIG_SYNCLINKMP is not set
CONFIG_N_HDLC=m
# CONFIG_SPECIALIX is not set
# CONFIG_SX is not set
CONFIG_STALDRV=y

#
# Serial drivers
#
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_SERIAL_8250_CS=m
CONFIG_SERIAL_8250_NR_UARTS=32
CONFIG_SERIAL_8250_EXTENDED=y
CONFIG_SERIAL_8250_MANY_PORTS=y
CONFIG_SERIAL_8250_SHARE_IRQ=y
CONFIG_SERIAL_8250_DETECT_IRQ=y
CONFIG_SERIAL_8250_RSA=y

#
# Non-8250 serial port support
#
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
CONFIG_SERIAL_PMACZILOG=m
CONFIG_SERIAL_ICOM=m
# CONFIG_SERIAL_JSM is not set
CONFIG_UNIX98_PTYS=y
# CONFIG_LEGACY_PTYS is not set
CONFIG_PRINTER=m
CONFIG_LP_CONSOLE=y
CONFIG_PPDEV=m
CONFIG_TIPAR=m
CONFIG_HVC_CONSOLE=y
CONFIG_HVCS=y

#
# IPMI
#
CONFIG_IPMI_HANDLER=m
# CONFIG_IPMI_PANIC_EVENT is not set
CONFIG_IPMI_DEVICE_INTERFACE=m
CONFIG_IPMI_SI=m
CONFIG_IPMI_WATCHDOG=m
CONFIG_IPMI_POWEROFF=m

#
# Watchdog Cards
#
CONFIG_WATCHDOG=y
# CONFIG_WATCHDOG_NOWAYOUT is not set

#
# Watchdog Device Drivers
#
CONFIG_SOFT_WATCHDOG=m
CONFIG_WATCHDOG_RTAS=m

#
# PCI-based Watchdog Cards
#
CONFIG_PCIPCWATCHDOG=m
CONFIG_WDTPCI=m
CONFIG_WDT_501_PCI=y

#
# USB-based Watchdog Cards
#
CONFIG_USBPCWATCHDOG=m
# CONFIG_RTC is not set
CONFIG_GEN_RTC=y
# CONFIG_GEN_RTC_X is not set
CONFIG_DTLK=m
CONFIG_R3964=m
# CONFIG_APPLICOM is not set

#
# Ftape, the floppy tape device driver
#
CONFIG_AGP=y
CONFIG_AGP_UNINORTH=y
CONFIG_DRM=m
CONFIG_DRM_TDFX=m
CONFIG_DRM_R128=m
CONFIG_DRM_RADEON=m
CONFIG_DRM_MGA=m
CONFIG_DRM_SIS=m
CONFIG_DRM_VIA=m
CONFIG_DRM_SAVAGE=m

#
# PCMCIA character devices
#
# CONFIG_SYNCLINK_CS is not set
CONFIG_CARDMAN_4000=m
CONFIG_CARDMAN_4040=m
# CONFIG_RAW_DRIVER is not set
CONFIG_HANGCHECK_TIMER=m

#
# TPM devices
#
# CONFIG_TCG_TPM is not set
# CONFIG_TELCLOCK is not set

#
# I2C support
#
CONFIG_I2C=y
CONFIG_I2C_CHARDEV=m

#
# I2C Algorithms
#
CONFIG_I2C_ALGOBIT=y
CONFIG_I2C_ALGOPCF=m
CONFIG_I2C_ALGOPCA=m

#
# I2C Hardware Bus support
#
# CONFIG_I2C_ALI1535 is not set
# CONFIG_I2C_ALI1563 is not set
# CONFIG_I2C_ALI15X3 is not set
# CONFIG_I2C_AMD756 is not set
# CONFIG_I2C_AMD8111 is not set
# CONFIG_I2C_I801 is not set
# CONFIG_I2C_I810 is not set
# CONFIG_I2C_PIIX4 is not set
CONFIG_I2C_ISA=m
CONFIG_I2C_KEYWEST=y
CONFIG_I2C_PMAC_SMU=y
CONFIG_I2C_NFORCE2=m
CONFIG_I2C_PARPORT=m
CONFIG_I2C_PARPORT_LIGHT=m
CONFIG_I2C_PROSAVAGE=m
CONFIG_I2C_SAVAGE4=m
# CONFIG_SCx200_ACB is not set
# CONFIG_I2C_SIS5595 is not set
# CONFIG_I2C_SIS630 is not set
# CONFIG_I2C_SIS96X is not set
CONFIG_I2C_STUB=m
# CONFIG_I2C_VIA is not set
# CONFIG_I2C_VIAPRO is not set
CONFIG_I2C_VOODOO3=m
CONFIG_I2C_PCA_ISA=m

#
# Miscellaneous I2C Chip support
#
CONFIG_SENSORS_DS1337=m
CONFIG_SENSORS_DS1374=m
CONFIG_SENSORS_EEPROM=m
CONFIG_SENSORS_PCF8574=m
CONFIG_SENSORS_PCA9539=m
CONFIG_SENSORS_PCF8591=m
CONFIG_SENSORS_RTC8564=m
CONFIG_SENSORS_MAX6875=m
CONFIG_RTC_X1205_I2C=m
# CONFIG_I2C_DEBUG_CORE is not set
# CONFIG_I2C_DEBUG_ALGO is not set
# CONFIG_I2C_DEBUG_BUS is not set
# CONFIG_I2C_DEBUG_CHIP is not set

#
# Dallas's 1-wire bus
#
CONFIG_W1=m
CONFIG_W1_MATROX=m
CONFIG_W1_DS9490=m
CONFIG_W1_DS9490_BRIDGE=m
CONFIG_W1_THERM=m
CONFIG_W1_SMEM=m
CONFIG_W1_DS2433=m
CONFIG_W1_DS2433_CRC=y

#
# Hardware Monitoring support
#
CONFIG_HWMON=m
CONFIG_HWMON_VID=m
CONFIG_SENSORS_ADM1021=m
CONFIG_SENSORS_ADM1025=m
CONFIG_SENSORS_ADM1026=m
CONFIG_SENSORS_ADM1031=m
CONFIG_SENSORS_ADM9240=m
CONFIG_SENSORS_ASB100=m
CONFIG_SENSORS_ATXP1=m
CONFIG_SENSORS_DS1621=m
CONFIG_SENSORS_FSCHER=m
CONFIG_SENSORS_FSCPOS=m
CONFIG_SENSORS_GL518SM=m
CONFIG_SENSORS_GL520SM=m
CONFIG_SENSORS_IT87=m
CONFIG_SENSORS_LM63=m
CONFIG_SENSORS_LM75=m
CONFIG_SENSORS_LM77=m
CONFIG_SENSORS_LM78=m
CONFIG_SENSORS_LM80=m
CONFIG_SENSORS_LM83=m
CONFIG_SENSORS_LM85=m
CONFIG_SENSORS_LM87=m
CONFIG_SENSORS_LM90=m
CONFIG_SENSORS_LM92=m
CONFIG_SENSORS_MAX1619=m
CONFIG_SENSORS_PC87360=m
CONFIG_SENSORS_SIS5595=m
CONFIG_SENSORS_SMSC47M1=m
CONFIG_SENSORS_SMSC47B397=m
CONFIG_SENSORS_VIA686A=m
CONFIG_SENSORS_W83781D=m
CONFIG_SENSORS_W83792D=m
CONFIG_SENSORS_W83L785TS=m
CONFIG_SENSORS_W83627HF=m
CONFIG_SENSORS_W83627EHF=m
# CONFIG_HWMON_DEBUG_CHIP is not set

#
# Misc devices
#

#
# Multimedia Capabilities Port drivers
#

#
# Multimedia devices
#
CONFIG_VIDEO_DEV=m

#
# Video For Linux
#

#
# Video Adapters
#
CONFIG_VIDEO_BT848=m
CONFIG_VIDEO_BT848_DVB=y
CONFIG_VIDEO_SAA6588=m
CONFIG_VIDEO_BWQCAM=m
CONFIG_VIDEO_CQCAM=m
CONFIG_VIDEO_W9966=m
CONFIG_VIDEO_CPIA=m
CONFIG_VIDEO_CPIA_PP=m
CONFIG_VIDEO_CPIA_USB=m
CONFIG_VIDEO_SAA5246A=m
CONFIG_VIDEO_SAA5249=m
CONFIG_TUNER_3036=m
# CONFIG_VIDEO_STRADIS is not set
# CONFIG_VIDEO_ZORAN is not set
CONFIG_VIDEO_SAA7134=m
CONFIG_VIDEO_SAA7134_ALSA=m
CONFIG_VIDEO_SAA7134_DVB=m
CONFIG_VIDEO_SAA7134_DVB_ALL_FRONTENDS=y
CONFIG_VIDEO_MXB=m
CONFIG_VIDEO_DPC=m
CONFIG_VIDEO_HEXIUM_ORION=m
CONFIG_VIDEO_HEXIUM_GEMINI=m
CONFIG_VIDEO_CX88=m
CONFIG_VIDEO_CX88_DVB=m
CONFIG_VIDEO_CX88_DVB_ALL_FRONTENDS=y
CONFIG_VIDEO_EM28XX=m
CONFIG_VIDEO_OVCAMCHIP=m
CONFIG_VIDEO_AUDIO_DECODER=m
CONFIG_VIDEO_DECODER=m

#
# Radio Adapters
#
CONFIG_RADIO_GEMTEK_PCI=m
CONFIG_RADIO_MAXIRADIO=m
CONFIG_RADIO_MAESTRO=m

#
# Digital Video Broadcasting Devices
#
CONFIG_DVB=y
CONFIG_DVB_CORE=m

#
# Supported SAA7146 based PCI Adapters
#
CONFIG_DVB_AV7110=m
CONFIG_DVB_AV7110_OSD=y
CONFIG_DVB_BUDGET=m
CONFIG_DVB_BUDGET_CI=m
CONFIG_DVB_BUDGET_AV=m
CONFIG_DVB_BUDGET_PATCH=m

#
# Supported USB Adapters
#
CONFIG_DVB_USB=m
# CONFIG_DVB_USB_DEBUG is not set
CONFIG_DVB_USB_A800=m
CONFIG_DVB_USB_DIBUSB_MB=m
CONFIG_DVB_USB_DIBUSB_MC=m
CONFIG_DVB_USB_UMT_010=m
CONFIG_DVB_USB_CXUSB=m
CONFIG_DVB_USB_DIGITV=m
CONFIG_DVB_USB_VP7045=m
CONFIG_DVB_USB_VP702X=m
CONFIG_DVB_USB_NOVA_T_USB2=m
CONFIG_DVB_USB_DTT200U=m
CONFIG_DVB_TTUSB_BUDGET=m
CONFIG_DVB_TTUSB_DEC=m
CONFIG_DVB_CINERGYT2=m
CONFIG_DVB_CINERGYT2_TUNING=y
CONFIG_DVB_CINERGYT2_STREAM_URB_COUNT=32
CONFIG_DVB_CINERGYT2_STREAM_BUF_SIZE=512
CONFIG_DVB_CINERGYT2_QUERY_INTERVAL=250
CONFIG_DVB_CINERGYT2_ENABLE_RC_INPUT_DEVICE=y
CONFIG_DVB_CINERGYT2_RC_QUERY_INTERVAL=100

#
# Supported FlexCopII (B2C2) Adapters
#
CONFIG_DVB_B2C2_FLEXCOP=m
CONFIG_DVB_B2C2_FLEXCOP_PCI=m
CONFIG_DVB_B2C2_FLEXCOP_USB=m
# CONFIG_DVB_B2C2_FLEXCOP_DEBUG is not set

#
# Supported BT878 Adapters
#
CONFIG_DVB_BT8XX=m

#
# Supported Pluto2 Adapters
#
CONFIG_DVB_PLUTO2=m

#
# Supported DVB Frontends
#

#
# Customise DVB Frontends
#

#
# DVB-S (satellite) frontends
#
CONFIG_DVB_STV0299=m
CONFIG_DVB_CX24110=m
CONFIG_DVB_TDA8083=m
CONFIG_DVB_TDA80XX=m
CONFIG_DVB_MT312=m
CONFIG_DVB_VES1X93=m
CONFIG_DVB_S5H1420=m

#
# DVB-T (terrestrial) frontends
#
CONFIG_DVB_SP8870=m
CONFIG_DVB_SP887X=m
CONFIG_DVB_CX22700=m
CONFIG_DVB_CX22702=m
CONFIG_DVB_L64781=m
CONFIG_DVB_TDA1004X=m
CONFIG_DVB_NXT6000=m
CONFIG_DVB_MT352=m
CONFIG_DVB_DIB3000MB=m
CONFIG_DVB_DIB3000MC=m

#
# DVB-C (cable) frontends
#
CONFIG_DVB_ATMEL_AT76C651=m
CONFIG_DVB_VES1820=m
CONFIG_DVB_TDA10021=m
CONFIG_DVB_STV0297=m

#
# ATSC (North American/Korean Terresterial DTV) frontends
#
CONFIG_DVB_NXT2002=m
CONFIG_DVB_NXT200X=m
CONFIG_DVB_OR51211=m
CONFIG_DVB_OR51132=m
CONFIG_DVB_BCM3510=m
CONFIG_DVB_LGDT330X=m
CONFIG_VIDEO_SAA7146=m
CONFIG_VIDEO_SAA7146_VV=m
CONFIG_VIDEO_VIDEOBUF=m
CONFIG_VIDEO_TUNER=m
CONFIG_VIDEO_BUF=m
CONFIG_VIDEO_BUF_DVB=m
CONFIG_VIDEO_BTCX=m
CONFIG_VIDEO_IR=m
CONFIG_VIDEO_TVEEPROM=m

#
# Graphics support
#
CONFIG_FB=y
CONFIG_FB_CFB_FILLRECT=y
CONFIG_FB_CFB_COPYAREA=y
CONFIG_FB_CFB_IMAGEBLIT=y
CONFIG_FB_MACMODES=y
CONFIG_FB_MODE_HELPERS=y
CONFIG_FB_TILEBLITTING=y
CONFIG_FB_CIRRUS=m
# CONFIG_FB_PM2 is not set
# CONFIG_FB_CYBER2000 is not set
CONFIG_FB_OF=y
# CONFIG_FB_CONTROL is not set
# CONFIG_FB_PLATINUM is not set
# CONFIG_FB_VALKYRIE is not set
# CONFIG_FB_CT65550 is not set
# CONFIG_FB_ASILIANT is not set
# CONFIG_FB_IMSTT is not set
# CONFIG_FB_VGA16 is not set
CONFIG_VIDEO_SELECT=y
# CONFIG_FB_S1D13XXX is not set
# CONFIG_FB_NVIDIA is not set
CONFIG_FB_RIVA=m
# CONFIG_FB_RIVA_I2C is not set
# CONFIG_FB_RIVA_DEBUG is not set
CONFIG_FB_MATROX=m
CONFIG_FB_MATROX_MILLENIUM=y
CONFIG_FB_MATROX_MYSTIQUE=y
CONFIG_FB_MATROX_G=y
CONFIG_FB_MATROX_I2C=m
CONFIG_FB_MATROX_MAVEN=m
CONFIG_FB_MATROX_MULTIHEAD=y
# CONFIG_FB_RADEON_OLD is not set
CONFIG_FB_RADEON=y
CONFIG_FB_RADEON_I2C=y
# CONFIG_FB_RADEON_DEBUG is not set
# CONFIG_FB_ATY128 is not set
# CONFIG_FB_ATY is not set
CONFIG_FB_SAVAGE=m
CONFIG_FB_SAVAGE_I2C=y
CONFIG_FB_SAVAGE_ACCEL=y
# CONFIG_FB_SIS is not set
CONFIG_FB_NEOMAGIC=m
CONFIG_FB_KYRO=m
CONFIG_FB_3DFX=m
CONFIG_FB_3DFX_ACCEL=y
CONFIG_FB_VOODOO1=m
CONFIG_FB_CYBLA=m
CONFIG_FB_TRIDENT=m
CONFIG_FB_TRIDENT_ACCEL=y
# CONFIG_FB_VIRTUAL is not set

#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
CONFIG_DUMMY_CONSOLE=y
CONFIG_FRAMEBUFFER_CONSOLE=y
CONFIG_FRAMEBUFFER_CONSOLE_ROTATION=y
# CONFIG_FONTS is not set
CONFIG_FONT_8x8=y
CONFIG_FONT_8x16=y

#
# Logo configuration
#
CONFIG_LOGO=y
# CONFIG_LOGO_LINUX_MONO is not set
# CONFIG_LOGO_LINUX_VGA16 is not set
CONFIG_LOGO_LINUX_CLUT224=y
CONFIG_BACKLIGHT_LCD_SUPPORT=y
CONFIG_BACKLIGHT_CLASS_DEVICE=m
CONFIG_BACKLIGHT_DEVICE=y
CONFIG_LCD_CLASS_DEVICE=m
CONFIG_LCD_DEVICE=y

#
# Sound
#
CONFIG_SOUND=m

#
# Advanced Linux Sound Architecture
#
CONFIG_SND=m
CONFIG_SND_AC97_CODEC=m
CONFIG_SND_AC97_BUS=m
CONFIG_SND_TIMER=m
CONFIG_SND_PCM=m
CONFIG_SND_HWDEP=m
CONFIG_SND_RAWMIDI=m
CONFIG_SND_SEQUENCER=m
CONFIG_SND_SEQ_DUMMY=m
CONFIG_SND_OSSEMUL=y
CONFIG_SND_MIXER_OSS=m
CONFIG_SND_PCM_OSS=m
CONFIG_SND_SEQUENCER_OSS=y
# CONFIG_SND_VERBOSE_PRINTK is not set
# CONFIG_SND_DEBUG is not set
CONFIG_SND_GENERIC_DRIVER=y

#
# Generic devices
#
CONFIG_SND_MPU401_UART=m
CONFIG_SND_OPL3_LIB=m
CONFIG_SND_VX_LIB=m
CONFIG_SND_DUMMY=m
CONFIG_SND_VIRMIDI=m
CONFIG_SND_MTPAV=m
# CONFIG_SND_SERIAL_U16550 is not set
CONFIG_SND_MPU401=m

#
# PCI devices
#
CONFIG_SND_ALI5451=m
CONFIG_SND_ATIIXP=m
CONFIG_SND_ATIIXP_MODEM=m
CONFIG_SND_AU8810=m
CONFIG_SND_AU8820=m
CONFIG_SND_AU8830=m
CONFIG_SND_AZT3328=m
CONFIG_SND_BT87X=m
# CONFIG_SND_BT87X_OVERCLOCK is not set
CONFIG_SND_CS46XX=m
CONFIG_SND_CS46XX_NEW_DSP=y
CONFIG_SND_CS4281=m
CONFIG_SND_EMU10K1=m
CONFIG_SND_EMU10K1X=m
CONFIG_SND_CA0106=m
CONFIG_SND_KORG1212=m
CONFIG_SND_MIXART=m
CONFIG_SND_NM256=m
CONFIG_SND_RME32=m
CONFIG_SND_RME96=m
CONFIG_SND_RME9652=m
CONFIG_SND_HDSP=m
CONFIG_SND_HDSPM=m
CONFIG_SND_TRIDENT=m
CONFIG_SND_YMFPCI=m
CONFIG_SND_AD1889=m
CONFIG_SND_ALS4000=m
CONFIG_SND_CMIPCI=m
CONFIG_SND_ENS1370=m
CONFIG_SND_ENS1371=m
CONFIG_SND_ES1938=m
CONFIG_SND_ES1968=m
CONFIG_SND_MAESTRO3=m
CONFIG_SND_FM801=m
CONFIG_SND_FM801_TEA575X=m
CONFIG_SND_ICE1712=m
CONFIG_SND_ICE1724=m
CONFIG_SND_INTEL8X0=m
CONFIG_SND_INTEL8X0M=m
CONFIG_SND_SONICVIBES=m
CONFIG_SND_VIA82XX=m
CONFIG_SND_VIA82XX_MODEM=m
CONFIG_SND_VX222=m
CONFIG_SND_HDA_INTEL=m

#
# ALSA PowerMac devices
#
CONFIG_SND_POWERMAC=m
CONFIG_SND_POWERMAC_AUTO_DRC=y

#
# USB devices
#
CONFIG_SND_USB_AUDIO=m
CONFIG_SND_USB_USX2Y=m

#
# PCMCIA devices
#

#
# Open Sound System
#
# CONFIG_SOUND_PRIME is not set

#
# USB support
#
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB_ARCH_HAS_OHCI=y
CONFIG_USB=y
# CONFIG_USB_DEBUG is not set

#
# Miscellaneous USB options
#
CONFIG_USB_DEVICEFS=y
# CONFIG_USB_BANDWIDTH is not set
# CONFIG_USB_DYNAMIC_MINORS is not set
# CONFIG_USB_SUSPEND is not set
# CONFIG_USB_OTG is not set

#
# USB Host Controller Drivers
#
CONFIG_USB_EHCI_HCD=m
CONFIG_USB_EHCI_SPLIT_ISO=y
CONFIG_USB_EHCI_ROOT_HUB_TT=y
CONFIG_USB_ISP116X_HCD=m
CONFIG_USB_OHCI_HCD=m
# CONFIG_USB_OHCI_BIG_ENDIAN is not set
CONFIG_USB_OHCI_LITTLE_ENDIAN=y
CONFIG_USB_UHCI_HCD=m
CONFIG_USB_SL811_HCD=m
CONFIG_USB_SL811_CS=m

#
# USB Device Class drivers
#
# CONFIG_OBSOLETE_OSS_USB_DRIVER is not set
CONFIG_USB_ACM=m
CONFIG_USB_PRINTER=m

#
# NOTE: USB_STORAGE enables SCSI, and 'SCSI disk support'
#

#
# may also be needed; see USB_STORAGE Help for more information
#
CONFIG_USB_STORAGE=m
# CONFIG_USB_STORAGE_DEBUG is not set
CONFIG_USB_STORAGE_DATAFAB=y
CONFIG_USB_STORAGE_FREECOM=y
CONFIG_USB_STORAGE_ISD200=y
CONFIG_USB_STORAGE_DPCM=y
CONFIG_USB_STORAGE_USBAT=y
CONFIG_USB_STORAGE_SDDR09=y
CONFIG_USB_STORAGE_SDDR55=y
CONFIG_USB_STORAGE_JUMPSHOT=y

#
# USB Input Devices
#
CONFIG_USB_HID=y
CONFIG_USB_HIDINPUT=y
CONFIG_HID_FF=y
CONFIG_HID_PID=y
CONFIG_LOGITECH_FF=y
CONFIG_THRUSTMASTER_FF=y
CONFIG_USB_HIDDEV=y
CONFIG_USB_AIPTEK=m
CONFIG_USB_WACOM=m
CONFIG_USB_ACECAD=m
CONFIG_USB_KBTAB=m
CONFIG_USB_POWERMATE=m
CONFIG_USB_MTOUCH=m
CONFIG_USB_ITMTOUCH=m
CONFIG_USB_EGALAX=m
# CONFIG_USB_YEALINK is not set
CONFIG_USB_XPAD=m
CONFIG_USB_ATI_REMOTE=m
CONFIG_USB_KEYSPAN_REMOTE=m
CONFIG_USB_APPLETOUCH=m

#
# USB Imaging devices
#
CONFIG_USB_MDC800=m
CONFIG_USB_MICROTEK=m

#
# USB Multimedia devices
#
CONFIG_USB_DABUSB=m
CONFIG_USB_VICAM=m
CONFIG_USB_DSBR=m
CONFIG_USB_IBMCAM=m
CONFIG_USB_KONICAWC=m
CONFIG_USB_OV511=m
CONFIG_USB_SE401=m
CONFIG_USB_SN9C102=m
CONFIG_USB_STV680=m
CONFIG_USB_W9968CF=m
CONFIG_USB_PWC=m

#
# USB Network Adapters
#
CONFIG_USB_CATC=m
CONFIG_USB_KAWETH=m
CONFIG_USB_PEGASUS=m
CONFIG_USB_RTL8150=m
CONFIG_USB_USBNET=m
CONFIG_USB_NET_AX8817X=m
CONFIG_USB_NET_CDCETHER=m
CONFIG_USB_NET_GL620A=m
CONFIG_USB_NET_NET1080=m
CONFIG_USB_NET_PLUSB=m
CONFIG_USB_NET_RNDIS_HOST=m
CONFIG_USB_NET_CDC_SUBSET=m
CONFIG_USB_ALI_M5632=y
CONFIG_USB_AN2720=y
CONFIG_USB_BELKIN=y
CONFIG_USB_ARMLINUX=y
CONFIG_USB_EPSON2888=y
CONFIG_USB_NET_ZAURUS=m
CONFIG_USB_ZD1201=m
CONFIG_USB_MON=y

#
# USB port drivers
#
CONFIG_USB_USS720=m

#
# USB Serial Converter support
#
CONFIG_USB_SERIAL=m
CONFIG_USB_SERIAL_GENERIC=y
CONFIG_USB_SERIAL_AIRPRIME=m
CONFIG_USB_SERIAL_ANYDATA=m
CONFIG_USB_SERIAL_BELKIN=m
CONFIG_USB_SERIAL_DIGI_ACCELEPORT=m
CONFIG_USB_SERIAL_CP2101=m
CONFIG_USB_SERIAL_CYPRESS_M8=m
CONFIG_USB_SERIAL_EMPEG=m
CONFIG_USB_SERIAL_FTDI_SIO=m
CONFIG_USB_SERIAL_VISOR=m
CONFIG_USB_SERIAL_IPAQ=m
CONFIG_USB_SERIAL_IR=m
CONFIG_USB_SERIAL_EDGEPORT=m
CONFIG_USB_SERIAL_EDGEPORT_TI=m
CONFIG_USB_SERIAL_GARMIN=m
CONFIG_USB_SERIAL_IPW=m
CONFIG_USB_SERIAL_KEYSPAN_PDA=m
CONFIG_USB_SERIAL_KEYSPAN=m
CONFIG_USB_SERIAL_KEYSPAN_MPR=y
CONFIG_USB_SERIAL_KEYSPAN_USA28=y
CONFIG_USB_SERIAL_KEYSPAN_USA28X=y
CONFIG_USB_SERIAL_KEYSPAN_USA28XA=y
CONFIG_USB_SERIAL_KEYSPAN_USA28XB=y
CONFIG_USB_SERIAL_KEYSPAN_USA19=y
CONFIG_USB_SERIAL_KEYSPAN_USA18X=y
CONFIG_USB_SERIAL_KEYSPAN_USA19W=y
CONFIG_USB_SERIAL_KEYSPAN_USA19QW=y
CONFIG_USB_SERIAL_KEYSPAN_USA19QI=y
CONFIG_USB_SERIAL_KEYSPAN_USA49W=y
CONFIG_USB_SERIAL_KEYSPAN_USA49WLC=y
CONFIG_USB_SERIAL_KLSI=m
CONFIG_USB_SERIAL_KOBIL_SCT=m
CONFIG_USB_SERIAL_MCT_U232=m
CONFIG_USB_SERIAL_PL2303=m
CONFIG_USB_SERIAL_HP4X=m
CONFIG_USB_SERIAL_SAFE=m
CONFIG_USB_SERIAL_SAFE_PADDED=y
CONFIG_USB_SERIAL_TI=m
CONFIG_USB_SERIAL_CYBERJACK=m
CONFIG_USB_SERIAL_XIRCOM=m
CONFIG_USB_SERIAL_OPTION=m
CONFIG_USB_SERIAL_OMNINET=m
CONFIG_USB_EZUSB=y

#
# USB Miscellaneous drivers
#
CONFIG_USB_EMI62=m
# CONFIG_USB_EMI26 is not set
CONFIG_USB_AUERSWALD=m
CONFIG_USB_RIO500=m
CONFIG_USB_LEGOTOWER=m
CONFIG_USB_LCD=m
CONFIG_USB_LED=m
# CONFIG_USB_CYTHERM is not set
CONFIG_USB_PHIDGETKIT=m
CONFIG_USB_PHIDGETSERVO=m
CONFIG_USB_IDMOUSE=m
CONFIG_USB_SISUSBVGA=m
CONFIG_USB_SISUSBVGA_CON=y
CONFIG_USB_LD=m
CONFIG_USB_TEST=m

#
# USB DSL modem support
#
CONFIG_USB_ATM=m
CONFIG_USB_SPEEDTOUCH=m
CONFIG_USB_CXACRU=m
CONFIG_USB_XUSBATM=m

#
# USB Gadget Support
#
# CONFIG_USB_GADGET is not set

#
# MMC/SD Card support
#
CONFIG_MMC=m
# CONFIG_MMC_DEBUG is not set
CONFIG_MMC_BLOCK=m
CONFIG_MMC_WBSD=m

#
# InfiniBand support
#
CONFIG_INFINIBAND=m
CONFIG_INFINIBAND_USER_MAD=m
CONFIG_INFINIBAND_USER_ACCESS=m
CONFIG_INFINIBAND_MTHCA=m
# CONFIG_INFINIBAND_MTHCA_DEBUG is not set
CONFIG_INFINIBAND_IPOIB=m
# CONFIG_INFINIBAND_IPOIB_DEBUG is not set
CONFIG_INFINIBAND_SRP=m

#
# SN Devices
#

#
# File systems
#
CONFIG_EXT2_FS=y
CONFIG_EXT2_FS_XATTR=y
CONFIG_EXT2_FS_POSIX_ACL=y
CONFIG_EXT2_FS_SECURITY=y
# CONFIG_EXT2_FS_XIP is not set
CONFIG_EXT3_FS=y
CONFIG_EXT3_FS_XATTR=y
CONFIG_EXT3_FS_POSIX_ACL=y
CONFIG_EXT3_FS_SECURITY=y
CONFIG_JBD=y
# CONFIG_JBD_DEBUG is not set
CONFIG_FS_MBCACHE=y
CONFIG_REISERFS_FS=m
# CONFIG_REISERFS_CHECK is not set
CONFIG_REISERFS_PROC_INFO=y
CONFIG_REISERFS_FS_XATTR=y
CONFIG_REISERFS_FS_POSIX_ACL=y
CONFIG_REISERFS_FS_SECURITY=y
CONFIG_JFS_FS=m
CONFIG_JFS_POSIX_ACL=y
CONFIG_JFS_SECURITY=y
# CONFIG_JFS_DEBUG is not set
# CONFIG_JFS_STATISTICS is not set
CONFIG_FS_POSIX_ACL=y
CONFIG_XFS_FS=m
CONFIG_XFS_EXPORT=y
CONFIG_XFS_QUOTA=y
CONFIG_XFS_SECURITY=y
CONFIG_XFS_POSIX_ACL=y
# CONFIG_XFS_RT is not set
CONFIG_MINIX_FS=m
CONFIG_ROMFS_FS=m
CONFIG_INOTIFY=y
CONFIG_QUOTA=y
# CONFIG_QFMT_V1 is not set
CONFIG_QFMT_V2=y
CONFIG_QUOTACTL=y
CONFIG_DNOTIFY=y
CONFIG_AUTOFS_FS=m
CONFIG_AUTOFS4_FS=m
CONFIG_FUSE_FS=m

#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
CONFIG_ZISOFS_FS=y
CONFIG_UDF_FS=m
CONFIG_UDF_NLS=y

#
# DOS/FAT/NT Filesystems
#
CONFIG_FAT_FS=m
CONFIG_MSDOS_FS=m
CONFIG_VFAT_FS=m
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="ascii"
# CONFIG_NTFS_FS is not set

#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
CONFIG_HUGETLBFS=y
CONFIG_HUGETLB_PAGE=y
CONFIG_RAMFS=y
CONFIG_RELAYFS_FS=m

#
# Miscellaneous filesystems
#
# CONFIG_ADFS_FS is not set
CONFIG_AFFS_FS=m
CONFIG_HFS_FS=m
CONFIG_HFSPLUS_FS=m
CONFIG_BEFS_FS=m
# CONFIG_BEFS_DEBUG is not set
CONFIG_BFS_FS=m
CONFIG_EFS_FS=m
# CONFIG_JFFS_FS is not set
CONFIG_JFFS2_FS=m
CONFIG_JFFS2_FS_DEBUG=0
CONFIG_JFFS2_FS_WRITEBUFFER=y
CONFIG_JFFS2_SUMMARY=y
# CONFIG_JFFS2_COMPRESSION_OPTIONS is not set
CONFIG_JFFS2_ZLIB=y
CONFIG_JFFS2_RTIME=y
# CONFIG_JFFS2_RUBIN is not set
CONFIG_CRAMFS=m
CONFIG_VXFS_FS=m
# CONFIG_HPFS_FS is not set
CONFIG_QNX4FS_FS=m
CONFIG_SYSV_FS=m
CONFIG_UFS_FS=m
# CONFIG_UFS_FS_WRITE is not set

#
# Network File Systems
#
CONFIG_NFS_FS=m
CONFIG_NFS_V3=y
CONFIG_NFS_V3_ACL=y
CONFIG_NFS_V4=y
CONFIG_NFS_DIRECTIO=y
CONFIG_NFSD=m
CONFIG_NFSD_V2_ACL=y
CONFIG_NFSD_V3=y
CONFIG_NFSD_V3_ACL=y
CONFIG_NFSD_V4=y
CONFIG_NFSD_TCP=y
CONFIG_LOCKD=m
CONFIG_LOCKD_V4=y
CONFIG_EXPORTFS=m
CONFIG_NFS_ACL_SUPPORT=m
CONFIG_NFS_COMMON=y
CONFIG_SUNRPC=m
CONFIG_SUNRPC_GSS=m
CONFIG_RPCSEC_GSS_KRB5=m
CONFIG_RPCSEC_GSS_SPKM3=m
# CONFIG_SMB_FS is not set
CONFIG_CIFS=m
# CONFIG_CIFS_STATS is not set
CONFIG_CIFS_XATTR=y
CONFIG_CIFS_POSIX=y
# CONFIG_CIFS_EXPERIMENTAL is not set
CONFIG_NCP_FS=m
CONFIG_NCPFS_PACKET_SIGNING=y
CONFIG_NCPFS_IOCTL_LOCKING=y
CONFIG_NCPFS_STRONG=y
CONFIG_NCPFS_NFS_NS=y
CONFIG_NCPFS_OS2_NS=y
CONFIG_NCPFS_SMALLDOS=y
CONFIG_NCPFS_NLS=y
CONFIG_NCPFS_EXTRAS=y
CONFIG_CODA_FS=m
# CONFIG_CODA_FS_OLD_API is not set
# CONFIG_AFS_FS is not set
CONFIG_9P_FS=m

#
# Partition Types
#
CONFIG_PARTITION_ADVANCED=y
# CONFIG_ACORN_PARTITION is not set
CONFIG_OSF_PARTITION=y
CONFIG_AMIGA_PARTITION=y
# CONFIG_ATARI_PARTITION is not set
CONFIG_MAC_PARTITION=y
CONFIG_MSDOS_PARTITION=y
CONFIG_BSD_DISKLABEL=y
CONFIG_MINIX_SUBPARTITION=y
CONFIG_SOLARIS_X86_PARTITION=y
CONFIG_UNIXWARE_DISKLABEL=y
# CONFIG_LDM_PARTITION is not set
CONFIG_SGI_PARTITION=y
# CONFIG_ULTRIX_PARTITION is not set
CONFIG_SUN_PARTITION=y
CONFIG_EFI_PARTITION=y

#
# Native Language Support
#
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="utf8"
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_CODEPAGE_737=m
CONFIG_NLS_CODEPAGE_775=m
CONFIG_NLS_CODEPAGE_850=m
CONFIG_NLS_CODEPAGE_852=m
CONFIG_NLS_CODEPAGE_855=m
CONFIG_NLS_CODEPAGE_857=m
CONFIG_NLS_CODEPAGE_860=m
CONFIG_NLS_CODEPAGE_861=m
CONFIG_NLS_CODEPAGE_862=m
CONFIG_NLS_CODEPAGE_863=m
CONFIG_NLS_CODEPAGE_864=m
CONFIG_NLS_CODEPAGE_865=m
CONFIG_NLS_CODEPAGE_866=m
CONFIG_NLS_CODEPAGE_869=m
CONFIG_NLS_CODEPAGE_936=m
CONFIG_NLS_CODEPAGE_950=m
CONFIG_NLS_CODEPAGE_932=m
CONFIG_NLS_CODEPAGE_949=m
CONFIG_NLS_CODEPAGE_874=m
CONFIG_NLS_ISO8859_8=m
CONFIG_NLS_CODEPAGE_1250=m
CONFIG_NLS_CODEPAGE_1251=m
CONFIG_NLS_ASCII=y
CONFIG_NLS_ISO8859_1=m
CONFIG_NLS_ISO8859_2=m
CONFIG_NLS_ISO8859_3=m
CONFIG_NLS_ISO8859_4=m
CONFIG_NLS_ISO8859_5=m
CONFIG_NLS_ISO8859_6=m
CONFIG_NLS_ISO8859_7=m
CONFIG_NLS_ISO8859_9=m
CONFIG_NLS_ISO8859_13=m
CONFIG_NLS_ISO8859_14=m
CONFIG_NLS_ISO8859_15=m
CONFIG_NLS_KOI8_R=m
CONFIG_NLS_KOI8_U=m
CONFIG_NLS_UTF8=m

#
# Library routines
#
CONFIG_CRC_CCITT=m
CONFIG_CRC16=m
CONFIG_CRC32=y
CONFIG_LIBCRC32C=m
CONFIG_ZLIB_INFLATE=y
CONFIG_ZLIB_DEFLATE=m
CONFIG_TEXTSEARCH=y
CONFIG_TEXTSEARCH_KMP=m
CONFIG_TEXTSEARCH_BM=m
CONFIG_TEXTSEARCH_FSM=m

#
# Instrumentation Support
#
CONFIG_PROFILING=y
CONFIG_OPROFILE=m
# CONFIG_KPROBES is not set

#
# Kernel hacking
#
# CONFIG_PRINTK_TIME is not set
CONFIG_DEBUG_KERNEL=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_LOG_BUF_SHIFT=17
CONFIG_DETECT_SOFTLOCKUP=y
# CONFIG_SCHEDSTATS is not set
CONFIG_DEBUG_SLAB=y
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_SPINLOCK_SLEEP=y
# CONFIG_DEBUG_KOBJECT is not set
CONFIG_DEBUG_INFO=y
CONFIG_DEBUG_FS=y
# CONFIG_DEBUG_VM is not set
# CONFIG_RCU_TORTURE_TEST is not set
CONFIG_DEBUG_STACKOVERFLOW=y
CONFIG_DEBUG_STACK_USAGE=y
CONFIG_DEBUGGER=y
CONFIG_XMON=y
CONFIG_XMON_DEFAULT=y
CONFIG_IRQSTACKS=y
CONFIG_BOOTX_TEXT=y

#
# Security options
#
CONFIG_KEYS=y
CONFIG_KEYS_DEBUG_PROC_KEYS=y
CONFIG_SECURITY=y
CONFIG_SECURITY_NETWORK=y
CONFIG_SECURITY_CAPABILITIES=y
# CONFIG_SECURITY_ROOTPLUG is not set
# CONFIG_SECURITY_SECLVL is not set
CONFIG_SECURITY_SELINUX=y
CONFIG_SECURITY_SELINUX_BOOTPARAM=y
CONFIG_SECURITY_SELINUX_BOOTPARAM_VALUE=1
CONFIG_SECURITY_SELINUX_DISABLE=y
CONFIG_SECURITY_SELINUX_DEVELOP=y
CONFIG_SECURITY_SELINUX_AVC_STATS=y
CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE=1
CONFIG_KEYS_COMPAT=y

#
# Cryptographic options
#
CONFIG_CRYPTO=y
CONFIG_CRYPTO_HMAC=y
CONFIG_CRYPTO_NULL=m
CONFIG_CRYPTO_MD4=m
CONFIG_CRYPTO_MD5=y
CONFIG_CRYPTO_SHA1=y
CONFIG_CRYPTO_SHA256=m
CONFIG_CRYPTO_SHA512=m
CONFIG_CRYPTO_WP512=m
CONFIG_CRYPTO_TGR192=m
CONFIG_CRYPTO_DES=m
CONFIG_CRYPTO_BLOWFISH=m
CONFIG_CRYPTO_TWOFISH=m
CONFIG_CRYPTO_SERPENT=m
CONFIG_CRYPTO_AES=m
CONFIG_CRYPTO_CAST5=m
CONFIG_CRYPTO_CAST6=m
CONFIG_CRYPTO_TEA=m
CONFIG_CRYPTO_ARC4=m
CONFIG_CRYPTO_KHAZAD=m
CONFIG_CRYPTO_ANUBIS=m
CONFIG_CRYPTO_DEFLATE=m
CONFIG_CRYPTO_MICHAEL_MIC=m
CONFIG_CRYPTO_CRC32C=m
# CONFIG_CRYPTO_TEST is not set

#
# Hardware crypto devices
#

From ahuja at austin.ibm.com  Thu Feb 16 15:44:02 2006
From: ahuja at austin.ibm.com (Manish Ahuja)
Date: Wed, 15 Feb 2006 22:44:02 -0600
Subject: [PATCH] PPC64 collect and export low-level cpu usage statistics
In-Reply-To: <20060214183259.28a6a501.sfr@canb.auug.org.au>
References: <17393.16261.768862.724265@cargo.ozlabs.ibm.com>
	<20060214183259.28a6a501.sfr@canb.auug.org.au>
Message-ID: <43F40312.2020800@austin.ibm.com>

Paulus, Stephen:

I have rebuilt this patch with suggestions and against 2.6.16-git3-rc3. 
This should apply cleanly.

Stephen,

Answering some of your queries.

>Why not PACA_START_TB and PACA_DELTA_TB?  Also, start_tb and delta_tb don't really
>store time base values, but PURR values.
>  
>
When I dropped the earlier patch, we were tracking only purr's but since 
purr was a function of timebase in a sense, one of the comments was that 
I use "tb" instead of "purr". It made sense then, but now with too many 
things being tracked, I would ideally like to change tb/*_cpu_util to 
purr as that adds readablity quite a bit.

>>Index: linux-2.6.15-rc6/arch/powerpc/kernel/entry_64.S
>>===================================================================
>>--- linux-2.6.15-rc6.orig/arch/powerpc/kernel/entry_64.S	2005-12-18 16:36:54.000000000 -0800
>>+++ linux-2.6.15-rc6/arch/powerpc/kernel/entry_64.S	2006-01-17 15:39:03.000000000 -0800
>>@@ -520,7 +520,19 @@
>> 	 * r13 is our per cpu area, only restore it if we are returning to
>> 	 * userspace
>> 	 */
>>+
>> 	beq	1f
>>+BEGIN_FTR_SECTION
>>+	li	r10,0
>>+	stb	r10,PACA_CDFLAG(r13)
>>    
>>
>
>cdflag get set here but not set or used anywhere else.
>
>  
>
I have a segment of code that uses this functionality. I pulled it out, 
since somewhere my math wasn't adding up. I left it to be dropped as a 
patch later. But if you wish, I can take this out now and add it later.

>>Index: linux-2.6.15-rc6/arch/powerpc/kernel/process.c
>>===================================================================
>>--- linux-2.6.15-rc6.orig/arch/powerpc/kernel/process.c	2005-12-18 16:36:54.000000000 -0800
>>+++ linux-2.6.15-rc6/arch/powerpc/kernel/process.c	2006-01-17 21:20:25.000000000 -0800
>>@@ -243,6 +243,7 @@
>> 	struct thread_struct *new_thread, *old_thread;
>> 	unsigned long flags;
>> 	struct task_struct *last;
>>+	struct paca_struct *lpaca;
>>    
>>
>
>This could have been declared below (near pd)
>
>  
>

Yes... But it seems fine there..


#ifdef CONFIG_SMP /* avoid complexity of lazy save/restore of fpu

>>@@ -313,19 +314,34 @@
>> 	new_thread = &new->thread;
>> 	old_thread = &current->thread;
>> 
>>-#ifdef CONFIG_PPC64
>>-	/*
>>-	 * Collect processor utilization data per process
>>-	 */
>>-	if (firmware_has_feature(FW_FEATURE_SPLPAR)) {
>>-		struct cpu_usage *cu = &__get_cpu_var(cpu_usage_array);
>>-		long unsigned start_tb, current_tb;
>>-		start_tb = old_thread->start_tb;
>>-		cu->current_tb = current_tb = mfspr(SPRN_PURR);
>>-		old_thread->accum_tb += (current_tb - start_tb);
>>-		new_thread->start_tb = current_tb;
>>+
>>+/* Collect cpu_util utilization data per process and per processor wise */
>>+	if (cpu_has_feature(CPU_FTR_PURR)) {
>>+		struct cpu_usage *pd = &__get_cpu_var(cpu_usage_array);
>>    
>>
>
>Was there some good reason to change this variable name from cu to pd?
>  
>

Not really.. except pd stood for purr data and i liked tha abbr more.

>>+		long unsigned start_cpu_util, current_cpu_util;
>>+
>>+		if ( old_thread->start_cpu_util )
>>+			pd->current_cpu_util = current_cpu_util = mfspr(SPRN_PURR);
>>+		else
>>+		   	old_thread->start_cpu_util = pd->current_cpu_util = current_cpu_util = mfspr(SPRN_PURR);
>>    
>>
>
>Probably better would be:
>	pd->current_cpu_util = current_cpu_util = mfspr(SPRN_PURR);
>	if (old_thread->start_cpu_util == 0)
>		old_thread->start_cpu_util = current_cpu_util;
>
>  
>

Yeah, that should have been obvious. Changed as requested.


>>+
>>+		/* store delta_tb & mftb into cpu_util data array for    *
>>+		 * later easy access otherwise you have to do run_on_cpu *
>>+		 * which is expensive             			 */
>>    
>>
>
>Comment style should be:
>
>	/* store delta_tb & mftb into cpu_util data array for
>	 * later easy access otherwise you have to do run_on_cpu
>	 * which is expensive
>	 */
>
>  
>

Changed as requested.

>>+
>>+		lpaca = get_paca();
>>+		pd->collected_krntb = lpaca->delta_tb;
>>+		pd->collected_timebase = mftb();
>>+
>>+		start_cpu_util = old_thread->start_cpu_util;
>>+		old_thread->total_dp += (current_cpu_util - start_cpu_util);
>>+
>>+		/* collect time from entry into kernel to now and account it *
>>+		 * in process kernel time 				     */
>>    
>>
>
>Comment style again.
>
>  
>


Changed as requested.


>>+
>>+		old_thread->proc_stime += (current_cpu_util - lpaca->start_tb);
>>+		new_thread->start_cpu_util = current_cpu_util;
>> 	}
>>-#endif
>> 
>> 	local_irq_save(flags);
>> 	last = _switch(old_thread, new_thread);
>>Index: linux-2.6.15-rc6/arch/powerpc/kernel/setup_64.c
>>===================================================================
>>--- linux-2.6.15-rc6.orig/arch/powerpc/kernel/setup_64.c	2005-12-18 16:36:54.000000000 -0800
>>+++ linux-2.6.15-rc6/arch/powerpc/kernel/setup_64.c	2006-02-10 11:51:28.197401840 -0800
>>@@ -851,3 +851,153 @@
>>    
>>
>
>  
>
>>+static void collect_cpu_deltas(int cpu)
>>    
>>
>
>  
>
>>+static void post_cpu_deltas(int cpu)
>>    
>>
>
>Should those two be #ifdef CONFIG_HOTPLUG_CPU ?
>
>  
>

Yeah, they should be and are now rightly so.

>>+		/* Initialize the global variables to zero */
>>+		offline_cpu_total_tb = 0;
>>+		offline_cpu_total_cpu_util = 0;
>>+		offline_cpu_total_krncycles = 0;
>>+		offline_cpu_total_idle = 0;
>>    
>>
>
>You don't need to set these to zero explicitly.
>
>  
>

Ok .. But since they are done.. No harm done..

>>Index: linux-2.6.15-rc6/arch/powerpc/kernel/sysfs.c
>>===================================================================
>>--- linux-2.6.15-rc6.orig/arch/powerpc/kernel/sysfs.c	2005-12-18 16:36:54.000000000 -0800
>>+++ linux-2.6.15-rc6/arch/powerpc/kernel/sysfs.c	2006-02-10 12:36:02.375372096 -0800
>>@@ -232,8 +240,11 @@
>> 	if (cur_cpu_spec->num_pmcs >= 8)
>> 		sysdev_create_file(s, &attr_pmc8);
>> 
>>-	if (cpu_has_feature(CPU_FTR_SMT))
>>+	if (cpu_has_feature(CPU_FTR_PURR)) {
>> 		sysdev_create_file(s, &attr_purr);
>>    
>>
>
>This will mean that the "purr" file doesn't exist in some cases where it
>used to (even if it was useless).  Not sure if that is a problem for any
>user mode utilities.
>
>  
>
I truly doubt it. But if there is such a utility, then it shouldn't 
really see purr if its not a power5 system.

>>Index: linux-2.6.15-rc6/include/asm-powerpc/processor.h
>>===================================================================
>>--- linux-2.6.15-rc6.orig/include/asm-powerpc/processor.h	2005-12-18 16:36:54.000000000 -0800
>>+++ linux-2.6.15-rc6/include/asm-powerpc/processor.h	2006-01-17 21:31:17.000000000 -0800
>>@@ -177,6 +177,9 @@
>> #ifdef CONFIG_PPC64
>> 	unsigned long	start_tb;	/* Start purr when proc switched in */
>> 	unsigned long	accum_tb;	/* Total accumilated purr for process */
>>+	unsigned long   start_cpu_util;	/* Start cpu_util when proc switch in */
>>+	unsigned long   total_dp ;	/* Total delta cpu_util accum for proc */
>>+	unsigned long   proc_stime;	/* Was pad,Now process cpu_util stime */
>>    
>>
>
>total_dp and proc_stime are not used anywhere and start_tb accum_tb are no longer used.
>  
>
total_dp & proc_stime are being used.

I think I made a mistake and while porting from 2.6.11.8 to 2.6.15, I 
changed things. I have gone ahead and deleted these values.


-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: cpu_patch-git3-rc3
Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060215/4944d874/attachment.txt 

From sharada at in.ibm.com  Fri Feb 17 01:15:06 2006
From: sharada at in.ibm.com (R Sharada)
Date: Thu, 16 Feb 2006 19:45:06 +0530
Subject: [PATCH] kdump ppc64: fix htab-size for non-lpar
Message-ID: <20060216141506.GA5064@in.ibm.com>

Hello,
	The htab-size calculated for kexec/kdump on 2.6 kernels was broken,
leading to wrong value being exported to /proc/device-tree
Here is a patch fixing it up.

This has been tested on 2.6.16-rc2

Thanks and Regards,
Sharada

We export a value linux,htab-size to /proc/device-tree so that kexec-tools
can use that to exclude the htab region when trying to make space for the
kexec segments. htab-size was earlier calculated in export_htab_values using
ppc64_pft_size. ppc64_pft_size no longer holds a valid size for all machines.
So, define a new variable htab_size in hash_utils_64.c which is initialized
to the htab size value obtained in htab_initialize. Use this variable to set
the htab-size in export_htab_values()

Signed-off-by: R Sharada <sharada at in.ibm.com>
---


diff -puN arch/powerpc/mm/hash_utils_64.c~kdump-save-htab-size arch/powerpc/mm/hash_utils_64.c
--- linux-2.6.16-rc2-htab/arch/powerpc/mm/hash_utils_64.c~kdump-save-htab-size	2006-02-16 18:29:54.000000000 +0530
+++ linux-2.6.16-rc2-htab-sharada/arch/powerpc/mm/hash_utils_64.c	2006-02-16 19:21:57.000000000 +0530
@@ -95,6 +95,10 @@ int mmu_virtual_psize = MMU_PAGE_4K;
 int mmu_huge_psize = MMU_PAGE_16M;
 unsigned int HPAGE_SHIFT;
 #endif
+#ifdef CONFIG_KEXEC
+#define HASH_GROUP_SIZE 0x80    /* size of each hash group, asm/mmu.h */
+unsigned long htab_size;
+#endif
 
 /* There are definitions of page sizes arrays to be used when none
  * is provided by the firmware.
@@ -445,6 +449,9 @@ void __init htab_initialize(void)
 
 		/* Set SDR1 */
 		mtspr(SPRN_SDR1, _SDR1);
+#ifdef CONFIG_KEXEC
+		htab_size = (htab_hash_mask + 1) * HASH_GROUP_SIZE;
+#endif
 	}
 
 	mode_rw = _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_COHERENT | PP_RWXX;
diff -puN arch/powerpc/kernel/machine_kexec_64.c~kdump-save-htab-size arch/powerpc/kernel/machine_kexec_64.c
--- linux-2.6.16-rc2-htab/arch/powerpc/kernel/machine_kexec_64.c~kdump-save-htab-size	2006-02-16 18:29:54.000000000 +0530
+++ linux-2.6.16-rc2-htab-sharada/arch/powerpc/kernel/machine_kexec_64.c	2006-02-16 18:53:06.000000000 +0530
@@ -26,8 +26,6 @@
 #include <asm/prom.h>
 #include <asm/smp.h>
 
-#define HASH_GROUP_SIZE 0x80	/* size of each hash group, asm/mmu.h */
-
 int default_machine_kexec_prepare(struct kimage *image)
 {
 	int i;
@@ -61,7 +59,7 @@ int default_machine_kexec_prepare(struct
 	 */
 	if (htab_address) {
 		low = __pa(htab_address);
-		high = low + (htab_hash_mask + 1) * HASH_GROUP_SIZE;
+		high = low + htab_size;
 
 		for (i = 0; i < image->nr_segments; i++) {
 			begin = image->segment[i].mem;
@@ -294,7 +292,7 @@ void default_machine_kexec(struct kimage
 }
 
 /* Values we need to export to the second kernel via the device tree. */
-static unsigned long htab_base, htab_size, kernel_end;
+static unsigned long htab_base, kernel_end;
 
 static struct property htab_base_prop = {
 	.name = "linux,htab-base",
@@ -332,7 +330,6 @@ static void __init export_htab_values(vo
 	htab_base = __pa(htab_address);
 	prom_add_property(node, &htab_base_prop);
 
-	htab_size = 1UL << ppc64_pft_size;
 	prom_add_property(node, &htab_size_prop);
 
  out:
--- linux-2.6.16-rc2-htab/include/asm-powerpc/mmu.h~kdump-save-htab-size	2006-02-16 18:57:43.000000000 +0530
+++ linux-2.6.16-rc2-htab-sharada/include/asm-powerpc/mmu.h	2006-02-16 19:21:29.000000000 +0530
@@ -113,6 +113,9 @@ typedef struct {
 
 extern hpte_t *htab_address;
 extern unsigned long htab_hash_mask;
+#ifdef CONFIG_KEXEC
+extern unsigned long htab_size;
+#endif
 
 /*
  * Page size definition
_


From olof at lixom.net  Fri Feb 17 01:26:54 2006
From: olof at lixom.net (Olof Johansson)
Date: Thu, 16 Feb 2006 08:26:54 -0600
Subject: [PATCH] kdump ppc64: fix htab-size for non-lpar
In-Reply-To: <20060216141506.GA5064@in.ibm.com>
References: <20060216141506.GA5064@in.ibm.com>
Message-ID: <20060216142654.GL6291@pb15.lixom.net>

On Thu, Feb 16, 2006 at 07:45:06PM +0530, R Sharada wrote:
> Hello,
> 	The htab-size calculated for kexec/kdump on 2.6 kernels was broken,
> leading to wrong value being exported to /proc/device-tree
> Here is a patch fixing it up.

Why you don't use htab_hash_mask in kexec instead of introducing a
global variable?

I.e. do htab_size = (htab_hash_mask + 1) * HASH_GROUP_SIZE in
export_htab_values()?

Saves a new global and a bunch of #ifdefs.


Thanks,

-Olof


From d.herrendoerfer at de.ibm.com  Fri Feb 17 03:45:10 2006
From: d.herrendoerfer at de.ibm.com (Dirk Herrendoerfer)
Date: Thu, 16 Feb 2006 17:45:10 +0100
Subject: [FYI/PATCH] Missing SPUFS context initializer
Message-ID: <22731be5999078b4021bf72c7909d9a4@de.ibm.com>

This patch adds a missing initializtaion in the spufs context.
Without this patch unmapping of the mfc file will result in a kernel 
oops.

Index: linux/arch/powerpc/platforms/cell/spufs/context.c
===================================================================
--- linux.orig/arch/powerpc/platforms/cell/spufs/context.c
+++ linux/arch/powerpc/platforms/cell/spufs/context.c
@@ -51,6 +51,7 @@ struct spu_context *alloc_spu_context(vo
         ctx->ibox_fasync = NULL;
         ctx->wbox_fasync = NULL;
         ctx->mfc_fasync = NULL;
+       ctx->mfc = NULL;
         ctx->tagwait = 0;
         ctx->state = SPU_STATE_SAVED;
         ctx->local_store = NULL;


From michael at ellerman.id.au  Fri Feb 17 08:56:49 2006
From: michael at ellerman.id.au (Michael Ellerman)
Date: Fri, 17 Feb 2006 08:56:49 +1100
Subject: [PATCH] kdump ppc64: fix htab-size for non-lpar
In-Reply-To: <20060216142654.GL6291@pb15.lixom.net>
References: <20060216141506.GA5064@in.ibm.com>
	<20060216142654.GL6291@pb15.lixom.net>
Message-ID: <200602170856.52733.michael@ellerman.id.au>

On Fri, 17 Feb 2006 01:26, Olof Johansson wrote:
> On Thu, Feb 16, 2006 at 07:45:06PM +0530, R Sharada wrote:
> > Hello,
> > 	The htab-size calculated for kexec/kdump on 2.6 kernels was broken,
> > leading to wrong value being exported to /proc/device-tree
> > Here is a patch fixing it up.
>
> Why you don't use htab_hash_mask in kexec instead of introducing a
> global variable?

Separation of concerns? We currently do the calculation in the kexec code, but 
that's gotten out of sync with the htab code, so just do it in one place and 
export it.

> Saves a new global and a bunch of #ifdefs.

I agree the patch could be a bit simpler, eg (totally untested):

Index: to-merge/arch/powerpc/kernel/machine_kexec_64.c
===================================================================
--- to-merge.orig/arch/powerpc/kernel/machine_kexec_64.c
+++ to-merge/arch/powerpc/kernel/machine_kexec_64.c
@@ -26,8 +26,6 @@
 #include <asm/prom.h>
 #include <asm/smp.h>
 
-#define HASH_GROUP_SIZE 0x80	/* size of each hash group, asm/mmu.h */
-
 int default_machine_kexec_prepare(struct kimage *image)
 {
 	int i;
@@ -61,7 +59,7 @@ int default_machine_kexec_prepare(struct
 	 */
 	if (htab_address) {
 		low = __pa(htab_address);
-		high = low + (htab_hash_mask + 1) * HASH_GROUP_SIZE;
+		high = low + htab_size_bytes;
 
 		for (i = 0; i < image->nr_segments; i++) {
 			begin = image->segment[i].mem;
@@ -294,7 +292,7 @@ void default_machine_kexec(struct kimage
 }
 
 /* Values we need to export to the second kernel via the device tree. */
-static unsigned long htab_base, htab_size, kernel_end;
+static unsigned long htab_base, kernel_end;
 
 static struct property htab_base_prop = {
 	.name = "linux,htab-base",
@@ -305,7 +303,7 @@ static struct property htab_base_prop = 
 static struct property htab_size_prop = {
 	.name = "linux,htab-size",
 	.length = sizeof(unsigned long),
-	.value = (unsigned char *)&htab_size,
+	.value = (unsigned char *)&htab_size_bytes,
 };
 
 static struct property kernel_end_prop = {
@@ -331,8 +329,6 @@ static void __init export_htab_values(vo
 
 	htab_base = __pa(htab_address);
 	prom_add_property(node, &htab_base_prop);
-
-	htab_size = 1UL << ppc64_pft_size;
 	prom_add_property(node, &htab_size_prop);
 
  out:
Index: to-merge/arch/powerpc/mm/hash_utils_64.c
===================================================================
--- to-merge.orig/arch/powerpc/mm/hash_utils_64.c
+++ to-merge/arch/powerpc/mm/hash_utils_64.c
@@ -88,6 +88,7 @@ static unsigned long _SDR1;
 struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT];
 
 hpte_t *htab_address;
+unsigned long htab_size_bytes;
 unsigned long htab_hash_mask;
 int mmu_linear_psize = MMU_PAGE_4K;
 int mmu_virtual_psize = MMU_PAGE_4K;
@@ -399,7 +400,7 @@ void create_section_mapping(unsigned lon
 
 void __init htab_initialize(void)
 {
-	unsigned long table, htab_size_bytes;
+	unsigned long table;
 	unsigned long pteg_count;
 	unsigned long mode_rw;
 	unsigned long base = 0, size = 0;
Index: to-merge/include/asm-powerpc/mmu.h
===================================================================
--- to-merge.orig/include/asm-powerpc/mmu.h
+++ to-merge/include/asm-powerpc/mmu.h
@@ -112,6 +112,7 @@ typedef struct {
 } hpte_t;
 
 extern hpte_t *htab_address;
+extern unsigned long htab_size_bytes;
 extern unsigned long htab_hash_mask;
 
 /*
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060217/770342eb/attachment.pgp 

From olof at lixom.net  Fri Feb 17 09:11:42 2006
From: olof at lixom.net (Olof Johansson)
Date: Thu, 16 Feb 2006 16:11:42 -0600
Subject: [PATCH] kdump ppc64: fix htab-size for non-lpar
In-Reply-To: <200602170856.52733.michael@ellerman.id.au>
References: <20060216141506.GA5064@in.ibm.com>
	<20060216142654.GL6291@pb15.lixom.net>
	<200602170856.52733.michael@ellerman.id.au>
Message-ID: <20060216221142.GA4772@pb15.lixom.net>

On Fri, Feb 17, 2006 at 08:56:49AM +1100, Michael Ellerman wrote:
> On Fri, 17 Feb 2006 01:26, Olof Johansson wrote:
> > On Thu, Feb 16, 2006 at 07:45:06PM +0530, R Sharada wrote:
> > > Hello,
> > > 	The htab-size calculated for kexec/kdump on 2.6 kernels was broken,
> > > leading to wrong value being exported to /proc/device-tree
> > > Here is a patch fixing it up.
> >
> > Why you don't use htab_hash_mask in kexec instead of introducing a
> > global variable?
> 
> Separation of concerns? We currently do the calculation in the kexec code, but 
> that's gotten out of sync with the htab code, so just do it in one place and 
> export it.

Eh, it's not like the hash group size is likely to change anytime soon,
given that it's fixed in the architecture. But sure, it avoids exposing
knowledge of it to kexec.

> > Saves a new global and a bunch of #ifdefs.
> 
> I agree the patch could be a bit simpler, eg (totally untested):

That looks considerably better. No more complaints from me. :)


-Olof


From michael at ellerman.id.au  Fri Feb 17 09:24:05 2006
From: michael at ellerman.id.au (Michael Ellerman)
Date: Fri, 17 Feb 2006 09:24:05 +1100
Subject: [PATCH] kdump ppc64: fix htab-size for non-lpar
In-Reply-To: <20060216221142.GA4772@pb15.lixom.net>
References: <20060216141506.GA5064@in.ibm.com>
	<200602170856.52733.michael@ellerman.id.au>
	<20060216221142.GA4772@pb15.lixom.net>
Message-ID: <200602170924.09133.michael@ellerman.id.au>

On Fri, 17 Feb 2006 09:11, Olof Johansson wrote:
> On Fri, Feb 17, 2006 at 08:56:49AM +1100, Michael Ellerman wrote:
> > On Fri, 17 Feb 2006 01:26, Olof Johansson wrote:
> > > On Thu, Feb 16, 2006 at 07:45:06PM +0530, R Sharada wrote:
> > > > Hello,
> > > > 	The htab-size calculated for kexec/kdump on 2.6 kernels was broken,
> > > > leading to wrong value being exported to /proc/device-tree
> > > > Here is a patch fixing it up.
> > >
> > > Why you don't use htab_hash_mask in kexec instead of introducing a
> > > global variable?
> >
> > Separation of concerns? We currently do the calculation in the kexec
> > code, but that's gotten out of sync with the htab code, so just do it in
> > one place and export it.
>
> Eh, it's not like the hash group size is likely to change anytime soon,
> given that it's fixed in the architecture. But sure, it avoids exposing
> knowledge of it to kexec.

Yeah sure, it's possible we could change the meaning of htab_hash_mask in 
future, although I agree it's unlikely.

> That looks considerably better. No more complaints from me. :)

Cool, I'll get it tested and send it up.

cheers

-- 
Michael Ellerman
IBM OzLabs

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060217/73e6fbf4/attachment.pgp 

From dwg at au1.ibm.com  Thu Feb 16 20:10:27 2006
From: dwg at au1.ibm.com (David Gibson)
Date: Thu, 16 Feb 2006 20:10:27 +1100
Subject: [PATCH] PPC64 collect and export low-level cpu usage statistics
In-Reply-To: <43F40312.2020800@austin.ibm.com>
References: <17393.16261.768862.724265@cargo.ozlabs.ibm.com>
	<20060214183259.28a6a501.sfr@canb.auug.org.au>
	<43F40312.2020800@austin.ibm.com>
Message-ID: <20060216091027.GA826@localhost.localdomain>

On Wed, Feb 15, 2006 at 10:44:02PM -0600, Manish Ahuja wrote:
[snip]
> >>Index: linux-2.6.15-rc6/arch/powerpc/kernel/process.c
> >>===================================================================
> >>--- linux-2.6.15-rc6.orig/arch/powerpc/kernel/process.c	2005-12-18 
> >>16:36:54.000000000 -0800
> >>+++ linux-2.6.15-rc6/arch/powerpc/kernel/process.c	2006-01-17 
> >>21:20:25.000000000 -0800
> >>@@ -243,6 +243,7 @@
> >>	struct thread_struct *new_thread, *old_thread;
> >>	unsigned long flags;
> >>	struct task_struct *last;
> >>+	struct paca_struct *lpaca;
> >>   
> >>
> >
> >This could have been declared below (near pd)
> 
> Yes... But it seems fine there..

Actually, I've been trying to get rid of lpaca locals everywhere.
Using get_paca() directly is barely more verbose, and usually clearer.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson


From david at gibson.dropbear.id.au  Fri Feb 17 14:54:11 2006
From: david at gibson.dropbear.id.au (David Gibson)
Date: Fri, 17 Feb 2006 14:54:11 +1100
Subject: powerpc: Fix accidentally-working typo in __pud_free_tlb
Message-ID: <20060217035411.GB21696@localhost.localdomain>

Andrew, Paulus, please apply.

One of the parameters to the __pud_free_tlb() macro for powerpc is
incorrect (see patch) .  We get away with it by accident, because the
one place the macro is called, the second parameter is a variable
named "pud".

Nonetheless, this should be fixed for 2.6.16.

Signed-off-by: David Gibson <dwg at au1.ibm.com>

Index: working-2.6/include/asm-powerpc/pgalloc.h
===================================================================
--- working-2.6.orig/include/asm-powerpc/pgalloc.h	2006-01-16 13:02:29.000000000 +1100
+++ working-2.6/include/asm-powerpc/pgalloc.h	2006-02-17 14:48:13.000000000 +1100
@@ -146,7 +146,7 @@ extern void pgtable_free_tlb(struct mmu_
 	pgtable_free_tlb(tlb, pgtable_free_cache(pmd, \
 		PMD_CACHE_NUM, PMD_TABLE_SIZE-1))
 #ifndef CONFIG_PPC_64K_PAGES
-#define __pud_free_tlb(tlb, pmd)	\
+#define __pud_free_tlb(tlb, pud)	\
 	pgtable_free_tlb(tlb, pgtable_free_cache(pud, \
 		PUD_CACHE_NUM, PUD_TABLE_SIZE-1))
 #endif /* CONFIG_PPC_64K_PAGES */

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson


From utz.bacher at de.ibm.com  Fri Feb 17 16:36:23 2006
From: utz.bacher at de.ibm.com (Utz Bacher)
Date: Fri, 17 Feb 2006 06:36:23 +0100 (CET)
Subject: [FYI/PATCH 1/4] add syscall declarations used by spufs
Message-ID: <Pine.LNX.4.62.0602170634330.7683@tuxmkge1.boeblingen.de.ibm.com>

This adds some syscall declarations used for spufs that were missing. It
applies on 2.6.15 and 2.6.15.4. I didn't see that Arnd posted it already;
this is for folks who want to build something right now, while Arnd might
revisit this some time when he's back.

From: Arnd Bergmann <arndb at de.ibm.com>
Signed-off-by: Utz Bacher <utz.bacher at de.ibm.com>

Index: linux-2.6.16-rc/include/linux/syscalls.h
===================================================================
--- linux-2.6.16-rc.orig/include/linux/syscalls.h
+++ linux-2.6.16-rc/include/linux/syscalls.h
@@ -511,7 +511,21 @@ asmlinkage long sys_ioprio_set(int which
  asmlinkage long sys_ioprio_get(int which, int who);
  asmlinkage long sys_set_mempolicy(int mode, unsigned long __user *nmask,
  					unsigned long maxnode);
+asmlinkage long sys_mbind(unsigned long start, unsigned long len,
+				unsigned long mode,
+				unsigned long __user *nmask,
+				unsigned long maxnode,
+				unsigned flags);
+asmlinkage long sys_get_mempolicy(int __user *policy,
+				unsigned long __user *nmask,
+				unsigned long maxnode,
+				unsigned long addr, unsigned long flags);

+asmlinkage long sys_inotify_init(void);
+asmlinkage long sys_inotify_add_watch(int fd, const char __user *path,
+					u32 mask);
+asmlinkage long sys_inotify_rm_watch(int fd, u32 wd);
+
  asmlinkage long sys_spu_run(int fd, __u32 __user *unpc,
  				 __u32 __user *ustatus);
  asmlinkage long sys_spu_create(const char __user *name,
Index: linux-2.6.16-rc/fs/inotify.c
===================================================================
--- linux-2.6.16-rc.orig/fs/inotify.c
+++ linux-2.6.16-rc/fs/inotify.c
@@ -33,6 +33,7 @@
  #include <linux/list.h>
  #include <linux/writeback.h>
  #include <linux/inotify.h>
+#include <linux/syscalls.h>

  #include <asm/ioctls.h>


From utz.bacher at de.ibm.com  Fri Feb 17 16:39:09 2006
From: utz.bacher at de.ibm.com (Utz Bacher)
Date: Fri, 17 Feb 2006 06:39:09 +0100 (CET)
Subject: [FYI/PATCH 4/4] Idle code for IBM Full System Simulator
Message-ID: <Pine.LNX.4.62.0602170638220.7683@tuxmkge1.boeblingen.de.ibm.com>

Improve system simulator idle loop. The patch applies on 2.6.15 and
2.6.15.4.

Cc: Arnd Bergmann <arndb at de.ibm.com>
From: Sidney Manning <sid at us.ibm.com>
Signed-off-by: Utz Bacher <utz.bacher at de.ibm.com>

Index: linux/arch/powerpc/kernel/setup_64.c
===================================================================
--- linux.orig/arch/powerpc/kernel/setup_64.c
+++ linux/arch/powerpc/kernel/setup_64.c
@@ -647,10 +647,10 @@ void __init setup_arch(char **cmdline_p)
  	conswitchp = &dummy_con;
  #endif

-	ppc_md.setup_arch();
-
  	setup_systemsim_idle();

+	ppc_md.setup_arch();
+
  	/* Use the default idle loop if the platform hasn't provided one. */
  	if (NULL == ppc_md.idle_loop) {
  		ppc_md.idle_loop = default_idle;


From utz.bacher at de.ibm.com  Fri Feb 17 16:37:23 2006
From: utz.bacher at de.ibm.com (Utz Bacher)
Date: Fri, 17 Feb 2006 06:37:23 +0100 (CET)
Subject: [FYI/PATCH 2/4] enable control-c for IBM Full System Simulator
Message-ID: <Pine.LNX.4.62.0602170636270.7683@tuxmkge1.boeblingen.de.ibm.com>

Enable control-C for the system simulator console. This patch applies on
2.6.15 and 2.6.15.4.

Cc: Arnd Bergmann <arndb at de.ibm.com>
From: Sidney Manning <sid at us.ibm.com>
Signed-off-by: Utz Bacher <utz.bacher at de.ibm.com>

Index: linux/drivers/char/tty_io.c
===================================================================
--- linux.orig/drivers/char/tty_io.c
+++ linux/drivers/char/tty_io.c
@@ -1838,7 +1838,9 @@ retry_open:
  		if (driver) {
  			/* Don't let /dev/console block */
  			filp->f_flags |= O_NONBLOCK;
+#ifndef CONFIG_PPC_SYSTEMSIM
  			noctty = 1;
+#endif
  			goto got_driver;
  		}
  		up(&tty_sem);


From paulus at samba.org  Fri Feb 17 19:39:13 2006
From: paulus at samba.org (Paul Mackerras)
Date: Fri, 17 Feb 2006 19:39:13 +1100
Subject: [FYI/PATCH 2/4] enable control-c for IBM Full System Simulator
In-Reply-To: <Pine.LNX.4.62.0602170636270.7683@tuxmkge1.boeblingen.de.ibm.com>
References: <Pine.LNX.4.62.0602170636270.7683@tuxmkge1.boeblingen.de.ibm.com>
Message-ID: <17397.35761.56383.60273@cargo.ozlabs.ibm.com>

Utz Bacher writes:

> +#ifndef CONFIG_PPC_SYSTEMSIM
>   			noctty = 1;
> +#endif

Why is this awful hack necessary?

Paul.


From utz.bacher at de.ibm.com  Fri Feb 17 16:38:17 2006
From: utz.bacher at de.ibm.com (Utz Bacher)
Date: Fri, 17 Feb 2006 06:38:17 +0100 (CET)
Subject: [FYI/PATCH 3/4] Build fixes for IBM Full System Simulator
Message-ID: <Pine.LNX.4.62.0602170637271.7683@tuxmkge1.boeblingen.de.ibm.com>

This patch applies on 2.6.15 and 2.6.15.4 and changes defconfig to more
reasonable values and sets flags for cross compiling. The patch is not
intended for inclusion, but useful for building stuff for the IBM Full
System Simulator.

Cc: Arnd Bergmann <arndb at de.ibm.com>
From: Sidney Manning <sid at us.ibm.com>
Signed-off-by: Utz Bacher <utz.bacher at de.ibm.com>

Index: linux/arch/powerpc/Makefile
===================================================================
--- linux.orig/arch/powerpc/Makefile
+++ linux/arch/powerpc/Makefile
@@ -105,6 +105,11 @@ ifndef CONFIG_FSL_BOOKE
  CFLAGS		+= -mstring
  endif

+
+ifneq ($(CROSS_COMPILE),)
+cpu-as-$(CONFIG_PPC_CELL)	+= -Wa,-mcellppu
+endif
+
  cpu-as-$(CONFIG_PPC64BRIDGE)	+= -Wa,-mppc64bridge
  cpu-as-$(CONFIG_4xx)		+= -Wa,-m405
  cpu-as-$(CONFIG_6xx)		+= -Wa,-maltivec
Index: linux/arch/powerpc/configs/cbesim_defconfig
===================================================================
--- linux.orig/arch/powerpc/configs/cbesim_defconfig
+++ linux/arch/powerpc/configs/cbesim_defconfig
@@ -1,7 +1,7 @@
  #
  # Automatically generated make config: don't edit
  # Linux kernel version: 2.6.15
-# Mon Jan  9 12:38:39 2006
+# Wed Jan 18 12:44:06 2006
  #
  CONFIG_PPC64=y
  CONFIG_64BIT=y
@@ -27,7 +27,6 @@ CONFIG_PPC_FPU=y
  CONFIG_ALTIVEC=y
  CONFIG_PPC_STD_MMU=y
  CONFIG_SMP=y
-# CONFIG_BE_DD1 is not set
  CONFIG_NR_CPUS=4

  #
@@ -45,8 +44,12 @@ CONFIG_LOCALVERSION=""
  CONFIG_LOCALVERSION_AUTO=y
  CONFIG_SWAP=y
  CONFIG_SYSVIPC=y
+# CONFIG_POSIX_MQUEUE is not set
  # CONFIG_BSD_PROCESS_ACCT is not set
  CONFIG_SYSCTL=y
+# CONFIG_AUDIT is not set
+CONFIG_HOTPLUG=y
+CONFIG_KOBJECT_UEVENT=y
  # CONFIG_IKCONFIG is not set
  # CONFIG_CPUSETS is not set
  CONFIG_INITRAMFS_SOURCE=""
@@ -55,7 +58,6 @@ CONFIG_CC_OPTIMIZE_FOR_SIZE=y
  CONFIG_KALLSYMS=y
  # CONFIG_KALLSYMS_ALL is not set
  # CONFIG_KALLSYMS_EXTRA_PASS is not set
-CONFIG_HOTPLUG=y
  CONFIG_PRINTK=y
  CONFIG_BUG=y
  CONFIG_BASE_FULL=y
@@ -92,11 +94,11 @@ CONFIG_IOSCHED_NOOP=y
  CONFIG_IOSCHED_AS=y
  CONFIG_IOSCHED_DEADLINE=y
  CONFIG_IOSCHED_CFQ=y
-# CONFIG_DEFAULT_AS is not set
+CONFIG_DEFAULT_AS=y
  # CONFIG_DEFAULT_DEADLINE is not set
  # CONFIG_DEFAULT_CFQ is not set
-CONFIG_DEFAULT_NOOP=y
-CONFIG_DEFAULT_IOSCHED="noop"
+# CONFIG_DEFAULT_NOOP is not set
+CONFIG_DEFAULT_IOSCHED="anticipatory"

  #
  # Platform support
@@ -110,6 +112,8 @@ CONFIG_PPC_MULTIPLATFORM=y
  # CONFIG_PPC_MAPLE is not set
  CONFIG_PPC_CELL=y
  CONFIG_PPC_OF=y
+CONFIG_PPC_SYSTEMSIM=y
+CONFIG_SYSTEMSIM_IDLE=y
  # CONFIG_U3_DART is not set
  CONFIG_MPIC=y
  CONFIG_PPC_RTAS=y
@@ -127,7 +131,9 @@ CONFIG_CELL_IIC=y
  #
  # Cell Broadband Engine options
  #
+# CONFIG_BE_DD2 is not set
  CONFIG_SPU_FS=m
+CONFIG_SPUFS_MMAP=y

  #
  # Kernel options
@@ -145,7 +151,7 @@ CONFIG_BINFMT_MISC=y
  CONFIG_FORCE_MAX_ZONEORDER=13
  # CONFIG_IOMMU_VMERGE is not set
  CONFIG_KEXEC=y
-# CONFIG_IRQ_ALL_CPUS is not set
+CONFIG_IRQ_ALL_CPUS=y
  # CONFIG_NUMA is not set
  CONFIG_ARCH_SELECT_MEMORY_MODEL=y
  CONFIG_ARCH_FLATMEM_ENABLE=y
@@ -193,7 +199,168 @@ CONFIG_KERNEL_START=0xc000000000000000
  #
  # Networking
  #
-# CONFIG_NET is not set
+CONFIG_NET=y
+
+#
+# Networking options
+#
+CONFIG_PACKET=y
+# CONFIG_PACKET_MMAP is not set
+CONFIG_UNIX=y
+CONFIG_XFRM=y
+# CONFIG_XFRM_USER is not set
+# CONFIG_NET_KEY is not set
+CONFIG_INET=y
+CONFIG_IP_MULTICAST=y
+# CONFIG_IP_ADVANCED_ROUTER is not set
+CONFIG_IP_FIB_HASH=y
+# CONFIG_IP_PNP is not set
+CONFIG_NET_IPIP=y
+# CONFIG_NET_IPGRE is not set
+# CONFIG_IP_MROUTE is not set
+# CONFIG_ARPD is not set
+CONFIG_SYN_COOKIES=y
+# CONFIG_INET_AH is not set
+# CONFIG_INET_ESP is not set
+# CONFIG_INET_IPCOMP is not set
+CONFIG_INET_TUNNEL=y
+CONFIG_INET_DIAG=y
+CONFIG_INET_TCP_DIAG=y
+# CONFIG_TCP_CONG_ADVANCED is not set
+CONFIG_TCP_CONG_BIC=y
+
+#
+# IP: Virtual Server Configuration
+#
+# CONFIG_IP_VS is not set
+CONFIG_IPV6=y
+# CONFIG_IPV6_PRIVACY is not set
+CONFIG_INET6_AH=m
+CONFIG_INET6_ESP=m
+CONFIG_INET6_IPCOMP=m
+CONFIG_INET6_TUNNEL=m
+CONFIG_IPV6_TUNNEL=m
+CONFIG_NETFILTER=y
+# CONFIG_NETFILTER_DEBUG is not set
+
+#
+# Core Netfilter Configuration
+#
+# CONFIG_NETFILTER_NETLINK is not set
+
+#
+# IP: Netfilter Configuration
+#
+CONFIG_IP_NF_CONNTRACK=y
+# CONFIG_IP_NF_CT_ACCT is not set
+# CONFIG_IP_NF_CONNTRACK_MARK is not set
+# CONFIG_IP_NF_CONNTRACK_EVENTS is not set
+CONFIG_IP_NF_CT_PROTO_SCTP=y
+CONFIG_IP_NF_FTP=m
+CONFIG_IP_NF_IRC=m
+# CONFIG_IP_NF_NETBIOS_NS is not set
+CONFIG_IP_NF_TFTP=m
+CONFIG_IP_NF_AMANDA=m
+# CONFIG_IP_NF_PPTP is not set
+CONFIG_IP_NF_QUEUE=m
+CONFIG_IP_NF_IPTABLES=m
+CONFIG_IP_NF_MATCH_LIMIT=m
+CONFIG_IP_NF_MATCH_IPRANGE=m
+CONFIG_IP_NF_MATCH_MAC=m
+CONFIG_IP_NF_MATCH_PKTTYPE=m
+CONFIG_IP_NF_MATCH_MARK=m
+CONFIG_IP_NF_MATCH_MULTIPORT=m
+CONFIG_IP_NF_MATCH_TOS=m
+CONFIG_IP_NF_MATCH_RECENT=m
+CONFIG_IP_NF_MATCH_ECN=m
+CONFIG_IP_NF_MATCH_DSCP=m
+CONFIG_IP_NF_MATCH_AH_ESP=m
+CONFIG_IP_NF_MATCH_LENGTH=m
+CONFIG_IP_NF_MATCH_TTL=m
+CONFIG_IP_NF_MATCH_TCPMSS=m
+CONFIG_IP_NF_MATCH_HELPER=m
+CONFIG_IP_NF_MATCH_STATE=m
+CONFIG_IP_NF_MATCH_CONNTRACK=m
+CONFIG_IP_NF_MATCH_OWNER=m
+CONFIG_IP_NF_MATCH_ADDRTYPE=m
+CONFIG_IP_NF_MATCH_REALM=m
+CONFIG_IP_NF_MATCH_SCTP=m
+# CONFIG_IP_NF_MATCH_DCCP is not set
+CONFIG_IP_NF_MATCH_COMMENT=m
+CONFIG_IP_NF_MATCH_HASHLIMIT=m
+# CONFIG_IP_NF_MATCH_STRING is not set
+CONFIG_IP_NF_FILTER=m
+CONFIG_IP_NF_TARGET_REJECT=m
+CONFIG_IP_NF_TARGET_LOG=m
+CONFIG_IP_NF_TARGET_ULOG=m
+CONFIG_IP_NF_TARGET_TCPMSS=m
+# CONFIG_IP_NF_TARGET_NFQUEUE is not set
+CONFIG_IP_NF_NAT=m
+CONFIG_IP_NF_NAT_NEEDED=y
+CONFIG_IP_NF_TARGET_MASQUERADE=m
+CONFIG_IP_NF_TARGET_REDIRECT=m
+CONFIG_IP_NF_TARGET_NETMAP=m
+CONFIG_IP_NF_TARGET_SAME=m
+CONFIG_IP_NF_NAT_SNMP_BASIC=m
+CONFIG_IP_NF_NAT_IRC=m
+CONFIG_IP_NF_NAT_FTP=m
+CONFIG_IP_NF_NAT_TFTP=m
+CONFIG_IP_NF_NAT_AMANDA=m
+CONFIG_IP_NF_MANGLE=m
+CONFIG_IP_NF_TARGET_TOS=m
+CONFIG_IP_NF_TARGET_ECN=m
+CONFIG_IP_NF_TARGET_DSCP=m
+CONFIG_IP_NF_TARGET_MARK=m
+CONFIG_IP_NF_TARGET_CLASSIFY=m
+# CONFIG_IP_NF_TARGET_TTL is not set
+CONFIG_IP_NF_RAW=m
+CONFIG_IP_NF_TARGET_NOTRACK=m
+CONFIG_IP_NF_ARPTABLES=m
+CONFIG_IP_NF_ARPFILTER=m
+CONFIG_IP_NF_ARP_MANGLE=m
+
+#
+# IPv6: Netfilter Configuration (EXPERIMENTAL)
+#
+# CONFIG_IP6_NF_QUEUE is not set
+# CONFIG_IP6_NF_IPTABLES is not set
+
+#
+# DCCP Configuration (EXPERIMENTAL)
+#
+# CONFIG_IP_DCCP is not set
+
+#
+# SCTP Configuration (EXPERIMENTAL)
+#
+# CONFIG_IP_SCTP is not set
+# CONFIG_ATM is not set
+# CONFIG_BRIDGE is not set
+# CONFIG_VLAN_8021Q is not set
+# CONFIG_DECNET is not set
+# CONFIG_LLC2 is not set
+# CONFIG_IPX is not set
+# CONFIG_ATALK is not set
+# CONFIG_X25 is not set
+# CONFIG_LAPB is not set
+# CONFIG_NET_DIVERT is not set
+# CONFIG_ECONET is not set
+# CONFIG_WAN_ROUTER is not set
+
+#
+# QoS and/or fair queueing
+#
+# CONFIG_NET_SCHED is not set
+CONFIG_NET_CLS_ROUTE=y
+
+#
+# Network testing
+#
+# CONFIG_NET_PKTGEN is not set
+# CONFIG_HAMRADIO is not set
+# CONFIG_IRDA is not set
+# CONFIG_BT is not set
+# CONFIG_IEEE80211 is not set

  #
  # Device Drivers
@@ -210,6 +377,7 @@ CONFIG_FW_LOADER=y
  #
  # Connector - unified userspace <-> kernelspace linker
  #
+# CONFIG_CONNECTOR is not set

  #
  # Memory Technology Devices (MTD)
@@ -236,17 +404,73 @@ CONFIG_FW_LOADER=y
  # CONFIG_BLK_DEV_COW_COMMON is not set
  CONFIG_BLK_DEV_LOOP=y
  # CONFIG_BLK_DEV_CRYPTOLOOP is not set
+# CONFIG_BLK_DEV_NBD is not set
  # CONFIG_BLK_DEV_SX8 is not set
  CONFIG_BLK_DEV_RAM=y
  CONFIG_BLK_DEV_RAM_COUNT=16
  CONFIG_BLK_DEV_RAM_SIZE=131072
  CONFIG_BLK_DEV_INITRD=y
+CONFIG_BLK_DEV_SYSTEMSIM=y
  # CONFIG_CDROM_PKTCDVD is not set
+# CONFIG_ATA_OVER_ETH is not set

  #
  # ATA/ATAPI/MFM/RLL support
  #
-# CONFIG_IDE is not set
+CONFIG_IDE=y
+CONFIG_BLK_DEV_IDE=y
+
+#
+# Please see Documentation/ide.txt for help/info on IDE drives
+#
+# CONFIG_BLK_DEV_IDE_SATA is not set
+CONFIG_BLK_DEV_IDEDISK=y
+CONFIG_IDEDISK_MULTI_MODE=y
+# CONFIG_BLK_DEV_IDECD is not set
+# CONFIG_BLK_DEV_IDETAPE is not set
+# CONFIG_BLK_DEV_IDEFLOPPY is not set
+# CONFIG_IDE_TASK_IOCTL is not set
+
+#
+# IDE chipset support/bugfixes
+#
+CONFIG_IDE_GENERIC=y
+CONFIG_BLK_DEV_IDEPCI=y
+CONFIG_IDEPCI_SHARE_IRQ=y
+# CONFIG_BLK_DEV_OFFBOARD is not set
+CONFIG_BLK_DEV_GENERIC=y
+# CONFIG_BLK_DEV_OPTI621 is not set
+# CONFIG_BLK_DEV_SL82C105 is not set
+CONFIG_BLK_DEV_IDEDMA_PCI=y
+# CONFIG_BLK_DEV_IDEDMA_FORCED is not set
+CONFIG_IDEDMA_PCI_AUTO=y
+# CONFIG_IDEDMA_ONLYDISK is not set
+CONFIG_BLK_DEV_AEC62XX=y
+# CONFIG_BLK_DEV_ALI15X3 is not set
+# CONFIG_BLK_DEV_AMD74XX is not set
+# CONFIG_BLK_DEV_CMD64X is not set
+# CONFIG_BLK_DEV_TRIFLEX is not set
+# CONFIG_BLK_DEV_CY82C693 is not set
+# CONFIG_BLK_DEV_CS5520 is not set
+# CONFIG_BLK_DEV_CS5530 is not set
+# CONFIG_BLK_DEV_HPT34X is not set
+# CONFIG_BLK_DEV_HPT366 is not set
+# CONFIG_BLK_DEV_SC1200 is not set
+# CONFIG_BLK_DEV_PIIX is not set
+# CONFIG_BLK_DEV_IT821X is not set
+# CONFIG_BLK_DEV_NS87415 is not set
+# CONFIG_BLK_DEV_PDC202XX_OLD is not set
+# CONFIG_BLK_DEV_PDC202XX_NEW is not set
+# CONFIG_BLK_DEV_SVWKS is not set
+CONFIG_BLK_DEV_SIIMAGE=y
+# CONFIG_BLK_DEV_SLC90E66 is not set
+# CONFIG_BLK_DEV_TRM290 is not set
+# CONFIG_BLK_DEV_VIA82CXXX is not set
+# CONFIG_IDE_ARM is not set
+CONFIG_BLK_DEV_IDEDMA=y
+# CONFIG_IDEDMA_IVB is not set
+CONFIG_IDEDMA_AUTO=y
+# CONFIG_BLK_DEV_HD is not set

  #
  # SCSI device support
@@ -282,12 +506,92 @@ CONFIG_BLK_DEV_INITRD=y
  #
  # Network device support
  #
+CONFIG_NETDEVICES=y
+# CONFIG_DUMMY is not set
+# CONFIG_BONDING is not set
+# CONFIG_EQUALIZER is not set
+# CONFIG_TUN is not set
+
+#
+# ARCnet devices
+#
+# CONFIG_ARCNET is not set
+
+#
+# PHY device support
+#
+# CONFIG_PHYLIB is not set
+
+#
+# Ethernet (10 or 100Mbit)
+#
+CONFIG_NET_ETHERNET=y
+CONFIG_MII=y
+# CONFIG_HAPPYMEAL is not set
+# CONFIG_SUNGEM is not set
+# CONFIG_CASSINI is not set
+# CONFIG_NET_VENDOR_3COM is not set
+
+#
+# Tulip family network device support
+#
+# CONFIG_NET_TULIP is not set
+# CONFIG_HP100 is not set
+CONFIG_SYSTEMSIM_NET=y
+# CONFIG_NET_PCI is not set
+
+#
+# Ethernet (1000 Mbit)
+#
+# CONFIG_ACENIC is not set
+# CONFIG_DL2K is not set
+CONFIG_E1000=y
+# CONFIG_E1000_NAPI is not set
+# CONFIG_NS83820 is not set
+# CONFIG_HAMACHI is not set
+# CONFIG_YELLOWFIN is not set
+# CONFIG_R8169 is not set
+# CONFIG_SIS190 is not set
+# CONFIG_SKGE is not set
+# CONFIG_SK98LIN is not set
+# CONFIG_TIGON3 is not set
+# CONFIG_BNX2 is not set
+# CONFIG_MV643XX_ETH is not set
+
+#
+# Ethernet (10000 Mbit)
+#
+# CONFIG_CHELSIO_T1 is not set
+# CONFIG_IXGB is not set
+# CONFIG_S2IO is not set
+
+#
+# Token Ring devices
+#
+# CONFIG_TR is not set
+
+#
+# Wireless LAN (non-hamradio)
+#
+# CONFIG_NET_RADIO is not set
+
+#
+# Wan interfaces
+#
+# CONFIG_WAN is not set
+# CONFIG_FDDI is not set
+# CONFIG_HIPPI is not set
+# CONFIG_PPP is not set
+# CONFIG_SLIP is not set
+# CONFIG_SHAPER is not set
+# CONFIG_NETCONSOLE is not set
  # CONFIG_NETPOLL is not set
  # CONFIG_NET_POLL_CONTROLLER is not set

  #
  # ISDN subsystem
  #
+# CONFIG_ISDN is not set

  #
  # Telephony Support
@@ -367,7 +671,7 @@ CONFIG_UNIX98_PTYS=y
  # CONFIG_LEGACY_PTYS is not set
  CONFIG_HVC_DRIVER=y
  CONFIG_HVC_FSS=y
-# CONFIG_HVC_RTAS is not set
+CONFIG_HVC_RTAS=y

  #
  # IPMI
@@ -414,7 +718,57 @@ CONFIG_WATCHDOG_RTAS=y
  #
  # I2C support
  #
-# CONFIG_I2C is not set
+CONFIG_I2C=y
+# CONFIG_I2C_CHARDEV is not set
+
+#
+# I2C Algorithms
+#
+CONFIG_I2C_ALGOBIT=y
+# CONFIG_I2C_ALGOPCF is not set
+# CONFIG_I2C_ALGOPCA is not set
+
+#
+# I2C Hardware Bus support
+#
+# CONFIG_I2C_ALI1535 is not set
+# CONFIG_I2C_ALI1563 is not set
+# CONFIG_I2C_ALI15X3 is not set
+# CONFIG_I2C_AMD756 is not set
+# CONFIG_I2C_AMD8111 is not set
+# CONFIG_I2C_I801 is not set
+# CONFIG_I2C_I810 is not set
+# CONFIG_I2C_PIIX4 is not set
+# CONFIG_I2C_NFORCE2 is not set
+# CONFIG_I2C_PARPORT_LIGHT is not set
+# CONFIG_I2C_PROSAVAGE is not set
+# CONFIG_I2C_SAVAGE4 is not set
+# CONFIG_SCx200_ACB is not set
+# CONFIG_I2C_SIS5595 is not set
+# CONFIG_I2C_SIS630 is not set
+# CONFIG_I2C_SIS96X is not set
+# CONFIG_I2C_STUB is not set
+# CONFIG_I2C_VIA is not set
+# CONFIG_I2C_VIAPRO is not set
+# CONFIG_I2C_VOODOO3 is not set
+# CONFIG_I2C_PCA_ISA is not set
+
+#
+# Miscellaneous I2C Chip support
+#
+# CONFIG_SENSORS_DS1337 is not set
+# CONFIG_SENSORS_DS1374 is not set
+# CONFIG_SENSORS_EEPROM is not set
+# CONFIG_SENSORS_PCF8574 is not set
+# CONFIG_SENSORS_PCA9539 is not set
+# CONFIG_SENSORS_PCF8591 is not set
+# CONFIG_SENSORS_RTC8564 is not set
+# CONFIG_SENSORS_MAX6875 is not set
+# CONFIG_RTC_X1205_I2C is not set
+# CONFIG_I2C_DEBUG_CORE is not set
+# CONFIG_I2C_DEBUG_ALGO is not set
+# CONFIG_I2C_DEBUG_BUS is not set
+# CONFIG_I2C_DEBUG_CHIP is not set

  #
  # Dallas's 1-wire bus
@@ -443,6 +797,7 @@ CONFIG_WATCHDOG_RTAS=y
  #
  # Digital Video Broadcasting Devices
  #
+# CONFIG_DVB is not set

  #
  # Graphics support
@@ -484,7 +839,14 @@ CONFIG_USB_ARCH_HAS_OHCI=y
  #
  # InfiniBand support
  #
-# CONFIG_INFINIBAND is not set
+CONFIG_INFINIBAND=y
+CONFIG_INFINIBAND_USER_MAD=m
+CONFIG_INFINIBAND_USER_ACCESS=m
+CONFIG_INFINIBAND_MTHCA=m
+CONFIG_INFINIBAND_MTHCA_DEBUG=y
+CONFIG_INFINIBAND_IPOIB=m
+CONFIG_INFINIBAND_IPOIB_DEBUG=y
+CONFIG_INFINIBAND_IPOIB_DEBUG_DATA=y

  #
  # SN Devices
@@ -496,10 +858,16 @@ CONFIG_USB_ARCH_HAS_OHCI=y
  CONFIG_EXT2_FS=y
  # CONFIG_EXT2_FS_XATTR is not set
  # CONFIG_EXT2_FS_XIP is not set
-# CONFIG_EXT3_FS is not set
+CONFIG_EXT3_FS=y
+CONFIG_EXT3_FS_XATTR=y
+# CONFIG_EXT3_FS_POSIX_ACL is not set
+# CONFIG_EXT3_FS_SECURITY is not set
+CONFIG_JBD=y
+# CONFIG_JBD_DEBUG is not set
+CONFIG_FS_MBCACHE=y
  # CONFIG_REISERFS_FS is not set
  # CONFIG_JFS_FS is not set
-# CONFIG_FS_POSIX_ACL is not set
+CONFIG_FS_POSIX_ACL=y
  # CONFIG_XFS_FS is not set
  # CONFIG_MINIX_FS is not set
  # CONFIG_ROMFS_FS is not set
@@ -513,14 +881,20 @@ CONFIG_DNOTIFY=y
  #
  # CD-ROM/DVD Filesystems
  #
-# CONFIG_ISO9660_FS is not set
-# CONFIG_UDF_FS is not set
+CONFIG_ISO9660_FS=m
+CONFIG_JOLIET=y
+# CONFIG_ZISOFS is not set
+CONFIG_UDF_FS=m
+CONFIG_UDF_NLS=y

  #
  # DOS/FAT/NT Filesystems
  #
-# CONFIG_MSDOS_FS is not set
-# CONFIG_VFAT_FS is not set
+CONFIG_FAT_FS=m
+CONFIG_MSDOS_FS=m
+CONFIG_VFAT_FS=m
+CONFIG_FAT_DEFAULT_CODEPAGE=437
+CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"
  # CONFIG_NTFS_FS is not set

  #
@@ -534,7 +908,6 @@ CONFIG_HUGETLBFS=y
  CONFIG_HUGETLB_PAGE=y
  CONFIG_RAMFS=y
  # CONFIG_RELAYFS_FS is not set
-# CONFIG_CONFIGFS_FS is not set

  #
  # Miscellaneous filesystems
@@ -554,6 +927,35 @@ CONFIG_RAMFS=y
  # CONFIG_UFS_FS is not set

  #
+# Network File Systems
+#
+CONFIG_NFS_FS=m
+CONFIG_NFS_V3=y
+CONFIG_NFS_V3_ACL=y
+# CONFIG_NFS_V4 is not set
+# CONFIG_NFS_DIRECTIO is not set
+CONFIG_NFSD=m
+CONFIG_NFSD_V2_ACL=y
+CONFIG_NFSD_V3=y
+CONFIG_NFSD_V3_ACL=y
+# CONFIG_NFSD_V4 is not set
+CONFIG_NFSD_TCP=y
+CONFIG_LOCKD=m
+CONFIG_LOCKD_V4=y
+CONFIG_EXPORTFS=m
+CONFIG_NFS_ACL_SUPPORT=m
+CONFIG_NFS_COMMON=y
+CONFIG_SUNRPC=m
+# CONFIG_RPCSEC_GSS_KRB5 is not set
+# CONFIG_RPCSEC_GSS_SPKM3 is not set
+# CONFIG_SMB_FS is not set
+# CONFIG_CIFS is not set
+# CONFIG_NCP_FS is not set
+# CONFIG_CODA_FS is not set
+# CONFIG_AFS_FS is not set
+# CONFIG_9P_FS is not set
+
+#
  # Partition Types
  #
  CONFIG_PARTITION_ADVANCED=y
@@ -576,7 +978,46 @@ CONFIG_EFI_PARTITION=y
  #
  # Native Language Support
  #
-# CONFIG_NLS is not set
+CONFIG_NLS=m
+CONFIG_NLS_DEFAULT="iso8859-1"
+# CONFIG_NLS_CODEPAGE_437 is not set
+# CONFIG_NLS_CODEPAGE_737 is not set
+# CONFIG_NLS_CODEPAGE_775 is not set
+# CONFIG_NLS_CODEPAGE_850 is not set
+# CONFIG_NLS_CODEPAGE_852 is not set
+# CONFIG_NLS_CODEPAGE_855 is not set
+# CONFIG_NLS_CODEPAGE_857 is not set
+# CONFIG_NLS_CODEPAGE_860 is not set
+# CONFIG_NLS_CODEPAGE_861 is not set
+# CONFIG_NLS_CODEPAGE_862 is not set
+# CONFIG_NLS_CODEPAGE_863 is not set
+# CONFIG_NLS_CODEPAGE_864 is not set
+# CONFIG_NLS_CODEPAGE_865 is not set
+# CONFIG_NLS_CODEPAGE_866 is not set
+# CONFIG_NLS_CODEPAGE_869 is not set
+# CONFIG_NLS_CODEPAGE_936 is not set
+# CONFIG_NLS_CODEPAGE_950 is not set
+# CONFIG_NLS_CODEPAGE_932 is not set
+# CONFIG_NLS_CODEPAGE_949 is not set
+# CONFIG_NLS_CODEPAGE_874 is not set
+# CONFIG_NLS_ISO8859_8 is not set
+# CONFIG_NLS_CODEPAGE_1250 is not set
+# CONFIG_NLS_CODEPAGE_1251 is not set
+# CONFIG_NLS_ASCII is not set
+CONFIG_NLS_ISO8859_1=m
+CONFIG_NLS_ISO8859_2=m
+CONFIG_NLS_ISO8859_3=m
+CONFIG_NLS_ISO8859_4=m
+CONFIG_NLS_ISO8859_5=m
+CONFIG_NLS_ISO8859_6=m
+CONFIG_NLS_ISO8859_7=m
+CONFIG_NLS_ISO8859_9=m
+CONFIG_NLS_ISO8859_13=m
+CONFIG_NLS_ISO8859_14=m
+CONFIG_NLS_ISO8859_15=m
+# CONFIG_NLS_KOI8_R is not set
+# CONFIG_NLS_KOI8_U is not set
+# CONFIG_NLS_UTF8 is not set

  #
  # Library routines
@@ -614,6 +1055,7 @@ CONFIG_DEBUG_FS=y
  # CONFIG_DEBUG_STACKOVERFLOW is not set
  # CONFIG_DEBUG_STACK_USAGE is not set
  # CONFIG_DEBUGGER is not set
+# CONFIG_XMON is not set
  CONFIG_IRQSTACKS=y
  # CONFIG_BOOTX_TEXT is not set


From segher at kernel.crashing.org  Fri Feb 17 21:25:42 2006
From: segher at kernel.crashing.org (Segher Boessenkool)
Date: Fri, 17 Feb 2006 11:25:42 +0100
Subject: [PATCH] Fix some MPIC + HT APIC buglets
Message-ID: <4df057680bcb35ef633804504735670c@kernel.crashing.org>

Do disable, not enable, the HT APIC IRQ in the function that is 
supposed to.
Enable the MPIC IRQ before enabling the downstream APIC IRQ, avoids
potentially losing an interrupt.

Signed-off-by: Segher Boessenkool <segher at kernel.crashing.org>
---

Index: linux/arch/powerpc/sysdev/mpic.c
===================================================================
--- linux.orig/arch/powerpc/sysdev/mpic.c
+++ linux/arch/powerpc/sysdev/mpic.c
@@ -234,7 +234,7 @@
         spin_lock_irqsave(&mpic->fixup_lock, flags);
         writeb(0x10 + 2 * fixup->index, fixup->base + 2);
         tmp = readl(fixup->base + 4);
-       tmp &= ~1U;
+       tmp |= 1;
         writel(tmp, fixup->base + 4);
         spin_unlock_irqrestore(&mpic->fixup_lock, flags);
  }
@@ -446,14 +446,15 @@
  #ifdef CONFIG_MPIC_BROKEN_U3
         struct mpic *mpic = mpic_from_irq(irq);
         unsigned int src = irq - mpic->irq_offset;
+#endif /* CONFIG_MPIC_BROKEN_U3 */
+
+       mpic_enable_irq(irq);

+#ifdef CONFIG_MPIC_BROKEN_U3
         if (mpic_is_ht_interrupt(mpic, src))
                 mpic_startup_ht_interrupt(mpic, src, 
irq_desc[irq].status);
-
  #endif /* CONFIG_MPIC_BROKEN_U3 */

-       mpic_enable_irq(irq);
-
         return 0;
  }

-------------- next part --------------
A non-text attachment was scrubbed...
Name: patch-mpic-fixes
Type: application/octet-stream
Size: 1116 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060217/33614aca/attachment.obj 

From segher at kernel.crashing.org  Fri Feb 17 21:30:30 2006
From: segher at kernel.crashing.org (Segher Boessenkool)
Date: Fri, 17 Feb 2006 11:30:30 +0100
Subject: [PATCH] Don't re-assign PCI resources on Maple
Message-ID: <7b18da50a94a66a434941526bfc5fbfd@kernel.crashing.org>

Maple firmware does not need PCI resource allocation, and in fact, it 
can
cause problems in some strange cases.

Signed-off-by: Segher Boessenkool <segher at kernel.crashing.org>
---

Index: linux/arch/powerpc/platforms/maple/pci.c
===================================================================
--- linux.orig/arch/powerpc/platforms/maple/pci.c
+++ linux/arch/powerpc/platforms/maple/pci.c
@@ -435,8 +435,8 @@
                         PCI_DN(np)->busno = 0xf0;
         }

-       /* Tell pci.c to use the common resource allocation mecanism */
-       pci_probe_only = 0;
+       /* Tell pci.c to not change any resource allocations.  */
+       pci_probe_only = 1;

         /* Allow all IO */
         io_page_mask = -1;
-------------- next part --------------
A non-text attachment was scrubbed...
Name: patch-maple-pci-setup
Type: application/octet-stream
Size: 663 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060217/1acd9b58/attachment.obj 

From hch at lst.de  Sat Feb 18 05:32:54 2006
From: hch at lst.de (Christoph Hellwig)
Date: Fri, 17 Feb 2006 19:32:54 +0100
Subject: [FYI/PATCH 3/4] Build fixes for IBM Full System Simulator
In-Reply-To: <Pine.LNX.4.62.0602170637271.7683@tuxmkge1.boeblingen.de.ibm.com>
References: <Pine.LNX.4.62.0602170637271.7683@tuxmkge1.boeblingen.de.ibm.com>
Message-ID: <20060217183254.GA3951@lst.de>

> +
> +ifneq ($(CROSS_COMPILE),)
> +cpu-as-$(CONFIG_PPC_CELL)	+= -Wa,-mcellppu
> +endif

the CROSS_COMPILE setting is wrong.  cross-compilation should not
affect selection of assembler flags.

> +
>   cpu-as-$(CONFIG_PPC64BRIDGE)	+= -Wa,-mppc64bridge
>   cpu-as-$(CONFIG_4xx)		+= -Wa,-m405
>   cpu-as-$(CONFIG_6xx)		+= -Wa,-maltivec


From rolandd at cisco.com  Sat Feb 18 11:55:32 2006
From: rolandd at cisco.com (Roland Dreier)
Date: Fri, 17 Feb 2006 16:55:32 -0800
Subject: [PATCH 00/22] [RFC] IBM eHCA InfiniBand adapter driver
Message-ID: <20060218005532.13620.79663.stgit@localhost.localdomain>

Here's a series of patches that add an InfiniBand adapter driver
for IBM eHCA hardware.  Please look it over with an eye towards issues
that need to be addressed before merging this upstream.

This patch series is somewhat unusual in that I am not the original
author of this driver -- I am just sending it for review for the
authors, who are apparently not able to post patches themselves due to
internal issues at IBM.  However they are cc'ed and will respond to
comments in this thread.

In fact I have some issues with the code myself that need to be
addressed before this driver is mergeable.  I've included most of them
in the individual patches, although I have some general comments too.
However I would like to get some early feedback for the ehca authors
from the wider community.  In particular I think its important to run
this past the ppc64 experts, since I'm not sure what the standards for
this sort of pSeries driver are.

Anyway, my general comments:

 - The #ifs that test EHCA_USERDRIVER and __KERNEL__ should be killed.
   We know that this is kernel code, so there's no reason to include
   userspace compatibility junk.

 - Many of the comments look like they are for some automatic
   documentation system that is not quite kerneldoc.  They should be
   fixed to be real kerneldoc comments.

 - In general there is a huge amount of code in large inline functions
   in .h files.  Things should be reorganized to cut this down to a
   sane amount.

Thanks,
  Roland


From rolandd at cisco.com  Sat Feb 18 11:57:10 2006
From: rolandd at cisco.com (Roland Dreier)
Date: Fri, 17 Feb 2006 16:57:10 -0800
Subject: [PATCH 03/22] pHype specific stuff
In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
Message-ID: <20060218005709.13620.77409.stgit@localhost.localdomain>

From: Roland Dreier <rolandd at cisco.com>

It's not clear what the connection between hcp_phyp.c and hcp_phyp.h
really is -- they don't seem to very closely related.

Again, hcp_phyp.h has some rather large functions that belong in
a .c file and maybe shouldn't be inlined (although maybe the
generated assembly ends up being small because it's just
fiddling registers around).

For a change, hipz_galpa_load() and hipz_galpa_store() actually
look simple enough that they could probably become inline functions
in a header (and just kill hcp_phyp.c).  This would also make the
comments about them being inline in ehca_galpa.h true.

Is ehca_galpha.h needed at all, or can it be folded into another
file?  Why is its abstraction needed?
---

 drivers/infiniband/hw/ehca/ehca_galpa.h |   74 +++++++
 drivers/infiniband/hw/ehca/hcp_phyp.c   |   81 +++++++
 drivers/infiniband/hw/ehca/hcp_phyp.h   |  338 +++++++++++++++++++++++++++++++
 3 files changed, 493 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_galpa.h b/drivers/infiniband/hw/ehca/ehca_galpa.h
new file mode 100644
index 0000000..d64115c
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ehca_galpa.h
@@ -0,0 +1,74 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  pSeries interface definitions
+ *
+ *  Authors: Waleri Fomin <fomin at de.ibm.com>
+ *           Christoph Raisch <raisch at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ehca_galpa.h,v 1.6 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+#ifndef __EHCA_GALPA_H__
+#define __EHCA_GALPA_H__
+
+/* eHCA page (mapped into p-memory)
+    resource to access eHCA register pages in CPU address space
+*/
+struct h_galpa {
+	u64 fw_handle;
+	/* for pSeries this is a 64bit memory address where
+	   I/O memory is mapped into CPU address space (kv) */
+};
+
+/**
+   resource to access eHCA address space registers, all types
+*/
+struct h_galpas {
+	u32 pid;		/*PID of userspace galpa checking */
+	struct h_galpa user;	/* user space accessible resource,
+				   set to 0 if unused */
+	struct h_galpa kernel;	/* kernel space accessible resource,
+				   set to 0 if unused */
+};
+/** @brief store value at offset into galpa, will be inline function
+ */
+void hipz_galpa_store(struct h_galpa galpa, u32 offset, u64 value);
+
+/** @brief return value from offset in galpa, will be inline function
+ */
+u64 hipz_galpa_load(struct h_galpa galpa, u32 offset);
+
+#endif				/* __EHCA_GALPA_H__ */
diff --git a/drivers/infiniband/hw/ehca/hcp_phyp.c b/drivers/infiniband/hw/ehca/hcp_phyp.c
new file mode 100644
index 0000000..129e61b
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/hcp_phyp.c
@@ -0,0 +1,81 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *   load store abstraction for ehca register access
+ *
+ *  Authors:  Christoph Raisch <raisch at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: hcp_phyp.c,v 1.10 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+
+#define DEB_PREFIX "PHYP"
+
+#ifdef __KERNEL__
+#include "ehca_kernel.h"
+#include "hipz_hw.h"
+/* #include "hipz_structs.h" */
+/* TODO: still necessary */
+#include "ehca_classes.h"
+#else				/* !__KERNEL__ */
+#include "ehca_utools.h"
+#include "ehca_galpa.h"
+#endif
+
+#ifndef EHCA_USERDRIVER		/* TODO: is this correct */
+
+u64 hipz_galpa_load(struct h_galpa galpa, u32 offset)
+{
+	u64 addr = galpa.fw_handle + offset;
+	u64 out;
+	EDEB_EN(7, "addr=%lx offset=%x ", addr, offset);
+	out = *(u64 *) addr;
+	EDEB_EX(7, "addr=%lx value=%lx", addr, out);
+	return out;
+};
+
+void hipz_galpa_store(struct h_galpa galpa, u32 offset, u64 value)
+{
+	u64 addr = galpa.fw_handle + offset;
+	EDEB(7, "addr=%lx offset=%x value=%lx", addr,
+	     offset, value);
+	*(u64 *) addr = value;
+#ifdef EHCA_USE_HCALL
+	/* hipz_galpa_load(galpa, offset); */
+	/* synchronize explicitly */
+#endif
+};
+
+#endif				/* EHCA_USERDRIVER */
diff --git a/drivers/infiniband/hw/ehca/hcp_phyp.h b/drivers/infiniband/hw/ehca/hcp_phyp.h
new file mode 100644
index 0000000..c82fb4b
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/hcp_phyp.h
@@ -0,0 +1,338 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  Firmware calls
+ *
+ *  Authors: Christoph Raisch <raisch at de.ibm.com>
+ *           Waleri Fomin <fomin at de.ibm.com>
+ *           Gerd Bayer <gerd.bayer at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: hcp_phyp.h,v 1.16 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+#ifndef __HCP_PHYP_H__
+#define __HCP_PHYP_H__
+
+#ifndef EHCA_USERDRIVER
+inline static int hcall_map_page(u64 physaddr, u64 * mapaddr)
+{
+	*mapaddr = (u64)(ioremap(physaddr, 4096));
+
+	EDEB(7, "ioremap physaddr=%lx mapaddr=%lx", physaddr, *mapaddr);
+	return 0;
+}
+
+inline static int hcall_unmap_page(u64 mapaddr)
+{
+	EDEB(7, "mapaddr=%lx", mapaddr);
+	iounmap((void *)(mapaddr));
+	return 0;
+}
+#else
+int hcall_map_page(u64 physaddr, u64 * mapaddr);
+int hcall_unmap_page(u64 mapaddr);
+#endif
+
+struct hcall {
+	u64 regs[11];
+};
+
+/**
+ * @brief returns time to wait in secs for the given long busy error code
+ */
+inline static u32 getLongBusyTimeSecs(int longBusyRetCode)
+{
+	switch (longBusyRetCode) {
+	case H_LongBusyOrder1msec:
+		return 1;
+	case H_LongBusyOrder10msec:
+		return 10;
+	case H_LongBusyOrder100msec:
+		return 100;
+	case H_LongBusyOrder1sec:
+		return 1000;
+	case H_LongBusyOrder10sec:
+		return 10000;
+	case H_LongBusyOrder100sec:
+		return 100000;
+	default:
+		return 1;
+	}			/* eof switch */
+}
+
+inline static long plpar_hcall_7arg_7ret(unsigned long opcode,
+					 unsigned long arg1,    /* <R4  */
+					 unsigned long arg2,	/* <R5  */
+					 unsigned long arg3,	/* <R6  */
+					 unsigned long arg4,	/* <R7  */
+					 unsigned long arg5,	/* <R8  */
+					 unsigned long arg6,	/* <R9  */
+					 unsigned long arg7,	/* <R10 */
+					 unsigned long *out1,	/* <R4  */
+					 unsigned long *out2,	/* <R5  */
+					 unsigned long *out3,	/* <R6  */
+					 unsigned long *out4,	/* <R7  */
+					 unsigned long *out5,	/* <R8  */
+					 unsigned long *out6,	/* <R9  */
+					 unsigned long *out7	/* <R10 */
+    )
+{
+	struct hcall hcall_in = {
+		.regs[0] = opcode,
+		.regs[1] = arg1,
+		.regs[2] = arg2,
+		.regs[3] = arg3,
+		.regs[4] = arg4,
+		.regs[5] = arg5,
+		.regs[6] = arg6,
+		.regs[7] = arg7	/*,
+				   .regs[8]=arg8 */
+	};
+	struct hcall hcall = hcall_in;
+	int i;
+	long ret;
+	int sleep_msecs;
+	EDEB(7, "HCALL77_IN r3=%lx r4=%lx r5=%lx r6=%lx r7=%lx r8=%lx"
+	     " r9=%lx r10=%lx r11=%lx", hcall.regs[0], hcall.regs[1],
+	     hcall.regs[2], hcall.regs[3], hcall.regs[4], hcall.regs[5],
+	     hcall.regs[6], hcall.regs[7], hcall.regs[8]);
+
+	/* if phype returns LongBusyXXX,
+	 * we retry several times, but not forever */
+	for (i = 0; i < 5; i++) {
+		__asm__ __volatile__("mr 3,%10\n"
+				     "mr 4,%11\n"
+				     "mr 5,%12\n"
+				     "mr 6,%13\n"
+				     "mr 7,%14\n"
+				     "mr 8,%15\n"
+				     "mr 9,%16\n"
+				     "mr 10,%17\n"
+				     "mr 11,%18\n"
+				     "mr 12,%19\n"
+				     ".long 0x44000022\n"
+				     "mr %0,3\n"
+				     "mr %1,4\n"
+				     "mr %2,5\n"
+				     "mr %3,6\n"
+				     "mr %4,7\n"
+				     "mr %5,8\n"
+				     "mr %6,9\n"
+				     "mr %7,10\n"
+				     "mr %8,11\n"
+				     "mr %9,12\n":"=r"(hcall.regs[0]),
+				     "=r"(hcall.regs[1]), "=r"(hcall.regs[2]),
+				     "=r"(hcall.regs[3]), "=r"(hcall.regs[4]),
+				     "=r"(hcall.regs[5]), "=r"(hcall.regs[6]),
+				     "=r"(hcall.regs[7]), "=r"(hcall.regs[8]),
+				     "=r"(hcall.regs[9])
+				     :"r"(hcall.regs[0]), "r"(hcall.regs[1]),
+				     "r"(hcall.regs[2]), "r"(hcall.regs[3]),
+				     "r"(hcall.regs[4]), "r"(hcall.regs[5]),
+				     "r"(hcall.regs[6]), "r"(hcall.regs[7]),
+				     "r"(hcall.regs[8]), "r"(hcall.regs[9])
+				     :"r0", "r2", "r3", "r4", "r5", "r6", "r7",
+				     "r8", "r9", "r10", "r11", "r12", "cc",
+				     "xer", "ctr", "lr", "cr0", "cr1", "cr5",
+				     "cr6", "cr7");
+
+		EDEB(7, "HCALL77_OUT r3=%lx r4=%lx r5=%lx r6=%lx r7=%lx r8=%lx"
+		     "r9=%lx r10=%lx r11=%lx", hcall.regs[0], hcall.regs[1],
+		     hcall.regs[2], hcall.regs[3], hcall.regs[4], hcall.regs[5],
+		     hcall.regs[6], hcall.regs[7], hcall.regs[8]);
+		ret = hcall.regs[0];
+		*out1 = hcall.regs[1];
+		*out2 = hcall.regs[2];
+		*out3 = hcall.regs[3];
+		*out4 = hcall.regs[4];
+		*out5 = hcall.regs[5];
+		*out6 = hcall.regs[6];
+		*out7 = hcall.regs[7];
+
+		if (!H_isLongBusy(ret)) {
+			if (ret<0) {
+				EDEB_ERR(4, "HCALL77_IN r3=%lx r4=%lx r5=%lx r6=%lx "
+					 "r7=%lx r8=%lx r9=%lx r10=%lx",
+					 opcode, arg1, arg2, arg3,
+					 arg4, arg5, arg6, arg7);
+				EDEB_ERR(4,
+					 "HCALL77_OUT r3=%lx r4=%lx r5=%lx "
+					 "r6=%lx r7=%lx r8=%lx r9=%lx r10=%lx ",
+					 hcall.regs[0], hcall.regs[1],
+					 hcall.regs[2], hcall.regs[3],
+					 hcall.regs[4], hcall.regs[5],
+					 hcall.regs[6], hcall.regs[7]);
+			}
+			return ret;
+		}
+
+		sleep_msecs = getLongBusyTimeSecs(ret);
+		EDEB(7, "Got LongBusy return code from phype. "
+		       "Sleep %dmsecs and retry...", sleep_msecs);
+		msleep_interruptible(sleep_msecs);
+		hcall = hcall_in;
+	}			/* eof for */
+	EDEB_ERR(4, "HCALL77_OUT ret=H_Busy");
+	return H_Busy;
+}
+
+inline static long plpar_hcall_9arg_9ret(unsigned long opcode,
+					 unsigned long arg1,	/* <R4  */
+					 unsigned long arg2,	/* <R5  */
+					 unsigned long arg3,	/* <R6  */
+					 unsigned long arg4,	/* <R7  */
+					 unsigned long arg5,	/* <R8  */
+					 unsigned long arg6,	/* <R9  */
+					 unsigned long arg7,	/* <R10 */
+					 unsigned long arg8,	/* <R11 */
+					 unsigned long arg9,	/* <R12 */
+					 unsigned long *out1,	/* <R4  */
+					 unsigned long *out2,	/* <R5  */
+					 unsigned long *out3,	/* <R6  */
+					 unsigned long *out4,	/* <R7  */
+					 unsigned long *out5,	/* <R8  */
+					 unsigned long *out6,	/* <R9  */
+					 unsigned long *out7,	/* <R10 */
+					 unsigned long *out8,	/* <R11 */
+					 unsigned long *out9	/* <R12 */
+    )
+{
+	struct hcall hcall_in = {
+		.regs[0] = opcode,
+		.regs[1] = arg1,
+		.regs[2] = arg2,
+		.regs[3] = arg3,
+		.regs[4] = arg4,
+		.regs[5] = arg5,
+		.regs[6] = arg6,
+		.regs[7] = arg7,
+		.regs[8] = arg8,
+		.regs[9] = arg9,
+	};
+	struct hcall hcall = hcall_in;
+	int i;
+	long ret;
+	int sleep_msecs;
+	EDEB(7,"HCALL99_IN  r3=%lx r4=%lx r5=%lx r6=%lx r7=%lx r8=%lx r9=%lx"
+	     " r10=%lx r11=%lx r12=%lx",
+	     hcall.regs[0], hcall.regs[1], hcall.regs[2], hcall.regs[3],
+	     hcall.regs[4], hcall.regs[5], hcall.regs[6], hcall.regs[7],
+	     hcall.regs[8], hcall.regs[9]);
+
+	/* if phype returns LongBusyXXX, we retry several times, but not forever */
+	for (i = 0; i < 5; i++) {
+		__asm__ __volatile__("mr 3,%10\n"
+				     "mr 4,%11\n"
+				     "mr 5,%12\n"
+				     "mr 6,%13\n"
+				     "mr 7,%14\n"
+				     "mr 8,%15\n"
+				     "mr 9,%16\n"
+				     "mr 10,%17\n"
+				     "mr 11,%18\n"
+				     "mr 12,%19\n"
+				     ".long 0x44000022\n"
+				     "mr %0,3\n"
+				     "mr %1,4\n"
+				     "mr %2,5\n"
+				     "mr %3,6\n"
+				     "mr %4,7\n"
+				     "mr %5,8\n"
+				     "mr %6,9\n"
+				     "mr %7,10\n"
+				     "mr %8,11\n"
+				     "mr %9,12\n":"=r"(hcall.regs[0]),
+				     "=r"(hcall.regs[1]), "=r"(hcall.regs[2]),
+				     "=r"(hcall.regs[3]), "=r"(hcall.regs[4]),
+				     "=r"(hcall.regs[5]), "=r"(hcall.regs[6]),
+				     "=r"(hcall.regs[7]), "=r"(hcall.regs[8]),
+				     "=r"(hcall.regs[9])
+				     :"r"(hcall.regs[0]), "r"(hcall.regs[1]),
+				     "r"(hcall.regs[2]), "r"(hcall.regs[3]),
+				     "r"(hcall.regs[4]), "r"(hcall.regs[5]),
+				     "r"(hcall.regs[6]), "r"(hcall.regs[7]),
+				     "r"(hcall.regs[8]), "r"(hcall.regs[9])
+				     :"r0", "r2", "r3", "r4", "r5", "r6", "r7",
+				     "r8", "r9", "r10", "r11", "r12", "cc",
+				     "xer", "ctr", "lr", "cr0", "cr1", "cr5",
+				     "cr6", "cr7");
+
+		EDEB(7,"HCALL99_OUT r3=%lx r4=%lx r5=%lx r6=%lx r7=%lx r8=%lx "
+		     "r9=%lx r10=%lx r11=%lx r12=%lx", hcall.regs[0],
+		     hcall.regs[1], hcall.regs[2], hcall.regs[3], hcall.regs[4],
+		     hcall.regs[5], hcall.regs[6], hcall.regs[7], hcall.regs[8],
+		     hcall.regs[9]);
+		ret = hcall.regs[0];
+		*out1 = hcall.regs[1];
+		*out2 = hcall.regs[2];
+		*out3 = hcall.regs[3];
+		*out4 = hcall.regs[4];
+		*out5 = hcall.regs[5];
+		*out6 = hcall.regs[6];
+		*out7 = hcall.regs[7];
+		*out8 = hcall.regs[8];
+		*out9 = hcall.regs[9];
+
+		if (!H_isLongBusy(ret)) {
+			if (ret<0) {
+				EDEB_ERR(4, "HCALL99_IN r3=%lx r4=%lx r5=%lx r6=%lx "
+					 "r7=%lx r8=%lx r9=%lx r10=%lx "
+					 "r11=%lx r12=%lx",
+					 opcode, arg1, arg2, arg3,
+					 arg4, arg5, arg6, arg7,
+					 arg8, arg9);
+				EDEB_ERR(4,
+					 "HCALL99_OUT r3=%lx r4=%lx r5=%lx "
+					 "r6=%lx r7=%lx r8=%lx r9=%lx r10=%lx "
+					 "r11=%lx r12=lx",
+					 hcall.regs[0], hcall.regs[1],
+					 hcall.regs[2], hcall.regs[3],
+					 hcall.regs[4], hcall.regs[5],
+					 hcall.regs[6], hcall.regs[7],
+					 hcall.regs[8]);
+			}
+			return ret;
+		}
+		sleep_msecs = getLongBusyTimeSecs(ret);
+		EDEB(7, "Got LongBusy return code from phype. "
+		     "Sleep %dmsecs and retry...", sleep_msecs);
+		msleep_interruptible(sleep_msecs);
+		hcall = hcall_in;
+	}			/* eof for */
+	EDEB_ERR(4, "HCALL99_OUT ret=H_Busy");
+	return H_Busy;
+}
+
+#endif


From rolandd at cisco.com  Sat Feb 18 11:57:04 2006
From: rolandd at cisco.com (Roland Dreier)
Date: Fri, 17 Feb 2006 16:57:04 -0800
Subject: [PATCH 01/22] Add powerpc-specific clear_cacheline(),
	which just compiles to "dcbz".
In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
Message-ID: <20060218005704.13620.88286.stgit@localhost.localdomain>

From: Roland Dreier <rolandd at cisco.com>

This is horribly non-portable.  How much of a performance difference
does it make?  How does it do on ppc64 systems where the cacheline
size is not 32?
---

 drivers/infiniband/hw/ehca/ehca_asm.h |   58 +++++++++++++++++++++++++++++++++
 1 files changed, 58 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_asm.h b/drivers/infiniband/hw/ehca/ehca_asm.h
new file mode 100644
index 0000000..6a09ac5
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ehca_asm.h
@@ -0,0 +1,58 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  Some helper macros with assembler instructions
+ *
+ *  Authors: Khadija Souissi <souissik at de.ibm.com>
+ *           Christoph Raisch <raisch at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ehca_asm.h,v 1.7 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+
+#ifndef __EHCA_ASM_H__
+#define __EHCA_ASM_H__
+
+#if defined(CONFIG_PPC_PSERIES) || defined (__PPC64__) || defined (__PPC__)
+
+#define clear_cacheline(adr) __asm__ __volatile("dcbz 0,%0"::"r"(adr))
+
+#elif defined(CONFIG_ARCH_S390)
+#error "unsupported yet"
+#else
+#error "invalid platform"
+#endif
+
+#endif /* __EHCA_ASM_H__ */


From rolandd at cisco.com  Sat Feb 18 11:57:07 2006
From: rolandd at cisco.com (Roland Dreier)
Date: Fri, 17 Feb 2006 16:57:07 -0800
Subject: [PATCH 02/22] Firmware interface code for IB device.
In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
Message-ID: <20060218005707.13620.20538.stgit@localhost.localdomain>

From: Roland Dreier <rolandd at cisco.com>

This is a very large file with way too much code for a .h file.
The functions look too big to be inlined also.  Is there any way
for this code to move to a .c file?
---

 drivers/infiniband/hw/ehca/hcp_if.h | 2022 +++++++++++++++++++++++++++++++++++
 1 files changed, 2022 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/hcp_if.h b/drivers/infiniband/hw/ehca/hcp_if.h
new file mode 100644
index 0000000..70bf77f
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/hcp_if.h
@@ -0,0 +1,2022 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  Firmware Infiniband Interface code for POWER
+ *
+ *  Authors: Gerd Bayer <gerd.bayer at de.ibm.com>
+ *           Christoph Raisch <raisch at de.ibm.com>
+ *           Waleri Fomin <fomin at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: hcp_if.h,v 1.62 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+#ifndef __HCP_IF_H__
+#define __HCP_IF_H__
+
+#include "ehca_tools.h"
+#include "hipz_structs.h"
+#include "ehca_classes.h"
+
+#ifndef EHCA_USE_HCALL
+#include "hcz_queue.h"
+#include "hcz_mrmw.h"
+#include "hcz_emmio.h"
+#include "sim_prom.h"
+#endif
+#include "hipz_fns.h"
+#include "hcp_sense.h"
+#include "ehca_irq.h"
+
+#ifndef CONFIG_PPC64
+#ifndef Z_SERIES
+#warning "included with wrong target, this is a p file"
+#endif
+#endif
+
+#ifdef EHCA_USE_HCALL
+
+#ifndef EHCA_USERDRIVER
+#include "hcp_phyp.h"
+#else
+#include "testbench/hcallbridge.h"
+#endif
+#endif
+
+inline static int hcp_galpas_ctor(struct h_galpas *galpas,
+				  u64 paddr_kernel, u64 paddr_user)
+{
+	int rc = 0;
+
+	rc = hcall_map_page(paddr_kernel, &galpas->kernel.fw_handle);
+	if (rc != 0)
+		return (rc);
+
+	galpas->user.fw_handle = paddr_user;
+
+	EDEB(7, "paddr_kernel=%lx paddr_user=%lx galpas->kernel=%lx"
+	     " galpas->user=%lx",
+	     paddr_kernel, paddr_user, galpas->kernel.fw_handle,
+	     galpas->user.fw_handle);
+
+	return (rc);
+}
+
+inline static int hcp_galpas_dtor(struct h_galpas *galpas)
+{
+	int rc = 0;
+
+	if (galpas->kernel.fw_handle != 0)
+		rc = hcall_unmap_page(galpas->kernel.fw_handle);
+
+	if (rc != 0)
+		return (rc);
+
+	galpas->user.fw_handle = galpas->kernel.fw_handle = 0;
+
+	return rc;
+}
+
+/**
+ * hipz_h_alloc_resource_eq - Allocate EQ resources in HW and FW, initalize
+ * resources, create the empty EQPT (ring).
+ *
+ * @eq_handle:         eq handle for this queue
+ * @act_nr_of_entries: actual number of queue entries
+ * @act_pages:         actual number of queue pages
+ * @eq_ist:            used by hcp_H_XIRR() call
+ */
+inline static u64 hipz_h_alloc_resource_eq(const struct
+						      ipz_adapter_handle
+						      hcp_adapter_handle,
+						      struct ehca_pfeq *pfeq,
+						      const u32 neq_control,
+						      const u32
+						      number_of_entries,
+						      struct ipz_eq_handle
+						      *eq_handle,
+						      u32 * act_nr_of_entries,
+						      u32 * act_pages,
+						      u32 * eq_ist)
+{
+	u64 retcode;
+	u64 dummy;
+	u64 act_nr_of_entries_out = 0;
+	u64 act_pages_out         = 0;
+	u64 eq_ist_out            = 0;
+	u64 allocate_controls     = 0;
+	u32 x = (u64)(&x);
+
+	EDEB_EN(7, "pfeq=%p hcp_adapter_handle=%lx  new_control=%x"
+		   " number_of_entries=%x",
+		   pfeq, hcp_adapter_handle.handle, neq_control,
+		   number_of_entries);
+
+#ifndef EHCA_USE_HCALL
+	retcode = simp_h_alloc_resource_eq(hcp_adapter_handle, pfeq,
+					   neq_control,
+					   number_of_entries,
+					   eq_handle,
+					   act_nr_of_entries,
+					   act_pages, eq_ist);
+#else
+
+	/* resource type */
+	allocate_controls = 3ULL;
+
+	/* ISN is associated */
+	if (neq_control != 1) {
+		allocate_controls = (1ULL << (63 - 7)) | allocate_controls;
+	}
+
+	/* notification event queue */
+	if (neq_control == 1) {
+		allocate_controls = (1ULL << 63) | allocate_controls;
+	}
+
+	retcode = plpar_hcall_7arg_7ret(H_ALLOC_RESOURCE,
+					hcp_adapter_handle.handle, /* r4 */
+					allocate_controls,	   /* r5 */
+					number_of_entries,	   /* r6 */
+					0, 0, 0, 0,
+					&eq_handle->handle,	   /* r4 */
+					&dummy,	                   /* r5 */
+					&dummy,	                   /* r6 */
+					&act_nr_of_entries_out,	   /* r7 */
+					&act_pages_out,	           /* r8 */
+					&eq_ist_out,               /* r8 */
+					&dummy);
+
+	*act_nr_of_entries = (u32) act_nr_of_entries_out;
+	*act_pages         = (u32) act_pages_out;
+	*eq_ist            = (u32) eq_ist_out;
+
+#endif /* EHCA_USE_HCALL */
+
+	if (retcode == H_NOT_ENOUGH_RESOURCES) {
+		EDEB_ERR(4, "Not enough resource - retcode=%lx ", retcode);
+	}
+
+	EDEB_EX(7, "act_nr_of_entries=%x act_pages=%x eq_ist=%x",
+		*act_nr_of_entries, *act_pages, *eq_ist);
+
+	return retcode;
+}
+
+static inline u64 hipz_h_reset_event(const struct ipz_adapter_handle
+						hcp_adapter_handle,
+						struct ipz_eq_handle eq_handle,
+						const u64 event_mask)
+{
+	u64 retcode = 0;
+	u64 dummy;
+
+	EDEB_EN(7, "eq_handle=%lx, adapter_handle=%lx  event_mask=%lx",
+		   eq_handle.handle, hcp_adapter_handle.handle, event_mask);
+
+#ifndef EHCA_USE_HCALL
+	/* TODO: Not implemented yet */
+#else
+
+	retcode = plpar_hcall_7arg_7ret(H_RESET_EVENTS,
+					hcp_adapter_handle.handle, /* r4 */
+					eq_handle.handle,	   /* r5 */
+					event_mask,	           /* r6 */
+					0, 0, 0, 0,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy);
+#endif
+	EDEB(7, "retcode=%lx", retcode);
+
+	return retcode;
+}
+
+/**
+ * hipz_h_allocate_resource_cq - Allocate CQ resources in HW and FW, initialize
+ * resources, create the empty CQPT (ring).
+ *
+ * @eq_handle:         eq handle to use for this cq
+ * @cq_handle:         cq handle for this queue
+ * @act_nr_of_entries: actual number of queue entries
+ * @act_pages:         actual number of queue pages
+ * @galpas:            contain logical adress of priv. storage and
+ *                     log_user_storage
+ */
+static inline u64 hipz_h_alloc_resource_cq(const struct
+						      ipz_adapter_handle
+						      hcp_adapter_handle,
+						      struct ehca_pfcq *pfcq,
+						      const struct ipz_eq_handle
+						      eq_handle,
+						      const u32 cq_token,
+						      const u32
+						      number_of_entries,
+						      struct ipz_cq_handle
+						      *cq_handle,
+						      u32 * act_nr_of_entries,
+						      u32 * act_pages,
+						      struct h_galpas *galpas)
+{
+	u64 retcode = 0;
+	u64 dummy;
+	u64 act_nr_of_entries_out;
+	u64 act_pages_out;
+	u64 g_la_privileged_out;
+	u64 g_la_user_out;
+	/* stack location is a unique identifier for a process from beginning
+	 * to end of this frame */
+	u32 x = (u64)(&x);
+
+	EDEB_EN(7, "pfcq=%p hcp_adapter_handle=%lx eq_handle=%lx cq_token=%x"
+		" number_of_entries=%x",
+		pfcq, hcp_adapter_handle.handle, eq_handle.handle,
+		cq_token, number_of_entries);
+
+#ifndef EHCA_USE_HCALL
+	retcode = simp_h_alloc_resource_cq(hcp_adapter_handle,
+					   pfcq,
+					   eq_handle,
+					   cq_token,
+					   number_of_entries,
+					   cq_handle,
+					   act_nr_of_entries,
+					   act_pages, galpas);
+#else
+	retcode = plpar_hcall_7arg_7ret(H_ALLOC_RESOURCE,
+					hcp_adapter_handle.handle, /* r4  */
+					2,	                   /* r5  */
+					eq_handle.handle,	   /* r6  */
+					cq_token,	           /* r7  */
+					number_of_entries,	   /* r8  */
+					0, 0,
+					&cq_handle->handle,	   /* r4  */
+					&dummy,	                   /* r5  */
+					&dummy,	                   /* r6  */
+					&act_nr_of_entries_out,	   /* r7  */
+					&act_pages_out,	           /* r8  */
+					&g_la_privileged_out,	   /* r9  */
+					&g_la_user_out);	   /* r10 */
+
+	*act_nr_of_entries = (u32) act_nr_of_entries_out;
+	*act_pages = (u32) act_pages_out;
+
+	if (retcode == 0) {
+		hcp_galpas_ctor(galpas, g_la_privileged_out, g_la_user_out);
+	}
+#endif /* EHCA_US_HCALL */
+
+	if (retcode == H_NOT_ENOUGH_RESOURCES) {
+		EDEB_ERR(4, "Not enough resources. retcode=%lx", retcode);
+	}
+
+	EDEB_EX(7, "cq_handle=%lx act_nr_of_entries=%x act_pages=%x",
+		cq_handle->handle, *act_nr_of_entries, *act_pages);
+
+	return retcode;
+}
+
+#define H_ALL_RES_QP_Enhanced_QP_Operations EHCA_BMASK_IBM(9,11)
+#define H_ALL_RES_QP_QP_PTE_Pin EHCA_BMASK_IBM(12,12)
+#define H_ALL_RES_QP_Service_Type EHCA_BMASK_IBM(13,15)
+#define H_ALL_RES_QP_LL_RQ_CQE_Posting EHCA_BMASK_IBM(18,18)
+#define H_ALL_RES_QP_LL_SQ_CQE_Posting EHCA_BMASK_IBM(19,21)
+#define H_ALL_RES_QP_Signalling_Type EHCA_BMASK_IBM(22,23)
+#define H_ALL_RES_QP_UD_Address_Vector_L_Key_Control EHCA_BMASK_IBM(31,31)
+#define H_ALL_RES_QP_Resource_Type EHCA_BMASK_IBM(56,63)
+
+#define H_ALL_RES_QP_Max_Outstanding_Send_Work_Requests EHCA_BMASK_IBM(0,15)
+#define H_ALL_RES_QP_Max_Outstanding_Receive_Work_Requests EHCA_BMASK_IBM(16,31)
+#define H_ALL_RES_QP_Max_Send_SG_Elements EHCA_BMASK_IBM(32,39)
+#define H_ALL_RES_QP_Max_Receive_SG_Elements EHCA_BMASK_IBM(40,47)
+
+#define H_ALL_RES_QP_Act_Outstanding_Send_Work_Requests EHCA_BMASK_IBM(16,31)
+#define H_ALL_RES_QP_Act_Outstanding_Receive_Work_Requests EHCA_BMASK_IBM(48,63)
+#define H_ALL_RES_QP_Act_Send_SG_Elements EHCA_BMASK_IBM(8,15)
+#define H_ALL_RES_QP_Act_Receeive_SG_Elements EHCA_BMASK_IBM(24,31)
+
+#define H_ALL_RES_QP_Send_Queue_Size_pages EHCA_BMASK_IBM(0,31)
+#define H_ALL_RES_QP_Receive_Queue_Size_pages EHCA_BMASK_IBM(32,63)
+
+/* direct access qp controls */
+#define DAQP_CTRL_ENABLE 0x01
+#define DAQP_CTRL_SEND_COMPLETION 0x20
+#define DAQP_CTRL_RECV_COMPLETION 0x40
+
+/**
+ * hipz_h_alloc_resource_qp - Allocate QP resources in HW and FW,
+ * initialize resources, create empty QPPTs (2 rings).
+ *
+ * @h_galpas to access HCA resident QP attributes
+ */
+static inline u64 hipz_h_alloc_resource_qp(const struct
+						      ipz_adapter_handle
+						      adapter_handle,
+						      struct ehca_pfqp *pfqp,
+						      const u8 servicetype,
+						      const u8 daqp_ctrl,
+						      const u8 signalingtype,
+						      const u8 ud_av_l_key_ctl,
+						      const struct ipz_cq_handle send_cq_handle,
+						      const struct ipz_cq_handle receive_cq_handle,
+						      const struct ipz_eq_handle async_eq_handle,
+						      const u32 qp_token,
+						      const struct ipz_pd pd,
+						      const u16 max_nr_send_wqes,
+						      const u16 max_nr_receive_wqes,
+						      const u8 max_nr_send_sges,
+						      const u8 max_nr_receive_sges,
+						      const u32 ud_av_l_key,
+						      struct ipz_qp_handle *qp_handle,
+						      u32 * qp_nr,
+						      u16 * act_nr_send_wqes,
+						      u16 * act_nr_receive_wqes,
+						      u8 * act_nr_send_sges,
+						      u8 * act_nr_receive_sges,
+						      u32 * nr_sq_pages,
+						      u32 * nr_rq_pages,
+						      struct h_galpas *h_galpas)
+{
+	u64 retcode = H_Success;
+	u64 allocate_controls;
+	u64 max_r10_reg;
+	u64 dummy         = 0;
+	u64 qp_nr_out     = 0;
+	u64 r6_out        = 0;
+	u64 r7_out        = 0;
+	u64 r8_out        = 0;
+	u64 g_la_user_out = 0;
+	u64 r11_out       = 0;
+
+	EDEB_EN(7, "pfqp=%p adapter_handle=%lx servicetype=%x signalingtype=%x"
+		" ud_av_l_key=%x send_cq_handle=%lx receive_cq_handle=%lx"
+		" async_eq_handle=%lx qp_token=%x  pd=%x max_nr_send_wqes=%x"
+		" max_nr_receive_wqes=%x max_nr_send_sges=%x"
+		" max_nr_receive_sges=%x ud_av_l_key=%x galpa.pid=%x",
+		pfqp, adapter_handle.handle, servicetype, signalingtype,
+		ud_av_l_key, send_cq_handle.handle,
+		receive_cq_handle.handle, async_eq_handle.handle, qp_token,
+		pd.value, max_nr_send_wqes, max_nr_receive_wqes,
+		max_nr_send_sges, max_nr_receive_sges, ud_av_l_key,
+		h_galpas->pid);
+
+#ifndef EHCA_USE_HCALL
+	retcode = simp_h_alloc_resource_qp(adapter_handle,
+					   pfqp,
+					   servicetype,
+					   signalingtype,
+					   ud_av_l_key_ctl,
+					   send_cq_handle,
+					   receive_cq_handle,
+					   async_eq_handle,
+					   qp_token,
+					   pd,
+					   max_nr_send_wqes,
+					   max_nr_receive_wqes,
+					   max_nr_send_sges,
+					   max_nr_receive_sges,
+					   ud_av_l_key,
+					   qp_handle,
+					   qp_nr,
+					   act_nr_send_wqes,
+					   act_nr_receive_wqes,
+					   act_nr_send_sges,
+					   act_nr_receive_sges,
+					   nr_sq_pages, nr_rq_pages, h_galpas);
+
+#else
+	allocate_controls =
+		EHCA_BMASK_SET(H_ALL_RES_QP_Enhanced_QP_Operations,
+			       (daqp_ctrl & DAQP_CTRL_ENABLE) ? 1 : 0)
+		| EHCA_BMASK_SET(H_ALL_RES_QP_QP_PTE_Pin, 0)
+		| EHCA_BMASK_SET(H_ALL_RES_QP_Service_Type, servicetype)
+		| EHCA_BMASK_SET(H_ALL_RES_QP_Signalling_Type, signalingtype)
+		| EHCA_BMASK_SET(H_ALL_RES_QP_LL_RQ_CQE_Posting,
+				 (daqp_ctrl & DAQP_CTRL_RECV_COMPLETION) ? 1 : 0)
+		| EHCA_BMASK_SET(H_ALL_RES_QP_LL_SQ_CQE_Posting,
+				 (daqp_ctrl & DAQP_CTRL_SEND_COMPLETION) ? 1 : 0)
+		| EHCA_BMASK_SET(H_ALL_RES_QP_UD_Address_Vector_L_Key_Control,
+				 ud_av_l_key_ctl)
+		| EHCA_BMASK_SET(H_ALL_RES_QP_Resource_Type, 1);
+
+	max_r10_reg =
+		EHCA_BMASK_SET(H_ALL_RES_QP_Max_Outstanding_Send_Work_Requests,
+			       max_nr_send_wqes)
+		| EHCA_BMASK_SET(H_ALL_RES_QP_Max_Outstanding_Receive_Work_Requests,
+				 max_nr_receive_wqes)
+		| EHCA_BMASK_SET(H_ALL_RES_QP_Max_Send_SG_Elements,
+				 max_nr_send_sges)
+		| EHCA_BMASK_SET(H_ALL_RES_QP_Max_Receive_SG_Elements,
+				 max_nr_receive_sges);
+
+
+	retcode = plpar_hcall_9arg_9ret(H_ALLOC_RESOURCE,
+					adapter_handle.handle,	 /* r4  */
+					allocate_controls,	 /* r5  */
+					send_cq_handle.handle,	 /* r6  */
+					receive_cq_handle.handle,/* r7  */
+					async_eq_handle.handle,	 /* r8  */
+					((u64) qp_token << 32)
+					| pd.value,              /* r9  */
+					max_r10_reg,	         /* r10 */
+					ud_av_l_key,	         /* r11 */
+					0,
+					&qp_handle->handle,	 /* r4  */
+					&qp_nr_out,	         /* r5  */
+					&r6_out,	         /* r6  */
+					&r7_out,	         /* r7  */
+					&r8_out,	         /* r8  */
+					&dummy,	                 /* r9  */
+					&g_la_user_out,	         /* r10 */
+					&r11_out,
+					&dummy);
+
+	/* extract outputs */
+	*qp_nr = (u32) qp_nr_out;
+	*act_nr_send_wqes = (u16)
+		EHCA_BMASK_GET(H_ALL_RES_QP_Act_Outstanding_Send_Work_Requests,
+			       r6_out);
+	*act_nr_receive_wqes = (u16)
+		EHCA_BMASK_GET(H_ALL_RES_QP_Act_Outstanding_Receive_Work_Requests,
+			       r6_out);
+	*act_nr_send_sges =
+		(u8) EHCA_BMASK_GET(H_ALL_RES_QP_Act_Send_SG_Elements,
+				    r7_out);
+	*act_nr_receive_sges =
+		(u8) EHCA_BMASK_GET(H_ALL_RES_QP_Act_Receeive_SG_Elements,
+				    r7_out);
+	*nr_sq_pages =
+		(u32) EHCA_BMASK_GET(H_ALL_RES_QP_Send_Queue_Size_pages,
+				     r8_out);
+	*nr_rq_pages =
+		(u32) EHCA_BMASK_GET(H_ALL_RES_QP_Receive_Queue_Size_pages,
+				     r8_out);
+	if (retcode == 0) {
+		hcp_galpas_ctor(h_galpas, g_la_user_out, g_la_user_out);
+	}
+#endif /* EHCA_USE_HCALL */
+
+	if (retcode == H_NOT_ENOUGH_RESOURCES) {
+		EDEB_ERR(4, "Not enough resources. retcode=%lx",
+			 retcode);
+	}
+
+	EDEB_EX(7, "qp_nr=%x act_nr_send_wqes=%x"
+		" act_nr_receive_wqes=%x act_nr_send_sges=%x"
+		" act_nr_receive_sges=%x nr_sq_pages=%x"
+		" nr_rq_pages=%x galpa.user=%lx galpa.kernel=%lx",
+		*qp_nr, *act_nr_send_wqes, *act_nr_receive_wqes,
+		*act_nr_send_sges, *act_nr_receive_sges, *nr_sq_pages,
+		*nr_rq_pages, h_galpas->user.fw_handle,
+		h_galpas->kernel.fw_handle);
+
+	return (retcode);
+}
+
+static inline u64 hipz_h_query_port(const struct ipz_adapter_handle
+					       hcp_adapter_handle,
+					       const u8 port_id,
+					       struct query_port_rblock
+					       *query_port_response_block)
+{
+	u64 retcode = H_Success;
+	u64 dummy;
+	u64 r_cb;
+
+	EDEB_EN(7, "hcp_adapter_handle=%lx port_id %x",
+		hcp_adapter_handle.handle, port_id);
+
+	if ((((u64)query_port_response_block) & 0xfff) != 0) {
+		EDEB_ERR(4, "response block not page aligned");
+		retcode = H_Parameter;
+		return (retcode);
+	}
+
+#ifndef EHCA_USE_HCALL
+	retcode = 0;
+#else
+	r_cb = ehca_kv_to_g(query_port_response_block);
+
+	retcode = plpar_hcall_7arg_7ret(H_QUERY_PORT,
+					hcp_adapter_handle.handle, /* r4 */
+					port_id,	           /* r5 */
+					r_cb,	                   /* r6 */
+					0, 0, 0, 0,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy);
+#endif /* EHCA_USE_HCALL */
+
+	EDEB(7, "offset0=%x offset1=%x offset2=%x offset3=%x",
+	     ((u32 *) query_port_response_block)[0],
+	     ((u32 *) query_port_response_block)[1],
+	     ((u32 *) query_port_response_block)[2],
+	     ((u32 *) query_port_response_block)[3]);
+	EDEB(7, "offset4=%x offset5=%x offset6=%x offset7=%x",
+	     ((u32 *) query_port_response_block)[4],
+	     ((u32 *) query_port_response_block)[5],
+	     ((u32 *) query_port_response_block)[6],
+	     ((u32 *) query_port_response_block)[7]);
+	EDEB(7, "offset8=%x offset9=%x offseta=%x offsetb=%x",
+	     ((u32 *) query_port_response_block)[8],
+	     ((u32 *) query_port_response_block)[9],
+	     ((u32 *) query_port_response_block)[10],
+	     ((u32 *) query_port_response_block)[11]);
+	EDEB(7, "offsetc=%x offsetd=%x offsete=%x offsetf=%x",
+	     ((u32 *) query_port_response_block)[12],
+	     ((u32 *) query_port_response_block)[13],
+	     ((u32 *) query_port_response_block)[14],
+	     ((u32 *) query_port_response_block)[15]);
+	EDEB(7, "offset31=%x offset35=%x offset36=%x",
+	     ((u32 *) query_port_response_block)[32],
+	     ((u32 *) query_port_response_block)[36],
+	     ((u32 *) query_port_response_block)[37]);
+	EDEB(7, "offset200=%x offset201=%x offset202=%x "
+	     "offset203=%x",
+	     ((u32 *) query_port_response_block)[0x200],
+	     ((u32 *) query_port_response_block)[0x201],
+	     ((u32 *) query_port_response_block)[0x202],
+	     ((u32 *) query_port_response_block)[0x203]);
+
+	EDEB_EX(7, "retcode=%lx", retcode);
+
+	return retcode;
+}
+
+static inline u64 hipz_h_query_hca(const struct ipz_adapter_handle
+					      hcp_adapter_handle,
+					      struct query_hca_rblock
+					      *query_hca_rblock)
+{
+	u64 retcode = 0;
+	u64 dummy;
+	u64 r_cb;
+	EDEB_EN(7, "hcp_adapter_handle=%lx", hcp_adapter_handle.handle);
+
+	if ((((u64)query_hca_rblock) & 0xfff) != 0) {
+		EDEB_ERR(4, "response block not page aligned");
+		retcode = H_Parameter;
+		return (retcode);
+	}
+
+#ifndef EHCA_USE_HCALL
+	retcode = 0;
+#else
+	r_cb = ehca_kv_to_g(query_hca_rblock);
+
+	retcode = plpar_hcall_7arg_7ret(H_QUERY_HCA,
+					hcp_adapter_handle.handle, /* r4 */
+					r_cb,                      /* r5 */
+					0, 0, 0, 0, 0,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy);
+#endif /* EHCA_USE_HCALL */
+
+	EDEB(7, "offset0=%x offset1=%x offset2=%x offset3=%x",
+	     ((u32 *) query_hca_rblock)[0],
+	     ((u32 *) query_hca_rblock)[1],
+	     ((u32 *) query_hca_rblock)[2], ((u32 *) query_hca_rblock)[3]);
+	EDEB(7, "offset4=%x offset5=%x offset6=%x offset7=%x",
+	     ((u32 *) query_hca_rblock)[4],
+	     ((u32 *) query_hca_rblock)[5],
+	     ((u32 *) query_hca_rblock)[6], ((u32 *) query_hca_rblock)[7]);
+	EDEB(7, "offset8=%x offset9=%x offseta=%x offsetb=%x",
+	     ((u32 *) query_hca_rblock)[8],
+	     ((u32 *) query_hca_rblock)[9],
+	     ((u32 *) query_hca_rblock)[10], ((u32 *) query_hca_rblock)[11]);
+	EDEB(7, "offsetc=%x offsetd=%x offsete=%x offsetf=%x",
+	     ((u32 *) query_hca_rblock)[12],
+	     ((u32 *) query_hca_rblock)[13],
+	     ((u32 *) query_hca_rblock)[14], ((u32 *) query_hca_rblock)[15]);
+	EDEB(7, "offset136=%x offset192=%x offset204=%x",
+	     ((u32 *) query_hca_rblock)[32],
+	     ((u32 *) query_hca_rblock)[48], ((u32 *) query_hca_rblock)[51]);
+	EDEB(7, "offset231=%x offset235=%x",
+	     ((u32 *) query_hca_rblock)[57], ((u32 *) query_hca_rblock)[58]);
+	EDEB(7, "offset200=%x offset201=%x offset202=%x offset203=%x",
+	     ((u32 *) query_hca_rblock)[0x201],
+	     ((u32 *) query_hca_rblock)[0x202],
+	     ((u32 *) query_hca_rblock)[0x203],
+	     ((u32 *) query_hca_rblock)[0x204]);
+
+	EDEB_EX(7, "retcode=%lx hcp_adapter_handle=%lx",
+		retcode, hcp_adapter_handle.handle);
+
+	return retcode;
+}
+
+/**
+ * hipz_h_register_rpage - hcp_if.h internal function for all
+ * hcp_H_REGISTER_RPAGE calls.
+ *
+ * @logical_address_of_page: kv transformation to GX address in this routine
+ */
+static inline u64 hipz_h_register_rpage(const struct
+						   ipz_adapter_handle
+						   hcp_adapter_handle,
+						   const u8 pagesize,
+						   const u8 queue_type,
+						   const u64 resource_handle,
+						   const u64
+						   logical_address_of_page,
+						   u64 count)
+{
+	u64 retcode = 0;
+	u64 dummy;
+
+	EDEB_EN(7, "hcp_adapter_handle=%lx pagesize=%x queue_type=%x"
+		" resource_handle=%lx logical_address_of_page=%lx count=%lx",
+		hcp_adapter_handle.handle, pagesize, queue_type,
+		resource_handle, logical_address_of_page, count);
+
+#ifndef EHCA_USE_HCALL
+	EDEB_ERR(4, "Not implemented");
+#else
+	retcode = plpar_hcall_7arg_7ret(H_REGISTER_RPAGES,
+					hcp_adapter_handle.handle,  /* r4  */
+					queue_type | pagesize << 8, /* r5  */
+					resource_handle,	    /* r6  */
+					logical_address_of_page,    /* r7  */
+					count,	                    /* r8  */
+					0, 0,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy);
+#endif /* EHCA_USE_HCALL */
+
+	EDEB_EX(7, "retcode=%lx", retcode);
+
+	return retcode;
+}
+
+static inline u64 hipz_h_register_rpage_eq(const struct
+						      ipz_adapter_handle
+						      hcp_adapter_handle,
+						      const struct ipz_eq_handle
+						      eq_handle,
+						      struct ehca_pfeq *pfeq,
+						      const u8 pagesize,
+						      const u8 queue_type,
+						      const u64
+						      logical_address_of_page,
+						      const u64 count)
+{
+	u64 retcode = 0;
+
+	EDEB_EN(7, "pfeq=%p hcp_adapter_handle=%lx eq_handle=%lx pagesize=%x"
+		" queue_type=%x  logical_address_of_page=%lx count=%lx",
+		pfeq, hcp_adapter_handle.handle, eq_handle.handle, pagesize,
+		queue_type,logical_address_of_page, count);
+
+#ifndef EHCA_USE_HCALL
+	retcode =
+		simp_h_register_rpage_eq(hcp_adapter_handle, eq_handle, pfeq,
+					 pagesize, queue_type,
+					 logical_address_of_page, count);
+#else
+	if (count != 1) {
+		EDEB_ERR(4, "Ppage counter=%lx", count);
+		return (H_Parameter);
+	}
+	retcode = hipz_h_register_rpage(hcp_adapter_handle,
+					pagesize,
+					queue_type,
+					eq_handle.handle,
+					logical_address_of_page, count);
+#endif /* EHCA_USE_HCALL */
+
+	EDEB_EX(7, "retcode=%lx", retcode);
+
+	return retcode;
+}
+
+static inline u32 hipz_request_interrupt(struct ehca_irq_info *irq_info,
+					 irqreturn_t(*handler)
+					 (int, void *, struct pt_regs *))
+{
+
+	int ret = 0;
+
+	EDEB_EN(7, "ist=0x%x", irq_info->ist);
+
+#ifdef EHCA_USE_HCALL
+#ifndef EHCA_USERDRIVER
+	ret = ibmebus_request_irq(NULL, irq_info->ist, handler,
+				  SA_INTERRUPT, "ehca", (void *)irq_info);
+
+	if (ret < 0)
+		EDEB_ERR(4, "Can't map interrupt handler.");
+#else
+	struct hcall_irq_info hirq = {.irq = irq_info->irq,
+				      .ist = irq_info->ist,
+				      .pid = irq_info->pid};
+
+	hirq = hirq;
+	ret = hcall_reg_eqh(&hirq, ehca_interrupt_eq);
+#endif /* EHCA_USERDRIVER */
+#endif /* EHCA_USE_HCALL  */
+
+	EDEB_EX(7, "ret=%x", ret);
+
+	return ret;
+}
+
+static inline void hipz_free_interrupt(struct ehca_irq_info *irq_info)
+{
+#ifdef EHCA_USE_HCALL
+#ifndef EHCA_USERDRIVER
+	ibmebus_free_irq(NULL, irq_info->ist, (void *)irq_info);
+#endif
+#endif
+}
+
+static inline u32 hipz_h_query_int_state(const struct ipz_adapter_handle
+					 hcp_adapter_handle,
+					 struct ehca_irq_info *irq_info)
+{
+	u32 rc = 0;
+	u64 dummy = 0;
+
+	EDEB_EN(7, "ist=0x%x", irq_info->ist);
+
+#ifdef EHCA_USE_HCALL
+#ifdef EHCA_USERDRIVER
+	/* TODO: Not implemented yet */
+#else
+	rc = plpar_hcall_7arg_7ret(H_QUERY_INT_STATE,
+				   hcp_adapter_handle.handle, /* r4 */
+				   irq_info->ist,             /* r5 */
+				   0, 0, 0, 0, 0,
+				   &dummy,
+				   &dummy,
+				   &dummy,
+				   &dummy,
+				   &dummy,
+				   &dummy,
+				   &dummy);
+
+	if ((rc != H_Success) && (rc != H_Busy))
+		EDEB_ERR(4, "Could not query interrupt state.");
+#endif
+#endif
+	EDEB_EX(7, "interrupt state: %x", rc);
+
+	return rc;
+}
+
+static inline u64 hipz_h_register_rpage_cq(const struct
+						      ipz_adapter_handle
+						      hcp_adapter_handle,
+						      const struct ipz_cq_handle
+						      cq_handle,
+						      struct ehca_pfcq *pfcq,
+						      const u8 pagesize,
+						      const u8 queue_type,
+						      const u64
+						      logical_address_of_page,
+						      const u64 count,
+						      const struct h_galpa gal)
+{
+	u64 retcode = 0;
+
+	EDEB_EN(7, "pfcq=%p hcp_adapter_handle=%lx cq_handle=%lx pagesize=%x"
+		" queue_type=%x  logical_address_of_page=%lx count=%lx",
+		pfcq, hcp_adapter_handle.handle, cq_handle.handle, pagesize,
+		queue_type, logical_address_of_page, count);
+
+#ifndef EHCA_USE_HCALL
+	retcode =
+		simp_h_register_rpage_cq(hcp_adapter_handle, cq_handle, pfcq,
+					 pagesize, queue_type,
+					 logical_address_of_page, count, gal);
+#else
+	if (count != 1) {
+		EDEB_ERR(4, "Page counter=%lx", count);
+		return (H_Parameter);
+	}
+
+	retcode =
+		hipz_h_register_rpage(hcp_adapter_handle, pagesize, queue_type,
+				      cq_handle.handle, logical_address_of_page,
+				      count);
+#endif /* EHCA_USE_HCALL */
+	EDEB_EX(7, "retcode=%lx", retcode);
+
+	return retcode;
+}
+
+static inline u64 hipz_h_register_rpage_qp(const struct
+						      ipz_adapter_handle
+						      hcp_adapter_handle,
+						      const struct ipz_qp_handle
+						      qp_handle,
+						      struct ehca_pfqp *pfqp,
+						      const u8 pagesize,
+						      const u8 queue_type,
+						      const u64
+						      logical_address_of_page,
+						      const u64 count,
+						      const struct h_galpa
+						      galpa)
+{
+	u64 retcode = 0;
+
+	EDEB_EN(7, "pfqp=%p hcp_adapter_handle=%lx qp_handle=%lx pagesize=%x"
+		" queue_type=%x  logical_address_of_page=%lx count=%lx",
+		pfqp, hcp_adapter_handle.handle, qp_handle.handle, pagesize,
+		queue_type, logical_address_of_page, count);
+
+#ifndef EHCA_USE_HCALL
+	retcode = simp_h_register_rpage_qp(hcp_adapter_handle,
+					   qp_handle,
+					   pfqp,
+					   pagesize,
+					   queue_type,
+					   logical_address_of_page,
+					   count, galpa);
+#else
+	if (count != 1) {
+		EDEB_ERR(4, "Page counter=%lx", count);
+		return (H_Parameter);
+	}
+
+	retcode = hipz_h_register_rpage(hcp_adapter_handle,
+					pagesize,
+					queue_type,
+					qp_handle.handle,
+					logical_address_of_page, count);
+#endif /* EHCA_USE_HCALL */
+	EDEB_EX(7, "retcode=%lx", retcode);
+
+	return retcode;
+}
+
+static inline u64 hipz_h_remove_rpt_cq(const struct
+						  ipz_adapter_handle
+						  hcp_adapter_handle,
+						  const struct ipz_cq_handle
+						  cq_handle,
+						  struct ehca_pfcq *pfcq)
+{
+	u64 retcode = 0;
+
+	EDEB_EN(7, "pfcq=%p hcp_adapter_handle=%lx  cq_handle=%lx",
+		pfcq, hcp_adapter_handle.handle, cq_handle.handle);
+
+#ifndef EHCA_USE_HCALL
+	retcode = simp_h_remove_rpt_cq(hcp_adapter_handle, cq_handle, pfcq);
+#else
+	/* TODO: hcall not implemented */
+#endif
+	EDEB_EX(7, "retcode=%lx", retcode);
+
+	return 0;
+}
+
+static inline u64 hipz_h_remove_rpt_eq(const struct
+						  ipz_adapter_handle
+						  hcp_adapter_handle,
+						  const struct ipz_eq_handle
+						  eq_handle,
+						  struct ehca_pfeq *pfeq)
+{
+	u64 retcode = 0;
+
+	EDEB_EX(7, "hcp_adapter_handle=%lx eq_handle=%lx",
+		hcp_adapter_handle.handle, eq_handle.handle);
+
+#ifndef EHCA_USE_HCALL
+	retcode = simp_h_remove_rpt_eq(hcp_adapter_handle, eq_handle, pfeq);
+#else
+	/* TODO: hcall not implemented */
+#endif
+	EDEB_EX(7, "retcode=%lx", retcode);
+
+	return 0;
+}
+
+static inline u64 hipz_h_remove_rpt_qp(const struct
+						  ipz_adapter_handle
+						  hcp_adapter_handle,
+						  const struct ipz_qp_handle
+						  qp_handle,
+						  struct ehca_pfqp *pfqp)
+{
+	u64 retcode = 0;
+
+	EDEB_EN(7, "pfqp=%p hcp_adapter_handle=%lx qp_handle=%lx",
+		pfqp, hcp_adapter_handle.handle, qp_handle.handle);
+
+#ifndef EHCA_USE_HCALL
+	retcode = simp_h_remove_rpt_qp(hcp_adapter_handle, qp_handle, pfqp);
+#else
+	/* TODO: hcall not implemented */
+#endif
+	EDEB_EX(7, "retcode=%lx", retcode);
+
+	return 0;
+}
+
+static inline u64 hipz_h_disable_and_get_wqe(const struct
+							ipz_adapter_handle
+							hcp_adapter_handle,
+							const struct
+							ipz_qp_handle qp_handle,
+							struct ehca_pfqp *pfqp,
+							void **log_addr_next_sq_wqe_tb_processed,
+							void **log_addr_next_rq_wqe_tb_processed,
+							int dis_and_get_function_code)
+{
+	u64 retcode = 0;
+	u8 function_code = 1;
+	u64 dummy, dummy1, dummy2;
+
+	EDEB_EN(7, "pfqp=%p hcp_adapter_handle=%lx function=%x qp_handle=%lx",
+		pfqp, hcp_adapter_handle.handle, function_code, qp_handle.handle);
+
+	if (log_addr_next_sq_wqe_tb_processed==NULL) {
+		log_addr_next_sq_wqe_tb_processed = (void**)&dummy1;
+	}
+	if (log_addr_next_rq_wqe_tb_processed==NULL) {
+		log_addr_next_rq_wqe_tb_processed = (void**)&dummy2;
+	}
+#ifndef EHCA_USE_HCALL
+	retcode =
+		simp_h_disable_and_get_wqe(hcp_adapter_handle, qp_handle, pfqp,
+					   log_addr_next_sq_wqe_tb_processed,
+					   log_addr_next_rq_wqe_tb_processed);
+#else
+
+	retcode = plpar_hcall_7arg_7ret(H_DISABLE_AND_GETC,
+					hcp_adapter_handle.handle, /* r4 */
+					dis_and_get_function_code, /* r5 */
+				        /* function code 1-disQP ret
+					 * SQ RQ wqe ptr
+					 * 2- ret SQ wqe ptr
+					 * 3- ret. RQ count */
+					qp_handle.handle,	   /* r6 */
+					0, 0, 0, 0,
+					(void*)log_addr_next_sq_wqe_tb_processed, /* r4 */
+					(void*)log_addr_next_rq_wqe_tb_processed, /* r5 */
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy);
+#endif /* EHCA_USE_HCALL */
+	EDEB_EX(7, "retcode=%lx  ladr_next_rq_wqe_out=%p"
+		" ladr_next_sq_wqe_out=%p", retcode,
+		*log_addr_next_sq_wqe_tb_processed,
+		*log_addr_next_rq_wqe_tb_processed);
+
+	return retcode;
+}
+
+enum hcall_sigt {
+	HCALL_SIGT_NO_CQE = 0,
+	HCALL_SIGT_BY_WQE = 1,
+	HCALL_SIGT_EVERY = 2
+};
+
+static inline u64 hipz_h_modify_qp(const struct ipz_adapter_handle
+					      hcp_adapter_handle,
+					      const struct ipz_qp_handle
+					      qp_handle, struct ehca_pfqp *pfqp,
+					      const u64 update_mask,
+					      struct hcp_modify_qp_control_block
+					      *mqpcb,
+					      struct h_galpa gal)
+{
+	u64 retcode = 0;
+	u64 invalid_attribute_identifier = 0;
+	u64 rc_attrib_mask = 0;
+	u64 dummy;
+	u64 r_cb;
+	EDEB_EN(7, "pfqp=%p hcp_adapter_handle=%lx qp_handle=%lx"
+		   " update_mask=%lx qp_state=%x mqpcb=%p",
+		   pfqp, hcp_adapter_handle.handle, qp_handle.handle,
+		   update_mask, mqpcb->qp_state, mqpcb);
+
+#ifndef EHCA_USE_HCALL
+	simp_h_modify_qp(hcp_adapter_handle, qp_handle, pfqp, update_mask,
+			 mqpcb, gal);
+#else
+	r_cb = ehca_kv_to_g(mqpcb);
+	retcode = plpar_hcall_7arg_7ret(H_MODIFY_QP,
+					hcp_adapter_handle.handle,     /* r4 */
+					qp_handle.handle,	       /* r5 */
+					update_mask,	               /* r6 */
+					r_cb,	                       /* r7 */
+					0, 0, 0,
+					&invalid_attribute_identifier, /* r4 */
+					&dummy,	                       /* r5 */
+					&dummy,	                       /* r6 */
+					&dummy,                        /* r7 */
+					&dummy,	                       /* r8 */
+					&rc_attrib_mask,               /* r9 */
+					&dummy);
+#endif
+	if (retcode == H_NOT_ENOUGH_RESOURCES) {
+		EDEB_ERR(4, "Insufficient resources retcode=%lx", retcode);
+	}
+
+	EDEB_EX(7, "retcode=%lx invalid_attribute_identifier=%lx"
+		" invalid_attribute_MASK=%lx", retcode,
+		invalid_attribute_identifier, rc_attrib_mask);
+
+	return retcode;
+}
+
+static inline u64 hipz_h_query_qp(const struct ipz_adapter_handle
+					     hcp_adapter_handle,
+					     const struct ipz_qp_handle
+					     qp_handle, struct ehca_pfqp *pfqp,
+					     struct hcp_modify_qp_control_block
+					     *qqpcb, struct h_galpa gal)
+{
+	u64 retcode = 0;
+	u64 dummy;
+	u64 r_cb;
+	EDEB_EN(7, "hcp_adapter_handle=%lx qp_handle=%lx",
+		hcp_adapter_handle.handle, qp_handle.handle);
+
+#ifndef EHCA_USE_HCALL
+	simp_h_query_qp(hcp_adapter_handle, qp_handle, qqpcb, gal);
+#else
+	r_cb = ehca_kv_to_g(qqpcb);
+	EDEB(7, "r_cb=%lx", r_cb);
+
+	retcode = plpar_hcall_7arg_7ret(H_QUERY_QP,
+					hcp_adapter_handle.handle, /* r4 */
+					qp_handle.handle,          /* r5 */
+					r_cb,	                   /* r6 */
+					0, 0, 0, 0,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy);
+
+#endif
+	EDEB_EX(7, "retcode=%lx", retcode);
+
+	return retcode;
+}
+
+static inline u64 hipz_h_destroy_qp(const struct ipz_adapter_handle
+					       hcp_adapter_handle,
+					       struct ehca_qp *qp)
+{
+	u64 retcode = 0;
+	u64 dummy;
+	u64 ladr_next_sq_wqe_out;
+	u64 ladr_next_rq_wqe_out;
+
+	EDEB_EN(7, "qp = %p ,ipz_qp_handle=%lx adapter_handle=%lx",
+		qp, qp->ipz_qp_handle.handle, hcp_adapter_handle.handle);
+
+#ifndef EHCA_USE_HCALL
+	retcode =
+		simp_h_destroy_qp(hcp_adapter_handle, qp,
+				  qp->ehca_qp_core.galpas.user);
+#else
+
+	retcode = hcp_galpas_dtor(&qp->ehca_qp_core.galpas);
+
+	retcode = plpar_hcall_7arg_7ret(H_DISABLE_AND_GETC,
+					hcp_adapter_handle.handle, /* r4 */
+					/* function code */
+					1,	                   /* r5 */
+					qp->ipz_qp_handle.handle,  /* r6 */
+					0, 0, 0, 0,
+					&ladr_next_sq_wqe_out,     /* r4 */
+					&ladr_next_rq_wqe_out,     /* r5 */
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy);
+	if (retcode == H_Hardware) {
+		EDEB_ERR(4, "HCA not operational. retcode=%lx", retcode);
+	}
+
+	retcode = plpar_hcall_7arg_7ret(H_FREE_RESOURCE,
+					hcp_adapter_handle.handle, /* r4 */
+					qp->ipz_qp_handle.handle,  /* r5 */
+					0, 0, 0, 0, 0,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy);
+#endif /* EHCA_USE_HCALL */
+
+	if (retcode == H_Resource) {
+		EDEB_ERR(4, "Resource still in use. retcode=%lx", retcode);
+	}
+	EDEB_EX(7, "retcode=%lx", retcode);
+
+	return retcode;
+}
+
+static inline u64 hipz_h_define_aqp0(const struct ipz_adapter_handle
+						hcp_adapter_handle,
+						const struct ipz_qp_handle
+						qp_handle, struct h_galpa gal,
+						u32 port)
+{
+	u64 retcode = 0;
+	u64 dummy;
+
+	EDEB_EN(7, "port=%x ipz_qp_handle=%lx adapter_handle=%lx",
+		port, qp_handle.handle, hcp_adapter_handle.handle);
+
+#ifndef EHCA_USE_HCALL
+	/* TODO: not implemented yet */
+#else
+
+	retcode = plpar_hcall_7arg_7ret(H_DEFINE_AQP0,
+					hcp_adapter_handle.handle, /* r4 */
+					qp_handle.handle,	   /* r5 */
+					port,                      /* r6 */
+					0, 0, 0, 0,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy);
+
+#endif /* EHCA_USE_HCALL */
+	EDEB_EX(7, "retcode=%lx", retcode);
+
+	return retcode;
+}
+
+static inline u64 hipz_h_define_aqp1(const struct ipz_adapter_handle
+						hcp_adapter_handle,
+						const struct ipz_qp_handle
+						qp_handle, struct h_galpa gal,
+						u32 port, u32 * pma_qp_nr,
+						u32 * bma_qp_nr)
+{
+	u64 retcode = 0;
+	u64 dummy;
+	u64 pma_qp_nr_out;
+	u64 bma_qp_nr_out;
+
+	EDEB_EN(7, "port=%x qp_handle=%lx adapter_handle=%lx",
+		port, qp_handle.handle, hcp_adapter_handle.handle);
+
+#ifndef EHCA_USE_HCALL
+	/* TODO: not implemented yet */
+#else
+
+	retcode = plpar_hcall_7arg_7ret(H_DEFINE_AQP1,
+					hcp_adapter_handle.handle, /* r4 */
+					qp_handle.handle,	   /* r5 */
+					port,	                   /* r6 */
+					0, 0, 0, 0,
+					&pma_qp_nr_out,            /* r4 */
+					&bma_qp_nr_out,            /* r5 */
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy);
+	*pma_qp_nr = (u32) pma_qp_nr_out;
+	*bma_qp_nr = (u32) bma_qp_nr_out;
+
+#endif
+	if (retcode == H_ALIAS_EXIST) {
+		EDEB_ERR(4, "AQP1 already exists. retcode=%lx", retcode);
+	}
+
+	EDEB_EX(7, "retcode=%lx pma_qp_nr=%i bma_qp_nr=%i",
+		retcode, (int)*pma_qp_nr, (int)*bma_qp_nr);
+
+	return retcode;
+}
+
+/* TODO: Don't use ib_* types in this file */
+static inline u64 hipz_h_attach_mcqp(const struct ipz_adapter_handle
+						hcp_adapter_handle,
+						const struct ipz_qp_handle
+						qp_handle, struct h_galpa gal,
+						u16 mcg_dlid, union ib_gid dgid)
+{
+	u64 retcode = 0;
+	u64 dummy;
+
+	EDEB_EN(7, "qp_handle=%lx adapter_handle=%lx\nMCG_DGID ="
+		" %d.%d.%d.%d.%d.%d.%d.%d."
+		" %d.%d.%d.%d.%d.%d.%d.%d\n",
+		qp_handle.handle, hcp_adapter_handle.handle,
+		dgid.raw[0], dgid.raw[1],
+		dgid.raw[2], dgid.raw[3],
+		dgid.raw[4], dgid.raw[5],
+		dgid.raw[6], dgid.raw[7],
+		dgid.raw[0 + 8], dgid.raw[1 + 8],
+		dgid.raw[2 + 8], dgid.raw[3 + 8],
+		dgid.raw[4 + 8], dgid.raw[5 + 8],
+		dgid.raw[6 + 8], dgid.raw[7 + 8]);
+
+#ifndef EHCA_USE_HCALL
+	/* TODO: not implemented yet */
+#else
+	retcode = plpar_hcall_7arg_7ret(H_ATTACH_MCQP,
+					hcp_adapter_handle.handle, /* r4 */
+					qp_handle.handle,          /* r5 */
+					mcg_dlid,                  /* r6 */
+					dgid.global.interface_id,  /* r7 */
+					dgid.global.subnet_prefix, /* r8 */
+					0, 0,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy);
+#endif /* EHCA_USE_HCALL */
+	if (retcode == H_NOT_ENOUGH_RESOURCES) {
+		EDEB_ERR(4, "Not enough resources. retcode=%lx", retcode);
+	}
+
+	EDEB_EX(7, "retcode=%lx", retcode);
+
+	return retcode;
+}
+
+static inline u64 hipz_h_detach_mcqp(const struct ipz_adapter_handle
+						hcp_adapter_handle,
+						const struct ipz_qp_handle
+						qp_handle, struct h_galpa gal,
+						u16 mcg_dlid, union ib_gid dgid)
+{
+	u64 retcode = 0;
+	u64 dummy;
+
+	EDEB_EN(7, "qp_handle=%lx adapter_handle=%lx\nMCG_DGID ="
+		" %d.%d.%d.%d.%d.%d.%d.%d."
+		" %d.%d.%d.%d.%d.%d.%d.%d\n",
+		qp_handle.handle, hcp_adapter_handle.handle,
+		dgid.raw[0], dgid.raw[1],
+		dgid.raw[2], dgid.raw[3],
+		dgid.raw[4], dgid.raw[5],
+		dgid.raw[6], dgid.raw[7],
+		dgid.raw[0 + 8], dgid.raw[1 + 8],
+		dgid.raw[2 + 8], dgid.raw[3 + 8],
+		dgid.raw[4 + 8], dgid.raw[5 + 8],
+		dgid.raw[6 + 8], dgid.raw[7 + 8]);
+#ifndef EHCA_USE_HCALL
+	/* TODO: not implemented yet */
+#else
+	retcode = plpar_hcall_7arg_7ret(H_DETACH_MCQP,
+					hcp_adapter_handle.handle, /* r4 */
+					qp_handle.handle,	   /* r5 */
+					mcg_dlid,	           /* r6 */
+					dgid.global.interface_id,  /* r7 */
+					dgid.global.subnet_prefix, /* r8 */
+					0, 0,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy);
+#endif /* EHCA_USE_HCALL */
+	EDEB(7, "retcode=%lx", retcode);
+
+	return retcode;
+}
+
+static inline u64 hipz_h_destroy_cq(const struct ipz_adapter_handle
+					       hcp_adapter_handle,
+					       struct ehca_cq *cq,
+					       u8 force_flag)
+{
+	u64 retcode = 0;
+	u64 dummy;
+
+	EDEB_EN(7, "cq->pf=%p cq=.%p ipz_cq_handle=%lx adapter_handle=%lx",
+		&cq->pf, cq, cq->ipz_cq_handle.handle, hcp_adapter_handle.handle);
+
+#ifndef EHCA_USE_HCALL
+	simp_h_destroy_cq(hcp_adapter_handle, cq,
+			  cq->ehca_cq_core.galpas.kernel);
+#else
+	retcode = hcp_galpas_dtor(&cq->ehca_cq_core.galpas);
+	if (retcode != 0) {
+		EDEB_ERR(4, "Could not destruct cp->galpas");
+		return (H_Resource);
+	}
+
+	retcode = plpar_hcall_7arg_7ret(H_FREE_RESOURCE,
+					hcp_adapter_handle.handle, /* r4 */
+					cq->ipz_cq_handle.handle,  /* r5 */
+					force_flag!=0 ? 1L : 0L,   /* r6 */
+					0, 0, 0, 0,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy);
+#endif
+
+	if (retcode == H_Resource) {
+		EDEB(4, "retcode=%lx ", retcode);
+	}
+
+	EDEB_EX(7, "retcode=%lx", retcode);
+
+	return retcode;
+}
+
+static inline u64 hipz_h_destroy_eq(const struct ipz_adapter_handle
+					       hcp_adapter_handle,
+					       struct ehca_eq *eq)
+{
+	u64 retcode = 0;
+	u64 dummy;
+
+	EDEB_EN(7, "eq->pf=%p eq=%p ipz_eq_handle=%lx adapter_handle=%lx",
+		&eq->pf, eq, eq->ipz_eq_handle.handle,
+		hcp_adapter_handle.handle);
+
+#ifndef EHCA_USE_HCALL
+	/* TODO: not implemeted et */
+#else
+
+	retcode = hcp_galpas_dtor(&eq->galpas);
+	if (retcode != 0) {
+		EDEB_ERR(4, "Could not destruct ep->galpas");
+		return (H_Resource);
+	}
+
+	retcode = plpar_hcall_7arg_7ret(H_FREE_RESOURCE,
+					hcp_adapter_handle.handle, /* r4 */
+					eq->ipz_eq_handle.handle,  /* r5 */
+					0, 0, 0, 0, 0,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy,
+					&dummy);
+
+#endif
+	if (retcode == H_Resource) {
+		EDEB_ERR(4, "Resource in use. retcode=%lx ", retcode);
+	}
+	EDEB_EX(7, "retcode=%lx", retcode);
+
+	return retcode;
+}
+
+/**
+ * hipz_h_alloc_resource_mr - Allocate MR resources in HW and FW, initialize
+ * resources.
+ *
+ * @pfmr:        platform specific for MR
+ * pfshca:       platform specific for SHCA
+ * vaddr:        Memory Region I/O Virtual Address
+ * @length:      Memory Region Length
+ * @access_ctrl: Memory Region Access Controls
+ * @pd:          Protection Domain
+ * @mr_handle:   Memory Region Handle
+ */
+static inline u64 hipz_h_alloc_resource_mr(const struct ipz_adapter_handle
+						      hcp_adapter_handle,
+						      struct ehca_pfmr *pfmr,
+						      struct ehca_pfshca
+						      *pfshca,
+						      const u64 vaddr,
+						      const u64 length,
+						      const u32 access_ctrl,
+						      const struct ipz_pd pd,
+						      struct ipz_mrmw_handle
+						      *mr_handle,
+						      u32 * lkey,
+						      u32 * rkey)
+{
+	u64 rc = H_Success;
+	u64 dummy;
+	u64 lkey_out;
+	u64 rkey_out;
+
+	EDEB_EN(7, "hcp_adapter_handle=%lx pfmr=%p vaddr=%lx length=%lx"
+		" access_ctrl=%x pd=%x pfshca=%p",
+		hcp_adapter_handle.handle, pfmr, vaddr, length, access_ctrl,
+		pd.value, pfshca);
+
+#ifndef EHCA_USE_HCALL
+	rc = simp_hcz_h_alloc_resource_mr(hcp_adapter_handle,
+					  pfmr,
+					  pfshca,
+					  vaddr,
+					  length,
+					  access_ctrl,
+					  pd,
+					  (struct hcz_mrmw_handle *)mr_handle,
+					  lkey, rkey);
+	EDEB_EX(7, "rc=%lx mr_handle.mrwpte=%p mr_handle.page_index=%x"
+		" lkey=%x rkey=%x",
+		rc, mr_handle->mrwpte, mr_handle->page_index, *lkey, *rkey);
+#else
+
+	rc = plpar_hcall_7arg_7ret(H_ALLOC_RESOURCE,
+				   hcp_adapter_handle.handle,        /* r4 */
+				   5,                                /* r5 */
+				   vaddr,                            /* r6 */
+				   length,                           /* r7 */
+				   ((((u64) access_ctrl) << 32ULL)), /* r8 */
+				   pd.value,                         /* r9 */
+				   0,
+				   &mr_handle->handle,               /* r4 */
+				   &dummy,                           /* r5 */
+				   &lkey_out,                        /* r6 */
+				   &rkey_out,                        /* r7 */
+				   &dummy,
+				   &dummy,
+				   &dummy);
+	*lkey = (u32) lkey_out;
+	*rkey = (u32) rkey_out;
+
+	EDEB_EX(7, "rc=%lx mr_handle=%lx lkey=%x rkey=%x",
+		rc, mr_handle->handle, *lkey, *rkey);
+#endif /* EHCA_USE_HCALL */
+
+	return rc;
+}
+
+/**
+ * hipz_h_register_rpage_mr - Register MR resource page in HW and FW .
+ *
+ * @pfmr:       platform specific for MR
+ * @pfshca:     platform specific for SHCA
+ * @queue_type: must be zero for MR
+ */
+static inline u64 hipz_h_register_rpage_mr(const struct ipz_adapter_handle
+						      hcp_adapter_handle,
+						      const struct ipz_mrmw_handle
+						      *mr_handle,
+						      struct ehca_pfmr *pfmr,
+						      struct ehca_pfshca *pfshca,
+						      const u8 pagesize,
+						      const u8 queue_type,
+						      const u64
+						      logical_address_of_page,
+						      const u64 count)
+{
+	u64 rc = H_Success;
+
+#ifndef EHCA_USE_HCALL
+	EDEB_EN(7, "hcp_adapter_handle=%lx pfmr=%p mr_handle.mrwpte=%p"
+		" mr_handle.page_index=%x pagesize=%x queue_type=%x "
+		" logical_address_of_page=%lx count=%lx pfshca=%p",
+		hcp_adapter_handle.handle, pfmr, mr_handle->mrwpte,
+		mr_handle->page_index, pagesize, queue_type,
+		logical_address_of_page, count, pfshca);
+
+	rc = simp_hcz_h_register_rpage_mr(hcp_adapter_handle,
+					  (struct hcz_mrmw_handle *)mr_handle,
+					  pfmr,
+					  pfshca,
+					  pagesize,
+					  queue_type,
+					  logical_address_of_page, count);
+#else
+	EDEB_EN(7, "hcp_adapter_handle=%lx pfmr=%p mr_handle=%lx pagesize=%x"
+		" queue_type=%x logical_address_of_page=%lx count=%lx",
+		hcp_adapter_handle.handle, pfmr, mr_handle->handle, pagesize,
+		queue_type, logical_address_of_page, count);
+
+	if ((count > 1) && (logical_address_of_page & 0xfff)) {
+		ehca_catastrophic("ERROR: logical_address_of_page "
+				  "not on a 4k boundary");
+		rc = H_Parameter;
+	} else {
+		rc = hipz_h_register_rpage(hcp_adapter_handle, pagesize,
+					   queue_type, mr_handle->handle,
+					   logical_address_of_page, count);
+	}
+#endif /* EHCA_USE_HCALL */
+	EDEB_EX(7, "rc=%lx", rc);
+
+	return rc;
+}
+
+/**
+ * hipz_h_query_mr - Query MR in HW and FW.
+ *
+ * @pfmr:             platform specific for MR
+ * @mr_handle:        Memory Region Handle
+ * @mr_local_length:  Local MR Length
+ * @mr_local_vaddr:   Local MR I/O Virtual Address
+ * @mr_remote_length: Remote MR Length
+ * @mr_remote_vaddr   Remote MR I/O Virtual Address
+ * @access_ctrl:      Memory Region Access Controls
+ * @pd:               Protection Domain
+ * lkey:              L_Key
+ * rkey:              R_Key
+ */
+static inline u64 hipz_h_query_mr(const struct ipz_adapter_handle
+					     hcp_adapter_handle,
+					     struct ehca_pfmr *pfmr,
+					     const struct ipz_mrmw_handle
+					     *mr_handle,
+					     u64 * mr_local_length,
+					     u64 * mr_local_vaddr,
+					     u64 * mr_remote_length,
+					     u64 * mr_remote_vaddr,
+					     u32 * access_ctrl,
+					     struct ipz_pd *pd,
+					     u32 * lkey,
+					     u32 * rkey)
+{
+	u64 rc = H_Success;
+	u64 dummy;
+	u64 acc_ctrl_pd_out;
+	u64 r9_out;
+
+#ifndef EHCA_USE_HCALL
+	EDEB_EN(7, "hcp_adapter_handle=%lx pfmr=%p mr_handle.mrwpte=%p"
+		" mr_handle.page_index=%x",
+		hcp_adapter_handle.handle, pfmr, mr_handle->mrwpte,
+		mr_handle->page_index);
+
+	rc = simp_hcz_h_query_mr(hcp_adapter_handle,
+				 pfmr,
+				 mr_handle,
+				 mr_local_length,
+				 mr_local_vaddr,
+				 mr_remote_length,
+				 mr_remote_vaddr, access_ctrl, pd, lkey, rkey);
+
+	EDEB_EX(7, "rc=%lx mr_local_length=%lx mr_local_vaddr=%lx"
+		" mr_remote_length=%lx mr_remote_vaddr=%lx access_ctrl=%x"
+		" pd=%x lkey=%x rkey=%x",
+		rc, *mr_local_length, *mr_local_vaddr, *mr_remote_length,
+		*mr_remote_vaddr, *access_ctrl, pd->value, *lkey, *rkey);
+#else
+	EDEB_EN(7, "hcp_adapter_handle=%lx pfmr=%p mr_handle=%lx",
+		hcp_adapter_handle.handle, pfmr, mr_handle->handle);
+
+
+	rc = plpar_hcall_7arg_7ret(H_QUERY_MR,
+				   hcp_adapter_handle.handle, /* r4 */
+				   mr_handle->handle,         /* r5 */
+				   0, 0, 0, 0, 0,
+				   mr_local_length,           /* r4 */
+				   mr_local_vaddr,            /* r5 */
+				   mr_remote_length,          /* r6 */
+				   mr_remote_vaddr,           /* r7 */
+				   &acc_ctrl_pd_out,          /* r8 */
+				   &r9_out,
+				   &dummy);
+
+	*access_ctrl = acc_ctrl_pd_out >> 32;
+	pd->value = (u32) acc_ctrl_pd_out;
+	*lkey = (u32) (r9_out >> 32);
+	*rkey = (u32) (r9_out & (0xffffffff));
+
+	EDEB_EX(7, "rc=%lx mr_local_length=%lx mr_local_vaddr=%lx"
+		" mr_remote_length=%lx mr_remote_vaddr=%lx access_ctrl=%x"
+		" pd=%x lkey=%x rkey=%x",
+		rc, *mr_local_length, *mr_local_vaddr, *mr_remote_length,
+		*mr_remote_vaddr, *access_ctrl, pd->value, *lkey, *rkey);
+#endif /* EHCA_USE_HCALL */
+
+	return rc;
+}
+
+/**
+ * hipz_h_free_resource_mr - Free MR resources in HW and FW.
+ *
+ * @pfmr:      platform specific for MR
+ * @mr_handle: Memory Region Handle
+ */
+static inline u64 hipz_h_free_resource_mr(const struct ipz_adapter_handle
+						     hcp_adapter_handle,
+						     struct ehca_pfmr *pfmr,
+						     const struct ipz_mrmw_handle
+						     *mr_handle)
+{
+	u64 rc = H_Success;
+	u64 dummy;
+
+#ifndef EHCA_USE_HCALL
+	EDEB_EN(7, "hcp_adapter_handle=%lx pfmr=%p mr_handle.mrwpte=%p"
+		" mr_handle.page_index=%x",
+		hcp_adapter_handle.handle, pfmr, mr_handle->mrwpte,
+		mr_handle->page_index);
+
+	rc = simp_hcz_h_free_resource_mr(hcp_adapter_handle, pfmr, mr_handle);
+#else
+	EDEB_EN(7, "hcp_adapter_handle=%lx pfmr=%p mr_handle=%lx",
+		hcp_adapter_handle.handle, pfmr, mr_handle->handle);
+
+	rc = plpar_hcall_7arg_7ret(H_FREE_RESOURCE,
+				   hcp_adapter_handle.handle, /* r4 */
+				   mr_handle->handle,         /* r5 */
+				   0, 0, 0, 0, 0,
+				   &dummy,
+				   &dummy,
+				   &dummy,
+				   &dummy,
+				   &dummy,
+				   &dummy,
+				   &dummy);
+#endif /* EHCA_USE_HCALL */
+	EDEB_EX(7, "rc=%lx", rc);
+
+	return rc;
+}
+
+/**
+ * hipz_h_reregister_pmr - Reregister MR in HW and FW.
+ *
+ * @pfmr:        platform specific for MR
+ * @pfshca:      platform specific for SHCA
+ * @mr_handle:   Memory Region Handle
+ * @vaddr_in:    Memory Region I/O Virtual Address
+ * @length:      Memory Region Length
+ * @access_ctrl: Memory Region Access Controls
+ * @pd:          Protection Domain
+ * @mr_addr_cb:  Logical Address of MR Control Block
+ * @vaddr_out:   Memory Region I/O Virtual Address
+ * lkey:         L_Key
+ * rkey:         R_Key
+ *
+ */
+static inline u64 hipz_h_reregister_pmr(const struct ipz_adapter_handle
+						   hcp_adapter_handle,
+						   struct ehca_pfmr *pfmr,
+						   struct ehca_pfshca *pfshca,
+						   const struct ipz_mrmw_handle
+						   *mr_handle,
+						   const u64 vaddr_in,
+						   const u64 length,
+						   const u32 access_ctrl,
+						   const struct ipz_pd pd,
+						   const u64 mr_addr_cb,
+						   u64 * vaddr_out,
+						   u32 * lkey,
+						   u32 * rkey)
+{
+	u64 rc = H_Success;
+	u64 dummy;
+	u64 lkey_out;
+	u64 rkey_out;
+
+#ifndef EHCA_USE_HCALL
+	EDEB_EN(7, "hcp_adapter_handle=%lx pfmr=%p pfshca=%p"
+		" mr_handle.mrwpte=%p mr_handle.page_index=%x vaddr_in=%lx"
+		" length=%lx access_ctrl=%x pd=%x mr_addr_cb=",
+		hcp_adapter_handle.handle, pfmr, pfshca, mr_handle->mrwpte,
+		mr_handle->page_index, vaddr_in, length, access_ctrl,
+		pd.value, mr_addr_cb);
+
+	rc = simp_hcz_h_reregister_pmr(hcp_adapter_handle, pfmr, pfshca,
+				       mr_handle, vaddr_in, length, access_ctrl,
+				       pd, mr_addr_cb, vaddr_out, lkey, rkey);
+#else
+	EDEB_EN(7, "hcp_adapter_handle=%lx pfmr=%p pfshca=%p mr_handle=%lx "
+		"vaddr_in=%lx length=%lx access_ctrl=%x pd=%x mr_addr_cb=%lx",
+		hcp_adapter_handle.handle, pfmr, pfshca, mr_handle->handle,
+		vaddr_in, length, access_ctrl, pd.value, mr_addr_cb);
+
+	rc = plpar_hcall_7arg_7ret(H_REREGISTER_PMR,
+				   hcp_adapter_handle.handle, /* r4 */
+				   mr_handle->handle,	      /* r5 */
+				   vaddr_in,	              /* r6 */
+				   length,                    /* r7 */
+				   /* r8 */
+				   ((((u64) access_ctrl) << 32ULL) | pd.value),
+				   mr_addr_cb,                /* r9 */
+				   0,
+				   &dummy,                    /* r4 */
+				   vaddr_out,                 /* r5 */
+				   &lkey_out,                 /* r6 */
+				   &rkey_out,                 /* r7 */
+				   &dummy,
+				   &dummy,
+				   &dummy);
+	*lkey = (u32) lkey_out;
+	*rkey = (u32) rkey_out;
+#endif /* EHCA_USE_HCALL */
+
+	EDEB_EX(7, "rc=%lx vaddr_out=%lx lkey=%x rkey=%x",
+		rc, *vaddr_out, *lkey, *rkey);
+	return rc;
+}
+
+/**  @brief
+     as defined in carols hcall document
+*/
+
+/**
+ * Register shared MR in HW and FW.
+ *
+ * @pfmr:           platform specific for new shared MR
+ * @orig_pfmr:      platform specific for original MR
+ * @pfshca:         platform specific for SHCA
+ * @orig_mr_handle: Memory Region Handle of original MR
+ * @vaddr_in:       Memory Region I/O Virtual Address of new shared MR
+ * @access_ctrl:    Memory Region Access Controls of new shared MR
+ * @pd:             Protection Domain of new shared MR
+ * @mr_handle:      Memory Region Handle of new shared MR
+ * @lkey:           L_Key of new shared MR
+ * @rkey:           R_Key of new shared MR
+ */
+static inline u64 hipz_h_register_smr(const struct ipz_adapter_handle
+						 hcp_adapter_handle,
+						 struct ehca_pfmr *pfmr,
+						 struct ehca_pfmr *orig_pfmr,
+						 struct ehca_pfshca *pfshca,
+						 const struct ipz_mrmw_handle
+						 *orig_mr_handle,
+						 const u64 vaddr_in,
+						 const u32 access_ctrl,
+						 const struct ipz_pd pd,
+						 struct ipz_mrmw_handle
+						 *mr_handle,
+						 u32 * lkey,
+						 u32 * rkey)
+{
+	u64 rc = H_Success;
+	u64 dummy;
+	u64 lkey_out;
+	u64 rkey_out;
+
+#ifndef EHCA_USE_HCALL
+	EDEB_EN(7, "hcp_adapter_handle=%lx pfmr=%p orig_pfmr=%p pfshca=%p"
+		" orig_mr_handle.mrwpte=%p orig_mr_handle.page_index=%x"
+		" vaddr_in=%lx access_ctrl=%x pd=%x",
+		hcp_adapter_handle.handle, pfmr, orig_pfmr, pfshca,
+		orig_mr_handle->mrwpte, orig_mr_handle->page_index,
+		vaddr_in, access_ctrl, pd.value);
+
+	rc = simp_hcz_h_register_smr(hcp_adapter_handle, pfmr, orig_pfmr,
+				     pfshca, orig_mr_handle, vaddr_in,
+				     access_ctrl, pd,
+				     (struct hcz_mrmw_handle *)mr_handle, lkey,
+				     rkey);
+	EDEB_EX(7, "rc=%lx mr_handle.mrwpte=%p mr_handle.page_index=%x"
+		" lkey=%x rkey=%x",
+		rc, mr_handle->mrwpte, mr_handle->page_index, *lkey, *rkey);
+#else
+	EDEB_EN(7, "hcp_adapter_handle=%lx orig_pfmr=%p pfshca=%p"
+		" orig_mr_handle=%lx vaddr_in=%lx access_ctrl=%x pd=%x",
+		hcp_adapter_handle.handle, orig_pfmr, pfshca,
+		orig_mr_handle->handle, vaddr_in, access_ctrl, pd.value);
+
+
+	rc = plpar_hcall_7arg_7ret(H_REGISTER_SMR,
+				   hcp_adapter_handle.handle,        /* r4 */
+				   orig_mr_handle->handle,           /* r5 */
+				   vaddr_in,                         /* r6 */
+				   ((((u64) access_ctrl) << 32ULL)), /* r7 */
+				   pd.value,                         /* r8 */
+				   0, 0,
+				   &mr_handle->handle,               /* r4 */
+				   &dummy,                           /* r5 */
+				   &lkey_out,                        /* r6 */
+				   &rkey_out,                        /* r7 */
+				   &dummy,
+				   &dummy,
+				   &dummy);
+	*lkey = (u32) lkey_out;
+	*rkey = (u32) rkey_out;
+
+	EDEB_EX(7, "rc=%lx mr_handle=%lx lkey=%x rkey=%x",
+		rc, mr_handle->handle, *lkey, *rkey);
+#endif /* EHCA_USE_HCALL */
+
+	return rc;
+}
+
+static inline u64 hipz_h_alloc_resource_mw(const struct ipz_adapter_handle
+						      hcp_adapter_handle,
+						      struct ehca_pfmw *pfmw,
+						      struct ehca_pfshca *pfshca,
+						      const struct ipz_pd pd,
+						      struct ipz_mrmw_handle *mw_handle,
+						      u32 * rkey)
+{
+	u64 rc = H_Success;
+	u64 dummy;
+	u64 rkey_out;
+
+	EDEB_EN(7, "hcp_adapter_handle=%lx pfmw=%p pd=%x pfshca=%p",
+		    hcp_adapter_handle.handle, pfmw, pd.value, pfshca);
+
+#ifndef EHCA_USE_HCALL
+
+	rc = simp_hcz_h_alloc_resource_mw(hcp_adapter_handle, pfmw, pfshca, pd,
+					  (struct hcz_mrmw_handle *)mw_handle,
+					  rkey);
+	EDEB_EX(7, "rc=%lx mw_handle.mrwpte=%p mw_handle.page_index=%x rkey=%x",
+		rc, mw_handle->mrwpte, mw_handle->page_index, *rkey);
+#else
+	rc = plpar_hcall_7arg_7ret(H_ALLOC_RESOURCE,
+				   hcp_adapter_handle.handle, /* r4 */
+				   6,                         /* r5 */
+				   pd.value,                  /* r6 */
+				   0, 0, 0, 0,
+				   &mw_handle->handle,        /* r4 */
+				   &dummy,                    /* r5 */
+				   &dummy,                    /* r6 */
+				   &rkey_out,                 /* r7 */
+				   &dummy,
+				   &dummy,
+				   &dummy);
+	*rkey = (u32) rkey_out;
+
+	EDEB_EX(7, "rc=%lx mw_handle=%lx rkey=%x",
+		rc, mw_handle->handle, *rkey);
+#endif /* EHCA_USE_HCALL */
+	return rc;
+}
+
+static inline u64 hipz_h_query_mw(const struct ipz_adapter_handle
+					     hcp_adapter_handle,
+					     struct ehca_pfmw *pfmw,
+					     const struct ipz_mrmw_handle
+					     *mw_handle,
+					     u32 * rkey,
+					     struct ipz_pd *pd)
+{
+	u64 rc = H_Success;
+	u64 dummy;
+	u64 pd_out;
+	u64 rkey_out;
+
+#ifndef EHCA_USE_HCALL
+	EDEB_EN(7, "hcp_adapter_handle=%lx pfmw=%p mw_handle.mrwpte=%p"
+		" mw_handle.page_index=%x",
+		hcp_adapter_handle.handle, pfmw, mw_handle->mrwpte,
+		mw_handle->page_index);
+
+	rc = simp_hcz_h_query_mw(hcp_adapter_handle, pfmw, mw_handle, rkey, pd);
+
+	EDEB_EX(7, "rc=%lx rkey=%x pd=%x", rc, *rkey, pd->value);
+#else
+	EDEB_EN(7, "hcp_adapter_handle=%lx pfmw=%p mw_handle=%lx",
+		hcp_adapter_handle.handle, pfmw, mw_handle->handle);
+
+	rc = plpar_hcall_7arg_7ret(H_QUERY_MW,
+				   hcp_adapter_handle.handle, /* r4 */
+				   mw_handle->handle,         /* r5 */
+				   0, 0, 0, 0, 0,
+				   &dummy,                    /* r4 */
+				   &dummy,                    /* r5 */
+				   &dummy,                    /* r6 */
+				   &rkey_out,                 /* r7 */
+				   &pd_out,                   /* r8 */
+				   &dummy,
+				   &dummy);
+	*rkey = (u32) rkey_out;
+	pd->value = (u32) pd_out;
+
+	EDEB_EX(7, "rc=%lx rkey=%x pd=%x", rc, *rkey, pd->value);
+#endif /* EHCA_USE_HCALL */
+
+	return rc;
+}
+
+static inline u64 hipz_h_free_resource_mw(const struct ipz_adapter_handle
+						     hcp_adapter_handle,
+						     struct ehca_pfmw *pfmw,
+						     const struct ipz_mrmw_handle
+						     *mw_handle)
+{
+	u64 rc = H_Success;
+	u64 dummy;
+
+#ifndef EHCA_USE_HCALL
+	EDEB_EN(7, "hcp_adapter_handle=%lx pfmw=%p mw_handle.mrwpte=%p"
+		" mw_handle.page_index=%x",
+		hcp_adapter_handle.handle, pfmw, mw_handle->mrwpte,
+		mw_handle->page_index);
+
+	rc = simp_hcz_h_free_resource_mw(hcp_adapter_handle, pfmw, mw_handle);
+#else
+	EDEB_EN(7, "hcp_adapter_handle=%lx pfmw=%p mw_handle=%lx",
+		hcp_adapter_handle.handle, pfmw, mw_handle->handle);
+
+	rc = plpar_hcall_7arg_7ret(H_FREE_RESOURCE,
+				   hcp_adapter_handle.handle, /* r4 */
+				   mw_handle->handle,         /* r5 */
+				   0, 0, 0, 0, 0,
+				   &dummy,
+				   &dummy,
+				   &dummy,
+				   &dummy,
+				   &dummy,
+				   &dummy,
+				   &dummy);
+#endif /* EHCA_USE_HCALL */
+	EDEB_EX(7, "rc=%lx", rc);
+
+	return rc;
+}
+
+static inline u64 hipz_h_error_data(const struct ipz_adapter_handle
+					       adapter_handle,
+					       const u64 ressource_handle,
+					       void *rblock,
+					       unsigned long *byte_count)
+{
+	u64 rc = H_Success;
+	u64 dummy;
+	u64 r_cb;
+
+	EDEB_EN(7, "adapter_handle=%lx ressource_handle=%lx  rblock=%p",
+		adapter_handle.handle, ressource_handle, rblock);
+
+	if ((((u64)rblock) & 0xfff) != 0) {
+		EDEB_ERR(4, "rblock not page aligned.");
+		rc = H_Parameter;
+		return rc;
+	}
+
+	r_cb = ehca_kv_to_g(rblock);
+
+	rc = plpar_hcall_7arg_7ret(H_ERROR_DATA,
+				   adapter_handle.handle,
+				   ressource_handle,
+				   r_cb,
+				   0, 0, 0, 0,
+				   &dummy,
+				   &dummy,
+				   &dummy,
+				   &dummy,
+				   &dummy,
+				   &dummy,
+				   &dummy);
+
+	EDEB_EX(7, "rc=%lx", rc);
+
+	return rc;
+}
+
+#endif /* __HCP_IF_H__ */


From rolandd at cisco.com  Sat Feb 18 11:57:14 2006
From: rolandd at cisco.com (Roland Dreier)
Date: Fri, 17 Feb 2006 16:57:14 -0800
Subject: [PATCH 04/22] OF adapter probing
In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
Message-ID: <20060218005712.13620.82908.stgit@localhost.localdomain>

From: Roland Dreier <rolandd at cisco.com>

hipz_probe_adapters() looks a little funny -- it seems to bail out
of all the remaining adapters if one of them isn't quite right.
---

 drivers/infiniband/hw/ehca/hcp_sense.c |  144 ++++++++++++++++++++++++++++++++
 drivers/infiniband/hw/ehca/hcp_sense.h |  136 ++++++++++++++++++++++++++++++
 2 files changed, 280 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/hcp_sense.c b/drivers/infiniband/hw/ehca/hcp_sense.c
new file mode 100644
index 0000000..83fa4a3
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/hcp_sense.c
@@ -0,0 +1,144 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  ehca detection and query code for POWER
+ *
+ *  Authors: Heiko J Schick <schickhj at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: hcp_sense.c,v 1.10 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+#define DEB_PREFIX "snse"
+
+#include "ehca_kernel.h"
+#include "ehca_tools.h"
+
+int hipz_count_adapters(void)
+{
+	int num = 0;
+	struct device_node *dn = NULL;
+
+	EDEB_EN(7, "");
+
+	while ((dn = of_find_node_by_name(dn, "lhca"))) {
+		num++;
+	}
+
+	of_node_put(dn);
+
+	if (num == 0) {
+		EDEB_ERR(4, "No lhca node name was found in the"
+			 " Open Firmware device tree.");
+		return -ENODEV;
+	}
+
+	EDEB(6, " ... found %x adapter(s)", num);
+
+	EDEB_EX(7, "num=%x", num);
+
+	return num;
+}
+
+int hipz_probe_adapters(char **adapter_list)
+{
+	int ret = 0;
+	int num = 0;
+	struct device_node *dn = NULL;
+	char *loc;
+
+	EDEB_EN(7, "adapter_list=%p", adapter_list);
+
+	while ((dn = of_find_node_by_name(dn, "lhca"))) {
+		loc = get_property(dn, "ibm,loc-code", NULL);
+		if (loc == NULL) {
+			EDEB_ERR(4, "No ibm,loc-code property for"
+				 " lhca Open Firmware device tree node.");
+			ret = -ENODEV;
+			goto probe_adapters0;
+		}
+
+		adapter_list[num] = loc;
+		EDEB(6, " ... found adapter[%x] with loc-code: %s", num, loc);
+		num++;
+	}
+
+      probe_adapters0:
+	of_node_put(dn);
+
+	EDEB_EX(7, "ret=%x", ret);
+
+	return ret;
+}
+
+u64 hipz_get_adapter_handle(char *adapter)
+{
+	struct device_node *dn = NULL;
+	char *loc;
+	u64 *u64data = NULL;
+	u64 ret = 0;
+
+	EDEB_EN(7, "adapter=%p", adapter);
+
+	while ((dn = of_find_node_by_name(dn, "lhca"))) {
+		loc = get_property(dn, "ibm,loc-code", NULL);
+		if (loc == NULL) {
+			EDEB_ERR(4, "No ibm,loc-code property for"
+				 " lhca Open Firmware device tree node.");
+			goto get_adapter_handle0;
+		}
+
+		if (strcmp(loc, adapter) == 0) {
+			u64data =
+			    (u64 *) get_property(dn, "ibm,hca-handle", NULL);
+			break;
+		}
+	}
+
+	if (u64data == NULL) {
+		EDEB_ERR(4, "No ibm,hca-handle property for"
+			 " lhca Open Firmware device tree node with"
+			 " ibm,loc-code: %s.", adapter);
+		goto get_adapter_handle0;
+	}
+
+	ret = *u64data;
+
+      get_adapter_handle0:
+	of_node_put(dn);
+
+	EDEB_EX(7, "ret=%lx",ret);
+
+	return ret;
+}
diff --git a/drivers/infiniband/hw/ehca/hcp_sense.h b/drivers/infiniband/hw/ehca/hcp_sense.h
new file mode 100644
index 0000000..a49040b
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/hcp_sense.h
@@ -0,0 +1,136 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  ehca detection and query code for POWER
+ *
+ *  Authors: Heiko J Schick <schickhj at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: hcp_sense.h,v 1.11 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+#ifndef HCP_SENSE_H
+#define HCP_SENSE_H
+
+int hipz_count_adapters(void);
+int hipz_probe_adapters(char **adapter_list);
+u64 hipz_get_adapter_handle(char *adapter);
+
+/* query hca response block */
+struct query_hca_rblock {
+	u32 cur_reliable_dg;
+	u32 cur_qp;
+	u32 cur_cq;
+	u32 cur_eq;
+	u32 cur_mr;
+	u32 cur_mw;
+	u32 cur_ee_context;
+	u32 cur_mcast_grp;
+	u32 cur_qp_attached_mcast_grp;
+	u32 reserved1;
+	u32 cur_ipv6_qp;
+	u32 cur_eth_qp;
+	u32 cur_hp_mr;
+	u32 reserved2[3];
+	u32 max_rd_domain;
+	u32 max_qp;
+	u32 max_cq;
+	u32 max_eq;
+	u32 max_mr;
+	u32 max_hp_mr;
+	u32 max_mw;
+	u32 max_mrwpte;
+	u32 max_special_mrwpte;
+	u32 max_rd_ee_context;
+	u32 max_mcast_grp;
+	u32 max_qps_attached_all_mcast_grp;
+	u32 max_qps_attached_mcast_grp;
+	u32 max_raw_ipv6_qp;
+	u32 max_raw_ethy_qp;
+	u32 internal_clock_frequency;
+	u32 max_pd;
+	u32 max_ah;
+	u32 max_cqe;
+	u32 max_wqes_wq;
+	u32 max_partitions;
+	u32 max_rr_ee_context;
+	u32 max_rr_qp;
+	u32 max_rr_hca;
+	u32 max_act_wqs_ee_context;
+	u32 max_act_wqs_qp;
+	u32 max_sge;
+	u32 max_sge_rd;
+	u32 memory_page_size_supported;
+	u64 max_mr_size;
+	u32 local_ca_ack_delay;
+	u32 num_ports;
+	u32 vendor_id;
+	u32 vendor_part_id;
+	u32 hw_ver;
+	u64 node_guid;
+	u64 hca_cap_indicators;
+	u32 data_counter_register_size;
+	u32 max_shared_rq;
+	u32 max_isns_eq;
+	u32 max_neq;
+} __attribute__ ((packed));
+
+/* query port response block */
+struct query_port_rblock {
+	u32 state;
+	u32 bad_pkey_cntr;
+	u32 lmc;
+	u32 lid;
+	u32 subnet_timeout;
+	u32 qkey_viol_cntr;
+	u32 sm_sl;
+	u32 sm_lid;
+	u32 capability_mask;
+	u32 init_type_reply;
+	u32 pkey_tbl_len;
+	u32 gid_tbl_len;
+	u64 gid_prefix;
+	u32 port_nr;
+	u16 pkey_entries[16];
+	u8  reserved1[32];
+	u32 trent_size;
+	u32 trbuf_size;
+	u64 max_msg_sz;
+	u32 max_mtu;
+	u32 vl_cap;
+	u8  reserved2[1900];
+	u64 guid_entries[255];
+} __attribute__ ((packed));
+
+#endif


From rolandd at cisco.com  Sat Feb 18 11:57:17 2006
From: rolandd at cisco.com (Roland Dreier)
Date: Fri, 17 Feb 2006 16:57:17 -0800
Subject: [PATCH 05/22] HW register abstractions
In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
Message-ID: <20060218005717.13620.85161.stgit@localhost.localdomain>

From: Roland Dreier <rolandd at cisco.com>

Does hipz_structs.h really need a whole file to hold 5 #defines?
---

 drivers/infiniband/hw/ehca/hipz_fns.h      |   83 ++++++
 drivers/infiniband/hw/ehca/hipz_fns_core.h |  123 +++++++++
 drivers/infiniband/hw/ehca/hipz_hw.h       |  382 ++++++++++++++++++++++++++++
 drivers/infiniband/hw/ehca/hipz_structs.h  |   54 ++++
 4 files changed, 642 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/hipz_fns.h b/drivers/infiniband/hw/ehca/hipz_fns.h
new file mode 100644
index 0000000..4231b65
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/hipz_fns.h
@@ -0,0 +1,83 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  HW abstraction register functions
+ *
+ *  Authors: Christoph Raisch <raisch at de.ibm.com>
+ *           Reinhard Ernst <rernst at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: hipz_fns.h,v 1.15 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+#ifndef __HIPZ_FNS_H__
+#define __HIPZ_FNS_H__
+
+#include "hipz_structs.h"
+#include "ehca_classes.h"
+#include "hipz_hw.h"
+#ifndef EHCA_USE_HCALL
+#include "sim_gal.h"
+#endif
+
+#include "hipz_fns_core.h"
+
+#define hipz_galpa_store_eq(gal,offset,value)\
+	hipz_galpa_store(gal,EQTEMM_OFFSET(offset),value)
+#define hipz_galpa_load_eq(gal,offset)\
+	hipz_galpa_load(gal,EQTEMM_OFFSET(offset))
+
+#define hipz_galpa_store_qped(gal,offset,value)\
+	hipz_galpa_store(gal,QPEDMM_OFFSET(offset),value)
+#define hipz_galpa_load_qped(gal,offset)\
+	hipz_galpa_load(gal,QPEDMM_OFFSET(offset))
+
+#define hipz_galpa_store_mrmw(gal,offset,value)\
+	hipz_galpa_store(gal,MRMWMM_OFFSET(offset),value)
+#define hipz_galpa_load_mrmw(gal,offset)\
+	hipz_galpa_load(gal,MRMWMM_OFFSET(offset))
+
+inline static void hipz_load_FEC(struct ehca_cq_core *cq_core, u32 * count)
+{
+	uint64_t reg = 0;
+	EDEB_EN(7, "cq_core=%p", cq_core);
+	{
+	  struct h_galpa gal = cq_core->galpas.kernel;
+	  reg = hipz_galpa_load_cq(gal, CQx_FEC);
+	  *count = EHCA_BMASK_GET(CQx_FEC_CQE_cnt, reg);
+	}
+	EDEB_EX(7,"cq_core=%p CQx_FEC=%lx", cq_core,reg);
+}
+
+#endif /* __IPZ_IF_H__ */
diff --git a/drivers/infiniband/hw/ehca/hipz_fns_core.h b/drivers/infiniband/hw/ehca/hipz_fns_core.h
new file mode 100644
index 0000000..a60b808
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/hipz_fns_core.h
@@ -0,0 +1,123 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  HW abstraction register functions
+ *
+ *  Authors: Christoph Raisch <raisch at de.ibm.com>
+ *           Reinhard Ernst <rernst at de.ibm.com>
+ *           Hoang-Nam Nguyen <hnguyen at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: hipz_fns_core.h,v 1.10 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+#ifndef __HIPZ_FNS_CORE_H__
+#define __HIPZ_FNS_CORE_H__
+
+#include "ehca_galpa.h"
+#include "hipz_hw.h"
+
+#define hipz_galpa_store_cq(gal,offset,value)\
+	hipz_galpa_store(gal,CQTEMM_OFFSET(offset),value)
+#define hipz_galpa_load_cq(gal,offset)\
+	hipz_galpa_load(gal,CQTEMM_OFFSET(offset))
+
+#define hipz_galpa_store_qp(gal,offset,value)\
+	hipz_galpa_store(gal,QPTEMM_OFFSET(offset),value)
+#define hipz_galpa_load_qp(gal,offset)\
+	hipz_galpa_load(gal,QPTEMM_OFFSET(offset))
+
+inline static void hipz_update_SQA(struct ehca_qp_core *qp_core, u16 nr_wqes)
+{
+	struct h_galpa gal;
+
+	EDEB_EN(7, "qp_core=%p", qp_core);
+	gal = qp_core->galpas.kernel;
+	/*  ringing doorbell :-) */
+	hipz_galpa_store_qp(gal, QPx_SQA, EHCA_BMASK_SET(QPx_SQAdder, nr_wqes));
+	EDEB_EX(7, "qp_core=%p QPx_SQA = %i", qp_core, nr_wqes);
+}
+
+inline static void hipz_update_RQA(struct ehca_qp_core *qp_core, u16 nr_wqes)
+{
+	struct h_galpa gal;
+
+	EDEB_EN(7, "qp_core=%p", qp_core);
+	gal = qp_core->galpas.kernel;
+	/*  ringing doorbell :-) */
+	hipz_galpa_store_qp(gal, QPx_RQA, EHCA_BMASK_SET(QPx_RQAdder, nr_wqes));
+	EDEB_EX(7, "qp_core=%p QPx_RQA = %i", qp_core, nr_wqes);
+}
+
+inline static void hipz_update_FECA(struct ehca_cq_core *cq_core, u32 nr_cqes)
+{
+	struct h_galpa gal;
+
+	EDEB_EN(7, "cq_core=%p", cq_core);
+	gal = cq_core->galpas.kernel;
+	hipz_galpa_store_cq(gal, CQx_FECA,
+			    EHCA_BMASK_SET(CQx_FECAdder, nr_cqes));
+	EDEB_EX(7, "cq_core=%p CQx_FECA = %i", cq_core, nr_cqes);
+}
+
+inline static void hipz_set_CQx_N0(struct ehca_cq_core *cq_core, u32 value)
+{
+	struct h_galpa gal;
+	u64 CQx_N0_reg = 0;
+
+	EDEB_EN(7, "cq_core=%p event on solicited completion -- write CQx_N0",
+		cq_core);
+	gal = cq_core->galpas.kernel;
+	hipz_galpa_store_cq(gal, CQx_N0,
+			    EHCA_BMASK_SET(CQx_N0_generate_solicited_comp_event,
+					   value));
+	CQx_N0_reg = hipz_galpa_load_cq(gal, CQx_N0);
+	EDEB_EX(7, "cq_core=%p loaded CQx_N0=%lx", cq_core,(unsigned long)CQx_N0_reg);
+}
+
+inline static void hipz_set_CQx_N1(struct ehca_cq_core *cq_core, u32 value)
+{
+	struct h_galpa gal;
+	u64 CQx_N1_reg = 0;
+
+	EDEB_EN(7, "cq_core=%p event on completion -- write CQx_N1",
+		cq_core);
+	gal = cq_core->galpas.kernel;
+	hipz_galpa_store_cq(gal, CQx_N1,
+			    EHCA_BMASK_SET(CQx_N1_generate_comp_event, value));
+	CQx_N1_reg = hipz_galpa_load_cq(gal, CQx_N1);
+	EDEB_EX(7, "cq_core=%p loaded CQx_N1=%lx", cq_core,(unsigned long)CQx_N1_reg);
+}
+
+#endif /* __HIPZ_FNC_CORE_H__ */
diff --git a/drivers/infiniband/hw/ehca/hipz_hw.h b/drivers/infiniband/hw/ehca/hipz_hw.h
new file mode 100644
index 0000000..6fa005b
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/hipz_hw.h
@@ -0,0 +1,382 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  eHCA register definitions
+ *
+ *  Authors: Christoph Raisch <raisch at de.ibm.com>
+ *           Reinhard Ernst <rernst at de.ibm.com>
+ *           Waleri Fomin <fomin at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: hipz_hw.h,v 1.7 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+#ifndef __HIPZ_HW_H__
+#define __HIPZ_HW_H__
+
+#ifdef __KERNEL__
+#include "ehca_tools.h"
+#include "ehca_kernel.h"
+#else				/*  !__KERNEL__ */
+#include "ehca_utools.h"
+#endif
+
+/** @brief Queue Pair Table Memory
+ */
+struct hipz_QPTEMM {
+	u64 QPx_HCR;
+#define QPx_HCR_PKEY_Mode EHCA_BMASK_IBM(1,2)
+#define QPx_HCR_Special_QP_Mode EHCA_BMASK_IBM(6,7)
+	u64 QPx_C;
+#define QPx_C_Enabled EHCA_BMASK_IBM(0,0)
+#define QPx_C_Disabled EHCA_BMASK_IBM(1,1)
+#define QPx_C_Req_State EHCA_BMASK_IBM(16,23)
+#define QPx_C_Res_State EHCA_BMASK_IBM(25,31)
+#define QPx_C_disable_ETE_check EHCA_BMASK_IBM(7,7)
+	u64 QPx_HERR;
+	u64 QPx_AER;
+/* 0x20*/
+	u64 QPx_SQA;
+#define QPx_SQAdder EHCA_BMASK_IBM(48,63)
+	u64 QPx_SQC;
+	u64 QPx_RQA;
+#define QPx_RQAdder EHCA_BMASK_IBM(48,63)
+	u64 QPx_RQC;
+/* 0x40*/
+	u64 QPx_ST;
+	u64 QPx_PMSTATE;
+#define  QPx_PMSTATE_BITS  EHCA_BMASK_IBM(30,31)
+	u64 QPx_PMFA;
+	u64 QPx_PKEY;
+#define QPx_PKEY_value EHCA_BMASK_IBM(48,63)
+/* 0x60*/
+	u64 QPx_PKEYA;
+#define QPx_PKEYA_index0 EHCA_BMASK_IBM(0,15)
+#define QPx_PKEYA_index1 EHCA_BMASK_IBM(16,31)
+#define QPx_PKEYA_index2 EHCA_BMASK_IBM(32,47)
+#define QPx_PKEYA_index3 EHCA_BMASK_IBM(48,63)
+	u64 QPx_PKEYB;
+#define QPx_PKEYB_index4 EHCA_BMASK_IBM(0,15)
+#define QPx_PKEYB_index5 EHCA_BMASK_IBM(16,31)
+#define QPx_PKEYB_index6 EHCA_BMASK_IBM(32,47)
+#define QPx_PKEYB_index7 EHCA_BMASK_IBM(48,63)
+	u64 QPx_PKEYC;
+#define QPx_PKEYC_index8 EHCA_BMASK_IBM(0,15)
+#define QPx_PKEYC_index9 EHCA_BMASK_IBM(16,31)
+#define QPx_PKEYC_index10 EHCA_BMASK_IBM(32,47)
+#define QPx_PKEYC_index11 EHCA_BMASK_IBM(48,63)
+	u64 QPx_PKEYD;
+#define QPx_PKEYD_index12 EHCA_BMASK_IBM(0,15)
+#define QPx_PKEYD_index13 EHCA_BMASK_IBM(16,31)
+#define QPx_PKEYD_index14 EHCA_BMASK_IBM(32,47)
+#define QPx_PKEYD_index15 EHCA_BMASK_IBM(48,63)
+/* 0x80*/
+	u64 QPx_QKEY;
+#define QPx_QKEY_value EHCA_BMASK_IBM(32,63)
+	u64 QPx_DQP;
+#define QPx_DQP_number EHCA_BMASK_IBM(40,63)
+	u64 QPx_DLIDP;
+#define QPx_DLID_PRIMARY EHCA_BMASK_IBM(48,63)
+#define QPx_DLIDP_GRH    EHCA_BMASK_IBM(31,31)
+	u64 QPx_PORTP;
+#define QPx_PORT_Primary EHCA_BMASK_IBM(57,63)
+/* 0xa0*/
+	u64 QPx_SLIDP;
+#define QPx_SLIDP_p_path EHCA_BMASK_IBM(48,63)
+#define QPx_SLIDP_lmc    EHCA_BMASK_IBM(37,39)
+	u64 QPx_SLIDPP;
+#define QPx_SLID_PRIM_PATH EHCA_BMASK_IBM(57,63)
+	u64 QPx_DLIDA;
+#define QPx_DLIDA_GRH    EHCA_BMASK_IBM(31,31)
+	u64 QPx_PORTA;
+#define QPx_PORT_Alternate EHCA_BMASK_IBM(57,63)
+/* 0xc0*/
+	u64 QPx_SLIDA;
+	u64 QPx_SLIDPA;
+	u64 QPx_SLVL;
+#define QPx_SLVL_BITS  EHCA_BMASK_IBM(56,59)
+#define QPx_SLVL_VL    EHCA_BMASK_IBM(60,63)
+	u64 QPx_IPD;
+#define QPx_IPD_max_static_rate EHCA_BMASK_IBM(56,63)
+/* 0xe0*/
+	u64 QPx_MTU;
+#define QPx_MTU_size EHCA_BMASK_IBM(56,63)
+	u64 QPx_LATO;
+#define QPx_LATO_BITS EHCA_BMASK_IBM(59,63)
+	u64 QPx_RLIMIT;
+#define QPx_RETRY_COUNT EHCA_BMASK_IBM(61,63)
+	u64 QPx_RNRLIMIT;
+#define QPx_RNR_RETRY_COUNT EHCA_BMASK_IBM(61,63)
+/* 0x100*/
+	u64 QPx_T;
+	u64 QPx_SQHP;
+	u64 QPx_SQPTP;
+	u64 QPx_NSPSN;
+#define QPx_NSPSN_value EHCA_BMASK_IBM(40,63)
+/* 0x120*/
+	u64 QPx_NSPSNHWM;
+#define QPx_NSPSNHWM_value EHCA_BMASK_IBM(40,63)
+	u64 reserved1;
+	u64 QPx_SDSI;
+	u64 QPx_SDSBC;
+/* 0x140*/
+	u64 QPx_SQWSIZE;
+#define QPx_SQWSIZE_value EHCA_BMASK_IBM(61,63)
+	u64 QPx_SQWTS;
+	u64 QPx_LSN;
+	u64 QPx_NSSN;
+/* 0x160 */
+	u64 QPx_MOR;
+#define QPx_MOR_value EHCA_BMASK_IBM(48,63)
+	u64 QPx_COR;
+	u64 QPx_SQSIZE;
+#define QPx_SQSIZE_value EHCA_BMASK_IBM(60,63)
+	u64 QPx_ERC;
+/* 0x180*/
+	u64 QPx_RNRRC;
+#define QPx_RNRRESP_value EHCA_BMASK_IBM(59,63)
+	u64 QPx_ERNRWT;
+	u64 QPx_RNRRESP;
+#define QPx_RNRRESP_WTR EHCA_BMASK_IBM(59,63)
+	u64 QPx_LMSNA;
+/* 0x1a0 */
+	u64 QPx_SQHPC;
+	u64 QPx_SQCPTP;
+	u64 QPx_SIGT;
+	u64 QPx_WQECNT;
+/* 0x1c0*/
+
+	u64 QPx_RQHP;
+	u64 QPx_RQPTP;
+	u64 QPx_RQSIZE;
+#define QPx_RQSIZE_value EHCA_BMASK_IBM(60,63)
+	u64 QPx_NRR;
+#define QPx_NRR_value EHCA_BMASK_IBM(61,63)
+/* 0x1e0*/
+	u64 QPx_RDMAC;
+#define QPx_RDMAC_value EHCA_BMASK_IBM(61,63)
+	u64 QPx_NRPSN;
+#define QPx_NRPSN_value EHCA_BMASK_IBM(40,63)
+	u64 QPx_LAPSN;
+#define QPx_LAPSN_value EHCA_BMASK_IBM(40,63)
+	u64 QPx_LCR;
+/* 0x200*/
+	u64 QPx_RWC;
+	u64 QPx_RWVA;
+	u64 QPx_RDSI;
+	u64 QPx_RDSBC;
+/* 0x220*/
+	u64 QPx_RQWSIZE;
+#define QPx_RQWSIZE_value EHCA_BMASK_IBM(61,63)
+	u64 QPx_CRMSN;
+	u64 QPx_RDD;
+#define QPx_RDD_VALUE  EHCA_BMASK_IBM(32,63)
+	u64 QPx_LARPSN;
+#define QPx_LARPSN_value EHCA_BMASK_IBM(40,63)
+/* 0x240*/
+	u64 QPx_PD;
+	u64 QPx_SCQN;
+	u64 QPx_RCQN;
+	u64 QPx_AEQN;
+/* 0x260*/
+	u64 QPx_AAELOG;
+	u64 QPx_RAM;
+	u64 QPx_RDMAQE0;
+	u64 QPx_RDMAQE1;
+/* 0x280*/
+	u64 QPx_RDMAQE2;
+	u64 QPx_RDMAQE3;
+	u64 QPx_NRPSNHWM;
+#define QPx_NRPSNHWM_value EHCA_BMASK_IBM(40,63)
+/* 0x298*/
+	u64 reserved[(0x400 - 0x298) / 8];
+/* 0x400 extended data */
+	u64 reserved_ext[(0x500 - 0x400) / 8];
+/* 0x500 */
+	u64 reserved2[(0x1000 - 0x500) / 8];
+/* 0x1000      */
+};
+
+#define QPTEMM_OFFSET(x) offsetof(struct hipz_QPTEMM,x)
+
+/** @brief MRMWPT Entry Memory Map
+ */
+struct hipz_MRMWMM {
+	/* 0x00 */
+	u64 MRx_HCR;
+#define MRx_HCR_LPARID_VALID EHCA_BMASK_IBM(0,0)
+
+	u64 MRx_C;
+	u64 MRx_HERR;
+	u64 MRx_AER;
+	/* 0x20 */
+	u64 MRx_PP;
+	u64 reserved1;
+	u64 reserved2;
+	u64 reserved3;
+	/* 0x40 */
+	u64 reserved4[(0x200 - 0x40) / 8];
+	/* 0x200 */
+	u64 MRx_CTL[64];
+
+};
+
+#define MRMWMM_OFFSET(x) offsetof(struct hipz_MRMWMM,x)
+
+/** @brief QPEDMM
+ */
+struct hipz_QPEDMM {
+	/* 0x00 */
+	u64 reserved0[(0x400) / 8];
+	/* 0x400 */
+	u64 QPEDx_PHH;
+#define QPEDx_PHH_TClass EHCA_BMASK_IBM(4,11)
+#define QPEDx_PHH_HopLimit EHCA_BMASK_IBM(56,63)
+#define QPEDx_PHH_FlowLevel EHCA_BMASK_IBM(12,31)
+	u64 QPEDx_PPSGP;
+#define QPEDx_PPSGP_PPPidx EHCA_BMASK_IBM(0,63)
+	/* 0x410 */
+	u64 QPEDx_PPSGU;
+#define QPEDx_PPSGU_PPPSGID EHCA_BMASK_IBM(0,63)
+	u64 QPEDx_PPDGP;
+	/* 0x420 */
+	u64 QPEDx_PPDGU;
+	u64 QPEDx_APH;
+	/* 0x430 */
+	u64 QPEDx_APSGP;
+	u64 QPEDx_APSGU;
+	/* 0x440 */
+	u64 QPEDx_APDGP;
+	u64 QPEDx_APDGU;
+	/* 0x450 */
+	u64 QPEDx_APAV;
+	u64 QPEDx_APSAV;
+	/* 0x460  */
+	u64 QPEDx_HCR;
+	u64 reserved1[4];
+	/* 0x488 */
+	u64 QPEDx_RRL0;
+	/* 0x490 */
+	u64 QPEDx_RRRKEY0;
+	u64 QPEDx_RRVA0;
+	/* 0x4A0 */
+	u64 reserved2;
+	u64 QPEDx_RRL1;
+	/* 0x4B0 */
+	u64 QPEDx_RRRKEY1;
+	u64 QPEDx_RRVA1;
+	/* 0x4C0 */
+	u64 reserved3;
+	u64 QPEDx_RRL2;
+	/* 0x4D0 */
+	u64 QPEDx_RRRKEY2;
+	u64 QPEDx_RRVA2;
+	/* 0x4E0 */
+	u64 reserved4;
+	u64 QPEDx_RRL3;
+	/* 0x4F0 */
+	u64 QPEDx_RRRKEY3;
+	u64 QPEDx_RRVA3;
+};
+
+#define QPEDMM_OFFSET(x) offsetof(struct hipz_QPEDMM,x)
+
+/** @brief CQ Table Entry Memory Map
+ */
+struct hipz_CQTEMM {
+	u64 CQx_HCR;
+#define CQx_HCR_LPARID_valid EHCA_BMASK_IBM(0,0)
+	u64 CQx_C;
+#define CQx_C_Enable EHCA_BMASK_IBM(0,0)
+#define CQx_C_Disable_Complete EHCA_BMASK_IBM(1,1)
+#define CQx_C_Error_Reset EHCA_BMASK_IBM(23,23)
+	u64 CQx_HERR;
+	u64 CQx_AER;
+/* 0x20  */
+	u64 CQx_PTP;
+	u64 CQx_TP;
+#define CQx_FEC_CQE_cnt EHCA_BMASK_IBM(32,63)
+	u64 CQx_FEC;
+	u64 CQx_FECA;
+#define CQx_FECAdder EHCA_BMASK_IBM(32,63)
+/* 0x40  */
+	u64 CQx_EP;
+#define CQx_EP_Event_Pending EHCA_BMASK_IBM(0,0)
+#define CQx_EQ_number EHCA_BMASK_IBM(0,15)
+#define CQx_EQ_CQtoken EHCA_BMASK_IBM(32,63)
+	u64 CQx_EQ;
+/* 0x50  */
+	u64 reserved1;
+	u64 CQx_N0;
+#define CQx_N0_generate_solicited_comp_event EHCA_BMASK_IBM(0,0)
+/* 0x60  */
+	u64 CQx_N1;
+#define CQx_N1_generate_comp_event EHCA_BMASK_IBM(0,0)
+	u64 reserved2[(0x1000 - 0x60) / 8];
+/* 0x1000 */
+};
+
+#define CQTEMM_OFFSET(x) offsetof(struct hipz_CQTEMM,x)
+
+/** @brief EQ Table Entry Memory Map
+ */
+struct hipz_EQTEMM {
+	u64 EQx_HCR;
+#define EQx_HCR_LPARID_valid EHCA_BMASK_IBM(0,0)
+#define EQx_HCR_ENABLE_PSB EHCA_BMASK_IBM(8,8)
+	u64 EQx_C;
+#define EQx_C_Enable EHCA_BMASK_IBM(0,0)
+#define EQx_C_Error_Reset EHCA_BMASK_IBM(23,23)
+#define EQx_C_Comp_Event EHCA_BMASK_IBM(17,17)
+
+	u64 EQx_HERR;
+	u64 EQx_AER;
+/* 0x20 */
+	u64 EQx_PTP;
+	u64 EQx_TP;
+	u64 EQx_SSBA;
+	u64 EQx_PSBA;
+
+/* 0x40 */
+	u64 EQx_CEC;
+	u64 EQx_MEQL;
+	u64 EQx_XISBI;
+	u64 EQx_XISC;
+/* 0x60 */
+	u64 EQx_IT;
+
+};
+#define EQTEMM_OFFSET(x) offsetof(struct hipz_EQTEMM,x)
+
+#endif
diff --git a/drivers/infiniband/hw/ehca/hipz_structs.h b/drivers/infiniband/hw/ehca/hipz_structs.h
new file mode 100644
index 0000000..bd2dcad
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/hipz_structs.h
@@ -0,0 +1,54 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  Infiniband Firmware structure definition
+ *
+ *  Authors: Waleri Fomin <fomin at de.ibm.com>
+ *           Christoph Raisch <raisch at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: hipz_structs.h,v 1.8 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+#ifndef __HIPZ_STRUCTS_H__
+#define __HIPZ_STRUCTS_H__
+
+/* access control defines for MR/MW */
+#define HIPZ_ACCESSCTRL_L_WRITE  0x00800000
+#define HIPZ_ACCESSCTRL_R_WRITE  0x00400000
+#define HIPZ_ACCESSCTRL_R_READ   0x00200000
+#define HIPZ_ACCESSCTRL_R_ATOMIC 0x00100000
+#define HIPZ_ACCESSCTRL_MW_BIND  0x00080000
+
+#endif /* __IPZ_IF_H__ */


From rolandd at cisco.com  Sat Feb 18 11:57:27 2006
From: rolandd at cisco.com (Roland Dreier)
Date: Fri, 17 Feb 2006 16:57:27 -0800
Subject: [PATCH 10/22] ehca IRQ handling
In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
Message-ID: <20060218005727.13620.58832.stgit@localhost.localdomain>

From: Roland Dreier <rolandd at cisco.com>

Where is the irq_count field of struct ehca_irq_info ever used?
I couldn't find anywhere, so it can be deleted.

The logic in ehca_interrupt_eq() is too convoluted for me to
follow; there are two nested while () {} loops inside a 
do {} while () loop, and ehca_poll_eq() is called in three
different places.  Is there any way to untangle this?
---

 drivers/infiniband/hw/ehca/ehca_irq.c |  436 +++++++++++++++++++++++++++++++++
 drivers/infiniband/hw/ehca/ehca_irq.h |   90 +++++++
 2 files changed, 526 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_irq.c b/drivers/infiniband/hw/ehca/ehca_irq.c
new file mode 100644
index 0000000..1bba58e
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ehca_irq.c
@@ -0,0 +1,436 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  Functions for EQs, NEQs and interrupts
+ *
+ *  Authors: Heiko J Schick <schickhj at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ehca_irq.c,v 1.64 2006/02/15 08:15:25 schickhj Exp $
+ */
+
+#include "ehca_kernel.h"
+#include "ehca_irq.h"
+
+#define DEB_PREFIX "eirq"
+
+#include "ehca_kernel.h"
+#include "ehca_classes.h"
+#include "ehca_tools.h"
+#include "ehca_eq.h"
+#include "ehca_irq.h"
+#include "hcp_if.h"
+
+#define EQE_COMPLETION_EVENT   EHCA_BMASK_IBM(1,1)
+#define EQE_CQ_QP_NUMBER       EHCA_BMASK_IBM(8,31)
+#define EQE_EE_IDENTIFIER      EHCA_BMASK_IBM(2,7)
+#define EQE_CQ_NUMBER          EHCA_BMASK_IBM(8,31)
+#define EQE_QP_NUMBER          EHCA_BMASK_IBM(8,31)
+#define EQE_QP_TOKEN           EHCA_BMASK_IBM(32,63)
+#define EQE_CQ_TOKEN           EHCA_BMASK_IBM(32,63)
+
+#define NEQE_COMPLETION_EVENT  EHCA_BMASK_IBM(1,1)
+#define NEQE_EVENT_CODE        EHCA_BMASK_IBM(2,7)
+#define NEQE_PORT_NUMBER       EHCA_BMASK_IBM(8,15)
+#define NEQE_PORT_AVAILABILITY EHCA_BMASK_IBM(16,16)
+
+#define ERROR_DATA_LENGTH      EHCA_BMASK_IBM(52,63)
+
+static inline void comp_event_callback(struct ehca_cq *cq)
+{
+	unsigned long spl_flags = 0;
+
+	EDEB_EN(7, "cq=%p", cq);
+
+	if (cq->ib_cq.comp_handler == NULL)
+		return;
+
+	spin_lock_irqsave(&cq->cb_lock, spl_flags);
+	cq->ib_cq.comp_handler(&cq->ib_cq, cq->ib_cq.cq_context);
+	spin_unlock_irqrestore(&cq->cb_lock, spl_flags);
+
+	EDEB_EX(7, "cq=%p", cq);
+
+	return;
+}
+
+int ehca_error_data(struct ehca_shca *shca,
+				  u64 ressource)
+{
+
+	unsigned long ret = 0;
+	u64 *rblock;
+	unsigned long block_count;
+
+	EDEB_EN(7, "ressource=%lx", ressource);
+
+	rblock = kmalloc(PAGE_SIZE, GFP_KERNEL);
+	if (rblock == NULL) {
+		EDEB_ERR(4, "Cannot allocate rblock memory.");
+		ret = -ENOMEM;
+		goto error_data1;
+	}
+
+	memset(rblock, 0, PAGE_SIZE);
+
+	ret = hipz_h_error_data(shca->ipz_hca_handle,
+				ressource,
+				rblock,
+				&block_count);
+
+	if (ret == H_R_STATE) {
+		EDEB_ERR(4, "No error data is available: %lx.", ressource);
+	}
+	else if (ret == H_Success) {
+		int length;
+
+		length = EHCA_BMASK_GET(ERROR_DATA_LENGTH, rblock[0]);
+
+		if (length > PAGE_SIZE)
+			length = PAGE_SIZE;
+
+		EDEB_ERR(4, "Error data is available: %lx.", ressource);
+		EDEB_ERR(4, "EHCA ----- error data begin "
+			 "---------------------------------------------------");
+		EDEB_DMP(4, rblock, length, "ressource=%lx", ressource);
+		EDEB_ERR(4, "EHCA ----- error data end "
+			 "-----------------------------------------------------");
+	}
+	else {
+		EDEB_ERR(4, "Error data could not be fetched: %lx", ressource);
+	}
+
+	kfree(rblock);
+
+      error_data1:
+	return ret;
+
+}
+
+static void qp_event_callback(struct ehca_shca *shca,
+					  u64 eqe,
+					  enum ib_event_type event_type)
+{
+	struct ib_event event;
+	struct ehca_qp *qp;
+	u32 token = EHCA_BMASK_GET(EQE_QP_TOKEN, eqe);
+
+	EDEB_EN(7, "eqe=%lx", eqe);
+
+	down_read(&ehca_qp_idr_sem);
+	qp = idr_find(&ehca_qp_idr, token);
+	up_read(&ehca_qp_idr_sem);
+
+	if (qp == NULL)
+		return;
+
+	if (event_type == IB_EVENT_QP_FATAL)
+		EDEB_ERR(4, "QP 0x%x (ressource=%lx) has errors.",
+			 qp->ib_qp.qp_num, qp->ipz_qp_handle.handle);
+
+	ehca_error_data(shca, qp->ipz_qp_handle.handle);
+
+	if (qp->ib_qp.event_handler == NULL)
+		return;
+
+	event.device     = &shca->ib_device;
+	event.event      = event_type;
+	event.element.qp = &qp->ib_qp;
+
+	qp->ib_qp.event_handler(&event, qp->ib_qp.qp_context);
+
+	EDEB_EX(7, "qp=%p", qp);
+
+	return;
+}
+
+static void cq_event_callback(struct ehca_shca *shca,
+					  u64 eqe)
+{
+	struct ehca_cq *cq;
+	u32 token = EHCA_BMASK_GET(EQE_CQ_TOKEN, eqe);
+
+	EDEB_EN(7, "eqe=%lx", eqe);
+
+	down_read(&ehca_cq_idr_sem);
+	cq = idr_find(&ehca_cq_idr, token);
+	up_read(&ehca_cq_idr_sem);
+
+	if (cq == NULL)
+		return;
+
+	EDEB_ERR(4, "CQ 0x%x (ressource=%lx) has errors.",
+		 cq->cq_number, cq->ipz_cq_handle.handle);
+
+	ehca_error_data(shca, cq->ipz_cq_handle.handle);
+
+	EDEB_EX(7, "cq=%p", cq);
+
+	return;
+}
+
+static void parse_identifier(struct ehca_shca *shca, u64 eqe)
+{
+	u8 identifier = EHCA_BMASK_GET(EQE_EE_IDENTIFIER, eqe);
+
+	EDEB_EN(7, "shca=%p eqe=%lx", shca, eqe);
+
+	switch (identifier) {
+	case 0x02:		/* path migrated */
+		qp_event_callback(shca, eqe, IB_EVENT_PATH_MIG);
+		break;
+	case 0x03:		/* communication established */
+		qp_event_callback(shca, eqe, IB_EVENT_COMM_EST);
+		break;
+	case 0x04:		/* send queue drained */
+		qp_event_callback(shca, eqe, IB_EVENT_SQ_DRAINED);
+		break;
+	case 0x05:		/* QP error */
+	case 0x06:		/* QP error */
+		qp_event_callback(shca, eqe, IB_EVENT_QP_FATAL);
+		break;
+	case 0x07:		/* CQ error */
+	case 0x08:		/* CQ error */
+		cq_event_callback(shca, eqe);
+		break;
+	case 0x09:		/* MRMWPTE error */
+	case 0x0A:		/* port event */
+	case 0x0B:		/* MR access error */
+	case 0x0C:		/* EQ error */
+	case 0x0D:		/* P/Q_Key mismatch */
+	case 0x10:		/* sampling complete */
+	case 0x11:		/* unaffiliated access error */
+	case 0x12:		/* path migrating error */
+	case 0x13:		/* interface trace stopped */
+	case 0x14:		/* first error capture info available */
+	default:
+		EDEB_ERR(4, "Unknown identifier: %x on %s.", 
+			 identifier, shca->ib_device.name);
+		break;
+	}
+
+	EDEB_EN(7, "eqe=%lx identifier=%x", eqe, identifier);
+
+	return;
+}
+
+static void parse_ec(struct ehca_shca *shca, u64 eqe)
+{
+	struct ib_event event;
+	u8 ec   = EHCA_BMASK_GET(NEQE_EVENT_CODE, eqe);
+	u8 port = EHCA_BMASK_GET(NEQE_PORT_NUMBER, eqe);
+
+	EDEB_EN(7, "shca=%p eqe=%lx", shca, eqe);
+
+	switch (ec) {
+	case 0x30:		/* port availability change */
+		if (EHCA_BMASK_GET(NEQE_PORT_AVAILABILITY, eqe)) {
+			EDEB(4, "%s: port %x is active.", 
+			     shca->ib_device.name, port);
+			event.device = &shca->ib_device;
+			event.event = IB_EVENT_PORT_ACTIVE;
+			event.element.port_num = port;
+			shca->sport[port - 1].port_state = IB_PORT_ACTIVE;
+			ib_dispatch_event(&event);
+		} else {
+			EDEB(4, "%s: port %x is inactive.", 
+			     shca->ib_device.name, port);
+			event.device = &shca->ib_device;
+			event.event = IB_EVENT_PORT_ERR;
+			event.element.port_num = port;
+			shca->sport[port - 1].port_state = IB_PORT_DOWN;
+			ib_dispatch_event(&event);
+		}
+		break;
+	case 0x31:
+		/* port configuration change      */
+		/* disruptive change is caused by */
+		/* LID, PKEY or SM change         */
+		EDEB(4, "EHCA disruptive port %x "
+		     "configuration change.", port);
+
+		EDEB(4, "%s: port %x is inactive.", 
+		     shca->ib_device.name, port);
+		event.device = &shca->ib_device;
+		event.event = IB_EVENT_PORT_ERR;
+		event.element.port_num = port;
+		shca->sport[port - 1].port_state = IB_PORT_DOWN;
+		ib_dispatch_event(&event);
+
+		EDEB(4, "%s: port %x is active.", 
+			     shca->ib_device.name, port);
+		event.device = &shca->ib_device;
+		event.event = IB_EVENT_PORT_ACTIVE;
+		event.element.port_num = port;
+		shca->sport[port - 1].port_state = IB_PORT_ACTIVE;
+		ib_dispatch_event(&event);
+		break;
+	case 0x32:		/* adapter malfunction */
+	case 0x33:		/* trace stopped */
+	default:
+		EDEB_ERR(4, "Unknown event code: %x on %s.", 
+			 ec, shca->ib_device.name);
+		break;
+	}
+
+	EDEB_EN(7, "eqe=%lx ec=%x", eqe, ec);
+
+	return;
+}
+
+static inline void reset_eq_pending(struct ehca_cq *cq)
+{
+	u64 CQx_EP = 0;
+	struct h_galpa gal = cq->ehca_cq_core.galpas.kernel;
+
+	EDEB_EN(7, "cq=%p", cq);
+
+	hipz_galpa_store_cq(gal, CQx_EP, 0x0);
+	CQx_EP = hipz_galpa_load(gal, CQTEMM_OFFSET(CQx_EP));
+	EDEB(7, "CQx_EP=%lx", CQx_EP);
+
+	EDEB_EX(7, "cq=%p", cq);
+
+	return;
+}
+
+void ehca_interrupt_eq(void *data)
+{
+	struct ehca_irq_info *irq_info;
+	struct ehca_shca *shca;
+	struct ehca_eqe *eqe;
+	int int_state;
+
+	EDEB_EN(7, "data=%p", data);
+
+	irq_info = (struct ehca_irq_info *)data;
+	shca = to_shca(eq);
+
+	do {
+		eqe = (struct ehca_eqe *)ehca_poll_eq(shca, &shca->eq);
+
+		if ((shca->hw_level >= 2) && (eqe != NULL))
+			int_state = 1;
+		else
+			int_state = 0;
+
+		while ((int_state == 1) || (eqe != 0)) {
+			while (eqe) {
+				u64 eqe_value = eqe->entry;
+
+				EDEB(7, "eqe_value=%lx", eqe_value);
+
+				/* TODO: better structure */
+				if (EHCA_BMASK_GET(EQE_COMPLETION_EVENT,
+						   eqe_value)) {
+					extern struct idr ehca_cq_idr;
+					u32 token;
+					struct ehca_cq *cq;
+
+					EDEB(7, "... completion event");
+					token =
+					    EHCA_BMASK_GET(EQE_CQ_TOKEN,
+							   eqe_value);
+					down_read(&ehca_cq_idr_sem);
+					cq = idr_find(&ehca_cq_idr, token);
+					up_read(&ehca_cq_idr_sem);
+					reset_eq_pending(cq);
+					comp_event_callback(cq);
+				} else {
+					EDEB(7, "... non completion event");
+					parse_identifier(shca, eqe_value);
+				}
+				eqe =
+				    (struct ehca_eqe *)ehca_poll_eq(shca,
+								    &shca->eq);
+			}
+
+			/* TODO: do we need hw_level  */
+			if (shca->hw_level >= 2)
+				int_state =
+				    hipz_h_query_int_state(shca->ipz_hca_handle,
+							   irq_info);
+			eqe = (struct ehca_eqe *)ehca_poll_eq(shca, &shca->eq);
+
+		}
+	} while (int_state != 0);
+
+	EDEB_EX(7, "shca=%p", shca);
+
+	return;
+}
+
+void ehca_interrupt_neq(void *data)
+{
+	struct ehca_irq_info *irq_info;
+	struct ehca_shca *shca;
+	struct ehca_eqe *eqe;
+	u64 ret = H_Success;
+
+	EDEB_EN(7, "data=%p", data);
+
+	irq_info = (struct ehca_irq_info *)data;
+	shca = to_shca(neq);
+	eqe = (struct ehca_eqe *)ehca_poll_eq(shca, &shca->neq);
+
+	while (eqe) {
+		if (!EHCA_BMASK_GET(NEQE_COMPLETION_EVENT, eqe->entry))
+			parse_ec(shca, eqe->entry);
+
+		eqe = (struct ehca_eqe *)ehca_poll_eq(shca, &shca->neq);
+	}
+
+	ret = hipz_h_reset_event(shca->ipz_hca_handle,
+				 shca->neq.ipz_eq_handle, 0xFFFFFFFFFFFFFFFF);
+
+	if (ret != H_Success)
+		EDEB_ERR(4, "Can't clear notification events.");
+
+	EDEB_EX(7, "shca=%p", shca);
+
+	return;
+}
+
+irqreturn_t ehca_interrupt(int irq, void *dev_id, struct pt_regs *regs)
+{
+	struct ehca_irq_info *info = (struct ehca_irq_info *)dev_id;
+
+	EDEB_EN(7, "dev_id=%p", dev_id);
+
+	queue_work(info->wq, info->work);
+
+	EDEB_EX(7, "");
+
+	return IRQ_HANDLED;
+}
diff --git a/drivers/infiniband/hw/ehca/ehca_irq.h b/drivers/infiniband/hw/ehca/ehca_irq.h
new file mode 100644
index 0000000..43b2e3e
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ehca_irq.h
@@ -0,0 +1,90 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  Function definitions and structs for EQs, NEQs and interrupts
+ *
+ *  Authors: Heiko J Schick <schickhj at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ehca_irq.h,v 1.25 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+#ifndef __EHCA_IRQ_H
+#define __EHCA_IRQ_H
+
+
+struct ehca_shca;
+
+#include <asm/atomic.h>
+#include <asm/types.h>
+
+#ifndef EHCA_USERDRIVER
+#include <linux/interrupt.h>
+#endif
+
+#ifndef __KERNEL__
+#define NO_IRQ (-1)
+#include <linux/version.h>
+#include <errno.h>
+#endif
+
+#ifndef EHCA_USERDRIVER
+#define to_shca(queue) container_of(irq_info->eq,     \
+				    struct ehca_shca, \
+				    queue)
+#else
+extern struct ehca_module ehca_module;
+#define to_shca(queue) list_entry(ehca_module.shca_list.next, \
+				  struct ehca_shca, shca_list)
+#endif
+
+struct ehca_irq_info {
+	__u32 ist;
+	__u32 irq;
+	void *eq;
+
+	atomic_t irq_count;
+	struct workqueue_struct *wq;
+	struct work_struct *work;
+
+	pid_t pid;
+};
+
+void ehca_interrupt_eq(void *data);
+void ehca_interrupt_neq(void *data);
+irqreturn_t ehca_interrupt(int irq, void *dev_id, struct pt_regs *regs);
+irqreturn_t ehca_interrupt_direct(int irq, void *dev_id, struct pt_regs *regs);
+int ehca_error_data(struct ehca_shca *shca, u64 ressource);
+
+#endif


From rolandd at cisco.com  Sat Feb 18 11:57:21 2006
From: rolandd at cisco.com (Roland Dreier)
Date: Fri, 17 Feb 2006 16:57:21 -0800
Subject: [PATCH 07/22] Hypercall definitions
In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
Message-ID: <20060218005721.13620.84990.stgit@localhost.localdomain>

From: Roland Dreier <rolandd at cisco.com>

Do these defines belong in the ehca driver, or should they be put
somewhere in generic hypercall support?
---

 drivers/infiniband/hw/ehca/ehca_common.h |  115 ++++++++++++++++++++++++++++++
 1 files changed, 115 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_common.h b/drivers/infiniband/hw/ehca/ehca_common.h
new file mode 100644
index 0000000..922f010
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ehca_common.h
@@ -0,0 +1,115 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  hcad local defines
+ *
+ *  Authors:  Christoph Raisch <raisch at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ehca_common.h,v 1.15 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+#ifndef __EHCA_COMMON_H__
+#define __EHCA_COMMON_H__
+
+#ifdef CONFIG_PPC64
+#include <asm/hvcall.h>
+
+#define H_PARTIAL_STORE   16
+#define H_PAGE_REGISTERED 15
+#define H_IN_PROGRESS     14
+#define H_PARTIAL          5
+#define H_NOT_AVAILABLE    3
+#define H_Closed           2
+#define H_ADAPTER_PARM         -17
+#define H_RH_PARM              -18
+#define H_RCQ_PARM             -19
+#define H_SCQ_PARM             -20
+#define H_EQ_PARM              -21
+#define H_RT_PARM              -22
+#define H_ST_PARM              -23
+#define H_SIGT_PARM            -24
+#define H_TOKEN_PARM           -25
+#define H_MLENGTH_PARM         -27
+#define H_MEM_PARM             -28
+#define H_MEM_ACCESS_PARM      -29
+#define H_ATTR_PARM            -30
+#define H_PORT_PARM            -31
+#define H_MCG_PARM             -32
+#define H_VL_PARM              -33
+#define H_TSIZE_PARM           -34
+#define H_TRACE_PARM           -35
+
+#define H_MASK_PARM            -37
+#define H_MCG_FULL             -38
+#define H_ALIAS_EXIST          -39
+#define H_P_COUNTER            -40
+#define H_TABLE_FULL           -41
+#define H_ALT_TABLE            -42
+#define H_MR_CONDITION         -43
+#define H_NOT_ENOUGH_RESOURCES -44
+#define H_R_STATE              -45
+#define H_RESCINDEND           -46
+
+/* H call defines to be moved to kernel */
+#define H_RESET_EVENTS         0x15C
+#define H_ALLOC_RESOURCE       0x160
+#define H_FREE_RESOURCE        0x164
+#define H_MODIFY_QP            0x168
+#define H_QUERY_QP             0x16C
+#define H_REREGISTER_PMR       0x170
+#define H_REGISTER_SMR         0x174
+#define H_QUERY_MR             0x178
+#define H_QUERY_MW             0x17C
+#define H_QUERY_HCA            0x180
+#define H_QUERY_PORT           0x184
+#define H_MODIFY_PORT          0x188
+#define H_DEFINE_AQP1          0x18C
+#define H_GET_TRACE_BUFFER     0x190
+#define H_DEFINE_AQP0          0x194
+#define H_RESIZE_MR            0x198
+#define H_ATTACH_MCQP          0x19C
+#define H_DETACH_MCQP          0x1A0
+#define H_CREATE_RPT           0x1A4
+#define H_REMOVE_RPT           0x1A8
+#define H_REGISTER_RPAGES      0x1AC
+#define H_DISABLE_AND_GETC     0x1B0
+#define H_ERROR_DATA           0x1B4
+#define H_GET_HCA_INFO         0x1B8
+#define H_GET_PERF_COUNT       0x1BC
+#define H_MANAGE_TRACE         0x1C0
+#define H_QUERY_INT_STATE      0x1E4
+#endif
+
+#endif /* __EHCA_COMMON_H__ */


From rolandd at cisco.com  Sat Feb 18 11:57:25 2006
From: rolandd at cisco.com (Roland Dreier)
Date: Fri, 17 Feb 2006 16:57:25 -0800
Subject: [PATCH 09/22] ehca classes
In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
Message-ID: <20060218005725.13620.32014.stgit@localhost.localdomain>

From: Roland Dreier <rolandd at cisco.com>

The fact that ehca_cq_delete and ehca_qp_delete return an int seems
a little silly, given that the functions can never fail.

The code in ehca_classes.c seems like a misuse of the kmem_cache API;
rather than wrapping kmem_cache_alloc() and doing extra initialization,
why not just use the kmem_cache's constructor to do this?
---

 drivers/infiniband/hw/ehca/ehca_classes.c         |  191 +++++++++++
 drivers/infiniband/hw/ehca/ehca_classes.h         |  369 +++++++++++++++++++++
 drivers/infiniband/hw/ehca/ehca_classes_core.h    |   73 ++++
 drivers/infiniband/hw/ehca/ehca_classes_pSeries.h |  256 +++++++++++++++
 4 files changed, 889 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_classes.c b/drivers/infiniband/hw/ehca/ehca_classes.c
new file mode 100644
index 0000000..9819788
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ehca_classes.c
@@ -0,0 +1,191 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  struct initialisations and allocation
+ *
+ *  Authors: Christoph Raisch <raisch at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ehca_classes.c,v 1.21 2006/02/06 16:20:38 schickhj Exp $
+ */
+
+#define DEB_PREFIX "clas"
+#include "ehca_kernel.h"
+
+#include "ehca_classes.h"
+
+struct ehca_pd *ehca_pd_new(void)
+{
+	extern struct ehca_module ehca_module;
+	struct ehca_pd *me;
+
+	me = kmem_cache_alloc(ehca_module.cache_pd, SLAB_KERNEL);
+	if (me == NULL)
+		return NULL;
+
+	memset(me, 0, sizeof(struct ehca_pd));
+
+	return me;
+}
+
+void ehca_pd_delete(struct ehca_pd *me)
+{
+	extern struct ehca_module ehca_module;
+
+	kmem_cache_free(ehca_module.cache_pd, me);
+}
+
+struct ehca_cq *ehca_cq_new(void)
+{
+	extern struct ehca_module ehca_module;
+	struct ehca_cq *me;
+
+	me = kmem_cache_alloc(ehca_module.cache_cq, SLAB_KERNEL);
+	if (me == NULL)
+		return NULL;
+
+	memset(me, 0, sizeof(struct ehca_cq));
+	spin_lock_init(&me->spinlock);
+	spin_lock_init(&me->cb_lock);
+
+	return me;
+}
+
+int ehca_cq_delete(struct ehca_cq *me)
+{
+	extern struct ehca_module ehca_module;
+
+	kmem_cache_free(ehca_module.cache_cq, me);
+
+	return H_Success;
+}
+
+struct ehca_qp *ehca_qp_new(void)
+{
+	extern struct ehca_module ehca_module;
+	struct ehca_qp *me;
+
+	me = kmem_cache_alloc(ehca_module.cache_qp, SLAB_KERNEL);
+	if (me == NULL)
+		return NULL;
+
+	memset(me, 0, sizeof(struct ehca_qp));
+	spin_lock_init(&me->spinlock_s);
+	spin_lock_init(&me->spinlock_r);
+
+	return me;
+}
+
+int ehca_qp_delete(struct ehca_qp *me)
+{
+	extern struct ehca_module ehca_module;
+
+	kmem_cache_free(ehca_module.cache_qp, me);
+
+	return H_Success;
+}
+
+struct ehca_av *ehca_av_new(void)
+{
+	extern struct ehca_module ehca_module;
+	struct ehca_av *me;
+
+	me  = kmem_cache_alloc(ehca_module.cache_av, SLAB_KERNEL);
+	if (me == NULL)
+		return NULL;
+
+	memset(me, 0, sizeof(struct ehca_av));
+
+	return me;
+}
+
+int ehca_av_delete(struct ehca_av *me)
+{
+	extern struct ehca_module ehca_module;
+
+	kmem_cache_free(ehca_module.cache_av, me);
+
+	return H_Success;
+}
+
+struct ehca_mr *ehca_mr_new(void)
+{
+	extern struct ehca_module ehca_module;
+	struct ehca_mr *me;
+
+	me = kmem_cache_alloc(ehca_module.cache_mr, SLAB_KERNEL);
+	if (me) {
+		memset(me, 0, sizeof(struct ehca_mr));
+		spin_lock_init(&me->mrlock);
+		EDEB_EX(7, "ehca_mr=%p sizeof(ehca_mr_t)=%x", me,
+			(u32) sizeof(struct ehca_mr));
+	} else {
+		EDEB_ERR(3, "alloc failed");
+	}
+
+	return me;
+}
+
+void ehca_mr_delete(struct ehca_mr *me)
+{
+	extern struct ehca_module ehca_module;
+
+	kmem_cache_free(ehca_module.cache_mr, me);
+}
+
+struct ehca_mw *ehca_mw_new(void)
+{
+	extern struct ehca_module ehca_module;
+	struct ehca_mw *me;
+
+	me = kmem_cache_alloc(ehca_module.cache_mw, SLAB_KERNEL);
+	if (me) {
+		memset(me, 0, sizeof(struct ehca_mw));
+		spin_lock_init(&me->mwlock);
+		EDEB_EX(7, "ehca_mw=%p sizeof(ehca_mw_t)=%x", me,
+			(u32) sizeof(struct ehca_mw));
+	} else {
+		EDEB_ERR(3, "alloc failed");
+	}
+
+	return me;
+}
+
+void ehca_mw_delete(struct ehca_mw *me)
+{
+	extern struct ehca_module ehca_module;
+
+	kmem_cache_free(ehca_module.cache_mw, me);
+}
+
diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h
new file mode 100644
index 0000000..1d72aaf
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ehca_classes.h
@@ -0,0 +1,369 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  struct definitions for hcad internal structures
+ *
+ *  Authors: Christoph Raisch <raisch at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ehca_classes.h,v 1.80 2006/02/06 16:20:38 schickhj Exp $
+ */
+
+#ifndef __EHCA_CLASSES_H__
+#define __EHCA_CLASSES_H__
+
+#include "ehca_kernel.h"
+#include "ipz_pt_fn.h"
+
+#include <linux/list.h>
+
+struct ehca_module;
+struct ehca_qp;
+struct ehca_cq;
+struct ehca_eq;
+struct ehca_mr;
+struct ehca_mw;
+struct ehca_pd;
+struct ehca_av;
+
+#ifndef CONFIG_PPC64
+#ifndef Z_SERIES
+#error "no series defined"
+#endif
+#endif
+
+#ifdef CONFIG_PPC64
+#include "ehca_classes_pSeries.h"
+#endif
+
+#ifdef Z_SERIES
+#include "ehca_classes_zSeries.h"
+#endif
+
+#include <rdma/ib_verbs.h>
+#include <rdma/ib_user_verbs.h>
+
+#include "ehca_irq.h"
+
+#include "ehca_classes_core.h"
+
+/** @brief HCAD class
+ *
+ * contains HCAD specific data
+ *
+ */
+struct ehca_module {
+	struct list_head shca_list;
+	spinlock_t shca_lock;
+
+	kmem_cache_t *cache_pd;
+	kmem_cache_t *cache_cq;
+	kmem_cache_t *cache_qp;
+	kmem_cache_t *cache_av;
+	kmem_cache_t *cache_mr;
+	kmem_cache_t *cache_mw;
+
+	struct ehca_pfmodule pf; /* plattform specific part of HCA */
+};
+
+/** @brief EQ class
+ */
+struct ehca_eq {
+	u32 length;		    /* length of EQ */
+	struct ipz_queue ipz_queue; /* EQ in kv     */
+	struct ipz_eq_handle ipz_eq_handle;
+	struct ehca_irq_info irq_info;
+	struct work_struct work;
+	struct h_galpas galpas;
+	int    is_initialized;
+
+	struct ehca_pfeq pf; /* plattform specific part of EQ */
+
+	spinlock_t spinlock;
+};
+
+/** static port
+ */
+struct ehca_sport {
+	struct ib_cq *ibcq_aqp1; /* CQ for AQP1 */
+	struct ib_qp *ibqp_aqp1; /* QP for AQP1 */
+	enum ib_port_state port_state;
+};
+
+/** @brief HCA class "static HCA"
+ *
+ * contains HCA specific data per HCA (or vHCA?)
+ * per instance reported by firmware
+ *
+ */
+struct ehca_shca {
+	struct ib_device ib_device;
+	struct ibmebus_dev *ibmebus_dev;
+	u8 num_ports;
+	int hw_level;
+	struct list_head shca_list;
+	struct ipz_adapter_handle ipz_hca_handle; /* firmware HCA handle     */
+	struct ehca_bridge_handle bridge;
+	struct ehca_sport sport[2];
+	struct ehca_eq eq;	/* event queue                        */
+	struct ehca_eq neq;	/* notification event queue           */
+	struct ehca_mr *maxmr;	/* internal max MR (for kernel users) */
+	struct ehca_pd *pd;	/* internal pd (for kernel users)     */
+	struct ehca_pfshca pf;	/* plattform specific part of HCA     */
+	struct h_galpas galpas;
+};
+
+/** @brief protection domain
+ */
+struct ehca_pd {
+	struct ib_pd ib_pd;	/* gen2 qp, must always be first in ehca_pd */
+	struct ipz_pd fw_pd;
+	struct ehca_pfpd pf;
+};
+
+/** @brief QP class
+ */
+struct ehca_qp {
+	struct ib_qp ib_qp;	/* gen2 qp, must always be first in ehca_qp */
+	struct ehca_qp_core ehca_qp_core;	/* common fields for
+						   user/kernel space */
+	u32 token;
+	spinlock_t spinlock_s;
+	spinlock_t spinlock_r;
+	u32 sq_max_inline_data_size; /* max # of inline data can be send */
+	struct ipz_qp_handle ipz_qp_handle; /* QP handle for h-calls            */
+	struct ehca_pfqp pf;	     /* plattform specific part of QP    */
+	struct ib_qp_init_attr init_attr;
+	/* adr mapping for s/r queues and fw handle bw kernel&user space */
+	u64 uspace_squeue;
+	u64 uspace_rqueue;
+	u64 uspace_fwh;
+	struct ehca_cq* send_cq;
+	unsigned int sqerr_purgeflag;
+	struct list_head list_entries;
+};
+
+#define QP_HASHTAB_LEN 7
+/** @brief CQ class
+ */
+struct ehca_cq {
+	struct ib_cq ib_cq;	/* gen2 cq, must always be first
+				   in ehca_cq */
+	struct ehca_cq_core ehca_cq_core; /* common fields for
+					     user/kernel space */
+	spinlock_t spinlock;
+	u32 cq_number;
+	u32 token;
+	u32 nr_of_entries;
+	/* fw specific data common for p+z */
+	struct ipz_cq_handle ipz_cq_handle;	/* CQ handle for h-calls */
+	/* pf specific code */
+	struct ehca_pfcq pf;            /* platform specific part of CQ */
+	spinlock_t cb_lock;             /* completion event handler */
+	/* adr mapping for queue and fw handle bw kernel&user space */
+	u64 uspace_queue;
+	u64 uspace_fwh;
+	struct list_head qp_hashtab[QP_HASHTAB_LEN];
+};
+
+
+/** @brief MR flags
+ */
+enum ehca_mr_flag {
+	EHCA_MR_FLAG_FMR = 0x80000000,	 /* FMR, created with ehca_alloc_fmr */
+	EHCA_MR_FLAG_MAXMR = 0x40000000, /* max-MR                           */
+	EHCA_MR_FLAG_USER = 0x20000000	 /* user space TODO...necessary????. */
+};
+
+/** @brief MR class
+ */
+struct ehca_mr {
+	union {
+		struct ib_mr ib_mr;	/* must always be first in ehca_mr */
+		struct ib_fmr ib_fmr;	/* must always be first in ehca_mr */
+	} ib;
+
+	spinlock_t mrlock;
+
+	/* !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
+	 * !!! ehca_mr_deletenew() memsets from flags to end of structure
+	 * !!! DON'T move flags or insert another field before.
+	 * !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! */
+
+	enum ehca_mr_flag flags;
+	u32 num_pages;		/* number of MR pages */
+	int acl;		/* ACL (stored here for usage in reregister) */
+	u64 *start;		/* virtual start address (stored here for */
+	/* usage in reregister) */
+	u64 size;		/* size (stored here for usage in reregister) */
+	u32 fmr_page_size;	/* page size for FMR */
+	u32 fmr_max_pages;	/* max pages for FMR */
+	u32 fmr_max_maps;	/* max outstanding maps for FMR */
+	u32 fmr_map_cnt;	/* map counter for FMR */
+	/* fw specific data */
+	struct ipz_mrmw_handle ipz_mr_handle;	/* MR handle for h-calls */
+	struct h_galpas galpas;
+	/* data for userspace bridge */
+	u32 nr_of_pages;
+	void *pagearray;
+
+	struct ehca_pfmr pf;	/* platform specific part of MR */
+};
+
+/** @brief MW class
+ */
+struct ehca_mw {
+	struct ib_mw ib_mw;	/* gen2 mw, must always be first in ehca_mw */
+	spinlock_t mwlock;
+
+	u8 never_bound;		/* indication MW was never bound */
+	struct ipz_mrmw_handle ipz_mw_handle;	/* MW handle for h-calls */
+	struct h_galpas galpas;
+
+	struct ehca_pfmw pf;	/* platform specific part of MW */
+};
+
+/** @brief MR page info type
+ */
+enum ehca_mr_pgi_type {
+	EHCA_MR_PGI_PHYS   = 1,  /* type of ehca_reg_phys_mr,
+				  * ehca_rereg_phys_mr,
+				  * ehca_reg_internal_maxmr */
+	EHCA_MR_PGI_USER   = 2,  /* type of ehca_reg_user_mr */
+	EHCA_MR_PGI_FMR    = 3   /* type of ehca_map_phys_fmr */
+};
+
+/** @brief MR page info
+ */
+struct ehca_mr_pginfo {
+	enum ehca_mr_pgi_type type;
+	u64 num_pages;
+	u64 page_count;
+
+	/* type EHCA_MR_PGI_PHYS section */
+	int num_phys_buf;
+	struct ib_phys_buf *phys_buf_array;
+	u64 next_buf;
+	u64 next_page;
+
+	/* type EHCA_MR_PGI_USER section */
+	struct ib_umem *region;
+	struct ib_umem_chunk *next_chunk;
+	u64 next_nmap;
+
+	/* type EHCA_MR_PGI_FMR section */
+	u64 *page_list;
+	u64 next_listelem;
+};
+
+
+/** @brief addres vector suitable for a ud enqueue request
+ */
+struct ehca_av {
+	struct ib_ah ib_ah;	/* gen2 ah, must always be first in ehca_ah */
+	struct ehca_ud_av av;
+};
+
+/** @brief user context
+ */
+struct ehca_ucontext {
+	struct ib_ucontext ib_ucontext;
+};
+
+struct ehca_module *ehca_module_new(void);
+
+int ehca_module_delete(struct ehca_module *me);
+
+int ehca_eq_ctor(struct ehca_eq *eq);
+
+int ehca_eq_dtor(struct ehca_eq *eq);
+
+struct ehca_shca *ehca_shca_new(void);
+
+int ehca_shca_delete(struct ehca_shca *me);
+
+struct ehca_sport *ehca_sport_new(struct ehca_shca *anchor);	/*anchor?? */
+
+struct ehca_cq *ehca_cq_new(void);
+
+int ehca_cq_delete(struct ehca_cq *me);
+
+struct ehca_av *ehca_av_new(void);
+
+int ehca_av_delete(struct ehca_av *me);
+
+struct ehca_pd *ehca_pd_new(void);
+
+void ehca_pd_delete(struct ehca_pd *me);
+
+struct ehca_qp *ehca_qp_new(void);
+
+int ehca_qp_delete(struct ehca_qp *me);
+
+struct ehca_mr *ehca_mr_new(void);
+
+void ehca_mr_delete(struct ehca_mr *me);
+
+struct ehca_mw *ehca_mw_new(void);
+
+void ehca_mw_delete(struct ehca_mw *me);
+
+extern struct rw_semaphore ehca_qp_idr_sem;
+extern struct rw_semaphore ehca_cq_idr_sem;
+extern struct idr ehca_qp_idr;
+extern struct idr ehca_cq_idr;
+
+/*
+ * resp structs for comm bw user and kernel space
+ */
+struct ehca_create_cq_resp {
+	u32 cq_number;
+	u32 token;
+	struct ehca_cq_core ehca_cq_core;
+};
+
+struct ehca_create_qp_resp {
+	u32 qp_num;
+	u32 token;
+	struct ehca_qp_core ehca_qp_core;
+};
+
+/*
+ * helper funcs to link send cq and qp
+ */
+int ehca_cq_assign_qp(struct ehca_cq *cq, struct ehca_qp *qp);
+int ehca_cq_unassign_qp(struct ehca_cq *cq, unsigned int qp_num);
+struct ehca_qp* ehca_cq_get_qp(struct ehca_cq *cq, int qp_num);
+
+#endif				/* __EHCA_CLASSES_H__ */
diff --git a/drivers/infiniband/hw/ehca/ehca_classes_core.h b/drivers/infiniband/hw/ehca/ehca_classes_core.h
new file mode 100644
index 0000000..5e864b3
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ehca_classes_core.h
@@ -0,0 +1,73 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  core struct definitions for hcad internal structures and
+ *  to be used/compiled commonly in user and kernel space
+ *
+ *  Authors: Christoph Raisch <raisch at de.ibm.com>
+ *           Hoang-Nam Nguyen <hnguyen at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ehca_classes_core.h,v 1.12 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+#ifndef __EHCA_CLASSES_CORE_H__
+#define __EHCA_CLASSES_CORE_H__
+
+#include "ipz_pt_fn_core.h"
+#include "ehca_galpa.h"
+
+/** @brief qp core contains common fields for user/kernel space
+ */
+struct ehca_qp_core {
+	/* kernel space: enum ib_qp_type, user space: enum ibv_qp_type */
+	int qp_type;
+	int dummy1; /* 8 byte alignment */
+	struct ipz_queue ipz_squeue;
+	struct ipz_queue ipz_rqueue;
+	struct h_galpas galpas;
+	unsigned int qkey;
+	int dummy2; /* 8 byte alignment */
+	/* qp_num assigned by ehca: sqp0/1 may have got different numbers */
+	unsigned int real_qp_num;
+};
+
+/** @brief cq core contains common fields for user/kernel space
+ */
+struct ehca_cq_core {
+	struct ipz_queue ipz_queue;
+	struct h_galpas galpas;
+};
+
+#endif /* __EHCA_CLASSES_CORE_H__ */
diff --git a/drivers/infiniband/hw/ehca/ehca_classes_pSeries.h b/drivers/infiniband/hw/ehca/ehca_classes_pSeries.h
new file mode 100644
index 0000000..8f86137
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ehca_classes_pSeries.h
@@ -0,0 +1,256 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  pSeries interface definitions
+ *
+ *  Authors: Waleri Fomin <fomin at de.ibm.com>
+ *           Christoph Raisch <raisch at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ehca_classes_pSeries.h,v 1.24 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+#ifndef __EHCA_CLASSES_PSERIES_H__
+#define __EHCA_CLASSES_PSERIES_H__
+
+#include "ehca_galpa.h"
+#include "ipz_pt_fn.h"
+
+
+struct ehca_pfmodule {
+};
+
+struct ehca_pfshca {
+};
+
+struct ehca_pfqp {
+	struct ipz_qpt sqpt;
+	struct ipz_qpt rqpt;
+	struct ehca_bridge_handle bridge;
+};
+
+struct ehca_pfcq {
+	struct ipz_qpt qpt;
+	struct ehca_bridge_handle bridge;
+	u32 cqnr;
+};
+
+struct ehca_pfeq {
+	struct ipz_qpt qpt;
+	struct ehca_bridge_handle bridge;
+	struct h_galpa galpa;
+	u32 eqnr;
+};
+
+struct ehca_pfpd {
+};
+
+struct ehca_pfmr {
+	struct ehca_bridge_handle bridge;
+};
+struct ehca_pfmw {
+};
+
+struct ipz_adapter_handle {
+	u64 handle;
+};
+
+struct ipz_cq_handle {
+	u64 handle;
+};
+
+struct ipz_eq_handle {
+	u64 handle;
+};
+
+struct ipz_qp_handle {
+	u64 handle;
+};
+struct ipz_mrmw_handle {
+	u64 handle;
+};
+
+struct ipz_pd {
+	u32 value;
+};
+
+struct hcp_modify_qp_control_block {
+	u32 qkey;                      /* 00 */
+	u32 rdd;                       /* reliable datagram domain */
+	u32 send_psn;                  /* 02 */
+	u32 receive_psn;               /* 03 */
+	u32 prim_phys_port;            /* 04 */
+	u32 alt_phys_port;             /* 05 */
+	u32 prim_p_key_idx;            /* 06 */
+	u32 alt_p_key_idx;             /* 07 */
+	u32 rdma_atomic_ctrl;          /* 08 */
+	u32 qp_state;                  /* 09 */
+	u32 reserved_10;               /* 10 */
+	u32 rdma_nr_atomic_resp_res;   /* 11 */
+	u32 path_migration_state;      /* 12 */
+	u32 rdma_atomic_outst_dest_qp; /* 13 */
+	u32 dest_qp_nr;                /* 14 */
+	u32 min_rnr_nak_timer_field;   /* 15 */
+	u32 service_level;             /* 16 */
+	u32 send_grh_flag;             /* 17 */
+	u32 retry_count;               /* 18 */
+	u32 timeout;                   /* 19 */
+	u32 path_mtu;                  /* 20 */
+	u32 max_static_rate;           /* 21 */
+	u32 dlid;                      /* 22 */
+	u32 rnr_retry_count;           /* 23 */
+	u32 source_path_bits;          /* 24 */
+	u32 traffic_class;             /* 25 */
+	u32 hop_limit;                 /* 26 */
+	u32 source_gid_idx;            /* 27 */
+	u32 flow_label;                /* 28 */
+	u32 reserved_29;               /* 29 */
+	union {                        /* 30 */
+		u64 dw[2];
+		u8 byte[16];
+	} dest_gid;
+	u32 service_level_al;          /* 34 */
+	u32 send_grh_flag_al;          /* 35 */
+	u32 retry_count_al;            /* 36 */
+	u32 timeout_al;                /* 37 */
+	u32 max_static_rate_al;        /* 38 */
+	u32 dlid_al;                   /* 39 */
+	u32 rnr_retry_count_al;        /* 40 */
+	u32 source_path_bits_al;       /* 41 */
+	u32 traffic_class_al;          /* 42 */
+	u32 hop_limit_al;              /* 43 */
+	u32 source_gid_idx_al;         /* 44 */
+	u32 flow_label_al;             /* 45 */
+	u32 reserved_46;               /* 46 */
+	u32 reserved_47;               /* 47 */
+	union {                        /* 48 */
+		u64 dw[2];
+		u8 byte[16];
+	} dest_gid_al;
+	u32 max_nr_outst_send_wr;      /* 52 */
+	u32 max_nr_outst_recv_wr;      /* 53 */
+	u32 disable_ete_credit_check;  /* 54 */
+	u32 qp_number;                 /* 55 */
+	u64 send_queue_handle;         /* 56 */
+	u64 recv_queue_handle;         /* 58 */
+	u32 actual_nr_sges_in_sq_wqe;  /* 60 */
+	u32 actual_nr_sges_in_rq_wqe;  /* 61 */
+	u32 qp_enable;                 /* 62 */
+	u32 curr_srq_limit;            /* 63 */
+	u64 qp_aff_asyn_ev_log_reg;    /* 64 */
+	u64 shared_rq_hndl;            /* 66 */
+	u64 trigg_doorbell_qp_hndl;    /* 68 */
+	u32 reserved_70_127[58];       /* 70 */
+};
+
+#define MQPCB_MASK_QKEY                         EHCA_BMASK_IBM(0,0)
+#define MQPCB_MASK_SEND_PSN                     EHCA_BMASK_IBM(2,2)
+#define MQPCB_MASK_RECEIVE_PSN                  EHCA_BMASK_IBM(3,3)
+#define MQPCB_MASK_PRIM_PHYS_PORT               EHCA_BMASK_IBM(4,4)
+#define MQPCB_PRIM_PHYS_PORT                    EHCA_BMASK_IBM(24,31)
+#define MQPCB_MASK_ALT_PHYS_PORT                EHCA_BMASK_IBM(5,5)
+#define MQPCB_MASK_PRIM_P_KEY_IDX               EHCA_BMASK_IBM(6,6)
+#define MQPCB_PRIM_P_KEY_IDX                    EHCA_BMASK_IBM(24,31)
+#define MQPCB_MASK_ALT_P_KEY_IDX                EHCA_BMASK_IBM(7,7)
+#define MQPCB_MASK_RDMA_ATOMIC_CTRL             EHCA_BMASK_IBM(8,8)
+#define MQPCB_MASK_QP_STATE                     EHCA_BMASK_IBM(9,9)
+#define MQPCB_QP_STATE                          EHCA_BMASK_IBM(24,31)
+#define MQPCB_MASK_RDMA_NR_ATOMIC_RESP_RES      EHCA_BMASK_IBM(11,11)
+#define MQPCB_MASK_PATH_MIGRATION_STATE         EHCA_BMASK_IBM(12,12)
+#define MQPCB_MASK_RDMA_ATOMIC_OUTST_DEST_QP    EHCA_BMASK_IBM(13,13)
+#define MQPCB_MASK_DEST_QP_NR                   EHCA_BMASK_IBM(14,14)
+#define MQPCB_MASK_MIN_RNR_NAK_TIMER_FIELD      EHCA_BMASK_IBM(15,15)
+#define MQPCB_MASK_SERVICE_LEVEL                EHCA_BMASK_IBM(16,16)
+#define MQPCB_MASK_SEND_GRH_FLAG                EHCA_BMASK_IBM(17,17)
+#define MQPCB_MASK_RETRY_COUNT                  EHCA_BMASK_IBM(18,18)
+#define MQPCB_MASK_TIMEOUT                      EHCA_BMASK_IBM(19,19)
+#define MQPCB_MASK_PATH_MTU                     EHCA_BMASK_IBM(20,20)
+#define MQPCB_PATH_MTU                          EHCA_BMASK_IBM(24,31)
+#define MQPCB_MASK_MAX_STATIC_RATE              EHCA_BMASK_IBM(21,21)
+#define MQPCB_MAX_STATIC_RATE                   EHCA_BMASK_IBM(24,31)
+#define MQPCB_MASK_DLID                         EHCA_BMASK_IBM(22,22)
+#define MQPCB_DLID                              EHCA_BMASK_IBM(16,31)
+#define MQPCB_MASK_RNR_RETRY_COUNT              EHCA_BMASK_IBM(23,23)
+#define MQPCB_RNR_RETRY_COUNT                   EHCA_BMASK_IBM(29,31)
+#define MQPCB_MASK_SOURCE_PATH_BITS             EHCA_BMASK_IBM(24,24)
+#define MQPCB_SOURCE_PATH_BITS                  EHCA_BMASK_IBM(25,31)
+#define MQPCB_MASK_TRAFFIC_CLASS                EHCA_BMASK_IBM(25,25)
+#define MQPCB_TRAFFIC_CLASS                     EHCA_BMASK_IBM(24,31)
+#define MQPCB_MASK_HOP_LIMIT                    EHCA_BMASK_IBM(26,26)
+#define MQPCB_HOP_LIMIT                         EHCA_BMASK_IBM(24,31)
+#define MQPCB_MASK_SOURCE_GID_IDX               EHCA_BMASK_IBM(27,27)
+#define MQPCB_SOURCE_GID_IDX                    EHCA_BMASK_IBM(24,31)
+#define MQPCB_MASK_FLOW_LABEL                   EHCA_BMASK_IBM(28,28)
+#define MQPCB_FLOW_LABEL                        EHCA_BMASK_IBM(12,31)
+#define MQPCB_MASK_DEST_GID                     EHCA_BMASK_IBM(30,30)
+#define MQPCB_MASK_SERVICE_LEVEL_AL             EHCA_BMASK_IBM(31,31)
+#define MQPCB_SERVICE_LEVEL_AL                  EHCA_BMASK_IBM(28,31)
+#define MQPCB_MASK_SEND_GRH_FLAG_AL             EHCA_BMASK_IBM(32,32)
+#define MQPCB_SEND_GRH_FLAG_AL                  EHCA_BMASK_IBM(31,31)
+#define MQPCB_MASK_RETRY_COUNT_AL               EHCA_BMASK_IBM(33,33)
+#define MQPCB_RETRY_COUNT_AL                    EHCA_BMASK_IBM(29,31)
+#define MQPCB_MASK_TIMEOUT_AL                   EHCA_BMASK_IBM(34,34)
+#define MQPCB_TIMEOUT_AL                        EHCA_BMASK_IBM(27,31)
+#define MQPCB_MASK_MAX_STATIC_RATE_AL           EHCA_BMASK_IBM(35,35)
+#define MQPCB_MAX_STATIC_RATE_AL                EHCA_BMASK_IBM(24,31)
+#define MQPCB_MASK_DLID_AL                      EHCA_BMASK_IBM(36,36)
+#define MQPCB_DLID_AL                           EHCA_BMASK_IBM(16,31)
+#define MQPCB_MASK_RNR_RETRY_COUNT_AL           EHCA_BMASK_IBM(37,37)
+#define MQPCB_RNR_RETRY_COUNT_AL                EHCA_BMASK_IBM(29,31)
+#define MQPCB_MASK_SOURCE_PATH_BITS_AL          EHCA_BMASK_IBM(38,38)
+#define MQPCB_SOURCE_PATH_BITS_AL               EHCA_BMASK_IBM(25,31)
+#define MQPCB_MASK_TRAFFIC_CLASS_AL             EHCA_BMASK_IBM(39,39)
+#define MQPCB_TRAFFIC_CLASS_AL                  EHCA_BMASK_IBM(24,31)
+#define MQPCB_MASK_HOP_LIMIT_AL                 EHCA_BMASK_IBM(40,40)
+#define MQPCB_HOP_LIMIT_AL                      EHCA_BMASK_IBM(24,31)
+#define MQPCB_MASK_SOURCE_GID_IDX_AL            EHCA_BMASK_IBM(41,41)
+#define MQPCB_SOURCE_GID_IDX_AL                 EHCA_BMASK_IBM(24,31)
+#define MQPCB_MASK_FLOW_LABEL_AL                EHCA_BMASK_IBM(42,42)
+#define MQPCB_FLOW_LABEL_AL                     EHCA_BMASK_IBM(12,31)
+#define MQPCB_MASK_DEST_GID_AL                  EHCA_BMASK_IBM(44,44)
+#define MQPCB_MASK_MAX_NR_OUTST_SEND_WR         EHCA_BMASK_IBM(45,45)
+#define MQPCB_MAX_NR_OUTST_SEND_WR              EHCA_BMASK_IBM(16,31)
+#define MQPCB_MASK_MAX_NR_OUTST_RECV_WR         EHCA_BMASK_IBM(46,46)
+#define MQPCB_MAX_NR_OUTST_RECV_WR              EHCA_BMASK_IBM(16,31)
+#define MQPCB_MASK_DISABLE_ETE_CREDIT_CHECK     EHCA_BMASK_IBM(47,47)
+#define MQPCB_DISABLE_ETE_CREDIT_CHECK          EHCA_BMASK_IBM(31,31)
+#define MQPCB_QP_NUMBER                         EHCA_BMASK_IBM(8,31)
+#define MQPCB_MASK_QP_ENABLE                    EHCA_BMASK_IBM(48,48)
+#define MQPCB_QP_ENABLE                         EHCA_BMASK_IBM(31,31)
+#define MQPCB_MASK_CURR_SQR_LIMIT               EHCA_BMASK_IBM(49,49)
+#define MQPCB_CURR_SQR_LIMIT                    EHCA_BMASK_IBM(15,31)
+#define MQPCB_MASK_QP_AFF_ASYN_EV_LOG_REG       EHCA_BMASK_IBM(50,50)
+#define MQPCB_MASK_SHARED_RQ_HNDL               EHCA_BMASK_IBM(51,51)
+
+#endif /* __EHCA_CLASSES_PSERIES_H__ */


From rolandd at cisco.com  Sat Feb 18 11:57:39 2006
From: rolandd at cisco.com (Roland Dreier)
Date: Fri, 17 Feb 2006 16:57:39 -0800
Subject: [PATCH 12/22] ehca low-level verbs
In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
Message-ID: <20060218005739.13620.15633.stgit@localhost.localdomain>

From: Roland Dreier <rolandd at cisco.com>

What is ehca_init_module()?  It is declared but never defined.
---

 drivers/infiniband/hw/ehca/ehca_iverbs.h |  163 ++++++++++++++++++
 drivers/infiniband/hw/ehca/ehca_qes.h    |  274 ++++++++++++++++++++++++++++++
 2 files changed, 437 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_iverbs.h b/drivers/infiniband/hw/ehca/ehca_iverbs.h
new file mode 100644
index 0000000..b1319a9
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ehca_iverbs.h
@@ -0,0 +1,163 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  Function definitions for internal functions
+ *
+ *  Authors: Heiko J Schick <schickhj at de.ibm.com>
+ *           Khadija Souissi <souissik at de.ibm.com>
+ *           Christoph Raisch <raisch at de.ibm.com>
+ *           Hoang-Nam Nguyen <hnguyen at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ehca_iverbs.h,v 1.32 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+#ifndef __EHCA_IVERBS_H__
+#define __EHCA_IVERBS_H__
+
+#include "ehca_classes.h"
+/** ehca internal verb for testuse
+ */
+void ehca_init_module(void);
+
+int ehca_query_device(struct ib_device *ibdev, struct ib_device_attr *props);
+int ehca_query_port(struct ib_device *ibdev,
+		    u8 port, struct ib_port_attr *props);
+int ehca_query_pkey(struct ib_device *ibdev, u8 port, u16 index, u16 * pkey);
+int ehca_query_gid(struct ib_device *ibdev, u8 port,
+		   int index, union ib_gid *gid);
+int ehca_modify_port(struct ib_device *ibdev,
+		     u8 port, int port_modify_mask,
+		     struct ib_port_modify *props);
+
+struct ib_pd *ehca_alloc_pd(struct ib_device *device,
+			    struct ib_ucontext *context,
+			    struct ib_udata *udata);
+
+int ehca_dealloc_pd(struct ib_pd *pd);
+
+struct ib_ah *ehca_create_ah(struct ib_pd *pd, struct ib_ah_attr *ah_attr);
+int ehca_modify_ah(struct ib_ah *ah, struct ib_ah_attr *ah_attr);
+int ehca_query_ah(struct ib_ah *ah, struct ib_ah_attr *ah_attr);
+int ehca_destroy_ah(struct ib_ah *ah);
+
+struct ib_cq *ehca_create_cq(struct ib_device *device, int cqe,
+			     struct ib_ucontext *context,
+			     struct ib_udata *udata);
+int ehca_resize_cq(struct ib_cq *cq, int cqe);
+
+int ehca_destroy_cq(struct ib_cq *cq);
+
+int ehca_poll_cq(struct ib_cq *cq, int num_entries, struct ib_wc *wc);
+
+int ehca_peek_cq(struct ib_cq *cq, int wc_cnt);
+
+int ehca_req_notify_cq(struct ib_cq *cq, enum ib_cq_notify cq_notify);
+
+struct ib_qp *ehca_create_qp(struct ib_pd *pd,
+			     struct ib_qp_init_attr *init_attr,
+			     struct ib_udata *udata);
+
+u64 ehca_define_sqp(struct ehca_shca *shca, struct ehca_qp *ibqp,
+			       struct ib_qp_init_attr *qp_init_attr);
+
+int ehca_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, int attr_mask);
+
+int ehca_query_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr,
+		  int qp_attr_mask, struct ib_qp_init_attr *qp_init_attr);
+
+int ehca_destroy_qp(struct ib_qp *qp);
+
+int ehca_post_send(struct ib_qp *qp,
+		   struct ib_send_wr *send_wr, struct ib_send_wr **bad_send_wr);
+
+int ehca_post_recv(struct ib_qp *qp,
+		   struct ib_recv_wr *recv_wr, struct ib_recv_wr **bad_recv_wr);
+
+struct ib_mr *ehca_get_dma_mr(struct ib_pd *pd, int mr_access_flags);
+
+struct ib_mr *ehca_reg_phys_mr(struct ib_pd *pd,
+			       struct ib_phys_buf *phys_buf_array,
+			       int num_phys_buf,
+			       int mr_access_flags, u64 *iova_start);
+
+struct ib_mr *ehca_reg_user_mr(struct ib_pd *pd,
+			       struct ib_umem *region,
+			       int mr_access_flags, struct ib_udata *udata);
+
+int ehca_rereg_phys_mr(struct ib_mr *mr,
+		       int mr_rereg_mask,
+		       struct ib_pd *pd,
+		       struct ib_phys_buf *phys_buf_array,
+		       int num_phys_buf, int mr_access_flags, u64 *iova_start);
+
+int ehca_query_mr(struct ib_mr *mr, struct ib_mr_attr *mr_attr);
+
+int ehca_dereg_mr(struct ib_mr *mr);
+
+struct ib_mw *ehca_alloc_mw(struct ib_pd *pd);
+
+int ehca_bind_mw(struct ib_qp *qp,
+		 struct ib_mw *mw, struct ib_mw_bind *mw_bind);
+
+int ehca_dealloc_mw(struct ib_mw *mw);
+
+struct ib_fmr *ehca_alloc_fmr(struct ib_pd *pd,
+			      int mr_access_flags,
+			      struct ib_fmr_attr *fmr_attr);
+
+int ehca_map_phys_fmr(struct ib_fmr *fmr,
+		      u64 *page_list, int list_len, u64 iova);
+
+int ehca_unmap_fmr(struct list_head *fmr_list);
+
+int ehca_dealloc_fmr(struct ib_fmr *fmr);
+
+int ehca_attach_mcast(struct ib_qp *qp, union ib_gid *gid, u16 lid);
+
+int ehca_detach_mcast(struct ib_qp *qp, union ib_gid *gid, u16 lid);
+
+struct ib_ucontext *ehca_alloc_ucontext(struct ib_device *device,
+					struct ib_udata *udata);
+int ehca_dealloc_ucontext(struct ib_ucontext *context);
+
+int ehca_mmap(struct ib_ucontext *context, struct vm_area_struct *vma);
+
+int ehca_poll_eqs(void *data);
+
+int ehca_mmap_nopage(u64 foffset,u64 length,void ** mapped,struct vm_area_struct ** vma);
+int ehca_mmap_register(u64 physical,void ** mapped,struct vm_area_struct ** vma);
+int ehca_munmap(unsigned long addr, size_t len);
+
+#endif
diff --git a/drivers/infiniband/hw/ehca/ehca_qes.h b/drivers/infiniband/hw/ehca/ehca_qes.h
new file mode 100644
index 0000000..e9420e3
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ehca_qes.h
@@ -0,0 +1,274 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  Hardware request structures
+ *
+ *  Authors: Waleri Fomin <fomin at de.ibm.com>
+ *           Reinhard Ernst <rernst at de.ibm.com>
+ *           Christoph Raisch <raisch at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ehca_qes.h,v 1.9 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+
+#ifndef _EHCA_QES_H_
+#define _EHCA_QES_H_
+
+/** DON'T include any kernel related files here!!!
+ * This file is used commonly in user and kernel space!!!
+ */
+
+/**
+ * virtual scatter gather entry to specify remote adresses with length
+ */
+struct ehca_vsgentry {
+	u64 vaddr;
+	u32 lkey;
+	u32 length;
+};
+
+#define GRH_FLAG_MASK        EHCA_BMASK_IBM(7,7)
+#define GRH_IPVERSION_MASK   EHCA_BMASK_IBM(0,3)
+#define GRH_TCLASS_MASK      EHCA_BMASK_IBM(4,12)
+#define GRH_FLOWLABEL_MASK   EHCA_BMASK_IBM(13,31)
+#define GRH_PAYLEN_MASK      EHCA_BMASK_IBM(32,47)
+#define GRH_NEXTHEADER_MASK  EHCA_BMASK_IBM(48,55)
+#define GRH_HOPLIMIT_MASK    EHCA_BMASK_IBM(56,63)
+
+/**
+ * Unreliable Datagram Address Vector Format
+ * see IBTA Vol1 chapter 8.3 Global Routing Header
+ */
+struct ehca_ud_av {
+	u8 sl;
+	u8 lnh;
+	u16 dlid;
+	u8 reserved1;
+	u8 reserved2;
+	u8 reserved3;
+	u8 slid_path_bits;
+	u8 reserved4;
+	u8 ipd;
+	u8 reserved5;
+	u8 pmtu;
+	u32 reserved6;
+	u64 reserved7;
+	union {
+		struct {
+			u64 word_0; /* always set to 6  */
+			/*should be 0x1B for IB transport */
+			u64 word_1;
+			u64 word_2;
+			u64 word_3;
+			u64 word_4;
+		} grh;
+		struct {
+			u32 wd_0;
+			u32 wd_1;
+			/* DWord_1 --> SGID */
+
+			u32 sgid_wd3;
+			/* bits 127 - 96       */
+
+			u32 sgid_wd2;
+			/* bits  95 - 64 */
+			/* DWord_2 */
+
+			u32 sgid_wd1;
+			/* bits  63 - 32 */
+
+			u32 sgid_wd0;
+			/* bits  31 -  0 */
+			/* DWord_3 --> DGID */
+
+			u32 dgid_wd3;
+			/* bits 127 - 96
+			 **/
+			u32 dgid_wd2;
+			/* bits  95 - 64
+			 DWord_4 */
+			u32 dgid_wd1;
+			/* bits  63 - 32 */
+
+			u32 dgid_wd0;
+			/* bits  31 -  0    */
+		} grh_l;
+	};
+};
+
+/* maximum number of sg entries allowed in a WQE */
+#define MAX_WQE_SG_ENTRIES 252
+
+#define  WQE_OPTYPE_SEND             0x80
+#define  WQE_OPTYPE_RDMAREAD         0x40
+#define  WQE_OPTYPE_RDMAWRITE        0x20
+#define  WQE_OPTYPE_CMPSWAP          0x10
+#define  WQE_OPTYPE_FETCHADD         0x08
+#define  WQE_OPTYPE_BIND             0x04
+
+#define  WQE_WRFLAG_REQ_SIGNAL_COM   0x80
+#define  WQE_WRFLAG_FENCE            0x40
+#define  WQE_WRFLAG_IMM_DATA_PRESENT 0x20
+#define  WQE_WRFLAG_SOLIC_EVENT      0x10
+
+#define  WQEF_CACHE_HINT             0x80
+#define  WQEF_CACHE_HINT_RD_WR       0x40
+#define  WQEF_TIMED_WQE              0x20
+#define  WQEF_PURGE                  0x08
+
+#define MW_BIND_ACCESSCTRL_R_WRITE   0x40
+#define MW_BIND_ACCESSCTRL_R_READ    0x20
+#define MW_BIND_ACCESSCTRL_R_ATOMIC  0x10
+
+struct ehca_wqe {
+	u64 work_request_id;
+	u8 optype;
+	u8 wr_flag;
+	u16 pkeyi;
+	u8 wqef;
+	u8 nr_of_data_seg;
+	u16 wqe_provided_slid;
+	u32 destination_qp_number;
+	u32 resync_psn_sqp;
+	u32 local_ee_context_qkey;
+	u32 immediate_data;
+	union {
+		struct {
+			u64 remote_virtual_adress;
+			u32 rkey;
+			u32 reserved;
+			u64 atomic_1st_op_dma_len;
+			u64 atomic_2nd_op;
+			struct ehca_vsgentry sg_list[MAX_WQE_SG_ENTRIES];
+
+		} nud;
+		struct {
+			u64 ehca_ud_av_ptr;
+			u64 reserved1;
+			u64 reserved2;
+			u64 reserved3;
+			struct ehca_vsgentry sg_list[MAX_WQE_SG_ENTRIES];
+		} ud_avp;
+		struct {
+			struct ehca_ud_av ud_av;
+			struct ehca_vsgentry sg_list[MAX_WQE_SG_ENTRIES -
+						     2];
+		} ud_av;
+		struct {
+			u64 reserved0;
+			u64 reserved1;
+			u64 reserved2;
+			u64 reserved3;
+			struct ehca_vsgentry sg_list[MAX_WQE_SG_ENTRIES];
+		} all_rcv;
+
+		struct {
+			u64 reserved;
+			u32 rkey;
+			u32 old_rkey;
+			u64 reserved1;
+			u64 reserved2;
+			u64 virtual_address;
+			u32 reserved3;
+			u32 length;
+			u32 reserved4;
+			u16 reserved5;
+			u8 reserved6;
+			u8 lr_ctl;
+			u32 lkey;
+			u32 reserved7;
+			u64 reserved8;
+			u64 reserved9;
+			u64 reserved10;
+			u64 reserved11;
+		} bind;
+		struct {
+			u64 reserved12;
+			u64 reserved13;
+			u32 size;
+			u32 start;
+		} inline_data;
+	} u;
+
+};
+
+#define WC_SEND_RECEIVE EHCA_BMASK_IBM(0,0)
+#define WC_IMM_DATA     EHCA_BMASK_IBM(1,1)
+#define WC_GRH_PRESENT  EHCA_BMASK_IBM(2,2)
+#define WC_SE_BIT       EHCA_BMASK_IBM(3,3)
+
+struct ehca_cqe {
+	u64 work_request_id;
+	u8 optype;
+	u8 w_completion_flags;
+	u16 reserved1;
+	u32 nr_bytes_transferred;
+	u32 immediate_data;
+	u32 local_qp_number;
+	u8 freed_resource_count;
+	u8 service_level;
+	u16 wqe_count;
+	u32 qp_token;
+	u32 qkey_ee_token;
+	u32 remote_qp_number;
+	u16 dlid;
+	u16 rlid;
+	u16 reserved2;
+	u16 pkey_index;
+	u32 cqe_timestamp;
+	u32 wqe_timestamp;
+	u8 wqe_timestamp_valid;
+	u8 reserved3;
+	u8 reserved4;
+	u8 cqe_flags;
+	u32 status;
+};
+
+struct ehca_eqe {
+	u64 entry;
+};
+
+struct ehca_mrte {
+	u64 starting_va;
+	u64 length; /* length of memory region in bytes*/
+	u32 pd;
+	u8 key_instance;
+	u8 pagesize;
+	u8 mr_control;
+	u8 local_remote_access_ctrl;
+	u8 reserved[0x20 - 0x18];
+	u64 at_pointer[4];
+};
+#endif /*_EHCA_QES_H_*/


From rolandd at cisco.com  Sat Feb 18 11:57:23 2006
From: rolandd at cisco.com (Roland Dreier)
Date: Fri, 17 Feb 2006 16:57:23 -0800
Subject: [PATCH 08/22] Generic ehca headers
In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
Message-ID: <20060218005723.13620.10389.stgit@localhost.localdomain>

From: Roland Dreier <rolandd at cisco.com>

The defines of TRUE and FALSE look rather useless.  Why are they needed?

What is struct ehca_cache for?  It doesn't seem to be used anywhere.

ehca_kv_to_g() looks completely horrible.  The whole idea of using
vmalloc()ed kernel memory to do DMA seems unacceptable to me.

It's usual to include all <linux/> headers before all <asm/> headers.
---

 drivers/infiniband/hw/ehca/ehca_flightrecorder.h |   74 ++++
 drivers/infiniband/hw/ehca/ehca_kernel.h         |  135 +++++++
 drivers/infiniband/hw/ehca/ehca_tools.h          |  431 ++++++++++++++++++++++
 3 files changed, 640 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_flightrecorder.h b/drivers/infiniband/hw/ehca/ehca_flightrecorder.h
new file mode 100644
index 0000000..7c631ad
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ehca_flightrecorder.h
@@ -0,0 +1,74 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  flightrecorder macros
+ *
+ *  Authors: Christoph Raisch <raisch at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ehca_flightrecorder.h,v 1.5 2006/02/06 10:17:34 schickhj Exp $
+ */
+/*****************************************************************************/
+#ifndef EHCA_FLIGHTRECORDER_H
+#define EHCA_FLIGHTRECORDER_H
+
+#define ED_EXTEND1(x,ar1...) \
+	unsigned long __EDEB_R2=(const unsigned long)x-0;ED_EXTEND2(ar1)
+#define ED_EXTEND2(x,ar1...) \
+	unsigned long __EDEB_R3=(const unsigned long)x-0;ED_EXTEND3(ar1)
+#define ED_EXTEND3(x,ar1...) \
+	unsigned long __EDEB_R4=(const unsigned long)x-0;ED_EXTEND4(ar1)
+#define ED_EXTEND4(x,ar1...)
+
+#define EHCA_FLIGHTRECORDER_SIZE 65536
+
+extern atomic_t ehca_flightrecorder_index;
+extern unsigned long ehca_flightrecorder[EHCA_FLIGHTRECORDER_SIZE];
+
+/* Not nice, but -O2 optimized */
+
+#define ED_FLIGHT_LOG(x,ar1...) {                                            \
+	u32 flight_offset = ((u32)					     \
+	atomic_add_return(4, &ehca_flightrecorder_index))		     \
+	% EHCA_FLIGHTRECORDER_SIZE;					     \
+	unsigned long *flight_trline = &ehca_flightrecorder[flight_offset];  \
+	unsigned long __EDEB_R1 = (unsigned long) x-0; ED_EXTEND1(ar1)	     \
+	flight_trline[0]=__EDEB_R1,flight_trline[1]=__EDEB_R2,		     \
+	flight_trline[2]=__EDEB_R3,flight_trline[3]=__EDEB_R4; }
+
+#define EHCA_FLIGHTRECORDER_BACKLOG 60
+
+void ehca_flight_to_printk(void);
+
+#endif
diff --git a/drivers/infiniband/hw/ehca/ehca_kernel.h b/drivers/infiniband/hw/ehca/ehca_kernel.h
new file mode 100644
index 0000000..f119149
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ehca_kernel.h
@@ -0,0 +1,135 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  generalized functions for code shared between kernel and userspace
+ *
+ *  Authors: Christoph Raisch <raisch at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ehca_kernel.h,v 1.39 2006/02/06 11:45:10 schickhj Exp $
+ */
+
+#ifndef _EHCA_KERNEL_H_
+#define _EHCA_KERNEL_H_
+
+#define FALSE (1==0)
+#define TRUE (1==1)
+
+#define big_little_target 0	/* needed for simulation */
+#include <linux/version.h>
+
+#include <linux/types.h>
+#include "ehca_common.h"
+#include "ehca_kernel.h"
+
+/**
+ * Handle to be used for adress translation mechanisms, currently a placeholder.
+ */
+struct ehca_bridge_handle {
+	int handle;
+};
+
+inline static int ehca_adr_bad(void *adr)
+{
+	return (adr == 0);
+};
+
+#ifdef EHCA_USERDRIVER
+/* userspace replacement for kernel functions */
+#include "ehca_usermain.h"
+#else				/* USERDRIVER */
+/* kernel includes */
+#include <linux/module.h>
+#include <linux/moduleparam.h>
+#include <linux/kernel.h>
+#include <linux/vmalloc.h>
+#include <linux/mm.h>
+#include <asm/current.h>
+#include <asm/io.h>
+#include <linux/slab.h>
+#include <linux/vmalloc.h>
+#include <linux/sched.h>
+#include <linux/err.h>
+#include <linux/kthread.h>
+#include <linux/mman.h>
+#include <linux/delay.h>
+#include <asm/processor.h>
+#include <asm/ibmebus.h>
+#include <linux/pci.h>
+#include <linux/idr.h>
+#include <linux/rwsem.h>
+
+struct ehca_cache {
+	kmem_cache_t *cache;
+	int size;
+};
+
+#ifdef __powerpc64__
+#include <linux/spinlock.h>
+#include <asm/abs_addr.h>
+#include <asm/prom.h>
+#else
+#endif
+
+#include <ehca_tools.h>
+
+#include <asm/pgtable.h>
+
+
+/**
+ * ehca_kv_to_g - Converts a kernel virtual address to visible addresses
+ * (i.e. a physical/absolute address).
+ */
+inline static u64 ehca_kv_to_g(void *adr)
+{
+	u64 raddr;
+#ifndef CONFIG_PPC64
+	raddr = virt_to_phys((u64)adr);
+#else
+	/* we need to find not only the physical address
+	 * but the absolute to account for memory segmentation */
+	raddr = virt_to_abs((u64)adr);
+#endif
+	if (((u64)adr & VMALLOC_START) == VMALLOC_START) {
+		raddr = phys_to_abs((page_to_pfn(vmalloc_to_page(adr)) <<
+				     PAGE_SHIFT));
+	}
+	return (raddr);
+}
+
+#endif /* USERDRIVER */
+#include <linux/types.h>
+
+
+#endif /* _EHCA_KERNEL_H_ */
diff --git a/drivers/infiniband/hw/ehca/ehca_tools.h b/drivers/infiniband/hw/ehca/ehca_tools.h
new file mode 100644
index 0000000..915a0b7
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ehca_tools.h
@@ -0,0 +1,431 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  auxiliary functions
+ *
+ *  Authors: Christoph Raisch <raisch at de.ibm.com>
+ *           Khadija Souissi <souissik at de.ibm.com>
+ *           Waleri Fomin <fomin at de.ibm.com>
+ *           Heiko J Schick <schickhj at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ehca_tools.h,v 1.43 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+
+#ifndef EHCA_TOOLS_H
+#define EHCA_TOOLS_H
+
+#include "ehca_flightrecorder.h"
+#include "ehca_common.h"
+
+#define flightlog_value() mftb()
+
+#ifndef sizeofmember
+#define sizeofmember(TYPE, MEMBER) (sizeof( ((TYPE *)0)->MEMBER))
+#endif
+
+#define EHCA_EDEB_TRACE_MASK_SIZE 32
+extern u8 ehca_edeb_mask[EHCA_EDEB_TRACE_MASK_SIZE];
+#define EDEB_ID_TO_U32(str4) (str4[3] | (str4[2] << 8) | (str4[1] << 16) | \
+			      (str4[0] << 24))
+
+inline static u64 ehca_edeb_filter(const u32 level,
+				   const u32 id, const u32 line)
+{
+	u64 ret = 0;
+	u32 filenr = 0;
+	u32 filter_level = 9;
+	u32 dynamic_level = 0;
+	/* This is code written for the gcc -O2 optimizer which should colapse
+	 * to two single ints filter_level is the first level kicked out by
+	 * compiler means trace everythin below 6. */
+	if (id == EDEB_ID_TO_U32("ehav")) {
+		filenr = 0x01;
+		filter_level = 8;
+	}
+	if (id == EDEB_ID_TO_U32("clas")) {
+		filenr = 0x02;
+		filter_level = 8;
+	}
+	if (id == EDEB_ID_TO_U32("cqeq")) {
+		filenr = 0x03;
+		filter_level = 8;
+	}
+	if (id == EDEB_ID_TO_U32("shca")) {
+		filenr = 0x05;
+		filter_level = 8;
+	}
+	if (id == EDEB_ID_TO_U32("eirq")) {
+		filenr = 0x06;
+		filter_level = 8;
+	}
+	if (id == EDEB_ID_TO_U32("lMad")) {
+		filenr = 0x07;
+		filter_level = 8;
+	}
+	if (id == EDEB_ID_TO_U32("mcas")) {
+		filenr = 0x08;
+		filter_level = 8;
+	}
+	if (id == EDEB_ID_TO_U32("mrmw")) {
+		filenr = 0x09;
+		filter_level = 8;
+	}
+	if (id == EDEB_ID_TO_U32("vpd ")) {
+		filenr = 0x0a;
+		filter_level = 8;
+	}
+	if (id == EDEB_ID_TO_U32("e_qp")) {
+		filenr = 0x0b;
+		filter_level = 8;
+	}
+	if (id == EDEB_ID_TO_U32("uqes")) {
+		filenr = 0x0c;
+		filter_level = 8;
+	}
+	if (id == EDEB_ID_TO_U32("PHYP")) {
+		filenr = 0x0d;
+		filter_level = 8;
+	}
+	if (id == EDEB_ID_TO_U32("snse")) {
+		filenr = 0x0e;
+		filter_level = 8;
+	}
+	if (id == EDEB_ID_TO_U32("iptz")) {
+		filenr = 0x0f;
+		filter_level = 8;
+	}
+	if (id == EDEB_ID_TO_U32("spta")) {
+		filenr = 0x10;
+		filter_level = 8;
+	}
+	if (id == EDEB_ID_TO_U32("simp")) {
+		filenr = 0x11;
+		filter_level = 8;
+	}
+	if (id == EDEB_ID_TO_U32("reqs")) {
+		filenr = 0x12;
+		filter_level = 8;
+	}
+
+	if ((filenr - 1) > sizeof(ehca_edeb_mask)) {
+		filenr = 0;
+	}
+
+	if (filenr == 0) {
+		filter_level = 9;
+	}			/* default */
+	ret = filenr * 0x10000 + line;
+	if (filter_level <= level) {
+		return (ret | 0x100000000); /* this is the flag to not trace */
+	}
+	dynamic_level = ehca_edeb_mask[filenr];
+	if (likely(dynamic_level <= level)) {
+		ret = ret | 0x100000000;
+	};
+	return ret;
+}
+
+#ifdef EHCA_USE_HCALL_KERNEL
+#ifdef CONFIG_PPC_PSERIES
+
+#include <asm/paca.h>
+
+/**
+ * IS_EDEB_ON - Checks if debug is on for the given level.
+ */
+#define IS_EDEB_ON(level) \
+    ((ehca_edeb_filter(level, EDEB_ID_TO_U32(DEB_PREFIX), __LINE__) & 0x100000000)==0)
+
+#define EDEB_P_GENERIC(level,idstring,format,args...) \
+do { \
+	u64 ehca_edeb_filterresult =					\
+		ehca_edeb_filter(level, EDEB_ID_TO_U32(DEB_PREFIX), __LINE__);\
+	if ((ehca_edeb_filterresult & 0x100000000) == 0)		\
+		printk("PU%04x %08x:%s " idstring " "format "\n",	\
+		       get_paca()->paca_index, (u32)(ehca_edeb_filterresult), \
+		       __func__,  ##args);				\
+	if (unlikely(ehca_edeb_mask[0x1e]!=0))				\
+		ED_FLIGHT_LOG((((u64)(get_paca()->paca_index)<< 32) |	\
+			       ((u64)(ehca_edeb_filterresult & 0xffffffff)) << 40 | \
+			       (flightlog_value()&0xffffffff)), args);	\
+} while (1==0)
+
+#elif CONFIG_ARCH_S390
+
+#include <asm/smp.h>
+#define EDEB_P_GENERIC(level,idstring,format,args...) \
+do { \
+	u64 ehca_edeb_filterresult =					\
+		ehca_edeb_filter(level, EDEB_ID_TO_U32(DEB_PREFIX), __LINE__);\
+	if ((ehca_edeb_filterresult & 0x100000000) == 0)		\
+		printk("PU%04x %08x:%s " idstring " "format "\n",	\
+			smp_processor_id(), (u32)(ehca_edeb_filterresult), \
+			__func__,  ##args); \
+} while (1==0)
+
+#elif REAL_HCALL
+
+#define EDEB_P_GENERIC(level,idstring,format,args...) \
+do { \
+	u64 ehca_edeb_filterresult =					\
+		ehca_edeb_filter(level, EDEB_ID_TO_U32(DEB_PREFIX), __LINE__); \
+	if ((ehca_edeb_filterresult & 0x100000000) == 0)		\
+		printk("%08x:%s " idstring " "format "\n",	\
+			(u32)(ehca_edeb_filterresult), \
+			__func__,  ##args); \
+} while (1==0)
+
+#endif
+#else
+
+#define IS_EDEB_ON(level) (1)
+
+#define EDEB_P_GENERIC(level,idstring,format,args...) \
+do { \
+	printk("%s " idstring " "format "\n",	\
+	       __func__,  ##args);		\
+} while (1==0)
+
+#endif
+
+/**
+ * EDEB - Trace output macro.
+ * @level tracelevel
+ * @format optional format string, use "" if not desired
+ * @args printf like arguments for trace, use %Lx for u64, %x for u32
+ *       %p for pointer
+ */
+#define EDEB(level,format,args...) \
+	EDEB_P_GENERIC(level,"",format,##args)
+#define EDEB_ERR(level,format,args...) \
+	EDEB_P_GENERIC(level,"HCAD_ERROR ",format,##args)
+#define EDEB_EN(level,format,args...) \
+	EDEB_P_GENERIC(level,">>>",format,##args)
+#define EDEB_EX(level,format,args...) \
+	EDEB_P_GENERIC(level,"<<<",format,##args)
+
+/**
+ * EDEB macro to dump a memory block, whose length is n*8 bytes.
+ * Each line has the following layout:
+ * <format string> adr=X ofs=Y <8 bytes hex> <8 bytes hex>
+ */
+
+#define EDEB_DMP(level,adr,len,format,args...) \
+	do {				       \
+		unsigned int x;			      \
+		unsigned int l = (unsigned int)(len); \
+		unsigned char *deb = (unsigned char*)(adr);	\
+		for (x = 0; x < l; x += 16) { \
+		        EDEB(level, format " adr=%p ofs=%04x %016lx %016lx", \
+			     ##args, deb, x, *((u64 *)&deb[0]), *((u64 *)&deb[8])); \
+			deb += 16; \
+		} \
+	} while (0)
+
+#define LOCATION __FILE__  " "
+
+/* define a bitmask, little endian version */
+#define EHCA_BMASK(pos,length) (((pos)<<16)+(length))
+/* define a bitmask, the ibm way... */
+#define EHCA_BMASK_IBM(from,to) (((63-to)<<16)+((to)-(from)+1))
+/* internal function, don't use */
+#define EHCA_BMASK_SHIFTPOS(mask) (((mask)>>16)&0xffff)
+/* internal function, don't use */
+#define EHCA_BMASK_MASK(mask) (0xffffffffffffffffULL >> ((64-(mask))&0xffff))
+/* return value shifted and masked by mask\n
+ * variable|=HCA_BMASK_SET(MY_MASK,0x4711) ORs the bits in variable\n
+ * variable&=~HCA_BMASK_SET(MY_MASK,-1) clears the bits from the mask
+ * in variable
+ */
+#define EHCA_BMASK_SET(mask,value) \
+	((EHCA_BMASK_MASK(mask) & ((u64)(value)))<<EHCA_BMASK_SHIFTPOS(mask))
+/* extract a parameter from value by mask\n
+ * param=EHCA_BMASK_GET(MY_MASK,value)
+ */
+#define EHCA_BMASK_GET(mask,value) \
+	( EHCA_BMASK_MASK(mask)& (((u64)(value))>>EHCA_BMASK_SHIFTPOS(mask)))
+
+/**
+ * ehca_fixme - Dummy function which will be removed in production code
+ * to find all todos by compiler.
+ */
+void ehca_fixme(void);
+
+extern void exit(int);
+inline static void ehca_catastrophic(char *str)
+{
+#ifndef  EHCA_USERDRIVER
+	printk(KERN_ERR "HCAD_ERROR %s\n", str);
+	ehca_flight_to_printk();
+#else
+	exit(1);
+#endif
+}
+
+#define PARANOIA_MODE
+#ifdef PARANOIA_MODE
+
+#define EHCA_CHECK_ADR_P(adr)					\
+	if (unlikely(adr==0)) {					\
+		EDEB_ERR(4, "adr=%p check failed line %i", adr,	\
+			 __LINE__);				\
+		return ERR_PTR(-EFAULT); }
+
+#define EHCA_CHECK_ADR(adr)					\
+	if (unlikely(adr==0)) {					\
+		EDEB_ERR(4, "adr=%p check failed line %i", adr,	\
+			 __LINE__);				\
+		return -EFAULT; }
+
+#define EHCA_CHECK_DEVICE_P(device)				\
+	if (unlikely(device==0)) {				\
+		EDEB_ERR(4, "device=%p check failed", device);	\
+		return ERR_PTR(-EFAULT); }
+
+#define EHCA_CHECK_DEVICE(device)				\
+	if (unlikely(device==0)) {				\
+		EDEB_ERR(4, "device=%p check failed", device);	\
+		return -EFAULT; }
+
+#define EHCA_CHECK_PD(pd)				\
+	if (unlikely(pd==0)) {				\
+		EDEB_ERR(4, "pd=%p check failed", pd);	\
+		return -EFAULT; }
+
+#define EHCA_CHECK_PD_P(pd)				\
+	if (unlikely(pd==0)) {				\
+		EDEB_ERR(4, "pd=%p check failed", pd);	\
+		return ERR_PTR(-EFAULT); }
+
+#define EHCA_CHECK_AV(av)				\
+	if (unlikely(av==0)) {				\
+		EDEB_ERR(4, "av=%p check failed", av);	\
+		return -EFAULT; }
+
+#define EHCA_CHECK_AV_P(av)				\
+	if (unlikely(av==0)) {				\
+		EDEB_ERR(4, "av=%p check failed", av);	\
+		return ERR_PTR(-EFAULT); }
+
+#define EHCA_CHECK_CQ(cq)				\
+	if (unlikely(cq==0)) {				\
+		EDEB_ERR(4, "cq=%p check failed", cq);	\
+		return -EFAULT; }
+
+#define EHCA_CHECK_CQ_P(cq)				\
+	if (unlikely(cq==0)) {				\
+		EDEB_ERR(4, "cq=%p check failed", cq);	\
+		return ERR_PTR(-EFAULT); }
+
+#define EHCA_CHECK_EQ(eq)				\
+	if (unlikely(eq==0)) {				\
+		EDEB_ERR(4, "eq=%p check failed", eq);	\
+		return -EFAULT; }
+
+#define EHCA_CHECK_EQ_P(eq)				\
+	if (unlikely(eq==0)) {				\
+		EDEB_ERR(4, "eq=%p check failed", eq);	\
+		return ERR_PTR(-EFAULT); }
+
+#define EHCA_CHECK_QP(qp)				\
+	if (unlikely(qp==0)) {				\
+		EDEB_ERR(4, "qp=%p check failed", qp);	\
+		return -EFAULT; }
+
+#define EHCA_CHECK_QP_P(qp)				\
+	if (unlikely(qp==0)) {				\
+		EDEB_ERR(4, "qp=%p check failed", qp);	\
+		return ERR_PTR(-EFAULT); }
+
+#define EHCA_CHECK_MR(mr)				\
+	if (unlikely(mr==0)) {				\
+		EDEB_ERR(4, "mr=%p check failed", mr);	\
+		return -EFAULT; }
+
+#define EHCA_CHECK_MR_P(mr)				\
+	if (unlikely(mr==0)) {				\
+		EDEB_ERR(4, "mr=%p check failed", mr);	\
+		return ERR_PTR(-EFAULT); }
+
+#define EHCA_CHECK_MW(mw)				\
+	if (unlikely(mw==0)) {				\
+		EDEB_ERR(4, "mw=%p check failed", mw);	\
+		return -EFAULT; }
+
+#define EHCA_CHECK_MW_P(mw)				\
+	if (unlikely(mw==0)) {				\
+		EDEB_ERR(4, "mw=%p check failed", mw);	\
+		return ERR_PTR(-EFAULT); }
+
+#define EHCA_CHECK_FMR(fmr)					\
+	if (unlikely(fmr==0)) {					\
+		EDEB_ERR(4, "fmr=%p check failed", fmr);	\
+		return -EFAULT; }
+
+#define EHCA_CHECK_FMR_P(fmr)					\
+	if (unlikely(fmr==0)) {					\
+		EDEB_ERR(4, "fmr=%p check failed", fmr);	\
+		return ERR_PTR(-EFAULT); }
+
+#define EHCA_REGISTER_PD(device,pd)
+#define EHCA_REGISTER_AV(pd,av)
+#define EHCA_DEREGISTER_PD(PD)
+#define EHCA_DEREGISTER_AV(av)
+#else
+#define EHCA_CHECK_DEVICE_P(device)
+
+#define EHCA_CHECK_PD(pd)
+#define EHCA_REGISTER_PD(device,pd)
+#define EHCA_DEREGISTER_PD(PD)
+#endif
+
+/**
+ * ehca2ib_return_code - Returns ib return code corresponding to the given
+ * ehca return code.
+ */
+static inline int ehca2ib_return_code(u64 ehca_rc)
+{
+	switch (ehca_rc) {
+	case H_Success:
+		return 0;
+	case H_Busy:
+		return -EBUSY;
+	case H_NoMem:
+		return -ENOMEM;
+	default:
+		return -EINVAL;
+	}
+}
+
+#endif /* EHCA_TOOLS_H */


From rolandd at cisco.com  Sat Feb 18 11:57:45 2006
From: rolandd at cisco.com (Roland Dreier)
Date: Fri, 17 Feb 2006 16:57:45 -0800
Subject: [PATCH 15/22] ehca queue pair handling
In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
Message-ID: <20060218005745.13620.43256.stgit@localhost.localdomain>

From: Roland Dreier <rolandd at cisco.com>


---

 drivers/infiniband/hw/ehca/ehca_qp.c | 1528 ++++++++++++++++++++++++++++++++++
 1 files changed, 1528 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c b/drivers/infiniband/hw/ehca/ehca_qp.c
new file mode 100644
index 0000000..e5b1b80
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ehca_qp.c
@@ -0,0 +1,1528 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  QP functions
+ *
+ *  Authors: Waleri Fomin <fomin at de.ibm.com>
+ *           Reinhard Ernst <rernst at de.ibm.com>
+ *           Hoang-Nam Nguyen <hnguyen at de.ibm.com>
+ *           Heiko J Schick <schickhj at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ehca_qp.c,v 1.159 2006/02/15 15:01:24 nguyen Exp $
+ */
+
+
+#define DEB_PREFIX "e_qp"
+
+#include "ehca_kernel.h"
+
+#include "ehca_classes.h"
+#include "ehca_tools.h"
+#include "hcp_if.h"
+#include "ehca_qes.h"
+
+#include "ehca_iverbs.h"
+#include <linux/module.h>
+#include <linux/err.h>
+
+#include <asm/io.h>
+#include <asm/uaccess.h>
+
+/** @brief attributes not supported by query qp
+ */
+#define QP_ATTR_QUERY_NOT_SUPPORTED (IB_QP_MAX_DEST_RD_ATOMIC | \
+				     IB_QP_MAX_QP_RD_ATOMIC   | \
+				     IB_QP_ACCESS_FLAGS       | \
+				     IB_QP_EN_SQD_ASYNC_NOTIFY)
+
+/** @brief ehca (internal) qp state values
+ */
+enum ehca_qp_state {
+	EHCA_QPS_RESET = 1,
+	EHCA_QPS_INIT = 2,
+	EHCA_QPS_RTR = 3,
+	EHCA_QPS_RTS = 5,
+	EHCA_QPS_SQD = 6,
+	EHCA_QPS_SQE = 8,
+	EHCA_QPS_ERR = 128
+};
+
+/** @brief qp state transitions as defined by IB Arch Rel 1.1 page 431
+ */
+enum ib_qp_statetrans {
+	IB_QPST_ANY2RESET,
+	IB_QPST_ANY2ERR,
+	IB_QPST_RESET2INIT,
+	IB_QPST_INIT2RTR,
+	IB_QPST_INIT2INIT,
+	IB_QPST_RTR2RTS,
+	IB_QPST_RTS2SQD,
+	IB_QPST_RTS2RTS,
+	IB_QPST_SQD2RTS,
+	IB_QPST_SQE2RTS,
+	IB_QPST_SQD2SQD,
+	IB_QPST_MAX	/* nr of transitions, this must be last!!! */
+};
+
+/** @brief returns ehca qp state corresponding to given ib qp state
+ */
+static inline enum ehca_qp_state ib2ehca_qp_state(enum ib_qp_state ib_qp_state)
+{
+	switch (ib_qp_state) {
+	case IB_QPS_RESET:
+		return EHCA_QPS_RESET;
+	case IB_QPS_INIT:
+		return EHCA_QPS_INIT;
+	case IB_QPS_RTR:
+		return EHCA_QPS_RTR;
+	case IB_QPS_RTS:
+		return EHCA_QPS_RTS;
+	case IB_QPS_SQD:
+		return EHCA_QPS_SQD;
+	case IB_QPS_SQE:
+		return EHCA_QPS_SQE;
+	case IB_QPS_ERR:
+		return EHCA_QPS_ERR;
+	default:
+		EDEB_ERR(4, "invalid ib_qp_state=%x", ib_qp_state);
+		return -EINVAL;
+	}
+}
+
+/** @brief returns ib qp state corresponding to given ehca qp state
+ */
+static inline enum ib_qp_state ehca2ib_qp_state(enum ehca_qp_state
+						ehca_qp_state)
+{
+	switch (ehca_qp_state) {
+	case EHCA_QPS_RESET:
+		return IB_QPS_RESET;
+	case EHCA_QPS_INIT:
+		return IB_QPS_INIT;
+	case EHCA_QPS_RTR:
+		return IB_QPS_RTR;
+	case EHCA_QPS_RTS:
+		return IB_QPS_RTS;
+	case EHCA_QPS_SQD:
+		return IB_QPS_SQD;
+	case EHCA_QPS_SQE:
+		return IB_QPS_SQE;
+	case EHCA_QPS_ERR:
+		return IB_QPS_ERR;
+	default:
+		EDEB_ERR(4,"invalid ehca_qp_state=%x",ehca_qp_state);
+		return -EINVAL;
+	}
+}
+
+/** @brief qp type
+ * used as index for req_attr and opt_attr of struct ehca_modqp_statetrans
+ */
+enum ehca_qp_type {
+	QPT_RC = 0,
+	QPT_UC = 1,
+	QPT_UD = 2,
+	QPT_SQP = 3,
+	QPT_MAX
+};
+
+/** @brief returns ehca qp type corresponding to ib qp type
+ */
+static inline enum ehca_qp_type ib2ehcaqptype(enum ib_qp_type ibqptype)
+{
+	switch (ibqptype) {
+	case IB_QPT_SMI:
+	case IB_QPT_GSI:
+		return QPT_SQP;
+	case IB_QPT_RC:
+		return QPT_RC;
+	case IB_QPT_UC:
+		return QPT_UC;
+	case IB_QPT_UD:
+		return QPT_UD;
+	default:
+		EDEB_ERR(4,"Invalid ibqptype=%x", ibqptype);
+		return -EINVAL;
+	}
+}
+
+static inline enum ib_qp_statetrans get_modqp_statetrans(int ib_fromstate,
+							 int ib_tostate)
+{
+	int index = -EINVAL;
+	switch (ib_tostate) {
+	case IB_QPS_RESET:
+		index = IB_QPST_ANY2RESET;
+		break;
+	case IB_QPS_INIT:
+		if (ib_fromstate == IB_QPS_RESET) {
+			index = IB_QPST_RESET2INIT;
+		} else if (ib_fromstate == IB_QPS_INIT) {
+			index = IB_QPST_INIT2INIT;
+		}
+		break;
+	case IB_QPS_RTR:
+		if (ib_fromstate == IB_QPS_INIT) {
+			index = IB_QPST_INIT2RTR;
+		}
+		break;
+	case IB_QPS_RTS:
+		if (ib_fromstate == IB_QPS_RTR) {
+			index = IB_QPST_RTR2RTS;
+		} else if (ib_fromstate == IB_QPS_RTS) {
+			index = IB_QPST_RTS2RTS;
+		} else if (ib_fromstate == IB_QPS_SQD) {
+			index = IB_QPST_SQD2RTS;
+		} else if (ib_fromstate == IB_QPS_SQE) {
+			index = IB_QPST_SQE2RTS;
+		}
+		break;
+	case IB_QPS_SQD:
+		if (ib_fromstate == IB_QPS_RTS) {
+			index = IB_QPST_RTS2SQD;
+		}
+		break;
+	case IB_QPS_SQE:
+		/* not allowed via mod qp */
+		break;
+	case IB_QPS_ERR:
+		index = IB_QPST_ANY2ERR;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return index;
+}
+
+/** @brief ehca service types
+ */
+enum ehca_service_type {
+	ST_RC = 0,
+	ST_UC = 1,
+	ST_RD = 2,
+	ST_UD = 3
+};
+
+/** @brief returns hcp service type corresponding to given ib qp type
+ * used by create_qp()
+ */
+static inline int ibqptype2servicetype(enum ib_qp_type ibqptype)
+{
+	switch (ibqptype) {
+	case IB_QPT_SMI:
+	case IB_QPT_GSI:
+		return ST_UD;
+	case IB_QPT_RC:
+		return ST_RC;
+	case IB_QPT_UC:
+		return ST_UC;
+	case IB_QPT_UD:
+		return ST_UD;
+	case IB_QPT_RAW_IPV6:
+		return -EINVAL;
+	case IB_QPT_RAW_ETY:
+		return -EINVAL;
+	default:
+		EDEB_ERR(4, "Invalid ibqptype=%x", ibqptype);
+		return -EINVAL;
+	}
+}
+
+/* init_qp_queues - Initializes/constructs r/squeue and registers queue pages.
+ * returns 0  if successful,
+ *        -EXXXX if not
+ */
+static inline int init_qp_queues(struct ipz_adapter_handle ipz_hca_handle,
+				 struct ehca_qp *my_qp,
+				 int nr_sq_pages,
+				 int nr_rq_pages,
+				 int swqe_size,
+				 int rwqe_size,
+				 int nr_send_sges, int nr_receive_sges)
+{
+	int ret = -EINVAL;
+	int cnt = 0;
+	void *vpage = NULL;
+	u64 rpage = 0;
+	int ipz_rc = -1;
+	u64 hipz_rc = H_Parameter;
+
+	ipz_rc = ipz_queue_ctor(&my_qp->ehca_qp_core.ipz_squeue,
+				nr_sq_pages,
+				EHCA_PAGESIZE, swqe_size, nr_send_sges);
+	if (!ipz_rc) {
+		EDEB_ERR(4, "Cannot allocate page for squeue. ipz_rc=%x",
+			 ipz_rc);
+		ret = -EBUSY;
+		return ret;
+	}
+
+	ipz_rc = ipz_queue_ctor(&my_qp->ehca_qp_core.ipz_rqueue,
+				nr_rq_pages,
+				EHCA_PAGESIZE, rwqe_size, nr_receive_sges);
+	if (!ipz_rc) {
+		EDEB_ERR(4, "Cannot allocate page for rqueue. ipz_rc=%x",
+			 ipz_rc);
+		ret = -EBUSY;
+		goto init_qp_queues0;
+	}
+	/* register SQ pages */
+	for (cnt = 0; cnt < nr_sq_pages; cnt++) {
+		vpage = ipz_QPageit_get_inc(&my_qp->ehca_qp_core.ipz_squeue);
+		if (!vpage) {
+			EDEB_ERR(4, "SQ ipz_QPageit_get_inc() "
+				 "failed p_vpage= %p", vpage);
+			ret = -EINVAL;
+			goto init_qp_queues1;
+		}
+		rpage = ehca_kv_to_g(vpage);
+
+		hipz_rc = hipz_h_register_rpage_qp(ipz_hca_handle,
+						   my_qp->ipz_qp_handle,
+						   &my_qp->pf, 0, 0, /*TODO*/
+						   rpage, 1,
+						   my_qp->ehca_qp_core.galpas.kernel);
+		if (hipz_rc < H_Success) {
+			EDEB_ERR(4,"SQ  hipz_qp_register_rpage() faield "
+				 " rc=%lx", hipz_rc);
+			ret = ehca2ib_return_code(hipz_rc);
+			goto init_qp_queues1;
+		}
+		/* for sq no need to check hipz_rc against
+		   e.g. H_PAGE_REGISTERED */
+	}
+
+	ipz_QEit_reset(&my_qp->ehca_qp_core.ipz_squeue);
+
+	/* register RQ pages */
+	for (cnt = 0; cnt < nr_rq_pages; cnt++) {
+		vpage = ipz_QPageit_get_inc(&my_qp->ehca_qp_core.ipz_rqueue);
+		if (!vpage) {
+			EDEB_ERR(4,"RQ ipz_QPageit_get_inc() "
+				 "failed p_vpage = %p", vpage);
+			hipz_rc = H_Resource;
+			ret = -EINVAL;
+			goto init_qp_queues1;
+		}
+
+		rpage = ehca_kv_to_g(vpage);
+
+		hipz_rc = hipz_h_register_rpage_qp(ipz_hca_handle,
+						   my_qp->ipz_qp_handle,
+						   &my_qp->pf, 0, 1, /*TODO*/
+						   rpage, 1,
+						   my_qp->ehca_qp_core.galpas.
+						   kernel);
+		if (hipz_rc < H_Success) {
+			EDEB_ERR(4, "RQ hipz_qp_register_rpage() failed "
+			     "rc=%lx", hipz_rc);
+			ret = ehca2ib_return_code(hipz_rc);
+			goto init_qp_queues1;
+		}
+		if (cnt == (nr_rq_pages - 1)) {	/* last page! */
+			if (hipz_rc != H_Success) {
+				EDEB_ERR(4,"RQ hipz_qp_register_rpage() "
+					 "hipz_rc= %lx ", hipz_rc);
+				ret = ehca2ib_return_code(hipz_rc);
+				goto init_qp_queues1;
+			}
+			vpage = ipz_QPageit_get_inc(&my_qp->ehca_qp_core.ipz_rqueue);
+			if (vpage != NULL) {
+				EDEB_ERR(4,"ipz_QPageit_get_inc() "
+					 "should not succeed vpage=%p",
+					 vpage);
+				ret = -EINVAL;
+				goto init_qp_queues1;
+			}
+		} else {
+			if (hipz_rc != H_PAGE_REGISTERED) {
+				EDEB_ERR(4,"RQ hipz_qp_register_rpage() "
+					 "hipz_rc= %lx ", hipz_rc);
+				ret = ehca2ib_return_code(hipz_rc);
+				goto init_qp_queues1;
+			}
+		}
+	}
+
+	ipz_QEit_reset(&my_qp->ehca_qp_core.ipz_rqueue);
+
+	return 0;
+
+ init_qp_queues1:
+	ipz_queue_dtor(&my_qp->ehca_qp_core.ipz_rqueue);
+ init_qp_queues0:
+	ipz_queue_dtor(&my_qp->ehca_qp_core.ipz_squeue);
+	return ret;
+}
+
+
+struct ib_qp *ehca_create_qp(struct ib_pd *pd,
+			     struct ib_qp_init_attr *init_attr,
+			     struct ib_udata *udata)
+{
+	static int da_msg_size[]={ 128, 256, 512, 1024, 2048, 4096 };
+	int ret = -EINVAL;
+	int servicetype = 0;
+	int sigtype = 0;
+
+	struct ehca_qp *my_qp = NULL;
+	struct ehca_pd *my_pd = NULL;
+	struct ehca_shca *shca = NULL;
+	struct ehca_cq *recv_ehca_cq = NULL;
+	struct ehca_cq *send_ehca_cq = NULL;
+	struct ib_ucontext *context = NULL;
+	u64 hipz_rc = H_Parameter;
+	int max_send_sge;
+	int max_recv_sge;
+	/* h_call's out parameters */
+	u16 act_nr_send_wqes = 0, act_nr_receive_wqes = 0;
+	u8 act_nr_send_sges = 0, act_nr_receive_sges = 0;
+	u32 qp_nr = 0,
+		nr_sq_pages = 0, swqe_size = 0, rwqe_size = 0, nr_rq_pages = 0;
+	u8 daqp_completion;
+	u8 isdaqp;
+	EDEB_EN(7,"pd=%p init_attr=%p", pd, init_attr);
+
+	EHCA_CHECK_PD_P(pd);
+	EHCA_CHECK_ADR_P(init_attr);
+
+	if (init_attr->sq_sig_type != IB_SIGNAL_REQ_WR &&
+	    init_attr->sq_sig_type != IB_SIGNAL_ALL_WR) {
+		EDEB_ERR(4, "init_attr->sg_sig_type=%x not allowed",
+			 init_attr->sq_sig_type);
+		return ERR_PTR(-EINVAL);
+	}
+
+	/* save daqp completion bits */
+	daqp_completion = init_attr->qp_type & 0x60;
+	/* save daqp bit */
+	isdaqp = (init_attr->qp_type & 0x80) ? 1 : 0;
+	init_attr->qp_type = init_attr->qp_type & 0x1F;
+
+	if (init_attr->qp_type != IB_QPT_UD &&
+	    init_attr->qp_type != IB_QPT_SMI &&
+	    init_attr->qp_type != IB_QPT_GSI &&
+	    init_attr->qp_type != IB_QPT_UC &&
+	    init_attr->qp_type != IB_QPT_RC) {
+		EDEB_ERR(4,"wrong QP Type=%x",init_attr->qp_type);
+		return ERR_PTR(-EINVAL);
+	}
+	if (init_attr->qp_type != IB_QPT_RC && isdaqp != 0) {
+		EDEB_ERR(4,"unsupported LL QP Type=%x",init_attr->qp_type);
+		return ERR_PTR(-EINVAL);
+	}
+
+	if (pd->uobject && udata != NULL) {
+		context = pd->uobject->context;
+	}
+
+	my_qp = ehca_qp_new();
+	if (!my_qp) {
+		EDEB_ERR(4, "pd=%p not enough memory to alloc qp", pd);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	my_pd = container_of(pd, struct ehca_pd, ib_pd);
+
+	shca = container_of(pd->device, struct ehca_shca, ib_device);
+	recv_ehca_cq = container_of(init_attr->recv_cq, struct ehca_cq, ib_cq);
+	send_ehca_cq = container_of(init_attr->send_cq, struct ehca_cq, ib_cq);
+
+	my_qp->init_attr = *init_attr;
+
+	do {
+		if (!idr_pre_get(&ehca_qp_idr, GFP_KERNEL)) {
+			ret = -ENOMEM;
+			EDEB_ERR(4, "Can't reserve idr resources.");
+			goto create_qp_exit0;
+		}
+
+		down_write(&ehca_qp_idr_sem);
+		ret = idr_get_new(&ehca_qp_idr, my_qp, &my_qp->token);
+		up_write(&ehca_qp_idr_sem);
+
+	} while (ret == -EAGAIN);
+
+	if (ret) {
+		ret = -ENOMEM;
+		EDEB_ERR(4, "Can't allocate new idr entry.");
+		goto create_qp_exit0;
+	}
+
+	servicetype = ibqptype2servicetype(init_attr->qp_type);
+	if (servicetype < 0) {
+		ret = -EINVAL;
+		EDEB_ERR(4, "Invalid qp_type=%x", init_attr->qp_type);
+		goto create_qp_exit0;
+	}
+
+	if (init_attr->sq_sig_type == IB_SIGNAL_ALL_WR) {
+		sigtype = HCALL_SIGT_EVERY;
+	} else {
+		sigtype = HCALL_SIGT_BY_WQE;
+	}
+
+	/* UD_AV CIRCUMVENTION */
+	max_send_sge=init_attr->cap.max_send_sge;
+	max_recv_sge=init_attr->cap.max_recv_sge;
+	if (IB_QPT_UD == init_attr->qp_type ||
+	    IB_QPT_GSI == init_attr->qp_type ||
+	    IB_QPT_SMI == init_attr->qp_type) {
+		max_send_sge += 2;
+		max_recv_sge += 2;
+	}
+
+	EDEB(7, "isdaqp=%x daqp_completion=%x", isdaqp, daqp_completion);
+
+	hipz_rc = hipz_h_alloc_resource_qp(shca->ipz_hca_handle,
+					   &my_qp->pf,
+					   servicetype,
+					   isdaqp | daqp_completion,
+					   sigtype, 0, /* no ud ad lkey ctrl */
+					   send_ehca_cq->ipz_cq_handle,
+					   recv_ehca_cq->ipz_cq_handle,
+					   shca->eq.ipz_eq_handle,
+					   my_qp->token,
+					   my_pd->fw_pd,
+					   (u16) init_attr->cap.max_send_wr + 1, /* fixme(+1 ??) */
+					   (u16) init_attr->cap.max_recv_wr + 1, /* fixme(+1 ??) */
+					   (u8) max_send_sge,
+					   (u8) max_recv_sge,
+					   0, /* ignored if ud ad lkey ctrl is 0 */
+					   &my_qp->ipz_qp_handle,
+					   &qp_nr,
+					   &act_nr_send_wqes,
+					   &act_nr_receive_wqes,
+					   &act_nr_send_sges,
+					   &act_nr_receive_sges,
+					   &nr_sq_pages,
+					   &nr_rq_pages,
+					   &my_qp->ehca_qp_core.galpas);
+	if (hipz_rc != H_Success) {
+		EDEB_ERR(4, "h_alloc_resource_qp() failed rc=%lx", hipz_rc);
+		ret = ehca2ib_return_code(hipz_rc);
+		goto create_qp_exit1;
+	}
+
+	/* store real qp_num as we got from ehca */
+	my_qp->ehca_qp_core.real_qp_num = qp_nr;
+
+	switch (init_attr->qp_type) {
+	case IB_QPT_RC:
+	        if (isdaqp == 0) {
+		        swqe_size = offsetof(struct ehca_wqe,
+					     u.nud.sg_list[(act_nr_send_sges)]);
+			rwqe_size = offsetof(struct ehca_wqe,
+					     u.nud.sg_list[(act_nr_receive_sges)]);
+		} else { /* for daqp we need to use msg size, not wqe size */
+		        swqe_size = da_msg_size[max_send_sge];
+			rwqe_size = da_msg_size[max_recv_sge];
+			act_nr_send_sges=1;
+			act_nr_receive_sges=1;
+		}
+		break;
+	case IB_QPT_UC:
+		swqe_size = offsetof(struct ehca_wqe,
+				     u.nud.sg_list[(act_nr_send_sges)]);
+		rwqe_size = offsetof(struct ehca_wqe,
+				     u.nud.sg_list[(act_nr_receive_sges)]);
+		break;
+
+	case IB_QPT_UD:
+	case IB_QPT_GSI:
+	case IB_QPT_SMI:
+		/* UD circumvention */
+	        act_nr_receive_sges -= 2;
+		act_nr_send_sges -= 2;
+		swqe_size = offsetof(struct ehca_wqe,
+				     u.ud_av.sg_list[(act_nr_send_sges)]);
+		rwqe_size = offsetof(struct ehca_wqe,
+				     u.ud_av.sg_list[(act_nr_receive_sges)]);
+
+		if (IB_QPT_GSI == init_attr->qp_type ||
+		    IB_QPT_SMI == init_attr->qp_type) {
+			act_nr_send_wqes = init_attr->cap.max_send_wr;
+			act_nr_receive_wqes = init_attr->cap.max_recv_wr;
+			act_nr_send_sges = init_attr->cap.max_send_sge;
+			act_nr_receive_sges = init_attr->cap.max_recv_sge;
+			qp_nr = (init_attr->qp_type == IB_QPT_SMI) ? 0 : 1;
+		}
+
+		break;
+
+	default:
+		break;
+	}
+
+	/* initializes r/squeue and registers queue pages */
+	ret = init_qp_queues(shca->ipz_hca_handle, my_qp,
+			     nr_sq_pages, nr_rq_pages,
+			     swqe_size, rwqe_size,
+			     act_nr_send_sges, act_nr_receive_sges);
+	if (ret != 0) {
+		EDEB_ERR(4,"Couldn't initialize r/squeue and pages ret=%x",
+			 ret);
+		goto create_qp_exit2;
+	}
+
+	my_qp->ib_qp.pd = &my_pd->ib_pd;
+	my_qp->ib_qp.device = my_pd->ib_pd.device;
+
+	my_qp->ib_qp.recv_cq = init_attr->recv_cq;
+	my_qp->ib_qp.send_cq = init_attr->send_cq;
+
+	my_qp->ib_qp.qp_num = qp_nr;
+	my_qp->ib_qp.qp_type = init_attr->qp_type;
+
+	my_qp->ehca_qp_core.qp_type = init_attr->qp_type;
+	my_qp->ib_qp.srq = init_attr->srq;
+
+	my_qp->ib_qp.qp_context = init_attr->qp_context;
+	my_qp->ib_qp.event_handler = init_attr->event_handler;
+
+	init_attr->cap.max_inline_data = 0; /* not supported? */
+	init_attr->cap.max_recv_sge = act_nr_receive_sges;
+	init_attr->cap.max_recv_wr = act_nr_receive_wqes;
+	init_attr->cap.max_send_sge = act_nr_send_sges;
+	init_attr->cap.max_send_wr = act_nr_send_wqes;
+
+	/* TODO : define_apq0() not supported yet */
+	if (init_attr->qp_type == IB_QPT_GSI) {
+		if ((hipz_rc = ehca_define_sqp(shca, my_qp, init_attr))) {
+			EDEB_ERR(4,  "ehca_define_sqp() failed rc=%lx", hipz_rc);
+			ret = ehca2ib_return_code(hipz_rc);
+			goto create_qp_exit3;
+		}
+	}
+
+	if (init_attr->send_cq != NULL) {
+		struct ehca_cq *cq = container_of(init_attr->send_cq,
+						  struct ehca_cq, ib_cq);
+		ret = ehca_cq_assign_qp(cq, my_qp);
+		if (ret != 0) {
+			EDEB_ERR(4, "Couldn't assign qp to send_cq ret=%x", ret);
+			goto create_qp_exit3;
+		}
+		my_qp->send_cq = cq;
+	}
+
+	/* copy queues, galpa data to user space */
+	if (context != NULL && udata != NULL) {
+		struct ehca_create_qp_resp resp;
+		struct vm_area_struct * vma;
+		resp.qp_num = qp_nr;
+		resp.token = my_qp->token;
+		resp.ehca_qp_core = my_qp->ehca_qp_core;
+
+		ehca_mmap_nopage(((u64) (my_qp->token) << 32) | 0x22000000,
+				 my_qp->ehca_qp_core.ipz_rqueue.queue_length,
+				 ((void**)&resp.ehca_qp_core.ipz_rqueue.queue),
+				 &vma);
+		my_qp->uspace_rqueue = (u64)resp.ehca_qp_core.ipz_rqueue.queue;
+		ehca_mmap_nopage(((u64) (my_qp->token) << 32) | 0x23000000,
+				 my_qp->ehca_qp_core.ipz_squeue.queue_length,
+				 ((void**)&resp.ehca_qp_core.ipz_squeue.queue),
+				 &vma);
+		my_qp->uspace_squeue = (u64)resp.ehca_qp_core.ipz_squeue.queue;
+		ehca_mmap_register(my_qp->ehca_qp_core.galpas.user.fw_handle,
+				   ((void**)&resp.ehca_qp_core.galpas.kernel.fw_handle),
+				   &vma);
+		my_qp->uspace_fwh = (u64)resp.ehca_qp_core.galpas.kernel.fw_handle;
+
+		if (ib_copy_to_udata(udata, &resp, sizeof resp)) {
+			EDEB_ERR(4, "Copy to udata failed");
+			ret = -EINVAL;
+			goto create_qp_exit3;
+		}
+	}
+
+	EDEB_EX(7, "ehca_qp=%p qp_num=%x, token=%x",
+		my_qp, qp_nr, my_qp->token);
+	return (&my_qp->ib_qp);
+
+ create_qp_exit3:
+	ipz_queue_dtor(&my_qp->ehca_qp_core.ipz_rqueue);
+	ipz_queue_dtor(&my_qp->ehca_qp_core.ipz_squeue);
+
+ create_qp_exit2:
+	hipz_h_destroy_qp(shca->ipz_hca_handle, my_qp);
+
+ create_qp_exit1:
+	down_write(&ehca_qp_idr_sem);
+	idr_remove(&ehca_qp_idr, my_qp->token);
+	up_write(&ehca_qp_idr_sem);
+
+ create_qp_exit0:
+	ehca_qp_delete(my_qp);
+	EDEB_EX(4, "failed ret=%x", ret);
+	return ERR_PTR(ret);
+
+}
+
+/** called by internal_modify_qp() at trans sqe -> rts:
+ * set purge bit of bad wqe and subsequent wqes to avoid reentering sqe
+ * @return total number of bad wqes in bad_wqe_cnt
+ */
+static int prepare_sqe_rts(struct ehca_qp *my_qp, struct ehca_shca *shca,
+			   int *bad_wqe_cnt)
+{
+	int ret = 0;
+	u64 hipz_rc = H_Success;
+	struct ipz_queue *squeue = NULL;
+	void *bad_send_wqe_p = NULL;
+	void *bad_send_wqe_v = NULL;
+	void *squeue_start_p = NULL;
+	void *squeue_end_p = NULL;
+	void *squeue_start_v = NULL;
+	void *squeue_end_v = NULL;
+	struct ehca_wqe *wqe = NULL;
+	int qp_num = my_qp->ib_qp.qp_num;
+
+	EDEB_EN(7, "ehca_qp=%p qp_num=%x ", my_qp, qp_num);
+
+	/* get send wqe pointer */
+	hipz_rc = hipz_h_disable_and_get_wqe(shca->ipz_hca_handle,
+					     my_qp->ipz_qp_handle, &my_qp->pf,
+					     &bad_send_wqe_p, NULL, 2);
+	if (hipz_rc != H_Success) {
+		EDEB_ERR(4, "hipz_h_disable_and_get_wqe() failed "
+			 "ehca_qp=%p qp_num=%x hipz_rc=%lx",
+			 my_qp, qp_num, hipz_rc);
+		ret = ehca2ib_return_code(hipz_rc);
+		goto prepare_sqe_rts_exit1;
+	}
+	bad_send_wqe_p = (void*)((u64)bad_send_wqe_p & (~(1L<<63)));
+	EDEB(7, "qp_num=%x bad_send_wqe_p=%p", qp_num, bad_send_wqe_p);
+	/* convert wqe pointer to vadr */
+	bad_send_wqe_v = abs_to_virt((u64)bad_send_wqe_p);
+	EDEB_DMP(6, bad_send_wqe_v, 32, "qp_num=%x bad_wqe", qp_num);
+
+	squeue = &my_qp->ehca_qp_core.ipz_squeue;
+	squeue_start_p = (void*)ehca_kv_to_g(squeue->queue);
+	squeue_end_p = squeue_start_p+squeue->queue_length;
+	squeue_start_v = abs_to_virt((u64)squeue_start_p);
+	squeue_end_v = abs_to_virt((u64)squeue_end_p);
+	EDEB(6, "qp_num=%x squeue_start_v=%p squeue_end_v=%p",
+	     qp_num, squeue_start_v, squeue_end_v);
+
+	/* loop sets wqe's purge bit */
+	wqe = (struct ehca_wqe*)bad_send_wqe_v;
+	*bad_wqe_cnt = 0;
+	while (wqe->optype != 0xff && wqe->wqef != 0xff) {
+		EDEB_DMP(6, wqe, 32, "qp_num=%x wqe", qp_num);
+		wqe->nr_of_data_seg = 0; /* suppress data access */
+		wqe->wqef = WQEF_PURGE; /* WQE to be purged */
+		wqe = (struct ehca_wqe*)((u8*)wqe+squeue->qe_size);
+		*bad_wqe_cnt = (*bad_wqe_cnt)+1;
+		if ((void*)wqe >= squeue_end_v) {
+			wqe = squeue_start_v;
+		}
+	} /* eof while wqe */
+	/* bad wqe will be reprocessed and ignored when pol_cq() is called,
+	   i.e. nr of wqes with flush error status is one less */
+	EDEB(6, "qp_num=%x flusherr_wqe_cnt=%x", qp_num, (*bad_wqe_cnt)-1);
+	wqe->wqef = 0;
+
+ prepare_sqe_rts_exit1:
+
+	EDEB_EX(7, "ehca_qp=%p qp_num=%x ret=%x", my_qp, qp_num, ret);
+	return ret;
+}
+
+/** @brief internal modify qp with circumvention to handle aqp0 properly
+ * smi_reset2init indicates if this is an internal reset-to-init-call for
+ * smi. This flag must always be zero if called from ehca_modify_qp()!
+ * This internal func was intorduced to avoid recursion of ehca_modify_qp()!
+ */
+static int internal_modify_qp(struct ib_qp *ibqp,
+			      struct ib_qp_attr *attr,
+			      int attr_mask, int smi_reset2init)
+{
+	enum ib_qp_state qp_cur_state = 0, qp_new_state = 0;
+	int cnt = 0, qp_attr_idx = 0, retcode = 0;
+
+	enum ib_qp_statetrans statetrans;
+	struct hcp_modify_qp_control_block *mqpcb = NULL;
+	struct ehca_qp *my_qp = NULL;
+	struct ehca_shca *shca = NULL;
+	u64 update_mask = 0;
+	u64 hipz_rc = H_Success;
+	int bad_wqe_cnt = 0;
+	int squeue_locked = 0;
+	unsigned long spl_flags = 0;
+
+	my_qp = container_of(ibqp, struct ehca_qp, ib_qp);
+	shca = container_of(ibqp->pd->device, struct ehca_shca, ib_device);
+
+	EDEB_EN(7, "ehca_qp=%p qp_num=%x ibqp_type=%x "
+		"new qp_state=%x attribute_mask=%x",
+		my_qp, ibqp->qp_num, ibqp->qp_type,
+		attr->qp_state, attr_mask);
+
+	/* do query_qp to obtain current attr values */
+	mqpcb = kmalloc(PAGE_SIZE, GFP_KERNEL);
+	if (mqpcb == NULL) {
+		retcode = -ENOMEM;
+		EDEB_ERR(4, "Could not get zeroed page for mqpcb "
+			 "ehca_qp=%p qp_num=%x ", my_qp, ibqp->qp_num);
+		goto modify_qp_exit0;
+	}
+	memset(mqpcb, 0, PAGE_SIZE);
+
+	hipz_rc = hipz_h_query_qp(shca->ipz_hca_handle,
+				  my_qp->ipz_qp_handle,
+				  &my_qp->pf,
+				  mqpcb, my_qp->ehca_qp_core.galpas.kernel);
+	if (hipz_rc != H_Success) {
+		EDEB_ERR(4, "hipz_h_query_qp() failed "
+			 "ehca_qp=%p qp_num=%x hipz_rc=%lx",
+			 my_qp, ibqp->qp_num, hipz_rc);
+		retcode = ehca2ib_return_code(hipz_rc);
+		goto modify_qp_exit1;
+	}
+	EDEB(7, "ehca_qp=%p qp_num=%x ehca_qp_state=%x",
+	     my_qp, ibqp->qp_num, mqpcb->qp_state);
+
+	qp_cur_state = ehca2ib_qp_state(mqpcb->qp_state);
+
+	if (qp_cur_state == -EINVAL) {	/* invalid qp state */
+		retcode = -EINVAL;
+		EDEB_ERR(4, "Invalid current ehca_qp_state=%x "
+			 "ehca_qp=%p qp_num=%x",
+			 mqpcb->qp_state, my_qp, ibqp->qp_num);
+		goto modify_qp_exit1;
+	}
+	/* circumvention to set aqp0 initial state to init
+	   as expected by IB spec */
+	if (smi_reset2init == 0 &&
+	    ibqp->qp_type == IB_QPT_SMI &&
+	    qp_cur_state == IB_QPS_RESET &&
+	    (attr_mask & IB_QP_STATE)
+	    && attr->qp_state == IB_QPS_INIT) { /* RESET -> INIT */
+		struct ib_qp_attr smiqp_attr = {
+			.qp_state = IB_QPS_INIT,
+			.port_num = my_qp->init_attr.port_num,
+			.pkey_index = 0,
+			.qkey = 0
+		};
+		int smiqp_attr_mask = IB_QP_STATE | IB_QP_PORT |
+			IB_QP_PKEY_INDEX | IB_QP_QKEY;
+		int smirc = internal_modify_qp(
+			ibqp, &smiqp_attr, smiqp_attr_mask, 1);
+		if (smirc != 0) {
+			EDEB_ERR(4, "SMI RESET -> INIT failed. "
+				 "ehca_modify_qp() rc=%x", smirc);
+			retcode = H_Parameter;
+			goto modify_qp_exit1;
+		}
+		qp_cur_state = IB_QPS_INIT;
+		EDEB(7, "SMI RESET -> INIT succeeded");
+	}
+	/* is transmitted current state  equal to "real" current state */
+	if (attr_mask & IB_QP_CUR_STATE) {
+		if (qp_cur_state != attr->cur_qp_state) {
+			retcode = -EINVAL;
+			EDEB_ERR(4, "Invalid IB_QP_CUR_STATE "
+				 "attr->curr_qp_state=%x <> "
+				 "actual cur_qp_state=%x. "
+				 "ehca_qp=%p qp_num=%x",
+				 attr->cur_qp_state, qp_cur_state,
+				 my_qp, ibqp->qp_num);
+			goto modify_qp_exit1;
+		}
+	}
+
+	EDEB(7,	"ehca_qp=%p qp_num=%x current qp_state=%x "
+	     "new qp_state=%x attribute_mask=%x",
+	     my_qp, ibqp->qp_num, qp_cur_state, attr->qp_state, attr_mask);
+
+	qp_new_state = attr_mask & IB_QP_STATE ? attr->qp_state : qp_cur_state;
+	if (!smi_reset2init &&
+	    !ib_modify_qp_is_ok(qp_cur_state, qp_new_state, ibqp->qp_type, 
+				attr_mask)) {
+		retcode = -EINVAL;
+		EDEB_ERR(4, "Invalid qp transition new_state=%x cur_state=%x "
+			 "ehca_qp=%p qp_num=%x attr_mask=%x",
+			 qp_new_state, qp_cur_state, my_qp, ibqp->qp_num, 
+			 attr_mask);
+		goto modify_qp_exit1;
+	}
+
+	if ((mqpcb->qp_state = ib2ehca_qp_state(qp_new_state))) {
+		update_mask = EHCA_BMASK_SET(MQPCB_MASK_QP_STATE, 1);
+	} else {
+		retcode = -EINVAL;
+		EDEB_ERR(4, "Invalid new qp state=%x "
+			 "ehca_qp=%p qp_num=%x",
+			 qp_new_state, my_qp, ibqp->qp_num);
+		goto modify_qp_exit1;
+	}
+
+	/* retrieve state transition struct to get req and opt attrs */
+	statetrans = get_modqp_statetrans(qp_cur_state, qp_new_state);
+	if (statetrans < 0) {
+		retcode = -EINVAL;
+		EDEB_ERR(4, "<INVALID STATE CHANGE> qp_cur_state=%x "
+			 "new_qp_state=%x State_xsition=%x "
+			 "ehca_qp=%p qp_num=%x",
+			 qp_cur_state, qp_new_state,
+			 statetrans, my_qp, ibqp->qp_num);
+		goto modify_qp_exit1;
+	}
+
+	qp_attr_idx = ib2ehcaqptype(ibqp->qp_type);
+
+	if (qp_attr_idx < 0) {
+		retcode = qp_attr_idx;
+		EDEB_ERR(4, "Invalid QP type=%x ehca_qp=%p qp_num=%x",
+			 ibqp->qp_type, my_qp, ibqp->qp_num);
+		goto modify_qp_exit1;
+	}
+
+	EDEB(7, "ehca_qp=%p qp_num=%x <VALID STATE CHANGE> qp_state_xsit=%x",
+	     my_qp, ibqp->qp_num, statetrans);
+
+	/* sqe -> rts: set purge bit of bad wqe before actual trans */
+	if ((my_qp->ehca_qp_core.qp_type == IB_QPT_UD
+	     || my_qp->ehca_qp_core.qp_type == IB_QPT_GSI
+	     || my_qp->ehca_qp_core.qp_type == IB_QPT_SMI)
+	    && statetrans == IB_QPST_SQE2RTS) {
+		/* mark next free wqe if kernel */
+		if (my_qp->uspace_squeue == 0) {
+			struct ehca_wqe *wqe = NULL;
+			/* lock send queue */
+			spin_lock_irqsave(&my_qp->spinlock_s, spl_flags);
+			squeue_locked = 1;
+			/* mark next free wqe */
+			wqe=(struct ehca_wqe*)
+				my_qp->ehca_qp_core.ipz_squeue.current_q_addr;
+			wqe->optype = wqe->wqef = 0xff;
+			EDEB(7, "qp_num=%x next_free_wqe=%p",
+			     ibqp->qp_num, wqe);
+		}
+		retcode = prepare_sqe_rts(my_qp, shca, &bad_wqe_cnt);
+		if (retcode != 0) {
+			EDEB_ERR(4, "prepare_sqe_rts() failed "
+				 "ehca_qp=%p qp_num=%x ret=%x",
+				 my_qp, ibqp->qp_num, retcode);
+			goto modify_qp_exit2;
+		}
+	}
+
+	/* enable RDMA_Atomic_Control if reset->init und reliable con
+	   this is necessary since gen2 does not provide that flag,
+	   but pHyp requires it */
+	if (statetrans == IB_QPST_RESET2INIT &&
+	    (ibqp->qp_type == IB_QPT_RC || ibqp->qp_type == IB_QPT_UC)) {
+		mqpcb->rdma_atomic_ctrl = 3;
+		update_mask |= EHCA_BMASK_SET(MQPCB_MASK_RDMA_ATOMIC_CTRL, 1);
+	}
+	/* circ. pHyp requires #RDMA/Atomic Responder Resources for UC INIT -> RTR */
+	if (statetrans == IB_QPST_INIT2RTR &&
+	    (ibqp->qp_type == IB_QPT_UC) &&
+	    !(attr_mask & IB_QP_MAX_DEST_RD_ATOMIC)) {
+		mqpcb->rdma_nr_atomic_resp_res = 1; /* default to 1 */
+		update_mask |=
+			EHCA_BMASK_SET(MQPCB_MASK_RDMA_NR_ATOMIC_RESP_RES, 1);
+	}
+
+	if (attr_mask & IB_QP_PKEY_INDEX) {
+		mqpcb->prim_p_key_idx = attr->pkey_index;
+		update_mask |= EHCA_BMASK_SET(MQPCB_MASK_PRIM_P_KEY_IDX, 1);
+		EDEB(7, "ehca_qp=%p qp_num=%x "
+		     "IB_QP_PKEY_INDEX update_mask=%lx",
+		     my_qp, ibqp->qp_num, update_mask);
+	}
+	if (attr_mask & IB_QP_PORT) {
+		if (attr->port_num < 1 || attr->port_num > shca->num_ports) {
+			retcode = -EINVAL;
+			EDEB_ERR(4, "Invalid port=%x. "
+				 "ehca_qp=%p qp_num=%x num_ports=%x",
+				 attr->port_num, my_qp, ibqp->qp_num,
+				 shca->num_ports);
+			goto modify_qp_exit2;
+		}
+		mqpcb->prim_phys_port = attr->port_num;
+		update_mask |= EHCA_BMASK_SET(MQPCB_MASK_PRIM_PHYS_PORT, 1);
+		EDEB(7, "ehca_qp=%p qp_num=%x IB_QP_PORT update_mask=%lx",
+		     my_qp, ibqp->qp_num, update_mask);
+	}
+	if (attr_mask & IB_QP_QKEY) {
+		mqpcb->qkey = attr->qkey;
+		update_mask |= EHCA_BMASK_SET(MQPCB_MASK_QKEY, 1);
+		EDEB(7, "ehca_qp=%p qp_num=%x IB_QP_QKEY update_mask=%lx",
+		     my_qp, ibqp->qp_num, update_mask);
+	}
+	if (attr_mask & IB_QP_AV) {
+		mqpcb->dlid = attr->ah_attr.dlid;
+		update_mask |= EHCA_BMASK_SET(MQPCB_MASK_DLID, 1);
+		mqpcb->source_path_bits = attr->ah_attr.src_path_bits;
+		update_mask |= EHCA_BMASK_SET(MQPCB_MASK_SOURCE_PATH_BITS, 1);
+		mqpcb->service_level = attr->ah_attr.sl;
+		update_mask |= EHCA_BMASK_SET(MQPCB_MASK_SERVICE_LEVEL, 1);
+		mqpcb->max_static_rate = attr->ah_attr.static_rate;
+		update_mask |= EHCA_BMASK_SET(MQPCB_MASK_MAX_STATIC_RATE, 1);
+
+		/* only if GRH is TRUE we might consider SOURCE_GID_IDX and DEST_GID
+		 * otherwise phype will return H_ATTR_PARM!!!
+		 */
+		if (attr->ah_attr.ah_flags == IB_AH_GRH) {
+			mqpcb->send_grh_flag = 1 << 31;
+			update_mask |=
+				EHCA_BMASK_SET(MQPCB_MASK_SEND_GRH_FLAG, 1);
+			mqpcb->source_gid_idx = attr->ah_attr.grh.sgid_index;
+			update_mask |=
+				EHCA_BMASK_SET(MQPCB_MASK_SOURCE_GID_IDX, 1);
+
+			for (cnt = 0; cnt < 16; cnt++) {
+				mqpcb->dest_gid.byte[cnt] =
+					attr->ah_attr.grh.dgid.raw[cnt];
+			}
+
+			update_mask |= EHCA_BMASK_SET(MQPCB_MASK_DEST_GID, 1);
+			mqpcb->flow_label = attr->ah_attr.grh.flow_label;
+			update_mask |= EHCA_BMASK_SET(MQPCB_MASK_FLOW_LABEL, 1);
+			mqpcb->hop_limit = attr->ah_attr.grh.hop_limit;
+			update_mask |= EHCA_BMASK_SET(MQPCB_MASK_HOP_LIMIT, 1);
+			mqpcb->traffic_class = attr->ah_attr.grh.traffic_class;
+			update_mask |=
+				EHCA_BMASK_SET(MQPCB_MASK_TRAFFIC_CLASS, 1);
+		}
+
+		EDEB(7, "ehca_qp=%p qp_num=%x IB_QP_AV update_mask=%lx",
+		     my_qp, ibqp->qp_num, update_mask);
+	}
+
+	if (attr_mask & IB_QP_PATH_MTU) {
+		mqpcb->path_mtu = attr->path_mtu;
+		update_mask |= EHCA_BMASK_SET(MQPCB_MASK_PATH_MTU, 1);
+		EDEB(7, "ehca_qp=%p qp_num=%x IB_QP_PATH_MTU update_mask=%lx",
+		     my_qp, ibqp->qp_num, update_mask);
+	}
+	if (attr_mask & IB_QP_TIMEOUT) {
+		mqpcb->timeout = attr->timeout;
+		update_mask |= EHCA_BMASK_SET(MQPCB_MASK_TIMEOUT, 1);
+		EDEB(7, "ehca_qp=%p qp_num=%x IB_QP_TIMEOUT update_mask=%lx",
+		     my_qp, ibqp->qp_num, update_mask);
+	}
+	if (attr_mask & IB_QP_RETRY_CNT) {
+		mqpcb->retry_count = attr->retry_cnt;
+		update_mask |= EHCA_BMASK_SET(MQPCB_MASK_RETRY_COUNT, 1);
+		EDEB(7, "ehca_qp=%p qp_num=%x IB_QP_RETRY_CNT update_mask=%lx",
+		     my_qp, ibqp->qp_num, update_mask);
+	}
+	if (attr_mask & IB_QP_RNR_RETRY) {
+		mqpcb->rnr_retry_count = attr->rnr_retry;
+		update_mask |= EHCA_BMASK_SET(MQPCB_MASK_RNR_RETRY_COUNT, 1);
+		EDEB(7, "ehca_qp=%p qp_num=%x IB_QP_RNR_RETRY update_mask=%lx",
+		     my_qp, ibqp->qp_num, update_mask);
+	}
+	if (attr_mask & IB_QP_RQ_PSN) {
+		mqpcb->receive_psn = attr->rq_psn;
+		update_mask |= EHCA_BMASK_SET(MQPCB_MASK_RECEIVE_PSN, 1);
+		EDEB(7, "ehca_qp=%p qp_num=%x IB_QP_RQ_PSN update_mask=%lx",
+		     my_qp, ibqp->qp_num, update_mask);
+	}
+	if (attr_mask & IB_QP_MAX_DEST_RD_ATOMIC) {
+	        /*  @TODO CHECK THIS with our spec */
+		mqpcb->rdma_nr_atomic_resp_res = attr->max_dest_rd_atomic;
+		update_mask |=
+			EHCA_BMASK_SET(MQPCB_MASK_RDMA_NR_ATOMIC_RESP_RES, 1);
+		EDEB(7, "ehca_qp=%p qp_num=%x IB_QP_MAX_DEST_RD_ATOMIC "
+		     "update_mask=%lx", my_qp, ibqp->qp_num, update_mask);
+	}
+	if (attr_mask & IB_QP_MAX_QP_RD_ATOMIC) {
+	        /*  @TODO CHECK THIS with our spec */
+		mqpcb->rdma_atomic_outst_dest_qp = attr->max_rd_atomic;
+		update_mask |=
+			EHCA_BMASK_SET
+			(MQPCB_MASK_RDMA_ATOMIC_OUTST_DEST_QP, 1);
+		EDEB(7, "ehca_qp=%p qp_num=%x IB_QP_MAX_QP_RD_ATOMIC "
+		     "update_mask=%lx", my_qp, ibqp->qp_num, update_mask);
+	}
+	if (attr_mask & IB_QP_ALT_PATH) {
+		mqpcb->dlid_al = attr->alt_ah_attr.dlid;
+		update_mask |= EHCA_BMASK_SET(MQPCB_MASK_DLID_AL, 1);
+		mqpcb->source_path_bits_al = attr->alt_ah_attr.src_path_bits;
+		update_mask |=
+			EHCA_BMASK_SET(MQPCB_MASK_SOURCE_PATH_BITS_AL, 1);
+		mqpcb->service_level_al = attr->alt_ah_attr.sl;
+		update_mask |= EHCA_BMASK_SET(MQPCB_MASK_SERVICE_LEVEL_AL, 1);
+		mqpcb->max_static_rate_al = attr->alt_ah_attr.static_rate;
+		update_mask |= EHCA_BMASK_SET(MQPCB_MASK_MAX_STATIC_RATE_AL, 1);
+
+		/* only if GRH is TRUE we might consider SOURCE_GID_IDX and DEST_GID
+		 * otherwise phype will return H_ATTR_PARM!!!
+		 */
+		if (attr->alt_ah_attr.ah_flags == IB_AH_GRH) {
+			mqpcb->send_grh_flag_al = 1 << 31;
+			update_mask |=
+				EHCA_BMASK_SET(MQPCB_MASK_SEND_GRH_FLAG_AL, 1);
+			mqpcb->source_gid_idx_al =
+				attr->alt_ah_attr.grh.sgid_index;
+			update_mask |=
+				EHCA_BMASK_SET(MQPCB_MASK_SOURCE_GID_IDX_AL, 1);
+
+			for (cnt = 0; cnt < 16; cnt++) {
+				mqpcb->dest_gid_al.byte[cnt] =
+					attr->alt_ah_attr.grh.dgid.raw[cnt];
+			}
+
+			update_mask |=
+				EHCA_BMASK_SET(MQPCB_MASK_DEST_GID_AL, 1);
+			mqpcb->flow_label_al = attr->alt_ah_attr.grh.flow_label;
+			update_mask |=
+				EHCA_BMASK_SET(MQPCB_MASK_FLOW_LABEL_AL, 1);
+			mqpcb->hop_limit_al = attr->alt_ah_attr.grh.hop_limit;
+			update_mask |=
+				EHCA_BMASK_SET(MQPCB_MASK_HOP_LIMIT_AL, 1);
+			mqpcb->traffic_class_al =
+				attr->alt_ah_attr.grh.traffic_class;
+			update_mask |=
+				EHCA_BMASK_SET(MQPCB_MASK_TRAFFIC_CLASS_AL, 1);
+		}
+
+		EDEB(7, "ehca_qp=%p qp_num=%x "
+		     "IB_QP_ALT_PATH update_mask=%lx",
+		     my_qp, ibqp->qp_num, update_mask);
+	}
+
+	if (attr_mask & IB_QP_MIN_RNR_TIMER) {
+		mqpcb->min_rnr_nak_timer_field = attr->min_rnr_timer;
+		update_mask |=
+			EHCA_BMASK_SET(MQPCB_MASK_MIN_RNR_NAK_TIMER_FIELD, 1);
+		EDEB(7, "ehca_qp=%p qp_num=%x "
+		     "IB_QP_MIN_RNR_TIMER update_mask=%lx",
+		     my_qp, ibqp->qp_num, update_mask);
+	}
+
+	if (attr_mask & IB_QP_SQ_PSN) {
+		mqpcb->send_psn = attr->sq_psn;
+		update_mask |= EHCA_BMASK_SET(MQPCB_MASK_SEND_PSN, 1);
+		EDEB(7, "ehca_qp=%p qp_num=%x "
+		     "IB_QP_SQ_PSN update_mask=%lx",
+		     my_qp, ibqp->qp_num, update_mask);
+	}
+
+	if (attr_mask & IB_QP_DEST_QPN) {
+		mqpcb->dest_qp_nr = attr->dest_qp_num;
+		update_mask |= EHCA_BMASK_SET(MQPCB_MASK_DEST_QP_NR, 1);
+		EDEB(7, "ehca_qp=%p qp_num=%x "
+		     "IB_QP_DEST_QPN update_mask=%lx",
+		     my_qp, ibqp->qp_num, update_mask);
+	}
+
+	if (attr_mask & IB_QP_PATH_MIG_STATE) {
+		mqpcb->path_migration_state = attr->path_mig_state;
+		update_mask |=
+			EHCA_BMASK_SET(MQPCB_MASK_PATH_MIGRATION_STATE, 1);
+		EDEB(7, "ehca_qp=%p qp_num=%x "
+		     "IB_QP_PATH_MIG_STATE update_mask=%lx", my_qp,
+		     ibqp->qp_num, update_mask);
+	}
+
+	if (attr_mask & IB_QP_CAP) {
+		mqpcb->max_nr_outst_send_wr = attr->cap.max_send_wr+1;
+		update_mask |=
+			EHCA_BMASK_SET(MQPCB_MASK_MAX_NR_OUTST_SEND_WR, 1);
+		mqpcb->max_nr_outst_recv_wr = attr->cap.max_recv_wr+1;
+		update_mask |=
+			EHCA_BMASK_SET(MQPCB_MASK_MAX_NR_OUTST_RECV_WR, 1);
+		EDEB(7, "ehca_qp=%p qp_num=%x "
+		     "IB_QP_CAP update_mask=%lx",
+		     my_qp, ibqp->qp_num, update_mask);
+		/* TODO no support for max_send/recv_sge??? */
+	}
+
+	EDEB_DMP(7, mqpcb, 4*70, "ehca_qp=%p qp_num=%x", my_qp, ibqp->qp_num);
+
+	hipz_rc = hipz_h_modify_qp(shca->ipz_hca_handle,
+				   my_qp->ipz_qp_handle,
+				   &my_qp->pf,
+				   update_mask,
+				   mqpcb, my_qp->ehca_qp_core.galpas.kernel);
+
+	if (hipz_rc != H_Success) {
+		retcode = ehca2ib_return_code(hipz_rc);
+		EDEB_ERR(4, "hipz_h_modify_qp() failed rc=%lx "
+			 "ehca_qp=%p qp_num=%x",
+			 hipz_rc, my_qp, ibqp->qp_num);
+		goto modify_qp_exit2;
+	}
+
+	if ((my_qp->ehca_qp_core.qp_type == IB_QPT_UD
+	     || my_qp->ehca_qp_core.qp_type == IB_QPT_GSI
+	     || my_qp->ehca_qp_core.qp_type == IB_QPT_SMI)
+	    && statetrans == IB_QPST_SQE2RTS) {
+		/* doorbell to reprocessing wqes */
+		iosync(); /* serialize GAL register access */
+		hipz_update_SQA(&my_qp->ehca_qp_core, bad_wqe_cnt-1);
+		EDEB(6, "doorbell for %x wqes", bad_wqe_cnt);
+	}
+
+	if (statetrans == IB_QPST_RESET2INIT ||
+	    statetrans == IB_QPST_INIT2INIT) {
+		mqpcb->qp_enable = TRUE;
+		mqpcb->qp_state = EHCA_QPS_INIT;
+		update_mask = 0;
+		update_mask = EHCA_BMASK_SET(MQPCB_MASK_QP_ENABLE, 1);
+
+		EDEB(7, "ehca_qp=%p qp_num=%x "
+		     "RESET_2_INIT needs an additional enable "
+		     "-> update_mask=%lx", my_qp, ibqp->qp_num, update_mask);
+
+		hipz_rc = hipz_h_modify_qp(shca->ipz_hca_handle,
+					   my_qp->ipz_qp_handle,
+					   &my_qp->pf,
+					   update_mask,
+					   mqpcb,
+					   my_qp->ehca_qp_core.galpas.kernel);
+
+		if (hipz_rc != H_Success) {
+			retcode = ehca2ib_return_code(hipz_rc);
+			EDEB_ERR(4, "ENABLE in context of "
+				 "RESET_2_INIT failed! "
+				 "Maybe you didn't get a LID"
+				 "hipz_rc=%lx ehca_qp=%p qp_num=%x",
+				 hipz_rc, my_qp, ibqp->qp_num);
+			goto modify_qp_exit2;
+		}
+	}
+
+	if (statetrans == IB_QPST_ANY2RESET) {
+		ipz_QEit_reset(&my_qp->ehca_qp_core.ipz_rqueue);
+		ipz_QEit_reset(&my_qp->ehca_qp_core.ipz_squeue);
+	}
+
+	if (attr_mask & IB_QP_QKEY) {
+		my_qp->ehca_qp_core.qkey = attr->qkey;
+	}
+
+ modify_qp_exit2:
+	if (squeue_locked) { /* this means: sqe -> rts */
+		spin_unlock_irqrestore(&my_qp->spinlock_s, spl_flags);
+		my_qp->sqerr_purgeflag = 1;
+	}
+
+ modify_qp_exit1:
+	kfree(mqpcb);
+
+ modify_qp_exit0:
+	EDEB_EX(7, "ehca_qp=%p qp_num=%x ibqp_type=%x retcode=%x",
+		my_qp, ibqp->qp_num, ibqp->qp_type, retcode);
+	return retcode;
+}
+
+int ehca_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, int attr_mask)
+{
+	int ret = 0;
+	struct ehca_qp *my_qp = NULL;
+
+	EHCA_CHECK_ADR(ibqp);
+	EHCA_CHECK_ADR(attr);
+	EHCA_CHECK_ADR(ibqp->device);
+
+	my_qp = container_of(ibqp, struct ehca_qp, ib_qp);
+
+	EDEB_EN(7, "ehca_qp=%p qp_num=%x ibqp_type=%x attr_mask=%x",
+		my_qp, ibqp->qp_num, ibqp->qp_type, attr_mask);
+
+	ret = internal_modify_qp(ibqp, attr, attr_mask, 0);
+
+	EDEB_EX(7, "ehca_qp=%p qp_num=%x ibqp_type=%x ret=%x",
+		my_qp, ibqp->qp_num, ibqp->qp_type, ret);
+	return ret;
+}
+
+int ehca_query_qp(struct ib_qp *qp,
+		  struct ib_qp_attr *qp_attr,
+		  int qp_attr_mask, struct ib_qp_init_attr *qp_init_attr)
+{
+	struct ehca_qp *my_qp = NULL;
+	struct ehca_shca *shca = NULL;
+	struct hcp_modify_qp_control_block *qpcb = NULL;
+
+	struct ipz_adapter_handle adapter_handle;
+	int cnt = 0, retcode = 0;
+	u64 hipz_rc = H_Success;
+
+	EHCA_CHECK_ADR(qp);
+	EHCA_CHECK_ADR(qp_attr);
+	EHCA_CHECK_DEVICE(qp->device);
+
+	my_qp = container_of(qp, struct ehca_qp, ib_qp);
+
+	EDEB_EN(7, "ehca_qp=%p qp_num=%x "
+		"qp_attr=%p qp_attr_mask=%x qp_init_attr=%p",
+		my_qp, qp->qp_num, qp_attr, qp_attr_mask, qp_init_attr);
+
+	shca = container_of(qp->device, struct ehca_shca, ib_device);
+	adapter_handle = shca->ipz_hca_handle;
+
+	if (qp_attr_mask & QP_ATTR_QUERY_NOT_SUPPORTED) {
+		retcode = -EINVAL;
+		EDEB_ERR(4,"Invalid attribute mask "
+			 "ehca_qp=%p qp_num=%x qp_attr_mask=%x ",
+			 my_qp, qp->qp_num, qp_attr_mask);
+		goto query_qp_exit0;
+	}
+
+	qpcb = kmalloc(EHCA_PAGESIZE, GFP_KERNEL );
+
+	if (qpcb == NULL) {
+		retcode = -ENOMEM;
+		EDEB_ERR(4,"Out of memory for qpcb "
+			 "ehca_qp=%p qp_num=%x", my_qp, qp->qp_num);
+		goto query_qp_exit0;
+	}
+	memset(qpcb, 0, sizeof(*qpcb));
+
+	hipz_rc = hipz_h_query_qp(adapter_handle,
+				  my_qp->ipz_qp_handle,
+				  &my_qp->pf,
+				  qpcb, my_qp->ehca_qp_core.galpas.kernel);
+
+	if (hipz_rc != H_Success) {
+		retcode = ehca2ib_return_code(hipz_rc);
+		EDEB_ERR(4,"hipz_h_query_qp() failed "
+			 "ehca_qp=%p qp_num=%x hipz_rc=%lx",
+			 my_qp, qp->qp_num, hipz_rc);
+		goto query_qp_exit1;
+	}
+
+	qp_attr->cur_qp_state = ehca2ib_qp_state(qpcb->qp_state);
+	qp_attr->qp_state = qp_attr->cur_qp_state;
+	if (qp_attr->cur_qp_state == -EINVAL) {
+		retcode = -EINVAL;
+		EDEB_ERR(4,"Got invalid ehca_qp_state=%x "
+			 "ehca_qp=%p qp_num=%x",
+			 qpcb->qp_state, my_qp, qp->qp_num);
+		goto query_qp_exit1;
+	}
+
+	if (qp_attr->qp_state == IB_QPS_SQD) {
+		qp_attr->sq_draining = TRUE;
+	}
+
+	qp_attr->qkey = qpcb->qkey;
+	qp_attr->path_mtu = qpcb->path_mtu;
+	qp_attr->path_mig_state = qpcb->path_migration_state;
+	qp_attr->rq_psn = qpcb->receive_psn;
+	qp_attr->sq_psn = qpcb->send_psn;
+	qp_attr->min_rnr_timer = qpcb->min_rnr_nak_timer_field;
+	qp_attr->cap.max_send_wr = qpcb->max_nr_outst_send_wr-1;
+	qp_attr->cap.max_recv_wr = qpcb->max_nr_outst_recv_wr-1;
+	/* UD_AV CIRCUMVENTION */
+	if (my_qp->ehca_qp_core.qp_type == IB_QPT_UD) {
+		qp_attr->cap.max_send_sge =
+			qpcb->actual_nr_sges_in_sq_wqe - 2;
+		qp_attr->cap.max_recv_sge =
+			qpcb->actual_nr_sges_in_rq_wqe - 2;
+	} else {
+		qp_attr->cap.max_send_sge =
+			qpcb->actual_nr_sges_in_sq_wqe;
+		qp_attr->cap.max_recv_sge =
+			qpcb->actual_nr_sges_in_rq_wqe;
+	}
+
+	qp_attr->cap.max_inline_data = my_qp->sq_max_inline_data_size;
+	qp_attr->dest_qp_num = qpcb->dest_qp_nr;
+
+	qp_attr->pkey_index =
+		EHCA_BMASK_GET(MQPCB_PRIM_P_KEY_IDX, qpcb->prim_p_key_idx);
+
+	qp_attr->port_num =
+		EHCA_BMASK_GET(MQPCB_PRIM_PHYS_PORT, qpcb->prim_phys_port);
+
+	qp_attr->timeout = qpcb->timeout;
+	qp_attr->retry_cnt = qpcb->retry_count;
+	qp_attr->rnr_retry = qpcb->rnr_retry_count;
+
+	qp_attr->alt_pkey_index =
+		EHCA_BMASK_GET(MQPCB_PRIM_P_KEY_IDX, qpcb->alt_p_key_idx);
+
+	qp_attr->alt_port_num = qpcb->alt_phys_port;
+	qp_attr->alt_timeout = qpcb->timeout_al;
+
+	/* primary av */
+	qp_attr->ah_attr.sl = qpcb->service_level;
+
+	if (qpcb->send_grh_flag) {
+		qp_attr->ah_attr.ah_flags = IB_AH_GRH;
+	}
+
+	qp_attr->ah_attr.static_rate = qpcb->max_static_rate;
+	qp_attr->ah_attr.dlid = qpcb->dlid;
+	qp_attr->ah_attr.src_path_bits = qpcb->source_path_bits;
+	qp_attr->ah_attr.port_num = qp_attr->port_num;
+
+	/* primary GRH */
+	qp_attr->ah_attr.grh.traffic_class = qpcb->traffic_class;
+	qp_attr->ah_attr.grh.hop_limit = qpcb->hop_limit;
+	qp_attr->ah_attr.grh.sgid_index = qpcb->source_gid_idx;
+	qp_attr->ah_attr.grh.flow_label = qpcb->flow_label;
+
+	for (cnt = 0; cnt < 16; cnt++) {
+		qp_attr->ah_attr.grh.dgid.raw[cnt] =
+			qpcb->dest_gid.byte[cnt];
+	}
+
+	/* alternate AV */
+	qp_attr->alt_ah_attr.sl = qpcb->service_level_al;
+	if (qpcb->send_grh_flag_al) {
+		qp_attr->alt_ah_attr.ah_flags = IB_AH_GRH;
+	}
+
+	qp_attr->alt_ah_attr.static_rate = qpcb->max_static_rate_al;
+	qp_attr->alt_ah_attr.dlid = qpcb->dlid_al;
+	qp_attr->alt_ah_attr.src_path_bits = qpcb->source_path_bits_al;
+
+	/* alternate GRH */
+	qp_attr->alt_ah_attr.grh.traffic_class = qpcb->traffic_class_al;
+	qp_attr->alt_ah_attr.grh.hop_limit = qpcb->hop_limit_al;
+	qp_attr->alt_ah_attr.grh.sgid_index = qpcb->source_gid_idx_al;
+	qp_attr->alt_ah_attr.grh.flow_label = qpcb->flow_label_al;
+
+	for (cnt = 0; cnt < 16; cnt++) {
+		qp_attr->alt_ah_attr.grh.dgid.raw[cnt] =
+			qpcb->dest_gid_al.byte[cnt];
+	}
+
+	/* return init attributes given in ehca_create_qp */
+	if (qp_init_attr != NULL) {
+		*qp_init_attr = my_qp->init_attr;
+	}
+
+	EDEB(7,	"ehca_qp=%p qp_number=%x dest_qp_number=%x "
+	     "dlid=%x path_mtu=%x dest_gid=%lx_%lx "
+	     "service_level=%x qp_state=%x",
+	     my_qp, qpcb->qp_number, qpcb->dest_qp_nr,
+	     qpcb->dlid, qpcb->path_mtu,
+	     qpcb->dest_gid.dw[0], qpcb->dest_gid.dw[1],
+	     qpcb->service_level, qpcb->qp_state);
+
+	EDEB_DMP(7, qpcb, 4*70, "ehca_qp=%p qp_num=%x", my_qp, qp->qp_num);
+
+ query_qp_exit1:
+	kfree(qpcb);
+
+ query_qp_exit0:
+	EDEB_EX(7, "ehca_qp=%p qp_num=%x retcode=%x",
+		my_qp, qp->qp_num, retcode);
+	return retcode;
+}
+
+int ehca_destroy_qp(struct ib_qp *ibqp)
+{
+	struct ehca_qp *my_qp = NULL;
+	struct ehca_shca *shca = NULL;
+	struct ehca_pfqp *qp_pf = NULL;
+	u32 qp_num = 0;
+	int retcode = 0;
+	u64 hipz_ret = H_Success;
+	u8 port_num = 0;
+	enum ib_qp_type	qp_type;
+
+	EHCA_CHECK_ADR(ibqp);
+
+	my_qp = container_of(ibqp, struct ehca_qp, ib_qp);
+	qp_num = ibqp->qp_num;
+	qp_pf = &my_qp->pf;
+
+	shca = container_of(ibqp->device, struct ehca_shca, ib_device);
+
+	EDEB_EN(7, "ehca_qp=%p qp_num=%x", my_qp, ibqp->qp_num);
+
+	if (my_qp->send_cq != NULL) {
+		retcode = ehca_cq_unassign_qp(my_qp->send_cq,
+					      my_qp->ehca_qp_core.real_qp_num);
+		if (retcode != 0) {
+			EDEB_ERR(4, "Couldn't unassign qp from send_cq "
+				 "ret=%x qp_num=%x cq_num=%x",
+				 retcode, my_qp->ib_qp.qp_num,
+				 my_qp->send_cq->cq_number);
+			goto destroy_qp_exit0;
+		}
+	}
+
+	down_write(&ehca_qp_idr_sem);
+	idr_remove(&ehca_qp_idr, my_qp->token);
+	up_write(&ehca_qp_idr_sem);
+
+	/* un-mmap if vma alloc */
+	if (my_qp->uspace_rqueue != 0) {
+		struct ehca_qp_core *qp_core = &my_qp->ehca_qp_core;
+		retcode = ehca_munmap(my_qp->uspace_rqueue,
+				      qp_core->ipz_rqueue.queue_length);
+		retcode = ehca_munmap(my_qp->uspace_squeue,
+				      qp_core->ipz_squeue.queue_length);
+		retcode = ehca_munmap(my_qp->uspace_fwh, 4096);
+	}
+
+	hipz_ret = hipz_h_destroy_qp(shca->ipz_hca_handle, my_qp);
+	if (hipz_ret != H_Success) {
+		EDEB_ERR(4, "hipz_h_destroy_qp() failed "
+			 "rc=%lx ehca_qp=%p qp_num=%x",
+			 hipz_ret, qp_pf, qp_num);
+		goto destroy_qp_exit0;
+	}
+
+	port_num = my_qp->init_attr.port_num;
+	qp_type  = my_qp->init_attr.qp_type;
+
+	/* TODO: later with IB_QPT_SMI */
+	if (qp_type == IB_QPT_GSI) {
+		struct ib_event event;
+
+		EDEB(4, "EHCA port %x is inactive.", port_num);
+		event.device = &shca->ib_device;
+		event.event = IB_EVENT_PORT_ERR;
+		event.element.port_num = port_num;
+		shca->sport[port_num - 1].port_state = IB_PORT_DOWN;
+		ib_dispatch_event(&event);
+	}
+
+	ipz_queue_dtor(&my_qp->ehca_qp_core.ipz_rqueue);
+	ipz_queue_dtor(&my_qp->ehca_qp_core.ipz_squeue);
+	ehca_qp_delete(my_qp);
+
+ destroy_qp_exit0:
+	retcode = ehca2ib_return_code(hipz_ret);
+	EDEB_EX(7,"ret=%x", retcode);
+	return retcode;
+}
+
+/* eof ehca_qp.c */


From rolandd at cisco.com  Sat Feb 18 11:57:19 2006
From: rolandd at cisco.com (Roland Dreier)
Date: Fri, 17 Feb 2006 16:57:19 -0800
Subject: [PATCH 06/22] Queue handling
In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
Message-ID: <20060218005719.13620.95136.stgit@localhost.localdomain>

From: Roland Dreier <rolandd at cisco.com>

Code like

	#ifndef __PPC64__
		void * dummy1;              /* make sure we use the same thing on 
32 bit 
*/
	#endif

looks _very_ suspicious.  Much better to make sure that the
structures are laid out the same no matter what the word size
of the architecture is rather than relying on fragile hacks
like this.
---

 drivers/infiniband/hw/ehca/ipz_pt_fn.c      |  137 ++++++++++++++++++++++
 drivers/infiniband/hw/ehca/ipz_pt_fn.h      |  165 +++++++++++++++++++++++++++
 drivers/infiniband/hw/ehca/ipz_pt_fn_core.h |  152 +++++++++++++++++++++++++
 3 files changed, 454 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ipz_pt_fn.c b/drivers/infiniband/hw/ehca/ipz_pt_fn.c
new file mode 100644
index 0000000..d6c490c
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ipz_pt_fn.c
@@ -0,0 +1,137 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  internal queue handling
+ *
+ *  Authors: Waleri Fomin <fomin at de.ibm.com>
+ *           Reinhard Ernst <rernst at de.ibm.com>
+ *           Christoph Raisch <raisch at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ipz_pt_fn.c,v 1.16 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+#define DEB_PREFIX "iptz"
+
+#include "ehca_kernel.h"
+#include "ehca_tools.h"
+#include "ipz_pt_fn.h"
+
+extern int ehca_hwlevel;
+
+void *ipz_QPageit_get_inc(struct ipz_queue *queue)
+{
+	void *retvalue = NULL;
+	u8 *EOF_last_page = queue->queue + queue->queue_length;
+
+	retvalue = queue->current_q_addr;
+	queue->current_q_addr += queue->pagesize;
+	if (queue->current_q_addr > EOF_last_page) {
+		queue->current_q_addr -= queue->pagesize;
+		retvalue = NULL;
+	}
+
+	if ((((u64)retvalue) % EHCA_PAGESIZE) != 0) {
+		EDEB(4,  "ERROR!! not at PAGE-Boundary");
+		return (NULL);
+	}
+	EDEB(7, "queue=%p retvalue=%p", queue, retvalue);
+	return (retvalue);
+}
+
+void *ipz_QEit_EQ_get_inc(struct ipz_queue *queue)
+{
+	void *retvalue = NULL;
+	u8 *last_entry_in_q = queue->queue + queue->queue_length
+	    - queue->qe_size;
+
+	retvalue = queue->current_q_addr;
+	queue->current_q_addr += queue->qe_size;
+	if (queue->current_q_addr > last_entry_in_q) {
+		queue->current_q_addr = queue->queue;
+		queue->toggle_state = (~queue->toggle_state) & 1;
+	}
+
+	EDEB(7, "queue=%p retvalue=%p new current_q_addr=%p qe_size=%x",
+	     queue, retvalue, queue->current_q_addr, queue->qe_size);
+
+	return (retvalue);
+}
+
+int ipz_queue_ctor(struct ipz_queue *queue,
+		   const u32 nr_of_pages,
+		   const u32 pagesize, const u32 qe_size, const u32 nr_of_sg)
+{
+	EDEB_EN(7,  "nr_of_pages=%x pagesize=%x qe_size=%x",
+		nr_of_pages, pagesize, qe_size);
+	queue->queue_length = nr_of_pages * pagesize;
+	queue->queue = vmalloc(queue->queue_length);
+	if (queue->queue == 0) {
+		EDEB(4,  "ERROR!! didn't get the memory");
+		return (FALSE);
+	}
+	if ((((u64)queue->queue) & (EHCA_PAGESIZE - 1)) != 0) {
+		EDEB(4,  "ERROR!! QUEUE doesn't start at "
+		     "page boundary");
+		vfree(queue->queue);
+		return (FALSE);
+	}
+
+	memset(queue->queue, 0, queue->queue_length);
+	queue->current_q_addr = queue->queue;
+	queue->qe_size = qe_size;
+	queue->act_nr_of_sg = nr_of_sg;
+	queue->pagesize = pagesize;
+	queue->toggle_state = 1;
+	EDEB_EX(7,  "queue_length=%x queue=%p qe_size=%x"
+		" act_nr_of_sg=%x", queue->queue_length, queue->queue,
+		queue->qe_size, queue->act_nr_of_sg);
+	return TRUE;
+}
+
+int ipz_queue_dtor(struct ipz_queue *queue)
+{
+	EDEB_EN(7,  "ipz_queue pointer=%p", queue);
+	if (queue == NULL) {
+		return (FALSE);
+	}
+	if (queue->queue == NULL) {
+		return (FALSE);
+	}
+	EDEB(7,  "destructing a queue with the following "
+	     "properties:\n nr_of_pages=%x pagesize=%x qe_size=%x",
+	     queue->act_nr_of_sg, queue->pagesize, queue->qe_size);
+	vfree(queue->queue);
+
+	EDEB_EX(7,  "queue freed!");
+	return TRUE;
+}
diff --git a/drivers/infiniband/hw/ehca/ipz_pt_fn.h b/drivers/infiniband/hw/ehca/ipz_pt_fn.h
new file mode 100644
index 0000000..2e197db
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ipz_pt_fn.h
@@ -0,0 +1,165 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  internal queue handling
+ *
+ *  Authors: Waleri Fomin <fomin at de.ibm.com>
+ *           Reinhard Ernst <rernst at de.ibm.com>
+ *           Christoph Raisch <raisch at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ipz_pt_fn.h,v 1.11 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+#ifndef __IPZ_PT_FN_H__
+#define __IPZ_PT_FN_H__
+
+#include "ipz_pt_fn_core.h"
+#include "ehca_qes.h"
+
+#define EHCA_PAGESIZE   4096UL
+#define EHCA_PT_ENTRIES 512UL
+
+/** @brief generic page table
+ */
+struct ipz_pt {
+	u64 entries[EHCA_PT_ENTRIES];
+};
+
+/** @brief generic page
+ */
+struct ipz_page {
+	u8 entries[EHCA_PAGESIZE];
+};
+
+/** @brief page table for a queue, only to be used in pf
+ */
+struct ipz_qpt {
+	/* queue page tables (kv), use u64 because we know the element length */
+	u64 *qpts;
+	u32 allocated_qpts_entries;
+	u32 nr_of_PTEs; /*  number of page table entries PTE iterators */
+	u64 *current_pte_addr;
+};
+
+/** @brief constructor for a ipz_queue_t, placement new for ipz_queue_t,
+    new for all dependent datastructors
+
+    all QP Tables are the same
+    flow:
+    -# allocate+pin queue
+    @see ipz_qpt_ctor()
+    @returns true if ok, false if out of memory
+ */
+int ipz_queue_ctor(struct ipz_queue *queue, const u32 nr_of_pages,
+		   const u32 pagesize,
+		   const u32 qe_size, /* queue entry size*/
+		   const u32 nr_of_sg);
+
+/** @brief destructor for a ipz_queue_t
+    -# free queue
+    @see ipz_queue_ctor()
+    @returns true if ok, false if queue was NULL-ptr of free failed
+*/
+int ipz_queue_dtor(struct ipz_queue *queue);
+
+/** @brief constructor for a ipz_qpt_t,
+ * placement new for struct ipz_queue, new for all dependent datastructors
+ *
+ *  all QP Tables are the same,
+ *  flow:
+ *  -# allocate+pin queue
+ *  -# initialise ptcb
+ *  -# allocate+pin PTs
+ *  -# link PTs to a ring, according to HCA Arch, set bit62 id needed
+ *  -# the ring must have room for exactly nr_of_PTEs
+ *  @see ipz_qpt_ctor()
+ */
+void ipz_qpt_ctor(struct ipz_qpt *qpt,
+		  struct ehca_bridge_handle bridge,
+		  const u32 nr_of_QEs,
+		  const u32 pagesize,
+		  const u32 qe_size,
+		  const u8 lowbyte, const u8 toggle,
+		  u32 * act_nr_of_QEs,
+		  u32 * act_nr_of_pages);
+
+/**  @brief return current Queue Entry, increment Queue Entry iterator by one
+     step in struct ipz_queue, will wrap in ringbuffer
+     @returns address (kv) of Queue Entry BEFORE increment
+     @warning don't use in parallel with ipz_QPageit_get_inc()
+     @warning unpredictable results may occur if steps>act_nr_of_queue_entries
+
+     fix EQ page problems
+ */
+void *ipz_QEit_EQ_get_inc(struct ipz_queue *queue);
+
+/**  @brief return current Event Queue Entry, increment Queue Entry iterator
+     by one step in struct ipz_queue if valid, will wrap in ringbuffer
+     @returns address (kv) of Queue Entry BEFORE increment
+     @returns 0 and does not increment, if wrong valid state
+     @warning don't use in parallel with ipz_queue_QPageit_get_inc()
+     @warning unpredictable results may occur if steps>act_nr_of_queue_entries
+ */
+inline static void *ipz_QEit_EQ_get_inc_valid(struct ipz_queue *queue)
+{
+	void *retvalue = ipz_QEit_get(queue);
+	u32 qe = *(u8 *) retvalue;
+	EDEB(7, "ipz_QEit_EQ_get_inc_valid qe=%x", qe);
+	if ((qe >> 7) == (queue->toggle_state & 1)) {
+		/* this is a good one */
+		ipz_QEit_EQ_get_inc(queue);
+	} else {
+		retvalue = NULL;
+	}
+	return (retvalue);
+}
+
+/**
+     @returns address (GX) of first queue entry
+ */
+inline static u64 ipz_qpt_get_firstpage(struct ipz_qpt *qpt)
+{
+	return (be64_to_cpu(qpt->qpts[0]));
+}
+
+/**
+     @returns address (kv) of first page of queue page table
+ */
+inline static void *ipz_qpt_get_qpt(struct ipz_qpt *qpt)
+{
+	return (qpt->qpts);
+}
+
+#endif /* __IPZ_PT_FN_H__ */
diff --git a/drivers/infiniband/hw/ehca/ipz_pt_fn_core.h b/drivers/infiniband/hw/ehca/ipz_pt_fn_core.h
new file mode 100644
index 0000000..1b9a114
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ipz_pt_fn_core.h
@@ -0,0 +1,152 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  internal queue handling
+ *
+ *  Authors: Waleri Fomin <fomin at de.ibm.com>
+ *           Reinhard Ernst <rernst at de.ibm.com>
+ *           Christoph Raisch <raisch at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ipz_pt_fn_core.h,v 1.12 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+#ifndef __IPZ_PT_FN_CORE_H__
+#define __IPZ_PT_FN_CORE_H__
+
+#ifdef __KERNEL__
+#include "ehca_tools.h"
+#else /* some replacements for kernel stuff */
+#include "ehca_utools.h"
+#endif
+
+#include "ehca_qes.h"
+
+/** @brief generic queue in linux kernel virtual memory (kv)
+ */
+struct ipz_queue {
+#ifndef __PPC64__
+	void * dummy1;              /* make sure we use the same thing on 32 bit */
+#endif
+	u8 *current_q_addr;         /* current queue entry */
+#ifndef __PPC64__
+	void * dummy2;
+#endif
+	u8 *queue;                  /* points to first queue entry */
+	u32 qe_size;                /* queue entry size */
+	u32 act_nr_of_sg;
+	u32 queue_length;           /* queue length allocated in bytes */
+	u32 pagesize;
+	u32 toggle_state;           /* toggle flag - per page */
+	u32 dummy3;                 /* 64 bit alignment*/
+};
+
+/**  @brief return current Queue Entry
+     @returns address (kv) of Queue Entry
+ */
+static inline void *ipz_QEit_get(struct ipz_queue *queue)
+{
+	return (queue->current_q_addr);
+}
+
+/**  @brief return current Queue Page , increment Queue Page iterator from
+     page to page in struct ipz_queue, last increment will return 0! and
+     NOT wrap
+     @returns address (kv) of Queue Page
+     @warning don't use in parallel with ipz_QE_get_inc()
+ */
+void *ipz_QPageit_get_inc(struct ipz_queue *queue);
+
+/**  @brief return current Queue Entry, increment Queue Entry iterator by one
+     step in struct ipz_queue, will wrap in ringbuffer
+     @returns address (kv) of Queue Entry BEFORE increment
+     @warning don't use in parallel with ipz_QPageit_get_inc()
+     @warning unpredictable results may occur if steps>act_nr_of_queue_entries
+ */
+static inline void *ipz_QEit_get_inc(struct ipz_queue *queue)
+{
+	void *retvalue = 0;
+	u8 *last_entry_in_q = queue->queue + queue->queue_length
+	    - queue->qe_size;
+
+	retvalue = queue->current_q_addr;
+	queue->current_q_addr += queue->qe_size;
+	if (queue->current_q_addr > last_entry_in_q) {
+		queue->current_q_addr = queue->queue;
+		/* toggle the valid flag */
+		queue->toggle_state = (~queue->toggle_state) & 1;
+	}
+
+	EDEB(7, "queue=%p retvalue=%p new current_q_addr=%p qe_size=%x",
+	     queue, retvalue, queue->current_q_addr, queue->qe_size);
+
+	return (retvalue);
+}
+
+/**  @brief return current Queue Entry, increment Queue Entry iterator by one
+     step in struct ipz_queue, will wrap in ringbuffer
+     @returns address (kv) of Queue Entry BEFORE increment
+     @returns 0 and does not increment, if wrong valid state
+     @warning don't use in parallel with ipz_QPageit_get_inc()
+     @warning unpredictable results may occur if steps>act_nr_of_queue_entries
+ */
+inline static void *ipz_QEit_get_inc_valid(struct ipz_queue *queue)
+{
+	void *retvalue = ipz_QEit_get(queue);
+#ifdef USERSPACE_DRIVER
+
+	u32 qe =
+	    ((struct ehca_cqe *)(ehca_ktou((struct ehca_cqe *)retvalue)))->
+	    cqe_flags;
+#else
+	u32 qe = ((struct ehca_cqe *)retvalue)->cqe_flags;
+#endif
+	if ((qe >> 7) == (queue->toggle_state & 1)) {
+		/* this is a good one */
+		ipz_QEit_get_inc(queue);
+	} else
+		retvalue = 0;
+	return (retvalue);
+}
+
+/**  @brief returns and resets Queue Entry iterator
+     @returns address (kv) of first Queue Entry
+ */
+static inline void *ipz_QEit_reset(struct ipz_queue *queue)
+{
+	queue->current_q_addr = queue->queue;
+	return (queue->queue);
+}
+
+#endif /* __IPZ_PT_FN_CORE_H__ */


From rolandd at cisco.com  Sat Feb 18 11:57:41 2006
From: rolandd at cisco.com (Roland Dreier)
Date: Fri, 17 Feb 2006 16:57:41 -0800
Subject: [PATCH 13/22] HCA query functions
In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
Message-ID: <20060218005741.13620.93906.stgit@localhost.localdomain>

From: Roland Dreier <rolandd at cisco.com>


---

 drivers/infiniband/hw/ehca/ehca_hca.c |  321 +++++++++++++++++++++++++++++++++
 1 files changed, 321 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_hca.c b/drivers/infiniband/hw/ehca/ehca_hca.c
new file mode 100644
index 0000000..af05a5c
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ehca_hca.c
@@ -0,0 +1,321 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  HCA query functions
+ *
+ *  Authors: Heiko J Schick <schickhj at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ehca_hca.c,v 1.46 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+#undef DEB_PREFIX
+#define DEB_PREFIX "shca"
+
+#include "ehca_kernel.h"
+#include "ehca_tools.h"
+
+#include "hcp_if.h"		/* TODO: later via hipz_* header file */
+
+#define TO_MAX_INT(dest, src)			\
+	if (src >= INT_MAX)			\
+		dest = INT_MAX;			\
+	else					\
+		dest = src
+
+int ehca_query_device(struct ib_device *ibdev, struct ib_device_attr *props)
+{
+	int ret = 0;
+	struct ehca_shca *shca;
+	struct query_hca_rblock *rblock;
+
+	EDEB_EN(7, "");
+	EHCA_CHECK_DEVICE(ibdev);
+
+	memset(props, 0, sizeof(struct ib_device_attr));
+	shca = container_of(ibdev, struct ehca_shca, ib_device);
+
+	rblock = kmalloc(PAGE_SIZE, GFP_KERNEL);
+	if (rblock == NULL) {
+		EDEB_ERR(4, "Can't allocate rblock memory.");
+		ret = -ENOMEM;
+		goto query_device0;
+	}
+
+	memset(rblock, 0, PAGE_SIZE);
+
+	if (hipz_h_query_hca(shca->ipz_hca_handle, rblock) != H_Success) {
+		EDEB_ERR(4, "Can't query device properties");
+		ret = -EINVAL;
+		goto query_device1;
+	}
+	props->fw_ver              = rblock->hw_ver;
+	/* TODO: memcpy(&props->sys_image_guid, ...); */
+	props->max_mr_size         = rblock->max_mr_size;
+	/* TODO: props->page_size_cap        */
+	props->vendor_id           = rblock->vendor_id >> 8;
+	props->vendor_part_id      = rblock->vendor_part_id >> 16;
+	props->hw_ver              = rblock->hw_ver;
+	TO_MAX_INT(props->max_qp, (rblock->max_qp - rblock->cur_qp));
+	/* TODO: props->max_qp_wr  =         */
+	/* TODO: props->device_cap_flags     */
+	props->max_sge             = rblock->max_sge;
+	props->max_sge_rd          = rblock->max_sge_rd;
+	TO_MAX_INT(props->max_qp, (rblock->max_cq - rblock->cur_cq));
+	props->max_cqe             = rblock->max_cqe;
+	TO_MAX_INT(props->max_mr, (rblock->max_cq - rblock->cur_mr));
+	TO_MAX_INT(props->max_pd, rblock->max_pd);
+	/* TODO: props->max_qp_rd_atom       */
+	/* TODO: props->max_qp_init_rd_atom  */
+	/* TODO: props->atomic_cap           */
+	/* TODO: props->max_ee               */
+	/* TODO: props->max_rdd              */
+	props->max_mw              = rblock->max_mw;
+	TO_MAX_INT(props->max_mr, (rblock->max_mw - rblock->cur_mw));
+	props->max_raw_ipv6_qp     = rblock->max_raw_ipv6_qp;
+	props->max_raw_ethy_qp     = rblock->max_raw_ethy_qp;
+	props->max_mcast_grp       = rblock->max_mcast_grp;
+	props->max_mcast_qp_attach = rblock->max_qps_attached_mcast_grp;
+	props->max_total_mcast_qp_attach = rblock->max_qps_attached_all_mcast_grp;
+
+	TO_MAX_INT(props->max_ah, rblock->max_ah);
+
+	props->max_fmr             = rblock->max_mr;
+	/* TODO: props->max_map_per_fmr      */
+
+	/* TODO: props->max_srq              */
+	/* TODO: props->max_srq_wr           */
+	/* TODO: props->max_srq_sge          */
+	props->max_srq             = 0;
+	props->max_srq_wr          = 0;
+	props->max_srq_sge         = 0;
+
+	/* TODO: props->max_pkeys            */
+	props->max_pkeys           = 16;
+
+	props->local_ca_ack_delay  = rblock->local_ca_ack_delay;
+
+      query_device1:
+	kfree(rblock);
+
+      query_device0:
+	EDEB_EX(7, "ret=%x", ret);
+
+	return ret;
+}
+
+int ehca_query_port(struct ib_device *ibdev,
+		    u8 port, struct ib_port_attr *props)
+{
+	int ret = 0;
+	struct ehca_shca *shca;
+	struct query_port_rblock *rblock;
+
+	EDEB_EN(7, "port=%x", port);
+	EHCA_CHECK_DEVICE(ibdev);
+
+	memset(props, 0, sizeof(struct ib_port_attr));
+	shca = container_of(ibdev, struct ehca_shca, ib_device);
+
+	rblock = kmalloc(PAGE_SIZE, GFP_KERNEL);
+	if (rblock == NULL) {
+		EDEB_ERR(4, "Can't allocate rblock memory.");
+		ret = -ENOMEM;
+		goto query_port0;
+	}
+
+	memset(rblock, 0, PAGE_SIZE);
+
+	if (hipz_h_query_port(shca->ipz_hca_handle, port, rblock) != H_Success) {
+		EDEB_ERR(4, "Can't query port properties");
+		ret = -EINVAL;
+		goto query_port1;
+	}
+
+	props->state = rblock->state;
+
+	switch (rblock->max_mtu) {
+	case 0x1:
+		props->active_mtu = props->max_mtu = IB_MTU_256;
+		break;
+	case 0x2:
+		props->active_mtu = props->max_mtu = IB_MTU_512;
+		break;
+	case 0x3:
+		props->active_mtu = props->max_mtu = IB_MTU_1024;
+		break;
+	case 0x4:
+		props->active_mtu = props->max_mtu = IB_MTU_2048;
+		break;
+	case 0x5:
+		props->active_mtu = props->max_mtu = IB_MTU_4096;
+		break;
+	default:
+		EDEB_ERR(4, "Unknown MTU size: %x.", rblock->max_mtu);
+	}
+
+	props->gid_tbl_len     = rblock->gid_tbl_len;
+	/* TODO: props->port_cap_flags */
+	props->max_msg_sz      = rblock->max_msg_sz;
+	props->bad_pkey_cntr   = rblock->bad_pkey_cntr;
+	props->qkey_viol_cntr  = rblock->qkey_viol_cntr;
+	props->pkey_tbl_len    = rblock->pkey_tbl_len;
+	props->lid             = rblock->lid;
+	props->sm_lid          = rblock->sm_lid;
+	props->lmc             = rblock->lmc;
+	/* TODO: max_vl_num            */
+	props->sm_sl           = rblock->sm_sl;
+	props->subnet_timeout  = rblock->subnet_timeout;
+	props->init_type_reply = rblock->init_type_reply;
+
+	/* TODO: props->active_width   */
+	props->active_width    = IB_WIDTH_12X;
+	/* TODO: props->active_speed   */
+
+	/* TODO: props->phys_state     */
+
+      query_port1:
+	kfree(rblock);
+
+      query_port0:
+	EDEB_EX(7, "ret=%x", ret);
+
+	return ret;
+}
+
+int ehca_query_pkey(struct ib_device *ibdev, u8 port, u16 index, u16 *pkey)
+{
+	int ret = 0;
+	struct ehca_shca *shca;
+	struct query_port_rblock *rblock;
+
+	EDEB_EN(7, "port=%x index=%x", port, index);
+	EHCA_CHECK_DEVICE(ibdev);
+
+	if (index > 16) {
+		EDEB_ERR(4, "Invalid index: %x.", index);
+		ret = -EINVAL;
+		goto query_pkey0;
+	}
+
+	shca = container_of(ibdev, struct ehca_shca, ib_device);
+
+	rblock = kmalloc(PAGE_SIZE, GFP_KERNEL);
+	if (rblock == NULL) {
+		EDEB_ERR(4,  "Can't allocate rblock memory.");
+		ret = -ENOMEM;
+		goto query_pkey0;
+	}
+
+	memset(rblock, 0, PAGE_SIZE);
+
+	if (hipz_h_query_port(shca->ipz_hca_handle, port, rblock) != H_Success) {
+		EDEB_ERR(4, "Can't query port properties");
+		ret = -EINVAL;
+		goto query_pkey1;
+	}
+
+	memcpy(pkey, &rblock->pkey_entries + index, sizeof(u16));
+
+      query_pkey1:
+	kfree(rblock);
+
+      query_pkey0:
+	EDEB_EX(7, "ret=%x", ret);
+
+	return ret;
+}
+
+int ehca_query_gid(struct ib_device *ibdev, u8 port,
+		   int index, union ib_gid *gid)
+{
+	int ret = 0;
+	struct ehca_shca *shca;
+	struct query_port_rblock *rblock;
+
+	EDEB_EN(7, "port=%x index=%x", port, index);
+	EHCA_CHECK_DEVICE(ibdev);
+
+	if (index > 255) {
+		EDEB_ERR(4, "Invalid index: %x.", index);
+		ret = -EINVAL;
+		goto query_gid0;
+	}
+
+	shca = container_of(ibdev, struct ehca_shca, ib_device);
+
+	rblock = kmalloc(PAGE_SIZE, GFP_KERNEL);
+	if (rblock == NULL) {
+		EDEB_ERR(4, "Can't allocate rblock memory.");
+		ret = -ENOMEM;
+		goto query_gid0;
+	}
+
+	memset(rblock, 0, PAGE_SIZE);
+
+	if (hipz_h_query_port(shca->ipz_hca_handle, port, rblock) != H_Success) {
+		EDEB_ERR(4, "Can't query port properties");
+		ret = -EINVAL;
+		goto query_gid1;
+	}
+
+	memcpy(&gid->raw[0], &rblock->gid_prefix, sizeof(u64));
+	memcpy(&gid->raw[8], &rblock->guid_entries[index], sizeof(u64));
+
+      query_gid1:
+	kfree(rblock);
+
+      query_gid0:
+	EDEB_EX(7, "ret=%x GID=%lx%lx", ret,
+		*(u64 *) & gid->raw[0],
+		*(u64 *) & gid->raw[8]);
+
+	return ret;
+}
+
+int ehca_modify_port(struct ib_device *ibdev,
+		     u8 port, int port_modify_mask,
+		     struct ib_port_modify *props)
+{
+	int ret = 0;
+
+	EDEB_EN(7, "port=%x", port);
+	EHCA_CHECK_DEVICE(ibdev);
+
+	/* TODO: implementation */
+
+	EDEB_EX(7, "ret=%x", ret);
+
+	return ret;
+}


From rolandd at cisco.com  Sat Feb 18 11:57:52 2006
From: rolandd at cisco.com (Roland Dreier)
Date: Fri, 17 Feb 2006 16:57:52 -0800
Subject: [PATCH 18/22] ehca address vectors, multicast groups,
	protection domains
In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
Message-ID: <20060218005752.13620.3255.stgit@localhost.localdomain>

From: Roland Dreier <rolandd at cisco.com>


---

 drivers/infiniband/hw/ehca/ehca_av.c    |  258 +++++++++++++++++++++++++++++++
 drivers/infiniband/hw/ehca/ehca_mcast.c |  194 +++++++++++++++++++++++
 drivers/infiniband/hw/ehca/ehca_pd.c    |  100 ++++++++++++
 3 files changed, 552 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_av.c b/drivers/infiniband/hw/ehca/ehca_av.c
new file mode 100644
index 0000000..f5382c2
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ehca_av.c
@@ -0,0 +1,258 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  adress vector functions
+ *
+ *  Authors: Reinhard Ernst <rernst at de.ibm.com>
+ *           Christoph Raisch <raisch at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ehca_av.c,v 1.28 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+
+#define DEB_PREFIX "ehav"
+
+#include "ehca_kernel.h"
+#include "ehca_tools.h"
+#include "ehca_iverbs.h"
+#include "hcp_if.h"
+
+struct ib_ah *ehca_create_ah(struct ib_pd *pd, struct ib_ah_attr *ah_attr)
+{
+	extern int ehca_static_rate;
+	int retcode = 0;
+	struct ehca_av *av = NULL;
+
+	EHCA_CHECK_PD_P(pd);
+	EHCA_CHECK_ADR_P(ah_attr);
+
+	EDEB_EN(7,"pd=%p ah_attr=%p", pd, ah_attr);
+
+	av = ehca_av_new();
+	if (!av) {
+		EDEB_ERR(4,"Out of memory pd=%p ah_attr=%p", pd, ah_attr);
+		retcode = -ENOMEM;
+		goto create_ah_exit0;
+	}
+
+	av->av.sl = ah_attr->sl;
+	av->av.dlid = ntohs(ah_attr->dlid);
+	av->av.slid_path_bits = ah_attr->src_path_bits;
+
+	if (ehca_static_rate < 0) {
+	        av->av.ipd = ah_attr->static_rate;
+	} else {
+	        av->av.ipd = ehca_static_rate;
+	}
+
+	av->av.lnh = ah_attr->ah_flags;
+	av->av.grh.word_0 |= EHCA_BMASK_SET(GRH_IPVERSION_MASK, 6);
+	av->av.grh.word_0 |= EHCA_BMASK_SET(GRH_TCLASS_MASK,
+					    ah_attr->grh.traffic_class);
+	av->av.grh.word_0 |= EHCA_BMASK_SET(GRH_FLOWLABEL_MASK,
+					    ah_attr->grh.flow_label);
+	av->av.grh.word_0 |= EHCA_BMASK_SET(GRH_HOPLIMIT_MASK,
+					    ah_attr->grh.hop_limit);
+	av->av.grh.word_0 |= EHCA_BMASK_SET(GRH_NEXTHEADER_MASK, 0x1B);
+	/* IB transport */
+	av->av.grh.word_0 = be64_to_cpu(av->av.grh.word_0);
+	/* set sgid in grh.word_1 */
+	if (ah_attr->ah_flags & IB_AH_GRH) {
+		int rc = 0;
+		struct ib_port_attr port_attr;
+		union ib_gid gid;
+		memset(&port_attr, 0, sizeof(port_attr));
+		rc = ehca_query_port(pd->device, ah_attr->port_num,
+				     &port_attr);
+		if (rc != 0) { /* invalid port number */
+			retcode = -EINVAL;
+			EDEB_ERR(4, "Invalid port number "
+				 "ehca_query_port() returned %x "
+				 "pd=%p ah_attr=%p", rc, pd, ah_attr);
+			goto create_ah_exit1;
+		}
+		memset(&gid, 0, sizeof(gid));
+		rc = ehca_query_gid(pd->device,
+				    ah_attr->port_num,
+				    ah_attr->grh.sgid_index, &gid);
+		if (rc != 0) {
+			retcode = -EINVAL;
+			EDEB_ERR(4, "Failed to retrieve sgid "
+				 "ehca_query_gid() returned %x "
+				 "pd=%p ah_attr=%p", rc, pd, ah_attr);
+			goto create_ah_exit1;
+		}
+		memcpy(&av->av.grh.word_1, &gid, sizeof(gid));
+	}
+	/* for the time beeing we use a hard coded PMTU of 2048 Bytes */
+	av->av.pmtu = 4; /* TODO */
+
+	/* dgid comes in grh.word_3 */
+	memcpy(&av->av.grh.word_3, &ah_attr->grh.dgid,
+	       sizeof(ah_attr->grh.dgid));
+
+	EHCA_REGISTER_AV(device, pd);
+
+	EDEB_EX(7,"pd=%p ah_attr=%p av=%p", pd, ah_attr, av);
+	return (&av->ib_ah);
+
+ create_ah_exit1:
+	ehca_av_delete(av);
+
+ create_ah_exit0:
+	EDEB_EX(7,"retcode=%x pd=%p ah_attr=%p", retcode, pd, ah_attr);
+	return ERR_PTR(retcode);
+}
+
+int ehca_modify_ah(struct ib_ah *ah, struct ib_ah_attr *ah_attr)
+{
+	struct ehca_av *av = NULL;
+	struct ehca_ud_av new_ehca_av;
+	int ret = 0;
+
+	EHCA_CHECK_AV(ah);
+	EHCA_CHECK_ADR(ah_attr);
+
+	EDEB_EN(7,"ah=%p ah_attr=%p", ah, ah_attr);
+
+	memset(&new_ehca_av, 0, sizeof(new_ehca_av));
+	new_ehca_av.sl = ah_attr->sl;
+	new_ehca_av.dlid = ntohs(ah_attr->dlid);
+	new_ehca_av.slid_path_bits = ah_attr->src_path_bits;
+	new_ehca_av.ipd = ah_attr->static_rate;
+	new_ehca_av.lnh = EHCA_BMASK_SET(GRH_FLAG_MASK,
+					 ((ah_attr->ah_flags & IB_AH_GRH) > 0));
+	new_ehca_av.grh.word_0 = EHCA_BMASK_SET(GRH_TCLASS_MASK,
+						ah_attr->grh.traffic_class);
+	new_ehca_av.grh.word_0 |= EHCA_BMASK_SET(GRH_FLOWLABEL_MASK,
+						 ah_attr->grh.flow_label);
+	new_ehca_av.grh.word_0 |= EHCA_BMASK_SET(GRH_HOPLIMIT_MASK,
+						 ah_attr->grh.hop_limit);
+	new_ehca_av.grh.word_0 |= EHCA_BMASK_SET(GRH_NEXTHEADER_MASK, 0x1b);
+	new_ehca_av.grh.word_0 = be64_to_cpu(new_ehca_av.grh.word_0);
+
+	/* set sgid in grh.word_1 */
+	if (ah_attr->ah_flags & IB_AH_GRH) {
+		int rc = 0;
+		struct ib_port_attr port_attr;
+		union ib_gid gid;
+		memset(&port_attr, 0, sizeof(port_attr));
+		rc = ehca_query_port(ah->device, ah_attr->port_num,
+				     &port_attr);
+		if (rc != 0) { /* invalid port number */
+			ret = -EINVAL;
+			EDEB_ERR(4, "Invalid port number "
+				 "ehca_query_port() returned %x "
+				 "ah=%p ah_attr=%p port_num=%x",
+				 rc, ah, ah_attr, ah_attr->port_num);
+			goto modify_ah_exit1;
+		}
+		memset(&gid, 0, sizeof(gid));
+		rc = ehca_query_gid(ah->device,
+				    ah_attr->port_num,
+				    ah_attr->grh.sgid_index, &gid);
+		if (rc != 0) {
+			ret = -EINVAL;
+			EDEB_ERR(4,
+				 "Failed to retrieve sgid "
+				 "ehca_query_gid() returned %x "
+				 "ah=%p ah_attr=%p port_num=%x "
+				 "sgid_index=%x",
+				 rc, ah, ah_attr, ah_attr->port_num,
+				 ah_attr->grh.sgid_index);
+			goto modify_ah_exit1;
+		}
+		memcpy(&new_ehca_av.grh.word_1, &gid, sizeof(gid));
+	}
+
+	new_ehca_av.pmtu = 4; /* TODO: see comment in create_ah() */
+
+	memcpy(&new_ehca_av.grh.word_3, &ah_attr->grh.dgid,
+	       sizeof(ah_attr->grh.dgid));
+
+	av = container_of(ah, struct ehca_av, ib_ah);
+	av->av = new_ehca_av;
+
+ modify_ah_exit1:
+	EDEB_EX(7,"ret=%x ah=%p ah_attr=%p", ret, ah, ah_attr);
+
+	return ret;
+}
+
+int ehca_query_ah(struct ib_ah *ah, struct ib_ah_attr *ah_attr)
+{
+	int ret = 0;
+	struct ehca_av *av = NULL;
+
+	EHCA_CHECK_AV(ah);
+	EHCA_CHECK_ADR(ah_attr);
+
+	EDEB_EN(7,"ah=%p ah_attr=%p", ah, ah_attr);
+
+	av = container_of(ah, struct ehca_av, ib_ah);
+	memcpy(&ah_attr->grh.dgid, &av->av.grh.word_3,
+	       sizeof(ah_attr->grh.dgid));
+	ah_attr->sl = av->av.sl;
+
+	ah_attr->dlid = av->av.dlid;
+
+	ah_attr->src_path_bits = av->av.slid_path_bits;
+	ah_attr->static_rate = av->av.ipd;
+	ah_attr->ah_flags = EHCA_BMASK_GET(GRH_FLAG_MASK, av->av.lnh);
+	ah_attr->grh.traffic_class = EHCA_BMASK_GET(GRH_TCLASS_MASK,
+						    av->av.grh.word_0);
+	ah_attr->grh.hop_limit = EHCA_BMASK_GET(GRH_HOPLIMIT_MASK,
+						av->av.grh.word_0);
+	ah_attr->grh.flow_label = EHCA_BMASK_GET(GRH_FLOWLABEL_MASK,
+						 av->av.grh.word_0);
+
+	EDEB_EX(7,"ah=%p ah_attr=%p ret=%x", ah, ah_attr, ret);
+	return ret;
+}
+
+int ehca_destroy_ah(struct ib_ah *ah)
+{
+	int ret = 0;
+
+	EHCA_CHECK_AV(ah);
+	EHCA_DEREGISTER_AV(ah);
+
+	EDEB_EN(7,"ah=%p", ah);
+
+	ehca_av_delete(container_of(ah, struct ehca_av, ib_ah));
+
+	EDEB_EX(7,"ret=%x ah=%p", ret, ah);
+	return ret;
+}
diff --git a/drivers/infiniband/hw/ehca/ehca_mcast.c b/drivers/infiniband/hw/ehca/ehca_mcast.c
new file mode 100644
index 0000000..b49bcf6
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ehca_mcast.c
@@ -0,0 +1,194 @@
+
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  mcast  functions
+ *
+ *  Authors: Waleri Fomin <fomin at de.ibm.com>
+ *           Reinhard Ernst <rernst at de.ibm.com>
+ *           Hoang-Nam Nguyen <hnguyen at de.ibm.com>
+ *           Heiko J Schick <schickhj at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ehca_mcast.c,v 1.20 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+#define DEB_PREFIX "mcas"
+
+#include "ehca_kernel.h"
+#include "ehca_classes.h"
+#include "ehca_tools.h"
+#include "hcp_if.h"
+#include "ehca_qes.h"
+#include <linux/module.h>
+#include <linux/err.h>
+#include "ehca_iverbs.h"
+
+#define MAX_MC_LID 0xFFFE
+#define MIN_MC_LID 0xC000	/* Multicast limits */
+#define EHCA_VALID_MULTICAST_GID(gid)  ((gid)[0] == 0xFF)
+#define EHCA_VALID_MULTICAST_LID(lid)  (((lid) >= MIN_MC_LID) && ((lid) <= MIN_MC_LID))
+
+int ehca_attach_mcast(struct ib_qp *ibqp, union ib_gid *gid, u16 lid)
+{
+	struct ehca_qp *my_qp = NULL;
+	struct ehca_shca *shca = NULL;
+	union ib_gid my_gid;
+	u64 hipz_rc = H_Success;
+	int retcode = 0;
+
+	EHCA_CHECK_ADR(ibqp);
+	EHCA_CHECK_ADR(gid);
+
+	my_qp = container_of(ibqp, struct ehca_qp, ib_qp);
+
+	EHCA_CHECK_QP(my_qp);
+	if (ibqp->qp_type != IB_QPT_UD) {
+		EDEB_ERR(4, "invalid qp_type %x gid, retcode=%x",
+			 ibqp->qp_type, EINVAL);
+		return (-EINVAL);
+	}
+
+	shca = container_of(ibqp->pd->device, struct ehca_shca, ib_device);
+	EHCA_CHECK_ADR(shca);
+
+	if (!(EHCA_VALID_MULTICAST_GID(gid->raw))) {
+		EDEB_ERR(4, "gid is not valid mulitcast gid retcode=%x",
+			 EINVAL);
+		return (-EINVAL);
+	} else if ((lid < MIN_MC_LID) || (lid > MAX_MC_LID)) {
+		EDEB_ERR(4, "lid=%x is not valid mulitcast lid retcode=%x",
+			 lid, EINVAL);
+		return (-EINVAL);
+	}
+
+	memcpy(&my_gid.raw, gid->raw, sizeof(union ib_gid));
+
+	hipz_rc = hipz_h_attach_mcqp(shca->ipz_hca_handle,
+				     my_qp->ipz_qp_handle,
+				     my_qp->ehca_qp_core.galpas.kernel,
+				     lid, my_gid);
+	if (H_Success != hipz_rc) {
+		EDEB_ERR(4,
+			 "ehca_qp=%p qp_num=%x hipz_h_attach_mcqp() failed "
+			 "hipz_rc=%lx", my_qp, ibqp->qp_num, hipz_rc);
+	}
+	retcode = ehca2ib_return_code(hipz_rc);
+
+	EDEB_EX(7, "mcast attach retcode=%x\n"
+		   "ehca_qp=%p qp_num=%x  lid=%x\n"
+		   "my_gid=  %x %x %x %x\n"
+		   "         %x %x %x %x\n"
+		   "         %x %x %x %x\n"
+		   "         %x %x %x %x\n",
+		   retcode, my_qp, ibqp->qp_num, lid,
+		   my_gid.raw[0], my_gid.raw[1],
+		   my_gid.raw[2], my_gid.raw[3],
+		   my_gid.raw[4], my_gid.raw[5],
+		   my_gid.raw[6], my_gid.raw[7],
+		   my_gid.raw[8], my_gid.raw[9],
+		   my_gid.raw[10], my_gid.raw[11],
+		   my_gid.raw[12], my_gid.raw[13],
+		   my_gid.raw[14], my_gid.raw[15]);
+
+	return retcode;
+}
+
+int ehca_detach_mcast(struct ib_qp *ibqp, union ib_gid *gid, u16 lid)
+{
+	struct ehca_qp *my_qp = NULL;
+	struct ehca_shca *shca = NULL;
+	union ib_gid my_gid;
+	u64 hipz_rc = H_Success;
+	int retcode = 0;
+
+	EHCA_CHECK_ADR(ibqp);
+	EHCA_CHECK_ADR(gid);
+
+	my_qp = container_of(ibqp, struct ehca_qp, ib_qp);
+
+	EHCA_CHECK_QP(my_qp);
+	if (ibqp->qp_type != IB_QPT_UD) {
+		EDEB_ERR(4, "invalid qp_type %x gid, retcode=%x",
+			 ibqp->qp_type, EINVAL);
+		return (-EINVAL);
+	}
+
+	shca = container_of(ibqp->pd->device, struct ehca_shca, ib_device);
+	EHCA_CHECK_ADR(shca);
+
+	if (!(EHCA_VALID_MULTICAST_GID(gid->raw))) {
+		EDEB_ERR(4, "gid is not valid mulitcast gid retcode=%x",
+			 EINVAL);
+		return (-EINVAL);
+	} else if ((lid < MIN_MC_LID) || (lid > MAX_MC_LID)) {
+		EDEB_ERR(4, "lid=%x is not valid mulitcast lid retcode=%x",
+			 lid, EINVAL);
+		return (-EINVAL);
+	}
+
+	EDEB_EN(7, "dgid=%p qp_numl=%x lid=%x",
+		gid, ibqp->qp_num, lid);
+
+	memcpy(&my_gid.raw, gid->raw, sizeof(union ib_gid));
+
+	hipz_rc = hipz_h_detach_mcqp(shca->ipz_hca_handle,
+				     my_qp->ipz_qp_handle,
+				     my_qp->ehca_qp_core.galpas.kernel,
+				     lid, my_gid);
+	if (H_Success != hipz_rc) {
+		EDEB_ERR(4,
+			 "ehca_qp=%p qp_num=%x hipz_h_detach_mcqp() failed "
+			 "hipz_rc=%lx", my_qp, ibqp->qp_num, hipz_rc);
+	}
+	retcode = ehca2ib_return_code(hipz_rc);
+
+	EDEB_EX(7, "mcast detach retcode=%x\n"
+		"ehca_qp=%p qp_num=%x  lid=%x\n"
+		"my_gid=  %x %x %x %x\n"
+		"         %x %x %x %x\n"
+		"         %x %x %x %x\n"
+		"         %x %x %x %x\n",
+		retcode, my_qp, ibqp->qp_num, lid,
+		my_gid.raw[0], my_gid.raw[1],
+		my_gid.raw[2], my_gid.raw[3],
+		my_gid.raw[4], my_gid.raw[5],
+		my_gid.raw[6], my_gid.raw[7],
+		my_gid.raw[8], my_gid.raw[9],
+		my_gid.raw[10], my_gid.raw[11],
+		my_gid.raw[12], my_gid.raw[13],
+		my_gid.raw[14], my_gid.raw[15]);
+
+	return retcode;
+}
diff --git a/drivers/infiniband/hw/ehca/ehca_pd.c b/drivers/infiniband/hw/ehca/ehca_pd.c
new file mode 100644
index 0000000..e110320
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ehca_pd.c
@@ -0,0 +1,100 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  PD functions
+ *
+ *  Authors: Christoph Raisch <raisch at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ehca_pd.c,v 1.25 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+
+#define DEB_PREFIX "vpd "
+
+#include "ehca_kernel.h"
+#include "ehca_tools.h"
+#include "ehca_iverbs.h"
+
+struct ib_pd *ehca_alloc_pd(struct ib_device *device,
+			    struct ib_ucontext *context, struct ib_udata *udata)
+{
+	struct ib_pd *mypd = NULL;
+	struct ehca_pd *pd = NULL;
+
+	EDEB_EN(7, "device=%p context=%p udata=%p", device, context, udata);
+
+	EHCA_CHECK_DEVICE_P(device);
+
+	pd = ehca_pd_new();
+	if (!pd) {
+		EDEB_ERR(4, "ERROR device=%p context=%p pd=%p "
+			 "out of memory", device, context, mypd);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	/* kernel pd when (device,-1,0)
+	 * user pd only if context != -1  */
+	if (context == NULL) {
+		/* kernel pds after init reuses always
+		 * the one created in ehca_shca_reopen()
+		 */
+		struct ehca_shca *shca = container_of(device, struct ehca_shca,
+						      ib_device);
+		pd->fw_pd.value = shca->pd->fw_pd.value;
+	} else {
+		pd->fw_pd.value = (u64)pd;
+	}
+
+	mypd = &pd->ib_pd;
+
+	EHCA_REGISTER_PD(device, pd);
+
+	EDEB_EX(7, "device=%p context=%p pd=%p", device, context, mypd);
+
+	return (mypd);
+}
+
+int ehca_dealloc_pd(struct ib_pd *pd)
+{
+	int ret = 0;
+	EDEB_EN(7, "pd=%p", pd);
+
+	EHCA_CHECK_PD(pd);
+	EHCA_DEREGISTER_PD(pd);
+	ehca_pd_delete(container_of(pd, struct ehca_pd, ib_pd));
+
+	EDEB_EX(7, "pd=%p", pd);
+	return ret;
+}


From rolandd at cisco.com  Sat Feb 18 11:57:50 2006
From: rolandd at cisco.com (Roland Dreier)
Date: Fri, 17 Feb 2006 16:57:50 -0800
Subject: [PATCH 17/22] Special QP functions
In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
Message-ID: <20060218005750.13620.62709.stgit@localhost.localdomain>

From: Roland Dreier <rolandd at cisco.com>

The wait for the port to become active when creating QP 1 seems
bizarre.  Why can't we just create QP 1 before the port is active?

What is the issue with creating QP 0?  Without QP 0, it's impossible
to run a subnet manager on top of ehca.
---

 drivers/infiniband/hw/ehca/ehca_sqp.c |  135 +++++++++++++++++++++++++++++++++
 1 files changed, 135 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_sqp.c b/drivers/infiniband/hw/ehca/ehca_sqp.c
new file mode 100644
index 0000000..bbad4cb
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ehca_sqp.c
@@ -0,0 +1,135 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  SQP functions
+ *
+ *  Authors: Khadija Souissi <souissi at de.ibm.com>
+ *           Heiko J Schick <schickhj at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ehca_sqp.c,v 1.35 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+
+#define DEB_PREFIX "e_qp"
+
+#include "ehca_kernel.h"
+#include "ehca_classes.h"
+#include "ehca_tools.h"
+#include "hcp_if.h"
+#include "ehca_qes.h"
+#include "ehca_iverbs.h"
+
+#include <linux/module.h>
+#include <linux/err.h>
+
+extern int ehca_create_aqp1(struct ehca_shca *shca, struct ehca_sport *sport);
+extern int ehca_destroy_aqp1(struct ehca_sport *sport);
+
+extern int ehca_port_act_time;
+
+/**
+ * ehca_define_aqp0 - TODO
+ *
+ * @ehca_qp:     : TODO adapter_handle, ipz_qp_handle, galpas.kernel
+ * @qp_init_attr : TODO for port number
+ */
+u64 ehca_define_sqp(struct ehca_shca *shca,
+			       struct ehca_qp *ehca_qp,
+			       struct ib_qp_init_attr *qp_init_attr)
+{
+
+	u32 pma_qp_nr = 0;
+	u32 bma_qp_nr = 0;
+	u64 ret = H_Success;
+	u8 port = qp_init_attr->port_num;
+	int counter = 0;
+
+	EDEB_EN(7, "port=%x qp_type=%x",
+		port, qp_init_attr->qp_type);
+
+	shca->sport[port - 1].port_state = IB_PORT_DOWN;
+
+	switch (qp_init_attr->qp_type) {
+	case IB_QPT_SMI:
+		/* TODO: function not supported yet */
+		/*
+		   ret = hipz_h_define_aqp0(shca->ipz_hca_handle,
+		   ehca_qp->ipz_qp_handle,
+		   ehca_qp->galpas.kernel,
+		   (u32)qp_init_attr->port_num);
+		 */
+		break;
+	case IB_QPT_GSI:
+		ret = hipz_h_define_aqp1(shca->ipz_hca_handle,
+					 ehca_qp->ipz_qp_handle,
+					 ehca_qp->ehca_qp_core.galpas.kernel,
+					 (u32) qp_init_attr->port_num,
+					 &pma_qp_nr, &bma_qp_nr);
+
+		if (ret != H_Success) {
+			EDEB_ERR(4, "Can't define AQP1 for port %x. rc=%lx",
+				    port, ret);
+			goto ehca_define_aqp1;
+		}
+		break;
+	default:
+		ret = H_Parameter;
+		goto ehca_define_aqp1;
+	}
+
+#ifndef EHCA_USERDRIVER
+	while ((shca->sport[port - 1].port_state != IB_PORT_ACTIVE) &&
+	       (counter < ehca_port_act_time)) {
+		EDEB(6, "... wait until port %x is active",
+			port);
+		msleep_interruptible(1000);
+		counter++;
+	}
+
+	if (counter == ehca_port_act_time) {
+		EDEB_ERR(4, "Port %x is not active.", port);
+		ret = H_Hardware;
+	}
+#else
+	if (shca->sport[port - 1].port_state != IB_PORT_ACTIVE) {
+		sleep(20);
+	}
+#endif
+
+      ehca_define_aqp1:
+	EDEB_EX(7, "ret=%lx", ret);
+
+	return ret;
+}


From rolandd at cisco.com  Sat Feb 18 11:57:37 2006
From: rolandd at cisco.com (Roland Dreier)
Date: Fri, 17 Feb 2006 16:57:37 -0800
Subject: [PATCH 11/22] ehca event queues
In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
Message-ID: <20060218005730.13620.53494.stgit@localhost.localdomain>

From: Roland Dreier <rolandd at cisco.com>

in ehca_poll_eqs(), is there any reason not to use list_for_each_entry()?

Since ehca_poll_eqs() defers all the work to an workqueue, is
there any reason for it to run in a kernel thread?  Why not just
make it a recurring timer?
---

 drivers/infiniband/hw/ehca/ehca_eq.c |  242 ++++++++++++++++++++++++++++++++++
 drivers/infiniband/hw/ehca/ehca_eq.h |   78 +++++++++++
 2 files changed, 320 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_eq.c b/drivers/infiniband/hw/ehca/ehca_eq.c
new file mode 100644
index 0000000..e508edb
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ehca_eq.c
@@ -0,0 +1,242 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  Event queue handling
+ *
+ *  Authors: Waleri Fomin <fomin at de.ibm.com>
+ *           Reinhard Ernst <rernst at de.ibm.com>
+ *           Heiko J Schick <schickhj at de.ibm.com>
+ *           Hoang-Nam Nguyen <hnguyen at de.ibm.com>
+ *
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ehca_eq.c,v 1.40 2006/02/06 16:20:38 schickhj Exp $
+ */
+
+#define DEB_PREFIX "e_eq"
+
+#include "ehca_eq.h"
+#include "ehca_kernel.h"
+#include "ehca_classes.h"
+#include "hcp_if.h"
+#include "ehca_iverbs.h"
+#include "ipz_pt_fn.h"
+#include "ehca_qes.h"
+#include "ehca_irq.h"
+
+/* TODO: should be defined in ehca_classes_pSeries.h */
+#define HIPZ_EQ_REGISTER_ORIG 0
+
+int ehca_create_eq(struct ehca_shca *shca,
+		   struct ehca_eq *eq,
+		   const enum ehca_eq_type type, const u32 length)
+{
+	extern struct workqueue_struct *ehca_wq;
+	u64 ret = H_Success;
+	u32 nr_pages = 0;
+	u32 i;
+	void *vpage = NULL;
+
+	EDEB_EN(7, "shca=%p eq=%p length=%x", shca, eq, length);
+	EHCA_CHECK_ADR(shca);
+	EHCA_CHECK_ADR(eq);
+
+	spin_lock_init(&eq->spinlock);
+	eq->is_initialized = 0;
+
+	if (type!=EHCA_EQ && type!=EHCA_NEQ) {
+		EDEB_ERR(4, "Invalid EQ type %x. eq=%p", type, eq);
+		return -EINVAL;
+	}
+	if (length==0) {
+		EDEB_ERR(4, "EQ length must not be zero. eq=%p", eq);
+		return -EINVAL;
+	}
+
+     	ret = hipz_h_alloc_resource_eq(shca->ipz_hca_handle,
+				       &eq->pf,
+				       type,
+				       length,
+				       &eq->ipz_eq_handle,
+				       &eq->length,
+				       &nr_pages, &eq->irq_info.ist);
+
+	if (ret != H_Success) {
+		EDEB_ERR(4, "Can't allocate EQ / NEQ. eq=%p", eq);
+		return -EINVAL;
+	}
+
+	ret = ipz_queue_ctor(&eq->ipz_queue, nr_pages,
+			     EHCA_PAGESIZE, sizeof(struct ehca_eqe), 0);
+	if (!ret) {
+		EDEB_ERR(4, "Can't allocate EQ pages. eq=%p", eq);
+		goto create_eq_exit1;
+	}
+
+	for (i = 0; i < nr_pages; i++) {
+		u64 rpage;
+
+		if (!(vpage = ipz_QPageit_get_inc(&eq->ipz_queue))) {
+			ret = H_Resource;
+			goto create_eq_exit2;
+		}
+
+		rpage = ehca_kv_to_g(vpage);
+		ret = hipz_h_register_rpage_eq(shca->ipz_hca_handle,
+					       eq->ipz_eq_handle,
+					       &eq->pf,
+					       0,
+					       HIPZ_EQ_REGISTER_ORIG, rpage, 1);
+
+		if (i == (nr_pages - 1)) {
+			/* last page */
+			vpage = ipz_QPageit_get_inc(&eq->ipz_queue);
+			if ((ret != H_Success) || (vpage != 0)) {
+				goto create_eq_exit2;
+			}
+		} else {
+			if ((ret != H_PAGE_REGISTERED) || (vpage == 0)) {
+				goto create_eq_exit2;
+			}
+		}
+	}
+
+	ipz_QEit_reset(&eq->ipz_queue);
+
+#ifndef EHCA_USERDRIVER
+	{
+	        pid_t pid = 0;
+		(eq->irq_info).pid = pid;
+		(eq->irq_info).eq = eq;
+		(eq->irq_info).wq = ehca_wq;
+		(eq->irq_info).work = &(eq->work);
+	}
+#endif
+
+	/* register interrupt handlers and initialize work queues */
+	if (type == EHCA_EQ) {
+		INIT_WORK(&(eq->work),
+			  ehca_interrupt_eq, (void *)&(eq->irq_info));
+		eq->is_initialized = 1;
+		hipz_request_interrupt(&(eq->irq_info), ehca_interrupt);
+	} else if (type == EHCA_NEQ) {
+		INIT_WORK(&(eq->work),
+			  ehca_interrupt_neq, (void *)&(eq->irq_info));
+		hipz_request_interrupt(&(eq->irq_info), ehca_interrupt);
+	}
+
+	EDEB_EX(7, "ret=%lx", ret);
+
+	return 0;
+
+      create_eq_exit2:
+	ipz_queue_dtor(&eq->ipz_queue);
+
+      create_eq_exit1:
+	hipz_h_destroy_eq(shca->ipz_hca_handle, eq);
+
+	EDEB_EX(7, "ret=%lx", ret);
+
+	return -EINVAL;
+}
+
+void *ehca_poll_eq(struct ehca_shca *shca, struct ehca_eq *eq)
+{
+	unsigned long flags = 0;
+	void *eqe = NULL;
+
+	EDEB_EN(7, "shca=%p  eq=%p", shca, eq);
+	EHCA_CHECK_ADR_P(shca);
+	EHCA_CHECK_EQ_P(eq);
+
+	spin_lock_irqsave(&eq->spinlock, flags);
+	eqe = ipz_QEit_EQ_get_inc_valid(&eq->ipz_queue);
+	spin_unlock_irqrestore(&eq->spinlock, flags);
+
+	EDEB_EX(7, "eq=%p eqe=%p", eq, eqe);
+
+	return eqe;
+}
+
+int ehca_poll_eqs(void *data)
+{
+	extern struct workqueue_struct *ehca_wq;
+	struct ehca_shca *shca;
+	struct ehca_module* module = data;
+	struct list_head *entry;
+
+	do {
+		spin_lock(&module->shca_lock);
+		list_for_each(entry, &module->shca_list) {
+			shca = list_entry(entry, struct ehca_shca, shca_list);
+
+			if (shca->eq.is_initialized && !kthread_should_stop())
+				queue_work(ehca_wq, &shca->eq.work);
+		}
+		spin_unlock(&module->shca_lock);
+
+		msleep_interruptible(1000);
+	}
+	while(!kthread_should_stop());
+
+	return 0;
+}
+
+int ehca_destroy_eq(struct ehca_shca *shca, struct ehca_eq *eq)
+{
+	unsigned long flags = 0;
+	u64 retcode = H_Success;
+
+	EDEB_EN(7, "shca=%p  eq=%p", shca, eq);
+	EHCA_CHECK_ADR(shca);
+	EHCA_CHECK_EQ(eq);
+
+	spin_lock_irqsave(&eq->spinlock, flags);
+	hipz_free_interrupt(&(eq->irq_info));
+
+	retcode = hipz_h_destroy_eq(shca->ipz_hca_handle, eq);
+
+	spin_unlock_irqrestore(&eq->spinlock, flags);
+
+	if (retcode != H_Success) {
+		EDEB_ERR(4, "Can't free EQ resources.");
+		return -EINVAL;
+	}
+	ipz_queue_dtor(&eq->ipz_queue);
+
+	EDEB_EX(7, "retcode=%lx", retcode);
+
+	return 0;
+}
+
diff --git a/drivers/infiniband/hw/ehca/ehca_eq.h b/drivers/infiniband/hw/ehca/ehca_eq.h
new file mode 100644
index 0000000..d09f21b
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ehca_eq.h
@@ -0,0 +1,78 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  Completion queue, event queue handling helper functions
+ *
+ *  Authors: Waleri Fomin <fomin at de.ibm.com>
+ *           Reinhard Ernst <rernst at de.ibm.com>
+ *           Heiko J Schick <schickhj at de.ibm.com>
+ *           Hoang-Nam Nguyen <hnguyen at de.ibm.com>
+ *
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ehca_eq.h,v 1.10 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+#ifndef EHCA_EQ_H
+#define EHCA_EQ_H
+
+#include "ehca_classes.h"
+#include "ehca_common.h"
+
+enum ehca_eq_type {
+	EHCA_EQ = 0, /* event queue              */
+	EHCA_NEQ     /* notification event queue */
+};
+
+/** @brief hcad internal create EQ
+ */
+int ehca_create_eq(struct ehca_shca *shca,
+		   struct ehca_eq *eq, /* struct contains eq to create */
+		   enum ehca_eq_type type,
+		   const u32 length);
+
+/** @brief destroy the eq
+ */
+int ehca_destroy_eq(struct ehca_shca *shca, struct ehca_eq *eq);
+
+/** @brief hcad internal poll EQ
+  - check if new EQE available,
+  - if yes, increment EQE pointer
+  - otherwise return 0
+  @returns pointer to EQE if new valid EQEavailable
+ */
+void *ehca_poll_eq(struct ehca_shca *shca, struct ehca_eq *eq);
+
+#endif /* EHCA_EQ_H */
+


From rolandd at cisco.com  Sat Feb 18 11:57:48 2006
From: rolandd at cisco.com (Roland Dreier)
Date: Fri, 17 Feb 2006 16:57:48 -0800
Subject: [PATCH 16/22] ehca post send/receive and poll CQ
In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
Message-ID: <20060218005748.13620.45620.stgit@localhost.localdomain>

From: Roland Dreier <rolandd at cisco.com>

There are an awful lot of magic numbers scattered around.  Probably
they should become enums somewhere.

The compatibility defines for using the kernel file in userspace
shouldn't go into the kernel.
---

 drivers/infiniband/hw/ehca/ehca_reqs.c      |  401 ++++++++++++++++++++++++++
 drivers/infiniband/hw/ehca/ehca_reqs_core.c |  420 +++++++++++++++++++++++++++
 2 files changed, 821 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_reqs.c b/drivers/infiniband/hw/ehca/ehca_reqs.c
new file mode 100644
index 0000000..659e6ba
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ehca_reqs.c
@@ -0,0 +1,401 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  post_send/recv, poll_cq, req_notify
+ *
+ *  Authors: Waleri Fomin <fomin at de.ibm.com>
+ *           Reinhard Ernst <rernst at de.ibm.com>
+ *           Hoang-Nam Nguyen <hnguyen at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ehca_reqs.c,v 1.41 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+
+#define DEB_PREFIX "reqs"
+
+#include "ehca_kernel.h"
+#include "ehca_classes.h"
+#include "ehca_tools.h"
+#include "hcp_if.h"
+#include "ehca_qes.h"
+#include "ehca_iverbs.h"
+
+/* include some inline service routines */
+#include "ehca_asm.h"
+#include "ehca_reqs_core.c"
+
+int ehca_post_send(struct ib_qp *qp,
+		   struct ib_send_wr *send_wr,
+		   struct ib_send_wr **bad_send_wr)
+{
+	struct ehca_qp *my_qp = NULL;
+	struct ib_send_wr *cur_send_wr = NULL;
+	struct ehca_wqe *wqe_p = NULL;
+	int wqe_cnt = 0;
+	int retcode = 0;
+	unsigned long spl_flags = 0;
+
+	EHCA_CHECK_ADR(qp);
+	my_qp = container_of(qp, struct ehca_qp, ib_qp);
+	EHCA_CHECK_QP(my_qp);
+	EHCA_CHECK_ADR(send_wr);
+	EDEB_EN(7, "ehca_qp=%p qp_num=%x send_wr=%p bad_send_wr=%p",
+		my_qp, qp->qp_num, send_wr, bad_send_wr);
+
+	/* LOCK the QUEUE */
+	spin_lock_irqsave(&my_qp->spinlock_s, spl_flags);
+
+	/* loop processes list of send reqs */
+	for (cur_send_wr = send_wr; cur_send_wr != NULL;
+	     cur_send_wr = cur_send_wr->next) {
+		void *start_addr =
+			&my_qp->ehca_qp_core.ipz_squeue.current_q_addr;
+		/* get pointer next to free WQE */
+		wqe_p = ipz_QEit_get_inc(&my_qp->ehca_qp_core.ipz_squeue);
+		if (unlikely(wqe_p == NULL)) {
+			/* too many posted work requests: queue overflow */
+			if (bad_send_wr != NULL) {
+				*bad_send_wr = cur_send_wr;
+			}
+			if (wqe_cnt==0) {
+				retcode = -ENOMEM;
+				EDEB_ERR(4, "Too many posted WQEs qp_num=%x",
+					 qp->qp_num);
+			}
+			goto post_send_exit0;
+		}
+		/* write a SEND WQE into the QUEUE */
+		retcode = ehca_write_swqe(&my_qp->ehca_qp_core,
+					  wqe_p, cur_send_wr);
+		/* if something failed,
+		   reset the free entry pointer to the start value
+		*/
+		if (unlikely(retcode != 0)) {
+			my_qp->ehca_qp_core.ipz_squeue.current_q_addr =
+				start_addr;
+			*bad_send_wr = cur_send_wr;
+			if (wqe_cnt==0) {
+				retcode = -EINVAL;
+				EDEB_ERR(4, "Could not write WQE qp_num=%x",
+					 qp->qp_num);
+			}
+			goto post_send_exit0;
+		}
+		wqe_cnt++;
+		EDEB(7, "ehca_qp=%p qp_num=%x wqe_cnt=%d",
+		     my_qp, qp->qp_num, wqe_cnt);
+	} /* eof for cur_send_wr */
+
+ post_send_exit0:
+	/* UNLOCK the QUEUE */
+	spin_unlock_irqrestore(&my_qp->spinlock_s, spl_flags);
+	iosync(); /* serialize GAL register access */
+	hipz_update_SQA(&my_qp->ehca_qp_core, wqe_cnt);
+	EDEB_EX(7, "ehca_qp=%p qp_num=%x ret=%x wqe_cnt=%d",
+		my_qp, qp->qp_num, retcode, wqe_cnt);
+	return retcode;
+}
+
+int ehca_post_recv(struct ib_qp *qp,
+		   struct ib_recv_wr *recv_wr,
+		   struct ib_recv_wr **bad_recv_wr)
+{
+	struct ehca_qp *my_qp = NULL;
+	struct ib_recv_wr *cur_recv_wr = NULL;
+	struct ehca_wqe *wqe_p = NULL;
+	int wqe_cnt = 0;
+	int retcode = 0;
+	unsigned long spl_flags = 0;
+
+	EHCA_CHECK_ADR(qp);
+	my_qp = container_of(qp, struct ehca_qp, ib_qp);
+	EHCA_CHECK_QP(my_qp);
+	EHCA_CHECK_ADR(recv_wr);
+	EDEB_EN(7, "ehca_qp=%p qp_num=%x recv_wr=%p bad_recv_wr=%p",
+		my_qp, qp->qp_num, recv_wr, bad_recv_wr);
+
+	/* LOCK the QUEUE */
+	spin_lock_irqsave(&my_qp->spinlock_r, spl_flags);
+
+	/* loop processes list of send reqs */
+	for (cur_recv_wr = recv_wr; cur_recv_wr != NULL;
+	     cur_recv_wr = cur_recv_wr->next) {
+		void *start_addr =
+			&my_qp->ehca_qp_core.ipz_rqueue.current_q_addr;
+		/* get pointer next to free WQE */
+		wqe_p = ipz_QEit_get_inc(&my_qp->ehca_qp_core.ipz_rqueue);
+		if (unlikely(wqe_p == NULL)) {
+			/* too many posted work requests: queue overflow */
+			if (bad_recv_wr != NULL) {
+				*bad_recv_wr = cur_recv_wr;
+			}
+			if (wqe_cnt==0) {
+				retcode = -ENOMEM;
+				EDEB_ERR(4, "Too many posted WQEs qp_num=%x",
+					 qp->qp_num);
+			}
+			goto post_recv_exit0;
+		}
+		/* write a RECV WQE into the QUEUE */
+		retcode =
+			ehca_write_rwqe(&my_qp->ehca_qp_core, wqe_p, cur_recv_wr);
+		/* if something failed,
+		   reset the free entry pointer to the start value
+		*/
+		if (unlikely(retcode != 0)) {
+			my_qp->ehca_qp_core.ipz_rqueue.current_q_addr =
+				start_addr;
+			*bad_recv_wr = cur_recv_wr;
+			if (wqe_cnt==0) {
+				retcode = -EINVAL;
+				EDEB_ERR(4, "Could not write WQE qp_num=%x",
+					 qp->qp_num);
+			}
+			goto post_recv_exit0;
+		}
+		wqe_cnt++;
+		EDEB(7, "ehca_qp=%p qp_num=%x wqe_cnt=%d",
+		     my_qp, qp->qp_num, wqe_cnt);
+	} /* eof for cur_recv_wr */
+
+ post_recv_exit0:
+	spin_unlock_irqrestore(&my_qp->spinlock_r, spl_flags);
+	iosync(); /* serialize GAL register access */
+	hipz_update_RQA(&my_qp->ehca_qp_core, wqe_cnt);
+	EDEB_EX(7, "ehca_qp=%p qp_num=%x ret=%x wqe_cnt=%d",
+		my_qp, qp->qp_num, retcode, wqe_cnt);
+	return retcode;
+}
+
+/**
+ * Table converts ehca wc opcode to ib
+ * Since we use zero to indicate invalid opcode, the actual ib opcode must
+ * be decremented!!!
+ */
+static const u8 ib_wc_opcode[255] = {
+	[0x01] = IB_WC_RECV+1,
+	[0x02] = IB_WC_RECV_RDMA_WITH_IMM+1,
+	[0x04] = IB_WC_BIND_MW+1,
+	[0x08] = IB_WC_FETCH_ADD+1,
+	[0x10] = IB_WC_COMP_SWAP+1,
+	[0x20] = IB_WC_RDMA_WRITE+1,
+	[0x40] = IB_WC_RDMA_READ+1,
+	[0x80] = IB_WC_SEND+1
+};
+
+/** @brief internal function to poll one entry of cq
+ */
+static inline int ehca_poll_cq_one(struct ib_cq *cq, struct ib_wc *wc)
+{
+	int retcode = 0;
+	struct ehca_cq *my_cq = container_of(cq, struct ehca_cq, ib_cq);
+	struct ehca_cqe *cqe = NULL;
+	int cqe_count = 0;
+
+	EDEB_EN(7, "ehca_cq=%p cq_num=%x wc=%p", my_cq, my_cq->cq_number, wc);
+
+ poll_cq_one_read_cqe:
+	cqe = (struct ehca_cqe *)
+		ipz_QEit_get_inc_valid(&my_cq->ehca_cq_core.ipz_queue);
+	if (cqe == NULL) {
+		retcode = -EAGAIN;
+		EDEB(7, "Completion queue is empty ehca_cq=%p cq_num=%x "
+		     "retcode=%x", my_cq, my_cq->cq_number, retcode);
+		goto  poll_cq_one_exit0;
+	}
+	cqe_count++;
+	if (unlikely(cqe->status & 0x10)) { /* purge bit set */
+		struct ehca_qp *qp=ehca_cq_get_qp(my_cq, cqe->local_qp_number);
+		int purgeflag = 0;
+		unsigned long spl_flags = 0;
+		if (qp==NULL) { /* should not happen */
+			EDEB_ERR(4, "cq_num=%x qp_num=%x "
+				 "could not find qp -> ignore cqe",
+				 my_cq->cq_number, cqe->local_qp_number);
+			EDEB_DMP(4, cqe, 64, "cq_num=%x qp_num=%x",
+				 my_cq->cq_number, cqe->local_qp_number);
+			/* ignore this purged cqe */
+			goto poll_cq_one_read_cqe;
+		}
+		spin_lock_irqsave(&qp->spinlock_s, spl_flags);
+		purgeflag = qp->sqerr_purgeflag;
+		spin_unlock_irqrestore(&qp->spinlock_s, spl_flags);
+		if (purgeflag!=0) {
+			EDEB(6, "Got CQE with purged bit qp_num=%x src_qp=%x",
+			     cqe->local_qp_number, cqe->remote_qp_number);
+			EDEB_DMP(6, cqe, 64, "qp_num=%x src_qp=%x",
+				 cqe->local_qp_number, cqe->remote_qp_number);
+			/* ignore this to avoid double cqes of bad wqe
+			   that caused sqe and turn off purge flag */
+			qp->sqerr_purgeflag = 0;
+			goto poll_cq_one_read_cqe;
+		}
+	}
+
+	/* tracing cqe */
+	if (IS_EDEB_ON(7)) {
+		EDEB(7, "Received COMPLETION ehca_cq=%p cq_num=%x -----",
+		     my_cq, my_cq->cq_number);
+		EDEB_DMP(7, cqe, 64, "ehca_cq=%p cq_num=%x",
+			 my_cq, my_cq->cq_number);
+		EDEB(7, "ehca_cq=%p cq_num=%x -------------------------",
+		     my_cq, my_cq->cq_number);
+	}
+
+	/* we got a completion! */
+	wc->wr_id = cqe->work_request_id;
+
+	/* eval ib_wc_opcode */
+	wc->opcode = ib_wc_opcode[cqe->optype]-1;
+	if (unlikely(wc->opcode == -1)) {
+		EDEB_ERR(4, "Invalid cqe->OPType=%x cqe->status=%x "
+			 "ehca_cq=%p cq_num=%x",
+			 cqe->optype, cqe->status, my_cq, my_cq->cq_number);
+		/* dump cqe for other infos */
+		EDEB_DMP(4, cqe, 64, "ehca_cq=%p cq_num=%x", my_cq, my_cq->cq_number);
+		/* update also queue adder to throw away this entry!!! */
+		goto poll_cq_one_exit0;
+	}
+	/* eval ib_wc_status */
+	if (unlikely(cqe->status & 0x80000000)) { /* complete with errors */
+		map_ib_wc_status(cqe->status, &wc->status);
+		wc->vendor_err = wc->status;
+	} else {
+		wc->status = IB_WC_SUCCESS;
+	}
+
+	wc->qp_num = cqe->local_qp_number;
+	wc->byte_len = ntohl(cqe->nr_bytes_transferred);
+	wc->pkey_index = cqe->pkey_index;
+	wc->slid = cqe->rlid;
+	wc->dlid_path_bits = cqe->dlid;
+	wc->src_qp = cqe->remote_qp_number;
+	wc->wc_flags = cqe->w_completion_flags;
+	wc->imm_data = cqe->immediate_data;
+	wc->sl = cqe->service_level;
+
+	if (wc->status != IB_WC_SUCCESS) {
+		EDEB(6, "ehca_cq=%p cq_num=%x WARNING unsuccessful cqe "
+		     "OPType=%x status=%x qp_num=%x src_qp=%x wr_id=%lx cqe=%p",
+		     my_cq, my_cq->cq_number, cqe->optype, cqe->status,
+		     cqe->local_qp_number, cqe->remote_qp_number,
+		     cqe->work_request_id, cqe);
+	}
+
+ poll_cq_one_exit0:
+	if (cqe_count>0) {
+		hipz_update_FECA(&my_cq->ehca_cq_core, cqe_count);
+	}
+
+	EDEB_EX(7, "retcode=%x ehca_cq=%p cq_number=%x wc=%p "
+		"status=%x opcode=%x qp_num=%x byte_len=%x",
+		retcode, my_cq, my_cq->cq_number, wc, wc->status,
+		wc->opcode, wc->qp_num, wc->byte_len);
+	return (retcode);
+}
+
+int ehca_poll_cq(struct ib_cq *cq, int num_entries, struct ib_wc *wc)
+{
+	struct ehca_cq *my_cq = NULL;
+	int nr = 0;
+	struct ib_wc *current_wc = NULL;
+	int retcode = 0;
+	unsigned long spl_flags = 0;
+
+	EHCA_CHECK_CQ(cq);
+	EHCA_CHECK_ADR(wc);
+
+	my_cq = container_of(cq, struct ehca_cq, ib_cq);
+	EHCA_CHECK_CQ(my_cq);
+
+	EDEB_EN(7, "ehca_cq=%p cq_num=%x num_entries=%d wc=%p",
+		my_cq, my_cq->cq_number, num_entries, wc);
+
+	if (num_entries < 1) {
+		EDEB_ERR(4, "Invalid num_entries=%d ehca_cq=%p cq_num=%x",
+			 num_entries, my_cq, my_cq->cq_number);
+		retcode = -EINVAL;
+		goto poll_cq_exit0;
+	}
+
+	current_wc = wc;
+	spin_lock_irqsave(&my_cq->spinlock, spl_flags);
+	for (nr = 0; nr < num_entries; nr++) {
+		retcode = ehca_poll_cq_one(cq, current_wc);
+		if (0 != retcode) {
+			break;
+		}
+		current_wc++;
+	} /* eof for nr */
+	spin_unlock_irqrestore(&my_cq->spinlock, spl_flags);
+	if (-EAGAIN == retcode || 0 == retcode) {
+		retcode = nr;
+	}
+
+ poll_cq_exit0:
+	EDEB_EX(7, "ehca_cq=%p cq_num=%x retcode=%x wc=%p nr_entries=%d",
+		my_cq, my_cq->cq_number, retcode, wc, nr);
+	return (retcode);
+}
+
+int ehca_req_notify_cq(struct ib_cq *cq, enum ib_cq_notify cq_notify)
+{
+	struct ehca_cq *my_cq = NULL;
+	int retcode = 0;
+
+	EHCA_CHECK_CQ(cq);
+	my_cq = container_of(cq, struct ehca_cq, ib_cq);
+	EHCA_CHECK_CQ(my_cq);
+	EDEB_EN(7, "ehca_cq=%p cq_num=%x cq_notif=%x",
+		my_cq, my_cq->cq_number, cq_notify);
+
+	switch (cq_notify) {
+	case IB_CQ_SOLICITED:
+		hipz_set_CQx_N0(&my_cq->ehca_cq_core, 1);
+		break;
+	case IB_CQ_NEXT_COMP:
+		hipz_set_CQx_N1(&my_cq->ehca_cq_core, 1);
+		break;
+	default:
+		retcode = -EINVAL;
+	}
+
+	EDEB_EX(7, "ehca_cq=%p cq_num=%x retcode=%x",
+		my_cq, my_cq->cq_number, retcode);
+
+	return (retcode);
+}
+
+/* eof ehca_reqs.c */
diff --git a/drivers/infiniband/hw/ehca/ehca_reqs_core.c b/drivers/infiniband/hw/ehca/ehca_reqs_core.c
new file mode 100644
index 0000000..c0b7281
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ehca_reqs_core.c
@@ -0,0 +1,420 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  post_send/recv, poll_cq, req_notify
+ *  Common code to be included statically in respective user/kernel
+ *  modules, i.e. ehca_ureqs.c/ehca_reqs.c
+ *  This module contains C code only. Including modules must include
+ *  all required header files.
+ *
+ *  Authors: Waleri Fomin <fomin at de.ibm.com>
+ *           Reinhard Ernst <rernst at de.ibm.com>
+ *           Hoang-Nam Nguyen <hnguyen at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ehca_reqs_core.c,v 1.40 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+/** THIS following block of defines
+ * replaces ib types of kernel space to corresponding ones in user space,
+ * so that the implemented inline functions below can be compiled and
+ * work in both user and kernel space.
+ * However this ASSUMES that there is no functional differences between ib
+ * types in kernel e.g. ib_send_wr and user space e.g. ibv_send_wr.
+ */
+
+#ifndef __KERNEL__
+#define ib_recv_wr ibv_recv_wr
+#define ib_send_wr ibv_send_wr
+#define ehca_av ehcau_av
+/* ib_wr_opcode */
+#define IB_WR_SEND IBV_WR_SEND
+#define IB_WR_SEND_WITH_IMM IBV_WR_SEND_WITH_IMM
+#define IB_WR_RDMA_WRITE IBV_WR_RDMA_WRITE
+#define IB_WR_RDMA_WRITE_WITH_IMM IBV_WR_RDMA_WRITE_WITH_IMM
+#define IB_WR_RDMA_READ IBV_WR_RDMA_READ
+/* ib_qp_type */
+#define IB_QPT_RC IBV_QPT_RC
+#define IB_QPT_UC IBV_QPT_UC
+#define IB_QPT_UD IBV_QPT_UD
+/* ib_wc_opcode */
+#define ib_wc_opcode ibv_wc_opcode
+#define IB_WC_SEND IBV_WC_SEND
+#define IB_WC_RDMA_WRITE IBV_WC_RDMA_WRITE
+#define IB_WC_RDMA_READ IBV_WC_RDMA_READ
+#define IB_WC_COMP_SWAP IBV_WC_COMP_SWAP
+#define IB_WC_FETCH_ADD IBV_WC_FETCH_ADD
+#define IB_WC_BIND_MW IBV_WC_BIND_MW
+#define IB_WC_RECV IBV_WC_RECV
+#define IB_WC_RECV_RDMA_WITH_IMM IBV_WC_RECV_RDMA_WITH_IMM
+/* ib_wc_status */
+#define ib_wc_status ibv_wc_status
+#define IB_WC_LOC_LEN_ERR IBV_WC_LOC_LEN_ERR
+#define IB_WC_LOC_QP_OP_ERR IBV_WC_LOC_QP_OP_ERR
+#define IB_WC_LOC_EEC_OP_ERR IBV_WC_LOC_EEC_OP_ERR
+#define IB_WC_LOC_PROT_ERR IBV_WC_LOC_PROT_ERR
+#define IB_WC_WR_FLUSH_ERR IBV_WC_WR_FLUSH_ERR
+#define IB_WC_MW_BIND_ERR IBV_WC_MW_BIND_ERR
+#define IB_WC_GENERAL_ERR IBV_WC_GENERAL_ERR
+#define IB_WC_REM_INV_REQ_ERR IBV_WC_REM_INV_REQ_ERR
+#define IB_WC_REM_ACCESS_ERR IBV_WC_REM_ACCESS_ERR
+#define IB_WC_REM_OP_ERR IBV_WC_REM_OP_ERR
+#define IB_WC_REM_INV_RD_REQ_ERR IBV_WC_REM_INV_RD_REQ_ERR
+#define IB_WC_RETRY_EXC_ERR IBV_WC_RETRY_EXC_ERR
+#define IB_WC_RNR_RETRY_EXC_ERR IBV_WC_RNR_RETRY_EXC_ERR
+#define IB_WC_REM_ABORT_ERR IBV_WC_REM_ABORT_ERR
+#define IB_WC_INV_EECN_ERR IBV_WC_INV_EECN_ERR
+#define IB_WC_INV_EEC_STATE_ERR IBV_WC_INV_EEC_STATE_ERR
+#define IB_WC_BAD_RESP_ERR IBV_WC_BAD_RESP_ERR
+#define IB_WC_FATAL_ERR IBV_WC_FATAL_ERR
+#define IB_WC_SUCCESS IBV_WC_SUCCESS
+/* ib_send_flags */
+#define IB_SEND_FENCE IBV_SEND_FENCE
+#define IB_SEND_SIGNALED IBV_SEND_SIGNALED
+#define IB_SEND_SOLICITED IBV_SEND_SOLICITED
+#define IB_SEND_INLINE IBV_SEND_INLINE
+#endif
+
+static inline int ehca_write_rwqe(struct ehca_qp_core *qp_core,
+				  struct ehca_wqe *wqe_p,
+				  struct ib_recv_wr *recv_wr)
+{
+	u8 cnt_ds;
+	if (unlikely((recv_wr->num_sge < 0) ||
+		     (recv_wr->num_sge > qp_core->ipz_rqueue.act_nr_of_sg))) {
+		EDEB_ERR(4, "Invalid number of WQE SGE. "
+			 "num_sqe=%x max_nr_of_sg=%x",
+			 recv_wr->num_sge, qp_core->ipz_rqueue.act_nr_of_sg);
+		return (-EINVAL); /* invalid SG list length */
+	}
+
+	clear_cacheline(wqe_p);
+	clear_cacheline((u8 *) wqe_p + 32);
+	clear_cacheline((u8 *) wqe_p + 64);
+
+	wqe_p->work_request_id = be64_to_cpu(recv_wr->wr_id);
+	wqe_p->nr_of_data_seg = recv_wr->num_sge;
+
+	for (cnt_ds = 0; cnt_ds < recv_wr->num_sge; cnt_ds++) {
+		wqe_p->u.all_rcv.sg_list[cnt_ds].vaddr =
+		    be64_to_cpu(recv_wr->sg_list[cnt_ds].addr);
+		wqe_p->u.all_rcv.sg_list[cnt_ds].lkey =
+		    ntohl(recv_wr->sg_list[cnt_ds].lkey);
+		wqe_p->u.all_rcv.sg_list[cnt_ds].length =
+		    ntohl(recv_wr->sg_list[cnt_ds].length);
+	}
+
+	if (IS_EDEB_ON(7)) {
+		EDEB(7, "RECEIVE WQE written into queue qp_core=%p", qp_core);
+		EDEB_DMP(7, wqe_p, 16*(6 + wqe_p->nr_of_data_seg),
+			 "qp_core=%p", qp_core);
+	}
+
+	return (0);
+}
+
+/* internal use only
+   uncomment this line to enable trace output of GSI send wr */
+/* #define DEBUG_GSI_SEND_WR 1 */
+#if defined(__KERNEL__) && defined(DEBUG_GSI_SEND_WR)
+
+/* need ib_mad struct */
+#include <rdma/ib_mad.h>
+
+static void trace_send_wr_ud(const struct ib_send_wr *send_wr)
+{
+	int idx = 0;
+	int j = 0;
+	while (send_wr != NULL) {
+		struct ib_mad_hdr *mad_hdr = send_wr->wr.ud.mad_hdr;
+		struct ib_sge *sge = send_wr->sg_list;
+		EDEB(4, "send_wr#%x wr_id=%lx num_sge=%x "
+		     "send_flags=%x opcode=%x",idx, send_wr->wr_id,
+		     send_wr->num_sge, send_wr->send_flags, send_wr->opcode);
+		if (mad_hdr != NULL) {
+			EDEB(4, "send_wr#%x mad_hdr base_version=%x "
+			     "mgmt_class=%x class_version=%x method=%x "
+			     "status=%x class_specific=%x tid=%lx attr_id=%x "
+			     "resv=%x attr_mod=%x",
+			     idx, mad_hdr->base_version, mad_hdr->mgmt_class,
+			     mad_hdr->class_version, mad_hdr->method,
+			     mad_hdr->status, mad_hdr->class_specific,
+			     mad_hdr->tid, mad_hdr->attr_id, mad_hdr->resv,
+			     mad_hdr->attr_mod);
+		}
+		for (j = 0; j < send_wr->num_sge; j++) {
+#ifdef EHCA_USERDRIVER
+			u8 *data = (u8 *) sge->addr;
+#else
+			u8 *data = (u8 *) abs_to_virt(sge->addr);
+#endif
+			EDEB(4, "send_wr#%x sge#%x addr=%p length=%x lkey=%x",
+			     idx, j, data, sge->length, sge->lkey);
+			/* assume length is n*16 */
+			EDEB_DMP(4, data, sge->length, "send_wr#%x sge#%x", idx, j);
+			sge++;
+		} /* eof for j */
+		idx++;
+		send_wr = send_wr->next;
+	} /* eof while send_wr */
+}
+
+#endif /* __KERNEL__ && DEBUG_GSI_SEND_WR */
+
+static inline int ehca_write_swqe(struct ehca_qp_core *qp_core,
+				  struct ehca_wqe *wqe_p,
+				  const struct ib_send_wr *send_wr)
+{
+	u32 idx;
+	u64 dma_length;
+	struct ehca_av *my_av;
+	u32 remote_qkey = send_wr->wr.ud.remote_qkey;
+
+	clear_cacheline(wqe_p);
+	clear_cacheline((u8 *) wqe_p + 32);
+
+	if (unlikely((send_wr->num_sge < 0) ||
+		     (send_wr->num_sge > qp_core->ipz_squeue.act_nr_of_sg))) {
+		EDEB_ERR(4, "Invalid number of WQE SGE. "
+			 "num_sqe=%x max_nr_of_sg=%x",
+			 send_wr->num_sge, qp_core->ipz_rqueue.act_nr_of_sg);
+		return (-EINVAL); /* invalid SG list length */
+	}
+
+	wqe_p->work_request_id = be64_to_cpu(send_wr->wr_id);
+
+	switch (send_wr->opcode) {
+	case IB_WR_SEND:
+	case IB_WR_SEND_WITH_IMM:
+		wqe_p->optype = WQE_OPTYPE_SEND;
+		break;
+	case IB_WR_RDMA_WRITE:
+	case IB_WR_RDMA_WRITE_WITH_IMM:
+		wqe_p->optype = WQE_OPTYPE_RDMAWRITE;
+		break;
+	case IB_WR_RDMA_READ:
+		wqe_p->optype = WQE_OPTYPE_RDMAREAD;
+		break;
+	default:
+		EDEB_ERR(4, "Invalid opcode=%x", send_wr->opcode);
+		return (-EINVAL); /* invalid opcode */
+	}
+
+	wqe_p->wqef = (send_wr->opcode) & 0xF0;
+
+	wqe_p->wr_flag = 0;
+	if (send_wr->send_flags & IB_SEND_SIGNALED) {
+		wqe_p->wr_flag |= WQE_WRFLAG_REQ_SIGNAL_COM;
+	}
+
+	if (send_wr->opcode == IB_WR_SEND_WITH_IMM ||
+	    send_wr->opcode == IB_WR_RDMA_WRITE_WITH_IMM) {
+		/* this might not work as long as HW does not support it */
+		wqe_p->immediate_data = send_wr->imm_data;
+		wqe_p->wr_flag |= WQE_WRFLAG_IMM_DATA_PRESENT;
+	}
+
+	wqe_p->nr_of_data_seg = send_wr->num_sge;
+
+	switch (qp_core->qp_type) {
+#ifdef __KERNEL__
+	case IB_QPT_SMI:
+	case IB_QPT_GSI:
+#endif /* __KERNEL__ */
+		/* no break is intential here */
+	case IB_QPT_UD:
+		/* IB 1.2 spec C10-15 compliance */
+		if (send_wr->wr.ud.remote_qkey & 0x80000000) {
+			remote_qkey = qp_core->qkey;
+		}
+		wqe_p->destination_qp_number =
+		    ntohl(send_wr->wr.ud.remote_qpn << 8);
+		wqe_p->local_ee_context_qkey = ntohl(remote_qkey);
+		if (send_wr->wr.ud.ah==NULL) {
+			EDEB_ERR(4, "wr.ud.ah is NULL. qp_core=%p", qp_core);
+			return (-EINVAL);
+		}
+		my_av = container_of(send_wr->wr.ud.ah, struct ehca_av, ib_ah);
+		wqe_p->u.ud_av.ud_av = my_av->av;
+
+		/* omitted check of IB_SEND_INLINE
+		   since HW does not support it */
+		for (idx = 0; idx < send_wr->num_sge; idx++) {
+			wqe_p->u.ud_av.sg_list[idx].vaddr =
+			    be64_to_cpu(send_wr->sg_list[idx].addr);
+			wqe_p->u.ud_av.sg_list[idx].lkey =
+			    ntohl(send_wr->sg_list[idx].lkey);
+			wqe_p->u.ud_av.sg_list[idx].length =
+			    ntohl(send_wr->sg_list[idx].length);
+		} /* eof for idx */
+#ifdef __KERNEL__
+		if (qp_core->qp_type == IB_QPT_SMI ||
+		    qp_core->qp_type == IB_QPT_GSI) {
+			wqe_p->u.ud_av.ud_av.pmtu = 1;
+		}
+		if (qp_core->qp_type == IB_QPT_GSI) {
+			wqe_p->pkeyi =
+			    ntohs(send_wr->wr.ud.pkey_index);
+#ifdef DEBUG_GSI_SEND_WR
+			trace_send_wr_ud(send_wr);
+#endif /* DEBUG_GSI_SEND_WR */
+		}
+#endif /* __KERNEL__ */
+		break;
+
+	case IB_QPT_UC:
+		if (send_wr->send_flags & IB_SEND_FENCE) {
+			wqe_p->wr_flag |= WQE_WRFLAG_FENCE;
+		}
+		/* no break is intential here */
+	case IB_QPT_RC:
+		/*@@TODO atomic???*/
+		wqe_p->u.nud.remote_virtual_adress =
+		    be64_to_cpu(send_wr->wr.rdma.remote_addr);
+		wqe_p->u.nud.rkey = ntohl(send_wr->wr.rdma.rkey);
+
+		/* omitted checking of IB_SEND_INLINE
+		   since HW does not support it */
+		dma_length = 0;
+		for (idx = 0; idx < send_wr->num_sge; idx++) {
+			wqe_p->u.nud.sg_list[idx].vaddr =
+			    be64_to_cpu(send_wr->sg_list[idx].addr);
+			wqe_p->u.nud.sg_list[idx].lkey =
+			    ntohl(send_wr->sg_list[idx].lkey);
+			wqe_p->u.nud.sg_list[idx].length =
+			    ntohl(send_wr->sg_list[idx].length);
+			dma_length += send_wr->sg_list[idx].length;
+		} /* eof idx */
+		wqe_p->u.nud.atomic_1st_op_dma_len = be64_to_cpu(dma_length);
+
+		break;
+
+	default:
+		EDEB_ERR(4, "Invalid qptype=%x", qp_core->qp_type);
+		return (-EINVAL);
+	}
+
+	if (IS_EDEB_ON(7)) {
+		EDEB(7, "SEND WQE written into queue qp_core=%p ", qp_core);
+		EDEB_DMP(7, wqe_p, 16*(6 + wqe_p->nr_of_data_seg),
+			 "qp_core=%p", qp_core);
+	}
+	return (0);
+}
+
+/** @brief convert cqe_status to ib_wc_status
+ */
+static inline void map_ib_wc_status(u32 cqe_status,
+				    enum ib_wc_status *wc_status)
+{
+	if (unlikely(cqe_status & 0x80000000)) { /* complete with errors */
+		switch (cqe_status & 0x0000003F) {
+		case 0x01:
+		case 0x21:
+			*wc_status = IB_WC_LOC_LEN_ERR;
+			break;
+		case 0x02:
+		case 0x22:
+			*wc_status = IB_WC_LOC_QP_OP_ERR;
+			break;
+		case 0x03:
+		case 0x23:
+			*wc_status = IB_WC_LOC_EEC_OP_ERR;
+			break;
+		case 0x04:
+		case 0x24:
+			*wc_status = IB_WC_LOC_PROT_ERR;
+			break;
+		case 0x05:
+		case 0x25:
+			*wc_status = IB_WC_WR_FLUSH_ERR;
+			break;
+		case 0x06:
+			*wc_status = IB_WC_MW_BIND_ERR;
+			break;
+		case 0x07: /* remote error - look into bits 20:24 */
+			switch ((cqe_status & 0x0000F800) >> 11) {
+			case 0x0:
+				/* PSN Sequence Error!
+				   couldn't find a matching VAPI status! */
+				*wc_status = IB_WC_GENERAL_ERR;
+				break;
+			case 0x1:
+				*wc_status = IB_WC_REM_INV_REQ_ERR;
+				break;
+			case 0x2:
+				*wc_status = IB_WC_REM_ACCESS_ERR;
+				break;
+			case 0x3:
+				*wc_status = IB_WC_REM_OP_ERR;
+				break;
+			case 0x4:
+				*wc_status = IB_WC_REM_INV_RD_REQ_ERR;
+				break;
+			}
+			break;
+		case 0x08:
+			*wc_status = IB_WC_RETRY_EXC_ERR;
+			break;
+		case 0x09:
+			*wc_status = IB_WC_RNR_RETRY_EXC_ERR;
+			break;
+		case 0x0A:
+		case 0x2D:
+			*wc_status = IB_WC_REM_ABORT_ERR;
+			break;
+		case 0x0B:
+		case 0x2E:
+			*wc_status = IB_WC_INV_EECN_ERR;
+			break;
+		case 0x0C:
+		case 0x2F:
+			*wc_status = IB_WC_INV_EEC_STATE_ERR;
+			break;
+		case 0x0D:
+			*wc_status = IB_WC_BAD_RESP_ERR;
+			break;
+		case 0x10:
+			/* WQE purged */
+			*wc_status = IB_WC_WR_FLUSH_ERR;
+			break;
+		default:
+			*wc_status = IB_WC_FATAL_ERR;
+
+		}
+	} else {
+		*wc_status = IB_WC_SUCCESS;
+	}
+}
+


From rolandd at cisco.com  Sat Feb 18 11:57:59 2006
From: rolandd at cisco.com (Roland Dreier)
Date: Fri, 17 Feb 2006 16:57:59 -0800
Subject: [PATCH 21/22] ehca main file
In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
Message-ID: <20060218005759.13620.10968.stgit@localhost.localdomain>

From: Roland Dreier <rolandd at cisco.com>

What is ehca_show_flightrecorder() trying to do that snprintf() is
not fast enough?  If you need to pass a binary structure back to
userspace (with a kernel address in it??) then sysfs is not the right
place to put it.  Look at debugfs; or relayfs might make the most
sense for your flightrecorder stuff.
---

 drivers/infiniband/hw/ehca/ehca_main.c | 1032 ++++++++++++++++++++++++++++++++
 1 files changed, 1032 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c
new file mode 100644
index 0000000..2e2be06
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ehca_main.c
@@ -0,0 +1,1032 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  module start stop, hca detection
+ *
+ *  Authors: Heiko J Schick <schickhj at de.ibm.com>
+ *           Christoph Raisch <raisch at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ehca_main.c,v 1.137 2006/02/06 16:20:38 schickhj Exp $
+ */
+
+#define DEB_PREFIX "shca"
+
+#include "ehca_kernel.h"
+#include "ehca_tools.h"
+#include "ehca_classes.h"
+#include "ehca_iverbs.h"
+#include "ehca_eq.h"
+#include "ehca_mrmw.h"
+
+#include "hcp_sense.h"		/* TODO: later via hipz_* header file */
+#include "hcp_if.h"		/* TODO: later via hipz_* header file */
+
+MODULE_LICENSE("Dual BSD/GPL");
+MODULE_AUTHOR("Christoph Raisch <raisch at de.ibm.com>");
+MODULE_DESCRIPTION("IBM eServer HCA Driver");
+MODULE_VERSION("EHCA2_0047");
+
+#ifdef EHCA_USERDRIVER
+int ehca_open_aqp1     = 1;
+#else
+int ehca_open_aqp1     = 0;
+#endif
+int ehca_tracelevel    = -1;
+int ehca_hw_level      = 0;
+int ehca_nr_ports      = 2;
+int ehca_use_hp_mr     = 0;
+int ehca_port_act_time = 30;
+int ehca_poll_all_eqs  = 1;
+int ehca_static_rate   = -1;
+
+module_param_named(open_aqp1,     ehca_open_aqp1,     int, 0);
+module_param_named(tracelevel,    ehca_tracelevel,    int, 0);
+module_param_named(hw_level,      ehca_hw_level,      int, 0);
+module_param_named(nr_ports,      ehca_nr_ports,      int, 0);
+module_param_named(use_hp_mr,     ehca_use_hp_mr,     int, 0);
+module_param_named(port_act_time, ehca_port_act_time, int, 0);
+module_param_named(poll_all_eqs,  ehca_poll_all_eqs,  int, 0);
+module_param_named(static_rate,   ehca_static_rate,   int, 0);
+
+MODULE_PARM_DESC(open_aqp1,     "0 no define AQP1 on startup (default),"
+			        "1 define AQP1 on startup");
+MODULE_PARM_DESC(tracelevel,    "0 maximum performance (no messages),"
+		                "9 maximum messages (no performance)");
+MODULE_PARM_DESC(hw_level,      "0 autosensing,"
+				"1 v. 0.20,"
+				"2 v. 0.21");
+MODULE_PARM_DESC(nr_ports,	"number of connected ports (default: 2)");
+MODULE_PARM_DESC(use_hp_mr,	"use high performance MRs,"
+				"0 no (default),"
+				"1 yes");
+MODULE_PARM_DESC(port_act_time, "time to wait for port activation"
+				"(default: 30 sec.)");
+MODULE_PARM_DESC(poll_all_eqs,	"polls all event queues periodically"
+				"0 no,"
+				"1 yes (default)");
+MODULE_PARM_DESC(static_rate,	"set permanent static rate (default: disabled)");
+
+/* This external trace mask controls what will end up in the
+ * kernel ring buffer. Number 6 means, that everything between
+ * 0 and 5 will be stored.
+ */
+u8 ehca_edeb_mask[EHCA_EDEB_TRACE_MASK_SIZE]={6,6,6,6,
+					      6,6,6,6,
+					      6,6,6,6,
+					      6,6,6,6,
+					      6,6,6,6,
+					      6,6,6,6,
+					      6,6,6,6,
+					      6,6,1,0};
+		     /* offset 0x1e is flightrecorder */
+EXPORT_SYMBOL(ehca_edeb_mask);
+
+atomic_t ehca_flightrecorder_index = ATOMIC_INIT(1);
+unsigned long ehca_flightrecorder[EHCA_FLIGHTRECORDER_SIZE];
+EXPORT_SYMBOL(ehca_flightrecorder_index);
+EXPORT_SYMBOL(ehca_flightrecorder);
+
+DECLARE_RWSEM(ehca_qp_idr_sem);
+DECLARE_RWSEM(ehca_cq_idr_sem);
+DEFINE_IDR(ehca_qp_idr);
+DEFINE_IDR(ehca_cq_idr);
+
+struct ehca_module ehca_module;
+struct workqueue_struct *ehca_wq;
+struct task_struct  *ehca_kthread_eq;
+
+/**
+ * ehca_init_trace - TODO
+ */
+void ehca_init_trace(void)
+{
+	EDEB_EN(7, "");
+
+	if (ehca_tracelevel != -1) {
+		int i;
+		for (i = 0; i < EHCA_EDEB_TRACE_MASK_SIZE; i++)
+			ehca_edeb_mask[i] = ehca_tracelevel;
+	}
+
+	EDEB_EX(7, "");
+}
+
+/**
+ * ehca_init_flight - TODO
+ */
+void ehca_init_flight(void)
+{
+	EDEB_EN(7, "");
+
+	memset(ehca_flightrecorder, 0xFA,
+	       sizeof(unsigned long) * EHCA_FLIGHTRECORDER_SIZE);
+	atomic_set(&ehca_flightrecorder_index, 0);
+	ehca_flightrecorder[0] = 0x12345678abcdef0;
+
+	EDEB_EX(7, "");
+}
+
+/**
+ * ehca_flight_to_printk - TODO
+ */
+void ehca_flight_to_printk(void)
+{
+	int cur_offset = atomic_read(&ehca_flightrecorder_index);
+	int new_offset = cur_offset - (EHCA_FLIGHTRECORDER_BACKLOG * 4);
+	u32 flight_offset;
+	int i;
+
+	if (new_offset < 0)
+		new_offset = EHCA_FLIGHTRECORDER_SIZE + new_offset - 4;
+
+	printk(KERN_ERR
+	       "EHCA ----- flight recorder begin "
+	       "-------------------------------------------\n");
+
+	for (i = 0; i < EHCA_FLIGHTRECORDER_BACKLOG; i++) {
+		new_offset += 4;
+		flight_offset = (u32) new_offset % EHCA_FLIGHTRECORDER_SIZE;
+
+		printk(KERN_ERR "EHCA %02d: %.16lX %.16lX %.16lX %.16lX\n",
+		       i + 1,
+		       ehca_flightrecorder[flight_offset],
+		       ehca_flightrecorder[flight_offset + 1],
+		       ehca_flightrecorder[flight_offset + 2],
+		       ehca_flightrecorder[flight_offset + 3]);
+	}
+
+	printk(KERN_ERR
+	       "EHCA ----- flight recorder end "
+	       "---------------------------------------------\n");
+}
+
+#define EHCA_CACHE_CREATE(name)                                   \
+	ehca_module->cache_##name =                               \
+		kmem_cache_create("ehca_cache_"#name,             \
+				  sizeof(struct ehca_##name),     \
+				  0, SLAB_HWCACHE_ALIGN,          \
+				  NULL, NULL);                    \
+	if (ehca_module->cache_##name == NULL) {                  \
+		EDEB_ERR(4, "Cannot create "#name" SLAB cache."); \
+		return -ENOMEM;                                   \
+	}                                                         \
+
+/**
+ * ehca_caches_create: TODO
+ */
+int ehca_caches_create(struct ehca_module *ehca_module)
+{
+	EDEB_EN(7, "");
+
+	EHCA_CACHE_CREATE(pd);
+	EHCA_CACHE_CREATE(cq);
+	EHCA_CACHE_CREATE(qp);
+	EHCA_CACHE_CREATE(av);
+	EHCA_CACHE_CREATE(mw);
+	EHCA_CACHE_CREATE(mr);
+
+	EDEB_EX(7, "");
+
+	return 0;
+}
+
+#define EHCA_CACHE_DESTROY(name)                                               \
+	ret = kmem_cache_destroy(ehca_module->cache_##name);                   \
+	if (ret != 0) {                                                        \
+		EDEB_ERR(4, "Cannot destroy "#name" SLAB cache. ret=%x", ret); \
+		return ret;                                                    \
+	}                                                                      \
+
+/**
+ * ehca_caches_destroy - TODO
+ */
+int ehca_caches_destroy(struct ehca_module *ehca_module)
+{
+	int ret;
+
+	EDEB_EN(7, "");
+
+	EHCA_CACHE_DESTROY(pd);
+	EHCA_CACHE_DESTROY(cq);
+	EHCA_CACHE_DESTROY(qp);
+	EHCA_CACHE_DESTROY(av);
+	EHCA_CACHE_DESTROY(mw);
+	EHCA_CACHE_DESTROY(mr);
+
+	EDEB_EX(7, "");
+
+	return 0;
+}
+
+#define EHCA_HCAAVER  EHCA_BMASK_IBM(32,39)
+#define EHCA_REVID    EHCA_BMASK_IBM(40,63)
+
+/**
+ * ehca_num_ports - TODO
+ */
+int ehca_sense_attributes(struct ehca_shca *shca)
+{
+	int ret = -EINVAL;
+	u64 rc = H_Success;
+	struct query_hca_rblock *rblock;
+
+	EDEB_EN(7, "shca=%p", shca);
+
+	rblock = kmalloc(PAGE_SIZE, GFP_KERNEL);
+	if (rblock == NULL) {
+		EDEB_ERR(4, "Cannot allocate rblock memory.");
+		ret = -ENOMEM;
+		goto num_ports0;
+	}
+
+	memset(rblock, 0, PAGE_SIZE);
+
+	rc = hipz_h_query_hca(shca->ipz_hca_handle, rblock);
+	if (rc != H_Success) {
+		EDEB_ERR(4, "Cannot query device properties.rc=%lx", rc);
+		ret = -EPERM;
+		goto num_ports1;
+	}
+
+	if (ehca_nr_ports == 1)
+		shca->num_ports = 1;
+	else
+		shca->num_ports = (u8) rblock->num_ports;
+
+	EDEB(6, " ... found %x ports", rblock->num_ports);
+
+	if (ehca_hw_level == 0) {
+		u32 hcaaver;
+		u32 revid;
+
+		hcaaver = EHCA_BMASK_GET(EHCA_HCAAVER, rblock->hw_ver);
+		revid   = EHCA_BMASK_GET(EHCA_REVID, rblock->hw_ver);
+
+		EDEB(6, " ... hardware version=%x:%x",
+		     hcaaver, revid);
+
+		if ((hcaaver == 1) && (revid == 0))
+			shca->hw_level = 0;
+		else if ((hcaaver == 1) && (revid == 1))
+			shca->hw_level = 1;
+		else if ((hcaaver == 1) && (revid == 2))
+			shca->hw_level = 2;
+	}
+	EDEB(6, " ... hardware level=%x", shca->hw_level);
+
+	ret = 0;
+
+      num_ports1:
+	kfree(rblock);
+
+      num_ports0:
+	EDEB_EX(7, "ret=%x", ret);
+
+	return ret;
+}
+
+static int init_node_guid(struct ehca_shca* shca)
+{
+	int ret = 0;
+	struct query_hca_rblock *rblock;
+
+	EDEB_EN(7, "");
+
+	rblock = kmalloc(PAGE_SIZE, GFP_KERNEL);
+	if (rblock == NULL) {
+		EDEB_ERR(4, "Can't allocate rblock memory.");
+		ret = -ENOMEM;
+		goto init_node_guid0;
+	}
+
+	memset(rblock, 0, PAGE_SIZE);
+
+	if (hipz_h_query_hca(shca->ipz_hca_handle, rblock) != H_Success) {
+		EDEB_ERR(4, "Can't query device properties");
+		ret = -EINVAL;
+		goto init_node_guid1;
+	}
+
+	memcpy(&shca->ib_device.node_guid, &rblock->node_guid, (sizeof(u64)));
+
+ init_node_guid1:
+	kfree(rblock);
+
+ init_node_guid0:
+	EDEB_EX(7, "node_guid=%lx ret=%x", shca->ib_device.node_guid, ret);
+
+	return ret;
+}
+
+int ehca_register_device(struct ehca_shca *shca)
+{
+	int ret = 0;
+
+	EDEB_EN(7, "shca=%p", shca);
+
+	ret = init_node_guid(shca);
+	if (ret != 0)
+		return ret;
+
+	strlcpy(shca->ib_device.name, "ehca%d", IB_DEVICE_NAME_MAX);
+	shca->ib_device.owner               = THIS_MODULE;
+
+	/* TODO: ABI ver later with define */
+	shca->ib_device.uverbs_abi_ver	    = 1;
+	shca->ib_device.uverbs_cmd_mask	    =
+		(1ull << IB_USER_VERBS_CMD_GET_CONTEXT)		|
+		(1ull << IB_USER_VERBS_CMD_QUERY_DEVICE)	|
+		(1ull << IB_USER_VERBS_CMD_QUERY_PORT)		|
+		(1ull << IB_USER_VERBS_CMD_ALLOC_PD)		|
+		(1ull << IB_USER_VERBS_CMD_DEALLOC_PD)		|
+		(1ull << IB_USER_VERBS_CMD_REG_MR)		|
+		(1ull << IB_USER_VERBS_CMD_DEREG_MR)		|
+		(1ull << IB_USER_VERBS_CMD_CREATE_COMP_CHANNEL)	|
+		(1ull << IB_USER_VERBS_CMD_CREATE_CQ)		|
+		(1ull << IB_USER_VERBS_CMD_DESTROY_CQ)		|
+		(1ull << IB_USER_VERBS_CMD_CREATE_QP)		|
+		(1ull << IB_USER_VERBS_CMD_MODIFY_QP)		|
+		(1ull << IB_USER_VERBS_CMD_DESTROY_QP)		|
+		(1ull << IB_USER_VERBS_CMD_ATTACH_MCAST)	|
+		(1ull << IB_USER_VERBS_CMD_DETACH_MCAST);
+
+	shca->ib_device.node_type           = RDMA_NODE_IB_CA;
+	shca->ib_device.phys_port_cnt       = shca->num_ports;
+	shca->ib_device.dma_device          = &shca->ibmebus_dev->ofdev.dev;
+	shca->ib_device.query_device        = ehca_query_device;
+	shca->ib_device.query_port          = ehca_query_port;
+	shca->ib_device.query_gid           = ehca_query_gid;
+	shca->ib_device.query_pkey          = ehca_query_pkey;
+	/* shca->in_device.modify_device    = ehca_modify_device    */
+	shca->ib_device.modify_port         = ehca_modify_port;
+	shca->ib_device.alloc_ucontext      = ehca_alloc_ucontext;
+	shca->ib_device.dealloc_ucontext    = ehca_dealloc_ucontext;
+	shca->ib_device.alloc_pd            = ehca_alloc_pd;
+	shca->ib_device.dealloc_pd          = ehca_dealloc_pd;
+	shca->ib_device.create_ah	    = ehca_create_ah;
+	/* shca->ib_device.modify_ah	    = ehca_modify_ah;	    */
+	shca->ib_device.query_ah	    = ehca_query_ah;
+	shca->ib_device.destroy_ah	    = ehca_destroy_ah;
+	shca->ib_device.create_qp	    = ehca_create_qp;
+	shca->ib_device.modify_qp	    = ehca_modify_qp;
+	shca->ib_device.query_qp	    = ehca_query_qp;
+	shca->ib_device.destroy_qp	    = ehca_destroy_qp;
+	shca->ib_device.post_send	    = ehca_post_send;
+	shca->ib_device.post_recv	    = ehca_post_recv;
+	shca->ib_device.create_cq	    = ehca_create_cq;
+	shca->ib_device.destroy_cq	    = ehca_destroy_cq;
+
+	/* TODO: disabled due to func signature conflict */
+	/* shca->ib_device.resize_cq	    = ehca_resize_cq;	    */
+
+	shca->ib_device.poll_cq		    = ehca_poll_cq;
+	/* shca->ib_device.peek_cq	    = ehca_peek_cq;	    */
+	shca->ib_device.req_notify_cq	    = ehca_req_notify_cq;
+	/* shca->ib_device.req_ncomp_notif  = ehca_req_ncomp_notif; */
+	shca->ib_device.get_dma_mr	    = ehca_get_dma_mr;
+	shca->ib_device.reg_phys_mr	    = ehca_reg_phys_mr;
+	shca->ib_device.reg_user_mr	    = ehca_reg_user_mr;
+	shca->ib_device.query_mr	    = ehca_query_mr;
+	shca->ib_device.dereg_mr	    = ehca_dereg_mr;
+	shca->ib_device.rereg_phys_mr	    = ehca_rereg_phys_mr;
+	shca->ib_device.alloc_mw	    = ehca_alloc_mw;
+	shca->ib_device.bind_mw		    = ehca_bind_mw;
+	shca->ib_device.dealloc_mw	    = ehca_dealloc_mw;
+	shca->ib_device.alloc_fmr	    = ehca_alloc_fmr;
+	shca->ib_device.map_phys_fmr	    = ehca_map_phys_fmr;
+	shca->ib_device.unmap_fmr	    = ehca_unmap_fmr;
+	shca->ib_device.dealloc_fmr	    = ehca_dealloc_fmr;
+	shca->ib_device.attach_mcast	    = ehca_attach_mcast;
+	shca->ib_device.detach_mcast	    = ehca_detach_mcast;
+	/* shca->ib_device.process_mad	    = ehca_process_mad;	    */
+	shca->ib_device.mmap		    = ehca_mmap;
+
+	ret = ib_register_device(&shca->ib_device);
+
+	EDEB_EX(7, "ret=%x", ret);
+
+	return ret;
+}
+
+/**
+ * ehca_create_aqp1 - TODO
+ *
+ * @shca: TODO
+ */
+static int ehca_create_aqp1(struct ehca_shca *shca, u32 port)
+{
+	struct ehca_sport *sport;
+	struct ib_cq *ibcq;
+	struct ib_qp *ibqp;
+	struct ib_qp_init_attr qp_init_attr;
+	int ret = 0;
+
+	EDEB_EN(7, "shca=%p port=%x", shca, port);
+
+	sport = &shca->sport[port - 1];
+
+	if (sport->ibcq_aqp1 != NULL) {
+		EDEB_ERR(4, "AQP1 CQ is already created.");
+		return -EPERM;
+	}
+
+	ibcq = ib_create_cq(&shca->ib_device, NULL, NULL, (void*)(-1), 10);
+	if (IS_ERR(ibcq)) {
+		EDEB_ERR(4, "Cannot create AQP1 CQ.");
+		return PTR_ERR(ibcq);
+	}
+	sport->ibcq_aqp1 = ibcq;
+
+	if (sport->ibqp_aqp1 != NULL) {
+		EDEB_ERR(4, "AQP1 QP is already created.");
+		ret = -EPERM;
+		goto create_aqp1;
+	}
+
+	memset(&qp_init_attr, 0, sizeof(struct ib_qp_init_attr));
+	qp_init_attr.send_cq          = ibcq;
+	qp_init_attr.recv_cq          = ibcq;
+	qp_init_attr.sq_sig_type      = IB_SIGNAL_ALL_WR;
+	qp_init_attr.cap.max_send_wr  = 100;
+	qp_init_attr.cap.max_recv_wr  = 100;
+	qp_init_attr.cap.max_send_sge = 2;
+	qp_init_attr.cap.max_recv_sge = 1;
+	qp_init_attr.qp_type          = IB_QPT_GSI;
+	qp_init_attr.port_num         = port;
+	qp_init_attr.qp_context       = NULL;
+	qp_init_attr.event_handler    = NULL;
+	qp_init_attr.srq              = NULL;
+
+	ibqp = ib_create_qp(&shca->pd->ib_pd, &qp_init_attr);
+	if (IS_ERR(ibqp)) {
+		EDEB_ERR(4, "Cannot create AQP1 QP.");
+		ret = PTR_ERR(ibqp);
+		goto create_aqp1;
+	}
+	sport->ibqp_aqp1 = ibqp;
+
+	EDEB_EX(7, "ret=%x", ret);
+
+	return ret;
+
+      create_aqp1:
+	ib_destroy_cq(sport->ibcq_aqp1);
+
+	EDEB_EX(7, "ret=%x", ret);
+
+	return ret;
+}
+
+/**
+ * ehca_destroy_aqp1 - TODO
+ */
+static int ehca_destroy_aqp1(struct ehca_sport *sport)
+{
+	int ret = 0;
+
+	EDEB_EN(7, "sport=%p", sport);
+
+	ret = ib_destroy_qp(sport->ibqp_aqp1);
+	if (ret != 0) {
+		EDEB_ERR(4, "Cannot destroy AQP1 QP. ret=%x", ret);
+		goto destroy_aqp1;
+	}
+
+	ret = ib_destroy_cq(sport->ibcq_aqp1);
+	if (ret != 0)
+		EDEB_ERR(4, "Cannot destroy AQP1 CQ. ret=%x", ret);
+
+      destroy_aqp1:
+	EDEB_EX(7, "ret=%x", ret);
+
+	return ret;
+}
+
+static ssize_t ehca_show_debug_level(struct device_driver *ddp, char *buf)
+{
+	int f;
+	int total = 0;
+	total += snprintf(buf + total, PAGE_SIZE - total, "%d",
+			  ehca_edeb_mask[0]);
+	for (f = 1; f < EHCA_EDEB_TRACE_MASK_SIZE; f++) {
+		total += snprintf(buf + total, PAGE_SIZE - total, ",%d",
+				  ehca_edeb_mask[f]);
+	}
+
+	total += snprintf(buf + total, PAGE_SIZE - total, "\n");
+
+	return total;
+}
+
+static ssize_t ehca_store_debug_level(struct device_driver *ddp,
+				      const char *buf, size_t count)
+{
+	int f;
+	for (f = 0; f < EHCA_EDEB_TRACE_MASK_SIZE; f++) {
+		char value = buf[f * 2] - '0';
+		if ((value <= 9) && (count >= f * 2)) {
+			ehca_edeb_mask[f] = value;
+		}
+	}
+	return count;
+}
+DRIVER_ATTR(debug_level, S_IRUSR | S_IWUSR,
+	    ehca_show_debug_level, ehca_store_debug_level);
+
+static ssize_t ehca_show_flightrecorder(struct device_driver *ddp,
+					char *buf)
+{
+	/* this is not style compliant, but snprintf is not fast enough */
+	u64 *lbuf = (u64 *) buf;
+	lbuf[0] = (u64) & ehca_flightrecorder;
+	lbuf[1] = EHCA_FLIGHTRECORDER_SIZE;
+	lbuf[2] = atomic_read(&ehca_flightrecorder_index);
+	return sizeof(u64) * 3;
+}
+DRIVER_ATTR(flightrecorder, S_IRUSR, ehca_show_flightrecorder, 0);
+
+void ehca_create_driver_sysfs(struct ibmebus_driver *drv)
+{
+	driver_create_file(&drv->driver, &driver_attr_debug_level);
+	driver_create_file(&drv->driver, &driver_attr_flightrecorder);
+}
+
+void ehca_remove_driver_sysfs(struct ibmebus_driver *drv)
+{
+	driver_remove_file(&drv->driver, &driver_attr_debug_level);
+	driver_remove_file(&drv->driver, &driver_attr_flightrecorder);
+}
+
+#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,12)
+#define EHCA_RESOURCE_ATTR_H(name)                                         \
+static ssize_t  ehca_show_##name(struct device *dev,                       \
+				 struct device_attribute *attr,            \
+				 char *buf)
+#else
+#define EHCA_RESOURCE_ATTR_H(name)                                         \
+static ssize_t  ehca_show_##name(struct device *dev,                       \
+				 char *buf)
+#endif
+
+#define EHCA_RESOURCE_ATTR(name)                                           \
+EHCA_RESOURCE_ATTR_H(name)                                                 \
+{									   \
+	struct ehca_shca *shca;						   \
+	struct query_hca_rblock *rblock;				   \
+	int len;							   \
+									   \
+	shca = dev->driver_data;					   \
+									   \
+	rblock = kmalloc(PAGE_SIZE, GFP_KERNEL);			   \
+	if (rblock == NULL) {						   \
+		EDEB_ERR(4, "Can't allocate rblock memory.");		   \
+		return 0;						   \
+	}								   \
+									   \
+	memset(rblock, 0, PAGE_SIZE);					   \
+									   \
+	if (hipz_h_query_hca(shca->ipz_hca_handle, rblock) != H_Success) { \
+			EDEB_ERR(4, "Can't query device properties");	   \
+			kfree(rblock);					   \
+			return 0;					   \
+	}								   \
+									   \
+	if ((strcmp(#name, "num_ports") == 0) && (ehca_nr_ports == 1))	   \
+		len = snprintf(buf, 256, "1");				   \
+	else								   \
+		len = snprintf(buf, 256, "%d", rblock->name);		   \
+									   \
+	if (len < 0)							   \
+		return 0;						   \
+	buf[len] = '\n';						   \
+	buf[len+1] = 0;							   \
+									   \
+	kfree(rblock);							   \
+									   \
+	return len+1;							   \
+}									   \
+static DEVICE_ATTR(name, S_IRUGO, ehca_show_##name, NULL);
+
+EHCA_RESOURCE_ATTR(num_ports);
+EHCA_RESOURCE_ATTR(hw_ver);
+EHCA_RESOURCE_ATTR(max_eq);
+EHCA_RESOURCE_ATTR(cur_eq);
+EHCA_RESOURCE_ATTR(max_cq);
+EHCA_RESOURCE_ATTR(cur_cq);
+EHCA_RESOURCE_ATTR(max_qp);
+EHCA_RESOURCE_ATTR(cur_qp);
+EHCA_RESOURCE_ATTR(max_mr);
+EHCA_RESOURCE_ATTR(cur_mr);
+EHCA_RESOURCE_ATTR(max_mw);
+EHCA_RESOURCE_ATTR(cur_mw);
+EHCA_RESOURCE_ATTR(max_pd);
+EHCA_RESOURCE_ATTR(max_ah);
+
+static ssize_t ehca_show_adapter_handle(struct device *dev,
+#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,12)
+					struct device_attribute *attr,
+#endif
+					char *buf)
+{
+	struct ehca_shca *shca = dev->driver_data;
+
+	return sprintf(buf, "%lx\n", shca->ipz_hca_handle.handle);
+
+}
+static DEVICE_ATTR(adapter_handle, S_IRUGO, ehca_show_adapter_handle, NULL);
+
+
+
+void ehca_create_device_sysfs(struct ibmebus_dev *dev)
+{
+	device_create_file(&dev->ofdev.dev, &dev_attr_adapter_handle);
+	device_create_file(&dev->ofdev.dev, &dev_attr_num_ports);
+	device_create_file(&dev->ofdev.dev, &dev_attr_hw_ver);
+	device_create_file(&dev->ofdev.dev, &dev_attr_max_eq);
+	device_create_file(&dev->ofdev.dev, &dev_attr_cur_eq);
+	device_create_file(&dev->ofdev.dev, &dev_attr_max_cq);
+	device_create_file(&dev->ofdev.dev, &dev_attr_cur_cq);
+	device_create_file(&dev->ofdev.dev, &dev_attr_max_qp);
+	device_create_file(&dev->ofdev.dev, &dev_attr_cur_qp);
+	device_create_file(&dev->ofdev.dev, &dev_attr_max_mr);
+	device_create_file(&dev->ofdev.dev, &dev_attr_cur_mr);
+	device_create_file(&dev->ofdev.dev, &dev_attr_max_mw);
+	device_create_file(&dev->ofdev.dev, &dev_attr_cur_mw);
+	device_create_file(&dev->ofdev.dev, &dev_attr_max_pd);
+	device_create_file(&dev->ofdev.dev, &dev_attr_max_ah);
+}
+
+void ehca_remove_device_sysfs(struct ibmebus_dev *dev)
+{
+	device_remove_file(&dev->ofdev.dev, &dev_attr_adapter_handle);
+	device_remove_file(&dev->ofdev.dev, &dev_attr_num_ports);
+	device_remove_file(&dev->ofdev.dev, &dev_attr_hw_ver);
+	device_remove_file(&dev->ofdev.dev, &dev_attr_max_eq);
+	device_remove_file(&dev->ofdev.dev, &dev_attr_cur_eq);
+	device_remove_file(&dev->ofdev.dev, &dev_attr_max_cq);
+	device_remove_file(&dev->ofdev.dev, &dev_attr_cur_cq);
+	device_remove_file(&dev->ofdev.dev, &dev_attr_max_qp);
+	device_remove_file(&dev->ofdev.dev, &dev_attr_cur_qp);
+	device_remove_file(&dev->ofdev.dev, &dev_attr_max_mr);
+	device_remove_file(&dev->ofdev.dev, &dev_attr_cur_mr);
+	device_remove_file(&dev->ofdev.dev, &dev_attr_max_mw);
+	device_remove_file(&dev->ofdev.dev, &dev_attr_cur_mw);
+	device_remove_file(&dev->ofdev.dev, &dev_attr_max_pd);
+	device_remove_file(&dev->ofdev.dev, &dev_attr_max_ah);
+}
+
+/**
+ * ehca_probe - TODO
+ */
+static int __devinit ehca_probe(struct ibmebus_dev *dev,
+				const struct of_device_id *id)
+{
+	struct ehca_shca *shca;
+	u64 *handle;
+	struct ib_pd *ibpd;
+	int ret = 0;
+
+	EDEB_EN(7, "name=%s", dev->name);
+
+	handle = (u64 *)get_property(dev->ofdev.node, "ibm,hca-handle", NULL);
+	if (!handle) {
+		EDEB_ERR(4, "Cannot get eHCA handle for adapter: %s.",
+			 dev->ofdev.node->full_name);
+		return -ENODEV;
+	}
+
+	if (!(*handle)) {
+		EDEB_ERR(4, "Wrong eHCA handle for adapter: %s.",
+			 dev->ofdev.node->full_name);
+		return -ENODEV;
+	}
+
+	shca = (struct ehca_shca *)ib_alloc_device(sizeof(*shca));
+	if (shca == NULL) {
+		EDEB_ERR(4, "Cannot allocate shca memory.");
+		return -ENOMEM;
+	}
+
+	shca->ibmebus_dev = dev;
+	shca->ipz_hca_handle.handle = *handle;
+	dev->ofdev.dev.driver_data = shca;
+
+	ret = ehca_sense_attributes(shca);
+	if (ret < 0) {
+		EDEB_ERR(4, "Cannot sense eHCA attributes.");
+		goto probe1;
+	}
+
+	/* create event queues */
+	ret = ehca_create_eq(shca, &shca->eq, EHCA_EQ, 2048);
+	if (ret != 0) {
+		EDEB_ERR(4, "Cannot create EQ.");
+		goto probe1;
+	}
+
+	ret = ehca_create_eq(shca, &shca->neq, EHCA_NEQ, 513);
+	if (ret != 0) {
+		EDEB_ERR(4, "Cannot create NEQ.");
+		goto probe2;
+	}
+
+	/* create internal protection domain */
+	ibpd = ehca_alloc_pd(&shca->ib_device, (void*)(-1), 0);
+	if (IS_ERR(ibpd)) {
+		EDEB_ERR(4, "Cannot create internal PD.");
+		ret = PTR_ERR(ibpd);
+		goto probe3;
+	}
+
+	shca->pd = container_of(ibpd, struct ehca_pd, ib_pd);
+	shca->pd->ib_pd.device = &shca->ib_device;
+
+	/* create internal max MR */
+	if (shca->maxmr == 0) {
+		struct ehca_mr *e_maxmr = 0;
+		ret = ehca_reg_internal_maxmr(shca, shca->pd, &e_maxmr);
+		if (ret != 0) {
+			EDEB_ERR(4, "Cannot create internal MR. ret=%x", ret);
+			goto probe4;
+		}
+		shca->maxmr = e_maxmr;
+	}
+
+	ret = ehca_register_device(shca);
+	if (ret != 0) {
+		EDEB_ERR(4, "Cannot register Infiniband device.");
+		goto probe5;
+	}
+
+	/* create AQP1 for port 1 */
+	if (ehca_open_aqp1 == 1) {
+		shca->sport[0].port_state = IB_PORT_DOWN;
+		ret = ehca_create_aqp1(shca, 1);
+		if (ret != 0) {
+			EDEB_ERR(4, "Cannot create AQP1 for port 1.");
+			goto probe6;
+		}
+	}
+
+	/* create AQP1 for port 2 */
+	if ((ehca_open_aqp1 == 1) && (shca->num_ports == 2)) {
+		shca->sport[1].port_state = IB_PORT_DOWN;
+		ret = ehca_create_aqp1(shca, 2);
+		if (ret != 0) {
+			EDEB_ERR(4, "Cannot create AQP1 for port 2.");
+			goto probe7;
+		}
+	}
+
+	ehca_create_device_sysfs(dev);
+
+	spin_lock(&ehca_module.shca_lock);
+	list_add(&shca->shca_list, &ehca_module.shca_list);
+	spin_unlock(&ehca_module.shca_lock);
+
+	EDEB_EX(7, "ret=%x", ret);
+
+	return 0;
+
+ probe7:
+	ret = ehca_destroy_aqp1(&shca->sport[0]);
+	if (ret != 0)
+		EDEB_ERR(4, "Cannot destroy AQP1 for port 1. ret=%x", ret);
+
+ probe6:
+	ib_unregister_device(&shca->ib_device);
+
+ probe5:
+	ret = ehca_dereg_internal_maxmr(shca);
+	if (ret != 0)
+		EDEB_ERR(4, "Cannot destroy internal MR. ret=%x", ret);
+
+ probe4:
+	ret = ehca_dealloc_pd(&shca->pd->ib_pd);
+	if (ret != 0)
+		EDEB_ERR(4, "Cannot destroy internal PD. ret=%x", ret);
+
+ probe3:
+	ret = ehca_destroy_eq(shca, &shca->neq);
+	if (ret != 0)
+		EDEB_ERR(4, "Cannot destroy NEQ. ret=%x", ret);
+
+ probe2:
+	ret = ehca_destroy_eq(shca, &shca->eq);
+	if (ret != 0)
+		EDEB_ERR(4, "Cannot destroy EQ. ret=%x", ret);
+
+ probe1:
+	ib_dealloc_device(&shca->ib_device);
+
+	EDEB_EX(4, "ret=%x", ret);
+
+	return -EINVAL;
+}
+
+static int __devexit ehca_remove(struct ibmebus_dev *dev)
+{
+	struct ehca_shca *shca = dev->ofdev.dev.driver_data;
+	int ret;
+
+	EDEB_EN(7, "shca=%p", shca);
+
+	ehca_remove_device_sysfs(dev);
+
+	if (ehca_open_aqp1 == 1) {
+		int i;
+
+		for (i = 0; i < shca->num_ports; i++) {
+			ret = ehca_destroy_aqp1(&shca->sport[i]);
+			if (ret != 0)
+				EDEB_ERR(4, "Cannot destroy AQP1 for port %x."
+					 " ret=%x", ret, i);
+		}
+	}
+
+	ib_unregister_device(&shca->ib_device);
+
+	ret = ehca_dereg_internal_maxmr(shca);
+	if (ret != 0)
+		EDEB_ERR(4, "Cannot destroy internal MR. ret=%x", ret);
+
+	ret = ehca_dealloc_pd(&shca->pd->ib_pd);
+	if (ret != 0)
+		EDEB_ERR(4, "Cannot destroy internal PD. ret=%x", ret);
+
+	ret = ehca_destroy_eq(shca, &shca->eq);
+	if (ret != 0)
+		EDEB_ERR(4, "Cannot destroy EQ. ret=%x", ret);
+
+	ret = ehca_destroy_eq(shca, &shca->neq);
+	if (ret != 0)
+		EDEB_ERR(4, "Canot destroy NEQ. ret=%x", ret);
+
+	ib_dealloc_device(&shca->ib_device);
+
+	spin_lock(&ehca_module.shca_lock);
+	list_del(&shca->shca_list);
+	spin_unlock(&ehca_module.shca_lock);
+
+	EDEB_EX(7, "ret=%x", ret);
+
+	return ret;
+}
+
+static struct of_device_id ehca_device_table[] =
+{
+	{
+		.name       = "lhca",
+		.compatible = "IBM,lhca",
+	},
+	{},
+};
+
+static struct ibmebus_driver ehca_driver = {
+	.name     = "ehca",
+	.id_table = ehca_device_table,
+	.probe    = ehca_probe,
+	.remove   = ehca_remove,
+};
+
+/**
+ * ehca_module_init - eHCA initialization routine.
+ */
+int __init ehca_module_init(void)
+{
+	int ret = 0;
+
+	printk(KERN_INFO "eHCA Infiniband Device Driver "
+	                 "(Rel.: EHCA2_0047)\n");
+	EDEB_EN(7, "");
+
+	idr_init(&ehca_qp_idr);
+	idr_init(&ehca_cq_idr);
+
+	INIT_LIST_HEAD(&ehca_module.shca_list);
+	spin_lock_init(&ehca_module.shca_lock);
+
+	ehca_init_trace();
+	ehca_init_flight();
+
+	ehca_wq = create_workqueue("ehca");
+	if (ehca_wq == NULL) {
+		EDEB_ERR(4, "Cannot create workqueue.");
+		ret = -ENOMEM;
+		goto module_init0;
+	}
+
+	if ((ret = ehca_caches_create(&ehca_module)) != 0) {
+		ehca_catastrophic("Cannot create SLAB caches");
+		ret = -ENOMEM;
+		goto module_init1;
+	}
+
+	if ((ret = ibmebus_register_driver(&ehca_driver)) != 0) {
+		ehca_catastrophic("Cannot register eHCA device driver");
+		ret = -EINVAL;
+		goto module_init2;
+	}
+
+	ehca_create_driver_sysfs(&ehca_driver);
+
+	if (ehca_poll_all_eqs != 1) {
+		EDEB_ERR(4, "WARNING!!!");
+		EDEB_ERR(4, "It is possible to lose interrupts.");
+
+		return 0;
+	}
+
+	ehca_kthread_eq = kthread_create(ehca_poll_eqs, &ehca_module,
+					 "ehca_poll_eqs");
+	if (IS_ERR(ehca_kthread_eq)) {
+		EDEB_ERR(4, "Cannot create kthread_eq");
+		ret = PTR_ERR(ehca_kthread_eq);
+		goto module_init3;
+	}
+
+	wake_up_process(ehca_kthread_eq);
+
+	EDEB_EX(7, "ret=%x", ret);
+
+	return 0;
+
+ module_init3:
+	ehca_remove_driver_sysfs(&ehca_driver);
+	ibmebus_unregister_driver(&ehca_driver);
+
+ module_init2:
+	ehca_caches_destroy(&ehca_module);
+
+ module_init1:
+	destroy_workqueue(ehca_wq);
+
+ module_init0:
+	EDEB_EX(7, "ret=%x", ret);
+
+	return ret;
+};
+
+/**
+ * ehca_module_exit - eHCA exit routine.
+ */
+void __exit ehca_module_exit(void)
+{
+	EDEB_EN(7, "");
+
+	if (ehca_poll_all_eqs == 1)
+		kthread_stop(ehca_kthread_eq);
+
+	ehca_remove_driver_sysfs(&ehca_driver);
+	ibmebus_unregister_driver(&ehca_driver);
+
+	if (ehca_caches_destroy(&ehca_module) != 0)
+		ehca_catastrophic("Cannot destroy SLAB caches");
+
+	destroy_workqueue(ehca_wq);
+
+#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,15)
+	idr_destroy_ext(&ehca_cq_idr);
+	idr_destroy_ext(&ehca_qp_idr);
+#else
+	idr_destroy(&ehca_cq_idr);
+	idr_destroy(&ehca_qp_idr);
+#endif
+
+	EDEB_EX(7, "");
+};
+
+module_init(ehca_module_init);
+module_exit(ehca_module_exit);


From rolandd at cisco.com  Sat Feb 18 11:57:43 2006
From: rolandd at cisco.com (Roland Dreier)
Date: Fri, 17 Feb 2006 16:57:43 -0800
Subject: [PATCH 14/22] ehca completion queue handling
In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
Message-ID: <20060218005743.13620.29456.stgit@localhost.localdomain>

From: Roland Dreier <rolandd at cisco.com>


---

 drivers/infiniband/hw/ehca/ehca_cq.c |  416 ++++++++++++++++++++++++++++++++++
 1 files changed, 416 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_cq.c b/drivers/infiniband/hw/ehca/ehca_cq.c
new file mode 100644
index 0000000..ebee9c3
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ehca_cq.c
@@ -0,0 +1,416 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  Completion queue handling
+ *
+ *  Authors: Waleri Fomin <fomin at de.ibm.com>
+ *           Reinhard Ernst <rernst at de.ibm.com>
+ *           Heiko J Schick <schickhj at de.ibm.com>
+ *           Hoang-Nam Nguyen <hnguyen at de.ibm.com>
+ *
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ehca_cq.c,v 1.61 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+#define DEB_PREFIX "e_cq"
+
+#include "ehca_kernel.h"
+#include "ehca_common.h"
+#include "ehca_iverbs.h"
+#include "ehca_classes.h"
+#include "ehca_irq.h"
+#include "hcp_if.h"
+#include <linux/err.h>
+#include <asm/uaccess.h>
+
+#define HIPZ_CQ_REGISTER_ORIG 0
+
+int ehca_cq_assign_qp(struct ehca_cq *cq, struct ehca_qp *qp)
+{
+	unsigned int qp_num = qp->ehca_qp_core.real_qp_num;
+	unsigned int key = qp_num%QP_HASHTAB_LEN;
+	unsigned long spl_flags = 0;
+	spin_lock_irqsave(&cq->spinlock, spl_flags);
+	list_add(&qp->list_entries, &cq->qp_hashtab[key]);
+	spin_unlock_irqrestore(&cq->spinlock, spl_flags);
+	EDEB(7, "cq_num=%x real_qp_num=%x", cq->cq_number, qp_num);
+	return 0;
+}
+
+int ehca_cq_unassign_qp(struct ehca_cq *cq, unsigned int real_qp_num)
+{
+	int ret = -EINVAL;
+	unsigned int key = real_qp_num%QP_HASHTAB_LEN;
+	struct list_head *iter = NULL;
+	struct ehca_qp *qp = NULL;
+	unsigned long spl_flags = 0;
+	spin_lock_irqsave(&cq->spinlock, spl_flags);
+	list_for_each(iter, &cq->qp_hashtab[key]) {
+		qp = list_entry(iter, struct ehca_qp, list_entries);
+		if (qp->ehca_qp_core.real_qp_num == real_qp_num) {
+			list_del(iter);
+			EDEB(7, "removed qp from cq .cq_num=%x real_qp_num=%x",
+			     cq->cq_number, real_qp_num);
+			ret = 0;
+			break;
+		}
+	}
+	spin_unlock_irqrestore(&cq->spinlock, spl_flags);
+	if (ret!=0) {
+		EDEB_ERR(4, "qp not found cq_num=%x real_qp_num=%x",
+			 cq->cq_number, real_qp_num);
+	}
+	return ret;
+}
+
+struct ehca_qp* ehca_cq_get_qp(struct ehca_cq *cq, int real_qp_num)
+{
+	struct ehca_qp *ret = NULL;
+	unsigned int key = real_qp_num%QP_HASHTAB_LEN;
+	struct list_head *iter = NULL;
+	struct ehca_qp *qp = NULL;
+	list_for_each(iter, &cq->qp_hashtab[key]) {
+		qp = list_entry(iter, struct ehca_qp, list_entries);
+		if (qp->ehca_qp_core.real_qp_num == real_qp_num) {
+			ret = qp;
+			break;
+		}
+	}
+	return ret;
+}
+
+struct ib_cq *ehca_create_cq(struct ib_device *device, int cqe,
+			     struct ib_ucontext *context,
+			     struct ib_udata *udata)
+{
+	struct ib_cq *cq = NULL;
+	struct ehca_cq *my_cq = NULL;
+	u32 number_of_entries = cqe;
+	struct ehca_shca *shca = NULL;
+	struct ipz_adapter_handle adapter_handle;
+	struct ipz_eq_handle eq_handle;
+	struct ipz_cq_handle *cq_handle_ref = NULL;
+	u32 act_nr_of_entries = 0;
+	u32 act_pages = 0;
+	u32 counter = 0;
+	void *vpage = NULL;
+	u64 rpage = 0;
+	struct h_galpa gal;
+	u64 CQx_FEC = 0;
+	u64 hipz_rc = H_Success;
+	int ipz_rc = 0;
+	int ret = 0;
+	const u32 additional_cqe=20;
+	int i= 0;
+
+	EHCA_CHECK_DEVICE_P(device);
+	EDEB_EN(7,  "device=%p cqe=%x context=%p",
+		device, cqe, context);
+	/* cq's maximum depth is 4GB-64
+	 * but we need additional 20 as buffer for receiving errors cqes
+	 */
+	if (cqe>=0xFFFFFFFF-64-additional_cqe) {
+		return ERR_PTR(-EINVAL);
+	}
+	number_of_entries += additional_cqe;
+
+	my_cq = ehca_cq_new();
+	if (my_cq == NULL) {
+		cq = ERR_PTR(-ENOMEM);
+		EDEB_ERR(4,
+			 "Out of memory for ehca_cq struct "
+			 "device=%p", device);
+		goto create_cq_exit0;
+	}
+	cq = &my_cq->ib_cq;
+
+	shca = container_of(device, struct ehca_shca, ib_device);
+	adapter_handle = shca->ipz_hca_handle;
+	eq_handle = shca->eq.ipz_eq_handle;
+	cq_handle_ref = &my_cq->ipz_cq_handle;
+
+	do {
+		if (!idr_pre_get(&ehca_cq_idr, GFP_KERNEL)) {
+			cq = ERR_PTR(-ENOMEM);
+			EDEB_ERR(4,
+				 "Can't reserve idr resources. "
+				 "device=%p", device);
+			goto create_cq_exit1;
+		}
+
+		down_write(&ehca_cq_idr_sem);
+		ret = idr_get_new(&ehca_cq_idr, my_cq, &my_cq->token);
+		up_write(&ehca_cq_idr_sem);
+
+	} while (ret == -EAGAIN);
+
+	if (ret) {
+		cq = ERR_PTR(-ENOMEM);
+		EDEB_ERR(4,
+			 "Can't allocate new idr entry. "
+			 "device=%p", device);
+		goto create_cq_exit1;
+	}
+
+	hipz_rc = hipz_h_alloc_resource_cq(adapter_handle,
+					   &my_cq->pf,
+					   eq_handle,
+					   my_cq->token,
+					   number_of_entries,
+					   cq_handle_ref,
+					   &act_nr_of_entries,
+					   &act_pages,
+					   &my_cq->ehca_cq_core.galpas);
+	if (hipz_rc != H_Success) {
+		EDEB_ERR(4,
+			 "hipz_h_alloc_resource_cq() failed "
+			 "hipz_rc=%lx device=%p", hipz_rc, device);
+		cq = ERR_PTR(ehca2ib_return_code(hipz_rc));
+		goto create_cq_exit2;
+	}
+
+	ipz_rc =
+		ipz_queue_ctor(&my_cq->ehca_cq_core.ipz_queue, act_pages,
+			       EHCA_PAGESIZE, sizeof(struct ehca_cqe), 0);
+	if (!ipz_rc) {
+		EDEB_ERR(4,
+			 "ipz_queue_ctor() failed "
+			 "ipz_rc=%x device=%p", ipz_rc, device);
+		cq = ERR_PTR(-EINVAL);
+		goto create_cq_exit3;
+	}
+
+	for (counter = 0; counter < act_pages; counter++) {
+		vpage = ipz_QPageit_get_inc(&my_cq->ehca_cq_core.ipz_queue);
+		if (!vpage) {
+			EDEB_ERR(4, "ipz_QPageit_get_inc() "
+				 "returns NULL device=%p", device);
+			cq = ERR_PTR(-EAGAIN);
+			goto create_cq_exit4;
+		}
+		rpage = ehca_kv_to_g(vpage);
+
+		hipz_rc = hipz_h_register_rpage_cq(adapter_handle,
+						   my_cq->ipz_cq_handle,
+						   &my_cq->pf,
+						   0,
+						   HIPZ_CQ_REGISTER_ORIG,
+						   rpage,
+						   1,
+						   my_cq->ehca_cq_core.galpas.
+						   kernel);
+
+		if (hipz_rc < H_Success) {
+			EDEB_ERR(4, "hipz_h_register_rpage_cq() failed "
+				 "ehca_cq=%p cq_num=%x hipz_rc=%lx "
+				 "counter=%i act_pages=%i",
+				 my_cq, my_cq->cq_number,
+				 hipz_rc, counter, act_pages);
+			cq = ERR_PTR(-EINVAL);
+			goto create_cq_exit4;
+		}
+
+		if (counter == (act_pages - 1)) {
+			vpage = ipz_QPageit_get_inc(
+				&my_cq->ehca_cq_core.ipz_queue);
+			if ((hipz_rc != H_Success) || (vpage != 0)) {
+				EDEB_ERR(4, "Registration of pages not "
+					 "complete ehca_cq=%p cq_num=%x "
+					 "hipz_rc=%lx",
+					 my_cq, my_cq->cq_number, hipz_rc);
+				cq = ERR_PTR(-EAGAIN);
+				goto create_cq_exit4;
+			}
+		} else {
+			if (hipz_rc != H_PAGE_REGISTERED) {
+				EDEB_ERR(4, "Registration of page failed "
+					 "ehca_cq=%p cq_num=%x hipz_rc=%lx"
+					 "counter=%i act_pages=%i",
+					 my_cq, my_cq->cq_number,
+					 hipz_rc, counter, act_pages);
+				cq = ERR_PTR(-ENOMEM);
+				goto create_cq_exit4;
+			}
+		}
+	}
+
+	ipz_QEit_reset(&my_cq->ehca_cq_core.ipz_queue);
+
+	gal = my_cq->ehca_cq_core.galpas.kernel;
+	CQx_FEC = hipz_galpa_load(gal, CQTEMM_OFFSET(CQx_FEC));
+	EDEB(8, "ehca_cq=%p cq_num=%x CQx_FEC=%lx",
+	     my_cq, my_cq->cq_number, CQx_FEC);
+
+	my_cq->ib_cq.cqe = my_cq->nr_of_entries =
+		act_nr_of_entries-additional_cqe;
+	my_cq->cq_number = (my_cq->ipz_cq_handle.handle) & 0xffff;
+
+	for (i=0; i<QP_HASHTAB_LEN; i++) {
+		INIT_LIST_HEAD(&my_cq->qp_hashtab[i]);
+	}
+
+	if (context) {
+		struct ehca_create_cq_resp resp;
+		struct vm_area_struct * vma;
+		resp.cq_number = my_cq->cq_number;
+		resp.token = my_cq->token;
+		resp.ehca_cq_core = my_cq->ehca_cq_core;
+
+		ehca_mmap_nopage(((u64) (my_cq->token) << 32) | 0x12000000,
+				 my_cq->ehca_cq_core.ipz_queue.queue_length,
+				 ((void**)&resp.ehca_cq_core.ipz_queue.queue),
+				 &vma);
+		my_cq->uspace_queue = (u64)resp.ehca_cq_core.ipz_queue.queue;
+		ehca_mmap_register(my_cq->ehca_cq_core.galpas.user.fw_handle,
+				   ((void**)&resp.ehca_cq_core.galpas.kernel.fw_handle),
+				   &vma);
+		my_cq->uspace_fwh = (u64)resp.ehca_cq_core.galpas.kernel.fw_handle;
+		if (ib_copy_to_udata(udata, &resp, sizeof(resp))) {
+			EDEB_ERR(4,  "Copy to udata failed.");
+			goto create_cq_exit4;
+		}
+	}
+
+	EDEB_EX(7,"retcode=%p ehca_cq=%p cq_num=%x cq_size=%x",
+		cq, my_cq, my_cq->cq_number, act_nr_of_entries);
+	return cq;
+
+ create_cq_exit4:
+	ipz_queue_dtor(&my_cq->ehca_cq_core.ipz_queue);
+
+ create_cq_exit3:
+	hipz_rc = hipz_h_destroy_cq(adapter_handle, my_cq, 1);
+	EDEB(3, "hipz_h_destroy_cq() failed ehca_cq=%p cq_num=%x hipz_rc=%lx",
+	     my_cq, my_cq->cq_number, hipz_rc);
+
+ create_cq_exit2:
+	/* dereg idr */
+	down_write(&ehca_cq_idr_sem);
+	idr_remove(&ehca_cq_idr, my_cq->token);
+	up_write(&ehca_cq_idr_sem);
+
+ create_cq_exit1:
+	/* free cq struct */
+	ehca_cq_delete(my_cq);
+
+ create_cq_exit0:
+	EDEB_EX(7,  "An error has occured retcode=%p ", cq);
+	return cq;
+}
+
+int ehca_destroy_cq(struct ib_cq *cq)
+{
+	u64 hipz_rc = H_Success;
+	int retcode = 0;
+	struct ehca_cq *my_cq = NULL;
+	int cq_num = 0;
+	struct ib_device *device = NULL;
+	struct ehca_shca *shca = NULL;
+	struct ipz_adapter_handle adapter_handle;
+
+	EHCA_CHECK_CQ(cq);
+	my_cq = container_of(cq, struct ehca_cq, ib_cq);
+	cq_num = my_cq->cq_number;
+	device = cq->device;
+	EHCA_CHECK_DEVICE(device);
+	shca = container_of(device, struct ehca_shca, ib_device);
+	adapter_handle = shca->ipz_hca_handle;
+	EDEB_EN(7, "ehca_cq=%p cq_num=%x",
+		my_cq, my_cq->cq_number);
+
+	down_write(&ehca_cq_idr_sem);
+	idr_remove(&ehca_cq_idr, my_cq->token);
+	up_write(&ehca_cq_idr_sem);
+
+	/* un-mmap if vma alloc */
+	if (my_cq->uspace_queue!=0) {
+		struct ehca_cq_core *cq_core = &my_cq->ehca_cq_core;
+		retcode = ehca_munmap(my_cq->uspace_queue,
+				      cq_core->ipz_queue.queue_length);
+		retcode = ehca_munmap(my_cq->uspace_fwh, 4096);
+	}
+
+	hipz_rc = hipz_h_destroy_cq(adapter_handle, my_cq, 0);
+	if (hipz_rc == H_R_STATE) {
+		/* cq in err: read err data and destroy it forcibly */
+		EDEB(4, "ehca_cq=%p cq_num=%x ressource=%lx in err state. "
+		     "Try to delete it forcibly.",
+		     my_cq, my_cq->cq_number, my_cq->ipz_cq_handle.handle);
+		ehca_error_data(shca, my_cq->ipz_cq_handle.handle);
+		hipz_rc = hipz_h_destroy_cq(adapter_handle, my_cq, 1);
+		if (hipz_rc == H_Success) {
+			EDEB(4, "ehca_cq=%p cq_num=%x deleted successfully.",
+			     my_cq, my_cq->cq_number);
+		}
+	}
+	if (hipz_rc != H_Success) {
+		EDEB_ERR(4,"hipz_h_destroy_cq() failed "
+			 "hipz_rc=%lx ehca_cq=%p cq_num=%x",
+			 hipz_rc, my_cq, my_cq->cq_number);
+		retcode = ehca2ib_return_code(hipz_rc);
+		goto destroy_cq_exit0;/*@TODO*/
+	}
+	ipz_queue_dtor(&my_cq->ehca_cq_core.ipz_queue);
+	ehca_cq_delete(my_cq);
+
+ destroy_cq_exit0:
+	EDEB_EX(7, "ehca_cq=%p cq_num=%x retcode=%x ",
+		my_cq, cq_num, retcode);
+	return retcode;
+}
+
+int ehca_resize_cq(struct ib_cq *cq, int cqe)
+{
+	int retcode = 0;
+	struct ehca_cq *my_cq = NULL;
+
+	if (unlikely(NULL == cq)) {
+		EDEB_ERR(4, "cq is NULL");
+		return -EFAULT;
+	}
+
+	my_cq = container_of(cq, struct ehca_cq, ib_cq);
+	EDEB_EN(7, "ehca_cq=%p cq_num=%x",
+		my_cq, my_cq->cq_number);
+	/*TODO proper resize still needs to be done*/
+	if (cqe > cq->cqe) {
+		retcode = -EINVAL;
+	}
+	EDEB_EX(7, "ehca_cq=%p cq_num=%x",
+		my_cq, my_cq->cq_number);
+	return retcode;
+}
+
+/* eof ehca_cq.c */


From rolandd at cisco.com  Sat Feb 18 11:57:57 2006
From: rolandd at cisco.com (Roland Dreier)
Date: Fri, 17 Feb 2006 16:57:57 -0800
Subject: [PATCH 20/22] ehca userspace verbs
In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
Message-ID: <20060218005757.13620.13628.stgit@localhost.localdomain>

From: Roland Dreier <rolandd at cisco.com>


---

 drivers/infiniband/hw/ehca/ehca_uverbs.c |  376 ++++++++++++++++++++++++++++++
 1 files changed, 376 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_uverbs.c b/drivers/infiniband/hw/ehca/ehca_uverbs.c
new file mode 100644
index 0000000..f813e9c
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ehca_uverbs.c
@@ -0,0 +1,376 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  userspace support verbs
+ *
+ *  Authors: Heiko J Schick <schickhj at de.ibm.com>
+ *           Christoph Raisch <raisch at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ehca_uverbs.c,v 1.29 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+#undef DEB_PREFIX
+#define DEB_PREFIX "uver"
+
+#include "ehca_kernel.h"
+#include "ehca_tools.h"
+#include "ehca_classes.h"
+#include "ehca_iverbs.h"
+#include "ehca_eq.h"
+#include "ehca_mrmw.h"
+
+#include "hcp_sense.h"		/* TODO: later via hipz_* header file */
+#include "hcp_if.h"		/* TODO: later via hipz_* header file */
+
+struct ib_ucontext *ehca_alloc_ucontext(struct ib_device *device,
+					struct ib_udata *udata)
+{
+	struct ehca_ucontext *my_context = NULL;
+	EHCA_CHECK_ADR_P(device);
+	EDEB_EN(7, "device=%p name=%s", device, device->name);
+	my_context = kmalloc(sizeof *my_context, GFP_KERNEL);
+	if (NULL == my_context) {
+		EDEB_ERR(4, "Out of memory device=%p", device);
+		return ERR_PTR(-ENOMEM);
+	}
+	memset(my_context, 0, sizeof(*my_context));
+	EDEB_EX(7, "device=%p ucontext=%p", device, my_context);
+	return &my_context->ib_ucontext;
+}
+
+int ehca_dealloc_ucontext(struct ib_ucontext *context)
+{
+	struct ehca_ucontext *my_context = NULL;
+	EHCA_CHECK_ADR(context);
+	EDEB_EN(7, "ucontext=%p", context);
+	my_context = container_of(context, struct ehca_ucontext, ib_ucontext);
+	kfree(my_context);
+	EDEB_EN(7, "ucontext=%p", context);
+	return 0;
+}
+
+struct page *ehca_nopage(struct vm_area_struct *vma,
+			 unsigned long address, int *type)
+{
+	struct page *mypage = 0;
+	u64 fileoffset = vma->vm_pgoff << PAGE_SHIFT;
+	u32 idr_handle = fileoffset >> 32;
+	u32 q_type = (fileoffset >> 28) & 0xF;	  /* CQ, QP,...        */
+	u32 rsrc_type = (fileoffset >> 24) & 0xF; /* sq,rq,cmnd_window */
+
+	EDEB_EN(7,
+		"vm_start=%lx vm_end=%lx vm_page_prot=%lx vm_fileoff=%lx",
+		vma->vm_start, vma->vm_end, vma->vm_page_prot, fileoffset);
+
+
+	if (q_type == 1) { /* CQ */
+		struct ehca_cq *cq;
+
+		down_read(&ehca_cq_idr_sem);
+		cq = idr_find(&ehca_cq_idr, idr_handle);
+		up_read(&ehca_cq_idr_sem);
+
+		/* make sure this mmap really belongs to the authorized user */
+		if (cq == 0) {
+			EDEB_ERR(4, "cq is NULL ret=NOPAGE_SIGBUS");
+			return NOPAGE_SIGBUS;
+		}
+		if (rsrc_type == 2) {
+			void *vaddr;
+			EDEB(6, "cq=%p cq queuearea", cq);
+			vaddr = address - vma->vm_start
+			    + cq->ehca_cq_core.ipz_queue.queue;
+			EDEB(6, "queue=%p vaddr=%p",
+			     cq->ehca_cq_core.ipz_queue.queue, vaddr);
+			mypage = vmalloc_to_page(vaddr);
+		}
+	} else if (q_type == 2) { /* QP */
+		struct ehca_qp *qp;
+
+		down_read(&ehca_qp_idr_sem);
+		qp = idr_find(&ehca_qp_idr, idr_handle);
+		up_read(&ehca_qp_idr_sem);
+
+		/* make sure this mmap really belongs to the authorized user */
+		if (qp == NULL) {
+			EDEB_ERR(4, "qp is NULL ret=NOPAGE_SIGBUS");
+			return NOPAGE_SIGBUS;
+		}
+		if (rsrc_type == 2) {	/* rqueue */
+			void *vaddr;
+			EDEB(6, "qp=%p qp rqueuearea", qp);
+			vaddr = address - vma->vm_start
+			    + qp->ehca_qp_core.ipz_rqueue.queue;
+			EDEB(6, "rqueue=%p vaddr=%p",
+			     qp->ehca_qp_core.ipz_rqueue.queue, vaddr);
+			mypage = vmalloc_to_page(vaddr);
+		} else if (rsrc_type == 3) {	/* squeue */
+			void *vaddr;
+			EDEB(6, "qp=%p qp squeuearea", qp);
+			vaddr = address - vma->vm_start
+			    + qp->ehca_qp_core.ipz_squeue.queue;
+			EDEB(6, "squeue=%p vaddr=%p",
+			     qp->ehca_qp_core.ipz_squeue.queue, vaddr);
+			mypage = vmalloc_to_page(vaddr);
+		}
+	}
+	if (mypage == 0) {
+		EDEB_ERR(4, "Invalid page adr==NULL ret=NOPAGE_SIGBUS");
+		return NOPAGE_SIGBUS;
+	}
+	get_page(mypage);
+	EDEB_EX(7, "page adr=%p", mypage);
+	return mypage;
+}
+
+static struct vm_operations_struct ehcau_vm_ops = {
+	.nopage = ehca_nopage,
+};
+
+/* TODO: better error output messages !!!
+   NO RETURN WITHOUT ERROR
+ */
+int ehca_mmap(struct ib_ucontext *context, struct vm_area_struct *vma)
+{
+	u64 fileoffset = vma->vm_pgoff << PAGE_SHIFT;
+
+
+	u32 idr_handle = fileoffset >> 32;
+	u32 q_type = (fileoffset >> 28) & 0xF;	  /* CQ, QP,...        */
+	u32 rsrc_type = (fileoffset >> 24) & 0xF; /* sq,rq,cmnd_window */
+	u32 ret = -EFAULT;	/* assume the worst             */
+	u64 vsize = 0;		/* must be calculated/set below */
+	u64 physical = 0;	/* must be calculated/set below */
+
+	EDEB_EN(7, "vm_start=%lx vm_end=%lx vm_page_prot=%lx vm_fileoff=%lx",
+		vma->vm_start, vma->vm_end, vma->vm_page_prot, fileoffset);
+
+	if (q_type == 1) { /* CQ */
+		struct ehca_cq *cq;
+
+		down_read(&ehca_cq_idr_sem);
+		cq = idr_find(&ehca_cq_idr, idr_handle);
+		up_read(&ehca_cq_idr_sem);
+
+		/* make sure this mmap really belongs to the authorized user */
+		if (cq == 0)
+			return -EINVAL;
+		if (cq->ib_cq.uobject == 0)
+			return -EINVAL;
+		if (cq->ib_cq.uobject->context != context)
+			return -EINVAL;
+		if (rsrc_type == 1) {	/* galpa fw handle */
+			EDEB(6, "cq=%p cq triggerarea", cq);
+			vma->vm_flags |= VM_RESERVED;
+			vsize = vma->vm_end - vma->vm_start;
+			if (vsize != 4096) {
+				EDEB_ERR(4, "invalid vsize=%lx",
+					 vma->vm_end - vma->vm_start);
+				ret = -EINVAL;
+				goto mmap_exit0;
+			}
+
+			physical = cq->ehca_cq_core.galpas.user.fw_handle;
+			vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
+			vma->vm_flags |= VM_IO | VM_RESERVED;
+
+			EDEB(6, "vsize=%lx physical=%lx", vsize,
+			     physical);
+			ret =
+			    remap_pfn_range(vma, vma->vm_start,
+					    physical >> PAGE_SHIFT, vsize,
+					    vma->vm_page_prot);
+			if (ret != 0) {
+				EDEB_ERR(4,
+					 "Error: remap_pfn_range() returned %x!",
+					 ret);
+				ret = -ENOMEM;
+			}
+			goto mmap_exit0;
+		} else if (rsrc_type == 2) {	/* cq queue_addr */
+			EDEB(6, "cq=%p cq q_addr", cq);
+			/* vma->vm_page_prot =
+			 * pgprot_noncached(vma->vm_page_prot); */
+			vma->vm_flags |= VM_RESERVED;
+			vma->vm_ops = &ehcau_vm_ops;
+			ret = 0;
+			goto mmap_exit0;
+		} else {
+			EDEB_ERR(6, "bad resource type %x", rsrc_type);
+			ret = -EINVAL;
+			goto mmap_exit0;
+		}
+	} else if (q_type == 2) { /* QP */
+		struct ehca_qp *qp;
+
+		down_read(&ehca_qp_idr_sem);
+		qp = idr_find(&ehca_qp_idr, idr_handle);
+		up_read(&ehca_qp_idr_sem);
+
+		/* make sure this mmap really belongs to the authorized user */
+		if (qp == NULL || qp->ib_qp.uobject == NULL ||
+		    qp->ib_qp.uobject->context != context) {
+			EDEB(6, "qp=%p, uobject=%p, context=%p",
+			     qp, qp->ib_qp.uobject, qp->ib_qp.uobject->context);
+			ret = -EINVAL;
+			goto mmap_exit0;
+		}
+		if (rsrc_type == 1) {	/* galpa fw handle */
+			EDEB(6, "qp=%p qp triggerarea", qp);
+			vma->vm_flags |= VM_RESERVED;
+			vsize = vma->vm_end - vma->vm_start;
+			if (vsize != 4096) {
+				EDEB_ERR(4, "invalid vsize=%lx",
+					 vma->vm_end - vma->vm_start);
+				ret = -EINVAL;
+				goto mmap_exit0;
+			}
+
+			physical = qp->ehca_qp_core.galpas.user.fw_handle;
+			vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
+			vma->vm_flags |= VM_IO | VM_RESERVED;
+
+			EDEB(6, "vsize=%lx physical=%lx", vsize,
+			     physical);
+			ret =
+			    remap_pfn_range(vma, vma->vm_start,
+					    physical >> PAGE_SHIFT, vsize,
+					    vma->vm_page_prot);
+			if (ret != 0) {
+				EDEB_ERR(4,
+					 "Error: remap_pfn_range() returned %x!",
+					 ret);
+				ret = -ENOMEM;
+			}
+			goto mmap_exit0;
+		} else if (rsrc_type == 2) {	/* qp rqueue_addr */
+			EDEB(6, "qp=%p qp rqueue_addr", qp);
+			vma->vm_flags |= VM_RESERVED;
+			vma->vm_ops = &ehcau_vm_ops;
+			ret = 0;
+			goto mmap_exit0;
+		} else if (rsrc_type == 3) {	/* qp squeue_addr */
+			EDEB(6, "qp=%p qp squeue_addr", qp);
+			vma->vm_flags |= VM_RESERVED;
+			vma->vm_ops = &ehcau_vm_ops;
+			ret = 0;
+			goto mmap_exit0;
+		} else {
+			EDEB_ERR(4, "bad resource type %x",
+				 rsrc_type);
+			ret = -EINVAL;
+			goto mmap_exit0;
+		}
+	} else {
+		EDEB_ERR(4, "bad queue type %x", q_type);
+		ret = -EINVAL;
+		goto mmap_exit0;
+	}
+
+      mmap_exit0:
+	EDEB_EX(7, "ret=%x", ret);
+	return ret;
+}
+
+int ehca_mmap_nopage(u64 foffset,u64 length,void ** mapped,struct vm_area_struct ** vma)
+{
+	down_write(&current->mm->mmap_sem);
+	*mapped=(void*)
+		do_mmap(NULL,0,
+			length,
+			PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS,
+			foffset);
+	up_write(&current->mm->mmap_sem);
+	if (*mapped) {
+		*vma = find_vma(current->mm,(u64)*mapped);
+		if (*vma) {
+			(*vma)->vm_flags |= VM_RESERVED;
+			(*vma)->vm_ops = &ehcau_vm_ops;
+		} else {
+			EDEB_ERR(4,"couldn't find queue vma queue=%p",
+				 *mapped);
+		}
+	} else {
+		EDEB_ERR(4,"couldn't create mmap length=%lx",length);
+	}
+	EDEB(7,"mapped=%p",*mapped);
+	return 0;
+}
+
+int ehca_mmap_register(u64 physical,void ** mapped,struct vm_area_struct ** vma)
+{
+	int ret;
+	unsigned long vsize;
+	ehca_mmap_nopage(0,4096,mapped,vma);
+	(*vma)->vm_flags |= VM_RESERVED;
+	vsize = (*vma)->vm_end - (*vma)->vm_start;
+	if (vsize != 4096) {
+		EDEB_ERR(4, "invalid vsize=%lx",
+			 (*vma)->vm_end - (*vma)->vm_start);
+		ret = -EINVAL;
+		return ret;
+	}
+
+	(*vma)->vm_page_prot = pgprot_noncached((*vma)->vm_page_prot);
+	(*vma)->vm_flags |= VM_IO | VM_RESERVED;
+
+	EDEB(6, "vsize=%lx physical=%lx", vsize,
+	     physical);
+	ret =
+		remap_pfn_range((*vma), (*vma)->vm_start,
+				physical >> PAGE_SHIFT, vsize,
+				(*vma)->vm_page_prot);
+	if (ret != 0) {
+		EDEB_ERR(4,
+			 "Error: remap_pfn_range() returned %x!",
+			 ret);
+		ret = -ENOMEM;
+	}
+	return ret;
+
+}
+
+int ehca_munmap(unsigned long addr, size_t len) {
+	int ret=0;
+	struct mm_struct *mm = current->mm;
+	if (mm!=0) {
+		down_write(&mm->mmap_sem);
+		ret = do_munmap(mm, addr, len);
+		up_write(&mm->mmap_sem);
+	}
+	return ret;
+}
+
+/* eof ehca_uverbs.c */


From rolandd at cisco.com  Sat Feb 18 11:57:54 2006
From: rolandd at cisco.com (Roland Dreier)
Date: Fri, 17 Feb 2006 16:57:54 -0800
Subject: [PATCH 19/22] ehca memory regions
In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
Message-ID: <20060218005754.13620.41418.stgit@localhost.localdomain>

From: Roland Dreier <rolandd at cisco.com>

Nearly all the inline functions in ehca_mrmw.h look too big to
be inlined.  Why can't they just be static functions in ehca_mrmw.c?
---

 drivers/infiniband/hw/ehca/ehca_mrmw.c | 1711 ++++++++++++++++++++++++++++++++
 drivers/infiniband/hw/ehca/ehca_mrmw.h |  739 ++++++++++++++
 2 files changed, 2450 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_mrmw.c b/drivers/infiniband/hw/ehca/ehca_mrmw.c
new file mode 100644
index 0000000..d756082
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ehca_mrmw.c
@@ -0,0 +1,1711 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  MR/MW functions
+ *
+ *  Authors: Dietmar Decker <ddecker at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ehca_mrmw.c,v 1.86 2006/02/07 07:51:13 decker Exp $
+ */
+
+#undef DEB_PREFIX
+#define DEB_PREFIX "mrmw"
+
+#include "ehca_kernel.h"
+#include "ehca_iverbs.h"
+#include "hcp_if.h"
+#include "ehca_mrmw.h"
+
+extern int ehca_use_hp_mr;
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+struct ib_mr *ehca_get_dma_mr(struct ib_pd *pd, int mr_access_flags)
+{
+	struct ib_mr *ib_mr;
+	int retcode = 0;
+	struct ehca_mr *e_maxmr = 0;
+	struct ehca_pd *e_pd;
+	struct ehca_shca *shca;
+
+	EDEB_EN(7, "pd=%p mr_access_flags=%x", pd, mr_access_flags);
+
+	EHCA_CHECK_PD_P(pd);
+	e_pd = container_of(pd, struct ehca_pd, ib_pd);
+	shca = container_of(pd->device, struct ehca_shca, ib_device);
+
+	if (shca->maxmr) {
+		e_maxmr = ehca_mr_new();
+		if (!e_maxmr) {
+			EDEB_ERR(4, "out of memory");
+			ib_mr = ERR_PTR(-ENOMEM);
+			goto get_dma_mr_exit0;
+		}
+
+		retcode = ehca_reg_maxmr(shca, e_maxmr,
+					 (u64 *)KERNELBASE,
+					 mr_access_flags, e_pd,
+					 &e_maxmr->ib.ib_mr.lkey,
+					 &e_maxmr->ib.ib_mr.rkey);
+		if (retcode != 0) {
+			ib_mr = ERR_PTR(retcode);
+			goto get_dma_mr_exit0;
+		}
+		ib_mr = &e_maxmr->ib.ib_mr;
+	} else {
+		EDEB_ERR(4, "no internal max-MR exist!");
+		ib_mr = ERR_PTR(-EINVAL);
+		goto get_dma_mr_exit0;
+	}
+
+      get_dma_mr_exit0:
+	if (IS_ERR(ib_mr) == 0)
+		EDEB_EX(7, "ib_mr=%p lkey=%x rkey=%x",
+			ib_mr, ib_mr->lkey, ib_mr->rkey);
+	else
+		EDEB_EX(4, "rc=%lx pd=%p mr_access_flags=%x ",
+			PTR_ERR(ib_mr), pd, mr_access_flags);
+	return (ib_mr);
+} /* end ehca_get_dma_mr() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+struct ib_mr *ehca_reg_phys_mr(struct ib_pd *pd,
+			       struct ib_phys_buf *phys_buf_array,
+			       int num_phys_buf,
+			       int mr_access_flags,
+			       u64 *iova_start)
+{
+	struct ib_mr *ib_mr = 0;
+	int retcode = 0;
+	struct ehca_mr *e_mr = 0;
+	struct ehca_shca *shca = 0;
+	struct ehca_pd *e_pd = 0;
+	u64 size = 0;
+	struct ehca_mr_pginfo pginfo={0,0,0,0,0,0,0,0,0,0,0,0};
+	u32 num_pages_mr = 0;
+
+	EDEB_EN(7, "pd=%p phys_buf_array=%p num_phys_buf=%x "
+		"mr_access_flags=%x iova_start=%p", pd, phys_buf_array,
+		num_phys_buf, mr_access_flags, iova_start);
+
+	EHCA_CHECK_PD_P(pd);
+	if ((num_phys_buf <= 0) || ehca_adr_bad(phys_buf_array)) {
+		EDEB_ERR(4, "bad input values: num_phys_buf=%x "
+			 "phys_buf_array=%p", num_phys_buf, phys_buf_array);
+		ib_mr = ERR_PTR(-EINVAL);
+		goto reg_phys_mr_exit0;
+	}
+	if (((mr_access_flags & IB_ACCESS_REMOTE_WRITE) &&
+	     !(mr_access_flags & IB_ACCESS_LOCAL_WRITE)) ||
+	    ((mr_access_flags & IB_ACCESS_REMOTE_ATOMIC) &&
+	     !(mr_access_flags & IB_ACCESS_LOCAL_WRITE))) {
+		/* Remote Write Access requires Local Write Access */
+		/* Remote Atomic Access requires Local Write Access */
+		EDEB_ERR(4, "bad input values: mr_access_flags=%x",
+			 mr_access_flags);
+		ib_mr = ERR_PTR(-EINVAL);
+		goto reg_phys_mr_exit0;
+	}
+
+	/* check physical buffer list and calculate size */
+	retcode = ehca_mr_chk_buf_and_calc_size(phys_buf_array, num_phys_buf,
+						iova_start, &size);
+	if (retcode != 0) {
+		ib_mr = ERR_PTR(retcode);
+		goto reg_phys_mr_exit0;
+	}
+	if ((size == 0) ||
+	    ((0xFFFFFFFFFFFFFFFF - size) < (u64)iova_start)) {
+		EDEB_ERR(4, "bad input values: size=%lx iova_start=%p",
+			 size, iova_start);
+		ib_mr = ERR_PTR(-EINVAL);
+		goto reg_phys_mr_exit0;
+	}
+
+	e_pd = container_of(pd, struct ehca_pd, ib_pd);
+	shca = container_of(pd->device, struct ehca_shca, ib_device);
+
+	e_mr = ehca_mr_new();
+	if (!e_mr) {
+		EDEB_ERR(4, "out of memory");
+		ib_mr = ERR_PTR(-ENOMEM);
+		goto reg_phys_mr_exit0;
+	}
+
+	/* determine number of MR pages */
+	/* pagesize currently hardcoded to 4k ... TODO.. */
+	num_pages_mr =
+	    ((((u64)iova_start % PAGE_SIZE) + size +
+	      PAGE_SIZE - 1) / PAGE_SIZE);
+
+	/* register MR on HCA */
+	if (ehca_mr_is_maxmr(size, iova_start)) {
+		e_mr->flags |= EHCA_MR_FLAG_MAXMR;
+		retcode = ehca_reg_maxmr(shca, e_mr, iova_start,
+					 mr_access_flags, e_pd,
+					 &e_mr->ib.ib_mr.lkey,
+					 &e_mr->ib.ib_mr.rkey);
+		if (retcode != 0) {
+			ib_mr = ERR_PTR(retcode);
+			goto reg_phys_mr_exit1;
+		}
+	} else {
+		pginfo.type           = EHCA_MR_PGI_PHYS;
+		pginfo.num_pages      = num_pages_mr;
+		pginfo.num_phys_buf   = num_phys_buf;
+		pginfo.phys_buf_array = phys_buf_array;
+
+		retcode = ehca_reg_mr(shca, e_mr, iova_start, size,
+				      mr_access_flags, e_pd, &pginfo,
+				      &e_mr->ib.ib_mr.lkey,
+				      &e_mr->ib.ib_mr.rkey);
+		if (retcode != 0) {
+			ib_mr = ERR_PTR(retcode);
+			goto reg_phys_mr_exit1;
+		}
+	}
+
+	/* successful registration of all pages */
+	ib_mr = &e_mr->ib.ib_mr;
+	goto reg_phys_mr_exit0;
+
+      reg_phys_mr_exit1:
+	ehca_mr_delete(e_mr);
+      reg_phys_mr_exit0:
+	if (IS_ERR(ib_mr) == 0)
+		EDEB_EX(7, "ib_mr=%p lkey=%x rkey=%x",
+			ib_mr, ib_mr->lkey, ib_mr->rkey);
+	else
+		EDEB_EX(4, "rc=%lx pd=%p phys_buf_array=%p "
+			"num_phys_buf=%x mr_access_flags=%x iova_start=%p",
+			PTR_ERR(ib_mr), pd, phys_buf_array,
+			num_phys_buf, mr_access_flags, iova_start);
+	return (ib_mr);
+} /* end ehca_reg_phys_mr() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+struct ib_mr *ehca_reg_user_mr(struct ib_pd *pd,
+			       struct ib_umem *region,
+			       int mr_access_flags,
+			       struct ib_udata *udata)
+{
+	struct ib_mr *ib_mr = 0;
+	struct ehca_mr *e_mr = 0;
+	struct ehca_shca *shca = 0;
+	struct ehca_pd *e_pd = 0;
+	struct ehca_mr_pginfo pginfo={0,0,0,0,0,0,0,0,0,0,0,0};
+	int retcode = 0;
+	u32 num_pages_mr = 0;
+
+	EDEB_EN(7, "pd=%p region=%p mr_access_flags=%x udata=%p",
+		pd, region, mr_access_flags, udata);
+
+	EHCA_CHECK_PD_P(pd);
+	if (ehca_adr_bad(region)) {
+		EDEB_ERR(4, "bad input values: region=%p", region);
+		ib_mr = ERR_PTR(-EINVAL);
+		goto reg_user_mr_exit0;
+	}
+	if (((mr_access_flags & IB_ACCESS_REMOTE_WRITE) &&
+	     !(mr_access_flags & IB_ACCESS_LOCAL_WRITE)) ||
+	    ((mr_access_flags & IB_ACCESS_REMOTE_ATOMIC) &&
+	     !(mr_access_flags & IB_ACCESS_LOCAL_WRITE))) {
+		/* Remote Write Access requires Local Write Access */
+		/* Remote Atomic Access requires Local Write Access */
+		EDEB_ERR(4, "bad input values: mr_access_flags=%x",
+			 mr_access_flags);
+		ib_mr = ERR_PTR(-EINVAL);
+		goto reg_user_mr_exit0;
+	}
+	EDEB(7, "user_base=%lx virt_base=%lx length=%lx offset=%x page_size=%x "
+	     "chunk_list.next=%p",
+	     region->user_base, region->virt_base, region->length,
+	     region->offset, region->page_size, region->chunk_list.next);
+	if (region->page_size != PAGE_SIZE) {
+		/* @TODO large page support */
+		EDEB_ERR(4, "large pages not supported, region->page_size=%x",
+			 region->page_size);
+		ib_mr = ERR_PTR(-EINVAL);
+		goto reg_user_mr_exit0;
+	}
+
+	if ((region->length == 0) ||
+	    ((0xFFFFFFFFFFFFFFFF - region->length) < region->virt_base)) {
+		EDEB_ERR(4, "bad input values: length=%lx virt_base=%lx",
+			 region->length, region->virt_base);
+		ib_mr = ERR_PTR(-EINVAL);
+		goto reg_user_mr_exit0;
+	}
+
+	e_pd = container_of(pd, struct ehca_pd, ib_pd);
+	shca = container_of(pd->device, struct ehca_shca, ib_device);
+
+	e_mr = ehca_mr_new();
+	if (!e_mr) {
+		EDEB_ERR(4, "out of memory");
+		ib_mr = ERR_PTR(-ENOMEM);
+		goto reg_user_mr_exit0;
+	}
+
+	/* determine number of MR pages */
+	/* pagesize currently hardcoded to 4k ...TODO... */
+	num_pages_mr =
+	    (((region->virt_base % PAGE_SIZE) + region->length +
+	      PAGE_SIZE - 1) / PAGE_SIZE);
+
+	/* register MR on HCA */
+	pginfo.type       = EHCA_MR_PGI_USER;
+	pginfo.num_pages  = num_pages_mr;
+	pginfo.region     = region;
+	pginfo.next_chunk = list_prepare_entry(pginfo.next_chunk,
+					       (&region->chunk_list),
+					       list);
+
+	retcode = ehca_reg_mr(shca, e_mr, (u64 *)region->virt_base,
+			      region->length, mr_access_flags, e_pd, &pginfo,
+			      &e_mr->ib.ib_mr.lkey, &e_mr->ib.ib_mr.rkey);
+	if (retcode != 0) {
+		ib_mr = ERR_PTR(retcode);
+		goto reg_user_mr_exit1;
+	}
+
+	/* successful registration of all pages */
+	ib_mr = &e_mr->ib.ib_mr;
+	goto reg_user_mr_exit0;
+
+      reg_user_mr_exit1:
+	ehca_mr_delete(e_mr);
+      reg_user_mr_exit0:
+	if (IS_ERR(ib_mr) == 0)
+		EDEB_EX(7, "ib_mr=%p lkey=%x rkey=%x",
+			ib_mr, ib_mr->lkey, ib_mr->rkey);
+	else
+		EDEB_EX(4, "rc=%lx pd=%p region=%p mr_access_flags=%x "
+			"udata=%p",
+			PTR_ERR(ib_mr), pd, region, mr_access_flags, udata);
+	return (ib_mr);
+} /* end ehca_reg_user_mr() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+int ehca_rereg_phys_mr(struct ib_mr *mr,
+		       int mr_rereg_mask,
+		       struct ib_pd *pd,
+		       struct ib_phys_buf *phys_buf_array,
+		       int num_phys_buf,
+		       int mr_access_flags,
+		       u64 *iova_start)
+{
+	int retcode = 0;
+	struct ehca_shca *shca = 0;
+	struct ehca_mr *e_mr = 0;
+	u64 new_size = 0;
+	u64 *new_start = 0;
+	u32 new_acl = 0;
+	struct ehca_pd *new_pd = 0;
+	u32 tmp_lkey = 0;
+	u32 tmp_rkey = 0;
+	unsigned long sl_flags;
+	u64 num_pages_mr = 0;
+	struct ehca_mr_pginfo pginfo={0,0,0,0,0,0,0,0,0,0,0,0};
+
+	EDEB_EN(7, "mr=%p mr_rereg_mask=%x pd=%p phys_buf_array=%p "
+		"num_phys_buf=%x mr_access_flags=%x iova_start=%p",
+		mr, mr_rereg_mask, pd, phys_buf_array, num_phys_buf,
+		mr_access_flags, iova_start);
+
+	if (!(mr_rereg_mask & IB_MR_REREG_TRANS)) {
+		/*@TODO not supported, because PHYP rereg hCall needs pages*/
+		/*@TODO: We will follow this with Tom ....*/
+		EDEB_ERR(4, "rereg without IB_MR_REREG_TRANS not supported yet,"
+			 " mr_rereg_mask=%x", mr_rereg_mask);
+		retcode = -EINVAL;
+		goto rereg_phys_mr_exit0;
+	}
+
+	EHCA_CHECK_MR(mr);
+	e_mr = container_of(mr, struct ehca_mr, ib.ib_mr);
+	if (mr_rereg_mask & IB_MR_REREG_PD) {
+		EHCA_CHECK_PD(pd);
+	}
+
+	if ((mr_rereg_mask &
+	     ~(IB_MR_REREG_TRANS | IB_MR_REREG_PD | IB_MR_REREG_ACCESS)) ||
+	    (mr_rereg_mask == 0)) {
+		retcode = -EINVAL;
+		goto rereg_phys_mr_exit0;
+	}
+
+	shca = container_of(mr->device, struct ehca_shca, ib_device);
+
+	/* check other parameters */
+	if (e_mr == shca->maxmr) {
+		/* should be impossible, however reject to be sure */
+		EDEB_ERR(3, "rereg internal max-MR impossible, mr=%p "
+			 "shca->maxmr=%p mr->lkey=%x",
+			 mr, shca->maxmr, mr->lkey);
+		retcode = -EINVAL;
+		goto rereg_phys_mr_exit0;
+	}
+	if (mr_rereg_mask & IB_MR_REREG_TRANS) { /* transl., i.e. addr/size */
+		if (e_mr->flags & EHCA_MR_FLAG_FMR) {
+			EDEB_ERR(4, "not supported for FMR, mr=%p flags=%x",
+				 mr, e_mr->flags);
+			retcode = -EINVAL;
+			goto rereg_phys_mr_exit0;
+		}
+		if (ehca_adr_bad(phys_buf_array) || num_phys_buf <= 0) {
+			EDEB_ERR(4, "bad input values: mr_rereg_mask=%x "
+				 "phys_buf_array=%p num_phys_buf=%x",
+				 mr_rereg_mask, phys_buf_array, num_phys_buf);
+			retcode = -EINVAL;
+			goto rereg_phys_mr_exit0;
+		}
+	}
+	if ((mr_rereg_mask & IB_MR_REREG_ACCESS) &&	/* change ACL */
+	    (((mr_access_flags & IB_ACCESS_REMOTE_WRITE) &&
+	      !(mr_access_flags & IB_ACCESS_LOCAL_WRITE)) ||
+	     ((mr_access_flags & IB_ACCESS_REMOTE_ATOMIC) &&
+	      !(mr_access_flags & IB_ACCESS_LOCAL_WRITE)))) {
+		/* Remote Write Access requires Local Write Access */
+		/* Remote Atomic Access requires Local Write Access */
+		EDEB_ERR(4, "bad input values: mr_rereg_mask=%x "
+			 "mr_access_flags=%x", mr_rereg_mask, mr_access_flags);
+		retcode = -EINVAL;
+		goto rereg_phys_mr_exit0;
+	}
+
+	/* set requested values dependent on rereg request */
+	spin_lock_irqsave(&e_mr->mrlock, sl_flags); /* get lock @TODO for MR*/
+	new_start = e_mr->start;  /* new == old address */
+	new_size  = e_mr->size;	  /* new == old length */
+	new_acl   = e_mr->acl;	  /* new == old access control */
+	new_pd    = container_of(mr->pd,struct ehca_pd,ib_pd); /*new == old PD*/
+
+	if (mr_rereg_mask & IB_MR_REREG_TRANS) {
+		new_start = iova_start;	/* change address */
+		/* check physical buffer list and calculate size */
+		retcode = ehca_mr_chk_buf_and_calc_size(phys_buf_array,
+							num_phys_buf,
+							iova_start, &new_size);
+		if (retcode != 0)
+			goto rereg_phys_mr_exit1;
+		if ((new_size == 0) ||
+		    ((0xFFFFFFFFFFFFFFFF - new_size) < (u64)iova_start)) {
+			EDEB_ERR(4, "bad input values: new_size=%lx "
+				 "iova_start=%p", new_size, iova_start);
+			retcode = -EINVAL;
+			goto rereg_phys_mr_exit1;
+		}
+		num_pages_mr = ((((u64)new_start % PAGE_SIZE) +
+				 new_size + PAGE_SIZE - 1) / PAGE_SIZE);
+		pginfo.type           = EHCA_MR_PGI_PHYS;
+		pginfo.num_pages      = num_pages_mr;
+		pginfo.num_phys_buf   = num_phys_buf;
+		pginfo.phys_buf_array = phys_buf_array;
+	}
+	if (mr_rereg_mask & IB_MR_REREG_ACCESS)
+		new_acl = mr_access_flags;
+	if (mr_rereg_mask & IB_MR_REREG_PD)
+		new_pd = container_of(pd, struct ehca_pd, ib_pd);
+
+	EDEB(7, "mr=%p new_start=%p new_size=%lx new_acl=%x new_pd=%p "
+	     "num_pages_mr=%lx",
+	     e_mr, new_start, new_size, new_acl, new_pd, num_pages_mr);
+
+	retcode = ehca_rereg_mr(shca, e_mr, new_start, new_size, new_acl,
+				new_pd, &pginfo, &tmp_lkey, &tmp_rkey);
+	if (retcode != 0)
+		goto rereg_phys_mr_exit1;
+
+	/* successful reregistration */
+	if (mr_rereg_mask & IB_MR_REREG_PD)
+		mr->pd = pd;
+	mr->lkey = tmp_lkey;
+	mr->rkey = tmp_rkey;
+
+      rereg_phys_mr_exit1:
+	spin_unlock_irqrestore(&e_mr->mrlock, sl_flags); /* free spin lock */
+      rereg_phys_mr_exit0:
+	if (retcode == 0)
+		EDEB_EX(7, "mr=%p mr_rereg_mask=%x pd=%p phys_buf_array=%p "
+			"num_phys_buf=%x mr_access_flags=%x iova_start=%p",
+			mr, mr_rereg_mask, pd, phys_buf_array, num_phys_buf,
+			mr_access_flags, iova_start);
+	else
+		EDEB_EX(4, "retcode=%x mr=%p mr_rereg_mask=%x pd=%p "
+			"phys_buf_array=%p num_phys_buf=%x mr_access_flags=%x "
+			"iova_start=%p",
+			retcode, mr, mr_rereg_mask, pd, phys_buf_array,
+			num_phys_buf, mr_access_flags, iova_start);
+
+	return (retcode);
+} /* end ehca_rereg_phys_mr() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+int ehca_query_mr(struct ib_mr *mr, struct ib_mr_attr *mr_attr)
+{
+	int retcode = 0;
+	u64 rc = H_Success;
+	struct ehca_shca *shca = 0;
+	struct ehca_mr *e_mr = 0;
+	struct ipz_pd fwpd;		/* Firmware PD */
+	u32 access_ctrl = 0;
+	u64 tmp_remote_size = 0;
+	u64 tmp_remote_len = 0;
+
+	unsigned long sl_flags;
+
+	EDEB_EN(7, "mr=%p mr_attr=%p", mr, mr_attr);
+
+	EHCA_CHECK_MR(mr);
+	e_mr = container_of(mr, struct ehca_mr, ib.ib_mr);
+	if (ehca_adr_bad(mr_attr)) {
+		EDEB_ERR(4, "bad input values: mr_attr=%p", mr_attr);
+		retcode = -EINVAL;
+		goto query_mr_exit0;
+	}
+	if ((e_mr->flags & EHCA_MR_FLAG_FMR)) {
+		EDEB_ERR(4, "not supported for FMR, mr=%p e_mr=%p "
+			 "e_mr->flags=%x", mr, e_mr, e_mr->flags);
+		retcode = -EINVAL;
+		goto query_mr_exit0;
+	}
+
+	shca = container_of(mr->device, struct ehca_shca, ib_device);
+	memset(mr_attr, 0, sizeof(struct ib_mr_attr));
+	spin_lock_irqsave(&e_mr->mrlock, sl_flags); /* get spin lock @TODO?? */
+
+	rc = hipz_h_query_mr(shca->ipz_hca_handle, &e_mr->pf,
+			     &e_mr->ipz_mr_handle, &mr_attr->size,
+			     &mr_attr->device_virt_addr, &tmp_remote_size,
+			     &tmp_remote_len, &access_ctrl, &fwpd,
+			     &mr_attr->lkey, &mr_attr->rkey);
+	if (rc != H_Success) {
+		EDEB_ERR(4, "hipz_mr_query failed, rc=%lx mr=%p "
+			 "hca_hndl=%lx mr_hndl=%lx lkey=%x",
+			 rc, mr, shca->ipz_hca_handle.handle,
+			 e_mr->ipz_mr_handle.handle, mr->lkey);
+		retcode = ehca_mrmw_map_rc_query_mr(rc);
+		goto query_mr_exit1;
+	}
+	ehca_mrmw_reverse_map_acl(&access_ctrl, &mr_attr->mr_access_flags);
+	mr_attr->pd = mr->pd;
+
+      query_mr_exit1:
+	spin_unlock_irqrestore(&e_mr->mrlock, sl_flags); /* free spin lock */
+      query_mr_exit0:
+	if (retcode == 0)
+		EDEB_EX(7, "pd=%p device_virt_addr=%lx size=%lx "
+			"mr_access_flags=%x lkey=%x rkey=%x",
+			mr_attr->pd, mr_attr->device_virt_addr,
+			mr_attr->size, mr_attr->mr_access_flags,
+			mr_attr->lkey, mr_attr->rkey);
+	else
+		EDEB_EX(4, "retcode=%x mr=%p mr_attr=%p", retcode, mr, mr_attr);
+	return (retcode);
+} /* end ehca_query_mr() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+int ehca_dereg_mr(struct ib_mr *mr)
+{
+	int retcode = 0;
+	u64 rc = H_Success;
+	struct ehca_shca *shca = 0;
+	struct ehca_mr *e_mr = 0;
+
+	EDEB_EN(7, "mr=%p", mr);
+
+	EHCA_CHECK_MR(mr);
+	e_mr = container_of(mr, struct ehca_mr, ib.ib_mr);
+	shca = container_of(mr->device, struct ehca_shca, ib_device);
+
+	if ((e_mr->flags & EHCA_MR_FLAG_FMR)) {
+		EDEB_ERR(4, "not supported for FMR, mr=%p e_mr=%p "
+			 "e_mr->flags=%x", mr, e_mr, e_mr->flags);
+		retcode = -EINVAL;
+		goto dereg_mr_exit0;
+	} else if (e_mr == shca->maxmr) {
+		/* should be impossible, however reject to be sure */
+		EDEB_ERR(3, "dereg internal max-MR impossible, mr=%p "
+			 "shca->maxmr=%p mr->lkey=%x",
+			 mr, shca->maxmr, mr->lkey);
+		retcode = -EINVAL;
+		goto dereg_mr_exit0;
+	}
+
+	/*@TODO: BUSY: MR still has bound window(s) */
+	rc = hipz_h_free_resource_mr(shca->ipz_hca_handle, &e_mr->pf,
+				     &e_mr->ipz_mr_handle);
+	if (rc != H_Success) {
+		EDEB_ERR(4, "hipz_free_mr failed, rc=%lx shca=%p e_mr=%p"
+			 " hca_hndl=%lx mr_hndl=%lx mr->lkey=%x",
+			 rc, shca, e_mr, shca->ipz_hca_handle.handle,
+			 e_mr->ipz_mr_handle.handle, mr->lkey);
+		retcode = ehca_mrmw_map_rc_free_mr(rc);
+		goto dereg_mr_exit0;
+	}
+
+	/* successful deregistration */
+	ehca_mr_delete(e_mr);
+
+      dereg_mr_exit0:
+	if (retcode == 0)
+		EDEB_EX(7, "");
+	else
+		EDEB_EX(4, "retcode=%x mr=%p", retcode, mr);
+	return (retcode);
+} /* end ehca_dereg_mr() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+struct ib_mw *ehca_alloc_mw(struct ib_pd *pd)
+{
+	struct ib_mw *ib_mw = 0;
+	u64 rc = H_Success;
+	struct ehca_shca *shca = 0;
+	struct ehca_mw *e_mw = 0;
+	struct ehca_pd *e_pd = 0;
+
+	EDEB_EN(7, "pd=%p", pd);
+
+	EHCA_CHECK_PD_P(pd);
+	e_pd = container_of(pd, struct ehca_pd, ib_pd);
+	shca = container_of(pd->device, struct ehca_shca, ib_device);
+
+	e_mw = ehca_mw_new();
+	if (!e_mw) {
+		ib_mw = ERR_PTR(-ENOMEM);
+		goto alloc_mw_exit0;
+	}
+
+	rc = hipz_h_alloc_resource_mw(shca->ipz_hca_handle, &e_mw->pf,
+				      &shca->pf, e_pd->fw_pd,
+				      &e_mw->ipz_mw_handle, &e_mw->ib_mw.rkey);
+	if (rc != H_Success) {
+		EDEB_ERR(4, "hipz_mw_allocate failed, rc=%lx shca=%p "
+			 "hca_hndl=%lx mw=%p", rc, shca,
+			 shca->ipz_hca_handle.handle, e_mw);
+		ib_mw = ERR_PTR(ehca_mrmw_map_rc_alloc(rc));
+		goto alloc_mw_exit1;
+	}
+	/* save R_Key in local copy */
+	/*@TODO?????    mw->rkey = *rkey_p; */
+
+	/* successful MW allocation */
+	ib_mw = &e_mw->ib_mw;
+	goto alloc_mw_exit0;
+
+      alloc_mw_exit1:
+	ehca_mw_delete(e_mw);
+      alloc_mw_exit0:
+	if (IS_ERR(ib_mw) == 0)
+		EDEB_EX(7, "ib_mw=%p rkey=%x", ib_mw, ib_mw->rkey);
+	else
+		EDEB_EX(4, "rc=%lx pd=%p", PTR_ERR(ib_mw), pd);
+	return (ib_mw);
+} /* end ehca_alloc_mw() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+int ehca_bind_mw(struct ib_qp *qp,
+		 struct ib_mw *mw,
+		 struct ib_mw_bind *mw_bind)
+{
+	int retcode = 0;
+
+	/*@TODO: not supported up to now */
+	EDEB_ERR(4, "bind MW currently not supported by HCAD");
+	retcode = -EPERM;
+	goto bind_mw_exit0;
+
+      bind_mw_exit0:
+	if (retcode == 0)
+		EDEB_EX(7, "qp=%p mw=%p mw_bind=%p", qp, mw, mw_bind);
+	else
+		EDEB_EX(4, "rc=%x qp=%p mw=%p mw_bind=%p",
+			retcode, qp, mw, mw_bind);
+	return (retcode);
+} /* end ehca_bind_mw() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+int ehca_dealloc_mw(struct ib_mw *mw)
+{
+	int retcode = 0;
+	u64 rc = H_Success;
+	struct ehca_shca *shca = 0;
+	struct ehca_mw *e_mw = 0;
+
+	EDEB_EN(7, "mw=%p", mw);
+
+	EHCA_CHECK_MW(mw);
+	e_mw = container_of(mw, struct ehca_mw, ib_mw);
+	shca = container_of(mw->device, struct ehca_shca, ib_device);
+
+	rc = hipz_h_free_resource_mw(shca->ipz_hca_handle, &e_mw->pf,
+				     &e_mw->ipz_mw_handle);
+	if (rc != H_Success) {
+		EDEB_ERR(4, "hipz_free_mw failed, rc=%lx shca=%p mw=%p "
+			 "rkey=%x hca_hndl=%lx mw_hndl=%lx",
+			 rc, shca, mw, mw->rkey, shca->ipz_hca_handle.handle,
+			 e_mw->ipz_mw_handle.handle);
+		retcode = ehca_mrmw_map_rc_free_mw(rc);
+		goto dealloc_mw_exit0;
+	}
+	/* successful deallocation */
+	ehca_mw_delete(e_mw);
+
+      dealloc_mw_exit0:
+	if (retcode == 0)
+		EDEB_EX(7, "");
+	else
+		EDEB_EX(4, "retcode=%x mw=%p", retcode, mw);
+	return (retcode);
+} /* end ehca_dealloc_mw() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+struct ib_fmr *ehca_alloc_fmr(struct ib_pd *pd,
+			      int mr_access_flags,
+			      struct ib_fmr_attr *fmr_attr)
+{
+	struct ib_fmr *ib_fmr = 0;
+	struct ehca_shca *shca = 0;
+	struct ehca_mr *e_fmr = 0;
+	int retcode = 0;
+	struct ehca_pd *e_pd = 0;
+	u32 tmp_lkey = 0;
+	u32 tmp_rkey = 0;
+	struct ehca_mr_pginfo pginfo={0,0,0,0,0,0,0,0,0,0,0,0};
+
+	EDEB_EN(7, "pd=%p mr_access_flags=%x fmr_attr=%p",
+		pd, mr_access_flags, fmr_attr);
+
+	EHCA_CHECK_PD_P(pd);
+	if (ehca_adr_bad(fmr_attr)) {
+		EDEB_ERR(4, "bad input values: fmr_attr=%p", fmr_attr);
+		ib_fmr = ERR_PTR(-EINVAL);
+		goto alloc_fmr_exit0;
+	}
+
+	EDEB(7, "max_pages=%x max_maps=%x page_shift=%x",
+	     fmr_attr->max_pages, fmr_attr->max_maps, fmr_attr->page_shift);
+
+	/* check other parameters */
+	if (((mr_access_flags & IB_ACCESS_REMOTE_WRITE) &&
+	     !(mr_access_flags & IB_ACCESS_LOCAL_WRITE)) ||
+	    ((mr_access_flags & IB_ACCESS_REMOTE_ATOMIC) &&
+	     !(mr_access_flags & IB_ACCESS_LOCAL_WRITE))) {
+		/* Remote Write Access requires Local Write Access */
+		/* Remote Atomic Access requires Local Write Access */
+		EDEB_ERR(4, "bad input values: mr_access_flags=%x",
+			 mr_access_flags);
+		ib_fmr = ERR_PTR(-EINVAL);
+		goto alloc_fmr_exit0;
+	}
+	if (mr_access_flags & IB_ACCESS_MW_BIND) {
+		EDEB_ERR(4, "bad input values: mr_access_flags=%x",
+			 mr_access_flags);
+		ib_fmr = ERR_PTR(-EINVAL);
+		goto alloc_fmr_exit0;
+	}
+	if ((fmr_attr->max_pages == 0) || (fmr_attr->max_maps == 0)) {
+		EDEB_ERR(4, "bad input values: fmr_attr->max_pages=%x "
+			 "fmr_attr->max_maps=%x fmr_attr->page_shift=%x",
+			 fmr_attr->max_pages, fmr_attr->max_maps,
+			 fmr_attr->page_shift);
+		ib_fmr = ERR_PTR(-EINVAL);
+		goto alloc_fmr_exit0;
+	}
+	if ((1 << fmr_attr->page_shift) != PAGE_SIZE) {
+		/* pagesize currently hardcoded to 4k ... */
+		EDEB_ERR(4, "unsupported fmr_attr->page_shift=%x",
+			 fmr_attr->page_shift);
+		ib_fmr = ERR_PTR(-EINVAL);
+		goto alloc_fmr_exit0;
+	}
+
+	e_pd = container_of(pd, struct ehca_pd, ib_pd);
+	shca = container_of(pd->device, struct ehca_shca, ib_device);
+
+	e_fmr = ehca_mr_new();
+	if (e_fmr == 0) {
+		ib_fmr = ERR_PTR(-ENOMEM);
+		goto alloc_fmr_exit0;
+	}
+	e_fmr->flags |= EHCA_MR_FLAG_FMR;
+
+	/* register MR on HCA */
+	retcode = ehca_reg_mr(shca, e_fmr, 0,
+			      fmr_attr->max_pages * PAGE_SIZE,
+			      mr_access_flags, e_pd, &pginfo,
+			      &tmp_lkey, &tmp_rkey);
+	if (retcode != 0) {
+		ib_fmr = ERR_PTR(retcode);
+		goto alloc_fmr_exit1;
+	}
+
+	/* successful registration of all pages */
+	e_fmr->fmr_page_size = 1 << fmr_attr->page_shift;
+	e_fmr->fmr_max_pages = fmr_attr->max_pages; /* pagesize hardcoded 4k */
+	e_fmr->fmr_max_maps = fmr_attr->max_maps;
+	e_fmr->fmr_map_cnt = 0;
+	ib_fmr = &e_fmr->ib.ib_fmr;
+	goto alloc_fmr_exit0;
+
+      alloc_fmr_exit1:
+	ehca_mr_delete(e_fmr);
+      alloc_fmr_exit0:
+	if (IS_ERR(ib_fmr) == 0)
+		EDEB_EX(7, "ib_fmr=%p tmp_lkey=%x tmp_rkey=%x",
+			ib_fmr, tmp_lkey, tmp_rkey);
+	else
+		EDEB_EX(4, "rc=%lx pd=%p mr_access_flags=%x "
+			"fmr_attr=%p", PTR_ERR(ib_fmr), pd,
+			mr_access_flags, fmr_attr);
+	return (ib_fmr);
+} /* end ehca_alloc_fmr() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+int ehca_map_phys_fmr(struct ib_fmr *fmr,
+		      u64 *page_list,
+		      int list_len,
+		      u64 iova)
+{
+	int retcode = 0;
+	struct ehca_shca *shca = 0;
+	struct ehca_mr *e_fmr = 0;
+	struct ehca_pd *e_pd = 0;
+	struct ehca_mr_pginfo pginfo={0,0,0,0,0,0,0,0,0,0,0,0};
+	u32 tmp_lkey = 0;
+	u32 tmp_rkey = 0;
+	/*@TODO unsigned long sl_flags; */
+
+	EDEB_EN(7, "fmr=%p page_list=%p list_len=%x iova=%lx",
+		fmr, page_list, list_len, iova);
+
+	EHCA_CHECK_FMR(fmr);
+	e_fmr = container_of(fmr, struct ehca_mr, ib.ib_fmr);
+	shca = container_of(fmr->device, struct ehca_shca, ib_device);
+	e_pd = container_of(fmr->pd, struct ehca_pd, ib_pd);
+
+	if (!(e_fmr->flags & EHCA_MR_FLAG_FMR)) {
+		EDEB_ERR(4, "not a FMR, e_fmr=%p e_fmr->flags=%x",
+			 e_fmr, e_fmr->flags);
+		retcode = -EINVAL;
+		goto map_phys_fmr_exit0;
+	}
+	retcode = ehca_fmr_check_page_list(e_fmr, page_list, list_len);
+	if (retcode != 0)
+		goto map_phys_fmr_exit0;
+	if (iova % PAGE_SIZE) {
+		/* only whole-numbered pages */
+		EDEB_ERR(4, "bad iova, iova=%lx", iova);
+		retcode = -EINVAL;
+		goto map_phys_fmr_exit0;
+	}
+	if (e_fmr->fmr_map_cnt >= e_fmr->fmr_max_maps) {
+		/* HCAD does not limit the maps, however trace this anyway */
+		EDEB(6, "map limit exceeded, fmr=%p e_fmr->fmr_map_cnt=%x "
+		     "e_fmr->fmr_max_maps=%x",
+		     fmr, e_fmr->fmr_map_cnt, e_fmr->fmr_max_maps);
+	}
+
+	pginfo.type      = EHCA_MR_PGI_FMR;
+	pginfo.num_pages = list_len;
+	pginfo.page_list = page_list;
+
+	/* @TODO spin_lock_irqsave(&e_fmr->mrlock, sl_flags); */
+
+	retcode = ehca_rereg_mr(shca, e_fmr, (u64 *)iova,
+				list_len * PAGE_SIZE,
+				e_fmr->acl, e_pd, &pginfo,
+				&tmp_lkey, &tmp_rkey);
+	if (retcode != 0) {
+		/* @TODO spin_unlock_irqrestore(&fmr->mrlock, sl_flags); */
+		goto map_phys_fmr_exit0;
+	}
+	/* successful reregistration */
+	e_fmr->fmr_map_cnt++;
+	/* @TODO spin_unlock_irqrestore(&fmr->mrlock, sl_flags); */
+
+	e_fmr->ib.ib_fmr.lkey = tmp_lkey;
+	e_fmr->ib.ib_fmr.rkey = tmp_rkey;
+
+      map_phys_fmr_exit0:
+	if (retcode == 0)
+		EDEB_EX(7, "lkey=%x rkey=%x",
+			e_fmr->ib.ib_fmr.lkey, e_fmr->ib.ib_fmr.rkey);
+	else
+		EDEB_EX(4, "retcode=%x fmr=%p page_list=%p list_len=%x  "
+			"iova=%lx",
+			retcode, fmr, page_list, list_len, iova);
+	return (retcode);
+} /* end ehca_map_phys_fmr() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+int ehca_unmap_fmr(struct list_head *fmr_list)
+{
+	int retcode = 0;
+	struct ib_fmr *ib_fmr;
+	struct ehca_shca *shca = 0;
+	struct ehca_shca *prev_shca = 0;
+	struct ehca_mr *e_fmr = 0;
+	u32 num_fmr = 0;
+	u32 unmap_fmr_cnt = 0;
+	/* @TODO unsigned long sl_flags; */
+
+	EDEB_EN(7, "fmr_list=%p", fmr_list);
+
+	/* check all FMR belong to same SHCA, and check internal flag */
+	list_for_each_entry(ib_fmr, fmr_list, list) {
+		prev_shca = shca;
+		shca = container_of(ib_fmr->device, struct ehca_shca,
+				    ib_device);
+		EHCA_CHECK_FMR(ib_fmr);
+		e_fmr = container_of(ib_fmr, struct ehca_mr, ib.ib_fmr);
+		if ((shca != prev_shca) && (prev_shca != 0)) {
+			EDEB_ERR(4, "SHCA mismatch, shca=%p prev_shca=%p "
+				 "e_fmr=%p", shca, prev_shca, e_fmr);
+			retcode = -EINVAL;
+			goto unmap_fmr_exit0;
+		}
+		if (!(e_fmr->flags & EHCA_MR_FLAG_FMR)) {
+			EDEB_ERR(4, "not a FMR, e_fmr=%p e_fmr->flags=%x",
+				 e_fmr, e_fmr->flags);
+			retcode = -EINVAL;
+			goto unmap_fmr_exit0;
+		}
+		num_fmr++;
+	}
+
+	/* loop over all FMRs to unmap */
+	list_for_each_entry(ib_fmr, fmr_list, list) {
+		unmap_fmr_cnt++;
+		e_fmr = container_of(ib_fmr, struct ehca_mr, ib.ib_fmr);
+		shca = container_of(ib_fmr->device, struct ehca_shca,
+				    ib_device);
+		/*@TODO??? spin_lock_irqsave(&fmr->mrlock, sl_flags); */
+		retcode = ehca_unmap_one_fmr(shca, e_fmr);
+		/*@TODO???? spin_unlock_irqrestore(&fmr->mrlock, sl_flags); */
+		if (retcode != 0) {
+			/* unmap failed, stop unmapping of rest of FMRs */
+			EDEB_ERR(4, "unmap of one FMR failed, stop rest, "
+				 "e_fmr=%p num_fmr=%x unmap_fmr_cnt=%x lkey=%x",
+				 e_fmr, num_fmr, unmap_fmr_cnt,
+				 e_fmr->ib.ib_fmr.lkey);
+			goto unmap_fmr_exit0;
+		}
+	}
+
+      unmap_fmr_exit0:
+	if (retcode == 0)
+		EDEB_EX(7, "num_fmr=%x", num_fmr);
+	else
+		EDEB_EX(4, "retcode=%x fmr_list=%p num_fmr=%x unmap_fmr_cnt=%x",
+			retcode, fmr_list, num_fmr, unmap_fmr_cnt);
+	return (retcode);
+} /* end ehca_unmap_fmr() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+int ehca_dealloc_fmr(struct ib_fmr *fmr)
+{
+	int retcode = 0;
+	u64 rc = H_Success;
+	struct ehca_shca *shca = 0;
+	struct ehca_mr *e_fmr = 0;
+
+	EDEB_EN(7, "fmr=%p", fmr);
+
+	EHCA_CHECK_FMR(fmr);
+	e_fmr = container_of(fmr, struct ehca_mr, ib.ib_fmr);
+	shca = container_of(fmr->device, struct ehca_shca, ib_device);
+
+	if (!(e_fmr->flags & EHCA_MR_FLAG_FMR)) {
+		EDEB_ERR(4, "not a FMR, e_fmr=%p e_fmr->flags=%x",
+			 e_fmr, e_fmr->flags);
+		retcode = -EINVAL;
+		goto free_fmr_exit0;
+	}
+
+	rc = hipz_h_free_resource_mr(shca->ipz_hca_handle, &e_fmr->pf,
+				     &e_fmr->ipz_mr_handle);
+	if (rc != H_Success) {
+		EDEB_ERR(4, "hipz_free_mr failed, rc=%lx e_fmr=%p "
+			 "hca_hndl=%lx fmr_hndl=%lx fmr->lkey=%x",
+			 rc, e_fmr, shca->ipz_hca_handle.handle,
+			 e_fmr->ipz_mr_handle.handle, fmr->lkey);
+		ehca_mrmw_map_rc_free_mr(rc);
+		goto free_fmr_exit0;
+	}
+	/* successful deregistration */
+	ehca_mr_delete(e_fmr);
+
+      free_fmr_exit0:
+	if (retcode == 0)
+		EDEB_EX(7, "");
+	else
+		EDEB_EX(4, "retcode=%x fmr=%p", retcode, fmr);
+	return (retcode);
+} /* end ehca_dealloc_fmr() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+int ehca_reg_mr(struct ehca_shca *shca,
+		struct ehca_mr *e_mr,
+		u64 *iova_start,
+		u64 size,
+		int acl,
+		struct ehca_pd *e_pd,
+		struct ehca_mr_pginfo *pginfo,
+		u32 *lkey,
+		u32 *rkey)
+{
+	int retcode = 0;
+	u64 rc = H_Success;
+	struct ehca_pfmr *pfmr = &e_mr->pf;
+	u32 hipz_acl = 0;
+
+	EDEB_EN(7, "shca=%p e_mr=%p iova_start=%p size=%lx acl=%x e_pd=%p "
+		"pginfo=%p num_pages=%lx", shca, e_mr, iova_start, size, acl,
+		e_pd, pginfo, pginfo->num_pages);
+
+	ehca_mrmw_map_acl(acl, &hipz_acl);
+	ehca_mrmw_set_pgsize_hipz_acl(&hipz_acl);
+	if (ehca_use_hp_mr == 1)
+	        hipz_acl |= 0x00000001;
+
+	rc = hipz_h_alloc_resource_mr(shca->ipz_hca_handle, pfmr, &shca->pf,
+				      (u64)iova_start, size, hipz_acl,
+				      e_pd->fw_pd, &e_mr->ipz_mr_handle,
+				      lkey, rkey);
+	if (rc != H_Success) {
+		EDEB_ERR(4, "hipz_alloc_mr failed, rc=%lx hca_hndl=%lx "
+			 "mr_hndl=%lx", rc, shca->ipz_hca_handle.handle,
+			 e_mr->ipz_mr_handle.handle);
+		retcode = ehca_mrmw_map_rc_alloc(rc);
+		goto ehca_reg_mr_exit0;
+	}
+
+	retcode = ehca_reg_mr_rpages(shca, e_mr, pginfo);
+	if (retcode != 0)
+		goto ehca_reg_mr_exit1;
+
+	/* successful registration */
+	e_mr->num_pages = pginfo->num_pages;
+	e_mr->start = iova_start;
+	e_mr->size = size;
+	e_mr->acl = acl;
+	goto ehca_reg_mr_exit0;
+
+      ehca_reg_mr_exit1:
+	rc = hipz_h_free_resource_mr(shca->ipz_hca_handle, pfmr,
+				     &e_mr->ipz_mr_handle);
+	if (rc != H_Success) {
+		EDEB(1, "rc=%lx shca=%p e_mr=%p iova_start=%p "
+		     "size=%lx acl=%x e_pd=%p lkey=%x pginfo=%p num_pages=%lx",
+		     rc, shca, e_mr, iova_start, size, acl,
+		     e_pd, *lkey, pginfo, pginfo->num_pages);
+		ehca_catastrophic("internal error in ehca_reg_mr, "
+				  "not recoverable");
+	}
+      ehca_reg_mr_exit0:
+	if (retcode == 0)
+		EDEB_EX(7, "retcode=%x lkey=%x rkey=%x", retcode, *lkey, *rkey);
+	else
+		EDEB_EX(4, "retcode=%x shca=%p e_mr=%p iova_start=%p "
+			"size=%lx acl=%x e_pd=%p pginfo=%p num_pages=%lx",
+			retcode, shca, e_mr, iova_start,
+			size, acl, e_pd, pginfo, pginfo->num_pages);
+	return (retcode);
+} /* end ehca_reg_mr() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+int ehca_reg_mr_rpages(struct ehca_shca *shca,
+		       struct ehca_mr *e_mr,
+		       struct ehca_mr_pginfo *pginfo)
+{
+	int retcode = 0;
+	u64 rc = H_Success;
+	struct ehca_pfmr *pfmr = &e_mr->pf;
+	u32 rnum = 0;
+	u64 rpage = 0;
+	u32 i;
+	u64 *kpage = 0;
+
+	EDEB_EN(7, "shca=%p e_mr=%p pginfo=%p num_pages=%lx",
+		shca, e_mr, pginfo, pginfo->num_pages);
+
+	kpage = kmalloc(PAGE_SIZE, GFP_KERNEL);
+	if (kpage == 0) {
+		EDEB_ERR(4, "kpage alloc failed");
+		retcode = -ENOMEM;
+		goto ehca_reg_mr_rpages_exit0;
+	}
+	memset(kpage, 0, PAGE_SIZE);
+
+	/* max 512 pages per shot */
+	for (i = 0; i < ((pginfo->num_pages + 512 - 1) / 512); i++) {
+
+		if (i == ((pginfo->num_pages + 512 - 1) / 512) - 1) {
+			rnum = pginfo->num_pages % 512; /* last shot */
+			if (rnum == 0)
+				rnum = 512;      /* last shot is full */
+		} else
+			rnum = 512;
+
+		if (rnum > 1) {
+			retcode = ehca_set_pagebuf(e_mr, pginfo, rnum, kpage);
+			if (retcode) {
+				EDEB_ERR(4, "ehca_set_pagebuf bad rc, "
+					 "retcode=%x rnum=%x kpage=%p",
+					 retcode, rnum, kpage);
+				retcode = -EFAULT;
+				goto ehca_reg_mr_rpages_exit1;
+			}
+			rpage = ehca_kv_to_g(kpage);
+			if (rpage == 0) {
+				EDEB_ERR(4, "kpage=%p i=%x", kpage, i);
+				retcode = -EFAULT;
+				goto ehca_reg_mr_rpages_exit1;
+			}
+		} else {  /* rnum==1 */
+			retcode = ehca_set_pagebuf_1(e_mr, pginfo, &rpage);
+			if (retcode) {
+				EDEB_ERR(4, "ehca_set_pagebuf_1 bad rc, "
+					 "retcode=%x i=%x", retcode, i);
+				retcode = -EFAULT;
+				goto ehca_reg_mr_rpages_exit1;
+			}
+		}
+
+		EDEB(9, "i=%x rnum=%x rpage=%lx", i, rnum, rpage);
+
+		rc = hipz_h_register_rpage_mr(shca->ipz_hca_handle,
+					      &e_mr->ipz_mr_handle, pfmr,
+					      &shca->pf,
+					      0, /* pagesize hardcoded to 4k */
+					      0, rpage, rnum);
+
+		if (i == ((pginfo->num_pages + 512 - 1) / 512) - 1) {
+			/* check for 'registration complete'==H_Success */
+			/* and for 'page registered'==H_PAGE_REGISTERED */
+			if (rc != H_Success) {
+				EDEB_ERR(4, "last hipz_reg_rpage_mr failed, "
+					 "rc=%lx e_mr=%p i=%x hca_hndl=%lx "
+					 "mr_hndl=%lx lkey=%x", rc, e_mr, i,
+					 shca->ipz_hca_handle.handle,
+					 e_mr->ipz_mr_handle.handle,
+					 e_mr->ib.ib_mr.lkey);
+				retcode = ehca_mrmw_map_rc_rrpg_last(rc);
+				break;
+			} else
+				retcode = 0;
+		} else if (rc != H_PAGE_REGISTERED) {
+			EDEB_ERR(4, "hipz_reg_rpage_mr failed, rc=%lx e_mr=%p "
+				 "i=%x lkey=%x hca_hndl=%lx mr_hndl=%lx",
+				 rc, e_mr, i, e_mr->ib.ib_mr.lkey,
+				 shca->ipz_hca_handle.handle,
+				 e_mr->ipz_mr_handle.handle);
+			retcode = ehca_mrmw_map_rc_rrpg_notlast(rc);
+			break;
+		} else
+			retcode = 0;
+	} /* end for(i) */
+
+
+       ehca_reg_mr_rpages_exit1:
+	kfree(kpage);
+       ehca_reg_mr_rpages_exit0:
+	if (retcode == 0)
+		EDEB_EX(7, "retcode=%x", retcode);
+	else
+		EDEB_EX(4, "retcode=%x shca=%p e_mr=%p pginfo=%p "
+			"num_pages=%lx",
+			retcode, shca, e_mr, pginfo, pginfo->num_pages);
+	return (retcode);
+} /* end ehca_reg_mr_rpages() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+inline int ehca_rereg_mr_rereg1(struct ehca_shca *shca,
+				struct ehca_mr *e_mr,
+				u64 *iova_start,
+				u64 size,
+				u32 acl,
+				struct ehca_pd *e_pd,
+				struct ehca_mr_pginfo *pginfo,
+				u32 *lkey,
+				u32 *rkey)
+{
+	int retcode = 0;
+	u64 rc = H_Success;
+	struct ehca_pfmr *pfmr = &e_mr->pf;
+	u64 iova_start_out = 0;
+	u32 hipz_acl = 0;
+	u64 *kpage = 0;
+	u64 rpage = 0;
+	struct ehca_mr_pginfo pginfo_save;
+
+	EDEB_EN(7, "shca=%p e_mr=%p iova_start=%p size=%lx acl=%x "
+		"e_pd=%p pginfo=%p num_pages=%lx", shca, e_mr,
+		iova_start, size, acl, e_pd, pginfo, pginfo->num_pages);
+
+	ehca_mrmw_map_acl(acl, &hipz_acl);
+	ehca_mrmw_set_pgsize_hipz_acl(&hipz_acl);
+
+	kpage = kmalloc(PAGE_SIZE, GFP_KERNEL);
+	if (kpage == 0) {
+		EDEB_ERR(4, "kpage alloc failed");
+		retcode = -ENOMEM;
+		goto ehca_rereg_mr_rereg1_exit0;
+	}
+	memset(kpage, 0, PAGE_SIZE);
+
+	pginfo_save = *pginfo;
+	retcode = ehca_set_pagebuf(e_mr, pginfo, pginfo->num_pages, kpage);
+	if (retcode != 0) {
+		EDEB_ERR(4, "set pagebuf failed, e_mr=%p pginfo=%p type=%x "
+			 "num_pages=%lx kpage=%p",
+			 e_mr, pginfo, pginfo->type, pginfo->num_pages, kpage);
+		goto ehca_rereg_mr_rereg1_exit1;
+	}
+	rpage = ehca_kv_to_g(kpage);
+	if (rpage == 0) {
+		EDEB_ERR(4, "kpage=%p", kpage);
+		retcode = -EFAULT;
+		goto ehca_rereg_mr_rereg1_exit1;
+	}
+	rc = hipz_h_reregister_pmr(shca->ipz_hca_handle, pfmr, &shca->pf,
+				   &e_mr->ipz_mr_handle, (u64)iova_start,
+				   size, hipz_acl, e_pd->fw_pd, rpage,
+				   &iova_start_out, lkey, rkey);
+	if (rc != H_Success) {
+		/* reregistration unsuccessful,                 */
+		/* try it again with the 3 hCalls,              */
+		/* e.g. this is required in case H_MR_CONDITION */
+		/* (MW bound or MR is shared)                   */
+		EDEB(6, "hipz_h_reregister_pmr failed (Rereg1), rc=%lx "
+		     "e_mr=%p", rc, e_mr);
+		*pginfo = pginfo_save;
+		retcode = -EAGAIN;
+	} else if ((u64 *)iova_start_out != iova_start) {
+		EDEB_ERR(4, "PHYP changed iova_start in rereg_pmr, "
+			 "iova_start=%p iova_start_out=%lx e_mr=%p "
+			 "mr_handle=%lx lkey=%x", iova_start, iova_start_out,
+			 e_mr, e_mr->ipz_mr_handle.handle, e_mr->ib.ib_mr.lkey);
+		retcode = -EFAULT;
+	} else {
+		/* successful reregistration */
+		/* note: start and start_out are identical for eServer HCAs */
+		e_mr->num_pages = pginfo->num_pages;
+		e_mr->start     = iova_start;
+		e_mr->size      = size;
+		e_mr->acl       = acl;
+	}
+
+       ehca_rereg_mr_rereg1_exit1:
+	kfree(kpage);
+       ehca_rereg_mr_rereg1_exit0:
+	if ((retcode == 0) || (retcode == -EAGAIN))
+		EDEB_EX(7, "retcode=%x rc=%lx lkey=%x rkey=%x pginfo=%p "
+			"num_pages=%lx",
+			retcode, rc, *lkey, *rkey, pginfo, pginfo->num_pages);
+	else
+		EDEB_EX(4, "retcode=%x rc=%lx lkey=%x rkey=%x pginfo=%p "
+			"num_pages=%lx",
+			retcode, rc, *lkey, *rkey, pginfo, pginfo->num_pages);
+	return (retcode);
+} /* end ehca_rereg_mr_rereg1() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+int ehca_rereg_mr(struct ehca_shca *shca,
+		  struct ehca_mr *e_mr,
+		  u64 *iova_start,
+		  u64 size,
+		  int acl,
+		  struct ehca_pd *e_pd,
+		  struct ehca_mr_pginfo *pginfo,
+		  u32 *lkey,
+		  u32 *rkey)
+{
+	int retcode = 0;
+	u64 rc = H_Success;
+	struct ehca_pfmr *pfmr = &e_mr->pf;
+	int Rereg1Hcall = TRUE;	 /* TRUE: use hipz_h_reregister_pmr directly */
+	int Rereg3Hcall = FALSE; /* TRUE: use 3 hipz calls for reregistration */
+	struct ehca_bridge_handle save_bridge;
+
+	EDEB_EN(7, "shca=%p e_mr=%p iova_start=%p size=%lx acl=%x "
+		"e_pd=%p pginfo=%p num_pages=%lx", shca, e_mr,
+		iova_start, size, acl, e_pd, pginfo, pginfo->num_pages);
+
+	/* first determine reregistration hCall(s) */
+	if ((pginfo->num_pages > 512) || (e_mr->num_pages > 512) ||
+	    (pginfo->num_pages > e_mr->num_pages)) {
+		EDEB(7, "Rereg3 case, pginfo->num_pages=%lx "
+		     "e_mr->num_pages=%x", pginfo->num_pages, e_mr->num_pages);
+		Rereg1Hcall = FALSE;
+		Rereg3Hcall = TRUE;
+	}
+
+	if (e_mr->flags & EHCA_MR_FLAG_MAXMR) {	/* check for max-MR */
+		Rereg1Hcall = FALSE;
+		Rereg3Hcall = TRUE;
+		e_mr->flags &= ~EHCA_MR_FLAG_MAXMR;
+		EDEB(4, "Rereg MR for max-MR! e_mr=%p", e_mr);
+	}
+
+	if (Rereg1Hcall) {
+		retcode = ehca_rereg_mr_rereg1(shca, e_mr, iova_start, size,
+					       acl, e_pd, pginfo, lkey, rkey);
+		if (retcode != 0) {
+			if (retcode == -EAGAIN)
+				Rereg3Hcall = TRUE;
+			else
+				goto ehca_rereg_mr_exit0;
+		}
+	}
+
+	if (Rereg3Hcall) {
+		struct ehca_mr save_mr;
+
+		/* first deregister old MR */
+		rc = hipz_h_free_resource_mr(shca->ipz_hca_handle, pfmr,
+					     &e_mr->ipz_mr_handle);
+		if (rc != H_Success) {
+			EDEB_ERR(4, "hipz_free_mr failed, rc=%lx e_mr=%p "
+				 "hca_hndl=%lx mr_hndl=%lx mr->lkey=%x",
+				 rc, e_mr, shca->ipz_hca_handle.handle,
+				 e_mr->ipz_mr_handle.handle,
+				 e_mr->ib.ib_mr.lkey);
+			retcode = ehca_mrmw_map_rc_free_mr(rc);
+			goto ehca_rereg_mr_exit0;
+		}
+		/* clean ehca_mr_t, without changing struct ib_mr and lock */
+		save_bridge = pfmr->bridge;
+		save_mr = *e_mr;
+		ehca_mr_deletenew(e_mr);
+
+		/* set some MR values */
+		e_mr->flags = save_mr.flags;
+		pfmr->bridge = save_bridge;
+		e_mr->fmr_page_size = save_mr.fmr_page_size;
+		e_mr->fmr_max_pages = save_mr.fmr_max_pages;
+		e_mr->fmr_max_maps = save_mr.fmr_max_maps;
+		e_mr->fmr_map_cnt = save_mr.fmr_map_cnt;
+
+		retcode = ehca_reg_mr(shca, e_mr, iova_start, size, acl,
+				      e_pd, pginfo, lkey, rkey);
+		if (retcode != 0) {
+			u32 offset = (u64)(&e_mr->flags) - (u64)e_mr;
+			memcpy(&e_mr->flags, &(save_mr.flags),
+			       sizeof(struct ehca_mr) - offset);
+			goto ehca_rereg_mr_exit0;
+		}
+	}
+
+      ehca_rereg_mr_exit0:
+	if (retcode == 0)
+		EDEB_EX(7, "retcode=%x shca=%p e_mr=%p iova_start=%p size=%lx "
+			"acl=%x e_pd=%p pginfo=%p num_pages=%lx lkey=%x "
+			"rkey=%x Rereg1Hcall=%x Rereg3Hcall=%x",
+			retcode, shca, e_mr, iova_start, size, acl, e_pd,
+			pginfo, pginfo->num_pages, *lkey, *rkey, Rereg1Hcall,
+			Rereg3Hcall);
+	else
+		EDEB_EX(4, "retcode=%x shca=%p e_mr=%p iova_start=%p size=%lx "
+			"acl=%x e_pd=%p pginfo=%p num_pages=%lx lkey=%x "
+			"rkey=%x Rereg1Hcall=%x Rereg3Hcall=%x",
+			retcode, shca, e_mr, iova_start, size, acl, e_pd,
+			pginfo, pginfo->num_pages, *lkey, *rkey, Rereg1Hcall,
+			Rereg3Hcall);
+
+	return (retcode);
+} /* end ehca_rereg_mr() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+int ehca_unmap_one_fmr(struct ehca_shca *shca,
+		       struct ehca_mr *e_fmr)
+{
+	int retcode = 0;
+	u64 rc = H_Success;
+	struct ehca_pfmr *pfmr = &e_fmr->pf;
+	int Rereg1Hcall = TRUE;	 /* TRUE: use hipz_mr_reregister directly */
+	int Rereg3Hcall = FALSE; /* TRUE: use 3 hipz calls for unmapping */
+	struct ehca_bridge_handle save_bridge;
+	struct ehca_pd *e_pd = 0;
+	struct ehca_mr save_fmr;
+	u32 tmp_lkey = 0;
+	u32 tmp_rkey = 0;
+	struct ehca_mr_pginfo pginfo={0,0,0,0,0,0,0,0,0,0,0,0};
+
+	EDEB_EN(7, "shca=%p e_fmr=%p", shca, e_fmr);
+
+	/* first check if reregistration hCall can be used for unmap */
+	if (e_fmr->fmr_max_pages > 512) {
+		Rereg1Hcall = FALSE;
+		Rereg3Hcall = TRUE;
+	}
+
+	e_pd = container_of(e_fmr->ib.ib_fmr.pd, struct ehca_pd, ib_pd);
+
+	if (Rereg1Hcall) {
+		/* note: after using rereg hcall with len=0,            */
+		/* rereg hcall must be used again for registering pages */
+		u64 start_out = 0;
+		rc = hipz_h_reregister_pmr(shca->ipz_hca_handle, pfmr,
+					   &shca->pf, &e_fmr->ipz_mr_handle, 0,
+					   0, 0, e_pd->fw_pd, 0, &start_out,
+					   &tmp_lkey, &tmp_rkey);
+		if (rc != H_Success) {
+			/* should not happen, because length checked above, */
+			/* FMRs are not shared and no MW bound to FMRs      */
+			EDEB_ERR(4, "hipz_reregister_pmr failed (Rereg1), "
+				 "rc=%lx e_fmr=%p hca_hndl=%lx mr_hndl=%lx "
+				 "lkey=%x", rc, e_fmr,
+				 shca->ipz_hca_handle.handle,
+				 e_fmr->ipz_mr_handle.handle,
+				 e_fmr->ib.ib_fmr.lkey);
+			Rereg3Hcall = TRUE;
+		} else {
+			/* successful reregistration */
+			e_fmr->start = 0;
+			e_fmr->size = 0;
+		}
+	}
+
+	if (Rereg3Hcall) {
+		struct ehca_mr save_mr;
+
+		/* first free old FMR */
+		rc = hipz_h_free_resource_mr(shca->ipz_hca_handle, pfmr,
+					     &e_fmr->ipz_mr_handle);
+		if (rc != H_Success) {
+			EDEB_ERR(4, "hipz_free_mr failed, rc=%lx e_fmr=%p "
+				 "hca_hndl=%lx mr_hndl=%lx lkey=%x", rc, e_fmr,
+				 shca->ipz_hca_handle.handle,
+				 e_fmr->ipz_mr_handle.handle,
+				 e_fmr->ib.ib_fmr.lkey);
+			retcode = ehca_mrmw_map_rc_free_mr(rc);
+			goto ehca_unmap_one_fmr_exit0;
+		}
+		/* clean ehca_mr_t, without changing lock */
+		save_bridge = pfmr->bridge;
+		save_fmr = *e_fmr;
+		ehca_mr_deletenew(e_fmr);
+
+		/* set some MR values */
+		e_fmr->flags = save_fmr.flags;
+		pfmr->bridge = save_bridge;
+		e_fmr->fmr_page_size = save_fmr.fmr_page_size;
+		e_fmr->fmr_max_pages = save_fmr.fmr_max_pages;
+		e_fmr->fmr_max_maps = save_fmr.fmr_max_maps;
+		e_fmr->fmr_map_cnt = save_fmr.fmr_map_cnt;
+		e_fmr->acl = save_fmr.acl;
+
+		pginfo.type      = EHCA_MR_PGI_FMR;
+		pginfo.num_pages = 0;
+		retcode = ehca_reg_mr(shca, e_fmr, 0,
+				      (e_fmr->fmr_max_pages *
+				       e_fmr->fmr_page_size),
+				      e_fmr->acl, e_pd, &pginfo, &tmp_lkey,
+				      &tmp_rkey);
+		if (retcode != 0) {
+			u32 offset = (u64)(&e_fmr->flags) - (u64)e_fmr;
+			memcpy(&e_fmr->flags, &(save_mr.flags),
+			       sizeof(struct ehca_mr) - offset);
+			goto ehca_unmap_one_fmr_exit0;
+		}
+	}
+
+      ehca_unmap_one_fmr_exit0:
+	EDEB_EX(7, "retcode=%x tmp_lkey=%x tmp_rkey=%x fmr_max_pages=%x "
+		"Rereg1Hcall=%x Rereg3Hcall=%x", retcode, tmp_lkey, tmp_rkey,
+		e_fmr->fmr_max_pages, Rereg1Hcall, Rereg3Hcall);
+	return (retcode);
+} /* end ehca_unmap_one_fmr() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+int ehca_reg_smr(struct ehca_shca *shca,
+		 struct ehca_mr *e_origmr,
+		 struct ehca_mr *e_newmr,
+		 u64 *iova_start,
+		 int acl,
+		 struct ehca_pd *e_pd,
+		 u32 *lkey,
+		 u32 *rkey)
+{
+	int retcode = 0;
+	u64 rc = H_Success;
+	struct ehca_pfmr *pfmr = &e_newmr->pf;
+	u32 hipz_acl = 0;
+
+	EDEB_EN(7,"shca=%p e_origmr=%p e_newmr=%p iova_start=%p acl=%x e_pd=%p",
+		shca, e_origmr, e_newmr, iova_start, acl, e_pd);
+
+	ehca_mrmw_map_acl(acl, &hipz_acl);
+	ehca_mrmw_set_pgsize_hipz_acl(&hipz_acl);
+
+	rc = hipz_h_register_smr(shca->ipz_hca_handle, pfmr, &e_origmr->pf,
+				 &shca->pf, &e_origmr->ipz_mr_handle,
+				 (u64)iova_start, hipz_acl, e_pd->fw_pd,
+				 &e_newmr->ipz_mr_handle, lkey, rkey);
+	if (rc != H_Success) {
+		EDEB_ERR(4, "hipz_reg_smr failed, rc=%lx shca=%p e_origmr=%p "
+			 "e_newmr=%p iova_start=%p acl=%x e_pd=%p hca_hndl=%lx "
+			 "mr_hndl=%lx lkey=%x", rc, shca, e_origmr, e_newmr,
+			 iova_start, acl, e_pd, shca->ipz_hca_handle.handle,
+			 e_origmr->ipz_mr_handle.handle,
+			 e_origmr->ib.ib_mr.lkey);
+		retcode = ehca_mrmw_map_rc_reg_smr(rc);
+		goto ehca_reg_smr_exit0;
+	}
+	/* successful registration */
+	e_newmr->num_pages = e_origmr->num_pages;
+	e_newmr->start = iova_start;
+	e_newmr->size = e_origmr->size;
+	e_newmr->acl = acl;
+	goto ehca_reg_smr_exit0;
+
+      ehca_reg_smr_exit0:
+	if (retcode == 0)
+		EDEB_EX(7, "retcode=%x lkey=%x rkey=%x",
+			retcode, *lkey, *rkey);
+	else
+		EDEB_EX(4, "retcode=%x shca=%p e_origmr=%p e_newmr=%p "
+			"iova_start=%p acl=%x e_pd=%p", retcode,
+			shca, e_origmr, e_newmr, iova_start, acl, e_pd);
+	return (retcode);
+} /* end ehca_reg_smr() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+int ehca_reg_internal_maxmr(
+	struct ehca_shca *shca,
+	struct ehca_pd *e_pd,
+	struct ehca_mr **e_maxmr)
+{
+	int retcode = 0;
+	struct ehca_mr *e_mr = 0;
+	u64 *iova_start = 0;
+	u64 size_maxmr = 0;
+	struct ehca_mr_pginfo pginfo={0,0,0,0,0,0,0,0,0,0,0,0};
+	struct ib_phys_buf ib_pbuf;
+	u32 num_pages_mr = 0;
+
+	EDEB_EN(7, "shca=%p e_pd=%p e_maxmr=%p", shca, e_pd, e_maxmr);
+
+	if (ehca_adr_bad(shca) || ehca_adr_bad(e_pd) || ehca_adr_bad(e_maxmr)) {
+		EDEB_ERR(4, "bad input values: shca=%p e_pd=%p e_maxmr=%p",
+			 shca, e_pd, e_maxmr);
+		retcode = -EINVAL;
+		goto ehca_reg_internal_maxmr_exit0;
+	}
+
+	e_mr = ehca_mr_new();
+	if (!e_mr) {
+		EDEB_ERR(4, "out of memory");
+		retcode = -ENOMEM;
+		goto ehca_reg_internal_maxmr_exit0;
+	}
+	e_mr->flags |= EHCA_MR_FLAG_MAXMR;
+
+	/* register internal max-MR on HCA */
+	size_maxmr = (u64)high_memory - PAGE_OFFSET;
+	EDEB(9, "high_memory=%p PAGE_OFFSET=%lx", high_memory, PAGE_OFFSET);
+	iova_start = (u64 *)KERNELBASE;
+	ib_pbuf.addr = 0;
+	ib_pbuf.size = size_maxmr;
+	num_pages_mr =
+		((((u64)iova_start % PAGE_SIZE) + size_maxmr +
+		  PAGE_SIZE - 1) / PAGE_SIZE);
+
+	pginfo.type           = EHCA_MR_PGI_PHYS;
+	pginfo.num_pages      = num_pages_mr;
+	pginfo.num_phys_buf   = 1;
+	pginfo.phys_buf_array = &ib_pbuf;
+
+	retcode = ehca_reg_mr(shca, e_mr, iova_start, size_maxmr, 0, e_pd,
+			      &pginfo, &e_mr->ib.ib_mr.lkey,
+			      &e_mr->ib.ib_mr.rkey);
+	if (retcode != 0) {
+		EDEB_ERR(4, "reg of internal max MR failed, e_mr=%p "
+			 "iova_start=%p size_maxmr=%lx num_pages_mr=%x",
+			 e_mr, iova_start, size_maxmr, num_pages_mr);
+		goto ehca_reg_internal_maxmr_exit1;
+	}
+
+	/* successful registration of all pages */
+	e_mr->ib.ib_mr.device = e_pd->ib_pd.device;
+	e_mr->ib.ib_mr.pd = &e_pd->ib_pd;
+	e_mr->ib.ib_mr.uobject = NULL;
+	atomic_inc(&(e_pd->ib_pd.usecnt));
+	atomic_set(&(e_mr->ib.ib_mr.usecnt), 0);
+	*e_maxmr = e_mr;
+	goto ehca_reg_internal_maxmr_exit0;
+
+      ehca_reg_internal_maxmr_exit1:
+	ehca_mr_delete(e_mr);
+      ehca_reg_internal_maxmr_exit0:
+	if (retcode == 0)
+		EDEB_EX(7, "*e_maxmr=%p lkey=%x rkey=%x",
+			*e_maxmr, (*e_maxmr)->ib.ib_mr.lkey,
+			(*e_maxmr)->ib.ib_mr.rkey);
+	else
+		EDEB_EX(4, "retcode=%x shca=%p e_pd=%p e_maxmr=%p",
+			retcode, shca, e_pd, e_maxmr);
+	return (retcode);
+} /* end ehca_reg_internal_maxmr() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+int ehca_reg_maxmr(struct ehca_shca *shca,
+		   struct ehca_mr *e_newmr,
+		   u64 *iova_start,
+		   int acl,
+		   struct ehca_pd *e_pd,
+		   u32 *lkey,
+		   u32 *rkey)
+{
+	int retcode = 0;
+	u64 rc = H_Success;
+	struct ehca_pfmr *pfmr = &e_newmr->pf;
+	struct ehca_mr *e_origmr = shca->maxmr;
+	u32 hipz_acl = 0;
+
+	EDEB_EN(7,"shca=%p e_origmr=%p e_newmr=%p iova_start=%p acl=%x e_pd=%p",
+		shca, e_origmr, e_newmr, iova_start, acl, e_pd);
+
+	ehca_mrmw_map_acl(acl, &hipz_acl);
+	ehca_mrmw_set_pgsize_hipz_acl(&hipz_acl);
+
+	rc = hipz_h_register_smr(shca->ipz_hca_handle, pfmr, &e_origmr->pf,
+				 &shca->pf, &e_origmr->ipz_mr_handle,
+				 (u64)iova_start, hipz_acl, e_pd->fw_pd,
+				 &e_newmr->ipz_mr_handle, lkey, rkey);
+	if (rc != H_Success) {
+		EDEB_ERR(4, "hipz_reg_smr failed, rc=%lx e_origmr=%p "
+			 "hca_hndl=%lx mr_hndl=%lx lkey=%x",
+			 rc, e_origmr, shca->ipz_hca_handle.handle,
+			 e_origmr->ipz_mr_handle.handle,
+			 e_origmr->ib.ib_mr.lkey);
+		retcode = ehca_mrmw_map_rc_reg_smr(rc);
+		goto ehca_reg_maxmr_exit0;
+	}
+	/* successful registration */
+	e_newmr->num_pages = e_origmr->num_pages;
+	e_newmr->start = iova_start;
+	e_newmr->size = e_origmr->size;
+	e_newmr->acl = acl;
+
+      ehca_reg_maxmr_exit0:
+	EDEB_EX(7, "retcode=%x lkey=%x rkey=%x", retcode, *lkey, *rkey);
+	return (retcode);
+} /* end ehca_reg_maxmr() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+int ehca_dereg_internal_maxmr(struct ehca_shca *shca)
+{
+	int retcode = 0;
+	struct ehca_mr *e_maxmr = 0;
+	struct ib_pd *ib_pd = 0;
+
+	EDEB_EN(7, "shca=%p shca->maxmr=%p", shca, shca->maxmr);
+
+	if (shca->maxmr == 0) {
+		EDEB_ERR(4, "bad call, shca=%p", shca);
+		retcode = -EINVAL;
+		goto ehca_dereg_internal_maxmr_exit0;
+	}
+
+	e_maxmr = shca->maxmr;
+	ib_pd = e_maxmr->ib.ib_mr.pd;
+	shca->maxmr = 0; /* remove internal max-MR indication from SHCA */
+
+	retcode = ehca_dereg_mr(&e_maxmr->ib.ib_mr);
+	if (retcode != 0) {
+		EDEB_ERR(3, "dereg internal max-MR failed, "
+			 "retcode=%x e_maxmr=%p shca=%p lkey=%x",
+			 retcode, e_maxmr, shca, e_maxmr->ib.ib_mr.lkey);
+		shca->maxmr = e_maxmr;
+		goto ehca_dereg_internal_maxmr_exit0;
+	}
+
+	atomic_dec(&ib_pd->usecnt);
+
+      ehca_dereg_internal_maxmr_exit0:
+	if (retcode == 0)
+		EDEB_EX(7, "");
+	else
+		EDEB_EX(4, "retcode=%x shca=%p shca->maxmr=%p",
+			retcode, shca, shca->maxmr);
+	return (retcode);
+} /* end ehca_dereg_internal_maxmr() */
diff --git a/drivers/infiniband/hw/ehca/ehca_mrmw.h b/drivers/infiniband/hw/ehca/ehca_mrmw.h
new file mode 100644
index 0000000..4df4b5b
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/ehca_mrmw.h
@@ -0,0 +1,739 @@
+/*
+ *  IBM eServer eHCA Infiniband device driver for Linux on POWER
+ *
+ *  MR/MW declarations and inline functions
+ *
+ *  Authors: Dietmar Decker <ddecker at de.ibm.com>
+ *
+ *  Copyright (c) 2005 IBM Corporation
+ *
+ *  All rights reserved.
+ *
+ *  This source code is distributed under a dual license of GPL v2.0 and OpenIB
+ *  BSD.
+ *
+ * OpenIB BSD License
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * Redistributions of source code must retain the above copyright notice, this
+ * list of conditions and the following disclaimer.
+ *
+ * Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials
+ * provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
+ * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ *  $Id: ehca_mrmw.h,v 1.59 2006/02/06 10:17:34 schickhj Exp $
+ */
+
+#ifndef _EHCA_MRMW_H_
+#define _EHCA_MRMW_H_
+
+#undef DEB_PREFIX
+#define DEB_PREFIX "mrmw"
+
+#include "hipz_structs.h"
+
+
+int ehca_reg_mr(struct ehca_shca *shca,
+		struct ehca_mr *e_mr,
+		u64 *iova_start,
+		u64 size,
+		int acl,
+		struct ehca_pd *e_pd,
+		struct ehca_mr_pginfo *pginfo,
+		u32 *lkey,  /**<OUT*/
+		u32 *rkey); /**<OUT*/
+
+int ehca_reg_mr_rpages(struct ehca_shca *shca,
+		       struct ehca_mr *e_mr,
+		       struct ehca_mr_pginfo *pginfo);
+
+int ehca_rereg_mr(struct ehca_shca *shca,
+		  struct ehca_mr *e_mr,
+		  u64 *iova_start,
+		  u64 size,
+		  int mr_access_flags,
+		  struct ehca_pd *e_pd,
+		  struct ehca_mr_pginfo *pginfo,
+		  u32 *lkey,  /**<OUT*/
+		  u32 *rkey); /**<OUT*/
+
+int ehca_unmap_one_fmr(struct ehca_shca *shca,
+		       struct ehca_mr *e_fmr);
+
+int ehca_reg_smr(struct ehca_shca *shca,
+		 struct ehca_mr *e_origmr,
+		 struct ehca_mr *e_newmr,
+		 u64 *iova_start,
+		 int acl,
+		 struct ehca_pd *e_pd,
+		 u32 *lkey,  /**<OUT*/
+		 u32 *rkey); /**<OUT*/
+
+/** @brief register internal max-MR to internal SHCA
+ */
+int ehca_reg_internal_maxmr(struct ehca_shca *shca,  /**<IN*/
+			    struct ehca_pd *e_pd,    /**<IN*/
+			    struct ehca_mr **maxmr); /**<OUT*/
+
+int ehca_reg_maxmr(struct ehca_shca *shca,
+		   struct ehca_mr *e_newmr,
+		   u64 *iova_start,
+		   int acl,
+		   struct ehca_pd *e_pd,
+		   u32 *lkey,
+		   u32 *rkey);
+
+int ehca_dereg_internal_maxmr(struct ehca_shca *shca);
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+/** @brief check physical buffer array of MR verbs for validness and
+ calculates MR size
+*/
+static inline int ehca_mr_chk_buf_and_calc_size(
+	struct ib_phys_buf *phys_buf_array, /**<IN*/
+	int num_phys_buf,                   /**<IN*/
+	u64 *iova_start,                    /**<IN*/
+	u64 *size)                          /**<OUT*/
+{
+	struct ib_phys_buf *pbuf = phys_buf_array;
+	u64 size_count = 0;
+	u32 i;
+
+	if (num_phys_buf == 0) {
+		EDEB_ERR(4, "bad phys buf array len, num_phys_buf=0");
+		return (-EINVAL);
+	}
+	/* check first buffer */
+	if (((u64)iova_start & ~PAGE_MASK) != (pbuf->addr & ~PAGE_MASK)) {
+		EDEB_ERR(4, "iova_start/addr mismatch, iova_start=%p "
+			 "pbuf->addr=%lx pbuf->size=%lx",
+			 iova_start, pbuf->addr, pbuf->size);
+		return (-EINVAL);
+	}
+	if (((pbuf->addr + pbuf->size) % PAGE_SIZE) &&
+	    (num_phys_buf > 1)) {
+		EDEB_ERR(4, "addr/size mismatch in 1st buf, pbuf->addr=%lx "
+			 "pbuf->size=%lx", pbuf->addr, pbuf->size);
+		return (-EINVAL);
+	}
+
+	for (i = 0; i < num_phys_buf; i++) {
+		if ((i > 0) && (pbuf->addr % PAGE_SIZE)) {
+			EDEB_ERR(4, "bad address, i=%x pbuf->addr=%lx "
+				 "pbuf->size=%lx", i, pbuf->addr, pbuf->size);
+			return (-EINVAL);
+		}
+		if (((i > 0) &&	/* not 1st */
+		     (i < (num_phys_buf - 1)) &&	/* not last */
+		     (pbuf->size % PAGE_SIZE)) || (pbuf->size == 0)) {
+			EDEB_ERR(4, "bad size, i=%x pbuf->size=%lx",
+				 i, pbuf->size);
+			return (-EINVAL);
+		}
+		size_count += pbuf->size;
+		pbuf++;
+	}
+
+	*size = size_count;
+	return (0);
+} /* end ehca_mr_chk_buf_and_calc_size() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+/** @brief check page list of map FMR verb for validness
+*/
+static inline int ehca_fmr_check_page_list(
+	struct ehca_mr *e_fmr, /**<IN*/
+	u64 *page_list,        /**<IN*/
+	int list_len)          /**<IN*/
+{
+	u32 i;
+	u64 *page = 0;
+
+	if (ehca_adr_bad(page_list)) {
+		EDEB_ERR(4, "bad page_list, page_list=%p fmr=%p",
+			 page_list, e_fmr);
+		return (-EINVAL);
+	}
+
+	if ((list_len == 0) || (list_len > e_fmr->fmr_max_pages)) {
+		EDEB_ERR(4, "bad list_len, list_len=%x e_fmr->fmr_max_pages=%x "
+			 "fmr=%p", list_len, e_fmr->fmr_max_pages, e_fmr);
+		return (-EINVAL);
+	}
+
+	/* each page must be aligned */
+	page = page_list;
+	for (i = 0; i < list_len; i++) {
+		if (*page % PAGE_SIZE) {
+			EDEB_ERR(4, "bad page, i=%x *page=%lx page=%p "
+				 "fmr=%p", i, *page, page, e_fmr);
+			return (-EINVAL);
+		}
+		page++;
+	}
+
+	return (0);
+} /* end ehca_fmr_check_page_list() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+/** @brief setup page buffer from page info
+ */
+static inline int ehca_set_pagebuf(struct ehca_mr *e_mr,
+				   struct ehca_mr_pginfo *pginfo,
+				   u32 number,
+				   u64 *kpage) /**<OUT*/
+{
+	int retcode = 0;
+	struct ib_umem_chunk *prev_chunk = NULL;
+	struct ib_umem_chunk *chunk      = NULL;
+	struct ib_phys_buf *pbuf         = NULL;
+	u64 *fmrlist = NULL;
+	u64 numpg  = 0;
+	u64 pgaddr = 0;
+	u32 i = 0;
+	u32 j = 0;
+
+
+	EDEB_EN(7, "pginfo=%p type=%x num_pages=%lx next_buf=%lx next_page=%lx "
+		"number=%x kpage=%p page_count=%lx next_listelem=%lx "
+		"region=%p next_chunk=%p next_nmap=%lx",
+		pginfo, pginfo->type, pginfo->num_pages, pginfo->next_buf,
+		pginfo->next_page, number, kpage, pginfo->page_count,
+		pginfo->next_listelem, pginfo->region, pginfo->next_chunk,
+		pginfo->next_nmap);
+
+	if (pginfo->type == EHCA_MR_PGI_PHYS) {
+		/* loop over desired phys_buf_array entries */
+		while (i < number) {
+			pbuf  = pginfo->phys_buf_array + pginfo->next_buf;
+			numpg = ((pbuf->size + PAGE_SIZE - 1) / PAGE_SIZE);
+			while (pginfo->next_page < numpg) {
+				/* sanity check */
+				if (pginfo->page_count >= pginfo->num_pages) {
+					EDEB_ERR(4, "page_count >= num_pages, "
+						 "page_count=%lx num_pages=%lx "
+						 "i=%x", pginfo->page_count,
+						 pginfo->num_pages, i);
+					retcode = -EFAULT;
+					goto ehca_set_pagebuf_exit0;
+				}
+				*kpage = phys_to_abs((pbuf->addr & PAGE_MASK)
+						     + (pginfo->next_page *
+							PAGE_SIZE));
+				if ((*kpage == 0) && (pbuf->addr != 0)) {
+					EDEB_ERR(4, "pbuf->addr=%lx"
+						 " pbuf->size=%lx"
+						 " next_page=%lx",
+						 pbuf->addr, pbuf->size,
+						 pginfo->next_page);
+					retcode = -EFAULT;
+					goto ehca_set_pagebuf_exit0;
+				}
+				(pginfo->next_page)++;
+				(pginfo->page_count)++;
+				kpage++;
+				i++;
+				if (i >= number) break;
+			}
+			if (pginfo->next_page >= numpg) {
+				(pginfo->next_buf)++;
+				pginfo->next_page = 0;
+			}
+		}
+	} else if (pginfo->type == EHCA_MR_PGI_USER) {
+		/* loop over desired chunk entries */
+		/* (@TODO: add support for large pages) */
+		chunk      = pginfo->next_chunk;
+		prev_chunk = pginfo->next_chunk;
+		list_for_each_entry_continue(chunk,
+					     (&(pginfo->region->chunk_list)),
+					     list) {
+			EDEB(9, "chunk->page_list[0]=%lx",
+			     (u64)sg_dma_address(&chunk->page_list[0]));
+			for (i = pginfo->next_nmap; i < chunk->nmap; i++) {
+				pgaddr = ( page_to_pfn(chunk->page_list[i].page)
+					   << PAGE_SHIFT );
+				*kpage = phys_to_abs(pgaddr);
+				EDEB(9,"pgaddr=%lx *kpage=%lx", pgaddr, *kpage);
+				if (*kpage == 0) {
+					EDEB_ERR(4, "chunk->page_list[i]=%lx"
+						 " i=%x mr=%p",
+						 (u64)sg_dma_address(
+							 &chunk->page_list[i]),
+						 i, e_mr);
+					retcode = -EFAULT;
+					goto ehca_set_pagebuf_exit0;
+				}
+				(pginfo->page_count)++;
+				(pginfo->next_nmap)++;
+				kpage++;
+				j++;
+				if (j >= number) break;
+			}
+			if ( (pginfo->next_nmap >= chunk->nmap) &&
+			     (j >= number) ) {
+				pginfo->next_nmap = 0;
+				prev_chunk = chunk;
+				break;
+			} else if (pginfo->next_nmap >= chunk->nmap) {
+				pginfo->next_nmap = 0;
+				prev_chunk = chunk;
+			} else if (j >= number)
+				break;
+			else
+				prev_chunk = chunk;
+		}
+		pginfo->next_chunk =
+			list_prepare_entry(prev_chunk,
+					   (&(pginfo->region->chunk_list)),
+					   list);
+	} else if (pginfo->type == EHCA_MR_PGI_FMR) {
+		/* loop over desired page_list entries */
+		fmrlist = pginfo->page_list + pginfo->next_listelem;
+		for (i = 0; i < number; i++) {
+			*kpage = phys_to_abs(*fmrlist);
+			if (*kpage == 0) {
+				EDEB_ERR(4, "*fmrlist=%lx fmrlist=%p"
+					 " next_listelem=%lx", *fmrlist,
+					 fmrlist, pginfo->next_listelem);
+				retcode = -EFAULT;
+				goto ehca_set_pagebuf_exit0;
+			}
+			(pginfo->next_listelem)++;
+			(pginfo->page_count)++;
+			fmrlist++;
+			kpage++;
+		}
+	} else {
+		EDEB_ERR(4, "bad pginfo->type=%x", pginfo->type);
+		retcode = -EFAULT;
+		goto ehca_set_pagebuf_exit0;
+	}
+
+      ehca_set_pagebuf_exit0:
+	if (retcode == 0)
+		EDEB_EX(7, "retcode=%x e_mr=%p pginfo=%p type=%x num_pages=%lx "
+			"next_buf=%lx next_page=%lx number=%x kpage=%p "
+			"page_count=%lx i=%x next_listelem=%lx region=%p "
+			"next_chunk=%p next_nmap=%lx",
+			retcode, e_mr, pginfo, pginfo->type, pginfo->num_pages,
+			pginfo->next_buf, pginfo->next_page, number, kpage,
+			pginfo->page_count, i, pginfo->next_listelem,
+			pginfo->region, pginfo->next_chunk, pginfo->next_nmap);
+	else
+		EDEB_EX(4, "retcode=%x e_mr=%p pginfo=%p type=%x num_pages=%lx "
+			"next_buf=%lx next_page=%lx number=%x kpage=%p "
+			"page_count=%lx i=%x next_listelem=%lx region=%p "
+			"next_chunk=%p next_nmap=%lx",
+			retcode, e_mr, pginfo, pginfo->type, pginfo->num_pages,
+			pginfo->next_buf, pginfo->next_page, number, kpage,
+			pginfo->page_count, i, pginfo->next_listelem,
+			pginfo->region, pginfo->next_chunk, pginfo->next_nmap);
+	return (retcode);
+} /* end ehca_set_pagebuf() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+/** @brief setup 1 page from page info page buffer
+ */
+static inline int ehca_set_pagebuf_1(struct ehca_mr *e_mr,
+				     struct ehca_mr_pginfo *pginfo,
+				     u64 *rpage) /**<OUT*/
+{
+	int retcode = 0;
+	struct ib_phys_buf *tmp_pbuf = 0;
+	u64 *tmp_fmrlist = 0;
+	struct ib_umem_chunk *chunk = 0;
+	struct ib_umem_chunk *prev_chunk = 0;
+	u64 pgaddr = 0;
+
+	EDEB_EN(7, "pginfo=%p type=%x num_pages=%lx next_buf=%lx next_page=%lx "
+		"rpage=%p page_count=%lx next_listelem=%lx region=%p "
+		"next_chunk=%p next_nmap=%lx",
+		pginfo, pginfo->type, pginfo->num_pages, pginfo->next_buf,
+		pginfo->next_page, rpage, pginfo->page_count,
+		pginfo->next_listelem, pginfo->region, pginfo->next_chunk,
+		pginfo->next_nmap);
+
+	if (pginfo->type == EHCA_MR_PGI_PHYS) {
+		/* sanity check */
+		if (pginfo->page_count >= pginfo->num_pages) {
+			EDEB_ERR(4, "page_count >= num_pages, "
+				 "page_count=%lx num_pages=%lx",
+				 pginfo->page_count, pginfo->num_pages);
+			retcode = -EFAULT;
+			goto ehca_set_pagebuf_1_exit0;
+		}
+		tmp_pbuf = pginfo->phys_buf_array + pginfo->next_buf;
+		*rpage = phys_to_abs(((tmp_pbuf->addr & PAGE_MASK) +
+				      (pginfo->next_page * PAGE_SIZE)));
+		if ((*rpage == 0) && (tmp_pbuf->addr != 0)) {
+			EDEB_ERR(4, "tmp_pbuf->addr=%lx"
+				 " tmp_pbuf->size=%lx next_page=%lx",
+				 tmp_pbuf->addr, tmp_pbuf->size,
+				 pginfo->next_page);
+			retcode = -EFAULT;
+			goto ehca_set_pagebuf_1_exit0;
+		}
+		(pginfo->next_page)++;
+		(pginfo->page_count)++;
+		if (pginfo->next_page >= tmp_pbuf->size / PAGE_SIZE) {
+			(pginfo->next_buf)++;
+			pginfo->next_page = 0;
+		}
+	} else if (pginfo->type == EHCA_MR_PGI_USER) {
+		chunk      = pginfo->next_chunk;
+		prev_chunk = pginfo->next_chunk;
+		list_for_each_entry_continue(chunk,
+					     (&(pginfo->region->chunk_list)),
+					     list) {
+			pgaddr = ( page_to_pfn(chunk->page_list[
+						       pginfo->next_nmap].page)
+				   << PAGE_SHIFT );
+			*rpage = phys_to_abs(pgaddr);
+			EDEB(9,"pgaddr=%lx *rpage=%lx", pgaddr, *rpage);
+			if (*rpage == 0) {
+				EDEB_ERR(4, "chunk->page_list[]=%lx next_nmap=%lx "
+					 "mr=%p", (u64)sg_dma_address(
+						 &chunk->page_list[
+							 pginfo->next_nmap]),
+					 pginfo->next_nmap, e_mr);
+				retcode = -EFAULT;
+				goto ehca_set_pagebuf_1_exit0;
+			}
+			(pginfo->page_count)++;
+			(pginfo->next_nmap)++;
+			if (pginfo->next_nmap >= chunk->nmap) {
+				pginfo->next_nmap = 0;
+				prev_chunk = chunk;
+			}
+			break;
+		}
+		pginfo->next_chunk =
+			list_prepare_entry(prev_chunk,
+					   (&(pginfo->region->chunk_list)),
+					   list);
+	} else if (pginfo->type == EHCA_MR_PGI_FMR) {
+		tmp_fmrlist = pginfo->page_list + pginfo->next_listelem;
+		*rpage = phys_to_abs(*tmp_fmrlist);
+		if (*rpage == 0) {
+			EDEB_ERR(4, "*tmp_fmrlist=%lx tmp_fmrlist=%p"
+				 " next_listelem=%lx", *tmp_fmrlist,
+				 tmp_fmrlist, pginfo->next_listelem);
+			retcode = -EFAULT;
+			goto ehca_set_pagebuf_1_exit0;
+		}
+		(pginfo->next_listelem)++;
+		(pginfo->page_count)++;
+	} else {
+		EDEB_ERR(4, "bad pginfo->type=%x", pginfo->type);
+		retcode = -EFAULT;
+		goto ehca_set_pagebuf_1_exit0;
+	}
+
+      ehca_set_pagebuf_1_exit0:
+	if (retcode == 0)
+		EDEB_EX(7, "retcode=%x e_mr=%p pginfo=%p type=%x num_pages=%lx "
+			"next_buf=%lx next_page=%lx rpage=%p page_count=%lx "
+			"next_listelem=%lx region=%p next_chunk=%p "
+			"next_nmap=%lx",
+			retcode, e_mr, pginfo, pginfo->type, pginfo->num_pages,
+			pginfo->next_buf, pginfo->next_page, rpage,
+			pginfo->page_count, pginfo->next_listelem,
+			pginfo->region, pginfo->next_chunk, pginfo->next_nmap);
+	else
+		EDEB_EX(4, "retcode=%x e_mr=%p pginfo=%p type=%x num_pages=%lx "
+			"next_buf=%lx next_page=%lx rpage=%p page_count=%lx "
+			"next_listelem=%lx region=%p next_chunk=%p "
+			"next_nmap=%lx",
+			retcode, e_mr, pginfo, pginfo->type, pginfo->num_pages,
+			pginfo->next_buf, pginfo->next_page, rpage,
+			pginfo->page_count, pginfo->next_listelem,
+			pginfo->region, pginfo->next_chunk, pginfo->next_nmap);
+	return (retcode);
+} /* end ehca_set_pagebuf_1() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+/** @brief check MR if it is a max-MR, i.e. uses whole memory
+    in case it's a max-MR TRUE is returned, else FALSE
+*/
+static inline int ehca_mr_is_maxmr(u64 size,
+				   u64 *iova_start)
+{
+	/* a MR is treated as max-MR only if it fits following: */
+	if ((size == ((u64)high_memory - PAGE_OFFSET)) &&
+	    (iova_start == (void*)KERNELBASE)) {
+		EDEB(6, "this is a max-MR");
+		return (TRUE);
+	} else
+		return (FALSE);
+} /* end ehca_mr_is_maxmr() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+/** @brief map access control for MR/MW.
+    This routine is used for MR and MW.
+*/
+static inline void ehca_mrmw_map_acl(int ib_acl,    /**<IN*/
+				     u32 *hipz_acl) /**<OUT*/
+{
+	*hipz_acl = 0;
+	if (ib_acl & IB_ACCESS_REMOTE_READ)
+		*hipz_acl |= HIPZ_ACCESSCTRL_R_READ;
+	if (ib_acl & IB_ACCESS_REMOTE_WRITE)
+		*hipz_acl |= HIPZ_ACCESSCTRL_R_WRITE;
+	if (ib_acl & IB_ACCESS_REMOTE_ATOMIC)
+		*hipz_acl |= HIPZ_ACCESSCTRL_R_ATOMIC;
+	if (ib_acl & IB_ACCESS_LOCAL_WRITE)
+		*hipz_acl |= HIPZ_ACCESSCTRL_L_WRITE;
+	if (ib_acl & IB_ACCESS_MW_BIND)
+		*hipz_acl |= HIPZ_ACCESSCTRL_MW_BIND;
+} /* end ehca_mrmw_map_acl() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+/** @brief sets page size in hipz access control for MR/MW.
+ */
+static inline void ehca_mrmw_set_pgsize_hipz_acl(
+	u32 *hipz_acl) /**<INOUT HIPZ access control */
+{
+	/* @TODO page size of 4k currently hardcoded ... */
+	return;
+} /* end ehca_mrmw_set_pgsize_hipz_acl() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+/** @brief reverse map access control for MR/MW.
+    This routine is used for MR and MW.
+*/
+static inline void ehca_mrmw_reverse_map_acl(
+	const u32 *hipz_acl, /**<IN*/
+	int *ib_acl)	     /**<OUT*/
+{
+	*ib_acl = 0;
+	if (*hipz_acl & HIPZ_ACCESSCTRL_R_READ)
+		*ib_acl |= IB_ACCESS_REMOTE_READ;
+	if (*hipz_acl & HIPZ_ACCESSCTRL_R_WRITE)
+		*ib_acl |= IB_ACCESS_REMOTE_WRITE;
+	if (*hipz_acl & HIPZ_ACCESSCTRL_R_ATOMIC)
+		*ib_acl |= IB_ACCESS_REMOTE_ATOMIC;
+	if (*hipz_acl & HIPZ_ACCESSCTRL_L_WRITE)
+		*ib_acl |= IB_ACCESS_LOCAL_WRITE;
+	if (*hipz_acl & HIPZ_ACCESSCTRL_MW_BIND)
+		*ib_acl |= IB_ACCESS_MW_BIND;
+} /* end ehca_mrmw_reverse_map_acl() */
+
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+/** @brief map HIPZ rc to IB retcodes for MR/MW allocations
+    Used for hipz_mr_reg_alloc and hipz_mw_alloc.
+*/
+static inline int ehca_mrmw_map_rc_alloc(const u64 rc)
+{
+	switch (rc) {
+	case H_Success:	             /* successful completion */
+		return (0);
+	case H_ADAPTER_PARM:         /* invalid adapter handle */
+	case H_RT_PARM:              /* invalid resource type */
+	case H_NOT_ENOUGH_RESOURCES: /* insufficient resources */
+	case H_MLENGTH_PARM:         /* invalid memory length */
+	case H_MEM_ACCESS_PARM:      /* invalid access controls */
+	case H_Constrained:          /* resource constraint */
+		return (-EINVAL);
+	case H_Busy:                 /* long busy */
+		return (-EBUSY);
+	default:
+		return (-EINVAL);
+	}
+} /* end ehca_mrmw_map_rc_alloc() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+/** @brief map HIPZ rc to IB retcodes for MR register rpage
+    Used for hipz_h_register_rpage_mr at registering last page
+*/
+static inline int ehca_mrmw_map_rc_rrpg_last(const u64 rc)
+{
+	switch (rc) {
+	case H_Success:         /* registration complete */
+		return (0);
+	case H_PAGE_REGISTERED:	/* page registered */
+	case H_ADAPTER_PARM:    /* invalid adapter handle */
+	case H_RH_PARM:         /* invalid resource handle */
+/*	case H_QT_PARM:            invalid queue type */
+	case H_Parameter:       /* invalid logical address, */
+		                /* or count zero or greater 512 */
+	case H_TABLE_FULL:      /* page table full */
+	case H_Hardware:        /* HCA not operational */
+		return (-EINVAL);
+	case H_Busy:            /* long busy */
+		return (-EBUSY);
+	default:
+		return (-EINVAL);
+	}
+} /* end ehca_mrmw_map_rc_rrpg_last() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+/** @brief map HIPZ rc to IB retcodes for MR register rpage
+    Used for hipz_h_register_rpage_mr at registering one page, but not last page
+*/
+static inline int ehca_mrmw_map_rc_rrpg_notlast(const u64 rc)
+{
+	switch (rc) {
+	case H_PAGE_REGISTERED:	/* page registered */
+		return (0);
+	case H_Success:         /* registration complete */
+	case H_ADAPTER_PARM:    /* invalid adapter handle */
+	case H_RH_PARM:         /* invalid resource handle */
+/*	case H_QT_PARM:            invalid queue type */
+	case H_Parameter:       /* invalid logical address, */
+		                /* or count zero or greater 512 */
+	case H_TABLE_FULL:      /* page table full */
+	case H_Hardware:        /* HCA not operational */
+		return (-EINVAL);
+	case H_Busy:            /* long busy */
+		return (-EBUSY);
+	default:
+		return (-EINVAL);
+	}
+} /* end ehca_mrmw_map_rc_rrpg_notlast() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+/** @brief map HIPZ rc to IB retcodes for MR query
+    Used for hipz_mr_query.
+*/
+static inline int ehca_mrmw_map_rc_query_mr(const u64 rc)
+{
+	switch (rc) {
+	case H_Success:	             /* successful completion */
+		return (0);
+	case H_ADAPTER_PARM:         /* invalid adapter handle */
+	case H_RH_PARM:              /* invalid resource handle */
+		return (-EINVAL);
+	case H_Busy:                 /* long busy */
+		return (-EBUSY);
+	default:
+		return (-EINVAL);
+	}
+} /* end ehca_mrmw_map_rc_query_mr() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+/** @brief map HIPZ rc to IB retcodes for freeing MR resource
+    Used for hipz_h_free_resource_mr
+*/
+static inline int ehca_mrmw_map_rc_free_mr(const u64 rc)
+{
+	switch (rc) {
+	case H_Success:	     /* resource freed */
+		return (0);
+	case H_ADAPTER_PARM: /* invalid adapter handle */
+	case H_RH_PARM:      /* invalid resource handle */
+	case H_R_STATE:      /* invalid resource state */
+	case H_Hardware:     /* HCA not operational */
+		return (-EINVAL);
+	case H_Resource:     /* Resource in use */
+	case H_Busy:         /* long busy */
+		return (-EBUSY);
+	default:
+		return (-EINVAL);
+	}
+} /* end ehca_mrmw_map_rc_free_mr() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+/** @brief map HIPZ rc to IB retcodes for freeing MW resource
+    Used for hipz_h_free_resource_mw
+*/
+static inline int ehca_mrmw_map_rc_free_mw(const u64 rc)
+{
+	switch (rc) {
+	case H_Success:	     /* resource freed */
+		return (0);
+	case H_ADAPTER_PARM: /* invalid adapter handle */
+	case H_RH_PARM:      /* invalid resource handle */
+	case H_R_STATE:      /* invalid resource state */
+	case H_Hardware:     /* HCA not operational */
+		return (-EINVAL);
+	case H_Resource:     /* Resource in use */
+	case H_Busy:         /* long busy */
+		return (-EBUSY);
+	default:
+		return (-EINVAL);
+	}
+} /* end ehca_mrmw_map_rc_free_mw() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+/** @brief map HIPZ rc to IB retcodes for SMR registrations
+    Used for hipz_h_register_smr.
+*/
+static inline int ehca_mrmw_map_rc_reg_smr(const u64 rc)
+{
+	switch (rc) {
+	case H_Success:	             /* successful completion */
+		return (0);
+	case H_ADAPTER_PARM:         /* invalid adapter handle */
+	case H_RH_PARM:              /* invalid resource handle */
+	case H_MEM_PARM:             /* invalid MR virtual address */
+	case H_MEM_ACCESS_PARM:      /* invalid access controls */
+	case H_NOT_ENOUGH_RESOURCES: /* insufficient resources */
+		return (-EINVAL);
+	case H_Busy:                 /* long busy */
+		return (-EBUSY);
+	default:
+		return (-EINVAL);
+	}
+} /* end ehca_mrmw_map_rc_reg_smr() */
+
+/*----------------------------------------------------------------------*/
+/*----------------------------------------------------------------------*/
+
+/** @brief MR destructor and constructor
+    used in Reregister MR verb, memsets ehca_mr_t to 0,
+    except struct ib_mr and spinlock
+ */
+static inline void ehca_mr_deletenew(struct ehca_mr *mr)
+{
+	u32 offset = (u64)(&mr->flags) - (u64)mr;
+	memset(&mr->flags, 0, sizeof(*mr) - offset);
+} /* end ehca_mr_deletenew() */
+
+#endif /*_EHCA_MRMW_H_*/


From rolandd at cisco.com  Sat Feb 18 11:58:02 2006
From: rolandd at cisco.com (Roland Dreier)
Date: Fri, 17 Feb 2006 16:58:02 -0800
Subject: [PATCH 22/22] ehca Makefile/Kconfig changes
In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
Message-ID: <20060218005801.13620.38625.stgit@localhost.localdomain>

From: Roland Dreier <rolandd at cisco.com>


---

 drivers/infiniband/Kconfig         |    2 ++
 drivers/infiniband/Makefile        |    1 +
 drivers/infiniband/hw/ehca/Kbuild  |    8 ++++++++
 drivers/infiniband/hw/ehca/Kconfig |    6 ++++++
 4 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/Kconfig b/drivers/infiniband/Kconfig
index bdf0891..2b3ad03 100644
--- a/drivers/infiniband/Kconfig
+++ b/drivers/infiniband/Kconfig
@@ -31,6 +31,8 @@ config INFINIBAND_USER_ACCESS
 
 source "drivers/infiniband/hw/mthca/Kconfig"
 
+source "drivers/infiniband/hw/ehca/Kconfig"
+
 source "drivers/infiniband/ulp/ipoib/Kconfig"
 
 source "drivers/infiniband/ulp/srp/Kconfig"
diff --git a/drivers/infiniband/Makefile b/drivers/infiniband/Makefile
index a43fb34..eb7788f 100644
--- a/drivers/infiniband/Makefile
+++ b/drivers/infiniband/Makefile
@@ -1,4 +1,5 @@
 obj-$(CONFIG_INFINIBAND)		+= core/
 obj-$(CONFIG_INFINIBAND_MTHCA)		+= hw/mthca/
+obj-$(CONFIG_INFINIBAND_EHCA)		+= hw/ehca/
 obj-$(CONFIG_INFINIBAND_IPOIB)		+= ulp/ipoib/
 obj-$(CONFIG_INFINIBAND_SRP)		+= ulp/srp/
diff --git a/drivers/infiniband/hw/ehca/Kbuild b/drivers/infiniband/hw/ehca/Kbuild
new file mode 100644
index 0000000..7b610b1
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/Kbuild
@@ -0,0 +1,8 @@
+obj-$(CONFIG_INFINIBAND_EHCA) += hcad_mod.o 
+
+hcad_mod-objs = ehca_main.o ehca_hca.o ipz_pt_fn.o ehca_classes.o ehca_av.o \
+	ehca_pd.o ehca_mrmw.o ehca_cq.o ehca_sqp.o ehca_qp.o hcp_sense.o \
+	ehca_eq.o ehca_irq.o hcp_phyp.o ehca_mcast.o ehca_reqs.o \
+	ehca_uverbs.o
+
+CFLAGS +=-DP_SERIES -DEHCA_USE_HCALL -DEHCA_USE_HCALL_KERNEL
diff --git a/drivers/infiniband/hw/ehca/Kconfig b/drivers/infiniband/hw/ehca/Kconfig
new file mode 100644
index 0000000..b875649
--- /dev/null
+++ b/drivers/infiniband/hw/ehca/Kconfig
@@ -0,0 +1,6 @@
+config INFINIBAND_EHCA
+       tristate "eHCA support"
+       depends on IBMEBUS && INFINIBAND
+       ---help---
+       This is a low level device driver for the IBM
+       GX based Host channel adapters (HCAs)
\ No newline at end of file


From greg at kroah.com  Sat Feb 18 12:54:13 2006
From: greg at kroah.com (Greg KH)
Date: Fri, 17 Feb 2006 17:54:13 -0800
Subject: [PATCH 04/22] OF adapter probing
In-Reply-To: <20060218005712.13620.82908.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
	<20060218005712.13620.82908.stgit@localhost.localdomain>
Message-ID: <20060218015413.GA17653@kroah.com>

On Fri, Feb 17, 2006 at 04:57:14PM -0800, Roland Dreier wrote:
> +int hipz_count_adapters(void)
> +{
> +	int num = 0;
> +	struct device_node *dn = NULL;
> +
> +	EDEB_EN(7, "");
> +
> +	while ((dn = of_find_node_by_name(dn, "lhca"))) {
> +		num++;
> +	}

The { } are not needed here.

> +
> +	of_node_put(dn);
> +
> +	if (num == 0) {
> +		EDEB_ERR(4, "No lhca node name was found in the"
> +			 " Open Firmware device tree.");
> +		return -ENODEV;
> +	}
> +
> +	EDEB(6, " ... found %x adapter(s)", num);
> +
> +	EDEB_EX(7, "num=%x", num);
> +
> +	return num;
> +}
> +
> +int hipz_probe_adapters(char **adapter_list)
> +{
> +	int ret = 0;
> +	int num = 0;
> +	struct device_node *dn = NULL;
> +	char *loc;
> +
> +	EDEB_EN(7, "adapter_list=%p", adapter_list);
> +
> +	while ((dn = of_find_node_by_name(dn, "lhca"))) {
> +		loc = get_property(dn, "ibm,loc-code", NULL);
> +		if (loc == NULL) {
> +			EDEB_ERR(4, "No ibm,loc-code property for"
> +				 " lhca Open Firmware device tree node.");
> +			ret = -ENODEV;
> +			goto probe_adapters0;
> +		}
> +
> +		adapter_list[num] = loc;
> +		EDEB(6, " ... found adapter[%x] with loc-code: %s", num, loc);
> +		num++;
> +	}
> +
> +      probe_adapters0:
> +	of_node_put(dn);

Please use tabs everywhere.

Hm, wait, that's a label.  Put it where it belongs, over on the left
please.

thanks,

greg k-h


From greg at kroah.com  Sat Feb 18 12:58:08 2006
From: greg at kroah.com (Greg KH)
Date: Fri, 17 Feb 2006 17:58:08 -0800
Subject: [PATCH 02/22] Firmware interface code for IB device.
In-Reply-To: <20060218005707.13620.20538.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
	<20060218005707.13620.20538.stgit@localhost.localdomain>
Message-ID: <20060218015808.GB17653@kroah.com>

On Fri, Feb 17, 2006 at 04:57:07PM -0800, Roland Dreier wrote:
> From: Roland Dreier <rolandd at cisco.com>
> 
> This is a very large file with way too much code for a .h file.
> The functions look too big to be inlined also.  Is there any way
> for this code to move to a .c file?

Roland, your comments are fine, but what about the original author's
descriptions of what each patch are?

Come on, IBM allows developers to post code to lkml, just look at the
archives for proof.  For them to use a proxy like this is very strange,
and also, there is no Signed-off-by: record from the original authors,
which is not ok.

And why aren't you using the standard firmware interface in the kernel?

> +#ifndef CONFIG_PPC64
> +#ifndef Z_SERIES
> +#warning "included with wrong target, this is a p file"
> +#endif
> +#endif

It's a "p" file?  What's that?

Is this even needed?

thanks,

greg k-h


From rdreier at cisco.com  Sat Feb 18 13:04:56 2006
From: rdreier at cisco.com (Roland Dreier)
Date: Fri, 17 Feb 2006 18:04:56 -0800
Subject: [PATCH 02/22] Firmware interface code for IB device.
In-Reply-To: <20060218015808.GB17653@kroah.com> (Greg KH's message of "Fri,
	17 Feb 2006 17:58:08 -0800")
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
	<20060218005707.13620.20538.stgit@localhost.localdomain>
	<20060218015808.GB17653@kroah.com>
Message-ID: <aday809bewn.fsf@cisco.com>

    Greg> Roland, your comments are fine, but what about the original
    Greg> author's descriptions of what each patch are?

This is actually me breaking up a giant driver into pieces small
enough to post to lkml without hitting the 100 KB limit.

This is just an RFC -- I assume the driver is going to get merged in
the end as one big git changeset with a changelog like "add driver for
IBM eHCA InfiniBand adapters".

    Greg> Come on, IBM allows developers to post code to lkml, just
    Greg> look at the archives for proof.  For them to use a proxy
    Greg> like this is very strange, and also, there is no
    Greg> Signed-off-by: record from the original authors, which is
    Greg> not ok.

Well, the eHCA guys tell me that they can't post patches to lkml.

You're right that the final merge will have to have an IBM
Signed-off-by: line but as I said this is just an RFC.  There are many
reasons beyond patch format issues that make this stuff unmergeable as-is.

    Greg> And why aren't you using the standard firmware interface in
    Greg> the kernel?

This is actually stuff to talk to the firmware that sits below the
kernel on IBM ppc64 machines, not an interface to load device firmware
from userspace.

 - R.


From apgo at patchbomb.org  Sat Feb 18 21:08:49 2006
From: apgo at patchbomb.org (Arthur Othieno)
Date: Sat, 18 Feb 2006 05:08:49 -0500
Subject: [PATCH] powerpc: ARCH=powerpc build fix for CONFIG_SYSVIPC=n ||
	CONFIG_SYSCTL=n
Message-ID: <20060218100849.GA1869@krypton>

When using a default config generated by just `make menuconfig'
(ie. none of arch/powerpc/configs/*), linking .tmp_vmlinux1 barfs with:

  arch/powerpc/kernel/built-in.o: In function `.sys_call_table':
  : undefined reference to `.compat_sys_ipc'
  arch/powerpc/kernel/built-in.o: In function `.sys_call_table':
  : undefined reference to `.compat_sys_sysctl'
  make: *** [.tmp_vmlinux1] Error 1

These are wrapped around #ifdef CONFIG_{SYSVIPC,SYSCTL} respectively.
Fixup to just return -ENOSYS when CONFIG_SYSVIPC=n || CONFIG_SYSCTL=n.

Signed-off-by: Arthur Othieno <apgo at patchbomb.org>

---

Paulus, any chance this can go in before 2.6.16 is Out There(tm) ?

 arch/powerpc/kernel/sys_ppc32.c |   17 ++++++++++++++---
 1 files changed, 14 insertions(+), 3 deletions(-)

122877b2f58236c61f87797c2908a9ab1e3e451d
diff --git a/arch/powerpc/kernel/sys_ppc32.c b/arch/powerpc/kernel/sys_ppc32.c
index 475249d..272beb3 100644
--- a/arch/powerpc/kernel/sys_ppc32.c
+++ b/arch/powerpc/kernel/sys_ppc32.c
@@ -440,7 +440,13 @@ long compat_sys_ipc(u32 call, u32 first,
 
 	return -ENOSYS;
 }
-#endif
+#else
+long compat_sys_ipc(u32 call, u32 first, u32 second, u32 third, compat_uptr_t ptr,
+	       u32 fifth)
+{
+	return -ENOSYS;
+}
+#endif /* CONFIG_SYSVIPC */
 
 /* Note: it is necessary to treat out_fd and in_fd as unsigned ints, 
  * with the corresponding cast to a signed int to insure that the 
@@ -818,7 +824,6 @@ asmlinkage long compat_sys_umask(u32 mas
 	return sys_umask((int)mask);
 }
 
-#ifdef CONFIG_SYSCTL
 struct __sysctl_args32 {
 	u32 name;
 	int nlen;
@@ -829,6 +834,7 @@ struct __sysctl_args32 {
 	u32 __unused[4];
 };
 
+#ifdef CONFIG_SYSCTL
 asmlinkage long compat_sys_sysctl(struct __sysctl_args32 __user *args)
 {
 	struct __sysctl_args32 tmp;
@@ -868,7 +874,12 @@ asmlinkage long compat_sys_sysctl(struct
 	}
 	return error;
 }
-#endif
+#else
+asmlinkage long compat_sys_sysctl(struct __sysctl_args32 __user *args)
+{
+	return -ENOSYS;
+}
+#endif /* CONFIG_SYSCTL */
 
 unsigned long compat_sys_mmap2(unsigned long addr, size_t len,
 			  unsigned long prot, unsigned long flags,
-- 
1.1.5


From heiko.carstens at de.ibm.com  Sat Feb 18 21:59:36 2006
From: heiko.carstens at de.ibm.com (Heiko Carstens)
Date: Sat, 18 Feb 2006 11:59:36 +0100
Subject: [PATCH 02/22] Firmware interface code for IB device.
In-Reply-To: <20060218015808.GB17653@kroah.com>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
	<20060218005707.13620.20538.stgit@localhost.localdomain>
	<20060218015808.GB17653@kroah.com>
Message-ID: <20060218105936.GD9216@osiris.boeblingen.de.ibm.com>

> Come on, IBM allows developers to post code to lkml, just look at the
> archives for proof.  For them to use a proxy like this is very strange,

Things aren't always that easy at IBM. You should know best :)

Heiko


From hch at infradead.org  Sat Feb 18 23:17:53 2006
From: hch at infradead.org (Christoph Hellwig)
Date: Sat, 18 Feb 2006 12:17:53 +0000
Subject: [PATCH 01/22] Add powerpc-specific clear_cacheline(),
	which just compiles to "dcbz".
In-Reply-To: <20060218005704.13620.88286.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
	<20060218005704.13620.88286.stgit@localhost.localdomain>
Message-ID: <20060218121753.GC911@infradead.org>

On Fri, Feb 17, 2006 at 04:57:04PM -0800, Roland Dreier wrote:
> From: Roland Dreier <rolandd at cisco.com>
> 
> This is horribly non-portable.

Yes.  If this is needed it should go to an asm/ header, not in a driver.


From hch at infradead.org  Sat Feb 18 23:19:13 2006
From: hch at infradead.org (Christoph Hellwig)
Date: Sat, 18 Feb 2006 12:19:13 +0000
Subject: [PATCH 02/22] Firmware interface code for IB device.
In-Reply-To: <20060218005707.13620.20538.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
	<20060218005707.13620.20538.stgit@localhost.localdomain>
Message-ID: <20060218121913.GD911@infradead.org>

On Fri, Feb 17, 2006 at 04:57:07PM -0800, Roland Dreier wrote:
> From: Roland Dreier <rolandd at cisco.com>
> 
> This is a very large file with way too much code for a .h file.
> The functions look too big to be inlined also.  Is there any way
> for this code to move to a .c file?
> ---
> 
>  drivers/infiniband/hw/ehca/hcp_if.h | 2022 +++++++++++++++++++++++++++++++++++

> +#include "ehca_tools.h"
> +#include "hipz_structs.h"
> +#include "ehca_classes.h"
> +
> +#ifndef EHCA_USE_HCALL
> +#include "hcz_queue.h"
> +#include "hcz_mrmw.h"
> +#include "hcz_emmio.h"
> +#include "sim_prom.h"
> +#endif
> +#include "hipz_fns.h"
> +#include "hcp_sense.h"
> +#include "ehca_irq.h"
> +
> +#ifndef CONFIG_PPC64
> +#ifndef Z_SERIES
> +#warning "included with wrong target, this is a p file"
> +#endif
> +#endif
> +
> +#ifdef EHCA_USE_HCALL
> +
> +#ifndef EHCA_USERDRIVER
> +#include "hcp_phyp.h"
> +#else
> +#include "testbench/hcallbridge.h"
> +#endif
> +#endif

the ifdefs should all go away and the build system should make sure it's
only built for the right platforms.


From hch at infradead.org  Sat Feb 18 23:20:11 2006
From: hch at infradead.org (Christoph Hellwig)
Date: Sat, 18 Feb 2006 12:20:11 +0000
Subject: [PATCH 02/22] Firmware interface code for IB device.
In-Reply-To: <aday809bewn.fsf@cisco.com>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
	<20060218005707.13620.20538.stgit@localhost.localdomain>
	<20060218015808.GB17653@kroah.com> <aday809bewn.fsf@cisco.com>
Message-ID: <20060218122011.GE911@infradead.org>

On Fri, Feb 17, 2006 at 06:04:56PM -0800, Roland Dreier wrote:
>     Greg> Roland, your comments are fine, but what about the original
>     Greg> author's descriptions of what each patch are?
> 
> This is actually me breaking up a giant driver into pieces small
> enough to post to lkml without hitting the 100 KB limit.
> 
> This is just an RFC -- I assume the driver is going to get merged in
> the end as one big git changeset with a changelog like "add driver for
> IBM eHCA InfiniBand adapters".
> 
>     Greg> Come on, IBM allows developers to post code to lkml, just
>     Greg> look at the archives for proof.  For them to use a proxy
>     Greg> like this is very strange, and also, there is no
>     Greg> Signed-off-by: record from the original authors, which is
>     Greg> not ok.
> 
> Well, the eHCA guys tell me that they can't post patches to lkml.

Then they lie.  And not posting to lkml is a good reason not to merge
an otherwise perfect driver.  (which this one is far from)


From hch at infradead.org  Sat Feb 18 23:23:17 2006
From: hch at infradead.org (Christoph Hellwig)
Date: Sat, 18 Feb 2006 12:23:17 +0000
Subject: [PATCH 03/22] pHype specific stuff
In-Reply-To: <20060218005709.13620.77409.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
	<20060218005709.13620.77409.stgit@localhost.localdomain>
Message-ID: <20060218122317.GF911@infradead.org>

> +u64 hipz_galpa_load(struct h_galpa galpa, u32 offset)
> +{
> +	u64 addr = galpa.fw_handle + offset;
> +	u64 out;
> +	EDEB_EN(7, "addr=%lx offset=%x ", addr, offset);
> +	out = *(u64 *) addr;

why does this cast an u64 to a pointer?

> +#ifndef EHCA_USERDRIVER
> +inline static int hcall_map_page(u64 physaddr, u64 * mapaddr)
> +{
> +	*mapaddr = (u64)(ioremap(physaddr, 4096));
> +
> +	EDEB(7, "ioremap physaddr=%lx mapaddr=%lx", physaddr, *mapaddr);
> +	return 0;

ioremap returns void __iomem * and casting that to any integer type is
wrong.

> +inline static int hcall_unmap_page(u64 mapaddr)
> +{
> +	EDEB(7, "mapaddr=%lx", mapaddr);
> +	iounmap((void *)(mapaddr));
> +	return 0;

dito for iounmap and casting back.

guys, please run this driver through sparse, thanks.

> +	/* if phype returns LongBusyXXX,
> +	 * we retry several times, but not forever */
> +	for (i = 0; i < 5; i++) {
> +		__asm__ __volatile__("mr 3,%10\n"
> +				     "mr 4,%11\n"
> +				     "mr 5,%12\n"

assembly code under drivers/ is not acceptable.  please create
and <asm/ehca.h> for it or something similar.


From hch at infradead.org  Sat Feb 18 23:29:10 2006
From: hch at infradead.org (Christoph Hellwig)
Date: Sat, 18 Feb 2006 12:29:10 +0000
Subject: [PATCH 02/22] Firmware interface code for IB device.
In-Reply-To: <20060218122631.GA30535@granada.merseine.nu>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
	<20060218005707.13620.20538.stgit@localhost.localdomain>
	<20060218015808.GB17653@kroah.com> <aday809bewn.fsf@cisco.com>
	<20060218122011.GE911@infradead.org>
	<20060218122631.GA30535@granada.merseine.nu>
Message-ID: <20060218122910.GA1521@infradead.org>

On Sat, Feb 18, 2006 at 02:26:31PM +0200, Muli Ben-Yehuda wrote:
> I don't speak for IBM or the authors, but there are perfectly
> reasonable reasons to ask someone else to post a patch on your behalf
> - including but not limited to to only being able to use Lotus Notes
> with one's IBM email. I'm sure you've all seen the travesties that
> Notes inflicts on inline patches.

sure.  and there's free webmail accounts that take about 10 minutes to
setup as well as various people offering shell access to linux machines
if you ask nicely.  so this really is not an issue.  I think this is more
about ibm politics (espeically in boeblingen) sometimes making it pretty
hard to post things.  But that doesn't mean it's impossible, it just means
they didn't try hard enough.


From arjan at infradead.org  Sat Feb 18 23:32:35 2006
From: arjan at infradead.org (Arjan van de Ven)
Date: Sat, 18 Feb 2006 13:32:35 +0100
Subject: [PATCH 02/22] Firmware interface code for IB device.
In-Reply-To: <20060218122631.GA30535@granada.merseine.nu>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
	<20060218005707.13620.20538.stgit@localhost.localdomain>
	<20060218015808.GB17653@kroah.com> <aday809bewn.fsf@cisco.com>
	<20060218122011.GE911@infradead.org>
	<20060218122631.GA30535@granada.merseine.nu>
Message-ID: <1140265955.4035.19.camel@laptopd505.fenrus.org>

On Sat, 2006-02-18 at 14:26 +0200, Muli Ben-Yehuda wrote:
> On Sat, Feb 18, 2006 at 12:20:11PM +0000, Christoph Hellwig wrote:
> 
> > > Well, the eHCA guys tell me that they can't post patches to lkml.
> > 
> > Then they lie.  And not posting to lkml is a good reason not to merge
> > an otherwise perfect driver.  (which this one is far from)
> 
> I don't speak for IBM or the authors, but there are perfectly
> reasonable reasons to ask someone else to post a patch on your behalf
> - including but not limited to to only being able to use Lotus Notes
> with one's IBM email. I'm sure you've all seen the travesties that
> Notes inflicts on inline patches.

there are ways around that with webmail etc.

The bigger issue is: if people can't be bothered to do those steps, why
would they be bothered to do this for maintenance and bugfixes etc etc?
Basically it's now already a de-facto unmaintained driver....


From info at schihei.de  Sat Feb 18 23:46:10 2006
From: info at schihei.de (Heiko J Schick)
Date: Sat, 18 Feb 2006 13:46:10 +0100
Subject: [openib-general] [PATCH 04/22] OF adapter probing
In-Reply-To: <20060218005712.13620.82908.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
	<20060218005712.13620.82908.stgit@localhost.localdomain>
Message-ID: <ADDBC190-7388-4904-9ECB-489F1D199AB1@schihei.de>

Hello Roland,

sorry, this file is not used anymore. The functions

	int hipz_count_adapters(void);
	int hipz_probe_adapters(char **adapter_list);
	u64 hipz_get_adapter_handle(char *adapter);

nowadays handled by the IBMEBUS [1] bus device driver.

[1]: http://www.kernel.org/git/?p=linux/kernel/git/torvalds/ 
linux-2.6.git;a=commit;h=d7a301033f1990188f65abf4fe8e5b90ef0e3888

Regards,
	Heiko

On Feb 18, 2006, at 1:57 AM, Roland Dreier wrote:

> From: Roland Dreier <rolandd at cisco.com>
>
> hipz_probe_adapters() looks a little funny -- it seems to bail out
> of all the remaining adapters if one of them isn't quite right.
> ---
>
>  drivers/infiniband/hw/ehca/hcp_sense.c |  144 +++++++++++++++++++++ 
> +++++++++++
>  drivers/infiniband/hw/ehca/hcp_sense.h |  136 +++++++++++++++++++++ 
> +++++++++
>  2 files changed, 280 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/infiniband/hw/ehca/hcp_sense.c b/drivers/ 
> infiniband/hw/ehca/hcp_sense.c
> new file mode 100644
> index 0000000..83fa4a3
> --- /dev/null
> +++ b/drivers/infiniband/hw/ehca/hcp_sense.c
> @@ -0,0 +1,144 @@
> +/*
> + *  IBM eServer eHCA Infiniband device driver for Linux on POWER
> + *
> + *  ehca detection and query code for POWER
> + *
> + *  Authors: Heiko J Schick <schickhj at de.ibm.com>
> + *
> + *  Copyright (c) 2005 IBM Corporation
> + *
> + *  All rights reserved.
> + *
> + *  This source code is distributed under a dual license of GPL  
> v2.0 and OpenIB
> + *  BSD.
> + *
> + * OpenIB BSD License
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following  
> conditions are met:
> + *
> + * Redistributions of source code must retain the above copyright  
> notice, this
> + * list of conditions and the following disclaimer.
> + *
> + * Redistributions in binary form must reproduce the above  
> copyright notice,
> + * this list of conditions and the following disclaimer in the  
> documentation
> + * and/or other materials
> + * provided with the distribution.
> + *
> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND  
> CONTRIBUTORS "AS IS"
> + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT  
> LIMITED TO, THE
> + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A  
> PARTICULAR PURPOSE
> + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR  
> CONTRIBUTORS BE
> + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,  
> EXEMPLARY, OR
> + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,  
> PROCUREMENT OF
> + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
> + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF  
> LIABILITY, WHETHER
> + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR  
> OTHERWISE)
> + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF  
> ADVISED OF THE
> + * POSSIBILITY OF SUCH DAMAGE.
> + *
> + *  $Id: hcp_sense.c,v 1.10 2006/02/06 10:17:34 schickhj Exp $
> + */
> +
> +#define DEB_PREFIX "snse"
> +
> +#include "ehca_kernel.h"
> +#include "ehca_tools.h"
> +
> +int hipz_count_adapters(void)
> +{
> +	int num = 0;
> +	struct device_node *dn = NULL;
> +
> +	EDEB_EN(7, "");
> +
> +	while ((dn = of_find_node_by_name(dn, "lhca"))) {
> +		num++;
> +	}
> +
> +	of_node_put(dn);
> +
> +	if (num == 0) {
> +		EDEB_ERR(4, "No lhca node name was found in the"
> +			 " Open Firmware device tree.");
> +		return -ENODEV;
> +	}
> +
> +	EDEB(6, " ... found %x adapter(s)", num);
> +
> +	EDEB_EX(7, "num=%x", num);
> +
> +	return num;
> +}
> +
> +int hipz_probe_adapters(char **adapter_list)
> +{
> +	int ret = 0;
> +	int num = 0;
> +	struct device_node *dn = NULL;
> +	char *loc;
> +
> +	EDEB_EN(7, "adapter_list=%p", adapter_list);
> +
> +	while ((dn = of_find_node_by_name(dn, "lhca"))) {
> +		loc = get_property(dn, "ibm,loc-code", NULL);
> +		if (loc == NULL) {
> +			EDEB_ERR(4, "No ibm,loc-code property for"
> +				 " lhca Open Firmware device tree node.");
> +			ret = -ENODEV;
> +			goto probe_adapters0;
> +		}
> +
> +		adapter_list[num] = loc;
> +		EDEB(6, " ... found adapter[%x] with loc-code: %s", num, loc);
> +		num++;
> +	}
> +
> +      probe_adapters0:
> +	of_node_put(dn);
> +
> +	EDEB_EX(7, "ret=%x", ret);
> +
> +	return ret;
> +}
> +
> +u64 hipz_get_adapter_handle(char *adapter)
> +{
> +	struct device_node *dn = NULL;
> +	char *loc;
> +	u64 *u64data = NULL;
> +	u64 ret = 0;
> +
> +	EDEB_EN(7, "adapter=%p", adapter);
> +
> +	while ((dn = of_find_node_by_name(dn, "lhca"))) {
> +		loc = get_property(dn, "ibm,loc-code", NULL);
> +		if (loc == NULL) {
> +			EDEB_ERR(4, "No ibm,loc-code property for"
> +				 " lhca Open Firmware device tree node.");
> +			goto get_adapter_handle0;
> +		}
> +
> +		if (strcmp(loc, adapter) == 0) {
> +			u64data =
> +			    (u64 *) get_property(dn, "ibm,hca-handle", NULL);
> +			break;
> +		}
> +	}
> +
> +	if (u64data == NULL) {
> +		EDEB_ERR(4, "No ibm,hca-handle property for"
> +			 " lhca Open Firmware device tree node with"
> +			 " ibm,loc-code: %s.", adapter);
> +		goto get_adapter_handle0;
> +	}
> +
> +	ret = *u64data;
> +
> +      get_adapter_handle0:
> +	of_node_put(dn);
> +
> +	EDEB_EX(7, "ret=%lx",ret);
> +
> +	return ret;
> +}
> diff --git a/drivers/infiniband/hw/ehca/hcp_sense.h b/drivers/ 
> infiniband/hw/ehca/hcp_sense.h
> new file mode 100644
> index 0000000..a49040b
> --- /dev/null
> +++ b/drivers/infiniband/hw/ehca/hcp_sense.h
> @@ -0,0 +1,136 @@
> +/*
> + *  IBM eServer eHCA Infiniband device driver for Linux on POWER
> + *
> + *  ehca detection and query code for POWER
> + *
> + *  Authors: Heiko J Schick <schickhj at de.ibm.com>
> + *
> + *  Copyright (c) 2005 IBM Corporation
> + *
> + *  All rights reserved.
> + *
> + *  This source code is distributed under a dual license of GPL  
> v2.0 and OpenIB
> + *  BSD.
> + *
> + * OpenIB BSD License
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following  
> conditions are met:
> + *
> + * Redistributions of source code must retain the above copyright  
> notice, this
> + * list of conditions and the following disclaimer.
> + *
> + * Redistributions in binary form must reproduce the above  
> copyright notice,
> + * this list of conditions and the following disclaimer in the  
> documentation
> + * and/or other materials
> + * provided with the distribution.
> + *
> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND  
> CONTRIBUTORS "AS IS"
> + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT  
> LIMITED TO, THE
> + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A  
> PARTICULAR PURPOSE
> + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR  
> CONTRIBUTORS BE
> + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,  
> EXEMPLARY, OR
> + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,  
> PROCUREMENT OF
> + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
> + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF  
> LIABILITY, WHETHER
> + * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR  
> OTHERWISE)
> + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF  
> ADVISED OF THE
> + * POSSIBILITY OF SUCH DAMAGE.
> + *
> + *  $Id: hcp_sense.h,v 1.11 2006/02/06 10:17:34 schickhj Exp $
> + */
> +
> +#ifndef HCP_SENSE_H
> +#define HCP_SENSE_H
> +
> +int hipz_count_adapters(void);
> +int hipz_probe_adapters(char **adapter_list);
> +u64 hipz_get_adapter_handle(char *adapter);
> +
> +/* query hca response block */
> +struct query_hca_rblock {
> +	u32 cur_reliable_dg;
> +	u32 cur_qp;
> +	u32 cur_cq;
> +	u32 cur_eq;
> +	u32 cur_mr;
> +	u32 cur_mw;
> +	u32 cur_ee_context;
> +	u32 cur_mcast_grp;
> +	u32 cur_qp_attached_mcast_grp;
> +	u32 reserved1;
> +	u32 cur_ipv6_qp;
> +	u32 cur_eth_qp;
> +	u32 cur_hp_mr;
> +	u32 reserved2[3];
> +	u32 max_rd_domain;
> +	u32 max_qp;
> +	u32 max_cq;
> +	u32 max_eq;
> +	u32 max_mr;
> +	u32 max_hp_mr;
> +	u32 max_mw;
> +	u32 max_mrwpte;
> +	u32 max_special_mrwpte;
> +	u32 max_rd_ee_context;
> +	u32 max_mcast_grp;
> +	u32 max_qps_attached_all_mcast_grp;
> +	u32 max_qps_attached_mcast_grp;
> +	u32 max_raw_ipv6_qp;
> +	u32 max_raw_ethy_qp;
> +	u32 internal_clock_frequency;
> +	u32 max_pd;
> +	u32 max_ah;
> +	u32 max_cqe;
> +	u32 max_wqes_wq;
> +	u32 max_partitions;
> +	u32 max_rr_ee_context;
> +	u32 max_rr_qp;
> +	u32 max_rr_hca;
> +	u32 max_act_wqs_ee_context;
> +	u32 max_act_wqs_qp;
> +	u32 max_sge;
> +	u32 max_sge_rd;
> +	u32 memory_page_size_supported;
> +	u64 max_mr_size;
> +	u32 local_ca_ack_delay;
> +	u32 num_ports;
> +	u32 vendor_id;
> +	u32 vendor_part_id;
> +	u32 hw_ver;
> +	u64 node_guid;
> +	u64 hca_cap_indicators;
> +	u32 data_counter_register_size;
> +	u32 max_shared_rq;
> +	u32 max_isns_eq;
> +	u32 max_neq;
> +} __attribute__ ((packed));
> +
> +/* query port response block */
> +struct query_port_rblock {
> +	u32 state;
> +	u32 bad_pkey_cntr;
> +	u32 lmc;
> +	u32 lid;
> +	u32 subnet_timeout;
> +	u32 qkey_viol_cntr;
> +	u32 sm_sl;
> +	u32 sm_lid;
> +	u32 capability_mask;
> +	u32 init_type_reply;
> +	u32 pkey_tbl_len;
> +	u32 gid_tbl_len;
> +	u64 gid_prefix;
> +	u32 port_nr;
> +	u16 pkey_entries[16];
> +	u8  reserved1[32];
> +	u32 trent_size;
> +	u32 trbuf_size;
> +	u64 max_msg_sz;
> +	u32 max_mtu;
> +	u32 vl_cap;
> +	u8  reserved2[1900];
> +	u64 guid_entries[255];
> +} __attribute__ ((packed));
> +
> +#endif
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/ 
> openib-general
>


From mulix at mulix.org  Sat Feb 18 23:26:31 2006
From: mulix at mulix.org (Muli Ben-Yehuda)
Date: Sat, 18 Feb 2006 14:26:31 +0200
Subject: [PATCH 02/22] Firmware interface code for IB device.
In-Reply-To: <20060218122011.GE911@infradead.org>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
	<20060218005707.13620.20538.stgit@localhost.localdomain>
	<20060218015808.GB17653@kroah.com> <aday809bewn.fsf@cisco.com>
	<20060218122011.GE911@infradead.org>
Message-ID: <20060218122631.GA30535@granada.merseine.nu>

On Sat, Feb 18, 2006 at 12:20:11PM +0000, Christoph Hellwig wrote:

> > Well, the eHCA guys tell me that they can't post patches to lkml.
> 
> Then they lie.  And not posting to lkml is a good reason not to merge
> an otherwise perfect driver.  (which this one is far from)

I don't speak for IBM or the authors, but there are perfectly
reasonable reasons to ask someone else to post a patch on your behalf
- including but not limited to to only being able to use Lotus Notes
with one's IBM email. I'm sure you've all seen the travesties that
Notes inflicts on inline patches.

Cheers,
Muli
-- 
Muli Ben-Yehuda
http://www.mulix.org | http://mulix.livejournal.com/


From mostrows at watson.ibm.com  Sun Feb 19 01:24:44 2006
From: mostrows at watson.ibm.com (Michal Ostrowski)
Date: Sat, 18 Feb 2006 09:24:44 -0500
Subject: [PATCH] Fix race condition in hvc console.
Message-ID: <E1FAT0u-0004no-0P@heater.watson.ibm.com>


tty_schedule_flip() would schedule a thread that would call flush_to_ldisc().
If tty_buffer_request_room() gets called prior to that thread running --
which is likely in this loop in hvc_poll(), it would set the active flag
in the tty buffer and consequently flush_to_ldisc() would ignore it.

The result is that input on the hvc console is not processed.

This fix calls tty_flip_buffer_push (and flags the tty as
"low_latency").  The push to the ldisc thus happens synchronously.

Signed-off-by: Michal Ostrowski <mostrows at watson.ibm.com>

---

 drivers/char/hvc_console.c |    9 ++++++---
 1 files changed, 6 insertions(+), 3 deletions(-)

1d719e2972f0c02d62a428aa84ca60793ad79666
diff --git a/drivers/char/hvc_console.c b/drivers/char/hvc_console.c
index 1994a92..67f368f 100644
--- a/drivers/char/hvc_console.c
+++ b/drivers/char/hvc_console.c
@@ -335,6 +335,8 @@ static int hvc_open(struct tty_struct *t
 	} /* else count == 0 */
 
 	tty->driver_data = hp;
+	tty->low_latency = 1; /* Makes flushes to ldisc synchronous. */
+
 	hp->tty = tty;
 	/* Save for request_irq outside of spin_lock. */
 	irq = hp->irq;
@@ -633,9 +635,6 @@ static int hvc_poll(struct hvc_struct *h
 			tty_insert_flip_char(tty, buf[i], 0);
 		}
 
-		if (count)
-			tty_schedule_flip(tty);
-
 		/*
 		 * Account for the total amount read in one loop, and if above
 		 * 64 bytes, we do a quick schedule loop to let the tty grok
@@ -656,6 +655,10 @@ static int hvc_poll(struct hvc_struct *h
  bail:
 	spin_unlock_irqrestore(&hp->lock, flags);
 
+	if (read_total) {
+		tty_flip_buffer_push(tty);
+	}
+	
 	return poll_mask;
 }
 
-- 
1.1.4.g0b63-dirty


From sid at us.ibm.com  Sun Feb 19 01:51:43 2006
From: sid at us.ibm.com (Sidney Manning)
Date: Sat, 18 Feb 2006 08:51:43 -0600
Subject: [FYI/PATCH 3/4] Build fixes for IBM Full System Simulator
In-Reply-To: <20060217183254.GA3951@lst.de>
Message-ID: <OF22F321CD.5D2DB891-ON85257119.004FD6D6-86257119.0051C260@us.ibm.com>


This patch is not intended for mainline inclusion.  It is intended to cover
up an assembler bug that is unique to the cross toolchain we use to compile
the kernel for the simulator, "Fatal error: Neither Power nor PowerPC
opcodes were selected."  The build was selecting -mno-altivec and -maltivec
and that combination was the cause of the above buildtime error, -mcellppu
overrode all of that.


Sidney Manning -- IBM-STI Design Center Austin, TX
sid at us.ibm.com -- (512) 838-1125, TL/678-1125


             Christoph Hellwig                                             
             <hch at lst.de>                                                  
                                                                        To 
             02/17/2006 12:32          Utz Bacher <utz.bacher at de.ibm.com>  
             PM                                                         cc 
                                       linuxppc64-dev at ozlabs.org, Sidney   
                                       Manning/Austin/IBM at IBMUS,           
                                       arndb at de.ibm.com                    
                                                                   Subject 
                                       Re: [FYI/PATCH 3/4] Build fixes for 
                                       IBM Full System Simulator           
                                                                           
                                                                           
> +
> +ifneq ($(CROSS_COMPILE),)
> +cpu-as-$(CONFIG_PPC_CELL)         += -Wa,-mcellppu
> +endif

the CROSS_COMPILE setting is wrong.  cross-compilation should not
affect selection of assembler flags.

> +
>   cpu-as-$(CONFIG_PPC64BRIDGE)           += -Wa,-mppc64bridge
>   cpu-as-$(CONFIG_4xx)                         += -Wa,-m405
>   cpu-as-$(CONFIG_6xx)                         += -Wa,-maltivec


From hch at lst.de  Sun Feb 19 02:09:20 2006
From: hch at lst.de (Christoph Hellwig)
Date: Sat, 18 Feb 2006 16:09:20 +0100
Subject: [openib-general] [PATCH 08/22] Generic ehca headers
In-Reply-To: <20060218005723.13620.10389.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
	<20060218005723.13620.10389.stgit@localhost.localdomain>
Message-ID: <20060218150920.GA23817@lst.de>

On Fri, Feb 17, 2006 at 04:57:23PM -0800, Roland Dreier wrote:
> From: Roland Dreier <rolandd at cisco.com>
> 
> The defines of TRUE and FALSE look rather useless.  Why are they needed?
> 
> What is struct ehca_cache for?  It doesn't seem to be used anywhere.
> 
> ehca_kv_to_g() looks completely horrible.  The whole idea of using
> vmalloc()ed kernel memory to do DMA seems unacceptable to me.

When you want to do scatter-gather dma on kernel-virtual contingous
areas allocate the pages individually and map them into kva using
vmap().  Then dma can be performed using dma_map_page, or in case
you have lots of pages dma_map_sg after creating an S/G list.


From rdreier at cisco.com  Sun Feb 19 03:02:33 2006
From: rdreier at cisco.com (Roland Dreier)
Date: Sat, 18 Feb 2006 08:02:33 -0800
Subject: [openib-general] [PATCH 04/22] OF adapter probing
In-Reply-To: <ADDBC190-7388-4904-9ECB-489F1D199AB1@schihei.de> (Heiko J.
	Schick's message of "Sat, 18 Feb 2006 13:46:10 +0100")
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
	<20060218005712.13620.82908.stgit@localhost.localdomain>
	<ADDBC190-7388-4904-9ECB-489F1D199AB1@schihei.de>
Message-ID: <ada4q2wbqp2.fsf@cisco.com>

    Heiko> Hello Roland, sorry, this file is not used anymore. The
    Heiko> functions

OK, please delete it from the svn tree.

Thanks,
  Roland


From rdreier at cisco.com  Sun Feb 19 03:32:28 2006
From: rdreier at cisco.com (Roland Dreier)
Date: Sat, 18 Feb 2006 08:32:28 -0800
Subject: [PATCH 02/22] Firmware interface code for IB device.
In-Reply-To: <1140265955.4035.19.camel@laptopd505.fenrus.org> (Arjan van de
	Ven's message of "Sat, 18 Feb 2006 13:32:35 +0100")
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
	<20060218005707.13620.20538.stgit@localhost.localdomain>
	<20060218015808.GB17653@kroah.com> <aday809bewn.fsf@cisco.com>
	<20060218122011.GE911@infradead.org>
	<20060218122631.GA30535@granada.merseine.nu>
	<1140265955.4035.19.camel@laptopd505.fenrus.org>
Message-ID: <adazmkoaaqr.fsf@cisco.com>

    Arjan> The bigger issue is: if people can't be bothered to do
    Arjan> those steps, why would they be bothered to do this for
    Arjan> maintenance and bugfixes etc etc?  Basically it's now
    Arjan> already a de-facto unmaintained driver....

I don't think that's really a fair statement.  The IBM people have
been active and responsive in maintaining their driving in the
openib.org svn tree.  However, they asked me to post their driver for
review because it would be difficult for them to do it.

IBM people: can you clarify the restrictions you have?  Why do you
feel you can't post your own driver for review?  Will you be able to
post smaller patches to lkml in the future if the driver is merged?

Thanks,
  Roland


From arjan at infradead.org  Sun Feb 19 04:02:42 2006
From: arjan at infradead.org (Arjan van de Ven)
Date: Sat, 18 Feb 2006 18:02:42 +0100
Subject: [PATCH 02/22] Firmware interface code for IB device.
In-Reply-To: <adazmkoaaqr.fsf@cisco.com>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
	<20060218005707.13620.20538.stgit@localhost.localdomain>
	<20060218015808.GB17653@kroah.com> <aday809bewn.fsf@cisco.com>
	<20060218122011.GE911@infradead.org>
	<20060218122631.GA30535@granada.merseine.nu>
	<1140265955.4035.19.camel@laptopd505.fenrus.org>
	<adazmkoaaqr.fsf@cisco.com>
Message-ID: <1140282163.6514.7.camel@laptopd505.fenrus.org>

On Sat, 2006-02-18 at 08:32 -0800, Roland Dreier wrote:
>     Arjan> The bigger issue is: if people can't be bothered to do
>     Arjan> those steps, why would they be bothered to do this for
>     Arjan> maintenance and bugfixes etc etc?  Basically it's now
>     Arjan> already a de-facto unmaintained driver....
> 
> I don't think that's really a fair statement.

It's a concern at least; if they're just having trouble posting really
big files that's one thing.. if they're not allowed to post at all
that's another.

> IBM people: can you clarify the restrictions you have?  Why do you
> feel you can't post your own driver for review?  Will you be able to
> post smaller patches to lkml in the future if the driver is merged?

And can you respond to questions and user questions on lkml?


From greg at kroah.com  Sun Feb 19 05:15:09 2006
From: greg at kroah.com (Greg KH)
Date: Sat, 18 Feb 2006 10:15:09 -0800
Subject: [PATCH 02/22] Firmware interface code for IB device.
In-Reply-To: <adazmkoaaqr.fsf@cisco.com>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
	<20060218005707.13620.20538.stgit@localhost.localdomain>
	<20060218015808.GB17653@kroah.com> <aday809bewn.fsf@cisco.com>
	<20060218122011.GE911@infradead.org>
	<20060218122631.GA30535@granada.merseine.nu>
	<1140265955.4035.19.camel@laptopd505.fenrus.org>
	<adazmkoaaqr.fsf@cisco.com>
Message-ID: <20060218181509.GA892@kroah.com>

On Sat, Feb 18, 2006 at 08:32:28AM -0800, Roland Dreier wrote:
>     Arjan> The bigger issue is: if people can't be bothered to do
>     Arjan> those steps, why would they be bothered to do this for
>     Arjan> maintenance and bugfixes etc etc?  Basically it's now
>     Arjan> already a de-facto unmaintained driver....
> 
> I don't think that's really a fair statement.  The IBM people have
> been active and responsive in maintaining their driving in the
> openib.org svn tree.  However, they asked me to post their driver for
> review because it would be difficult for them to do it.

Checking stuff into a private svn tree is vastly different from posting
to lkml in public.  In fact, it looks like the svn tree is so far ahead
of the in-kernel stuff, that most people are just using it instead of
the in-kernel code.

I know at least one company has asked a distro to just accept the svn
snapshot over the in-kernel IB code, which makes me wonder if the
in-kernel stuff is even useful to people?  Why have it, if companies
insist on using the out-of-tree stuff instead?

thanks,

greg k-h


From hch at infradead.org  Sun Feb 19 05:19:32 2006
From: hch at infradead.org (Christoph Hellwig)
Date: Sat, 18 Feb 2006 18:19:32 +0000
Subject: [PATCH 02/22] Firmware interface code for IB device.
In-Reply-To: <20060218181509.GA892@kroah.com>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
	<20060218005707.13620.20538.stgit@localhost.localdomain>
	<20060218015808.GB17653@kroah.com> <aday809bewn.fsf@cisco.com>
	<20060218122011.GE911@infradead.org>
	<20060218122631.GA30535@granada.merseine.nu>
	<1140265955.4035.19.camel@laptopd505.fenrus.org>
	<adazmkoaaqr.fsf@cisco.com> <20060218181509.GA892@kroah.com>
Message-ID: <20060218181932.GA6410@infradead.org>

On Sat, Feb 18, 2006 at 10:15:09AM -0800, Greg KH wrote:
> On Sat, Feb 18, 2006 at 08:32:28AM -0800, Roland Dreier wrote:
> >     Arjan> The bigger issue is: if people can't be bothered to do
> >     Arjan> those steps, why would they be bothered to do this for
> >     Arjan> maintenance and bugfixes etc etc?  Basically it's now
> >     Arjan> already a de-facto unmaintained driver....
> > 
> > I don't think that's really a fair statement.  The IBM people have
> > been active and responsive in maintaining their driving in the
> > openib.org svn tree.  However, they asked me to post their driver for
> > review because it would be difficult for them to do it.
> 
> Checking stuff into a private svn tree is vastly different from posting
> to lkml in public.  In fact, it looks like the svn tree is so far ahead
> of the in-kernel stuff, that most people are just using it instead of
> the in-kernel code.
> 
> I know at least one company has asked a distro to just accept the svn
> snapshot over the in-kernel IB code, which makes me wonder if the
> in-kernel stuff is even useful to people?  Why have it, if companies
> insist on using the out-of-tree stuff instead?

The openib tree isn't private.  It's mostly just a staging area for
development.  Any company that wants it included into a distro release
is completely clueless.


From rdreier at cisco.com  Sun Feb 19 05:52:58 2006
From: rdreier at cisco.com (Roland Dreier)
Date: Sat, 18 Feb 2006 10:52:58 -0800
Subject: [PATCH 02/22] Firmware interface code for IB device.
In-Reply-To: <20060218181509.GA892@kroah.com> (Greg KH's message of "Sat, 18
	Feb 2006 10:15:09 -0800")
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
	<20060218005707.13620.20538.stgit@localhost.localdomain>
	<20060218015808.GB17653@kroah.com> <aday809bewn.fsf@cisco.com>
	<20060218122011.GE911@infradead.org>
	<20060218122631.GA30535@granada.merseine.nu>
	<1140265955.4035.19.camel@laptopd505.fenrus.org>
	<adazmkoaaqr.fsf@cisco.com> <20060218181509.GA892@kroah.com>
Message-ID: <adavevca48l.fsf@cisco.com>

    Greg> Checking stuff into a private svn tree is vastly different
    Greg> from posting to lkml in public.  In fact, it looks like the
    Greg> svn tree is so far ahead of the in-kernel stuff, that most
    Greg> people are just using it instead of the in-kernel code.

It's not a private svn tree -- the IBM ehca development is available
to anyone via svn at https://openib.org/svn/gen2/trunk/src/linux-kernel/infiniband/hw/ehca

    Greg> I know at least one company has asked a distro to just
    Greg> accept the svn snapshot over the in-kernel IB code, which
    Greg> makes me wonder if the in-kernel stuff is even useful to
    Greg> people?  Why have it, if companies insist on using the
    Greg> out-of-tree stuff instead?

The IB driver stack is still in its early stages, so although I'm
pushing for things to be merged as fast as possible, the unfortunate
fact is that lots of things that people want to use (including the IBM
ehca driver) are not upstream and are not ready to go upstream yet.
But that doesn't mean we should give up on merging them.

Distro politics are just distro politics -- and there will always be
pressure on distros to ship stuff that's not upstream yet.

 - R.


From greg at kroah.com  Sun Feb 19 06:53:27 2006
From: greg at kroah.com (Greg KH)
Date: Sat, 18 Feb 2006 11:53:27 -0800
Subject: [PATCH 02/22] Firmware interface code for IB device.
In-Reply-To: <adavevca48l.fsf@cisco.com>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
	<20060218005707.13620.20538.stgit@localhost.localdomain>
	<20060218015808.GB17653@kroah.com> <aday809bewn.fsf@cisco.com>
	<20060218122011.GE911@infradead.org>
	<20060218122631.GA30535@granada.merseine.nu>
	<1140265955.4035.19.camel@laptopd505.fenrus.org>
	<adazmkoaaqr.fsf@cisco.com> <20060218181509.GA892@kroah.com>
	<adavevca48l.fsf@cisco.com>
Message-ID: <20060218195327.GA1382@kroah.com>

On Sat, Feb 18, 2006 at 10:52:58AM -0800, Roland Dreier wrote:
>     Greg> Checking stuff into a private svn tree is vastly different
>     Greg> from posting to lkml in public.  In fact, it looks like the
>     Greg> svn tree is so far ahead of the in-kernel stuff, that most
>     Greg> people are just using it instead of the in-kernel code.
> 
> It's not a private svn tree -- the IBM ehca development is available
> to anyone via svn at https://openib.org/svn/gen2/trunk/src/linux-kernel/infiniband/hw/ehca

Sorry, I didn't mean to say "private", but rather, "seperate".
Doing kernel development in a seperate development tree from the
mainline kernel is very problematic, as has been documented many times
in the past.

> Distro politics are just distro politics -- and there will always be
> pressure on distros to ship stuff that's not upstream yet.

Luckily the distros know better than to accept this anymore, as they
have been burned too many times in the past...

thanks,

greg k-h


From rdreier at cisco.com  Sun Feb 19 08:31:52 2006
From: rdreier at cisco.com (Roland Dreier)
Date: Sat, 18 Feb 2006 13:31:52 -0800
Subject: [PATCH 02/22] Firmware interface code for IB device.
In-Reply-To: <20060218195327.GA1382@kroah.com> (Greg KH's message of "Sat,
	18 Feb 2006 11:53:27 -0800")
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
	<20060218005707.13620.20538.stgit@localhost.localdomain>
	<20060218015808.GB17653@kroah.com> <aday809bewn.fsf@cisco.com>
	<20060218122011.GE911@infradead.org>
	<20060218122631.GA30535@granada.merseine.nu>
	<1140265955.4035.19.camel@laptopd505.fenrus.org>
	<adazmkoaaqr.fsf@cisco.com> <20060218181509.GA892@kroah.com>
	<adavevca48l.fsf@cisco.com> <20060218195327.GA1382@kroah.com>
Message-ID: <adar7609wvr.fsf@cisco.com>

    Greg> Sorry, I didn't mean to say "private", but rather,
    Greg> "seperate".  Doing kernel development in a seperate
    Greg> development tree from the mainline kernel is very
    Greg> problematic, as has been documented many times in the past.

As a general rule I agree with that.  However, the openib svn tree
we're talking about is not some project that is off in space never
merging with the kernel; as Christoph said, it's really just a staging
area for stuff that isn't ready for upstream yet.n

Perhaps it would be more politically correct to use git to develop
kernel code, but in the end that's really just a technical difference
that shouldn't matter.

    Roland> Distro politics are just distro politics -- and there will
    Roland> always be pressure on distros to ship stuff that's not
    Roland> upstream yet.

    Greg> Luckily the distros know better than to accept this anymore,
    Greg> as they have been burned too many times in the past...

OK, that's great.  But now I don't understand your original point.
You say there are people putting pressure on distros to ship what's in
openib svn rather than the upstream kernel, but if the distros are
going to ignore them, what does it matter?

And this thread started with me trying to help the IBM people make
progress towards merging a big chunk of that svn tree upstream.  That
should make you happy, right?

 - R.


From sfr at canb.auug.org.au  Sun Feb 19 10:23:35 2006
From: sfr at canb.auug.org.au (Stephen Rothwell)
Date: Sun, 19 Feb 2006 10:23:35 +1100
Subject: [PATCH] Fix race condition in hvc console.
In-Reply-To: <E1FAT0u-0004no-0P@heater.watson.ibm.com>
References: <E1FAT0u-0004no-0P@heater.watson.ibm.com>
Message-ID: <20060219102335.37cda813.sfr@canb.auug.org.au>

Hi Michal,

On Sat, 18 Feb 2006 09:24:44 -0500 Michal Ostrowski <mostrows at watson.ibm.com> wrote:
>
> +	if (read_total) {
> +		tty_flip_buffer_push(tty);
> +	}

A small nit: please don't add these unnecessary '{}' pairs.

-- 
Cheers,
Stephen Rothwell                    sfr at canb.auug.org.au
http://www.canb.auug.org.au/~sfr/


From greg at kroah.com  Sun Feb 19 10:29:34 2006
From: greg at kroah.com (Greg KH)
Date: Sat, 18 Feb 2006 15:29:34 -0800
Subject: [PATCH 02/22] Firmware interface code for IB device.
In-Reply-To: <adar7609wvr.fsf@cisco.com>
References: <20060218015808.GB17653@kroah.com> <aday809bewn.fsf@cisco.com>
	<20060218122011.GE911@infradead.org>
	<20060218122631.GA30535@granada.merseine.nu>
	<1140265955.4035.19.camel@laptopd505.fenrus.org>
	<adazmkoaaqr.fsf@cisco.com> <20060218181509.GA892@kroah.com>
	<adavevca48l.fsf@cisco.com> <20060218195327.GA1382@kroah.com>
	<adar7609wvr.fsf@cisco.com>
Message-ID: <20060218232934.GA2624@kroah.com>

On Sat, Feb 18, 2006 at 01:31:52PM -0800, Roland Dreier wrote:
>     Greg> Sorry, I didn't mean to say "private", but rather,
>     Greg> "seperate".  Doing kernel development in a seperate
>     Greg> development tree from the mainline kernel is very
>     Greg> problematic, as has been documented many times in the past.
> 
> As a general rule I agree with that.  However, the openib svn tree
> we're talking about is not some project that is off in space never
> merging with the kernel; as Christoph said, it's really just a staging
> area for stuff that isn't ready for upstream yet.n
> 
> Perhaps it would be more politically correct to use git to develop
> kernel code, but in the end that's really just a technical difference
> that shouldn't matter.

Yes, that doesn't matter.  But it seems that the svn tree is vastly
different from the in-kernel code.  So much so that some companies feel
that the in-kernel stuff just isn't worth running at all.

>     Roland> Distro politics are just distro politics -- and there will
>     Roland> always be pressure on distros to ship stuff that's not
>     Roland> upstream yet.
> 
>     Greg> Luckily the distros know better than to accept this anymore,
>     Greg> as they have been burned too many times in the past...
> 
> OK, that's great.  But now I don't understand your original point.
> You say there are people putting pressure on distros to ship what's in
> openib svn rather than the upstream kernel, but if the distros are
> going to ignore them, what does it matter?

It takes a _lot_ of effort to ignore them, as it's very difficult to do
so.  Especially when companies try to play the different distros off of
each other, but that's not an issue that the mainline kernel developers
need to worry about :)

> And this thread started with me trying to help the IBM people make
> progress towards merging a big chunk of that svn tree upstream.  That
> should make you happy, right?

Yes, that does make me happy.  But it doesn't make me happy to see IBM
not being able to participate in kernel development by posting and
defending their own code to lkml.  I thought IBM knew better than
that...

thanks,

greg k-h


From rdreier at cisco.com  Sun Feb 19 11:09:31 2006
From: rdreier at cisco.com (Roland Dreier)
Date: Sat, 18 Feb 2006 16:09:31 -0800
Subject: [PATCH 02/22] Firmware interface code for IB device.
In-Reply-To: <20060218232934.GA2624@kroah.com> (Greg KH's message of "Sat,
	18 Feb 2006 15:29:34 -0800")
References: <20060218015808.GB17653@kroah.com> <aday809bewn.fsf@cisco.com>
	<20060218122011.GE911@infradead.org>
	<20060218122631.GA30535@granada.merseine.nu>
	<1140265955.4035.19.camel@laptopd505.fenrus.org>
	<adazmkoaaqr.fsf@cisco.com> <20060218181509.GA892@kroah.com>
	<adavevca48l.fsf@cisco.com> <20060218195327.GA1382@kroah.com>
	<adar7609wvr.fsf@cisco.com> <20060218232934.GA2624@kroah.com>
Message-ID: <adamzgo9pl0.fsf@cisco.com>

    Greg> Yes, that doesn't matter.  But it seems that the svn tree is
    Greg> vastly different from the in-kernel code.  So much so that
    Greg> some companies feel that the in-kernel stuff just isn't
    Greg> worth running at all.

I don't want to belabor this issue... but the svn tree is not vastly
different than what's in the kernel.  It has some things that aren't
upstream yet, and which are important to some people.  For example,
the IBM ehca driver we're talking about, as well as the PathScale
driver, SDP (sockets direct protocol), etc.  It just takes time for
this new code to get to the point where both the developers of the new
stuff feel it's ready to be merged, and the kernel community agrees
that it should be merged.

    Greg> Yes, that does make me happy.  But it doesn't make me happy
    Greg> to see IBM not being able to participate in kernel
    Greg> development by posting and defending their own code to lkml.
    Greg> I thought IBM knew better than that...

Agreed.  But let's not get sidetracked on that internal IBM issue.
The ehca developers have assured me that they can and will participate
in the thread reviewing their driver.  It seems like it's better for
me to help them work around their internal problems by acting as a
proxy, than for me to delay merging their driver just because someone
in IBM management is clueless.

 - R.


From paulus at samba.org  Sun Feb 19 22:52:31 2006
From: paulus at samba.org (Paul Mackerras)
Date: Sun, 19 Feb 2006 22:52:31 +1100
Subject: [PATCH] powerpc: ARCH=powerpc build fix for CONFIG_SYSVIPC=n ||
	CONFIG_SYSCTL=n
In-Reply-To: <20060218100849.GA1869@krypton>
References: <20060218100849.GA1869@krypton>
Message-ID: <17400.23551.904754.47979@cargo.ozlabs.ibm.com>

Arthur Othieno writes:

> --- a/arch/powerpc/kernel/sys_ppc32.c
> +++ b/arch/powerpc/kernel/sys_ppc32.c
> @@ -440,7 +440,13 @@ long compat_sys_ipc(u32 call, u32 first,
>  
>  	return -ENOSYS;
>  }
> -#endif
> +#else
> +long compat_sys_ipc(u32 call, u32 first, u32 second, u32 third, compat_uptr_t ptr,
> +	       u32 fifth)
> +{
> +	return -ENOSYS;
> +}
> +#endif /* CONFIG_SYSVIPC */

Can't we just add a couple of cond_syscall lines to kernel/sys_ni.c
instead?

Paul.


From laforge at gnumonks.org  Sun Feb 19 22:45:32 2006
From: laforge at gnumonks.org (Harald Welte)
Date: Sun, 19 Feb 2006 12:45:32 +0100
Subject: PowerMac11,2 sound questions
Message-ID: <20060219114532.GA30498@sunbeam.de.gnumonks.org>

Hi!

Since I recently got a Quad G5, and paulus/benh were too fast for me to
hack on the fan control, I was looking for something else that is
missing.

Apparently there is no sound support for those machines yet.  Apple
seems to call the sound architecture of those boxes 'onyx', and a quick
look at
http://darwinsource.opendarwin.org/10.4.5.ppc/AppleOnboardAudio-256.2.5/AppleOnboardAudio/
revealed that all onyx specific bits are not present in the source code
:(

Does anyone have more information on what needs to be done / what is
missing for getting sound support on those devices?

[yes, I'm well aware of the long-standing
i2s/infrastructure/ubuntu-bounty/... discussion, but that's not what I'm
asking about]

Thanks!

-- 
- Harald Welte <laforge at gnumonks.org>          	        http://gnumonks.org/
============================================================================
"Privacy in residential applications is a desirable marketing option."
                                                  (ETSI EN 300 175-7 Ch. A6)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060219/336495f4/attachment.pgp 

From benh at kernel.crashing.org  Mon Feb 20 09:14:53 2006
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Mon, 20 Feb 2006 09:14:53 +1100
Subject: PowerMac11,2 sound questions
In-Reply-To: <20060219114532.GA30498@sunbeam.de.gnumonks.org>
References: <20060219114532.GA30498@sunbeam.de.gnumonks.org>
Message-ID: <1140387293.32374.39.camel@localhost.localdomain>

On Sun, 2006-02-19 at 12:45 +0100, Harald Welte wrote:
> Hi!
> 
> Since I recently got a Quad G5, and paulus/benh were too fast for me to
> hack on the fan control, I was looking for something else that is
> missing.
> 
> Apparently there is no sound support for those machines yet.  Apple
> seems to call the sound architecture of those boxes 'onyx', and a quick
> look at
> http://darwinsource.opendarwin.org/10.4.5.ppc/AppleOnboardAudio-256.2.5/AppleOnboardAudio/
> revealed that all onyx specific bits are not present in the source code
> :(
> 
> Does anyone have more information on what needs to be done / what is
> missing for getting sound support on those devices?
> 
> [yes, I'm well aware of the long-standing
> i2s/infrastructure/ubuntu-bounty/... discussion, but that's not what I'm
> asking about]

Those machines have dual codecs (Onyx + Topaz). Onyx is a PCM3052 (TI
afaik) and I have a spec. Topaz is is CS84xx (Darwin at least knows at
least 3 models, CS8406, CS8416, CS8420), spec available online.

The main issue right now is that the current driver can't really handle
properly multiple codecs and multiple i2s busses, along with all the
various bits & pieces that are already barely working and need serious
rework.

I've had plans for some time to rewrite the sound driver (at least for
newer architectures based on layoutID) but didn't have time yet to
seriously begin work on it. Among the things that need to be done is
proper usage of platform-do-* functions for things like GPIO
manipulations (Ben Collins did some work on that already), better
"objectisation" of the whole driver so we can properly instanciate sound
busses (I think up to 2 i2s busses can be used, maybe 3) and codecs,
with a generic callback system for things like clock changes etc...
(when using digital inputs, the bus clocking and other codecs must adapt
to changes ot hte digital input clock) etc...


Ben.


From huangjq at cn.ibm.com  Mon Feb 20 12:14:54 2006
From: huangjq at cn.ibm.com (Jin Qi Huang)
Date: Mon, 20 Feb 2006 09:14:54 +0800
Subject: Kernel oops then panic when perform a soft reset on ppc64 box
Message-ID: <OF8B342B2E.F48C311C-ON4825711B.00049833-4825711B.0006DA3E@cn.ibm.com>

Hi all,

When I perform a soft reset on HMC console to a ppc64 box, the kernel oops 
then panic, here is the procedure to reproduce it:
1. machine hardware environment:
# cat /proc/cpuinfo 
processor       : 0
cpu             : POWER4 (gp)
clock           : 1002.296504MHz
revision        : 3.2

processor       : 1
cpu             : POWER4 (gp)
clock           : 1002.296504MHz
revision        : 3.2

timebase        : 125287063
machine         : CHRP IBM,7028-6C4

2.  machine software environment:
# uname -a
Linux mcptest4 2.6.5-279 #2 SMP Thu Feb 9 21:21:11 UTC 2006 ppc64 ppc64 
ppc64 GNU/Linux

3. on HMC console perform a soft reset:
$ chsysstate -m plinuxt4 -r lpar -n lpar1 -o reset

4. on the HMC virtual terminal give the kernel oops and panic message:
Oops: System Reset, sig: 0 [#1]
SMP NR_CPUS=32 PSERIES LPAR 
NIP: C000000000013B5C XER: 0000000020000000 LR: C000000000013B9C
REGS: c00000000053fad0 TRAP: 0100   Not tainted  (2.6.5-279 )
MSR: 8000000000009032 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11
TASK: c0000000005d3a20[0] 'swapper' THREAD: c00000000053c000 CPU: 0
GPR00: 0000000000000010 C00000000053FD50 C00000000071EAB8 C0000000BB1CD800 

GPR04: 0000000000000007 0000000000000000 C00000000053FC30 0000000000000000 

GPR08: 0000000000000000 0000000000000000 C00000000071D008 C00000000053C000 

GPR12: 0000000042004028 C000000000541000 0000000000000000 0000000000000000 

GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 

GPR20: 0000000000230000 0000000000000000 0000000000000000 0000000003A00000 

GPR24: C000000000541000 C00000000071D008 C000000000539AF0 0000000000008000 

GPR28: 0000000000000010 0000000000000008 C00000000053C000 C00000000053C010 

NIP [c000000000013b5c] .default_idle+0x64/0xac
LR [c000000000013b9c] .default_idle+0xa4/0xac
Call Trace:
[c00000000053fd50] [c000000000013b9c] .default_idle+0xa4/0xac (unreliable)
[c00000000053fde0] [c00000000001398c] .cpu_idle+0x38/0x50
[c00000000053fe50] [c00000000000c49c] .rest_init+0x64/0x7c
[c00000000053fed0] [c0000000004ee5dc] .start_kernel+0x2b4/0x330
[c00000000053ff90] [c00000000000c394] .__setup_cpu_power3+0x0/0x4
 <0>Fatal exception: panic in 5 seconds
et, sig: 0 [#2]
SMP NR_CPUS=32 PSERIES LPAR 
NIP: C000000000013B5C XER: 0000000020000000 LR: C000000000013B9C
REGS: c0000000bff07b80 TRAP: 0100   Not tainted  (2.6.5-279 )
MSR: 8000000000009032 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11
TASK: c00000000397c9b0[0] 'swapper' THREAD: c0000000bff04000 CPU: 1
GPR00: 0000000000000010 C0000000BFF07E00 C00000000071EAB8 C0000000BC181000 

GPR04: 0000000000000007 0000000000000000 C0000000BFF07CE0 0000000000000000 

GPR08: 0000000000000000 0000000000000000 C00000000071D008 C0000000BFF04000 

GPR12: 0000000044004028 C000000000543000 0000000000000000 0000000000000000 

GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 

GPR20: 0000000000000000 0000000000C00000 0000000000000000 0000000000000001 

GPR24: 0000000000000001 0000000000000010 0000000000000568 000000000000041C 

GPR28: 0000000000000010 0000000000000008 C0000000BFF04000 C0000000BFF04010 

NIP [c000000000013b5c] .default_idle+0x64/0xac
LR [c000000000013b9c] .default_idle+0xa4/0xac
Call Trace:
[c0000000bff07e00] [c000000000013b9c] .default_idle+0xa4/0xac (unreliable)
[c0000000bff07e90] [c00000000001398c] .cpu_idle+0x38/0x50
[c0000000bff07f00] [c00000000003ed78] .start_secondary+0x148/0x1a8
[c0000000bff07f90] [c00000000000c03c] .enable_64b_mode+0x0/0x28
 <0>Fatal exception: panic in 5 seconds
Kernel panic: Fatal exception
In idle task - not syncing

>From its kernel code, when user perform a soft reset, it creates a system 
reset exception, then invoke the exception handler SystemResetException 
and go to die, Does system must go to die when receive a soft reset? 
thanks!

--
Regards,
Jin Qi Huang
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060220/a7024a8b/attachment.htm 

From paulus at samba.org  Mon Feb 20 13:03:31 2006
From: paulus at samba.org (Paul Mackerras)
Date: Mon, 20 Feb 2006 13:03:31 +1100
Subject: Kernel oops then panic when perform a soft reset on ppc64 box
In-Reply-To: <OF8B342B2E.F48C311C-ON4825711B.00049833-4825711B.0006DA3E@cn.ibm.com>
References: <OF8B342B2E.F48C311C-ON4825711B.00049833-4825711B.0006DA3E@cn.ibm.com>
Message-ID: <17401.9075.295712.950980@cargo.ozlabs.ibm.com>

Jin Qi Huang writes:

> When I perform a soft reset on HMC console to a ppc64 box, the kernel oops 
> then panic, here is the procedure to reproduce it:

That's normal, what did you expect it to do?

Paul.


From huangjq at cn.ibm.com  Mon Feb 20 13:34:16 2006
From: huangjq at cn.ibm.com (Jin Qi Huang)
Date: Mon, 20 Feb 2006 10:34:16 +0800
Subject: Kernel oops then panic when perform a soft reset on ppc64 box
In-Reply-To: <17401.9075.295712.950980@cargo.ozlabs.ibm.com>
Message-ID: <OFC9FDB870.888063B3-ON4825711B.000DCC87-4825711B.000E1E58@cn.ibm.com>

Hi Paul,
Would you please give me some detailed information about what happens when 
we perform a soft reset and why the system must go to die? I am a 
youngster to POWER architecture, thanks!

--
Regards,
Jin Qi Huang


Paul Mackerras <paulus at samba.org> 
2006-02-20 10:03

To
Jin Qi Huang/China/Contr/IBM at IBMCN
cc
linuxppc64-dev at ozlabs.org
Subject
Re: Kernel oops then panic when perform a soft reset on ppc64 box


Jin Qi Huang writes:

> When I perform a soft reset on HMC console to a ppc64 box, the kernel 
oops 
> then panic, here is the procedure to reproduce it:

That's normal, what did you expect it to do?

Paul.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060220/9fd43edc/attachment.htm 

From david at gibson.dropbear.id.au  Mon Feb 20 14:05:56 2006
From: david at gibson.dropbear.id.au (David Gibson)
Date: Mon, 20 Feb 2006 14:05:56 +1100
Subject: powerpc: Fixup for STRICT_MM_TYPECHECKS
Message-ID: <20060220030556.GC24457@localhost.localdomain>

Paulus, please apply (for post 2.6.16, I guess).

Currently ARCH=powerpc will not compile when STRICT_MM_TYPECHECKS is
turned on and CONFIG_64K_PAGES is turned off.  The patch below
corrects the problem.

Signed-off-by: David Gibson <dwg at au1.ibm.com>

Index: working-2.6/include/asm-powerpc/pgtable-4k.h
===================================================================
--- working-2.6.orig/include/asm-powerpc/pgtable-4k.h	2006-01-16 13:02:29.000000000 +1100
+++ working-2.6/include/asm-powerpc/pgtable-4k.h	2006-02-20 13:53:57.000000000 +1100
@@ -62,9 +62,14 @@
 /* shift to put page number into pte */
 #define PTE_RPN_SHIFT	(17)
 
-#define __real_pte(e,p)		((real_pte_t)(e))
-#define __rpte_to_pte(r)	(r)
-#define __rpte_to_hidx(r,index)	(pte_val((r)) >> 12)
+#ifdef STRICT_MM_TYPECHECKS
+#define __real_pte(e,p)		((real_pte_t){(e)})
+#define __rpte_to_pte(r)	((r).pte)
+#else
+#define __real_pte(e,p)		(e)
+#define __rpte_to_pte(r)	(__pte(r))
+#endif
+#define __rpte_to_hidx(r,index)	(pte_val(__rpte_to_pte(r)) >> 12)
 
 #define pte_iterate_hashed_subpages(rpte, psize, va, index, shift)       \
 	do {							         \

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson


From sfr at canb.auug.org.au  Mon Feb 20 15:32:26 2006
From: sfr at canb.auug.org.au (Stephen Rothwell)
Date: Mon, 20 Feb 2006 15:32:26 +1100
Subject: [PATCH] Fix compile for CONFIG_SYSVIPC=n or CONFIG_SYSCTL=n
In-Reply-To: <17400.23551.904754.47979@cargo.ozlabs.ibm.com>
References: <20060218100849.GA1869@krypton>
	<17400.23551.904754.47979@cargo.ozlabs.ibm.com>
Message-ID: <20060220153226.30ee4b13.sfr@canb.auug.org.au>

The compat syscalls are added to sys_ni.c since they are not defined
if the above CONFIG options are off. Also, nfs would not build with
CONFIG_SYSCTL off.

Noticed by Arthur Othieno.

Signed-off-by: Stephen Rothwell <sfr at canb.auug.org.au>
---

 include/linux/nfs_fs.h |    2 +-
 kernel/sys_ni.c        |    2 ++
 2 files changed, 3 insertions(+), 1 deletions(-)

On Sun, 19 Feb 2006 22:52:31 +1100 Paul Mackerras <paulus at samba.org> wrote:
>
> Arthur Othieno writes:
> 
> > --- a/arch/powerpc/kernel/sys_ppc32.c
> > +++ b/arch/powerpc/kernel/sys_ppc32.c
> > @@ -440,7 +440,13 @@ long compat_sys_ipc(u32 call, u32 first,
> >  
> >  	return -ENOSYS;
> >  }
> > -#endif
> > +#else
> > +long compat_sys_ipc(u32 call, u32 first, u32 second, u32 third, compat_uptr_t ptr,
> > +	       u32 fifth)
> > +{
> > +	return -ENOSYS;
> > +}
> > +#endif /* CONFIG_SYSVIPC */
> 
> Can't we just add a couple of cond_syscall lines to kernel/sys_ni.c
> instead?

Linus, can we have this applied for 2.6.16.  It presumably affects sparc64
(at least for CONFIG_SYSVIPC) as well as powerpc.  The NFS fix would
affect all architectures, I think?

This has been compile tested with the CONFIG options on and off for powerpc.

-- 
Cheers,
Stephen Rothwell                    sfr at canb.auug.org.au
http://www.canb.auug.org.au/~sfr/

c1a27bc400a1412c7c758775bb695e8b98d1c0c3
diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
index 547d649..b4dc6e2 100644
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -398,7 +398,7 @@ extern struct inode_operations nfs_symli
 extern int nfs_register_sysctl(void);
 extern void nfs_unregister_sysctl(void);
 #else
-#define nfs_register_sysctl() do { } while(0)
+#define nfs_register_sysctl() 0
 #define nfs_unregister_sysctl() do { } while(0)
 #endif
 
diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
index 17313b9..1067090 100644
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -104,6 +104,8 @@ cond_syscall(sys_setreuid16);
 cond_syscall(sys_setuid16);
 cond_syscall(sys_vm86old);
 cond_syscall(sys_vm86);
+cond_syscall(compat_sys_ipc);
+cond_syscall(compat_sys_sysctl);
 
 /* arch-specific weak syscall entries */
 cond_syscall(sys_pciconfig_read);
-- 
1.2.1

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060220/8d8b7dad/attachment.pgp 

From michael at ellerman.id.au  Mon Feb 20 19:07:31 2006
From: michael at ellerman.id.au (Michael Ellerman)
Date: Mon, 20 Feb 2006 19:07:31 +1100
Subject: [PATCH] powerpc: Initialise hvlpevent_queue.lock correctly
Message-ID: <20060220080757.74C78679F6@ozlabs.org>

When I changed the hvlpevent_queue code to use a spinlock instead of a custom
atomic (719d1cd86780c156f954fc34f34481adac197aec) I didn't initialise the
lock anywhere, oops.

Signed-off-by: Michael Ellerman <michael at ellerman.id.au>
---

 arch/powerpc/platforms/iseries/lpevents.c |    2 ++
 1 files changed, 2 insertions(+)

Index: iseries/arch/powerpc/platforms/iseries/lpevents.c
===================================================================
--- iseries.orig/arch/powerpc/platforms/iseries/lpevents.c
+++ iseries/arch/powerpc/platforms/iseries/lpevents.c
@@ -184,6 +184,8 @@ void setup_hvlpevent_queue(void)
 {
 	void *eventStack;
 
+	spin_lock_init(&hvlpevent_queue.lock);
+
 	/* Allocate a page for the Event Stack. */
 	eventStack = alloc_bootmem_pages(LpEventStackSize);
 	memset(eventStack, 0, LpEventStackSize);


From apgo at patchbomb.org  Tue Feb 21 00:53:15 2006
From: apgo at patchbomb.org (Arthur Othieno)
Date: Mon, 20 Feb 2006 08:53:15 -0500
Subject: [PATCH] Fix compile for CONFIG_SYSVIPC=n or CONFIG_SYSCTL=n
In-Reply-To: <20060220153226.30ee4b13.sfr@canb.auug.org.au>
References: <20060218100849.GA1869@krypton>
	<17400.23551.904754.47979@cargo.ozlabs.ibm.com>
	<20060220153226.30ee4b13.sfr@canb.auug.org.au>
Message-ID: <20060220135315.GA24943@krypton>

On Mon, Feb 20, 2006 at 03:32:26PM +1100, Stephen Rothwell wrote:
> The compat syscalls are added to sys_ni.c since they are not defined
> if the above CONFIG options are off. Also, nfs would not build with
> CONFIG_SYSCTL off.
> 
> Noticed by Arthur Othieno.
> 
> Signed-off-by: Stephen Rothwell <sfr at canb.auug.org.au>

Looks good, thanks ;-)

Acked-by: Arthur Othieno <apgo at patchbomb.org>

> ---
> 
>  include/linux/nfs_fs.h |    2 +-
>  kernel/sys_ni.c        |    2 ++
>  2 files changed, 3 insertions(+), 1 deletions(-)
> 
> On Sun, 19 Feb 2006 22:52:31 +1100 Paul Mackerras <paulus at samba.org> wrote:
> >
> > Arthur Othieno writes:
> > 
> > > --- a/arch/powerpc/kernel/sys_ppc32.c
> > > +++ b/arch/powerpc/kernel/sys_ppc32.c
> > > @@ -440,7 +440,13 @@ long compat_sys_ipc(u32 call, u32 first,
> > >  
> > >  	return -ENOSYS;
> > >  }
> > > -#endif
> > > +#else
> > > +long compat_sys_ipc(u32 call, u32 first, u32 second, u32 third, compat_uptr_t ptr,
> > > +	       u32 fifth)
> > > +{
> > > +	return -ENOSYS;
> > > +}
> > > +#endif /* CONFIG_SYSVIPC */
> > 
> > Can't we just add a couple of cond_syscall lines to kernel/sys_ni.c
> > instead?
> 
> Linus, can we have this applied for 2.6.16.  It presumably affects sparc64
> (at least for CONFIG_SYSVIPC) as well as powerpc.  The NFS fix would
> affect all architectures, I think?
> 
> This has been compile tested with the CONFIG options on and off for powerpc.
> 
> -- 
> Cheers,
> Stephen Rothwell                    sfr at canb.auug.org.au
> http://www.canb.auug.org.au/~sfr/
> 
> c1a27bc400a1412c7c758775bb695e8b98d1c0c3
> diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
> index 547d649..b4dc6e2 100644
> --- a/include/linux/nfs_fs.h
> +++ b/include/linux/nfs_fs.h
> @@ -398,7 +398,7 @@ extern struct inode_operations nfs_symli
>  extern int nfs_register_sysctl(void);
>  extern void nfs_unregister_sysctl(void);
>  #else
> -#define nfs_register_sysctl() do { } while(0)
> +#define nfs_register_sysctl() 0
>  #define nfs_unregister_sysctl() do { } while(0)
>  #endif
>  
> diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
> index 17313b9..1067090 100644
> --- a/kernel/sys_ni.c
> +++ b/kernel/sys_ni.c
> @@ -104,6 +104,8 @@ cond_syscall(sys_setreuid16);
>  cond_syscall(sys_setuid16);
>  cond_syscall(sys_vm86old);
>  cond_syscall(sys_vm86);
> +cond_syscall(compat_sys_ipc);
> +cond_syscall(compat_sys_sysctl);
>  
>  /* arch-specific weak syscall entries */
>  cond_syscall(sys_pciconfig_read);
> -- 
> 1.2.1


From anton at samba.org  Tue Feb 21 01:59:05 2006
From: anton at samba.org (Anton Blanchard)
Date: Tue, 21 Feb 2006 01:59:05 +1100
Subject: [PATCH 01/22] Add powerpc-specific clear_cacheline(),
	which just compiles to "dcbz".
In-Reply-To: <20060218005704.13620.88286.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
	<20060218005704.13620.88286.stgit@localhost.localdomain>
Message-ID: <20060220145904.GA19895@krispykreme>


Hi,

> This is horribly non-portable.  How much of a performance difference
> does it make?  How does it do on ppc64 systems where the cacheline
> size is not 32?

Yes, if anything we should catch cacheline aligned, multiple cacheline
sized zeroing in memset. 

Anton


From anton at samba.org  Tue Feb 21 02:09:53 2006
From: anton at samba.org (Anton Blanchard)
Date: Tue, 21 Feb 2006 02:09:53 +1100
Subject: [PATCH 03/22] pHype specific stuff
In-Reply-To: <20060218005709.13620.77409.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
	<20060218005709.13620.77409.stgit@localhost.localdomain>
Message-ID: <20060220150953.GB19895@krispykreme>


Hi,

> +inline static u32 getLongBusyTimeSecs(int longBusyRetCode)
> +{
> +	switch (longBusyRetCode) {
> +	case H_LongBusyOrder1msec:
> +		return 1;
> +	case H_LongBusyOrder10msec:
> +		return 10;
> +	case H_LongBusyOrder100msec:
> +		return 100;
> +	case H_LongBusyOrder1sec:
> +		return 1000;
> +	case H_LongBusyOrder10sec:
> +		return 10000;
> +	case H_LongBusyOrder100sec:
> +		return 100000;
> +	default:
> +		return 1;
> +	}			/* eof switch */
> +}

Since this actually returns milliseconds it might be worth making it
obvious in the function name. Also no need to use studly caps for the
function name and variable. We will fix the studly caps H_LongBusy*
stuff another day :)

> +inline static long plpar_hcall_7arg_7ret(unsigned long opcode,
> +inline static long plpar_hcall_9arg_9ret(unsigned long opcode,

These belong in arch/powerpc/platforms/pseries/hvCall.S

Anton


From anton at samba.org  Tue Feb 21 02:12:15 2006
From: anton at samba.org (Anton Blanchard)
Date: Tue, 21 Feb 2006 02:12:15 +1100
Subject: [PATCH 07/22] Hypercall definitions
In-Reply-To: <20060218005721.13620.84990.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
	<20060218005721.13620.84990.stgit@localhost.localdomain>
Message-ID: <20060220151215.GC19895@krispykreme>


Hi,

> Do these defines belong in the ehca driver, or should they be put
> somewhere in generic hypercall support?

Agreed, I think they should go into include/asm-powerpc/hvcall.h

Anton


From anton at samba.org  Tue Feb 21 02:22:13 2006
From: anton at samba.org (Anton Blanchard)
Date: Tue, 21 Feb 2006 02:22:13 +1100
Subject: [PATCH 21/22] ehca main file
In-Reply-To: <20060218005759.13620.10968.stgit@localhost.localdomain>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
	<20060218005759.13620.10968.stgit@localhost.localdomain>
Message-ID: <20060220152213.GD19895@krispykreme>

 
Hi,

> What is ehca_show_flightrecorder() trying to do that snprintf() is
> not fast enough?  If you need to pass a binary structure back to
> userspace (with a kernel address in it??) then sysfs is not the right
> place to put it.  Look at debugfs; or relayfs might make the most
> sense for your flightrecorder stuff.

I agree debugfs or relayfs would be better suited. Of course as the
driver matures this form of debug is probably not required at all.

> +#include "hcp_sense.h"		/* TODO: later via hipz_* header file */
> +#include "hcp_if.h"		/* TODO: later via hipz_* header file */

I count 88 TODOs in the driver, it would be nice to get rid of some of
them like the two above, so we can concentrate on the important TODOs :)

> +#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,12)
> +#define EHCA_RESOURCE_ATTR_H(name)                                         \
> +static ssize_t  ehca_show_##name(struct device *dev,                       \
> +				 struct device_attribute *attr,            \
> +				 char *buf)
> +#else
> +#define EHCA_RESOURCE_ATTR_H(name)                                         \
> +static ssize_t  ehca_show_##name(struct device *dev,                       \
> +				 char *buf)
> +#endif

No need for kernel version ifdefs.

Anton


From rdreier at cisco.com  Tue Feb 21 03:52:55 2006
From: rdreier at cisco.com (Roland Dreier)
Date: Mon, 20 Feb 2006 08:52:55 -0800
Subject: [PATCH 21/22] ehca main file
In-Reply-To: <20060220152213.GD19895@krispykreme> (Anton Blanchard's message
	of "Tue, 21 Feb 2006 02:22:13 +1100")
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
	<20060218005759.13620.10968.stgit@localhost.localdomain>
	<20060220152213.GD19895@krispykreme>
Message-ID: <adabqx27z14.fsf@cisco.com>

    Anton> No need for kernel version ifdefs.

Sorry, I tried to strip these out before posting the patch, but I
missed one.

Anyway, totally agree on the ifdefs and I will be double-extra-sure
that the final version doesn't include them.

 - R.


From rdreier at cisco.com  Tue Feb 21 03:55:24 2006
From: rdreier at cisco.com (Roland Dreier)
Date: Mon, 20 Feb 2006 08:55:24 -0800
Subject: [PATCH 00/22] [RFC] IBM eHCA InfiniBand adapter driver
In-Reply-To: <OF994D8D1D.24198E91-ONC125711B.00528887-C125711B.0052E575@de.ibm.com>
	(Christoph Raisch's message of "Mon, 20 Feb 2006 16:06:19 +0100")
References: <OF994D8D1D.24198E91-ONC125711B.00528887-C125711B.0052E575@de.ibm.com>
Message-ID: <ada7j7q7ywz.fsf@cisco.com>

    Christoph> I guess posting 22 new patch files (diff against NIL)
    Christoph> each week is sort of a DoS attack on the mailing list
    Christoph> and we'll end up in peoples spam folders pretty
    Christoph> quickly...  So what's the recomended way to proceed
    Christoph> here?

I don't think there's any other way to proceed.  For each version, you
should carefully note down the feedback that you received and how you
are responding to each suggestion, and include that with the patch
file.  But it's too much to expect for people to keep context for a
patch under review, so even though it generates a lot of email, I
think that including the whole series is the only way to go.

Perhaps the list admins disagree with me though ;)

 - R.


From arndb at de.ibm.com  Tue Feb 21 04:26:25 2006
From: arndb at de.ibm.com (Arnd Bergmann)
Date: Mon, 20 Feb 2006 18:26:25 +0100
Subject: [FYI/PATCH 2/4] enable control-c for IBM Full System Simulator
In-Reply-To: <17397.35761.56383.60273@cargo.ozlabs.ibm.com>
References: <Pine.LNX.4.62.0602170636270.7683@tuxmkge1.boeblingen.de.ibm.com>
	<17397.35761.56383.60273@cargo.ozlabs.ibm.com>
Message-ID: <200602201826.25489.arndb@de.ibm.com>

On Friday 17 February 2006 09:39, Paul Mackerras wrote:
> Utz Bacher writes:
> 
> > +#ifndef CONFIG_PPC_SYSTEMSIM
> > ? ????????????????????noctty = 1;
> > +#endif
> 
> Why is this awful hack necessary?


It's not. It's just a workaround to boot systemsim without
any sort of /sbin/init logic that sets ctty.

I actually though we had removed that hack earlier.

	Arnd <><


From RAISCH at de.ibm.com  Tue Feb 21 02:06:19 2006
From: RAISCH at de.ibm.com (Christoph Raisch)
Date: Mon, 20 Feb 2006 16:06:19 +0100
Subject: [PATCH 00/22] [RFC] IBM eHCA InfiniBand adapter driver
In-Reply-To: <20060218005532.13620.79663.stgit@localhost.localdomain>
Message-ID: <OF994D8D1D.24198E91-ONC125711B.00528887-C125711B.0052E575@de.ibm.com>


Roland,
as you already stated we really have a problem that we're not able to send
"large" pieces of code to the kernel mailing list.
It's perfectly ok for us to send patches to the openib.org mailing list and
svn.
This is something we still try to resolve with legal.
So thank you Roland for acting as a proxy here...
We have the ok to contribute to any ehca related discussion on kernel
mailing-list and ppc64-mailing list, and are absolutely willing to do so!

Adding a new driver for a complex new hardware isn't the regular linux
develpment case, especially if there's no base code in linux kernel to
patch against...
In our case this patch resulted in 22 postings.
Some people already noticed that there's still quite some road ahead of
us... but we're abolutely willing to work that, and we had to start at some
place.
Some coments will result in modifications to all files.
I guess posting 22 new patch files (diff against NIL) each week is sort of
a DoS attack on the mailing list and we'll end up in peoples spam folders
pretty quickly...
So what's the recomended way to proceed here?


Gruss / Regards . . . Christoph Raisch

christoph raisch, HCAD teamlead

Roland Dreier wrote on 18.02.2006 01:55:32:

> Here's a series of patches that add an InfiniBand adapter driver
> for IBM eHCA hardware.  Please look it over with an eye towards issues
> that need to be addressed before merging this upstream.
>


From arnd at arndb.de  Tue Feb 21 05:32:31 2006
From: arnd at arndb.de (Arnd Bergmann)
Date: Mon, 20 Feb 2006 19:32:31 +0100
Subject: [openib-general] Re: [PATCH 21/22] ehca main file
In-Reply-To: <43FA7677.3040901@de.ibm.com>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>
	<20060220152213.GD19895@krispykreme> <43FA7677.3040901@de.ibm.com>
Message-ID: <200602201932.31739.arnd@arndb.de>

On Tuesday 21 February 2006 03:09, Heiko J Schick wrote:
> ?>>+#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,12)
> ?>>+#define EHCA_RESOURCE_ATTR_H(name) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? \
> ?>>+static ssize_t ?ehca_show_##name(struct device *dev, ? ? ? ? ? ? ? ? ? ? ? \
> ?>>+???????????????????????????? struct device_attribute *attr, ? ? ? ? ? ?\
> ?>>+???????????????????????????? char *buf)
> ?>>+#else
> ?>>+#define EHCA_RESOURCE_ATTR_H(name) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? \
> ?>>+static ssize_t ?ehca_show_##name(struct device *dev, ? ? ? ? ? ? ? ? ? ? ? \
> ?>>+???????????????????????????? char *buf)
> ?>>+#endif
> ?>
> ?>
> ?> No need for kernel version ifdefs.
> 
> The point is that our module have to run on Linux 2.6.5-7.244 (SuSE SLES 9 SP3), too.
> This was the reason why we've included the ifdefs. We can change the ifdefs to
> #if LINUX_VERSION_CODE >= KERNEL_VERSION(2.6.5) to mark that this code is used for
> Linux 2.6.5 compatibility.

That only makes sense as long as you have a common source code for both
that also is under your control. As soon as the driver enters the mainline
kernel, it is no longer helpful to have these checks in it, because other
people will start making changes to the driver that you don't want to
have in the 2.6.5 version.

You cannot avoid forking the code in the long term, but fortunately the
need to backport fixes to the old version should also decrease over time.

	Arnd <><


From spoole at lanl.gov  Tue Feb 21 04:43:51 2006
From: spoole at lanl.gov (Stephen Poole)
Date: Mon, 20 Feb 2006 10:43:51 -0700
Subject: [openib-general] Re: [PATCH 00/22] [RFC] IBM eHCA InfiniBand 
	adapter driver
In-Reply-To: <OF994D8D1D.24198E91-ONC125711B.00528887-C125711B.0052E575@de.ibm.com>
References: <OF994D8D1D.24198E91-ONC125711B.00528887-C125711B.0052E575@de.ibm.com>
Message-ID: <a0623090bc01fafbc5685@[192.168.0.12]>

If every open source company was being sued for $3B I think many 
companies would be a bit timid. :-) IBM has been working this issue 
at all levels. It will happen when IBM Legal has figured out all of 
the necessary paths in order to cover any potential law suits. 
Unfortunately, the open source path has been muddied by some folks.

Steve...

At 4:06 PM +0100 2/20/06, Christoph Raisch wrote:
>Roland,
>as you already stated we really have a problem that we're not able to send
>"large" pieces of code to the kernel mailing list.
>It's perfectly ok for us to send patches to the openib.org mailing list and
>svn.
>This is something we still try to resolve with legal.
>So thank you Roland for acting as a proxy here...
>We have the ok to contribute to any ehca related discussion on kernel
>mailing-list and ppc64-mailing list, and are absolutely willing to do so!
>
>Adding a new driver for a complex new hardware isn't the regular linux
>develpment case, especially if there's no base code in linux kernel to
>patch against...
>In our case this patch resulted in 22 postings.
>Some people already noticed that there's still quite some road ahead of
>us... but we're abolutely willing to work that, and we had to start at some
>place.
>Some coments will result in modifications to all files.
>I guess posting 22 new patch files (diff against NIL) each week is sort of
>a DoS attack on the mailing list and we'll end up in peoples spam folders
>pretty quickly...
>So what's the recomended way to proceed here?
>
>
>Gruss / Regards . . . Christoph Raisch
>
>christoph raisch, HCAD teamlead
>
>Roland Dreier wrote on 18.02.2006 01:55:32:
>
>>  Here's a series of patches that add an InfiniBand adapter driver
>>  for IBM eHCA hardware.  Please look it over with an eye towards issues
>>  that need to be addressed before merging this upstream.
>>
>
>_______________________________________________
>openib-general mailing list
>openib-general at openib.org
>http://openib.org/mailman/listinfo/openib-general
>
>To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


-- 
Steve Poole (spoole at lanl.gov) 
	Office: 505.665.9662
Los Alamos National Laboratory					Cell: 
505.699.3807
CCN - Special Projects / Advanced Development			Fax: 
505.665.7793
P.O. Box 1663, MS B255
Los Alamos, NM. 87545
03149801S


From schihei at de.ibm.com  Tue Feb 21 13:09:59 2006
From: schihei at de.ibm.com (Heiko J Schick)
Date: Tue, 21 Feb 2006 03:09:59 +0100
Subject: [openib-general] Re: [PATCH 21/22] ehca main file
In-Reply-To: <20060220152213.GD19895@krispykreme>
References: <20060218005532.13620.79663.stgit@localhost.localdomain>	<20060218005759.13620.10968.stgit@localhost.localdomain>
	<20060220152213.GD19895@krispykreme>
Message-ID: <43FA7677.3040901@de.ibm.com>

Hello Anton,

thanks for your help!

 >>+#include "hcp_sense.h"		/* TODO: later via hipz_* header file */
 >>+#include "hcp_if.h"		/* TODO: later via hipz_* header file */
 >
 >
 > I count 88 TODOs in the driver, it would be nice to get rid of some of
 > them like the two above, so we can concentrate on the important TODOs :)

We will remove the TODOs soon as possible.

 >>+#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,12)
 >>+#define EHCA_RESOURCE_ATTR_H(name)                                         \
 >>+static ssize_t  ehca_show_##name(struct device *dev,                       \
 >>+				 struct device_attribute *attr,            \
 >>+				 char *buf)
 >>+#else
 >>+#define EHCA_RESOURCE_ATTR_H(name)                                         \
 >>+static ssize_t  ehca_show_##name(struct device *dev,                       \
 >>+				 char *buf)
 >>+#endif
 >
 >
 > No need for kernel version ifdefs.

The point is that our module have to run on Linux 2.6.5-7.244 (SuSE SLES 9 SP3), too.
This was the reason why we've included the ifdefs. We can change the ifdefs to
#if LINUX_VERSION_CODE >= KERNEL_VERSION(2.6.5) to mark that this code is used for
Linux 2.6.5 compatibility.

Regards,
	Heiko


From utz.bacher at de.ibm.com  Tue Feb 21 04:33:39 2006
From: utz.bacher at de.ibm.com (Utz Bacher)
Date: Mon, 20 Feb 2006 18:33:39 +0100
Subject: [FYI/PATCH 2/4] enable control-c for IBM Full System Simulator
In-Reply-To: <200602201826.25489.arndb@de.ibm.com>
Message-ID: <OF98E428D5.BA475B45-ONC125711B.005FFCC5-C125711B.00608323@de.ibm.com>

Arnd Bergmann wrote on 20.02.2006 18:26:25:
> On Friday 17 February 2006 09:39, Paul Mackerras wrote:
> > Utz Bacher writes:
> > 
> > > +#ifndef CONFIG_PPC_SYSTEMSIM
> > >                       noctty = 1;
> > > +#endif
> > 
> > Why is this awful hack necessary?
> 
> 
> It's not. It's just a workaround to boot systemsim without
> any sort of /sbin/init logic that sets ctty.
> 
> I actually though we had removed that hack earlier.

The idea was to keep the system simulator environment very small, no login
etc. It shouldn't go really into a proper kernel. What we missed to point
out (probably for all of the four patches) and I take that, is that this
should be used when packages for such environments are built ready for 
use,
like our friends at http://www.bsc.es/; it is however not a thing that
should go in somewhere else, the kernel or such, and finally should
disappear.

Utz

:wq
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060220/bda07ca7/attachment.htm 

From ahuja at austin.ibm.com  Tue Feb 21 10:03:46 2006
From: ahuja at austin.ibm.com (Manish Ahuja)
Date: Mon, 20 Feb 2006 17:03:46 -0600
Subject: [PATCH] PPC64 collect and export low-level cpu usage statistics
In-Reply-To: <20060216091027.GA826@localhost.localdomain>
References: <17393.16261.768862.724265@cargo.ozlabs.ibm.com>
	<20060214183259.28a6a501.sfr@canb.auug.org.au>
	<43F40312.2020800@austin.ibm.com>
	<20060216091027.GA826@localhost.localdomain>
Message-ID: <43FA4AD2.3090503@austin.ibm.com>

David Gibson wrote:

>On Wed, Feb 15, 2006 at 10:44:02PM -0600, Manish Ahuja wrote:
>[snip]
>  
>
>>>>Index: linux-2.6.15-rc6/arch/powerpc/kernel/process.c
>>>>===================================================================
>>>>--- linux-2.6.15-rc6.orig/arch/powerpc/kernel/process.c	2005-12-18 
>>>>16:36:54.000000000 -0800
>>>>+++ linux-2.6.15-rc6/arch/powerpc/kernel/process.c	2006-01-17 
>>>>21:20:25.000000000 -0800
>>>>@@ -243,6 +243,7 @@
>>>>	struct thread_struct *new_thread, *old_thread;
>>>>	unsigned long flags;
>>>>	struct task_struct *last;
>>>>+	struct paca_struct *lpaca;
>>>>  
>>>>
>>>>        
>>>>
>>>This could have been declared below (near pd)
>>>      
>>>
>>Yes... But it seems fine there..
>>    
>>
>
>Actually, I've been trying to get rid of lpaca locals everywhere.
>Using get_paca() directly is barely more verbose, and usually clearer.
>
>  
>
I can change it accordingly..

-Manish


From sharada at in.ibm.com  Tue Feb 21 16:44:49 2006
From: sharada at in.ibm.com (R Sharada)
Date: Tue, 21 Feb 2006 11:14:49 +0530
Subject: [PATCH] ppc64 - fix spinlock recursion in native_hpte_clear
Message-ID: <20060221054448.GA1695@in.ibm.com>

Hello,
	kexec on Power4 (non-lpar) was breaking because of a spinlock 
recursion problem in native_hpte_clear. This patch fixes the recursion by 
changing the call to tlbie() in native_hpte_clear to call __tlbie() (as per
Milton's suggestion).
	native_hpte_clear and slot2va still do not support clearing of 
large pages (>4K pages). I do not know the large page support code well
enough to fix that at the moment. If any one has any ideas or can help fix
the hpte_clear code to add support for large pages, that would be appreciated.

	With this patch, I am able to kexec boot on Power4 non-lpar.
	Please review, provide comments, and consider for acceptance

Thanks and Regards,
Sharada


native_hpte_clear has a spin_lock recursion problem with the native_tlbie_lock
being called twice. Fixing the tlbie() call in native_hpte_clear to call
__tlbie(). It still supports only 4K pages for now.


Signed-off-by: R Sharada <sharada at in.ibm.com>
---


diff -puN arch/powerpc/mm/hash_native_64.c~fix_native_hpte_clear arch/powerpc/mm/hash_native_64.c
--- linux-2.6.16-rc4-tlbie/arch/powerpc/mm/hash_native_64.c~fix_native_hpte_clear	2006-02-20 22:01:49.000000000 +0530
+++ linux-2.6.16-rc4-tlbie-sharada/arch/powerpc/mm/hash_native_64.c	2006-02-20 22:05:31.000000000 +0530
@@ -383,6 +383,7 @@ static void native_hpte_clear(void)
 	hpte_t *hptep = htab_address;
 	unsigned long hpte_v;
 	unsigned long pteg_count;
+	unsigned long va;
 
 	pteg_count = htab_hash_mask + 1;
 
@@ -405,10 +406,12 @@ static void native_hpte_clear(void)
 
 		if (hpte_v & HPTE_V_VALID) {
 			hptep->v = 0;
-			tlbie(slot2va(hpte_v, slot), MMU_PAGE_4K, 0);
+			va = slot2va(hpte_v, slot);
+			__tlbie(va, MMU_PAGE_4K);
 		}
 	}
 
+	asm volatile("eieio; tlbsync; ptesync":::"memory");
 	spin_unlock(&native_tlbie_lock);
 	local_irq_restore(flags);
 }
_


From michael at ellerman.id.au  Tue Feb 21 17:22:55 2006
From: michael at ellerman.id.au (Michael Ellerman)
Date: Tue, 21 Feb 2006 17:22:55 +1100
Subject: [PATCH] powerpc: Only calculate htab_size in one place for kexec
Message-ID: <20060221062320.6EFB5679F5@ozlabs.org>

For kexec we need to know the size of the htab.

Currently we calculate the size once in the htab code, and then twice more in
the kexec code, once using htab_hash_mask and once using ppc64_pft_size.
On some machines the ppc64_pft_size calculation is broken because
ppc64_pft_size is not set.

So we need to fix the second calculation, but better still we should just
calculate the size once and use it everywhere else.

Tested on Power5 LPAR, Power4 non-LPAR and Power3.

Kexec is broken on some non-LPAR machines without this, so I think it should
go upstream for 2.6.16.

Signed-off-by: Michael Ellerman <michael at ellerman.id.au>
---

 arch/powerpc/kernel/machine_kexec_64.c |   10 +++-------
 arch/powerpc/mm/hash_utils_64.c        |    3 ++-
 include/asm-powerpc/mmu.h              |    1 +
 3 files changed, 6 insertions(+), 8 deletions(-)

Index: to-merge/arch/powerpc/kernel/machine_kexec_64.c
===================================================================
--- to-merge.orig/arch/powerpc/kernel/machine_kexec_64.c
+++ to-merge/arch/powerpc/kernel/machine_kexec_64.c
@@ -26,8 +26,6 @@
 #include <asm/prom.h>
 #include <asm/smp.h>
 
-#define HASH_GROUP_SIZE 0x80	/* size of each hash group, asm/mmu.h */
-
 int default_machine_kexec_prepare(struct kimage *image)
 {
 	int i;
@@ -61,7 +59,7 @@ int default_machine_kexec_prepare(struct
 	 */
 	if (htab_address) {
 		low = __pa(htab_address);
-		high = low + (htab_hash_mask + 1) * HASH_GROUP_SIZE;
+		high = low + htab_size_bytes;
 
 		for (i = 0; i < image->nr_segments; i++) {
 			begin = image->segment[i].mem;
@@ -294,7 +292,7 @@ void default_machine_kexec(struct kimage
 }
 
 /* Values we need to export to the second kernel via the device tree. */
-static unsigned long htab_base, htab_size, kernel_end;
+static unsigned long htab_base, kernel_end;
 
 static struct property htab_base_prop = {
 	.name = "linux,htab-base",
@@ -305,7 +303,7 @@ static struct property htab_base_prop = 
 static struct property htab_size_prop = {
 	.name = "linux,htab-size",
 	.length = sizeof(unsigned long),
-	.value = (unsigned char *)&htab_size,
+	.value = (unsigned char *)&htab_size_bytes,
 };
 
 static struct property kernel_end_prop = {
@@ -331,8 +329,6 @@ static void __init export_htab_values(vo
 
 	htab_base = __pa(htab_address);
 	prom_add_property(node, &htab_base_prop);
-
-	htab_size = 1UL << ppc64_pft_size;
 	prom_add_property(node, &htab_size_prop);
 
  out:
Index: to-merge/arch/powerpc/mm/hash_utils_64.c
===================================================================
--- to-merge.orig/arch/powerpc/mm/hash_utils_64.c
+++ to-merge/arch/powerpc/mm/hash_utils_64.c
@@ -88,6 +88,7 @@ static unsigned long _SDR1;
 struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT];
 
 hpte_t *htab_address;
+unsigned long htab_size_bytes;
 unsigned long htab_hash_mask;
 int mmu_linear_psize = MMU_PAGE_4K;
 int mmu_virtual_psize = MMU_PAGE_4K;
@@ -399,7 +400,7 @@ void create_section_mapping(unsigned lon
 
 void __init htab_initialize(void)
 {
-	unsigned long table, htab_size_bytes;
+	unsigned long table;
 	unsigned long pteg_count;
 	unsigned long mode_rw;
 	unsigned long base = 0, size = 0;
Index: to-merge/include/asm-powerpc/mmu.h
===================================================================
--- to-merge.orig/include/asm-powerpc/mmu.h
+++ to-merge/include/asm-powerpc/mmu.h
@@ -112,6 +112,7 @@ typedef struct {
 } hpte_t;
 
 extern hpte_t *htab_address;
+extern unsigned long htab_size_bytes;
 extern unsigned long htab_hash_mask;
 
 /*


From johnrose at austin.ibm.com  Wed Feb 22 07:55:41 2006
From: johnrose at austin.ibm.com (John Rose)
Date: Tue, 21 Feb 2006 14:55:41 -0600
Subject: [PATCH 2/2] Fix dynamic PCI probe regression
Message-ID: <1140555341.24859.15.camel@sinatra.austin.ibm.com>

Some hotplug driver functions were migrated to the kernel for use by EEH
in the following set of changes:  http://tinyurl.com/qke9r

Previously, the PCI Hotplug module had been changed to use the new
OFDT-based PCI probe when appropriate:  http://tinyurl.com/jy4jl

When rpaphp_pci_config_slot() was moved from the rpaphp driver to the
new kernel function pcibios_add_pci_devices(), the OFDT-based probe
stuff was dropped.  This patch restores it.

Signed-off-by: John Rose <johnrose at austin.ibm.com>

diff -puN arch/powerpc/platforms/pseries/pci_dlpar.c~reorg_regress arch/powerpc/platforms/pseries/pci_dlpar.c
--- 2_6_linus/arch/powerpc/platforms/pseries/pci_dlpar.c~reorg_regress	2006-02-21 14:54:10.000000000 -0600
+++ 2_6_linus-johnrose/arch/powerpc/platforms/pseries/pci_dlpar.c	2006-02-21 14:54:10.000000000 -0600
@@ -106,6 +106,8 @@ pcibios_fixup_new_pci_devices(struct pci
 			}
 		}
 	}
+
+	eeh_add_device_tree_late(bus);
 }
 EXPORT_SYMBOL_GPL(pcibios_fixup_new_pci_devices);
 
@@ -114,7 +116,6 @@ pcibios_pci_config_bridge(struct pci_dev
 {
 	u8 sec_busno;
 	struct pci_bus *child_bus;
-	struct pci_dev *child_dev;
 
 	/* Get busno of downstream bus */
 	pci_read_config_byte(dev, PCI_SECONDARY_BUS, &sec_busno);
@@ -129,10 +130,6 @@ pcibios_pci_config_bridge(struct pci_dev
 
 	pci_scan_child_bus(child_bus);
 
-	list_for_each_entry(child_dev, &child_bus->devices, bus_list) {
-		eeh_add_device_late(child_dev);
-	}
-
 	/* Fixup new pci devices without touching bus struct */
 	pcibios_fixup_new_pci_devices(child_bus, 0);
 
@@ -160,18 +157,25 @@ pcibios_add_pci_devices(struct pci_bus *
 
 	eeh_add_device_tree_early(dn);
 
-	/* pci_scan_slot should find all children */
-	slotno = PCI_SLOT(PCI_DN(dn->child)->devfn);
-	num = pci_scan_slot(bus, PCI_DEVFN(slotno, 0));
-	if (num) {
-		pcibios_fixup_new_pci_devices(bus, 1);
-		pci_bus_add_devices(bus);
-	}
+	if (_machine == PLATFORM_PSERIES_LPAR) {
+		/* use ofdt-based probe */
+		of_scan_bus(dn, bus);
+		if (!list_empty(&bus->devices)) {
+			pcibios_fixup_new_pci_devices(bus, 0);
+			pci_bus_add_devices(bus);
+		}
+	} else {
+		/* use legacy probe */
+		slotno = PCI_SLOT(PCI_DN(dn->child)->devfn);
+		num = pci_scan_slot(bus, PCI_DEVFN(slotno, 0));
+		if (num) {
+			pcibios_fixup_new_pci_devices(bus, 1);
+			pci_bus_add_devices(bus);
+		}
 
-	list_for_each_entry(dev, &bus->devices, bus_list) {
-		eeh_add_device_late (dev);
-		if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE)
-			pcibios_pci_config_bridge(dev);
+		list_for_each_entry(dev, &bus->devices, bus_list)
+			if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE)
+				pcibios_pci_config_bridge(dev);
 	}
 }
 EXPORT_SYMBOL_GPL(pcibios_add_pci_devices);
diff -puN arch/powerpc/platforms/pseries/eeh.c~reorg_regress arch/powerpc/platforms/pseries/eeh.c
--- 2_6_linus/arch/powerpc/platforms/pseries/eeh.c~reorg_regress	2006-02-21 14:54:10.000000000 -0600
+++ 2_6_linus-johnrose/arch/powerpc/platforms/pseries/eeh.c	2006-02-21 14:54:10.000000000 -0600
@@ -917,6 +917,20 @@ void eeh_add_device_late(struct pci_dev 
 	pci_addr_cache_insert_device (dev);
 }
 
+void eeh_add_device_tree_late(struct pci_bus *bus)
+{
+	struct pci_dev *dev;
+
+	list_for_each_entry(dev, &bus->devices, bus_list) {
+ 		eeh_add_device_late(dev);
+ 		if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) {
+ 			struct pci_bus *subbus = dev->subordinate;
+ 			if (subbus)
+ 				eeh_add_device_tree_late(subbus);
+ 		}
+	}
+}
+
 /**
  * eeh_remove_device - undo EEH setup for the indicated pci device
  * @dev: pci device to be removed
diff -puN include/asm-powerpc/eeh.h~reorg_regress include/asm-powerpc/eeh.h
--- 2_6_linus/include/asm-powerpc/eeh.h~reorg_regress	2006-02-21 14:54:10.000000000 -0600
+++ 2_6_linus-johnrose/include/asm-powerpc/eeh.h	2006-02-21 14:54:10.000000000 -0600
@@ -27,6 +27,7 @@
 #include <linux/string.h>
 
 struct pci_dev;
+struct pci_bus;
 struct device_node;
 
 #ifdef CONFIG_EEH
@@ -51,7 +52,7 @@ int eeh_dn_check_failure(struct device_n
 void __init pci_addr_cache_build(void);
 
 void eeh_add_device_tree_early(struct device_node *);
-void eeh_add_device_late(struct pci_dev *);
+void eeh_add_device_tree_late(struct pci_bus *);
 
 /**
  * eeh_remove_bus_device - undo EEH for device & children.
@@ -92,10 +93,10 @@ static inline int eeh_dn_check_failure(s
 
 static inline void pci_addr_cache_build(void) { }
 
-static inline void eeh_add_device_late(struct pci_dev *dev) { }
-
 static inline void eeh_add_device_tree_early(struct device_node *dn) { }
 
+static inline void eeh_add_device_tree_late(struct pci_bus *bus) { }
+
 static inline void eeh_remove_bus_device(struct pci_dev *dev) { }
 #define EEH_POSSIBLE_ERROR(val, type) (0)
 #define EEH_IO_ERROR_VALUE(size) (-1UL)

_


From linas at austin.ibm.com  Wed Feb 22 08:14:02 2006
From: linas at austin.ibm.com (Linas Vepstas)
Date: Tue, 21 Feb 2006 15:14:02 -0600
Subject: [PATCH 1/2] EEH cleanups
In-Reply-To: <1140555218.24859.11.camel@sinatra.austin.ibm.com>
References: <1140555218.24859.11.camel@sinatra.austin.ibm.com>
Message-ID: <20060221211402.GD26339@austin.ibm.com>

Hi,

On Tue, Feb 21, 2006 at 02:53:38PM -0600, John Rose was heard to remark:
> This patch removes unnecessary exports, marks functions as static when
> possible, and simplifies some list-related code.
> 
> Signed-off-by: John Rose <johnrose at austin.ibm.com>

Looks reasonable to me; I have one request, though. The patch
removes the following documentatin from eeh.h; can you copy 
this over to eeh.c? (what's there now is shorter and has 
a typo.)

>  /**
> - * eeh_remove_device - undo EEH setup for the indicated pci device
> - * @dev: pci device to be removed
> - *
> - * This routine should be called when a device is removed from
> - * a running system (e.g. by hotplug or dlpar).  It unregisters
> - * the PCI device from the EEH subsystem.  I/O errors affecting
> - * this device will no longer be detected after this call; thus,
> - * i/o errors affecting this slot may leave this device unusable.
> - */

I won't be here tommorrow, to ack anything revised then, 
so I'll just ack now:

Acked-by: Linas Vepstas <linas at austin.ibm.com>

--linas


From johnrose at austin.ibm.com  Wed Feb 22 08:21:45 2006
From: johnrose at austin.ibm.com (John Rose)
Date: Tue, 21 Feb 2006 15:21:45 -0600
Subject: [PATCH 1/2] EEH cleanups
In-Reply-To: <20060221211402.GD26339@austin.ibm.com>
References: <1140555218.24859.11.camel@sinatra.austin.ibm.com>
	<20060221211402.GD26339@austin.ibm.com>
Message-ID: <1140556904.24859.18.camel@sinatra.austin.ibm.com>

This patch removes unnecessary exports, marks functions as static when
possible, and simplifies some list-related code.

Signed-off-by: John Rose <johnrose at austin.ibm.com>
Acked-by: Linas Vepstas <linas at austin.ibm.com>

diff -puN arch/powerpc/platforms/pseries/eeh.c~eeh_cleanups arch/powerpc/platforms/pseries/eeh.c
--- 2_6_linus/arch/powerpc/platforms/pseries/eeh.c~eeh_cleanups	2006-02-21 15:20:08.000000000 -0600
+++ 2_6_linus-johnrose/arch/powerpc/platforms/pseries/eeh.c	2006-02-21 15:24:32.000000000 -0600
@@ -409,8 +409,6 @@ dn_unlock:
 	return rc;
 }
 
-EXPORT_SYMBOL_GPL(eeh_dn_check_failure);
-
 /**
  * eeh_check_failure - check if all 1's data is due to EEH slot freeze
  * @token i/o token, should be address in the form 0xA....
@@ -865,7 +863,7 @@ void __init eeh_init(void)
  * on the CEC architecture, type of the device, on earlier boot
  * command-line arguments & etc.
  */
-void eeh_add_device_early(struct device_node *dn)
+static void eeh_add_device_early(struct device_node *dn)
 {
 	struct pci_controller *phb;
 	struct eeh_early_enable_info info;
@@ -882,7 +880,6 @@ void eeh_add_device_early(struct device_
 	info.buid_lo = BUID_LO(phb->buid);
 	early_enable_eeh(dn, &info);
 }
-EXPORT_SYMBOL_GPL(eeh_add_device_early);
 
 void eeh_add_device_tree_early(struct device_node *dn)
 {
@@ -919,16 +916,18 @@ void eeh_add_device_late(struct pci_dev 
 
 	pci_addr_cache_insert_device (dev);
 }
-EXPORT_SYMBOL_GPL(eeh_add_device_late);
 
-/**
- * eeh_remove_device - undo EEH setup for the indicated pci device
- * @dev: pci device to be removed
- *
- * This routine should be when a device is removed from a running
- * system (e.g. by hotplug or dlpar).
- */
-void eeh_remove_device(struct pci_dev *dev)
+ /**
+  * eeh_remove_device - undo EEH setup for the indicated pci device
+  * @dev: pci device to be removed
+  *
+  * This routine should be called when a device is removed from
+  * a running system (e.g. by hotplug or dlpar).  It unregisters
+  * the PCI device from the EEH subsystem.  I/O errors affecting
+  * this device will no longer be detected after this call; thus,
+  * i/o errors affecting this slot may leave this device unusable.
+  */
+static void eeh_remove_device(struct pci_dev *dev)
 {
 	struct device_node *dn;
 	if (!dev || !eeh_subsystem_enabled)
@@ -944,21 +943,16 @@ void eeh_remove_device(struct pci_dev *d
 	PCI_DN(dn)->pcidev = NULL;
 	pci_dev_put (dev);
 }
-EXPORT_SYMBOL_GPL(eeh_remove_device);
 
 void eeh_remove_bus_device(struct pci_dev *dev)
 {
+	struct pci_bus *bus = dev->subordinate;
+	struct pci_dev *child, *tmp;
+
 	eeh_remove_device(dev);
-	if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) {
-		struct pci_bus *bus = dev->subordinate;
-		struct list_head *ln;
-		if (!bus)
-			return; 
-		for (ln = bus->devices.next; ln != &bus->devices; ln = ln->next) {
-			struct pci_dev *pdev = pci_dev_b(ln);
-			if (pdev)
-				eeh_remove_bus_device(pdev);
-		}
+	if (bus && dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) {
+		list_for_each_entry_safe(child, tmp, &bus->devices, bus_list)
+			eeh_remove_bus_device(child);
 	}
 }
 EXPORT_SYMBOL_GPL(eeh_remove_bus_device);
diff -puN include/asm-powerpc/eeh.h~eeh_cleanups include/asm-powerpc/eeh.h
--- 2_6_linus/include/asm-powerpc/eeh.h~eeh_cleanups	2006-02-21 15:20:08.000000000 -0600
+++ 2_6_linus-johnrose/include/asm-powerpc/eeh.h	2006-02-21 15:24:32.000000000 -0600
@@ -50,33 +50,11 @@ unsigned long eeh_check_failure(const vo
 int eeh_dn_check_failure(struct device_node *dn, struct pci_dev *dev);
 void __init pci_addr_cache_build(void);
 
-/**
- * eeh_add_device_early
- * eeh_add_device_late
- *
- * Perform eeh initialization for devices added after boot.
- * Call eeh_add_device_early before doing any i/o to the
- * device (including config space i/o).  Call eeh_add_device_late
- * to finish the eeh setup for this device.
- */
-void eeh_add_device_early(struct device_node *);
 void eeh_add_device_tree_early(struct device_node *);
 void eeh_add_device_late(struct pci_dev *);
 
 /**
- * eeh_remove_device - undo EEH setup for the indicated pci device
- * @dev: pci device to be removed
- *
- * This routine should be called when a device is removed from
- * a running system (e.g. by hotplug or dlpar).  It unregisters
- * the PCI device from the EEH subsystem.  I/O errors affecting
- * this device will no longer be detected after this call; thus,
- * i/o errors affecting this slot may leave this device unusable.
- */
-void eeh_remove_device(struct pci_dev *);
-
-/**
- * eeh_remove_device_recursive - undo EEH for device & children.
+ * eeh_remove_bus_device - undo EEH for device & children.
  * @dev: pci device to be removed
  *
  * As above, this removes the device; it also removes child
@@ -114,12 +92,8 @@ static inline int eeh_dn_check_failure(s
 
 static inline void pci_addr_cache_build(void) { }
 
-static inline void eeh_add_device_early(struct device_node *dn) { }
-
 static inline void eeh_add_device_late(struct pci_dev *dev) { }
 
-static inline void eeh_remove_device(struct pci_dev *dev) { }
-
 static inline void eeh_add_device_tree_early(struct device_node *dn) { }
 
 static inline void eeh_remove_bus_device(struct pci_dev *dev) { }

_


From linas at austin.ibm.com  Wed Feb 22 08:29:07 2006
From: linas at austin.ibm.com (Linas Vepstas)
Date: Tue, 21 Feb 2006 15:29:07 -0600
Subject: [PATCH 2/2] Fix dynamic PCI probe regression
In-Reply-To: <1140555341.24859.15.camel@sinatra.austin.ibm.com>
References: <1140555341.24859.15.camel@sinatra.austin.ibm.com>
Message-ID: <20060221212907.GE26339@austin.ibm.com>

On Tue, Feb 21, 2006 at 02:55:41PM -0600, John Rose was heard to remark:
> 
> When rpaphp_pci_config_slot() was moved from the rpaphp driver to the
> new kernel function pcibios_add_pci_devices(), the OFDT-based probe
> stuff was dropped.  This patch restores it.

I did that. Sorry. I think I even know how/why; but I'll spare you the
convoluted excuse. The ofdt logic flow certainly looks cleaner now.
 
> Signed-off-by: John Rose <johnrose at austin.ibm.com>

I haven't tested this patch, but after reading it, it looks good to me.
So:

Acked-by: Linas Vepstas <linas at austin.ibm.com>

--linas


From johnrose at austin.ibm.com  Wed Feb 22 07:53:38 2006
From: johnrose at austin.ibm.com (John Rose)
Date: Tue, 21 Feb 2006 14:53:38 -0600
Subject: [PATCH 1/2] EEH cleanups
Message-ID: <1140555218.24859.11.camel@sinatra.austin.ibm.com>

This patch removes unnecessary exports, marks functions as static when
possible, and simplifies some list-related code.

Signed-off-by: John Rose <johnrose at austin.ibm.com>

diff -puN arch/powerpc/platforms/pseries/eeh.c~eeh_cleanups arch/powerpc/platforms/pseries/eeh.c
--- 2_6_linus/arch/powerpc/platforms/pseries/eeh.c~eeh_cleanups	2006-02-21 14:40:43.000000000 -0600
+++ 2_6_linus-johnrose/arch/powerpc/platforms/pseries/eeh.c	2006-02-21 14:55:34.000000000 -0600
@@ -409,8 +409,6 @@ dn_unlock:
 	return rc;
 }
 
-EXPORT_SYMBOL_GPL(eeh_dn_check_failure);
-
 /**
  * eeh_check_failure - check if all 1's data is due to EEH slot freeze
  * @token i/o token, should be address in the form 0xA....
@@ -865,7 +863,7 @@ void __init eeh_init(void)
  * on the CEC architecture, type of the device, on earlier boot
  * command-line arguments & etc.
  */
-void eeh_add_device_early(struct device_node *dn)
+static void eeh_add_device_early(struct device_node *dn)
 {
 	struct pci_controller *phb;
 	struct eeh_early_enable_info info;
@@ -882,7 +880,6 @@ void eeh_add_device_early(struct device_
 	info.buid_lo = BUID_LO(phb->buid);
 	early_enable_eeh(dn, &info);
 }
-EXPORT_SYMBOL_GPL(eeh_add_device_early);
 
 void eeh_add_device_tree_early(struct device_node *dn)
 {
@@ -919,7 +916,6 @@ void eeh_add_device_late(struct pci_dev 
 
 	pci_addr_cache_insert_device (dev);
 }
-EXPORT_SYMBOL_GPL(eeh_add_device_late);
 
 /**
  * eeh_remove_device - undo EEH setup for the indicated pci device
@@ -928,7 +924,7 @@ EXPORT_SYMBOL_GPL(eeh_add_device_late);
  * This routine should be when a device is removed from a running
  * system (e.g. by hotplug or dlpar).
  */
-void eeh_remove_device(struct pci_dev *dev)
+static void eeh_remove_device(struct pci_dev *dev)
 {
 	struct device_node *dn;
 	if (!dev || !eeh_subsystem_enabled)
@@ -944,21 +940,16 @@ void eeh_remove_device(struct pci_dev *d
 	PCI_DN(dn)->pcidev = NULL;
 	pci_dev_put (dev);
 }
-EXPORT_SYMBOL_GPL(eeh_remove_device);
 
 void eeh_remove_bus_device(struct pci_dev *dev)
 {
+	struct pci_bus *bus = dev->subordinate;
+	struct pci_dev *child, *tmp;
+
 	eeh_remove_device(dev);
-	if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) {
-		struct pci_bus *bus = dev->subordinate;
-		struct list_head *ln;
-		if (!bus)
-			return; 
-		for (ln = bus->devices.next; ln != &bus->devices; ln = ln->next) {
-			struct pci_dev *pdev = pci_dev_b(ln);
-			if (pdev)
-				eeh_remove_bus_device(pdev);
-		}
+	if (bus && dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) {
+		list_for_each_entry_safe(child, tmp, &bus->devices, bus_list)
+			eeh_remove_bus_device(child);
 	}
 }
 EXPORT_SYMBOL_GPL(eeh_remove_bus_device);
diff -puN include/asm-powerpc/eeh.h~eeh_cleanups include/asm-powerpc/eeh.h
--- 2_6_linus/include/asm-powerpc/eeh.h~eeh_cleanups	2006-02-21 14:40:43.000000000 -0600
+++ 2_6_linus-johnrose/include/asm-powerpc/eeh.h	2006-02-21 14:55:34.000000000 -0600
@@ -50,33 +50,11 @@ unsigned long eeh_check_failure(const vo
 int eeh_dn_check_failure(struct device_node *dn, struct pci_dev *dev);
 void __init pci_addr_cache_build(void);
 
-/**
- * eeh_add_device_early
- * eeh_add_device_late
- *
- * Perform eeh initialization for devices added after boot.
- * Call eeh_add_device_early before doing any i/o to the
- * device (including config space i/o).  Call eeh_add_device_late
- * to finish the eeh setup for this device.
- */
-void eeh_add_device_early(struct device_node *);
 void eeh_add_device_tree_early(struct device_node *);
 void eeh_add_device_late(struct pci_dev *);
 
 /**
- * eeh_remove_device - undo EEH setup for the indicated pci device
- * @dev: pci device to be removed
- *
- * This routine should be called when a device is removed from
- * a running system (e.g. by hotplug or dlpar).  It unregisters
- * the PCI device from the EEH subsystem.  I/O errors affecting
- * this device will no longer be detected after this call; thus,
- * i/o errors affecting this slot may leave this device unusable.
- */
-void eeh_remove_device(struct pci_dev *);
-
-/**
- * eeh_remove_device_recursive - undo EEH for device & children.
+ * eeh_remove_bus_device - undo EEH for device & children.
  * @dev: pci device to be removed
  *
  * As above, this removes the device; it also removes child
@@ -114,12 +92,8 @@ static inline int eeh_dn_check_failure(s
 
 static inline void pci_addr_cache_build(void) { }
 
-static inline void eeh_add_device_early(struct device_node *dn) { }
-
 static inline void eeh_add_device_late(struct pci_dev *dev) { }
 
-static inline void eeh_remove_device(struct pci_dev *dev) { }
-
 static inline void eeh_add_device_tree_early(struct device_node *dn) { }
 
 static inline void eeh_remove_bus_device(struct pci_dev *dev) { }

_


From ahuja at austin.ibm.com  Wed Feb 22 10:07:15 2006
From: ahuja at austin.ibm.com (Manish Ahuja)
Date: Tue, 21 Feb 2006 17:07:15 -0600
Subject: [PATCH] PPC64 collect and export low-level cpu usage statistics
In-Reply-To: <20060216091027.GA826@localhost.localdomain>
References: <17393.16261.768862.724265@cargo.ozlabs.ibm.com>
	<20060214183259.28a6a501.sfr@canb.auug.org.au>
	<43F40312.2020800@austin.ibm.com>
	<20060216091027.GA826@localhost.localdomain>
Message-ID: <43FB9D23.8070207@austin.ibm.com>

Added entry and exit points to system_call path.

Got rid of lpaca variables.


David Gibson wrote:

>On Wed, Feb 15, 2006 at 10:44:02PM -0600, Manish Ahuja wrote:
>[snip]
>  
>
>>>>Index: linux-2.6.15-rc6/arch/powerpc/kernel/process.c
>>>>===================================================================
>>>>--- linux-2.6.15-rc6.orig/arch/powerpc/kernel/process.c	2005-12-18 
>>>>16:36:54.000000000 -0800
>>>>+++ linux-2.6.15-rc6/arch/powerpc/kernel/process.c	2006-01-17 
>>>>21:20:25.000000000 -0800
>>>>@@ -243,6 +243,7 @@
>>>>	struct thread_struct *new_thread, *old_thread;
>>>>	unsigned long flags;
>>>>	struct task_struct *last;
>>>>+	struct paca_struct *lpaca;
>>>>  
>>>>
>>>>        
>>>>
>>>This could have been declared below (near pd)
>>>      
>>>
>>Yes... But it seems fine there..
>>    
>>
>
>Actually, I've been trying to get rid of lpaca locals everywhere.
>Using get_paca() directly is barely more verbose, and usually clearer.
>
>  
>

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: cpu-acct.txt
Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060221/6d21af2b/attachment.txt 

From sfr at ozlabs.org  Wed Feb 22 15:04:26 2006
From: sfr at ozlabs.org (Stephen Rothwell)
Date: Wed, 22 Feb 2006 15:04:26 +1100
Subject: Yahoo addresses delayed
Message-ID: <20060222150426.755ceb91.sfr@ozlabs.org>

Hi all,

This is just an email to let you all know that if you are subscribed to
any of these lists using a Yahoo email address, your copies of posts will
be delayed as someone has reported ozlabs.org to Yahoo as a spam site!

As fas as I know there has been no (or very little) spam through these
lists as they are set to member post only.  If you do see spam on these
lists, please report it to abuse at ozlabs.org (and not Yahoo or spamcop etc)
so that we can try to fix the problem.

-- 
Cheers,
Stephen Rothwell                    sfr at ozlabs.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060222/41e08504/attachment.pgp 

From paulus at samba.org  Wed Feb 22 22:35:30 2006
From: paulus at samba.org (Paul Mackerras)
Date: Wed, 22 Feb 2006 22:35:30 +1100
Subject: [PATCH] Accurate task and cpu time accounting
Message-ID: <17404.19586.404909.178103@cargo.ozlabs.ibm.com>

Here is a patch that implements accurate task and cpu time accounting
for 64-bit powerpc kernels.  Instead of accounting a whole jiffy of
time to a task on a timer interrupt because that task happened to be
running at the time, we now account time in units of timebase ticks
according to the actual time spent in user mode and kernel mode.  To
do this we read either the PURR (processor utilization of resources
register) on POWER5 machines or the timebase on other machines on each
entry to the kernel from usermode, each exit to usermode, on
transitions between process context, hard irq context and soft irq
context in kernel mode, and on context switches.  On POWER5 systems
with shared-processor logical partitioning we also read both the PURR
and the timebase at each timer interrupt in order to determine how
much time has been taken by the hypervisor to run other partitions
("steal" time).

This is all based quite heavily on what s390 does, and it uses the
generic interfaces that were added by the s390 developers,
i.e. account_system_time(), account_user_time(), etc.

This patch doesn't add any new interfaces between the kernel and
userspace, and doesn't change the units in which time is reported to
userspace by things such as /proc/stat, /proc/<pid>/stat, getrusage(),
times(), etc.  Internally the various task and cpu times are stored in
timebase units, but they are converted to USER_HZ units (1/100th of a
second) when reported to userspace.  Some precision is therefore lost
but there should not be any accumulating error, since the internal
accumulation is at full precision.

All of this is conditional on CONFIG_VIRT_CPU_ACCOUNTING.  If that is
not set, we do tick-based approximate accounting as before.

Paul.

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 80d114a..707d079 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -238,6 +238,21 @@ config PPC_STD_MMU_32
 	def_bool y
 	depends on PPC_STD_MMU && PPC32
 
+config VIRT_CPU_ACCOUNTING
+	bool "Deterministic task and CPU time accounting"
+	depends on PPC64
+	default y
+	help
+	  Select this option to enable more accurate task and CPU time
+	  accounting.  This is done by reading a CPU counter on each
+	  kernel entry and exit and on transitions within the kernel
+	  between system, softirq and hardirq state, so there is a
+	  small performance impact.  This also enables accounting of
+	  stolen time on logically-partitioned systems running on
+	  IBM POWER5-based machines.
+
+	  If in doubt, say Y here.
+
 config SMP
 	depends on PPC_STD_MMU
 	bool "Symmetric multi-processing support"
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 840aad4..18810ac 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -137,6 +137,9 @@ int main(void)
 	DEFINE(PACAEMERGSP, offsetof(struct paca_struct, emergency_sp));
 	DEFINE(PACALPPACAPTR, offsetof(struct paca_struct, lppaca_ptr));
 	DEFINE(PACAHWCPUID, offsetof(struct paca_struct, hw_cpu_id));
+	DEFINE(PACA_STARTPURR, offsetof(struct paca_struct, startpurr));
+	DEFINE(PACA_USER_TIME, offsetof(struct paca_struct, user_time));
+	DEFINE(PACA_SYSTEM_TIME, offsetof(struct paca_struct, system_time));
 
 	DEFINE(LPPACASRR0, offsetof(struct lppaca, saved_srr0));
 	DEFINE(LPPACASRR1, offsetof(struct lppaca, saved_srr1));
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 388f861..df918f7 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -63,6 +63,7 @@ system_call_common:
 	std	r12,_MSR(r1)
 	std	r0,GPR0(r1)
 	std	r10,GPR1(r1)
+	ACCOUNT_CPU_USER_ENTRY(r10, r11)
 	std	r2,GPR2(r1)
 	std	r3,GPR3(r1)
 	std	r4,GPR4(r1)
@@ -170,8 +171,9 @@ syscall_error_cont:
 	stdcx.	r0,0,r1			/* to clear the reservation */
 	andi.	r6,r8,MSR_PR
 	ld	r4,_LINK(r1)
-	beq-	1f			/* only restore r13 if */
-	ld	r13,GPR13(r1)		/* returning to usermode */
+	beq-	1f		
+	ACCOUNT_CPU_USER_EXIT(r11, r12)
+	ld	r13,GPR13(r1)	/* only restore r13 if returning to usermode */
 1:	ld	r2,GPR2(r1)
 	li	r12,MSR_RI
 	andc	r11,r10,r12
@@ -538,6 +540,7 @@ restore:
 	 * userspace
 	 */
 	beq	1f
+	ACCOUNT_CPU_USER_EXIT(r3, r4)
 	REST_GPR(13, r1)
 1:
 	ld	r3,_CTR(r1)
diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index 2b03a09..40c813b 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -283,6 +283,7 @@ exception_marker:
 	std	r10,0(r1);		/* make stack chain pointer	*/ \
 	std	r0,GPR0(r1);		/* save r0 in stackframe	*/ \
 	std	r10,GPR1(r1);		/* save r1 in stackframe	*/ \
+	ACCOUNT_CPU_USER_ENTRY(r9, r10);				   \
 	std	r2,GPR2(r1);		/* save r2 in stackframe	*/ \
 	SAVE_4GPRS(3, r1);		/* save r3 - r6 in stackframe	*/ \
 	SAVE_2GPRS(7, r1);		/* save r7, r8 in stackframe	*/ \
@@ -858,6 +859,14 @@ fast_exception_return:
 	ld	r11,_NIP(r1)
 	andi.	r3,r12,MSR_RI		/* check if RI is set */
 	beq-	unrecov_fer
+
+#ifdef CONFIG_VIRT_CPU_ACCOUNTING
+	andi.	r3,r12,MSR_PR
+	beq	2f
+	ACCOUNT_CPU_USER_EXIT(r3, r4)
+2:
+#endif
+
 	ld	r3,_CCR(r1)
 	ld	r4,_LINK(r1)
 	ld	r5,_CTR(r1)
diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index d1fffce..dea05b4 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -394,10 +394,24 @@ void irq_ctx_init(void)
 	}
 }
 
+static inline void do_softirq_onstack(void)
+{
+	struct thread_info *curtp, *irqtp;
+
+	curtp = current_thread_info();
+	irqtp = softirq_ctx[smp_processor_id()];
+	irqtp->task = curtp->task;
+	call_do_softirq(irqtp);
+	irqtp->task = NULL;
+}
+
+#else
+#define do_softirq_onstack()	__do_softirq()
+#endif /* CONFIG_IRQSTACKS */
+
 void do_softirq(void)
 {
 	unsigned long flags;
-	struct thread_info *curtp, *irqtp;
 
 	if (in_interrupt())
 		return;
@@ -405,19 +419,17 @@ void do_softirq(void)
 	local_irq_save(flags);
 
 	if (local_softirq_pending()) {
-		curtp = current_thread_info();
-		irqtp = softirq_ctx[smp_processor_id()];
-		irqtp->task = curtp->task;
-		call_do_softirq(irqtp);
-		irqtp->task = NULL;
+		account_system_vtime(current);
+		local_bh_disable();
+		do_softirq_onstack();
+		account_system_vtime(current);
+		__local_bh_enable();
 	}
 
 	local_irq_restore(flags);
 }
 EXPORT_SYMBOL(do_softirq);
 
-#endif /* CONFIG_IRQSTACKS */
-
 static int __init setup_noirqdistrib(char *str)
 {
 	distribute_irqs = 0;
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 5770399..aeede05 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -330,6 +330,11 @@ struct task_struct *__switch_to(struct t
 #endif
 
 	local_irq_save(flags);
+
+	account_system_vtime(current);
+	account_process_vtime(current);
+	calculate_steal_time();
+
 	last = _switch(old_thread, new_thread);
 
 	local_irq_restore(flags);
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 13595a6..805eaed 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -541,7 +541,7 @@ int __devinit start_secondary(void *unus
 		smp_ops->take_timebase();
 
 	if (system_state > SYSTEM_BOOTING)
-		per_cpu(last_jiffy, cpu) = get_tb();
+		snapshot_timebase();
 
 	spin_lock(&call_lock);
 	cpu_set(cpu, cpu_online_map);
@@ -573,6 +573,8 @@ void __init smp_cpus_done(unsigned int m
 
 	set_cpus_allowed(current, old_mask);
 
+	snapshot_timebases();
+
 	dump_numa_cpu_topology();
 }
 
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 2a7ddc5..8a57a38 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -51,6 +51,7 @@
 #include <linux/percpu.h>
 #include <linux/rtc.h>
 #include <linux/jiffies.h>
+#include <linux/posix-timers.h>
 
 #include <asm/io.h>
 #include <asm/processor.h>
@@ -135,6 +136,220 @@ unsigned long tb_last_stamp;
  */
 DEFINE_PER_CPU(unsigned long, last_jiffy);
 
+#ifdef CONFIG_VIRT_CPU_ACCOUNTING
+/*
+ * Factors for converting from cputime_t (timebase ticks) to
+ * jiffies, milliseconds, seconds, and clock_t (1/USER_HZ seconds).
+ * These are all stored as 0.64 fixed-point binary fractions.
+ */
+u64 __cputime_jiffies_factor;
+u64 __cputime_msec_factor;
+u64 __cputime_sec_factor;
+u64 __cputime_clockt_factor;
+
+static void calc_cputime_factors(void)
+{
+	struct div_result res;
+
+	div128_by_32(HZ, 0, tb_ticks_per_sec, &res);
+	__cputime_jiffies_factor = res.result_low;
+	div128_by_32(1000, 0, tb_ticks_per_sec, &res);
+	__cputime_msec_factor = res.result_low;
+	div128_by_32(1, 0, tb_ticks_per_sec, &res);
+	__cputime_sec_factor = res.result_low;
+	div128_by_32(USER_HZ, 0, tb_ticks_per_sec, &res);
+	__cputime_clockt_factor = res.result_low;
+}
+
+/*
+ * Read the PURR on systems that have it, otherwise the timebase.
+ */
+static u64 read_purr(void)
+{
+	if (cpu_has_feature(CPU_FTR_PURR))
+		return mfspr(SPRN_PURR);
+	return mftb();
+}
+
+/*
+ * Account time for a transition between system, hard irq
+ * or soft irq state.
+ */
+void account_system_vtime(struct task_struct *tsk)
+{
+	u64 now, delta;
+	unsigned long flags;
+
+	local_irq_save(flags);
+	now = read_purr();
+	delta = now - get_paca()->startpurr;
+	get_paca()->startpurr = now;
+	if (!in_interrupt()) {
+		delta += get_paca()->system_time;
+		get_paca()->system_time = 0;
+	}
+	account_system_time(tsk, 0, delta);
+	local_irq_restore(flags);
+}
+
+/*
+ * Transfer the user and system times accumulated in the paca
+ * by the exception entry and exit code to the generic process
+ * user and system time records.
+ * Must be called with interrupts disabled.
+ */
+void account_process_vtime(struct task_struct *tsk)
+{
+	cputime_t utime;
+
+	utime = get_paca()->user_time;
+	get_paca()->user_time = 0;
+	account_user_time(tsk, utime);
+}
+
+static void account_process_time(struct pt_regs *regs)
+{
+	int cpu = smp_processor_id();
+
+	account_process_vtime(current);
+	run_local_timers();
+	if (rcu_pending(cpu))
+		rcu_check_callbacks(cpu, user_mode(regs));
+	scheduler_tick();
+ 	run_posix_cpu_timers(current);
+}
+
+#ifdef CONFIG_PPC_SPLPAR
+/*
+ * Stuff for accounting stolen time.
+ */
+struct cpu_purr_data {
+	int	initialized;			/* thread is running */
+	u64	tb0;			/* timebase at origin time */
+	u64	purr0;			/* PURR at origin time */
+	u64	tb;			/* last TB value read */
+	u64	purr;			/* last PURR value read */
+	u64	stolen;			/* stolen time so far */
+	spinlock_t lock;
+};
+
+static DEFINE_PER_CPU(struct cpu_purr_data, cpu_purr_data);
+
+static void snapshot_tb_and_purr(void *data)
+{
+	struct cpu_purr_data *p = &__get_cpu_var(cpu_purr_data);
+
+	p->tb0 = mftb();
+	p->purr0 = mfspr(SPRN_PURR);
+	p->tb = p->tb0;
+	p->purr = 0;
+	wmb();
+	p->initialized = 1;
+}
+
+/*
+ * Called during boot when all cpus have come up.
+ */
+void snapshot_timebases(void)
+{
+	int cpu;
+
+	if (!cpu_has_feature(CPU_FTR_PURR))
+		return;
+	for_each_cpu(cpu)
+		spin_lock_init(&per_cpu(cpu_purr_data, cpu).lock);
+	on_each_cpu(snapshot_tb_and_purr, NULL, 0, 1);
+}
+
+void calculate_steal_time(void)
+{
+	u64 tb, purr, t0;
+	s64 stolen;
+	struct cpu_purr_data *p0, *pme, *phim;
+	int cpu;
+
+	if (!cpu_has_feature(CPU_FTR_PURR))
+		return;
+	cpu = smp_processor_id();
+	pme = &per_cpu(cpu_purr_data, cpu);
+	if (!pme->initialized)
+		return;		/* this can happen in early boot */
+	p0 = &per_cpu(cpu_purr_data, cpu & ~1);
+	phim = &per_cpu(cpu_purr_data, cpu ^ 1);
+	spin_lock(&p0->lock);
+	tb = mftb();
+	purr = mfspr(SPRN_PURR) - pme->purr0;
+	if (!phim->initialized || !cpu_online(cpu ^ 1)) {
+		stolen = (tb - pme->tb) - (purr - pme->purr);
+	} else {
+		t0 = pme->tb0;
+		if (phim->tb0 < t0)
+			t0 = phim->tb0;
+		stolen = phim->tb - t0 - phim->purr - purr - p0->stolen;
+	}
+	if (stolen > 0) {
+		account_steal_time(current, stolen);
+		p0->stolen += stolen;
+	}
+	pme->tb = tb;
+	pme->purr = purr;
+	spin_unlock(&p0->lock);
+}
+
+/*
+ * Must be called before the cpu is added to the online map when
+ * a cpu is being brought up at runtime.
+ */
+static void snapshot_purr(void)
+{
+	int cpu;
+	u64 purr;
+	struct cpu_purr_data *p0, *pme, *phim;
+	unsigned long flags;
+
+	if (!cpu_has_feature(CPU_FTR_PURR))
+		return;
+	cpu = smp_processor_id();
+	pme = &per_cpu(cpu_purr_data, cpu);
+	p0 = &per_cpu(cpu_purr_data, cpu & ~1);
+	phim = &per_cpu(cpu_purr_data, cpu ^ 1);
+	spin_lock_irqsave(&p0->lock, flags);
+	pme->tb = pme->tb0 = mftb();
+	purr = mfspr(SPRN_PURR);
+	if (!phim->initialized) {
+		pme->purr = 0;
+		pme->purr0 = purr;
+	} else {
+		/* set p->purr and p->purr0 for no change in p0->stolen */
+		pme->purr = phim->tb - phim->tb0 - phim->purr - p0->stolen;
+		pme->purr0 = purr - pme->purr;
+	}
+	pme->initialized = 1;
+	spin_unlock_irqrestore(&p0->lock, flags);
+}
+
+#endif /* CONFIG_PPC_SPLPAR */
+
+#else /* ! CONFIG_VIRT_CPU_ACCOUNTING */
+#define calc_cputime_factors()
+#define account_process_time(regs)	update_process_times(user_mode(regs))
+#define calculate_steal_time()		do { } while (0)
+#endif
+
+#if !(defined(CONFIG_VIRT_CPU_ACCOUNTING) && defined(CONFIG_PPC_SPLPAR))
+#define snapshot_purr()			do { } while (0)
+#endif
+
+/*
+ * Called when a cpu comes up after the system has finished booting,
+ * i.e. as a result of a hotplug cpu action.
+ */
+void snapshot_timebase(void)
+{
+	__get_cpu_var(last_jiffy) = get_tb();
+	snapshot_purr();
+}
+
 void __delay(unsigned long loops)
 {
 	unsigned long start;
@@ -382,6 +597,7 @@ static void iSeries_tb_recal(void)
 						new_tb_ticks_per_jiffy, sign, tick_diff );
 				tb_ticks_per_jiffy = new_tb_ticks_per_jiffy;
 				tb_ticks_per_sec   = new_tb_ticks_per_sec;
+				calc_cputime_factors();
 				div128_by_32( XSEC_PER_SEC, 0, tb_ticks_per_sec, &divres );
 				do_gtod.tb_ticks_per_sec = tb_ticks_per_sec;
 				tb_to_xs = divres.result_low;
@@ -430,6 +646,7 @@ void timer_interrupt(struct pt_regs * re
 	irq_enter();
 
 	profile_tick(CPU_PROFILING, regs);
+	calculate_steal_time();
 
 #ifdef CONFIG_PPC_ISERIES
 	get_lppaca()->int_dword.fields.decr_int = 0;
@@ -451,7 +668,7 @@ void timer_interrupt(struct pt_regs * re
 		 * is the case.
 		 */
 		if (!cpu_is_offline(cpu))
-			update_process_times(user_mode(regs));
+			account_process_time(regs);
 
 		/*
 		 * No need to check whether cpu is offline here; boot_cpuid
@@ -706,6 +923,7 @@ void __init time_init(void)
 	tb_ticks_per_sec = ppc_tb_freq;
 	tb_ticks_per_usec = ppc_tb_freq / 1000000;
 	tb_to_us = mulhwu_scale_factor(ppc_tb_freq, 1000000);
+	calc_cputime_factors();
 
 	/*
 	 * Calculate the length of each tick in ns.  It will not be
diff --git a/include/asm-powerpc/cputable.h b/include/asm-powerpc/cputable.h
index 6421054..f74d0ed 100644
--- a/include/asm-powerpc/cputable.h
+++ b/include/asm-powerpc/cputable.h
@@ -117,6 +117,7 @@ extern void do_cpu_ftr_fixups(unsigned l
 #define CPU_FTR_MMCRA_SIHV		ASM_CONST(0x0000080000000000)
 #define CPU_FTR_CI_LARGE_PAGE		ASM_CONST(0x0000100000000000)
 #define CPU_FTR_PAUSE_ZERO		ASM_CONST(0x0000200000000000)
+#define CPU_FTR_PURR			ASM_CONST(0x0000400000000000)
 #else
 /* ensure on 32b processors the flags are available for compiling but
  * don't do anything */
@@ -132,6 +133,7 @@ extern void do_cpu_ftr_fixups(unsigned l
 #define CPU_FTR_LOCKLESS_TLBIE		ASM_CONST(0x0)
 #define CPU_FTR_MMCRA_SIHV		ASM_CONST(0x0)
 #define CPU_FTR_CI_LARGE_PAGE		ASM_CONST(0x0)
+#define CPU_FTR_PURR			ASM_CONST(0x0)
 #endif
 
 #ifndef __ASSEMBLY__
@@ -313,7 +315,7 @@ enum {
 	    CPU_FTR_HPTE_TABLE | CPU_FTR_PPCAS_ARCH_V2 |
 	    CPU_FTR_MMCRA | CPU_FTR_SMT |
 	    CPU_FTR_COHERENT_ICACHE | CPU_FTR_LOCKLESS_TLBIE |
-	    CPU_FTR_MMCRA_SIHV,
+	    CPU_FTR_MMCRA_SIHV | CPU_FTR_PURR,
 	CPU_FTRS_CELL = CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB |
 	    CPU_FTR_HPTE_TABLE | CPU_FTR_PPCAS_ARCH_V2 |
 	    CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT |
@@ -326,7 +328,7 @@ enum {
 #ifdef __powerpc64__
 	    CPU_FTRS_POWER3 | CPU_FTRS_RS64 | CPU_FTRS_POWER4 |
 	    CPU_FTRS_PPC970 | CPU_FTRS_POWER5 | CPU_FTRS_CELL |
-            CPU_FTR_CI_LARGE_PAGE |
+            CPU_FTR_CI_LARGE_PAGE | CPU_FTR_PURR |
 #else
 #if CLASSIC_PPC
 	    CPU_FTRS_PPC601 | CPU_FTRS_603 | CPU_FTRS_604 | CPU_FTRS_740_NOTAU |
diff --git a/include/asm-powerpc/cputime.h b/include/asm-powerpc/cputime.h
index 6d68ad7..a21185d 100644
--- a/include/asm-powerpc/cputime.h
+++ b/include/asm-powerpc/cputime.h
@@ -1 +1,203 @@
+/*
+ * Definitions for measuring cputime on powerpc machines.
+ *
+ * Copyright (C) 2006 Paul Mackerras, IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ *
+ * If we have CONFIG_VIRT_CPU_ACCOUNTING, we measure cpu time in
+ * the same units as the timebase.  Otherwise we measure cpu time
+ * in jiffies using the generic definitions.
+ */
+
+#ifndef __POWERPC_CPUTIME_H
+#define __POWERPC_CPUTIME_H
+
+#ifndef CONFIG_VIRT_CPU_ACCOUNTING
 #include <asm-generic/cputime.h>
+#else
+
+#include <linux/types.h>
+#include <linux/time.h>
+#include <asm/div64.h>
+#include <asm/time.h>
+#include <asm/param.h>
+
+typedef u64 cputime_t;
+typedef u64 cputime64_t;
+
+#define cputime_zero			((cputime_t)0)
+#define cputime_max			((~((cputime_t)0) >> 1) - 1)
+#define cputime_add(__a, __b)		((__a) +  (__b))
+#define cputime_sub(__a, __b)		((__a) -  (__b))
+#define cputime_div(__a, __n)		((__a) /  (__n))
+#define cputime_halve(__a)		((__a) >> 1)
+#define cputime_eq(__a, __b)		((__a) == (__b))
+#define cputime_gt(__a, __b)		((__a) >  (__b))
+#define cputime_ge(__a, __b)		((__a) >= (__b))
+#define cputime_lt(__a, __b)		((__a) <  (__b))
+#define cputime_le(__a, __b)		((__a) <= (__b))
+
+#define cputime64_zero			((cputime64_t)0)
+#define cputime64_add(__a, __b)		((__a) + (__b))
+#define cputime_to_cputime64(__ct)	(__ct)
+
+#ifdef __KERNEL__
+
+/*
+ * Convert cputime <-> jiffies
+ */
+extern u64 __cputime_jiffies_factor;
+
+static inline unsigned long cputime_to_jiffies(const cputime_t ct)
+{
+	return mulhdu(ct, __cputime_jiffies_factor);
+}
+
+static inline cputime_t jiffies_to_cputime(const unsigned long jif)
+{
+	cputime_t ct;
+	unsigned long sec;
+
+	/* have to be a little careful about overflow */
+	ct = jif % HZ;
+	sec = jif / HZ;
+	if (ct) {
+		ct *= tb_ticks_per_sec;
+		do_div(ct, HZ);
+	}
+	if (sec)
+		ct += (cputime_t) sec * tb_ticks_per_sec;
+	return ct;
+}
+
+static inline u64 cputime64_to_jiffies64(const cputime_t ct)
+{
+	return mulhdu(ct, __cputime_jiffies_factor);
+}
+
+/*
+ * Convert cputime <-> milliseconds
+ */
+extern u64 __cputime_msec_factor;
+
+static inline unsigned long cputime_to_msecs(const cputime_t ct)
+{
+	return mulhdu(ct, __cputime_msec_factor);
+}
+
+static inline cputime_t msecs_to_cputime(const unsigned long ms)
+{
+	cputime_t ct;
+	unsigned long sec;
+
+	/* have to be a little careful about overflow */
+	ct = ms % 1000;
+	sec = ms / 1000;
+	if (ct) {
+		ct *= tb_ticks_per_sec;
+		do_div(ct, 1000);
+	}
+	if (sec)
+		ct += (cputime_t) sec * tb_ticks_per_sec;
+	return ct;
+}
+
+/*
+ * Convert cputime <-> seconds
+ */
+extern u64 __cputime_sec_factor;
+
+static inline unsigned long cputime_to_secs(const cputime_t ct)
+{
+	return mulhdu(ct, __cputime_sec_factor);
+}
+
+static inline cputime_t secs_to_cputime(const unsigned long sec)
+{
+	return (cputime_t) sec * tb_ticks_per_sec;
+}
+
+/*
+ * Convert cputime <-> timespec
+ */
+static inline void cputime_to_timespec(const cputime_t ct, struct timespec *p)
+{
+	u64 x = ct;
+	unsigned int frac;
+
+	frac = do_div(x, tb_ticks_per_sec);
+	p->tv_sec = x;
+	x = (u64) frac * 1000000000;
+	do_div(x, tb_ticks_per_sec);
+	p->tv_nsec = x;
+}
+
+static inline cputime_t timespec_to_cputime(const struct timespec *p)
+{
+	cputime_t ct;
+
+	ct = (u64) p->tv_nsec * tb_ticks_per_sec;
+	do_div(ct, 1000000000);
+	return ct + (u64) p->tv_sec * tb_ticks_per_sec;
+}
+
+/*
+ * Convert cputime <-> timeval
+ */
+static inline void cputime_to_timeval(const cputime_t ct, struct timeval *p)
+{
+	u64 x = ct;
+	unsigned int frac;
+
+	frac = do_div(x, tb_ticks_per_sec);
+	p->tv_sec = x;
+	x = (u64) frac * 1000000;
+	do_div(x, tb_ticks_per_sec);
+	p->tv_usec = x;
+}
+
+static inline cputime_t timeval_to_cputime(const struct timeval *p)
+{
+	cputime_t ct;
+
+	ct = (u64) p->tv_usec * tb_ticks_per_sec;
+	do_div(ct, 1000000);
+	return ct + (u64) p->tv_sec * tb_ticks_per_sec;
+}
+
+/*
+ * Convert cputime <-> clock_t (units of 1/USER_HZ seconds)
+ */
+extern u64 __cputime_clockt_factor;
+
+static inline unsigned long cputime_to_clock_t(const cputime_t ct)
+{
+	return mulhdu(ct, __cputime_clockt_factor);
+}
+
+static inline cputime_t clock_t_to_cputime(const unsigned long clk)
+{
+	cputime_t ct;
+	unsigned long sec;
+
+	/* have to be a little careful about overflow */
+	ct = clk % USER_HZ;
+	sec = clk / USER_HZ;
+	if (ct) {
+		ct *= tb_ticks_per_sec;
+		do_div(ct, USER_HZ);
+	}
+	if (sec)
+		ct += (cputime_t) sec * tb_ticks_per_sec;
+	return ct;
+}
+
+#define cputime64_to_clock_t(ct)	cputime_to_clock_t((cputime_t)(ct))
+
+#endif /* __KERNEL__ */
+#endif /* CONFIG_VIRT_CPU_ACCOUNTING */
+#endif /* __POWERPC_CPUTIME_H */
diff --git a/include/asm-powerpc/irq.h b/include/asm-powerpc/irq.h
index 8eb7e85..51f87d9 100644
--- a/include/asm-powerpc/irq.h
+++ b/include/asm-powerpc/irq.h
@@ -479,6 +479,10 @@ extern int distribute_irqs;
 struct irqaction;
 struct pt_regs;
 
+#define __ARCH_HAS_DO_SOFTIRQ
+
+extern void __do_softirq(void);
+
 #ifdef CONFIG_IRQSTACKS
 /*
  * Per-cpu stacks for handling hard and soft interrupts.
@@ -491,8 +495,6 @@ extern void call_do_softirq(struct threa
 extern int call___do_IRQ(int irq, struct pt_regs *regs,
 		struct thread_info *tp);
 
-#define __ARCH_HAS_DO_SOFTIRQ
-
 #else
 #define irq_ctx_init()
 
diff --git a/include/asm-powerpc/paca.h b/include/asm-powerpc/paca.h
index c9add8f..4cd1a95 100644
--- a/include/asm-powerpc/paca.h
+++ b/include/asm-powerpc/paca.h
@@ -96,6 +96,11 @@ struct paca_struct {
 	u64 saved_r1;			/* r1 save for RTAS calls */
 	u64 saved_msr;			/* MSR saved here by enter_rtas */
 	u8 proc_enabled;		/* irq soft-enable flag */
+
+	/* Stuff for accurate time accounting */
+	u64 user_time;			/* accumulated usermode TB ticks */
+	u64 system_time;		/* accumulated system TB ticks */
+	u64 startpurr;			/* PURR/TB value snapshot */
 };
 
 extern struct paca_struct paca[];
diff --git a/include/asm-powerpc/ppc_asm.h b/include/asm-powerpc/ppc_asm.h
index ab8688d..dd1c0a9 100644
--- a/include/asm-powerpc/ppc_asm.h
+++ b/include/asm-powerpc/ppc_asm.h
@@ -15,6 +15,48 @@
 #define SZL			(BITS_PER_LONG/8)
 
 /*
+ * Stuff for accurate CPU time accounting.
+ * These macros handle transitions between user and system state
+ * in exception entry and exit and accumulate time to the
+ * user_time and system_time fields in the paca.
+ */
+
+#ifndef CONFIG_VIRT_CPU_ACCOUNTING
+#define ACCOUNT_CPU_USER_ENTRY(ra, rb)
+#define ACCOUNT_CPU_USER_EXIT(ra, rb)
+#else
+#define ACCOUNT_CPU_USER_ENTRY(ra, rb)					\
+	beq	2f;			/* if from kernel mode */	\
+BEGIN_FTR_SECTION;							\
+	mfspr	ra,SPRN_PURR;		/* get processor util. reg */	\
+END_FTR_SECTION_IFSET(CPU_FTR_PURR);					\
+BEGIN_FTR_SECTION;							\
+	mftb	ra;			/* or get TB if no PURR */	\
+END_FTR_SECTION_IFCLR(CPU_FTR_PURR);					\
+	ld	rb,PACA_STARTPURR(r13);				\
+	std	ra,PACA_STARTPURR(r13);					\
+	subf	rb,rb,ra;		/* subtract start value */	\
+	ld	ra,PACA_USER_TIME(r13);					\
+	add	ra,ra,rb;		/* add on to user time */	\
+	std	ra,PACA_USER_TIME(r13);					\
+2:
+
+#define ACCOUNT_CPU_USER_EXIT(ra, rb)					\
+BEGIN_FTR_SECTION;							\
+	mfspr	ra,SPRN_PURR;		/* get processor util. reg */	\
+END_FTR_SECTION_IFSET(CPU_FTR_PURR);					\
+BEGIN_FTR_SECTION;							\
+	mftb	ra;			/* or get TB if no PURR */	\
+END_FTR_SECTION_IFCLR(CPU_FTR_PURR);					\
+	ld	rb,PACA_STARTPURR(r13);				\
+	std	ra,PACA_STARTPURR(r13);					\
+	subf	rb,rb,ra;		/* subtract start value */	\
+	ld	ra,PACA_SYSTEM_TIME(r13);				\
+	add	ra,ra,rb;		/* add on to user time */	\
+	std	ra,PACA_SYSTEM_TIME(r13);
+#endif
+
+/*
  * Macros for storing registers into and loading registers from
  * exception frames.
  */
diff --git a/include/asm-powerpc/system.h b/include/asm-powerpc/system.h
index d9bf536..41b7a5b 100644
--- a/include/asm-powerpc/system.h
+++ b/include/asm-powerpc/system.h
@@ -424,5 +424,9 @@ static inline void create_function_call(
 	create_branch(addr, func_addr, BRANCH_SET_LINK);
 }
 
+#ifdef CONFIG_VIRT_CPU_ACCOUNTING
+extern void account_system_vtime(struct task_struct *);
+#endif
+
 #endif /* __KERNEL__ */
 #endif /* _ASM_POWERPC_SYSTEM_H */
diff --git a/include/asm-powerpc/time.h b/include/asm-powerpc/time.h
index baddc9a..912118d 100644
--- a/include/asm-powerpc/time.h
+++ b/include/asm-powerpc/time.h
@@ -41,6 +41,7 @@ extern time_t last_rtc_update;
 
 extern void generic_calibrate_decr(void);
 extern void wakeup_decrementer(void);
+extern void snapshot_timebase(void);
 
 /* Some sane defaults: 125 MHz timebase, 1GHz processor */
 extern unsigned long ppc_proc_freq;
@@ -221,5 +222,19 @@ struct cpu_usage {
 
 DECLARE_PER_CPU(struct cpu_usage, cpu_usage_array);
 
+#ifdef CONFIG_VIRT_CPU_ACCOUNTING
+extern void account_process_vtime(struct task_struct *tsk);
+#else
+#define account_process_vtime(tsk)		do { } while (0)
+#endif
+
+#if defined(CONFIG_VIRT_CPU_ACCOUNTING) && defined(CONFIG_PPC_SPLPAR)
+extern void calculate_steal_time(void);
+extern void snapshot_timebases(void);
+#else
+#define calculate_steal_time()			do { } while (0)
+#define snapshot_timebases()			do { } while (0)
+#endif
+
 #endif /* __KERNEL__ */
 #endif /* __PPC64_TIME_H */


From olh at suse.de  Thu Feb 23 00:35:51 2006
From: olh at suse.de (Olaf Hering)
Date: Wed, 22 Feb 2006 14:35:51 +0100
Subject: [PATCH] Accurate task and cpu time accounting
In-Reply-To: <17404.19586.404909.178103@cargo.ozlabs.ibm.com>
References: <17404.19586.404909.178103@cargo.ozlabs.ibm.com>
Message-ID: <20060222133551.GA30355@suse.de>

 On Wed, Feb 22, Paul Mackeras wrote:

> All of this is conditional on CONFIG_VIRT_CPU_ACCOUNTING.  If that is
> not set, we do tick-based approximate accounting as before.

arch/powerpc/kernel/process.c: In function '__switch_to':
arch/powerpc/kernel/process.c:335: error: implicit declaration of function 'account_process_vtime'
arch/powerpc/kernel/process.c:336: error: implicit declaration of function 'calculate_steal_time'
make[1]: *** [arch/powerpc/kernel/process.o] Error 1

This change fixes it. But it will not link 32bit:

kernel/built-in.o(.text+0xbd88): In function `irq_exit':
: undefined reference to `do_softirq'
kernel/built-in.o(.text+0xbde0): In function `local_bh_enable':
: undefined reference to `do_softirq'
kernel/built-in.o(.text+0xbe60): In function `ksoftirqd':
: undefined reference to `do_softirq'
net/built-in.o(.text+0xd6fc): In function `netif_rx_ni':
: undefined reference to `do_softirq'

I think the placement of __ARCH_HAS_DO_SOFTIRQ needs adjustment,
or the code must be moved out of CONFIG_PPC64 in arch/powerpc/kernel/irq.c.


Index: linux-2.6.15/arch/powerpc/kernel/process.c
===================================================================
--- linux-2.6.15.orig/arch/powerpc/kernel/process.c
+++ linux-2.6.15/arch/powerpc/kernel/process.c
@@ -49,8 +49,8 @@
 #include <asm/machdep.h>
 #ifdef CONFIG_PPC64
 #include <asm/firmware.h>
-#include <asm/time.h>
 #endif
+#include <asm/time.h>
 
 extern unsigned long _get_SP(void);
 

From sharada at in.ibm.com  Thu Feb 23 03:13:08 2006
From: sharada at in.ibm.com (R Sharada)
Date: Wed, 22 Feb 2006 21:43:08 +0530
Subject: [PATCH] ppc64 - fix spinlock recursion in native_hpte_clear
In-Reply-To: <20060221054448.GA1695@in.ibm.com>
References: <20060221054448.GA1695@in.ibm.com>
Message-ID: <20060222161308.GA6356@in.ibm.com>

Ok, I realized I did not have to add the extra variables and could have done it
cleaner. Also, Michael suggested adding a comment why we were replacing the
call to tlbie() with __tlbie(). So, here is a revised version.
With this fix, I am able to kexec and kdump boot successfully on p630 non-lpar
mode running 2.6.16-rc4.
Since without this fix kexec is currently broken on non-lpar, please consider 
for inclusion in 2.6.16

Thanks and Regards,
Sharada


native_hpte_clear has a spinlock recursion problem with the native_tlbie_lock
being called twice, once in native_hpte_clear() and once within tlbie().
Fix the problem by changing the call to tlbie() in native_hpte_clear() to
__tlbie(). It still supports only 4k pages for now.


Signed-off-by: R Sharada <sharada at in.ibm.com>
---


diff -puN arch/powerpc/mm/hash_native_64.c~fix_native_hpte_clear arch/powerpc/mm/hash_native_64.c
--- linux-2.6.16-rc4/arch/powerpc/mm/hash_native_64.c~fix_native_hpte_clear	2006-02-22 21:22:42.000000000 +0530
+++ linux-2.6.16-rc4-sharada/arch/powerpc/mm/hash_native_64.c	2006-02-22 21:26:25.000000000 +0530
@@ -403,12 +403,16 @@ static void native_hpte_clear(void)
 		 */
 		hpte_v = hptep->v;
 
+		/* tlbie() takes the native_tlbie_lock. hence change the
+		 * tlbie() call here to __tlbie()
+		 */
 		if (hpte_v & HPTE_V_VALID) {
 			hptep->v = 0;
-			tlbie(slot2va(hpte_v, slot), MMU_PAGE_4K, 0);
+			__tlbie(slot2va(hpte_v, slot), MMU_PAGE_4K);
 		}
 	}
 
+	asm volatile("eieio; tlbsync; ptesync":::"memory");
 	spin_unlock(&native_tlbie_lock);
 	local_irq_restore(flags);
 }
_


From geoffrey.levand at am.sony.com  Thu Feb 23 09:50:26 2006
From: geoffrey.levand at am.sony.com (Geoff Levand)
Date: Wed, 22 Feb 2006 14:50:26 -0800
Subject: [PATCH] ppc64 - fix spinlock recursion in native_hpte_clear
In-Reply-To: <20060222161308.GA6356@in.ibm.com>
References: <20060222161308.GA6356@in.ibm.com>
Message-ID: <43FCEAB2.2020702@am.sony.com>

R Sharada wrote:
> linux-2.6.16-rc4/arch/powerpc/mm/hash_native_64.c~fix_native_hpte_clear
> 2006-02-22 21:22:42.000000000 +0530
> +++ linux-2.6.16-rc4-sharada/arch/powerpc/mm/hash_native_64.c
> 2006-02-22 21:26:25.000000000 +0530
> @@ -403,12 +403,16 @@ static void native_hpte_clear(void)
>  		 */
>  		hpte_v = hptep->v;
>  
> +		/* tlbie() takes the native_tlbie_lock. hence change the
> +		 * tlbie() call here to __tlbie()
> +		 */


Once the patch is applied, the tlbie() call disappears and you have
comment that doesn't make sense in the new context.  Maybe you should
reconsider the wording.


>  		if (hpte_v & HPTE_V_VALID) {
>  			hptep->v = 0;
> -			tlbie(slot2va(hpte_v, slot), MMU_PAGE_4K, 0);
> +			__tlbie(slot2va(hpte_v, slot), MMU_PAGE_4K);
>  		}
>  	}
>  
> +	asm volatile("eieio; tlbsync; ptesync":::"memory");
>  	spin_unlock(&native_tlbie_lock);
>  	local_irq_restore(flags);
>  }


From michael at ellerman.id.au  Thu Feb 23 10:49:18 2006
From: michael at ellerman.id.au (Michael Ellerman)
Date: Thu, 23 Feb 2006 10:49:18 +1100
Subject: [PATCH] ppc64 - fix spinlock recursion in native_hpte_clear
In-Reply-To: <43FCEAB2.2020702@am.sony.com>
References: <20060222161308.GA6356@in.ibm.com> <43FCEAB2.2020702@am.sony.com>
Message-ID: <200602231049.23107.michael@ellerman.id.au>

On Thu, 23 Feb 2006 09:50, Geoff Levand wrote:
> R Sharada wrote:
> > linux-2.6.16-rc4/arch/powerpc/mm/hash_native_64.c~fix_native_hpte_clear
> > 2006-02-22 21:22:42.000000000 +0530
> > +++ linux-2.6.16-rc4-sharada/arch/powerpc/mm/hash_native_64.c
> > 2006-02-22 21:26:25.000000000 +0530
> > @@ -403,12 +403,16 @@ static void native_hpte_clear(void)
> >  		 */
> >  		hpte_v = hptep->v;
> >
> > +		/* tlbie() takes the native_tlbie_lock. hence change the
> > +		 * tlbie() call here to __tlbie()
> > +		 */
>
> Once the patch is applied, the tlbie() call disappears and you have
> comment that doesn't make sense in the new context.  Maybe you should
> reconsider the wording.

Yeah I agree with Geoff. The point is that we already hold the tlbie lock, so 
we can't call something that takes it again.

cheers

-- 
Michael Ellerman
IBM OzLabs

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060223/4c3d4d53/attachment.pgp 

From pj at sgi.com  Thu Feb 23 11:50:09 2006
From: pj at sgi.com (Paul Jackson)
Date: Wed, 22 Feb 2006 16:50:09 -0800
Subject: Altix SN2 2.6.16-rc1-mm5 build breakage (was:  msi support)
In-Reply-To: <20060203202742.1e514fcc.akpm@osdl.org>
References: <20060119194647.12213.44658.14543@lnx-maule.americas.sgi.com>
	<20060119194702.12213.16524.93275@lnx-maule.americas.sgi.com>
	<20060203201441.194be500.pj@sgi.com>
	<20060203202531.27d685fa.akpm@osdl.org>
	<20060203202742.1e514fcc.akpm@osdl.org>
Message-ID: <20060222165009.6493e6a1.pj@sgi.com>

On Feb 3, Andrew wrote:
> Actually, gregkh-pci-altix-msi-support-git-ia64-fix.patch fix`es
> git-ia64.patch when gregkh-pci-altix-msi-support.patch

Is it time to reinsert that patch?

My ia64 sn build fails again, complaining:


===========================
  CC      arch/ia64/sn/pci/tioce_provider.o
arch/ia64/sn/pci/tioce_provider.c:720:46: macro "ATE_MAKE" requires 3 arguments, but only 2 given
===========================


Your broken-out/series file (2.6.16-rc4-mm1) has the lines:

    # Need this when gregkh-pci-altix-msi-support.patch comes back
    #gregkh-pci-altix-msi-support-git-ia64-fix.patch

I guess that is this patch below, which fixes my sn build just fine.
Holler if you need it as a proper patch.


--- 2.6.16-rc4-mm1.orig/arch/ia64/sn/pci/tioce_provider.c       2006-02-22 16:21:52.054985166 -0800
+++ 2.6.16-rc4-mm1/arch/ia64/sn/pci/tioce_provider.c    2006-02-22 16:31:21.594755653 -0800
@@ -717,7 +717,7 @@ tioce_reserve_m32(struct tioce_kernel *c
        while (ate_index <= last_ate) {
                u64 ate;

-               ate = ATE_MAKE(0xdeadbeef, ps);
+               ate = ATE_MAKE(0xdeadbeef, ps, 0);
                ce_kern->ce_ate3240_shadow[ate_index] = ate;
                tioce_mmr_storei(ce_kern, &ce_mmr->ce_ure_ate3240[ate_index],
                                 ate);


-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj at sgi.com> 1.925.600.0401


From akpm at osdl.org  Thu Feb 23 12:01:42 2006
From: akpm at osdl.org (Andrew Morton)
Date: Wed, 22 Feb 2006 17:01:42 -0800
Subject: Altix SN2 2.6.16-rc1-mm5 build breakage (was:  msi support)
In-Reply-To: <20060222165009.6493e6a1.pj@sgi.com>
References: <20060119194647.12213.44658.14543@lnx-maule.americas.sgi.com>
	<20060119194702.12213.16524.93275@lnx-maule.americas.sgi.com>
	<20060203201441.194be500.pj@sgi.com>
	<20060203202531.27d685fa.akpm@osdl.org>
	<20060203202742.1e514fcc.akpm@osdl.org>
	<20060222165009.6493e6a1.pj@sgi.com>
Message-ID: <20060222170142.497eaac3.akpm@osdl.org>

Paul Jackson <pj at sgi.com> wrote:
>
> Your broken-out/series file (2.6.16-rc4-mm1) has the lines:
> 
>     # Need this when gregkh-pci-altix-msi-support.patch comes back
>     #gregkh-pci-altix-msi-support-git-ia64-fix.patch

Bah.   I resurrected it, thanks.


From kelly at au1.ibm.com  Thu Feb 23 14:32:59 2006
From: kelly at au1.ibm.com (Kelly Daly)
Date: Thu, 23 Feb 2006 14:32:59 +1100
Subject: Fwd: [PATCH] powerpc: disable OProfile for iSeries
Message-ID: <200602231432.59472.kelly@au.ibm.com>

disable OProfile in Kconfig for iSeries to prevent hangs.  OProfile was not originally intended to work with legacy iSeries.

Signed-off-by: Kelly Daly <kelly at au.ibm.com>
---

hi Paulus,
could you push this up to the 2.6.16 release please?
K


diff -urpN linux-2.6.15.4/arch/powerpc/oprofile/Kconfig linux-2.6.15.4_patch/arch/powerpc/oprofile/Kconfig
--- linux-2.6.15.4/arch/powerpc/oprofile/Kconfig	2006-02-10 18:22:48.000000000 +1100
+++ linux-2.6.15.4_patch/arch/powerpc/oprofile/Kconfig	2006-02-23 13:09:00.000000000 +1100
@@ -1,4 +1,5 @@
 config PROFILING
+	depends on !PPC_ISERIES
 	bool "Profiling support (EXPERIMENTAL)"
 	help
 	  Say Y here to enable the extended profiling support mechanisms used


From paulus at samba.org  Thu Feb 23 15:12:52 2006
From: paulus at samba.org (Paul Mackerras)
Date: Thu, 23 Feb 2006 15:12:52 +1100
Subject: [PATCH] Accurate task and cpu time accounting
In-Reply-To: <20060222133551.GA30355@suse.de>
References: <17404.19586.404909.178103@cargo.ozlabs.ibm.com>
	<20060222133551.GA30355@suse.de>
Message-ID: <17405.13892.370606.476003@cargo.ozlabs.ibm.com>

Olaf Hering writes:

> This change fixes it. But it will not link 32bit:

I didn't notice that stuff was inside an #ifdef CONFIG_PPC64 block.
Easily fixed...

Paul.

diff -urN a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
--- a/arch/powerpc/kernel/irq.c	2006-02-22 09:44:33.000000000 +1100
+++ b/arch/powerpc/kernel/irq.c	2006-02-23 15:10:51.000000000 +1100
@@ -371,6 +371,7 @@
 	return NO_IRQ;
 
 }
+#endif /* CONFIG_PPC64 */
 
 #ifdef CONFIG_IRQSTACKS
 struct thread_info *softirq_ctx[NR_CPUS];
@@ -430,6 +431,7 @@
 }
 EXPORT_SYMBOL(do_softirq);
 
+#ifdef CONFIG_PPC64
 static int __init setup_noirqdistrib(char *str)
 {
 	distribute_irqs = 0;


From sharada at in.ibm.com  Thu Feb 23 15:59:39 2006
From: sharada at in.ibm.com (R Sharada)
Date: Thu, 23 Feb 2006 10:29:39 +0530
Subject: [PATCH] ppc64 - fix spinlock recursion in native_hpte_clear
In-Reply-To: <200602231049.23107.michael@ellerman.id.au>
References: <20060222161308.GA6356@in.ibm.com> <43FCEAB2.2020702@am.sony.com>
	<200602231049.23107.michael@ellerman.id.au>
Message-ID: <20060223045939.GA2151@in.ibm.com>

Would something to this effect be more appropriate?

/* we already hold the native_tlbie_lock before getting here. So, cannot
 * take it back again. So call raw __tlbie() in here
 */

Thanks and Regards,
Sharada

On Thu, Feb 23, 2006 at 10:49:18AM +1100, Michael Ellerman wrote:
> On Thu, 23 Feb 2006 09:50, Geoff Levand wrote:
> > R Sharada wrote:
> > > linux-2.6.16-rc4/arch/powerpc/mm/hash_native_64.c~fix_native_hpte_clear
> > > 2006-02-22 21:22:42.000000000 +0530
> > > +++ linux-2.6.16-rc4-sharada/arch/powerpc/mm/hash_native_64.c
> > > 2006-02-22 21:26:25.000000000 +0530
> > > @@ -403,12 +403,16 @@ static void native_hpte_clear(void)
> > >  		 */
> > >  		hpte_v = hptep->v;
> > >
> > > +		/* tlbie() takes the native_tlbie_lock. hence change the
> > > +		 * tlbie() call here to __tlbie()
> > > +		 */
> >
> > Once the patch is applied, the tlbie() call disappears and you have
> > comment that doesn't make sense in the new context.  Maybe you should
> > reconsider the wording.
> 
> Yeah I agree with Geoff. The point is that we already hold the tlbie lock, so 
> we can't call something that takes it again.
> 
> cheers
> 
> -- 
> Michael Ellerman
> IBM OzLabs
> 
> wwweb: http://michael.ellerman.id.au
> phone: +61 2 6212 1183 (tie line 70 21183)
> 
> We do not inherit the earth from our ancestors,
> we borrow it from our children. - S.M.A.R.T Person


From michael at ellerman.id.au  Thu Feb 23 16:55:22 2006
From: michael at ellerman.id.au (Michael Ellerman)
Date: Thu, 23 Feb 2006 16:55:22 +1100
Subject: Fwd: [PATCH] powerpc: disable OProfile for iSeries
In-Reply-To: <200602231432.59472.kelly@au.ibm.com>
References: <200602231432.59472.kelly@au.ibm.com>
Message-ID: <200602231655.26244.michael@ellerman.id.au>

On Thu, 23 Feb 2006 14:32, Kelly Daly wrote:
> disable OProfile in Kconfig for iSeries to prevent hangs.  OProfile was not
> originally intended to work with legacy iSeries.
>
> diff -urpN linux-2.6.15.4/arch/powerpc/oprofile/Kconfig
> linux-2.6.15.4_patch/arch/powerpc/oprofile/Kconfig ---
> linux-2.6.15.4/arch/powerpc/oprofile/Kconfig	2006-02-10 18:22:48.000000000
> +1100 +++ linux-2.6.15.4_patch/arch/powerpc/oprofile/Kconfig	2006-02-23
> 13:09:00.000000000 +1100 @@ -1,4 +1,5 @@
>  config PROFILING
> +	depends on !PPC_ISERIES

We've been trying to avoid !ISERIES compile time checks because they're a 
barrier to the mythical combined kernel. I haven't looked at the oprofile 
code, but is there an easy way to turn this into a 
firmware_has_feature(ISERIES) check?

cheers

-- 
Michael Ellerman
IBM OzLabs

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060223/52fe3a81/attachment.pgp 

From arnd at arndb.de  Thu Feb 23 20:39:45 2006
From: arnd at arndb.de (Arnd Bergmann)
Date: Thu, 23 Feb 2006 10:39:45 +0100
Subject: [FYI/PATCH 0/2] what is left on cell
Message-ID: <200602231039.45554.arnd@arndb.de>

On the way to getting everything together for a new binary kernel
on bsc.es and other distributors, we found two more patches to
be missing. These are not for inclusion in the mainline kernel,
but needed for now.


From arnd at arndb.de  Thu Feb 23 20:46:52 2006
From: arnd at arndb.de (Arnd Bergmann)
Date: Thu, 23 Feb 2006 10:46:52 +0100
Subject: [FYI/PATCH 2/2] fix previous interrupt controller rework patch
In-Reply-To: <200602231039.45554.arnd@arndb.de>
References: <200602231039.45554.arnd@arndb.de>
Message-ID: <200602231046.52479.arnd@arndb.de>

This fixes a bug for the patch in 
http://patchwork.ozlabs.org/linuxppc64/patch?id=4188.

I still haven't received feedback on the implementation
itself of that patch, but for now lets assume that we do
it that way.

I'll submit a fixed patch for the interrupt controller
rework for inclusion 2.6.17-rc then.

--- linux-2.6.16-rc1.orig/arch/powerpc/platforms/cell/spider-pic.c
+++ linux-2.6.16-rc1/arch/powerpc/platforms/cell/spider-pic.c
@@ -196,10 +196,11 @@ void spider_init_IRQ(void)
 
 		if (strstr(compatible, "CBEA,platform-spider-pic"))
 			spider_reg = *(long *)get_property(dn,"reg", NULL);
-		else {
+		else if (strstr(compatible, "sti,platform-spider-pic")) {
 			spider_init_IRQ_hardcoded();
 			return;
-		}
+		} else
+			continue;
 
 		if (!spider_reg)
 			printk("interrupt controller does not have reg property !\n");


From arnd at arndb.de  Thu Feb 23 20:41:14 2006
From: arnd at arndb.de (Arnd Bergmann)
Date: Thu, 23 Feb 2006 10:41:14 +0100
Subject: [FYI/PATCH 1/2] small hacks for running on BPA hardware, v4
In-Reply-To: <200602231039.45554.arnd@arndb.de>
References: <200602231039.45554.arnd@arndb.de>
Message-ID: <200602231041.14418.arnd@arndb.de>

The things done in here are workarounds for
deficiencies in the firmware that will be solved
there in later releases.

Signed-off-by: Arnd Bergmann <arndb at de.ibm.com>

Index: linux-2.6.16-rc/arch/powerpc/platforms/cell/Makefile
===================================================================
--- linux-2.6.16-rc.orig/arch/powerpc/platforms/cell/Makefile
+++ linux-2.6.16-rc/arch/powerpc/platforms/cell/Makefile
@@ -1,5 +1,5 @@
 obj-y			+= interrupt.o iommu.o setup.o spider-pic.o
-obj-y			+= pervasive.o
+obj-y			+= pervasive.o pci.o
 
 obj-$(CONFIG_SMP)	+= smp.o
 obj-$(CONFIG_SPU_FS)	+= spufs/ spu-base.o
Index: linux-2.6.16-rc/arch/powerpc/platforms/cell/pci.c
===================================================================
--- /dev/null
+++ linux-2.6.16-rc/arch/powerpc/platforms/cell/pci.c
@@ -0,0 +1,82 @@
+/*
+ * Cell specific PCI code
+ *
+ * Copyright (C) 2005 IBM Corporation,
+ 			Arnd Bergmann <arndb at de.ibm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
+ */
+
+#include <linux/kernel.h>
+#include <linux/pci.h>
+#include <linux/init.h>
+
+#include <asm/prom.h>
+#include <asm/machdep.h>
+#include <asm/pci-bridge.h>
+
+#include "interrupt.h"
+
+void __init cell_final_fixup(void)
+{
+	struct pci_dev *dev = NULL;
+
+	//phbs_remap_io();
+
+	for_each_pci_dev(dev) {
+	// FIXME: fix IRQ numbers for devices on second south bridge
+	}
+}
+
+static void fixup_spider_ipci_irq(struct pci_dev* dev)
+{
+	int irq_node_offset;
+	pr_debug("fixup for %04x:%04x at %02x.%1x: ", dev->vendor, dev->device,
+			 PCI_SLOT(dev->devfn), PCI_FUNC(dev->devfn));
+	switch (dev->devfn) {
+		case PCI_DEVFN(3,0):
+			/* ethernet */
+			dev->irq = 8;
+			break;
+		case PCI_DEVFN(5,0):
+			/* OHCI 0 */
+			dev->irq = 10;
+			break;
+		case PCI_DEVFN(6,0):
+			/* OHCI 1 */
+			dev->irq = 11;
+			break;
+		case PCI_DEVFN(5,1):
+			/* EHCI 0 */
+			dev->irq = 10;
+			break;
+		case PCI_DEVFN(6,1):
+			/* EHCI 1 */
+			dev->irq = 11;
+			break;
+	}
+
+	irq_node_offset = IIC_NODE_STRIDE * (pci_domain_nr(dev->bus)-1);
+	dev->irq += irq_node_offset;
+
+	pr_debug("irq %0x\n", dev->irq);
+}
+
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TOSHIBA_2,
+		PCI_DEVICE_ID_TOSHIBA_SPIDER_NET, fixup_spider_ipci_irq);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TOSHIBA_2,
+		PCI_DEVICE_ID_TOSHIBA_SPIDER_OHCI, fixup_spider_ipci_irq);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TOSHIBA_2,
+		PCI_DEVICE_ID_TOSHIBA_SPIDER_EHCI, fixup_spider_ipci_irq);
Index: linux-2.6.16-rc/arch/powerpc/platforms/cell/setup.c
===================================================================
--- linux-2.6.16-rc.orig/arch/powerpc/platforms/cell/setup.c
+++ linux-2.6.16-rc/arch/powerpc/platforms/cell/setup.c
@@ -58,6 +58,7 @@
 #else
 #define DBG(fmt...)
 #endif
+extern void cell_final_fixup(void);
 
 void cell_show_cpuinfo(struct seq_file *m)
 {
@@ -297,6 +298,7 @@ struct machdep_calls __initdata cell_md 
 	.setup_arch		= cell_setup_arch,
 	.init_early		= cell_init_early,
 	.show_cpuinfo		= cell_show_cpuinfo,
+	.pcibios_fixup		= cell_final_fixup,
 	.restart		= rtas_restart,
 	.power_off		= rtas_power_off,
 	.halt			= rtas_halt,
Index: linux-2.6.16-rc/include/linux/pci_ids.h
===================================================================
--- linux-2.6.16-rc.orig/include/linux/pci_ids.h
+++ linux-2.6.16-rc/include/linux/pci_ids.h
@@ -1367,6 +1367,8 @@
 #define PCI_DEVICE_ID_TOSHIBA_TC35815CF	0x0030
 #define PCI_DEVICE_ID_TOSHIBA_TC86C001_MISC	0x0108
 #define PCI_DEVICE_ID_TOSHIBA_SPIDER_NET 0x01b3
+#define PCI_DEVICE_ID_TOSHIBA_SPIDER_OHCI 0x01b6
+#define PCI_DEVICE_ID_TOSHIBA_SPIDER_EHCI 0x01b5
 
 #define PCI_VENDOR_ID_RICOH		0x1180
 #define PCI_DEVICE_ID_RICOH_RL5C465	0x0465
Index: linux-2.6.16-rc/arch/powerpc/platforms/cell/spu_base.c
===================================================================
--- linux-2.6.16-rc.orig/arch/powerpc/platforms/cell/spu_base.c
+++ linux-2.6.16-rc/arch/powerpc/platforms/cell/spu_base.c
@@ -534,6 +534,10 @@ static void __iomem * __init map_spe_pro
 
 	prop = p;
 
+	/* FIXME: Firmware bug */
+	if (strcmp (name, "priv2") == 0 && prop->len < 0x20000)
+		return ioremap(prop->address, 0x20000);
+
 	return ioremap(prop->address, prop->len);
 }
 

From paulus at samba.org  Thu Feb 23 21:35:02 2006
From: paulus at samba.org (Paul Mackerras)
Date: Thu, 23 Feb 2006 21:35:02 +1100
Subject: [patch] powerpc: native atomic_add_unless
In-Reply-To: <20060121112536.GA27505@wotan.suse.de>
References: <20060121112536.GA27505@wotan.suse.de>
Message-ID: <17405.36822.685629.515591@cargo.ozlabs.ibm.com>

Nick Piggin writes:

> atomic_add_unless (atomic_inc_not_zero) is used in several hot paths in the
> vfs and I'm planning some uses in the memory manager, so it should be as
> small and fast as possible.
> 
> Joel had a good suggestion to save a register but all bugs are mine.
> 
> Comments?

The implementation looks OK.  I would be interested to know if this
actually makes any measurable difference though.

Paul.


From olof at lixom.net  Fri Feb 24 03:42:13 2006
From: olof at lixom.net (Olof Johansson)
Date: Thu, 23 Feb 2006 08:42:13 -0800
Subject: [PATCH] Accurate task and cpu time accounting
In-Reply-To: <17404.19586.404909.178103@cargo.ozlabs.ibm.com>
References: <17404.19586.404909.178103@cargo.ozlabs.ibm.com>
Message-ID: <20060223164213.GB4674@pb15.lixom.net>

Hi,

On Wed, Feb 22, 2006 at 10:35:30PM +1100, Paul Mackerras wrote:

>  #ifndef __ASSEMBLY__
> @@ -313,7 +315,7 @@ enum {
>  	    CPU_FTR_HPTE_TABLE | CPU_FTR_PPCAS_ARCH_V2 |
>  	    CPU_FTR_MMCRA | CPU_FTR_SMT |
>  	    CPU_FTR_COHERENT_ICACHE | CPU_FTR_LOCKLESS_TLBIE |
> -	    CPU_FTR_MMCRA_SIHV,
> +	    CPU_FTR_MMCRA_SIHV | CPU_FTR_PURR,
>  	CPU_FTRS_CELL = CPU_FTR_SPLIT_ID_CACHE | CPU_FTR_USE_TB |
>  	    CPU_FTR_HPTE_TABLE | CPU_FTR_PPCAS_ARCH_V2 |
>  	    CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT |
> @@ -326,7 +328,7 @@ enum {
>  #ifdef __powerpc64__
>  	    CPU_FTRS_POWER3 | CPU_FTRS_RS64 | CPU_FTRS_POWER4 |
>  	    CPU_FTRS_PPC970 | CPU_FTRS_POWER5 | CPU_FTRS_CELL |
> -            CPU_FTR_CI_LARGE_PAGE |
> +            CPU_FTR_CI_LARGE_PAGE | CPU_FTR_PURR |

Is this second change really needed (this is the setting of
CPU_FTRS_POSSIBLE)? It already includes CPU_FTRS_POWER5, which has the
bit set by the first change.

Only case I can see where it's mandated to include there is when the bit
is set based on device-tree contents, right?

CPU_FTR_CI_LARGE_PAGE seems to be a weird case, it's checked only in
one location (hash code that enables 64K pages) but never actually set
anywhere in current sources. It never seems to have been.


-Olof


From stevewin at us.ibm.com  Fri Feb 24 03:37:54 2006
From: stevewin at us.ibm.com (Stephen Winiecki)
Date: Thu, 23 Feb 2006 11:37:54 -0500
Subject: Maple boot hang when SMP not configured and KEXEC configured
Message-ID: <OF2D750524.1D103F4D-ON8725711E.00527C12-8525711E.005B3404@us.ibm.com>


Using maple_defconfig with latest 2.6.16 prepatch versions, when SMP is not
configured the kernel hangs in smp_release_cpus() in kernel/setup_64.c

...
returning from prom_init
Page orders: linear mapping = 24, others = 12
Found initrd at 0xc0000000018fa000:0xc000000001b2c3c5
DART: table not allocated, using direct DMA
Found legacy serial port 0 for /ht at 0/isa at 4/serial at 3f8
  port=3f8, taddr=f40003f8, irq=ffffffffffffffff, clk=1843200, speed=115200
Found legacy serial port 1 for /ht at 0/isa at 4/serial at 2f8
  port=2f8, taddr=f40002f8, irq=ffffffffffffffff, clk=1843200, speed=115200
 -> smp_release_cpus()


         #if defined(CONFIG_SMP) || defined(CONFIG_KEXEC)
         void smp_release_cpus(void)
         {
         ...

Unconfiguring KEXEC does allow the boot to complete successfully

Steve Winiecki
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060223/9826bd33/attachment.htm 

From haren at us.ibm.com  Fri Feb 24 05:53:36 2006
From: haren at us.ibm.com (Haren Myneni)
Date: Thu, 23 Feb 2006 10:53:36 -0800
Subject: Maple boot hang when SMP not configured and KEXEC configured
In-Reply-To: <OF2D750524.1D103F4D-ON8725711E.00527C12-8525711E.005B3404@us.ibm.com>
References: <OF2D750524.1D103F4D-ON8725711E.00527C12-8525711E.005B3404@us.ibm.com>
Message-ID: <43FE04B0.6000004@us.ibm.com>

Stephen Winiecki wrote:

> Using maple_defconfig with latest 2.6.16 prepatch versions, when SMP 
> is not configured the kernel hangs in smp_release_cpus() in 
> kernel/setup_64.c
>
> ...
> returning from prom_init
> Page orders: linear mapping = 24, others = 12
> Found initrd at 0xc0000000018fa000:0xc000000001b2c3c5
> DART: table not allocated, using direct DMA
> Found legacy serial port 0 for /ht at 0/isa at 4/serial at 3f8
> port=3f8, taddr=f40003f8, irq=ffffffffffffffff, clk=1843200, speed=115200
> Found legacy serial port 1 for /ht at 0/isa at 4/serial at 2f8
> port=2f8, taddr=f40002f8, irq=ffffffffffffffff, clk=1843200, speed=115200
> -> smp_release_cpus()
>
>
>
>          #if defined(CONFIG_SMP) || defined(CONFIG_KEXEC)
>         void smp_release_cpus(void)
>         {
> ...
>
> Unconfiguring KEXEC does allow the boot to complete successfully
>
For UP kernels even KEXEC is enabled, this function should not get executed.
Please check whether your kernel has the following patch.

http://ozlabs.org/pipermail/linuxppc64-dev/2006-February/008064.html

>
> Steve Winiecki
>
>------------------------------------------------------------------------
>
>_______________________________________________
>Linuxppc64-dev mailing list
>Linuxppc64-dev at ozlabs.org
>https://ozlabs.org/mailman/listinfo/linuxppc64-dev
>  
>


From paulus at samba.org  Fri Feb 24 09:07:57 2006
From: paulus at samba.org (Paul Mackerras)
Date: Fri, 24 Feb 2006 09:07:57 +1100
Subject: [PATCH] Accurate task and cpu time accounting
In-Reply-To: <20060223164213.GB4674@pb15.lixom.net>
References: <17404.19586.404909.178103@cargo.ozlabs.ibm.com>
	<20060223164213.GB4674@pb15.lixom.net>
Message-ID: <17406.12861.222238.773579@cargo.ozlabs.ibm.com>

Olof Johansson writes:

> Is this second change really needed (this is the setting of
> CPU_FTRS_POSSIBLE)? It already includes CPU_FTRS_POWER5, which has the
> bit set by the first change.

Good point.  I'll take that bit out.

Paul.


From sfr at canb.auug.org.au  Fri Feb 24 10:16:44 2006
From: sfr at canb.auug.org.au (Stephen Rothwell)
Date: Fri, 24 Feb 2006 10:16:44 +1100
Subject: [PATCH] change compat shmget size arg to signed
Message-ID: <20060224101644.548b0c24.sfr@canb.auug.org.au>

Hi Olaf,

> change second arg (the 'size') to signed to handle a size of -1.
> ltp test shmget02 fails. This patch fixes it.
> Oddly, we see the failure only on a POWER4 LPAR with 4.6G ram.
> 
> Signed-off-by: Olaf Hering <olh at suse.de>
> 
>  arch/powerpc/kernel/sys_ppc32.c |    2 +-
>  1 files changed, 1 insertion(+), 1 deletion(-)
>
> Index: linux-2.6.16-rc4-olh/arch/powerpc/kernel/sys_ppc32.c
> ===================================================================
> --- linux-2.6.16-rc4-olh.orig/arch/powerpc/kernel/sys_ppc32.c
> +++ linux-2.6.16-rc4-olh/arch/powerpc/kernel/sys_ppc32.c
> @@ -429,7 +429,7 @@ long compat_sys_ipc(u32 call, u32 first,
>  		return sys_shmdt(compat_ptr(ptr));
>  	case SHMGET:
>  		/* sign extend key_t */
> -		return sys_shmget((int)first, second, third);
> +		return sys_shmget((int)first, (int)second, third);
>  	case SHMCTL:
>  		/* sign extend shmid */
>  		return compat_sys_shmctl((int)first, second, compat_ptr(ptr));

Does the ltp test fail on a standard kernel(where SHMMAX is 0x2000000), or
only on a SLES kernel (where SHMMAX is ULONG_MAX)?

-- 
Cheers,
Stephen Rothwell                    sfr at canb.auug.org.au
http://www.canb.auug.org.au/~sfr/


From olh at suse.de  Fri Feb 24 10:27:17 2006
From: olh at suse.de (Olaf Hering)
Date: Fri, 24 Feb 2006 00:27:17 +0100
Subject: [PATCH] change compat shmget size arg to signed
In-Reply-To: <20060224101644.548b0c24.sfr@canb.auug.org.au>
References: <20060224101644.548b0c24.sfr@canb.auug.org.au>
Message-ID: <20060223232717.GB29454@suse.de>

 On Fri, Feb 24, Stephen Rothwell wrote:

> Does the ltp test fail on a standard kernel(where SHMMAX is 0x2000000), or
> only on a SLES kernel (where SHMMAX is ULONG_MAX)?

It fails with SLES9 and SLES10. SLES9 has 0x2000000 as default.


From kelly.daly at gmail.com  Fri Feb 24 11:00:05 2006
From: kelly.daly at gmail.com (Kelly Daly)
Date: Fri, 24 Feb 2006 11:00:05 +1100
Subject: Fwd: [PATCH] powerpc: disable OProfile for iSeries
In-Reply-To: <200602231655.26244.michael@ellerman.id.au>
References: <200602231432.59472.kelly@au.ibm.com>
	<200602231655.26244.michael@ellerman.id.au>
Message-ID: <9ffa56aa0602231600i214d95c6lb7bfdbf9de67d494@mail.gmail.com>

Hey Michael,

I will definitely look into doing it the way that you have mentioned.  In
the interim, however, this is a good solution to stop the hanging problem.

Cheers,
Kelly

On 2/23/06, Michael Ellerman <michael at ellerman.id.au> wrote:
>
> On Thu, 23 Feb 2006 14:32, Kelly Daly wrote:
> > disable OProfile in Kconfig for iSeries to prevent hangs.  OProfile was
> not
> > originally intended to work with legacy iSeries.
> >
> > diff -urpN linux-2.6.15.4/arch/powerpc/oprofile/Kconfig
> > linux-2.6.15.4_patch/arch/powerpc/oprofile/Kconfig ---
> > linux-2.6.15.4/arch/powerpc/oprofile/Kconfig  2006-02-10 18:22:
> 48.000000000
> > +1100 +++ linux-2.6.15.4_patch/arch/powerpc/oprofile/Kconfig  2006-02-23
> > 13:09:00.000000000 +1100 @@ -1,4 +1,5 @@
> >  config PROFILING
> > +     depends on !PPC_ISERIES
>
> We've been trying to avoid !ISERIES compile time checks because they're a
> barrier to the mythical combined kernel. I haven't looked at the oprofile
> code, but is there an easy way to turn this into a
> firmware_has_feature(ISERIES) check?
>
> cheers
>
> --
> Michael Ellerman
> IBM OzLabs
>
> wwweb: http://michael.ellerman.id.au
> phone: +61 2 6212 1183 (tie line 70 21183)
>
> We do not inherit the earth from our ancestors,
> we borrow it from our children. - S.M.A.R.T Person
>
>
> _______________________________________________
> Linuxppc64-dev mailing list
> Linuxppc64-dev at ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc64-dev
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060224/11bd6d32/attachment.htm 

From sfr at canb.auug.org.au  Fri Feb 24 11:12:42 2006
From: sfr at canb.auug.org.au (Stephen Rothwell)
Date: Fri, 24 Feb 2006 11:12:42 +1100
Subject: [PATCH] change compat shmget size arg to signed
In-Reply-To: <20060223232717.GB29454@suse.de>
References: <20060224101644.548b0c24.sfr@canb.auug.org.au>
	<20060223232717.GB29454@suse.de>
Message-ID: <20060224111242.08f14bd9.sfr@canb.auug.org.au>

On Fri, 24 Feb 2006 00:27:17 +0100 Olaf Hering <olh at suse.de> wrote:
>
>  On Fri, Feb 24, Stephen Rothwell wrote:
> 
> > Does the ltp test fail on a standard kernel(where SHMMAX is 0x2000000), or
> > only on a SLES kernel (where SHMMAX is ULONG_MAX)?
> 
> It fails with SLES9 and SLES10. SLES9 has 0x2000000 as default.

So what was shm_ctlmax set to when the test was run.

I am trying to figure out why this test:

if (size < SHMMIN || size > shm_ctlmax)
                return -EINVAL;

Doesn't return -EINVAL for size == 0xffffffff if shm_ctlmax is 0x2000000?

-- 
Cheers,
Stephen Rothwell                    sfr at canb.auug.org.au
http://www.canb.auug.org.au/~sfr/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060224/a14ef0ea/attachment.pgp 

From benh at kernel.crashing.org  Fri Feb 24 14:32:43 2006
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Fri, 24 Feb 2006 14:32:43 +1100
Subject: [PATCH] Accurate task and cpu time accounting
In-Reply-To: <20060223164213.GB4674@pb15.lixom.net>
References: <17404.19586.404909.178103@cargo.ozlabs.ibm.com>
	<20060223164213.GB4674@pb15.lixom.net>
Message-ID: <1140751963.8264.70.camel@localhost.localdomain>


> CPU_FTR_CI_LARGE_PAGE seems to be a weird case, it's checked only in
> one location (hash code that enables 64K pages) but never actually set
> anywhere in current sources. It never seems to have been.

Yup, because it's not yet clear when to set it... There is a new
device-tree property being architected but I didn't yet have a chance to
test a machine with a firmware that provides it. Setting it based on the
PVR would cause problems in the case of machines mixing multiple CPU
revisions.

In general, we have a problem with our cputable model vs. IBM cpu
feature model. The PAPR architecture considers that any feature not
explicitely exposed in the device-tree should not be used (like altivec
for example). We currently use the PVR for almost everything
however.... 

We need to be able to identify properly a PAPR machine early and clear
out a load of feature bits from what was provided by the table, and then
only set back in the bits that are advertised by the various device-tree
properties defined by IBM. In addition, we need to make sure we don't
break bare-metal in the process. To make things difficult, identifying a
PAPR machine is a bit dodgy since they are still specified as simply
having "chrp" in / device_type ...

Ben.


From npiggin at suse.de  Fri Feb 24 14:47:19 2006
From: npiggin at suse.de (Nick Piggin)
Date: Fri, 24 Feb 2006 04:47:19 +0100
Subject: [patch] powerpc: native atomic_add_unless
In-Reply-To: <17405.36822.685629.515591@cargo.ozlabs.ibm.com>
References: <20060121112536.GA27505@wotan.suse.de>
	<17405.36822.685629.515591@cargo.ozlabs.ibm.com>
Message-ID: <20060224034719.GB19281@wotan.suse.de>

On Thu, Feb 23, 2006 at 09:35:02PM +1100, Paul Mackerras wrote:
> Nick Piggin writes:
> 
> > atomic_add_unless (atomic_inc_not_zero) is used in several hot paths in the
> > vfs and I'm planning some uses in the memory manager, so it should be as
> > small and fast as possible.
> > 
> > Joel had a good suggestion to save a register but all bugs are mine.
> > 
> > Comments?
> 
> The implementation looks OK.  I would be interested to know if this
> actually makes any measurable difference though.
> 

I tried to microbenchmark it in userspace but couldn't get significant
results for a single thread.

When the cacheline is not hot or there is some contention, I hoped the
native version might result in less coherency protocol operations.

There are less branches and it should use less I cache too.

All things that are difficult to test in microbenchmarks, unfortunately.


From galak at kernel.crashing.org  Sat Feb 25 03:34:30 2006
From: galak at kernel.crashing.org (Kumar Gala)
Date: Fri, 24 Feb 2006 10:34:30 -0600 (CST)
Subject: Membership stats (Was: Re: merge these lists?)
In-Reply-To: <20060208110718.57e9f9f5.sfr@canb.auug.org.au>
Message-ID: <Pine.LNX.4.44.0602241033400.2323-100000@gate.crashing.org>

On Wed, 8 Feb 2006, Stephen Rothwell wrote:

> On Wed, 8 Feb 2006 11:01:50 +1100 Stephen Rothwell <sfr at canb.auug.org.au> wrote:
> >
> > Yes, "a sysadmin" could do that.  However, those that are
> > subscribed with different addresses on each list will end
> > up subscribed twice and those who have changed their preferences on
> > the abondoned list will have fix them as well.
> 
> Just for interest:
> 
> 	members of linuxppc-dev		473
> 	members of linuxppc64-dev	264
> 	common				 98
> 
> But, as I said, "common" above does not count those who have different
> addresses subscribed to each list.

Where did we leave on with this?  I was about to request that 
marc.theaimsgroup.com start archiving some of the ppc lists but figured 
doing it after we merged lists would be better.

- kumar


From johnrose at austin.ibm.com  Sat Feb 25 04:34:23 2006
From: johnrose at austin.ibm.com (John Rose)
Date: Fri, 24 Feb 2006 11:34:23 -0600
Subject: [PATCH] fix dynamic PCI probe regression
Message-ID: <1140802463.17752.3.camel@sinatra.austin.ibm.com>

<separated and rebased from 2/21 post>

Hi Paul-

Some hotplug driver functions were migrated to the kernel for use by EEH
in the following set of changes:
http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=2bf6a8fa21570f37fd1789610da30f70a05ac5e3

Previously, the PCI Hotplug module had been changed to use the new
OFDT-based PCI probe when appropriate:
http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=5fa80fcdca9d20d30c9ecec30d4dbff4ed93a5c6

When rpaphp_pci_config_slot() was moved from the rpaphp driver to the
new kernel function pcibios_add_pci_devices(), the OFDT-based probe
stuff was dropped.  This patch restores it.

Please apply if approriate.

Thanks-
John

Signed-off-by: John Rose <johnrose at austin.ibm.com>

diff -puN arch/powerpc/platforms/pseries/eeh.c~reorg_regress arch/powerpc/platforms/pseries/eeh.c
--- 2_6_linus_2/arch/powerpc/platforms/pseries/eeh.c~reorg_regress	2006-02-24 11:04:10.000000000 -0600
+++ 2_6_linus_2-johnrose/arch/powerpc/platforms/pseries/eeh.c	2006-02-24 11:04:10.000000000 -0600
@@ -893,6 +893,20 @@ void eeh_add_device_tree_early(struct de
 }
 EXPORT_SYMBOL_GPL(eeh_add_device_tree_early);
 
+void eeh_add_device_tree_late(struct pci_bus *bus)
+{
+	struct pci_dev *dev;
+
+	list_for_each_entry(dev, &bus->devices, bus_list) {
+ 		eeh_add_device_late(dev);
+ 		if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) {
+ 			struct pci_bus *subbus = dev->subordinate;
+ 			if (subbus)
+ 				eeh_add_device_tree_late(subbus);
+ 		}
+	}
+}
+
 /**
  * eeh_add_device_late - perform EEH initialization for the indicated pci device
  * @dev: pci device for which to set up EEH
diff -puN arch/powerpc/platforms/pseries/pci_dlpar.c~reorg_regress arch/powerpc/platforms/pseries/pci_dlpar.c
--- 2_6_linus_2/arch/powerpc/platforms/pseries/pci_dlpar.c~reorg_regress	2006-02-24 11:04:10.000000000 -0600
+++ 2_6_linus_2-johnrose/arch/powerpc/platforms/pseries/pci_dlpar.c	2006-02-24 11:04:10.000000000 -0600
@@ -106,6 +106,8 @@ pcibios_fixup_new_pci_devices(struct pci
 			}
 		}
 	}
+
+	eeh_add_device_tree_late(bus);
 }
 EXPORT_SYMBOL_GPL(pcibios_fixup_new_pci_devices);
 
@@ -114,7 +116,6 @@ pcibios_pci_config_bridge(struct pci_dev
 {
 	u8 sec_busno;
 	struct pci_bus *child_bus;
-	struct pci_dev *child_dev;
 
 	/* Get busno of downstream bus */
 	pci_read_config_byte(dev, PCI_SECONDARY_BUS, &sec_busno);
@@ -129,10 +130,6 @@ pcibios_pci_config_bridge(struct pci_dev
 
 	pci_scan_child_bus(child_bus);
 
-	list_for_each_entry(child_dev, &child_bus->devices, bus_list) {
-		eeh_add_device_late(child_dev);
-	}
-
 	/* Fixup new pci devices without touching bus struct */
 	pcibios_fixup_new_pci_devices(child_bus, 0);
 
@@ -160,18 +157,25 @@ pcibios_add_pci_devices(struct pci_bus *
 
 	eeh_add_device_tree_early(dn);
 
-	/* pci_scan_slot should find all children */
-	slotno = PCI_SLOT(PCI_DN(dn->child)->devfn);
-	num = pci_scan_slot(bus, PCI_DEVFN(slotno, 0));
-	if (num) {
-		pcibios_fixup_new_pci_devices(bus, 1);
-		pci_bus_add_devices(bus);
-	}
+	if (_machine == PLATFORM_PSERIES_LPAR) {
+		/* use ofdt-based probe */
+		of_scan_bus(dn, bus);
+		if (!list_empty(&bus->devices)) {
+			pcibios_fixup_new_pci_devices(bus, 0);
+			pci_bus_add_devices(bus);
+		}
+	} else {
+		/* use legacy probe */
+		slotno = PCI_SLOT(PCI_DN(dn->child)->devfn);
+		num = pci_scan_slot(bus, PCI_DEVFN(slotno, 0));
+		if (num) {
+			pcibios_fixup_new_pci_devices(bus, 1);
+			pci_bus_add_devices(bus);
+		}
 
-	list_for_each_entry(dev, &bus->devices, bus_list) {
-		eeh_add_device_late (dev);
-		if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE)
-			pcibios_pci_config_bridge(dev);
+		list_for_each_entry(dev, &bus->devices, bus_list)
+			if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE)
+				pcibios_pci_config_bridge(dev);
 	}
 }
 EXPORT_SYMBOL_GPL(pcibios_add_pci_devices);
diff -puN include/asm-powerpc/eeh.h~reorg_regress include/asm-powerpc/eeh.h
--- 2_6_linus_2/include/asm-powerpc/eeh.h~reorg_regress	2006-02-24 11:04:10.000000000 -0600
+++ 2_6_linus_2-johnrose/include/asm-powerpc/eeh.h	2006-02-24 11:06:50.000000000 -0600
@@ -27,6 +27,7 @@
 #include <linux/string.h>
 
 struct pci_dev;
+struct pci_bus;
 struct device_node;
 
 #ifdef CONFIG_EEH
@@ -61,7 +62,7 @@ void __init pci_addr_cache_build(void);
  */
 void eeh_add_device_early(struct device_node *);
 void eeh_add_device_tree_early(struct device_node *);
-void eeh_add_device_late(struct pci_dev *);
+void eeh_add_device_tree_late(struct pci_bus *);
 
 /**
  * eeh_remove_device - undo EEH setup for the indicated pci device
@@ -116,12 +117,12 @@ static inline void pci_addr_cache_build(
 
 static inline void eeh_add_device_early(struct device_node *dn) { }
 
-static inline void eeh_add_device_late(struct pci_dev *dev) { }
-
 static inline void eeh_remove_device(struct pci_dev *dev) { }
 
 static inline void eeh_add_device_tree_early(struct device_node *dn) { }
 
+static inline void eeh_add_device_tree_late(struct pci_bus *bus) { }
+
 static inline void eeh_remove_bus_device(struct pci_dev *dev) { }
 #define EEH_POSSIBLE_ERROR(val, type) (0)
 #define EEH_IO_ERROR_VALUE(size) (-1UL)

_


From stevewin at us.ibm.com  Sat Feb 25 08:40:46 2006
From: stevewin at us.ibm.com (Stephen Winiecki)
Date: Fri, 24 Feb 2006 16:40:46 -0500
Subject: Maple fails to boot current git
Message-ID: <OF430BE00B.A014F4E5-ON8725711F.0076F7F4-8525711F.0076EE85@us.ibm.com>


On Tue, 2006-01-31 at 08:08 -0700, Tom Rini wrote:
> On Tue, Jan 31, 2006 at 02:53:11PM +1100, Benjamin Herrenschmidt wrote:
> > Well, the RTC problem definitely looks like a bogus or lack of "ranges"
> > property or the fact that the parser doesn't recognize "ht" as a PCI
> > bus. You may want to try updating prom_parse.c to treat "ht" as a PCI
> > bus and see if that helps.
>
> With the following, I get parent bus is pci now, but still:
> OF: ** translation for device /ht at 0/isa at 4/rtc at 900 **
> OF: bus is isa (na=2, ns=1) on /ht at 0/isa at 4
> OF: translating address: 00000001 00000900
> OF: parent bus is pci (na=3, ns=2) on /ht at 0
> OF: walking ranges...
> OF: not found !
> Maple: Unable to translate RTC address
> Maple: No device node for RTC, assuming legacy address (0x70)

For the record, changing the ISA ranges property does correct the problem
translating the addresses for the devices hanging off that bus

Old:
    /isa at 4
      ...
>>      ranges              = 00000001 f4000000 00010000


New:
    /isa at 4
      ...
>>      ranges              = 00000001 00000000 f4000000 00000000 00000000
00010000


Output w/ ISA range property change only:
...
OF: ** translation for device /ht at 0/isa at 4/rtc at 900 **
OF: bus is isa (na=2, ns=1) on /ht at 0/isa at 4
OF: translating address: 00000001 00000900
OF: parent bus is default (na=3, ns=2) on /ht at 0
OF: walking ranges...
OF: ISA map, cp=0, s=10000, da=900
OF: parent translation for: f4000000 00000000 00000000
OF: with offset: 900
OF: one level translation: 00000000 00000000 00000900
OF: parent bus is default (na=2, ns=2) on /
OF: walking ranges...
OF: default map, cp=0, s=400000, da=900
OF: parent translation for: 00000000 f4000000
OF: with offset: 900
OF: one level translation: 00000000 f4000900
OF: reached root node
Maple: Found RTC at IO 0x900
...

Fixes similar issues w/ other devices on the bus as well.

Note Ben - it looks like adding "ht" as a match in of_bus_pci_match()
doesn't help matters -

Output w/ ISA range property change and adding "ht" as match in
of_bus_pci_match():
OF: of_bus_pci_match with ht
OF: ** translation for device /ht at 0/isa at 4/rtc at 900 **
OF: of_bus_pci_match with ht
OF: bus is isa (na=2, ns=1) on /ht at 0/isa at 4
OF: translating address: 00000001 00000900
OF: of_bus_pci_match with ht
OF: parent bus is pci (na=3, ns=2) on /ht at 0
OF: walking ranges...
OF: ISA map, cp=0, s=10000, da=900
OF: parent translation for: f4000000 00000000 00000000
OF: with offset: 900
OF: one level translation: f4000000 00000000 00000900
OF: of_bus_pci_match with ht
OF: parent bus is default (na=2, ns=2) on /
OF: walking ranges...
OF: not found !
Maple: Unable to translate RTC address
Maple: No device node for RTC, assuming legacy address (0x70)

Updating the range property can be done via the EPOS(/PIBS for more recent
versions) shell using this function:

of_change_property(char *nodename, char *propname, char* prop, size_t len)

As an example:

PIBS $ int val=malloc(24)
PIBS $ int *p=val
PIBS $ *p=0x0000000100000000
PIBS $ p+=1
PIBS $ *p=0xf400000000000000
PIBS $ p+=8                   # Note - there appears to be an anomoly in my
PIBS version where ptr arith is only done for the
                  # first addition - check your values using "print p"
PIBS $ *p=0x0000000000010000
PIBS $ of_change_property("/ht/isa", "ranges", val, 24)

Note also - this range definition does appear to be compatible with older
kernels (I booted a 2.6.10 based image w/ no obvious problems)

Steve Winiecki
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060224/43cac33a/attachment.htm 

From pradeep at us.ibm.com  Sat Feb 25 09:14:02 2006
From: pradeep at us.ibm.com (Pradeep Satyanarayana)
Date: Fri, 24 Feb 2006 14:14:02 -0800
Subject: Problems loading some select modules
Message-ID: <OF0E9B0E2A.2FB5C32A-ON8825711F.0077F598-8825711F.0079FB7C@us.ibm.com>


I was trying to load some Infiniband modules (using modprobe) on Power5
machine (p570), and I get the following error:

WARNING: Error inserting findex
(/lib/modules/2.6.16-rc2/kernel/drivers/infiniband/core/findex.ko): Invalid
module format

Also, in /var/log/messages I see the following error about the same module:

kernel: findex: doesn't contain .toc or .stubs.

objdump -h findex.ko | grep toc

returns nothing. However,  when I tried that on another module I see the
following:

objdump -h ib_core.ko | grep toc
 16 .toc1         000002b8  0000000000000000  0000000000000000  0000d900
2**0
 18 .toc          00000038  0000000000000000  0000000000000000  0000e548
2**3

As expected the ib_core (and several other modules) load properly. Just
findex.ko has this problem.
I suspected problems with the wrong module being picked up and attempted an
insmod
of  the module by specifying the path; same problem.

I was using linux 2.6.16-rc2. This was a Sles9sp2 machine. The  gcc version
is :
gcc version 3.3.3 (SuSE Linux).

Identical kernel and Infiniband sources on RHEL4U3 machine (on a p570
again) have no problems and the modules
load properly.

On the RedHat machine the gcc version is :
gcc version 3.4.5 20051201 (Red Hat 3.4.5-2)

Any help with this is much  appreciated.


Pradeep
pradeep at us.ibm.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060224/22a2fb5f/attachment.htm 

From ericvh at gmail.com  Sat Feb 25 09:57:40 2006
From: ericvh at gmail.com (Eric Van Hensbergen)
Date: Fri, 24 Feb 2006 16:57:40 -0600 (CST)
Subject: [PATCH 0/2] systemsim extensions (not for mainline inclusion)
Message-ID: <20060224225740.E0BBF5A8075@localhost.localdomain>

What follows is a couple of FYI systemsim patches.  They are not necessarily 
for inclusion in mainline kernel and will be maintained in 
/pub/scm/linux/kernel/git/ericvh/systemsim.git o kernel.org.

       -eric


From ericvh at gmail.com  Sat Feb 25 09:58:24 2006
From: ericvh at gmail.com (Eric Van Hensbergen)
Date: Fri, 24 Feb 2006 16:58:24 -0600 (CST)
Subject: [PATCH 1/2] systemsim: boot hacks for non-standard platforms
Message-ID: <20060224225824.D98DE5A8075@localhost.localdomain>

>From nobody Mon Sep 17 00:00:00 2001
From: Eric Van Hensbergen <ericvh at gmail.com>
Date: Fri Feb 24 16:46:07 2006 -0600
Subject: [PATCH] systemsim: add boot hacks for non-standard platforms

When booting on some "experimental platforms" under the IBM Full System
Simulator - a certain set of boot hacks are required which differentiate
the hardware from standard pSeries systems.  This patch adds a config flag
which allows you to use these hacks.

Signed-off-by: Eric Van Hensbergen <bergevan at us.ibm.com>

---

 arch/powerpc/Kconfig                   |    8 ++++++++
 arch/powerpc/platforms/pseries/setup.c |    8 ++++++++
 2 files changed, 16 insertions(+), 0 deletions(-)

6f27df783005ca87d1a27370837070b98798fbeb
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 371043b..592846c 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -384,6 +384,14 @@ config SYSTEMSIM_IDLE
 	  significantly reduces the load on the host system when
 	  simulating an idle system.
 
+config SYSTEMSIM_BOOT
+	bool "   Boot hacks for non-standard hardware under systemsim"
+	depends on PPC_SYSTEMSIM
+	help
+	  Selecting this option will enable boot hacks during setup
+	  to facilitate Linux boots on non-standard hardware under the
+          IBM Full System Simulator.
+
 config XICS
 	depends on PPC_PSERIES
 	bool
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index 9edeca8..ca5d20a 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -171,11 +171,19 @@ static void __init pSeries_setup_mpic(vo
 	
 	/* Setup the openpic driver */
 	irq_count = NR_IRQS - NUM_ISA_INTERRUPTS - 4; /* leave room for IPIs */
+#ifndef CONFIG_SYSTEMSIM_BOOT
 	pSeries_mpic = mpic_alloc(openpic_addr, MPIC_PRIMARY,
 				  16, 16, irq_count, /* isu size, irq offset, irq count */ 
 				  NR_IRQS - 4, /* ipi offset */
 				  senses, irq_count, /* sense & sense size */
 				  " MPIC     ");
+#else /* CONFIG_SYSTEMSIM_BOOT */
+	pSeries_mpic = mpic_alloc(openpic_addr, MPIC_PRIMARY,
+				  0, 0, irq_count, /* isu size, irq offset, irq count */ 
+				  NR_IRQS - 4, /* ipi offset */
+				  senses, irq_count, /* sense & sense size */
+				  " MPIC     ");
+#endif /* CONFIG_SYSTEMSIM_BOOT */
 }
 
 static void pseries_lpar_enable_pmcs(void)
-- 


From ericvh at gmail.com  Sat Feb 25 09:58:55 2006
From: ericvh at gmail.com (Eric Van Hensbergen)
Date: Fri, 24 Feb 2006 16:58:55 -0600 (CST)
Subject: [PATCH 2/2] systemsim: add early debug options for HVC_FSS
Message-ID: <20060224225855.B83C45A8075@localhost.localdomain>

>From nobody Mon Sep 17 00:00:00 2001
From: Eric Van Hensbergen <ericvh at gmail.com>
Date: Fri Feb 24 16:47:36 2006 -0600
Subject: [PATCH] systemsim: add early debug option when using systemsim console

This patch adds udbg hooks for early-printk debug when using the IBM
Full System Simulator console support.

Signed-off-by: Eric Van Hensbergen <bergevan at us.ibm.com>

---

 arch/powerpc/kernel/udbg.c |    3 +++
 drivers/char/Kconfig       |    7 +++++++
 drivers/char/hvc_fss.c     |   15 +++++++++++++++
 include/asm-powerpc/udbg.h |    1 +
 4 files changed, 26 insertions(+), 0 deletions(-)

b4e4add5d57f130a422e68787626f96f311658a0
diff --git a/arch/powerpc/kernel/udbg.c b/arch/powerpc/kernel/udbg.c
index 3774e80..66b63ad 100644
--- a/arch/powerpc/kernel/udbg.c
+++ b/arch/powerpc/kernel/udbg.c
@@ -39,6 +39,9 @@ void __init udbg_early_init(void)
 #elif defined(CONFIG_PPC_EARLY_DEBUG_MAPLE)
 	/* Maple real mode debug */
 	udbg_init_maple_realmode();
+#elif defined(CONFIG_PPC_EARLY_DEBUG_FSS)
+	/* Maple real mode debug */
+	udbg_init_fss();
 #elif defined(CONFIG_PPC_EARLY_DEBUG_ISERIES)
 	/* For iSeries - hit Ctrl-x Ctrl-x to see the output */
 	udbg_init_iseries();
diff --git a/drivers/char/Kconfig b/drivers/char/Kconfig
index 74f9932..1973869 100644
--- a/drivers/char/Kconfig
+++ b/drivers/char/Kconfig
@@ -586,6 +586,13 @@ config HVC_FSS
 	  IBM Full System Simulator Console device driver which makes use of
 	  the HVC_DRIVER front end.
 
+config PPC_EARLY_DEBUG_FSS
+	bool "IBM Full System Simulator Console early debug support"
+	depends on PPC_SYSTEMSIM
+	select HVC_FSS
+	help
+	  Display early debug info over the Full System simulator console
+
 config HVC_RTAS
 	bool "IBM RTAS Console support"
 	depends on PPC_RTAS
diff --git a/drivers/char/hvc_fss.c b/drivers/char/hvc_fss.c
index e87c03a..84aa34f 100644
--- a/drivers/char/hvc_fss.c
+++ b/drivers/char/hvc_fss.c
@@ -38,6 +38,7 @@
 #include <asm/prom.h>
 #include <asm/irq.h>
 #include <asm/systemsim.h>
+#include <asm/udbg.h>
 
 #include "hvc_console.h"
 
@@ -74,6 +75,20 @@ static int hvc_fss_read_console(uint32_t
 	return got;
 }
 
+#ifdef CONFIG_PPC_EARLY_DEBUG_FSS
+void udbg_fss_real_putc(char c)
+{
+	callthru3(SIM_WRITE_CONSOLE_CODE, (unsigned long)&c, 1, 1);
+}
+
+void __init udbg_init_fss(void)
+{
+	udbg_putc = udbg_fss_real_putc;
+	udbg_getc = NULL;
+	udbg_getc_poll = NULL;
+}
+#endif /* CONFIG_PPC_EARLY_DEBUG_FSS */
+
 static struct hv_ops hvc_fss_get_put_ops = {
 	.get_chars = hvc_fss_read_console,
 	.put_chars = hvc_fss_write_console,
diff --git a/include/asm-powerpc/udbg.h b/include/asm-powerpc/udbg.h
index 5c4236c..46b100a 100644
--- a/include/asm-powerpc/udbg.h
+++ b/include/asm-powerpc/udbg.h
@@ -42,6 +42,7 @@ extern void __init udbg_init_pmac_realmo
 extern void __init udbg_init_maple_realmode(void);
 extern void __init udbg_init_iseries(void);
 extern void __init udbg_init_rtas(void);
+extern void __init udbg_init_fss(void);
 
 #endif /* __KERNEL__ */
 #endif /* _ASM_POWERPC_UDBG_H */
-- 


From Luis.Lopez at foxconn.com  Sat Feb 25 09:44:54 2006
From: Luis.Lopez at foxconn.com (Luis Lopez-FLL005)
Date: Fri, 24 Feb 2006 15:44:54 -0700
Subject: p660 RIO failure
Message-ID: <0608F878EEE32846BFEFE4F1221A0E070419A5DC@cuuexm01.mx.efoxconn.com>

Hello 
 
Did you solve your problem with the following message ?
 
Service Processor Firmware Failure
    Error code: B1014602
    Detail:     6013
 
    SRC
    --------------------------------------------------------------
    word11: B1014602    word12: 0230005D    word13: 60132014
    word14: 00000000    word15: 00000700    word16: 0000A05A
    word17: 00000000    word18: 00004000    word19: F444E060
 
    B1014602
 
 
I am having this problem with my RS 6000 Server using AIX 4.3.3, I
appreciate any help.
 
Luis
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20060224/643fedc1/attachment.htm 

From benh at kernel.crashing.org  Sat Feb 25 16:02:32 2006
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Sat, 25 Feb 2006 16:02:32 +1100
Subject: Maple fails to boot current git
In-Reply-To: <OF430BE00B.A014F4E5-ON8725711F.0076F7F4-8525711F.0076EE85@us.ibm.com>
References: <OF430BE00B.A014F4E5-ON8725711F.0076F7F4-8525711F.0076EE85@us.ibm.com>
Message-ID: <1140843752.24957.14.camel@localhost.localdomain>


> For the record, changing the ISA ranges property does correct the
> problem translating the addresses for the devices hanging off that bus

Yup, we should get the PIBS folks to fix that.

> Note Ben - it looks like adding "ht" as a match in of_bus_pci_match()
> doesn't help matters -

It should still be done for correctness imho

> Updating the range property can be done via the EPOS(/PIBS for more
> recent versions) shell using this function:
> 
> of_change_property(char *nodename, char *propname, char* prop, size_t
> len)

Ah good to know, I couldn't remember how to do it.

> As an example:
> 
> PIBS $ int val=malloc(24)
> PIBS $ int *p=val
> PIBS $ *p=0x0000000100000000
> PIBS $ p+=1
> PIBS $ *p=0xf400000000000000
> PIBS $ p+=8 # Note - there appears to be an anomoly in my PIBS version
> where ptr arith is only done for the
> # first addition - check your values using "print p"
> PIBS $ *p=0x0000000000010000
> PIBS $ of_change_property("/ht/isa", "ranges", val, 24)
> 
> Note also - this range definition does appear to be compatible with
> older kernels (I booted a 2.6.10 based image w/ no obvious problems) 

Yup, the previous one was bogus.

Ben.


From olh at suse.de  Sat Feb 25 19:17:00 2006
From: olh at suse.de (Olaf Hering)
Date: Sat, 25 Feb 2006 09:17:00 +0100
Subject: p660 RIO failure
In-Reply-To: <0608F878EEE32846BFEFE4F1221A0E070419A5DC@cuuexm01.mx.efoxconn.com>
References: <0608F878EEE32846BFEFE4F1221A0E070419A5DC@cuuexm01.mx.efoxconn.com>
Message-ID: <20060225081700.GA13698@suse.de>

 On Fri, Feb 24, Luis Lopez-FLL005 wrote:

> I am having this problem with my RS 6000 Server using AIX 4.3.3, I
> appreciate any help.

I sort of solved it by taking everything apart and reassemble it. It
helped for a while.


From olh at suse.de  Sat Feb 25 23:34:51 2006
From: olh at suse.de (Olaf Hering)
Date: Sat, 25 Feb 2006 13:34:51 +0100
Subject: [PATCH] Accurate task and cpu time accounting
In-Reply-To: <20060222133551.GA30355@suse.de>
References: <17404.19586.404909.178103@cargo.ozlabs.ibm.com>
	<20060222133551.GA30355@suse.de>
Message-ID: <20060225123451.GA20731@suse.de>

 On Wed, Feb 22, Olaf Hering wrote:

>  On Wed, Feb 22, Paul Mackeras wrote:
> 
> > All of this is conditional on CONFIG_VIRT_CPU_ACCOUNTING.  If that is
> > not set, we do tick-based approximate accounting as before.

cpufreq has now unresolved symbols.

WARNING: /var/tmp/kernel-ppc64-2.6.16_rc4_git8-build/lib/modules/2.6.16-rc4-git8-20060225_do_get_xsec-ppc64/kernel/drivers/cpufreq/cpufreq_stats.ko needs unknown symbol __cputime_clockt_factor


From benh at kernel.crashing.org  Sun Feb 26 08:09:00 2006
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Sun, 26 Feb 2006 08:09:00 +1100
Subject: [PATCH] powerpc: vdso 64bits gettimeofday bug
Message-ID: <1140901740.24957.29.camel@localhost.localdomain>

A bug in the assembly code of the vdso can cause gettimeofday() to hang
or to return incorrect results. The wrong register was used to test for
pending updates of the calibration variables and to create a dependency
for subsequent loads. This fixes it.

Signed-off-by: Benjamin Herrenschmidt <benh at kernel.crashing.org>
---

Might be worth applying to the stable series too and/or distro kernels
2.6.15 and later

--- linux-work.orig/arch/powerpc/kernel/vdso64/gettimeofday.S	2006-02-26 08:02:57.000000000 +1100
+++ linux-work/arch/powerpc/kernel/vdso64/gettimeofday.S	2006-02-26 08:04:23.000000000 +1100
@@ -225,9 +225,9 @@
   .cfi_startproc
 	/* check for update count & load values */
 1:	ld	r8,CFG_TB_UPDATE_COUNT(r3)
-	andi.	r0,r4,1			/* pending update ? loop */
+	andi.	r0,r8,1			/* pending update ? loop */
 	bne-	1b
-	xor	r0,r4,r4		/* create dependency */
+	xor	r0,r8,r8		/* create dependency */
 	add	r3,r3,r0
 
 	/* Get TB & offset it */


From benh at kernel.crashing.org  Sun Feb 26 08:29:17 2006
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Sun, 26 Feb 2006 08:29:17 +1100
Subject: Fwd: [PATCH] powerpc: disable OProfile for iSeries
In-Reply-To: <200602231432.59472.kelly@au.ibm.com>
References: <200602231432.59472.kelly@au.ibm.com>
Message-ID: <1140902958.24957.38.camel@localhost.localdomain>

On Thu, 2006-02-23 at 14:32 +1100, Kelly Daly wrote:
> disable OProfile in Kconfig for iSeries to prevent hangs.  OProfile was not originally intended to work with legacy iSeries.

What is hanging exactly ? There should be no problem using oprofile
timer based sampling at least on iseries...

Ben.


From benh at kernel.crashing.org  Sun Feb 26 08:36:14 2006
From: benh at kernel.crashing.org (Benjamin Herrenschmidt)
Date: Sun, 26 Feb 2006 08:36:14 +1100
Subject: [PATCH] powerpc: Fix runlatch performance issues
In-Reply-To: <200602240502.k1O52ExR009703@hera.kernel.org>
References: <200602240502.k1O52ExR009703@hera.kernel.org>
Message-ID: <1140903374.24957.43.camel@localhost.localdomain>

On Fri, 2006-02-24 at 05:02 +0000, Linux Kernel Mailing List wrote:
> commit cb2c9b2741346eb23b177187a51ff5abf08295bd
> tree 31433b46f96a00e22ca7e8402fd0bfe1fea3408d
> parent 47f78a49206b7f9b0d283ba46a2a5a6ee1796472
> author Anton Blanchard <anton at samba.org> Mon, 13 Feb 2006 14:48:35 +1100
> committer Paul Mackerras <paulus at samba.org> Fri, 24 Feb 2006 11:36:31 +1100
> 
> [PATCH] powerpc: Fix runlatch performance issues
> 
> The runlatch SPR can take a lot of time to write. My original runlatch
> code would set it on every exception entry even though most of the time
> this was not required. It would also continually set it in the idle
> loop, which is an issue on an SMT capable processor.
> 
> Now we cache the runlatch value in a threadinfo bit, and only check for
> it in decrementer and hardware interrupt exceptions as well as the idle
> loop. Boot on POWER3, POWER5 and iseries, and compile tested on pmac32.

I very much dislike the unconditional bl to C code in the exception
path. Can you at least wrap it in asm cpu feature conditionals on
CPU_FTR_CTRL so that it gets NOP'ed out on CPUs without a runlatch ? In
addition, we should probably not set that feature bit from the cputable
but from the platform code so that it's only set on machines where it's
useful, thus causing the code to be NOP'ed out on G5s and other bare
metal stuff that don't care about the runlatch no ?

Ben.


From paulus at samba.org  Mon Feb 27 14:13:24 2006
From: paulus at samba.org (Paul Mackerras)
Date: Mon, 27 Feb 2006 14:13:24 +1100
Subject: [PATCH] Accurate task and cpu time accounting
In-Reply-To: <20060225123451.GA20731@suse.de>
References: <17404.19586.404909.178103@cargo.ozlabs.ibm.com>
	<20060222133551.GA30355@suse.de> <20060225123451.GA20731@suse.de>
Message-ID: <17410.28244.72197.129637@cargo.ozlabs.ibm.com>

Olaf Hering writes:

> WARNING:
> /var/tmp/kernel-ppc64-2.6.16_rc4_git8-build/lib/modules/2.6.16-rc4-git8-20060225_do_get_xsec-ppc64/kernel/drivers/cpufreq/cpufreq_stats.ko
> needs unknown symbol __cputime_clockt_factor 

We need to export that and a few others, or else move the conversion
functions in cputime.h to arch/powerpc/kernel/time.c so that they are
out of line.

Paul.


From sfr at canb.auug.org.au  Mon Feb 27 16:03:37 2006
From: sfr at canb.auug.org.au (Stephen Rothwell)
Date: Mon, 27 Feb 2006 16:03:37 +1100
Subject: [PATCH] Signal hadnling fix for 2.4
Message-ID: <20060227160337.65610906.sfr@canb.auug.org.au>

Hi Marcelo,

While investigating a bug report about a 64bit application that crashed in
malloc, Paul Mackerras noticed that sys_rt_sigreturn's return value was
"int".  It needs to be "long" or else the return value of a syscall that
is interrupted by a signal will be truncated to 32 bits and then sign
extended.  This causes .e.g mmap's return value to be corrupted if it is
returning an address above 2^31 (which is what caused a SEGV in malloc).
This problem obviously only affects 64 bit processes.

Signed-off-by: Stephen Rothwell <sfr at canb.auug.org.au>

---

Please apply for 2.4.33, this patch is against 2.4.33-pre2.

-- 
Cheers,
Stephen Rothwell                    sfr at canb.auug.org.au
http://www.canb.auug.org.au/~sfr/

diff -ruN linux/arch/ppc64/kernel/signal.c linux-sfr/arch/ppc64/kernel/signal.c
--- linux/arch/ppc64/kernel/signal.c	2006-02-24 17:37:08.000000000 +1100
+++ linux-sfr/arch/ppc64/kernel/signal.c	2006-02-27 11:05:07.000000000 +1100
@@ -332,7 +332,7 @@
 }
 
 
-asmlinkage int
+asmlinkage long
 sys_rt_sigreturn(unsigned long r3, unsigned long r4, unsigned long r5,
 		 unsigned long r6, unsigned long r7, unsigned long r8,
 		 struct pt_regs *regs)


From clumens at redhat.com  Tue Feb 28 02:41:37 2006
From: clumens at redhat.com (Chris Lumens)
Date: Mon, 27 Feb 2006 10:41:37 -0500
Subject: [PATCH] Conditionalize debugging printks
Message-ID: <20060227154137.GF17260@exeter.boston.redhat.com>

All the debugging output I'm seeing in the log files on my G5 means
relatively little to me, so this patch gives opportunity to turn it off.
It looks like relatively new code, which is why I didn't just default to
turning everything off.

- Chris


Signed-off-by: Chris Lumens <clumens at redhat.com>


---

 arch/powerpc/platforms/powermac/pfunc_base.c |    6 ++++++
 arch/powerpc/platforms/powermac/pfunc_core.c |    7 +++++++
 2 files changed, 13 insertions(+), 0 deletions(-)

a192d232af68676eee3488d734bf334acce05453
diff --git a/arch/powerpc/platforms/powermac/pfunc_base.c b/arch/powerpc/platforms/powermac/pfunc_base.c
index 4ffd2a9..8ea5bc0 100644
--- a/arch/powerpc/platforms/powermac/pfunc_base.c
+++ b/arch/powerpc/platforms/powermac/pfunc_base.c
@@ -9,7 +9,13 @@
 #include <asm/pmac_feature.h>
 #include <asm/pmac_pfunc.h>
 
+#define DEBUG
+
+#ifdef DEBUG
 #define DBG(fmt...)	printk(fmt)
+#else
+#define DBG(fmt...)
+#endif
 
 static irqreturn_t macio_gpio_irq(int irq, void *data, struct pt_regs *regs)
 {
diff --git a/arch/powerpc/platforms/powermac/pfunc_core.c b/arch/powerpc/platforms/powermac/pfunc_core.c
index 356a739..215d267 100644
--- a/arch/powerpc/platforms/powermac/pfunc_core.c
+++ b/arch/powerpc/platforms/powermac/pfunc_core.c
@@ -17,10 +17,17 @@
 #include <asm/pmac_pfunc.h>
 
 /* Debug */
+#define DEBUG
+
 #define LOG_PARSE(fmt...)
 #define LOG_ERROR(fmt...)	printk(fmt)
 #define LOG_BLOB(t,b,c)
+
+#ifdef DEBUG
 #define DBG(fmt...)		printk(fmt)
+#else
+#define DBG(fmt...)
+#endif
 
 /* Command numbers */
 #define PMF_CMD_LIST			0
-- 
1.2.3


From sonny at burdell.org  Tue Feb 28 06:31:57 2006
From: sonny at burdell.org (Sonny Rao)
Date: Mon, 27 Feb 2006 14:31:57 -0500
Subject: [PATCH] powerpc: Fix runlatch performance issues
In-Reply-To: <1140903374.24957.43.camel@localhost.localdomain>
References: <200602240502.k1O52ExR009703@hera.kernel.org>
	<1140903374.24957.43.camel@localhost.localdomain>
Message-ID: <20060227193157.GA22165@kevlar.burdell.org>

On Sun, Feb 26, 2006 at 08:36:14AM +1100, Benjamin Herrenschmidt wrote:
> On Fri, 2006-02-24 at 05:02 +0000, Linux Kernel Mailing List wrote:
> > commit cb2c9b2741346eb23b177187a51ff5abf08295bd
> > tree 31433b46f96a00e22ca7e8402fd0bfe1fea3408d
> > parent 47f78a49206b7f9b0d283ba46a2a5a6ee1796472
> > author Anton Blanchard <anton at samba.org> Mon, 13 Feb 2006 14:48:35 +1100
> > committer Paul Mackerras <paulus at samba.org> Fri, 24 Feb 2006 11:36:31 +1100
> > 
> > [PATCH] powerpc: Fix runlatch performance issues
> > 
> > The runlatch SPR can take a lot of time to write. My original runlatch
> > code would set it on every exception entry even though most of the time
> > this was not required. It would also continually set it in the idle
> > loop, which is an issue on an SMT capable processor.
> > 
> > Now we cache the runlatch value in a threadinfo bit, and only check for
> > it in decrementer and hardware interrupt exceptions as well as the idle
> > loop. Boot on POWER3, POWER5 and iseries, and compile tested on pmac32.
> 
> I very much dislike the unconditional bl to C code in the exception
> path. Can you at least wrap it in asm cpu feature conditionals on
> CPU_FTR_CTRL so that it gets NOP'ed out on CPUs without a runlatch ? In
> addition, we should probably not set that feature bit from the cputable
> but from the platform code so that it's only set on machines where it's
> useful, thus causing the code to be NOP'ed out on G5s and other bare
> metal stuff that don't care about the runlatch no ?

AFAIK, runlatch is orthogonal to paravirtualization vs bare metal issues.

All it does is stop the PM_RUN_CYC counter from running while a CPU is
idle.  This is useful when you want to accurately determine CPI on a
given workload and there is any idle time (waiting for I/O, whatever).

If we ever release pmcount (pending on perfmon2 api stabilization, I
think?) you'll find runlatch is useful even on a G5.

Sonny


From paulus at samba.org  Mon Feb 27 15:43:29 2006
From: paulus at samba.org (Paul Mackerras)
Date: Mon, 27 Feb 2006 15:43:29 +1100
Subject: [PATCH] Accurate task and cpu time accounting
In-Reply-To: <20060225123451.GA20731@suse.de>
References: <17404.19586.404909.178103@cargo.ozlabs.ibm.com>
	<20060222133551.GA30355@suse.de> <20060225123451.GA20731@suse.de>
Message-ID: <17410.33649.46500.544164@cargo.ozlabs.ibm.com>

Olaf Hering writes:

> cpufreq has now unresolved symbols.

This should fix it... (now in powerpc.git)

Paul.

diff-tree 2cf82c0256b198ae28c465f2c4d7c12c836ea5ea (from f055affb89f587a03f3411c3fd49ef31295c3d48)
Author: Paul Mackerras <paulus at samba.org>
Date:   Mon Feb 27 15:41:47 2006 +1100

    powerpc: Export variables used in conversions to/from cputime_t
    
    The inline cputime_to_foo and foo_to_cputime conversion functions in
    include/asm-powerpc/cputime.h refer to 5 variables, which need to be
    exported if those functions are to be usable from modules.
    
    Signed-off-by: Paul Mackerras <paulus at samba.org>

diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 0b34db2..4f20a5f 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -97,10 +97,11 @@ static unsigned long first_settimeofday 
 
 unsigned long tb_ticks_per_jiffy;
 unsigned long tb_ticks_per_usec = 100; /* sane default */
 EXPORT_SYMBOL(tb_ticks_per_usec);
 unsigned long tb_ticks_per_sec;
+EXPORT_SYMBOL(tb_ticks_per_sec);	/* for cputime_t conversions */
 u64 tb_to_xs;
 unsigned tb_to_us;
 
 #define TICKLEN_SCALE	(SHIFT_SCALE - 10)
 u64 last_tick_len;	/* units are ns / 2^TICKLEN_SCALE */
@@ -141,13 +142,17 @@ DEFINE_PER_CPU(unsigned long, last_jiffy
  * Factors for converting from cputime_t (timebase ticks) to
  * jiffies, milliseconds, seconds, and clock_t (1/USER_HZ seconds).
  * These are all stored as 0.64 fixed-point binary fractions.
  */
 u64 __cputime_jiffies_factor;
+EXPORT_SYMBOL(__cputime_jiffies_factor);
 u64 __cputime_msec_factor;
+EXPORT_SYMBOL(__cputime_msec_factor);
 u64 __cputime_sec_factor;
+EXPORT_SYMBOL(__cputime_sec_factor);
 u64 __cputime_clockt_factor;
+EXPORT_SYMBOL(__cputime_clockt_factor);
 
 static void calc_cputime_factors(void)
 {
 	struct div_result res;
 

From michael at ellerman.id.au  Tue Feb 28 14:54:26 2006
From: michael at ellerman.id.au (Michael Ellerman)
Date: Tue, 28 Feb 2006 14:54:26 +1100
Subject: [PATCH] powerpc: iseries: Fix double phys_to_abs bug in
	htab_bolt_mapping
Message-ID: <20060228035450.BB448679F8@ozlabs.org>

Before the merge I updated create_pte_mapping() to work for iSeries, by
calling iSeries_hpte_bolt_or_insert. (4c55130b2aa93370f1bf52d2304394e91cf8ee39)

Later we changed iSeries_hpte_insert to cope with the bolting case, and called
that instead from create_pte_mapping() (which was renamed to htab_bolt_mapping)
(3c726f8dee6f55e96475574e9f645327e461884c).

Unfortunately that change introduced a subtle bug, where we pass an absolute
address to iSeries_hpte_insert() where it expects a physical address. This
leads to us calling phys_to_abs() twice on the physical address, which is
seriously bogus.

This only causes a problem if the absolute address from the first translation
can be looked up again in the chunk_map, which depends on the size and layout
of memory. I've seen it fail on one box, but not others.

The minimal fix is to pass the physical address to iSeries_hpte_insert(). For
2.6.17 we should make phys_to_abs() BUG if we try to double-translate an
address.

Signed-off-by: Michael Ellerman <michael at ellerman.id.au>
---

 arch/powerpc/mm/hash_utils_64.c |    2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)

Index: iseries/arch/powerpc/mm/hash_utils_64.c
===================================================================
--- iseries.orig/arch/powerpc/mm/hash_utils_64.c
+++ iseries/arch/powerpc/mm/hash_utils_64.c
@@ -169,7 +169,7 @@ int htab_bolt_mapping(unsigned long vsta
 #ifdef CONFIG_PPC_ISERIES
 		if (_machine == PLATFORM_ISERIES_LPAR)
 			ret = iSeries_hpte_insert(hpteg, va,
-						  virt_to_abs(paddr),
+						  __pa(vaddr),
 						  tmp_mode,
 						  HPTE_V_BOLTED,
 						  psize);


From paulus at samba.org  Tue Feb 28 15:01:34 2006
From: paulus at samba.org (Paul Mackerras)
Date: Tue, 28 Feb 2006 15:01:34 +1100
Subject: [PATCH] Signal hadnling fix for 2.4
In-Reply-To: <20060227160337.65610906.sfr@canb.auug.org.au>
References: <20060227160337.65610906.sfr@canb.auug.org.au>
Message-ID: <17411.51998.642468.642351@cargo.ozlabs.ibm.com>

Stephen Rothwell writes:

> While investigating a bug report about a 64bit application that crashed in
> malloc, Paul Mackerras noticed that sys_rt_sigreturn's return value was
> "int".  It needs to be "long" or else the return value of a syscall that
> is interrupted by a signal will be truncated to 32 bits and then sign
> extended.  This causes .e.g mmap's return value to be corrupted if it is
> returning an address above 2^31 (which is what caused a SEGV in malloc).
> This problem obviously only affects 64 bit processes.
> 
> Signed-off-by: Stephen Rothwell <sfr at canb.auug.org.au>

Acked-by: Paul Mackerras <paulus at samba.org>


From paulus at samba.org  Tue Feb 28 16:06:02 2006
From: paulus at samba.org (Paul Mackerras)
Date: Tue, 28 Feb 2006 16:06:02 +1100
Subject: Problems loading some select modules
In-Reply-To: <OF0E9B0E2A.2FB5C32A-ON8825711F.0077F598-8825711F.0079FB7C@us.ibm.com>
References: <OF0E9B0E2A.2FB5C32A-ON8825711F.0077F598-8825711F.0079FB7C@us.ibm.com>
Message-ID: <17411.55866.172377.50234@cargo.ozlabs.ibm.com>

Pradeep Satyanarayana writes:

> I was trying to load some Infiniband modules (using modprobe) on Power5
> machine (p570), and I get the following error:
> 
> WARNING: Error inserting findex
> (/lib/modules/2.6.16-rc2/kernel/drivers/infiniband/core/findex.ko): Invalid
> module format
> 
> Also, in /var/log/messages I see the following error about the same module:
> 
> kernel: findex: doesn't contain .toc or .stubs.

Interesting.  I don't see findex.c in the kernel sources anywhere.  It
could be that a very simple module that only accesses variables on the
stack would not need a toc, and maybe in this case the toolchain
doesn't generate a toc.  Could you send me the source of your module
plus the generated findex.ko?

Paul.