more eeh

Thu Mar 18 09:31:02 EST 2004

On Wed, Mar 17, 2004 at 03:28:07PM -0600, Nathan Fontenot wrote:
> Well, lets see if I can fill everyones mailbox with some more
> lively eeh discussions again today. :)
>
> I decided to follow Paul's suggestion and use a workqueue to handle
> the hotplug remove of the device.  This was before I saw Greg's reply
> about using kobject_hotplug with a user-space script.  The relevant
> piece is in eeh.c in the attached patch.
>
> Greg, you're right that this is a harsh policy.  In this case I
> think it's the right thing.  When an EEH event happens the
> slot is basically, device stores will fail and all reads will
> return all F's.  The idea with this code is to avoid a panic() call and
> hopefully allow some time to do any cleanup before shutting down.  Also,
> this is why we want to limit this to network devices.  This kind of
> policy for a hard disk would just be begging for data corruption.
>
> So, bring on the comments.  at least I now know you guys aren't shy.

I still say drop this to userspace, and have it do the power down of the
slot.  Otherwise this will not work with any other PCI hotplug driver.
It also requires that the PPC64 pci hotplug driver be built into your
kernel, which will not be true for any vendor kernel.

Also, I would never allow the generic "disable_slot" symbol become
global in the kernel, that's just bad form :)

You can also do the "is this a ethernet device or not" type of checking
in userspace, which is the proper place for it too.  Oh, what happens if
that ethernet device contained some NFS mounts or a iSCSI device?

thanks,

greg k-h

** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/