From anton at samba.org Sun May 1 19:26:42 2005 From: anton at samba.org (Anton Blanchard) Date: Sun, 1 May 2005 19:26:42 +1000 Subject: [PATCH] ppc64: remove hidden -fno-omit-frame-pointer for schedule.c Message-ID: <20050501092641.GL19662@krispykreme> Hi, While looking at code generated by gcc4.0 I noticed some functions still had frame pointers, even after we stopped ppc64 from defining CONFIG_FRAME_POINTER. It turns out kernel/Makefile hardwires -fno-omit-frame-pointer on when compiling schedule.c. It was already disabled on ia64, disable it on ppc64 as well. Signed-off-by: Anton Blanchard Index: linux-2.6.12-rc2/kernel/Makefile =================================================================== --- linux-2.6.12-rc2.orig/kernel/Makefile 2005-04-19 13:37:40.599016667 +1000 +++ linux-2.6.12-rc2/kernel/Makefile 2005-05-01 05:48:00.689299680 +1000 @@ -33,6 +33,7 @@ obj-$(CONFIG_SECCOMP) += seccomp.o ifneq ($(CONFIG_IA64),y) +ifneq ($(CONFIG_PPC64),y) # According to Alan Modra , the -fno-omit-frame-pointer is # needed for x86 only. Why this used to be enabled for all architectures is beyond # me. I suspect most platforms don't need this, but until we know that for sure @@ -40,6 +41,7 @@ # to get a correct value for the wait-channel (WCHAN in ps). --davidm CFLAGS_sched.o := $(PROFILING) -fno-omit-frame-pointer endif +endif $(obj)/configs.o: $(obj)/config_data.h From akpm at osdl.org Sun May 1 19:37:59 2005 From: akpm at osdl.org (Andrew Morton) Date: Sun, 1 May 2005 02:37:59 -0700 Subject: [PATCH] ppc64: remove hidden -fno-omit-frame-pointer for schedule.c In-Reply-To: <20050501092641.GL19662@krispykreme> References: <20050501092641.GL19662@krispykreme> Message-ID: <20050501023759.15d98aea.akpm@osdl.org> Anton Blanchard wrote: > > --- linux-2.6.12-rc2.orig/kernel/Makefile 2005-04-19 13:37:40.599016667 +1000 > +++ linux-2.6.12-rc2/kernel/Makefile 2005-05-01 05:48:00.689299680 +1000 > @@ -33,6 +33,7 @@ > obj-$(CONFIG_SECCOMP) += seccomp.o > > ifneq ($(CONFIG_IA64),y) > +ifneq ($(CONFIG_PPC64),y) Could we please use a new CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER, define that in the arch's Kconfig? (is that a triple negative I see?) From anton at samba.org Sun May 1 19:56:39 2005 From: anton at samba.org (Anton Blanchard) Date: Sun, 1 May 2005 19:56:39 +1000 Subject: [PATCH] ppc64: remove hidden -fno-omit-frame-pointer for schedule.c In-Reply-To: <20050501023759.15d98aea.akpm@osdl.org> References: <20050501092641.GL19662@krispykreme> <20050501023759.15d98aea.akpm@osdl.org> Message-ID: <20050501095639.GM19662@krispykreme> > Could we please use a new CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER, define > that in the arch's Kconfig? > > (is that a triple negative I see?) I like it. Hopefully someone suitably annoyed by this triple negative will go and work out which damn architectures actually need -fno-omit-frame-pointer and reverse the test. For now ppc32, ppc64, ia64 dont need it. Anton -- While looking at code generated by gcc4.0 I noticed some functions still had frame pointers, even after we stopped ppc64 from defining CONFIG_FRAME_POINTER. It turns out kernel/Makefile hardwires -fno-omit-frame-pointer on when compiling schedule.c. Create CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER and define it on architectures that dont require frame pointers in sched.c code. Signed-off-by: Anton Blanchard Index: linux-2.6.12-rc2/kernel/Makefile =================================================================== --- linux-2.6.12-rc2.orig/kernel/Makefile 2005-04-19 13:37:40.599016667 +1000 +++ linux-2.6.12-rc2/kernel/Makefile 2005-05-01 19:48:44.471448005 +1000 @@ -32,7 +32,7 @@ obj-$(CONFIG_CRASH_DUMP) += crash_dump.o obj-$(CONFIG_SECCOMP) += seccomp.o -ifneq ($(CONFIG_IA64),y) +ifneq ($(CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER),y) # According to Alan Modra , the -fno-omit-frame-pointer is # needed for x86 only. Why this used to be enabled for all architectures is beyond # me. I suspect most platforms don't need this, but until we know that for sure Index: linux-2.6.12-rc2/arch/ia64/Kconfig =================================================================== --- linux-2.6.12-rc2.orig/arch/ia64/Kconfig 2005-04-19 13:37:33.173418325 +1000 +++ linux-2.6.12-rc2/arch/ia64/Kconfig 2005-05-01 19:49:35.060202590 +1000 @@ -46,6 +46,10 @@ bool default y +config SCHED_NO_NO_OMIT_FRAME_POINTER + bool + default y + choice prompt "System type" default IA64_GENERIC Index: linux-2.6.12-rc2/arch/ppc/Kconfig =================================================================== --- linux-2.6.12-rc2.orig/arch/ppc/Kconfig 2005-04-19 13:37:33.450396856 +1000 +++ linux-2.6.12-rc2/arch/ppc/Kconfig 2005-05-01 19:49:24.699414050 +1000 @@ -43,6 +43,10 @@ bool default y +config SCHED_NO_NO_OMIT_FRAME_POINTER + bool + default y + source "init/Kconfig" menu "Processor" Index: linux-2.6.12-rc2/arch/ppc64/Kconfig =================================================================== --- linux-2.6.12-rc2.orig/arch/ppc64/Kconfig 2005-05-01 05:39:38.017058150 +1000 +++ linux-2.6.12-rc2/arch/ppc64/Kconfig 2005-05-01 19:50:47.878561880 +1000 @@ -40,6 +40,10 @@ bool default y +config SCHED_NO_NO_OMIT_FRAME_POINTER + bool + default y + # We optimistically allocate largepages from the VM, so make the limit # large enough (16MB). This badly named config option is actually # max order + 1 From anton at samba.org Sun May 1 20:30:13 2005 From: anton at samba.org (Anton Blanchard) Date: Sun, 1 May 2005 20:30:13 +1000 Subject: [PATCH] ppc64: add missing Kconfig help text Message-ID: <20050501103013.GN19662@krispykreme> From: Jesper Juhl There's no help text for CONFIG_DEBUG_STACKOVERFLOW - add one. Signed-off-by: Jesper Juhl Signed-off-by: Anton Blanchard Index: linux-2.6.12-rc2/arch/ppc64/Kconfig.debug =================================================================== --- linux-2.6.12-rc2.orig/arch/ppc64/Kconfig.debug 2005-02-04 04:10:36.000000000 +1100 +++ linux-2.6.12-rc2/arch/ppc64/Kconfig.debug 2005-05-01 20:27:18.760365099 +1000 @@ -5,6 +5,9 @@ config DEBUG_STACKOVERFLOW bool "Check for stack overflows" depends on DEBUG_KERNEL + help + This option will cause messages to be printed if free stack space + drops below a certain limit. config KPROBES bool "Kprobes" From peter at chubb.wattle.id.au Mon May 2 10:17:51 2005 From: peter at chubb.wattle.id.au (Peter Chubb) Date: Mon, 2 May 2005 10:17:51 +1000 Subject: [PATCH] ppc64: update to use the new 4L headers In-Reply-To: <4270472E.9050708@yahoo.com.au> References: <1114652039.7112.213.camel@gaston> <42704130.9050005@yahoo.com.au> <427044AA.5030402@nortel.com> <4270472E.9050708@yahoo.com.au> Message-ID: <17013.29103.249971.866326@wombat.chubb.wattle.id.au> >>>>> "Nick" == Nick Piggin writes: Nick> Chris Friesen wrote: >> I needed something like: >> >> pte_t *va_to_ptep_map(struct mm_struct *mm, unsigned int addr) >> >> There was code in follow_page() that did basically what I needed, >> but it was all contained within that function so I had to >> re-implement it. >> Nick> If you can break out exactly what you need, and make that inline Nick> or otherwise available via the correct header, I'm sure it would Nick> have a good chance of being merged. We're currently working on this, so as to be able to provide interfaces to alternative page tables. We want to be able to slot in Liedtke's `Guarded Page Tables', or B-trees, or a hash table to see what happens. Except we've called the function: pte_t * lookup_page_table(unsigned long address, struct mm_struct *mm); follow_page() is essentially the same after inline expansion happens; but we're seeing a regression in clear_page_range() that we want to fix before release. If you want to take a look (warning: it's still fairly rough work-in-progress) there's high level design being worked on at http://www.gelato.unsw.edu.au/IA64wiki/PageTableInterface and patches from our CVS repository. The only patch of interst is pti.patch. cvs -d :pserver:anoncvs at gelato.unsw.edu.au:/gelato login Logging in to :pserver:anoncvs at lemon:2401/gelato CVS password:[enter anoncvs] $ cvs -d:pserver:anoncvs at gelato.unsw.edu.au:/gelato co kernel/page_table_interface or from http://www.gelato.unsw.edu.au/cgi-bin/viewcvs.cgi/cvs/kernel/page_table_interface/ Peter C From miltonm at bga.com Mon May 2 16:43:40 2005 From: miltonm at bga.com (Milton Miller) Date: Mon, 2 May 2005 01:43:40 -0500 Subject: [PATCH 1/2] ppc64: fix read/write on large /dev/nvram Message-ID: <7845758a806ed6769cea59a9df344d39@bga.com> On Fri Apr 22 16:49:59 EST 2005, Arnd wrote a patch with the following lines (among several others). - len = ppc_md.nvram_read(tmp_buffer, count, ppos); + ret = ppc_md.nvram_read(tmp, count, ppos); - len = ppc_md.nvram_write(tmp_buffer, count, ppos); + ret = ppc_md.nvram_read(tmp, count, ppos); Even though I am just scanning, I am guessing this is not quite right. milton From david at gibson.dropbear.id.au Tue May 3 10:26:08 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Tue, 3 May 2005 10:26:08 +1000 Subject: [PPC64] pgtable.h and other header cleanups Message-ID: <20050503002608.GA22453@localhost.localdomain> Andrew, please apply. This patch started as simply removing a few never-used macros from asm-ppc64/pgtable.h, then kind of grew. It now makes a bunch of cleanups to the ppc64 low-level header files (with corresponding changes to .c files where necessary) such as: - Abolishing never-used macros - Eliminating multiple #defines with the same purpose - Removing pointless macros (cases where just expanding the macro everywhere turns out clearer and more sensible) - Removing some cases where macros which could be defined in terms of each other weren't - Moving imalloc() related definitions from pgtable.h to their own header file (imalloc.h) - Re-arranging headers to group things more logically - Moving all VSID allocation related things to mmu.h, instead of being split between mmu.h and mmu_context.h - Removing some reserved space for flags from the PMD - we're not using it. Signed-off-by: David Gibson Index: working-2.6/include/asm-ppc64/pgtable.h =================================================================== --- working-2.6.orig/include/asm-ppc64/pgtable.h 2005-05-02 16:21:09.000000000 +1000 +++ working-2.6/include/asm-ppc64/pgtable.h 2005-05-02 17:58:29.000000000 +1000 @@ -17,16 +17,6 @@ #include -/* PMD_SHIFT determines what a second-level page table entry can map */ -#define PMD_SHIFT (PAGE_SHIFT + PAGE_SHIFT - 3) -#define PMD_SIZE (1UL << PMD_SHIFT) -#define PMD_MASK (~(PMD_SIZE-1)) - -/* PGDIR_SHIFT determines what a third-level page table entry can map */ -#define PGDIR_SHIFT (PAGE_SHIFT + (PAGE_SHIFT - 3) + (PAGE_SHIFT - 2)) -#define PGDIR_SIZE (1UL << PGDIR_SHIFT) -#define PGDIR_MASK (~(PGDIR_SIZE-1)) - /* * Entries per page directory level. The PTE level must use a 64b record * for each page table entry. The PMD and PGD level use a 32b record for @@ -40,40 +30,30 @@ #define PTRS_PER_PMD (1 << PMD_INDEX_SIZE) #define PTRS_PER_PGD (1 << PGD_INDEX_SIZE) -#define USER_PTRS_PER_PGD (1024) -#define FIRST_USER_ADDRESS 0 +/* PMD_SHIFT determines what a second-level page table entry can map */ +#define PMD_SHIFT (PAGE_SHIFT + PTE_INDEX_SIZE) +#define PMD_SIZE (1UL << PMD_SHIFT) +#define PMD_MASK (~(PMD_SIZE-1)) -#define EADDR_SIZE (PTE_INDEX_SIZE + PMD_INDEX_SIZE + \ - PGD_INDEX_SIZE + PAGE_SHIFT) +/* PGDIR_SHIFT determines what a third-level page table entry can map */ +#define PGDIR_SHIFT (PMD_SHIFT + PMD_INDEX_SIZE) +#define PGDIR_SIZE (1UL << PGDIR_SHIFT) +#define PGDIR_MASK (~(PGDIR_SIZE-1)) + +#define FIRST_USER_ADDRESS 0 /* * Size of EA range mapped by our pagetables. */ -#define PGTABLE_EA_BITS 41 -#define PGTABLE_EA_MASK ((1UL<> PMD_TO_PTEPAGE_SHIFT)) +#define pmd_page_kernel(pmd) (__bpn_to_ba(pmd_val(pmd))) #define pmd_page(pmd) virt_to_page(pmd_page_kernel(pmd)) #define pud_set(pudp, pmdp) (pud_val(*(pudp)) = (__ba_to_bpn(pmdp))) @@ -266,8 +241,6 @@ /* to find an entry in the ioremap page-table-directory */ #define pgd_offset_i(address) (ioremap_pgd + pgd_index(address)) -#define pages_to_mb(x) ((x) >> (20-PAGE_SHIFT)) - /* * The following only work if pte_present() is true. * Undefined behaviour if not.. @@ -487,18 +460,13 @@ extern unsigned long ioremap_bot, ioremap_base; -#define USER_PGD_PTRS (PAGE_OFFSET >> PGDIR_SHIFT) -#define KERNEL_PGD_PTRS (PTRS_PER_PGD-USER_PGD_PTRS) - -#define pte_ERROR(e) \ - printk("%s:%d: bad pte %016lx.\n", __FILE__, __LINE__, pte_val(e)) #define pmd_ERROR(e) \ printk("%s:%d: bad pmd %08x.\n", __FILE__, __LINE__, pmd_val(e)) #define pgd_ERROR(e) \ printk("%s:%d: bad pgd %08x.\n", __FILE__, __LINE__, pgd_val(e)) -extern pgd_t swapper_pg_dir[1024]; -extern pgd_t ioremap_dir[1024]; +extern pgd_t swapper_pg_dir[]; +extern pgd_t ioremap_dir[]; extern void paging_init(void); @@ -540,43 +508,11 @@ */ #define kern_addr_valid(addr) (1) -#define io_remap_page_range(vma, vaddr, paddr, size, prot) \ - remap_pfn_range(vma, vaddr, (paddr) >> PAGE_SHIFT, size, prot) - #define io_remap_pfn_range(vma, vaddr, pfn, size, prot) \ remap_pfn_range(vma, vaddr, pfn, size, prot) -#define MK_IOSPACE_PFN(space, pfn) (pfn) -#define GET_IOSPACE(pfn) 0 -#define GET_PFN(pfn) (pfn) - void pgtable_cache_init(void); -extern void hpte_init_native(void); -extern void hpte_init_lpar(void); -extern void hpte_init_iSeries(void); - -/* imalloc region types */ -#define IM_REGION_UNUSED 0x1 -#define IM_REGION_SUBSET 0x2 -#define IM_REGION_EXISTS 0x4 -#define IM_REGION_OVERLAP 0x8 -#define IM_REGION_SUPERSET 0x10 - -extern struct vm_struct * im_get_free_area(unsigned long size); -extern struct vm_struct * im_get_area(unsigned long v_addr, unsigned long size, - int region_type); -unsigned long im_free(void *addr); - -extern long pSeries_lpar_hpte_insert(unsigned long hpte_group, - unsigned long va, unsigned long prpn, - int secondary, unsigned long hpteflags, - int bolted, int large); - -extern long native_hpte_insert(unsigned long hpte_group, unsigned long va, - unsigned long prpn, int secondary, - unsigned long hpteflags, int bolted, int large); - /* * find_linux_pte returns the address of a linux pte for a given * effective address and directory. If not found, it returns zero. Index: working-2.6/include/asm-ppc64/page.h =================================================================== --- working-2.6.orig/include/asm-ppc64/page.h 2005-05-02 16:21:09.000000000 +1000 +++ working-2.6/include/asm-ppc64/page.h 2005-05-02 16:21:43.000000000 +1000 @@ -23,7 +23,6 @@ #define PAGE_SHIFT 12 #define PAGE_SIZE (ASM_CONST(1) << PAGE_SHIFT) #define PAGE_MASK (~(PAGE_SIZE-1)) -#define PAGE_OFFSET_MASK (PAGE_SIZE-1) #define SID_SHIFT 28 #define SID_MASK 0xfffffffffUL @@ -85,9 +84,6 @@ /* align addr on a size boundary - adjust address up if needed */ #define _ALIGN(addr,size) _ALIGN_UP(addr,size) -/* to align the pointer to the (next) double word boundary */ -#define DOUBLEWORD_ALIGN(addr) _ALIGN(addr,sizeof(unsigned long)) - /* to align the pointer to the (next) page boundary */ #define PAGE_ALIGN(addr) _ALIGN(addr, PAGE_SIZE) @@ -100,7 +96,6 @@ #define REGION_SIZE 4UL #define REGION_SHIFT 60UL #define REGION_MASK (((1UL<>REGION_SHIFT) -#define VMALLOC_REGION_ID (VMALLOCBASE>>REGION_SHIFT) -#define KERNEL_REGION_ID (KERNELBASE>>REGION_SHIFT) +#define IO_REGION_ID (IOREGIONBASE >> REGION_SHIFT) +#define VMALLOC_REGION_ID (VMALLOCBASE >> REGION_SHIFT) +#define KERNEL_REGION_ID (KERNELBASE >> REGION_SHIFT) #define USER_REGION_ID (0UL) -#define REGION_ID(X) (((unsigned long)(X))>>REGION_SHIFT) +#define REGION_ID(ea) (((unsigned long)(ea)) >> REGION_SHIFT) -#define __bpn_to_ba(x) ((((unsigned long)(x))<> PAGE_SHIFT) #define __va(x) ((void *)((unsigned long)(x) + KERNELBASE)) Index: working-2.6/arch/ppc64/mm/imalloc.c =================================================================== --- working-2.6.orig/arch/ppc64/mm/imalloc.c 2005-04-26 15:37:55.000000000 +1000 +++ working-2.6/arch/ppc64/mm/imalloc.c 2005-05-02 16:59:40.000000000 +1000 @@ -14,6 +14,7 @@ #include #include #include +#include static DECLARE_MUTEX(imlist_sem); struct vm_struct * imlist = NULL; @@ -23,11 +24,11 @@ unsigned long addr; struct vm_struct **p, *tmp; - addr = IMALLOC_START; + addr = ioremap_bot; for (p = &imlist; (tmp = *p) ; p = &tmp->next) { if (size + addr < (unsigned long) tmp->addr) break; - if ((unsigned long)tmp->addr >= IMALLOC_START) + if ((unsigned long)tmp->addr >= ioremap_bot) addr = tmp->size + (unsigned long) tmp->addr; if (addr > IMALLOC_END-size) return 1; Index: working-2.6/arch/ppc64/mm/hash_utils.c =================================================================== --- working-2.6.orig/arch/ppc64/mm/hash_utils.c 2005-04-26 15:37:55.000000000 +1000 +++ working-2.6/arch/ppc64/mm/hash_utils.c 2005-05-02 16:56:17.000000000 +1000 @@ -298,24 +298,23 @@ int local = 0; cpumask_t tmp; + if ((ea & ~REGION_MASK) > EADDR_MASK) + return 1; + switch (REGION_ID(ea)) { case USER_REGION_ID: user_region = 1; mm = current->mm; - if ((ea > USER_END) || (! mm)) + if (! mm) return 1; vsid = get_vsid(mm->context.id, ea); break; case IO_REGION_ID: - if (ea > IMALLOC_END) - return 1; mm = &ioremap_mm; vsid = get_kernel_vsid(ea); break; case VMALLOC_REGION_ID: - if (ea > VMALLOC_END) - return 1; mm = &init_mm; vsid = get_kernel_vsid(ea); break; @@ -362,7 +361,7 @@ unsigned long vsid, vpn, va, hash, secondary, slot; unsigned long huge = pte_huge(pte); - if ((ea >= USER_START) && (ea <= USER_END)) + if (ea < KERNELBASE) vsid = get_vsid(context, ea); else vsid = get_kernel_vsid(ea); Index: working-2.6/arch/ppc64/mm/hash_native.c =================================================================== --- working-2.6.orig/arch/ppc64/mm/hash_native.c 2005-04-26 15:37:55.000000000 +1000 +++ working-2.6/arch/ppc64/mm/hash_native.c 2005-05-02 16:51:44.000000000 +1000 @@ -320,8 +320,7 @@ j = 0; for (i = 0; i < number; i++) { - if ((batch->addr[i] >= USER_START) && - (batch->addr[i] <= USER_END)) + if (batch->addr[i] < KERNELBASE) vsid = get_vsid(context, batch->addr[i]); else vsid = get_kernel_vsid(batch->addr[i]); Index: working-2.6/arch/ppc64/mm/init.c =================================================================== --- working-2.6.orig/arch/ppc64/mm/init.c 2005-05-02 08:57:20.000000000 +1000 +++ working-2.6/arch/ppc64/mm/init.c 2005-05-02 16:38:18.000000000 +1000 @@ -64,6 +64,7 @@ #include #include #include +#include int mem_init_done; unsigned long ioremap_bot = IMALLOC_BASE; Index: working-2.6/include/asm-ppc64/mmu.h =================================================================== --- working-2.6.orig/include/asm-ppc64/mmu.h 2005-04-26 15:38:02.000000000 +1000 +++ working-2.6/include/asm-ppc64/mmu.h 2005-05-02 17:45:59.000000000 +1000 @@ -15,19 +15,10 @@ #include #include -#include -#ifndef __ASSEMBLY__ - -/* Time to allow for more things here */ -typedef unsigned long mm_context_id_t; -typedef struct { - mm_context_id_t id; -#ifdef CONFIG_HUGETLB_PAGE - pgd_t *huge_pgdir; - u16 htlb_segs; /* bitmask */ -#endif -} mm_context_t; +/* + * Segment table + */ #define STE_ESID_V 0x80 #define STE_ESID_KS 0x20 @@ -36,15 +27,48 @@ #define STE_VSID_SHIFT 12 -struct stab_entry { - unsigned long esid_data; - unsigned long vsid_data; -}; +/* Location of cpu0's segment table */ +#define STAB0_PAGE 0x9 +#define STAB0_PHYS_ADDR (STAB0_PAGE<> VSID_BITS) + (x & VSID_MODULUS); + return (x + ((x+1) >> VSID_BITS)) & VSID_MODULUS; +#endif /* 1 */ +} + +/* This is only valid for addresses >= KERNELBASE */ +static inline unsigned long get_kernel_vsid(unsigned long ea) +{ + return vsid_scramble(ea >> SID_SHIFT); +} + +/* This is only valid for user addresses (which are below 2^41) */ +static inline unsigned long get_vsid(unsigned long context, unsigned long ea) +{ + return vsid_scramble((context << USER_ESID_BITS) + | (ea >> SID_SHIFT)); +} + +#endif /* __ASSEMBLY */ + #endif /* _PPC64_MMU_H_ */ Index: working-2.6/arch/ppc64/mm/stab.c =================================================================== --- working-2.6.orig/arch/ppc64/mm/stab.c 2005-04-26 15:37:55.000000000 +1000 +++ working-2.6/arch/ppc64/mm/stab.c 2005-05-02 17:29:03.000000000 +1000 @@ -19,6 +19,11 @@ #include #include +struct stab_entry { + unsigned long esid_data; + unsigned long vsid_data; +}; + /* Both the segment table and SLB code uses the following cache */ #define NR_STAB_CACHE_ENTRIES 8 DEFINE_PER_CPU(long, stab_cache_ptr); Index: working-2.6/include/asm-ppc64/mmu_context.h =================================================================== --- working-2.6.orig/include/asm-ppc64/mmu_context.h 2005-04-26 15:38:02.000000000 +1000 +++ working-2.6/include/asm-ppc64/mmu_context.h 2005-05-02 17:41:49.000000000 +1000 @@ -84,86 +84,4 @@ local_irq_restore(flags); } -/* VSID allocation - * =============== - * - * We first generate a 36-bit "proto-VSID". For kernel addresses this - * is equal to the ESID, for user addresses it is: - * (context << 15) | (esid & 0x7fff) - * - * The two forms are distinguishable because the top bit is 0 for user - * addresses, whereas the top two bits are 1 for kernel addresses. - * Proto-VSIDs with the top two bits equal to 0b10 are reserved for - * now. - * - * The proto-VSIDs are then scrambled into real VSIDs with the - * multiplicative hash: - * - * VSID = (proto-VSID * VSID_MULTIPLIER) % VSID_MODULUS - * where VSID_MULTIPLIER = 268435399 = 0xFFFFFC7 - * VSID_MODULUS = 2^36-1 = 0xFFFFFFFFF - * - * This scramble is only well defined for proto-VSIDs below - * 0xFFFFFFFFF, so both proto-VSID and actual VSID 0xFFFFFFFFF are - * reserved. VSID_MULTIPLIER is prime, so in particular it is - * co-prime to VSID_MODULUS, making this a 1:1 scrambling function. - * Because the modulus is 2^n-1 we can compute it efficiently without - * a divide or extra multiply (see below). - * - * This scheme has several advantages over older methods: - * - * - We have VSIDs allocated for every kernel address - * (i.e. everything above 0xC000000000000000), except the very top - * segment, which simplifies several things. - * - * - We allow for 15 significant bits of ESID and 20 bits of - * context for user addresses. i.e. 8T (43 bits) of address space for - * up to 1M contexts (although the page table structure and context - * allocation will need changes to take advantage of this). - * - * - The scramble function gives robust scattering in the hash - * table (at least based on some initial results). The previous - * method was more susceptible to pathological cases giving excessive - * hash collisions. - */ - -/* - * WARNING - If you change these you must make sure the asm - * implementations in slb_allocate(), do_stab_bolted and mmu.h - * (ASM_VSID_SCRAMBLE macro) are changed accordingly. - * - * You'll also need to change the precomputed VSID values in head.S - * which are used by the iSeries firmware. - */ - -static inline unsigned long vsid_scramble(unsigned long protovsid) -{ -#if 0 - /* The code below is equivalent to this function for arguments - * < 2^VSID_BITS, which is all this should ever be called - * with. However gcc is not clever enough to compute the - * modulus (2^n-1) without a second multiply. */ - return ((protovsid * VSID_MULTIPLIER) % VSID_MODULUS); -#else /* 1 */ - unsigned long x; - - x = protovsid * VSID_MULTIPLIER; - x = (x >> VSID_BITS) + (x & VSID_MODULUS); - return (x + ((x+1) >> VSID_BITS)) & VSID_MODULUS; -#endif /* 1 */ -} - -/* This is only valid for addresses >= KERNELBASE */ -static inline unsigned long get_kernel_vsid(unsigned long ea) -{ - return vsid_scramble(ea >> SID_SHIFT); -} - -/* This is only valid for user addresses (which are below 2^41) */ -static inline unsigned long get_vsid(unsigned long context, unsigned long ea) -{ - return vsid_scramble((context << USER_ESID_BITS) - | (ea >> SID_SHIFT)); -} - #endif /* __PPC64_MMU_CONTEXT_H */ -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson From omkhar at gentoo.org Tue May 3 10:44:41 2005 From: omkhar at gentoo.org (Omkhar Arasaratnam) Date: Mon, 02 May 2005 20:44:41 -0400 Subject: [BUG] 2.4.30 - Bring up on JS20 Fails Message-ID: <4276C979.3020300@gentoo.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 At first the kernel wouldn't compile as it was missing a header, which I have resolved : http://dev.gentoo.org/~omkhar/ppc64-autofs_4.patch After including this header I was able to compile, but on bring up i see the following: [boot]0012 Setup Arch pSeries_pci: this system has large bus numbers and the kernel was not built with the patch that fixes include/linux/pci.h struct pci_bus so number, primary, secondary and subordinate are ints. Kernel panic: pSeries_pci: this system has large bus numbers and the kernel was not built with the patch that fixes include/linux/pci.h struct pci_bus so number, primary, secondary and subordinate are ints. In idle task - not syncing Ideas? - -- Omkhar Arasaratnam - Gentoo PPC64 Developer omkhar at gentoo.org - http://dev.gentoo.org/~omkhar Gentoo Linux / PPC64 Linux: http://ppc64.gentoo.org -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (MingW32) iD8DBQFCdsl59msUWjh2lHURAm9hAKCdfkUB5p+qx8hlQvzt7PnHgaLKqACeKe8y 6kUP8tOuOF+Zgi1OxkzOXKc= =QkrG -----END PGP SIGNATURE----- From benh at kernel.crashing.org Tue May 3 11:16:53 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Tue, 03 May 2005 11:16:53 +1000 Subject: [BUG] 2.4.30 - Bring up on JS20 Fails In-Reply-To: <4276C979.3020300@gentoo.org> References: <4276C979.3020300@gentoo.org> Message-ID: <1115083013.6155.37.camel@gaston> On Mon, 2005-05-02 at 20:44 -0400, Omkhar Arasaratnam wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > At first the kernel wouldn't compile as it was missing a header, which I > have resolved : http://dev.gentoo.org/~omkhar/ppc64-autofs_4.patch > > After including this header I was able to compile, but on bring up i see > the following: > > [boot]0012 Setup Arch > pSeries_pci: this system has large bus numbers and the kernel was not > built with the patch that fixes include/linux/pci.h struct pci_bus so > number, primary, secondary and subordinate are ints. > Kernel panic: pSeries_pci: this system has large bus numbers and the > kernel was not > built with the patch that fixes include/linux/pci.h struct pci_bus so > number, primary, secondary and subordinate are ints. > In idle task - not syncing Why 2.4 ? Ben. From david at gibson.dropbear.id.au Tue May 3 11:23:43 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Tue, 3 May 2005 11:23:43 +1000 Subject: [PPC64] pgtable.h and other header cleanups In-Reply-To: <20050503002608.GA22453@localhost.localdomain> References: <20050503002608.GA22453@localhost.localdomain> Message-ID: <20050503012343.GB22453@localhost.localdomain> On Tue, May 03, 2005 at 10:26:08AM +1000, David Gibson wrote: > Andrew, please apply. > > This patch started as simply removing a few never-used macros from > asm-ppc64/pgtable.h, then kind of grew. It now makes a bunch of > cleanups to the ppc64 low-level header files (with corresponding > changes to .c files where necessary) such as: > - Abolishing never-used macros > - Eliminating multiple #defines with the same purpose > - Removing pointless macros (cases where just expanding the > macro everywhere turns out clearer and more sensible) > - Removing some cases where macros which could be defined in > terms of each other weren't > - Moving imalloc() related definitions from pgtable.h to their > own header file (imalloc.h) > - Re-arranging headers to group things more logically > - Moving all VSID allocation related things to mmu.h, instead > of being split between mmu.h and mmu_context.h > - Removing some reserved space for flags from the PMD - we're > not using it. Aargh! Don't apply, patch is broken (missing imalloc.h). Grr... I could have sworn I'd quilt added it. Fixed version coming shortly. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson From anton at samba.org Tue May 3 13:13:22 2005 From: anton at samba.org (Anton Blanchard) Date: Tue, 3 May 2005 13:13:22 +1000 Subject: [BUG] 2.4.30 - Bring up on JS20 Fails In-Reply-To: <4276C979.3020300@gentoo.org> References: <4276C979.3020300@gentoo.org> Message-ID: <20050503031322.GG12682@krispykreme> Hi, > After including this header I was able to compile, but on bring up i see > the following: > > [boot]0012 Setup Arch > pSeries_pci: this system has large bus numbers and the kernel was not > built with the patch that fixes include/linux/pci.h struct pci_bus so > number, primary, secondary and subordinate are ints. > Kernel panic: pSeries_pci: this system has large bus numbers and the > kernel was not > built with the patch that fixes > include/linux/pci.h struct pci_bus so > number, primary, secondary and subordinate are ints. Do that and it should work :) Its to do with PCI domains and is fixed properly in 2.6. Anton From david at gibson.dropbear.id.au Tue May 3 13:33:32 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Tue, 3 May 2005 13:33:32 +1000 Subject: [PPC64] pgtable.h and other header cleanups In-Reply-To: <20050503012343.GB22453@localhost.localdomain> References: <20050503002608.GA22453@localhost.localdomain> <20050503012343.GB22453@localhost.localdomain> Message-ID: <20050503033332.GC22453@localhost.localdomain> On Tue, May 03, 2005 at 11:23:43AM +1000, David Gibson wrote: > On Tue, May 03, 2005 at 10:26:08AM +1000, David Gibson wrote: > > Andrew, please apply. > > > > This patch started as simply removing a few never-used macros from > > asm-ppc64/pgtable.h, then kind of grew. It now makes a bunch of > > cleanups to the ppc64 low-level header files (with corresponding > > changes to .c files where necessary) such as: > > - Abolishing never-used macros > > - Eliminating multiple #defines with the same purpose > > - Removing pointless macros (cases where just expanding the > > macro everywhere turns out clearer and more sensible) > > - Removing some cases where macros which could be defined in > > terms of each other weren't > > - Moving imalloc() related definitions from pgtable.h to their > > own header file (imalloc.h) > > - Re-arranging headers to group things more logically > > - Moving all VSID allocation related things to mmu.h, instead > > of being split between mmu.h and mmu_context.h > > - Removing some reserved space for flags from the PMD - we're > > not using it. > > Aargh! Don't apply, patch is broken (missing imalloc.h). Grr... I > could have sworn I'd quilt added it. Fixed version coming shortly. Ok, this time for sure. Andrew, please apply: This patch started as simply removing a few never-used macros from asm-ppc64/pgtable.h, then kind of grew. It now makes a bunch of cleanups to the ppc64 low-level header files (with corresponding changes to .c files where necessary) such as: - Abolishing never-used macros - Eliminating multiple #defines with the same purpose - Removing pointless macros (cases where just expanding the macro everywhere turns out clearer and more sensible) - Removing some cases where macros which could be defined in terms of each other weren't - Moving imalloc() related definitions from pgtable.h to their own header file (imalloc.h) - Re-arranging headers to group things more logically - Moving all VSID allocation related things to mmu.h, instead of being split between mmu.h and mmu_context.h - Removing some reserved space for flags from the PMD - we're not using it. - Fix some bugs which broke compile with STRICT_MM_TYPECHECKS. Signed-off-by: David Gibson Index: working-2.6/include/asm-ppc64/pgtable.h =================================================================== --- working-2.6.orig/include/asm-ppc64/pgtable.h 2005-05-02 08:57:22.000000000 +1000 +++ working-2.6/include/asm-ppc64/pgtable.h 2005-05-03 12:56:34.000000000 +1000 @@ -17,16 +17,6 @@ #include -/* PMD_SHIFT determines what a second-level page table entry can map */ -#define PMD_SHIFT (PAGE_SHIFT + PAGE_SHIFT - 3) -#define PMD_SIZE (1UL << PMD_SHIFT) -#define PMD_MASK (~(PMD_SIZE-1)) - -/* PGDIR_SHIFT determines what a third-level page table entry can map */ -#define PGDIR_SHIFT (PAGE_SHIFT + (PAGE_SHIFT - 3) + (PAGE_SHIFT - 2)) -#define PGDIR_SIZE (1UL << PGDIR_SHIFT) -#define PGDIR_MASK (~(PGDIR_SIZE-1)) - /* * Entries per page directory level. The PTE level must use a 64b record * for each page table entry. The PMD and PGD level use a 32b record for @@ -40,40 +30,30 @@ #define PTRS_PER_PMD (1 << PMD_INDEX_SIZE) #define PTRS_PER_PGD (1 << PGD_INDEX_SIZE) -#define USER_PTRS_PER_PGD (1024) -#define FIRST_USER_ADDRESS 0 +/* PMD_SHIFT determines what a second-level page table entry can map */ +#define PMD_SHIFT (PAGE_SHIFT + PTE_INDEX_SIZE) +#define PMD_SIZE (1UL << PMD_SHIFT) +#define PMD_MASK (~(PMD_SIZE-1)) -#define EADDR_SIZE (PTE_INDEX_SIZE + PMD_INDEX_SIZE + \ - PGD_INDEX_SIZE + PAGE_SHIFT) +/* PGDIR_SHIFT determines what a third-level page table entry can map */ +#define PGDIR_SHIFT (PMD_SHIFT + PMD_INDEX_SIZE) +#define PGDIR_SIZE (1UL << PGDIR_SHIFT) +#define PGDIR_MASK (~(PGDIR_SIZE-1)) + +#define FIRST_USER_ADDRESS 0 /* * Size of EA range mapped by our pagetables. */ -#define PGTABLE_EA_BITS 41 -#define PGTABLE_EA_MASK ((1UL<> PMD_TO_PTEPAGE_SHIFT)) +#define pmd_page_kernel(pmd) (__bpn_to_ba(pmd_val(pmd))) #define pmd_page(pmd) virt_to_page(pmd_page_kernel(pmd)) #define pud_set(pudp, pmdp) (pud_val(*(pudp)) = (__ba_to_bpn(pmdp))) @@ -266,8 +242,6 @@ /* to find an entry in the ioremap page-table-directory */ #define pgd_offset_i(address) (ioremap_pgd + pgd_index(address)) -#define pages_to_mb(x) ((x) >> (20-PAGE_SHIFT)) - /* * The following only work if pte_present() is true. * Undefined behaviour if not.. @@ -442,7 +416,7 @@ pte_clear(mm, addr, ptep); flush_tlb_pending(); } - *ptep = __pte(pte_val(pte)) & ~_PAGE_HPTEFLAGS; + *ptep = __pte(pte_val(pte) & ~_PAGE_HPTEFLAGS); } /* Set the dirty and/or accessed bits atomically in a linux PTE, this @@ -487,18 +461,13 @@ extern unsigned long ioremap_bot, ioremap_base; -#define USER_PGD_PTRS (PAGE_OFFSET >> PGDIR_SHIFT) -#define KERNEL_PGD_PTRS (PTRS_PER_PGD-USER_PGD_PTRS) - -#define pte_ERROR(e) \ - printk("%s:%d: bad pte %016lx.\n", __FILE__, __LINE__, pte_val(e)) #define pmd_ERROR(e) \ printk("%s:%d: bad pmd %08x.\n", __FILE__, __LINE__, pmd_val(e)) #define pgd_ERROR(e) \ printk("%s:%d: bad pgd %08x.\n", __FILE__, __LINE__, pgd_val(e)) -extern pgd_t swapper_pg_dir[1024]; -extern pgd_t ioremap_dir[1024]; +extern pgd_t swapper_pg_dir[]; +extern pgd_t ioremap_dir[]; extern void paging_init(void); @@ -540,43 +509,11 @@ */ #define kern_addr_valid(addr) (1) -#define io_remap_page_range(vma, vaddr, paddr, size, prot) \ - remap_pfn_range(vma, vaddr, (paddr) >> PAGE_SHIFT, size, prot) - #define io_remap_pfn_range(vma, vaddr, pfn, size, prot) \ remap_pfn_range(vma, vaddr, pfn, size, prot) -#define MK_IOSPACE_PFN(space, pfn) (pfn) -#define GET_IOSPACE(pfn) 0 -#define GET_PFN(pfn) (pfn) - void pgtable_cache_init(void); -extern void hpte_init_native(void); -extern void hpte_init_lpar(void); -extern void hpte_init_iSeries(void); - -/* imalloc region types */ -#define IM_REGION_UNUSED 0x1 -#define IM_REGION_SUBSET 0x2 -#define IM_REGION_EXISTS 0x4 -#define IM_REGION_OVERLAP 0x8 -#define IM_REGION_SUPERSET 0x10 - -extern struct vm_struct * im_get_free_area(unsigned long size); -extern struct vm_struct * im_get_area(unsigned long v_addr, unsigned long size, - int region_type); -unsigned long im_free(void *addr); - -extern long pSeries_lpar_hpte_insert(unsigned long hpte_group, - unsigned long va, unsigned long prpn, - int secondary, unsigned long hpteflags, - int bolted, int large); - -extern long native_hpte_insert(unsigned long hpte_group, unsigned long va, - unsigned long prpn, int secondary, - unsigned long hpteflags, int bolted, int large); - /* * find_linux_pte returns the address of a linux pte for a given * effective address and directory. If not found, it returns zero. Index: working-2.6/include/asm-ppc64/page.h =================================================================== --- working-2.6.orig/include/asm-ppc64/page.h 2005-05-02 08:57:22.000000000 +1000 +++ working-2.6/include/asm-ppc64/page.h 2005-05-03 13:08:06.000000000 +1000 @@ -23,7 +23,6 @@ #define PAGE_SHIFT 12 #define PAGE_SIZE (ASM_CONST(1) << PAGE_SHIFT) #define PAGE_MASK (~(PAGE_SIZE-1)) -#define PAGE_OFFSET_MASK (PAGE_SIZE-1) #define SID_SHIFT 28 #define SID_MASK 0xfffffffffUL @@ -85,9 +84,6 @@ /* align addr on a size boundary - adjust address up if needed */ #define _ALIGN(addr,size) _ALIGN_UP(addr,size) -/* to align the pointer to the (next) double word boundary */ -#define DOUBLEWORD_ALIGN(addr) _ALIGN(addr,sizeof(unsigned long)) - /* to align the pointer to the (next) page boundary */ #define PAGE_ALIGN(addr) _ALIGN(addr, PAGE_SIZE) @@ -100,7 +96,6 @@ #define REGION_SIZE 4UL #define REGION_SHIFT 60UL #define REGION_MASK (((1UL<>REGION_SHIFT) -#define VMALLOC_REGION_ID (VMALLOCBASE>>REGION_SHIFT) -#define KERNEL_REGION_ID (KERNELBASE>>REGION_SHIFT) +#define IO_REGION_ID (IOREGIONBASE >> REGION_SHIFT) +#define VMALLOC_REGION_ID (VMALLOCBASE >> REGION_SHIFT) +#define KERNEL_REGION_ID (KERNELBASE >> REGION_SHIFT) #define USER_REGION_ID (0UL) -#define REGION_ID(X) (((unsigned long)(X))>>REGION_SHIFT) +#define REGION_ID(ea) (((unsigned long)(ea)) >> REGION_SHIFT) -#define __bpn_to_ba(x) ((((unsigned long)(x))<> PAGE_SHIFT) #define __va(x) ((void *)((unsigned long)(x) + KERNELBASE)) Index: working-2.6/arch/ppc64/mm/imalloc.c =================================================================== --- working-2.6.orig/arch/ppc64/mm/imalloc.c 2005-04-26 15:37:55.000000000 +1000 +++ working-2.6/arch/ppc64/mm/imalloc.c 2005-05-03 12:56:34.000000000 +1000 @@ -14,6 +14,7 @@ #include #include #include +#include static DECLARE_MUTEX(imlist_sem); struct vm_struct * imlist = NULL; @@ -23,11 +24,11 @@ unsigned long addr; struct vm_struct **p, *tmp; - addr = IMALLOC_START; + addr = ioremap_bot; for (p = &imlist; (tmp = *p) ; p = &tmp->next) { if (size + addr < (unsigned long) tmp->addr) break; - if ((unsigned long)tmp->addr >= IMALLOC_START) + if ((unsigned long)tmp->addr >= ioremap_bot) addr = tmp->size + (unsigned long) tmp->addr; if (addr > IMALLOC_END-size) return 1; Index: working-2.6/arch/ppc64/mm/hash_utils.c =================================================================== --- working-2.6.orig/arch/ppc64/mm/hash_utils.c 2005-04-26 15:37:55.000000000 +1000 +++ working-2.6/arch/ppc64/mm/hash_utils.c 2005-05-03 12:56:34.000000000 +1000 @@ -298,24 +298,23 @@ int local = 0; cpumask_t tmp; + if ((ea & ~REGION_MASK) > EADDR_MASK) + return 1; + switch (REGION_ID(ea)) { case USER_REGION_ID: user_region = 1; mm = current->mm; - if ((ea > USER_END) || (! mm)) + if (! mm) return 1; vsid = get_vsid(mm->context.id, ea); break; case IO_REGION_ID: - if (ea > IMALLOC_END) - return 1; mm = &ioremap_mm; vsid = get_kernel_vsid(ea); break; case VMALLOC_REGION_ID: - if (ea > VMALLOC_END) - return 1; mm = &init_mm; vsid = get_kernel_vsid(ea); break; @@ -362,7 +361,7 @@ unsigned long vsid, vpn, va, hash, secondary, slot; unsigned long huge = pte_huge(pte); - if ((ea >= USER_START) && (ea <= USER_END)) + if (ea < KERNELBASE) vsid = get_vsid(context, ea); else vsid = get_kernel_vsid(ea); Index: working-2.6/arch/ppc64/mm/hash_native.c =================================================================== --- working-2.6.orig/arch/ppc64/mm/hash_native.c 2005-04-26 15:37:55.000000000 +1000 +++ working-2.6/arch/ppc64/mm/hash_native.c 2005-05-03 12:56:34.000000000 +1000 @@ -320,8 +320,7 @@ j = 0; for (i = 0; i < number; i++) { - if ((batch->addr[i] >= USER_START) && - (batch->addr[i] <= USER_END)) + if (batch->addr[i] < KERNELBASE) vsid = get_vsid(context, batch->addr[i]); else vsid = get_kernel_vsid(batch->addr[i]); Index: working-2.6/arch/ppc64/mm/init.c =================================================================== --- working-2.6.orig/arch/ppc64/mm/init.c 2005-05-02 08:57:20.000000000 +1000 +++ working-2.6/arch/ppc64/mm/init.c 2005-05-03 12:56:34.000000000 +1000 @@ -64,6 +64,7 @@ #include #include #include +#include int mem_init_done; unsigned long ioremap_bot = IMALLOC_BASE; Index: working-2.6/include/asm-ppc64/mmu.h =================================================================== --- working-2.6.orig/include/asm-ppc64/mmu.h 2005-04-26 15:38:02.000000000 +1000 +++ working-2.6/include/asm-ppc64/mmu.h 2005-05-03 12:56:34.000000000 +1000 @@ -15,19 +15,10 @@ #include #include -#include -#ifndef __ASSEMBLY__ - -/* Time to allow for more things here */ -typedef unsigned long mm_context_id_t; -typedef struct { - mm_context_id_t id; -#ifdef CONFIG_HUGETLB_PAGE - pgd_t *huge_pgdir; - u16 htlb_segs; /* bitmask */ -#endif -} mm_context_t; +/* + * Segment table + */ #define STE_ESID_V 0x80 #define STE_ESID_KS 0x20 @@ -36,15 +27,48 @@ #define STE_VSID_SHIFT 12 -struct stab_entry { - unsigned long esid_data; - unsigned long vsid_data; -}; +/* Location of cpu0's segment table */ +#define STAB0_PAGE 0x9 +#define STAB0_PHYS_ADDR (STAB0_PAGE<> VSID_BITS) + (x & VSID_MODULUS); + return (x + ((x+1) >> VSID_BITS)) & VSID_MODULUS; +#endif /* 1 */ +} + +/* This is only valid for addresses >= KERNELBASE */ +static inline unsigned long get_kernel_vsid(unsigned long ea) +{ + return vsid_scramble(ea >> SID_SHIFT); +} + +/* This is only valid for user addresses (which are below 2^41) */ +static inline unsigned long get_vsid(unsigned long context, unsigned long ea) +{ + return vsid_scramble((context << USER_ESID_BITS) + | (ea >> SID_SHIFT)); +} + +#endif /* __ASSEMBLY */ + #endif /* _PPC64_MMU_H_ */ Index: working-2.6/arch/ppc64/mm/stab.c =================================================================== --- working-2.6.orig/arch/ppc64/mm/stab.c 2005-04-26 15:37:55.000000000 +1000 +++ working-2.6/arch/ppc64/mm/stab.c 2005-05-03 12:56:34.000000000 +1000 @@ -19,6 +19,11 @@ #include #include +struct stab_entry { + unsigned long esid_data; + unsigned long vsid_data; +}; + /* Both the segment table and SLB code uses the following cache */ #define NR_STAB_CACHE_ENTRIES 8 DEFINE_PER_CPU(long, stab_cache_ptr); Index: working-2.6/include/asm-ppc64/mmu_context.h =================================================================== --- working-2.6.orig/include/asm-ppc64/mmu_context.h 2005-04-26 15:38:02.000000000 +1000 +++ working-2.6/include/asm-ppc64/mmu_context.h 2005-05-03 12:56:34.000000000 +1000 @@ -84,86 +84,4 @@ local_irq_restore(flags); } -/* VSID allocation - * =============== - * - * We first generate a 36-bit "proto-VSID". For kernel addresses this - * is equal to the ESID, for user addresses it is: - * (context << 15) | (esid & 0x7fff) - * - * The two forms are distinguishable because the top bit is 0 for user - * addresses, whereas the top two bits are 1 for kernel addresses. - * Proto-VSIDs with the top two bits equal to 0b10 are reserved for - * now. - * - * The proto-VSIDs are then scrambled into real VSIDs with the - * multiplicative hash: - * - * VSID = (proto-VSID * VSID_MULTIPLIER) % VSID_MODULUS - * where VSID_MULTIPLIER = 268435399 = 0xFFFFFC7 - * VSID_MODULUS = 2^36-1 = 0xFFFFFFFFF - * - * This scramble is only well defined for proto-VSIDs below - * 0xFFFFFFFFF, so both proto-VSID and actual VSID 0xFFFFFFFFF are - * reserved. VSID_MULTIPLIER is prime, so in particular it is - * co-prime to VSID_MODULUS, making this a 1:1 scrambling function. - * Because the modulus is 2^n-1 we can compute it efficiently without - * a divide or extra multiply (see below). - * - * This scheme has several advantages over older methods: - * - * - We have VSIDs allocated for every kernel address - * (i.e. everything above 0xC000000000000000), except the very top - * segment, which simplifies several things. - * - * - We allow for 15 significant bits of ESID and 20 bits of - * context for user addresses. i.e. 8T (43 bits) of address space for - * up to 1M contexts (although the page table structure and context - * allocation will need changes to take advantage of this). - * - * - The scramble function gives robust scattering in the hash - * table (at least based on some initial results). The previous - * method was more susceptible to pathological cases giving excessive - * hash collisions. - */ - -/* - * WARNING - If you change these you must make sure the asm - * implementations in slb_allocate(), do_stab_bolted and mmu.h - * (ASM_VSID_SCRAMBLE macro) are changed accordingly. - * - * You'll also need to change the precomputed VSID values in head.S - * which are used by the iSeries firmware. - */ - -static inline unsigned long vsid_scramble(unsigned long protovsid) -{ -#if 0 - /* The code below is equivalent to this function for arguments - * < 2^VSID_BITS, which is all this should ever be called - * with. However gcc is not clever enough to compute the - * modulus (2^n-1) without a second multiply. */ - return ((protovsid * VSID_MULTIPLIER) % VSID_MODULUS); -#else /* 1 */ - unsigned long x; - - x = protovsid * VSID_MULTIPLIER; - x = (x >> VSID_BITS) + (x & VSID_MODULUS); - return (x + ((x+1) >> VSID_BITS)) & VSID_MODULUS; -#endif /* 1 */ -} - -/* This is only valid for addresses >= KERNELBASE */ -static inline unsigned long get_kernel_vsid(unsigned long ea) -{ - return vsid_scramble(ea >> SID_SHIFT); -} - -/* This is only valid for user addresses (which are below 2^41) */ -static inline unsigned long get_vsid(unsigned long context, unsigned long ea) -{ - return vsid_scramble((context << USER_ESID_BITS) - | (ea >> SID_SHIFT)); -} - #endif /* __PPC64_MMU_CONTEXT_H */ Index: working-2.6/include/asm-ppc64/imalloc.h =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ working-2.6/include/asm-ppc64/imalloc.h 2005-05-03 12:56:34.000000000 +1000 @@ -0,0 +1,24 @@ +#ifndef _PPC64_IMALLOC_H +#define _PPC64_IMALLOC_H + +/* + * Define the address range of the imalloc VM area. + */ +#define PHBS_IO_BASE IOREGIONBASE +#define IMALLOC_BASE (IOREGIONBASE + 0x80000000ul) /* Reserve 2 gigs for PHBs */ +#define IMALLOC_END (IOREGIONBASE + EADDR_MASK) + + +/* imalloc region types */ +#define IM_REGION_UNUSED 0x1 +#define IM_REGION_SUBSET 0x2 +#define IM_REGION_EXISTS 0x4 +#define IM_REGION_OVERLAP 0x8 +#define IM_REGION_SUPERSET 0x10 + +extern struct vm_struct * im_get_free_area(unsigned long size); +extern struct vm_struct * im_get_area(unsigned long v_addr, unsigned long size, + int region_type); +unsigned long im_free(void *addr); + +#endif /* _PPC64_IMALLOC_H */ Index: working-2.6/arch/ppc64/kernel/pci.c =================================================================== --- working-2.6.orig/arch/ppc64/kernel/pci.c 2005-04-26 15:37:55.000000000 +1000 +++ working-2.6/arch/ppc64/kernel/pci.c 2005-05-03 12:56:34.000000000 +1000 @@ -438,7 +438,7 @@ int i; if (page_is_ram(offset >> PAGE_SHIFT)) - return prot; + return __pgprot(prot); prot |= _PAGE_NO_CACHE | _PAGE_GUARDED; -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson From paulus at samba.org Tue May 3 14:14:00 2005 From: paulus at samba.org (Paul Mackerras) Date: Tue, 3 May 2005 14:14:00 +1000 Subject: [PPC64] pgtable.h and other header cleanups In-Reply-To: <20050503033332.GC22453@localhost.localdomain> References: <20050503002608.GA22453@localhost.localdomain> <20050503012343.GB22453@localhost.localdomain> <20050503033332.GC22453@localhost.localdomain> Message-ID: <17014.64136.49697.910612@cargo.ozlabs.ibm.com> David Gibson writes: > This patch started as simply removing a few never-used macros from > asm-ppc64/pgtable.h, then kind of grew. It now makes a bunch of > cleanups to the ppc64 low-level header files (with corresponding > changes to .c files where necessary) such as: > - Abolishing never-used macros > - Eliminating multiple #defines with the same purpose > - Removing pointless macros (cases where just expanding the > macro everywhere turns out clearer and more sensible) > - Removing some cases where macros which could be defined in > terms of each other weren't > - Moving imalloc() related definitions from pgtable.h to their > own header file (imalloc.h) > - Re-arranging headers to group things more logically > - Moving all VSID allocation related things to mmu.h, instead > of being split between mmu.h and mmu_context.h > - Removing some reserved space for flags from the PMD - we're > not using it. > - Fix some bugs which broke compile with STRICT_MM_TYPECHECKS. > > Signed-off-by: David Gibson Acked-by: Paul Mackerras From sonny at burdell.org Tue May 3 15:02:12 2005 From: sonny at burdell.org (Sonny Rao) Date: Tue, 3 May 2005 01:02:12 -0400 Subject: 2.6.11 e1000 EEH MMIO failure Message-ID: <20050503050212.GA22395@kevlar.burdell.org> I'm guessing this means a bad e1000 card but I wanted to check with the experts. The box is a p690 w/ some expansion drawers attached, and is running a pretty-much stock 2.6.11 kernel, system is booted in SMP mode. Could it be related to e1000 errata "23" mentioned earlier on the mailing list? Here are the messages: Intel(R) PRO/1000 Network Driver - version 5.6.10.1-k2 Copyright (c) 1999-2004 Intel Corporation. e1000: eth3: e1000_probe: Intel(R) PRO/1000 Network Connection PCI: Enabling device: (000a:01:01.0), cmd 143 e1000: eth4: e1000_probe: Intel(R) PRO/1000 Network Connection PCI: Enabling device: (000a:01:01.1), cmd 143 e1000: eth5: e1000_probe: Intel(R) PRO/1000 Network Connection PCI: Enabling device: (000e:21:01.0), cmd 143 e1000: eth6: e1000_probe: Intel(R) PRO/1000 Network Connection PCI: Enabling device: (0011:21:01.0), cmd 143 e1000: eth7: e1000_probe: Intel(R) PRO/1000 Network Connection RTAS: event: 15, Type: Retry, Severity: 2 EEH: MMIO failure (2) on device: ethernet /pci at 3ffe7f0a000/pci at 2,2/ethernet at 1 Call Trace: [c00000103873a910] [c000000000631630] 0xc000000000631630 (unreliable) [c00000103873a990] [c000000000036a6c] .eeh_dn_check_failure+0x2e4/0x334 [c00000103873aa70] [c000000000036c20] .eeh_check_failure+0x164/0x1b0 [c00000103873ab10] [d0000000002a6b04] .e1000_check_for_link+0x5ac/0x664 [e1000] [c00000103873abd0] [d00000000029a5e0] .e1000_watchdog+0x48/0x79c [e1000] [c00000103873ac90] [c00000000005f558] .run_timer_softirq+0x15c/0x280 [c00000103873ad60] [c00000000005a3c4] .__do_softirq+0xdc/0x1c8 [c00000103873ae20] [c00000000005a538] .do_softirq+0x88/0x8c [c00000103873aeb0] [c000000000011520] .timer_interrupt+0x294/0x35c [c00000103873afb0] [c00000000000a2b8] decrementer_common+0xb8/0x100 --- Exception: 901 at ._spin_unlock_irqrestore+0x1c/0x28 LR = .rtas_call+0x1a4/0x2b4 [c00000103873b2a0] [c0000000001e8128] .snprintf+0x30/0x44 (unreliable) [c00000103873b2e0] [c00000000003421c] .rtas_call+0x110/0x2b4 [c00000103873b3a0] [c0000000000366ec] .read_slot_reset_state+0x94/0xac [c00000103873b420] [c000000000036890] .eeh_dn_check_failure+0x108/0x334 [c00000103873b500] [c000000000036c20] .eeh_check_failure+0x164/0x1b0 [c00000103873b5a0] [d00000000029f174] .e1000_up+0x404/0x40c [e1000] [c00000103873b650] [d00000000029f5cc] .e1000_open+0x54/0xc0 [e1000] [c00000103873b6e0] [c0000000002fec84] .dev_open+0x118/0x13c [c00000103873b780] [c0000000002fcef8] .dev_change_flags+0x19c/0x1d4 [c00000103873b820] [c000000000357878] .devinet_ioctl+0x66c/0x820 [c00000103873b930] [c000000000358794] .inet_ioctl+0x260/0x2e0 [c00000103873b9c0] [c0000000002f03a0] .sock_ioctl+0x28c/0x418 [c00000103873ba70] [c0000000000c7564] .do_ioctl+0x124/0x13c [c00000103873bb10] [c0000000000c777c] .vfs_ioctl+0x200/0x4e0 [c00000103873bbc0] [c0000000000c7ab8] .sys_ioctl+0x5c/0xa4 [c00000103873bc70] [c00000000001e8c0] .dev_ifsioc+0x8c/0x348 [c00000103873bd50] [c0000000000e7d24] .compat_sys_ioctl+0x46c/0x4c4 [c00000103873be30] [c00000000000d500] syscall_exit+0x0/0x18 e1000: eth7: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex RTAS: event: 16, Type: Retry, Severity: 2 EEH: MMIO failure (2) on device: ethernet /pci at 3ffe7f0a000/pci at 2,2/ethernet at 1 Call Trace: [c00000103873b3a0] [c000000000631630] 0xc000000000631630 (unreliable) [c00000103873b420] [c000000000036a6c] .eeh_dn_check_failure+0x2e4/0x334 [c00000103873b500] [c000000000036c20] .eeh_check_failure+0x164/0x1b0 [c00000103873b5a0] [d00000000029f174] .e1000_up+0x404/0x40c [e1000] [c00000103873b650] [d00000000029f5cc] .e1000_open+0x54/0xc0 [e1000] [c00000103873b6e0] [c0000000002fec84] .dev_open+0x118/0x13c [c00000103873b780] [c0000000002fcef8] .dev_change_flags+0x19c/0x1d4 [c00000103873b820] [c000000000357878] .devinet_ioctl+0x66c/0x820 [c00000103873b930] [c000000000358794] .inet_ioctl+0x260/0x2e0 [c00000103873b9c0] [c0000000002f03a0] .sock_ioctl+0x28c/0x418 [c00000103873ba70] [c0000000000c7564] .do_ioctl+0x124/0x13c [c00000103873bb10] [c0000000000c777c] .vfs_ioctl+0x200/0x4e0 [c00000103873bbc0] [c0000000000c7ab8] .sys_ioctl+0x5c/0xa4 [c00000103873bc70] [c00000000001e8c0] .dev_ifsioc+0x8c/0x348 [c00000103873bd50] [c0000000000e7d24] .compat_sys_ioctl+0x46c/0x4c4 [c00000103873be30] [c00000000000d500] syscall_exit+0x0/0x18 EEH: MMIO failure (2), notifiying device 0011:21:01.0 EEH: MMIO failure (2), notifiying device 0011:21:01.0 PCI: Enabling device: (0014:01:01.0), cmd 143 e1000: eth8: e1000_probe: Intel(R) PRO/1000 Network Connection PCI: Enabling device: (0014:01:01.1), cmd 143 e1000: eth9: e1000_probe: Intel(R) PRO/1000 Network Connection PCI: Enabling device: (0017:01:01.0), cmd 143 e1000: eth10: e1000_probe: Intel(R) PRO/1000 Network Connection Sonny From benh at kernel.crashing.org Tue May 3 15:34:58 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Tue, 03 May 2005 15:34:58 +1000 Subject: [PATCH] Fix gcc 4.0 vs CONFIG_ALTIVEC Message-ID: <1115098498.6030.82.camel@gaston> Hi ! gcc-4.0 generates altivec code implicitely when -mcpu insidates an altivec capable CPU which is not suitable for the kenrel. However, we used to set -mcpu=970 when CONFIG_ALTIVEC was set because a gcc-3.x bug prevented from using -maltivec along with -mcpu=power4, thus prevented building the RAID6 altivec code. This patch fixes all of this by testing for the gcc version. If 4.0 or later, just normally use -mcpu=power4 and let the RAID6 code add -maltivec to the few files it needs to be compiled with altivec support. For 3.x, we still use -mcpu=970 to work around the above problem, which is fine as 3.x will never implicitely generate altivec code. The Makefile hackery may not be the most lovely, I welcome anybody more skilled than me to improve it. Signed-off-by: Benjamin Herrenschmidt Index: linux-work/arch/ppc64/Makefile =================================================================== --- linux-work.orig/arch/ppc64/Makefile 2005-05-02 10:48:08.000000000 +1000 +++ linux-work/arch/ppc64/Makefile 2005-05-03 14:38:43.000000000 +1000 @@ -56,13 +56,20 @@ CFLAGS += -msoft-float -pipe -mminimal-toc -mtraceback=none \ -mcall-aixdesc +GCC_VERSION := $(call cc-version) +GCC_BROKEN_VEC := $(shell if [ $(GCC_VERSION) -lt 0400 ] ; then echo "y"; fi ;) + ifeq ($(CONFIG_POWER4_ONLY),y) ifeq ($(CONFIG_ALTIVEC),y) +ifeq ($(GCC_BROKEN_VEC),y) CFLAGS += $(call cc-option,-mcpu=970) else CFLAGS += $(call cc-option,-mcpu=power4) endif else + CFLAGS += $(call cc-option,-mcpu=power4) +endif +else CFLAGS += $(call cc-option,-mtune=power4) endif From benh at kernel.crashing.org Tue May 3 16:08:25 2005 From: benh at kernel.crashing.org (Benjamin Herrenschmidt) Date: Tue, 03 May 2005 16:08:25 +1000 Subject: [PATCH] Fix gcc 4.0 vs CONFIG_ALTIVEC In-Reply-To: <1115098498.6030.82.camel@gaston> References: <1115098498.6030.82.camel@gaston> Message-ID: <1115100505.6030.91.camel@gaston> On Tue, 2005-05-03 at 15:35 +1000, Benjamin Herrenschmidt wrote: > Hi ! > > gcc-4.0 generates altivec code implicitely when -mcpu insidates an > altivec capable CPU which is not suitable for the kenrel. Damn ! I should stay away from a keyboard today ! Oh well, at least the patch itself looks ok. Ben. From arnd at arndb.de Tue May 3 19:46:03 2005 From: arnd at arndb.de (Arnd Bergmann) Date: Tue, 3 May 2005 11:46:03 +0200 Subject: [PATCH 1/2] ppc64: fix read/write on large /dev/nvram In-Reply-To: <7845758a806ed6769cea59a9df344d39@bga.com> References: <7845758a806ed6769cea59a9df344d39@bga.com> Message-ID: <200505031146.04824.arnd@arndb.de> On Maandag 02 Mai 2005 08:43, Milton Miller wrote: > On Fri Apr 22 16:49:59 EST 2005, Arnd wrote a patch with the following > lines (among several others). > > - len = ppc_md.nvram_read(tmp_buffer, count, ppos); > + ret = ppc_md.nvram_read(tmp, count, ppos); > > - len = ppc_md.nvram_write(tmp_buffer, count, ppos); > + ret = ppc_md.nvram_read(tmp, count, ppos); > > > Even though I am just scanning, I am guessing this is not quite right. Good catch. I only tested the read path because I did not want to mess with the contents of the nvram. I'll do a new patch when I come back to Germany, unless someone else (Utz?) does one first. Arnd <>< From omkhar at gentoo.org Wed May 4 02:27:51 2005 From: omkhar at gentoo.org (Omkhar Arasaratnam) Date: Tue, 03 May 2005 12:27:51 -0400 Subject: [BUG] 2.4.30 - Bring up on JS20 Fails In-Reply-To: <20050503031322.GG12682@krispykreme> References: <4276C979.3020300@gentoo.org> <20050503031322.GG12682@krispykreme> Message-ID: <4277A687.8010806@gentoo.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Anton Blanchard wrote: > > Hi, > >> After including this header I was able to compile, but on bring >> up i see the following: >> >> [boot]0012 Setup Arch pSeries_pci: this system has large bus >> numbers and the kernel was not built with the patch that fixes >> include/linux/pci.h struct pci_bus so number, primary, secondary >> and subordinate are ints. Kernel panic: pSeries_pci: this system >> has large bus numbers and the kernel was not built with the patch >> that fixes > > >> include/linux/pci.h struct pci_bus so number, primary, secondary >> and subordinate are ints. > > > Do that and it should work :) Its to do with PCI domains and is > fixed properly in 2.6. > > Anton > Long story short - someone asked - I'll try and quantify thier requirements but thats about it for now - -- Omkhar Arasaratnam - Gentoo PPC64 Developer omkhar at gentoo.org - http://dev.gentoo.org/~omkhar Gentoo Linux / PPC64 Linux: http://ppc64.gentoo.org -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (MingW32) iD8DBQFCd6aH9msUWjh2lHURArUrAJ0afKK091NgJV/3J9TbmTFreT+gLgCghO9B sh6ijo7Mmkg28spwNR06MvY= =Wq8s -----END PGP SIGNATURE----- From sonny at burdell.org Wed May 4 04:17:47 2005 From: sonny at burdell.org (Sonny Rao) Date: Tue, 3 May 2005 14:17:47 -0400 Subject: 2.6.11 e1000 EEH MMIO failure In-Reply-To: <20050503050212.GA22395@kevlar.burdell.org> References: <20050503050212.GA22395@kevlar.burdell.org> Message-ID: <20050503181747.GB7870@kevlar.burdell.org> On Tue, May 03, 2005 at 01:02:12AM -0400, Sonny Rao wrote: > I'm guessing this means a bad e1000 card but I wanted to check with > the experts. The box is a p690 w/ some expansion drawers attached, > and is running a pretty-much stock 2.6.11 kernel, system is booted in > SMP mode. > > Could it be related to e1000 errata "23" mentioned earlier on the > mailing list? > This little bugger is causing a lot of spew into my logs, is there a way to tell EEH to just offline that PCI device ? Isn't that what it's supposed to do? Is there a PCI hotplug FAQ or README somewhere that I can read (and stop posting this crap to the list :) ) Thanks, Sonny From linas at austin.ibm.com Wed May 4 08:46:32 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Tue, 3 May 2005 17:46:32 -0500 Subject: 2.6.11 e1000 EEH MMIO failure In-Reply-To: <20050503050212.GA22395@kevlar.burdell.org> References: <20050503050212.GA22395@kevlar.burdell.org> Message-ID: <20050503224632.GF11745@austin.ibm.com> Recent e1000 code has some new kind of whiz-bang watchdog timer code that is causing the device to DMA off into hyperspace, thus triggering the EEH code. It's not clear to me if the 2.6.11 kernel has this code. Am cc'ing two people who should know.... --linas On Tue, May 03, 2005 at 01:02:12AM -0400, Sonny Rao was heard to remark: > I'm guessing this means a bad e1000 card but I wanted to check with > the experts. The box is a p690 w/ some expansion drawers attached, > and is running a pretty-much stock 2.6.11 kernel, system is booted in > SMP mode. > > Could it be related to e1000 errata "23" mentioned earlier on the > mailing list? > > Here are the messages: > > Intel(R) PRO/1000 Network Driver - version 5.6.10.1-k2 > Copyright (c) 1999-2004 Intel Corporation. > > > > e1000: eth3: e1000_probe: Intel(R) PRO/1000 Network Connection > PCI: Enabling device: (000a:01:01.0), cmd 143 > e1000: eth4: e1000_probe: Intel(R) PRO/1000 Network Connection > PCI: Enabling device: (000a:01:01.1), cmd 143 > e1000: eth5: e1000_probe: Intel(R) PRO/1000 Network Connection > PCI: Enabling device: (000e:21:01.0), cmd 143 > e1000: eth6: e1000_probe: Intel(R) PRO/1000 Network Connection > PCI: Enabling device: (0011:21:01.0), cmd 143 > e1000: eth7: e1000_probe: Intel(R) PRO/1000 Network Connection > RTAS: event: 15, Type: Retry, Severity: 2 > EEH: MMIO failure (2) on device: ethernet /pci at 3ffe7f0a000/pci at 2,2/ethernet at 1 > Call Trace: > [c00000103873a910] [c000000000631630] 0xc000000000631630 (unreliable) > [c00000103873a990] [c000000000036a6c] .eeh_dn_check_failure+0x2e4/0x334 > [c00000103873aa70] [c000000000036c20] .eeh_check_failure+0x164/0x1b0 > [c00000103873ab10] [d0000000002a6b04] .e1000_check_for_link+0x5ac/0x664 [e1000] > [c00000103873abd0] [d00000000029a5e0] .e1000_watchdog+0x48/0x79c [e1000] > [c00000103873ac90] [c00000000005f558] .run_timer_softirq+0x15c/0x280 > [c00000103873ad60] [c00000000005a3c4] .__do_softirq+0xdc/0x1c8 > [c00000103873ae20] [c00000000005a538] .do_softirq+0x88/0x8c > [c00000103873aeb0] [c000000000011520] .timer_interrupt+0x294/0x35c > [c00000103873afb0] [c00000000000a2b8] decrementer_common+0xb8/0x100 > --- Exception: 901 at ._spin_unlock_irqrestore+0x1c/0x28 > LR = .rtas_call+0x1a4/0x2b4 > [c00000103873b2a0] [c0000000001e8128] .snprintf+0x30/0x44 (unreliable) > [c00000103873b2e0] [c00000000003421c] .rtas_call+0x110/0x2b4 > [c00000103873b3a0] [c0000000000366ec] .read_slot_reset_state+0x94/0xac > [c00000103873b420] [c000000000036890] .eeh_dn_check_failure+0x108/0x334 > [c00000103873b500] [c000000000036c20] .eeh_check_failure+0x164/0x1b0 > [c00000103873b5a0] [d00000000029f174] .e1000_up+0x404/0x40c [e1000] > [c00000103873b650] [d00000000029f5cc] .e1000_open+0x54/0xc0 [e1000] > [c00000103873b6e0] [c0000000002fec84] .dev_open+0x118/0x13c > [c00000103873b780] [c0000000002fcef8] .dev_change_flags+0x19c/0x1d4 > [c00000103873b820] [c000000000357878] .devinet_ioctl+0x66c/0x820 > [c00000103873b930] [c000000000358794] .inet_ioctl+0x260/0x2e0 > [c00000103873b9c0] [c0000000002f03a0] .sock_ioctl+0x28c/0x418 > [c00000103873ba70] [c0000000000c7564] .do_ioctl+0x124/0x13c > [c00000103873bb10] [c0000000000c777c] .vfs_ioctl+0x200/0x4e0 > [c00000103873bbc0] [c0000000000c7ab8] .sys_ioctl+0x5c/0xa4 > [c00000103873bc70] [c00000000001e8c0] .dev_ifsioc+0x8c/0x348 > [c00000103873bd50] [c0000000000e7d24] .compat_sys_ioctl+0x46c/0x4c4 > [c00000103873be30] [c00000000000d500] syscall_exit+0x0/0x18 > e1000: eth7: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex > RTAS: event: 16, Type: Retry, Severity: 2 > EEH: MMIO failure (2) on device: ethernet /pci at 3ffe7f0a000/pci at 2,2/ethernet at 1 > Call Trace: > [c00000103873b3a0] [c000000000631630] 0xc000000000631630 (unreliable) > [c00000103873b420] [c000000000036a6c] .eeh_dn_check_failure+0x2e4/0x334 > [c00000103873b500] [c000000000036c20] .eeh_check_failure+0x164/0x1b0 > [c00000103873b5a0] [d00000000029f174] .e1000_up+0x404/0x40c [e1000] > [c00000103873b650] [d00000000029f5cc] .e1000_open+0x54/0xc0 [e1000] > [c00000103873b6e0] [c0000000002fec84] .dev_open+0x118/0x13c > [c00000103873b780] [c0000000002fcef8] .dev_change_flags+0x19c/0x1d4 > [c00000103873b820] [c000000000357878] .devinet_ioctl+0x66c/0x820 > [c00000103873b930] [c000000000358794] .inet_ioctl+0x260/0x2e0 > [c00000103873b9c0] [c0000000002f03a0] .sock_ioctl+0x28c/0x418 > [c00000103873ba70] [c0000000000c7564] .do_ioctl+0x124/0x13c > [c00000103873bb10] [c0000000000c777c] .vfs_ioctl+0x200/0x4e0 > [c00000103873bbc0] [c0000000000c7ab8] .sys_ioctl+0x5c/0xa4 > [c00000103873bc70] [c00000000001e8c0] .dev_ifsioc+0x8c/0x348 > [c00000103873bd50] [c0000000000e7d24] .compat_sys_ioctl+0x46c/0x4c4 > [c00000103873be30] [c00000000000d500] syscall_exit+0x0/0x18 > EEH: MMIO failure (2), notifiying device 0011:21:01.0 > EEH: MMIO failure (2), notifiying device 0011:21:01.0 > PCI: Enabling device: (0014:01:01.0), cmd 143 > e1000: eth8: e1000_probe: Intel(R) PRO/1000 Network Connection > PCI: Enabling device: (0014:01:01.1), cmd 143 > e1000: eth9: e1000_probe: Intel(R) PRO/1000 Network Connection > PCI: Enabling device: (0017:01:01.0), cmd 143 > e1000: eth10: e1000_probe: Intel(R) PRO/1000 Network Connection > > Sonny > _______________________________________________ > Linuxppc64-dev mailing list > Linuxppc64-dev at ozlabs.org > https://ozlabs.org/cgi-bin/mailman/listinfo/linuxppc64-dev > From linas at austin.ibm.com Wed May 4 08:55:08 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Tue, 3 May 2005 17:55:08 -0500 Subject: 2.6.11 e1000 EEH MMIO failure In-Reply-To: <20050503181747.GB7870@kevlar.burdell.org> References: <20050503050212.GA22395@kevlar.burdell.org> <20050503181747.GB7870@kevlar.burdell.org> Message-ID: <20050503225508.GG11745@austin.ibm.com> On Tue, May 03, 2005 at 02:17:47PM -0400, Sonny Rao was heard to remark: > > This little bugger is causing a lot of spew into my logs, is there a > way to tell EEH to just offline that PCI device ? Isn't that what > it's supposed to do? Is there a PCI hotplug FAQ or README somewhere > that I can read (and stop posting this crap to the list :) ) You can prevent it from panicing by setting "panic_on_oops" to 0 echo 0 > /proc/sys/kernel/panic_on_oops Unfortunately, there is no boot-prompt option for this; there should be a __setup(panic_on_oops) added to kernel/panic.c As to actually recovering from that error-- you might try applying one of the earlier posted EEH patches; it should work. These earlier patches aren't in the mainline kernel because they have deficiencies. I'm supposed to be re-writing the code to make an EEH patch that is generally acceptable as a real patch, but am currently snowed under with other activities. --linas From sonny at burdell.org Wed May 4 10:29:03 2005 From: sonny at burdell.org (Sonny Rao) Date: Tue, 3 May 2005 20:29:03 -0400 Subject: 2.6.11 e1000 EEH MMIO failure In-Reply-To: <20050503224632.GF11745@austin.ibm.com> References: <20050503050212.GA22395@kevlar.burdell.org> <20050503224632.GF11745@austin.ibm.com> Message-ID: <20050504002903.GA11855@kevlar.burdell.org> On Tue, May 03, 2005 at 05:46:32PM -0500, Linas Vepstas wrote: > > Recent e1000 code has some new kind of whiz-bang watchdog timer > code that is causing the device to DMA off into hyperspace, > thus triggering the EEH code. It's not clear to me if the > 2.6.11 kernel has this code. > > Am cc'ing two people who should know.... > > --linas > Well that machine has other e1000 cards in it that aren't doing this, so I'm thinking it really is bad hardware in my case. If you want this card for testing EEH code or something, I just found it and ripped it out earlier today, and you can have it :-) Sonny From sonny at burdell.org Wed May 4 10:33:05 2005 From: sonny at burdell.org (Sonny Rao) Date: Tue, 3 May 2005 20:33:05 -0400 Subject: 2.6.11 e1000 EEH MMIO failure In-Reply-To: <20050503225508.GG11745@austin.ibm.com> References: <20050503050212.GA22395@kevlar.burdell.org> <20050503181747.GB7870@kevlar.burdell.org> <20050503225508.GG11745@austin.ibm.com> Message-ID: <20050504003305.GB11855@kevlar.burdell.org> On Tue, May 03, 2005 at 05:55:08PM -0500, Linas Vepstas wrote: > On Tue, May 03, 2005 at 02:17:47PM -0400, Sonny Rao was heard to remark: > > > > This little bugger is causing a lot of spew into my logs, is there a > > way to tell EEH to just offline that PCI device ? Isn't that what > > it's supposed to do? Is there a PCI hotplug FAQ or README somewhere > > that I can read (and stop posting this crap to the list :) ) > > You can prevent it from panicing by setting "panic_on_oops" to 0 > echo 0 > /proc/sys/kernel/panic_on_oops > > Unfortunately, there is no boot-prompt option for this; > there should be a __setup(panic_on_oops) added to kernel/panic.c Hmm okay, so it isn't actually causing a panic in my case, which I think is good mind you :) I didn't actually try and use it though, it was just in that machine among other e1000s. > As to actually recovering from that error-- you might try applying > one of the earlier posted EEH patches; it should work. These earlier > patches aren't in the mainline kernel because they have deficiencies. > > I'm supposed to be re-writing the code to make an EEH patch that is > generally acceptable as a real patch, but am currently snowed under > with other activities. Ah okay cool, so in the future Linux will be able to smartly handle it, very nice. Unfortunately I can't really test your patch because several other people need to use the machine which is normally partitioned up (and that particular device is left out of any LPAR config) I just happend to boot the full-system partition to do some tests and noticed the problem. Again, if someone wants to do something with that card, let me know, otherwise I'm going to toss it out. Sonny From anton at samba.org Wed May 4 14:37:45 2005 From: anton at samba.org (Anton Blanchard) Date: Wed, 4 May 2005 14:37:45 +1000 Subject: [PATCH] remove io_page_mask Message-ID: <20050504043745.GJ13590@krispykreme> Hi Jake, I found an issue with the io_page_mask code when pci_probe_only is not set (we dont initialise io_page_mask and bad things happen). I was about to fix it up when I wondered if we can remove it now. Ben changed the serial code to check before it goes pounding on addresses. Im not sure if there were other issues with badly behaving drivers but my js20 boots here with the following removal patch. Thoughts? Anton Index: foobar2/include/asm-ppc64/io.h =================================================================== --- foobar2.orig/include/asm-ppc64/io.h 2005-05-04 13:51:41.245647479 +1000 +++ foobar2/include/asm-ppc64/io.h 2005-05-04 13:55:14.823405718 +1000 @@ -33,12 +33,6 @@ extern unsigned long isa_io_base; extern unsigned long pci_io_base; -extern unsigned long io_page_mask; - -#define MAX_ISA_PORT 0x10000 - -#define _IO_IS_VALID(port) ((port) >= MAX_ISA_PORT || (1 << (port>>PAGE_SHIFT)) \ - & io_page_mask) #ifdef CONFIG_PPC_ISERIES /* __raw_* accessors aren't supported on iSeries */ Index: foobar2/include/asm-ppc64/eeh.h =================================================================== --- foobar2.orig/include/asm-ppc64/eeh.h 2005-05-04 13:51:41.246647403 +1000 +++ foobar2/include/asm-ppc64/eeh.h 2005-05-04 13:55:14.825405566 +1000 @@ -310,8 +310,6 @@ static inline u8 eeh_inb(unsigned long port) { u8 val; - if (!_IO_IS_VALID(port)) - return ~0; val = in_8((u8 __iomem *)(port+pci_io_base)); if (EEH_POSSIBLE_ERROR(val, u8)) return eeh_check_failure((void __iomem *)(port), val); @@ -320,15 +318,12 @@ static inline void eeh_outb(u8 val, unsigned long port) { - if (_IO_IS_VALID(port)) - out_8((u8 __iomem *)(port+pci_io_base), val); + out_8((u8 __iomem *)(port+pci_io_base), val); } static inline u16 eeh_inw(unsigned long port) { u16 val; - if (!_IO_IS_VALID(port)) - return ~0; val = in_le16((u16 __iomem *)(port+pci_io_base)); if (EEH_POSSIBLE_ERROR(val, u16)) return eeh_check_failure((void __iomem *)(port), val); @@ -337,15 +332,12 @@ static inline void eeh_outw(u16 val, unsigned long port) { - if (_IO_IS_VALID(port)) - out_le16((u16 __iomem *)(port+pci_io_base), val); + out_le16((u16 __iomem *)(port+pci_io_base), val); } static inline u32 eeh_inl(unsigned long port) { u32 val; - if (!_IO_IS_VALID(port)) - return ~0; val = in_le32((u32 __iomem *)(port+pci_io_base)); if (EEH_POSSIBLE_ERROR(val, u32)) return eeh_check_failure((void __iomem *)(port), val); @@ -354,8 +346,7 @@ static inline void eeh_outl(u32 val, unsigned long port) { - if (_IO_IS_VALID(port)) - out_le32((u32 __iomem *)(port+pci_io_base), val); + out_le32((u32 __iomem *)(port+pci_io_base), val); } /* in-string eeh macros */ Index: foobar2/arch/ppc64/kernel/iSeries_pci.c =================================================================== --- foobar2.orig/arch/ppc64/kernel/iSeries_pci.c 2005-05-04 13:55:12.042223389 +1000 +++ foobar2/arch/ppc64/kernel/iSeries_pci.c 2005-05-04 13:55:52.221213083 +1000 @@ -47,8 +47,6 @@ #include "pci.h" -extern unsigned long io_page_mask; - /* * Forward declares of prototypes. */ @@ -291,7 +289,6 @@ PPCDBG(PPCDBG_BUSWALK, "iSeries_pcibios_init Entry.\n"); iomm_table_initialize(); find_and_init_phbs(); - io_page_mask = -1; PPCDBG(PPCDBG_BUSWALK, "iSeries_pcibios_init Exit.\n"); } Index: foobar2/arch/ppc64/kernel/pci.c =================================================================== --- foobar2.orig/arch/ppc64/kernel/pci.c 2005-05-04 13:55:12.047223007 +1000 +++ foobar2/arch/ppc64/kernel/pci.c 2005-05-04 13:55:52.226212702 +1000 @@ -42,15 +42,6 @@ unsigned long pci_probe_only = 1; unsigned long pci_assign_all_buses = 0; -/* - * legal IO pages under MAX_ISA_PORT. This is to ensure we don't touch - * devices we don't have access to. - */ -unsigned long io_page_mask; - -EXPORT_SYMBOL(io_page_mask); - - unsigned int pcibios_assign_all_busses(void) { return pci_assign_all_buses; @@ -674,8 +665,6 @@ pci_process_ISA_OF_ranges(isa_dn, hose->io_base_phys, hose->io_base_virt); of_node_put(isa_dn); - /* Allow all IO */ - io_page_mask = -1; } } @@ -837,24 +826,9 @@ if (dev->resource[i].flags & IORESOURCE_IO) { unsigned long offset = (unsigned long)hose->io_base_virt - pci_io_base; - unsigned long start, end, mask; - - start = dev->resource[i].start += offset; - end = dev->resource[i].end += offset; - /* Need to allow IO access to pages that are in the - ISA range */ - if (start < MAX_ISA_PORT) { - if (end > MAX_ISA_PORT) - end = MAX_ISA_PORT; - - start >>= PAGE_SHIFT; - end >>= PAGE_SHIFT; - - /* get the range of pages for the map */ - mask = ((1 << (end+1))-1) ^ ((1 << start)-1); - io_page_mask |= mask; - } + dev->resource[i].start += offset; + dev->resource[i].end += offset; } else if (dev->resource[i].flags & IORESOURCE_MEM) { dev->resource[i].start += hose->pci_mem_offset; Index: foobar2/arch/ppc64/kernel/maple_pci.c =================================================================== --- foobar2.orig/arch/ppc64/kernel/maple_pci.c 2005-05-04 13:55:12.044223236 +1000 +++ foobar2/arch/ppc64/kernel/maple_pci.c 2005-05-04 13:55:52.223212931 +1000 @@ -454,9 +454,6 @@ /* Tell pci.c to use the common resource allocation mecanism */ pci_probe_only = 0; - - /* Allow all IO */ - io_page_mask = -1; } int maple_pci_get_legacy_ide_irq(struct pci_dev *pdev, int channel) Index: foobar2/arch/ppc64/kernel/pmac_pci.c =================================================================== --- foobar2.orig/arch/ppc64/kernel/pmac_pci.c 2005-05-04 13:55:12.050222778 +1000 +++ foobar2/arch/ppc64/kernel/pmac_pci.c 2005-05-04 13:55:52.228212549 +1000 @@ -755,9 +755,6 @@ /* Tell pci.c to not use the common resource allocation mecanism */ pci_probe_only = 1; - - /* Allow all IO */ - io_page_mask = -1; } /* Index: foobar2/arch/ppc64/kernel/iomap.c =================================================================== --- foobar2.orig/arch/ppc64/kernel/iomap.c 2005-05-04 13:16:13.000000000 +1000 +++ foobar2/arch/ppc64/kernel/iomap.c 2005-05-04 14:02:43.887671757 +1000 @@ -88,8 +88,6 @@ void __iomem *ioport_map(unsigned long port, unsigned int len) { - if (!_IO_IS_VALID(port)) - return NULL; return (void __iomem *) (port+pci_io_base); } From anton at samba.org Wed May 4 15:42:58 2005 From: anton at samba.org (Anton Blanchard) Date: Wed, 4 May 2005 15:42:58 +1000 Subject: [PATCH] remove io_page_mask In-Reply-To: <20050504043745.GJ13590@krispykreme> References: <20050504043745.GJ13590@krispykreme> Message-ID: <20050504054258.GK13590@krispykreme> > I found an issue with the io_page_mask code when pci_probe_only is not > set (we dont initialise io_page_mask and bad things happen). I was > about to fix it up when I wondered if we can remove it now. > > Ben changed the serial code to check before it goes pounding on > addresses. Im not sure if there were other issues with badly behaving > drivers but my js20 boots here with the following removal patch. First fallout: parport_pc_probe_port+0x318/0xbd0 parport_pc_init+0x418/0x520 sys_init_module+0x1bc/0x4b0 The parallel port code is going out and touching stuff it shouldnt. Anton From will_schmidt at vnet.ibm.com Wed May 4 23:12:27 2005 From: will_schmidt at vnet.ibm.com (will schmidt) Date: Wed, 04 May 2005 08:12:27 -0500 Subject: (resend) RFC/Patch xmon pte/pgd/ userspace address additions Message-ID: <4278CA3B.6060700@vnet.ibm.com> Hi Folks, This is a resend. I didnt see this on the patch page, so suspect it got lost. (Do updated patches need to be sent to the list under a different thread?) -Will -------- Original Message -------- Subject: Re: RFC/Patch more xmon additions Date: Thu, 07 Apr 2005 08:39:36 -0500 From: will schmidt To: will schmidt CC: linuxppc64-dev at ozlabs.org References: <421E3BE3.90301 at vnet.ibm.com> Hi All, here's a revised version of my initial patch. - I've removed the try_spinlock code; - As an alternative to duplicating lots of function to add mread calls in place of references, I've added setjmp(bus_error_jmp) {} around what seem more likely to be critical areas. - cleaned up spacing - changed most of the function names to be xmon_xxx instead of wm_xxx. these functions show up under a submenu 'w'. use "w?" at xmon> prompt to get the help blurb. -Will will schmidt wrote: > > Hi Folks, > Am looking for comments on this additional function i've added to xmon > on the side.. > > the bulk of my intent was to make it easier for me to poke at memory > within a particular user process. > > I realize that the spacing is a bit screwed up, and the function names > should eventually change. Because i couldnt decide on letters for the > new functions, i put them under a submenu 'w'. > > wP will dump info on all processes. > > wp 0xabc will make process with pid 0xabc the active pid. <- active > only with respect to xmon poking into memory. > > wd 0xabcd1234 - will call through the pdg/pmd functions and return the > kernel address corresponding to 0xabcd1234 within the processes memory > space location. > > wg will dump gprs of the process/thread. > > -Will > > > ------------------------------------------------------------------------ > > > _______________________________________________ > Linuxppc64-dev mailing list > Linuxppc64-dev at ozlabs.org > https://ozlabs.org/cgi-bin/mailman/listinfo/linuxppc64-dev -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: xmon_pxd_code_apr7.diff Url: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050504/e50cfe05/attachment.txt From linas at austin.ibm.com Thu May 5 02:28:55 2005 From: linas at austin.ibm.com (Linas Vepstas) Date: Wed, 4 May 2005 11:28:55 -0500 Subject: 2.6.11 e1000 EEH MMIO failure In-Reply-To: <20050504003305.GB11855@kevlar.burdell.org> References: <20050503050212.GA22395@kevlar.burdell.org> <20050503181747.GB7870@kevlar.burdell.org> <20050503225508.GG11745@austin.ibm.com> <20050504003305.GB11855@kevlar.burdell.org> Message-ID: <20050504162855.GH11745@austin.ibm.com> On Tue, May 03, 2005 at 08:33:05PM -0400, Sonny Rao was heard to remark: > > Ah okay cool, so in the future Linux will be able to smartly handle > it, very nice. Unfortunately I can't really test your patch because > several other people need to use the machine which is normally > partitioned up (and that particular device is left out of any LPAR > config) I just happend to boot the full-system partition to do some > tests and noticed the problem. There's supposed to be some code that allows slots to be dynamically added and removed from running partitions, but I've never tried it myself. > Again, if someone wants to do something with that card, let me know, > otherwise I'm going to toss it out. FWIW, field experience shows that nine out of ten failures are due to poorly seated PCI cards. Before you chuck it, you might want to remove it, make sure there are no iron filings in the slot, and try again. Let me know how that goes; I'd like to add this to my bag of "real world" experience with this thing. --linas From support at paypal.com Thu May 5 02:30:40 2005 From: support at paypal.com (support at paypal.com) Date: Wed, 4 May 2005 18:30:40 +0200 (CEST) Subject: Billing Issues Message-ID: <20050504163040.AE7D524F7@wmphpp02.st2.lyceu.net> An HTML attachment was scrubbed... URL: http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20050504/f601641f/attachment.htm From moilanen at austin.ibm.com Thu May 5 06:35:59 2005 From: moilanen at austin.ibm.com (Jake Moilanen) Date: Wed, 4 May 2005 15:35:59 -0500 Subject: [PATCH] ppc64: enforce medium thread priority in hypervisor calls In-Reply-To: <20050429135446.GF19662@krispykreme> References: <20050429135446.GF19662@krispykreme> Message-ID: <20050504153559.2aa85753.moilanen@austin.ibm.com> > Calls into the hypervisor do not raise the thread priority. Ensure we > are running at medium priority upon entry to the hypervisor. Anton, what's the purpose of this patch. I thought only RS64 had HMT, and those boxes don't make hypervisor calls. Jake From moilanen at austin.ibm.com Thu May 5 06:48:35 2005 From: moilanen at austin.ibm.com (Jake Moilanen) Date: Wed, 4 May 2005 15:48:35 -0500 Subject: [PATCH] ppc64: enforce medium thread priority in hypervisor calls In-Reply-To: <20050504153559.2aa85753.moilanen@austin.ibm.com> References: <20050429135446.GF19662@krispykreme> <20050504153559.2aa85753.moilanen@austin.ibm.com> Message-ID: <20050504154835.1e67686b.moilanen@austin.ibm.com> On Wed, 4 May 2005 15:35:59 -0500 Jake Moilanen wrote: > > Calls into the hypervisor do not raise the thread priority. Ensure we > > are running at medium priority upon entry to the hypervisor. > > Anton, what's the purpose of this patch. I thought only RS64 had HMT, > and those boxes don't make hypervisor calls. Disregard...Olof reminded me that SMT uses the same scheme. Jake From moilanen at austin.ibm.com Thu May 5 04:50:07 2005 From: moilanen at austin.ibm.com (Jake Moilanen) Date: Wed, 4 May 2005 13:50:07 -0500 Subject: [PATCH] remove io_page_mask In-Reply-To: <20050504043745.GJ13590@krispykreme> References: <20050504043745.GJ13590@krispykreme> Message-ID: <20050504135007.78f449d2.moilanen@austin.ibm.com> > I found an issue with the io_page_mask code when pci_probe_only is not > set (we dont initialise io_page_mask and bad things happen). I was > about to fix it up when I wondered if we can remove it now. > > Ben changed the serial code to check before it goes pounding on > addresses. Im not sure if there were other issues with badly behaving > drivers but my js20 boots here with the following removal patch. As long as the serial code is fixed up, then the JS20 shouldn't need the io_page_mask. Jake From apw at shadowen.org Thu May 5 06:30:57 2005 From: apw at shadowen.org (Andy Whitcroft) Date: Wed, 04 May 2005 21:30:57 +0100 Subject: [3/3] sparsemem memory model for ppc64 Message-ID: Provide the architecture specific implementation for SPARSEMEM for PPC64 systems. Signed-off-by: Andy Whitcroft Signed-off-by: Dave Hansen Signed-off-by: Mike Kravetz (in part) Signed-off-by: Martin Bligh --- arch/ppc64/Kconfig | 13 ++++++++++++- arch/ppc64/kernel/setup.c | 1 + arch/ppc64/mm/Makefile | 2 +- arch/ppc64/mm/init.c | 24 +++++++++++++++++++----- include/asm-ppc64/mmzone.h | 36 +++++++++++++++++++++++------------- include/asm-ppc64/page.h | 3 ++- include/asm-ppc64/sparsemem.h | 16 ++++++++++++++++ 7 files changed, 74 insertions(+), 21 deletions(-) diff -X /home/apw/brief/lib/vdiff.excl -rupN reference/arch/ppc64/Kconfig current/arch/ppc64/Kconfig --- reference/arch/ppc64/Kconfig 2005-05-04 20:54:52.000000000 +0100 +++ current/arch/ppc64/Kconfig 2005-05-04 20:54:54.000000000 +0100 @@ -198,6 +198,13 @@ config HMT This option enables hardware multithreading on RS64 cpus. pSeries systems p620 and p660 have such a cpu type. +config ARCH_SELECT_MEMORY_MODEL + def_bool y + +config ARCH_FLATMEM_ENABLE + def_bool y + depends on !NUMA + config ARCH_DISCONTIGMEM_ENABLE def_bool y depends on SMP && PPC_PSERIES @@ -209,6 +216,10 @@ config ARCH_DISCONTIGMEM_DEFAULT config ARCH_FLATMEM_ENABLE def_bool y +config ARCH_SPARSEMEM_ENABLE + def_bool y + depends on ARCH_DISCONTIGMEM_ENABLE + source "mm/Kconfig" config HAVE_ARCH_EARLY_PFN_TO_NID @@ -229,7 +240,7 @@ config NODES_SPAN_OTHER_NODES config NUMA bool "NUMA support" - depends on DISCONTIGMEM + default y if DISCONTIGMEM || SPARSEMEM config SCHED_SMT bool "SMT (Hyperthreading) scheduler support" diff -X /home/apw/brief/lib/vdiff.excl -rupN reference/arch/ppc64/kernel/setup.c current/arch/ppc64/kernel/setup.c --- reference/arch/ppc64/kernel/setup.c 2005-04-11 19:33:15.000000000 +0100 +++ current/arch/ppc64/kernel/setup.c 2005-05-04 20:54:53.000000000 +0100 @@ -1059,6 +1059,7 @@ void __init setup_arch(char **cmdline_p) /* set up the bootmem stuff with available memory */ do_init_bootmem(); + sparse_init(); /* initialize the syscall map in systemcfg */ setup_syscall_map(); diff -X /home/apw/brief/lib/vdiff.excl -rupN reference/arch/ppc64/mm/init.c current/arch/ppc64/mm/init.c --- reference/arch/ppc64/mm/init.c 2005-05-04 20:54:20.000000000 +0100 +++ current/arch/ppc64/mm/init.c 2005-05-04 20:54:54.000000000 +0100 @@ -601,13 +601,21 @@ EXPORT_SYMBOL(page_is_ram); * Initialize the bootmem system and give it all the memory we * have available. */ -#ifndef CONFIG_DISCONTIGMEM +#ifndef CONFIG_NEED_MULTIPLE_NODES void __init do_init_bootmem(void) { unsigned long i; unsigned long start, bootmap_pages; unsigned long total_pages = lmb_end_of_DRAM() >> PAGE_SHIFT; int boot_mapsize; + unsigned long start_pfn, end_pfn; + /* + * Note presence of first (logical/coalasced) LMB which will + * contain RMO region + */ + start_pfn = lmb.memory.region[0].physbase >> PAGE_SHIFT; + end_pfn = start_pfn + (lmb.memory.region[0].size >> PAGE_SHIFT); + memory_present(0, start_pfn, end_pfn); /* * Find an area to use for the bootmem bitmap. Calculate the size of @@ -623,12 +631,18 @@ void __init do_init_bootmem(void) max_pfn = max_low_pfn; - /* add all physical memory to the bootmem map. Also find the first */ + /* add all physical memory to the bootmem map. Also, find the first + * presence of all LMBs*/ for (i=0; i < lmb.memory.cnt; i++) { unsigned long physbase, size; physbase = lmb.memory.region[i].physbase; size = lmb.memory.region[i].size; + if (i) { /* already created mappings for first LMB */ + start_pfn = physbase >> PAGE_SHIFT; + end_pfn = start_pfn + (size >> PAGE_SHIFT); + } + memory_present(0, start_pfn, end_pfn); free_bootmem(physbase, size); } @@ -667,7 +681,7 @@ void __init paging_init(void) free_area_init_node(0, &contig_page_data, zones_size, __pa(PAGE_OFFSET) >> PAGE_SHIFT, zholes_size); } -#endif /* CONFIG_DISCONTIGMEM */ +#endif /* ! CONFIG_NEED_MULTIPLE_NODES */ static struct kcore_list kcore_vmem; @@ -698,7 +712,7 @@ module_init(setup_kcore); void __init mem_init(void) { -#ifdef CONFIG_DISCONTIGMEM +#ifdef CONFIG_NEED_MULTIPLE_NODES int nid; #endif pg_data_t *pgdat; @@ -709,7 +723,7 @@ void __init mem_init(void) num_physpages = max_low_pfn; /* RAM is assumed contiguous */ high_memory = (void *) __va(max_low_pfn * PAGE_SIZE); -#ifdef CONFIG_DISCONTIGMEM +#ifdef CONFIG_NEED_MULTIPLE_NODES for_each_online_node(nid) { if (NODE_DATA(nid)->node_spanned_pages != 0) { printk("freeing bootmem node %x\n", nid); diff -X /home/apw/brief/lib/vdiff.excl -rupN reference/arch/ppc64/mm/Makefile current/arch/ppc64/mm/Makefile --- reference/arch/ppc64/mm/Makefile 2005-01-21 14:04:09.000000000 +0000 +++ current/arch/ppc64/mm/Makefile 2005-05-04 20:54:54.000000000 +0100 @@ -6,6 +6,6 @@ EXTRA_CFLAGS += -mno-minimal-toc obj-y := fault.o init.o imalloc.o hash_utils.o hash_low.o tlb.o \ slb_low.o slb.o stab.o mmap.o -obj-$(CONFIG_DISCONTIGMEM) += numa.o +obj-$(CONFIG_NEED_MULTIPLE_NODES) += numa.o obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o obj-$(CONFIG_PPC_MULTIPLATFORM) += hash_native.o diff -X /home/apw/brief/lib/vdiff.excl -rupN reference/include/asm-ppc64/mmzone.h current/include/asm-ppc64/mmzone.h --- reference/include/asm-ppc64/mmzone.h 2005-05-04 20:54:50.000000000 +0100 +++ current/include/asm-ppc64/mmzone.h 2005-05-04 20:54:54.000000000 +0100 @@ -10,9 +10,20 @@ #include #include -#ifdef CONFIG_DISCONTIGMEM +/* generic non-linear memory support: + * + * 1) we will not split memory into more chunks than will fit into the + * flags field of the struct page + */ + + +#ifdef CONFIG_NEED_MULTIPLE_NODES extern struct pglist_data *node_data[]; +/* + * Return a pointer to the node data for node n. + */ +#define NODE_DATA(nid) (node_data[nid]) /* * Following are specific to this numa platform. @@ -47,30 +58,27 @@ static inline int pa_to_nid(unsigned lon return nid; } -#define pfn_to_nid(pfn) pa_to_nid((pfn) << PAGE_SHIFT) - -/* - * Return a pointer to the node data for node n. - */ -#define NODE_DATA(nid) (node_data[nid]) - #define node_localnr(pfn, nid) ((pfn) - NODE_DATA(nid)->node_start_pfn) /* * Following are macros that each numa implmentation must define. */ -/* - * Given a kernel address, find the home node of the underlying memory. - */ -#define kvaddr_to_nid(kaddr) pa_to_nid(__pa(kaddr)) - #define node_start_pfn(nid) (NODE_DATA(nid)->node_start_pfn) #define node_end_pfn(nid) (NODE_DATA(nid)->node_end_pfn) #define local_mapnr(kvaddr) \ ( (__pa(kvaddr) >> PAGE_SHIFT) - node_start_pfn(kvaddr_to_nid(kvaddr)) +#ifdef CONFIG_DISCONTIGMEM + +/* + * Given a kernel address, find the home node of the underlying memory. + */ +#define kvaddr_to_nid(kaddr) pa_to_nid(__pa(kaddr)) + +#define pfn_to_nid(pfn) pa_to_nid((unsigned long)(pfn) << PAGE_SHIFT) + /* Written this way to avoid evaluating arguments twice */ #define discontigmem_pfn_to_page(pfn) \ ({ \ @@ -91,6 +99,8 @@ static inline int pa_to_nid(unsigned lon #endif /* CONFIG_DISCONTIGMEM */ +#endif /* CONFIG_NEED_MULTIPLE_NODES */ + #ifdef CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID #define early_pfn_to_nid(pfn) pa_to_nid(((unsigned long)pfn) << PAGE_SHIFT) #endif diff -X /home/apw/brief/lib/vdiff.excl -rupN reference/include/asm-ppc64/page.h current/include/asm-ppc64/page.h --- reference/include/asm-ppc64/page.h 2005-04-11 19:33:45.000000000 +0100 +++ current/include/asm-ppc64/page.h 2005-05-04 20:54:54.000000000 +0100 @@ -224,7 +224,8 @@ extern u64 ppc64_pft_size; /* Log 2 of #define page_to_pfn(page) discontigmem_page_to_pfn(page) #define pfn_to_page(pfn) discontigmem_pfn_to_page(pfn) #define pfn_valid(pfn) discontigmem_pfn_valid(pfn) -#else +#endif +#ifdef CONFIG_FLATMEM #define pfn_to_page(pfn) (mem_map + (pfn)) #define page_to_pfn(page) ((unsigned long)((page) - mem_map)) #define pfn_valid(pfn) ((pfn) < max_mapnr) diff -X /home/apw/brief/lib/vdiff.excl -rupN reference/include/asm-ppc64/sparsemem.h current/include/asm-ppc64/sparsemem.h --- reference/include/asm-ppc64/sparsemem.h 1970-01-01 01:00:00.000000000 +0100 +++ current/include/asm-ppc64/sparsemem.h 2005-05-04 20:54:54.000000000 +0100 @@ -0,0 +1,16 @@ +#ifndef _ASM_PPC64_SPARSEMEM_H +#define _ASM_PPC64_SPARSEMEM_H 1 + +#ifdef CONFIG_SPARSEMEM +/* + * SECTION_SIZE_BITS 2^N: how big each section will be + * MAX_PHYSADDR_BITS 2^N: how much physical address space we have + * MAX_PHYSMEM_BITS 2^N: how much memory we can have in that space + */ +#define SECTION_SIZE_BITS 24 +#define MAX_PHYSADDR_BITS 38 +#define MAX_PHYSMEM_BITS 36 + +#endif /* CONFIG_SPARSEMEM */ + +#endif /* _ASM_PPC64_SPARSEMEM_H */ From apw at shadowen.org Thu May 5 06:28:57 2005 From: apw at shadowen.org (Andy Whitcroft) Date: Wed, 04 May 2005 21:28:57 +0100 Subject: [1/3] add early_pfn_to_nid for ppc64 Message-ID: Provide an implementation of early_pfn_to_nid for PPC64. This is used by memory models to determine the node from which to take allocations before the memory allocators are fully initialised. Signed-off-by: Andy Whitcroft Signed-off-by: Dave Hansen Signed-off-by: Martin Bligh --- arch/ppc64/Kconfig | 4 ++++ include/asm-ppc64/mmzone.h | 5 +++++ 2 files changed, 9 insertions(+) diff -X /home/apw/brief/lib/vdiff.excl -rupN reference/arch/ppc64/Kconfig current/arch/ppc64/Kconfig --- reference/arch/ppc64/Kconfig 2005-05-04 20:54:41.000000000 +0100 +++ current/arch/ppc64/Kconfig 2005-05-04 20:54:48.000000000 +0100 @@ -211,6 +211,10 @@ config ARCH_FLATMEM_ENABLE source "mm/Kconfig" +config HAVE_ARCH_EARLY_PFN_TO_NID + bool + default y + # Some NUMA nodes have memory ranges that span # other nodes. Even though a pfn is valid and # between a node's start and end pfns, it may not diff -X /home/apw/brief/lib/vdiff.excl -rupN reference/include/asm-ppc64/mmzone.h current/include/asm-ppc64/mmzone.h --- reference/include/asm-ppc64/mmzone.h 2005-05-04 20:54:41.000000000 +0100 +++ current/include/asm-ppc64/mmzone.h 2005-05-04 20:54:48.000000000 +0100 @@ -90,4 +90,9 @@ static inline int pa_to_nid(unsigned lon #define discontigmem_pfn_valid(pfn) ((pfn) < num_physpages) #endif /* CONFIG_DISCONTIGMEM */ + +#ifdef CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID +#define early_pfn_to_nid(pfn) pa_to_nid(((unsigned long)pfn) << PAGE_SHIFT) +#endif + #endif /* _ASM_MMZONE_H_ */ From apw at shadowen.org Thu May 5 06:27:57 2005 From: apw at shadowen.org (Andy Whitcroft) Date: Wed, 04 May 2005 21:27:57 +0100 Subject: [0/3] SPARSEMEM memory model patches for ppc64 Message-ID: After long testing outside -mm we believe that the SPARSEMEM patches are ready for wider testing, please consider for -mm. SPARSEMEM essentially is a replacement for DISCONTIGMEM providing support for non-contigious memory but with the advantage of handling both inter- and intra-node memory holes. The goal of the implementation was to design a clean memory memory model covering the needs of both UMA and NUMA discontigouos memory layouts whilst providing a basis for hotplug. This should allow us to consolidate the implementation of various "discontiguous" memory model whilst trying to fix its short comings. Ultimatly it should allow us to remove DISCONTIGMEM. Following this mail are 3 patches which provide SPARSEMEM for ppc64: [1/3] add early_pfn_to_nid for ppc64 [2/3] add memory present for ppc64 [3/3] sparsemem memory model for ppc64 These apply on top of the generic/i386 patches recently sent out to linux-mm: [1/6] generify early_pfn_to_nid [2/6] generify memory present [3/6] sparsemem memory model [4/6] sparsemem memory model for i386 [5/6] sparsemem swiss cheese numa layouts [6/6] sparsemem hotplug base These patches have been compiled, booted and tested on 2.6.12-rc2 (plus the -mm patches listed below). They have been compile and boot tested against 2.6.12-rc3-mm2. They do assume a number of patches already incorporated into -mm including the latest configuration updates from Dave Hansen . remove-non-discontig-use-of-pgdat-node_mem_map.patch resubmit-sparsemem-base-early_pfn_to_nid-works-before-sparse-is-initialized.patch resubmit-sparsemem-base-simple-numa-remap-space-allocator.patch resubmit-sparsemem-base-reorganize-page-flags-bit-operations.patch resubmit-sparsemem-base-teach-discontig-about-sparse-ranges.patch create-mm-kconfig-for-arch-independent-memory-options.patch make-each-arch-use-mm-kconfig.patch update-all-defconfigs-for-arch_discontigmem_enable.patch introduce-new-kconfig-option-for-numa-or-discontig.patch sparsemem-fix-minor-defaults-issue-in-mm-kconfig.patch mm-kconfig-kill-unused-arch_flatmem_disable.patch mm-kconfig-hide-memory-model-selection-menu.patch Comments/feedback appreciated. -apw From apw at shadowen.org Thu May 5 06:29:57 2005 From: apw at shadowen.org (Andy Whitcroft) Date: Wed, 04 May 2005 21:29:57 +0100 Subject: [2/3] add memory present for ppc64 Message-ID: Provide hooks for PPC64 to allow memory models to be informed of installed memory areas. This allows SPARSEMEM to instantiate mem_map for the populated areas. Signed-off-by: Andy Whitcroft Signed-off-by: Dave Hansen Signed-off-by: Martin Bligh --- Kconfig | 4 ++-- mm/numa.c | 3 +++ 2 files changed, 5 insertions(+), 2 deletions(-) diff -X /home/apw/brief/lib/vdiff.excl -rupN reference/arch/ppc64/Kconfig current/arch/ppc64/Kconfig --- reference/arch/ppc64/Kconfig 2005-05-04 20:54:50.000000000 +0100 +++ current/arch/ppc64/Kconfig 2005-05-04 20:54:50.000000000 +0100 @@ -212,8 +212,8 @@ config ARCH_FLATMEM_ENABLE source "mm/Kconfig" config HAVE_ARCH_EARLY_PFN_TO_NID - bool - default y + def_bool y + depends on NEED_MULTIPLE_NODES # Some NUMA nodes have memory ranges that span # other nodes. Even though a pfn is valid and diff -X /home/apw/brief/lib/vdiff.excl -rupN reference/arch/ppc64/mm/numa.c current/arch/ppc64/mm/numa.c --- reference/arch/ppc64/mm/numa.c 2005-04-11 19:33:15.000000000 +0100 +++ current/arch/ppc64/mm/numa.c 2005-05-04 20:54:50.000000000 +0100 @@ -440,6 +440,8 @@ new_range: for (i = start ; i < (start+size); i += MEMORY_INCREMENT) numa_memory_lookup_table[i >> MEMORY_INCREMENT_SHIFT] = numa_domain; + memory_present(numa_domain, start >> PAGE_SHIFT, + (start + size) >> PAGE_SHIFT); if (--ranges) goto new_range; @@ -481,6 +483,7 @@ static void __init setup_nonnuma(void) for (i = 0 ; i < top_of_ram; i += MEMORY_INCREMENT) numa_memory_lookup_table[i >> MEMORY_INCREMENT_SHIFT] = 0; + memory_present(0, 0, init_node_data[0].node_end_pfn); } static void __init dump_numa_topology(void) From olof at lixom.net Thu May 5 08:08:19 2005 From: olof at lixom.net (Olof Johansson) Date: Wed, 4 May 2005 17:08:19 -0500 Subject: [PATCH] remove io_page_mask In-Reply-To: <20050504135007.78f449d2.moilanen@austin.ibm.com> References: <20050504043745.GJ13590@krispykreme> <20050504135007.78f449d2.moilanen@austin.ibm.com> Message-ID: <20050504220815.GA29571@austin.ibm.com> On Wed, May 04, 2005 at 01:50:07PM -0500, Jake Moilanen wrote: > > I found an issue with the io_page_mask code when pci_probe_only is not > > set (we dont initialise io_page_mask and bad things happen). I was > > about to fix it up when I wondered if we can remove it now. > > > > Ben changed the serial code to check before it goes pounding on > > addresses. Im not sure if there were other issues with badly behaving > > drivers but my js20 boots here with the following removal patch. > > As long as the serial code is fixed up, then the JS20 shouldn't need the > io_page_mask. I tried booting 2.6.12-rc3-mm2 + this patch on a JS20 here and it seemed to go well. Serial options enabled were: CONFIG_SERIAL_8250=y CONFIG_SERIAL_8250_CONSOLE=y CONFIG_SERIAL_8250_NR_UARTS=4 # CONFIG_SERIAL_8250_EXTENDED is not set # Non-8250 serial port support CONFIG_SERIAL_CORE=y CONFIG_SERIAL_CORE_CONSOLE=y -Olof From sonny at burdell.org Thu May 5 10:39:54 2005 From: sonny at burdell.org (Sonny Rao) Date: Wed, 4 May 2005 20:39:54 -0400 Subject: 2.6.11 e1000 EEH MMIO failure In-Reply-To: <20050504162855.GH11745@austin.ibm.com> References: <20050503050212.GA22395@kevlar.burdell.org> <20050503181747.GB7870@kevlar.burdell.org> <20050503225508.GG11745@austin.ibm.com> <20050504003305.GB11855@kevlar.burdell.org> <20050504162855.GH11745@austin.ibm.com> Message-ID: <20050505003954.GB7367@kevlar.burdell.org> On Wed, May 04, 2005 at 11:28:55AM -0500, Linas Vepstas wrote: > On Tue, May 03, 2005 at 08:33:05PM -0400, Sonny Rao was heard to remark: > > > > Ah okay cool, so in the future Linux will be able to smartly handle > > it, very nice. Unfortunately I can't really test your patch because > > several other people need to use the machine which is normally > > partitioned up (and that particular device is left out of any LPAR > > config) I just happend to boot the full-system partition to do some > > tests and noticed the problem. > > There's supposed to be some code that allows slots to be dynamically > added and removed from running partitions, but I've never tried it > myself. > > > Again, if someone wants to do something with that card, let me know, > > otherwise I'm going to toss it out. > > FWIW, field experience shows that nine out of ten failures are due to > poorly seated PCI cards. Before you chuck it, you might want to remove > it, make sure there are no iron filings in the slot, and try again. > > Let me know how that goes; I'd like to add this to my bag of "real > world" experience with this thing. Well, I tried to hot-plug that thing back in and boot a partition with it, and as far as I can tell the card seems to be dead. Not sure how to interpret this. Oh well, thanks. Sonny From david at gibson.dropbear.id.au Thu May 5 11:42:56 2005 From: david at gibson.dropbear.id.au (David Gibson) Date: Thu, 5 May 2005 11:42:56 +1000 Subject: Patch to kill ioremap_mm Message-ID: <20050505014256.GE18270@localhost.localdomain> Can anyone see any problems with this patch. If not, I'll send it on to akpm. Currently ppc64 has two mm_structs for the kernel, init_mm and also ioremap_mm. The latter really isn't necessary: this patch abolishes it, instead restricting vmallocs to the lower 1TB of the init_mm's range and placing io mappings in the upper 1TB. This simplifies the code in a number of places, and gets rid of an unecessary set of pagetables. Index: working-2.6/include/asm-ppc64/pgtable.h =================================================================== --- working-2.6.orig/include/asm-ppc64/pgtable.h 2005-05-05 10:58:04.000000000 +1000 +++ working-2.6/include/asm-ppc64/pgtable.h 2005-05-05 11:12:59.000000000 +1000 @@ -53,7 +53,8 @@ * Define the address range of the vmalloc VM area. */ #define VMALLOC_START (0xD000000000000000ul) -#define VMALLOC_END (VMALLOC_START + EADDR_MASK) +#define VMALLOC_SIZE (0x10000000000UL) +#define VMALLOC_END (VMALLOC_START + VMALLOC_SIZE) /* * Bits in a linux-style PTE. These match the bits in the @@ -239,9 +240,6 @@ /* This now only contains the vmalloc pages */ #define pgd_offset_k(address) pgd_offset(&init_mm, address) -/* to find an entry in the ioremap page-table-directory */ -#define pgd_offset_i(address) (ioremap_pgd + pgd_index(address)) - /* * The following only work if pte_present() is true. * Undefined behaviour if not.. @@ -459,15 +457,12 @@ #define __HAVE_ARCH_PTE_SAME #define pte_same(A,B) (((pte_val(A) ^ pte_val(B)) & ~_PAGE_HPTEFLAGS) == 0) -extern unsigned long ioremap_bot, ioremap_base; - #define pmd_ERROR(e) \ printk("%s:%d: bad pmd %08x.\n", __FILE__, __LINE__, pmd_val(e)) #define pgd_ERROR(e) \ printk("%s:%d: bad pgd %08x.\n", __FILE__, __LINE__, pgd_val(e)) extern pgd_t swapper_pg_dir[]; -extern pgd_t ioremap_dir[]; extern void paging_init(void); Index: working-2.6/include/asm-ppc64/imalloc.h =================================================================== --- working-2.6.orig/include/asm-ppc64/imalloc.h 2005-05-05 10:58:04.000000000 +1000 +++ working-2.6/include/asm-ppc64/imalloc.h 2005-05-05 11:13:39.000000000 +1000 @@ -4,9 +4,9 @@ /* * Define the address range of the imalloc VM area. */ -#define PHBS_IO_BASE IOREGIONBASE -#define IMALLOC_BASE (IOREGIONBASE + 0x80000000ul) /* Reserve 2 gigs for PHBs */ -#define IMALLOC_END (IOREGIONBASE + EADDR_MASK) +#define PHBS_IO_BASE VMALLOC_END +#define IMALLOC_BASE (PHBS_IO_BASE + 0x80000000ul) /* Reserve 2 gigs for PHBs */ +#define IMALLOC_END (VMALLOC_START + EADDR_MASK) /* imalloc region types */ @@ -21,4 +21,6 @@ int region_type); unsigned long im_free(void *addr); +extern unsigned long ioremap_bot; + #endif /* _PPC64_IMALLOC_H */ Index: working-2.6/include/asm-ppc64/page.h =================================================================== --- working-2.6.orig/include/asm-ppc64/page.h 2005-05-05 10:58:04.000000000 +1000 +++ working-2.6/include/asm-ppc64/page.h 2005-05-05 11:14:02.000000000 +1000 @@ -202,9 +202,7 @@ #define PAGE_OFFSET ASM_CONST(0xC000000000000000) #define KERNELBASE PAGE_OFFSET #define VMALLOCBASE ASM_CONST(0xD000000000000000) -#define IOREGIONBASE ASM_CONST(0xE000000000000000) -#define IO_REGION_ID (IOREGIONBASE >> REGION_SHIFT) #define VMALLOC_REGION_ID (VMALLOCBASE >> REGION_SHIFT) #define KERNEL_REGION_ID (KERNELBASE >> REGION_SHIFT) #define USER_REGION_ID (0UL) Index: working-2.6/arch/ppc64/kernel/eeh.c =================================================================== --- working-2.6.orig/arch/ppc64/kernel/eeh.c 2005-04-26 15:37:55.000000000 +1000 +++ working-2.6/arch/ppc64/kernel/eeh.c 2005-05-05 11:23:40.000000000 +1000 @@ -505,7 +505,7 @@ pte_t *ptep; unsigned long pa; - ptep = find_linux_pte(ioremap_mm.pgd, token); + ptep = find_linux_pte(init_mm.pgd, token); if (!ptep) return token; pa = pte_pfn(*ptep) << PAGE_SHIFT; Index: working-2.6/arch/ppc64/kernel/process.c =================================================================== --- working-2.6.orig/arch/ppc64/kernel/process.c 2005-04-26 15:37:55.000000000 +1000 +++ working-2.6/arch/ppc64/kernel/process.c 2005-05-05 11:16:20.000000000 +1000 @@ -58,14 +58,6 @@ struct task_struct *last_task_used_altivec = NULL; #endif -struct mm_struct ioremap_mm = { - .pgd = ioremap_dir, - .mm_users = ATOMIC_INIT(2), - .mm_count = ATOMIC_INIT(1), - .cpu_vm_mask = CPU_MASK_ALL, - .page_table_lock = SPIN_LOCK_UNLOCKED, -}; - /* * Make sure the floating-point register state in the * the thread_struct is up to date for task tsk. Index: working-2.6/include/asm-ppc64/processor.h =================================================================== --- working-2.6.orig/include/asm-ppc64/processor.h 2005-04-26 15:38:02.000000000 +1000 +++ working-2.6/include/asm-ppc64/processor.h 2005-05-05 11:24:46.000000000 +1000 @@ -590,16 +590,6 @@ } /* - * Note: the vm_start and vm_end fields here should *not* - * be in kernel space. (Could vm_end == vm_start perhaps?) - */ -#define IOREMAP_MMAP { &ioremap_mm, 0, 0x1000, NULL, \ - PAGE_SHARED, VM_READ | VM_WRITE | VM_EXEC, \ - 1, NULL, NULL } - -extern struct mm_struct ioremap_mm; - -/* * Return saved PC of a blocked thread. For now, this is the "user" PC */ #define thread_saved_pc(tsk) \ Index: working-2.6/arch/ppc64/mm/hash_utils.c =================================================================== --- working-2.6.orig/arch/ppc64/mm/hash_utils.c 2005-05-05 10:58:04.000000000 +1000 +++ working-2.6/arch/ppc64/mm/hash_utils.c 2005-05-05 11:17:03.000000000 +1000 @@ -310,10 +310,6 @@ vsid = get_vsid(mm->context.id, ea); break; - case IO_REGION_ID: - mm = &ioremap_mm; - vsid = get_kernel_vsid(ea); - break; case VMALLOC_REGION_ID: mm = &init_mm; vsid = get_kernel_vsid(ea); Index: working-2.6/arch/ppc64/mm/init.c =================================================================== --- working-2.6.orig/arch/ppc64/mm/init.c 2005-05-05 10:58:04.000000000 +1000 +++ working-2.6/arch/ppc64/mm/init.c 2005-05-05 11:22:54.000000000 +1000 @@ -144,7 +144,7 @@ pte = pte_offset_kernel(pmd, addr); do { - pte_t ptent = ptep_get_and_clear(&ioremap_mm, addr, pte); + pte_t ptent = ptep_get_and_clear(&init_mm, addr, pte); WARN_ON(!pte_none(ptent) && !pte_present(ptent)); } while (pte++, addr += PAGE_SIZE, addr != end); } @@ -181,13 +181,13 @@ static void unmap_im_area(unsigned long addr, unsigned long end) { - struct mm_struct *mm = &ioremap_mm; + struct mm_struct *mm = &init_mm; unsigned long next; pgd_t *pgd; spin_lock(&mm->page_table_lock); - pgd = pgd_offset_i(addr); + pgd = pgd_offset_k(addr); flush_cache_vunmap(addr, end); do { next = pgd_addr_end(addr, end); @@ -214,21 +214,21 @@ unsigned long vsid; if (mem_init_done) { - spin_lock(&ioremap_mm.page_table_lock); - pgdp = pgd_offset_i(ea); - pudp = pud_alloc(&ioremap_mm, pgdp, ea); + spin_lock(&init_mm.page_table_lock); + pgdp = pgd_offset_k(ea); + pudp = pud_alloc(&init_mm, pgdp, ea); if (!pudp) return -ENOMEM; - pmdp = pmd_alloc(&ioremap_mm, pudp, ea); + pmdp = pmd_alloc(&init_mm, pudp, ea); if (!pmdp) return -ENOMEM; - ptep = pte_alloc_kernel(&ioremap_mm, pmdp, ea); + ptep = pte_alloc_kernel(&init_mm, pmdp, ea); if (!ptep) return -ENOMEM; pa = abs_to_phys(pa); - set_pte_at(&ioremap_mm, ea, ptep, pfn_pte(pa >> PAGE_SHIFT, + set_pte_at(&init_mm, ea, ptep, pfn_pte(pa >> PAGE_SHIFT, __pgprot(flags))); - spin_unlock(&ioremap_mm.page_table_lock); + spin_unlock(&init_mm.page_table_lock); } else { unsigned long va, vpn, hash, hpteg; -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/people/dgibson From olof at lixom.net Thu May 5 12:31:32 2005 From: olof at lixom.net (Olof Johansson) Date: Wed, 4 May 2005 21:31:32 -0500 Subject: [3/3] sparsemem memory model for ppc64 In-Reply-To: References: Message-ID: <20050505023132.GB20283@austin.ibm.com> Hi, Just two formatting nitpicks below. -Olof On Wed, May 04, 2005 at 09:30:57PM +0100, Andy Whitcroft wrote: > diff -X /home/apw/brief/lib/vdiff.excl -rupN reference/arch/ppc64/mm/init.c current/arch/pp