[PATCH v2 2/2] powerpc/mm: Add memory_block_size as a kernel parameter

David Hildenbrand david at redhat.com
Mon Jun 19 20:35:01 AEST 2023


On 09.06.23 08:08, Aneesh Kumar K.V wrote:
> Certain devices can possess non-standard memory capacities, not constrained
> to multiples of 1GB. Provide a kernel parameter so that we can map the
> device memory completely on memory hotplug.

So, the unfortunate thing is that these devices would have worked out of 
the box before the memory block size was increased from 256 MiB to 1 GiB 
in these setups. Now, one has to fine-tune the memory block size. The 
only other arch that I know, which supports setting the memory block 
size, is x86 for special (large) UV systems -- and at least in the past 
128 MiB vs. 2 GiB memory blocks made a performance difference during 
boot (maybe no longer today, who knows).


Obviously, less tunable and getting stuff simply working out of the box 
is preferable.

Two questions:

1) Isn't there a way to improve auto-detection to fallback to 256 MiB in 
these setups, to avoid specifying these parameters?

2) Is the 256 MiB -> 1 GiB memory block size switch really worth it? On 
x86-64, experiments (with direct map fragmentation) showed that the 
effective performance boost is pretty insignificant, so I wonder how big 
the 1 GiB direct map performance improvement is.


I guess the only real issue with 256 MiB memory blocks and 1 GiB direct 
mapping is memory unplug of boot memory: when unplugging a 256 MiB 
block, one would have to remap the 1 GiB range using 2 MiB ranges.

... I was wondering what would happen if you simply leave the direct 
mapping in this corner case in place instead of doing this remapping. 
IOW, remove the memory but keep the direct map pointing at the removed 
memory. Nobody should be touching it, or are there any cases where that 
could hurt?


Or is there any other reason why we really want 1 GiB memory blocks 
instead of to defaulting to 256 MiB the way it used to be?

Thanks!

> 
> Restrict memory_block_size value to a power of 2 value similar to LMB size.
> The memory block size should also be more than the section size.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.ibm.com>
> ---
>   .../admin-guide/kernel-parameters.txt         |  3 +++
>   arch/powerpc/kernel/setup_64.c                | 23 +++++++++++++++++++
>   arch/powerpc/mm/init_64.c                     | 17 ++++++++++----
>   3 files changed, 38 insertions(+), 5 deletions(-)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 9e5bab29685f..833b8c5b4b4c 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -3190,6 +3190,9 @@
>   			Note that even when enabled, there are a few cases where
>   			the feature is not effective.
>   
> +	memory_block_size=size [PPC]
> +			 Use this parameter to configure the memory block size value.
> +
>   	memtest=	[KNL,X86,ARM,M68K,PPC,RISCV] Enable memtest
>   			Format: <integer>
>   			default : 0 <disable>
> diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
> index 246201d0d879..cbdb924462c7 100644
> --- a/arch/powerpc/kernel/setup_64.c
> +++ b/arch/powerpc/kernel/setup_64.c
> @@ -892,6 +892,29 @@ unsigned long memory_block_size_bytes(void)
>   
>   	return MIN_MEMORY_BLOCK_SIZE;
>   }
> +
> +/*
> + * Restrict to a power of 2 value for memblock which is larger than
> + * section size
> + */
> +static int __init parse_mem_block_size(char *ptr)
> +{
> +	unsigned int order;
> +	unsigned long size = memparse(ptr, NULL);
> +
> +	order = fls64(size);
> +	if (!order)
> +		return 0;
> +
> +	order--;
> +	if (order < SECTION_SIZE_BITS)
> +		return 0;
> +
> +	memory_block_size = 1UL << order;
> +
> +	return 0;
> +}
> +early_param("memory_block_size", parse_mem_block_size);
>   #endif
>   
>   #if defined(CONFIG_PPC_INDIRECT_PIO) || defined(CONFIG_PPC_INDIRECT_MMIO)
> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
> index 97a9163f1280..5e6dde593ea3 100644
> --- a/arch/powerpc/mm/init_64.c
> +++ b/arch/powerpc/mm/init_64.c
> @@ -549,13 +549,20 @@ static int __init probe_memory_block_size(unsigned long node, const char *uname,
>   	return 0;
>   }
>   
> -/*
> - * start with 1G memory block size. Early init will
> - * fix this with correct value.
> - */
> -unsigned long memory_block_size __ro_after_init = 1UL << 30;
> +unsigned long memory_block_size __ro_after_init;
>   static void __init early_init_memory_block_size(void)
>   {
> +	/*
> +	 * if it is set via early param just return.
> +	 */
> +	if (memory_block_size)
> +		return;
> +
> +	/*
> +	 * start with 1G memory block size. update_memory_block_size()
> +	 * will derive the right value based on device tree details.
> +	 */
> +	memory_block_size = 1UL << 30;
>   	/*
>   	 * We need to do memory_block_size probe early so that
>   	 * radix__early_init_mmu() can use this as limit for

-- 
Cheers,

David / dhildenb



More information about the Linuxppc-dev mailing list