[PATCH] powerpc/ftrace: Handle large kernel configs

Christophe Leroy christophe.leroy at csgroup.eu
Tue Nov 30 00:09:03 AEDT 2021


Hi Naveen,

Le 16/10/2018 à 22:25, Naveen N. Rao a écrit :
> Currently, we expect to be able to reach ftrace_caller() from all
> ftrace-enabled functions through a single relative branch. With large
> kernel configs, we see functions outside of 32MB of ftrace_caller()
> causing ftrace_init() to bail.
> 
> In such configurations, gcc/ld emits two types of trampolines for mcount():
> 1. A long_branch, which has a single branch to mcount() for functions that
>     are one hop away from mcount():
> 	c0000000019e8544 <00031b56.long_branch._mcount>:
> 	c0000000019e8544:	4a 69 3f ac 	b       c00000000007c4f0 <._mcount>
> 
> 2. A plt_branch, for functions that are farther away from mcount():
> 	c0000000051f33f8 <0008ba04.plt_branch._mcount>:
> 	c0000000051f33f8:	3d 82 ff a4 	addis   r12,r2,-92
> 	c0000000051f33fc:	e9 8c 04 20 	ld      r12,1056(r12)
> 	c0000000051f3400:	7d 89 03 a6 	mtctr   r12
> 	c0000000051f3404:	4e 80 04 20 	bctr
> 
> We can reuse those trampolines for ftrace if we can have those
> trampolines go to ftrace_caller() instead. However, with ABIv2, we
> cannot depend on r2 being valid. As such, we use only the long_branch
> trampolines by patching those to instead branch to ftrace_caller or
> ftrace_regs_caller.
> 
> In addition, we add additional trampolines around .text and .init.text
> to catch locations that are covered by the plt branches. This allows
> ftrace to work with most large kernel configurations.
> 
> For now, we always patch the trampolines to go to ftrace_regs_caller,
> which is slightly inefficient. This can be optimized further at a later
> point.
> 
> Signed-off-by: Naveen N. Rao <naveen.n.rao at linux.vnet.ibm.com>
> ---
> Since RFC:
> - Change to patch long_branch to go to ftrace_caller, rather than
>    patching mcount()
> - Stop using plt_branch since it can't be relied on for ABIv2
> - Add trampolines around .text and .init.text to catch remaining
>    locations
> 
> - Naveen
> 
>   arch/powerpc/kernel/trace/ftrace.c    | 261 +++++++++++++++++++++++++-
>   arch/powerpc/kernel/trace/ftrace_64.S |  12 ++
>   arch/powerpc/kernel/vmlinux.lds.S     |  13 +-
>   3 files changed, 281 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/trace/ftrace.c b/arch/powerpc/kernel/trace/ftrace.c
> index 4bfbb54dee51..4bf051d3e21e 100644
> --- a/arch/powerpc/kernel/trace/ftrace.c
> +++ b/arch/powerpc/kernel/trace/ftrace.c

...

> +/*
> + * If this is a compiler generated long_branch trampoline (essentially, a
> + * trampoline that has a branch to _mcount()), we re-write the branch to
> + * instead go to ftrace_[regs_]caller() and note down the location of this
> + * trampoline.
> + */
> +static int setup_mcount_compiler_tramp(unsigned long tramp)
> +{
> +	int i, op;
> +	unsigned long ptr;
> +	static unsigned long ftrace_plt_tramps[NUM_FTRACE_TRAMPS];
> +
> +	/* Is this a known long jump tramp? */
> +	for (i = 0; i < NUM_FTRACE_TRAMPS; i++)
> +		if (!ftrace_tramps[i])
> +			break;
> +		else if (ftrace_tramps[i] == tramp)
> +			return 0;
> +
> +	/* Is this a known plt tramp? */
> +	for (i = 0; i < NUM_FTRACE_TRAMPS; i++)
> +		if (!ftrace_plt_tramps[i])
> +			break;
> +		else if (ftrace_plt_tramps[i] == tramp)
> +			return -1;

I don't understand how this is supposed to work.
ftrace_plt_tramps[] being a static table, it is set to 0s at startup.
So the above loop breaks at first round.

Then ftrace_plt_tramps[i] is never/nowhere set.

So I just see it as useless.

Am I missing something ?

Thanks
Christophe

> +
> +	/* New trampoline -- read where this goes */
> +	if (probe_kernel_read(&op, (void *)tramp, sizeof(int))) {
> +		pr_debug("Fetching opcode failed.\n");
> +		return -1;
> +	}
> +
> +	/* Is this a 24 bit branch? */
> +	if (!is_b_op(op)) {
> +		pr_debug("Trampoline is not a long branch tramp.\n");
> +		return -1;
> +	}
> +
> +	/* lets find where the pointer goes */
> +	ptr = find_bl_target(tramp, op);
> +
> +	if (ptr != ppc_global_function_entry((void *)_mcount)) {
> +		pr_debug("Trampoline target %p is not _mcount\n", (void *)ptr);
> +		return -1;
> +	}
> +
> +	/* Let's re-write the tramp to go to ftrace_[regs_]caller */
> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
> +	ptr = ppc_global_function_entry((void *)ftrace_regs_caller);
> +#else
> +	ptr = ppc_global_function_entry((void *)ftrace_caller);
> +#endif
> +	if (!create_branch((void *)tramp, ptr, 0)) {
> +		pr_debug("%ps is not reachable from existing mcount tramp\n",
> +				(void *)ptr);
> +		return -1;
> +	}
> +
> +	if (patch_branch((unsigned int *)tramp, ptr, 0)) {
> +		pr_debug("REL24 out of range!\n");
> +		return -1;
> +	}
> +
> +	if (add_ftrace_tramp(tramp)) {
> +		pr_debug("No tramp locations left\n");
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +


...



More information about the Linuxppc-dev mailing list