Thu, 12 Jun 2008

stop_machine latency

Kathy Staples and I wrote a little program to measure the latency on every CPU on a machine. It sets CPU affinity and high priority (SCHED_FIFO, prio 50) for each thread, then spins doing gettimeofday() for a given duration. The maximum gap in gettimeofday() is reported for each CPU.

I tested this on an old 18-way Power4 box sitting around the lab: CPU 0 is used for the parent process, and the latency is measured on the other CPUS. This was run 100 times. Then a variant which did an insmod system call on CPU 0 was used (this calls stop_machine, which is what we were trying to measure).

The results are interesting and a little surprising. Normal max latency is around 35 usec, the stop_machine increasing it to the 100 range. There's obviously something running periodically on CPU 2: for both runs I had to remove one horrific 150ms latency result (1000 times average!) but there's still a noticeable spike there. I suspect CPU1 is low because CPU0 is mainly idle (same core).

But more concerning is that latency seems to go up with higher CPU numbers, whereas I expected it to be worst on lower CPUs. We launch stop_machine threads in cpu order, so I expected the lower CPUs to wait the longest.

We're running modprobe on cpu 0, which means the stop_machine control thread runs there, too. It loops through creating 17 other threads: as CPU 0 is busy, it gets scheduled on a different idle CPU. The first thing the thread does is try to move itself to its proper CPU.

I suspect what is happening is that we're creating the 17 threads fast enough that they all end up queued on the migration queue for CPU 0 at once: this queueing uses "list_add" not "list_add_tail", so they are in fact deployed by the migration thread in reverse-CPU order.

My simplified version of stop_machine is more intelligent: it moves all the threads to their correct CPUs before waking them all up. This should solve this problem as well as reducing overall latency.


[/tech] permanent link