|
Free Software programmer
Subscribe This blog existed before my current employment, and obviously reflects my own opinions and not theirs. ![]() This work is licensed under a Creative Commons Attribution 2.1 Australia License.
Categories of this blog:
Older issues: |
Fri, 29 Aug 2008Linux Next GraphingSome neat stats just graphing the size of the bz2 patch for Linux next for the last 108 days (12 May through 28 August). Since Stephen doesn't produce patches on weekends, you can see the gaps (dashed lines are Mondays, Australian time) The -rc1 dip is really clear (these patches are produced against the last labelled Linus kernel, so hence it's a one day drop), and you can see the -rc2, -rc3 and -rc4 dips diminishing like they're supposed to. Those sharp-eyed will note that during the merge window, kernel hackers work weekends :) [/tech] permanent link Tue, 22 Jul 2008WTF? Wikipedia deletion gone mad...OK, so Dave Miller's pending deletion I can understand; if you didn't know how key he was, the article itself lacks references and is lacks detail (compare it with Andrew Tridgell's page. (At least he noticed; when I was deleted last time I didn't know). But then I find out that the article on OLS was deleted back in February. Huh? This is the major Linux conference in the world. Some would argue that it's a bit faded at the edged these days, but none of the crop of contenders can genuinely claim that crown. I know conferences don't generally get pages as sexy as humans do, but still... [/tech] permanent link Sun, 20 Jul 2008The Joy of linux-nextSure, linux-next is a useful way of early-detecting patch conflicts with random developers. But the second order effect has been more useful to me: forcing me to get my shit together. Now I regularly publish my patchqueue in a form which applies and compiles, and has clear "production" vs "alpha" demarcation. Obviously, this is good for people trying to follow various patches (and there are quite a few independent efforts at the moment, including typesafe patches, virtio, lguest, module, tun/tap, stop_machine, kmod-removal and down_trylock removal), but it also makes the arrival of the merge window far less stressful. In theory, I could have been this organized before. But just like the concept of doing homework long before the deadline, it was never going to happen. So thanks Stephen! [/tech] permanent link Mon, 14 Jul 2008UNSW CS: Employment @ IBM OzLabs Talk: 1pm Tuesday September 2ndUNSW School of Computer Science and Engineering are having "Employer of the Week" experiment: September 1st is IBM's week. I'll be spruking for OzLabs, so if you know anyone at UNSW who worth talking to, drag them there (I don't know which room, I'm guessing the signs in CS will be pretty clear). I'm going to try to talk about the stuff people in the office are hacking on, to give an idea what it's like being in what AFAICT is Australia's largest bunch of Free and Open Source Software hackers. [/tech] permanent link Mon, 30 Jun 2008stop_machine latency: the rewriteFollowing on from my previous graphs of stop_machine latency, I have new results with my stop_machine simplification patch. Again, it's the 18-way Power4 box; the simplied stop_machine creates all the threads and moves them into the correct CPUs before starting them. They then step through the state machine themselves, rather than having a central controller. Since these are different kernel versions, I looked at the baseline latency for both kernels: Now I need to go back and compare the exact same kernel version, to make sure something else isn't interfering... [/tech] permanent link Fri, 27 Jun 2008Linux Foundation's Device Driver StatementSomeone noted that I didn't sign the LF "proprietary modules are bad" statement. This is entirely due to my slackness and not any lack of support. As kernel module maintainer I feel obliged to maintain the status quo with proprietary modules, but I have noticed many colleagues becoming more annoyed about them. [/tech] permanent link Thu, 12 Jun 2008stop_machine latencyKathy Staples and I wrote a little program to measure the latency on every CPU on a machine. It sets CPU affinity and high priority (SCHED_FIFO, prio 50) for each thread, then spins doing gettimeofday() for a given duration. The maximum gap in gettimeofday() is reported for each CPU. I tested this on an old 18-way Power4 box sitting around the lab: CPU 0 is used for the parent process, and the latency is measured on the other CPUS. This was run 100 times. Then a variant which did an insmod system call on CPU 0 was used (this calls stop_machine, which is what we were trying to measure). The results are interesting and a little surprising. Normal max latency is around 35 usec, the stop_machine increasing it to the 100 range. There's obviously something running periodically on CPU 2: for both runs I had to remove one horrific 150ms latency result (1000 times average!) but there's still a noticeable spike there. I suspect CPU1 is low because CPU0 is mainly idle (same core). But more concerning is that latency seems to go up with higher CPU numbers, whereas I expected it to be worst on lower CPUs. We launch stop_machine threads in cpu order, so I expected the lower CPUs to wait the longest. We're running modprobe on cpu 0, which means the stop_machine control thread runs there, too. It loops through creating 17 other threads: as CPU 0 is busy, it gets scheduled on a different idle CPU. The first thing the thread does is try to move itself to its proper CPU. I suspect what is happening is that we're creating the 17 threads fast enough that they all end up queued on the migration queue for CPU 0 at once: this queueing uses "list_add" not "list_add_tail", so they are in fact deployed by the migration thread in reverse-CPU order. My simplified version of stop_machine is more intelligent: it moves all the threads to their correct CPUs before waking them all up. This should solve this problem as well as reducing overall latency. [/tech] permanent link |