Rusty's Bleeding Edge Page

I installed the Open Source SharpMusique on the weekend (installing the Ubuntu .deb file fine on my Debian unstable system). Two problems: View Album didn't work, so I had no way to buy a whole album, and the song I bought was saved as

" -  - .m4a"

. Played fine under Totem, so I renamed it correctly and bought another. Same deal.

I reported the problem in a comment on Jon Johannsen's blog entry announcing the 1.0 release, and there are now two fixes available in the further comments: a trivial one which fixed the Album problem, and a larger patch (munged from being entered into a comment box) which fixes the name problem.

Rebuilding the deb from the arch repository was non-trivial though, but the 1.0 tarball doesn't have the debian/ subdir. So: grab source as per the web page, using baz, install libtool, run ./autogen.sh, then run dpkg-buildpackage -rfakeroot. You must be online for the ./autogen.sh, as it downloads and unpacks stuff.

Tried to send a donation, but PayPal took away Amex as an option when I got down to setting my country to Australia, and then refused to accept by Visa. Sorry Jon, maybe later.

[/tech] permanent link

Wed, 12 Oct 2005

A Well-Tuned Crap-o-meter

One thing I gained from working on the kernel: a good sense of when code is crap. If my code is good, it is not because I am smart, but because I refuse to release stuff which stinks, so I'm forced to spend time cleaning it up. The harsh environment of the linux-kernel mailing list trains you to do this yourself.

I'm reluctant to harshly criticize the code of others, because I am aware how hurtful such words can be. On the other hand, without such harshness, you end up, say, using a helper thread in the implementation of a library. Not because you're stupid, but because you haven't developed a pavlovian shudder at the thought of releasing such a thing.

[/tech] permanent link

Tue, 11 Oct 2005

Netfilter Workshop / Summit

Good weather in Savilla. I imbibed a little too much on the Friday night celebrating the cluefulness of the Australian High Court, so was less effective on the second hacking day than I would have liked.

Some points included:

A solution for Peer-to-peer NAT and BEHAVE: Jesse Peng provided the idea. Basically, a P2PNAT target which keeps a hash table to ensure we don't allocate the same source IP/port to two NAT connections. This allows us to do hairpin NAT (it probably needs to set up an expectation to catch these). Also needs to set a flag so TCP window tracking will allow simultaneous open, and not drop immediately on RST (the latter can happen if the other end firewalls).
Nfsim seems to be attracting more of a following in the core team. Joszef committed window tracking tests! Harald wants netlink support, and also an actual nfsim release. I applied updates for 2.6.14 (thanks to Max Kellerman), and cleaned up the tests a little.
More thinking on the use of a hash trie and progress. There are several benefits for speed and scalability, although the are still fairly sizable tuning questions. Martin Josefsson is playing here.
Possible simplification and scalability improvements on the expectation code. It's more general than it needs to be at the moment.

[/tech] permanent link

Sat, 17 Sep 2005

"I've tried to explain this a few times to you but you're simply ignoring the facts"

There are three things I've learnt, the hard way, about software. The first, and the root of the other two, is that you must fear complexity above all else.

The second is summed up in Worse is better: that simplicity is virtue in itself, and a greater virtue than anything else. This must be carefully applied to the whole system, however: if you create an API which is simple, and has a simple implementation, but forces complexity on the user, you're not winning.

The third learned from Linus Torvalds, who once said his job was to say "no". People always want to add features, but a maintainer's job is to weigh the environmental damage. This, too, stems from a fear of complexity.

The title quote is from an irritated user who believed I was ignoring all evidence that a feature was necessary. This happens reasonably often among programmers who weigh features as a virtue: any potential use justifies a feature. As it turns out, he actually produced a really good reason for the feature in that same EMail, so I will be implementing it.

Wasteful as this process might seem (yes, it would have been easier to implement the feature the first time around), it represents a hard-won understanding that I am simply not smart enough to get things right the first time. So, fearing complexity, I implement as little as possible: simplicity is easier to fix.

[/tech] permanent link

Mon, 12 Sep 2005

Fernanda Weiden on Women in Free Software

Groklaw ran an interesting article on Women in Free Software. Interesting, because in the five years since I last mentioned this, I've heard much speculation on causes, but not much on solid solutions. Fernanda is in a position to know what can work. Not sure that I can do anything to help, though.
[/tech] permanent link

Fri, 09 Sep 2005

Git: An Experience in Pain

I had the opportunity to try to use git today. Having used bzr and mercurial and been favourably impressed, I guess my expectations were fairly high.

As it turns out, far too high. There's no clean way to get git (apt-get would have been nice, but a tarball to compile would have been fine too), so was given Debian packages by Stephen. Set up an rsync repository to hold the .hg files. You can't use "git-push-script" on that empty dir, so rsync the original files across. Then you to use "git-pull-script" you have to add .hg to the URL: my mistake, should have just rsync'ed that dir.

Now try to push. The man page for "git-push-script" says what you'd expect, so you do "git-push-script rsync://machine/mydir/.hg". "Cannot push to rsync://machine/mydir/.hg". Er, ok, why?

AFAICT, grovelling through the scripts, the man page lies: it doesn't actually support push over rsync. Hell, maybe it doesn't really support push at all; seems you need to rsync upstream manually. Of course, doing this means you can't have more than one user pushing to a repository at a time, which might be fine for Linux, but simply won't work with any kind of core team.

Now I start to understand why the Xen guys chose Mercurial . Linus seems proud of the fact that he didn't write an SCM, just a content-addressable-filesystem, but there's a lot more required to turn that into a usable system than a handful of scripts.

So kudos for Matt Mackall for actually writing a system which is usable; it makes git look like a half-finished school project.

[/tech] permanent link

Mon, 29 Aug 2005

Now with old links

Turns out you can just use "/<date>" at the end of blosxom URLs to get all entries on that date, so I hacked in the old links on the sidebar, including the pre-blosxom plain-html entries. So now you can browse freely...
[/tech] permanent link

Sun, 14 Aug 2005

New Trivial Patch Maintainer: Adrian Bunk

Adrian Bunk pushed a pile of my old trivial patches through to Andrew, and so I offered to hand to him the time-honored position of Trivial Patch Monkey. I haven't read my mail to trivial for months, and he accepted, so I bundled it up, despammed it a little, and forwarded them to him. I also set the trivial patch monkey mailing address to forward to him. The reason I let it lapse was because Andrew generally performs the same job, but perhaps someone will still find it useful.
[/tech] permanent link

Mon, 25 Jul 2005

OLS nfsim Talk Slides

I've put them up here in OpenOffice format .
[/tech] permanent link

Fri, 24 Jun 2005

Sudoku Revisited

Andrew Bennetts send Email pointing out that with Python 2.4, you can use the built-in set type, which speeds my sudoku program by a factor of more than two: indeed, a simple change makes it only 57x slower than the C implementation, vs 120x.
[/tech] permanent link

Mon, 20 Jun 2005

Solving Sudoku in C and Python

As an intellectual exercise (I hate puzzles) I wrote a quick C program to solve sudoku puzzles. Then I wrote a Python version (almost, but not quite, "my first python program").

The python was shorter, 150 lines, vs C 260 lines. Time to code is not comparable since I am not fluent in Python, but I can believe that the Python version would have been quicker to write if I was. But hacking both versions to loop 100 times reveals the Python is about 192 times slower (oops, fixed, now 120 times slower). So it's not free.

I talked to Chris about writing a C++ version (presumably using an STL Set), which would be another data point. I'd do a Z80 asm version, except I'm not sure where to get a Z80 assembler these days.

[/tech] permanent link

Tue, 07 Jun 2005

Came up with a nasty hack for Wesnoth: creating a .po file from the Italian which also includes English "hints". Here is a shell script which reads in a .po file, and takes the obvious words out of the (English) key and appends it in brackets to the translated version. This provides enough of a hint for me to read the longer sentences is Wesnoth. It makes some sentences overflow though.

Example:

"Dopo una lunga camminata il Principe Haldric e i suoi compagni si trovano in "
"una spiaggia assolata. Anche se in casi normali questo creerebbe una "
"situazione di piacere ben presto si accorgono della presenza di Sauri al "
"lavoro. (After long trek Prince and his companions find themselves on sunny "
"beach. While normally this would pleasant occurrence, soon find Saurians "
"hard at work.)"

This hack was inspired by the Slashdot story from January Learning a Foreign Language with The Sims . A better solution would be to break it up by sentences, rather than do the entire dialog at the end. When an English word has been shown some number of times, it would no longer be shown. Mousing over the sentence would give the full English sentence. But this was quick!

[/tech] permanent link

Sun, 05 Jun 2005

Exceptions revisited: the destructor relationship

So David White (of Wesnoth fame) made an excellent point on my previous problem with exceptions. The problem of exceptions unexpectedly exiting out the side doesn't exist if every object cleans up in destructors (ie. use RAII . In other words, all state changes must be encapsulated in a class.

This might seem obvious to people who deal with exceptions day-to-day (my C++ experience predates widespread use of exceptions). But encapsulating things in an object has one advantage: the programmer knows they cannot allocate memory in the destructor, The problem comes when not everything has a destructor, as will tend to happen when C code (and C programmers) using "exceptions" like my example.

[/tech] permanent link

Sun, 29 May 2005

Wesnoth

I patched Wesnoth to label how many enemies can reach each spot in "Show Enemy Moves". I jumped on freenode's #wesnoth channel, and the friendly people there directed me to #wesnoth-dev, and there to the mailing list. Patch sent. The code's not super clean, and C++ is pretty icky, although the my vague memories of the Standard Template Library and some cut & paste got me though.

In other news, I figured out why Wesnoth didn't change languages for me: when you install Debian you specify which locales to build, so running "dpkg-reconfigure locales" and adding Italian fixed it. I'm now playing Wesnoth in Italian (I figure it can't hurt my horrible Italian skills). Reading Italian Wikipedia is beyond me: I can't figure out what an Inkling is!

[/tech] permanent link

Mon, 23 May 2005

C++ Compilation Times: An Update

Ian Wienand pointed out that dropping the -O2 from compilation removes the unitialized warnings from GCC, which are useful. Unfortunately a recent Debian upgrade broke my Wesnoth build, so I can't measure the time taken by "-O" instead of "-O2".

And Chris pointed out that gnome terminal is slow with anti-aliased fonts. My tests gave at most a 10 second reduction with no output, but machines and setups vary.

[/tech] permanent link

LCA Kernel Hacking Tutorial Links

OK, Robert put the core of the material up, but OpenOffice 1.2 crashes trying to open it (he uses 2.0). Here's the complete version, including the "political" section at the end.

Please note that the final driver is far from perfect: it should really use waitqueues, and Greg Kroah-Hartmann sent me some fairly trivial code to make the SHA accessible as a sysfs attribute. But it's a start.

[/tech] permanent link

Wed, 18 May 2005

Compile times with C++

As everyone knows, I'm a huge Wesnoth fan. I found an interview with the lead developer from last year on PCTechTalk where he complained about compile times with g++.

So I downloaded the CVS and did some testing, and sent him the results. Most of these are probably fairly obvious to people, but the figures are interesting:

Raw build time

make clean; time make > /tmp/out
487.80user 26.41system 8:59.65elapsed

Build time without -O2

make clean; time make CXXFLAGS=-g > /tmp/out
193.98user 14.84system 4:00.92elapsed

Built time without -O2, one file touched

touch src/global.hpp; time make CXXFLAGS=-g > /tmp/out
191.53user 9.52system 3:43.92elapsed

Build time with ccache installed, one file touched:

touch src/global.hpp; time make CXXFLAGS=-g > /tmp/out"
10.45user 2.04system 0:37.46elapsed

Built time with ccache, one file touched, output to screen:

touch src/global.hpp; /usr/bin/time make CXXFLAGS=-g"
10.73user 2.06system 0:47.85elapsed

Summary: don't use -O when developing, use ccache, use distcc. If you're still not fast enough you can try suppressing all screen output (probably not worth the pain of not seeing what's happening), or using pre-compiled headers (in recent gcc versions).

[/tech] permanent link

Mon, 09 May 2005

Talloc and longjmp: the danger of exceptions

So I added a few lines to talloc, to allow me to set a handler for out-of-memory. The Xen Store Daemon had a fair bit of code like so (do_rm):

	permnode = perm_node(node);
	if (!permnode) { /* ENOMEM */
		send_error(conn, errno);
		return;
	}

	/* Figure out where it is. */
	permpath = node_path(conn->transaction, permnode);
	if (!permpath) { /* ENOMEM */
		send_error(conn, errno);
		return;
	}

	path = node_path(conn->transaction, node);
	if (!path) {
		send_error(conn, errno);
		return;
	}

	if (!delete(path)) {
		send_error(conn, errno);
		return;
	}

	if (unlink(permpath) != 0)
		corrupt(conn, "Removing permfile %s after %s removed",
			permpath, path);

Since a reasonable action on allocation failure is to drop the connection to the client, I set the talloc_fail_handler to do a longjmp back to the core code, which talloc_free'd the connection, cleaning everything up. No more talloc failures!

The patch was 84 insertions, 255 deletions. Great, now I have code like so:

	permpath = node_path(conn->transaction, perm_node(node));
	if (!delete(node_path(conn->transaction, node))) {
		send_error(conn, errno);
		return;
	}
	if (unlink(permpath) != 0)
		corrupt(conn, "Removing permfile %s after %s removed",
			permpath, path);

Neat, but the obvious next step is:

	if (!delete(node_path(conn->transaction, node))) {
		send_error(conn, errno);
		return;
	}
	if (unlink(node_path(conn->transaction, perm_node(node))) != 0)
		corrupt(conn, "Removing permfile %s after %s removed",
			permpath, path);

Now this code has a subtle bug, which was obvious when writing code which had to handle node_path() failing to alloc and returning NULL: we can end up with the perm file not being unlinked, as the allocation in the unlink line fails. I'm pretty sure I wouldn't have noticed this if I had written using exceptions the first time.

By letting the programmer ignore error handling, exception mechanisms create subtle bugs: the worst kind. Its the same rationale against "return" statements within macros: it might save lines of code, but don't do it. I've left the change in there for the moment, but I'm a little frightened that I've traded off some typing for a new way to shoot myself in the foot...

[/tech] permanent link

Sun, 01 May 2005

linux.conf.au Kernel Tutorial Revisited

Several people have contacted me to say that the tutorial I gave at linux.conf.au wasn't as bad as I claimed. Particularly Tom Parker who said it was "the highlight of the conference for me". Thanks Tom, but I was still disappointed.

Many people have given feedback and suggestions, and if I do it again I'd do the following:

Reduce the problem: say do a programmed I/O rather than DMA,
Make people write the first "hello world" driver as part of the homework, so they know what programming level we expect for the tutorial,
Give them the answers and the slides encrypted and give them the keys to decrypt them as we go

I've been thinking about smaller groups: I know a 20-person tutorial would have been far better. I wonder if we could get some great programmers, and have attendees sign up for a "programming style" tutorial. Semi-randomly assign attendees to each programmer, and have them lead their group in solving a problem (which they've seen before, so they have a prepared solution). With programmers like Tridge, Rasmus, Martin Pool and some lesser-known ones like Anton Blanchard, it'd be a hoot! Maybe it's s dumb idea, but I can assure you that I learn a lot by just discussing coding with these people...

[/tech] permanent link

Sat, 23 Apr 2005

LCA Dinner

I was touched by Steven Hanley's decision to give the proceeds of the annual T-Shirt auction to SIDS and Kids.

In my theory that the Free Software community is empowering more non-hacker types, Andrew Pollock and I rated all the screensavers in Xscreensaver either "geeky" (0) or "arty" (1). eg. something which solves a maze or draws a fractal is geeky, and something which draws pretty blobs on the screen is arty. There are many borderline cases of course. Then, with help from Jamie Zawinksi, who gave me the dates of all the xscreensaver releases since 1.3, we graphed the "artiness index" over time, which shows a slight increase in artiness.

[/tech] permanent link

Thu, 21 Apr 2005

linux.conf.au Kernel Hacking Tutorial

I was disappointed with the tutorial. We tried to be interactive and have people write their own driver, but we lost about 50% of people through the tutorial. But as it was, we had to abandon the interactive approach and run through the last half of the material in the last quarter of the presentation. Writing a PCI driver was clearly too much, although it seemed logical to write a driver. My patch series, lightly annotated, is here

Some people were having problems getting the device to fire an interrupt: it occurred to me later that perhaps qemu doesn't always initialize the interrupt lines correctly, and writing a "0" to the control word of the device would clear it (something which also needs to be done by the driver so it can get a second interrupt).

The highlight was getting to present with Robert Love , whose book has apparently sold out on campus already.

[/tech] permanent link

talloc

Tridge gave a great keynote today. In particular he exposed everyone at LCA to talloc. I'm a big talloc fan, and I usually fear change.

[/tech] permanent link

Thu, 14 Apr 2005

linux.conf.au Linux Kernel Hacking Tutorial

Robert and I finished the tutorial material which people need to complete before the tutorial. There's a fair bit to do, so go to it...
[/tech] permanent link

Thu, 03 Mar 2005

So I have been speaking with the Xen people, and I have failed to convert them to my way of doing inter-partition I/O. I believe my method is simpler and cleaner, but theirs is more mature (I have a prototype). At the core, they explicitly map another domain's memory into their own for bulk data transport. This leads to lifetime and accounting difficulties with malicious domains, but also complexity from the Operating System's (ie driver's) point of view. In my model, the domain registers a scatter-gather list which returns an ID, and another domain then asks the hypervisor to transfer data in or out of that scatter-gather list. The registration step is amortized quite well by using a recycled pool of skbs (for my network driver example), so it's still about one hypercall per transfer; my preliminary results reflected this performance parity. It can also be used N-way triggering. However, the real benefit is the similarity to normal DMA, which makes drivers which look like "normal" drivers.

My other difference was that my "event channels" were bound directly to a physical address, rather than explicitly to an identifier. The other end said "fire off anyone bound to this address". This simplifies binding while still enforcing security, which is implied by sharing memory. The implementation was also a little more flexible than the event channel model used in Xen: when creating an event channel the OS passed a pointer to a domain-private atomic int. On every trigger, that was decremented, and if zero, caused a virtual interrupt. In my example network driver, the driver bound an event channel to the address of every scatter gather id, with the each decrement pointer pointing into its internal data structures, and all triggering the same interrupt. When that interrupt occurred, it would scan those structures for a value <= 0. Xen uses a simple "triggered" bit array in a fixed location, which I like because if its simplicity.

So I'm now implementing "bind to address" event channels in Xen, to see how useful they are. I can then add one feature at a time to see what effect it has on the OS drivers, which is the interface I care about.

[/tech] permanent link

Thu, 08 Dec 2005

Mon, 05 Dec 2005

Tue, 08 Nov 2005

Fri, 04 Nov 2005

Wed, 12 Oct 2005

Tue, 11 Oct 2005

Sat, 17 Sep 2005

Mon, 12 Sep 2005

Fri, 09 Sep 2005

Mon, 29 Aug 2005

Sun, 14 Aug 2005

Mon, 25 Jul 2005

Fri, 24 Jun 2005

Mon, 20 Jun 2005

Tue, 07 Jun 2005

Sun, 05 Jun 2005

Sun, 29 May 2005

Mon, 23 May 2005

Wed, 18 May 2005

Mon, 09 May 2005

Sun, 01 May 2005

Sat, 23 Apr 2005

Thu, 21 Apr 2005

Thu, 14 Apr 2005

linux.conf.au Linux Kernel Hacking Tutorial

Thu, 03 Mar 2005

Wed, 09 Feb 2005