|
Free Software programmer
rusty@rustcorp.com.au
Subscribe
Subscribe to a syndicated
feed of my weblog, brought to you by the wonders of
RSS.
This blog existed before my current employment, and obviously
reflects my own opinions and not theirs.

This work is licensed under a Creative Commons Attribution 2.1 Australia License.
Categories of this blog:
IP issues
Technical issues
Personal issues
Restaurants
Older issues:
All 2007 posts
All 2006 posts
All 2005 posts
All 2004 posts
Older posts
My wife:
Alison Russell
|
Rusty's Bleeding Edge Page
Tue, 22 Apr 2008
Arrived for the virtualization mini-summit (alongside the Linux
Foundation Collaboration Summit) the week before last, and stayed
around because much of IBM's kvm work is done here. Much hacking,
but I should have blogged about my travel plans sooner.
I leave on Friday for San Jose (on the "Nerd bird" I'm told) for the
weekend before I fly back home, but if anyone wants to catch up, send
mail...
[/self] permanent link
Mon, 07 Apr 2008
I just appreciated an interesting side-effect of slapping "inline" on
static functions within .c files. You don't get a warning when they
become unused.
This breaks my normal method for code cleanup (in this case, the tun
driver). So unless you have evidence otherwise, plase trust the
compiler to inline static functions appropriately and don't label them
inline. (And remember: inline is the register keyword
for the 21st century.)
[/tech] permanent link
Sat, 05 Apr 2008
Tue, 01 Apr 2008
Here begins our descent into hell; if an interface manages to
achieve negative scores on the Hard To Misuse List, your users may
detect the dull red glow of malignancy rather than incompetence.
- -1. Read the mailing list thread and you'll get it wrong.
-
If the first hit on Google when searching for the symptoms or how to
use your interface leads to a
convincing but incorrect answer, that puts your interface here.
- -2. Read the implementation and you'll get it wrong.
-
This happens most often when the implementation being read is not
the one you which ends up being used. Or maybe the implementation
comes with test cases which all exercise the unnatural corners of the
interface, which mislead instead of enlightening.
- -3. Read the documentation and you'll get it wrong.
-
Here's my favorite (now fixed) example, from the glibc snprintf
man page:
RETURN VALUE
snprintf and vsnprintf do not write more than size bytes
(including the trailing '\0'), and return -1 if the output was
truncated due to this limit.
I was scanning the man page for the return value on overlength
snprintfs; now I'd found it I stopped reading. But here was the
next sentence:
(Thus until glibc 2.0.6. Since glibc 2.1 these functions follow
the C99 standard and return the number of characters (exclud-
ing the trailing '\0') which would have been written to the
final string if enough space had been available.)
- -4. Follow common convention and you'll get it wrong.
-
The usual example here is fputs() and similar which
take the context argument at the end instead of the start:
int fputs(const char *s, FILE *stream);
But that doesn't quite get down here: the compiler will warn if
you get the argument order backwards (or, if you prefer, forwards).
So again I reach to the Linux Kernel, this time for the list macros:
void list_add(struct list_head *new, struct list_head *head);
I now have this nailed into my brain, but for a long time I
expected the 'head' (ie. the list I'm adding to) to be the first
argument. Of course, this wouldn't be such a problem if list heads and list
entries were not exactly the same type.
- -5. Do it right and it will sometimes break at runtime.
-
Every C programmer knows that malloc returns NULL on error:
p = malloc(bufsize);
if (!p) {
/* Phew! We can handle this... */
backout_nicely();
exit(1);
}
Except malloc may also return NULL on zero-length
allocations: something you'll find out the hard way when your nice
code which didn't special case 0-length allocations breaks horribly on
someone else's machine.
- -6. The name tells you how not to use it.
-
Sometimes we opt for changing behavior without changing a
(now-inappropriate) name, knowing that existing users won't be broken
by the new behaviour. But don't curse future users with a misleading
name: if your project takes off, there will be far more of them than
current users.
My example here is another Linux kernel one which bit me. I was
writing a block (disk) driver: it gets passed a struct
request which consists of a series of chunks. After servicing
them, it calls end_request(). Only it turns out that (for
historical reasons!) this only ends the first chunk. My block driver
"worked", but it was doing about N^2/2 times the work it needed to do
for an N-chunk request.
(I didn't find that, the maintainer reviewing my code did).
- -7. The obvious use is wrong.
-
I've been coding in C for about 20 years, and about five years ago
I spent an hour chasing a case where I'd done if (strcmp(arg, "foo"))
instead of if (!strcmp(arg, "foo")).
Now I religiously #define streq(a, b) (!strcmp((a),(b)))
because I know I'm not as smart as I think I am.
Less "I'm obviously an idiot" is the behavior of
strncpy() which truncates the destination string without
adding a NUL terminator. Or char x[5] = "hello"; which the
C standards committee thought would be an excellent trap for newcomers
(and particularly stupid since there is a workaround if you really want an
unterminated character array).
- -8. The compiler will warn if you get it right.
-
The bind() socket library call comes to mind here: it takes a
struct sockaddr but you always have to cast to use it,
as you will never have a struct sockaddr, but instead a
struct sockaddr_in or some other specific type. This one is almost
excusable, although I'd expect better from modern code.
- -9. The compiler/linker won't let you get it right.
-
This is hard to find in C, since the compiler will let you cast
your way through almost anything. Listed here for completeness.
- -10. It's impossible to get right.
-
Unlike the first category, this final category is neither a
paragon nor unattainable. Some interfaces are so fundamentally flawed
that they can't be used correctly. Perhaps it can fail in a way you
have to know about but it doesn't return an error. Perhaps it
returns an error but you can do nothing about it.
In the Linux kernel there used to be interfaces which assumed
single-threading, and are now unsafe. Say you expose two functions
called prepare() and and action() and expect the
caller to do if (prepare()) action();. This is broken if
action() relied on all the checks in prepare() passing,
and now conditions can change between the two.
That's everything I know about interface design. Now, go and make
your own mistakes so you can have wise things to say about it!
[/tech] permanent link
Sun, 30 Mar 2008
It's useful to arm ourselves with a pithy phrase should we ever
have to face an "it'll be easier to use!" argument. But once we've
pointed to it, it's still not clear how to improve the difficulty of
interface misuse.
So I've created a "best" to "worst" list: my hope is that by
putting "hard to misuse" on one axis in our mental graphs, we can at
least make informed decisions about tradeoffs like "hard to misuse" vs
"optimal".
The Hard To Misuse Positive Score List
- 10. It's impossible to get wrong.
-
This ideal is represented by the dwim() (Do What I Mean) function,
where misuse means the implementation has a bug. In real life this
goal is only achievable by greatly restricting your definition of
misuse. Even the dwim() function can be abused by not
calling it at all.
- 9. The compiler/linker won't let you get it wrong.
-
As a C person, I like that the compiler reads all my code before
it even gives me a chance to run any of it. We're so used to this we
don't give it a second thought when the compiler barfs because we use
the wrong type or don't provide enough arguments to a function. But
we can go out of our way to use this: various project such as gcc and
the Linux kernel have macros like BUILD_BUG_ON(cond) which
can be implanted strategically to evoke compile errors (it evalates
sizeof(char[1-2*!!(cond)]) which won't compile if
cond is true).
I use this in the kernel's module_param(name, type,
perm) macro to check that the read/write permissions for the
module parameter are sane (a common mistake was to specify
644 instead of 0644).
- 8. The compiler will warn if you get it wrong.
-
This is weaker than breaking the compile, but in many cases easier
to achieve. The classic of this school is the Linux kernel min() and
max() macros, which use two GCC extensions: a statement expression
which allows the whole statement to be treated by the caller as a
single expression, and typeof which lets us declare a
temporary variable of same type as another:
/*
* min()/max() macros that also do
* strict type-checking.. See the
* "unnecessary" pointer comparison.
*/
#define min(x,y) ({ \
typeof(x) _x = (x); \
typeof(y) _y = (y); \
(void) (&_x == &_y); \
_x < _y ? _x : _y; })
Since a common error in C is to compare signed vs unsigned types and
expect a signed result, this macro insists that both types be
identical.
- 7. The obvious use is (probably) the correct one.
-
Always make it easier to do the Right Thing than the Wrong Thing.
So if you can't make the right thing easy, make the wrong thing hard!
This is the "explicit args required for kmalloc" example again, but it
usually means choosing defaults carefully and knowing the normal use
for the function.
My example here is the standard Unix exit() and
_exit(): the latter does not call any atexit()
handlers and is usually not the right choice, so it's harder to find.
- 6. The name tells you how to use it.
-
Everyone knows a good name is invaluable. In the _exit()
the underscore punches far above its one-character weight was a
warning sign.
My example here is the strange reference counting mechanism used
by the Linux Kernel module code: getting a reference count can
fail, unlike almost all the rest of the kernel reference
counts. Hence, the "get a reference count" function is called
try_module_get(): those first four characters reflect the
importance of the return code. Note that these days, the GCC
"__attribute__((warn_unused_result))" can be used to promote this
usage to a warning. I still like the name, though, because overuse of
such things has lead to some warning fatigue...
- 5. Do it right or it will always break at runtime.
-
As soon as the misusing code is executed, it'll die horribly. Not
all code paths are tested, but this will often catch cases where
someone is writing new code using your interface. It's hard for the
compiler to ensure that the user calls your "open" routine before your
other routines, but an "assert()" can at least get you to this level.
- 4. Follow common convention and you'll get it right.
-
This is a corollary of "this simplest use is the correct one", and
a very useful handhold on the way up this scale. In particular, C
convention for argument order seems to have evolved down to three
ordered rules:
-
Context argument(s) go first. A context is something the user
will do a series of different things to; a handle.
-
Associated arguments are adjacent. An array and its length go
together, as does a timestamp and its granularity. If you could see yourself
making a structure out of some of the args, they should go together.
-
Details go as late as possible. Flags for the function go at the end.
Pointer and length pairs are passed in that order.
I've never gotten the argument order of the standard
write() wrong, even though the fd and count could be
interchanged:
ssize_t write(int fd, const void *buf, size_t count);
There are also minor (but important!) conventions, such as
memcpy's "destination before source", which you should use for any
memcpy-like routines.
Like all rules, this one exists to be violated; but know you're doing so.
- 3. Read the documentation and you'll get it right.
-
People only read instructions after they've already tied
themselves into a knot. Then they skim them for keywords and don't
read your warnings. I don't give an example of this; if this is the
best an interface can get do, it's in trouble.
- 2. Read the implementation and you'll get it right.
-
We've all done this. Reading the implementation can work for the
simple questions (what unit is this argument in?), but leads to
trouble for the subtler issues. The concept of "the" implementation
is always problematic, and when the implementation is tightened or
fixed we discover we didn't actually get it right, we just got it
working.
In some cases, the implementation is a noop, which doesn't help.
- 1. Read the correct mailing list thread and you'll get it right.
-
The reason the some strange interface quirk exists might be for
compatibility with some strange OS or compiler, weird corner case or
even older versions of this codebase. In other words, historical
reasons ("see, on the VAX we only had 6 characters for..."). You
sometimes only find this when you send a patch to fix it and the
original author yells at you.
Sometimes they add it to the FAQ. That does not increase the
interface's score very much: please try harder.
[/tech] permanent link
Tue, 18 Mar 2008
It's an elementary goal of API design to make something easy to
use: easy for yourself, easy for yourself next year, easy for others.
Let's take that as a given.
Many goals will conflict with "easy to use", but the subtlest is
the requirement that an API be hard to misuse. Ease of use
attracts users, but difficulty of misuse keeps them alive.
To make this concept crisp, I have two real life examples. The first is the safety catch on a gun. Hard to misuse beats easy to use.
The second example is the Linux kernel's kmalloc
dynamic memory allocation function. It takes two arguments: a size and a flag. The most commonly used flag arguments are GFP_KERNEL and GFP_ATOMIC: I'll ignore the others for this example.
This flag indicates what the allocator should do when no memory is
immediately available: should it wait (sleep) while memory is freed or
swapped out (GFP_KERNEL), or should it return NULL immediately
(GFP_ATOMIC). And this flag is entirely redundant:
kmalloc() itself can figure out whether it is able to sleep
or not. Implementing malloc() would be a no-brainer, and kernel
coders generally like ease of use. So why don't we? [Correction:Jon Corbet
points out that it's not entirely redundant in some configurations; we'd
need to do a few lines extra work.]
Because atomic allocations should be avoided: they're drawing from
a limited pool and more likely to fail or make other atomic
allocations fail. By placing the burden of specifying this onto the
author, we make atomic allocations easier to spot and thus harder to
abuse.
And if we want to make our APIs harder to misuse we need to
measure how an API scores, and that'll be the topic of the next post.
[/tech] permanent link
Wed, 12 Mar 2008
I'm always a little uncomfortable with "fuzzy" programming topics;
much better to judge between two specific pieces of code. The big
issues are important but it's hard to say something new on that topic
which will help people code better. Most useful stuff has been said
already.
Nonetheless, for my OLS keynote years ago I did have a point which I
felt was underappreciated, and managed to rope it down to actual
guidelines so the idea was of practical use. I'm going to revisit that
topic in my next few blog posts, because unfortunately my OLS keynote
was not recorded anywhere for me to simply point to, and there has been
some maturing of these ideas since then.
[/tech] permanent link
|
|