Log in

No account? Create an account
ChaoticMUX prehistory, part 2 - Alierak
April 6th, 2002
02:08 pm


Previous Entry Share Next Entry
ChaoticMUX prehistory, part 2
My hate affair with a programming language, and how I got involved in TinyMUX.

Backtracking a bit. My previous roommate Brian had gotten me a protein-folding research / programming job with his previous supervisor. I'd started at the beginning of the summer, and I had begun to learn more about C in the process, but mostly I was just scripting the simulation runs and fiddling with Markov probabilities. I'd gotten used to understanding other people's coding styles and bug tendencies as a three-term lab assistant for 6.001, and I could pretty much get the general idea of code and spot potential bugs in any language at that point if I felt like it.

One of my assignments was to track down a bug that made the simulation program crash. One of those things where the programmer didn't know the difference between a pointer and his dick. I remember having an awful revelation about C when I figured it out. During my orientation as a transfer student, I'd heard a department head say "Ah, C, that's a good street language", which I took to mean that it wasn't bad as languages went, and you might want to learn it for a job. I realized he'd been implying that it was not at all a good language, but that it was inexplicably popular. Not at all a beautiful, well-designed language like Scheme, it was more like a kind of slang, with a tendency to aid the speaker in producing culturally valued but intellectually bankrupt output.

Another thing I remember from that job was being asked to add some command-line options to the program, so that we could change parameters without recompiling it. I thought this was a neat idea, and I'd heard about argv and argc somewhere, so I went about writing a little generic option parser. That was fun, and interesting, but I was furious at the wasted effort when I later discovered getopt. Where was I supposed to have found out about that beforehand? Where was the R4RS for C?

Side note: R4RS, the Revised(4) Report on Scheme, is a freely available document specifying in detail and with examples not only how the Scheme language should work, but also what built-in functions should be included and how they should work. Every student got a copy at the beginning of 6.001, and it was on the web even then. There's now an R5RS, and I assume it's equally elucidating though I haven't had time to do anything with Scheme lately. As far as I'm aware there still isn't a corresponding specification available for the C language (technically, for the standard C library). The various ANSI, POSIX, and IEEE documents on the subject are apparently both non-free and (from the snippets I've been able to find out about secondhand) obfuscatory. If you run across anything that would prove me wrong, please send me the URL.

I remember I was sitting in the lounge one day in late July logged into stochastic, telnetted to four of the LCS bird machines to run my protein-folding simulations, when Jen wandered into the lounge. Her room was right around the corner, but she seldom came out to the public areas of the dorm since she spent most of her free time mushing and didn't really prefer to be around people in real life. She'd been mostly wary of my new hobby, generally just glad to have Alex and me out of her hair, but she had the vague idea that a unix machine of her own would be useful. She'd built Tinyfugue in her Athena account, but she wasn't about to try running a game on a campus machine. She had a windows pc on her desk, and wasn't about to try running unix herself. I think you all know where this goes.

"Do you think I could run a mush on chaotic?"

I'd already mostly become the anal-retentive sysadmin and code auditor that you all know and love. So I was skeptical. She didn't know anything about porting software, and both of us were mainly used to installing windows shareware programs. I put my work aside and we downloaded TinyMUSH 2.2. It wouldn't build. Now, of course, I can't recall the error, probably nothing serious. But to us at the time it was reason enough to move on and try something else. She'd heard that TinyMUSH might be more stable than TinyMUX, but didn't really know anything beyond that.

So we downloaded TinyMUX 1.2. It built, but wouldn't run, crashing right away with a bus error. I considered that to be somewhat more promising, but Jen was really upset. She had printed out pages and pages of documentation telling her exactly how to build these programs, how to set them up, and how to use them, and nowhere did it say anything about the software just plain failing to work at the get-go.

She (from my point of view) stomped back to her room to sulk. (Turns out she went to look for people online to ask for help. She got no responses from the tinymush list, but managed to find someone who was on the list, had some clue, and just hadn't felt like responding. This still didn't do any good, of course.) I took this as a sign that this was all my fault for having such an oddball server, and I should try to make it work somehow.

I didn't know what a corefile was or how to use a debugger, so I spent several hours adding debugging printfs to try to narrow down where the crash might be occurring, and eventually got it down to "somewhere near this pool alloc thing". Figuring maybe TinyMUSH wasn't so bad after all, I replaced TinyMUX's pool alloc code with the corresponding parts of TinyMUSH, working entirely with copy-paste (don't try this at home, kids). Amazingly, it pretty much worked.

Side note: this bug turned out to be a 64-bit alignment problem. The pool header struct from TinyMUX 1.2 put the beginning of an lbuf at a 32-bit aligned boundary, but not 64-bit. Some code during game initialization used an lbuf as a place to store a struct (having to do with getrusage, iirc), and structs had to be 64-bit aligned on that platform. Using the TinyMUSH code here added or removed something from the pool header struct causing it to end up as a multiple of 64 bits. I didn't really understand this until months later when I had to re-fix this bug in TinyMUX 1.4.

Side side note: David Passmore would later embarrass himself by asking me to report it as a NetBSD bug and then continuing the thread after the bug report was closed. He got a pretty good lesson in netiquette and C from Chuck Cranor, but I don't think I have a copy of those messages anywhere.

So we were able to start up a netmux process that same night. Jen dug through her printouts to find the initial Wizard password and logged in, following the directions that said to @shutdown immediately (you had to manually specify -s on the first run to initialize the database; we later automated that in Startmux). But when we started it back up, the database had clearly been corrupted. Missing names and attributes, etc.

Good enough for a start, said Jen. She began keeping a text file full of the commands necessary to recreate everything and reset the attributes, and would upload it every time the mux had to be restarted (including deleting the database and starting up with -s). This went on for about two months, during which time classes resumed and I didn't really think any more about it. Alex and I found out what it means, as a sysadmin, to have users on your box when we rebooted chaotic one night to fiddle with something and Jen came storming out into the lounge. We were trying to provide a better service, with a new kernel or a bigger disk or something, and she was trying to use chaotic full-time to run a mux.

Sometime in September, Jen mentioned that she was going to open the mux and let people in. She had about a hundred objects in her text file, and she knew exactly what was in there because only a handful of people had been helping her build the game thus far. But to let arbitrary people participate, this method of database storage just wouldn't do. I had to find this bug and fix it that night, homework be damned.

I didn't have much to go on, not really wanting to dig in and understand the database layer. This was back in the days when TinyMUX used its own chunkfile format rather than gdbm for attribute text storage, so there wasn't an obvious abstraction barrier below which there were unlikely to be any bugs. I'd have to attack this problem indirectly.

Reading the Makefile, I found out that you could use gcc -Wall to help identify potential bugs in the code. I'd long been used to getting warnings from my Pascal compiler, and I was surprised to find that gcc was being so quiet by default. So I turned on -Wall. And all hell broke loose. How anyone could write code with that many problems, I had no idea. I had always done incremental compiles and eliminated warnings as soon as they started showing up.

So I spent the night with M-x compile in emacs, stepping thru thousands of lines of warnings to look for ones that might be relevant. I finally found one. Data locations in the chunkfile were inconsistently specified as type off_t and type int. Normally it wouldn't matter, but NetBSD/sparc defined off_t as 64-bit, to support large files, and int as 32-bit. Trying to store an off_t in the space reserved for an int or trying to extract an int from the upper 32 bits of an off_t just didn't have the intended effect, and gcc did actually warn about it when asked nicely.

I also fixed some of the other things that had caused warnings, and then sent a nice toasty flame to the contact address listed in the documentation, lauren@ranger.orst.edu or something like that. It bounced. I think we later found a sneakers.org address for David, and a mention of the TinyMUX list, so I began sending patches and flames there instead, trying to teach the author a few things about finding bugs, to little apparent effect except that he bitched about me posting unofficial bugfix patches that weren't exactly like he would have made them. To keep Jen's mux running, I would have to continue to maintain it myself; we couldn't just assume that she'd be able to download David's latest version of TinyMUX and have it work.

And thus began ChaoticMUX.

Current Mood: accomplished
Tags: ,

(4 comments | Leave a comment)

Date:April 6th, 2002 12:27 pm (UTC)

Keep it coming, this is fascinating. :)
[User Picture]
Date:April 6th, 2002 12:36 pm (UTC)

A few observations...

I'm not inclined to disagree with you because you are entitled to your own point of view, but I'd like to clarify a few things.

First off, I did consider running Unix on that old PC, but I thought it would be too underpowered to be useful running Xwindows. If you remember the PC that was at the front desk for a while and what a slug it was when booted into Linux, I think my experience with that machine was what changed my mind - but for a while I had NetBSD install floppies and docs under my desk in my room, with every intent of using them at some point. Later, when we installed Red Hat on heads-or-tails, there were times when I used that machine more than my PC. :)

Secondly, I don't think we knew about the database corruption bug the very first time we restarted the game (since none of the objects had any attributes at that point). I think we were about 30 objects into building the central area of the MUX before I first noticed it, I think because one of the global commands stopped working when the attribute that defined it disappeared. It was only at that point (when we had some work worth preserving) that I started keeping the list of commands necessary to recreate everything we'd done.

Third, and I know I mentioned this, I was only mad that you took chaotic down without /warning/ me first. That's what the time delay on shutdown is for. :)

Other comments:

It becomes obvious to me that our relationship has always been plagued by communication problems, not just recently.

I still can't believe you made all that effort on code that you knew and cared nothing about. You love me. You really, really love me. :)

And I love your C language flame. It cracked me up.
[User Picture]
Date:April 7th, 2002 07:58 pm (UTC)

Rak's Prehistory

Wait a sec. I thought this Chaotic-thing started with a chemistry set accident and a few cans of COKE. I am so gullible. On another note, it seems we have a Pre HIStory and a HERstory?
[User Picture]
Date:September 12th, 2002 02:53 pm (UTC)
What can I say, I was young and you were cocky. ;)
Powered by LiveJournal.com