Sun Dec 25 13:49:45 EST 2005

Attrition Staff

As you may have noticed, Attrition has had a bit of downtime lately. The problems began when the box started powering off with no warning. This lead to no logs, nothing on the console screen and no indication of what the problem may be. Due to the sudden loss of power, file systems were not properly unmounted and it played havoc with fsck and booting. The only way to get the machine to boot was to have our NOC run fsck by hand several times on each drive (a 25 minute process). This would last between one and three hours on average, causing a significant amount of downtime.

Since the box supports several projects and is the base of email for many people, this had to be fixed sooner than later. Lyger researched and coordinated for a new machine to be sent to Jericho who would handle the base installation and configuration. Since Jericho has had several stable SuSE 8.1 machines, he figured another would work just fine. The first shot at getting the OS installed worked fine. The default SuSE 8.1 kernel worked for the most part (including eth0), but didn't have all the options needed. The latest 2.4.x kernel compiled fine, but wouldn't boot. The ever present "Kernel panic: VFS: Unable to mount root fs" was not very helpful in figuring out exactly why it wouldn't play nice. In an attempt to maximize time, kernel reconfiguring was put on the back burner, the known good default kernel back in place, and he could keep working on the rest of the software such as rot13, nethack and figlet. In theory.

The known default kernel (after a power cycle/reboot) came back up fine, but didn't initialize the ethernet card. Rather than a five page rant to share the frustration, suffice it to say that three complete format/reinstalls of SuSE did not fix it, nor did three relatively competant linux/network people. This quickly lead to the idea to say fsck SuSE, try Slackware. Jericho downloaded Slack 10.2 over night and burned it to CD the next morning. Then he burned it again remembering that writing an .iso image to a CD wasn't very helpful for installing it. What a gimp. On a more positive note, Lyger has now fully experienced the sound of a floppy disk being flung across the room and shattering, all over Skype!

Badmouth Slackware all you want, but Attrition (and the boxes before it, including the lovable has always had good luck with it. The box before this was Slackware 8.1 and did us right for over five years. Slack 10.2 was no different. Easy and clean install process, easy to grok package descriptions (even better than some of the 'verbose' SuSE package descriptions), and everything worked fine. Since Slack 10.2 is less than 30 days old, almost every package it comes with is the latest version which was convenient. A few dozen more packages and the box was mostly ready.

Off it went to the NOC to be put in place. Next came the server specific configuration and testing. Postfix and our own DNS cache were installed by Strange since the rest of the staff are evolutionary throwbacks that grew up on Sendmail but finally acknowledged their problem and checked into rehab. Once Strange had downed enough Saki to tolerate Lyger's sexual advances, he finished off the configuration enough so that Jericho could prematurely do the cutover. We will fondly remember the last words before cutover.. "Nah, just need to get Mailman configured and we're set. Shouldn't be an issue." What a complete and total moron. In the meantime, we were constantly backing up home directories and mail spools, as well as trying to keep the old box running by NOT deleting spam or doing other unnecessary disk writes (since that may have been related to the other problems). Lyger's moment to shine came when he was constantly getting "permission denied" errors while trying to rsync home folders. Jericho kindly asked "you're doing that as root, right?", to which he had to reply "omfg.. i forgot to su". Owned.

Mailman, once installed, is a nice package that has great versatility and features for managing mail lists. Mailman, pre installation, is a horrible nightmare that needs to die a swift death. We're not sure if it is just classic programmer arrogance, horrible coding, or the lack of chicken blood used during the installation.. but it sucked ass. Trying to get the new version (2.1.6) installed wasn't happening. I kept getting the same error message that hundreds of other people had, suggesting that it wasn't thoroughly tested outside the developer atmosphere. The dozen or so suggestions for fixing it didn't work. Half a day and two frustrated admins later, they jumped to 2.1.7 RC1 because it promised to provide more descriptive error messages.. and it did. The same error message under the new version finally gave Strange the idea of what might be wrong. A minor tweak to something Jericho had tried proved fruitful, and Mailman started spitting out mail. Of course, the entire web interface is broken at the moment but oh well, we'll figure it out soon enough (read: after we procure chicken or goat blood).

As expected, after the box arrives and gets configured, the old box (now dubbed 'forcedout') managed to keep a 24 hour uptime. We're thinking that the swift kicks administered by the NOC gave it the encouragement it needed to stay alive. Either way, the new box is here, serving up this page, and processing spam and abuse faster than ever. This wouldn't have happened without support from several people. In no specific order, Jericho would like to thank:

Over the past few weeks, several people have offered support in one form or another. If you are still looking to rid yourself of the guilt of mugging that old lady, you can donate a few bucks, hookers and/or cocaine to the cause. All reasonable cash donations will be rewarded with access to the attrition image gallery, closed to the public a while back. Contact Jericho for further information.

To repay their kindness and help, Jericho will be repaying them as best he can. All cash donations will go to Forno for hosting and putting up with the non stop care the old box needed (which now has a 32 hour uptime, !@#$). Some extra gold pieces will be sent to Sandrak to help better gear his crappy character so he can frickin kill my poor twink rogue more often. Jericho promises to get completely trashed and chat with Raven for a few hours to offer incriminating logs. Lyger will get a double helping of Chedds when he returns from Cambodia. Strange will be showered with quality sushi and plenty of saki to help him forget his contempt toward Jericho.

If you run Mailman 2.1.6 or 2.1.7* and are having issues with it giving you errors, change the file ownership of your aliases.db to 'mailman'. If that works, promise you will mail the Mailman developers with hostile insulting mail impugning their coding ability and penis size.

That's it! The story of our x-mas. Well, everything except our x-mas play put on for the Idaho Kids Association.

-- Attrition Staff

p.s. merry x-mas, happy holidays, or whatever other common phrase is more offensive to you
p.p.s Old news comments regarding server woes and downtime here.

