Just call this the week from hell. Actually, it was the month from hell. Server crashes plus configuration problems on a new server this week brought just about everything to a halt.

The trash bin in the back of our data center is littered with the remnants of servers that didn’t work as promised and hard drives corrupted by viruses and worms because a virus program also didn’t work as it should have.

Part of the problem started with a planned move from our data center in Loudoun County to a new one at Virginia Tech’s Corporate Research Park in Blacksburg. The idea was to have the server farm closer at hand. This meant buying new equipment and, in true Murphy’s law fashion, some of the equipment didn’t work like I expected new servers to perform.

Since the first of April, I’ve had three of Sun’s new Coolthreads Sun Fire servers bite the dust, two Dells running Linux suffer kernel failures and one Windows 2003 server reset itself and destroy everything from the last 10 days.

Capitol Hill Blue, which runs on multiple servers, went offline three times in the last three weeks, crippled by a bug in a new content management system and the same bug corrupted the backups without our knowledge.

We thought we were on the homestretch Friday moving the last of our servers, the one containing blogs for Fred First, Colleen Redman and others. But Fred’s blog crashed before the move and wouldn’t reboot on the new server. Finally traced the bug late Friday and finished up the move at 12:50 a.m. today.

So far (fingers crossed) everything is running fine. I’ve suffered more hardware and software problems in the last three weeks than in the last 11-and-a-half years of running and hosting web sites.

Lessons learned:

  • Sun Servers ain’t what they used to be. I guess I shouldn’t be surprised. David St. Lawrence, an escapee from the corporate drudge of Sun, said it ain’t the company it used to be either.
  • Abacus, a server co-location company located in San Diego and Germany, is a ripoff. I had hoped to locate a mirror site there but their tech people failed to respond in a timely manner when we needed assistance and it took them three days to fix a minor problem. That’s a shame. Abacus used to be a good company. Now they are just sham artists in it for the quick buck.
  • Backups don’t work when the file corruption that brings down a server is also on the backup files.
  • Linux is good for running Tivo and small web sites but it doesn’t have the power for large, full-scale operations that demand multi-threading, heavy processing needs and high traffic.
  • I need some sleep.