(no subject)
Sep. 18th, 2002 03:03 pm![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
We have this one client. He's always a complete jerk towards us.
This morning, I came in and found an email from him, complaining that he couldn't access our download server, and that we must have screwed up his access again.
No Glen, we didn't screw up your access. The server is apparently dead. Oops.
I go to the server room, and get a login prompt, but no password prompt after I've entered my username. Switching virtual consoles works, but the login just hangs.
For lack of any better idea of what to do, I reboot it. It pops up with "No operating system found". Erk.
After a few minutes of ineffectual troubleshooting, in which it appears that the internal hard drive can't be seen at all, I shut the thing down and bring up the old server, which will luckily provide most of the functionality of the new server until I can fix it.
Except when I'm unplugging the scsi cable from the disk array, the little screws that hold the scsi connector in place decide that they like the scsi cable better than the disk array, and stick to it. Fixing this requires opening up the disk array case.
Somewhere in this process, one of the little screws gets tweaked somehow so that it will not go back in correctly. Fine, I'll just leave it off and find a replacement someday when things are going right. But no, the scsi cable won't stay in place without both screws (there's actually two scsi cables in play here, one for the old server and one for the new server. They are not interchangable).
I finally manage to get the screw in place, and get the server powered on and slightly reconfigured to take changes in the file structure into account.
By now, this process has taken an hour and a half, while it should have taken half an hour tops. Oh well. Glen can just deal.
I take the server back to my desk. It works perfectly. I figure it must be some sort of interaction between the SCSI and IDE, but hell if I know what.
After Glen is done with his download, I take the server back to the server room and set it up again. The disaster this time around is that somehow the scsi terminator has gone missing. I swear, we have gremlins who are out to get me. I steal the terminator from our old disk array (which is no longer in use) and use that instead. Grr.
I turn the server back on and not surprisingly, I get the "Operating sytem not found" message again. This time, I double check the BIOS setup. It looks like the boot order was switched, and that the external scsi drives are considered removable media. Hrm.
I switch the boot order back, and everything works. Hurray.
A while later, I found the missing scsi terminator. It was sitting on the ground, partially hidden under the rack. Why? I don't know.
So everything is apparently back to normal now, and I've been reminded why it is a good idea to take backups.
The two most important questions here: Why did the system hang in the first place (there is no indication in the log files of any problem), and how did the boot order get switched (it had to have randomly switched itself, as the system has booted correctly in the past)?
And, of course, how do we get the damn gremlins out of the server room?
no subject
Date: 2002-09-18 06:00 pm (UTC)I came back and the machine was still up.
"Well," I thought, "maybe the machine is Angry, and just requires an offering of food." So I left the potato chip bag there, and the machine ran perfectly for many months.
Then, one of the mathematics teaching staff decided he was hungry, grabbed the bag of potato chips, and ripped it open. I turned around to say that he really shouldn't do that --
-- and the server crashed.
True story.
Clear skies, stout hearts.
I can relate
Date: 2002-12-13 10:33 pm (UTC)Re: I can relate
Date: 2002-12-13 11:16 pm (UTC)I'm pretty sure cosmic rays is one of the official BOFH excuses.