Expand Cut Tags

No cut tags
ocelot: (pow)
[personal profile] ocelot

We have this one client. He's always a complete jerk towards us.

This morning, I came in and found an email from him, complaining that he couldn't access our download server, and that we must have screwed up his access again.

No Glen, we didn't screw up your access. The server is apparently dead. Oops.

I go to the server room, and get a login prompt, but no password prompt after I've entered my username. Switching virtual consoles works, but the login just hangs.

For lack of any better idea of what to do, I reboot it. It pops up with "No operating system found". Erk.

After a few minutes of ineffectual troubleshooting, in which it appears that the internal hard drive can't be seen at all, I shut the thing down and bring up the old server, which will luckily provide most of the functionality of the new server until I can fix it.

Except when I'm unplugging the scsi cable from the disk array, the little screws that hold the scsi connector in place decide that they like the scsi cable better than the disk array, and stick to it. Fixing this requires opening up the disk array case.

Somewhere in this process, one of the little screws gets tweaked somehow so that it will not go back in correctly. Fine, I'll just leave it off and find a replacement someday when things are going right. But no, the scsi cable won't stay in place without both screws (there's actually two scsi cables in play here, one for the old server and one for the new server. They are not interchangable).

I finally manage to get the screw in place, and get the server powered on and slightly reconfigured to take changes in the file structure into account.

By now, this process has taken an hour and a half, while it should have taken half an hour tops. Oh well. Glen can just deal.

I take the server back to my desk. It works perfectly. I figure it must be some sort of interaction between the SCSI and IDE, but hell if I know what.

After Glen is done with his download, I take the server back to the server room and set it up again. The disaster this time around is that somehow the scsi terminator has gone missing. I swear, we have gremlins who are out to get me. I steal the terminator from our old disk array (which is no longer in use) and use that instead. Grr.

I turn the server back on and not surprisingly, I get the "Operating sytem not found" message again. This time, I double check the BIOS setup. It looks like the boot order was switched, and that the external scsi drives are considered removable media. Hrm.

I switch the boot order back, and everything works. Hurray.

A while later, I found the missing scsi terminator. It was sitting on the ground, partially hidden under the rack. Why? I don't know.

So everything is apparently back to normal now, and I've been reminded why it is a good idea to take backups.

The two most important questions here: Why did the system hang in the first place (there is no indication in the log files of any problem), and how did the boot order get switched (it had to have randomly switched itself, as the system has booted correctly in the past)?

And, of course, how do we get the damn gremlins out of the server room?

Date: 2002-09-18 06:00 pm (UTC)
From: [identity profile] gdmusumeci.livejournal.com
When I was a freshman in college, I ran a few systems for the mathematics department at my high school. They had an AppleShare file server that was not, as they say, particularly reliable. After spending an afternoon working on it, and suffering through random crashes, I walked down the block to get something to eat, and left an unopened bag of potato chips on the machine.

I came back and the machine was still up.

"Well," I thought, "maybe the machine is Angry, and just requires an offering of food." So I left the potato chip bag there, and the machine ran perfectly for many months.

Then, one of the mathematics teaching staff decided he was hungry, grabbed the bag of potato chips, and ripped it open. I turned around to say that he really shouldn't do that --

-- and the server crashed.

True story.


Clear skies, stout hearts.

I can relate

Date: 2002-12-13 10:33 pm (UTC)
From: [identity profile] bongo3045.livejournal.com
I've got this new box in our test environment and just loving the speed out of the damn thing. It's got Dual Sparc III's with 2GB's of memory. So one morning I walk in and the system is just dead. No life whatsoever. I console into the thing and run diagnostics on it, find one of the banks or memory has gone bad. DAMN it. Well to my luck the reseller had some spares in stock and sent them out to replace the whole bank. The whole time I'm sitting here trying to figure out what could do this to my memory. The answer that really threw me for a loop was "Well, it could be cosmic rays". I was like WTF? Cosmic rays? What are you trying to feed me here... anyways... box works again... everyone is happy... and of course... SCREW GLEN... he just another user :)

Re: I can relate

Date: 2002-12-13 11:16 pm (UTC)
From: [identity profile] therealocelot.livejournal.com
Glen is not just another user. He's the User from Hell, as he managed to prove once again this week. If he would just go away, our complaints would go down at least 50%. Luckily, the manager recognizes his complaints as idiotic and isn't terribly concerned about it.

I'm pretty sure cosmic rays is one of the official BOFH excuses.

Profile

ocelot: (Default)
ocelot

April 2011

S M T W T F S
     12
3456789
10111213141516
17181920212223
24252627 282930

Most Popular Tags

Style Credit

Page generated Jun. 18th, 2025 06:43 pm
Powered by Dreamwidth Studios