Saving our Server

About two weeks ago, our fair server (home to, Chris’ IB History Topics, Helen’s new blog, my business websites, etc.) shut down for no apparent reason at 4am. Panic!

I spend the day combing through log files, searching for hints of an intruder. The shutdown scrambled several log files, so I decided to rebuild the system from scratch. Right in the middle of doing so and, coincidentally at 4pm, the computer shut down again. Argh!

Again, assuming a South Korean hacker was coming for my lovely family photos and insightful rhetoric, I downgraded the software to a slightly earlier version, assuming that the bugs would be ironed out of that. Installing the earlier Linux operating system solved the problem. For two days. Then another 4am shutdown.

I got the bright idea that the computer might be overheating. There are sensors already in place inside most computer equipment these days to measure temperature and shutdown if it gets too high. Accessing the sensors is easy with the MacOS, but a bit more complicated with Linux. I installed lm-sensors (a data extraction tool) and sensorsd (a separate tool to query lm-sensors and provide a log). The CPU is showing 67º Celsius (152º for you Fahrenheit people). Is that high? Querying Google, I find an engineering doc from Intel that rates the Celeron processor in the server at a maximum temperature of 67º. Hmm.

I take the casing off the computer. I move the server down to the garage. Winter in Temecula means outside temperatures between 34º and 65º Farenheit, so the garage location should cool it down a bit. I also blow some compressed air around the insides of the computer and particularly around the fan connected to the CPU.

Temperatures go down for awhile – 57º, 59º, 65º, 57º – but then spike again after a few days – 65º, 67º, 69º, 67º. What’s going on? Do I need to get a new computer?

Last night, about 9pm I return home from taking Daniel to his periodic ultrasound exam. He fell asleep in the back of the Toyota before we got back. I check the server. 72º C, right at the edge of shutdown. I go to the garage with a can of compressed air and a flashlight. I tilt the computer over 45º and shine the flashlight directly into the spinning fan that sits on the CPU. What do my eyes behold but a solid mass of lint, plugging every single cooling vent on the heatsink, packing it with dust and lint. The dust and lint were completely invisible when the fan wasn’t running, but under the flashlight with the running fan, they stand out like dog poop on a red carpet entrance. I take five minutes cleaning with compressed air and some tape to remove the larger particles.

This morning, after running for 12 hours, all temperatures are at 38º, with a 4am spike of 56º. Our server is safe again …

No Responses to “Saving our Server”

  1. CRS says:

    Wow I just did a google on my last two blog entry titles (or versions thereof) and the blog comes up right up top or with a version right in the mix of some real stuff. I think I’m going to have to watch myself a bit.

  2. Ed Steussy says:

    That’s why I changed the data structure of the blogs. I was tired of other blogs beating us out on Google searches when it was an easy change to make us more find-able. Yes, we should be near the top of most Google searches in our subject areas.

  3. nic says:

    My CPUs run 37 to 40C, even under load. But then they are cleaned pretty regularly and I’ve moved away from the default heat-sink/fan unit.

Leave a Reply