Posts Tagged ‘uptime’
The Tent Datacenter
In Microsoft’s “The Power of Software” blog, they recently ran a post about a “datacenter” they ran inside a tent. The idea was to prove that they could run a rack of servers without any air conditioning using only the outside air to cool it and show how resilient servers really are. This way, the only power that was actually used was the power used to run the servers, thus achieving a better PUE. The rack of servers actually got wet and stayed up. Interestingly, (at least to them) the servers maintained a 100% uptime for the entire 30-day experiment. I’m going to throw out my security and environmental concerns just for the sake of this argument.
As a whole, the IT industry definitely does a lot of over-building of architecture. We buy two of everything. Through my job I’ve learned to trust nothing. Heck, sometimes I feel vulnerable driving down the road in my car since I don’t have dual tires and dual engines so in case one fails, I can keep going. Don’t laugh, it’s funny because it’s true. In our industry we quickly learn that everything fails. Hard drives, power supplies, RAM, and the worst of them all is people. Given this state of mistrust, we tend to buy much more than necessary, constantly chasing that elusive 100% uptime mark.
The idea behind this experiment was great – we really do over-purchase cooling, fire fighting, and power equipment. However, there’s one basic flaw here. For the same reason we buy seat belts, air bags, and insurance policies. 9,999 times out of 10,000 (hopefully it’s much higher!), everything is just fine when we get in our cars and drive away. But it’s that one time that will kill you, literally.
I find it interesting that people have been thinking of this as a new idea. People working with festivals and other events where temporary networks must be established have been running computers in tents for some time. I helped work with a large outdoor music festival called LifeLight for a couple of years. We did the same thing. We put computers in tents. Except for having less than ideal hardware, everything was fine. The machines got dew on them when they sat out overnight but they booted up in the mornings. It works, but it’s far from ideal. I’m sad to say that we didn’t achieve the 100% uptime these two did, but our equipment wasn’t quite the same.
Now I’m a progressive thinker, but seriously, let’s not forget where we came from. There’s a huge difference between 99% uptime and 99.99% uptime. Whether or not this is a reasonable goal is another conversation, but that’s what everyone seems to “need”. The big problem with this idea is this: our environment can cause disruptions in computers. Just like insurance policies, we never need the “wasted” money and power used for brick, cooling, fire, and power equipment until we need it. I do understand where they are coming from, but I’m not going to be the one telling my CEO that the servers that our company depends upon are down because it’s slightly warmer than what the servers can handle outside and we decided to save money by not spending a little more to have extra equipment to handle the load. I don’t want to be that guy.
Tags: microsoft, security, uptime
Filed under Tech Trends :
Comments (0) :
Sep 22nd, 2008