Monday, September 24, 2012

Data Centers: Not Exactly About Nuclear Energy But All About Energy

datacenter
No Commodore 64s at the data center.
The New York Times takes a look at large data centers, the warehouses of computers that power large web sites like Amazon and Facebook and Google (and plenty of others). We’ve noted these in the past because many of them have set up in places like Illinois, Virginia and North Carolina – that is, states well covered by nuclear energy – but many with a strong desire to use renewable energy – to the extent that some of them want to install their own wind farms or solar arrays.

We called this silly then but now, the Times' year-long investigation has revealed a rather more alarming angle, because the data centers are environmental sump holes:
At least a dozen major data centers have been cited for violations of air quality regulations in Virginia and Illinois alone, according to state records. Amazon was cited with more than 24 violations over a three-year period in Northern Virginia, including running some of its generators without a basic environmental permit.
And wasteful consumers of electricity, which is more relevant to this discussion:
Most data centers, by design, consume vast amounts of energy in an incongruously wasteful manner, interviews and documents show. Online companies typically run their facilities at maximum capacity around the clock, whatever the demand. As a result, data centers can waste 90 percent or more of the electricity they pull off the grid, The Times found.
Data centers are essentially factories that may be too new in kind to have effectively understood how to marshal resources to maximize profit. (I can’t believe they are designed to be wasteful, as implied above.) After all, nuclear energy facilities require considerable energy themselves, but successfully manage resources to make the production of electricity affordable. This is as true across industries that hope to stay in business.

The sudden rise of data centers hasn’t been matched by good procedure or an understanding of how to maximize output - a lot of the computers at the data centers, for example, consume electricity and do no or very little computation. That’s where the waste piles up.
A senior official at the data center already suspected that something was amiss. He had previously conducted his own informal survey, putting red stickers on servers he believed to be “comatose” — the term engineers use for servers that are plugged in and using energy even as their processors are doing little if any computational work.
“At the end of that process, what we found was our data center had a case of the measles,” said the official, Martin Stephens, during a Web seminar with Mr. Rowan. “There were so many red tags out there it was unbelievable.”
The story doesn’t really explain why those servers are not being used – probably to kick in if another server fails – but the lack of a process to identify them and determine how many have to be flipped on to act as back-ups is just bad planning. The Times suggests that this is because careers ride on containing outages, thus the fear not to use every available server. Maybe that’s part of it – maybe the bottomless bank accounts at some of these companies just make it easier not to really fret about it – or about the electricity bills. It all gets paid.
In addition to generators, most large data centers contain banks of huge, spinning flywheels or thousands of lead-acid batteries — many of them similar to automobile batteries — to power the computers in case of a grid failure as brief as a few hundredths of a second, an interruption that could crash the servers.
“It’s a waste,” said Dennis P. Symanski, a senior researcher at the Electric Power Research Institute, a nonprofit industry group. “It’s too many insurance policies.”
I wouldn’t call the story a horror show of energy malfeasance – there’s no evidence that these data centers have destabilized the grid, though that day could come. But the story also makes clear that these operations can run smoothly:
The National Energy Research Scientific Computing Center, which consists of clusters of servers and mainframe computers at the Lawrence Berkeley National Laboratory in California, ran at 96.4 percent utilization in July, said Jeff Broughton, the director of operations. The efficiency is achieved by queuing up large jobs and scheduling them so that the machines are running nearly full-out, 24 hours a day.
That’s about as good as a nuclear energy facility, although queuing jobs may make more sense for processing scientific data than for sharing funny kitten videos – one thing is not like the other. It also suggests that the lab uses all the resources it has rather than all the resources it can buy. You don’t have to worry about an efficient outcome if you can just throw more resources at an issue. Cut the number of servers in half and watch efficiency skyrocket.

Is there a role for nuclear energy in this story? Yes, but mostly because it’s there, in those states with the data centers. If the grid has stayed stable, it’s because of the 24-7 nature of nuclear energy (and fossil fuels, too). These data centers say they would prefer intermittent renewable resources – which I think would put to use all those diesel backups referenced in the story in no time flat. Using wind power sounds good when an data center official is talking to Greenpeace, but these companies did move to states with inexpensive electricity supported by nuclear energy. I’m not saying one thing led to the other – there’s no evidence of it - but there it is.

But this is more a story about unintended – I hope – malfeasance in a relatively immature industry. Assuming the hot water doesn’t scald, there’s plenty of room for a course correction. Step one, start queuing those jobs.

2 comments:

Joseph said...

The article sounded like an obvious attempt by the legacy media to encourage regulations that would handicap their competitors.

Jack Harrison said...

For the media to compare data centers to nuclear facilities is simply outrageous. I can definitely see the source of this being some of the top dogs in data center management, such as IBM, RackWise, and AlphaPoint