There’s actually a horror movie called ‘IT’ – although it’s more commonly written as ‘It’ and, according to IMDB, it’s/It’s about ‘a group of bullied kids who band together when a shape-shifting demon, taking the appearance of a clown, starts hunting children’. A nice family movie, then! And, of course, there are all those movies where technology starts misbehaving and destruction ensure.
Hopefully, when it comes to data centres, there’s not too much destruction about. But, with Halloween just around the corner, it might be worth checking that you’re prepared not so much for an invasion of demons, clowns, or children – just those nightmare scenarios that can cause data centre meltdown, but can be avoided with a little bit of planning…
Flooding is an interesting topic, bearing in mind that two of the world’s data centre hotspots – Amsterdam and London – have potential issues with this subject. Data centres shouldn’t be built in flood risk zones, nor in or around busy airports. As for earthquake zones…but then no data centres in Los Angeles, San Francisco, or Japan. Whoops!
Of course, the reality is that, all too frequently, regardless of the natural or man-made risks, data centres need to be built in a specific location, so it’s vitally important to ensure that there is a backup/disaster recovery plan in place.
And that’s the first horror scenario – the backup plan that doesn’t work. Whether it’s a loss of power that requires the emergency generator to kick into action, but you haven’t tested it for 12 months, and it doesn’t fire, and you failover to the backup data centre, but you only have one connection to the outside world, and the helpful men digging the road just happen to have cut through this cable, or you do have two separate connections leaving the data centre, but they are both supplied by the same provider, who just happens to have a major network meltdown…
Not having a resilient, well thought out backup and disaster recovery plan could spell the end of your business. And there’s no excuse not to have one.
Security is another major concern. Both physical security – I chuckle every time the ‘unbreakable’ entry security is bypassed to let in a group of journalists visiting a data centre facility. And IT security – not so much the outside threat – that’s almost impossible to stop, but not impossible to manage. Nope, I’m thinking of the disgruntled employee who leaves and, they can’t believe their luck, their passwords still work a week later. Revenge is a dish best served cold!
And those waiting for the more ‘exotic’ data centre nightmares, well, time to pay attention.
Heard the one about the construction work that created so much dust that made its way into the inadequately curtained-off data centre. Servers don’t like dust.
And then there’s the one where the construction work dust was being sucked in to computer room via the cooling plant. Easily solved – turn off the cooling plant. Ah, but the weather was hot, and when some of the servers stopped working, and it was discovered that the temperature inside the room made the Sahara Desert seem bracingly cold?
And what about all those confident souls who know they shouldn’t take a drink with them when the go to do some maintenance work inside the data centre? They know the risk, but they wouldn’t be as stupid as some people and actually spill the drink, would they? Liquid cooling requires rather more sophistication than a spilt can of lemonade…
Oh, and what about all those data centres full of really old, legacy kit, with that one Stone age server running the company’s critical application. It’s held together by elastic bands, and the knowledge of the 60 something employee who’s due to retire any day now, if the machine doesn’t break first. The mental gymnastics required to justify this situation as opposed to migrating/rewriting the application are too frightening for words. Yes, the transfer process might be painful and expensive, but I’m guessing it won’t be quite as expensive or painful as a catastrophic failure.
And let’s take a step back in time and look at major data centre disasters throughout history.
It’s widely believed that, in the early 80s, an error in an early warning system alerted the Soviets to the fact that the USA had launched a handful of ballistic missiles headed for Russia. Fortunately, the duty officer suspected that an American attack might be a little more full-bloodied than five or so missiles, investigated the situation and averted, well, make up your own mind on this one.
In 1990, a single switch failure caused a major global telecoms data centre to shut down. On being repaired, the switch sent the same shut down message to the company’s other switching centres. Some 75 million phone call were dropped, and the cost to business was, well, immeasurable.
More recently, the UK’s Child Support Agency managed to overpay, yes, overpay, one billion pounds more than it should have.
And then there’s the story of the telecoms provider’s Istanbul data centre which suffered a flash flood during a cataclysmic downpour. Computers and furniture started floating around. The good news? The company had an effective DR plan in place which minimised the impact.
Moving right up to date, we’ve had an airline brought to a virtual stand-still when, word has it, an employee disconnected a power supply, and then re-connected it. Chaos ensued.
Yes, good old human error is right up there with natural disasters when it comes to data centre horror stories.
The lesson to be learnt from this look at data centre horror stories? Plenty of potential disasters are avoidable. Think before you, your colleagues, your contractors and your customers act and you’ll keep data centre failures to a minimum. As for the disasters you can’t avoid? Well, putting in place the right DR plan can avoid the consequences of ‘Acts of God’ (as well as stupidity).
I couldn’t end this blog without casting a glance in the direction of the most likely data centre fright stories you’ll encounter, especially when dealing with third party data centre providers. First up, be very sure you know exactly what it is (and isn’t) you are paying for when it comes to data centre services. It’s no good assuming that your monthly bill includes everything you imagine, without checking the contract.
A true story for you: A Swiss businessman with a warehouse situated at a major road and rail hub, sold the unit. The new owner assumed that the sale contract included access from the back of the warehouse to load goods onto the railway. Alas, this wasn’t covered, so (substantially) more money had to change hands to obtain such access! Imagine you pay a monthly fee assuming you have a certain amount of network availability/resiliency, but when you need to use such a feature, your provider explains the extra charges this will incur.
Finally, have you ever bothered to check the exact terms of the various Service Level Agreements you might have with data centre and service providers? Most obviously, the penalties that the providers might agree to pay in case of any service level breaches will often bear little relevance to the extent of disruption you might experience. One month’s free power as recompense for a 24 hour ecommerce website shutdown doesn’t seem very balanced! Okay, so SLAs are a minefield, but do make sure you have some idea of what it is you are risking when you sign up to any particular agreement.
Make sure you keep your eyes wide open and don’t hide under the bedclothes when it comes to dealing with data centre problems.