To consider what happens when something goes wrong might seem to be a strange starting point on which to base the decision as to which data centre/colo provider to choose – assuming, as it does, that there will be problems. However, it’s a realistic approach to take. No matter what the sales person’s promises, no matter how impressive the data centre facility – Tier 4, built ten miles underground, protected by a private army, with seemingly limitless power availability supplied via multiple, independent networks and equally limitless IT network connections and capacity – glitches will occur. These could well be man-made, might just be a piece of crucial kit failing or one of those unlucky ‘one in five hundred year’ freak weather events.
Asking your potential data centre or colo provider the simple question: “What happens when something goes wrong?” will hopefully not receive the unrealistic answer: “Nothing can or will”. But, just as your chosen data centre provider will try and reassure you that it is extremely unlikely that anything can go wrong, they will also be realistic about the possibility of some kind of a failure. This is where the Service Level Agreement (SLA) comes to prominence – with both parties debating and negotiating the (un)likely failure scenarios and the speedy resolution of such problems.
However, as more than one respected data centre professional has pointed out to me in the past, if a data centre/colo service issue has to be resolved by recourse to an SLA, then there’s almost certainly something wrong in the relationship between the provider and the customer.
In any case, when there is a problem, then the SLA actually comes some way down the priority list: “You said that this wouldn’t happen, and if it did you’d give me four months free power” is not a particularly useful starting point, when, say, your website is down and you are missing out on hundreds or thousands pounds worth of sales per minute or hour. Nope, when something goes wrong, you want to have confidence in the process that will lead to a fixing of the problem as quickly as possible.
Next up, and most difficult to establish pre-contract signing, is to try and understand what is the procedure for registering a service issue, and how is this issue resolved and/or escalated over time? Clearly, there are different types and levels of problems that will need to be resolved, so it’s important to understand how many different service levels there might be and how each of these levels is addressed.
To add a further degree of complexity, there’s the small matter of the increasing trend towards self-service – where the customer is given the ability to manage their own facility and IT infrastructure within the data centre/colo building. Self-service diagnostics are all well and good, so long as they cover the problem(s) you are encountering. Inevitably, as with the basic options one is given when using a customer service phone line, there will be problems that do not seem to be covered by the menu options you are given online. So, what happens then?
On the plus side, your data centre/colo provider can be monitoring your facility/IT environment on your behalf, spotting, and solving, any potential problems before they have any impact on your business. Is such a service provided as ‘standard’, or will you need to pay extra for the peace of mind your colo can provide?
Finally(!), we arrive at the crucial link of the whole customer service chain – what happens when you phone up the NOC/service desk to flag up an issue – as is almost certain to happen at some stage during your contract? Well, hopefully, you won’t be met with the kind of customer service line mentioned above. You want to speak to a real person, and someone who has a good understanding of your facility/IT infrastructure within the overall data centre/colo environment. Even if they are not monitoring your kit specifically, they should have a pretty good idea of any ‘universal’ issue that is affecting the facility, and, via their monitoring tools, they should even have a pretty good idea of the specifics that relate to your rack(s). In an ideal world, although not necessarily essential, if the NOC is not actually on the site of the data centre, it will have direct and quick access to an individual who is and who is, therefore, well-placed to sort out any problem.
No doubt there are other customer service issues that need to be considered when evaluating your data centre/colo options, but the main consideration has to be the risk/reward equation that you believe gives you the optimum service level at a price that you are happy to pay.
Talking to a colo’s existing customers is a great way of finding out how the customer service ‘theory’ works in practice, and is a vital part of any decision-making process. After all, it’s no accident that, when it comes to customer service across virtually all industries, there are companies who are recognised as leaders, and many others who lag some way behind. There is no reason why the data centre/colo space should be any different, so make sure you do your research, or at least understand that if your colo seems to be offering an amazing deal, then it may well be that customer service is not one of their major priorities!