The testing of critical infrastructure needs to be considered from the inception of a data centre design project and shouldn’t be viewed as something that is performed ‘at the end’. But what exactly constitutes a ‘good’ approach to testing? Louise Frampton reports
Zac Potts, associate director at Sudlows, warns that testing is a crucial part of the data centre design process and should not be viewed as something that ‘happens at the end of a project’.
Speaking at Data Centre Summit, London, he pointed out that it is not only about proving resilience, reliability and availability but also about providing confidence that the systems installed will “work as intended, when they need to”.
Systems will be engineered to match the client’s needs but to people unfamiliar with a project some systems may appear to be over engineered, while others may appear to have been aggressively ‘value engineered’. Early consideration of how a system will be tested can help uncover shortfalls in design.
“You need to understand the limitations before you start testing a facility,” says Potts. Understanding the basis of the design to be tested will:
• Avoid failing tests
• Possibly identify to the design team unintentional consequences of design decisions
Initial testing is required for ‘proof of operation’, while ongoing testing ensures ‘continual confidence’.
However, Potts points out that there is a misconception that ‘testing happens after everything else’. A good testing regime should start at inception, he advises.
“Although the bulk of testing is carried out at the end of the project, a good approach starts early on. When specifications are not as thorough as they could be, you need to look at how the systems can be improved,” comments Potts.
“At Sudlows, there is a dedicated team that is focused on testing and commissioning. I know that the sooner that I can sit down with that team and go through the project the better it will be. It is much easier and less costly to change something on paper than trying to resolve an issue when the project has started. Having this opportunity is invaluable.”
Early consideration of how systems will be tested will enable the risks associated with testing to be minimised. Furthermore, if a piece of plant is identified as requiring replacement as part of ongoing maintenance, thought needs to be given as to how the new replacement will be tested. In the case of modular systems, if the facility is seeking to expand the UPS system, will every module need to be approved as it is expanded? How will it be tested? This needs careful consideration as soon as possible, comments Potts.
So which systems need to be considered and will they require consideration in isolation or as part of a larger system? Data centres are complex, containing multiple systems each with multiple interactions and dependencies, and this needs to be understood from a testing perspective, Potts points out.
Factory visits and witnessing
Outlining the stages of a thorough testing and validation approach, Potts explains that this should start with ‘pre-testing’ as part of due diligence, by interacting with the supply chain, visiting factories and witnessing potential or specified systems.
Testing before the system is ordered can result in less inertia against changing manufacturer if the test performs poorly. This also gives an opportunity to see the support structure and to see what the manufacturer does in terms of their end of line testing.
Factory acceptance testing
The second stage of factory acceptance testing takes place once an order has been placed. “This is essentially an ‘out of context test’ and is an opportunity to do things you may not be able to do on site.
“For example, this may include overloading the system or putting the system in environmental chambers to simulate conditions that would be difficult to re-create on site,” explains Potts.
“The key to factory acceptance testing is to understand what the manufacturer or supplier is offering and to ensure it aligns with what you want to see.
“Prior to the test, you will need to have a clear idea of what parameters will be tested, what will constitute a pass or fail, and what you will do if something doesn’t pass – how will this impact on the project?”
Manufacturers’ commissioning testing
This is the first opportunity to witness your actual systems in operation, on your site and is an ‘in context test’. You will be limited to what you can do on site, but the primary purpose is for the manufacturer to ensure the system is set up correctly so that any issues are found and resolved. It is important to understand what tests are proposed, what proof will be provided, and to decide which, if any, tests should be witnessed.
Systems acceptance testing
“This is when we start the testing stages at the end of the project. This offers the best opportunity to witness your actual systems in operation, on your site, in context. Because of this, it is important to identify dependencies / interactions prior to testing. At this stage it is not about ‘commissioning’, so the expectation should be to pass,” says Potts. He explains that systems acceptance testing is concerned with:
• Development of test scripts in line with the design
• Testing normal operation including peak operation
• Demonstrating failures which should be tolerated within the design
Integrated systems testing
This is the point after all the systems are on site, commissioned and individually tested. This testing stage is all about witnessing the entire system in operation, onsite, with all its interactions and dependencies. It will include:
• Simulations of various failures which should be tolerated within design.
• Consequential failures and impacts
• Monitored, logged, and recorded testing
• Observation of normal operation and under failure at peak design conditions
“This may include simulation of mains failure, for example, so that you can watch the systems that aren’t UPS backed lose power. You can see the systems that are UPS backed continue to run, as well as witness the generator starting and transfer switches in operation. It enables you to appreciate exactly what will happen in this scenario.
“It is not just about mains failures, however. It may include simulating a fire alarm and seeing how the systems operate. Do any systems shut down? Does any shut-down compromise the system? It is an invaluable opportunity to learn about what is going to be your data centre, from an operational perspective, and what will happen during these events,” comments Potts.
Some of the common challenges encountered relate to the fact that, despite our early planning, the bulk of the testing occurs at the end of a project when there is restricted time and additional pressure to get the project completed. “With testing for data centres, the element of the unknown can catch people out,” warns Potts. “You can allow contingency time for the tests to ‘fail’, but how many tests will fail is unknown; you could be unlucky.
“However, you can allow a sensible order of testing so that by the time it has got so far down the line, the chances of a common failure happening is unlikely. It is also important to appreciate all the interactions and dependencies. Most systems in a data centre do not operate on their own.
“If you are looking to prove the UPS’ operation, for example, it needs to be commissioned, set up and ready to accept load, but you also need: controllable step load banks, cooling for the UPS and controlled simulation of the design conditions, cooling for the batteries (and simulation of the environmental conditions), output switchgear, meters and BMS connectivity.
“One area that is often a stumbling block is BMS systems testing. To commission it properly you need all the systems that feed into the BMS to be commissioned so the level of dependency is high.”
He explains that it can be proven simultaneously with all other testing – proving the BMS on its own would require repeating a lot of conditions which will have occurred naturally during previous testing, eg fault status monitoring.
Risk of testing vs the risk of not testing
At the design stage, consideration needs to be given to how future testing will be achieved to minimise intrusive works – it will involve balancing the ‘risk of testing vs the risk of not testing’.
Full mains failure simulation is recommended on an annual basis, but Sudlows is also seeing an increase in the use of non-invasive and predictive testing approaches – such as thermal imaging, lab analysis of fluids (ie oils, coolants and chilled water), power analysis and review of BMS trend data.
“Ultimately, data centres are not ‘fit and forget’ – initial testing will provide confidence that the systems will operate as intended.
“However, ongoing testing and maintenance will be required to maintain this confidence for many years,” Potts concludes.