As I’m sure you’ve seen in the news, British Airways has been hit with a series of operational problems this summer. The latest setback was last week’s IT outage, which drove staff to cancel more than 90 flights across 3 London airports.
Throngs of would-be holidaymakers gathered at Heathrow, Gatwick and London City, while staff struggled to check them in. Passengers couldn’t check themselves in online either. The check-in system was down, and the systems allowing aircraft to depart were also affected.
BA said the IT outage was not a global issue, but involved two separate systems: one which deals with online check-in, and the other that deals with flight departures. The airline moved to backup manual systems to keep some flights operating.
There looks to be a problem with the way data is shared in those two systems, which is indicated by the fact that staff had to resort to manual systems to keep essential services up and running.
BA’s automation certainly creates a low-friction and personalised experience for the customer, but automation can of course go wrong from time to time, highlighting the need for considered risk management.
What lessons in business continuity are there to take from the BA systems outage, for small and medium sized businesses?
What’s the best way to prepare to respond to an IT outage?
- Mitigate the issues that lead to outages.
- Create a disaster recovery plan to minimise the potential consequences for your business and your customers.
Disaster recovery plans cover a range of areas, from what to do about business continuity, people, security, compliance, communication and reputation protection.
I expect that for the majority of SMEs, creating and managing a backup system (which includes servers, software, networking equipment and more) takes up too many resources. In these cases, outsourcing disaster recovery management to off-site IT managed service providers is an effective move.
What are the key elements of a disaster recovery plan for an SME, from the perspective of an IT managed service provider?
The key elements of a disaster recovery plan for an SME
1. Communication during the outage
Coordinating communications effectively across your organisation while the technical response is underway is a challenge. A variety of stakeholders will need to be kept informed and engaged with updates, including representatives from public relations, customer support and legal.
A disaster recovery plan, if written “properly”, will set out your processes for ensuring effective communication during an outage, enabling your team to stay as productive and as informed as possible.
Here are a few classic tips for simplifying your outage communication plan:
- Establish a ‘single source of truth’. Use one channel to carry out essential tasks like collaborating, and updating your team and those outside of it on the progress of the technical response. There’s no toggling between different tools, which allows your team to focus on resolving the issue at hand.
- Know who you need to notify, and how. When a systems outage strikes, the last thing you’ll want to be doing is rummaging for the contact details of the people you need to reach, and then figuring out how to reach them.
- Outline how you’ll deal with any possible media interest and public updates (hopefully this will not happen)! Appoint a spokesperson to answer questions and consider how to notify the public about the outage. There are many ways to keep the public informed, for example via your website, social media or customer support team.
- Establish the roles of different individuals in an emergency.
If your primary system goes down, your backup system needs to maintain controls so that your data is backed up and versioned in a way that adheres to regulations - which also prepares your business for the event of an IT audit or governance review.
Maintaining GDPR controls is absolutely key here. New privacy regulations like the GDPR are requiring businesses to reestablish data protection controls, which includes the backup and recovery of data in some cases. The legal obligation to demonstrate that you’re processing data according to GDPR policy has meant that it’s now easier for us all to boost our disaster recovery.
What are the essential steps for SMEs to take to demonstrate compliance?
- Maintain ‘data integrity’. Ensure you’re processing data on a lawful and fair basis, that you’re controlling access to data and that your IT enables access and deletion requests.
- Clarify your data usage and handling policy, and consent policy.
- Protect data with a proactive approach to cyber security.
3. Proactive cyber security
Proactive monitoring and management of your IT systems enables you to reduce the risk of security breach and outage or malfunction. As a result, there should be less damage to productivity and customer relationships, less time spent dealing with IT issues by senior management and more predictable IT costs.
Here’s a summary of the cyber security “essentials” for SMEs:
- Create an IT security policy, and train your team to follow it. Make sure you include email and mobile security.
- Get certified by government-backed schemes, like Cyber Essentials, which benefits compliance and builds trust with your customers.
- Monitor your systems, 24/7. Carry out health checks and diagnostics of your servers and devices.
- Monitor for and detect data breaches and intrusions. Breaches have to be reported within 72 hours of identification.
- Patch security issues and update security controls automatically.
- Flag and prioritise potential problems for response automatically, and gain support from an IT partner to remediate issues - often without you realising it.
- Use cyber security awareness training for your team to practice and improve their response to cyber threats in the future.
- Keep your devices and software up to date. Regular updates from the manufacturer/developer fix security vulnerabilities that have been discovered.
- Protect yourself from viruses and other malware, using anti-malware measures, whilelisting and sandboxing.
- Secure your internet connection using a firewall. In the firewall, incoming traffic can be analysed to find out if it should be allowed onto your network.
- Choose the most secure settings for your devices and software. Manufacturers often set the default configurations to be open and multi-functional. These settings also provide cyber attackers with opportunities to gain unauthorised access to your data.
4. Why include an IT managed service provider in your disaster recovery plan?
For the majority of SMEs, creating and managing a disaster recovery plan and a business continuity solution demands too many resources.
An IT managed service provider would normally undertake an assessment of your IT to design a disaster recovery plan and business continuity solution with your situation in mind. The nature of the service is such that your plan is managed, updated and regularly tested by the IT managed service provider.
By outsourcing the management of your disaster recovery plan to an IT managed service provider, your team should benefit from less disruption caused by IT outages or malfunctions, better system performance and more time to focus on day-to-day responsibilities that add value to the business. When intervention is needed, your response should be faster and more effective.
A proactive approach to management and monitoring also results in increased predictability of cost.
What should you ask an IT managed service provider to see if their approach to business continuity and disaster recovery will suit you?
Use this handy cheat-sheet to find out if you’d work together effectively. Get the answer to 5 key questions, including:
- How would you, the MSP, help us to solve problems in general?
- How would you, the MSP, guide us with making informed decisions about overall IT service management?
- How would you, the MSP, manage our IT specifically?