Glasgow praised for IT crash response
- 14 November 2013
A report has blamed a “rare corruption” of an Active Directory database for a major computer crash at NHS Greater Glasgow and Clyde, but said that it is not possible to establish “the exact root cause of the failure.”
The report, which sets out the findings of a review team set up by the health board and the Scottish Government, says the crash between 1 and 3 October led to 709 patients having their hospital appointments postponed.
However, it also says that no data was lost, that the initial implementation of Active Directory was in line with industry best practice, and that – despite its rarity – the health board had included the possibility of a problem with the system in its contingency plans.
It also says that recovery procedures worked well. Indeed, the review, led by the Scottish government’s chief technology officer, Andy McClintock, praises the health board’s IT team and suppliers for their professional handling of the incident.
Health secretary Alex Neil said: “This review has shown that the technical team took the appropriate actions and did everything possible to restore services under enormous pressure.”
The report notes that the health board is a “large and complex organisation” with more than 40,000 staff working on 30,000 desktops supported by 1,700 servers.
Its Active Directory has accounts for 80% of these desktops, so the software error that occurred in part of the directory on 1 October effectively locked 10,000 users out of their systems until “the early hours” of 3 October.
Despite this, the review says that the design of the system, which was implemented in 2008 with support from Microsoft partner Charteris was good, and that it had been assessed on a number of occasions.
When the problem arose on 1 October, it was quickly traced to Active Directory, and Microsoft Professional Support staff were called in.
However, the health board’s team, Charteris, and Microsoft found that overcoming it “required a rebuild of existing hardware and further expert intervention by Microsoft subject matter experts.”
The review makes a number of recommendations to be considered by health boards across Scotland. The first four of these all relate to using both Microsoft and third-party back-ups and support services, and testing them regularly.
The report also sets out the steps taken by the health board when the problem occurred. It says the senior incident team was put in place, and that the service desk immediately put more resources in place to cope with the large volume of calls it was receiving.
Charteris and Microsoft engineers were called in, and set about trying to recover the ‘partition’ of Active Directory that had failed, and preparing to recover the entire system if this attempt failed, which it eventually did.
Most of the additional recommendations in the report relate to keeping incident logs and having further resources lined up. However, the report describes the actions taken as “sensible” and “appropriate” and concludes that overall the situation was handled well.
NHS Greater Glasgow and Clyde chief executive Robert Calderwood said his organisation had already acted on the recommendations.
A further review of back-up plans for IT systems across the Scottish NHS is still underway, and will report to ministers by the end of the year.