On July 19, as much of the world slept, dreaded “blue screens of death” flickered across computer screens in Australia. Whatever was happening — and no one was sure at first— grounded flights, knocked broadcasters off the air, and disrupted businesses.
It was the start of what is considered “the largest IT outage in history,” one that took down 8.5 million Microsoft Windows devices. The cause? A botched software update from cybersecurity technology vendor CrowdStrike.
“The outage caused by CrowdStrike, while not a cyberattack, was arguably the most significant cybersecurity-related event of 2024,” said Michael Tanji, Director of Cybersecurity for MxD.
Each year MxD identifies the top cyber incident, detailing it and the lessons it holds for manufacturers.
“The outage showed the fragility of the critical infrastructure we call ‘the internet,’ “ Tanji said. “And at a time when we’re trying to improve productivity and quality through automation and interconnectivity, this was a black eye that is likely to set back modernization efforts — and all that that implies for the economy and national security.”
The Disruption
What triggered the outage? It was an update process used for CrowdStrike’s Falcon sensor cybersecurity product, which is embedded in the “kernel” of Windows operating systems. A crucial component, the kernel controls memory, processing, and interactions between a computer’s hardware and software.
“Falcon does something called ‘hooking the kernel,’ which is a nerdy way of saying Falcon is given high-level privileges in the operating system, relative to other types of programs,” Tanji said. “When CrowdStrike pushed out an update to Falcon installations — an update that had a logic flaw in it — it caused systems to crash.”
Flaws in programs happen all the time and modern, large-scale software update practices should have prevented this from happening, Tanji said. But they didn’t.
“CrowdStrike,” he added, “has since fixed the error and adopted protocols that should preclude this from happening again.”
Lessons to Learn
The outage, which exposed global cybersecurity and supply chain vulnerabilities, offers lessons for manufacturers, Tanji said, such as:
- Don’t use this incident as a reason to halt automation and the adoption of more IT and/or operational technology (OT). “These are mature technologies that, while not perfect, bring more value to manufacturing than any risk they introduce,” Tanji said.
- Talk to your service providers to understand how they deal with updates and security issues. Such conversations are essential to have with both IT and OT service providers.
- Know who to call if things go wrong, what their protocols are for incident response, and evaluate ways to mitigate the risk of outages.
- Consider — if you are particularly cautious — delaying service providers’ automatic update processes. Manufacturers would still update their systems, of course, but would do so shortly after the automatic update, minimizing risk.
The biggest lesson, Tanji said, is that while CrowdStrike was not an attack, manufacturers should not let their cybersecurity guard down.
This year — for the third year in a row — the IBM X-Force Threat Intelligence Report ranked manufacturing as cybercriminals’ top target.
“Given the geopolitical situation, manufacturing, particularly the defense industrial base, is taking center stage,” Tanji said. “But you don’t have to make weapons or ammunition to be considered a target.”
“Small and medium-sized manufacturers are at particular risk,” he added, “because they’re the most poorly defended yet have trust relationships with prime contractors that can be exploited with minimal risk of being detected.”
Looking back: What did 2023’s top cyber incident teach manufacturers?