In the summer of 2024, corporate anti-malware provider CrowdStrike pushed a broken update to millions of PCs and servers running some version of Microsoft’s Windows software, taking down systems that both companies and consumers relied on for air travel, payments, emergency services, and their morning coffee. It was a huge outage, and it caused days and weeks of pain as the world’s permanently beleaguered IT workers brought systems back online, in some cases touching each affected PC individually to remove the bad update and get the systems back up and running.
The outage was ultimately CrowdStrike’s fault, and in the aftermath of the incident, the company promised a long list of process improvements to keep a bad update like that from going out again. But because the outage affected Windows systems, Microsoft often had shared and sometimes even top billing in mainstream news coverage—another in a string of security-related embarrassments that prompted CEO Satya Nadella and other executives to promise that the company would refocus its efforts on improving the security of its products.
The CrowdStrike crash was possible partly due to how anti-malware software works in Windows. Security vendors and their AV products generally have access to the Windows kernel, the cornerstone of the operating system that sits between your hardware and most user applications. But most user applications don’t have kernel access specifically because a buggy app (or one hijacked by malware) with kernel access can bring the entire system down rather than just affecting the app. The bad CrowdStrike update was bad mostly because it was being loaded so early in Windows’ boot process that many systems couldn’t check for and download CrowdStrike’s fix before they crashed.