Connect with us

Technology

CrowdStrike’s IT outage makes clear why cyber resilience is important

Avatar

Published

on

CrowdStrike's IT outage makes clear why cyber resilience is important

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn more


A misconfigured content update released by CrowdStrike Late Thursday, Microsoft inadvertently caused global outages on Windows systems, knocking many of the world’s most essential services offline.

CrowdStrike attempted to update the content that their Falcon Sensor uses to perform real-time threat detection and endpoint protection by monitoring system activities that identify suspicious behavior to prevent cyber-attacks. The content update includes logic designed to refine the detection of malicious activity and is based on the latest threat intelligence that CrowdStrike collects in real time and continuously.

“This was not a code update. This was actually a substantive update. And what that means is that there’s a single file that drives some additional logic about how we look for bad actors. And this logic was pushed out and only caused a problem in the Microsoft environment,” CrowdStrike CEO and founder George Kurtz told Jim Cramer during a interview on CNBC earlier today.

The glitch was first noticed in Australia, where Windows machines crashed and displayed the Blue Screen of Death (BSOD). The faulty update caused a Windows blackout worldwide, affecting dozens of airports, airlines, banking institutions and service companies that all rely on Windows-based systems to run their operations. Hundreds of thousands of travelers are stranded at airports around the world. About 2,600 U.S. flights were canceled Friday afternoon and more than 4,200 flights were canceled worldwide based on FlightAware data, as reported by the Wall Street Journal.

The consequences of the IT outage also spread across the Microsoft Azure cloud platform. Azure customers complained that they “experience unresponsiveness and boot errors on Windows machines using the CrowdStrike Falcon agent, which affects both on-premises and various cloud platforms.” Azure status shows that the outage is still impacting Azure virtual machines in the four regions of the Americas, Europe, Asia Pacific, the Middle East, and Africa.

IT teams are in for a long weekend and a tough July as many cloud-based configurations require individualized updates for each customer using a cloud-based system. Give IT teams a break and, if possible, postpone large-scale projects until the misconfiguration can be resolved.

An outage should be a call to action for greater cyber resilience

The more cyber-resilient a company is, the greater its ability to anticipate, resist and recover from a wide range of adverse conditions, including attacks, intrusions and compromises. It is often on CISOs must get their cyber resilience in order as a core part of their role in senior management and, increasingly, on boards of directors.

“Ultimately, every business has cadence patching challenges. Today is CrowdStrike’s bad day, and it turned out to be a bad day for a lot of people. The fact that Crowdstrike required their end customers to do the work to correct the issues created more time to respond and time to recover,” said Merritt Baer, ​​CISO at Rec and advisor to Expansion, Andesite And EncryptAI told VentureBeat.

Trustwave CISO Kory Daniels recently said that “boards have started asking the question: is it important to have a formally appointed Chief Resilience Officer?” VentureBeat has learned that more and more boards are adding cyber resilience to their broader risk management project teams. High-profile ransomware attacks that wreak havoc on supply chains are among the most expensive any company can withstand, as the United Healthcare breach makes clear.

Outages caused by misconfigurations underscore the need for a unique form of cyber resilience that is so actively pursued that it becomes a core part of a company’s DNA. Misconfigured updates will continue to cause global disruptions. That comes with the territory of an always-on, real-time world defined by complex, integrated systems. “The scale is significant, but so is the source. For example, Snowflake was due to SaaS misconfigurations and SolarWinds was a Russian-backed supply chain attack. This is old-fashioned security pain,” Baer said.

This week’s global breakdown is what a nation-state attack would look like if a country’s cybersecurity was weak or non-existent. To get an idea of ​​what’s at stake when it comes to national cyber resilience and cyber defense, check out the recently released report U.S. Intelligence Community Annual Threat Assessment 2024.

Cyber ​​resilience, in response to misconfigurations, must quickly identify and define problems, define a solution (ideally at a scale that can be automated), and over-communicate with every customer and person involved. Getting internal cyber resilience in order must be supported with reporting that is accurate, easily accessible to everyone and as real-time as possible. The goal should be to give everyone involved in updates the opportunity to own the outcome and know that regression testing and testing on partner platforms is complete.

“Earlier today, CrowdStrike’s Falcon service experienced an unfortunate global outage that affected many customers using the software on Windows systems. The quick action by CrowdStrike’s incident response team to determine the root cause and quickly notify customers is commendable, and their CEO’s blog was honest and clear,” Paul Davis, Field CISO at JFrog, told VentureBeat.

Kurtz continues to post updates on social media platforms X and LinkedIn. In the most recent X post below, he promises to provide a root cause analysis of how the outage occurred.

“In the security world you always have to be prepared for the unexpected and have an incident plan for those surprising events. Perfect software does not exist. After all, software is built by people, and to make mistakes is human. It’s about how quickly you identify the problem and recover from it,” Davis told VentureBeat.

Restore your system

Earlier today, CrowdStrike Posted instructions on his site for restoring systems affected by the fault and for find systems or hosts affected by the misconfigured update.

You must start each affected machine in safe mode first. This step is necessary because the Falcon Sensor software, which needs to be updated, is embedded in a subfolder of the Windows operating system. Booting into safe mode is essential to access this subfolder and perform any necessary updates.

If the affected PC uses BitLocker or other FDE (Full Disk Encryption) software, you will need the recovery key for each machine. CrowdStrike recommends the following steps in the details of their blog posts how to restore an affected machine:

Source: CrowdStirke, Falcon Content Update Statement for Windows Hosts, updated 6:11 PM ET, July 19, 2024.

Cyber ​​resilience is a measure of customer trust

“Security vendors must understand that they control customer outcomes. I imagine Crowdstrike won’t push updates in the same way in the future,” Baer told VentureBeat. The global power outage continues to disrupt the lives of hundreds of thousands of people and force businesses to a standstill. From the workplace of designers who rely on cloud-based systems to connect with their customers to large-scale enterprises where thousands of colleagues cannot log in, today’s experiences make it clear that cyber resilience is more than a security initiative. It should be a cornerstone of the customer experience.

Earning and maintaining customer trust depends on making a company as cyber-resilient as possible. The outage is a compelling event that every company should view as a crucible to assess how well prepared they are for a similar event.

Given the complex integrations and connections between global systems, disruptions will occur in the future. Every company must take responsibility for cyber resilience and choose to excel now, rather than later.