Windows systems throw BSOD due to faulty CrowdStrike update

Stop - Pixabay[German]A couple of hours ago, we are facing two significant outages in Microsoft IT infra structure. Microsoft 365 have had an outage due to an Azure configuration change. And a faulty update from Cybersecurity vendor CrowdStrike  force Windows hosts into bluescreens (BSDO). Enterprises using CrowsStrike Falcon are currently doomed – although there is already a workaround. Worldwide man air lines, airports, hospitals, etc. was out of order. I'll try to sort out the topic a bit.

First reader reports

I got first reader reports about 10 hours ago (it was Morning in Germany) via email, Facebook and X. Someone wrote to me on X: "Microsoft must be having real problems right now. See the posts on Twitter under Microsoft. BSODs. Probably something to do with Crowdstrike.". Other posts say that the infrastructure at the airport in Berlin is currently "down". My inbox is also piling up with comments from readers (thanks for that). die Hinweise der Leserschaft (danke dafür).

A German article reports, that airports, trains, radio stations etc. are affected. n-tv reports that petrol stations and banks are also affected. As of July 19, 2024, 11:00 a.m., I assume that a large part of the global Windows infrastructure (where CrowdStrike is used in companies) is down.

Microsoft reported an MS 365 outage

At that time, things are complex and unsorted, because Microsoft also reported problems with Microsoft 365 apps and services. I had originally assumed it was related to the above glitch. But the CrowdStrike update bug and the Microsoft 365 outage are different events that just happened at the same time.

For Microsoft 365, it was an Azure misconfiguration. I have therefore extracted the details in the blog post Worldwide outage of Microsoft 365 (July 19, 2024). At that time it was clear, that it wasn't a cyber attack, because the issue started in Australia and moved then west ward.

CrowdStrike-Update verursacht BSOD-Boot-Loop

In this short German post, I read that a global outage has hit the cybersecurity company CrowdStrike, affecting several government agencies and Australian businesses. On reddit.com there is this thread titled "BSOD error in latest crowdstrike update".

We have widespread reports of BSODs on windows hosts, occurring on multiple sensor versions. Investigating cause. TA will be published shortly. Pinned thread.

SCOPE: EU-1, US-1, US-2 and US-GOV-1

The latest update for the Windows endpoint monitoring software from the cyber security company CrowdStrike leads to BlueScreens on Windows systems. There is probably an entry "Windows crashes related to FalconSensor" in the CrowdStrike support portal (I can't access the details due to a lack of user account.

CrowdStrike Falcon is a widely used enterprise detection and response (EDR) protection software for end devices. During operation, software updates are regularly rolled out by means of so-called channel files. CrowdStrike uses the channel files to distribute dynamic updates and detection rules.

Due to a recently rolled out faulty update, which affects the Crowdstrike Falcon sensors installed on end devices and servers, a situation has arisen that can lead to crashes under Windows.

Thanks to a reader, I can provide more details as I was able to view the engineering report. CrowsStrike has received feedback about crashes on Windows hosts in connection with the Falcon Sensor. It is then paraphrased as follows:

  • Symptoms include (Windows) hosts experiencing an on-screen error related to the Falcon Sensor.
  • Mac or Linux based hosts are not affected by this problem.

The update bug has been fixed with the rollout of the channel file "C-00000291*.sys" (with a timestamp of 05:27 UTC or later) by reverting the update. If hosts still crash and are unable to stay online to receive the changes to the channel files, the following workarounds can be performed to work around the above issue.

Quick workarounds to get up and running

In the meantime, there are probably workarounds that can be used to make affected Windows systems work again. One workaround described on reddit – as a CrowdStrike Engineering solution – reads:

  1. Boot Windows into Safe Mode or the Windows Recovery Environment
  2. Navigate to the C:\Windows\System32\drivers\CrowdStrike directory
  3. Locate the file matching "C-00000291*.sys", and delete it.
  4. Boot the host normally.

However, the workaround does not work if you want to bring thousands of Windows systems worldwide back to life. And there was a 2nd hurdle: Booting in WinRE ends on encrypted systems with a Bitlocker prompt for a Bitlocker recovery key.

German blog reader Lars sent me the following hint on Facebook in a PM on how he got the systems up and running again (thanks for that, it was also a solution proposed by the CrowdStrike engineering team):

In the CrowdStrike console, set the sensor version to 11 when deploying. Then restart all Windows servers and Windows clients – this can be done several times.

Lars wrote to me: "We are just getting back on our feet without having to touch every computer worldwide. So far only 1 server that had to be retrieved from the backup." Thanks to Lars for the tip – maybe it will help.

The cloud is also affected

I now also have the CrowdStrike information for the workaround for public clouds or similar environments:

  • Disconnect the operating system disk volume from the affected virtual server.
  • Create a snapshot or backup of the disk volume before proceeding to avoid unintentional changes.
  • Attach the volume to a new virtual server or mount it.
  • Navigate to the directory C:\Windows\System32\drivers\CrowdStrike
    Search for the file with the name "C-00000291*.sys" and delete it.
  • Disconnect the volume from the new virtual server.
  • Reconnect the fixed disk to the affected virtual server.

n principle, this corresponds to the approach outlined above for restoring bare-metal systems. Hopefully, only a few systems will need to be revived.

Microsoft Azure VMs affected

The virtual machines in Microsoft Azure are also affected. Microsoft has published a corresponding message on this Azure status page.

Auch die virtuellen Maschinen in Microsoft Azure sind wohl betroffen. Microsoft hat auf dieser Azure-Status-Seite eine entsprechende Meldung veröffentlicht.

Awareness – Virtual Machines

We have been made aware of an issue impacting Virtual Machines running Windows Client and Windows Server, running the CrowdStrike Falcon agent, which may encounter a bug check (BSOD) and get stuck in a restarting state.

Updated: We approximate impact started as early as 04:09 UTC on the 18th of July, when this update started rolling out.

The incident is a wonderful example of how broken the entire IT ecosystem is in principle. A cloud security provider delivers a faulty update, sending all Windows clients into the digital nowhere. Nothing works at the airport, banks, petrol stations close and so on.

I assume that damage in the billions will be incurred today because the IT infrastructure is down. But let say it in other words: We have had luck. Imagine if there had been a supply chain attack on CrowdStrike and instead of a faulty update, malware had been injected and then loaded a kill software or ransomware.

Similar articles:
Worldwide outage of Microsoft 365 (July 19, 2024)
Windows systems throw BSOD due to faulty CrowdStrike update
Why numerous IT systems around the world failed due to two errors on July 19, 2024
CrowdStrike analysis: Why an empty file led to BlueSceen
Review of the CrowdStrike incident, the biggest computer glitch of all time

This entry was posted in issue, Software, Update, Windows and tagged , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *