This isn’t a gloat post. In fact, I was completely oblivious to this massive outage until I tried to check my bank balance and it wouldn’t log in.
Apparently Visa Paywave, banks, some TV networks, EFTPOS, etc. have gone down. Flights have had to be cancelled as some airlines systems have also gone down. Gas stations and public transport systems inoperable. As well as numerous Windows systems and Microsoft services affected. (At least according to one of my local MSMs.)
Seems insane to me that one company’s messed up update could cause so much global disruption and so many systems gone down :/ This is exactly why centralisation of services and large corporations gobbling up smaller companies and becoming behemoth services is so dangerous.
Is there a chance that this makes organisations move to Linux?
I guess they would want some cybersecurity software like Crowdstrike in either case? If so, this could probably have happened on any system, as it’s a bug in third party software that crashes the computer.
Not that I know much about this, but if this leads to a push towards Linux it would be if companies already wanted to make the switch, but were unwilling because they thought they needed Crowdstrike specifically. This might lead them to consider alternative cybersecurity software.
Windows usage isn’t the cause of dysfunction in corporate IT but a symptom of it. All you would get is badly managed Linux systems compromised by bloated insecure commercial security/management software.
You’d think maybe not being reliant on a 90 billion dollar company to un-fuck security would be a bigger deal than it is.
No because Windows Indoctrination starts with Academia.
There will have to be heavy monetary losses before IT is forced to leave their golden goose that keeps them employed with “problems” to “fix” that soak up hours each.
But maybe they will notice the monetary losses and competitors not using their trash will pull ahead – that will get their attention. Still they require the cognition to understand the problem and select a solution and the Linux Jungle is hard for corporate minds to navigate without smart IT help.
Not really. This isn’t a Windows problem. This is a faulty software problem. People can write faulty software on Linux too.
Is there an easy way to silence every fuckdamn sanctimonious linux cultist from my lemmy experience?
Secondly, this update fucked linux just as bad as windows, but keep huffing your own farts. You seem to like it.
I’d unsubscribe from !linux for a start.
I’m pretty sure this update didn’t get pushed to linux endpoints, but sure, linux machines running the CrowdStrike driver are probably vulnerable to panicking on malformed config files. There are a lot of weirdos claiming this is a uniquely Windows issue.
Thanks for the tip, so glad Lemmy makes it easy to block communities.
Also: It seems everyone is claiming it didn’t affect Linux but as part of our corporate cleanup yesterday, I had 8 linux boxes I needed to drive to the office to throw a head on and reset their iDrac so sure maybe they all just happened to fail at the same time but in my 2 years on this site we’ve never had more than 1 down at a time ever, and never for the same reason. I’m not the tech head of the site by any means and it certainly could be unrelated, but people with significantly greater experience than me in my org chalked this up to Crowdstrike.
Hi there! Looks like you linked to a Lemmy community using a URL instead of its name, which doesn’t work well for people on different instances. Try fixing it like this: [email protected]
username… checks out?
Oh you really have no fucking clue. It’s medical and no treatment has worked for more than a few weeks. it’s only a matter of time before I am banned. Now imagine living with that for 4+ decades and being the butt of every thread’s joke.
A real shame that can’t be considered medical discrimination.
That sounds exhausting. I hope you find peace, one day.
We’re all going to be so smug.
Hopefully this will be the straw that breaks this dead camels back.
Microsoft should get buried after this
It’s not a Microsoft problem
It’s a world-depending-on-a-few-large-companies problem
This is exactly why centralisation of services and large corporations gobbling up smaller companies and becoming behemoth services is so dangerous.
Its true, but otherside of same coin is that with too much solo implementation you lose benefits of economy of scale.
But indeed the world seems like a village today.
you lose benefits of economy of scale.
I think you mean - the shareholders enjoy the profits of scale.
When a company scales up, prices are rarely reduced. Users do get increased community support through common experiences especially when official channels are congested through events like today, but that’s about the only benefit the consumer sees.
I love how everyone understands the issue wrong. It’s not about being on Windows or Linux. It’s about the ecosystem that is common place and people are used to on Windows or Linux. On windows it’s accepted that every stupid anticheat can drop its filthy paws into ring 0 and normies don’t mind. Linux has a fostered a less clueless community, but ultimately it’s a reminder to keep vigilant and strive for pure and well documented open source with the correct permissions.
BSODs won’t come from userspace software
Crowdstrike does have linux and mac version. Not sure who runs it
I deployed it for my last employer on our linux environment. My buddies who still work there said Linux was fine while they had to help the windows Admins fix their hosts.
That’s precisely why I didn’t blame windows in my post, but the windows-consumer mentality of “yeah install with privileges, shove genshin impact into ring 0 why not”
Linux can have the same issue. We have to keep the culture on our side here vigilant and pure near the kernel.
While that is true, it makes sense for antivirus/edr software to run in kernelspace. This is a fuck-up of a giant company that sells very expensive software. I wholeheartedly agree with your sentiment, but I mostly see this as a cautionary tale against putting excessive trust and power in the hands of one organization/company.
Imagine if this was actually malicious instead of the product of incompetence, and the update instead ran ransomware.
If it was malicious it wouldn’t have had the reach a trusted platform would. That is what made the xz exploit so scary was the reach and the malicious attempt.
I like open source software but that’s one big benefit of proprietary software. Not all proprietary software is bad. We should recognize the ones doing their best to avoid anti consumer practices and genuinely try to serve their customers needs to the best of their abilities.
Microsoft should test all its products on its own computers, not on ours. Made an update, tested it and only then posted it online.
Microsoft has nothing to do with this. This is entirely on Crowdstrike.
I would be too, except Firefox just started crashing on Wayland all the morning D;
New Nvidia driver?
Yes but I upgraded to 555 at least a week or two ago and it started crashing a couple of days ago, I think there’s an issue with explicit sync
explicit sync is used, but no acquire point is set
If you Google this you’ll find various bug reports
I isn’t even a Linux vs Windows thing but a competent at your job vs don’t know what the fuck you are doing thing. Critical systems are immutable and isolated or as close as reasonably possible. They don’t do live updates of third party software and certainly not software that is running privileged and can crash the operating system.
I couldn’t face working in corporate IT with this sort of bullshit going on.
It’s also a “don’t allow third party proprietary shit into your kernel” issue. If the driver was open source it would actually go through a public code review and the issue would be more likely to get caught. Even if it did slip through people would publically have a fix by now with all the eyes on the code. It also wouldn’t get pushed to everyone simultaneously under the control of a single company, it would get tested and packaged by distributions before making it to end users.
It’s actually a “test things first and have a proper change control process” thing. Doesn’t matter if it’s open source, closed source scummy bullshit or even coded by God: you always test it first before hitting deploy.
And roll it out in a controlled fashion: 1% of machines, 10%, 25%…no issues? Do the rest.
How this didn’t get caught by testing seems impossible to me.
The implementation/rollout strategy just seems bonkers. I feel bad for all of the field support guys who have had there next few weeks ruined, the sys admins who won’t sleep for 3 days, and all of the innocent businesses that got roped into it.
A couple local shops are fucked this morning. Kinda shocked they’d be running crowd strike but also these aren’t big businesses. They are probably using managed service providers who are now swamped and who know when they’ll get back online.
One was a bakery. They couldn’t sell all the bread they made this morning.
One shop I was at had a manual process going with cash only purchases.
That blew up when I ordered 3 things and the ‘cashier’ didn’t know how to add them together. They didn’t have calculator on Windows available🤣
I told them the total and change to give me, but lent them the calculator on my phone so they could verify for themselves 🤣
It’s not that clear cut a problem. There seems to be two elements; the kernel driver had a memory safety bug; and a definitions file was deployed incorrectly, triggering the bug. The kernel driver definitely deserves a lot of scrutiny and static analysis should have told them this bug existed. The live updates are a bit different since this is a real-time response system. If malware starts actively exploiting a software vulnerability, they can’t wait for distribution maintainers to package their mitigation - they have to be deployed ASAP. They certainly should roll-out definitions progressively and monitor for anything anomalous but it has to be quick or the malware could beat them to it.
This is more a code safety issue than CI/CD strategy. The bug was in the driver all along, but it had never been triggered before so it passed the tests and got rolled out to everyone. Critical code like this ought to be written in memory safe languages like Rust.
So it’s Linux vs Windows
No it’s Crowdstrike… we’re just seeing an issue with their Windows software, not their Linux software.
That being said Microsoft still did hire crowd strike and give them the keys to release an update like this.
End result still is windows having more issues than linux
Huh? Crowdstrike is an antivirus product, you’re only affected if you bought and installed it on your Windows devices. Crowdstrike also had issues with their Linux version a few weeks ago, but that one was thankfully less severe.
I couldn’t face working in corporate IT with this sort of bullshit going on.
im taking you don’t work in IT anymore then?
There are state and government IT departments.
More generally: delegate anything critical to a 3rd party and you’ve just put your business at the mercy of the quality (or lack thereof) of their own business processes which you do not control, which is especially dangerous in the current era of “cheapest as possible” hiring practices.
Having been in IT for almost 3 decades, a lesson I have learned long ago and which I’ve also been applying to my own things (such as having my own domain for my own e-mail address rather than using something like Google) was that you should avoid as much as possible to have your mission critical or hard to replace stuff dependent on a 3rd Party, especially if the dependency is Live (i.e. activelly connected rather than just buying and installing their software).
I’ve managed to avoid quite a lot of the recent enshittification exactly because I’ve been playing it safe in this domain for 2 decades.
This is just like “what not to do in IT/dev/tech 101” right here. Every since I’ve been in the industry for literally decades at this point I was always told, even when in school, “Never test in production, never roll anything out to production on a Friday, if you’re unsure have someone senior code review” of which, Crowdstrike, failed to do all of the above. Even the most junior of junior devs should know better. So the fact that this update was allowed go through…I mean blame the juniors, the seniors, the PM’s, the CTO’s, everyone. If your shit is so critical that a couple bad lines of poorly written code (which apparently is what it was) can cripple the majority of the world…yeah crowdstrike is done.
It’s incredible how an issue of this magnitude didn’t get discovered before they shipped it. It’s not exactly an issue that happens in some niche cases. It’s happening on all Windows computers!
This can only happen if they didn’t test their product at all before releasing to production. Or worse: maybe they did test, got the error, and they just “eh, it’s probably just something wrong with test systems”, and then shipped anyway.
This is just stupid.
Can you imagine being the person that hit that button today? Jesus.
Our group got hit with this today. We don’t have a choice. If you want to run Windows, you have to install this software.
It’s why stuff like this is so crippling. Individual organizations within companies have to follow corporate mandates, even if they don’t agree.
Crowdstrike is a cyber security company, so some kind of live update is needed. The fault is not catching the crush before pushing the update.
Even 911 is impacted
In the US 911 is decentralized, so widespread things will always affect it in some places. Solarwinds hack was another one.
Assuming the entire phone system isn’t down, there are typically very shitty to deal with workarounds for CAD outages.
That’s potentially life threatening. I wonder if 112 in other countries is affected, it shouldn’t be but at this point I’m afraid it is.
In the Netherlands 112 is fine, most critical systems are. It’s mostly airports that are getting fucked by this it seems.
Banks and PSPs are fine here too.
Me too. Additionally, I use guix so if a system update ever broke my machine I can just rollback to a prior system version (either via the command line or grub menu).
That’s assuming grub doesn’t get broken in the update…
True, then I’d be screwed. But, because my system config is declared in a single file (plus a file for channels) i could re-install my system and be back in business relatively quickly. There’s also guix home but I haven’t had a chance to try that.
Immutable systems sound like something desperately needed, tbh. It’s just such an obvious solution and I’m surprised that it’s been invented so late
I work in hospitality and our systems are completely down. No POS, no card processing, no reservations, we’re completely f’ked.
Our only saving grace is the fact that we are in a remote location and we have power outages frequently. So operating without a POS is semi-normal for us.
I’ve worked with POS systems my whole career and I still can’t help think Piece Of Shit whenever I see it
It’s also reported in Danish news now: https://www.dr.dk/nyheder/udland/store-it-problemer-flere-steder-i-verden
Dutch media are reporting the same thing: https://nos.nl/l/2529468 (liveblog) https://nos.nl/l/2529464 (Normal article)
I just saw it on the Swedish national broadcaster’s website:
https://www.svt.se/nyheter/snabbkollen/it-storningar-varlden-over-e1l936
What?! No, it must be Kaspersky!
/s