Says a lot about their internal organisation structure for something like this to happen. Intern is the only tolerable excuse here, but even still why would you put a newbie in a position where they could brick thousands of vehicles with a slip of the finger?
I’d expect a tech company like Rivian who happens to sell a vehicle to know better than this 🤦♂️
wrong build with the wrong security certificates was sent out
Isn’t standard practice to validate signed code first before installing it? Hope the next update allows the car’s computer system to check the firmware signature before doing what I assume is an automatic installation…
may require physical repair in some cases
Ouch
I don’t follow your line about an intern. I don’t see it in the article and even if it were the case, an unqualified person being able to do this is on the seniors/leads. Throwing the intern under the bus is what scummy companies do to shift blame - see solar winds , where (spoiler) this strategy doesn’t seem to be working out
It’s more incompetent to allow an intern to fuck up production than it is to have normal developers make mistakes. It shows a complete lack of controls and care.
Yeah not the intern’s fault, the fault of the system that allowed the intern to be able to do it at all
Anyone who builds software that runs on actual hardware should know that you NEVER deploy builds that haven’t been fully exercised on actual hardware.
This tells me that their software QC process is non-existent at best and actually malignant at worst.
If their software is supposed to be their defining feature this is the equivalent of McDonald’s “accidentally” shipping frozen discs of literal shit instead of burger patties to franchises who then serve them to customers without question.
If their company dies because of this (it fucking should imo) they 100% deserve it for the countless unknown dangers they’ve exposed their customers to. It’s not this particular thing, bricking the infotainment system, it’s the demonstration that they have no or bad process.
Ok, calm down. Seems like a bit of an overreaction to link a bad software update for an infotainment system to “countless unknown dangers”
They screwed up, it happens to the best of us. There isn’t a company on the planet that hasn’t made a mistake and rolled out something that is broken.
What’s important here is that they said “yep, we fucked up, we are prioritizing fixing this problem for customers” instead of trying to hide it or blaming the customer for the problem.
If anything Rivian should be applauded for how they handled it and if this kind of thing continues to happen, then maybe we get the pitch forks out.
Dollars to donuts their infotainment system shares a CAN bus with nodes that affect control systems. If they can’t handle the easy stuff, what the hell else are they fucking up?
It’s not about the infotainment system, it’s about the culture that leads to this problem.
This company will not end because of this issue. Boeing is still kicking and you can actually count the number of people they’ve killed with shitty software/system integration process.
I’ve spent my career working in embedded systems and embedded test and verification. This issue is not the first or only issue to get by. Maybe they take this like the red hot poker it is and fix their problems, maybe not. I’m not gonna gamble on their products though.
So if I’m understanding this correctly. If anyone ever rolls out a software update that causes a failure like this it is instantly a sign that the company has a culture that leads to problems. Hard and fast? No exceptions? No one makes a huge mistake, that’s just a mistake that slipped through the cracks?
As for it being connected to the CAN bus, so what? It isn’t some sort of magical system where if something fails all the rest of the connected systems do too. That’s like saying if the monitor on my computer fails and it’s connected to the rest of my computer via the PCIe lanes on my graphics card, then everything else is going to be affected. It doesn’t work like that.
I don’t even have an opinion on the company I just don’t think it’s the end of times because the wrong build rolled out. They fucked up, they owned up to it and based on the response they will learn from it.
The issue is not just that a bad update went out. Freak accidents can happen. Software is complicated and you can never be 100% sure. The problem is the specifics. A fat finger should never be able to push a bad update to a system in customers’ hands, forget a system easily capable of killing people in a multitude of ways. I’m not quite as critical as the above commentor but this is a serious issue that should raise major questions about their culture and procedures.
This isn’t just some website where a fat finger at worst means the site is down for a while (assuming you do the bare minimum and back up your db). This is a vehicle. That’s what they meant about the CAN bus - not that that’s really a concern when the infotainment system just gets bricked, but that they have such lax procedures around software that touches a safety-critical system.
Having systems in place to ensure only tested, known good builds are pushed is pretty damn basic safety practice. Swiss cheese model. If they can’t even handle the basics, what other bad practices do they have?
Again, not that I think this is necessarily as bad as the other person - perhaps this is the only mistake they’ve made in their safety procedures and otherwise they’re industry leaders - we don’t know that yet. But this is extremely concerning and until proven otherwise should be investigated and treated as a very serious safety violation. Safety first.
Thank you for this response. I can agree with this perspective.
My comments were, “hey, let’s be a little more level headed about this” and less “this company should die and heads should roll”.
Interconnected hardware and software systems do affect each other. It’s not magic, it’s physics.
And yes, your graphics card spewing garbage onto the pci bus can affect the rest of your system.
It actually does work like that.
If I have offended you, that wasn’t my intent. You seem defensive about what I said but I wasn’t trying to upset you.
I said broken monitors don’t necessarily affect the rest of the system. Just like, you know, broken infotainment systems don’t necessarily affect the rest of the car. Can happen sometimes, doesn’t seem to have happened this time. So yes what you are implying is that magic is happening when it clearly didn’t and to sit here and say it will definitely affect other systems misleading.
People make mistakes, it’s unavoidable but the fact that they are willing to admit it was their fault, shows an attitude of learning and growth and is a welcome change from norm, where companies sweep it under the rug and it costs people lives.
Will they probably grow to a point where they are too big to give a shit, probably. At least for now they are being open and honest instead of blaming the user or a third party.
We don’t live in a vacuum, the world isn’t black and white. Come live in the grey and cut people some slack.
Interns do but should not get the level of write access that makes a durable change impacting all customers. Deadlock a server or even wipe SQL tables, this is an outage. Break a customer’s configuration, send the wrong client’s paperwork, again small scale problem you can deal with. Interns don’t change company policy.
I think it’s a more foundational architecture question: why do you push builds to all customers at once without gating it by SOMETHING that positively confirms the exact OTA update package has been validated? The absolute simplest thing I can think of is pushing to 1 random car and waiting for the post-install self tests to pass before pushing to everyone else. Maybe there’s actually no release automation?? But then you make it safe a different way. It’s just defensive coding practice, I’m not even a CS degree but learned on the job something always breaks so you generally account for the expectation that everything will fail by making a fail-safe just so the failure is not spectacular. Nothing fancy, just enough mitigation to keep the fuck up from eating into your weekend if it happens on a Friday.
may require physical repair in some cases
Ouch
https://www.rivianownersforum.com/threads/rear-ended-rivian-gets-42-000-repair-bill.5445/
Ouch indeed.
Integration tests! Do you speak it!?
Integration tests don’t really help if you just push the wring build to production.
Although the pipeline should probably not accept builds that haven’t passed all tests.
Integration tests don’t really help if you just push the wring build to production.
This is like designing a deadbolt that tells you that the key doesn’t work, but it allows you to open the door anyway. Why would anyone have a process in place where you can push to production with failing integration tests?
I actually work in automotive testing, and the honest truth is that there likely is no real automated pipeline.
Automotive software testing is much more complex than simple software unit or integration tests. You need to run on actual hardware, accompanied by all the other ECUs you are interfacing with. And the tools that slow you to do so are specialized tools, which often are not yet integrated into CI/CD processes (they’re pretty much all working on it though). I.e. getting test results for a build involves manual labor, which makes it prone to errors.
You don’t think this is a smell? Surely your company can improve on this.
I do, and we are actively working on it.
I didn’t even know rivian had even made an actual vehicle until I saw one last month. I thought they were still vaporware. Wasn’t there another trunk company that was supposed to have made a truck like 6 years ago? Did they ever do that?
Rivian is out and about. Thru even finished their 10k amazon van order along eith the trucks and SUVs.
The truck brand thats still vaporware may be tesla (ha) but you probably mean the Fisker Alaskawhich looks amazing if it ever exists. They actually went bankrupt a while ago but came back and seem to be solvent.
The Lordstown Endurance was an electric pickup truck. It was to be built at a former GM factory. They managed to produce 31 of the trucks,19 of which had to be recalled. The compny failed, but the factory is owned by Foxcon and is planned to be used to manufacture Fiskers in the future.
Fisker has been producing electric cars. They’ve made around 5,000 at this point. The Alaska is a planned future vehicle. Unlikely to happen. The auto industry is notoriously difficult to for new manufacturers. Telsa is one of the few car new car companies to see widespread success, and they are the exception.
Well it’s certainly better looking than the Rivian, which is one of the ugliest production vehicles I’ve ever seen.
lol yeah it’s definitely Tesla! The Alaska looks amazing. I actually hadn’t seen that one but it definitely looks awesome. I looked it up and I think it was the nikola badger. The name doesn’t seem familiar but the look of it seems like what I was thinking.
Honestly I feel there have been a lot of ev startups that never deliver. I do think the rivian truck looks pretty nice as well.
This is the best summary I could come up with:
The more innovation-minded people in the auto industry have heralded the advent of the software-defined car.
It’s been spun as a big benefit for consumers, too—witness the excitement among Tesla owners when that company adds a new video game or childish noise to see why the rest of the industry joined the hype train.
The EV startup, which makes well-regarded pickup trucks and SUVs, as well as delivery vans for Amazon, pushed out a new over-the-air software update on Monday.
But all is not well with 2023.42; the update stalls before it completes installing, taking out both infotainment and main instrument display screens.
Update 1 (11/13, 10:45 PM PT): The issue impacts the infotainment system.
A vehicle reset or sleep cycle will not solve the issue.
The original article contains 304 words, the summary contains 126 words. Saved 59%. I’m a bot and I’m open source!