this post was submitted on 26 Jul 2024
988 points (99.5% liked)

Technology

59612 readers
3262 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] MudMan@fedia.io 171 points 4 months ago (6 children)

I have a 13 series chip, it had some reproducible crashing issues that so far have subsided by downclocking it. It is in the window they've shared for the oxidation issue. At this point there's no reliable way of knowing to what degree I'm affected, by what type of issue, whether I should wait for the upcoming patch or reach out to see if they'll replace it.

I am not happy about it.

Obviously next time I'd go AMD, just on principle, but this isn't the 90s anymore. I could do a drop-in replacement to another Intel chip, but switching platforms is a very expensive move these days. This isn't just a bad CPU issue, this could lead to having to swap out two multi-hundred dollar componenet, at least on what should have been a solidly future-proof setup for at least five or six years.

I am VERY not happy about it.

[–] henfredemars@infosec.pub 77 points 4 months ago (1 children)

I’m angry on your behalf. If you have to downclock the part so that it works, then you’ve been scammed. It’s fraud to sell a part as a higher performing part when it can’t deliver that performance.

[–] MudMan@fedia.io 53 points 4 months ago* (last edited 4 months ago) (2 children)

So here's the thing about that, the real performance I lose is... not negligible, but somewhere between 0 and 10% in most scenarios, and I went pretty hard keeping the power limits low. Once I set it up this way, realizing just how much power and heat I'm saving for the last few few drops of performance made me angrier than having to do this. The dumb performance race with all the built-in overclocking has led to these insanely power hungry parts that are super sensitive to small defects and require super aggressive cooling solutions.

I would have been fine with a part rated for 150W instead of 250 that worked fine with an air cooler. I could have chosen whether to push it. But instead here we are, with extremely expensive motherboards massaging those electrons into a firehose automatically and turning my computer into a space heater for the sake of bragging about shaving half a milisecond per frame on CounterStrike. It's absurd.

None of which changes that I got sold a bum part, Intel is fairly obviously trying to weasel out of the obviously needed recall and warranty extension and I'm suddenly on the hook for close to a grand in superfluous hardware next time I want to upgrade because my futureproof parts are apparently made of rust and happy thoughts.

[–] tal@lemmy.today 8 points 4 months ago* (last edited 4 months ago) (1 children)

150W instead of 250

Yeah, when I saw that the CPU could pull 250W, I initially thought that it was a misprint in the spec sheet. That is kind of a nutty number. I have a space heater that can run at low at 400W, which is getting into that range, and you can get very low-power space heaters that consume less power than the TDP on that processor. That's an awful lot of heat to be putting into an incredibly small, fragile part.

That being said, I don't believe that Intel intentionally passed the initial QA for the 13th generation thinking that there were problems. They probably thought there was a healthy safety margin. You can certainly blame them for insufficient QA or for how they handled the problem as the issue was ongoing, though.

And you could also have said "this is absurd" at many times in the past when other performance barriers came up. I remember -- a long time ago now -- when the idea of processors that needed active cooling or they would destroy themselves seemed rather alarming and fragile. I mean, fans do fail. Processors capable of at least shutting down on overheat to avoid destroying themselves, or later throttling themselves, didn't come along until much later. But if we'd stopped with passive heatsink cooling, we'd be using far slower systems (though probably a lot quieter!)

[–] MudMan@fedia.io 3 points 4 months ago

You're not wrong, but "we've been winging it for decades" is not necessarily a good defense here.

That said, I do think they did look at their performance numbers and made a conscious choice to lean into feeding these more power and running them hotter, though. Whether the impact would be lower with more conservative power specs is debatable, but as you say there are other reasons why trying to fake generational leaps by making CPUs capable of fusing helium is not a great idea.

[–] Nighed@sffa.community 1 points 4 months ago (1 children)

Could you not have just bought a lower power chip then?

Or does that loose you cores?

[–] MudMan@fedia.io 3 points 4 months ago

Oh, I absolutely could have. It would lose a couple of cores, but the 13th gen is pretty linear, it would have performed more or less the same.

Thing is, I couldn't have known that then, could I? Chip reviews aren't aiming at normalizing for temps, everybody is reviewing for moar pahwah. So is there a way for me to know that gimping this chip to run silently basically gets me a slightly overclocked 13600K? Not really. Do I know, even at this point, that getting a 13600K wouldn't deliver the same performance but require my fans to be back to sounding noticeable? I don't know that.

Because the actual performance of these is not to a reliable spec other than "run flat out and see how much heat your thermal solution can soak" there is no good way to evaluate these for applications that aren't just that without buying them and checking. Maybe I could have saved a hundred bucks. Maybe not. Who knows?

This is less of a problem if you buy laptops, but for casual DIY I frankly find the current status quo absurd.

[–] brickfrog@lemmy.dbzer0.com 48 points 4 months ago* (last edited 4 months ago) (2 children)

I have a 13 series chip, it had some reproducible crashing issues that so far have subsided by downclocking it.

From the article:

the company confirmed a patch is coming in mid-August that should address the “root cause” of exposure to elevated voltage. But if your 13th or 14th Gen Intel Core processor is already crashing, that patch apparently won’t fix it.

Citing unnamed sources, Tom’s Hardware reports that any degradation of the processor is irreversible, and an Intel spokesperson did not deny that when we asked.

If your CPU is already crashing then that's it, game over. The upcoming patch cannot fix it. You've got to figure out if you can do a warranty replacement or continue to live with workarounds like you're doing now.

Their retail boxed CPUs usually have a 3(?) year warranty so for a 13th gen CPU you may be midway or at the tail end of that warranty period. If it's OEM, etc. it could be a 1 year warranty aka Intel isn't doing anything about it unless a class action suit forces them :/

The whole situation sucks and honestly seems a bit crazy that Intel hasn't already issued a recall or dealt with this earlier.

[–] 1rre@discuss.tchncs.de 23 points 4 months ago

If you're in the UK or I expect EU, I imagine if it's due to oxidation you can get it replaced even on an expired warranty as it's a defect which was known to either you or intel before the warranty expired, and a manufacturing defect rather than breaking from use, so intel are pretty much in a corner about having sold you faulty shit

[–] MudMan@fedia.io -1 points 4 months ago

The article is... not wrong, but oversimplifying. There seem to be multiple faults at play here, some would continue to degrade, others would prevent you from recovering some performance threshold, but may be prevented from further damage, others may be solved. Yes, degradation of the chip may be irreversible, if it's due to the oxidation problem or due to the incorrect voltages having cuased damage, but presumably in some cases the chip would continue to work stable and not degenerate further with the microcode fixes.

But yes, agreed, the situation sucks and Intel should be out there disclosing a range of affected chips by at least the confirmed physical defect and allowing a streamlined recall of affected devices, not saying "start an RMA process and we'll look into it".

[–] tal@lemmy.today 13 points 4 months ago* (last edited 4 months ago) (2 children)

They do say that you can contact Intel customer support if you have an affected CPU, and that they're replacing CPUs that have been actually damaged. I don't know -- and Intel may not know -- what information or proof you need, but my guess is that it's good odds that you can get a replacement CPU. So there probably is some level of recourse.

Now, obviously that's still a bad situation. You're out the time that you didn't have a stable system, out the effort you put into diagnosing it, maybe have losses from system downtime (like, I took an out-of-state trip expecting to be able to access my system remotely and had it hang due to the CPU damage at one point), maybe out data you lost from corruption, maybe out money you spent trying to fix the problem (like, on other parts).

But I'd guess that specifically for the CPU, if it's clearly damaged, you have good odds of being able to at least get a non-damaged replacement CPU at some point without needing to buy it. It may not perform as well as the generation had initially been benchmarked at. But it should be stable.

[–] MudMan@fedia.io 1 points 4 months ago

"Clearly damaged" is an interesting problem. The CPU would crash 100% of the time on the default settings for the motherboard, but if you remember, they issued a patch already.

I patched. And guess what, with the new Intel Defaults it doesn't crash anymore. But it suddenly runs very hot instead. Like, weird hot. On a liquid cooling system it's thermal throttling when before it wouldn't come even close. Won't crash, though.

So is it human error? Did I incorrectly mount my cooling? I'd say probably not, considering it ran cool enough pre-patch until it became unstable and it runs cool enough now with a manual downclock. But is that enough for Intel to issue a replacement if the system isn't unstable? More importantly, do I want to have that fight with them now or to wait and see if their upcoming patch, which allegedly will fix whatever incorrect voltage requests the CPU is making, fixes the overheating issue? Because I work on this thing, I can't just chuck it in a box, send it to Intel and wait. I need to be up and running immediately.

So yeah, it sucks either way, but it would suck a lot less if Intel was willing to flag a range of CPUs as being eligible for a recall.

As I see it right now, the order of operations is to wait for the upcoming patch, retest the default settings after the patch and if the behavior seems incorrect contact Intel for a replacement. I just wish they would make it clearer what that process is going to be and who is eligible for one.

[–] RootBeerGuy@discuss.tchncs.de 1 points 4 months ago (1 children)

Replacing it with what exactly. Another 13/14th gen chip?

[–] tal@lemmy.today 1 points 4 months ago* (last edited 4 months ago)

Yeah. They can't replace it with their upcoming 15th gen, because that uses a new, incompatible socket. They'd apparently been handing replacement CPUs out to large customers to replace failed processors, according to one of Steve Burke's past videos on the subject.

On a motherboard that has the microcode update which they're theoretically supposed to get out in a month or so, the processors should at least refrain from destroying themselves, though I expect that they'll probably run with some degree of degraded performance from the update.

Just guessing, not anything Burke said, but if there's enough demand for replacement CPUs, might also be possible that they'll do another 14th gen production run, maybe fixing the oxidation issue this time, so that the processors could work as intended.

[–] blackwateropeth@lemmy.world 7 points 4 months ago (1 children)

Went 13th -> 14th very early in both’s launch cycles because of chronic crashing. After about swapping mobo, RAM and SSDs i finally swapped to AMD and my build from late 2022 is FINALLY stable. Wendell’s video was the catalyst to jump ship. I thought I was going crazy, but yea… it was intel

[–] MudMan@fedia.io 4 points 4 months ago (1 children)

Whoa, that's even worse. It's not just the uncertainty of knowing whether Intel will replace your hardware or the cost of jumping ship next time. Intel straight up owes you money. That sucks.

[–] blackwateropeth@lemmy.world 4 points 4 months ago

Yea, my crashes were either watchdog BSODs, or nvldkkm (nvidia). So diagnosing the issue was super difficult, the CPU is the last thing I think of unless there’s some evidence of it failing :).

I also got to experience a cable mod adapter burn a 4090.. Zotac replaced the card thank god. I’m a walking billboard of what went wrong in the last 2 years with components lol.

Anyhow I hope we all get a refund. My PC is my main hobby so having instability caused a ton of frustration and anguish.

[–] gravitas_deficiency@sh.itjust.works 2 points 4 months ago (1 children)

When did you buy it? Depending on the credit card you have, they will sometimes extend on any manufacturer warranty by a year or two. Might be worth checking.

[–] MisterFrog@lemmy.world 1 points 4 months ago (1 children)

Are there like no consumer guarantees in the US? How is this not a open and shut case where the manufacturer needs to replace or refund the product?

[–] gravitas_deficiency@sh.itjust.works 2 points 4 months ago (1 children)

Nope, pretty much none in most cases. Though this is probably going to devolve into a giant class-action, because it is pretty egregious… so affected people will get something like $6.71 and the lawyers will walk away with a couple billion or whatever.

[–] MisterFrog@lemmy.world 1 points 4 months ago

Neat 👍 that sucks