Gadgetory


All Cool Mind-blowing Gadgets You Love in One Place

Full Buildzoid Conversation on EVGA VRMs & Dying Cards

2016-11-28
I'm gonna say it's the capacitors I have a bad batch of like is the one consisting thing that's like starting to show up is you have scorch marks coming like this latest one it comes right off of like you have the scorch mark coming right off the capacitor right there was a really early one where somebody had a blown-out capacitor just like capacitor completely fried in pieces that was like the first report and there I was like oh no that's that's just a manufacturing defect that's not a thermal issue right so this most recent one as well is just like the capacitors well it's not blown to bits but you can see that the solder on one side of it just like got shot right off so that one's another like that's a capacitor failure and basically all of the other ones are like varyingly more severe capacitor failures could cause those exact types of damage and then along with the fact that is claiming that oh there isn't thermal issue right that would line up with why there isn't a thermal issue you have a bad batch of capacitors so I'm thinking it's like because I read through the like the capacitor failure stuff which I gave you a link to yeah basically all of its gonna lead to pretty much short short-circuit explosion sort of scenarios where the capacitor will be damaged partially over time it'll get worse at some point it's gonna short out and it's gonna blow up and since this isn't really temperature bound that explains why it blows up at idle why it blows up with happy capacitor actually being like properly powered or anything you know the one where it blows up from the PCIe slot right at startup it expecially considering about the thermals even without the thermal pads for most people won't be that bad right considering like the temporary anode me from the testing it it's like fine that's normal for a vrn under the degrees here a hundred degrees there is fine it's just like so I I don't see so it's basically I'm like I'm almost certain that every good just has a bad batch of capacitors some of the cards hopefully not all of them that would be a disaster but I think they probably just have a bad batch of capacitors and they're just going up in flames and now it's really up to I've got to just state probably that that's the case reaffirm it or disprove it because they obviously don't have a thermal issue with the MOSFETs because yeah and we should orchard talk about that too because in the testing I was doing you know the hottest these things that I hottest I could get them was a really worst-case scenario dumping CPU radiator heat with prime95 running straight into the vrm fan like three inches away so ambience like 40-ish 41 as far as the vrm is concerned or the GPU and we're hitting like with overclocking with over volting with the old v bios and with no thermal pads 126 Celsius on the backplate and then about a hundred on MOSFETs number seven and number two which we've seen are the hot spots so and seven is actually running hotter than the number two so number two is the one at the bottom seven is like the middle one yeah and we've seen a bunch of with them where it blows up on the bottom one yeah yeah well no because the powered is now I was just wondering about the the PCIe slot failure if the wiring was lining up with that right but now it is and so then the I think the thing to point out is for folks is you know I think a lot of people basically be prepared that some cards are gonna blow up what's gonna happen hopefully it won't be too many of them and yeah and it won't be because of thermals as far as we can tell yeah so yes I mean I think I think thermals will like cause the issue to show up faster but they won't like they'll cause the degra cause the downward spiral over the capacitor to speed up a little bit but it's not like it's gonna prevent it if you get the thermal pads or something it's just gonna blow up anyway right so yeah yeah so I think the one thing to point out though is that people I think see this 100 Celsius 105 Celsius or whatever on the BRM and just immediately think holy crap that's really hot but these aren't it's not the same as like a GPU or a CPU where 100 Celsius is t.j.maxx were greater than t.j.maxx and you have a thermal shut yeah no no no TJ Maxx for MOSFETs is 150 but that's like standard the industry standard is 150 TJ Max and so the casing I mean really high-end stuff is rated at 125 like they're rated to work at 125 there's MOSFETs actually if I look at say a have a 68 yeah a 68 94 which like is really popular on AMD cards that has a temperature rating which assumes okay this is a fun one ambient temperature 70 degrees no air flow no air flow except for the convection of the MOSFET itself right no heat sink nothing 70 degrees ambient and yeah I mean the rating is crap but it's a rating right they tell you yeah you can run it at that just don't overload it yeah so so really like it's it's a case of yeah you can run vrm is really really hot because the problem is like it's a combination of the current and the temperature because the heat output of a vrm grows as it gets warmer right so its efficiency decreases and basically at some point you're gonna end up and the efficiency is basically bound to the thermal Junction temperature so that's the internal of the MOSFET so the external temperature just sort of like this is the way it I gonna be making a video about how VRMs fail sometime soon so I might as well get over this so in a in a MOSFET you have your silicon well it's not necessarily silicon but you have a semiconductor which is the actual switching component then you have all of the casing around it which is the wiring hook up to the silica and the ceramic that basically covers it and protects it from damage and all of that and you basically have a temperature rating where the the current specs for the MOSFET are basically set up that if your case of the MOSFET is at a hundred degrees and you put this much power through it the MOSFET is going to put out X amount of heat which will lead to the internals to be at a hundred and fifty which is perfectly safe and that's a good current rating and if you exceed that 150 rating and continue 150 degree Junction temperature then the problem is that your resistance of the MOSFET is gonna go up the power dissipation because of that is gonna go up as well that's gonna further increase internal temperatures and basically you get a thermal runaway scenario where the extra heating is causing higher resistance which causes extra heating which eventually leads to it getting so hot that it blows itself to bits and actually that's a thing because it causes a decrease in efficiency that's actually if you've ever seen power supply reviews from Johnny guru where he does some of the ones that blow up actually right he always notes that you see a massive drop in efficiency and then it just the efficiency starts falling falling falling and then the power supply eventually explodes the same happens to the are ends you basically see that your efficiency falls off a cliff because the internals are opening and that that's basically yeah I actually said that that the best way to figure out if the vrm is about to fail is to monitor the efficiency if it suddenly starts going downhill turn it off right so yeah so it's basically sort of the scenario for actually getting the vrm to blow up is that it's gonna just overeat itself really really quickly at some point right I think the question then is now we've done the thermal testing shown obviously thermal pads and the V bios are beneficial and if you own one of these cards there's really no reason not to at least do the V bios and the only them to not do the thermal pads is because of like fear that you're gonna damage it or laziness or maybe you're not in a country they shipped to ya so I definitely do those but with thermals not really being a source of concern either question I think becomes if I owned one of these cards should I be worried about damage you know we've two other components right yeah at this point like we can't actually say the cards will or won't fail because it seems to be just a manufacturing defect and it's gonna be random on how many cards it shows up I yeah so it's really up to em good to figure out what batch of cards they have that got screwed up right then basically say yeah this set of serial numbers is bad right I need to go and then recon or something yeah and hopefully they'll figure his head out sometime soon make a statement about it but because I mean yeah it's not thermals at this point so they're right about claiming that the thermals are fine because they are yeah and then as for damaging other components if you have a bad power supply and you get a short-circuit on a capacitor going bad that power supply is probably gonna go with it and I mean I've had a I actually had like a you know not a great power supply but like a decent unit where it didn't kill anything else except itself but it was like 12 volts short-circuit power supply dies so I would be sort of yeah basically just make sure you have a good power supply if you have one FTW but I'd hope most people have that anyway it's expensive enough where they should in general yeah yeah motherboard should we be concerned about this I mean there's been a few like there's been a few like I think there was one case where he said he had scorch marks that like reached all the way to the motherboard but I think that's gonna be superficial just bits of the card just ending up you know on the motherboard not actual apparently somebody had one where it cascaded where the power supply went out and the motherboard died with basically everything in the computer got toasted but that's again the power supply fail the power supply is supposed to stop that that's the whole point like if you're if you have a short-circuit in the system the power supply should detect that and shut down which would protect all your components if it doesn't shut down fast enough and the issue is that the regulation of the power supply is going to go out of spec and it's gonna just well it's gonna basically do what the GPU does to the rest of the system which sucks very badly but you know so I think that I think the started conclusion here the thermals largely were not a red herring because they it wasn't over sites might have thermal pads on there and run the fan speed that they did like those the temperatures seemed okay but they could be better and it was with not yeah but it could definitely get better temperatures but that wasn't the heart of the issue the heart of the issue seems like it's it's something that we can't necessarily figure out not easily and it's probably I would think of this point a manufacturing defect of some kind yeah and that's and it might not even be a manufacturing defect from advocates side right my capacitor suspicion is correct right yeah yeah so I guess if you have one of these cards I don't really know what to what to suggest don't like I would just say don't run it when you're not in the same room as the computer because that just like that just exasperate like if that fails and you have a terrible power supply like the nightmare scenario let's think hey you have some horrific 500 watt power supply which will power an FCW just fine you have an FTW which has the defect that's causing the vrm to blow up the RM blows up you get a short-circuit power supply goes out and I've have heard of power supplies where they go out and they melt their own casing because of how terrible they are it's just like they burn themselves out worst-case scenario is gonna burst in your house down so just don't run it when you're not near it so that you can catch it because like but yeah but this isn't like the mic like I have no I'd like I've heard very few cases of this happening like power supplies failing this badly right and even then it's like really really rare and like you know it's not like cheap power supply we are talking garbage absolute garbage like this is below the worst powers like this is it's not even like Diablo tech levels of fat right Diablo Tec is considered a terrible power supply company these are companies that take power supplies where it's like you know the design is barely capable of half the power they rated for I know it never saw testing scenario they cost like 20 bucks right you know like they don't even necessarily have PCIe connectors because they're that outdated for most people that I don't think this is really it's like yeah basically if your card goes contact EVGA got a replacement or refund and buy something else and if you haven't done thermal pad or B BIOS mods you might as well do them because it is better and make who knows maybe it prolongs the life of the device a little bit for always you have a defective card if you have a defective card I'm pretty sure it'll just fail anyway yeah yeah you're not gonna find I can have warning signs like one of the ones we saw just failed on Windows desktop allegedly according to the user I mean I actually think that if it is the capacitors out like that then it's totally reasonable because those things like once they shorts like this they basically the defects are all basically there is a crack or some kind of other failure in the capacitor which causes it to slowly get higher and higher like more and more short circuited until eventually it reaches like like because capacitors aren't perfect like insulators right they're leaked a little bit of current and basically if they're bad than they'll leak more and more current until eventually the amount of current they're leaked that's leaking through them actually just completely shorts out the capacitor and then it blows up so you can get that happening basically as long as the card is running that can happen eventually I mean we have thermal images from the thermal images from the Tom's Hardware and everything and you can actually see the capacitors because they're colder than everything else on the back art so they have a fight like that their card is obviously fun because like yes if you had a thermal camera pointed at the PCB as it's about to die you would see the capacitor be significantly warmer than everything else and then it would get basically one set for thermal runaway as you'd see it's spiked really quickly and I go yeah it would go from something like I think once it shorts its gonna because we've heard people basically get flashes of white light and I did tell you that to get a red glow you need 5:25 to get a flash of white light Celsius for ya occasion yeah yeah you're gonna need like a couple thousand degrees which let's do it like it's basically gonna arc weld itself at that point you have 12 volts shorted straight to ground at that right through a tiny little SMD component that's not meant to handle that kind of like it has no chance of dissipating that much energy in such a short period of time which is why it's so dramatic when it's a relatively met like failure in terms of light Randy like how bad it is yeah like like the amount of drama you get for very simple failure is rather high with capacitors failing because they basically short out and yeah that's why actually you have so much like the scorch marks and such significant PCB damage because they basically get easily to the melting point of copper right so once you get a short circuit with that much power available because it also you have the delay before the power supply notices that it's pushing too much current than the OCP kicks in and yeah basically that's that's plenty of power yes the problem about you see every I will say it's worth kind of noting if anyone does have a card that fails send us a photo you can yeah tweeted at us at cameras Nexus or do you have a Twitter account or what's the best way for people that send you uh I don't have a Twitter account so question your comments maybe a video that you posed yeah yeah actually I have a YouTube discussion section I know that's wrong that so they can just post it there as well and I do want back and front side yeah because dependent like just seeing the back you see damage but you know you don't know yeah because if one side gets really really then it'll transfer through to the other side anyway so if it's send us photos if you if both sides sent us photos both sigh it's really easy you take out four screws let's get the heatsink off that exposes the front side and then you take out the back plate screws and that'll take the base plate and the back plate off and the PCB is bare and at this point you shouldn't be worried about damaging it anyway cuz it's the reason be afraid to take it apart and then send it back to EPG in a box called the partner box but that will help us catalog things and see if there's anything any trend but yeah I guess that's that about recaps the issue so yeah yeah thank you for joining me bill joy from actually hardcore overclocking he has a YouTube channel search for actually a hardcore overclocking pretty cool stuff and I guess it sounds like you've got a couple of erm videos in the future anyway yeah most of them are vrf EFS so yeah so if you want to learn more about this stuff go there
We are a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites.