Full Buildzoid Conversation on EVGA VRMs & Dying Cards
Full Buildzoid Conversation on EVGA VRMs & Dying Cards
2016-11-28
I'm gonna say it's the capacitors I have
a bad batch of like is the one
consisting thing that's like starting to
show up is you have scorch marks coming
like this latest one it comes right off
of like you have the scorch mark coming
right off the capacitor right there was
a really early one where somebody had a
blown-out capacitor just like capacitor
completely fried in pieces that was like
the first report and there I was like oh
no that's that's just a manufacturing
defect that's not a thermal issue right
so this most recent one as well is just
like the capacitors well it's not blown
to bits but you can see that the solder
on one side of it just like got shot
right off so that one's another like
that's a capacitor failure and basically
all of the other ones are like varyingly
more severe capacitor failures could
cause those exact types of damage and
then along with the fact that is
claiming that oh there isn't thermal
issue right that would line up with why
there isn't a thermal issue you have a
bad batch of capacitors so I'm thinking
it's like because I read through the
like the capacitor failure stuff which I
gave you a link to yeah basically all of
its gonna lead to pretty much short
short-circuit explosion sort of
scenarios where the capacitor will be
damaged partially over time it'll get
worse at some point it's gonna short out
and it's gonna blow up and since this
isn't really temperature bound that
explains why it blows up at idle why it
blows up with happy capacitor actually
being like properly powered or anything
you know the one where it blows up from
the PCIe slot right at startup it
expecially considering about the
thermals even without the thermal pads
for most people won't be that bad
right considering like the temporary
anode me from the testing it it's like
fine that's normal for a vrn under the
degrees here a hundred degrees there is
fine it's just like so I I don't see so
it's basically I'm like I'm almost
certain that every good just has a bad
batch of capacitors
some of the cards hopefully not all of
them that would be a disaster but I
think they probably just have a bad
batch of capacitors and they're just
going up in flames and now it's really
up to I've got to just state probably
that that's the case reaffirm it or
disprove it because they obviously don't
have a thermal issue with the MOSFETs
because yeah and we should orchard talk
about that too because in the testing I
was doing you know the hottest these
things that I hottest I could get them
was a really worst-case scenario dumping
CPU radiator heat with prime95 running
straight into the vrm fan like three
inches away
so ambience like 40-ish 41 as far as the
vrm is concerned or the GPU and we're
hitting like with overclocking with over
volting with the old v bios and with no
thermal pads 126 Celsius on the
backplate and then about a hundred on
MOSFETs number seven and number two
which we've seen are the hot spots so
and seven is actually running hotter
than the number two so number two is the
one at the bottom seven is like the
middle one yeah
and we've seen a bunch of with them
where it blows up on the bottom one yeah
yeah well no because the powered is now
I was just wondering about the the PCIe
slot failure if the wiring was lining up
with that right but now it is and so
then the I think the thing to point out
is for folks is you know I think a lot
of people basically be prepared that
some cards are gonna blow up what's
gonna happen hopefully it won't be too
many of them and yeah and it won't be
because of thermals as far as we can
tell yeah so yes I mean I think I think
thermals will like cause the issue to
show up faster but they won't like
they'll cause the degra cause the
downward spiral over the capacitor to
speed up a little bit but it's not like
it's gonna prevent it if you get the
thermal pads or something it's just
gonna blow up anyway right so
yeah yeah so I think the one thing to
point out though is that people I think
see this 100 Celsius 105 Celsius or
whatever on the BRM and just immediately
think holy crap that's really hot
but these aren't it's not the same as
like a GPU or a CPU where 100 Celsius is
t.j.maxx were greater than t.j.maxx and
you have a thermal shut yeah no no no TJ
Maxx for MOSFETs is 150 but that's like
standard the industry standard is 150 TJ
Max and so the casing I mean really
high-end stuff is rated at 125 like
they're rated to work at 125 there's
MOSFETs actually if I look at say a have
a 68 yeah a 68 94 which like is really
popular on AMD cards that has a
temperature rating which assumes okay
this is a fun one ambient temperature 70
degrees no air flow no air flow except
for the convection of the MOSFET itself
right no heat sink nothing
70 degrees ambient and yeah I mean the
rating is crap but it's a rating right
they tell you yeah you can run it at
that just don't overload it yeah so so
really like it's it's a case of yeah you
can run vrm is really really hot because
the problem is like it's a combination
of the current and the temperature
because the heat output of a vrm grows
as it gets warmer right so its
efficiency decreases and basically at
some point you're gonna end up and the
efficiency is basically bound to the
thermal Junction temperature so that's
the internal of the MOSFET so the
external temperature just sort of like
this is the way it I gonna be making a
video about how VRMs fail sometime soon
so I might as well get over this so in a
in a MOSFET you have your silicon well
it's not necessarily silicon but you
have a semiconductor which is the actual
switching component then you have all of
the casing around it which is the wiring
hook up to the silica and the ceramic
that basically covers it and protects it
from damage
and all of that and you basically have a
temperature rating where the the current
specs for the MOSFET are basically set
up that if your case of the MOSFET is at
a hundred degrees and you put this much
power through it the MOSFET is going to
put out X amount of heat which will lead
to the internals to be at a hundred and
fifty which is perfectly safe and that's
a good current rating and if you exceed
that 150 rating and continue 150 degree
Junction temperature then the problem is
that your resistance of the MOSFET is
gonna go up the power dissipation
because of that is gonna go up as well
that's gonna further increase internal
temperatures and basically you get a
thermal runaway scenario where the extra
heating is causing higher resistance
which causes extra heating which
eventually leads to it getting so hot
that it blows itself to bits and
actually that's a thing because it
causes a decrease in efficiency that's
actually if you've ever seen power
supply reviews from Johnny guru where he
does some of the ones that blow up
actually right he always notes that you
see a massive drop in efficiency and
then it just the efficiency starts
falling falling falling and then the
power supply eventually explodes the
same happens to the are ends you
basically see that your efficiency falls
off a cliff because the internals are
opening and that that's basically yeah I
actually said that that the best way to
figure out if the vrm is about to fail
is to monitor the efficiency if it
suddenly starts going downhill turn it
off right so yeah so it's basically sort
of the scenario for actually getting the
vrm to blow up is that it's gonna just
overeat itself really really quickly at
some point right I think the question
then is now we've done the thermal
testing shown obviously thermal pads and
the V bios are beneficial and if you own
one of these cards there's really no
reason not to at least do the V bios and
the only them to not do the thermal pads
is because of like fear that you're
gonna damage it or laziness or maybe
you're not in a country they shipped to
ya so I definitely do those but with
thermals not really being a source of
concern either question I think becomes
if I owned one of these cards should I
be worried about damage
you know we've two other components
right yeah at this point like we can't
actually say the cards will or won't
fail because it seems to be just a
manufacturing defect and it's gonna be
random on how many cards it shows up I
yeah so it's really up to em good to
figure out what batch of cards they have
that got screwed up right then basically
say yeah this set of serial numbers is
bad right I need to go and then recon or
something yeah and hopefully they'll
figure his head out sometime soon make a
statement about it but because I mean
yeah it's not thermals at this point so
they're right about claiming that the
thermals are fine because they are yeah
and then as for damaging other
components if you have a bad power
supply and you get a short-circuit on a
capacitor going bad that power supply is
probably gonna go with it and I mean
I've had a I actually had like a you
know not a great power supply but like a
decent unit where it didn't kill
anything else except itself but it was
like 12 volts short-circuit power supply
dies so I would be sort of yeah
basically just make sure you have a good
power supply if you have one FTW but I'd
hope most people have that anyway it's
expensive enough where they should in
general yeah yeah
motherboard should we be concerned about
this I mean there's been a few like
there's been a few like I think there
was one case where he said he had scorch
marks that like reached all the way to
the motherboard but I think that's gonna
be superficial just bits of the card
just ending up you know on the
motherboard not actual apparently
somebody had one where it cascaded where
the power supply went out and the
motherboard died with basically
everything in the computer got toasted
but that's again the power supply fail
the power supply is supposed to stop
that that's the whole point like if
you're if you have a short-circuit in
the system the power supply should
detect that and shut down which would
protect all your components if it
doesn't shut down fast enough and the
issue is that the regulation of the
power supply is going to go out of spec
and it's gonna just well it's gonna
basically do what the GPU does to the
rest of the system
which sucks very badly but you know so I
think that I think the started
conclusion here the thermals largely
were not a red herring because they it
wasn't over sites might have thermal
pads on there and run the fan speed that
they did like those the temperatures
seemed okay but they could be better and
it was with not yeah but it could
definitely get better temperatures but
that wasn't the heart of the issue the
heart of the issue seems like it's it's
something that we can't necessarily
figure out not easily and it's probably
I would think of this point a
manufacturing defect of some kind yeah
and that's and it might not even be a
manufacturing defect from advocates side
right my capacitor suspicion is correct
right yeah yeah so I guess if you have
one of these cards I don't really know
what to what to suggest don't like I
would just say don't run it when you're
not in the same room as the computer
because that just like that just
exasperate like if that fails and you
have a terrible power supply like the
nightmare scenario let's think
hey you have some horrific 500 watt
power supply which will power an FCW
just fine you have an FTW which has the
defect that's causing the vrm to blow up
the RM blows up you get a short-circuit
power supply goes out and I've have
heard of power supplies where they go
out and they melt their own casing
because of how terrible they are it's
just like they burn themselves out
worst-case scenario is gonna burst in
your house down so just don't run it
when you're not near it so that you can
catch it because like but yeah but this
isn't like the mic like I have no I'd
like I've heard very few cases of this
happening like power supplies failing
this badly right and even then it's like
really really rare and like you know
it's not like cheap power supply we are
talking garbage absolute garbage like
this is below the worst powers like this
is
it's not even like Diablo tech levels of
fat right Diablo Tec is considered a
terrible power supply company these are
companies that take power supplies where
it's like you know the design is barely
capable of half the power they rated for
I know it never saw testing scenario
they cost like 20 bucks right you know
like they don't even necessarily have
PCIe connectors because they're that
outdated for most people that I don't
think this is really it's like yeah
basically if your card goes contact EVGA
got a replacement or refund and buy
something else and if you haven't done
thermal pad or B BIOS mods you might as
well do them because it is better and
make who knows maybe it prolongs the
life of the device a little bit for
always you have a defective card if you
have a defective card I'm pretty sure
it'll just fail anyway yeah yeah you're
not gonna find I can have warning signs
like one of the ones we saw just failed
on Windows desktop allegedly according
to the user I mean I actually think that
if it is the capacitors out like that
then it's totally reasonable because
those things like once they shorts like
this they basically the defects are all
basically there is a crack or some kind
of other failure in the capacitor which
causes it to slowly get higher and
higher like more and more short
circuited until eventually it reaches
like like because capacitors aren't
perfect like insulators right they're
leaked a little bit of current and
basically if they're bad than they'll
leak more and more current until
eventually the amount of current they're
leaked that's leaking through them
actually just completely shorts out the
capacitor and then it blows up so you
can get that happening basically as long
as the card is running that can happen
eventually I mean we have thermal images
from the thermal images from the Tom's
Hardware and everything and you can
actually see the capacitors because
they're colder than everything else on
the back art so they have a fight like
that their card is obviously fun because
like yes if you had a thermal camera
pointed at the PCB as it's about to die
you would see the capacitor be
significantly warmer than everything
else and then it would get basically one
set for thermal runaway as you'd see
it's spiked
really quickly and I go yeah it would go
from something like I think once it
shorts its gonna because we've heard
people basically get flashes of white
light and I did tell you that to get a
red glow you need 5:25 to get a flash of
white light Celsius for ya occasion yeah
yeah you're gonna need like a couple
thousand degrees which let's do it like
it's basically gonna arc weld itself at
that point you have 12 volts shorted
straight to ground at that right through
a tiny little SMD component that's not
meant to handle that kind of like it has
no chance of dissipating that much
energy in such a short period of time
which is why it's so dramatic when it's
a relatively met like failure in terms
of light Randy like how bad it is yeah
like like the amount of drama you get
for very simple failure is rather high
with capacitors failing because they
basically short out and yeah that's why
actually you have so much like the
scorch marks and such significant PCB
damage because they basically get easily
to the melting point of copper right so
once you get a short circuit with that
much power available because it also you
have the delay before the power supply
notices that it's pushing too much
current than the OCP kicks in and yeah
basically that's that's plenty of power
yes the problem about you see every I
will say it's worth kind of noting if
anyone does have a card that fails send
us a photo you can yeah tweeted at us at
cameras Nexus or do you have a Twitter
account or what's the best way for
people that send you uh I don't have a
Twitter account so question your
comments maybe a video that you posed
yeah yeah actually I have a YouTube
discussion section I know that's wrong
that so they can just post it there as
well and I do want back and front side
yeah because dependent like just seeing
the back you see damage but you know you
don't know yeah because if one side gets
really really
then it'll transfer through to the other
side anyway so if it's send us photos if
you if both sides sent us photos both
sigh it's really easy you take out four
screws let's get the heatsink off that
exposes the front side and then you take
out the back plate screws and that'll
take the base plate and the back plate
off and the PCB is bare and at this
point you shouldn't be worried about
damaging it anyway cuz it's the reason
be afraid to take it apart and then send
it back to EPG in a box called the
partner box but that will help us
catalog things and see if there's
anything any trend but yeah I guess
that's that
about recaps the issue so yeah yeah
thank you for joining me bill joy from
actually hardcore overclocking he has a
YouTube channel search for actually a
hardcore overclocking pretty cool stuff
and I guess it sounds like you've got a
couple of erm videos in the future
anyway yeah
most of them are vrf EFS so yeah so if
you want to learn more about this stuff
go there
We are a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites.