Inside a GPU Die: Exploding 2080 Ti GPUs by Overheating, ft. TiN
Inside a GPU Die: Exploding 2080 Ti GPUs by Overheating, ft. TiN
2019-06-18
oh yes
very important - oh so people are gonna
be crying that oh my god you destroyed
it at 1890 our video yeah this was
actually the last year engineering
sample so like this card will ready to
went through all the validation and we
cannot sell this board anymore doesn't
have a heatsink doesn't have anything
it's even like the power design is a
little bit different so it doesn't have
commercial value so it's all of us to
use it for this experiment and indicate
in the real practical snare what can
happen if you forget to turn on there
one more temperature okay before that
this video is brought to you by us and
the GN store the best way to support our
independent reporting is through store
cameras and access net this is made
possible with your purchases of merch
like our GN medium mod matte in stock
and shipping now and designed with GPS
tear down diagrams and grids our 100%
custom eight 2-tone shirt is also a
great way to help and it's currently on
sale the shirt uses 95% cotton and 5% of
lasting for a sporty fit with vibrant
colors and was designed entirely by the
GN team learn more at the link of the
description below or go to store that
gamers access net so I'm with tin from
EVGA we've done a few videos here and
this you've presented me is say a GP
with the crater in it yeah it's a victim
of the somebody forget him to turn on
the over temperature protection back
after the bench session and GPU overheat
but the power was never cut off and GPU
exploded yes so we have some demos of
that we the video will start with
probably one of them overheating we
didn't get a giant crater in it but we
got some cracks on the die and a big
puff of smoke essentially there this
demo to show the importance why the
thermal protection is in place
originally on pretty much every VGA cart
and motherboards as well and why it is
important because like we have the
people who are trying different extreme
overclocking experiments like running
nitrogen or even water cooling on the
cards but evidence always happened and
sometimes there is no water or like pump
failure and what can happen if you don't
turn the
protection back on when you finish with
your benchmarking right yeah if you if
you Tyler not to protect itself in it
it'll listen to you yes exactly so yeah
this one is created and I guess another
time is could happen would be like if
you forget - poor Ellen - in the fight
you walk away or something yeah if you
get distracted go talk to somebody and
then half an hour you come back to the
system run in with a 1/2 pot no Ellen 2
it will be temperature will be 200 C and
GPU or CPU will be there as well that's
even sound like black oh yeah look goo
come out so just like I guess the top is
like a diffusion barrier or something on
top of the well silicon no the top is
the silicon that was it and the goo is
the under field that's like a glue under
the chip so the humidity and air doesn't
get under the ship I see okay so but
other actually you can see on there to
explore the card that there are small
little balls just like you have on the
BGA package but much smaller ones that
connect the GPU dye to the substrate yes
okay very important to intro people it's
just hurt right and actually you can see
like all those shiny things that's
actually the silicon level that's where
all those two millions of transit might
be careful so what's the actual
transistor and there you have the copper
layers aha pretty much like PCB but much
much much smaller okay so because
they're all the components all the
transistors are on the back side not on
the topside outside doesn't do anything
is it the so is it is the GPU die also
BGA to the substrate yeah but it's not
using the solder it's using the copper
bumps I see okay so micro bumps record
tall and then what are the layers like
do you know of like when you cracked it
open just now so the top layer
it's the biggest layer like like they
have on the PCB all the layers the same
but on the GPU on the silicon they have
like the biggest layer which will handle
all the power
it will connect like power from the PCB
like memory power or all the like traces
that that doesn't need a lot of
high-speed signals but you need to kill
a lot of power so that will be top layer
I can it will connect all these blocks
like around the GPU and then their next
layers they go like more fine more fine
till they go to the bottom layer which
have actual transistors again and that
will be like what when you he rode like
ten and a meter GPU or CPU or like seven
and a meter that's where all those nano
meters they are right on the bottom of
the chip okay because that's why this
package called flip chip yeah so you
have the actual structures flipped down
to the PCB and then on the top you have
just silicon material which is allowed
to transfer the heat to the heat sink
mmm and provide the cooling nice and
actually like when you see like the pair
of beautiful pictures of the silicon
like dyes like all those like rainbow
rainbow color transistors that's
essentially the bottom layer I see so
the bottom layer is pre close to the
contact bumps yes yes and then they have
just an insulation layer and then they
have contact bounce to connect to their
substrate
he's grabbing another specimen
thank you have the card debt bloop yeah
where's it it's always this moment with
legs you like you've got something
exciting and then you want to show
someone and then you're trying to show
it they lot nothing happened
and there's one more thing on there too
so we did try to blow up another one and
with that one you thought you were
hitting I guess OCP yes because there is
always not just GPU protection
mechanisms but vrm itself the iron will
try to protect itself from drawing too
much current or voltage and it will also
shut itself off but it's actually much
higher limit on any overclocking cards
because if you have the limit too low
then you will not be able to run extreme
overclocking on the car it will shut off
too early I also made the simple diagram
sort of erm works like overheat signal
like the essentially we have power
controller we have power stages we have
GPU and how the over temperature
protection works when the temperature
sensor and GPU detects temperature too
high it will toggle the signal over heat
output and then the signal go to the vrn
controller and connect to enable signal
when it signal is off then the whole vrm
shut down I see so that's essentially
very simple concept how it works
and then about the vid rope-like you
have the controller then you have the
power stage which doing all the 12 volt
input goes to the power stages they can
convert it to the lower voltage and
provided by the big beefy shape to the
GPU die and then there is a two special
pin give you power and GPU return sense
pin they go back to the controller and
like you can think like there is some
small oompa loompa who sits in the
controller looks at the voltage and then
if the voltage is not correct what is
expected he will adjust the pwm signal
fire
so the voltage increases
and if voltage is too low because for
example you're running heavy load like
3dmark benchmark then the bumper will
adjust the voltage law required to
compensate for that okay
so that's why I write this very
professional guy who is watching this
feedback the same voltage I can
constantly adjust everything in the loop
that's why it's very important to have
the correct V sense and feedback that's
one of the first thing we test during
the power design on any product like the
motherboard or crash car and just keep
your eyes and screen
now he's medium presents now he knows
it's gonna work
that's it
and this is happening within the
controller yes so essentially you tune
the controller which have different
adjustment knobs like the frequency
health and this monitoring loop we'll be
working on and then also you can adjust
like you can artificially tell this guy
like oh like actually the real voltage
is 50 millivolts higher than you see and
then their correction will be applied
and the whole voltage will change
accordingly as well okay switching
frequencies like essentially one of the
knobs that controllers have set as well
and then there you can different
transient different speed have
everything works will be affected by the
switching frequency that's why on the
older motherboards and DJ cars often
like you need to increase the switching
frequency so everything on the BRM side
can catch up with the demand from the
GPU back all right so that's the
walkthrough of over temperature
protection or lack thereof you wanted to
add something though yeah basically like
our temperature protection is important
for the safety reasons and anybody puts
it down there for good reason right yes
so thermal protection is important and
all the cards have has been able that
it's default and if you want to use
maximum fan speed like we're providing
the lm-2 bias position you would best to
do that is take the bus and flash it
into the normal position on the buy
switch or into the aussie position that
will remain the ability to go maximum
fan speed but still with all the thermal
protection simply okay so is it so then
if you're in the regular bios position
but the thermal protection is on okay so
only disabled when you switch into the
lm-2 mode which is red light on the
backside right indicator and that's when
all the protection is disabled from the
term upon the throne protection is
independent then from v bios yes you can
use any v bias and this
control is purely on the hard way on PCB
level right right and I think the
coolest the biggest takeaway here is if
you forget to shut down and you leave
your system running
yeah then that won't be a very expensive
day effects I see BIOS on and then you
are a may your card and you put the
cooler back on it hoping that they won't
notice if you leave a hole in it like
that they might know so that's it for
this one pretty cool stuff
you don't really get to see 28 et eyes
get exploded every day so thank you for
watching that it was fun for us and
check back for the other videos on
voltage and LLC Thank You Tennant for
joining me we'll see you all next time
We are a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites.