in videos use of tensor course has
sparked quite the interest in the tech
community now originally this
architecture was debuted by Google who
had been actually using it kind of a
nonchalant fashion for just over a year
in the AI field its design makes it
especially good at machine learning
analyzing patterns and parallel
processing but what exactly does a
tensor chord look like and why is it any
more special than a CUDA core or a
stream processor welcome to minute
science one thing's for sure GPUs have
always been good at physics calculations
or parallelism the things you tend to
think of when you hear the phrase and
graphical computation or at least what
you'll probably think of now but these
fancy words are nothing more really than
reveals over simple mathematical
operations that albeit amount to
billions of calculations per second now
most computing tasks are simple
additions and products really nothing
more than simple elementary calculations
things that basic calculators can do
almost instantly but graphical computing
involves matrices those fancy blocks of
numbers that you likely learned in high
school they look like this right here
there are certain methods to solving
matrices most of which involve a series
of sums and products CPUs can handle
these workloads but only in small chunks
since only a small number of pipelines
exist in the workflow GPUs instead
utilize thousands of CUDA cores in the
case of Nvidia or stream processors in
the case of AMD to handle large
mathematical work loads including
matrices simultaneously this is what we
refer to as parallelism or parallel
processing by the way so we have
matrices that themselves require
addition and multiplication to solve and
then we have entire matrices being added
and multiplied together again
simultaneously so these things are
adding and multiplying matrices together
and there's also addition and
multiplication taking place within these
matrices so this can be a lot of strain
on any processor let alone one with a
few cores and that's why GPUs have their
place in the market especially for
things like AI and graphical computation
but what makes a tensor core any
different from a traditional GPU core
you ask good question that's why this
video exists the way in which they
handle calculations see tensor cores are
really
different in essence from the way a GPU
core is designed there just specific in
terms of what they do they handle four
by four matrix workloads NVIDIA has an
excellent blog outlining the process
they specifically multiply and
accumulate using the formula D equals a
times B plus C where all four letters
represent four by four matrices
recognized here where F P 16 and F P 32
are used don't worry these just them
from the acronym flop which stands for
floating-point operations per second or
per clock and represents a number of
multiplications and additions a GPU can
handle in a given time span this doesn't
always translate by the way to raw
gaming performance execution on a
software level plays a huge role here
which is why something like Vega 64
despite having an insane amount of
floating-point precision because it was
hailed as such falls behind something
like a GTX 1080 in several titles so
optimization that's its fault not the
graphics curve when you see FB 16 and 32
in this blog know that they're referring
to the number of bits per digit
represented in the floating-point
operation you know it gets really
complicated here but FP 16 units are
easier to process they carry less
information with them though it's a
trade off tensor cores can handle 64
floating-point mixed precision
operations per second
mixed precision in reference to the mix
of FP 16 and 32 units in the formula we
mentioned earlier and those was getting
really confusing just bear with me the
FP 60 multiply yields a full precision
result that is accumulated in FP 32
operations with the other products in a
given dot product for a four by four by
four matrix multiply in a nutshell this
jumble of words means that tensor cores
are designed for specific operations at
the expense of precision and effect CUDA
cores are still technically more
efficient in their current state for
many operations tensor cores are simply
mixed into sm's for when they're needed
it's a bit like placing several
college-level mathematicians in a room
to handle several random operations
along with a couple of students who
specialized in one or two specific
operations say long division the other
students can handle the workload but not
as efficiently and quickly as the two
specialists whose brains are
specifically wired for long division I
don't know it's kind of a weird analogy
but hopefully you get what a tensor core
is doing in essence and that's why
NVIDIA has been pushing
hence ER course heart in their machine
learning endeavors they're really good
at just a few things including machine
learning and AI applications and that's
why things like the Titan v have tensor
cores but should we expect to see tensor
cores and conventional consumer grade
graphics cards don't count on it like we
just mentioned these cores are
specialists they handle parallelizing
and compute storage operations like true
bosses but traditional GPU cores are
still much better at handling most
videogame code out there they're often
faster and more efficient which is why
they aren't likely to be replaced
anytime soon if you guys liked this
video let us know by give this one a
thumbs up we appreciate it thumbs down
for the opposite click that red
subscribe button and the sponsor but if
you want to get fancy with it down below
stay tuned for the next video this is
science to do thanks for learning with
us
We are a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites.