What are Tensor Cores?

in videos use of tensor course has sparked quite the interest in the tech community now originally this architecture was debuted by Google who had been actually using it kind of a nonchalant fashion for just over a year in the AI field its design makes it especially good at machine learning analyzing patterns and parallel processing but what exactly does a tensor chord look like and why is it any more special than a CUDA core or a stream processor welcome to minute science one thing's for sure GPUs have always been good at physics calculations or parallelism the things you tend to think of when you hear the phrase and graphical computation or at least what you'll probably think of now but these fancy words are nothing more really than reveals over simple mathematical operations that albeit amount to billions of calculations per second now most computing tasks are simple additions and products really nothing more than simple elementary calculations things that basic calculators can do almost instantly but graphical computing involves matrices those fancy blocks of numbers that you likely learned in high school they look like this right here there are certain methods to solving matrices most of which involve a series of sums and products CPUs can handle these workloads but only in small chunks since only a small number of pipelines exist in the workflow GPUs instead utilize thousands of CUDA cores in the case of Nvidia or stream processors in the case of AMD to handle large mathematical work loads including matrices simultaneously this is what we refer to as parallelism or parallel processing by the way so we have matrices that themselves require addition and multiplication to solve and then we have entire matrices being added and multiplied together again simultaneously so these things are adding and multiplying matrices together and there's also addition and multiplication taking place within these matrices so this can be a lot of strain on any processor let alone one with a few cores and that's why GPUs have their place in the market especially for things like AI and graphical computation but what makes a tensor core any different from a traditional GPU core you ask good question that's why this video exists the way in which they handle calculations see tensor cores are really different in essence from the way a GPU core is designed there just specific in terms of what they do they handle four by four matrix workloads NVIDIA has an excellent blog outlining the process they specifically multiply and accumulate using the formula D equals a times B plus C where all four letters represent four by four matrices recognized here where F P 16 and F P 32 are used don't worry these just them from the acronym flop which stands for floating-point operations per second or per clock and represents a number of multiplications and additions a GPU can handle in a given time span this doesn't always translate by the way to raw gaming performance execution on a software level plays a huge role here which is why something like Vega 64 despite having an insane amount of floating-point precision because it was hailed as such falls behind something like a GTX 1080 in several titles so optimization that's its fault not the graphics curve when you see FB 16 and 32 in this blog know that they're referring to the number of bits per digit represented in the floating-point operation you know it gets really complicated here but FP 16 units are easier to process they carry less information with them though it's a trade off tensor cores can handle 64 floating-point mixed precision operations per second mixed precision in reference to the mix of FP 16 and 32 units in the formula we mentioned earlier and those was getting really confusing just bear with me the FP 60 multiply yields a full precision result that is accumulated in FP 32 operations with the other products in a given dot product for a four by four by four matrix multiply in a nutshell this jumble of words means that tensor cores are designed for specific operations at the expense of precision and effect CUDA cores are still technically more efficient in their current state for many operations tensor cores are simply mixed into sm's for when they're needed it's a bit like placing several college-level mathematicians in a room to handle several random operations along with a couple of students who specialized in one or two specific operations say long division the other students can handle the workload but not as efficiently and quickly as the two specialists whose brains are specifically wired for long division I don't know it's kind of a weird analogy but hopefully you get what a tensor core is doing in essence and that's why NVIDIA has been pushing hence ER course heart in their machine learning endeavors they're really good at just a few things including machine learning and AI applications and that's why things like the Titan v have tensor cores but should we expect to see tensor cores and conventional consumer grade graphics cards don't count on it like we just mentioned these cores are specialists they handle parallelizing and compute storage operations like true bosses but traditional GPU cores are still much better at handling most videogame code out there they're often faster and more efficient which is why they aren't likely to be replaced anytime soon if you guys liked this video let us know by give this one a thumbs up we appreciate it thumbs down for the opposite click that red subscribe button and the sponsor but if you want to get fancy with it down below stay tuned for the next video this is science to do thanks for learning with us

Gadgetory

All Cool Mind-blowing Gadgets You Love in One Place

2018-07-05