Arm Cortex A76 - What Does It Mean For Smartphone Performance?
Arm Cortex A76 - What Does It Mean For Smartphone Performance?
2018-06-08
hmmm I have Gary Sims and this is
Android authority now every year we
expect smartphones that are more power
efficient and have greater performance
compared to the smartphones of the year
before but to do that we need new CPU
designs and new GPU designs and to that
end arm has just released its details on
its newest CPU designed the cortex a 76
so if you want to find out what is the
court it's a 76 and what will it mean
for smartphones of 2019
please let me explain okay so the way it
works is this arm is a design company
they don't actually physically make any
chips they design CPU cores and GPU
cores and then they pass those designs
over to companies like Qualcomm or to
Samsung or to Huawei or - mediatek and
they use them in the SOC s that we
actually find in our smartphones at last
year's CPU was the cortex a 75 and this
year's new design is the cortex a 76 now
the corners a 76 is quite an interesting
design there are some significant
changes compared to the court it's a 75
so let's have a look at some of the main
characteristics and the first thing to
notice is the court it's a 76 is not an
evolution of the court it's a 75 it's a
brand new design with a brand new
microarchitecture however of course it
still uses dynamic which means there are
high performance calls which will be
coupled with a high energy efficient
cause and here is the key figure that
are telling us there's two times of
performance boost for laptops and that's
compared to the current performance and
that really means compared to the
Snapdragon 835 and the cortex a 73 and
the reason that mentioning laptops of
course isn't now we are in the era of
Windows ARM based laptops although I'm a
particularly underlining laptop
performance whichever way you look at it
a doubling in performance from the court
is a 73 till the court it's a 76 is a
major achievement now when you come to
compare the court it's a 76 with the
cortex a 75 we see some significant
changes here so this is talking about a
cortex a 76 clocked at 3 gigahertz with
a 7 nanometer process mode compared to
this year's cortex a 75 at 2
eight gigahertz at ten nanometers first
of all the court is a 76 has a 35%
performance increase it has a 40%
efficiency increase and when it comes to
AI and to machine learning which of
course is the kind of the key word at
the moment they're saying there is a
four times performance increase and here
we have some interesting comparisons
using Geekbench for the court it's a 73
the court is a 75 and the cortex a 76 of
course they are also clocked at
different frequencies 2.45 2.8 and 3
gigahertz however we can see that the
court is a 76 is again producing
significant increases and there's a 35%
increase we see there from the court is
a 75 to the court is a 76 pretty
interesting to note this 2.5 X increase
in floating-point so arm have really put
a lot of effort into improving the
floating-point calculations that we get
in this year's processor now we also
mentioned increases in performance and
in power efficiency on the left here you
can see that if you are running the core
a single a76 core at 750 millivolts then
you can actually do 40% more work 40%
more performance in exactly the same
power envelope so that means it can do
things fortunate and quicker and your
battery doesn't go down any faster than
it would compared to what the a 75 does
and the other way you can look at that
if you want to have the same level of
performance as you do with today's
cortex a 75 processors then actually the
a 76 will use 50% less power so you can
actually have the same kind of you know
geek baby scores as you get today but
your battery will drain much much less
so that is a significant improvement so
let's move on to look a little bit at
the internals of the cortex a 76 now the
first thing to note about the court is a
76 as I mentioned earlier is that this
is a new micro architectural design and
arm are confident that each year they're
going to be able to take this initial
design with this new architecture and
tweak it some more to get better
performance and even better power
efficiency but starting with this first
generation of this new architecture
we're seeing
couldn't changes compared to the cortex
a 75 and you might be wondering how long
does it take to do a process of design
well arm of getting some information
here they started the initial thoughts
the initial scribblings on bits of paper
about the court it's a 76 happened four
years ago so that just shows you how
complicated these CPUs are and how much
effort arm put into tweaking every
single tiny little bit of it to eke out
as much power as they can so I'm going
to show you a fairly complicated picture
of the internals of the a76 and even
this really is a simplified very
simplified version but it just shows
just how a CPU is put together and so we
can see here there are three distinctive
parts there's the part at the beginning
which is called the front end which is
how the instructions are fetched from
memory and how they are ready to go down
the pipeline so they can be executed and
then you have this kind of decode part
works out what the instruction meant to
do is it's a floating point in
structures it's an integer instruction
does this instruction need to access the
memory and all that gets sorted out in
the decode part and then after that you
have the execution part and so that we
can get instruction level parallelism
there are different parts to this
execution because while you're executing
for example a floating-point operation
you can also be halfway through
executing the next integer operation
which generally is much simpler now the
key takeaway here about the front end
part is at the branch predictor and the
instruction fetch are actually decoupled
from each other and what I mean by d-cup
of what it means is that the Brass
predictor works independently to the
instruction fetch what that means is
when the branch predictor is actually
working out where the program is going
to next predicting the jump predicting
the branches it can actually fix those
instructions from memory work out what's
going on and by time they get to the
instruction fetch stage of the pipeline
they're actually already in the cache
which means it's the branch predictor
that has done the hard work of actually
fetching them from the memory and
working out where they should be and
what arm of done is they make the
predictor work twice the bandwidth than
the instruction fetch which means that
while there's all these things going on
with memory latencies and working out
what goes on next the instruction
fetch is always being fed by the branch
predictor double the bandwidth and when
the branch predictor is actually filled
up it's kind of internal q is it hey
I've got nothing more to do it just
waits until the instruction fetch kind
of goes through the instructions one at
a time so this decoupling have been able
to remove a lot of the latency that you
can find in these early parts of the
pipeline now it's a twit and a bit over
your head a bit just know this they've
made it quicker to get the right
instructions out of the memory and into
the CPU and of course that's vital for
performance and then when we look at the
execution core look at this there is a
eight independent issue cues which are
power optimized for the attached
execution pipes some things take longer
in a process of another so an integer
calculation one plus one is maybe a lot
simpler than kind of multiplying
something by PI for example so when you
have this instruction level parallelism
it means that if you've got let's say a
sequence of integer operations and
before that you've got a mathematical
operation you can kind of start the
mathematical operation off then you can
go ahead and kind of get working on
those integer things and things start to
work in parallel that's called
instruction level parallelism so as a
summary of what these micro architecture
changes managed to get for us is there's
a 25% more integer instructions per
cycle than the court it's a 75 that is
significant there's a 35% higher
floating-point performance and here is
the really important one my arm have
been talking about laptops there is a 90
percent higher memory bandwidth now
memory bandwidth is very important both
in smartphones and in laptops and a 90
percent increase is a very very
interesting so what does that mean for
you and for me well basically it means
that in 2019 we're gonna see flagship
smartphones using the cortex a 76 and
we're probably going to hear some
announcements about the processors that
power those smartphones towards the end
of 2018 so maybe just six months from
now
so for 2019 we're looking at greater
performance significantly greater
performance better memory bandwidth
we're looking at better power efficiency
and then we're going to see a push into
the win
those are laptop area so I'm really
looking forward to seeing what companies
like Qualcomm and Samsung and Huawei can
do with the cortex a 76 so my knife
carry Sims and this is an authority I
really hope you enjoyed this overview of
the cortex a 76 if you did please do
give it a thumbs up also you know we'd
like to ask you please subscribe to our
Channel please share this video on
social media and I will be reading your
comments below to see what you think
about the cortex a 76 well one last
thing to say don't forget to go over to
Andrew authority calm because we are
your source for all things Android
We are a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites.