Gadgetory


All Cool Mind-blowing Gadgets You Love in One Place

GTX 1080 Founders Edition Review & Benchmark

2016-05-17
pascale is here the GP 104 has arrived on the GTX 1080 graphics card that we have here we've benchmarked this for frame rates thermals power noise and more and what we're looking at today is the GTX at 1080 founders edition the main news here just to sort of bring everyone up to speed GPU 104 is the Pascal architecture this is the first Pascal architecture card shipping for consumers for the gaming market and the big change here is that it's got a refined or brand-new actually process note so it's running a 16 nanometer FinFET process as opposed to Maxwell's 28 nanometer planar process and that has some inherent power efficiency and performance for watt gains that will hopefully be reflected in our tests that we're showing you today before diving into all this I want to just sort of let everyone know we've got a 9,000 word article review of this card written on the website hit the link in the description below for the full article because we can't go into all the same detail in a video that we can in that article and the article includes things like more depth on asynchronous computes more depth on us on architecture or more depth on memory subsystem things like that so you'll find that link below but let's dive into this thing look at some of the testing data and specs the GTX 1080 is priced at $700 currently for the founders Edition which is equivalent to the previous reference nomenclature see our previous video for that MSRP for board partners is $600 so we'll see a range of 600 and higher and the cards ship on May 27th for the founders edition this is the specs table for all the recent Nvidia video cards shown on the screen now on the far left is the tesla p 100 accelerator card which runs the GP 100 cut version of big pascal that was the first pascal architecture device but was shipped with scientific tasks in mind it was not a compute card the GTX 1080 debuts the GP 104 version of Pascal and a lot of the architecture is similar in terms of sort of top level ideas and the process note is the same but s/m architecture is varied in some ways compute preemption is similar in some ways there's a couple differences there that we'll talk about the gtx 1080 has 2560 cuda cores split between 4g pcs and 20s ms the reduced cores per SM mean more dedicated resources per SM partition so to speak as sums are to quote-unquote partitions of Kors and schedulers and warps and buffers and things like that and we'll talk about that momentarily other than the processed shrink the most obvious change to GP 104 for consumers is the introduction of 8 gigabytes of gddr5 X memory that X is new and important gddr5 X V RAM from micron has an effective operating frequency of 10 gigahertz on this particular device and it's got a 256 bit interface again on this particular device it offers a mid step between the 40-ish percent slower gddr5 and the future of HP m to high bandwidth memory of course shipped originally on Fiji and these card and HP m2 will be coming to Vigo later and hopefully to Nvidia sometime maybe in the next year but it's got low yields right now gddr5 ex operates at 10 gigabits per second per die on GP 104 and has the potential to grow to 13 to 14 gigabits per second with microns future advances and gd-r 5 but for a reference operates at about 8 gigabits per second versus the 13 to 14 maximum capacity that micron is targeting for gddr5 X the GTX 1080 runs at 17 33 megahertz boosted with a GPU boost 3.0 update and can overclock beyond 2 gigahertz as you'll see later it's got nine teraflops of FP 32 compute at stock whereas the gtx 980ti has 5.63 teraflops and the gtx 1070 has a 6.5 teraflops most notably the gtx 1080 has TDP of 180 watts and requires only a single 8 pin power header which is a pretty big change as well now of course I know you all want to see the benchmarks right away so we'll start there starting with thermal noise and then fps and you can find all of our testing methodology on the website in the article if you're curious how we conducted these tests for gaming we tested OpenGL DirectX 11 12 and Vulcan we also tested 1080 1440 and 4k resolutions games include doom ashes of singularity Talos principle Tomb Raider the division in gta5 shadow board or Metro last light and more we will not show all the charts here as that would be insane so again article for those as many of you know we used a thermal chamber recently to validate our thermal testing methodology and found it to be highly accurate thanks to our ambient login actively with a thermocouple reader here's our equilibrium chart as we call it for ease the GTX 1080 operates at 50 7.5 Celsius under load and 96 Celsius Idol comparatively that's roughly 49 presents warmer than the fury X which is liquid cooled at 36 point three nine Celsius and is effectively identical to the gtx 980ti reference edition the founders edition cooler is able to keep the more powerful gtx 1080 at about the same temperature as the gtx 980ti is reference cooler does for the GM 200 chip and then of course we've got the hybrid card on here as well from EVGA and that is just insane but it's a much different design and it's more expensive here's a look at thermals over times they're all torture commences at the same 120 second mark for all devices because we use a custom program after adding ambient back in we would see that the thermals hit about 80 Celsius and stay there for these and video devices and especially the 1080 the fan will adjust itself accordingly to maintain this thermal level and that is something we'll talk about with throttling this next test is the new one for us the purpose here is to run an endurance test and generate a chart that looks for throttling under tortured scenarios we disabled the open bench fans to this end and just let the GPU try to cool itself and create a worst-case scenario for it hoping to discover if it would throttle the frequency against a thermal barrier somewhere that's what we're looking to discover here unlike the earlier charts this temperature is represented as an absolute value not a delta you'll notice that over the two-hour test period running dirt rally at completely maxed settings all accounts the GPU seemed to sit around 80 Celsius and Peaks occasionally and then also dropped and frequency occasionally here is a cropped and look at those dips where you can see what what's causing the drops and if you look closely you'll see that it's basically the temperature hitting about 82 cells he's causing the frequency fluctuations which show a range of about 60 megahertz each time the GPU diode hits 82 Celsius absolute temperature and that can trigger a slight latency increase or a slight frame rate fluctuation at the exact moment of the frequency drop but it is basically imperceptible and not something to really be concerned about because overall we've seen this five times over a period of two hours enough of that let's talk power and noise and then games total system power consumption this is not per card but for the whole system offers a difference comparison for the GT x 1080 fe cards we see that they set around 300 watts this is a bit more power than the GTX 980 required non ti but a fair bit less than the gtx 980ti that's also shown here as for noise all this methodology and the meters and setup we used are defined in the article pretty important to check that out if you want to know more idle noise levels are more or less imperceptibly varied between all the cards tested after the five-minute GPU load period the r9 290x pushes the loudest auto DB output at forty nine point three decibels and would be perceptible even from within an enclosure the r9 fury x may only be thirty nine point oh eight decibels but it's still the one we've got produces that high-pitched pump whine that we wrote about ages ago the msi Twin Frozr card keeps the lowest DB level at thirty 0.37 aided by its dual fan push setup and massive alloy heat sinks and the gtx 980ti VR Edition pushes one of the loudest outputs at forty point seven decibels with a 1080 running within margin of error forty eight point eight decibels effectively identical no card realistically hits the 100% fan speed they tend to sit around 50 percent if they can help it but the stats show the r9 290x had 70 decibels on its reference design the GTX 1080 at fifty seven point two DB and the 980ti noticeably louder at sixty three point four dB the 980 at fifty nine point five two dB the fury X and Twin Frozr cards again around the quietest for reference conversational speech is about 65 ish maybe 70 decibels telling how loudly you talk alright time to test some games were opening with it doom which we just tested this is an OpenGL game not a Vulcan or DX game and it runs on two different versions there's 4.3 that it runs on for AMD that's something done by in software and 4.5 OpenGL for NVIDIA so keep that in mind let's jump into this we already posted the full benchmark by the way for this game if you want to learn more about it 4k shows the GTX 1080 as being the only card getting within throwing distance of 60fps one or two settings weeks would push the GTX 1080 into full 60 plus FPS action the GTX 10 a TF e at stock clock outperforms a stock clock gtx 980ti by 13.8% against the fury x that delta is widened to 21.4% but that's still really in the real world not that noticeable the gtx 1080 is off to a good start though and the value proposition is a bit better against these similarly priced devices 1440p shows a significant performance lead for the 1080 plan to note of 98.3 fps and with the best low frame times on our bed the GTX in 1080 is the most tightly timed card for frame delivery that we've tested as of this instant and against the predecessor GTX 980 non TI performance gains are a staggering at 30% over again the 989 TI against the 980 i however the performance differences again 13% similar to the last one we looked at here's the 1080p chart in this test it's clear that we're bottle necking somewhere else in the system probably on our 59 30 kcp you and that CPU bind that does mean that we're seeing the 980 I and 1080 push effectively an identical performance output either way either card would enable close to 144 Hertz gaming if an appropriate CPU is paired with them we tested a few new API games but ashes is the most interesting as it's the most reliable with most data we rip the satellite shot to data from ashes which shows the large batches down the pipe and chokes components we have two sets of data for ashes to feature millisecond latency between frames and its improvement yields from DX 12 and the hard frame rate comparison chart between the x11 and DX 12 what you're looking at right now is the 1080p high comparison of DX 12 and dx11 performance bashes GT X 1080 holds a clear lead over everything when using DX 12 this is because of GP 104 is clear improvements and asynchronous compute which I'll explain in a few minutes notice that the fury X and 390x are both choking on some sort of DirectX 11 optimization issue where they can't circumvent a CPU hang-up or other bottleneck in the system hence they're identical performance to getting bottlenecked and this is something which and videos dx11 drivers are good at working around looking at dx12 performance though Andy shows some of the biggest raw gains from the new API and that's great news for them as the industry trends toward DX 12 in Vulcan now look at the GTX 980 it's dx11 performance ranked it high among the cards but once we sort by DX 12 it's clear that the 980 is the loser on the bench 4k high shows similar performance changes as those 4k with crazy settings the GT x 1080 has made obvious improvements in the x12 optimization at frame rate and here's a frame time chart with 1080p and high lower is better here it's measured in milliseconds notice that Andy's bane isit's dx11 frame latency which creates the stuttering seen in dx11 the 1080 has an absurdly low thirteen point three eight millisecond average frame time which is good but that's not to take away from the improvements made across the board for AMD and NVIDIA both with their newest architectures here's the percent change chart for latency the fury X sees a 120% latency reduction the 390x he's a massive seventy six point nine percent latency reduction both a reward of andes investment in async computes while the older GM 204 Maxwell architecture struggles to stay positive GT X 1080 and GP 104 however combined brute force compute with async improvements to generate a forty eight point six five percent Johnson TX twelve which is massive news ran video who struggled in the last generation with DX twelve so they're finally gaining some serious ground here and have become a real contender in the new API space to see more dx12 and Vulcan performance check the article we're trying to keep this bit of it short because it is pretty complex data let's look at DX 11 at 4k the GTX 1080 holds nearly a 30% lead against both the gtx 980ti and the r9 fury act the 1080 is the first single GPU that is able to sustain our GTA v 4k benchmark with all main settings on very high and ultra it pushes 56 FPS which is well within the acceptable range without many in-game hiccups at 1080p the GTX 1080 pushes into 125 fps and leaves behind the nighty ion fury X again both of which sit at about 16% lower frame rates the fury X drops it's 0.1% low values below 60fps but that's really not too bad here and generally pushes it less consistent frame times on the 1080 now we're looking at black ops 3 we've had to expand our scale for these charts past 100 60 FPS because the GTX 1080 crushes frame rates at 200 2.3 FPS average for 1080 but if you're buying this then you really aren't gonna plan on 1080 let's all be honest again that the 200 is an average with one percent low is those exceed one 30 fps and the GTX 1080 is exceptional in its frame rate performance here and runs tight frame times meaning consistent latency between frame delivery and these are nine fury x also performs very well in black ops 3 and outperforms the see I just barely assuming a reference now a TI anyway and that's all we've really got for for this test the fury X is still about 24% behind the 1080 even at 1440p and the 1080 is nearly at 144 Hertz range and would easily sustain such high frame rates with a few settings tweaks if he needed it as for 4k the card push of 68 FPS average and holds a firm lead over the fury X and it's average frame rate but a massive lead over the fury X is 0.1 percent load dips which this fury X card seems to show it's for gigabyte limitation in some of these 4k high setting scenarios because the vram is just being tapped so heavily this is a consistent issue in some of the DX 11 games with the fury X though dx12 is somewhere AMD excels and you'll see that in or you've already seen that in some of the test results for sake of time we're gonna stop the game benchmark charts here if you want to see more length description below you'll find Metro and a couple of other DX 1112 in Vulcan analysis items that you can look at but now we're going to talk about overclocking overclocking has changed thanks to GPU boost 3.0 but not too much first of all we have an interview with Tom Peterson and technical marketing director ads and Vidia that's live with this video you can check that out for more depth on how overclocking works the main thing here first of all GPU boost 3.0 enables a new feature called scan OC what this does is it plots a frequency voltage curve on your particular GPU and it looks for specific frequencies where voltage may need to be peaked or lowered or whatever to sustain a stable overclock that is an automated tool you basically click scan and it runs some sort of burnin program basically a reskin fir mark for EVGA is precision and then that looks for what your maximum potential overclock is the reason this exists is because when performing the overclock and there's this chance that you'll end up with a pretty high and stable OC high frequency but once you start throwing specific applications or games into the mix maybe the witcher 3 triggers at a specific point of failure it's not triggered on other games the only real solution is to step down the OC or to create multiple profiles for multiple games and that just kind of sucks so scan OC bypasses that that's the thin I was not able to get it functioning properly for this review it crashed Cecily but we did get something together through the old means and that was just by manually sitting there and Oh seen it myself and I have several sheets here of passes and failures that I'm not gonna read all of them for you but the one that we settled on was a pretty light OSI in terms of what I think this card's capable of but it was the max that I could get out of the founders edition cooler and that's because of some of the thermal throttling we talked about earlier so max OCI was landing between 2025 megahertz and 2050 megahertz star's landing between that range I maximally hit I think was 20 80 megahertz and it's not too bad the memory clock I was hitting 50 400 megahertz I didn't try to push that higher you probably could but I pushed it 400 Hertz and kind of left it voltage I was sitting at 1.031 volts and my OC was a 120 percent power target so we're giving to an extra 20% to the TDP to the card 220 megahertz OC to the core which produced that twenty twenty-five twenty fifty megahertz output 400 to the memory and then a 37% voltage increase which really didn't push 37 percent extra volts to it because we've talked about that Maxwell it's the same here it doesn't really always give it all that it's got or that you asked for now in terms of fps here's the impact we got them doing off of paper here because we just did this test and interrupted the video do it the GTA v 4k benchmark sees an improvement from 56 fps stock frequency to 65 fps overclock that's pretty big gain that's a full nine FPS mordor moves from sixty point seven FPS stock to 65 o seed at 4k not a huge gain but not bad and then at 1440p it moves from 106 to 113 fps reasonable doom moves from 98 FPS stock at 1440 to 109 FPS ioc it pretty big gain actually a bit over 10% I believe and then 51.7 FPS stock to 59 FPS at 4k overclocked so what this tells us is that first of all the the Headroom for o scene is large for this card it is only limited on the founders edition by the cooling potential which we'll talk about very really I have a pretty special feature for that but that's the main limitation it tells us that AIB boards the the Adhan boards from the AIB partners will be significantly better at overclocking in terms of their cooling potential now whether or not the silicon can handle it that's a different story we have to really get more hands-on samples to figure out exactly where the sort of silicon Madhuri plays out with this particular chip but that's all I got for you with overclocking all right this is the hard part I'm gonna try and condense all the pertinent architecture information into just a few minutes and we'll see how it goes we're gonna be talking about GPC as SMS the memory subsystem and other items like that if you want some of the visual guides again link below for those let's start with a block diagram this is GP 104 unlike GP 100 which is much different in a couple of key ways the GP 104 GP does not have in fact 15 billion transistors and it also doesn't have six GP sees with ten SMS each that's what GP one hundreds guy GP 104 is shrunken down to four GP sees with 20 total SMS in some ways it can be thought of as a shrunken Maxwell it's very similar in a couple of key areas there are 20 T pcs for GP 104 and the total TM you count hits 160 which is calculated simply by multiplying the TM use per SM by the SM count so there's ATM use per SM times 20 equals 160 TM used like GP 100 GP 104 partitions and sessoms into two blocks of cores each with 64 to 2 cores that's 128 per SM partitioning the cores into smaller clusters helps allocate more dedicated resources to these cores like warp schedulers that queue threads dispatch units register files cache and memory that could access frequently things like that each SM has its own 256 kilobyte register file a 96 kilobytes are memory unit 48 kilobyte l1 cache and ATM use that's again per SM stepping up a level we zoom out and kind of look at GPC is each GPC has its own raster unit as well as that's four total for GP 104 new to GP 104 is its polymorphic engine 4.0 which was introduced on Fermi each TVC has a polymorph engine that executes specific tasks mostly related to its nvidia simultaneous multipe rejection tool that was recently introduced SMP is used to ensure multi display surround display or non flat display output is warped according to its position relative to the user so if you've got monitors sort of on an angle towards you the polymorph engine deals with all of that translation now let's compress asynchronous compute in a similar fashion asynchronous compute paves the way for leveraging low-level api's and asynchronous command queuing allows GPU resources to be allocated between non-dependent tasks there are three major changes to a sinc compute in pascal from previous architectures and it affects these items one overlapping workloads to real-time workloads and three compute preemption for gaming GP resources are often split into graphics and compute segments eg a selection of cores cache and other elements assigned to graphics while the remainder is assigned to compute and the resources are partitioned too for instance rendering and post-processing effects and it may be the case that one of those partitioned clusters say the cluster that's handling rendering completes its workload prior to its partner may be handling something else compute whatever that compute allocation may still be crunching a particularly complex problem when the render allocation completes its job leaving the units allocated to rendering idle that's wasted resources you don't want that you don't want to leave those units idle async command queuing structures allow for more resources estimate on the fly and abling concurrent in-flight jobs to all reach completion this is called dynamic load balancing and allows workloads to scale as resources become available or busy maybe 50% of resources are allocated to compute and the other 50% to renderings and they like that if one job completes the other job can consume idle resources and speed of completion and reduce latency between frames in this image the command push buffer stores triangle and pixel data and halfway through working on its latest draw call which you see on the far right you'll notice that it's stopped working on that draw call the reason is because Pascal allows for preemption requests to arrive hit that push buffer and basically demand priority over other tasks and at that point the push buffer pauses its execution saves all the rasterizes and shaded pixel data freeze itself for use elsewhere and then can execute that command later when the preemption is done Pascal can perform pixel level thread level and instruction level compute preemption and that allows some very low reaching changes to the the data structure so that these things like time warp for VR can be dropped in as needed if something's really crit so the user doesn't vomit from VR now gddr5 ex introduces a lot of changes too and I'd love to talk about it more but we're going to keep that short just talk about memory compression very briefly and Vidya has moved to its fourth generation delta collar compression and DCC functions by looking at all the color data temporally which means basically frame the frame and then reduces colors lossless leads into Delta values so instead of fetching all the color data from an absolute value DCC can group as an example of the blues and the skybox of game together store a neutral value that resides between them all and then use that neutral sort of mean to reach outward with Delta values and then create the colors that are needed later as they are required this compression approach reduces bandwidth consumption by about 20 percent DCC can do eight to one compression maximally but also offers four to one and the original max while two to one and more on that in the article and so we come to the conclusion so this thing is the gtx 1080 words they will first of all this one founders edition is seven hundred dollars you should expect to see cards hitting these 650 i would think they technically they can go as low as six hundred by MSRP standards but we'll see 650 600 cards as these AIB Partners start shipping their devices in terms of performance NVIDIA has definitely improved some of their dx12 Vulcan and other API performance issues from the past it's a good card it's performed the best out of all the cards we've tested and the 1080 has about a 30% lead over the 980 this one and the fury X not shown here now one thing to note this 1080 in some games it's well actually almost all games it's about 13% ahead of the nyai 980ti it was very powerful but if you have one now I wouldn't I wouldn't upgrade to this because why 13% is kind of like who cares so it's got 13% lead there if you have something like a hybrid the EVGA hybrid which i think is some around twelve hundred megahertz boosted the gap actually shrinks considerably so you end up with more of a 10% some games even lower difference between the two so a lot of the gains here I think for passed out can be chalked up to sort of the frequency increase it's more than seventeen hundred megahertz when it's running boosted now that is a massive increase over this which I could get this up to probably 1400 1500 stable 1504 sure I was able to hit before if you did that you'd be pretty darn close with these two devices but my ETA hybrid is more expensive ten eighties better architecture better sort of at the ground level it's got eight gigabytes of memory which is actually becoming relevant now and overall the 1080 is definitely the card to buy out of the 980 980ti and 1080 in the future fury X is kind of a it's an interestingly placed card it's not something I would buy over the 1080 right now because the price is very similar I think I've seen it like 650 in some places versus 700 for this so the only thing you really gain is the lower thermals and HBM but we're seeing that 0.1% frame time kind of hurt the fury acts and games where it's exceeding the 4 gigabyte allocation on that card the GT x 1080 founders Edition runs reasonably cool but AI B's will obviously outperform that pretty easily and videos trying to grant its reference cards longer stay in power by continuing to sell them through AOL but I would still point you toward the board partners for fiercer competition that will drive prices down they'll pre overclock the cards and the 1080 will be a good purchase at the top and as the price drops down more towards that 600 low-end that they're listing 700 is pretty steep I it's not unfair it's just a lot of money and if you can wait a couple monster at the market to stabilize and start pumping out these cards then it's definitely a good wait and you'll save some money get something that runs cooler as well it's already an easy choice though over the 980 I in the r9 fury x of both of which are priced similarly and lower-performing sometimes 13% for the 980 i sometimes up to 30% for the fury x so definitely an obvious win there for the 1080 so thank you for watching I know this was huge at length ascription below for more information patreon like special video wanna helps out directly for these efforts I'll see you all next time
We are a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites.