Gadgetory


All Cool Mind-blowing Gadgets You Love in One Place

AMD vs. Nvidia Low Level API Performance: What Exactly Is Going On?

2016-07-27
so what is asynchronous compute reading through the comments section of my rx 480 in GTX 1060 reviews you'll quickly learn that it's a special directx12 feature that cripples in video hardware and shows just how awesome anybody's GCN architecture really is yes sometimes the YouTube comments section can be very educational the truth is we still don't really know how AMD is Polaris and invidious paschal architectures stack up in DirectX 12 titles and what the implications of eights in computer will be and we won't know until we have at least a dozen good quality titles to test that said we're beginning to get a glimpse of what the future might hold the first really well put together DirectX 12 title was ashes of the singularity this realtime strategy game created a heap of discussion and plenty of speculation regarding how AMD and video will stack up in a world filled with quality DirectX 12 titles the controversy began when it was discovered that in videos Maxwell and Pasco GPUs weren't actually faster using DirectX 12 rather they were slightly slower in relation to the DirectX 11 performance on the other hand aimed is GCM base Radeon graphics cards enjoyed up to 30% more performance when using DirectX 12 over DirectX 11 so why is this well many would have you believe it has everything to do with asynchronous compute it's no secret that AMD's GCN has asynchronous compute engines or ACS built into the architecture for hardware based async compute support whereas in video on the other hand doesn't feature dedicated hardware support for acing computer so then case closed Nvidia messed up despite knowing full well that AMD had implemented support over four years ago but how can that be billions spent on R&D only to make a crucial mistake that'll cripple them going forward well hang on a minute let's just take a look at what's really going on here why in a game such as ashes of the singularity doesn't video see no real benefit and why does AMD come from such poor DirectX 11 performance to such strong directors 12 performance and video would have you believe it's down to the fact that their more recent GPU architectures are already extremely efficient therefore they don't benefit from low-level api s and the features they offer such as async compute that seems like a pretty convenient answer but Nvidia might have a point I feel we are seeing the full potential of their Maxwell and Pascal cards and directives 11 titles they are after all extremely efficient when compared to their ng counterparts testing with DirectX 11 title sees the are X 480 consume the same amount of power as the gtx 1070 which isn't great giving it on average 30% slower by now I think we can all agree the problem for AMD has been efficiency and these problems seem to stem not just from the architecture but also the software aka display driver if we first look at the architecture and how it scales then adding more stream processors you find some interesting and perhaps unexpected results the Radeon r9 390 for example features 2560 SPU's or rather let's say cause the I know nano boasts 60% more cause at 4096 they're operating at the same 1,000 megahertz frequency the Nano cause also had 33% more bandwidth to play with so you'd expect the Nano to be around 60% faster however in reality the Nano is on average just a little over 20% faster than the r9 390 in DirectX 11 titles if we look at my Star Wars Battlefront results from the GTX 1060 video for example we see the Nano is just 15% faster than the 390 so in this title a 60% increase in caused netted 15% more performance keep in mind there are no system limitations capping the performance of the nano either if we look at the gtx 1060 and compared to the 1070 which features 50% more cores we see the 1070 is 32% faster so you won't get perfect scaling from nvidia either but the performance gain is much closer to the increase in cause this is a real problem for AMD because as they ramped up the core count in order to compete with in videos upper echelon this inefficiency continues to amplify for whatever reason the GCN architecture isn't able to fully utilize all these cores and as a result we see much smaller performance gains than what the specs would suggest this also doesn't help with power efficiency as those cores are still present and active even if they aren't being fully utilized getting back to ashes of the singularity here we have a game with AMD and NVIDIA architecture actually scaled quite evenly using DirectX 11 this isn't the best GPU test given it's a real-time strategy game and therefore predominantly CPU bound despite that we see when testing with DirectX 11 that rx 40 and I know ix are only able to match the GTX 970 meanwhile the GTX 1060 can be seen beating the Nano moving to DirectX 12 we find a rather different story the GTX 1060 is still faster than the our X 480 but only just well the 480 does be both 980 and 970 the Nano however is now considerably faster than the 1060 and even beats the 980 TI the odd thing here is that we go from one extreme to the other running on the DirectX 11 API the AMD cards are much slower than they typically are in other titles then when we tested with DirectX 12 they're much faster than you would expect the fury X for example beats the GTX 1070 basically AMD has made no effort to optimize the drivers for DirectX 11 performance and ashes well the game itself has been heavily optimized for aim DS GC and architecture so the results from this one game make it difficult to draw any real conclusions so I'm going to choose not to draw one moving on we find doom with its recent updated supporting the Vulcan API prior to the update the radeon gpus struggled using OpenGL here we see the RX 4 ad was good for 89 FPS on average of 1080p while the GTX 1060 pumped at 112 FPS this meant using OpenGL dr x 480 was 21% slower now with Vulcan and async compute shaders enabled thanks to the use of TSS AAA we see the GTX 1060 maintains that same 112 FPS average the R X 480 on the other hand gains an incredible 36 percent performance boost making it now 8 percent faster than the 1060 what's also interesting to note here is the I 9 390 was slower when compared to the rx 480 using OpenGL this is interesting as the r9 390 features 11% more cause enabling Vulcan allows the more core heavy 390 to just outperform the 480 is the low-level API helps overcome any efficiency problems this effect is amplified to a much greater degree when looking at the core rich nano which sees a massive 53 percent performance boost for now acing computer is only enabled when using Vulcan in doom if anti-aliasing is disabled or tssaa a is used so what happens if we disable AC and compute by using invidious taa method well not a lot based on what we see here in fact the RX for ad delivered the same 121 FPS from a three run average using TA and yes si a given by using different anti-aliasing methods this example isn't an apples to apples comparison but it does strongly suggest the async computer isn't really responsible for aim these stellar performance in Doom and using Vulcan okay so what about 3d marks new DirectX 12 times fire synthetic benchmark that allows us to enable and disable async compute in to GPU tests looking at the first graphics test we find some interesting results the gcf's 1060 is indeed faster with acing compute enabled albeit by just 4% the rx 480 however was 14% faster with async computer enabled though I should point out in this test it was still 11% slower than the GTX 1060 still there's no denying that AMD's hardware support for async compute does give them a performance advantage in this test the second time spot graphics test shows different performance trends here the GTX 1060 was no faster or slower with async compute disabled the RX 4 ad on the other hand was temps and faster with async compute enabled though this wasn't me a 3 FPS game so it seems that in certain cases async compute can enable around 10 percent more performance on the AMD GPUs that being the case how GPUs such as the Nano over 50% faster in Doom when using a low-level api in that example we saw racing compute was only improving performance by a few percent it's my opinion that the Radeon GPUs are so much faster when running on a low-level API such as DirectX 12 of volcán simply because that's how fast they should be the way in which DirectX 11 works simply doesn't suit the way AMD designed their drivers the issue here is a key feature of DirectX 11 command lists this is a DirectX 11 feature that AMD doesn't support and this is what hurts the DirectX 11 performance command lists essentially takes single-threaded code and try to multi thread it sounds familiar hey I'm of course referring to async compute actually it's probably more like hyper threading for your GPU these command lists were touted as a massive step forward for DirectX 11 terms of multi-threaded performance when it was first announced so while Andy offers hardware basic compute for api's that support it it didn't bother to take advantage of a similar feature for DirectX 11 a driver level by failing to take advantage of this multi thread feature amy has run into a driver overhead problem that hampers CPU performance in a way Andy's been lucky to a degree firstly almost every review test with the most high-end hardware possible in an effort to eliminate or at least reduce system bottlenecks that could limit GPU performance and therefore shape the results I myself do this by writing a core i7 six to seven hundred K at four point five gigahertz and for AMD this helps reduce the impact of the driver overhead also for the most part modern games a GPU dependent which also helps to limit the impact of the driver overhead likewise when testing higher-end GPUs we benchmark at high resolutions such as 1440p and 4k the GPU becomes the primary bottleneck here so any extra load on the CPU goes largely unnoticed the driver overhead does however present a real problem for those running lower end or older hardware it's been seen in the past when testing budget GPUs that AMD is faster than using a high-end rig but falls behind when using a budget system in short AMD has two things working against them when using API such as OpenGL and DirectX 11 firstly and most crucially I believe is the driver overhead which is a particularly big problem for both low end and high end AMD GPUs then you have the core efficiency issue which async computers believe to help solve though this could also just be the benefit of using a low-level api as that stops the cpu from holding the GPU up so it's my belief that we're really now starting to see the true performance of AMD GPUs in games using low-level api's as for NVIDIA is there more performance to be had should they have integrated async compute engines into their design honestly I have no idea but it stands to reason that doing so probably would net them up to 10% more performance in games such as ashes of the singularity keep in mind adding this technology could also increase the power consumption by that margin so then you have to wonder how worthwhile with such a change be for an already very efficient architecture in the end this is ultimately good news for everyone if the next generation of games do enable AMD to up their game then we should start to see more affordable graphics cards as a result perhaps that's just wishful thinking but I'd sure like to find out what do you guys think let me know in the comments I'm your host Matt as always and I'll see you guys next time youtubers like me depend on your support to continue improving the quality and content of our videos to support the channel directly consider becoming a patron to also get access to a heap of cool rewards and exclusive giveaways also don't forget you can check prices and buy the products I looked at in this four through the Amazon links in the video description below thank you kindly for supporting me and the hardware on box channel it means a lot to me and I really do appreciate it and in return I'll continue to work as hard as I can to keep producing the content you enjoy
We are a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites.