NVIDIA’s Gamescom 2018 keynote just wrapped up, and as many have been expecting since it was announced last month, NVIDIA is getting ready to launch…
NVIDIA’s Gamescom 2018 keynote just wrapped up, and as many have been expecting since it was announced last month, NVIDIA is getting ready to launch their next generation of GeForce hardware. Announced at the event and going on sale starting September 20 th is NVIDIA’s GeForce RTX 20 series, which is succeeding the current Pascal-powered GeForce GTX 10 series. Based on NVIDIA’s new Turing GPU architecture and built on TSMC’s 12nm “FFN” process, NVIDIA has lofty goals, looking to drive an entire paradigm shift in how games are rendered and how PC video cards are evaluated. CEO Jensen Huang has called Turing NVIDIA’s most important GPU architecture since 2006’s Tesla GPU architecture (G80 GPU), and from a features standpoint it’s clear that he’s not overstating matters.
As is traditionally the case, the first cards out of the NVIDIA stable are the high-end cards. But in a rather sizable break from tradition we’re not only going to get the x80 and x70 cards at launch, but also the x80 Ti card as well. Meaning the GeForce RTX 2080 Ti, RTX 2080, and RTX 2070 will all be hitting the streets within a month of each other. NVIDIA’s product stack is remaining unchanged here, so RTX 2080 Ti remains their flagship card, while RTX 2080 is their high-end card, and then RTX 2070 the slightly cheaper card to entice enthusiasts without breaking the bank.
All three cards will be launching over the next two months. First off will be the RTX 2080 Ti and RTX 2080, which will launch September 20 th. The RTX 2080 Ti will start at $999 for partner cards, while the RTX 2080 will start at $699. Meanwhile the RTX 2070 will launch at some point in October, with partner cards starting at $499. On a historical basis, all of these prices are higher than the last generation by anywhere between $120 and $300. Meanwhile NVIDIA’s own reference-quality Founders Edition cards are once again back, and those will carry a $100 to $200 premium over the baseline pricing.
Unfortunately, NVIDIA is already taking pre-orders here, so consumers are essentially required to make a “blind buy” if they want to snag a card from the first batch. NVIDIA has offered surprisingly little information on performance and we’d suggest waiting for trustworthy third-party reviews (i.e. us), however I have to admit that I don’t imagine there’s going to be much stock available by the time reviews hit the streets.
So what does Turing bring to the table? The marquee feature across the board is hybrid rendering, which combines ray tracing with traditional rasterization to exploit the strengths of both technologies. This announcement is essentially a continuation of NVIDIA’s RTX announcement from earlier this year, so if you thought that announcement was a little sparse, well then here is the rest of the story.
The big change here is that NVIDIA is going to be including even more ray tracing hardware with Turing in order to offer faster and more efficient hardware ray tracing acceleration. New to the Turing architecture is what NVIDIA is calling an RT core, the underpinnings of which we aren’t fully informed on at this time, but serve as dedicated ray tracing processors. These processor blocks accelerate both ray-triangle intersection checks and bounding volume hierarchy (BVH) manipulation, the latter being a very popular data structure for storing objects for ray tracing.
NVIDIA is stating that the fastest GeForce RTX part can cast 10 Billion (Giga) rays per second, which compared to the unaccelerated Pascal is a 25x improvement in ray tracing performance.
The Turing architecture also carries over the tensor cores from Volta, and indeed these have even been enhanced over Volta. The tensor cores are an important aspect of multiple NVIDIA initiatives. Along with speeding up ray tracing itself, NVIDIA’s other tool in their Turing bag of tricks is to reduce the amount of rays required in a scene by using AI denoising to clean up an image, which is something the tensor cores excel at. Of course that’s not the only feature tensor cores are for – NVIDIA’s entire AI/neural networking empire is all but built on them – so while not a primary focus for the Gamescom crowd, this also confirms that NVIDIA’s most powerful neural networking hardware will be coming to a wider range of GPUs.
Looking at hybrid rendering in general, it’s interesting that despite these individual speed-ups, NVIDIA’s overall performance promises aren’t quite as extreme. All told, the company is promising a 6x performance boost versus Pascal, and this doesn’t specify against which parts. Time will tell if even this is a realistic assessment, as even with the RT cores, ray tracing in general is still quite the resource hog.
As for gaming matters in particular, the benefits of hybrid rendering are potentially significant, but it’s going to depend heavily on how developers choose to use it. From performance standpoint I’m not sure there’s much to say here, and that’s because ray tracing & hybrid rendering are ultimately features to improve rendering quality, not improve the performance of today’s algorithms. Granted, if you tried to do ray tracing on today’s GPUs it would be extremely slow – and Turing an incredible speedup as a result – but no one uses slow path tracing systems on current hardware for this reason. So hybrid rendering is instead about replacing the approximations and hacks of current rasterization technology with more accurate rendering methods. In other words, less “faking it” and more “making it.”
Those quality benefits, in turn, are typically clustered around lighting, shadows, and reflections. All three features are inherently based on the properties of light, which in simplistic terms moves as a ray, and which up to now various algorithms have been faking the work involved or “pre-baking” scenes in advance. And while current algorithms are quite good, they still aren’t close to accurate. So there is clear room for improvement.
NVIDIA for their part is particularly throwing around global illumination, which is one of the harder tasks. However there are other lighting methods that benefit as well, not to mention reflections and shadows of those lit objects. And truthfully this is where words are a poor tool; it’s difficult to describe how a ray traced shadow looks better than a fake shadow with PCSS, or real-time lighting over pre-baked lighting. Which is why NVIDIA, the video card company, is going to be pushing the visual aspects of all of this harder than ever.
Overall then, hybrid rendering is the lynchpin feature of the GeForce RTX 20 series. Going by their Gamescom and SIGGRAPH presentations, it’s clear that NVIDIA has invested heavily into the field, and that they have bet the success of the GeForce brand over the coming years on this technology. RT cores and tensor cores are semi-fixed function hardware; they can’t be used for rasterization, and the transistors allocated to them are transistors that could have been dedicated to more rasterization hardware otherwise. So NVIDIA has made an incredibly significant move here in terms of opportunity cost by going the hybrid rendering route rather than building a bigger Pascal.
As a result, NVIDIA is attempting a paradigm shift in consumer rendering, one that we’ve really only see before with the introduction of pixel and vertex shaders (DX8 & DX9 era tech) all the way back in 2001 & 2002. Which is why Microsoft’s DirectX Raytracing (DXR) initiative is so important, as are NVIDIA’s other developer and consumer initiatives. NVIDIA needs to sell consumers and developers alike on this vision of mixing rasterization with ray tracing to provide better image quality. And more so than that, they need to ease developers into the idea of working with more specialized, fixed function units as Moore’s Law continues to slow down and fixed function hardware becomes a means to achieve greater efficiency.
NVIDIA hasn’t bet the farm on hybrid rendering, but they’ve never attempted to move the market in this fashion. So if it seems like NVIDIA is hyper-focused on hybrid rendering and ray tracing, that’s because they are. It’s their vision of the future, and now they need to get everyone else on board.
Alongside the dedicated RT and tensor cores, the Turing architecture Streaming Multiprocessor (SM) itself is also learning some new tricks. In particular here, it’s inheriting one of Volta’s more novel changes, which saw the Integer cores separated out into their own blocks, as opposed to being a facet of the Floating Point CUDA cores. The advantage here – at least as much as we saw in Volta – is that it speeds up address generation and Fused Multiply Add (FMA) performance, though as with a lot of aspects of Turing, there’s likely more to it (and what it can be used for) than we’re seeing today.
The Turing SM also includes what NVIDIA is calling a “unified cache architecture.” As I’m still awaiting official SM diagrams from NVIDIA, it’s not clear if this is the same kind of unification we saw with Volta – where the L1 cache was merged with shared memory – or if NVIDIA has gone one step further.