| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
|
|
||||||||||||||||
|
NVIDIA groups eight SPs and two SFUs (Special Function Units) into a cluster called a Streaming Multiprocessor (SM). These clusters of SMs are grouped into a TFC or Texture / Processor Cluster (see figure 4). In terms of nomenclature, NVIDIAs referrals to SPs is more accurate since each of their SPs is an independent, pipelined, microprocessor capable of working on a single thread.
While both have different methods of grouping their respective SPs, it is interesting to note that each SP cluster on the Radeon HD48xx has one complex ALU and four simple ALUs, and this complex ALU is basically what NVIDIA calls an SFU. ATI obviously has more SFUs but whether this significant or not is anyone’s guess. Divergently ATI clubs its Texture Units and Texture Cache along with its SP clusters, while NVIDIA pipelines their SPs to their Texture Units and Texture Cache.
For anyone looking at the RV770, it’s obvious that ATI has taken a huge jump in the total SP count, and gone with a more brute force approach, something like what NVIDIA did in 2006. In fact, the jump from 64 SPs to 160 SPs is more impressive from a manufacturing standpoint than the jump from 128 to 240 SPs. When you take into account that each of the 160 SPs on the RV770 has five ALUs that handle complex computations, ATI’s HD48xx series seem even more impressive.
Another first for ATI and the industry in general is the move to GDDR5 memory on the HD4870, although the cheaper HD4850 cards will still utilise GDDR3 memory. ATI’s memory subsystem on the HD4870 offers nearly the same bandwidth as the GeForce GTX 260, but with a much simpler (and cheaper to manufacture) 256-bit bus. NVIDIA uses 448-bit and 512-bit buses, which complicate the PCB design and push costs up. GDDR5 memory on the HD4870 runs at 900 MHz, but due to having two parallel data paths the effective bandwidth is quadrupled, as opposed to GDDR3 or GDDR4, which can only double the data throughput. Therefore, for a clock of 900 MHz, the effective data frequency becomes 3.6 GBps. We’re told that GDDR5 can do as much as 1.2 GHz, which equates to a data throughput of 4.8 GBps. GDDR5 has the advantage of offering ATI more bang for buck as the costs involved with producing 3600 MHz, 256-bit GDDR5 is much less than the cost NVIDIA has to bear for producing 2200 MHz, 512-bit memory.
With DX9 and DX10 and future standards, (DX10.1 and DX11), we’re seeing a move from texturing to shader-intensive operations across all game titles. Most games released over the past couple of years already utilise shaders for certain details as a shader is much more flexible and has much less demands from hardware than a texture has, given similar levels of detailing in a given scene. While a texture requires having a reference image that is used for creating elements in a scene, a shader relies on program code. The best possible showcase for shaders, are games like Crysis, S.T.A.L.K.E.R, Oblivion, Splinter Cell Double Agent and UT3. ATI has geared their RV770 to be a shader-heavy beast, and while the number of texture units goes up from 16 to 40, the ratio of SPs to texture units is 4:1 (160:40). NVIDIA has also woken up to this fact, and the GTX280 has hardly upped its texture unit count from the G80 / G92 days (from 64 to 80 units). For NVIDIA this ratio is 3:1 (240:80), as opposed to 2:1 (128:64) with the G80.
Whether the Radeon RV770 is the better performer in real world games or not, isn’t clear. What we cannot dispute however, is that ATI has come back with all guns blazing. The HD4870 and HD4850 do have one very significant advantage over the competition—they are less monolithic in nature, and therefore ATI has more legroom to manoeuvre prices as well as scale performance up or down. We know that NVIDIA can’t really up performance of their GTX 280 without a die shrink. The fact is that ATI has finally built a tremendous card just as NVIDIA did with the G80 and the new GTX2xx, and what’s more important is they’ve learnt well from their mistakes with the R600 and RV670. We’re told that the successors for both RV770 and GTX2xx are also ready, waiting in the wings to come after two fine competitors have slugged it out.
With upcoming titles like Far Cry 2, Fallout 3, Crysis Warhead, Dragon Age, Dawn Of War 2 in the wings, it seems we’ll need all the crunching horsepower we can use. The best part of the RV770’s existence is the fact that at last after a year and a half, we’re seeing competition in the high and mid-range graphics card segments. We’re seeing tremendous price drops, which is good for users as well as the industry because this speeds up adoption of such graphics solutions.
michael.browne@thinkdigit.com

|
|
| digit magazine, august, ir |
