RSX

From PS3 Developer wiki
Revision as of 18:45, 26 May 2011 by Euss (talk | contribs) (Created page with "== Hardware == RSX - Reality Synthesizer The RSX is a graphical processor unit (GPU) based off of the nVidia 7800GTX graphics processor, and is a G70/G71 hybrid with some modifi...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Hardware

RSX - Reality Synthesizer

The RSX is a graphical processor unit (GPU) based off of the nVidia 7800GTX graphics processor, and is a G70/G71 hybrid with some modifications. The RSX has separate vertex and pixel shader pipelines.

The following is a small sample of serial numbers of the RSX by model number.

PS3 Model RSX Serial Die Tech Die Size
CECHA CXD2971GB 90nm 258mm2
CECHC CXD2971DGB 90nm 258mm2
CECHG CXD2971DGB 90nm 258mm2
CECHH CXD2971AGB 90nm 258mm2
CECHK CXD2982GB 90nm 258mm2
CECHL CXD2991GB 65nm ?mm2
CECH-20xx CXD2991EGB 65nm ?mm2
CECH-21xx CXD5300AGB 40nm ?mm2
CECH-25xx CXD5300A1GB 40nm ?mm2


The following are relevant facts about the RSX...

   Little Endian
   8 vertex shaders at 500Mhz
   28 pixel shaders (4 redundant, 24 active) at 550Mhz
   28 texture units (4 redundant, 24 active)
   8 Raster Operations Pipeline units (ROPs)
   Includes 256MB GDDR3 650Mhz clocked graphics memory
       Earlier PS3 Models: Samsung K4J52324QC-SC14 rated at 700Mhz
       Later PS3 Models: Qimonda HYB18H512322AF-14 
   GDDR3 Memory interface bus width: 128bit
   Rambus XDR Memory interface bus width: 56bit out of 64bit (serial) 

More features are revealed in the following chart delineating the differences between the RSX and the nVidia 7800 GTX.

Difference RSX nVidia 7800GTX
GDDR3 Memory bus 128bit 256bit
ROPs 8 16
Post Transform and Lighting Cache 63 max vertices 45 max vertices
Total Texture Cache Per Quad of Pixel Pipes (L1 and L2) 96kB 48kB
CPU interface FlexIO PCI-Express 16x
Technology 40nm/65nm/90nm 110nm

Other RSX features/differences include:

   More shader instructions
       Extra texture lookup logic (helps RSX transport data from XDR)
       Fast vector normalize 

Note that the cache (Post Transform and Lighting Vertext Cache) is located between the vector shader and the triangle setup.

A sample flow of data inside the RSX would see them first processed by 8 vertex shaders. The output are then sent to the 24 active pixel shaders, which can involve the 24 active texture units. Finally, the data is passed to the 8 Raster Operation Pipeline units (ROPs), and on out to the GDDR3. Note that the pixel shaders are grouped into groups of four (called Quads). There are 7 Quads, with 1 redundant, leaving 6 Quads active, which provides us with the 24 active pixel shaders listed above (6 times 4 equals 24). Since each Quad has 96kB of L1 and L2 cache, the total RSX texture cache is 576kB. General RSX features include 2x and 4x hardware anti-aliasing, and support for Shader Model 3.0.

Although the RSX has 256MB of GDDR3 RAM, not all of it is useable. The last 4MB is reserved for keeping track of the RSX internal state and issued commands. The 4MB of GPU Data contains RAMIN, RAMHT, RAMFC, DMA Objects, Graphic Objects, and the Graphic Context. The following is a breakdown of the address within 256MB of the RSX.

Address Range Size Comment
0000000-FBFFFFF 252 MB Framebuffer
FC00000-FFFFFFF 4 MB GPU Data
FF80000-FFFFFFF 512KB RAMIN: Instance Memory
FF90000-FF93FFF 16KB RAMHT: Hash Table
FFA0000-FFA0FFF 4KB/s RAMFC: FIFO Context
FFC0000-FFCFFFF 64KB DMA Objects
FFD0000-FFDFFFF 64KB Graphic Objects
FFE0000-FFFFFFF 128KB GRAPH: Graphic Context

Speed, Bandwidth, and Latency

Because of the aforementioned layout of the communication path between the different chips, and the latency and bandwidth differences between the various components, there are different access speeds depending on the direction of the access in relation to the source and destination. The following is a chart showing the speed of reads and writes to the GDDR3 and XDR memory from the viewpoint of the Cell and RSX. Note that these are measured speeds (rather than calculated speeds) and they should be worse if RSX and GDDR3 access are involved because these figures were measured when the RSX was clocked at 550Mhz and the GDDR3 memory was clocked at 700Mhz. The shipped PS3 has the RSX clocked in at 500Mhz (front and back end, although the pixel shaders run separately inside at 550Mhz). In addition, the GDDR3 memory was also clocked lower at 650Mhz.

Processor 256MB XDR 256MB GDDR3
Cell Read 16.8GB/s 16MB/s
Cell Write 24.9GB/s 4GB/s
RSX Read 15.5GB/s 22.4GB/s
RSX Write 10.6GB/s 22.4GB/s

Because of the VERY slow Cell Read speed from the 256MB GDDR3 memory, it is more efficient for the Cell to work in XDR and then have the RSX pull data from XDR and write to GDDR3 for output to the HDMI display. This is why extra texture lookup instructions were included in the RSX to allow loading data from XDR memory (as opposed to the local GDDR3 memory).

RSX Libraries

The RSX is dedicated to 3D graphics, and developers are able to use different API libraries to access its features. The easiest way is to use high level PSGL, which is basicially OpenGL|ES with programmable pipeline added in. At a lower level developers can use LibGCM, which is an API that talks to the RSX at a lower level. PSGL is actually implemented on top of LibGCM. For the advanced programmer, you can program the RSX by sending commands to it directly using C or assembly. This can be done by setting up commands (via FIFO Context) and DMA Objects and issuing them to the RSX via DMA calls.

Source: http://www.edepot.com/playstation3.html#PS3_RSX_GPU

Other References