Graphics: Difference between revisions

From Vita Developer wiki
Jump to navigation Jump to search
m (Reverted edits by SharonRodriguez (talk) to last revision by GregoryRasputin)
No edit summary
Line 23: Line 23:
* [http://www.imgtec.com/powervr/series5xt.asp PowerVR SGX Series5XT main page]
* [http://www.imgtec.com/powervr/series5xt.asp PowerVR SGX Series5XT main page]
* [http://community.imgtec.com/developers/powervr/graphics-sdk/ PowerVR SDK]
* [http://community.imgtec.com/developers/powervr/graphics-sdk/ PowerVR SDK]
== SGX543MP4+ Block Overview ==
The SGX543MP4+ multi-core GPU contains four SGX543+ cores and several master blocks whose role it is to orchestrate and distribute work amongst the cores efficiently.
[[File:GPU Block Overview.png]]
The master blocks of SGX543MP4+ are described below.
=== Master VDM ===
The Master VDM – Master Vertex Data Master – is the front end of the GPU. It is responsible for reading the single command stream constructed by the graphics driver (libgxm) from memory; and for distributing all vertex work amongst the cores.
The Master VDM splits vertex processing work amongst the cores in order to load balance their usage, while at the same time preserving primitive submission order to ensure correct rasterization. Vertex processing work is basically split according to a pre-defined split threshold (in number of primitives). These primitive chunks are the fundamental unit of vertex processing work at master level. Consequently, it is possible for individual Draw commands to be split over multiple cores. Once the cores accept their requested vertex processing work they operate independently, generating their own subset of the Parameter Buffer.
=== Master IPF ===
The Master IPF – Master ISP Parameter Fetch – is responsible for processing the Parameter Buffer produced by the multiple cores and initiating the rasterization process. As Master IPF reads the Parameter Buffer, it combines the individual subsets produced by the cores, such that each tile has a single tile command stream. A tile command stream contains references to primitive data (output by vertex programs) and state data required for rasterization and fragment processing.
Using information provided by the Master VDM, the Master IPF also ensures primitives binned in each tile are rasterized in submission order. In parallel to the Parameter Buffer read, the Master IPF will also distribute rasterization and fragment processing amongst the cores; and ensure they are efficiently load balanced. Since each tile is completely independent, distribution across the cores is naturally achieved in terms of tiles; this is the fundamental unit of rasterization work at master level. Each tile assigned to a core is rasterized and processed entirely on that core.
=== Master DPM ===
The Master DPM – Master Dynamic Parameter Management – (block) is responsible for managing Parameter Buffer memory for all GPU cores. No CPU or user intervention is required to manage the Parameter Buffer; it is handled entirely in hardware.
Parameter Buffer memory is managed as a linked list of pages. As vertex processing proceeds individual cores request memory pages from the Master DPM, which are allocated from a free list. As fragment processing proceeds and sections of the scene are rendered, the Master DPM receives lists of memory pages back from the cores. These page allocation and free operations are performed in parallel to help ensure the maximum amount of free Parameter Buffer memory is maintained.
=== PTLA ===
The PTLA - Present and Texture Load Accelerator - (block) is a fixed function transfer unit that operates asynchronously to the GPU cores. The PTLA block supports various forms of format conversion, memory layout conversion, downscaling, copying, and filling of 2D images. See "Transfers using PTLA" for additional information.
=== SLC ===
The SLC – System Level Cache – is the highest level cache within the GPU; sitting between all cores, the master-level blocks and memory. All GPU memory requests go through the SLC in order to avoid re-fetching of memory between and within the cores.
In addition, to avoid thrashing the SLC, not all memory requests are cached. This depends on the type of data and the hardware block issuing the request. For example, texture and vertex attribute data is likely to be read many times and so is cached; whereas the VDM Command Stream is read only once and so is un-cached. Similarly, triangle primitive data in the Parameter Buffer is written only once per macro tile (group of tiles) and so these writes are un-cached, however, this same data is read many times and so reads are cached.
The SLC is implemented in four separate cache banks, one per memory access channel, with a crossbar directing the requests based on address. This allows all four cores to access different SLC banks in parallel.
Features:
* 256K total cache size
64K cache bank - per core
* 16-way set associative
* Pseudo-LRU replacement policy

Revision as of 14:51, 19 August 2018


SGX543MP4+

PowerVR Series5XT SGXMP chips are multi-core variants of the SGX series with some updates. It will be included in the PlayStation Vita portable gaming device with the MP4+ Model of the PowerVR SGX543, the only intended difference is the cores, where MP4+ denotes 4 cores (quad-core).

Specifications

PowerVR SGX Series5XT Block Diagram

  • Model: SGX543
  • Date: Jan 2009
  • Cores: 4 (availble as 1-16 on other platforms)
  • Die Size (mm2): 8@65 nm
  • Config core: 4/2
  • Fillrate (@ 200 MHz): 35 MTriangles/s, 1000 MPixel/s
  • Bus width (bit): 64
  • API: DirectX 9.0, OpenGL 2.1
  • GFLOPS(@ 200 MHz,per core): 7.2


External References

SGX543MP4+ Block Overview

The SGX543MP4+ multi-core GPU contains four SGX543+ cores and several master blocks whose role it is to orchestrate and distribute work amongst the cores efficiently.

File:GPU Block Overview.png

The master blocks of SGX543MP4+ are described below.

Master VDM

The Master VDM – Master Vertex Data Master – is the front end of the GPU. It is responsible for reading the single command stream constructed by the graphics driver (libgxm) from memory; and for distributing all vertex work amongst the cores.

The Master VDM splits vertex processing work amongst the cores in order to load balance their usage, while at the same time preserving primitive submission order to ensure correct rasterization. Vertex processing work is basically split according to a pre-defined split threshold (in number of primitives). These primitive chunks are the fundamental unit of vertex processing work at master level. Consequently, it is possible for individual Draw commands to be split over multiple cores. Once the cores accept their requested vertex processing work they operate independently, generating their own subset of the Parameter Buffer.

Master IPF

The Master IPF – Master ISP Parameter Fetch – is responsible for processing the Parameter Buffer produced by the multiple cores and initiating the rasterization process. As Master IPF reads the Parameter Buffer, it combines the individual subsets produced by the cores, such that each tile has a single tile command stream. A tile command stream contains references to primitive data (output by vertex programs) and state data required for rasterization and fragment processing.

Using information provided by the Master VDM, the Master IPF also ensures primitives binned in each tile are rasterized in submission order. In parallel to the Parameter Buffer read, the Master IPF will also distribute rasterization and fragment processing amongst the cores; and ensure they are efficiently load balanced. Since each tile is completely independent, distribution across the cores is naturally achieved in terms of tiles; this is the fundamental unit of rasterization work at master level. Each tile assigned to a core is rasterized and processed entirely on that core.

Master DPM

The Master DPM – Master Dynamic Parameter Management – (block) is responsible for managing Parameter Buffer memory for all GPU cores. No CPU or user intervention is required to manage the Parameter Buffer; it is handled entirely in hardware.

Parameter Buffer memory is managed as a linked list of pages. As vertex processing proceeds individual cores request memory pages from the Master DPM, which are allocated from a free list. As fragment processing proceeds and sections of the scene are rendered, the Master DPM receives lists of memory pages back from the cores. These page allocation and free operations are performed in parallel to help ensure the maximum amount of free Parameter Buffer memory is maintained.

PTLA

The PTLA - Present and Texture Load Accelerator - (block) is a fixed function transfer unit that operates asynchronously to the GPU cores. The PTLA block supports various forms of format conversion, memory layout conversion, downscaling, copying, and filling of 2D images. See "Transfers using PTLA" for additional information.

SLC

The SLC – System Level Cache – is the highest level cache within the GPU; sitting between all cores, the master-level blocks and memory. All GPU memory requests go through the SLC in order to avoid re-fetching of memory between and within the cores.

In addition, to avoid thrashing the SLC, not all memory requests are cached. This depends on the type of data and the hardware block issuing the request. For example, texture and vertex attribute data is likely to be read many times and so is cached; whereas the VDM Command Stream is read only once and so is un-cached. Similarly, triangle primitive data in the Parameter Buffer is written only once per macro tile (group of tiles) and so these writes are un-cached, however, this same data is read many times and so reads are cached.

The SLC is implemented in four separate cache banks, one per memory access channel, with a crossbar directing the requests based on address. This allows all four cores to access different SLC banks in parallel.

Features:

  • 256K total cache size

64K cache bank - per core

  • 16-way set associative
  • Pseudo-LRU replacement policy