Introduction to the graphics
pipeline of the PS3
: : Cedric Perthuis
Introduction
An overview of the hardware architecture with a
focus on the graphics pipeline, and an
introduction to the related software APIs
Aimed to be a high level overview for academics
and game developers
No announcement and no sneak previews of
PS3 games in this presentation
Outline
Platform Overview
Graphics Pipeline
APIs and tools
Cell Computing example
Conclusion
Platform overview
Processing
3.2Ghz Cell: PPU and 7 SPUs
PPU: PowerPC based, 2 hardware threads
SPUs: dedicated vector processing units
RSX®: high end GPU
Data flow
IO: BluRay, HDD, USB, Memory Cards, GigaBit
ethernet
Memory: main 256 MB, video 256 MB
SPUs, PPU and RSX® access main via shared bus
RSX® pulls from main to video
PS3 Architecture
20GB/s HD/HD
25.6GB/s Cell
XDRAM RSX® SD
256 MB 3.2 GHz AV out
15GB/s
2.5GB/s 22.4GB/s
2.5GB/s GDDR3
256 MB
I/O
Bridge BD/DVD/CD BT Controller
ROM Drive
54GB USB 2.0 x 6
Gbit Ether/WiFi Removable Storage
MemoryStick,SD,CF
Focus on the Cell SPUs
The key strength of the PS3
Similar to PS2 Vector Units, but order of magnitude
more powerful
Main Memory Access via DMA: needs software cache
to do generic processing
Programmable in C/C++ or assembly
Programs: standalone executables or jobs
Ideal for sound, physics, graphics data
preprocessing, or simply to offload the PPU
The Cell Processor
MIC SPE1 SPE3 SPE5 I/O
XIO LS LS LS
Memory (256KB) (256KB) (256KB) I/O
Interface Flex-
Controller
DMA DMA DMA IO1
PPE SPE0 SPE2 SPE4 SPE6 Flex- I/O
L1 (32 KB I/D) LS LS LS LS IO0
(256KB) (256KB) (256KB) (256KB)
L2
(512 KB) DMA DMA DMA DMA
The RSX® Graphics Processor
Based on a high end NVidia chip
Fully programmable pipeline: shader model 3.0
Floating point render targets
Hardware anti-aliasing ( 2x, 4x )
256 MB of dedicated video memory
PULL from the main memory at 20 GB/s
HD Ready (720p/1080p)
720p = 921 600 pixels
1080p = 2 073 600 pixels
a high end GPU adapted to work with the Cell
Processor and HD displays
The RSX® parallel pipeline
Command processing
Fifo of commands, flip and sync
Texture management
System or video memory
storage mode, compression
Vertex Processing
Attribute fetch, vertex program
Fragment Processing
Zcull, Fragment program, ROP
Particle system example on PS3
Hardware
Objective: to update a particle system
The PPU prepares the rendering
schedule SPU jobs to compute batches of particles
push RSX® commands to pull the VBO from the main
memory
make the render call
The SPUs fill a VBO with positions, normals, etc
receive a job
compute particles properties
DMA the result directly to VBO
release RSX® semaphore
fundamental hardware difference with other
platforms: the SPUs are part of the pipeline
API differences with the PC
approach
Pass-through driver
no driver level optimization, no batching, no shader
modification
direct access to RSX® via memory mapped
“registers”
restricted to the system
deferred access to RSX® via a fifo of commands
system and user
PSGL: the high level graphics API
Needed a standard: practical and extensible
the choice was OpenGL ES 1.0
Why not a subset of OpenGL ?
Mainly needed conformance tests
Benefits:
pipeline state management
Vertex arrays
Texture management
Bonus: Fixed pipeline
Only ~20 entry points for fixed pipeline
Fog, light, material, texenv
Inconvenience:
Fixed point functions
No shaders: needed to be added
PSGL: modern GPU extensions
OpenGL ES 1.1 More data types
VBO ex: half_float
FBO Textures:
PBO Floating point textures
Cube Map, texgen DXT
Primitives: 3D
Quads, Quads_strips non power of 2
primitive restart Anisotropic filtering,
Instancing Min/Max LOD, LOD Bias
Queries and Conditional Depth textures
Rendering Gamma correction
Vertex Texture
PSGL: PS3 specific extensions
Synchronizations:
Wait on or check GPU progress
Make the GPU wait on another GPU event or on PPU
Provide sync APIs for PPU and for SPU
Memory usage hints
For texture, VBO, PBO, render-targets
PPU specific extensions:
Embedded system: PPU usage needs to be limited,
some extensions are added to decrease the PPU
load for some existing features:
Ex: Attribute set
Shading language
CG: high level shader language
Support Cg 1.5
PS3 specific compiler
Mostly compatible with other languages like HLSL
Tools: FX composer for PS3
CG: runtime
Direct access to shader engine registers or via CG
parameter
shared and unshared parameters
CG FX runtime: techniques, render states, textures
Performance analysis
PSGL HUD: runtime performance analyzer
display global statistics and hardware counters
explore objects in video and main memory
explore individual draw calls
profile graphics API calls
PSGL HUD
Call View
Memory view
Executive summary
Beyond High Level APIs
A low level graphics API exists:
proprietary
small and simple
let the user create and send command buffers
deep knowledge of the RSX® internals needed to
really take full advantage of it
A leap forward in graphics
Gamer expectations have changed:
Higher resolutions
Deeper colors
Larger and deeper environment
More environmental and lighting effects
Game console developer expectations have
changed too
Typical PS2 title graphics budget
Assets
60 000 polygons
5 years old HW, at that time PC games were around 30 000
polys, it's only with GF3 that gamers started seeing 100 000
polys in games.
compare to 480p FB: 1 poly for 4 pixels
10 MB of 8 bits or 4 bits textures
Rendering
Multi pass for lightmaps
Multi pass for specular
Projected shadow
Typical Next Gen graphics budget
Assets
800 000 polygons : compare to 720p FB
150 MB of textures in video memory
Rendering
Z pass
2 shadow maps 1024x1024: blur
color and lighting pass: diffuse, normal, specular,
4xAA
Post effects: blooming, tone mapping,…
Maximized Framebuffer Read/Write bandwidth
20 millions+ rasterized pixels
Example of intensive computing
and visualization on PS3
Cure@PS3
Project Folding @ home : provides a PC client
PS3 client created in few months by SCE
presented at the Game Convention 2006 in Leipzig
intensive computing application for PS3
maximize SPU processing
PPU schedules jobs
visualization on PS3
Arbitrary complex molecule rendering challenge
Geometries generated in the fragment program
PSGL MRTs
Cure@PS3: protein
Cure@PS3: protein + water
Cure@PS3 : what if...
What if it became a PS3 screensaver ?
Running on 1% of the PS3 sold during the 1st
month
Estimation: x2 the current Folding @ home
computing power of 210 T flops
Up to 20 times faster than a PC
Conclusion
Thank you for attending
Questions ?
Cedric_Perthuis @ playstation.sony.com