0% found this document useful (0 votes)
102 views29 pages

Introduction To The Graphics Pipeline of The PS3

Uploaded by

cokeinh
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
102 views29 pages

Introduction To The Graphics Pipeline of The PS3

Uploaded by

cokeinh
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 29

Introduction to the graphics

pipeline of the PS3

: : Cedric Perthuis
Introduction
 An overview of the hardware architecture with a
focus on the graphics pipeline, and an
introduction to the related software APIs

 Aimed to be a high level overview for academics


and game developers

 No announcement and no sneak previews of


PS3 games in this presentation
Outline
 Platform Overview
 Graphics Pipeline
 APIs and tools
 Cell Computing example
 Conclusion
Platform overview
 Processing
 3.2Ghz Cell: PPU and 7 SPUs
 PPU: PowerPC based, 2 hardware threads
 SPUs: dedicated vector processing units
 RSX®: high end GPU
 Data flow
 IO: BluRay, HDD, USB, Memory Cards, GigaBit
ethernet
 Memory: main 256 MB, video 256 MB
 SPUs, PPU and RSX® access main via shared bus
 RSX® pulls from main to video
PS3 Architecture

20GB/s HD/HD
25.6GB/s Cell
XDRAM RSX® SD
256 MB 3.2 GHz AV out
15GB/s
2.5GB/s 22.4GB/s

2.5GB/s GDDR3
256 MB
I/O
Bridge BD/DVD/CD BT Controller
ROM Drive
54GB USB 2.0 x 6

Gbit Ether/WiFi Removable   Storage


MemoryStick,SD,CF
Focus on the Cell SPUs
 The key strength of the PS3
 Similar to PS2 Vector Units, but order of magnitude
more powerful
 Main Memory Access via DMA: needs software cache
to do generic processing
 Programmable in C/C++ or assembly
 Programs: standalone executables or jobs
 Ideal for sound, physics, graphics data
preprocessing, or simply to offload the PPU
The Cell Processor
MIC SPE1 SPE3 SPE5 I/O
XIO LS LS LS
Memory (256KB) (256KB) (256KB) I/O
Interface Flex-
Controller
DMA DMA DMA IO1

PPE SPE0 SPE2 SPE4 SPE6 Flex- I/O


L1 (32 KB I/D) LS LS LS LS IO0
(256KB) (256KB) (256KB) (256KB)
L2
(512 KB) DMA DMA DMA DMA
The RSX® Graphics Processor
 Based on a high end NVidia chip
 Fully programmable pipeline: shader model 3.0
 Floating point render targets
 Hardware anti-aliasing ( 2x, 4x )
 256 MB of dedicated video memory
 PULL from the main memory at 20 GB/s
 HD Ready (720p/1080p)
 720p = 921 600 pixels
 1080p = 2 073 600 pixels
 a high end GPU adapted to work with the Cell
Processor and HD displays
The RSX® parallel pipeline
 Command processing
 Fifo of commands, flip and sync
 Texture management
 System or video memory
 storage mode, compression
 Vertex Processing
 Attribute fetch, vertex program
 Fragment Processing
 Zcull, Fragment program, ROP
Particle system example on PS3
Hardware
 Objective: to update a particle system
 The PPU prepares the rendering
 schedule SPU jobs to compute batches of particles
 push RSX® commands to pull the VBO from the main
memory
 make the render call
 The SPUs fill a VBO with positions, normals, etc
 receive a job
 compute particles properties
 DMA the result directly to VBO
 release RSX® semaphore
 fundamental hardware difference with other
platforms: the SPUs are part of the pipeline
API differences with the PC
approach
 Pass-through driver
 no driver level optimization, no batching, no shader
modification

 direct access to RSX® via memory mapped


“registers”
 restricted to the system

 deferred access to RSX® via a fifo of commands


 system and user
PSGL: the high level graphics API
 Needed a standard: practical and extensible
 the choice was OpenGL ES 1.0
 Why not a subset of OpenGL ?
 Mainly needed conformance tests
 Benefits:
 pipeline state management
 Vertex arrays
 Texture management
 Bonus: Fixed pipeline
 Only ~20 entry points for fixed pipeline
 Fog, light, material, texenv
 Inconvenience:
 Fixed point functions
 No shaders: needed to be added
PSGL: modern GPU extensions
 OpenGL ES 1.1  More data types
 VBO  ex: half_float
 FBO  Textures:
 PBO Floating point textures
 Cube Map, texgen DXT
 Primitives: 3D
 Quads, Quads_strips non power of 2
 primitive restart Anisotropic filtering,
 Instancing Min/Max LOD, LOD Bias
 Queries and Conditional  Depth textures
Rendering  Gamma correction
 Vertex Texture
PSGL: PS3 specific extensions
 Synchronizations:
 Wait on or check GPU progress
 Make the GPU wait on another GPU event or on PPU
 Provide sync APIs for PPU and for SPU
 Memory usage hints
 For texture, VBO, PBO, render-targets
 PPU specific extensions:
 Embedded system: PPU usage needs to be limited,
some extensions are added to decrease the PPU
load for some existing features:
 Ex: Attribute set
Shading language
 CG: high level shader language
 Support Cg 1.5
 PS3 specific compiler
 Mostly compatible with other languages like HLSL
 Tools: FX composer for PS3
 CG: runtime
 Direct access to shader engine registers or via CG
parameter
 shared and unshared parameters
 CG FX runtime: techniques, render states, textures
Performance analysis
 PSGL HUD: runtime performance analyzer
 display global statistics and hardware counters
 explore objects in video and main memory
 explore individual draw calls
 profile graphics API calls
PSGL HUD
Call View
Memory view
Executive summary
Beyond High Level APIs
 A low level graphics API exists:
 proprietary
 small and simple
 let the user create and send command buffers
 deep knowledge of the RSX® internals needed to
really take full advantage of it
A leap forward in graphics
 Gamer expectations have changed:
 Higher resolutions
 Deeper colors
 Larger and deeper environment
 More environmental and lighting effects

 Game console developer expectations have


changed too
Typical PS2 title graphics budget
 Assets
 60 000 polygons
 5 years old HW, at that time PC games were around 30 000
polys, it's only with GF3 that gamers started seeing 100 000
polys in games.
 compare to 480p FB: 1 poly for 4 pixels
 10 MB of 8 bits or 4 bits textures
 Rendering
 Multi pass for lightmaps
 Multi pass for specular
 Projected shadow
Typical Next Gen graphics budget
 Assets
 800 000 polygons : compare to 720p FB
 150 MB of textures in video memory
 Rendering
 Z pass
 2 shadow maps 1024x1024: blur
 color and lighting pass: diffuse, normal, specular,
4xAA
 Post effects: blooming, tone mapping,…
 Maximized Framebuffer Read/Write bandwidth
 20 millions+ rasterized pixels
Example of intensive computing
and visualization on PS3
 Cure@PS3
 Project Folding @ home : provides a PC client
 PS3 client created in few months by SCE
 presented at the Game Convention 2006 in Leipzig
 intensive computing application for PS3
 maximize SPU processing
 PPU schedules jobs
 visualization on PS3
 Arbitrary complex molecule rendering challenge
 Geometries generated in the fragment program
 PSGL MRTs
Cure@PS3: protein
Cure@PS3: protein + water
Cure@PS3 : what if...
 What if it became a PS3 screensaver ?
 Running on 1% of the PS3 sold during the 1st
month

 Estimation: x2 the current Folding @ home


computing power of 210 T flops
 Up to 20 times faster than a PC
Conclusion
 Thank you for attending
 Questions ?

 Cedric_Perthuis @ playstation.sony.com

You might also like