18-796


 Multimedia Communications:
Coding, Systems, and Networking


                      Prof. Tsuhan Chen
                     tsuhan@ece.cmu.edu




     MPEG-4




                                          1
MPEG-4
• Originally
   – A standard for very low bit rate coding of limited
     complexity audio-visual material


• In July 94, the scope was extended to
   – Functionalities not supported by other standards
      • Content-based interactivity
      • Universal access
      • High compression
   – Coding of general material for a wide bit rate range
   – Flexibility and extensibility
                                                    18-796/Spring 1999/Chen




        Content-Based Interactivity
• A scene is composed of audio-visual objects
  – Not just pixels or moving blocks
• Objects can be of different nature
  –   Text or images
  –   Rectangular or arbitrary shape
  –   2D or 3D objects
  –   Natural of synthetic
• Different coding schemes applied to different
  objects
• Compositor puts objects back in a scene
                                                    18-796/Spring 1999/Chen




                                                                              2
[MPEG-4 N1909,
“Overview of the
MPEG-4 Version
1 Standard”]


                                                             18-796/Spring 1999/Chen




                            Applications
 Human-machine interface                                     Content creation
 GUI, Virtual environment                                    Digital TV, HDTV
 Vision, Graphics                                            VCD, DVD


                   Computer                        TV/Film

                                 MPEG-4



                             Telecommunication

                     Wireless, Internet, ISDN, POTS, Cable
                                                             18-796/Spring 1999/Chen




                                                                                       3
Parts of MPEG-4
• Part 1: Systems
• Part 2: Video
• Part 3: Audio
• Part 4: Conformance testing
• Part 5: Reference software
• Part 6: Delivery multimedia integration framework
• Others
  –   Synthetic and Natural Hybrid Coding (SNHC)
  –   Requirements and applications
  –   Implementation Study
  –   Intellectual property rights (IPR)
                                                       18-796/Spring 1999/Chen




               MPEG-4 Activities
• Competitive phase
  – Proposals and evaluations


• Collaborative phase
  – Verification model and core experiments
      1. Define Verification Model (VM-n)
      2. Define core experiments for improving VM-n
      3. Perform core experiments. Compare with VM-n
      4. n++, go to Step 1




                                                       18-796/Spring 1999/Chen




                                                                                 4
MPEG-4 Time Table
• July 93    Started work
• Nov 95     Subjective tests and tool evaluation
• Jan 96     Define verification model (VM) and core
  experiments (CE)
• Mar/July/Sept/Nov 96, Feb/Apr/Jul 97
             Update VM and define a new set of CEs
• Oct 97     Committee Draft (CD)
• July 98    Final CD (FCD)
• Nov 98     Draft international standard (DIS)
• Jan 99     International standard (IS)


                                                18-796/Spring 1999/Chen




                  MPEG-4 Video
• General functionalities
   – Coding efficiency
      • For 5 kbit/s ~ 5 Mbit/s
      • From small images to TV resolution
      • Progressive/interlaced
   – Error resilience and robustness
   – Spatial and temporal scalabilities
• Content-based functionalities
   – Shape coding and sprites
   – Content-based scalabilities
   – Error resilience and robustness

                                                18-796/Spring 1999/Chen




                                                                          5
MPEG-4 Video (cont.)
• Tools
  –   Motion/texture coding derived from H.263
  –   Coding of video object plane (VOP): I, B, P
  –   Binary and gray-scale shape coding
  –   Scalabilities: temporal/spatial
  –   Static sprites
  –   Interlaced prediction
  –   12 bit video
  –   Computational graceful degradation (CGD)



                                                    18-796/Spring 1999/Chen




         Video Object Plane (VOP)
                                                      VOP1




                                                      VOP2




                                                      VOP3




                                                    18-796/Spring 1999/Chen




                                                                              6
Structure of VOP Encoder

                               VOP 0
                               Coding
                               VOP 1
               VOP             Coding
Input                                        MUX              Bitstream
             Definition        VOP 2
                               Coding




         – Note: Segmentation is outside the scope of MPEG-4
                                                       18-796/Spring 1999/Chen




             Structure of VOP Decoder

                                VOP 0
                               Decoding

                                VOP 1
                               Decoding
 Bitstream          DEMUX                      Composition         Output
                                VOP 2
                               Decoding




                                                       18-796/Spring 1999/Chen




                                                                                 7
Coding of VOP
• Motion compensation and DCT
   – Similar to H.263



• Polygon matching for motion estimation

                                         Transparent
                                         Pixels

  VOP
                                Pixels for polygon matching


                                                18-796/Spring 1999/Chen




                    VOP Padding
• Applied to the reference VOP prior to motion
  estimation/compensation




        Padded Previous Frame   Current Frame
                                                18-796/Spring 1999/Chen




                                                                          8
Shape-Adaptive DCT for Texture Coding




                    Shape Coding
• Binary shape
   – Context-based arithmetic encoding (CAE)
• Gray scale alpha plane
   – Motion compensated DCT
      • Similar to texture coding
                                               Gray-Level
                                                 Alpha


                                     Support                   Tex ture




                                      Binary                Tex ture Coder
                                    Shape Coder

                                                               18-796/Spring 1999/Chen




                                                                                         9
Binary Shape Coding
• Context-based arithmetic encoding (CAE)
  – A binary shape is treated as a binary image
  – Apply CAE to each binary alpha block (BAB)

• The “context”
                                   Current BAB    Motion Compensated BAB

          c9 c8 c7                c3 c2 c1                 c8
       c6 c5 c4 c3 c2             c0   ?             c7 c6 c5
       c1 c0   ?                                           c4

          Intra                                  Inter
                                                         18-796/Spring 1999/Chen




                        Feathering
• Feathering and translucency coding
   –   No effects
   –   Linear feathering
   –   Constant alpha
   –   Linear feathering and constant alpha
   –   Feathering filter
   –   Feathering filter and constant alpha




                                                         18-796/Spring 1999/Chen




                                                                                   10
VOP Encoder
                +                                           motion
                                                            texture    video
                    _     DCT        Q                                multiplex
                                                            coding
                                                  -1
                                               Q


                                              IDCT
                                                       +
                                              +
                         S     pred. 1
                         w
                         i                    Frame
                               pred. 2        Store
                         t
                         c
                         h     pred. 3


                         Motion
                        estimation


                                     Shape
                                     coding
                                                                      18-796/Spring 1999/Chen




                               VOP Decoder
                 video_object_layer_shape

   Coded Bit Stream
       (Shape)                             Shape
                                          Decoding                      Previous
                                                                      Reconstructed
                                                                          VOP

   Coded Bit Stream
      (Motion)                                              Motion
                              Motion                       Compen-
                             Decoding                       sation


                                                                          VOP
Demultiplexer                                                            Recon-
                                                                        struction
     Coded
    Bit Stream                   Texture Decoding
    (Texture)



                                                                      18-796/Spring 1999/Chen




                                                                                                11
Profiles of MPEG-4
   bitrate
               High bitrate
                   tools
                (interlace)
                                           Content-based
                                           functionalities
                                         (shape, scalability)

                    VLBV
                    core


                                                           functionalities

                                                                              18-796/Spring 1999/Chen




             Visual Object Profiles
• Not final yet
             Intra Coding Mode (I)                    4-12 bit pixel depth
             Inter Prediction Mode (P)
             AC/DC Prediction
             Reversible VLC
             Slice Resychronization
             Data Partitioning
             Binary Shape
             P-VOP based temporal scalability
                                        Simple                          12 bit


             H.263/MPEG-2 Quantization Table
             Bidirectional Prediction Mode (B)        Binary shape
             Advanced Prediction Mode                 Texture coding
             Static Sprites
             Interlaced tools                            Scalable Wavelet
             Temporal Scalability
                                                      Note that the binary
             • object-based
                                                      shape tool is the same
             • frame-based                            for all Object Profiles it
             Spatial Scalability (frame-based)
                                                      is used in.
                                               Core
                                                                              18-796/Spring 1999/Chen




                                                                                                        12
Core vs. Generic MPEG-4




                                                   18-796/Spring 1999/Chen




Synthetic & Natural Hybrid Coding (SNHC)
• Efficient representation and composition of
  synthetically and naturally generated audiovisual
  data
• To be integrated into MPEG-4 Video and Audio
   – Not a separate part of MPEG-4
• Applications
   – Virtual environment, conferencing, education,
     entertainment, media production, and real-time, interactive
     and broadcast media experiences



                                                   18-796/Spring 1999/Chen




                                                                             13
SNHC Target technologies
• Video
  –   Face animation
  –   2D/3D mesh compression
  –   Wavelet-based still texture coding
  –   View dependent scalability
• Audio
  – Text-to-speech synthesis, structured audio,
    environmental auralization, 3D audio, etc.




                                                                 18-796/Spring 1999/Chen




                    Face Animation
• Face animation
  – 2D/3D polygon mesh for face rendering
  – Facial Definition Parameter (FDP) Set
       • Controls shape, texture, gender, age, etc.
  – Facial Animation Parameter (FAP) Set
       • 68 parameters to produce animation and to create expressions




                                                                    Demo

                                                                 18-796/Spring 1999/Chen




                                                                                           14
Examples of FAP and FDP
c l a s s Fa c e De f i n i t i on Pa r a ms {
    p ub l i c :
         3Dme s h s h a p e ;
         3DPoi n t f e a t u r e Poi n t [ 46 ] ;
         I ma g e t e x t u r e ;
         i nt age ;
         i n t ge n de r ;
         ...;
}

c l a s s Fa c e An i ma t i on Pa r a ms {
    p ub l i c :
        i n t mov e _h _l _e y e b a l l ;
        i n t mov e _h _r _e y e b a l l ;
        i n t mov e _v _l _e y e b a l l ;
        i n t mov e _v _r _e y e b a l l ;
        i n t e n l a r g e _ l _ pu pi l ;
        i n t e n l a r g e _ r _ pu pi l ;
        ...;
}
                                                                                           18-796/Spring 1999/Chen




                              Example FAPs

 #   FAP name              FAP description                    units   Uni-    Pos      Grp     FDP      Quant
                                                                      or      motion           subgrp   step
                                                                      Bidir                    num      size
 1   viseme                Set of values determining the      na      na      na       1       na       1
                           mixture of two visemes for this
                           frame (e.g. pbm, fv, th)
 2   expression            A set of values determining the    na      na      na       1       na       1
                           mixture of two facial expression
 3   open_jaw              Vertical jaw displacement (does    MNS     U       down     2       1        8
                           not affect mouth opening)
 4   lower_t_midlip        Vertical top middle inner lip      MNS     B       down     2       2        5
                           displacement
 5   raise_b_midlip        Vertical bottom middle inner       MNS     B       up       2       3        5
                           lip displacement
 6   stretch_l_cornerlip   Horizontal displacement of left    MW      B       left     2       4        5
                           inner lip corner




                                                                                           18-796/Spring 1999/Chen




                                                                                                                     15
Facial Definition Parameters
                                                                                                                   11.5
                                                      11.5

                                                      11.4                                                         11.4
                                                                                                                                        11.1
                               11.2                   11.1           11.3
                                                                                                                       11.2
                                          4.4 4.2 4.1 4.3                                                                                    4.4
                                                                                                                                4.6
               10.2            4.6                                     4.5          10.1                       10.2                               9.6

             10.4           10.10                                      10.9                                                     10.10
                                                                                    10.3                      10.4
                                      5.2                        5.1                                                                9.14            9.12
                            10.8                                       10.7                                                     10.8
                                                                                                                                                        9.3
                    10.6                                                      10.5                  Y                  10.6           9.4
                                                                                                                                              9.2
                        2.14                                                2.13           2.14                X

                        Y                                                                    7.1                   Z             2.10
                                                        2.10
                               X            2.12 2.1 2.11                                                                             2.1
                                                                                                                          2.12
                    Z

                               Right Eye                                     Left Eye
                                                                                                                                Nose
                                         3.14                                      3.13
                                                      3.6                                    3.5
                                                                              3.1                                         9.6               9.7
                                    3.2
                    3.12                               3.8       3.11                             3.7
                                   3.4                                        3.3
                                         3.10                                      3.9                                           9.12
                                                                                                                9.14                              9.13
                                                             Teeth
                                                                                                                                 9.3
                                                               9.8                                                        9.2               9.1

                                                  9.10                      9.11                                9.4              9.15             9.5

                                                                9.9


                                                  Mouth                                                              Tongue
                                   8.6          8.9        8.10 8.5
                                                      8.1
                                            2.7          .2.2 2.6
                8.4         2.5                                              2.4 8.3
                                                                                                  6.4          6.2
                                            2.9       2.3      2.8
                                     8.8                             8.7                                                        6.3
                                                       8.2
                                                                                                        6.1
                    Feature points affected by FAPs
                    Other feature points                                                                                                                      18-796/Spring 1999/Chen




Facial Animation Parameter Units (FAPUs)
• Parameters normalized to FAPUs


                                             IRISD0

                                                                              ES0




                                                                                     ENS0




                                                                                     MNS0




                                                                                             MW0




                                                                                                                                                              18-796/Spring 1999/Chen




                                                                                                                                                                                        16
Visemes for Lip Synch
         viseme_select          phonemes             example
         0                      none                 na
         1                      p, b, m              put, bed, m ill
         2                      f, v                 far, voice
         3                      T,D                  th ink, that
         4                      t, d                 tip, doll
         5                      k, g                 call, gas
         6                      tS, dZ, S            ch air, join, sh e
         7                      s, z                 sir, zeal
         8                      n, l                 lot, n ot
         9                      r                    red
         10                     A:                   car
         11                     e                    bed
         12                     I                    tip
         13                     Q                    top
         14                     U                    book

                                                                               18-796/Spring 1999/Chen




              2D Mesh Compression




• Decoding process
                          dxn
                          dyn
                                           Mesh
                                          Geometry               xn
                                          Decoding               yn
                                                                                Decoded
                                                                 tm             Mesh
      Coded    Variable
      Data      Length
               Decoding
                                           Mesh                        Mesh
                                           Motion                      Data
                                          Decoding                    Memory
                          exn
                          eyn
                                                                               18-796/Spring 1999/Chen




                                                                                                         17
3D Mesh Compression
• Progressive representation
   – Streaming of 3D objects
   – Both spatially and temporally




• Indexing and retrieval of 3D meshes
   – Multiresolution databases
   – Related to MPEG-7                                 Demo




  Wavelets for Scalable Texture Coding
         Lowpass                         Lowpass
          H1 (z)    2   CODEC 1      2    F1 (z)

          H2 (z)    2   CODEC 2      2    F2 (z)
         Highpass                        Highpass




• Decompose the signal in the frequency domain
• Critical downsampling maintains the number of samples in
  the subbands
• For 2D case, separable filters are often used. Decompose
  into four bands: LL, LH, HL, HH
• Decompose the LL band iteratively
                                                    18-796/Spring 1999/Chen




                                                                              18
Wavelets for Scalable Texture Coding (cont.)

                    Lowest Band                  Inverse
                                  Prediction
                                               Quantization
 Coded Data                                                                  Output
              Arithmetic                                           Inverse
              Decoding                                              DWT
                                  ZeroTree       Inverse
                    Other Bands   Decoding     Quantization




                                                     “Zero Tree”
                                                              18-796/Spring 1999/Chen




         View Dependent Scalability




                                                                                        19
MPEG-4 Video Test Sequences
• Class A: Low spatial detail and low amount of
  movement
• Class B: Medium spatial detail and low amount of
  movement or vice versa
• Class C: High spatial detail and medium amount
  of movement or vice versa
• Class D: Stereoscopic
• Class E: Hybrid natural and synthetic
• Class F: 12-bit video sequences

                                                                                     18-796/Spring 1999/Chen




                      Test Sequences
          Sequence Name      Class     Input Format     YUV     Alpha    Segment
                                                        files    files    Mask
                                                                         Available
         Mother & daughter    A      ITU-R 601 (60Hz)     1       0         0
               Akiyo          A      ITU-R 601 (60Hz)   2+1       1         2
           Hall Monitor       A      ITU-R 601 (60Hz)     1       0         3
          Container Ship      A      ITU-R 601 (60Hz)     1       0         6
                Sean          A      ITU-R 601 (60Hz)     1       0         3
             Foreman          B      ITU-R 601 (50Hz)     1       0         0
                News          B      ITU-R 601 (60Hz)   4+1       3         4
           Silent Voice       B      ITU-R 601 (50Hz)     1       0         0
            Coastguard        B      ITU-R 601 (60Hz)     1       0         4
                Bus           C      ITU-R 601 (60Hz)     1       0         0
           Table Tennis       C      ITU-R 601 (50Hz)     1       0         0
               Stefan         C      ITU-R 601 (60Hz)     1       0         2
         Mobile & Calendar    C      ITU-R 601 (60Hz)     1       0         0
            Basketball        C      ITU-R 601 (50Hz)     1       0         0
             Football         C      ITU-R 602 (60Hz)     1       0         0
           Cheerleaders       C      ITU-R 601 (60Hz)     1       0         0
               Tunnel         D      ITU-R 601 (50Hz)    2x1      0         0
             Fun Fair         D      ITU-R 601 (50Hz)    2x1      0         0
             Children         E      ITU-R 601 (60Hz)   3+1       2         3
               Bream          E      ITU-R 601 (60Hz)   3+1       2         3
             Weather          E      ITU-R 601 (60Hz)   2+1       1         2
            Destruction       E      ITU-R 601 (60Hz)   11+1     10         0
                 Ti1          F       176x144 (15Hz)      1       0         0
             Man1sw           F       272x136 (15Hz)      1       0         0
             Hum2sw           F       272x136 (15Hz)      1       0         0
             Veh2sw           F       272x136 (15Hz)      1       0         0
              labview         F       176x144 (60Hz)      1       0         0
              hallway         F       176x144 (60Hz)      1       0         0




                                                                                                               20
MPEG-4 Version 2
• One year following Version 1
• Adds new profiles with new functionalities
• Video
   –   Scalable transmission of arbitrary-shaped objects
   –   Tools for additional efficiency improvements
   –   Tools for improved error robustness
   –   Coding of multiple views
   –   Body animation
   –   Coding of 3D meshes and scalabilities


                                                     18-796/Spring 1999/Chen




                      References
   – MPEG Home Page http://drogo.cselt.it/mpeg/
   – MPEG Video Home Page http://wwwam.hhi.de/mpeg-
     video/
   – MPEG4-SNHC http://www.es.com/mpeg4-snhc/
   – T. Sikora, “MPEG digital video coding standards,”
     IEEE Signal Processing Magazine, Sept. 1997, pp. 82-
     100
   – T. Sikora, “The MPEG-4 Video Standard Verification
     Model,” IEEE Trans. CSVT, Vol.7, No.1, Feb.1997




                                                     18-796/Spring 1999/Chen




                                                                               21

More Related Content

PPT
Iain Richardson: An Introduction to Video Compression
PDF
PDF
Emerging H.264 Standard: Overview and TMS320DM642- Based ...
PPTX
Generic Video Adaptation Framework Towards Content – and Context Awareness in...
PDF
Emerging H.264 Standard:
PDF
ICME 2016 - High Efficiency Video Coding - Coding Tools and Specification: HE...
PDF
Applied technology
PPT
The MPEG Extensible Middleware API
Iain Richardson: An Introduction to Video Compression
Emerging H.264 Standard: Overview and TMS320DM642- Based ...
Generic Video Adaptation Framework Towards Content – and Context Awareness in...
Emerging H.264 Standard:
ICME 2016 - High Efficiency Video Coding - Coding Tools and Specification: HE...
Applied technology
The MPEG Extensible Middleware API

What's hot (20)

PDF
h.264 video compression standard.
PPT
H263.ppt
PDF
W-JAX 08 - Aspect Weaving for OSGii
PPT
H.263 Video Codec
PDF
H.264 nal and RTP
DOCX
ภาษาคอมพิวเตอร์
PDF
H264 final
PPT
H.264 video standard
PPTX
A short history of video coding
PPTX
Video coding standards ppt
PDF
HEVC intra coding
PDF
HEVC VIDEO CODEC By Vinayagam Mariappan
PPT
Getting the most out of H.264
PDF
HEVC overview main
PPT
MPEG4 vs H.264
PPT
Video Coding Standard
PDF
Deblocking_Filter_v2
PDF
Feature hevc
PDF
The H.265/MPEG-HEVC Standard
PPTX
An Overview of High Efficiency Video Codec HEVC (H.265)
h.264 video compression standard.
H263.ppt
W-JAX 08 - Aspect Weaving for OSGii
H.263 Video Codec
H.264 nal and RTP
ภาษาคอมพิวเตอร์
H264 final
H.264 video standard
A short history of video coding
Video coding standards ppt
HEVC intra coding
HEVC VIDEO CODEC By Vinayagam Mariappan
Getting the most out of H.264
HEVC overview main
MPEG4 vs H.264
Video Coding Standard
Deblocking_Filter_v2
Feature hevc
The H.265/MPEG-HEVC Standard
An Overview of High Efficiency Video Codec HEVC (H.265)
Ad

Viewers also liked (16)

PDF
Michael_Lavrentiev_Trans trating.PDF
PDF
easiest_linux_guide_ever
PDF
ts_102427v010101p
PPTX
Place name and design tree
PDF
SI-Manager-ProductDescription
PDF
en_300468v011101o
PDF
ts_ETSI_101154v010901p
PDF
Discrete cosine transform
PDF
Basic of BISS
PPT
MPEG-4 BIFS Overview
PPT
Bacteria
PPTX
MPEG 4 VIDEO
PPTX
MPEG Augmented Reality Tutorial
PDF
Basics of Mpeg 4 3D Graphics Compression
Michael_Lavrentiev_Trans trating.PDF
easiest_linux_guide_ever
ts_102427v010101p
Place name and design tree
SI-Manager-ProductDescription
en_300468v011101o
ts_ETSI_101154v010901p
Discrete cosine transform
Basic of BISS
MPEG-4 BIFS Overview
Bacteria
MPEG 4 VIDEO
MPEG Augmented Reality Tutorial
Basics of Mpeg 4 3D Graphics Compression
Ad

Similar to mpeg4 (20)

PDF
Overview of the H.264/AVC video coding standard - Circuits ...
PDF
Tutorial MPEG 3D Graphics
PDF
MPEG-4 Developments
PDF
The H.264 Video Compression Standard
PPT
MPEG-4-WWW.ppt
PDF
Brokerage 2007 presentation multimedia
PDF
DIC_video_coding_standards_07
PDF
DIC_video_coding_standards_07
PDF
DIC_video_coding_standards_07
PPTX
3DgraphicsAndAR
PPT
New coding techniques, standardisation, and quality metrics
PPT
Introduction to Video Compression Techniques - Anurag Jain
PDF
Video Compression Algorithm Based on Frame Difference Approaches
PPT
H 264 in cuda presentation
PDF
Jpeg2000
PDF
VLSI Design for Video Coding 2010th Edition Youn
PDF
08 android multimedia_framework_overview
PPT
PDF
The H.264/AVC Advanced Video Coding Standard: Overview and ...
Overview of the H.264/AVC video coding standard - Circuits ...
Tutorial MPEG 3D Graphics
MPEG-4 Developments
The H.264 Video Compression Standard
MPEG-4-WWW.ppt
Brokerage 2007 presentation multimedia
DIC_video_coding_standards_07
DIC_video_coding_standards_07
DIC_video_coding_standards_07
3DgraphicsAndAR
New coding techniques, standardisation, and quality metrics
Introduction to Video Compression Techniques - Anurag Jain
Video Compression Algorithm Based on Frame Difference Approaches
H 264 in cuda presentation
Jpeg2000
VLSI Design for Video Coding 2010th Edition Youn
08 android multimedia_framework_overview
The H.264/AVC Advanced Video Coding Standard: Overview and ...

More from Aniruddh Tyagi (20)

PDF
security vulnerabilities of dvb chipsets
PDF
whitepaper_mpeg-if_understanding_mpeg4
PDF
BUC BLOCK UP CONVERTER
PDF
digital_set_top_box2
PDF
EBU_DVB_S2 READY TO LIFT OFF
PDF
ADVANCED DVB-C,DVB-S STB DEMOD
PDF
DVB_Arch
PDF
haffman coding DCT transform
PDF
tyagi 's doc
PDF
quantization_PCM
PDF
ECMG & EMMG protocol
PDF
7015567A
PDF
euler theorm
PDF
fundamentals_satellite_communication_part_1
PDF
quantization
PDF
art_sklar7_reed-solomon
PDF
DVBSimulcrypt2
PDF
en_302769v010101v
PDF
Euler formula
security vulnerabilities of dvb chipsets
whitepaper_mpeg-if_understanding_mpeg4
BUC BLOCK UP CONVERTER
digital_set_top_box2
EBU_DVB_S2 READY TO LIFT OFF
ADVANCED DVB-C,DVB-S STB DEMOD
DVB_Arch
haffman coding DCT transform
tyagi 's doc
quantization_PCM
ECMG & EMMG protocol
7015567A
euler theorm
fundamentals_satellite_communication_part_1
quantization
art_sklar7_reed-solomon
DVBSimulcrypt2
en_302769v010101v
Euler formula

Recently uploaded (20)

PPTX
Internet of Everything -Basic concepts details
PDF
The-2025-Engineering-Revolution-AI-Quality-and-DevOps-Convergence.pdf
PDF
Transform-Quality-Engineering-with-AI-A-60-Day-Blueprint-for-Digital-Success.pdf
PPTX
Training Program for knowledge in solar cell and solar industry
PDF
Aug23rd - Mulesoft Community Workshop - Hyd, India.pdf
DOCX
Basics of Cloud Computing - Cloud Ecosystem
PDF
Dell Pro Micro: Speed customer interactions, patient processing, and learning...
PDF
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
PDF
Rapid Prototyping: A lecture on prototyping techniques for interface design
PDF
Comparative analysis of machine learning models for fake news detection in so...
PDF
giants, standing on the shoulders of - by Daniel Stenberg
PPTX
AI-driven Assurance Across Your End-to-end Network With ThousandEyes
PPTX
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
PDF
NewMind AI Weekly Chronicles – August ’25 Week IV
PDF
4 layer Arch & Reference Arch of IoT.pdf
PPTX
future_of_ai_comprehensive_20250822032121.pptx
PPTX
SGT Report The Beast Plan and Cyberphysical Systems of Control
PDF
Accessing-Finance-in-Jordan-MENA 2024 2025.pdf
PPTX
MuleSoft-Compete-Deck for midddleware integrations
PDF
AI.gov: A Trojan Horse in the Age of Artificial Intelligence
Internet of Everything -Basic concepts details
The-2025-Engineering-Revolution-AI-Quality-and-DevOps-Convergence.pdf
Transform-Quality-Engineering-with-AI-A-60-Day-Blueprint-for-Digital-Success.pdf
Training Program for knowledge in solar cell and solar industry
Aug23rd - Mulesoft Community Workshop - Hyd, India.pdf
Basics of Cloud Computing - Cloud Ecosystem
Dell Pro Micro: Speed customer interactions, patient processing, and learning...
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
Rapid Prototyping: A lecture on prototyping techniques for interface design
Comparative analysis of machine learning models for fake news detection in so...
giants, standing on the shoulders of - by Daniel Stenberg
AI-driven Assurance Across Your End-to-end Network With ThousandEyes
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
NewMind AI Weekly Chronicles – August ’25 Week IV
4 layer Arch & Reference Arch of IoT.pdf
future_of_ai_comprehensive_20250822032121.pptx
SGT Report The Beast Plan and Cyberphysical Systems of Control
Accessing-Finance-in-Jordan-MENA 2024 2025.pdf
MuleSoft-Compete-Deck for midddleware integrations
AI.gov: A Trojan Horse in the Age of Artificial Intelligence

mpeg4

  • 1. 18-796 Multimedia Communications: Coding, Systems, and Networking Prof. Tsuhan Chen [email protected] MPEG-4 1
  • 2. MPEG-4 • Originally – A standard for very low bit rate coding of limited complexity audio-visual material • In July 94, the scope was extended to – Functionalities not supported by other standards • Content-based interactivity • Universal access • High compression – Coding of general material for a wide bit rate range – Flexibility and extensibility 18-796/Spring 1999/Chen Content-Based Interactivity • A scene is composed of audio-visual objects – Not just pixels or moving blocks • Objects can be of different nature – Text or images – Rectangular or arbitrary shape – 2D or 3D objects – Natural of synthetic • Different coding schemes applied to different objects • Compositor puts objects back in a scene 18-796/Spring 1999/Chen 2
  • 3. [MPEG-4 N1909, “Overview of the MPEG-4 Version 1 Standard”] 18-796/Spring 1999/Chen Applications Human-machine interface Content creation GUI, Virtual environment Digital TV, HDTV Vision, Graphics VCD, DVD Computer TV/Film MPEG-4 Telecommunication Wireless, Internet, ISDN, POTS, Cable 18-796/Spring 1999/Chen 3
  • 4. Parts of MPEG-4 • Part 1: Systems • Part 2: Video • Part 3: Audio • Part 4: Conformance testing • Part 5: Reference software • Part 6: Delivery multimedia integration framework • Others – Synthetic and Natural Hybrid Coding (SNHC) – Requirements and applications – Implementation Study – Intellectual property rights (IPR) 18-796/Spring 1999/Chen MPEG-4 Activities • Competitive phase – Proposals and evaluations • Collaborative phase – Verification model and core experiments 1. Define Verification Model (VM-n) 2. Define core experiments for improving VM-n 3. Perform core experiments. Compare with VM-n 4. n++, go to Step 1 18-796/Spring 1999/Chen 4
  • 5. MPEG-4 Time Table • July 93 Started work • Nov 95 Subjective tests and tool evaluation • Jan 96 Define verification model (VM) and core experiments (CE) • Mar/July/Sept/Nov 96, Feb/Apr/Jul 97 Update VM and define a new set of CEs • Oct 97 Committee Draft (CD) • July 98 Final CD (FCD) • Nov 98 Draft international standard (DIS) • Jan 99 International standard (IS) 18-796/Spring 1999/Chen MPEG-4 Video • General functionalities – Coding efficiency • For 5 kbit/s ~ 5 Mbit/s • From small images to TV resolution • Progressive/interlaced – Error resilience and robustness – Spatial and temporal scalabilities • Content-based functionalities – Shape coding and sprites – Content-based scalabilities – Error resilience and robustness 18-796/Spring 1999/Chen 5
  • 6. MPEG-4 Video (cont.) • Tools – Motion/texture coding derived from H.263 – Coding of video object plane (VOP): I, B, P – Binary and gray-scale shape coding – Scalabilities: temporal/spatial – Static sprites – Interlaced prediction – 12 bit video – Computational graceful degradation (CGD) 18-796/Spring 1999/Chen Video Object Plane (VOP) VOP1 VOP2 VOP3 18-796/Spring 1999/Chen 6
  • 7. Structure of VOP Encoder VOP 0 Coding VOP 1 VOP Coding Input MUX Bitstream Definition VOP 2 Coding – Note: Segmentation is outside the scope of MPEG-4 18-796/Spring 1999/Chen Structure of VOP Decoder VOP 0 Decoding VOP 1 Decoding Bitstream DEMUX Composition Output VOP 2 Decoding 18-796/Spring 1999/Chen 7
  • 8. Coding of VOP • Motion compensation and DCT – Similar to H.263 • Polygon matching for motion estimation Transparent Pixels VOP Pixels for polygon matching 18-796/Spring 1999/Chen VOP Padding • Applied to the reference VOP prior to motion estimation/compensation Padded Previous Frame Current Frame 18-796/Spring 1999/Chen 8
  • 9. Shape-Adaptive DCT for Texture Coding Shape Coding • Binary shape – Context-based arithmetic encoding (CAE) • Gray scale alpha plane – Motion compensated DCT • Similar to texture coding Gray-Level Alpha Support Tex ture Binary Tex ture Coder Shape Coder 18-796/Spring 1999/Chen 9
  • 10. Binary Shape Coding • Context-based arithmetic encoding (CAE) – A binary shape is treated as a binary image – Apply CAE to each binary alpha block (BAB) • The “context” Current BAB Motion Compensated BAB c9 c8 c7 c3 c2 c1 c8 c6 c5 c4 c3 c2 c0 ? c7 c6 c5 c1 c0 ? c4 Intra Inter 18-796/Spring 1999/Chen Feathering • Feathering and translucency coding – No effects – Linear feathering – Constant alpha – Linear feathering and constant alpha – Feathering filter – Feathering filter and constant alpha 18-796/Spring 1999/Chen 10
  • 11. VOP Encoder + motion texture video _ DCT Q multiplex coding -1 Q IDCT + + S pred. 1 w i Frame pred. 2 Store t c h pred. 3 Motion estimation Shape coding 18-796/Spring 1999/Chen VOP Decoder video_object_layer_shape Coded Bit Stream (Shape) Shape Decoding Previous Reconstructed VOP Coded Bit Stream (Motion) Motion Motion Compen- Decoding sation VOP Demultiplexer Recon- struction Coded Bit Stream Texture Decoding (Texture) 18-796/Spring 1999/Chen 11
  • 12. Profiles of MPEG-4 bitrate High bitrate tools (interlace) Content-based functionalities (shape, scalability) VLBV core functionalities 18-796/Spring 1999/Chen Visual Object Profiles • Not final yet Intra Coding Mode (I) 4-12 bit pixel depth Inter Prediction Mode (P) AC/DC Prediction Reversible VLC Slice Resychronization Data Partitioning Binary Shape P-VOP based temporal scalability Simple 12 bit H.263/MPEG-2 Quantization Table Bidirectional Prediction Mode (B) Binary shape Advanced Prediction Mode Texture coding Static Sprites Interlaced tools Scalable Wavelet Temporal Scalability Note that the binary • object-based shape tool is the same • frame-based for all Object Profiles it Spatial Scalability (frame-based) is used in. Core 18-796/Spring 1999/Chen 12
  • 13. Core vs. Generic MPEG-4 18-796/Spring 1999/Chen Synthetic & Natural Hybrid Coding (SNHC) • Efficient representation and composition of synthetically and naturally generated audiovisual data • To be integrated into MPEG-4 Video and Audio – Not a separate part of MPEG-4 • Applications – Virtual environment, conferencing, education, entertainment, media production, and real-time, interactive and broadcast media experiences 18-796/Spring 1999/Chen 13
  • 14. SNHC Target technologies • Video – Face animation – 2D/3D mesh compression – Wavelet-based still texture coding – View dependent scalability • Audio – Text-to-speech synthesis, structured audio, environmental auralization, 3D audio, etc. 18-796/Spring 1999/Chen Face Animation • Face animation – 2D/3D polygon mesh for face rendering – Facial Definition Parameter (FDP) Set • Controls shape, texture, gender, age, etc. – Facial Animation Parameter (FAP) Set • 68 parameters to produce animation and to create expressions Demo 18-796/Spring 1999/Chen 14
  • 15. Examples of FAP and FDP c l a s s Fa c e De f i n i t i on Pa r a ms { p ub l i c : 3Dme s h s h a p e ; 3DPoi n t f e a t u r e Poi n t [ 46 ] ; I ma g e t e x t u r e ; i nt age ; i n t ge n de r ; ...; } c l a s s Fa c e An i ma t i on Pa r a ms { p ub l i c : i n t mov e _h _l _e y e b a l l ; i n t mov e _h _r _e y e b a l l ; i n t mov e _v _l _e y e b a l l ; i n t mov e _v _r _e y e b a l l ; i n t e n l a r g e _ l _ pu pi l ; i n t e n l a r g e _ r _ pu pi l ; ...; } 18-796/Spring 1999/Chen Example FAPs # FAP name FAP description units Uni- Pos Grp FDP Quant or motion subgrp step Bidir num size 1 viseme Set of values determining the na na na 1 na 1 mixture of two visemes for this frame (e.g. pbm, fv, th) 2 expression A set of values determining the na na na 1 na 1 mixture of two facial expression 3 open_jaw Vertical jaw displacement (does MNS U down 2 1 8 not affect mouth opening) 4 lower_t_midlip Vertical top middle inner lip MNS B down 2 2 5 displacement 5 raise_b_midlip Vertical bottom middle inner MNS B up 2 3 5 lip displacement 6 stretch_l_cornerlip Horizontal displacement of left MW B left 2 4 5 inner lip corner 18-796/Spring 1999/Chen 15
  • 16. Facial Definition Parameters 11.5 11.5 11.4 11.4 11.1 11.2 11.1 11.3 11.2 4.4 4.2 4.1 4.3 4.4 4.6 10.2 4.6 4.5 10.1 10.2 9.6 10.4 10.10 10.9 10.10 10.3 10.4 5.2 5.1 9.14 9.12 10.8 10.7 10.8 9.3 10.6 10.5 Y 10.6 9.4 9.2 2.14 2.13 2.14 X Y 7.1 Z 2.10 2.10 X 2.12 2.1 2.11 2.1 2.12 Z Right Eye Left Eye Nose 3.14 3.13 3.6 3.5 3.1 9.6 9.7 3.2 3.12 3.8 3.11 3.7 3.4 3.3 3.10 3.9 9.12 9.14 9.13 Teeth 9.3 9.8 9.2 9.1 9.10 9.11 9.4 9.15 9.5 9.9 Mouth Tongue 8.6 8.9 8.10 8.5 8.1 2.7 .2.2 2.6 8.4 2.5 2.4 8.3 6.4 6.2 2.9 2.3 2.8 8.8 8.7 6.3 8.2 6.1 Feature points affected by FAPs Other feature points 18-796/Spring 1999/Chen Facial Animation Parameter Units (FAPUs) • Parameters normalized to FAPUs IRISD0 ES0 ENS0 MNS0 MW0 18-796/Spring 1999/Chen 16
  • 17. Visemes for Lip Synch viseme_select phonemes example 0 none na 1 p, b, m put, bed, m ill 2 f, v far, voice 3 T,D th ink, that 4 t, d tip, doll 5 k, g call, gas 6 tS, dZ, S ch air, join, sh e 7 s, z sir, zeal 8 n, l lot, n ot 9 r red 10 A: car 11 e bed 12 I tip 13 Q top 14 U book 18-796/Spring 1999/Chen 2D Mesh Compression • Decoding process dxn dyn Mesh Geometry xn Decoding yn Decoded tm Mesh Coded Variable Data Length Decoding Mesh Mesh Motion Data Decoding Memory exn eyn 18-796/Spring 1999/Chen 17
  • 18. 3D Mesh Compression • Progressive representation – Streaming of 3D objects – Both spatially and temporally • Indexing and retrieval of 3D meshes – Multiresolution databases – Related to MPEG-7 Demo Wavelets for Scalable Texture Coding Lowpass Lowpass H1 (z) 2 CODEC 1 2 F1 (z) H2 (z) 2 CODEC 2 2 F2 (z) Highpass Highpass • Decompose the signal in the frequency domain • Critical downsampling maintains the number of samples in the subbands • For 2D case, separable filters are often used. Decompose into four bands: LL, LH, HL, HH • Decompose the LL band iteratively 18-796/Spring 1999/Chen 18
  • 19. Wavelets for Scalable Texture Coding (cont.) Lowest Band Inverse Prediction Quantization Coded Data Output Arithmetic Inverse Decoding DWT ZeroTree Inverse Other Bands Decoding Quantization “Zero Tree” 18-796/Spring 1999/Chen View Dependent Scalability 19
  • 20. MPEG-4 Video Test Sequences • Class A: Low spatial detail and low amount of movement • Class B: Medium spatial detail and low amount of movement or vice versa • Class C: High spatial detail and medium amount of movement or vice versa • Class D: Stereoscopic • Class E: Hybrid natural and synthetic • Class F: 12-bit video sequences 18-796/Spring 1999/Chen Test Sequences Sequence Name Class Input Format YUV Alpha Segment files files Mask Available Mother & daughter A ITU-R 601 (60Hz) 1 0 0 Akiyo A ITU-R 601 (60Hz) 2+1 1 2 Hall Monitor A ITU-R 601 (60Hz) 1 0 3 Container Ship A ITU-R 601 (60Hz) 1 0 6 Sean A ITU-R 601 (60Hz) 1 0 3 Foreman B ITU-R 601 (50Hz) 1 0 0 News B ITU-R 601 (60Hz) 4+1 3 4 Silent Voice B ITU-R 601 (50Hz) 1 0 0 Coastguard B ITU-R 601 (60Hz) 1 0 4 Bus C ITU-R 601 (60Hz) 1 0 0 Table Tennis C ITU-R 601 (50Hz) 1 0 0 Stefan C ITU-R 601 (60Hz) 1 0 2 Mobile & Calendar C ITU-R 601 (60Hz) 1 0 0 Basketball C ITU-R 601 (50Hz) 1 0 0 Football C ITU-R 602 (60Hz) 1 0 0 Cheerleaders C ITU-R 601 (60Hz) 1 0 0 Tunnel D ITU-R 601 (50Hz) 2x1 0 0 Fun Fair D ITU-R 601 (50Hz) 2x1 0 0 Children E ITU-R 601 (60Hz) 3+1 2 3 Bream E ITU-R 601 (60Hz) 3+1 2 3 Weather E ITU-R 601 (60Hz) 2+1 1 2 Destruction E ITU-R 601 (60Hz) 11+1 10 0 Ti1 F 176x144 (15Hz) 1 0 0 Man1sw F 272x136 (15Hz) 1 0 0 Hum2sw F 272x136 (15Hz) 1 0 0 Veh2sw F 272x136 (15Hz) 1 0 0 labview F 176x144 (60Hz) 1 0 0 hallway F 176x144 (60Hz) 1 0 0 20
  • 21. MPEG-4 Version 2 • One year following Version 1 • Adds new profiles with new functionalities • Video – Scalable transmission of arbitrary-shaped objects – Tools for additional efficiency improvements – Tools for improved error robustness – Coding of multiple views – Body animation – Coding of 3D meshes and scalabilities 18-796/Spring 1999/Chen References – MPEG Home Page http://drogo.cselt.it/mpeg/ – MPEG Video Home Page http://wwwam.hhi.de/mpeg- video/ – MPEG4-SNHC http://www.es.com/mpeg4-snhc/ – T. Sikora, “MPEG digital video coding standards,” IEEE Signal Processing Magazine, Sept. 1997, pp. 82- 100 – T. Sikora, “The MPEG-4 Video Standard Verification Model,” IEEE Trans. CSVT, Vol.7, No.1, Feb.1997 18-796/Spring 1999/Chen 21