Multimedia networking: outline
6.1 multimedia networking applications
6.2 streaming stored video
6.3 voice-over-IP
6.4 protocols for real-time conversational
applications
6.5 network support for multimedia
Multimedia: audio
analog audio signal
sampled at constant rate
quantization
telephone: 8,000
quantized
error value of
samples/sec analog value
audio signal amplitude
CD music: 44,100 analog
samples/sec signal
each sample quantized, i.e.,
rounded
e.g., 28=256 possible time
quantized values
sampling rate
each quantized value (N sample/sec)
represented by bits,
e.g., 8 bits for 256
values
Multimedia: video
spatial coding example: instead
of sending N values of same
color (all purple), send only two
values: color value (purple) and
CBR: (constant bit rate): video number of repeated values (N)
encoding rate fixed ……………………...…
……………………...…
VBR: (variable bit rate): video
encoding rate changes as
amount of spatial, temporal
coding changes
examples:
MPEG 1 (CD-ROM) 1.5
Mbps frame i
MPEG2 (DVD) 3-6 Mbps
MPEG4 (often used in temporal coding example:
Internet, < 1 Mbps) instead of sending
complete frame at i+1,
send only differences from
frame i
frame i+1
Multimedia networking: 3 application types
streaming, stored audio, video
streaming: can begin playout before downloading entire
file
stored (at server): can transmit faster than audio/video
will be rendered (implies storing/buffering at client)
e.g., YouTube, Netflix, Hulu
conversational voice/video over IP
interactive nature of human-to-human conversation
limits delay tolerance
e.g., Skype
streaming live audio, video
e.g., live sporting event
Streaming stored video:
2. video
sent
1. video 3. video received,
recorded network delay played out at client
(e.g., 30 (fixed in this (30 frames/sec) time
frames/sec) example)
streaming: at this time, client
playing out early part of video,
while server still sending later
part of video
Streaming stored video: challenges
continuous playout constraint: once client playout
begins, playback must match original timing
… but network delays are variable (jitter), so
will need client-side buffer to match playout
requirements
other challenges:
client interactivity: pause, fast-forward,
rewind, jump through video
video packets may be lost, retransmitted
Client-side buffering, playout
buffer fill level,
Q(t)
variable fill playout rate,
rate, x(t) e.g., CBR r
client application
video server buffer, size B
client
Client-side buffering, playout
buffer fill level,
Q(t)
variable fill playout rate,
rate, x(t) e.g., CBR r
client application
video server buffer, size B
client
1. Initial fill of buffer until playout begins at tp
2. playout begins at tp,
3. buffer fill level varies over time as fill rate x(t) varies
and playout rate r is constant
Client-side buffering, playout
buffer fill level,
Q(t)
variable fill playout rate,
rate, x(t) e.g., CBR r
client application
video server buffer, size B
playout buffering: average fill rate (x), playout rate (r):
x < r: buffer eventually empties (causing freezing of video
playout until buffer again fills)
x > r: buffer will not empty, provided initial playout delay is
large enough to absorb variability in x(t)
initial playout delay tradeoff: buffer starvation less likely
with larger delay, but larger delay until user begins
watching
Streaming multimedia: HTTP
multimedia file retrieved via HTTP GET
send at maximum possible rate under TCP
variable
rate, x(t)
video TCP send TCP receive application
file buffer buffer playout buffer
server client
fill rate fluctuates due to TCP congestion control,
retransmissions (in-order delivery)
larger playout delay: smooth TCP delivery rate
HTTP/TCP passes more easily through firewalls
Streaming multimedia: DASH
DASH: Dynamic, Adaptive Streaming over HTTP
server:
divides video file into multiple chunks
each chunk stored, encoded at different rates
manifest file: provides URLs for different chunks
client:
periodically measures server-to-client bandwidth
consulting manifest, requests one chunk at a time
• chooses maximum coding rate sustainable given
current bandwidth
• can choose different coding rates at different points
in time (depending on available bandwidth at time)
Voice-over-IP (VoIP)
VoIP end-end-delay requirement: needed to maintain
“conversational” aspect
higher delays noticeable, impair interactivity
< 150 msec: good
> 400 msec bad
includes application-level (packetization,playout),
network delays
session initialization: how does callee advertise IP
address, port number, encoding algorithms?
value-added services: call forwarding, screening,
recording
VoIP characteristics
speaker’s audio: alternating talk spurts, silent
periods.
64 kbps during talk spurt
pkts generated only during talk spurts
20 msec chunks at 8 Kbytes/sec: 160 bytes of data
application-layer header added to each chunk
chunk+header encapsulated into UDP or TCP
segment
application sends segment into socket every 20
msec during talkspurt
VoIP: packet loss, delay
network loss: IP datagram lost due to network
congestion (router buffer overflow)
delay loss: IP datagram arrives too late for playout
at receiver
delays: processing, queueing in network; end-system
(sender, receiver) delays
typical maximum tolerable delay: 400 ms
loss tolerance: depending on voice encoding, loss
concealment, packet loss rates between 1% and
10% can be tolerated
VoiP: recovery from packet loss
Challenge: recover from packet loss given small
tolerable delay between original transmission and
playout
each ACK/NAK takes ~ one RTT
alternative: Forward Error Correction (FEC)
send enough bits to allow recovery without
retransmission (recall two-dimensional parity in Ch. 5)
simple FEC
for every group of n chunks, create redundant chunk by
exclusive OR-ing n original chunks
send n+1 chunks, increasing bandwidth by factor 1/n
can reconstruct original n chunks if at most one lost chunk
from n+1 chunks, with playout delay
Voice-over-IP: Skype
Skype clients (SC)
proprietary application-
layer protocol (inferred
via reverse engineering)
encrypted msgs
Skype
P2P components: login server supernode (SN)
clients: skype peers
connect directly to supernode
overlay
each other for VoIP call network
super nodes (SN):
skype peers with
special functions
overlay network: among
SNs to locate SCs
login server
P2P voice-over-IP: skype
skype client operation:
1. joins skype network by
contacting SN (IP address
cached) using TCP Skype
2. logs-in (usename, login server
password) to centralized
skype login server
3. obtains IP address for
callee from SN, SN
overlay
or client buddy list
4. initiate call directly to
callee
Real-Time Protocol (RTP)
RTP specifies packet RTP runs in end
structure for packets systems
carrying audio, video RTP packets
data encapsulated in UDP
RFC 3550 segments
RTP packet provides interoperability: if two
payload type VoIP applications run
identification RTP, they may be able
packet sequence to work together
numbering
time stamping
RTP runs on top of UDP
RTP libraries provide transport-layer interface
that extends UDP:
• port numbers, IP addresses
• payload type identification
• packet sequence numbering
• time-stamping
RTP and QoS
RTP does not provide any mechanism to ensure
timely data delivery or other QoS guarantees
RTP encapsulation only seen at end systems (not
by intermediate routers)
routers provide best-effort service, making no
special effort to ensure that RTP packets arrive
at destination in timely matter
RTP header
payload sequence Synchronization Miscellaneous
time stamp
type number Source ID fields
type
payload type (7 bits): indicates type of encoding currently being
used. If sender changes encoding during call, sender
informs receiver via payload type field
Payload type 0: PCM mu-law, 64 kbps
Payload type 3: GSM, 13 kbps
Payload type 7: LPC, 2.4 kbps
Payload type 26: Motion JPEG
Payload type 31: H.261
Payload type 33: MPEG2 video
sequence # (16 bits): increment by one for each RTP packet sent
detect packet loss, restore packet sequence
RTP header
payload sequence Synchronization Miscellaneous
time stamp
type number Source ID fields
type
timestamp field (32 bits long): sampling instant of first
byte in this RTP data packet
for audio, timestamp clock increments by one for each
sampling period (e.g., each 125 usecs for 8 KHz sampling
clock)
if application generates chunks of 160 encoded samples,
timestamp increases by 160 for each RTP packet when
source is active. Timestamp clock continues to increase
at constant rate when source is inactive.
SSRC field (32 bits long): identifies source of RTP
stream. Each stream in RTP session has distinct SSRC
Real-Time Control Protocol (RTCP)
works in conjunction each RTCP packet
with RTP contains sender and/or
each participant in RTP receiver reports
session periodically report statistics useful to
application: # packets
sends RTCP control sent, # packets lost,
packets to all other interarrival jitter
participants feedback used to control
performance
sender may modify its
transmissions based on
feedback
RTCP: multiple multicast senders
sender RTP
RTCP
RTCP
RTCP
receivers
each RTP session: typically a single multicast address; all RTP
/RTCP packets belonging to session use multicast address
RTP, RTCP packets distinguished from each other via distinct port
numbers
to limit traffic, each participant reduces RTCP traffic as number of
conference participants increases
RTCP: packet types
receiver report packets: source description packets:
fraction of packets lost, last e-mail address of sender,
sequence number, average sender's name, SSRC of
interarrival jitter associated RTP stream
sender report packets: provide mapping between
the SSRC and the
SSRC of RTP stream, user/host name
current time, number of
packets sent, number of
bytes sent
RTCP: stream synchronization
RTCP can synchronize each RTCP sender-report
different media streams packet contains (for most
within a RTP session recently generated packet
e.g., videoconferencing in associated RTP stream):
app: each sender timestamp of RTP
generates one RTP packet
stream for video, one for wall-clock time for
audio. when packet was
timestamps in RTP created
packets tied to the video, receivers uses association
audio sampling clocks to synchronize playout of
not tied to wall-clock audio, video
time
SIP: Session Initiation Protocol [RFC 3261]
long-term vision:
all telephone calls, video conference calls take
place over Internet
people identified by names or e-mail addresses,
rather than by phone numbers
can reach callee (if callee so desires), no matter
where callee roams, no matter what IP device
callee is currently using
SIP services
SIP provides determine current IP
mechanisms for call address of callee:
setup: maps mnemonic
for caller to let identifier to current IP
callee know she address
wants to establish a call management:
call add new media
streams during call
so caller, callee can
change encoding
agree on media type, during call
encoding invite others
to end call transfer, hold calls
Setting up a call (more)
codec negotiation: rejecting a call
suppose Bob doesn’t Bob can reject with
have PCM mlaw encoder replies “busy,” “gone,”
Bob will instead reply “payment required,”
with 606 Not “forbidden”
Acceptable Reply, listing media can be sent
his encoders. Alice can over RTP or some
then send new INVITE other protocol
message, advertising
different encoder
Example of SIP message
INVITE sip:
[email protected] SIP/2.0
Via: SIP/2.0/UDP 167.180.112.24 Here we don’t know
From: sip:
[email protected] Bob’s IP address
To: sip:
[email protected] intermediate SIP
Call-ID:
[email protected]Content-Type: application/sdp
servers needed
Content-Length: 885 Alice sends, receives
SIP messages using SIP
c=IN IP4 167.180.112.24 default port 506
m=audio 38060 RTP/AVP 0
Alice specifies in
header that SIP client
Notes:
HTTP message syntax
sends, receives SIP
sdp = session description protocol messages over UDP
Call-ID is unique for every call
Name translation, user location
caller wants to call result can be based on:
callee, but only has time of day (work,
callee’s name or e-mail home)
address. caller (don’t want boss
to call you at home)
need to get IP address of
callee’s current host: status of callee (calls sent
to voicemail when callee
user moves around is already talking to
DHCP protocol someone)
user has different IP
devices (PC, smartphone,
car device)
SIP registrar
one function of SIP server: registrar
when Bob starts SIP client, client sends SIP REGISTER
message to Bob’s registrar server
register message:
REGISTER sip:domain.com SIP/2.0
Via: SIP/2.0/UDP 193.64.210.89
From: sip:[email protected]
To: sip:[email protected]
Expires: 3600
Scheduling and policing mechanisms
scheduling: choose next packet to send on link
FIFO (first in first out) scheduling: send in order of
arrival to queue
discard policy: if packet arrives to full queue: who to
discard?
• tail drop: drop arriving packet
• priority: drop/remove on priority basis
• random: drop/remove randomly
packet packet
arrivals queue link departures
(waiting area) (server)
Scheduling policies: priority
priority scheduling: send
high priority queue
(waiting area)
highest priority arrivals departures
queued packet
multiple classes, with classify link
different priorities low priority queue
(server)
(waiting area)
class may depend on
marking or other 2
5
header info, e.g. IP arrivals
1 3 4
source/dest, port
numbers, etc. packet
in 1 3 2 4 5
real world example? service
departures
1 3 2 4 5
Scheduling policies: still more
Round Robin (RR) scheduling:
multiple classes
cyclically scan class queues, sending one complete
packet from each class (if available)
real world example?
2
1 3 4 5
arrivals
packet
in 1 3 2 4 5
service
departures
1 3 3 4 5
Multimedia networking:Summary
6.1 multimedia networking applications
6.2 streaming stored video
6.3 voice-over-IP
6.4 protocols for real-time conversational
applications
6.5 network support for multimedia