COM210: Computer Networks (CN)
Chapter 3: TCP
Dr Ahmad Al-Zubi
King Saud University -
Dr Ahmad Al-Zubi1
TCP: Overview
Point-to-point:
one sender, one receiver
Reliable, in-order byte steam:
no message boundaries
But TCP chops it up into segments for
transmission internally
Pipelined (window) flow control:
Window size decided by receiver and network
Send & receive buffers
King Saud University -
Dr Ahmad Al-Zubi2
TCP: Overview
socket
door
a p p lic a tio n
w r ite s d a ta
a p p lic a t io n
re a d s d a ta
TCP
s e n d b u ffe r
TC P
r e c e iv e b u f f e r
socket
door
segm ent
King Saud University -
Dr Ahmad Al-Zubi3
TCP: Overview
Full duplex data:
bi-directional data flow in same connection
MSS: maximum segment size
Connection-oriented:
handshaking (exchange of control msgs)
inits sender, receiver state before data
exchange
Flow & Congestion Control:
sender will not overwhelm receiver or the
network
King Saud University -
Dr Ahmad Al-Zubi4
TCP segment structure
32 bits
URG: urgent data
(generally not used)
ACK: ACK #
valid
PSH: push data now
(generally not used)
RST, SYN, FIN:
connection estab
(setup, teardown
commands)
Internet
checksum
(as in UDP)
source port #
dest port #
sequence number
acknowledgement number
head not
U AP R S F
len used
checksum
rcvr window size
ptr urgent data
Options (variable length)
counting
by bytes
of data
(not segments!)
# bytes
rcvr willing
to accept
application
data
(variable length)
King Saud University -
Dr Ahmad Al-Zubi5
TCP seq. #s and ACKs (I)
Sequence Numbers:
byte stream number of first byte in segments
data
ACKs:
seq # of next byte expected from other side
cumulative ACK
Q: how receiver handles out-of-order segments
A: TCP spec doesnt say, - up to implementor
King Saud University -
Dr Ahmad Al-Zubi6
TCP Seq. #s and ACKs (II)
Host B
Host A
User
types
C
Seq=4
2, AC
79
Seq=
host ACKs
receipt
of echoed
C
Seq=4
K=79,
d
ata =
C
, data
3
4
=
, AC K
3, ACK
= C
host ACKs
receipt of
C, echoes
back C
=80
simple telnet scenario
King Saud University -
Dr Ahmad Al-Zubi7
Temporal Redundancy Model
Packets
Timeout
Status Reports
Sequence Numbers
CRC or Checksum
ACKs
NAKs,
SACKs
Bitmaps
Retransmissions
Packets
FEC information
King Saud University -
Dr Ahmad Al-Zubi8
Status Report Design
Cumulative acks:
Robust to losses on the reverse channel
Can work with go-back-N retransmission
Cannot pinpoint blocks of data which are
lost
The first lost packet can be pinpointed
because the receiver would generate
duplicate acks
King Saud University -
Dr Ahmad Al-Zubi9
TCP: reliable data transfer (I)
event: data received
from application above
create, send segment
wait
wait
for
for
event
event
event: timer timeout for
segment with seq # y
retransmit segment
one way data transfer
no flow, congestion
control
event: ACK received,
with ACK # y
ACK processing
King Saud University -
Dr Ahmad Al-Zubi
10
TCP:
reliable
data
transfer (II)
00 sendbase = initial_sequence number
01 nextseqnum = initial_sequence number
02
03 loop (forever) {
04
switch(event)
05
event: data received from application above
06
create TCP segment with sequence number nextseqnum
07
start timer for segment nextseqnum
08
pass segment to IP
09
nextseqnum = nextseqnum + length(data)
10
event: timer timeout for segment with sequence number y
11
retransmit segment with sequence number y
12
compute new timeout interval for segment y
13
restart timer for sequence number y
14
event: ACK received, with ACK field value of y
15
if (y > sendbase) { /* cumulative ACK of all data up to y */
16
cancel all timers for segments with sequence numbers < y
17
sendbase = y
18
}
19
else { /* a duplicate ACK for already ACKed segment */
20
increment number of duplicate ACKs received for y
21
if (number of duplicate ACKS received for y == 3) {
22
/* TCP fast retransmit */
23
resend segment with sequence number y
24
restart timer for segment y
25
}
26
} /* end of loop forever */
Simplified
TCP
sender
King Saud University -
Dr Ahmad Al-Zubi
11
TCP ACK generation
Event
TCP Receiver action
in-order segment arrival,
no gaps,
everything else already ACKed
delayed ACK. Wait up to 500ms
for next segment. If no next segment,
send ACK
in-order segment arrival,
no gaps,
one delayed ACK pending
immediately send single
cumulative ACK
out-of-order segment arrival
higher-than-expect seq. #
gap detected!
send duplicate ACK, indicating seq. #
of next expected byte
arrival of segment that
partially or completely fills gap
immediate ACK if segment starts
at lower end of gap
King Saud University -
Dr Ahmad Al-Zubi
12
TCP: retransmission scenarios
Host A
2, 8 b
ytes d
ata
=100
K
C
A
loss
Seq=9
2, 8
bytes
data
lost ACK scenario
Seq=
100,
8 byte
s data
20 by
t es d
at a
0
10
=
K
20
AC CK=1
A
Seq=9
2, 8 b
ytes d
ata
0
=12
K
AC
=100
ACK
time
Host B
Seq=9
2,
Seq=100 timeout
Seq=92 timeout
Seq=9
timeout
Host A
Host B
time
King Saud University -
premature timeout,
cumulative ACKs
Dr Ahmad Al-Zubi
13
TCP Flow Control
flow control
sender wont overrun
receivers buffers by
transmitting too much,
too fast
RcvBuffer = size or TCP Receive Buffer
RcvWindow = amount of spare room in Buffer
receiver: explicitly
informs sender of
free buffer space
RcvWindow field
in TCP segment
sender: keeps the
amount of
transmitted,
unACKed data less
than most recently
received
RcvWindow
receiver buffering
King Saud University -
Dr Ahmad Al-Zubi
14
Timeout and RTT Estimation
Timeout: for robust detection of
packet loss
Problem: How long should timeout
be ?
Too long => underutilization
Too short => wasteful
retransmissions
Solution: adaptive timeout: based on
estimate of max RTT
King Saud University -
Dr Ahmad Al-Zubi
15
How to estimate max RTT?
RTT = prop + queuing delay
Queuing delay highly variable
So, different samples of RTTs will give
different random values of queuing delay
Chebyshevs Theorem:
MaxRTT = Avg RTT + k*Deviation
Error probability is less than 1/(k**2)
Result true for ANY distribution of
samples
King Saud University -
Dr Ahmad Al-Zubi
16
Round Trip Time and Timeout (II)
Q: how to estimate RTT?
SampleRTT: measured time from segment
transmission until ACK receipt
ignore retransmissions, cumulatively
ACKed segments
SampleRTT will vary wildly => want
estimated RTT smoother
use several recent measurements, not
just current SampleRTT to calculate
AverageRTT
King Saud University -
Dr Ahmad Al-Zubi
17
TCP Round Trip Time and
Timeout (III)
AverageRTT = (1-x)*AverageRTT + x*SampleRTT
Exponential weighted moving average (EWMA)
influence of given sample decreases
exponentially fast; x = 0.1
Setting the timeout
AverageRTT plus safety margin proportional to
variation
Timeout = AverageRTT + 4*Deviation
Deviation = (1-x)*Deviation + x*|SampleRTT- AverageRTT|
King Saud University -
Dr Ahmad Al-Zubi
18
TCP Connection Management - 1
Recall: TCP sender, receiver establish connection
before exchanging data segments
initialize TCP variables:
seq. #s
buffers, flow control info (e.g. RcvWindow)
client: connection initiator
Socket clientSocket = new Socket("hostname","port number");
server: contacted by client
Socket connectionSocket = welcomeSocket.accept();
King Saud University -
Dr Ahmad Al-Zubi
19
TCP Connection Management - 2
Three way handshake:
Step 1: client end system sends TCP SYN control
segment to server
specifies initial seq #
Step 2: server end system receives SYN, replies
with SYNACK control segment
ACKs received SYN
allocates buffers
specifies server-> receiver initial seq. #
King Saud University -
Dr Ahmad Al-Zubi
20
TCP Connection Management - 3
Closing a connection:
client closes socket: clientSocket.close();
Step 1: client end system sends TCP
FIN control segment to server
Step 2: server receives FIN, replies with
ACK. Closes connection, sends FIN.
King Saud University -
Dr Ahmad Al-Zubi
21
TCP Connection Management - 4
Fddfdf
close
client
server
FIN
ACK
close
timed wait
FIN
ACK
closed
King Saud University -
Dr Ahmad Al-Zubi
22
TCP Connection Management - 5
Step 3: client receives FIN, replies with ACK.
Enters timed wait - will respond with
ACK to received FINs
Step 4: server, receives ACK.
Connection
closed.
Note: with small modification, can handle
simultaneous FINs.
King Saud University -
Dr Ahmad Al-Zubi
23
TCP Connection Management - 6
TCP client lifecycle
King Saud University -
Dr Ahmad Al-Zubi
24
TCP Connection Management - 7
TCP server lifecycle
King Saud University -
Dr Ahmad Al-Zubi
25
Recap: Stability of a Multiplexed System
Average Input Rate > Average Output Rate
=> system is unstable!
How to ensure stability ?
1. Reserve enough capacity so that
demand is less than reserved capacity
2. Dynamically detect overload and adapt
either the demand or capacity to resolve
overload
King Saud University -
Dr Ahmad Al-Zubi
26
Congestion Problem in Packet Switching
10 Mbs
Ethernet
A
B
statistical multiplexing
1.5 Mbs
queue of packets
waiting for output
link
45 Mbs
Cost: self-descriptive header per-packet,
buffering and delays for applications.
Need to either reserve resources or
dynamically detect/adapt to overload for stability
King Saud University -
Dr Ahmad Al-Zubi
27
The
Congestion Problem
outstrips
available capacity
Problem: demand
1
Demand
Capacity
If information about i , and is
known in a central location where
control of i or can be effected
with zero time delays,
the congestion problem is solved!
King Saud University -
Dr Ahmad Al-Zubi
28
The Congestion Problem
(Continued)
Problems:
Incomplete information (eg: loss
indications)
Distributed solution required
Congestion and control/measurement
locations different
Time-varying, heterogeneous timedelay
King Saud University -
Dr Ahmad Al-Zubi
29
The Congestion Problem
Static fixes may not solve congestion
a) Memory becomes cheap (infinite memory)
No buffer
Too late
b) Links become cheap (high speed links)?
Replace with 1 Mb/s
All links 19.2 kb/s
SS
SS
SS
SS
File Transfer time = 5 mins
SS
SS
SS
SS
File Transfer Time = 7 hours
King Saud University -
Dr Ahmad Al-Zubi
30
The Congestion Problem
(Continued)
c) Processors become cheap
(fast routers & switches)
A
B
C
D
Scenario: All links 1 Gb/s.
A & B send to C
=> high-speed congestion!!
(lose more packets faster!)
King Saud University -
Dr Ahmad Al-Zubi
31
Principles of Congestion Control
Congestion:
informally: too many sources sending too
much data too fast for network to handle
different from flow control (receiver overload)!
manifestations:
lost packets (buffer overflow at routers)
long delays (queuing in router buffers)
a top-10 problem!
King Saud University -
Dr Ahmad Al-Zubi
32
Causes/costs of congestion:
scenario 1
two senders, two
receivers
one router,
infinite buffers
no
retransmission
large delays
when congested
maximum
achievable
throughput
King Saud University -
Dr Ahmad Al-Zubi
33
Causes/costs of congestion:
scenario 2
one router, finite buffers
sender retransmission of lost packet
King Saud University -
Dr Ahmad Al-Zubi
34
Causes/costs of congestion:
scenario 2 (continued)
Costs of congestion:
More work (retrans) for given goodput
Unneeded retransmissions: link carries
multiple copies of pkt due to spurious
timeouts
King Saud University -
Dr Ahmad Al-Zubi
35
Causes/costs of congestion:
scenario 3
Another cost of congestion:
when packet dropped, any upstream
transmission capacity used for that packet
was wasted!
King Saud University -
Dr Ahmad Al-Zubi
36
Approaches towards
congestion control - 1
Two broad approaches towards
congestion control:
End-end congestion control:
no explicit feedback from network
congestion inferred from end-system
observed loss, delay
approach taken by TCP
King Saud University -
Dr Ahmad Al-Zubi
37
Approaches towards
congestion control - 2
Network-assisted congestion
control:
routers provide feedback to end
systems
single bit indicating congestion
(SNA, DECbit, TCP/IP ECN, ATM)
explicit rate sender should send at
King Saud University -
Dr Ahmad Al-Zubi
38
TCP congestion control - 1
end-end control (no network assistance)
transmission rate limited by congestion
window size, Congwin, over segments:
King Saud University -
Dr Ahmad Al-Zubi
39
TCP congestion control - 2
w segments, each with MSS
bytes sent in one RTT:
throughput =
w * MSS
RTT
King Saud University -
Bytes/sec
Dr Ahmad Al-Zubi
40
TCP congestion control - 3
Probing for usable bandwidth:
Window flow control: avoid receiver
overrun
Dynamic window congestion control:
avoid/control network overrun
Policy:
Increase Congwin until loss (congestion)
Loss => decrease Congwin, then begin
probing (increasing) again
King Saud University -
Dr Ahmad Al-Zubi
41
Additive Increase/Multiplicative
Decrease (AIMD) Policy
For stability:
rate-of-decrease > rate-of-increase
Decrease performed enough times as long as congestion exists
AIMD policy satisfies this condition, provided packet loss is
congestion indicator
window
time
King Saud University -
Dr Ahmad Al-Zubi
42
Fairness
Fairness goal: if N TCP sessions
share same bottleneck link, each
should get 1/N of link capacity
TCP connection 1
TCP
connection 2
bottleneck
router
capacity R
King Saud University -
Dr Ahmad Al-Zubi
43
Fairness Analysis
King Saud University -
Dr Ahmad Al-Zubi
44
AIMD Converges to Fairness
King Saud University -
Dr Ahmad Al-Zubi
45
TCP congestion control - 4
TCP uses AIMD policy in steady state
Two phases
Transient phase: aka Slow start
Steady State: aka Congestion avoidance
Important variables:
Congwin
threshold: defines threshold between two
slow start phase, congestion avoidance
phase
King Saud University -
Dr Ahmad Al-Zubi
46
TCP Slowstart - 1
Slowstart algorithm
initialize: Congwin = 1
for (each segment ACKed)
Congwin++
until (loss event OR
CongWin > threshold)
Exponential increase (per RTT) in window
size (not so slow!)
Loss event: timeout (Tahoe TCP) and/or
three duplicate ACKs (Reno TCP)
King Saud University -
Dr Ahmad Al-Zubi
47
TCP Slowstart - 2
RTT
asdf
Host A
Host B
one segm
en
two segm
ents
four segm
ents
time
King Saud University -
Dr Ahmad Al-Zubi
48
TCP Dynamics
1st RTT
2nd RTT
3rd RTT
4th RTT
Rate of acks determines rate of
packets : Self-clocking property.
100 Mbps
10 Mbps
Router
King Saud University -
Dr Ahmad Al-Zubi
49
TCP Congestion Avoidance - 1
Congestion avoidance
/* slowstart is over
*/
/* Congwin > threshold */
Until (loss event) {
every w segments ACKed:
Congwin++
}
threshold = Congwin/2
Congwin = 1
1
perform slowstart
1: TCP Reno skips slowstart (aka fast recovery) after
three duplicate ACKs and performs close to AIMD
King Saud University Dr Ahmad Al-Zubi
50
TCP Congestion Avoidance - 2
King Saud University -
Dr Ahmad Al-Zubi
51
TCP window dynamics (more)
Congestion
Window
(cwnd)
Timeout
Receiver Window
Idle
Interval
ssthresh
1
Time (units of RTTs)
King Saud University -
Dr Ahmad Al-Zubi
52
TCP latency modeling - 1
Q: How long does it take to receive
an object from a Web server after
sending a request?
TCP connection establishment
data transfer delay
King Saud University -
Dr Ahmad Al-Zubi
53
TCP latency modeling - 2
Notation, assumptions:
Assume one link between client and
server of rate R
Assume: fixed congestion window, W
segments
S: MSS (bits)
O: object size (bits)
no retransmissions (no loss, no
corruption)
King Saud University -
Dr Ahmad Al-Zubi
54
TCP latency modeling - 3
Two cases to consider:
WS/R > RTT + S/R: ACK for first
segment in window returns
before windows worth of data
sent
WS/R < RTT + S/R: wait for ACK
after sending windows worth of
data sent
King Saud University -
Dr Ahmad Al-Zubi
55
TCP latency modeling - 4
Case 1: latency = 2RTT + O/R
King Saud University -
Dr Ahmad Al-Zubi
56
TCP latency modeling - 5
K = O/WS
Case 2: latency = 2RTT + O/R
+ (K-1)[S/R + RTT - WS/R]
King Saud University -
Dr Ahmad Al-Zubi
57
TCP latency modeling:
slow start - 1
Now suppose window grows according to
slow start.
Will show that the latency of one object of
size O is:
O
S
S
P
Latency 2 RTT P RTT (2 1)
R
R
R
where P is the number of times TCP stalls at
server:
King Saud University -
Dr Ahmad Al-Zubi
58
TCP latency modeling:
slow start - 2
P min{Q, K 1}
- where Q is the number of times the server
would stall if the object were of infinite
size.
- and K is the number of windows that cover
the object.
King Saud University -
Dr Ahmad Al-Zubi
59
TCP latency modeling:
slow start - 3
in it ia t e T C P
c o n n e c t io n
Example:
re q u e s t
o b je c t
O/S = 15 segments
f ir s t w in d o w
= S /R
RTT
s e c o n d w in d o w
= 2 S /R
K = 4 windows
t h ir d w in d o w
= 4 S /R
Q=2
f o u r t h w in d o w
= 8 S /R
P = min{K-1,Q} = 2
Server stalls P=2 times.
c o m p le t e
t r a n s m is s io n
o b je c t
d e liv e r e d
t im e a t
c lie n t
King Saud University -
t im e a t
s e rv e r
Dr Ahmad Al-Zubi
60
TCP latency modeling:
slow start - 4
S
RTT time from when server starts to send segment
R
until server receives acknowledgement
k 1
S
time to transmit the kth window
R
S
k 1 S
R RTT 2 R
stall time after the kth window
King Saud University -
Dr Ahmad Al-Zubi
61
TCP latency modeling:
slow start - 5
P
O
latency 2 RTT stallTime p
R
p 1
P
O
S
k 1 S
2 RTT [ RTT 2
]
R
R
k 1 R
O
S
S
P
2 RTT P[ RTT ] (2 1)
R
R
R
King Saud University -
Dr Ahmad Al-Zubi
62
Sample Results
R
Minimum Latency:
O/R + 2 RTT
Latency with slow
start
O/R
28 Kbps
28.6 sec
28.8 sec
28.9 sec
100 Kbps
8 sec
8.2 sec
8.4 sec
1 Mbps
800
msec
1 sec
1.5 sec
10 Mbps
80 msec
0.28 sec
0.98 sec
King Saud University -
Dr Ahmad Al-Zubi
63
Summary: Chapter 3
Principles behind transport layer services:
multiplexing/demultiplexing
reliable data transfer
flow control
congestion control
Instantiation and implementation in the
Internet
UDP, TCP
King Saud University -
Dr Ahmad Al-Zubi
64