0% found this document useful (0 votes)
2 views

tcp-rev2

The document provides an overview of TCP (Transmission Control Protocol), detailing its characteristics such as reliability, full duplex data transfer, and connection-oriented communication. It explains TCP's segment structure, sequence numbers, acknowledgments, round-trip time estimation, and flow control mechanisms. Additionally, it covers TCP connection management, congestion control principles, and the algorithms used for managing data transmission and retransmission in a networked environment.

Uploaded by

bhavanikiruba
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

tcp-rev2

The document provides an overview of TCP (Transmission Control Protocol), detailing its characteristics such as reliability, full duplex data transfer, and connection-oriented communication. It explains TCP's segment structure, sequence numbers, acknowledgments, round-trip time estimation, and flow control mechanisms. Additionally, it covers TCP connection management, congestion control principles, and the algorithms used for managing data transmission and retransmission in a networked environment.

Uploaded by

bhavanikiruba
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

TCP: Overview RFCs: 793, 1122, 1323, 2018, 2581

„ point-to-point: „ full duplex data:


one sender, one receiver bi-directional data flow in
„ reliable, in-order, byte steam: same connection
no “message boundaries” MSS: maximum segment
size
„ pipelined:
TCP congestion and flow „ connection-oriented:
control set window size handshaking (exchange
„ send & receive buffers of control msgs) init’s
sender, receiver state
application application before data exchange
writes data reads data
socket socket
door
TCP TCP
door „ flow controlled:
send buffer receive buffer
segment sender will not overwhelm
receiver

TCP segment structure


32 bits
URG: urgent data counting
(generally not used) source port # dest port #
by bytes
sequence number of data
ACK: ACK #
valid acknowledgement number (not segments!)
head not
PSH: push data now len used
UA P R S F Receive window
(generally not used) # bytes
checksum Urg data pnter
rcvr willing
RST, SYN, FIN: to accept
Options (variable length)
connection estab
(setup, teardown
commands)
application
Internet data
checksum (variable length)
(as in UDP)
TCP seq. #’s and ACKs
Seq. #’s: Host B
Host A
byte stream
“number” of first User Seq=4
2
types , A CK
=79, d
byte in segment’s ‘C’
a ta = ‘C

data host ACKs
receipt of
ACKs: = ‘C’ ‘C’, echoes
=4 3 , data
seq # of next byte 79 , ACK back ‘C’
Seq =
expected from other
side host ACKs
cumulative ACK receipt Seq=4
of echoed 3 , A CK
= 80
Q: how receiver handles ‘C’
out-of-order segments
A: TCP spec doesn’t
time
say, - up to
simple telnet scenario
implementor

TCP Round Trip Time and Timeout

Q: how to set TCP Q: how to estimate RTT?


timeout value? „ SampleRTT: measured time
„ longer than RTT from segment transmission
but RTT varies until ACK receipt
„ too short: premature ignore retransmissions
timeout „ SampleRTT will vary, want
unnecessary estimated RTT “smoother”
retransmissions average several recent
„ too long: slow measurements, not just
reaction to segment current SampleRTT
loss
TCP Round Trip Time and Timeout

EstimatedRTT = (1- α)*EstimatedRTT + α*SampleRTT

„ Exponential weighted moving average


„ influence of past sample decreases
exponentially fast
„ typical value: α = 0.125

Example RTT estimation:


RTT: gaia.cs.umass.edu to fantasia.eurecom.fr

350

300

250
RTT (milliseconds)

200

150

100
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)

SampleRTT Estimated RTT


TCP Round Trip Time and Timeout

Setting the timeout


„ EstimtedRTT plus “safety margin”
large variation in EstimatedRTT -> larger safety margin
„ first estimate of how much SampleRTT deviates
from EstimatedRTT:
DevRTT = (1-β)*DevRTT +
β*|SampleRTT-EstimatedRTT|

(typically, β = 0.25)
Then set timeout interval:

TimeoutInterval = EstimatedRTT + 4*DevRTT

TCP reliable data transfer

„ TCP creates rdt „ Retransmissions are


service on top of IP’s triggered by:
unreliable service timeout events
„ Pipelined segments duplicate acks
„ Cumulative acks „ Initially consider
„ TCP uses single simplified TCP
retransmission timer sender:
ignore duplicate acks
ignore flow control,
congestion control
TCP sender events:
data rcvd from app: timeout:
„ Create segment with „ retransmit segment that
seq # caused timeout
„ seq # is byte-stream „ restart timer
number of first data Ack rcvd:
byte in segment „ If acknowledges
„ start timer if not already previously unacked
running (think of timer segments
as for oldest unacked update what is known to
segment) be acked
„ expiration interval: start timer if there are
TimeOutInterval outstanding segments

TCP Sender (simplified)


NextSeqNum = InitialSeqNum
SendBase = InitialSeqNum

loop (forever) {
switch(event)

event: data received from application above


create TCP segment with sequence number NextSeqNum
if (timer currently not running)
start timer
pass segment to IP
NextSeqNum = NextSeqNum + length(data)

event: timer timeout


retransmit not-yet-acknowledged segment with
smallest sequence number
start timer

event: ACK received, with ACK field value of y


if (y > SendBase) {
SendBase = y
if (there are currently not-yet-acknowledged segments)
start timer
}

} /* end of loop forever */


TCP: retransmission scenarios
Host A Host B Host A Host B

Seq=9 Seq=9
2 , 8 byte 2 , 8 byte
s data s data

Seq=92 timeout
Seq=
1 00, 2
0 byt
es da
timeout

ta
=100
A CK
1 00
X K=
AC ACK=
120
loss
Seq=9 Seq=9
2
2 , 8 byte
s data
Sendbase , 8 byte
s data
= 100

Seq=92 timeout
SendBase
= 120 2 0
K=1
100 AC
A C K=

SendBase
= 100 SendBase
= 120 premature timeout
time time
lost ACK scenario

TCP retransmission scenarios (more)


Host A Host B

Seq=9
2 , 8 byte
s data

=100
timeout

Seq=1 A CK
0 0, 20
bytes
data
X
loss

SendBase =120
AC K
= 120

time
Cumulative ACK scenario
TCP ACK generation [RFC 1122, RFC 2581]

Event at Receiver TCP Receiver action


Arrival of in-order segment with Delayed ACK. Wait up to 500ms
expected seq #. All data up to for next segment. If no next segment,
expected seq # already ACKed send ACK

Arrival of in-order segment with Immediately send single cumulative


expected seq #. One other ACK, ACKing both in-order segments
segment has ACK pending

Arrival of out-of-order segment Immediately send duplicate ACK,


higher-than-expect seq. # . indicating seq. # of next expected byte
Gap detected

Arrival of segment that Immediate send ACK, provided that


partially or completely fills gap segment starts at lower end of gap

Fast Retransmit
„ Time-out period often „ If sender receives 3
relatively long: ACKs for the same data,
long delay before it supposes that segment
resending lost packet after ACKed data was
„ Detect lost segments via lost:
duplicate ACKs. fast retransmit: resend
Sender often sends many segment before timer
segments back-to-back expires
If segment is lost, there will
likely be many duplicate
ACKs.
Fast retransmit algorithm:

event: ACK received, with ACK field value of y


if (y > SendBase) {
SendBase = y
if (there are currently not-yet-acknowledged segments)
start timer
}
else {
increment count of dup ACKs received for y
if (count of dup ACKs received for y = 3) {
resend segment with sequence number y
}

a duplicate ACK for fast retransmit


already ACKed segment

TCP Flow Control


flow control
„ receive side of TCP sender won’t overflow
receiver’s buffer by
connection has a transmitting too much,
receive buffer: too fast

„ speed-matching
service: matching
the send rate to the
receiving app’s drain
„ app process may be rate
slow at reading from
buffer
TCP Flow control: how it works
„ Rcvr advertises
spare room by
including value of
RcvWindow in
segments
(Suppose TCP receiver
discards out-of-order „ Sender limits
segments) unACKed data to
RcvWindow
„ spare room in buffer
guarantees receive
= RcvWindow
buffer doesn’t
= RcvBuffer- overflow
[LastByteRcvd -
LastByteRead]

TCP Connection Management


Recall: TCP sender, receiver Three way handshake:
establish “connection” before
exchanging data segments Step 1: client host sends TCP
„ initialize TCP variables: SYN segment to server
seq. #s specifies initial seq #
buffers, flow control info no data
(e.g. RcvWindow) Step 2: server host receives SYN,
„ client: connection initiator replies with SYNACK segment
Socket clientSocket = new server allocates buffers
Socket("hostname","port
specifies server initial seq. #
number");
Step 3: client receives SYNACK,
„ server: contacted by client
replies with ACK segment,
Socket connectionSocket =
welcomeSocket.accept(); which may contain data
Three-Way Handshake

TCP Connection Management (cont.)


Closing a connection:
client server
client closes socket:
close
clientSocket.close(); FIN

Step 1: client end system AC K


close
sends TCP FIN control
FIN
segment to server
timed wait

A CK
Step 2: server receives FIN,
replies with ACK. Closes
connection, sends FIN.
closed
TCP Connection Management (cont.)

Step 3: client receives FIN, client server


replies with ACK.
closing
Enters “timed wait” - will FIN
respond with ACK to
received FINs
AC K
closing
Step 4: server, receives ACK. FIN
Connection closed.

timed wait
A CK
Note: with small modification,
can handle simultaneous closed
FINs.
closed

TCP Connection Management (cont)

TCP server
lifecycle

TCP client
lifecycle
Principles of Congestion Control

Congestion:
„ informally: “too many sources sending too much
data too fast for network to handle”
„ different from flow control!
„ manifestations:
lost packets (buffer overflow at routers)
long delays (queueing in router buffers)
„ a top-10 problem!

Causes/costs of congestion: scenario 1


Host A λout
λin : original data
„ two senders, two
receivers Host B unlimited shared
output link buffers

„ one router,
infinite buffers
„ no
retransmission
„ large delays
when
congested
„ maximum
achievable
throughput
Causes/costs of congestion: scenario 2
„ one router, finite buffers
„ sender retransmission of lost packet
„ unneeded retransmissions: link carries multiple
copies of pkt
Host A λin : original data λout

λ'in : original data, plus


retransmitted data

Host B finite shared output


link buffers

Causes/costs of congestion
H λ
o
o
s
u
t
A t

H
o
s
t
B

Another “cost” of congestion:


„ when packet dropped, any “upstream
transmission capacity used for that packet
was wasted!
Approaches towards congestion control
Two broad approaches towards congestion
control:
End-end congestion Network-assisted
control: congestion control:
„ no explicit feedback from „ routers provide feedback
network to end systems
„ congestion inferred from single bit indicating
end-system observed congestion (SNA,
loss, delay DECbit, TCP/IP ECN,
„ approach taken by TCP ATM)
explicit rate sender
should send at

TCP Congestion Control


How does sender
„ end-end control (no network perceive congestion?
assistance)
„ loss event = timeout or
„ sender limits transmission: 3 duplicate acks
LastByteSent-LastByteAcked
„ TCP sender reduces
≤ CongWin
rate (CongWin) after
„ Roughly, loss event
CongWin
rate = Bytes/sec three mechanisms:
RTT
AIMD
„ CongWin is dynamic, function of slow start
perceived network congestion conservative after
timeout events
TCP AIMD
multiplicative additive increase:
decrease: cut increase CongWin
CongWin in half by 1 MSS every
after loss event RTT in the absence
congestion
window
of loss events:
24 Kbytes
probing
16 Kbytes

8 Kbytes

time

Long-lived TCP connection

TCP Slow Start


„ When connection „ When connection
begins, CongWin = 1 begins, increase rate
MSS exponentially fast until
Example: MSS = 500 first loss event
bytes & RTT = 200
msec
initial rate = 20 kbps
„ available bandwidth
may be >> MSS/RTT
desirable to quickly
ramp up to
respectable rate
TCP Slow Start (more)
„ When connection
begins, increase rate Host A Host B

exponentially until first one segm


ent

RTT
loss event:
double CongWin two segm
en ts
every RTT
done by incrementing
four segm
CongWin for every ents

ACK received
„ Summary: initial rate
is slow but ramps up time
exponentially fast

Refinement
Philosophy:
„ After 3 dup ACKs:
CongWin is cut in half • 3 dup ACKs indicates
window then grows network capable of
linearly delivering some segments
„ But after timeout event: • timeout before 3 dup
CongWin instead set to ACKs is “more alarming”
1 MSS;
window then grows
exponentially
to a threshold, then
grows linearly
Refinement (more)
Q: When should the
exponential
increase switch to
linear?
A: When CongWin
gets to 1/2 of its
value before
timeout.

Implementation:
„ Variable Threshold
„ At loss event, Threshold
is set to 1/2 of CongWin
just before loss event

Summary: TCP Congestion Control

„ When CongWin is below Threshold, sender in


slow-start phase, window grows exponentially.
„ When CongWin is above Threshold, sender is in
congestion-avoidance phase, window grows linearly.
„ When a triple duplicate ACK occurs, Threshold set
to CongWin/2 and CongWin set to Threshold.

„ When timeout occurs, Threshold set to CongWin/2


and CongWin is set to 1 MSS.
Congestion Avoidance
„ TCP’s strategy
control congestion once it happens
repeatedly increase load in an effort to find the
point at which congestion occurs, and then back
off
„ Alternative strategy
predict when congestion is about to happen
reduce rate before packets start being discarded
call this congestion avoidance, instead of
congestion control

Random Early Detection (RED)

„ Notification is implicit
just drop the packet (TCP will timeout)
could make explicit by marking the packet
„ Early random drop
rather than wait for queue to become full, drop
each arriving packet with some drop
probability whenever the queue length
exceeds some drop level
RED Details
„ Compute average queue length
AvgLen = (1 - Weight) * AvgLen +
Weight * SampleLen
0 < Weight < 1 (usually 0.002)
SampleLen is queue length each time a packet
arrives
MaxThreshold MinThreshold

AvgLen

RED Details (cont)


„ Two queue length thresholds

if AvgLen <= MinThreshold then


enqueue the packet
if MinThreshold < AvgLen < MaxThreshold
then
calculate probability P
drop arriving packet with probability P
if ManThreshold <= AvgLen then
drop arriving packet
RED Details (cont)
„ Drop Probability Curve
P(drop)

1.0

MaxP
AvgLen
MinThresh MaxThresh

TCP Vegas
„ Idea: source watches for some sign that
router’s queue is building up and congestion
will happen too; e.g.,
RTT grows
TCP Vegas
„ Let BaseRTT be the minimum of all measured RTTs
(commonly the RTT of the first packet)
„ If not overflowing the connection, then
ExpectRate = CongestionWindow/BaseRTT
„ Source calculates sending rate (ActualRate) once per
RTT
„ Source compares ActualRate with ExpectRate

Diff = ExpectedRate - ActualRate


if Diff < α
increase CongestionWindow linearly
else if Diff > β
decrease CongestionWindow linearly
else
leave CongestionWindow unchanged

TCP Fairness
„ Fairness goal: if K TCP sessions share same
bottleneck link of bandwidth R, each should have
average rate of R/K
„ Practically this does not happen in TCP as
connections with lower RTT are able to grab the
available link bandwidth more quickly.
TCP connection 1

bottleneck
TCP
router
connection 2
capacity R
Fairness (more)
Fairness and UDP Fairness and parallel TCP
„ Multimedia apps often
connections
do not use TCP „ nothing prevents app from
do not want rate throttled opening parallel cnctions
by congestion control between 2 hosts.
„ Instead use UDP: „ Web browsers do this
pump audio/video at „ Example: link of rate R
constant rate, tolerate supporting 9 cnctions;
packet loss
new app asks for 1 TCP, gets
„ Research area: TCP rate R/10
friendly new app asks for 11 TCPs,
gets R/2 !

TCP Options: Protection Against Wrap Around Sequence

„ 32-bit SequenceNum

Bandwidth Time Until Wrap Around


T1 (1.5 Mbps) 6.4 hours
Ethernet (10 Mbps) 57 minutes
T3 (45 Mbps) 13 minutes
FDDI (100 Mbps) 6 minutes
STS-3 (155 Mbps) 4 minutes
STS-12 (622 Mbps) 55 seconds
STS-24 (1.2 Gbps) 28 seconds
TCP Options: Keeping the Pipe Full

„ 16-bit AdvertisedWindow

Bandwidth Delay x Bandwidth Product


T1 (1.5 Mbps) 18KB
Ethernet (10 Mbps) 122KB
T3 (45 Mbps) 549KB
FDDI (100 Mbps) 1.2MB
STS-3 (155 Mbps) 1.8MB
STS-12 (622 Mbps) 7.4MB
STS-24 (1.2 Gbps) 14.8MB

assuming 100ms RTT

TCP Extensions

„ Implemented as header options


„ Store timestamp in outgoing segments
„ Extend sequence space with 32-bit
timestamp (PAWS)
„ Shift (scale) advertised window

You might also like