TCP Protocol: Key Concepts Explained
TCP Protocol: Key Concepts Explained
001. Can you tell me the difference between TCP and UDP?
First, let's summarize the basic differences:
TCP is a connection-oriented, reliable, byte stream-based transport layer protocol.
UDP is a connectionless transport layer protocol. (It is that simple, and other TCP features
are missing).
Specifically, compared with UDP , TCP there are three core features:
1. Connection-oriented . The so-called connection refers to the connection between the
client and the server. Before the two parties communicate with each other, TCP requires
a three-way handshake to establish a connection, while UDP does not have a
corresponding connection establishment process.
2. Reliability . TCP takes great pains to ensure reliable connections. How does this
reliability manifest itself? One aspect is statefulness, and the other is controllability.
TCP accurately records which data has been sent, which data has been received, and which
data has not been received, and ensures that data packets arrive in order, without any errors.
This is stateful .
When TCP detects packet loss or poor network conditions, it adjusts its behavior based on
the specific situation, controlling its sending speed or retransmitting packets. This is
controllable .
Correspondingly, UDP is .
3. Byte stream oriented . UDP data transmission is based on datagrams. This is because
it simply inherits the characteristics of the IP layer, while TCP converts IP packets into
byte streams in order to maintain state.
When the sender SYN sends , the receiver also sends SYN a message to the sender, and the
two are connected!
After sending SYN , the status of both becomes SYN-SENT .
After each receives the other's message SYN , the status of both becomes SYN-REVD .
Then it will reply to the corresponding ACK + SYN message. After the other party receives
the message, the status of both parties will change to ESTABLISHED .
This is the state transition when opening at the same time.
003: Talk about the process of TCP's four waves
Process Disassembly
After sending, the client becomes FIN-WAIT-1 state . Note that at this time, the client also
becomes half-close( ) state , that is, it cannot send messages to the server and can
only receive.
After receiving it, the server confirms it to the client and changes it to CLOSED-WAIT status.
The client receives the confirmation from the server and becomes FIN-WAIT2 the state.
Then, the server sends a message to the client FIN and enters LAST-ACK the state.
After the client receives the message from the server FIN , it changes TIME-WAIT to the
state and then sends an ACK to the server.
Note that at this time, the client needs to wait long enough, specifically, 2 MSL ( Maximum
Segment Lifetime ). During this period of time, if the client does not
receive a resend request from the server, it means that the ACK has arrived successfully and
the waving ends. Otherwise, the client resends the ACK.
The significance of waiting for 2MSL
What happens if you don't wait?
If the client doesn't wait and runs away, the server will have many packets to send to the
client and they're still on the way. If the client's port is occupied by a new application, it will
receive useless packets, causing data packet confusion. Therefore, the safest approach is to
wait until all packets sent by the server have died before starting a new application.
So, if one MSL isn't enough, why wait for 2 MSLs?
1 MSL ensures that the last ACK message from the active closing party in the four waves
can eventually reach the other end
1 MSL ensures that the retransmitted FIN message can reach the peer without receiving
ACK
This is the meaning of waiting for 2MSL.
Why four waves instead of three?
Because the server FIN often does not respond immediately after receiving a message
FIN , it must wait until all messages on the server have been sent before sending another
FIN message. Therefore, it first sends a ACK message indicating that it has received the
client FIN , and then delays the sending of the message FIN . This results in four
handshakes.
What would be the problem if I waved three times?
This means that the server will combine the sending of ACK and FIN into one wave. At this
time, a long delay may cause the client to mistakenly believe that the FIN message has not
arrived at the client, causing the client to resend continuously FIN .
What happens if they close at the same time?
If the client and server send FIN at the same time, how will the status change? As shown in
the figure:
004: Talk about the relationship between the semi-connection queue
and the SYN Flood attack
Before the three-way handshake, the server's state CLOSED changes from LISTEN , and two
queues are created internally: a semi-connection queue and a full-connection queue ,
namely the SYN queue and the ACCEPT queue .
Semi-connected queue
When the client sends a message SYN to the server, the server replies ACK and SYN the
state LISTEN changes from SYN to SYN SYN_RCVD . At this time, the connection is pushed
into the SYN queue , which is the semi-connected queue .
Full connection queue
When the client returns ACK and the server receives it, the three-way handshake is
complete. At this point, the connection waits to be taken away by a specific application.
Before being taken away, it is pushed into another queue maintained by TCP, namely the full
connection queue (Accept Queue) .
SYN Flood Attack Principle
A SYN flood is a typical DoS/DDoS attack. The attack principle is simple: the client forges a
large number of non-existent IP addresses within a short period of time and frantically sends
them to the server SYN . This can have two dangerous consequences for the server:
1. Processing a large number of SYN packets and returning corresponding responses
ACK will inevitably result in a large number of connections being in SYN_RCVD the state,
thus filling up the entire half-connected queue and being unable to process normal
requests.
2. Since the IP does not exist, the server cannot receive the client's message for a long
time ACK , which will cause the server to continuously resend data until the server's
resources are exhausted.
How to deal with SYN Flood attacks?
1. Increasing the number of SYN connections means increasing the capacity of the semi-
connected queue.
2. Reduce the number of SYN + ACK retries to avoid a large number of timeout
retransmissions.
3. SYN Using SYN Cookie technology, the server does not allocate connection resources
immediately after receiving the message . Instead SYN , it calculates a cookie based on
it and replies to the client along with the second handshake. The client ACK includes
this Cookie value when replying. The server verifies that the cookie is legal before
allocating connection resources.
SYN and ACK have been mentioned above. The explanations of the last three are as follows:
FIN : Finish, indicating that the sender is ready to disconnect.
the client.
The client obtains the cookie value and caches it, completing the three-way handshake
normally.
The first three-way handshake is like this. But the subsequent three-way handshake is
different!
The following three-way handshake
In the subsequent three-way handshake, the client will send the previously cached Cookie ,
, SYN and HTTP (yes, you read that right) to the server. The server verifies the legitimacy
of the cookie. If it is illegal, it will be discarded directly; if it is legal, it will be returned
normally SYN + ACK .
Here comes the key point. Now the server can send HTTP responses to the client! This is the
most significant change. The three-way handshake has not yet been established. The HTTP
response can be returned after only verifying the validity of the cookie.
Of course, the client's message ACK must be transmitted normally, otherwise it is not called
a three-way handshake.
The process is as follows:
Note: The ACK of the client's final handshake does not have to wait until the server's HTTP
response arrives before it is sent. The two processes have nothing to do with each other.
Advantages of TFO
The advantage of TFO lies not in the first three-way handshake, but in the subsequent
handshakes. After obtaining the client's cookie and verifying it, the HTTP response can be
returned directly, making full use of 1 RTT (Round-Trip Time) to transmit data in advance ,
which is a relatively large advantage when accumulated.
Where kind = 8, length = 10, info consists of two parts: timestamp and timestamp echo ,
each occupying 4 bytes.
So what are these fields for? What problems do they solve?
Next, let's sort them out one by one. TCP timestamps mainly solve two major problems:
Calculating the Round-Trip Time (RTT)
Preventing serial number wraparound issues
Calculating round-trip time (RTT)
When there is no timestamp, the problem of calculating RTT is shown in the following figure:
If the first packet is used as the start time, the problem shown in the left figure will occur. The
RTT is obviously too long. The start time should be the second packet.
If the second packet is used as the start time, it will cause the problem shown in the right
figure. The RTT is obviously too small. The start time should be the first packet.
In fact, whether the start time is based on the first or second contract issuance, it is
inaccurate.
At this time, introducing timestamps can solve this problem very well.
For example, if a sends a message s1 to b, and b replies to a with a message s2 containing an
ACK, then:
Step 1: When a sends to b, timestamp the content stored in is the kernel time when host
a sends ta1 .
Step 2: When b replies to a with the s2 message, the timestamp time of host b is stored
in the field ta1 parsed from the s1 message. tb timestamp echo
Step 3: After a receives the s2 message from b, the kernel time of host a is now ta2. This
can be obtained from the timestamp echo option in the s2 message ta1 , which is the
time when the s2 message was originally sent. Then, the RTT value is directly obtained
by taking ta2 - ta1.
Preventing sequence number wraparound issues
Now let's simulate this problem.
The range of serial numbers is actually between 0 and 2^32 - 1. For the sake of
demonstration, we narrow this range and assume that the range is 0 to 4. Then when it
reaches 4, it will return to 0.
How many times have the Corresponding serial
Send Bytes state
package been sent? number
Assume that the packet that was previously stuck in the network returns for the sixth time.
Then there will be two 1 ~ 2 data packets with the same sequence number. How to
distinguish which is which? This is when the sequence number wraparound problem occurs.
Then using timestamp can solve this problem very well, because every time a packet is sent,
the kernel time of the sending machine is recorded in the message. Therefore, even if the
sequence numbers of two packets are the same, the timestamps cannot be the same, so the
two data packets can be distinguished.
Where α is the smoothing factor , the recommended value is 0.8 , and the range is 0.8 ~
0.9 .
Note that the value at this time is different from that α in the classic method . The
recommended value is , that is . α 1/8 0.125
Step 2 : Calculate RTTVAR the intermediate variable (round-trip time variation).
Experience AI code assistant Code Interpretation
The recommended value for β is 0.25. This value is the highlight of this algorithm. That is, it
records the difference between the latest RTT and the current SRTT, providing us with a
handle for subsequent perception of RTT changes.
Step 3 : Calculate the final RTO :
Experience AI code assistant Code Interpretation
The send window is the area framed in the diagram. SND stands for send "unconfirmed
window , " WND unacknowledged stands for "unconfirmed," and NXT next stands for "next
send location."
Receive Window
The window structure of the receiving end is as follows:
REV receive , NXT indicates the next receive position, and WND indicates the receive
window size.
Flow control process
Here we don't use too complicated examples, but use the simplest back and forth simulation
of the flow control process to make it easier for everyone to understand.
First, the two parties perform a three-way handshake and initialize their respective window
sizes to 200 bytes.
If the current sender sends 100 bytes to the receiver, then for the sender, [Link] must of
course shift right 100 bytes, which means the current amount is reduced by 100
bytes, which is easy to understand.
Now these 100 bytes have arrived at the receiving end and are placed in the receiving end's
buffer queue. However, due to the heavy load at this time, the receiving end cannot process
so many bytes and can only process 40 bytes, 60 leaving the remaining bytes in the buffer
queue.
Please note that the receiving end's processing power is insufficient at this time. Please send
less data to me. Therefore, the receiving window of the receiving end should be reduced.
Specifically, it should be reduced by 60 bytes, from 200 bytes to 140 bytes, because there
are still 60 bytes in the buffer queue that have not been taken away by the application.
Therefore, the receiving end will add the reduced sliding window of 140 bytes to the header
of the ACK message, and the sending end will correspondingly adjust the size of the sending
window to 140 bytes.
At this time, for the sender, the part that has been sent and confirmed increases by 40 bytes,
that is, [Link] shifts right by 40 bytes, and the sending window is reduced to 140 bytes.
This is the process of flow control . No matter how many rounds there are, the entire control
process and principle remain the same.
1 = min(rwnd, cwnd)
The smaller value of the two is taken. Congestion control is used to control cwnd changes.
Slow Start
When you first start transmitting data, you don't know whether the network is stable or
congested. If you are too aggressive and send packets too quickly, you will suffer from crazy
packet loss, causing an avalanche of network disasters.
Therefore, congestion control first uses a conservative algorithm to slowly adapt to the entire
network. This algorithm is called . The operation process is as follows:
First, three-way handshake, both parties declare their receiving window size
Both parties initialize their own congestion window (cwnd) size
During the initial transmission period, the congestion window size increases by 1 for each
ACK received by the sender. In other words, cwnd doubles with each RTT. If the initial
window is 10, then after the first round of 10 packets is transmitted and the sender
receives an ACK, cwnd becomes 20, 40 in the second round, 80 in the third round, and
so on.
Will it continue to double indefinitely? Of course not. Its threshold is called the slow-start
threshold . When cwnd reaches this threshold, it's like stepping on the brakes. Don't
increase so quickly, my friend, hold on!
How to control the size of cwnd after reaching the threshold?
This is what congestion avoidance does.
Congestion Avoidance
Originally, cwnd increased by 1 with each ACK received. Now that the threshold has been
reached, cwnd can only increase by this much: 1 / cwnd . If you calculate carefully, after one
round of RTT, and receiving cwnd ACKs, the total congestion window size, cwnd, only
increases by 1.
In other words, the RTT used to cwnd double, but now cwnd it only increases by 1.
Of course, slow start and congestion avoidance work together and are integrated.
Fast retransmit and fast recovery
Fast Retransmit
During TCP transmission, if packet loss occurs, that is, when the receiving end finds that the
data segments do not arrive in order, the receiving end will resend the previous ACK.
For example, if the 5th packet is lost, even if the 6th and 7th packets arrive at the receiving
end, the receiving end will still return an ACK for the 4th packet. When the sending end
receives 3 duplicate ACKs, it realizes that the packet is lost and retransmits it immediately
without waiting for the RTO to expire.
This is fast retransmission , which solves the problem of whether retransmission is
needed .
Selective Repeat
Then you may ask, since it needs to be retransmitted, should only the 5th packet be
retransmitted or the 5th, 6th, and 7th packets?
Of course, the sixth and seventh packets have already arrived. The designers of TCP are not
stupid. Why would they retransmit them if they have already been transmitted? They can
simply record which packets have arrived and which have not, and retransmit them
accordingly.
After receiving a message from the sender, the receiver responds with an ACK message.
SACK This attribute can be included in the optional options in the packet header, left
edge informing right edge the sender which datagram ranges have been received.
Therefore, even if the fifth packet is lost, after receiving the sixth and seventh packets, the
receiver will still notify the sender that these two packets have arrived. If the fifth packet has
not arrived, it will retransmit it. This process is also called Selective Acknowledgment
(SACK) , and it solves the problem of how to retransmit .
Quick recovery
Of course, after receiving three repeated ACKs, the sender discovers packet loss and thinks
that the network is somewhat congested, and will enter the rapid recovery phase.
At this stage, the sender changes as follows:
The congestion threshold is reduced to half of cwnd
The size of cwnd becomes the congestion threshold
cwnd increases linearly
The above are the classic algorithms of TCP congestion control: slow start , congestion
avoidance , fast retransmit and fast recovery .
011: Can you talk about Nagle algorithm and delayed confirmation?
Nagle's algorithm
Imagine a scenario where the sender continuously sends small packets to the receiver, each
containing only one byte. This means that sending 1,000 bytes requires 1,000 transmissions.
This frequent transmission is problematic, not only because of the transmission latency, but
also because the sending and acknowledgment processes themselves take time. Frequent
transmission and reception introduce significant latency.
And avoiding the frequent sending of small packets is what the Nagle algorithm does.
Specifically, the rules of Nagle's algorithm are as follows:
When sending data for the first time, there is no need to wait, even a small packet of 1
byte is sent immediately
The following conditions can be met before sending:
The packet size reaches the maximum segment size (MSS)
ACKs for all previous packets have been received
Delayed confirmation
Imagine a scenario like this: I receive a packet from the sender, and then receive a second
packet in a very short time. Should I reply one by one, or wait a while, merge the ACKs of the
two packets, and reply together?
Delayed ACKs do the latter: they delay the message slightly, combine the ACKs, and finally
reply to the sender. TCP requires this delay to be less than 500ms, and most operating
systems implement it within 200ms.
However, it is important to note that there are some scenarios where confirmation cannot be
delayed and you must reply immediately after receiving it:
A message larger than one frame was received and the window size needs to be
adjusted.
TCP is in quickack mode (via tcp_in_quickack_mode settings)
Out-of-order packets were found
What happens when you use both together?
The former means delayed sending, and the latter means delayed receiving, which will cause
greater delays and performance problems.
However, the current situation is that most applications do not have keep-alive the option
to enable TCP by default. Why?
From the application perspective:
7200s means testing once every two hours, which is too long.
If the time is shorter, it will be difficult to reflect the original intention of its design, which
is to detect long-term dead connections.
Therefore, it is a rather awkward design.
at last
This article was first published on my blog . If you find it helpful, please give it a star. Thank
you very much.
Next Issue: HTTP Protocol
References:
Detailed Explanation of Web Protocols and Practical Packet Capture by Tao Hui
Interesting Talk on Network Protocols - Liu Chao
Nuggets booklet "In-depth understanding of TCP protocol: from principle to practice"
About BBR Congestion Control Algorithm Paper
Comments 0
Be the first to comment and have friendly exchanges
0 / 1000 send
Hottest up to date
Elity