TCP握手,客户端第三次ack比数据包晚到会发生什么事情?

本文阅读量 Posted by Kird on 2020-09-28

问题


知乎上看到的一个有意思的问题,TCP握手中,客户端第三次ack比数据包晚到会发生什么事情?下面的回答中看出部分答者对TCP协议理解的不够深入,作者通过packetdrill工具模拟重现这一现象,看看系统中是怎么实现的!

先看结论

结论:只要第三个包的ack序列号正确(即和不带数据的第三次ack序列号相同)的话,可以正常连接

packetdrill复现

下面是使用packetdrill工具进行模拟场景:

以下抓包中 192.0.2.1 为客户端,另一个为服务端(192.0.2.1为stap在系统中生成的ip,作为客户端)

正常情况

先看看正常场景模拟

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
--tolerance_usecs=1000000
0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
0.000 setsockopt(3, SOL_SOCKET, TCP_NODELAY, [1], 4) = 0
0.000 bind(3, ..., ...) = 0
0.000 listen(3, 1) = 0
+0 < S 0:0(0) win 32792 <mss 1000,nop,wscale 7>
+0 > S. 0:0(0) ack 1 <mss 1460,nop,wscale 7>

+0.5 < . 1:1(0) ack 1 win 257 //第三次数据包
+0 < P. 1:201(200) win 4000 //客户端握手后发的第一个数据

+0.1 > . 1:1(0) ack 201
+.1 < F. 201:201(0) win 65535 <mss 100>
+0 > . 1:1(0) ack 202 <...>
+0 `sleep 100`

tcpdump查看交互:

1
2
3
4
5
6
7
8
9
10
[root@0xfe.com.cn ~]# tcpdump -i any -n host 192.0.2.1
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
14:20:03.411381 IP 192.0.2.1.40336 > 192.168.234.229.webcache: Flags [S], seq 0, win 32792, options [mss 1000,nop,wscale 7], length 0
14:20:03.411422 IP 192.168.234.229.webcache > 192.0.2.1.40336: Flags [S.], seq 930521400, ack 1, win 29200, options [mss 1460,nop,wscale 7], length 0
14:20:03.911788 IP 192.0.2.1.40336 > 192.168.234.229.webcache: Flags [.], ack 1, win 257, length 0
14:20:03.911865 IP 192.0.2.1.40336 > 192.168.234.229.webcache: Flags [P.], seq 1:201, ack 0, win 4000, length 200: HTTP
14:20:03.911891 IP 192.168.234.229.webcache > 192.0.2.1.40336: Flags [.], ack 201, win 237, length 0
14:20:04.012264 IP 192.0.2.1.40336 > 192.168.234.229.webcache: Flags [F.], seq 201, ack 0, win 65535, options [mss 100], length 0
14:20:04.052010 IP 192.168.234.229.webcache > 192.0.2.1.40336: Flags [.], ack 202, win 237, length 0

异常情况,第三次ack晚到

复现模拟乱序过程

  1. pktdrill脚本更换收到的顺序如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
--tolerance_usecs=1000000
0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
0.000 setsockopt(3, SOL_SOCKET, TCP_NODELAY, [1], 4) = 0
0.000 bind(3, ..., ...) = 0
0.000 listen(3, 1) = 0
+0 < S 0:0(0) win 32792 <mss 1000,nop,wscale 7>
+0 > S. 0:0(0) ack 1 <mss 1460,nop,wscale 7>

+0 < P. 1:201(200) win 4000 //模拟客户端先发送数据
+0.5 < . 1:1(0) ack 1 win 257 //第三次握手的ack稍晚到达

+0.1 > . 1:1(0) ack 201
+.1 < F. 201:201(0) win 65535 <mss 100>
+0 > . 1:1(0) ack 202 <...>
+0 `sleep 100`

2.抓包结果如下:

1
2
3
4
5
14:21:48.191298 IP 192.0.2.1.38177 > 192.168.171.254.webcache: Flags [S], seq 0, win 32792, options [mss 1000,nop,wscale 7], length 0
14:21:48.191326 IP 192.168.171.254.webcache > 192.0.2.1.38177: Flags [S.], seq 980495874, ack 1, win 29200, options [mss 1460,nop,wscale 7], length 0
14:21:48.191418 IP 192.0.2.1.38177 > 192.168.171.254.webcache: Flags [P.], seq 1:201, ack 0, win 4000, length 200: HTTP
14:21:48.191427 IP 192.168.171.254.webcache > 192.0.2.1.38177: Flags [R], seq 980495874, win 0, length 0
14:21:48.691790 IP 192.0.2.1.38177 > 192.168.171.254.webcache: Flags [.], ack 1, win 257, length 0

可以看到,当这种情况下,服务端会发送RST重置socket。细心的朋友可能会发现,上面构造的第四个包的ack号是0,但是正常的应该是1,那我们修改下试试:

修改后的脚本:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
--tolerance_usecs=1000000
0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
0.000 setsockopt(3, SOL_SOCKET, TCP_NODELAY, [1], 4) = 0
0.000 bind(3, ..., ...) = 0
0.000 listen(3, 1) = 0
+0 < S 0:0(0) win 32792 <mss 1000,nop,wscale 7>
+0 > S. 0:0(0) ack 1 <mss 1460,nop,wscale 7>
+0 < P. 1:201(200) ack 1 win 4000 //模拟客户端先发送数据,ack修改为正确的ack,即1
+0.5 < . 1:1(0) ack 1 win 257 //第三次握手的ack稍晚到达
+0.1 > . 1:1(0) ack 201
+.1 < F. 201:201(0) win 65535 <mss 100>
+0 > . 1:1(0) ack 202 <...>
//0.000...0.200 accept(3, ..., ...) = 4

+0 `sleep 100`

测试能正常连接:

1
2
3
4
5
6
7
8
14:36:52.761193 IP 192.0.2.1.35618 > 192.168.150.129.webcache: Flags [S], seq 0, win 32792, options [mss 1000,nop,wscale 7], length 0
14:36:52.761233 IP 192.168.150.129.webcache > 192.0.2.1.35618: Flags [S.], seq 2972813010, ack 1, win 29200, options [mss 1460,nop,wscale 7], length 0
14:36:52.761309 IP 192.0.2.1.35618 > 192.168.150.129.webcache: Flags [P.], seq 1:201, ack 1, win 4000, length 200: HTTP #可以建立连接
14:36:52.761325 IP 192.168.150.129.webcache > 192.0.2.1.35618: Flags [.], ack 201, win 237, length 0 #服务器返回ack
14:36:53.261580 IP 192.0.2.1.35618 > 192.168.150.129.webcache: Flags [.], ack 1, win 257, length 0 #服务端收到客户端发的第三次ack报文
14:36:53.261622 IP 192.168.150.129.webcache > 192.0.2.1.35618: Flags [.], ack 201, win 237, length 0 #收到后发现序列号不对,重出ack为201的ack
14:36:53.362248 IP 192.0.2.1.35618 > 192.168.150.129.webcache: Flags [F.], seq 201, ack 0, win 65535, options [mss 100], length 0 #客户端发出seq为201的FIN
14:36:53.403023 IP 192.168.150.129.webcache > 192.0.2.1.35618: Flags [.], ack 202, win 237, length 0

只要数据包的ack是正确的,第三个ack是先收到,还是后收到,还是直接丢弃,均不影响连接的建立,后端也不会进行重试。正常情况握手后的第一个数据包的ack和第三次握手的ack也相同,所以没什么影响。后续收到了之后,进行ack重传即可。



支付宝打赏 微信打赏

赞赏支持一下