[ovs-discuss] OVN does not work with vlans when CX5 does UDP tx checksum offload on OEL 7.7 (RHEL 7.7 based) / OEL 7.9 (RHEL 7.9) based
Brendan Doyle
brendan.doyle at oracle.com
Wed May 5 15:00:29 UTC 2021
Folks,
I had posted an question to this alias a while back with the subject:
" TCP tunnel traffic stops working when move from RHEL 7.7 to 7.9"
I finally got to the bottom of this and discovered that the issues is
with UDP checksum offload
when the underlay is in a vlan, which seems to break OVN. Is this a
known issue?
To cut to the chase I got things working with the following command on
each chassis:
*ethtool --offload genev_sys_6081 rx on tx off*
When I looked at the tcpdumps on the underlay NIC I noticed that in the
old working
OS (OEL 7.7 (RHEL 7.7 based) ) that the outer UDP pkt always had "[udp
sum ok]" meaning that
the OS was doing the checksum, where as in the new broken OS (OEL 7.9
(RHEL 7.9) the first
few packets had "[bad udp cksum" these packets got through, but then
the next few had
"[udp sum ok]" and these did not get through to the other chassis across
the tunnel. Oddly
when I removed the vlan, with no ethertool changes things worked, It
only broke when there
was a vlan in the mix. Then after much trail and error with ethertool
settings on the NIC,
the VIF, ovs-system and finally *genev_sys_6081* I got it to work.
Seems like a bit of a performance limitation that OVN does not work with
NIC checksum offload?
Brendan
On 29/04/2021 10:54, Brendan Doyle wrote:
> Hi Folks,
>
> In a very basic OVN config, where I have two VMs on different chassis:
>
> switch 7b89d593-05f3-41a7-a246-8dade975df48 (ls_vcn1)
> port a6a358c5-5db4-49c7-b68a-3a7429161ab4
> addresses: ["52:54:00:71:ad:a0 192.16.1.5"]
> port b6c5ef1a-acd9-4053-9986-88e1a6a12b81
> addresses: ["52:54:00:40:8f:dc 192.16.1.6"]
>
> When I upgrade the chassis from OEL 7.7 (RHEL 7.7 based) to OEL 7.9
> (RHEL 7.9) based, then
> TCP traffic stops working, ping and UDP are fine. When I look at
> tcpdump of the traffic on both
> chassis, I see the initial handshake encapsulated traffic being sent
> and revived on both nodes.
> The initial TCP handshake seems to get through on the sender and it
> sends the first data packet
> but the receive side does not get the data packets and keeps sending
> the initial handshake ack
> (see traces below).
>
> I'm think something to do with tcp checksum or some other NIC offload?
> the NICS are CX5s.
> Just wondering has anyone come across this?
>
> Thanks
>
> Brendan
>
>
> Sender
> ---------
> 98:03:9b:89:21:e2 > 98:03:9b:89:21:5a, ethertype IPv4 (0x0800), length
> 132: (tos 0x0, ttl 64, id 29694, offset 0, flags [DF], proto UDP (17),
> length 118)
> 253.255.0.21.62384 > 253.255.0.18.6081: [bad udp cksum 0xfc99 ->
> 0xa576!] Geneve, Flags [C], vni 0x1, proto TEB (0x6558), options
> [class Open Virtual Networking (OVN) (0x102) type 0x80(C) len 8 data
> 00010002]
> 52:54:00:40:8f:dc > 52:54:00:71:ad:a0, ethertype IPv4
> (0x0800), length 74: (tos 0x0, ttl 64, id 61068, offset 0, flags [DF],
> proto TCP (6), length 60)
> 192.16.1.6.38900 > 192.16.1.5.22: Flags [S], cksum 0x0a2b
> (correct), seq 3225335796, win 27200, options [mss 1360,sackOK,TS val
> 1242625918 ecr 0,nop,wscale 7], length 0
>
> 98:03:9b:89:21:5a > 98:03:9b:89:21:e2, ethertype IPv4 (0x0800), length
> 132: (tos 0x0, ttl 64, id 5167, offset 0, flags [DF], proto UDP (17),
> length 118)
> 253.255.0.18.28454 > 253.255.0.21.6081: [udp sum ok] Geneve, Flags
> [C], vni 0x1, proto TEB (0x6558), options [class Open Virtual
> Networking (OVN) (0x102) type 0x80(C) len 8 data 00020001]
> 52:54:00:71:ad:a0 > 52:54:00:40:8f:dc, ethertype IPv4
> (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF],
> proto TCP (6), length 60)
> 192.16.1.5.22 > 192.16.1.6.38900: Flags [S.], cksum 0xb82f
> (correct), seq 3217262113, ack 3225335797, win 26960, options [mss
> 1360,sackOK,TS val 3343009202 ecr 1242625918,nop,wscale 7], length 0
>
> 98:03:9b:89:21:e2 > 98:03:9b:89:21:5a, ethertype IPv4 (0x0800), length
> 124: (tos 0x0, ttl 64, id 29695, offset 0, flags [DF], proto UDP (17),
> length 110)
> 253.255.0.21.62384 > 253.255.0.18.6081: [bad udp cksum 0xa57e ->
> 0x723d!] Geneve, Flags [C], vni 0x1, proto TEB (0x6558), options
> [class Open Virtual Networking (OVN) (0x102) type 0x80(C) len 8 data
> 00010002]
> 52:54:00:40:8f:dc > 52:54:00:71:ad:a0, ethertype IPv4
> (0x0800), length 66: (tos 0x0, ttl 64, id 61069, offset 0, flags [DF],
> proto TCP (6), length 52)
> 192.16.1.6.38900 > 192.16.1.5.22: Flags [.], cksum 0x8252
> (incorrect -> 0x4f11), seq 1, ack 1, win 213, options [nop,nop,TS val
> 1242625920 ecr 3343009202], length 0
>
> 98:03:9b:89:21:e2 > 98:03:9b:89:21:5a, ethertype IPv4 (0x0800), length
> 145: (tos 0x0, ttl 64, id 29696, offset 0, flags [DF], proto UDP (17),
> length 131)
> 253.255.0.21.62384 > 253.255.0.18.6081: [bad udp cksum 0xa569 ->
> 0xae4d!] Geneve, Flags [C], vni 0x1, proto TEB (0x6558), options
> [class Open Virtual Networking (OVN) (0x102) type 0x80(C) len 8 data
> 00010002]
> 52:54:00:40:8f:dc > 52:54:00:71:ad:a0, ethertype IPv4
> (0x0800), length 87: (tos 0x0, ttl 64, id 61070, offset 0, flags [DF],
> proto TCP (6), length 73)
> 192.16.1.6.38900 > 192.16.1.5.22: Flags [P.], cksum 0x8267
> (incorrect -> 0x8b4b), seq 1:22, ack 1, win 213, options [nop,nop,TS
> val 1242625920 ecr 3343009202], length 21
>
> 98:03:9b:89:21:e2 > 98:03:9b:89:21:5a, ethertype IPv4 (0x0800), length
> 145: (tos 0x0, ttl 64, id 29775, offset 0, flags [DF], proto UDP (17),
> length 131)
> 253.255.0.21.62384 > 253.255.0.18.6081: [bad udp cksum 0xa569 ->
> 0xad7f!] Geneve, Flags [C], vni 0x1, proto TEB (0x6558), options
> [class Open Virtual Networking (OVN) (0x102) type 0x80(C) len 8 data
> 00010002]
> 52:54:00:40:8f:dc > 52:54:00:71:ad:a0, ethertype IPv4
> (0x0800), length 87: (tos 0x0, ttl 64, id 61071, offset 0, flags [DF],
> proto TCP (6), length 73)
> 192.16.1.6.38900 > 192.16.1.5.22: Flags [P.], cksum 0x8267
> (incorrect -> 0x8a7d), seq 1:22, ack 1, win 213, options [nop,nop,TS
> val 1242626126 ecr 3343009202], length 21
>
> Just repeats don't see anything else from the receiver
>
> Receiver
> ------------
> 98:03:9b:89:21:e2 > 98:03:9b:89:21:5a, ethertype IPv4 (0x0800), length
> 132: (tos 0x0, ttl 64, id 29694, offset 0, flags [DF], proto UDP (17),
> length 118)
> 253.255.0.21.62384 > 253.255.0.18.6081: [udp sum ok] Geneve, Flags
> [C], vni 0x1, proto TEB (0x6558), options [class Open Virtual
> Networking (OVN) (0x102) type 0x80(C) len 8 data 00010002]
> 52:54:00:40:8f:dc > 52:54:00:71:ad:a0, ethertype IPv4
> (0x0800), length 74: (tos 0x0, ttl 64, id 61068, offset 0, flags [DF],
> proto TCP (6), length 60)
> 192.16.1.6.38900 > 192.16.1.5.22: Flags [S], cksum 0x0a2b
> (correct), seq 3225335796, win 27200, options [mss 1360,sackOK,TS val
> 1242625918 ecr 0,nop,wscale 7], length 0
>
> 98:03:9b:89:21:5a > 98:03:9b:89:21:e2, ethertype IPv4 (0x0800), length
> 132: (tos 0x0, ttl 64, id 5167, offset 0, flags [DF], proto UDP (17),
> length 118)
> 253.255.0.18.28454 > 253.255.0.21.6081: [bad udp cksum 0xfc99 ->
> 0x2a01!] Geneve, Flags [C], vni 0x1, proto TEB (0x6558), options
> [class Open Virtual Networking (OVN) (0x102) type 0x80(C) len 8 data
> 00020001]
> 52:54:00:71:ad:a0 > 52:54:00:40:8f:dc, ethertype IPv4
> (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF],
> proto TCP (6), length 60)
> 192.16.1.5.22 > 192.16.1.6.38900: Flags [S.], cksum 0xb82f
> (correct), seq 3217262113, ack 3225335797, win 26960, options [mss
> 1360,sackOK,TS val 3343009202 ecr 1242625918,nop,wscale 7], length 0
>
> 98:03:9b:89:21:5a > 98:03:9b:89:21:e2, ethertype IPv4 (0x0800), length
> 132: (tos 0x0, ttl 64, id 6137, offset 0, flags [DF], proto UDP (17),
> length 118)
> 253.255.0.18.28454 > 253.255.0.21.6081: [bad udp cksum 0x2a01 ->
> 0x5bc0!] Geneve, Flags [C], vni 0x1, proto TEB (0x6558), options
> [class Open Virtual Networking (OVN) (0x102) type 0x80(C) len 8 data
> 00020001]
> 52:54:00:71:ad:a0 > 52:54:00:40:8f:dc, ethertype IPv4
> (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF],
> proto TCP (6), length 60)
> 192.16.1.5.22 > 192.16.1.6.38900: Flags [S.], cksum 0x825a
> (incorrect -> 0xb419), seq 3217262113, ack 3225335797, win 26960,
> options [mss 1360,sackOK,TS val 3343010248 ecr 1242625918,nop,wscale
> 7], length 0
>
>
> Repeats don't see anything else from the sender.
>
> _______________________________________________
> discuss mailing list
> discuss at openvswitch.org
> https://urldefense.com/v3/__https://mail.openvswitch.org/mailman/listinfo/ovs-discuss__;!!GqivPVa7Brio!N7LR5w08pkOggvzRCJX5QV6SXVf2Jet8S66oBsNRg9twtYl94cpCa-6wRj-l_gZyKVg$
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20210505/e7e9c481/attachment-0001.html>
More information about the discuss
mailing list