[ovs-discuss] OVN does not work with vlans when CX5 does UDP tx checksum offload on OEL 7.7 (RHEL 7.7 based) / OEL 7.9 (RHEL 7.9) based

Brendan Doyle brendan.doyle at oracle.com
Wed May 5 15:00:29 UTC 2021


Folks,

I had posted an question to this alias a while back with the subject:
  " TCP tunnel traffic stops working when move from RHEL   7.7 to 7.9"

I finally got to the bottom of this and discovered that the issues is 
with UDP checksum offload
when the underlay is in a vlan, which seems to break OVN. Is this a 
known issue?

To cut to the chase I got things working with the following command on 
each chassis:

*ethtool --offload genev_sys_6081 rx on tx off*

When I looked at the tcpdumps on the underlay NIC I noticed that in the 
old working
OS (OEL 7.7 (RHEL 7.7 based) ) that the outer UDP pkt always had "[udp 
sum ok]" meaning that
the OS was doing the checksum, where as in the new broken OS (OEL 7.9 
(RHEL 7.9) the first
few packets had  "[bad udp cksum" these packets got through, but then 
the next few had
"[udp sum ok]" and these did not get through to the other chassis across 
the tunnel. Oddly
when I removed the vlan, with no ethertool changes things worked, It 
only broke when there
was a vlan in the mix. Then after much trail and error with ethertool 
settings on the NIC,
the VIF, ovs-system and finally *genev_sys_6081* I got it to work.

Seems like a bit of a performance limitation that OVN does not work with 
NIC checksum offload?

Brendan


On 29/04/2021 10:54, Brendan Doyle wrote:
> Hi Folks,
>
> In a very basic OVN config, where I have two VMs on different chassis:
>
> switch 7b89d593-05f3-41a7-a246-8dade975df48 (ls_vcn1)
>     port a6a358c5-5db4-49c7-b68a-3a7429161ab4
>         addresses: ["52:54:00:71:ad:a0 192.16.1.5"]
>     port b6c5ef1a-acd9-4053-9986-88e1a6a12b81
>         addresses: ["52:54:00:40:8f:dc 192.16.1.6"]
>
> When I upgrade the chassis from OEL 7.7 (RHEL 7.7 based) to OEL 7.9 
> (RHEL 7.9) based, then
> TCP traffic stops working, ping and UDP are fine. When I look at 
> tcpdump of the traffic on both
> chassis, I see the initial handshake encapsulated traffic being sent 
> and revived on both nodes.
> The initial TCP handshake seems to get through on the sender and it 
> sends the first data packet
> but the receive side does  not get the data packets and keeps sending 
> the initial handshake ack
> (see traces below).
>
> I'm think something to do with tcp checksum or some other NIC offload? 
> the NICS are CX5s.
> Just wondering has anyone come across this?
>
> Thanks
>
> Brendan
>
>
> Sender
> ---------
> 98:03:9b:89:21:e2 > 98:03:9b:89:21:5a, ethertype IPv4 (0x0800), length 
> 132: (tos 0x0, ttl 64, id 29694, offset 0, flags [DF], proto UDP (17), 
> length 118)
>     253.255.0.21.62384 > 253.255.0.18.6081: [bad udp cksum 0xfc99 -> 
> 0xa576!] Geneve, Flags [C], vni 0x1, proto TEB (0x6558), options 
> [class Open Virtual Networking (OVN) (0x102) type 0x80(C) len 8 data 
> 00010002]
>         52:54:00:40:8f:dc > 52:54:00:71:ad:a0, ethertype IPv4 
> (0x0800), length 74: (tos 0x0, ttl 64, id 61068, offset 0, flags [DF], 
> proto TCP (6), length 60)
>     192.16.1.6.38900 > 192.16.1.5.22: Flags [S], cksum 0x0a2b 
> (correct), seq 3225335796, win 27200, options [mss 1360,sackOK,TS val 
> 1242625918 ecr 0,nop,wscale 7], length 0
>
> 98:03:9b:89:21:5a > 98:03:9b:89:21:e2, ethertype IPv4 (0x0800), length 
> 132: (tos 0x0, ttl 64, id 5167, offset 0, flags [DF], proto UDP (17), 
> length 118)
>     253.255.0.18.28454 > 253.255.0.21.6081: [udp sum ok] Geneve, Flags 
> [C], vni 0x1, proto TEB (0x6558), options [class Open Virtual 
> Networking (OVN) (0x102) type 0x80(C) len 8 data 00020001]
>         52:54:00:71:ad:a0 > 52:54:00:40:8f:dc, ethertype IPv4 
> (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], 
> proto TCP (6), length 60)
>     192.16.1.5.22 > 192.16.1.6.38900: Flags [S.], cksum 0xb82f 
> (correct), seq 3217262113, ack 3225335797, win 26960, options [mss 
> 1360,sackOK,TS val 3343009202 ecr 1242625918,nop,wscale 7], length 0
>
> 98:03:9b:89:21:e2 > 98:03:9b:89:21:5a, ethertype IPv4 (0x0800), length 
> 124: (tos 0x0, ttl 64, id 29695, offset 0, flags [DF], proto UDP (17), 
> length 110)
>     253.255.0.21.62384 > 253.255.0.18.6081: [bad udp cksum 0xa57e -> 
> 0x723d!] Geneve, Flags [C], vni 0x1, proto TEB (0x6558), options 
> [class Open Virtual Networking (OVN) (0x102) type 0x80(C) len 8 data 
> 00010002]
>         52:54:00:40:8f:dc > 52:54:00:71:ad:a0, ethertype IPv4 
> (0x0800), length 66: (tos 0x0, ttl 64, id 61069, offset 0, flags [DF], 
> proto TCP (6), length 52)
>     192.16.1.6.38900 > 192.16.1.5.22: Flags [.], cksum 0x8252 
> (incorrect -> 0x4f11), seq 1, ack 1, win 213, options [nop,nop,TS val 
> 1242625920 ecr 3343009202], length 0
>
> 98:03:9b:89:21:e2 > 98:03:9b:89:21:5a, ethertype IPv4 (0x0800), length 
> 145: (tos 0x0, ttl 64, id 29696, offset 0, flags [DF], proto UDP (17), 
> length 131)
>     253.255.0.21.62384 > 253.255.0.18.6081: [bad udp cksum 0xa569 -> 
> 0xae4d!] Geneve, Flags [C], vni 0x1, proto TEB (0x6558), options 
> [class Open Virtual Networking (OVN) (0x102) type 0x80(C) len 8 data 
> 00010002]
>         52:54:00:40:8f:dc > 52:54:00:71:ad:a0, ethertype IPv4 
> (0x0800), length 87: (tos 0x0, ttl 64, id 61070, offset 0, flags [DF], 
> proto TCP (6), length 73)
>     192.16.1.6.38900 > 192.16.1.5.22: Flags [P.], cksum 0x8267 
> (incorrect -> 0x8b4b), seq 1:22, ack 1, win 213, options [nop,nop,TS 
> val 1242625920 ecr 3343009202], length 21
>
> 98:03:9b:89:21:e2 > 98:03:9b:89:21:5a, ethertype IPv4 (0x0800), length 
> 145: (tos 0x0, ttl 64, id 29775, offset 0, flags [DF], proto UDP (17), 
> length 131)
>     253.255.0.21.62384 > 253.255.0.18.6081: [bad udp cksum 0xa569 -> 
> 0xad7f!] Geneve, Flags [C], vni 0x1, proto TEB (0x6558), options 
> [class Open Virtual Networking (OVN) (0x102) type 0x80(C) len 8 data 
> 00010002]
>         52:54:00:40:8f:dc > 52:54:00:71:ad:a0, ethertype IPv4 
> (0x0800), length 87: (tos 0x0, ttl 64, id 61071, offset 0, flags [DF], 
> proto TCP (6), length 73)
>     192.16.1.6.38900 > 192.16.1.5.22: Flags [P.], cksum 0x8267 
> (incorrect -> 0x8a7d), seq 1:22, ack 1, win 213, options [nop,nop,TS 
> val 1242626126 ecr 3343009202], length 21
>
> Just repeats don't see anything else from the receiver
>
> Receiver
> ------------
> 98:03:9b:89:21:e2 > 98:03:9b:89:21:5a, ethertype IPv4 (0x0800), length 
> 132: (tos 0x0, ttl 64, id 29694, offset 0, flags [DF], proto UDP (17), 
> length 118)
>     253.255.0.21.62384 > 253.255.0.18.6081: [udp sum ok] Geneve, Flags 
> [C], vni 0x1, proto TEB (0x6558), options [class Open Virtual 
> Networking (OVN) (0x102) type 0x80(C) len 8 data 00010002]
>         52:54:00:40:8f:dc > 52:54:00:71:ad:a0, ethertype IPv4 
> (0x0800), length 74: (tos 0x0, ttl 64, id 61068, offset 0, flags [DF], 
> proto TCP (6), length 60)
>     192.16.1.6.38900 > 192.16.1.5.22: Flags [S], cksum 0x0a2b 
> (correct), seq 3225335796, win 27200, options [mss 1360,sackOK,TS val 
> 1242625918 ecr 0,nop,wscale 7], length 0
>
> 98:03:9b:89:21:5a > 98:03:9b:89:21:e2, ethertype IPv4 (0x0800), length 
> 132: (tos 0x0, ttl 64, id 5167, offset 0, flags [DF], proto UDP (17), 
> length 118)
>     253.255.0.18.28454 > 253.255.0.21.6081: [bad udp cksum 0xfc99 -> 
> 0x2a01!] Geneve, Flags [C], vni 0x1, proto TEB (0x6558), options 
> [class Open Virtual Networking (OVN) (0x102) type 0x80(C) len 8 data 
> 00020001]
>         52:54:00:71:ad:a0 > 52:54:00:40:8f:dc, ethertype IPv4 
> (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], 
> proto TCP (6), length 60)
>     192.16.1.5.22 > 192.16.1.6.38900: Flags [S.], cksum 0xb82f 
> (correct), seq 3217262113, ack 3225335797, win 26960, options [mss 
> 1360,sackOK,TS val 3343009202 ecr 1242625918,nop,wscale 7], length 0
>
> 98:03:9b:89:21:5a > 98:03:9b:89:21:e2, ethertype IPv4 (0x0800), length 
> 132: (tos 0x0, ttl 64, id 6137, offset 0, flags [DF], proto UDP (17), 
> length 118)
>     253.255.0.18.28454 > 253.255.0.21.6081: [bad udp cksum 0x2a01 -> 
> 0x5bc0!] Geneve, Flags [C], vni 0x1, proto TEB (0x6558), options 
> [class Open Virtual Networking (OVN) (0x102) type 0x80(C) len 8 data 
> 00020001]
>         52:54:00:71:ad:a0 > 52:54:00:40:8f:dc, ethertype IPv4 
> (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], 
> proto TCP (6), length 60)
>     192.16.1.5.22 > 192.16.1.6.38900: Flags [S.], cksum 0x825a 
> (incorrect -> 0xb419), seq 3217262113, ack 3225335797, win 26960, 
> options [mss 1360,sackOK,TS val 3343010248 ecr 1242625918,nop,wscale 
> 7], length 0
>
>
> Repeats don't see anything else from the sender.
>
> _______________________________________________
> discuss mailing list
> discuss at openvswitch.org
> https://urldefense.com/v3/__https://mail.openvswitch.org/mailman/listinfo/ovs-discuss__;!!GqivPVa7Brio!N7LR5w08pkOggvzRCJX5QV6SXVf2Jet8S66oBsNRg9twtYl94cpCa-6wRj-l_gZyKVg$ 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20210505/e7e9c481/attachment-0001.html>


More information about the discuss mailing list