[ovs-discuss] Bad checksums observed with nsh encapsulation

Thu Jun 14 16:15:10 UTC 2018

Hello

I have done a follow-up test very similar to the previous one, but this
time using two computes such that client and server reside in one of
them and the vnf on the other one. This means that packets coming from
either client/server that are being nsh encapsulated are then forwarded
to the vnf compute egressing through a vxlan tunnel port
(vxlan+eth+nsh+payload). 

In this scenario I dont observe the checksum problem. So it is a
combination of nsh encasulation + tap port egress when the checksum is
sometimes observed to be incorrect.

BR
Jaime.

-----Original Message-----
From: Jaime Caamaño Ruiz  <jcaamano at suse.de>
Reply-To: jcaamano at suse.com
To: ovs-discuss at openvswitch.org, jcaamano at suse.de
Subject: [ovs-discuss] Bad checksums observed with nsh encapsulation
Date: Wed, 13 Jun 2018 12:51:59 +0200

Hello

I am facing a problem where eth+nsh encapsulated packets egress OVS
with incorrect checksum. 

The scenario is

client ---- vnf ---- server

all guests on the same host so this is vm2vm traffic, tap ports are
directly added to the ovs bridge. TCP traffic from/to server port 80 is
encapsulated with eth+nsh and traverse the vnf. I exercise the traffic
by using nc both on client and server.

I include captures at the client [1] and at the vnf [2] where I attempt
three tcp connections on port 80. The general observation is that
packets generated on client/server are seen there with wrong checksums
due to offloading but then arrive at the vnf with correct checksum. But
not all of them. For the first conenction attempt you can see that SYN
(frame 74) and ACK (78) are ok, but then FIN (79) is not ok. A
retransmitted FIN (80) is still not ok and then a further FIN (93)
retranmission is ok. Much of the same happens for the second attempt.
The third attempt shows a bad SYN (104) coming from the server.

Two additional observations:

- This does not happen if I try the same on a port different than 80 so
that the traffic goes directly from the client to the server with no
eth+nsh encapsulation.

- This does not happen if I disable tx offloading both in the server
and the client.

I include also the flows [3] and the ofproto trace [4] for the FIN
(79), generated by the client, which is eth+nsh encapsulated and
forwarded to the vnf. The decision on whether packet should be eth+nsh
encapsulated or no happens on table 101 by setting reg2 which is then
checked on 221. Packet is nsh encapsulated on table 222 and then
ethernet encapsulated on table 83. If not encapsulated packet would go
from 221 back to 220 and output there without any further actions.

Using OVS 2.9.2 with OVS tree kernel module. Kernel is 4.4.

I am understanding the problem correctly in regards to OVS being
responsible for these checksums when offloading is enabled?
Any pointers on how I can debug this further?
Why would just some of the eth+nsh packets exhibit this problem and not
all?
Why would these bad packets be ok after retransmissions?

[1] https://filebin.net/8mnypc2qm4vninof/client.pcap?t=b097kh0m
[2] https://filebin.net/8mnypc2qm4vninof/vnf_eth0.pcap?t=b097kh0m
[3] https://hastebin.com/nuhexufaze.sql
[4] https://hastebin.com/yevufanula.http

Thanks for your help,
Jaime.

_______________________________________________
discuss mailing list
discuss at openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss