[ovs-dev] [PATCH] netdev-natvie-tnl: Fix geneve tunnel flag

Yi-Hung Wei yihung.wei at gmail.com
Tue Dec 15 23:41:20 UTC 2020


On userspace datapath, geneve option flag FLOW_TNL_F_UDPIF should only
be set when the geneve option is available.  However, currently, we always
set FLOW_TNL_F_UDPIF in the metadata flag, and that may result in megaflow
revalidation issue when the geneve option length is zero due to the datapath
flow key mis-match.  That is the original flow key always has FLOW_TNL_F_UDPIF
set, while the regenerated flow key during revalidation process  only has
FLOW_TNL_F_UDPIF set if geneve option length is zero.  For reference, check
tun_metadata_to_geneve_nlattr() to see how the flow key of geneve
tunnel metadatafor are generated from netlink attributes.

The following are the reproducible steps reported by Antonin.
After step 6, OvS reports a warning about failing to update the datapath
flow.

2020-12-15T01:56:21.324Z|00007|dpif(revalidator4)|WARN|netdev at ovs-netdev:
failed to put[modify] (No such file or directory)
ufid:d252fe62-5b2a-44e6-846a-190328190e09 skb_priority(0/0),
tunnel(tun_id=0x0,src=192.168.77.1,dst=192.168.77.2,ttl=64/0,tp_src=57391/0,
tp_dst=6081/0,flags(-df-csum+key)),skb_mark(0/0),ct_state(0x21/0x21),
ct_zone(0xfff0/0),ct_mark(0/0),ct_label(0/0),ct_tuple4(src=10.10.0.1/0.0.0.0,
dst=10.10.0.2/0.0.0.0,proto=1/0,tp_src=8/0,tp_dst=0/0),recirc_id(0xa),
dp_hash(0/0),in_port(5),packet_type(ns=0,id=0),
eth(src=1e:9c:f0:fb:18:9c/00:00:00:00:00:00,dst=ee:9f:60:8a:c0:a5/00:00:00:00:00:00),
eth_type(0x0800),ipv4(src=10.10.0.1/0.0.0.0,dst=10.10.0.2,proto=1/0,tos=0/0,
ttl=64/0,frag=no),icmp(type=8/0,code=0/0), actions:drop

Setup:
2 Nodes on the same subnet (192.168.77.101/24 - 192.168.77.102/24)

Step 1 – Creating bridges (run on each Node):

iface="enp0s8"

hwaddr=$(ip link show $iface | grep link/ether | awk '{print $2}')
inet=$(ip addr show $iface | grep "inet " | awk '{ print $2 }')

ovs-vsctl add-br br-phy \
          -- set Bridge br-phy datapath_type=netdev \
          -- br-set-external-id br-phy bridge-id br-phy \
          -- set bridge br-phy fail-mode=standalone \
          other_config:hwaddr="$hwaddr"
ovs-vsctl --timeout 10 add-port br-phy "$iface"
ip addr add "$inet" dev br-phy
ip link set br-phy up
ip addr flush dev enp0s8 2>/dev/null
ip link set enp0s8 up
iptables -F

ovs-vsctl add-br br-int \
          -- set Bridge br-int datapath_type=netdev \
          -- set bridge br-phy fail-mode=standalone

Step 2 – Creating netns to for overlay endpoints:

On first Node (192.168.77.101):
ip netns add ns0
ip link add veth0 type veth peer name veth1
ip link set veth0 netns ns0
ovs-vsctl add-port br-int veth1
ip link set dev veth1 up
ip netns exec ns0 ip addr add 10.10.0.1/24 dev veth0
ip netns exec ns0 ip link set dev veth0 up

On second Node (192.168.77.102):
ip netns add ns0
ip link add veth0 type veth peer name veth1
ip link set veth0 netns ns0
ovs-vsctl add-port br-int veth1
ip link set dev veth1 up
ip netns exec ns0 ip addr add 10.10.0.2/24 dev veth0
ip netns exec ns0 ip link set dev veth0 up

Step 3 – Create tunnel (run on each Node):

ovs-vsctl add-port br-int tun0 -- set interface tun0 type=geneve \
ofport_request=100 options:remote_ip=flow options:key=flow

Step 4 – Create the following flows:

On first Node (192.168.77.101):
root at ovs-test-node-1:/home/vagrant# ovs-ofctl dump-flows br-int --no-stats
priority=100,ip actions=resubmit(,10)
priority=0 actions=NORMAL
priority=50 actions=resubmit(,40)
table=10, priority=100,ip actions=ct(table=20,zone=65520)
table=20, priority=200,ct_state=-new+trk,ip actions=resubmit(,30)
table=20, priority=100,ip,nw_dst=10.10.0.2 actions=resubmit(,30)
table=20, priority=0,ip actions=drop
table=30, priority=100,ip actions=ct(commit,table=40,zone=65520)
table=40, priority=100,in_port=veth1 actions=load:0xc0a84d66->NXM_NX_TUN_IPV4_DST[],output:tun0
table=40, priority=100,in_port=tun0 actions=output:veth1
table=40, priority=0 actions=drop

On second Node (192.168.77.102):
root at ovs-test-node-2:/home/vagrant# ovs-ofctl dump-flows br-int --no-stats
priority=100,ip actions=resubmit(,10)
priority=0 actions=NORMAL
priority=50 actions=resubmit(,40)
table=10, priority=100,ip actions=ct(table=20,zone=65520)
table=20, priority=200,ct_state=-new+trk,ip actions=resubmit(,30)
table=20, priority=100,ip,nw_dst=10.10.0.2 actions=resubmit(,30)
table=20, priority=0,ip actions=drop
table=30, priority=100,ip actions=ct(commit,table=40,zone=65520)
table=40, priority=100,in_port=veth1 actions=load:0xc0a84d65->NXM_NX_TUN_IPV4_DST[],output:tun0
table=40, priority=100,in_port=tun0 actions=output:veth1
table=40, priority=0 actions=drop

Step 5 – Check that ping works:
>From the first Node: ip netns exec ns0 ping 10.10.0.2

Step 6 – The actual issue:
>From the first Node: ip netns exec ns0 ping -c 3 10.10.0.2
>From the second Node: ovs-ofctl del-flows br-int 'table=20,ip,nw_dst=10.10.0.2'
>From the first Node: ip netns exec ns0 ping -c 3 10.10.0.2

Execute these instructions in order, as soon as the previous one
completes.  If you follow these steps, you should see the ping in
the last step succeed.  This is not expected because of the deleted
flow. It also does not happen with VXLAN.

Reported-by: Antonin Bas <abas at vmware.com>
Reported-at: https://github.com/vmware-tanzu/antrea/issues/897
Fixes: 6b241d645291 ("netdev-vport: Factor-out tunnel Push-pop code into separate module.")
Signed-off-by: Yi-Hung Wei <yihung.wei at gmail.com>
---
 lib/netdev-native-tnl.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c
index b89dfdd52a86..09cff3a1fc7d 100644
--- a/lib/netdev-native-tnl.c
+++ b/lib/netdev-native-tnl.c
@@ -1015,9 +1015,11 @@ netdev_geneve_pop_header(struct dp_packet *packet)
     tnl->tun_id = htonll(ntohl(get_16aligned_be32(&gnh->vni)) >> 8);
     tnl->flags |= FLOW_TNL_F_KEY;
 
-    memcpy(tnl->metadata.opts.gnv, gnh->options, opts_len);
-    tnl->metadata.present.len = opts_len;
-    tnl->flags |= FLOW_TNL_F_UDPIF;
+    if (opts_len > 0) {
+        memcpy(tnl->metadata.opts.gnv, gnh->options, opts_len);
+        tnl->metadata.present.len = opts_len;
+        tnl->flags |= FLOW_TNL_F_UDPIF;
+    }
 
     packet->packet_type = htonl(PT_ETH);
     dp_packet_reset_packet(packet, hlen);
-- 
2.7.4



More information about the dev mailing list