[ovs-dev] [PATCH] ofproto-dpif-xlate: Terminate native tunnels only on ports with IP addresses.

Mike Pattrick mkp at redhat.com
Mon Nov 15 17:28:15 UTC 2021


Hello Ilya,

For some reason, the included test fails for me when I try to run it.
The diff is:

./tunnel-push-pop.at:751: tail -2 stdout
--- -    2021-11-15 12:09:05.838890065 -0500
+++ /root/ovs/tests/testsuite.dir/at-groups/782/stdout    2021-11-15
12:09:05.836652743 -0500
@@ -1,3 +1,3 @@
-Megaflow: recirc_id=0,eth,in_port=2,dl_type=0x0000
-Datapath actions: 3
+Megaflow: recirc_id=0,eth,in_port=1,dl_type=0x0000
+Datapath actions: 5

I see that it passed github CI, so I'm unsure why it's failing for me.

-M

On Mon, Nov 1, 2021 at 4:15 PM Ilya Maximets <i.maximets at ovn.org> wrote:
>
> Commit dc0bd12f5b04 removed restriction that tunnel endpoint must be a
> bridge port.  So, currently OVS has to check if the native tunnel needs
> to be terminated regardless of the output port.  Unfortunately, there
> is a side effect: tnl_port_map_lookup() always adds at least 'dl_dst'
> match to the megaflow that ends up in the corresponding datapath flow.
> And since tunneling works on L3 level and not restricted by any
> particular bridge, this extra match criteria is added to every
> datapath flow on every bridge even if that bridge cannot be part of
> a tunnel processing.
>
> For example, if OVS has at least one tunnel configured and we're
> adding a completely separate bridge with 2 ports and simple rules
> to forward packets between two ports, there still will be a match on
> a destination mac address:
>
>  1. <create a tunnel configuration in OVS>
>  2. ovs-vsctl add-br br-non-tunnel -- set bridge datapath_type=netdev
>  3. ovs-vsctl add-port br-non-tunnel port0
>            -- add-port br-non-tunnel port1
>  4. ovs-ofctl del-flows br-non-tunnel
>  5. ovs-ofctl add-flow br-non-tunnel in_port=port0,actions=port1
>  6. ovs-ofctl add-flow br-non-tunnel in_port=port1,actions=port0
>
>  # ovs-appctl ofproto/trace br-non-tunnel in_port=port0
>
>  Flow: in_port=1,vlan_tci=0x0000,
>        dl_src=00:00:00:00:00:00,dl_dst=00:00:00:00:00:00,dl_type=0x0000
>
>  bridge("br-non-tunnel")
>  -----------------------
>   0. in_port=1, priority 32768
>      output:2
>
>  Final flow: unchanged
>  Megaflow: recirc_id=0,eth,in_port=1,dl_dst=00:00:00:00:00:00,dl_type=0x0000
>  Datapath actions: 5                 ^^^^^^^^^^^^^^^^^^^^^^^^
>
> This increases the number of upcalls and installed datapath flows,
> since separate flow needs to be installed per destination MAC, reducing
> the switching performance.  This also blocks datapath performance
> optimizations that are based on the datapath flow simplicity.
>
> In general, in order to be a tunnel endpoint, port has to have an IP
> address.  Hence native tunnel termination should be attempted only
> for such ports.  This allows to avoid extra matches in most cases.
>
> Fixes: dc0bd12f5b04 ("userspace: Enable non-bridge port as tunnel endpoint.")
> Reported-by: Sriharsha Basavapatna <sriharsha.basavapatna at broadcom.com>
> Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2021-October/388904.html
> Signed-off-by: Ilya Maximets <i.maximets at ovn.org>
> ---
>
> I'm sure not an expert in this part of a code, but the change seems to pass
> all unit and system tests.  It also passes OVN tests.  So, I'm assuming that
> it's correct.
>
>  ofproto/ofproto-dpif-xlate.c | 31 +++++++++++++++----
>  tests/packet-type-aware.at   |  4 +--
>  tests/tunnel-push-pop.at     | 58 ++++++++++++++++++++++++++++++++++++
>  3 files changed, 86 insertions(+), 7 deletions(-)
>
> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> index 9d336bc6a..701902b0c 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -4081,14 +4081,35 @@ is_neighbor_reply_correct(const struct xlate_ctx *ctx, const struct flow *flow)
>  }
>
>  static bool
> -terminate_native_tunnel(struct xlate_ctx *ctx, struct flow *flow,
> -                        struct flow_wildcards *wc, odp_port_t *tnl_port)
> +xport_has_ip(const struct xport *xport)
> +{
> +    struct in6_addr *ip_addr, *mask;
> +    int n_in6 = 0;
> +
> +    if (netdev_get_addr_list(xport->netdev, &ip_addr, &mask, &n_in6)) {
> +        n_in6 = 0;
> +    } else {
> +        free(ip_addr);
> +        free(mask);
> +    }
> +    return n_in6 ? true : false;
> +}
> +
> +static bool
> +terminate_native_tunnel(struct xlate_ctx *ctx, const struct xport *xport,
> +                        struct flow *flow, struct flow_wildcards *wc,
> +                        odp_port_t *tnl_port)
>  {
>      *tnl_port = ODPP_NONE;
>
>      /* XXX: Write better Filter for tunnel port. We can use in_port
> -     * in tunnel-port flow to avoid these checks completely. */
> -    if (ovs_native_tunneling_is_on(ctx->xbridge->ofproto)) {
> +     * in tunnel-port flow to avoid these checks completely.
> +     *
> +     * Port without an IP address cannot be a tunnel termination point.
> +     * Not performing a lookup in this case to avoid unwildcarding extra
> +     * flow fields (dl_dst). */
> +    if (ovs_native_tunneling_is_on(ctx->xbridge->ofproto)
> +        && xport_has_ip(xport)) {
>          *tnl_port = tnl_port_map_lookup(flow, wc);
>
>          /* If no tunnel port was found and it's about an ARP or ICMPv6 packet,
> @@ -4242,7 +4263,7 @@ compose_output_action__(struct xlate_ctx *ctx, ofp_port_t ofp_port,
>              native_tunnel_output(ctx, xport, flow, odp_port, truncate);
>              flow->tunnel = flow_tnl; /* Restore tunnel metadata */
>
> -        } else if (terminate_native_tunnel(ctx, flow, wc,
> +        } else if (terminate_native_tunnel(ctx, xport, flow, wc,
>                                             &odp_tnl_port)) {
>              /* Intercept packet to be received on native tunnel port. */
>              nl_msg_put_odp_port(ctx->odp_actions, OVS_ACTION_ATTR_TUNNEL_POP,
> diff --git a/tests/packet-type-aware.at b/tests/packet-type-aware.at
> index 73aa14cea..054dcc9cc 100644
> --- a/tests/packet-type-aware.at
> +++ b/tests/packet-type-aware.at
> @@ -892,7 +892,7 @@ ovs-appctl time/warp 1000
>  AT_CHECK([
>      ovs-appctl dpctl/dump-flows --names dummy at ovs-dummy | strip_used | grep -v ipv6 | sort
>  ], [0], [flow-dump from the main thread:
> -recirc_id(0),in_port(n0),packet_type(ns=0,id=0),eth(dst=1e:2c:e9:2a:66:9e),eth_type(0x0800),ipv4(tos=0/0x3,frag=no), packets:1, bytes:98, used:0.0s, actions:n1,pop_eth,clone(tnl_push(tnl_port(gre_sys),header(size=38,type=3,eth(dst=de:af:be:ef:ba:be,src=aa:55:00:00:00:02,dl_type=0x0800),ipv4(src=10.0.0.1,dst=10.0.0.2,proto=47,tos=0,ttl=64,frag=0x4000),gre((flags=0x0,proto=0x800))),out_port(br2)),n2)
> +recirc_id(0),in_port(n0),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(tos=0/0x3,frag=no), packets:1, bytes:98, used:0.0s, actions:n1,pop_eth,clone(tnl_push(tnl_port(gre_sys),header(size=38,type=3,eth(dst=de:af:be:ef:ba:be,src=aa:55:00:00:00:02,dl_type=0x0800),ipv4(src=10.0.0.1,dst=10.0.0.2,proto=47,tos=0,ttl=64,frag=0x4000),gre((flags=0x0,proto=0x800))),out_port(br2)),n2)
>  ])
>
>  AT_CHECK([
> @@ -1021,7 +1021,7 @@ AT_CHECK([
>  ], [0], [flow-dump from the main thread:
>  recirc_id(0),in_port(p0),packet_type(ns=0,id=0),eth(src=aa:bb:cc:00:00:02,dst=aa:bb:cc:00:00:01),eth_type(0x0800),ipv4(dst=20.0.0.1,proto=47,frag=no), packets:3, bytes:378, used:0.0s, actions:tnl_pop(gre_sys)
>  tunnel(src=20.0.0.2,dst=20.0.0.1,flags(-df-csum)),recirc_id(0),in_port(gre_sys),packet_type(ns=1,id=0x8847),eth_type(0x8847),mpls(label=999/0x0,tc=0/0,ttl=64/0x0,bos=1/1), packets:3, bytes:264, used:0.0s, actions:push_eth(src=00:00:00:00:00:00,dst=00:00:00:00:00:00),pop_mpls(eth_type=0x800),recirc(0x1)
> -tunnel(src=20.0.0.2,dst=20.0.0.1,flags(-df-csum)),recirc_id(0x1),in_port(gre_sys),packet_type(ns=0,id=0),eth(dst=00:00:00:00:00:00),eth_type(0x0800),ipv4(ttl=64,frag=no), packets:3, bytes:294, used:0.0s, actions:set(ipv4(ttl=63)),int-br
> +tunnel(src=20.0.0.2,dst=20.0.0.1,flags(-df-csum)),recirc_id(0x1),in_port(gre_sys),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(ttl=64,frag=no), packets:3, bytes:294, used:0.0s, actions:set(ipv4(ttl=63)),int-br
>  ])
>
>  ovs-appctl time/warp 1000
> diff --git a/tests/tunnel-push-pop.at b/tests/tunnel-push-pop.at
> index 636465397..ed72ff986 100644
> --- a/tests/tunnel-push-pop.at
> +++ b/tests/tunnel-push-pop.at
> @@ -703,3 +703,61 @@ NXST_FLOW reply:
>
>  OVS_VSWITCHD_STOP
>  AT_CLEANUP
> +
> +AT_SETUP([tunnel_push_pop - flow translation on unrelated bridges])
> +
> +OVS_VSWITCHD_START(
> +    [add-port br0 p0 dnl
> +     -- set Interface p0 type=dummy ofport_request=1 dnl
> +                         other-config:hwaddr=aa:55:aa:55:00:00])
> +AT_CHECK([ovs-appctl vlog/set dpif_netdev:dbg])
> +AT_CHECK([ovs-vsctl add-br int-br -- set bridge int-br datapath_type=dummy])
> +AT_CHECK([ovs-vsctl add-port int-br t2 dnl
> +          -- set Interface t2 type=geneve options:remote_ip=1.1.2.92 dnl
> +                              options:key=123 ofport_request=2])
> +
> +dnl First setup dummy interface IP address, then add the route
> +dnl so that tnl-port table can get valid IP address for the device.
> +AT_CHECK([ovs-appctl netdev-dummy/ip4addr br0 1.1.2.88/24], [0], [OK
> +])
> +AT_CHECK([ovs-appctl ovs/route/add 1.1.2.92/24 br0], [0], [OK
> +])
> +AT_CHECK([ovs-ofctl add-flow br0 action=normal])
> +
> +dnl This ARP reply from p0 has two effects:
> +dnl 1. The ARP cache will learn that 1.1.2.92 is at f8:bc:12:44:34:b6.
> +dnl 2. The br0 mac learning will learn that f8:bc:12:44:34:b6 is on p0.
> +AT_CHECK([
> +  ovs-appctl netdev-dummy/receive p0 dnl
> +      'recirc_id(0),in_port(2),dnl
> +       eth(src=f8:bc:12:44:34:b6,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),dnl
> +       arp(sip=1.1.2.92,tip=1.1.2.88,op=2,sha=f8:bc:12:44:34:b6,tha=00:00:00:00:00:00)'
> +])
> +
> +dnl Creating a separate bridge that is completely unrelated to a tunnel
> +dnl configuration.  Ports in this bridge cannot be tunnel endpoints.
> +AT_CHECK([ovs-vsctl add-br br-non-tunnel dnl
> +          -- set bridge br-non-tunnel datapath_type=dummy fail-mode=secure dnl
> +          -- add-port br-non-tunnel port0 -- set Interface port0 type=dummy dnl
> +          -- add-port br-non-tunnel port1 -- set Interface port1 type=dummy])
> +AT_CHECK([ovs-ofctl del-flows br-non-tunnel])
> +AT_CHECK([ovs-ofctl add-flow br-non-tunnel in_port=port0,action=port1])
> +AT_CHECK([ovs-ofctl add-flow br-non-tunnel in_port=port1,action=port0])
> +
> +dnl Checking that tunnel configuration doesn't impact flow translation
> +dnl on this bridge (Megaflow should contain a bare minimum of fields
> +dnl according to installed OF rules).
> +AT_CHECK([ovs-appctl ofproto/trace br-non-tunnel in_port=port0], [0], [stdout])
> +AT_CHECK([tail -2 stdout], [0], [dnl
> +Megaflow: recirc_id=0,eth,in_port=2,dl_type=0x0000
> +Datapath actions: 3
> +])
> +
> +AT_CHECK([ovs-appctl ofproto/trace br-non-tunnel in_port=port1], [0], [stdout])
> +AT_CHECK([tail -2 stdout], [0], [dnl
> +Megaflow: recirc_id=0,eth,in_port=1,dl_type=0x0000
> +Datapath actions: 5
> +])
> +
> +OVS_VSWITCHD_STOP
> +AT_CLEANUP
> --
> 2.31.1
>
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>



More information about the dev mailing list