[ovs-dev] [PATCH] ofproto-dpif-xlate: Terminate native tunnels only on ports with IP addresses.
Mike Pattrick
mkp at redhat.com
Mon Nov 15 17:28:15 UTC 2021
Hello Ilya,
For some reason, the included test fails for me when I try to run it.
The diff is:
./tunnel-push-pop.at:751: tail -2 stdout
--- - 2021-11-15 12:09:05.838890065 -0500
+++ /root/ovs/tests/testsuite.dir/at-groups/782/stdout 2021-11-15
12:09:05.836652743 -0500
@@ -1,3 +1,3 @@
-Megaflow: recirc_id=0,eth,in_port=2,dl_type=0x0000
-Datapath actions: 3
+Megaflow: recirc_id=0,eth,in_port=1,dl_type=0x0000
+Datapath actions: 5
I see that it passed github CI, so I'm unsure why it's failing for me.
-M
On Mon, Nov 1, 2021 at 4:15 PM Ilya Maximets <i.maximets at ovn.org> wrote:
>
> Commit dc0bd12f5b04 removed restriction that tunnel endpoint must be a
> bridge port. So, currently OVS has to check if the native tunnel needs
> to be terminated regardless of the output port. Unfortunately, there
> is a side effect: tnl_port_map_lookup() always adds at least 'dl_dst'
> match to the megaflow that ends up in the corresponding datapath flow.
> And since tunneling works on L3 level and not restricted by any
> particular bridge, this extra match criteria is added to every
> datapath flow on every bridge even if that bridge cannot be part of
> a tunnel processing.
>
> For example, if OVS has at least one tunnel configured and we're
> adding a completely separate bridge with 2 ports and simple rules
> to forward packets between two ports, there still will be a match on
> a destination mac address:
>
> 1. <create a tunnel configuration in OVS>
> 2. ovs-vsctl add-br br-non-tunnel -- set bridge datapath_type=netdev
> 3. ovs-vsctl add-port br-non-tunnel port0
> -- add-port br-non-tunnel port1
> 4. ovs-ofctl del-flows br-non-tunnel
> 5. ovs-ofctl add-flow br-non-tunnel in_port=port0,actions=port1
> 6. ovs-ofctl add-flow br-non-tunnel in_port=port1,actions=port0
>
> # ovs-appctl ofproto/trace br-non-tunnel in_port=port0
>
> Flow: in_port=1,vlan_tci=0x0000,
> dl_src=00:00:00:00:00:00,dl_dst=00:00:00:00:00:00,dl_type=0x0000
>
> bridge("br-non-tunnel")
> -----------------------
> 0. in_port=1, priority 32768
> output:2
>
> Final flow: unchanged
> Megaflow: recirc_id=0,eth,in_port=1,dl_dst=00:00:00:00:00:00,dl_type=0x0000
> Datapath actions: 5 ^^^^^^^^^^^^^^^^^^^^^^^^
>
> This increases the number of upcalls and installed datapath flows,
> since separate flow needs to be installed per destination MAC, reducing
> the switching performance. This also blocks datapath performance
> optimizations that are based on the datapath flow simplicity.
>
> In general, in order to be a tunnel endpoint, port has to have an IP
> address. Hence native tunnel termination should be attempted only
> for such ports. This allows to avoid extra matches in most cases.
>
> Fixes: dc0bd12f5b04 ("userspace: Enable non-bridge port as tunnel endpoint.")
> Reported-by: Sriharsha Basavapatna <sriharsha.basavapatna at broadcom.com>
> Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2021-October/388904.html
> Signed-off-by: Ilya Maximets <i.maximets at ovn.org>
> ---
>
> I'm sure not an expert in this part of a code, but the change seems to pass
> all unit and system tests. It also passes OVN tests. So, I'm assuming that
> it's correct.
>
> ofproto/ofproto-dpif-xlate.c | 31 +++++++++++++++----
> tests/packet-type-aware.at | 4 +--
> tests/tunnel-push-pop.at | 58 ++++++++++++++++++++++++++++++++++++
> 3 files changed, 86 insertions(+), 7 deletions(-)
>
> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> index 9d336bc6a..701902b0c 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -4081,14 +4081,35 @@ is_neighbor_reply_correct(const struct xlate_ctx *ctx, const struct flow *flow)
> }
>
> static bool
> -terminate_native_tunnel(struct xlate_ctx *ctx, struct flow *flow,
> - struct flow_wildcards *wc, odp_port_t *tnl_port)
> +xport_has_ip(const struct xport *xport)
> +{
> + struct in6_addr *ip_addr, *mask;
> + int n_in6 = 0;
> +
> + if (netdev_get_addr_list(xport->netdev, &ip_addr, &mask, &n_in6)) {
> + n_in6 = 0;
> + } else {
> + free(ip_addr);
> + free(mask);
> + }
> + return n_in6 ? true : false;
> +}
> +
> +static bool
> +terminate_native_tunnel(struct xlate_ctx *ctx, const struct xport *xport,
> + struct flow *flow, struct flow_wildcards *wc,
> + odp_port_t *tnl_port)
> {
> *tnl_port = ODPP_NONE;
>
> /* XXX: Write better Filter for tunnel port. We can use in_port
> - * in tunnel-port flow to avoid these checks completely. */
> - if (ovs_native_tunneling_is_on(ctx->xbridge->ofproto)) {
> + * in tunnel-port flow to avoid these checks completely.
> + *
> + * Port without an IP address cannot be a tunnel termination point.
> + * Not performing a lookup in this case to avoid unwildcarding extra
> + * flow fields (dl_dst). */
> + if (ovs_native_tunneling_is_on(ctx->xbridge->ofproto)
> + && xport_has_ip(xport)) {
> *tnl_port = tnl_port_map_lookup(flow, wc);
>
> /* If no tunnel port was found and it's about an ARP or ICMPv6 packet,
> @@ -4242,7 +4263,7 @@ compose_output_action__(struct xlate_ctx *ctx, ofp_port_t ofp_port,
> native_tunnel_output(ctx, xport, flow, odp_port, truncate);
> flow->tunnel = flow_tnl; /* Restore tunnel metadata */
>
> - } else if (terminate_native_tunnel(ctx, flow, wc,
> + } else if (terminate_native_tunnel(ctx, xport, flow, wc,
> &odp_tnl_port)) {
> /* Intercept packet to be received on native tunnel port. */
> nl_msg_put_odp_port(ctx->odp_actions, OVS_ACTION_ATTR_TUNNEL_POP,
> diff --git a/tests/packet-type-aware.at b/tests/packet-type-aware.at
> index 73aa14cea..054dcc9cc 100644
> --- a/tests/packet-type-aware.at
> +++ b/tests/packet-type-aware.at
> @@ -892,7 +892,7 @@ ovs-appctl time/warp 1000
> AT_CHECK([
> ovs-appctl dpctl/dump-flows --names dummy at ovs-dummy | strip_used | grep -v ipv6 | sort
> ], [0], [flow-dump from the main thread:
> -recirc_id(0),in_port(n0),packet_type(ns=0,id=0),eth(dst=1e:2c:e9:2a:66:9e),eth_type(0x0800),ipv4(tos=0/0x3,frag=no), packets:1, bytes:98, used:0.0s, actions:n1,pop_eth,clone(tnl_push(tnl_port(gre_sys),header(size=38,type=3,eth(dst=de:af:be:ef:ba:be,src=aa:55:00:00:00:02,dl_type=0x0800),ipv4(src=10.0.0.1,dst=10.0.0.2,proto=47,tos=0,ttl=64,frag=0x4000),gre((flags=0x0,proto=0x800))),out_port(br2)),n2)
> +recirc_id(0),in_port(n0),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(tos=0/0x3,frag=no), packets:1, bytes:98, used:0.0s, actions:n1,pop_eth,clone(tnl_push(tnl_port(gre_sys),header(size=38,type=3,eth(dst=de:af:be:ef:ba:be,src=aa:55:00:00:00:02,dl_type=0x0800),ipv4(src=10.0.0.1,dst=10.0.0.2,proto=47,tos=0,ttl=64,frag=0x4000),gre((flags=0x0,proto=0x800))),out_port(br2)),n2)
> ])
>
> AT_CHECK([
> @@ -1021,7 +1021,7 @@ AT_CHECK([
> ], [0], [flow-dump from the main thread:
> recirc_id(0),in_port(p0),packet_type(ns=0,id=0),eth(src=aa:bb:cc:00:00:02,dst=aa:bb:cc:00:00:01),eth_type(0x0800),ipv4(dst=20.0.0.1,proto=47,frag=no), packets:3, bytes:378, used:0.0s, actions:tnl_pop(gre_sys)
> tunnel(src=20.0.0.2,dst=20.0.0.1,flags(-df-csum)),recirc_id(0),in_port(gre_sys),packet_type(ns=1,id=0x8847),eth_type(0x8847),mpls(label=999/0x0,tc=0/0,ttl=64/0x0,bos=1/1), packets:3, bytes:264, used:0.0s, actions:push_eth(src=00:00:00:00:00:00,dst=00:00:00:00:00:00),pop_mpls(eth_type=0x800),recirc(0x1)
> -tunnel(src=20.0.0.2,dst=20.0.0.1,flags(-df-csum)),recirc_id(0x1),in_port(gre_sys),packet_type(ns=0,id=0),eth(dst=00:00:00:00:00:00),eth_type(0x0800),ipv4(ttl=64,frag=no), packets:3, bytes:294, used:0.0s, actions:set(ipv4(ttl=63)),int-br
> +tunnel(src=20.0.0.2,dst=20.0.0.1,flags(-df-csum)),recirc_id(0x1),in_port(gre_sys),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(ttl=64,frag=no), packets:3, bytes:294, used:0.0s, actions:set(ipv4(ttl=63)),int-br
> ])
>
> ovs-appctl time/warp 1000
> diff --git a/tests/tunnel-push-pop.at b/tests/tunnel-push-pop.at
> index 636465397..ed72ff986 100644
> --- a/tests/tunnel-push-pop.at
> +++ b/tests/tunnel-push-pop.at
> @@ -703,3 +703,61 @@ NXST_FLOW reply:
>
> OVS_VSWITCHD_STOP
> AT_CLEANUP
> +
> +AT_SETUP([tunnel_push_pop - flow translation on unrelated bridges])
> +
> +OVS_VSWITCHD_START(
> + [add-port br0 p0 dnl
> + -- set Interface p0 type=dummy ofport_request=1 dnl
> + other-config:hwaddr=aa:55:aa:55:00:00])
> +AT_CHECK([ovs-appctl vlog/set dpif_netdev:dbg])
> +AT_CHECK([ovs-vsctl add-br int-br -- set bridge int-br datapath_type=dummy])
> +AT_CHECK([ovs-vsctl add-port int-br t2 dnl
> + -- set Interface t2 type=geneve options:remote_ip=1.1.2.92 dnl
> + options:key=123 ofport_request=2])
> +
> +dnl First setup dummy interface IP address, then add the route
> +dnl so that tnl-port table can get valid IP address for the device.
> +AT_CHECK([ovs-appctl netdev-dummy/ip4addr br0 1.1.2.88/24], [0], [OK
> +])
> +AT_CHECK([ovs-appctl ovs/route/add 1.1.2.92/24 br0], [0], [OK
> +])
> +AT_CHECK([ovs-ofctl add-flow br0 action=normal])
> +
> +dnl This ARP reply from p0 has two effects:
> +dnl 1. The ARP cache will learn that 1.1.2.92 is at f8:bc:12:44:34:b6.
> +dnl 2. The br0 mac learning will learn that f8:bc:12:44:34:b6 is on p0.
> +AT_CHECK([
> + ovs-appctl netdev-dummy/receive p0 dnl
> + 'recirc_id(0),in_port(2),dnl
> + eth(src=f8:bc:12:44:34:b6,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),dnl
> + arp(sip=1.1.2.92,tip=1.1.2.88,op=2,sha=f8:bc:12:44:34:b6,tha=00:00:00:00:00:00)'
> +])
> +
> +dnl Creating a separate bridge that is completely unrelated to a tunnel
> +dnl configuration. Ports in this bridge cannot be tunnel endpoints.
> +AT_CHECK([ovs-vsctl add-br br-non-tunnel dnl
> + -- set bridge br-non-tunnel datapath_type=dummy fail-mode=secure dnl
> + -- add-port br-non-tunnel port0 -- set Interface port0 type=dummy dnl
> + -- add-port br-non-tunnel port1 -- set Interface port1 type=dummy])
> +AT_CHECK([ovs-ofctl del-flows br-non-tunnel])
> +AT_CHECK([ovs-ofctl add-flow br-non-tunnel in_port=port0,action=port1])
> +AT_CHECK([ovs-ofctl add-flow br-non-tunnel in_port=port1,action=port0])
> +
> +dnl Checking that tunnel configuration doesn't impact flow translation
> +dnl on this bridge (Megaflow should contain a bare minimum of fields
> +dnl according to installed OF rules).
> +AT_CHECK([ovs-appctl ofproto/trace br-non-tunnel in_port=port0], [0], [stdout])
> +AT_CHECK([tail -2 stdout], [0], [dnl
> +Megaflow: recirc_id=0,eth,in_port=2,dl_type=0x0000
> +Datapath actions: 3
> +])
> +
> +AT_CHECK([ovs-appctl ofproto/trace br-non-tunnel in_port=port1], [0], [stdout])
> +AT_CHECK([tail -2 stdout], [0], [dnl
> +Megaflow: recirc_id=0,eth,in_port=1,dl_type=0x0000
> +Datapath actions: 5
> +])
> +
> +OVS_VSWITCHD_STOP
> +AT_CLEANUP
> --
> 2.31.1
>
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
More information about the dev
mailing list