[ovs-dev] [PATCH ovn v3] ovn-northd: Skip unsnat flows for load balancer vips in router ingress pipeline

Numan Siddique numans at ovn.org
Fri Mar 27 09:05:37 UTC 2020


On Thu, Mar 26, 2020 at 8:29 PM Mark Michelson <mmichels at redhat.com> wrote:
>
> Acked-by: Mark Michelson <mmichels at redhat.com>

Thanks, I applied the patch to master.

Numan

>
> On 3/26/20 10:49 AM, numans at ovn.org wrote:
> > From: Numan Siddique <numans at ovn.org>
> >
> > Suppose there is below NAT entry with external_ip = 172.168.0.100
> >
> > nat <UUID>
> >      external ip: "172.168.0.100"
> >      logical ip: "10.0.0.0/24"
> >      type: "snat"
> >
> > And a load balancer with the VIP - 172.168.0.100
> >
> > _uuid               : <UUID>
> > external_ids        : {}
> > name                : lb1
> > protocol            : tcp
> > vips                : {"172.168.0.100:8080"="10.0.0.4:8080"}
> >
> > And if these are associated to a gateway logical router
> >
> > Then we will see the below lflows in the router pipeline
> >
> > ...
> > table=5 (lr_in_unsnat       ), priority=90   , match=(ip && ip4.dst == 172.168.0.100), action=(ct_snat;)
> > ...
> > table=6 (lr_in_dnat         ), priority=120  , match=(ct.new && ip && ip4.dst == 172.168.0.100 && tcp && tcp.dst == 8080), action=(ct_lb(10.0.0.4:8080);)
> > table=6 (lr_in_dnat         ), priority=120  , match=(ct.est && ip && ip4.dst == 172.168.0.100 && tcp && tcp.dst == 8080), action=(ct_dnat;)
> >
> > When a new connection packet destinated for the lb vip 172.168.0.100 and tcp.dst = 8080
> > is received, the ct.new flow in the lr_in_dnat is hit and the packet's ip4.dst is
> > dnatted to 10.0.0.4 in the dnat conntrack zone.
> >
> > But for the subsequent packet destined to the vip, the ct.est lflow in the lr_in_dnat
> > stage doesn't get hit. In this case, the packet first hits the lr_in_unsnat pri 90 flow
> > as mentioned above with the action ct_snat. Even though ct_snat should have no effect,
> > looks like it is resetting the ct flags.
> >
> > In the case of tcp, the ct.new flow is hit instead of ct.est. In the the case of sctp, neither of the above
> > lflows in lr_in_dnat stage hit.
> >
> > This needs to be investigated further. But we can avoid this scenario in OVN
> > by adding the below lflow.
> >
> > table=5 (lr_in_unsnat       ), priority=120  , match=(ip4 && ip4.dst == 172.168.0.100 && tcp.dst == 8080), action=(next;)
> >
> > This patch adds the above lflow if the lb vip also has an entry in the NAT table.
> >
> > This patch is also required to support sctp load balancers in OVN.
> >
> > Reported-by: Tim Rozet <trozet at redhat.com>
> > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1815217
> > Signed-off-by: Numan Siddique <numans at ovn.org>
> > ---
> > v2 -> v3
> > -----
> >    * v2 was nothing but v1 as I forgot to commit the modified files.
> >    * Also updated the ovn-northd.8.xml.
> >
> > v1 -> v2
> > ------
> >    * Addressed the review comments from Mark - Added the protocol in the
> >      match.
> >    * I'm not sure if this is an ovs issue. So updated the commit message
> >      accordingly.
> >
> >
> >   northd/ovn-northd.8.xml | 27 +++++++++++++++++++
> >   northd/ovn-northd.c     | 59 +++++++++++++++++++++++++++++++----------
> >   tests/ovn-northd.at     | 38 ++++++++++++++++++++++++++
> >   tests/system-ovn.at     | 51 ++++++++++++++++++++++++++++++++++-
> >   4 files changed, 160 insertions(+), 15 deletions(-)
> >
> > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> > index 1e0993e07..b5e4d6d84 100644
> > --- a/northd/ovn-northd.8.xml
> > +++ b/northd/ovn-northd.8.xml
> > @@ -2075,6 +2075,33 @@ icmp6 {
> >         unSNATted here.
> >       </p>
> >
> > +    <p>Ingress Table 5: UNSNAT on Gateway and Distributed Routers</p>
> > +    <ul>
> > +      <li>
> > +        <p>
> > +          If the Router (Gateway or Distributed) is configured with
> > +          load balancers, then below lflows are added:
> > +        </p>
> > +
> > +        <p>
> > +          For each IPv4 address <var>A</var> defined as load balancer
> > +          VIP with the protocol <var>P</var> (and the protocol port
> > +          <var>T</var> if defined) is also present as an
> > +          <code>external_ip</code> in the NAT table,
> > +          a priority-120 logical flow is added with the match
> > +          <code>ip4 && ip4.dst == <var>A</var> &&
> > +          <var>P</var></code> with the action <code>next;</code> to
> > +          advance the packet to the next table. If the load balancer
> > +          has protocol port <code>B</code> defined, then the match also has
> > +          <code><var>P</var>.dst == <var>B</var></code>.
> > +        </p>
> > +
> > +        <p>
> > +          The above flows are also added for IPv6 load balancers.
> > +        </p>
> > +      </li>
> > +    </ul>
> > +
> >       <p>Ingress Table 5: UNSNAT on Gateway Routers</p>
> >
> >       <ul>
> > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > index b76df0554..7ef049310 100644
> > --- a/northd/ovn-northd.c
> > +++ b/northd/ovn-northd.c
> > @@ -7574,7 +7574,7 @@ add_router_lb_flow(struct hmap *lflows, struct ovn_datapath *od,
> >                      struct ds *match, struct ds *actions, int priority,
> >                      const char *lb_force_snat_ip, struct lb_vip *lb_vip,
> >                      bool is_udp, struct nbrec_load_balancer *lb,
> > -                   struct shash *meter_groups)
> > +                   struct shash *meter_groups, struct sset *nat_entries)
> >   {
> >       build_empty_lb_event_flow(od, lflows, lb_vip, lb, S_ROUTER_IN_DNAT,
> >                                 meter_groups);
> > @@ -7607,6 +7607,40 @@ add_router_lb_flow(struct hmap *lflows, struct ovn_datapath *od,
> >       free(new_match);
> >       free(est_match);
> >
> > +    const char *ip_match = NULL;
> > +    if (lb_vip->addr_family == AF_INET) {
> > +        ip_match = "ip4";
> > +    } else {
> > +        ip_match = "ip6";
> > +    }
> > +
> > +    if (sset_contains(nat_entries, lb_vip->vip)) {
> > +        /* The load balancer vip is also present in the NAT entries.
> > +         * So add a high priority lflow to advance the the packet
> > +         * destined to the vip (and the vip port if defined)
> > +         * in the S_ROUTER_IN_UNSNAT stage.
> > +         * There seems to be an issue with ovs-vswitchd. When the new
> > +         * connection packet destined for the lb vip is received,
> > +         * it is dnat'ed in the S_ROUTER_IN_DNAT stage in the dnat
> > +         * conntrack zone. For the next packet, if it goes through
> > +         * unsnat stage, the conntrack flags are not set properly, and
> > +         * it doesn't hit the established state flows in
> > +         * S_ROUTER_IN_DNAT stage. */
> > +        struct ds unsnat_match = DS_EMPTY_INITIALIZER;
> > +        ds_put_format(&unsnat_match, "%s && %s.dst == %s && %s",
> > +                      ip_match, ip_match, lb_vip->vip,
> > +                      is_udp ? "udp" : "tcp");
> > +        if (lb_vip->vip_port) {
> > +            ds_put_format(&unsnat_match, " && %s.dst == %d",
> > +                          is_udp ? "udp" : "tcp", lb_vip->vip_port);
> > +        }
> > +
> > +        ovn_lflow_add_with_hint(lflows, od, S_ROUTER_IN_UNSNAT, 120,
> > +                                ds_cstr(&unsnat_match), "next;", &lb->header_);
> > +
> > +        ds_destroy(&unsnat_match);
> > +    }
> > +
> >       if (!od->l3dgw_port || !od->l3redirect_port || !lb_vip->n_backends) {
> >           return;
> >       }
> > @@ -7616,19 +7650,11 @@ add_router_lb_flow(struct hmap *lflows, struct ovn_datapath *od,
> >        * router has a gateway router port associated.
> >        */
> >       struct ds undnat_match = DS_EMPTY_INITIALIZER;
> > -    if (lb_vip->addr_family == AF_INET) {
> > -        ds_put_cstr(&undnat_match, "ip4 && (");
> > -    } else {
> > -        ds_put_cstr(&undnat_match, "ip6 && (");
> > -    }
> > +    ds_put_format(&undnat_match, "%s && (", ip_match);
> >
> >       for (size_t i = 0; i < lb_vip->n_backends; i++) {
> >           struct lb_vip_backend *backend = &lb_vip->backends[i];
> > -        if (backend->addr_family == AF_INET) {
> > -            ds_put_format(&undnat_match, "(ip4.src == %s", backend->ip);
> > -        } else {
> > -            ds_put_format(&undnat_match, "(ip6.src == %s", backend->ip);
> > -        }
> > +        ds_put_format(&undnat_match, "(%s.src == %s", ip_match, backend->ip);
> >
> >           if (backend->port) {
> >               ds_put_format(&undnat_match, " && %s.src == %d) || ",
> > @@ -8879,6 +8905,11 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
> >                                               &nat->header_);
> >                       sset_add(&nat_entries, nat->external_ip);
> >                   }
> > +            } else {
> > +                /* Add the NAT external_ip to the nat_entries even for
> > +                 * gateway routers. This is required for adding load balancer
> > +                 * flows.*/
> > +                sset_add(&nat_entries, nat->external_ip);
> >               }
> >
> >               /* Egress UNDNAT table: It is for already established connections'
> > @@ -9059,8 +9090,6 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
> >               }
> >           }
> >
> > -        sset_destroy(&nat_entries);
> > -
> >           /* Handle force SNAT options set in the gateway router. */
> >           if (dnat_force_snat_ip && !od->l3dgw_port) {
> >               /* If a packet with destination IP address as that of the
> > @@ -9121,6 +9150,7 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
> >           /* Load balancing and packet defrag are only valid on
> >            * Gateway routers or router with gateway port. */
> >           if (!smap_get(&od->nbr->options, "chassis") && !od->l3dgw_port) {
> > +            sset_destroy(&nat_entries);
> >               continue;
> >           }
> >
> > @@ -9195,10 +9225,11 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
> >                   }
> >                   add_router_lb_flow(lflows, od, &match, &actions, prio,
> >                                      lb_force_snat_ip, lb_vip, is_udp,
> > -                                   nb_lb, meter_groups);
> > +                                   nb_lb, meter_groups, &nat_entries);
> >               }
> >           }
> >           sset_destroy(&all_ips);
> > +        sset_destroy(&nat_entries);
> >       }
> >
> >       /* Logical router ingress table ND_RA_OPTIONS & ND_RA_RESPONSE: IPv6 Router
> > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> > index a2989e78e..d127152f5 100644
> > --- a/tests/ovn-northd.at
> > +++ b/tests/ovn-northd.at
> > @@ -1288,3 +1288,41 @@ ovn-nbctl --wait=sb lb-del lb2
> >   OVS_WAIT_UNTIL([test 0 = `ovn-sbctl list service_monitor |  wc -l`])
> >
> >   AT_CLEANUP
> > +
> > +AT_SETUP([ovn -- Load balancer VIP in NAT entries])
> > +AT_SKIP_IF([test $HAVE_PYTHON = no])
> > +ovn_start
> > +
> > +ovn-nbctl lr-add lr0
> > +ovn-nbctl lrp-add lr0 lr0-public 00:00:01:01:02:04 192.168.2.1/24
> > +ovn-nbctl lrp-add lr0 lr0-join 00:00:01:01:02:04 10.10.0.1/24
> > +
> > +ovn-nbctl set logical_router lr0 options:chassis=ch1
> > +
> > +ovn-nbctl lb-add lb1 "192.168.2.1:8080" "10.0.0.4:8080"
> > +ovn-nbctl lb-add lb2 "192.168.2.4:8080" "10.0.0.5:8080" udp
> > +ovn-nbctl lb-add lb3 "192.168.2.5:8080" "10.0.0.6:8080"
> > +ovn-nbctl lb-add lb4 "192.168.2.6:8080" "10.0.0.7:8080"
> > +
> > +ovn-nbctl lr-lb-add lr0 lb1
> > +ovn-nbctl lr-lb-add lr0 lb2
> > +ovn-nbctl lr-lb-add lr0 lb3
> > +ovn-nbctl lr-lb-add lr0 lb4
> > +
> > +ovn-nbctl lr-nat-add lr0 snat 192.168.2.1 10.0.0.0/24
> > +ovn-nbctl lr-nat-add lr0 dnat_and_snat 192.168.2.4 10.0.0.4
> > +ovn-nbctl lr-nat-add lr0 dnat 192.168.2.5 10.0.0.5
> > +
> > +OVS_WAIT_UNTIL([test 1 = $(ovn-sbctl dump-flows lr0 | grep lr_in_unsnat | \
> > +grep "ip4 && ip4.dst == 192.168.2.1 && tcp && tcp.dst == 8080" -c) ])
> > +
> > +AT_CHECK([test 1 = $(ovn-sbctl dump-flows lr0 | grep lr_in_unsnat | \
> > +grep "ip4 && ip4.dst == 192.168.2.4 && udp && udp.dst == 8080" -c) ])
> > +
> > +AT_CHECK([test 1 = $(ovn-sbctl dump-flows lr0 | grep lr_in_unsnat | \
> > +grep "ip4 && ip4.dst == 192.168.2.5 && tcp && tcp.dst == 8080" -c) ])
> > +
> > +AT_CHECK([test 0 = $(ovn-sbctl dump-flows lr0 | grep lr_in_unsnat | \
> > +grep "ip4 && ip4.dst == 192.168.2.6 && tcp && tcp.dst == 8080" -c) ])
> > +
> > +AT_CLEANUP
> > diff --git a/tests/system-ovn.at b/tests/system-ovn.at
> > index 3b3379840..f1ae69b20 100644
> > --- a/tests/system-ovn.at
> > +++ b/tests/system-ovn.at
> > @@ -1635,7 +1635,6 @@ ovn-nbctl set load_balancer $uuid vips:'"30.0.0.2:8000"'='"192.168.1.2:80,192.16
> >   ovn-nbctl -- --id=@nat create nat type="snat" logical_ip=192.168.2.2 \
> >       external_ip=30.0.0.2 -- add logical_router R2 nat @nat
> >
> > -
> >   # Wait for ovn-controller to catch up.
> >   ovn-nbctl --wait=hv sync
> >   OVS_WAIT_UNTIL([ovs-ofctl -O OpenFlow13 dump-groups br-int | \
> > @@ -1671,6 +1670,56 @@ tcp,orig=(src=172.16.1.2,dst=30.0.0.2,sport=<cleared>,dport=<cleared>),reply=(sr
> >   tcp,orig=(src=172.16.1.2,dst=30.0.0.2,sport=<cleared>,dport=<cleared>),reply=(src=192.168.2.2,dst=172.16.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>)
> >   ])
> >
> > +check_est_flows () {
> > +    n=$(ovs-ofctl dump-flows br-int table=14 | grep \
> > +"priority=120,ct_state=+est+trk,tcp,metadata=0x2,nw_dst=30.0.0.2,tp_dst=8000" \
> > +| grep nat | sed -n 's/.*n_packets=\([[0-9]]\{1,\}\).*/\1/p')
> > +
> > +    echo "n_packets=$n"
> > +    test "$n" != 0
> > +}
> > +
> > +OVS_WAIT_UNTIL([check_est_flows], [check established flows])
> > +
> > +
> > +ovn-nbctl set logical_router R2 options:lb_force_snat_ip="20.0.0.2"
> > +
> > +# Destroy the load balancer and create again. ovn-controller will
> > +# clear the OF flows and re add again and clears the n_packets
> > +# for these flows.
> > +ovn-nbctl destroy load_balancer $uuid
> > +uuid=`ovn-nbctl  create load_balancer vips:30.0.0.1="192.168.1.2,192.168.2.2"`
> > +ovn-nbctl set logical_router R2 load_balancer=$uuid
> > +
> > +# Config OVN load-balancer with another VIP (this time with ports).
> > +ovn-nbctl set load_balancer $uuid vips:'"30.0.0.2:8000"'='"192.168.1.2:80,192.168.2.2:80"'
> > +
> > +ovn-nbctl list load_balancer
> > +ovn-sbctl dump-flows R2
> > +OVS_WAIT_UNTIL([ovs-ofctl -O OpenFlow13 dump-flows br-int table=41 | \
> > +grep 'nat(src=20.0.0.2)'])
> > +
> > +dnl Test load-balancing that includes L4 ports in NAT.
> > +for i in `seq 1 20`; do
> > +    echo Request $i
> > +    NS_CHECK_EXEC([alice1], [wget 30.0.0.2:8000 -t 5 -T 1 --retry-connrefused -v -o wget$i.log])
> > +done
> > +
> > +dnl Each server should have at least one connection.
> > +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.2) |
> > +sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl
> > +tcp,orig=(src=172.16.1.2,dst=30.0.0.2,sport=<cleared>,dport=<cleared>),reply=(src=192.168.1.2,dst=172.16.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>)
> > +tcp,orig=(src=172.16.1.2,dst=30.0.0.2,sport=<cleared>,dport=<cleared>),reply=(src=192.168.2.2,dst=172.16.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>)
> > +])
> > +
> > +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(20.0.0.2) |
> > +sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl
> > +tcp,orig=(src=172.16.1.2,dst=192.168.1.2,sport=<cleared>,dport=<cleared>),reply=(src=192.168.1.2,dst=20.0.0.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>)
> > +tcp,orig=(src=172.16.1.2,dst=192.168.2.2,sport=<cleared>,dport=<cleared>),reply=(src=192.168.2.2,dst=20.0.0.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,protoinfo=(state=<cleared>)
> > +])
> > +
> > +OVS_WAIT_UNTIL([check_est_flows], [check established flows])
> > +
> >   OVS_APP_EXIT_AND_WAIT([ovn-controller])
> >
> >   as ovn-sb
> >
>
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>


More information about the dev mailing list