[ovs-dev] [PATCH ovn v4] ovn-northd: Limit ARP/ND broadcast domain whenever possible.

Dumitru Ceara dceara at redhat.com
Mon Nov 4 15:08:30 UTC 2019


On Mon, Nov 4, 2019 at 3:22 PM Numan Siddique <numans at ovn.org> wrote:
>
> On Mon, Nov 4, 2019 at 6:36 PM Dumitru Ceara <dceara at redhat.com> wrote:
> >
> > ARP request and ND NS packets for router owned IPs were being
> > flooded in the complete L2 domain (using the MC_FLOOD multicast group).
> > However this creates a scaling issue in scenarios where aggregation
> > logical switches are connected to more logical routers (~350). The
> > logical pipelines of all routers would have to be executed before the
> > packet is finally replied to by a single router, the owner of the IP
> > address.
> >
> > This commit limits the broadcast domain by bypassing the L2 Lookup stage
> > for ARP requests that will be replied by a single router. The packets
> > are still flooded in the L2 domain but not on any of the other patch
> > ports towards other routers connected to the switch. This restricted
> > flooding is done by using a new multicast group (MC_ARP_ND_FLOOD).
> >
> > IPs that are owned by the routers and for which this fix applies are:
> > - IP addresses configured on the router ports.
> > - VIPs.
> > - NAT IPs.
> >
> > This commit also fixes function get_router_load_balancer_ips() which
> > was incorrectly returning a single address_family even though the
> > IP set could contain a mix of IPv4 and IPv6 addresses.
> >
> > Reported-at: https://bugzilla.redhat.com/1756945
> > Reported-by: Anil Venkata <vkommadi at redhat.com>
> > Signed-off-by: Dumitru Ceara <dceara at redhat.com>
>
> Hi Dumitru,
> The patch LGTM. I have just a couple of minor comments.
>
> Thanks
> Numan

Thanks Numan for the review.
I sent a v5 with fixes for the issues you pointed out:
https://patchwork.ozlabs.org/patch/1188982/

Thanks,
Dumitru

>
> >
> > ---
> > v4: Rebase.
> > v3: Properly deal with VXLAN traffic. Address review comments from
> >     Numan (add autotests). Fix function get_router_load_balancer_ips.
> >     Rebase -> deal with IPv6 NAT too.
> > v2: Move ARP broadcast domain limiting to table S_SWITCH_IN_L2_LKUP to
> > address localnet ports too.
> > ---
> >  lib/mcast-group-index.h |   1 +
> >  northd/ovn-northd.8.xml |  16 +++
> >  northd/ovn-northd.c     | 340 ++++++++++++++++++++++++++++++++++++------------
> >  tests/ovn.at            | 244 +++++++++++++++++++++++++++++++++-
> >  4 files changed, 517 insertions(+), 84 deletions(-)
> >
> > diff --git a/lib/mcast-group-index.h b/lib/mcast-group-index.h
> > index ba995ba..06bd8b3 100644
> > --- a/lib/mcast-group-index.h
> > +++ b/lib/mcast-group-index.h
> > @@ -27,6 +27,7 @@ enum ovn_mcast_tunnel_keys {
> >
> >      OVN_MCAST_FLOOD_TUNNEL_KEY = OVN_MIN_MULTICAST,
> >      OVN_MCAST_UNKNOWN_TUNNEL_KEY,
> > +    OVN_MCAST_ARP_ND_TUNNEL_KEY,
> >      OVN_MCAST_MROUTER_FLOOD_TUNNEL_KEY,
> >      OVN_MCAST_MROUTER_STATIC_TUNNEL_KEY,
> >      OVN_MCAST_STATIC_TUNNEL_KEY,
> > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> > index 0a33dcd..6fbb3ab 100644
> > --- a/northd/ovn-northd.8.xml
> > +++ b/northd/ovn-northd.8.xml
> > @@ -1005,6 +1005,22 @@ output;
> >        </li>
> >
> >        <li>
> > +        Priority-80 flows for each port connected to a logical router
> > +        matching self originated GARP/ARP request/ND packets. These packets
> > +        are flooded to the <code>MC_FLOOD</code> which contains all logical
> > +        ports.
> > +      </li>
> > +
> > +      <li>
> > +        Priority-75 flows for each IP address/VIP/NAT address owned by a
> > +        router port connected to the switch. These flows match ARP requests
> > +        and ND packets for the specific IP addresses.  Matched packets are
> > +        forwarded in the L3 domain only to the router that owns the IP
> > +        address and flooded in the L2 domain on all ports except patch
> > +        ports connected to logical routers.
> > +      </li>
> > +
> > +      <li>
> >          A priority-70 flow that outputs all packets with an Ethernet broadcast
> >          or multicast <code>eth.dst</code> to the <code>MC_FLOOD</code>
> >          multicast group.
> > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > index c23c270..b13b432 100644
> > --- a/northd/ovn-northd.c
> > +++ b/northd/ovn-northd.c
> > @@ -210,6 +210,8 @@ enum ovn_stage {
> >  #define REGBIT_LOOKUP_NEIGHBOR_RESULT "reg9[4]"
> >  #define REGBIT_SKIP_LOOKUP_NEIGHBOR "reg9[5]"
> >
> > +#define REGBIT_NOT_VXLAN "flags[1] == 0"
> > +
> >  /* Returns an "enum ovn_stage" built from the arguments. */
> >  static enum ovn_stage
> >  ovn_stage_build(enum ovn_datapath_type dp_type, enum ovn_pipeline pipeline,
> > @@ -1202,6 +1204,34 @@ ovn_port_allocate_key(struct ovn_datapath *od)
> >                            1, (1u << 15) - 1, &od->port_key_hint);
> >  }
> >
> > +/* Returns true if the logical switch port 'enabled' column is empty or
> > + * set to true.  Otherwise, returns false. */
> > +static bool
> > +lsp_is_enabled(const struct nbrec_logical_switch_port *lsp)
> > +{
> > +    return !lsp->n_enabled || *lsp->enabled;
> > +}
> > +
> > +/* Returns true only if the logical switch port 'up' column is set to true.
> > + * Otherwise, if the column is not set or set to false, returns false. */
> > +static bool
> > +lsp_is_up(const struct nbrec_logical_switch_port *lsp)
> > +{
> > +    return lsp->n_up && *lsp->up;
> > +}
> > +
> > +static bool
> > +lsp_is_external(const struct nbrec_logical_switch_port *nbsp)
> > +{
> > +    return !strcmp(nbsp->type, "external");
> > +}
> > +
> > +static bool
> > +lrport_is_enabled(const struct nbrec_logical_router_port *lrport)
> > +{
> > +    return !lrport->enabled || *lrport->enabled;
> > +}
> > +
> >  static char *
> >  chassis_redirect_name(const char *port_name)
> >  {
> > @@ -2184,7 +2214,7 @@ ip_address_and_port_from_lb_key(const char *key, char **ip_address,
> >
> >  static void
> >  get_router_load_balancer_ips(const struct ovn_datapath *od,
> > -                             struct sset *all_ips, int *addr_family)
> > +                             struct sset *all_ips_v4, struct sset *all_ips_v6)
> >  {
> >      if (!od->nbr) {
> >          return;
> > @@ -2199,13 +2229,21 @@ get_router_load_balancer_ips(const struct ovn_datapath *od,
> >              /* node->key contains IP:port or just IP. */
> >              char *ip_address = NULL;
> >              uint16_t port;
> > +            int addr_family;
> >
> >              ip_address_and_port_from_lb_key(node->key, &ip_address, &port,
> > -                                            addr_family);
> > +                                            &addr_family);
> >              if (!ip_address) {
> >                  continue;
> >              }
> >
> > +            struct sset *all_ips;
> > +            if (addr_family == AF_INET) {
> > +                all_ips = all_ips_v4;
> > +            } else {
> > +                all_ips = all_ips_v6;
> > +            }
> > +
> >              if (!sset_contains(all_ips, ip_address)) {
> >                  sset_add(all_ips, ip_address);
> >              }
> > @@ -2299,17 +2337,22 @@ get_nat_addresses(const struct ovn_port *op, size_t *n)
> >          }
> >      }
> >
> > -    /* A set to hold all load-balancer vips. */
> > -    struct sset all_ips = SSET_INITIALIZER(&all_ips);
> > -    int addr_family;
> > -    get_router_load_balancer_ips(op->od, &all_ips, &addr_family);
> > +    /* Two sets to hold all load-balancer vips. */
> > +    struct sset all_ips_v4 = SSET_INITIALIZER(&all_ips_v4);
> > +    struct sset all_ips_v6 = SSET_INITIALIZER(&all_ips_v6);
> > +    get_router_load_balancer_ips(op->od, &all_ips_v4, &all_ips_v6);
> >
> >      const char *ip_address;
> > -    SSET_FOR_EACH (ip_address, &all_ips) {
> > +    SSET_FOR_EACH (ip_address, &all_ips_v4) {
> >          ds_put_format(&c_addresses, " %s", ip_address);
> >          central_ip_address = true;
> >      }
> > -    sset_destroy(&all_ips);
> > +    SSET_FOR_EACH (ip_address, &all_ips_v6) {
> > +        ds_put_format(&c_addresses, " %s", ip_address);
> > +        central_ip_address = true;
> > +    }
> > +    sset_destroy(&all_ips_v4);
> > +    sset_destroy(&all_ips_v6);
> >
> >      if (central_ip_address) {
> >          /* Gratuitous ARP for centralized NAT rules on distributed gateway
> > @@ -3036,6 +3079,10 @@ static const struct multicast_group mc_static =
> >  static const struct multicast_group mc_unknown =
> >      { MC_UNKNOWN, OVN_MCAST_UNKNOWN_TUNNEL_KEY };
> >
> > +#define MC_ARP_ND "_MC_arp_nd"
> > +static const struct multicast_group mc_arp_nd =
> > +    { MC_ARP_ND, OVN_MCAST_ARP_ND_TUNNEL_KEY };
> > +
> >  static bool
> >  multicast_group_equal(const struct multicast_group *a,
> >                        const struct multicast_group *b)
> > @@ -3737,28 +3784,6 @@ build_port_security_ip(enum ovn_pipeline pipeline, struct ovn_port *op,
> >
> >  }
> >
> > -/* Returns true if the logical switch port 'enabled' column is empty or
> > - * set to true.  Otherwise, returns false. */
> > -static bool
> > -lsp_is_enabled(const struct nbrec_logical_switch_port *lsp)
> > -{
> > -    return !lsp->n_enabled || *lsp->enabled;
> > -}
> > -
> > -/* Returns true only if the logical switch port 'up' column is set to true.
> > - * Otherwise, if the column is not set or set to false, returns false. */
> > -static bool
> > -lsp_is_up(const struct nbrec_logical_switch_port *lsp)
> > -{
> > -    return lsp->n_up && *lsp->up;
> > -}
> > -
> > -static bool
> > -lsp_is_external(const struct nbrec_logical_switch_port *nbsp)
> > -{
> > -    return !strcmp(nbsp->type, "external");
> > -}
> > -
> >  static bool
> >  build_dhcpv4_action(struct ovn_port *op, ovs_be32 offer_ip,
> >                      struct ds *options_action, struct ds *response_action,
> > @@ -5161,6 +5186,143 @@ build_lrouter_groups(struct hmap *ports, struct ovs_list *lr_list)
> >      }
> >  }
> >
> > +/*
> > + * Ingress table 17: Flows that forward ARP/ND requests only to the routers
> > + * that own the addresses. Packets are still flooded in the switching domain
> > + * as regular broadcast.
> > + */
> > +static void
> > +build_lswitch_rport_arp_flow(const char *target_address, int addr_family,
> > +                             struct ovn_port *patch_op,
> > +                             struct ovn_datapath *od,
> > +                             uint32_t priority,
> > +                             struct hmap *lflows)
> > +{
> > +    struct ds match   = DS_EMPTY_INITIALIZER;
> > +    struct ds actions = DS_EMPTY_INITIALIZER;
> > +
> > +    if (addr_family == AF_INET) {
> > +        ds_put_format(&match, "arp.tpa == %s && arp.op == 1", target_address);
> > +    } else {
> > +        ds_put_format(&match, "nd.target == %s && nd_ns", target_address);
> > +    }
> > +
> > +    /* Packets received from VXLAN tunnels have already been through the
> > +     * router pipeline so we should skip them. Normally this is done by the
> > +     * multicast_group implementation (VXLAN packets skip table 32 which
> > +     * delivers to patch ports) but we're bypassing multicast_groups.
> > +     */
> > +    ds_put_format(&match, " && "REGBIT_NOT_VXLAN);
> > +
> > +    /* Send a clone of the packet to the router pipeline and flood the
> > +     * original in the broadcast domain (skipping router ports). */
> > +    ds_put_format(&actions,
> > +                  "clone { outport = %s; output; }; "
> > +                  "outport = \""MC_ARP_ND"\"; output;",
> > +                  patch_op->json_key);
> > +    ovn_lflow_add(lflows, od, S_SWITCH_IN_L2_LKUP, priority,
> > +                  ds_cstr(&match), ds_cstr(&actions));
> > +
> > +    ds_destroy(&match);
> > +    ds_destroy(&actions);
> > +}
> > +
> > +/*
> > + * Ingress table 17: Flows that forward ARP/ND requests only to the routers
> > + * that own the addresses.
> > + * Priorities:
> > + * - 80: self originated GARPs that need to follow regular processing.
> > + * - 75: ARP requests to router owned IPs (interface IP/LB/NAT).
> > + */
> > +static void
> > +build_lswitch_rport_arp_responders(struct ovn_port *op,
> > +                                   struct ovn_datapath *sw_od,
> > +                                   struct ovn_port *sw_op,
> > +                                   struct hmap *lflows)
> > +{
> > +    if (!op || !op->nbrp) {
> > +        return;
> > +    }
> > +
> > +    if (!lrport_is_enabled(op->nbrp)) {
> > +        return;
> > +    }
> > +
> > +    struct ds match = DS_EMPTY_INITIALIZER;
> > +
> > +    /* Self originated (G)ARP requests/ND need to be flooded as usual.
> > +     * Priority: 40.
>
> You mean priority 80  here right ?
>
> I think you need to udpate the comments to change from 40 to 80
> and 30 to 75.
>
> > +     */
> > +    ds_put_format(&match, "inport == %s && (arp.op == 1 || nd_ns)",
> > +                  sw_op->json_key);
> > +    ovn_lflow_add(lflows, sw_od, S_SWITCH_IN_L2_LKUP, 80,
> > +                  ds_cstr(&match),
> > +                  "outport = \""MC_FLOOD"\"; output;");
> > +
> > +    ds_destroy(&match);
> > +
> > +    /* Forward ARP requests for IPs configured on the router only to this
> > +     * router port.
> > +     * Priority: 30.
> Same here.
>
> > +     */
> > +    for (int i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) {
> > +        build_lswitch_rport_arp_flow(op->lrp_networks.ipv4_addrs[i].addr_s,
> > +                                     AF_INET, sw_op, sw_od, 75, lflows);
> > +    }
> > +    for (int i = 0; i < op->lrp_networks.n_ipv6_addrs; i++) {
> > +        build_lswitch_rport_arp_flow(op->lrp_networks.ipv6_addrs[i].addr_s,
> > +                                     AF_INET6, sw_op, sw_od, 75, lflows);
> > +    }
> > +
> > +    /* Forward ARP requests to load-balancer VIPs configured on the router
> > +     * only to this router port.
> > +     * Priority: 30.
> > +     */
> > +    struct sset all_ips_v4 = SSET_INITIALIZER(&all_ips_v4);
> > +    struct sset all_ips_v6 = SSET_INITIALIZER(&all_ips_v6);
> > +    const char *ip_address;
> > +
> > +    get_router_load_balancer_ips(op->od, &all_ips_v4, &all_ips_v6);
> > +
> > +    SSET_FOR_EACH (ip_address, &all_ips_v4) {
> > +        build_lswitch_rport_arp_flow(ip_address, AF_INET, sw_op, sw_od,
> > +                                     75, lflows);
> > +    }
> > +    SSET_FOR_EACH (ip_address, &all_ips_v6) {
> > +        build_lswitch_rport_arp_flow(ip_address, AF_INET6, sw_op, sw_od,
> > +                                     75, lflows);
> > +    }
> > +    sset_destroy(&all_ips_v4);
> > +    sset_destroy(&all_ips_v6);
> > +
> > +    /* Forward ARP requests to NAT addresses configured on the router
> > +     * only to this router port.
> > +     * Priority: 30.
> > +     */
> > +    for (int i = 0; i < op->od->nbr->n_nat; i++) {
> > +        const struct nbrec_nat *nat = op->od->nbr->nat[i];
> > +
> > +        if (!strcmp(nat->type, "snat")) {
> > +            continue;
> > +        }
> > +
> > +        ovs_be32 ip;
> > +        ovs_be32 mask;
> > +        struct in6_addr ipv6;
> > +        struct in6_addr mask_v6;
> > +
> > +        if (ip_parse_masked(nat->external_ip, &ip, &mask)) {
> > +            if (!ipv6_parse_masked(nat->external_ip, &ipv6, &mask_v6)) {
> > +                build_lswitch_rport_arp_flow(nat->external_ip, AF_INET6, sw_op,
> > +                                             sw_od, 75, lflows);
> > +            }
> > +        } else {
> > +            build_lswitch_rport_arp_flow(nat->external_ip, AF_INET, sw_op,
> > +                                         sw_od, 75, lflows);
> > +        }
> > +    }
> > +}
> > +
> >  static void
> >  build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
> >                      struct hmap *port_groups, struct hmap *lflows,
> > @@ -5748,6 +5910,15 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
> >              continue;
> >          }
> >
> > +        /* For ports connected to logical routers add flows to bypass the
> > +         * broadcast flooding of ARP/ND requests in table 17. We direct the
> > +         * requests only to the router port that owns the IP address.
> > +         */
> > +        if (!strcmp(op->nbsp->type, "router")) {
> > +            build_lswitch_rport_arp_responders(op->peer, op->od, op,
> > +                                               lflows);
> > +        }
> > +
> >          for (size_t i = 0; i < op->nbsp->n_addresses; i++) {
> >              /* Addresses are owned by the logical port.
> >               * Ethernet address followed by zero or more IPv4
> > @@ -5879,12 +6050,6 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports,
> >      ds_destroy(&actions);
> >  }
> >
> > -static bool
> > -lrport_is_enabled(const struct nbrec_logical_router_port *lrport)
> > -{
> > -    return !lrport->enabled || *lrport->enabled;
> > -}
> > -
> >  /* Returns a string of the IP address of the router port 'op' that
> >   * overlaps with 'ip_s".  If one is not found, returns NULL.
> >   *
> > @@ -6904,61 +7069,66 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
> >          }
> >
> >          /* A set to hold all load-balancer vips that need ARP responses. */
> > -        struct sset all_ips = SSET_INITIALIZER(&all_ips);
> > -        int addr_family;
> > -        get_router_load_balancer_ips(op->od, &all_ips, &addr_family);
> > +        struct sset all_ips_v4 = SSET_INITIALIZER(&all_ips_v4);
> > +        struct sset all_ips_v6 = SSET_INITIALIZER(&all_ips_v6);
> > +        get_router_load_balancer_ips(op->od, &all_ips_v4, &all_ips_v6);
> >
> >          const char *ip_address;
> > -        SSET_FOR_EACH(ip_address, &all_ips) {
> > +        SSET_FOR_EACH (ip_address, &all_ips_v4) {
> >              ds_clear(&match);
> > -            if (addr_family == AF_INET) {
> > -                ds_put_format(&match,
> > -                              "inport == %s && arp.tpa == %s && arp.op == 1",
> > -                              op->json_key, ip_address);
> > -            } else {
> > -                ds_put_format(&match,
> > -                              "inport == %s && nd_ns && nd.target == %s",
> > -                              op->json_key, ip_address);
> > -            }
> > +            ds_put_format(&match,
> > +                          "inport == %s && arp.tpa == %s && arp.op == 1",
> > +                          op->json_key, ip_address);
> >
> >              ds_clear(&actions);
> > -            if (addr_family == AF_INET) {
> > -                ds_put_format(&actions,
> > -                "eth.dst = eth.src; "
> > -                "eth.src = %s; "
> > -                "arp.op = 2; /* ARP reply */ "
> > -                "arp.tha = arp.sha; "
> > -                "arp.sha = %s; "
> > -                "arp.tpa = arp.spa; "
> > -                "arp.spa = %s; "
> > -                "outport = %s; "
> > -                "flags.loopback = 1; "
> > -                "output;",
> > -                op->lrp_networks.ea_s,
> > -                op->lrp_networks.ea_s,
> > -                ip_address,
> > -                op->json_key);
> > -            } else {
> > -                ds_put_format(&actions,
> > -                "nd_na { "
> > -                "eth.src = %s; "
> > -                "ip6.src = %s; "
> > -                "nd.target = %s; "
> > -                "nd.tll = %s; "
> > -                "outport = inport; "
> > -                "flags.loopback = 1; "
> > -                "output; "
> > -                "};",
> > -                op->lrp_networks.ea_s,
> > -                ip_address,
> > -                ip_address,
> > -                op->lrp_networks.ea_s);
> > -            }
> > +            ds_put_format(&actions,
> > +                          "eth.dst = eth.src; "
> > +                          "eth.src = %s; "
> > +                          "arp.op = 2; /* ARP reply */ "
> > +                          "arp.tha = arp.sha; "
> > +                          "arp.sha = %s; "
> > +                          "arp.tpa = arp.spa; "
> > +                          "arp.spa = %s; "
> > +                          "outport = %s; "
> > +                          "flags.loopback = 1; "
> > +                          "output;",
> > +                          op->lrp_networks.ea_s,
> > +                          op->lrp_networks.ea_s,
> > +                          ip_address,
> > +                          op->json_key);
> > +
> >              ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_INPUT, 90,
> >                            ds_cstr(&match), ds_cstr(&actions));
> >          }
> >
> > -        sset_destroy(&all_ips);
> > +        SSET_FOR_EACH (ip_address, &all_ips_v6) {
> > +            ds_clear(&match);
> > +            ds_put_format(&match,
> > +                          "inport == %s && nd_ns && nd.target == %s",
> > +                          op->json_key, ip_address);
> > +
> > +            ds_clear(&actions);
> > +            ds_put_format(&actions,
> > +                          "nd_na { "
> > +                          "eth.src = %s; "
> > +                          "ip6.src = %s; "
> > +                          "nd.target = %s; "
> > +                          "nd.tll = %s; "
> > +                          "outport = inport; "
> > +                          "flags.loopback = 1; "
> > +                          "output; "
> > +                          "};",
> > +                          op->lrp_networks.ea_s,
> > +                          ip_address,
> > +                          ip_address,
> > +                          op->lrp_networks.ea_s);
> > +
> > +            ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_INPUT, 90,
> > +                          ds_cstr(&match), ds_cstr(&actions));
> > +        }
> > +
> > +        sset_destroy(&all_ips_v4);
> > +        sset_destroy(&all_ips_v6);
> >
> >          /* A gateway router can have 2 SNAT IP addresses to force DNATed and
> >           * LBed traffic respectively to be SNATed.  In addition, there can be
> > @@ -9392,6 +9562,12 @@ build_mcast_groups(struct northd_context *ctx,
> >          } else if (op->nbsp && lsp_is_enabled(op->nbsp)) {
> >              ovn_multicast_add(mcast_groups, &mc_flood, op);
> >
> > +            /* Add all non-router ports to the ARP ND L2 broadcast flood
> > +             * domain entry. */
> > +            if (strcmp(op->nbsp->type, "router")) {
> > +                ovn_multicast_add(mcast_groups, &mc_arp_nd, op);
> > +            }
> > +
> >              /* If this port is connected to a multicast router then add it
> >               * to the MC_MROUTER_FLOOD group.
> >               */
> > diff --git a/tests/ovn.at b/tests/ovn.at
> > index 410f4b5..c56bdce 100644
> > --- a/tests/ovn.at
> > +++ b/tests/ovn.at
> > @@ -9595,7 +9595,7 @@ ovn-nbctl --wait=hv --timeout=3 sync
> >  # Check that there is a logical flow in logical switch foo's pipeline
> >  # to set the outport to rp-foo (which is expected).
> >  OVS_WAIT_UNTIL([test 1 = `ovn-sbctl dump-flows foo | grep ls_in_l2_lkup | \
> > -grep rp-foo | grep -v is_chassis_resident | wc -l`])
> > +grep rp-foo | grep -v is_chassis_resident | grep priority=50 -c`])
> >
> >  # Set the option 'reside-on-redirect-chassis' for foo
> >  ovn-nbctl set logical_router_port foo options:reside-on-redirect-chassis=true
> > @@ -9603,7 +9603,7 @@ ovn-nbctl set logical_router_port foo options:reside-on-redirect-chassis=true
> >  # to set the outport to rp-foo with the condition is_chassis_redirect.
> >  ovn-sbctl dump-flows foo
> >  OVS_WAIT_UNTIL([test 1 = `ovn-sbctl dump-flows foo | grep ls_in_l2_lkup | \
> > -grep rp-foo | grep is_chassis_resident | wc -l`])
> > +grep rp-foo | grep is_chassis_resident | grep priority=50 -c`])
> >
> >  echo "---------NB dump-----"
> >  ovn-nbctl show
> > @@ -16676,3 +16676,243 @@ as hv4 ovs-appctl fdb/show br-phys
> >  OVN_CLEANUP([hv1],[hv2],[hv3],[hv4])
> >
> >  AT_CLEANUP
> > +
> > +AT_SETUP([ovn -- ARP/ND request broadcast limiting])
> > +AT_SKIP_IF([test $HAVE_PYTHON = no])
> > +ovn_start
> > +
> > +ip_to_hex() {
> > +    printf "%02x%02x%02x%02x" "$@"
> > +}
> > +
> > +send_arp_request() {
> > +    local hv=$1 inport=$2 eth_src=$3 spa=$4 tpa=$5
> > +    local eth_dst=ffffffffffff
> > +    local eth_type=0806
> > +    local eth=${eth_dst}${eth_src}${eth_type}
> > +
> > +    local arp=0001080006040001${eth_src}${spa}${eth_dst}${tpa}
> > +
> > +    local request=${eth}${arp}
> > +    as hv$hv ovs-appctl netdev-dummy/receive hv${hv}-vif$inport $request
> > +}
> > +
> > +send_nd_ns() {
> > +    local hv=$1 inport=$2 eth_src=$3 spa=$4 tpa=$5 cksum=$6
> > +
> > +    local eth_dst=ffffffffffff
> > +    local eth_type=86dd
> > +    local eth=${eth_dst}${eth_src}${eth_type}
> > +
> > +    local ip_vhlen=60000000
> > +    local ip_plen=0020
> > +    local ip_next=3a
> > +    local ip_ttl=ff
> > +    local ip=${ip_vhlen}${ip_plen}${ip_next}${ip_ttl}${spa}${tpa}
> > +
> > +    # Neighbor Solicitation
> > +    local icmp6_type=87
> > +    local icmp6_code=00
> > +    local icmp6_rsvd=00000000
> > +    # ICMPv6 source lla option
> > +    local icmp6_opt=01
> > +    local icmp6_optlen=01
> > +    local icmp6=${icmp6_type}${icmp6_code}${cksum}${icmp6_rsvd}${tpa}${icmp6_opt}${icmp6_optlen}${eth_src}
> > +
> > +    local request=${eth}${ip}${icmp6}
> > +
> > +    as hv$hv ovs-appctl netdev-dummy/receive hv${hv}-vif$inport $request
> > +}
> > +
> > +src_mac=000000000001
> > +
> > +net_add n1
> > +sim_add hv1
> > +as hv1
> > +ovs-vsctl add-br br-phys
> > +ovn_attach n1 br-phys 192.168.0.1
> > +
> > +ovs-vsctl -- add-port br-int hv1-vif1 -- \
> > +    set interface hv1-vif1 external-ids:iface-id=sw-agg-ext \
> > +    options:tx_pcap=hv1/vif1-tx.pcap \
> > +    options:rxq_pcap=hv1/vif1-rx.pcap \
> > +    ofport-request=1
> > +
> > +# One Aggregation Switch connected to two Logical networks (routers).
> > +ovn-nbctl ls-add sw-agg
> > +ovn-nbctl lsp-add sw-agg sw-agg-ext \
> > +    -- lsp-set-addresses sw-agg-ext 00:00:00:00:00:01
> > +
> > +ovn-nbctl lsp-add sw-agg sw-rtr1                   \
> > +    -- lsp-set-type sw-rtr1 router                 \
> > +    -- lsp-set-addresses sw-rtr1 00:00:00:00:01:00 \
> > +    -- lsp-set-options sw-rtr1 router-port=rtr1-sw
> > +ovn-nbctl lsp-add sw-agg sw-rtr2                   \
> > +    -- lsp-set-type sw-rtr2 router                 \
> > +    -- lsp-set-addresses sw-rtr2 00:00:00:00:02:00 \
> > +    -- lsp-set-options sw-rtr2 router-port=rtr2-sw
> > +
> > +# Configure L3 interface IPv4 & IPv6 on both routers
> > +ovn-nbctl lr-add rtr1
> > +ovn-nbctl lrp-add rtr1 rtr1-sw 00:00:00:00:01:00 10.0.0.1/24 10::1/64
> > +
> > +ovn-nbctl lr-add rtr2
> > +ovn-nbctl lrp-add rtr2 rtr2-sw 00:00:00:00:02:00 10.0.0.2/24 10::2/64
> > +
> > +OVN_POPULATE_ARP
> > +ovn-nbctl --wait=hv sync
> > +
> > +sw_dp_uuid=$(ovn-sbctl --bare --columns _uuid list datapath_binding sw-agg)
> > +sw_dp_key=$(ovn-sbctl --bare --columns tunnel_key list datapath_binding sw-agg)
> > +
> > +r1_tnl_key=$(ovn-sbctl --bare --columns tunnel_key list port_binding sw-rtr1)
> > +r2_tnl_key=$(ovn-sbctl --bare --columns tunnel_key list port_binding sw-rtr2)
> > +
> > +mc_key=$(ovn-sbctl --bare --columns tunnel_key find multicast_group datapath=${sw_dp_uuid} name="_MC_flood")
> > +mc_key=$(printf "%04x" $mc_key)
> > +
> > +match_sw_metadata="metadata=0x${sw_dp_key}"
> > +
> > +# Inject ARP request for first router owned IP address.
> > +send_arp_request 1 1 ${src_mac} $(ip_to_hex 10 0 0 254) $(ip_to_hex 10 0 0 1)
> > +
> > +# Verify that the ARP request is sent only to rtr1.
> > +match_arp_req1="${match_sw_metadata}.*arp_tpa=10.0.0.1,arp_op=1"
> > +match_arp_req2="${match_sw_metadata}.*arp_tpa=10.0.0.2,arp_op=1"
> > +match_send_rtr1="clone(load:0x${r1_tnl_key}->NXM_NX_REG15"
> > +match_send_rtr2="clone(load:0x${r2_tnl_key}->NXM_NX_REG15"
> > +
> > +as hv1
> > +pkts_to_rtr1=$(ovs-ofctl dump-flows br-int | grep -E "${match_arp_req1}" | \
> > +    grep "${match_send_rtr1}" | grep n_packets=1 -c)
> > +pkts_to_rtr2=$(ovs-ofctl dump-flows br-int | grep -E "${match_arp_req2}" | \
> > +    grep "${match_send_rtr2}" | grep n_packets=1 -c)
> > +pkts_flooded=$(ovs-ofctl dump-flows br-int | grep -E "${match_sw_metadata}" | \
> > +    grep ${mc_key} | grep -v n_packets=0 -c)
> > +
> > +AT_CHECK([test "1" = "${pkts_to_rtr1}"])
> > +AT_CHECK([test "0" = "${pkts_to_rtr2}"])
> > +AT_CHECK([test "0" = "${pkts_flooded}"])
> > +
> > +# Inject ND_NS for ofirst router owned IP address.
> > +src_ipv6=00100000000000000000000000000254
> > +dst_ipv6=00100000000000000000000000000001
> > +send_nd_ns 1 1 ${src_mac} ${src_ipv6} ${dst_ipv6} 751d
> > +
> > +# Verify that the ND_NS is sent only to rtr1.
> > +match_nd_ns1="${match_sw_metadata}.*icmp_type=135.*nd_target=10::1"
> > +match_nd_ns2="${match_sw_metadata}.*icmp_type=135.*nd_target=10::2"
> > +as hv1
> > +pkts_to_rtr1=$(ovs-ofctl dump-flows br-int | grep -E "${match_nd_ns1}" | \
> > +    grep "${match_send_rtr1}" | grep n_packets=1 -c)
> > +pkts_to_rtr2=$(ovs-ofctl dump-flows br-int | grep -E "${match_nd_ns1}" | \
> > +    grep "${match_send_rtr2}" | grep n_packets=1 -c)
> > +pkts_flooded=$(ovs-ofctl dump-flows br-int | grep -E "${match_sw_metadata}" | \
> > +    grep ${mc_key} | grep -v n_packets=0 -c)
> > +
> > +AT_CHECK([test "1" = "${pkts_to_rtr1}"])
> > +AT_CHECK([test "0" = "${pkts_to_rtr2}"])
> > +AT_CHECK([test "0" = "${pkts_flooded}"])
> > +
> > +# Configure load balancing on both routers.
> > +ovn-nbctl lb-add lb1-v4 10.0.0.11 42.42.42.1
> > +ovn-nbctl lb-add lb1-v6 10::11 42::1
> > +ovn-nbctl lr-lb-add rtr1 lb1-v4
> > +ovn-nbctl lr-lb-add rtr1 lb1-v6
> > +
> > +ovn-nbctl lb-add lb2-v4 10.0.0.22 42.42.42.2
> > +ovn-nbctl lb-add lb2-v6 10::22 42::2
> > +ovn-nbctl lr-lb-add rtr2 lb2-v4
> > +ovn-nbctl lr-lb-add rtr2 lb2-v6
> > +ovn-nbctl --wait=hv sync
> > +
> > +# Inject ARP request for first router owned VIP address.
> > +send_arp_request 1 1 ${src_mac} $(ip_to_hex 10 0 0 254) $(ip_to_hex 10 0 0 11)
> > +
> > +# Verify that the ARP request is sent only to rtr1.
> > +match_arp_req1="${match_sw_metadata}.*arp_tpa=10.0.0.11,arp_op=1"
> > +match_arp_req2="${match_sw_metadata}.*arp_tpa=10.0.0.22,arp_op=1"
> > +match_send_rtr1="clone(load:0x${r1_tnl_key}->NXM_NX_REG15"
> > +match_send_rtr2="clone(load:0x${r2_tnl_key}->NXM_NX_REG15"
> > +
> > +as hv1
> > +pkts_to_rtr1=$(ovs-ofctl dump-flows br-int | grep -E "${match_arp_req1}" | \
> > +    grep "${match_send_rtr1}" | grep n_packets=1 -c)
>
> When we run tests with -j10 for instance, this could fail in the below AT_CHECK,
> if ovn-northd/ovn-controlle is little slow to catch up.
>
> Can you please use OVS_WAIT_UNTIL for the first check instead of  AT_CHECK
> whenever you call send_arp_request/send_nd_ns, to be sure that we
> don't fail the tests
> because of timing issues ?
>
>
> > +pkts_to_rtr2=$(ovs-ofctl dump-flows br-int | grep -E "${match_arp_req2}" | \
> > +    grep "${match_send_rtr2}" | grep n_packets=1 -c)
> > +pkts_flooded=$(ovs-ofctl dump-flows br-int | grep -E "${match_sw_metadata}" | \
> > +    grep ${mc_key} | grep -v n_packets=0 -c)
> > +
> > +AT_CHECK([test "1" = "${pkts_to_rtr1}"])
> > +AT_CHECK([test "0" = "${pkts_to_rtr2}"])
> > +AT_CHECK([test "0" = "${pkts_flooded}"])
> > +
> > +# Inject ND_NS for first router owned VIP address.
> > +src_ipv6=00100000000000000000000000000254
> > +dst_ipv6=00100000000000000000000000000011
> > +send_nd_ns 1 1 ${src_mac} ${src_ipv6} ${dst_ipv6} 751d
> > +
> > +# Verify that the ND_NS is sent only to rtr1.
> > +match_nd_ns1="${match_sw_metadata}.*icmp_type=135.*nd_target=10::11"
> > +match_nd_ns2="${match_sw_metadata}.*icmp_type=135.*nd_target=10::22"
> > +as hv1
> > +pkts_to_rtr1=$(ovs-ofctl dump-flows br-int | grep -E "${match_nd_ns1}" | \
> > +    grep "${match_send_rtr1}" | grep n_packets=1 -c)
> > +pkts_to_rtr2=$(ovs-ofctl dump-flows br-int | grep -E "${match_nd_ns1}" | \
> > +    grep "${match_send_rtr2}" | grep n_packets=1 -c)
> > +pkts_flooded=$(ovs-ofctl dump-flows br-int | grep -E "${match_sw_metadata}" | \
> > +    grep ${mc_key} | grep -v n_packets=0 -c)
> > +
> > +AT_CHECK([test "1" = "${pkts_to_rtr1}"])
> > +AT_CHECK([test "0" = "${pkts_to_rtr2}"])
> > +AT_CHECK([test "0" = "${pkts_flooded}"])
> > +
> > +# Configure NAT on both routers
> > +ovn-nbctl lr-nat-add rtr1 dnat_and_snat 10.0.0.111 42.42.42.1
> > +ovn-nbctl lr-nat-add rtr1 dnat_and_snat 10::111 42::1
> > +ovn-nbctl lr-nat-add rtr2 dnat_and_snat 10.0.0.222 42.42.42.2
> > +ovn-nbctl lr-nat-add rtr2 dnat_and_snat 10::222 42::2
> > +
> > +# Inject ARP request for first router owned NAT address.
> > +send_arp_request 1 1 ${src_mac} $(ip_to_hex 10 0 0 254) $(ip_to_hex 10 0 0 111)
> > +
> > +# Verify that the ARP request is sent only to rtr1.
> > +match_arp_req1="${match_sw_metadata}.*arp_tpa=10.0.0.111,arp_op=1"
> > +match_arp_req2="${match_sw_metadata}.*arp_tpa=10.0.0.222,arp_op=1"
> > +match_send_rtr1="clone(load:0x${r1_tnl_key}->NXM_NX_REG15"
> > +match_send_rtr2="clone(load:0x${r2_tnl_key}->NXM_NX_REG15"
> > +
> > +as hv1
> > +pkts_to_rtr1=$(ovs-ofctl dump-flows br-int | grep -E "${match_arp_req1}" | \
> > +    grep "${match_send_rtr1}" | grep n_packets=1 -c)
> > +pkts_to_rtr2=$(ovs-ofctl dump-flows br-int | grep -E "${match_arp_req2}" | \
> > +    grep "${match_send_rtr2}" | grep n_packets=1 -c)
> > +pkts_flooded=$(ovs-ofctl dump-flows br-int | grep -E "${match_sw_metadata}" | \
> > +    grep ${mc_key} | grep -v n_packets=0 -c)
> > +
> > +AT_CHECK([test "1" = "${pkts_to_rtr1}"])
> > +AT_CHECK([test "0" = "${pkts_to_rtr2}"])
> > +AT_CHECK([test "0" = "${pkts_flooded}"])
> > +
> > +# Inject ND_NS for first router owned IP address.
> > +src_ipv6=00100000000000000000000000000254
> > +dst_ipv6=00100000000000000000000000000111
> > +send_nd_ns 1 1 ${src_mac} ${src_ipv6} ${dst_ipv6} 751d
> > +
> > +# Verify that the ND_NS is sent only to rtr1.
> > +match_nd_ns1="${match_sw_metadata}.*icmp_type=135.*nd_target=10::111"
> > +match_nd_ns2="${match_sw_metadata}.*icmp_type=135.*nd_target=10::222"
> > +as hv1
> > +pkts_to_rtr1=$(ovs-ofctl dump-flows br-int | grep -E "${match_nd_ns1}" | \
> > +    grep "${match_send_rtr1}" | grep n_packets=1 -c)
> > +pkts_to_rtr2=$(ovs-ofctl dump-flows br-int | grep -E "${match_nd_ns1}" | \
> > +    grep "${match_send_rtr2}" | grep n_packets=1 -c)
> > +pkts_flooded=$(ovs-ofctl dump-flows br-int | grep -E "${match_sw_metadata}" | \
> > +    grep ${mc_key} | grep -v n_packets=0 -c)
> > +
> > +AT_CHECK([test "1" = "${pkts_to_rtr1}"])
> > +AT_CHECK([test "0" = "${pkts_to_rtr2}"])
> > +AT_CHECK([test "0" = "${pkts_flooded}"])
> > +
> > +OVN_CLEANUP([hv1])
> > +AT_CLEANUP
> > --
> > 1.8.3.1
> >
> > _______________________________________________
> > dev mailing list
> > dev at openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev



More information about the dev mailing list