[ovs-dev] [PATCH v2 1/3] Add multipath static router in OVN northd and north-db

Gao Zhenyu sysugaozhenyu at gmail.com
Wed Oct 11 01:55:36 UTC 2017


Hi Miguel,

   Thanks for your suggestion on it. It's very usefull.
   In my point of view, I think no matter we have single router leg or
multiple router legs on edge router, we still need a way to dispatch
traffic randomly, right?
   So even we implement multiple legs on a router we can't random seperate
traffics to those legs easily.(static route only seperates specific
traffic) Then multipath action is a good candidate to make it.

   Currently, gateway chassises are links to single ovn edge logical router
port, which means those gateways chassises output traffic contain same src
mac.
   I don't know if we have a good way to implement L3HA A/A in current
architecture. (Maybe adding  gateway_chassis options field, populate
"rewrite-mac", "rewrite-ip" to rewrite mac address is a way, but I don't
think it is a good way and may confuse people)
   So if you already get a idea to make it, it would be great to bring it
up then we can discuss it and move whole process faster. :)


Thanks
Zhenyu Gao

2017-10-11 9:50 GMT+08:00 Gao Zhenyu <sysugaozhenyu at gmail.com>:

> I discussed this multipath stuff with Miguel in other mailling thread and
> I want to bring this discusstion on ovs mailing list and hope to collect
> more suggestions from all of you. :)
>
> Here is the Miguel's suggestion on it.
>
> =================================
> Hi Gao,
>
>    Sorry, I didn't have more time to look at it currently (although it's a
> topic of my interest.)
>
>    I'm worried of the replication of concerns inside networking-ovn
> related routing, and I don't see the advantage of l3gateway mode, beyond
> legacy usage.
>
>    I understand the limitation you expressed about the
> "chassisredirect"/"gatewaychassis" mode only being able to expose a
> single external router leg.
>
>    If that's a limitation that doesn't work for you, my opinion is that we
> should work on fixing that limitation, and keeping all our development
> efforts in a single place, with distributed E/W routing.
>
>    In such way we could construct L3HA A/A , by having every
> gateway_chassis have the same priority, and possible some extra options.
>
>    But again, please, this is a discussion we may have on the development
> mailing list, because may be my point of view is too narrow.
>
>     Can you bring it up on the mailing list, or do you want me to do it?
>
>    Best regards,
> =================================
>
> 2017-10-08 17:42 GMT+08:00 Gao Zhenyu <sysugaozhenyu at gmail.com>:
>
>> Comments and suggestions are welcome :)
>>
>> Thanks
>> Zhenyu Gao
>>
>> 2017-09-26 17:52 GMT+08:00 Zhenyu Gao <sysugaozhenyu at gmail.com>:
>>
>>> 1. ovn-nb.ovsschema was updated output_port field. Change the max entry
>>> number from 1 to unlimited.
>>> 2. Add multipath feature in ovn-northd part. northd generates multipath
>>> flows to dispatch traffic by using packet's IP dst address if route's
>>> output_port contains two or more ports.
>>> 3. Add new table(lr_in_multipath) in ovn-northd's router ingress stages
>>> to dispatch traffic to ports.
>>> 4. Add multipath flow in Table 5(lr_in_ip_routing) and store hash result
>>> into reg0. reg9[2] was used to indicate packet which need dispatching.
>>> 5. Add multipath feature description in ovn/northd/ovn-northd.8.xml
>>> and ovn/ovn-nb.xml
>>> 6. ovn-nbctl.c was updated to handle configuring mulitiple output_port.
>>>
>>> Signed-off-by: Zhenyu Gao <sysugaozhenyu at gmail.com>
>>> ---
>>>  ovn/northd/ovn-northd.8.xml |  67 +++++++++++-
>>>  ovn/northd/ovn-northd.c     | 257 ++++++++++++++++++++++++++++++
>>> +++++++-------
>>>  ovn/ovn-nb.ovsschema        |   7 +-
>>>  ovn/ovn-nb.xml              |   4 +
>>>  ovn/utilities/ovn-nbctl.c   |  28 +++--
>>>  5 files changed, 311 insertions(+), 52 deletions(-)
>>>
>>> diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
>>> index 0d85ec0..b1ce9a9 100644
>>> --- a/ovn/northd/ovn-northd.8.xml
>>> +++ b/ovn/northd/ovn-northd.8.xml
>>> @@ -1598,6 +1598,9 @@ icmp4 {
>>>        port (ingress table <code>ARP Request</code> will generate an ARP
>>>        request, if needed, with <code>reg0</code> as the target protocol
>>>        address and <code>reg1</code> as the source protocol address).
>>> +      A IP route can be configured that it has multipath to next-hop.
>>> +      If a packet has multipath to destination, OVN assign the port
>>> +      index into reg[0] to indicate the packet's output port in table 6.
>>>      </p>
>>>
>>>      <p>
>>> @@ -1617,6 +1620,28 @@ icmp4 {
>>>
>>>        <li>
>>>          <p>
>>> +          IPv4/IPV6 multipath routing table. For each route to IPv4/IPv6
>>> +          network <var>N</var> with netmask <var>M</var>, on multipath
>>> port
>>> +          <var>P</var> with IP address <var>A</var> and Ethernet
>>> +          address <var>E</var>, a logical flow with match
>>> +          <code>ip4.dst ==<var>N</var>/<var>M</var></code>,whose
>>> priority
>>> +          is the number of 1-bits plus 10 in <var>M</var>,
>>> +          has the following actions:
>>> +        </p>
>>> +
>>> +        <pre>
>>> +ip.ttl--;
>>> +multipath (nw_dst, 0, modulo_n, <var>n_links</var>, 0, reg0);
>>> +reg9[2] = 1
>>> +next;
>>> +        </pre>
>>> +        <p>
>>> +          <var>n_links</var> is the number of multipath port.
>>> +        </p>
>>> +      </li>
>>> +
>>> +      <li>
>>> +        <p>
>>>            IPv4 routing table.  For each route to IPv4 network
>>> <var>N</var> with
>>>            netmask <var>M</var>, on router port <var>P</var> with IP
>>> address
>>>            <var>A</var> and Ethernet
>>> @@ -1686,7 +1711,43 @@ next;
>>>        </li>
>>>      </ul>
>>>
>>> -    <h3>Ingress Table 6: ARP/ND Resolution</h3>
>>> +    <h3>Ingress Table 6: Multipath</h3>
>>> +    <p>
>>> +      Any packet taht reaches this table is an IP packet and reg9[2]=1
>>> +      using the following flows to route to corresponding port. This
>>> table
>>> +      implement dispatching by consuming reg0.
>>> +    </p>
>>> +
>>> +    <ul>
>>> +      <li>
>>> +        <p>
>>> +          A packet with netmask <var>M</var>, IP address <var>A</var>
>>> and
>>> +          <code>reg9[2] = 1</code>, whose priority above 1 has following
>>> +          actions:
>>> +        </p>
>>> +
>>> +        <pre>
>>> +reg0 = <var>G</var>;
>>> +reg1 = <var>A</var>;
>>> +eth.src = <var>E</var>;
>>> +outport = <var>P</var>;
>>> +flags.loopback = 1;
>>> +next;
>>> +        </pre>
>>> +
>>> +        <p>
>>> +          <var>G</var> is the gateway IP address. <var>A</var>,
>>> <var>E</var>
>>> +          and <var>P</var> are the values that were described in
>>> multipath
>>> +          routeing in table 5
>>> +        </p>
>>> +
>>> +        <p>
>>> +          A priority-0 logical flow with match has actions
>>> <code>next;</code>.
>>> +        </p>
>>> +      </li>
>>> +    </ul>
>>> +
>>> +    <h3>Ingress Table 7: ARP/ND Resolution</h3>
>>>
>>>      <p>
>>>        Any packet that reaches this table is an IP packet whose next-hop
>>> @@ -1779,7 +1840,7 @@ next;
>>>        </li>
>>>      </ul>
>>>
>>> -    <h3>Ingress Table 7: Gateway Redirect</h3>
>>> +    <h3>Ingress Table 8: Gateway Redirect</h3>
>>>
>>>      <p>
>>>        For distributed logical routers where one of the logical router
>>> @@ -1836,7 +1897,7 @@ next;
>>>        </li>
>>>      </ul>
>>>
>>> -    <h3>Ingress Table 8: ARP Request</h3>
>>> +    <h3>Ingress Table 9: ARP Request</h3>
>>>
>>>      <p>
>>>        In the common case where the Ethernet destination has been
>>> resolved, this
>>> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
>>> index 49e4ac3..f8bfee2 100644
>>> --- a/ovn/northd/ovn-northd.c
>>> +++ b/ovn/northd/ovn-northd.c
>>> @@ -135,9 +135,10 @@ enum ovn_stage {
>>>      PIPELINE_STAGE(ROUTER, IN,  UNSNAT,      3, "lr_in_unsnat")       \
>>>      PIPELINE_STAGE(ROUTER, IN,  DNAT,        4, "lr_in_dnat")         \
>>>      PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING,  5, "lr_in_ip_routing")   \
>>> -    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE, 6, "lr_in_arp_resolve")  \
>>> -    PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT, 7, "lr_in_gw_redirect")  \
>>> -    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST, 8, "lr_in_arp_request")  \
>>> +    PIPELINE_STAGE(ROUTER, IN,  MULTIPATH,   6, "lr_in_multipath")    \
>>> +    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE, 7, "lr_in_arp_resolve")  \
>>> +    PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT, 8, "lr_in_gw_redirect")  \
>>> +    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST, 9, "lr_in_arp_request")  \
>>>                                                                        \
>>>      /* Logical router egress stages. */                               \
>>>      PIPELINE_STAGE(ROUTER, OUT, UNDNAT,    0, "lr_out_undnat")        \
>>> @@ -173,6 +174,11 @@ enum ovn_stage {
>>>   * one of the logical router's own IP addresses. */
>>>  #define REGBIT_EGRESS_LOOPBACK  "reg9[1]"
>>>
>>> +/* Indicate multipath action has process this packet and store hash
>>> result
>>> + * into other regX. Should consume the hash result to determin the right
>>> + * output port. */
>>> +#define REGBIT_MULTIPATH "reg9[2]"
>>> +
>>>  /* Returns an "enum ovn_stage" built from the arguments. */
>>>  static enum ovn_stage
>>>  ovn_stage_build(enum ovn_datapath_type dp_type, enum ovn_pipeline
>>> pipeline,
>>> @@ -4142,82 +4148,178 @@ add_route(struct hmap *lflows, const struct
>>> ovn_port *op,
>>>  }
>>>
>>>  static void
>>> -build_static_route_flow(struct hmap *lflows, struct ovn_datapath *od,
>>> -                        struct hmap *ports,
>>> -                        const struct nbrec_logical_router_static_route
>>> *route)
>>> +add_multipath_route(struct hmap *lflows, uint32_t port_num,
>>> +                    struct ovn_port **out_ports,
>>> +                    const char **lrp_addr_s,
>>> +                    struct ovn_datapath *od,
>>> +                    const char *network_s, int plen,
>>> +                    const char *gateway, const char *policy)
>>> +{
>>> +    bool is_ipv4 = strchr(network_s, '.') ? true : false;
>>> +    struct ds match = DS_EMPTY_INITIALIZER;
>>> +    const char *dir;
>>> +    uint16_t priority;
>>> +
>>> +    if (policy && !strcmp(policy, "src-ip")) {
>>> +        dir = "src";
>>> +        priority = plen * 2;
>>> +    } else {
>>> +        dir = "dst";
>>> +        priority = (plen * 2) + 1;
>>> +    }
>>> +
>>> +    ds_put_format(&match, "ip%s.%s == %s/%d", is_ipv4 ? "4" : "6", dir,
>>> +                  network_s, plen);
>>> +
>>> +    struct ds actions = DS_EMPTY_INITIALIZER;
>>> +
>>> +    ds_put_format(&actions, "ip.ttl--; ");
>>> +    ds_put_format(&actions,
>>> +                  "multipath (nw_dst, 0, modulo_n, %u, 0, reg0); "
>>> +                  "%s = 1; "
>>> +                  "next;",
>>> +                  port_num, REGBIT_MULTIPATH);
>>> +
>>> +    /* The priority here is calculated to implement longest-prefix-match
>>> +     * routing. */
>>> +    ovn_lflow_add(lflows, od, S_ROUTER_IN_IP_ROUTING, priority,
>>> +                  ds_cstr(&match), ds_cstr(&actions));
>>> +
>>> +    for (int i = 0; i < port_num; i++) {
>>> +        struct ds mp_match = DS_EMPTY_INITIALIZER;
>>> +        struct ds mp_actions = DS_EMPTY_INITIALIZER;
>>> +
>>> +        ds_put_format(&mp_match, "%s == 1 && reg0 == %d && ",
>>> +                      REGBIT_MULTIPATH, i);
>>> +        ds_put_format(&mp_match, "ip%s.%s == %s/%d",
>>> +                      is_ipv4 ? "4" : "6", dir,
>>> +                      network_s, plen);
>>> +
>>> +        ds_put_format(&mp_actions, "%sreg0 = ", is_ipv4 ? "" : "xx");
>>> +        if (gateway) {
>>> +            ds_put_cstr(&mp_actions, gateway);
>>> +        } else {
>>> +            ds_put_format(&mp_actions, "ip%s.dst", is_ipv4 ? "4" : "6");
>>> +        }
>>> +
>>> +        ds_put_format(&mp_actions, "; "
>>> +                      "%sreg1 = %s; "
>>> +                      "eth.src = %s; "
>>> +                      "outport = %s; "
>>> +                      "flags.loopback = 1; "
>>> +                      "next;",
>>> +                      is_ipv4 ? "" : "xx",
>>> +                      lrp_addr_s[i],
>>> +                      out_ports[i]->lrp_networks.ea_s,
>>> +                      out_ports[i]->json_key);
>>> +
>>> +        /* Add flow in table 6 to determin the right output port
>>> +         * for this traffic. */
>>> +        ovn_lflow_add(lflows, od, S_ROUTER_IN_MULTIPATH, priority,
>>> +                      ds_cstr(&mp_match), ds_cstr(&mp_actions));
>>> +        ds_destroy(&mp_match);
>>> +        ds_destroy(&mp_actions);
>>> +    }
>>> +    ds_destroy(&match);
>>> +    ds_destroy(&actions);
>>> +}
>>> +
>>> +static bool
>>> +verify_nexthop_prefix(const struct nbrec_logical_router_static_route
>>> *route,
>>> +                      bool *is_ipv4, char **prefix_s, unsigned int
>>> *plen)
>>>  {
>>>      ovs_be32 nexthop;
>>> -    const char *lrp_addr_s = NULL;
>>> -    unsigned int plen;
>>> -    bool is_ipv4;
>>>
>>>      /* Verify that the next hop is an IP address with an all-ones mask.
>>> */
>>> -    char *error = ip_parse_cidr(route->nexthop, &nexthop, &plen);
>>> +    char *error = ip_parse_cidr(route->nexthop, &nexthop, plen);
>>>      if (!error) {
>>> -        if (plen != 32) {
>>> +        if (*plen != 32) {
>>>              static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5,
>>> 1);
>>>              VLOG_WARN_RL(&rl, "bad next hop mask %s", route->nexthop);
>>> -            return;
>>> +            return false;
>>>          }
>>> -        is_ipv4 = true;
>>> +        *is_ipv4 = true;
>>>      } else {
>>>          free(error);
>>>
>>>          struct in6_addr ip6;
>>> -        error = ipv6_parse_cidr(route->nexthop, &ip6, &plen);
>>> +        error = ipv6_parse_cidr(route->nexthop, &ip6, plen);
>>>          if (!error) {
>>> -            if (plen != 128) {
>>> +            if (*plen != 128) {
>>>                  static struct vlog_rate_limit rl =
>>> VLOG_RATE_LIMIT_INIT(5, 1);
>>>                  VLOG_WARN_RL(&rl, "bad next hop mask %s",
>>> route->nexthop);
>>> -                return;
>>> +                return false;
>>>              }
>>> -            is_ipv4 = false;
>>> +            *is_ipv4 = false;
>>>          } else {
>>>              static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5,
>>> 1);
>>>              VLOG_WARN_RL(&rl, "bad next hop ip address %s",
>>> route->nexthop);
>>>              free(error);
>>> -            return;
>>> +            return false;
>>>          }
>>>      }
>>>
>>> -    char *prefix_s;
>>> -    if (is_ipv4) {
>>> +    if (*is_ipv4) {
>>>          ovs_be32 prefix;
>>>          /* Verify that ip prefix is a valid IPv4 address. */
>>> -        error = ip_parse_cidr(route->ip_prefix, &prefix, &plen);
>>> +        error = ip_parse_cidr(route->ip_prefix, &prefix, plen);
>>>          if (error) {
>>>              static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5,
>>> 1);
>>>              VLOG_WARN_RL(&rl, "bad 'ip_prefix' in static routes %s",
>>>                           route->ip_prefix);
>>>              free(error);
>>> -            return;
>>> +            return false;
>>>          }
>>> -        prefix_s = xasprintf(IP_FMT, IP_ARGS(prefix &
>>> be32_prefix_mask(plen)));
>>> +        *prefix_s = xasprintf(IP_FMT, IP_ARGS(prefix
>>> +                                              &
>>> be32_prefix_mask(*plen)));
>>>      } else {
>>>          /* Verify that ip prefix is a valid IPv6 address. */
>>>          struct in6_addr prefix;
>>> -        error = ipv6_parse_cidr(route->ip_prefix, &prefix, &plen);
>>> +        error = ipv6_parse_cidr(route->ip_prefix, &prefix, plen);
>>>          if (error) {
>>>              static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5,
>>> 1);
>>>              VLOG_WARN_RL(&rl, "bad 'ip_prefix' in static routes %s",
>>>                           route->ip_prefix);
>>>              free(error);
>>> -            return;
>>> +            return false;
>>>          }
>>> -        struct in6_addr mask = ipv6_create_mask(plen);
>>> +        struct in6_addr mask = ipv6_create_mask(*plen);
>>>          struct in6_addr network = ipv6_addr_bitand(&prefix, &mask);
>>> -        prefix_s = xmalloc(INET6_ADDRSTRLEN);
>>> -        inet_ntop(AF_INET6, &network, prefix_s, INET6_ADDRSTRLEN);
>>> +        *prefix_s = xmalloc(INET6_ADDRSTRLEN);
>>> +        inet_ntop(AF_INET6, &network, *prefix_s, INET6_ADDRSTRLEN);
>>> +    }
>>> +
>>> +    return true;
>>> +}
>>> +
>>> +static void
>>> +build_static_route_flow(struct hmap *lflows, struct ovn_datapath *od,
>>> +                        struct hmap *ports,
>>> +                        const struct nbrec_logical_router_static_route
>>> *route)
>>> +{
>>> +    const char *lrp_addr_s = NULL;
>>> +    unsigned int plen;
>>> +    bool is_ipv4;
>>> +    char *prefix_s = NULL;
>>> +
>>> +    if (!verify_nexthop_prefix(route, &is_ipv4, &prefix_s, &plen)) {
>>> +        return;
>>> +    }
>>> +
>>> +    /* Only need one output_port, if route contains multiple
>>> output_port, then
>>> +     * we should use build_multipath_flow to handle it. */
>>> +    if (route->n_output_port > 1) {
>>> +        return;
>>>      }
>>>
>>>      /* Find the outgoing port. */
>>>      struct ovn_port *out_port = NULL;
>>> -    if (route->output_port) {
>>> -        out_port = ovn_port_find(ports, route->output_port);
>>> +    if (route->n_output_port) {
>>> +        out_port = ovn_port_find(ports, route->output_port[0]);
>>>          if (!out_port) {
>>>              static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5,
>>> 1);
>>>              VLOG_WARN_RL(&rl, "Bad out port %s for static route %s",
>>> -                         route->output_port, route->ip_prefix);
>>> +                         route->output_port[0], route->ip_prefix);
>>>              goto free_prefix_s;
>>>          }
>>>          lrp_addr_s = find_lrp_member_ip(out_port, route->nexthop);
>>> @@ -4270,7 +4372,77 @@ build_static_route_flow(struct hmap *lflows,
>>> struct ovn_datapath *od,
>>>                policy);
>>>
>>>  free_prefix_s:
>>> -    free(prefix_s);
>>> +    if (prefix_s) {
>>> +        free(prefix_s);
>>> +    }
>>> +}
>>> +
>>> +static void
>>> +build_multipath_flow(struct hmap *lflows, struct ovn_datapath *od,
>>> +                     struct hmap *ports,
>>> +                     const struct nbrec_logical_router_static_route
>>> *route)
>>> +{
>>> +    unsigned int plen;
>>> +    bool is_ipv4;
>>> +    char *prefix_s = NULL;
>>> +
>>> +    if (!verify_nexthop_prefix(route, &is_ipv4, &prefix_s, &plen)) {
>>> +        return;
>>> +    }
>>> +
>>> +    /* Find the outgoing port. */
>>> +    struct ovn_port **out_ports = xmalloc(route->n_output_port *
>>> +                                             sizeof(struct ovn_port *));
>>> +    const char **lrp_addr_s = xmalloc(route->n_output_port *
>>> +                                         sizeof(const char *));
>>> +    uint32_t idx = 0;
>>> +    for (int i = 0; i < route->n_output_port; i++) {
>>> +        out_ports[idx] = ovn_port_find(ports, route->output_port[i]);
>>> +        if (!out_ports[idx]) {
>>> +            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5,
>>> 1);
>>> +            VLOG_WARN_RL(&rl, "Bad out port %s for static route %s",
>>> +                         route->output_port[i], route->ip_prefix);
>>> +            continue;
>>> +        }
>>> +
>>> +        lrp_addr_s[idx] = find_lrp_member_ip(out_ports[idx],
>>> route->nexthop);
>>> +        if (!lrp_addr_s[idx]) {
>>> +            if (is_ipv4) {
>>> +                if (out_ports[idx]->lrp_networks.n_ipv4_addrs) {
>>> +                    lrp_addr_s[idx] = out_ports[idx]->
>>> +                                        lrp_networks.ipv4_addrs[0].add
>>> r_s;
>>> +                }
>>> +            } else {
>>> +                if (out_ports[idx]->lrp_networks.n_ipv6_addrs) {
>>> +                    lrp_addr_s[idx] = out_ports[idx]->
>>> +                                        lrp_networks.ipv6_addrs[0].add
>>> r_s;
>>> +                }
>>> +            }
>>> +        }
>>> +        if (!lrp_addr_s[idx]) {
>>> +            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5,
>>> 1);
>>> +            VLOG_WARN_RL(&rl,
>>> +                         "%s has no path for static route %s; next hop
>>> %s",
>>> +                         route->output_port[i], route->ip_prefix,
>>> +                         route->nexthop);
>>> +            continue;
>>> +        }
>>> +
>>> +        idx++;
>>> +    }
>>> +
>>> +    char *policy = route->policy ? route->policy : "dst-ip";
>>> +    if (idx > 0) {
>>> +        add_multipath_route(lflows, idx,
>>> +                            out_ports, lrp_addr_s, od,
>>> +                            prefix_s, plen, route->nexthop, policy);
>>> +    }
>>> +
>>> +    free(out_ports);
>>> +    free(lrp_addr_s);
>>> +    if (prefix_s) {
>>> +        free(prefix_s);
>>> +    }
>>>  }
>>>
>>>  static void
>>> @@ -5344,7 +5516,7 @@ build_lrouter_flows(struct hmap *datapaths, struct
>>> hmap *ports,
>>>          }
>>>      }
>>>
>>> -    /* Convert the static routes to flows. */
>>> +    /* Convert the static routes and multipath route to flows. */
>>>      HMAP_FOR_EACH (od, key_node, datapaths) {
>>>          if (!od->nbr) {
>>>              continue;
>>> @@ -5354,13 +5526,24 @@ build_lrouter_flows(struct hmap *datapaths,
>>> struct hmap *ports,
>>>              const struct nbrec_logical_router_static_route *route;
>>>
>>>              route = od->nbr->static_routes[i];
>>> -            build_static_route_flow(lflows, od, ports, route);
>>> +            if (route->n_output_port > 1) {
>>> +                /* Logical router ingress table 5-6: Multipath Routing.
>>> +                 *
>>> +                 * If router had been configured a traffic has multiple
>>> paths
>>> +                 * to destination. The specific output port should be
>>> firgured
>>> +                 * out by computing packet's IP dst address header */
>>> +                build_multipath_flow(lflows, od, ports, route);
>>> +            } else {
>>> +                build_static_route_flow(lflows, od, ports, route);
>>> +            }
>>>          }
>>> +        /* Packets are allowed by default in table 6. */
>>> +        ovn_lflow_add(lflows, od, S_ROUTER_IN_MULTIPATH, 0, "1",
>>> "next;");
>>>      }
>>>
>>>      /* XXX destination unreachable */
>>>
>>> -    /* Local router ingress table 6: ARP Resolution.
>>> +    /* Local router ingress table 7: ARP Resolution.
>>>       *
>>>       * Any packet that reaches this table is an IP packet whose
>>> next-hop IP
>>>       * address is in reg0. (ip4.dst is the final destination.) This
>>> table
>>> @@ -5555,7 +5738,7 @@ build_lrouter_flows(struct hmap *datapaths, struct
>>> hmap *ports,
>>>                        "get_nd(outport, xxreg0); next;");
>>>      }
>>>
>>> -    /* Logical router ingress table 7: Gateway redirect.
>>> +    /* Logical router ingress table 8: Gateway redirect.
>>>       *
>>>       * For traffic with outport equal to the l3dgw_port
>>>       * on a distributed router, this table redirects a subset
>>> @@ -5595,7 +5778,7 @@ build_lrouter_flows(struct hmap *datapaths, struct
>>> hmap *ports,
>>>          ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 0, "1",
>>> "next;");
>>>      }
>>>
>>> -    /* Local router ingress table 8: ARP request.
>>> +    /* Local router ingress table 9: ARP request.
>>>       *
>>>       * In the common case where the Ethernet destination has been
>>> resolved,
>>>       * this table outputs the packet (priority 0).  Otherwise, it
>>> composes
>>> diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema
>>> index a077bfb..7a43473 100644
>>> --- a/ovn/ovn-nb.ovsschema
>>> +++ b/ovn/ovn-nb.ovsschema
>>> @@ -1,7 +1,7 @@
>>>  {
>>>      "name": "OVN_Northbound",
>>> -    "version": "5.8.0",
>>> -    "cksum": "2812300190 <(281)%20230-0190> 16766",
>>> +    "version": "5.9.0",
>>> +    "cksum": "1515729450 16817",
>>>      "tables": {
>>>          "NB_Global": {
>>>              "columns": {
>>> @@ -235,7 +235,8 @@
>>>
>>> "dst-ip"]]},
>>>                                      "min": 0, "max": 1}},
>>>                  "nexthop": {"type": "string"},
>>> -                "output_port": {"type": {"key": "string", "min": 0,
>>> "max": 1}}},
>>> +                "output_port": {"type": {"key": "string", "min": 0,
>>> +                                         "max": "unlimited"}}},
>>>              "isRoot": false},
>>>          "NAT": {
>>>              "columns": {
>>> diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
>>> index 9869d7e..eaba0c8 100644
>>> --- a/ovn/ovn-nb.xml
>>> +++ b/ovn/ovn-nb.xml
>>> @@ -1485,6 +1485,10 @@
>>>          multiple IP addresses on the router port and none of them are
>>> in the
>>>          same subnet of <ref column="nexthop"/>, OVN chooses the first IP
>>>          address as the one via which the <ref column="nexthop"/> is
>>> reachable.
>>> +        When it contains more than two ports, it means packet has
>>> multiple
>>> +        candidate output ports. OVN uses the packet header to determin
>>> which
>>> +        port the packet would be delivered to.
>>> +        Currently, OVN consumes destination IP field to figure out port.
>>>        </p>
>>>      </column>
>>>    </table>
>>> diff --git a/ovn/utilities/ovn-nbctl.c b/ovn/utilities/ovn-nbctl.c
>>> index 8e5c1a4..417194f 100644
>>> --- a/ovn/utilities/ovn-nbctl.c
>>> +++ b/ovn/utilities/ovn-nbctl.c
>>> @@ -397,7 +397,7 @@ Logical router port commands:\n\
>>>                              ('enabled' or 'disabled')\n\
>>>  \n\
>>>  Route commands:\n\
>>> -  [--policy=POLICY] lr-route-add ROUTER PREFIX NEXTHOP [PORT]\n\
>>> +  [--policy=POLICY] lr-route-add ROUTER PREFIX NEXTHOP [PORT]...\n\
>>>                              add a route to ROUTER\n\
>>>    lr-route-del ROUTER [PREFIX]\n\
>>>                              remove routes from ROUTER\n\
>>> @@ -2184,13 +2184,15 @@ normalize_prefix_str(const char *orig_prefix)
>>>          return normalize_ipv6_prefix(ipv6, plen);
>>>      }
>>>  }
>>> -
>>> +
>>>  static void
>>>  nbctl_lr_route_add(struct ctl_context *ctx)
>>>  {
>>>      const struct nbrec_logical_router *lr;
>>>      lr = lr_by_name_or_uuid(ctx, ctx->argv[1], true);
>>>      char *prefix, *next_hop;
>>> +    int n_output_port = 0;
>>> +    const char **output_port;
>>>
>>>      const char *policy = shash_find_data(&ctx->options, "--policy");
>>>      if (policy && strcmp(policy, "src-ip") && strcmp(policy, "dst-ip"))
>>> {
>>> @@ -2224,6 +2226,11 @@ nbctl_lr_route_add(struct ctl_context *ctx)
>>>          }
>>>      }
>>>
>>> +    if (ctx->argc > 4) {
>>> +        n_output_port = ctx->argc - 4;
>>> +        output_port = (const char **)&ctx->argv[4];
>>> +    }
>>> +
>>>      bool may_exist = shash_find(&ctx->options, "--may-exist") != NULL;
>>>      for (int i = 0; i < lr->n_static_routes; i++) {
>>>          const struct nbrec_logical_router_static_route *route
>>> @@ -2253,9 +2260,10 @@ nbctl_lr_route_add(struct ctl_context *ctx)
>>>          nbrec_logical_router_static_route_verify_nexthop(route);
>>>          nbrec_logical_router_static_route_set_ip_prefix(route, prefix);
>>>          nbrec_logical_router_static_route_set_nexthop(route, next_hop);
>>> -        if (ctx->argc == 5) {
>>> +        if (n_output_port > 0) {
>>>              nbrec_logical_router_static_route_set_output_port(route,
>>> -
>>> ctx->argv[4]);
>>> +
>>> output_port,
>>> +
>>> n_output_port);
>>>          }
>>>          if (policy) {
>>>               nbrec_logical_router_static_route_set_policy(route,
>>> policy);
>>> @@ -2270,8 +2278,10 @@ nbctl_lr_route_add(struct ctl_context *ctx)
>>>      route = nbrec_logical_router_static_route_insert(ctx->txn);
>>>      nbrec_logical_router_static_route_set_ip_prefix(route, prefix);
>>>      nbrec_logical_router_static_route_set_nexthop(route, next_hop);
>>> -    if (ctx->argc == 5) {
>>> -        nbrec_logical_router_static_route_set_output_port(route,
>>> ctx->argv[4]);
>>> +    if (n_output_port > 0) {
>>> +        nbrec_logical_router_static_route_set_output_port(route,
>>> +                                                          output_port,
>>> +
>>> n_output_port);
>>>      }
>>>      if (policy) {
>>>          nbrec_logical_router_static_route_set_policy(route, policy);
>>> @@ -3066,8 +3076,8 @@ print_route(const struct
>>> nbrec_logical_router_static_route *route, struct ds *s)
>>>          ds_put_format(s, " %s", "dst-ip");
>>>      }
>>>
>>> -    if (route->output_port) {
>>> -        ds_put_format(s, " %s", route->output_port);
>>> +    for (int i = 0; i < route->n_output_port; i++) {
>>> +        ds_put_format(s, " %s", route->output_port[i]);
>>>      }
>>>      ds_put_char(s, '\n');
>>>  }
>>> @@ -3682,7 +3692,7 @@ static const struct ctl_command_syntax
>>> nbctl_commands[] = {
>>>        NULL, "", RO },
>>>
>>>      /* logical router route commands. */
>>> -    { "lr-route-add", 3, 4, "ROUTER PREFIX NEXTHOP [PORT]", NULL,
>>> +    { "lr-route-add", 3, INT_MAX, "ROUTER PREFIX NEXTHOP [PORT]...",
>>> NULL,
>>>        nbctl_lr_route_add, NULL, "--may-exist,--policy=", RW },
>>>      { "lr-route-del", 1, 2, "ROUTER [PREFIX]", NULL, nbctl_lr_route_del,
>>>        NULL, "--if-exists", RW },
>>> --
>>> 1.8.3.1
>>>
>>>
>>
>


More information about the dev mailing list