[ovs-dev] [PATCH v2 1/3] Add multipath static router in OVN northd and north-db

Gao Zhenyu sysugaozhenyu at gmail.com
Thu Oct 26 01:38:37 UTC 2017


ping.....


Thanks
Zhenyu Gao

2017-10-11 9:55 GMT+08:00 Gao Zhenyu <sysugaozhenyu at gmail.com>:

> Hi Miguel,
>
>    Thanks for your suggestion on it. It's very usefull.
>    In my point of view, I think no matter we have single router leg or
> multiple router legs on edge router, we still need a way to dispatch
> traffic randomly, right?
>    So even we implement multiple legs on a router we can't random seperate
> traffics to those legs easily.(static route only seperates specific
> traffic) Then multipath action is a good candidate to make it.
>
>    Currently, gateway chassises are links to single ovn edge logical
> router port, which means those gateways chassises output traffic contain
> same src mac.
>    I don't know if we have a good way to implement L3HA A/A in current
> architecture. (Maybe adding  gateway_chassis options field, populate
> "rewrite-mac", "rewrite-ip" to rewrite mac address is a way, but I don't
> think it is a good way and may confuse people)
>    So if you already get a idea to make it, it would be great to bring it
> up then we can discuss it and move whole process faster. :)
>
>
> Thanks
> Zhenyu Gao
>
> 2017-10-11 9:50 GMT+08:00 Gao Zhenyu <sysugaozhenyu at gmail.com>:
>
>> I discussed this multipath stuff with Miguel in other mailling thread and
>> I want to bring this discusstion on ovs mailing list and hope to collect
>> more suggestions from all of you. :)
>>
>> Here is the Miguel's suggestion on it.
>>
>> =================================
>> Hi Gao,
>>
>>    Sorry, I didn't have more time to look at it currently (although it's
>> a topic of my interest.)
>>
>>    I'm worried of the replication of concerns inside networking-ovn
>> related routing, and I don't see the advantage of l3gateway mode, beyond
>> legacy usage.
>>
>>    I understand the limitation you expressed about the
>> "chassisredirect"/"gatewaychassis" mode only being able to expose a
>> single external router leg.
>>
>>    If that's a limitation that doesn't work for you, my opinion is that
>> we should work on fixing that limitation, and keeping all our development
>> efforts in a single place, with distributed E/W routing.
>>
>>    In such way we could construct L3HA A/A , by having every
>> gateway_chassis have the same priority, and possible some extra options.
>>
>>    But again, please, this is a discussion we may have on the development
>> mailing list, because may be my point of view is too narrow.
>>
>>     Can you bring it up on the mailing list, or do you want me to do it?
>>
>>    Best regards,
>> =================================
>>
>> 2017-10-08 17:42 GMT+08:00 Gao Zhenyu <sysugaozhenyu at gmail.com>:
>>
>>> Comments and suggestions are welcome :)
>>>
>>> Thanks
>>> Zhenyu Gao
>>>
>>> 2017-09-26 17:52 GMT+08:00 Zhenyu Gao <sysugaozhenyu at gmail.com>:
>>>
>>>> 1. ovn-nb.ovsschema was updated output_port field. Change the max entry
>>>> number from 1 to unlimited.
>>>> 2. Add multipath feature in ovn-northd part. northd generates multipath
>>>> flows to dispatch traffic by using packet's IP dst address if route's
>>>> output_port contains two or more ports.
>>>> 3. Add new table(lr_in_multipath) in ovn-northd's router ingress stages
>>>> to dispatch traffic to ports.
>>>> 4. Add multipath flow in Table 5(lr_in_ip_routing) and store hash result
>>>> into reg0. reg9[2] was used to indicate packet which need dispatching.
>>>> 5. Add multipath feature description in ovn/northd/ovn-northd.8.xml
>>>> and ovn/ovn-nb.xml
>>>> 6. ovn-nbctl.c was updated to handle configuring mulitiple output_port.
>>>>
>>>> Signed-off-by: Zhenyu Gao <sysugaozhenyu at gmail.com>
>>>> ---
>>>>  ovn/northd/ovn-northd.8.xml |  67 +++++++++++-
>>>>  ovn/northd/ovn-northd.c     | 257 ++++++++++++++++++++++++++++++
>>>> +++++++-------
>>>>  ovn/ovn-nb.ovsschema        |   7 +-
>>>>  ovn/ovn-nb.xml              |   4 +
>>>>  ovn/utilities/ovn-nbctl.c   |  28 +++--
>>>>  5 files changed, 311 insertions(+), 52 deletions(-)
>>>>
>>>> diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
>>>> index 0d85ec0..b1ce9a9 100644
>>>> --- a/ovn/northd/ovn-northd.8.xml
>>>> +++ b/ovn/northd/ovn-northd.8.xml
>>>> @@ -1598,6 +1598,9 @@ icmp4 {
>>>>        port (ingress table <code>ARP Request</code> will generate an ARP
>>>>        request, if needed, with <code>reg0</code> as the target protocol
>>>>        address and <code>reg1</code> as the source protocol address).
>>>> +      A IP route can be configured that it has multipath to next-hop.
>>>> +      If a packet has multipath to destination, OVN assign the port
>>>> +      index into reg[0] to indicate the packet's output port in table
>>>> 6.
>>>>      </p>
>>>>
>>>>      <p>
>>>> @@ -1617,6 +1620,28 @@ icmp4 {
>>>>
>>>>        <li>
>>>>          <p>
>>>> +          IPv4/IPV6 multipath routing table. For each route to
>>>> IPv4/IPv6
>>>> +          network <var>N</var> with netmask <var>M</var>, on multipath
>>>> port
>>>> +          <var>P</var> with IP address <var>A</var> and Ethernet
>>>> +          address <var>E</var>, a logical flow with match
>>>> +          <code>ip4.dst ==<var>N</var>/<var>M</var></code>,whose
>>>> priority
>>>> +          is the number of 1-bits plus 10 in <var>M</var>,
>>>> +          has the following actions:
>>>> +        </p>
>>>> +
>>>> +        <pre>
>>>> +ip.ttl--;
>>>> +multipath (nw_dst, 0, modulo_n, <var>n_links</var>, 0, reg0);
>>>> +reg9[2] = 1
>>>> +next;
>>>> +        </pre>
>>>> +        <p>
>>>> +          <var>n_links</var> is the number of multipath port.
>>>> +        </p>
>>>> +      </li>
>>>> +
>>>> +      <li>
>>>> +        <p>
>>>>            IPv4 routing table.  For each route to IPv4 network
>>>> <var>N</var> with
>>>>            netmask <var>M</var>, on router port <var>P</var> with IP
>>>> address
>>>>            <var>A</var> and Ethernet
>>>> @@ -1686,7 +1711,43 @@ next;
>>>>        </li>
>>>>      </ul>
>>>>
>>>> -    <h3>Ingress Table 6: ARP/ND Resolution</h3>
>>>> +    <h3>Ingress Table 6: Multipath</h3>
>>>> +    <p>
>>>> +      Any packet taht reaches this table is an IP packet and reg9[2]=1
>>>> +      using the following flows to route to corresponding port. This
>>>> table
>>>> +      implement dispatching by consuming reg0.
>>>> +    </p>
>>>> +
>>>> +    <ul>
>>>> +      <li>
>>>> +        <p>
>>>> +          A packet with netmask <var>M</var>, IP address <var>A</var>
>>>> and
>>>> +          <code>reg9[2] = 1</code>, whose priority above 1 has
>>>> following
>>>> +          actions:
>>>> +        </p>
>>>> +
>>>> +        <pre>
>>>> +reg0 = <var>G</var>;
>>>> +reg1 = <var>A</var>;
>>>> +eth.src = <var>E</var>;
>>>> +outport = <var>P</var>;
>>>> +flags.loopback = 1;
>>>> +next;
>>>> +        </pre>
>>>> +
>>>> +        <p>
>>>> +          <var>G</var> is the gateway IP address. <var>A</var>,
>>>> <var>E</var>
>>>> +          and <var>P</var> are the values that were described in
>>>> multipath
>>>> +          routeing in table 5
>>>> +        </p>
>>>> +
>>>> +        <p>
>>>> +          A priority-0 logical flow with match has actions
>>>> <code>next;</code>.
>>>> +        </p>
>>>> +      </li>
>>>> +    </ul>
>>>> +
>>>> +    <h3>Ingress Table 7: ARP/ND Resolution</h3>
>>>>
>>>>      <p>
>>>>        Any packet that reaches this table is an IP packet whose next-hop
>>>> @@ -1779,7 +1840,7 @@ next;
>>>>        </li>
>>>>      </ul>
>>>>
>>>> -    <h3>Ingress Table 7: Gateway Redirect</h3>
>>>> +    <h3>Ingress Table 8: Gateway Redirect</h3>
>>>>
>>>>      <p>
>>>>        For distributed logical routers where one of the logical router
>>>> @@ -1836,7 +1897,7 @@ next;
>>>>        </li>
>>>>      </ul>
>>>>
>>>> -    <h3>Ingress Table 8: ARP Request</h3>
>>>> +    <h3>Ingress Table 9: ARP Request</h3>
>>>>
>>>>      <p>
>>>>        In the common case where the Ethernet destination has been
>>>> resolved, this
>>>> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
>>>> index 49e4ac3..f8bfee2 100644
>>>> --- a/ovn/northd/ovn-northd.c
>>>> +++ b/ovn/northd/ovn-northd.c
>>>> @@ -135,9 +135,10 @@ enum ovn_stage {
>>>>      PIPELINE_STAGE(ROUTER, IN,  UNSNAT,      3, "lr_in_unsnat")       \
>>>>      PIPELINE_STAGE(ROUTER, IN,  DNAT,        4, "lr_in_dnat")         \
>>>>      PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING,  5, "lr_in_ip_routing")   \
>>>> -    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE, 6, "lr_in_arp_resolve")  \
>>>> -    PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT, 7, "lr_in_gw_redirect")  \
>>>> -    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST, 8, "lr_in_arp_request")  \
>>>> +    PIPELINE_STAGE(ROUTER, IN,  MULTIPATH,   6, "lr_in_multipath")    \
>>>> +    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE, 7, "lr_in_arp_resolve")  \
>>>> +    PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT, 8, "lr_in_gw_redirect")  \
>>>> +    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST, 9, "lr_in_arp_request")  \
>>>>                                                                        \
>>>>      /* Logical router egress stages. */                               \
>>>>      PIPELINE_STAGE(ROUTER, OUT, UNDNAT,    0, "lr_out_undnat")        \
>>>> @@ -173,6 +174,11 @@ enum ovn_stage {
>>>>   * one of the logical router's own IP addresses. */
>>>>  #define REGBIT_EGRESS_LOOPBACK  "reg9[1]"
>>>>
>>>> +/* Indicate multipath action has process this packet and store hash
>>>> result
>>>> + * into other regX. Should consume the hash result to determin the
>>>> right
>>>> + * output port. */
>>>> +#define REGBIT_MULTIPATH "reg9[2]"
>>>> +
>>>>  /* Returns an "enum ovn_stage" built from the arguments. */
>>>>  static enum ovn_stage
>>>>  ovn_stage_build(enum ovn_datapath_type dp_type, enum ovn_pipeline
>>>> pipeline,
>>>> @@ -4142,82 +4148,178 @@ add_route(struct hmap *lflows, const struct
>>>> ovn_port *op,
>>>>  }
>>>>
>>>>  static void
>>>> -build_static_route_flow(struct hmap *lflows, struct ovn_datapath *od,
>>>> -                        struct hmap *ports,
>>>> -                        const struct nbrec_logical_router_static_route
>>>> *route)
>>>> +add_multipath_route(struct hmap *lflows, uint32_t port_num,
>>>> +                    struct ovn_port **out_ports,
>>>> +                    const char **lrp_addr_s,
>>>> +                    struct ovn_datapath *od,
>>>> +                    const char *network_s, int plen,
>>>> +                    const char *gateway, const char *policy)
>>>> +{
>>>> +    bool is_ipv4 = strchr(network_s, '.') ? true : false;
>>>> +    struct ds match = DS_EMPTY_INITIALIZER;
>>>> +    const char *dir;
>>>> +    uint16_t priority;
>>>> +
>>>> +    if (policy && !strcmp(policy, "src-ip")) {
>>>> +        dir = "src";
>>>> +        priority = plen * 2;
>>>> +    } else {
>>>> +        dir = "dst";
>>>> +        priority = (plen * 2) + 1;
>>>> +    }
>>>> +
>>>> +    ds_put_format(&match, "ip%s.%s == %s/%d", is_ipv4 ? "4" : "6", dir,
>>>> +                  network_s, plen);
>>>> +
>>>> +    struct ds actions = DS_EMPTY_INITIALIZER;
>>>> +
>>>> +    ds_put_format(&actions, "ip.ttl--; ");
>>>> +    ds_put_format(&actions,
>>>> +                  "multipath (nw_dst, 0, modulo_n, %u, 0, reg0); "
>>>> +                  "%s = 1; "
>>>> +                  "next;",
>>>> +                  port_num, REGBIT_MULTIPATH);
>>>> +
>>>> +    /* The priority here is calculated to implement
>>>> longest-prefix-match
>>>> +     * routing. */
>>>> +    ovn_lflow_add(lflows, od, S_ROUTER_IN_IP_ROUTING, priority,
>>>> +                  ds_cstr(&match), ds_cstr(&actions));
>>>> +
>>>> +    for (int i = 0; i < port_num; i++) {
>>>> +        struct ds mp_match = DS_EMPTY_INITIALIZER;
>>>> +        struct ds mp_actions = DS_EMPTY_INITIALIZER;
>>>> +
>>>> +        ds_put_format(&mp_match, "%s == 1 && reg0 == %d && ",
>>>> +                      REGBIT_MULTIPATH, i);
>>>> +        ds_put_format(&mp_match, "ip%s.%s == %s/%d",
>>>> +                      is_ipv4 ? "4" : "6", dir,
>>>> +                      network_s, plen);
>>>> +
>>>> +        ds_put_format(&mp_actions, "%sreg0 = ", is_ipv4 ? "" : "xx");
>>>> +        if (gateway) {
>>>> +            ds_put_cstr(&mp_actions, gateway);
>>>> +        } else {
>>>> +            ds_put_format(&mp_actions, "ip%s.dst", is_ipv4 ? "4" :
>>>> "6");
>>>> +        }
>>>> +
>>>> +        ds_put_format(&mp_actions, "; "
>>>> +                      "%sreg1 = %s; "
>>>> +                      "eth.src = %s; "
>>>> +                      "outport = %s; "
>>>> +                      "flags.loopback = 1; "
>>>> +                      "next;",
>>>> +                      is_ipv4 ? "" : "xx",
>>>> +                      lrp_addr_s[i],
>>>> +                      out_ports[i]->lrp_networks.ea_s,
>>>> +                      out_ports[i]->json_key);
>>>> +
>>>> +        /* Add flow in table 6 to determin the right output port
>>>> +         * for this traffic. */
>>>> +        ovn_lflow_add(lflows, od, S_ROUTER_IN_MULTIPATH, priority,
>>>> +                      ds_cstr(&mp_match), ds_cstr(&mp_actions));
>>>> +        ds_destroy(&mp_match);
>>>> +        ds_destroy(&mp_actions);
>>>> +    }
>>>> +    ds_destroy(&match);
>>>> +    ds_destroy(&actions);
>>>> +}
>>>> +
>>>> +static bool
>>>> +verify_nexthop_prefix(const struct nbrec_logical_router_static_route
>>>> *route,
>>>> +                      bool *is_ipv4, char **prefix_s, unsigned int
>>>> *plen)
>>>>  {
>>>>      ovs_be32 nexthop;
>>>> -    const char *lrp_addr_s = NULL;
>>>> -    unsigned int plen;
>>>> -    bool is_ipv4;
>>>>
>>>>      /* Verify that the next hop is an IP address with an all-ones
>>>> mask. */
>>>> -    char *error = ip_parse_cidr(route->nexthop, &nexthop, &plen);
>>>> +    char *error = ip_parse_cidr(route->nexthop, &nexthop, plen);
>>>>      if (!error) {
>>>> -        if (plen != 32) {
>>>> +        if (*plen != 32) {
>>>>              static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5,
>>>> 1);
>>>>              VLOG_WARN_RL(&rl, "bad next hop mask %s", route->nexthop);
>>>> -            return;
>>>> +            return false;
>>>>          }
>>>> -        is_ipv4 = true;
>>>> +        *is_ipv4 = true;
>>>>      } else {
>>>>          free(error);
>>>>
>>>>          struct in6_addr ip6;
>>>> -        error = ipv6_parse_cidr(route->nexthop, &ip6, &plen);
>>>> +        error = ipv6_parse_cidr(route->nexthop, &ip6, plen);
>>>>          if (!error) {
>>>> -            if (plen != 128) {
>>>> +            if (*plen != 128) {
>>>>                  static struct vlog_rate_limit rl =
>>>> VLOG_RATE_LIMIT_INIT(5, 1);
>>>>                  VLOG_WARN_RL(&rl, "bad next hop mask %s",
>>>> route->nexthop);
>>>> -                return;
>>>> +                return false;
>>>>              }
>>>> -            is_ipv4 = false;
>>>> +            *is_ipv4 = false;
>>>>          } else {
>>>>              static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5,
>>>> 1);
>>>>              VLOG_WARN_RL(&rl, "bad next hop ip address %s",
>>>> route->nexthop);
>>>>              free(error);
>>>> -            return;
>>>> +            return false;
>>>>          }
>>>>      }
>>>>
>>>> -    char *prefix_s;
>>>> -    if (is_ipv4) {
>>>> +    if (*is_ipv4) {
>>>>          ovs_be32 prefix;
>>>>          /* Verify that ip prefix is a valid IPv4 address. */
>>>> -        error = ip_parse_cidr(route->ip_prefix, &prefix, &plen);
>>>> +        error = ip_parse_cidr(route->ip_prefix, &prefix, plen);
>>>>          if (error) {
>>>>              static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5,
>>>> 1);
>>>>              VLOG_WARN_RL(&rl, "bad 'ip_prefix' in static routes %s",
>>>>                           route->ip_prefix);
>>>>              free(error);
>>>> -            return;
>>>> +            return false;
>>>>          }
>>>> -        prefix_s = xasprintf(IP_FMT, IP_ARGS(prefix &
>>>> be32_prefix_mask(plen)));
>>>> +        *prefix_s = xasprintf(IP_FMT, IP_ARGS(prefix
>>>> +                                              &
>>>> be32_prefix_mask(*plen)));
>>>>      } else {
>>>>          /* Verify that ip prefix is a valid IPv6 address. */
>>>>          struct in6_addr prefix;
>>>> -        error = ipv6_parse_cidr(route->ip_prefix, &prefix, &plen);
>>>> +        error = ipv6_parse_cidr(route->ip_prefix, &prefix, plen);
>>>>          if (error) {
>>>>              static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5,
>>>> 1);
>>>>              VLOG_WARN_RL(&rl, "bad 'ip_prefix' in static routes %s",
>>>>                           route->ip_prefix);
>>>>              free(error);
>>>> -            return;
>>>> +            return false;
>>>>          }
>>>> -        struct in6_addr mask = ipv6_create_mask(plen);
>>>> +        struct in6_addr mask = ipv6_create_mask(*plen);
>>>>          struct in6_addr network = ipv6_addr_bitand(&prefix, &mask);
>>>> -        prefix_s = xmalloc(INET6_ADDRSTRLEN);
>>>> -        inet_ntop(AF_INET6, &network, prefix_s, INET6_ADDRSTRLEN);
>>>> +        *prefix_s = xmalloc(INET6_ADDRSTRLEN);
>>>> +        inet_ntop(AF_INET6, &network, *prefix_s, INET6_ADDRSTRLEN);
>>>> +    }
>>>> +
>>>> +    return true;
>>>> +}
>>>> +
>>>> +static void
>>>> +build_static_route_flow(struct hmap *lflows, struct ovn_datapath *od,
>>>> +                        struct hmap *ports,
>>>> +                        const struct nbrec_logical_router_static_route
>>>> *route)
>>>> +{
>>>> +    const char *lrp_addr_s = NULL;
>>>> +    unsigned int plen;
>>>> +    bool is_ipv4;
>>>> +    char *prefix_s = NULL;
>>>> +
>>>> +    if (!verify_nexthop_prefix(route, &is_ipv4, &prefix_s, &plen)) {
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    /* Only need one output_port, if route contains multiple
>>>> output_port, then
>>>> +     * we should use build_multipath_flow to handle it. */
>>>> +    if (route->n_output_port > 1) {
>>>> +        return;
>>>>      }
>>>>
>>>>      /* Find the outgoing port. */
>>>>      struct ovn_port *out_port = NULL;
>>>> -    if (route->output_port) {
>>>> -        out_port = ovn_port_find(ports, route->output_port);
>>>> +    if (route->n_output_port) {
>>>> +        out_port = ovn_port_find(ports, route->output_port[0]);
>>>>          if (!out_port) {
>>>>              static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5,
>>>> 1);
>>>>              VLOG_WARN_RL(&rl, "Bad out port %s for static route %s",
>>>> -                         route->output_port, route->ip_prefix);
>>>> +                         route->output_port[0], route->ip_prefix);
>>>>              goto free_prefix_s;
>>>>          }
>>>>          lrp_addr_s = find_lrp_member_ip(out_port, route->nexthop);
>>>> @@ -4270,7 +4372,77 @@ build_static_route_flow(struct hmap *lflows,
>>>> struct ovn_datapath *od,
>>>>                policy);
>>>>
>>>>  free_prefix_s:
>>>> -    free(prefix_s);
>>>> +    if (prefix_s) {
>>>> +        free(prefix_s);
>>>> +    }
>>>> +}
>>>> +
>>>> +static void
>>>> +build_multipath_flow(struct hmap *lflows, struct ovn_datapath *od,
>>>> +                     struct hmap *ports,
>>>> +                     const struct nbrec_logical_router_static_route
>>>> *route)
>>>> +{
>>>> +    unsigned int plen;
>>>> +    bool is_ipv4;
>>>> +    char *prefix_s = NULL;
>>>> +
>>>> +    if (!verify_nexthop_prefix(route, &is_ipv4, &prefix_s, &plen)) {
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    /* Find the outgoing port. */
>>>> +    struct ovn_port **out_ports = xmalloc(route->n_output_port *
>>>> +                                             sizeof(struct ovn_port
>>>> *));
>>>> +    const char **lrp_addr_s = xmalloc(route->n_output_port *
>>>> +                                         sizeof(const char *));
>>>> +    uint32_t idx = 0;
>>>> +    for (int i = 0; i < route->n_output_port; i++) {
>>>> +        out_ports[idx] = ovn_port_find(ports, route->output_port[i]);
>>>> +        if (!out_ports[idx]) {
>>>> +            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5,
>>>> 1);
>>>> +            VLOG_WARN_RL(&rl, "Bad out port %s for static route %s",
>>>> +                         route->output_port[i], route->ip_prefix);
>>>> +            continue;
>>>> +        }
>>>> +
>>>> +        lrp_addr_s[idx] = find_lrp_member_ip(out_ports[idx],
>>>> route->nexthop);
>>>> +        if (!lrp_addr_s[idx]) {
>>>> +            if (is_ipv4) {
>>>> +                if (out_ports[idx]->lrp_networks.n_ipv4_addrs) {
>>>> +                    lrp_addr_s[idx] = out_ports[idx]->
>>>> +                                        lrp_networks.ipv4_addrs[0].add
>>>> r_s;
>>>> +                }
>>>> +            } else {
>>>> +                if (out_ports[idx]->lrp_networks.n_ipv6_addrs) {
>>>> +                    lrp_addr_s[idx] = out_ports[idx]->
>>>> +                                        lrp_networks.ipv6_addrs[0].add
>>>> r_s;
>>>> +                }
>>>> +            }
>>>> +        }
>>>> +        if (!lrp_addr_s[idx]) {
>>>> +            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5,
>>>> 1);
>>>> +            VLOG_WARN_RL(&rl,
>>>> +                         "%s has no path for static route %s; next hop
>>>> %s",
>>>> +                         route->output_port[i], route->ip_prefix,
>>>> +                         route->nexthop);
>>>> +            continue;
>>>> +        }
>>>> +
>>>> +        idx++;
>>>> +    }
>>>> +
>>>> +    char *policy = route->policy ? route->policy : "dst-ip";
>>>> +    if (idx > 0) {
>>>> +        add_multipath_route(lflows, idx,
>>>> +                            out_ports, lrp_addr_s, od,
>>>> +                            prefix_s, plen, route->nexthop, policy);
>>>> +    }
>>>> +
>>>> +    free(out_ports);
>>>> +    free(lrp_addr_s);
>>>> +    if (prefix_s) {
>>>> +        free(prefix_s);
>>>> +    }
>>>>  }
>>>>
>>>>  static void
>>>> @@ -5344,7 +5516,7 @@ build_lrouter_flows(struct hmap *datapaths,
>>>> struct hmap *ports,
>>>>          }
>>>>      }
>>>>
>>>> -    /* Convert the static routes to flows. */
>>>> +    /* Convert the static routes and multipath route to flows. */
>>>>      HMAP_FOR_EACH (od, key_node, datapaths) {
>>>>          if (!od->nbr) {
>>>>              continue;
>>>> @@ -5354,13 +5526,24 @@ build_lrouter_flows(struct hmap *datapaths,
>>>> struct hmap *ports,
>>>>              const struct nbrec_logical_router_static_route *route;
>>>>
>>>>              route = od->nbr->static_routes[i];
>>>> -            build_static_route_flow(lflows, od, ports, route);
>>>> +            if (route->n_output_port > 1) {
>>>> +                /* Logical router ingress table 5-6: Multipath Routing.
>>>> +                 *
>>>> +                 * If router had been configured a traffic has
>>>> multiple paths
>>>> +                 * to destination. The specific output port should be
>>>> firgured
>>>> +                 * out by computing packet's IP dst address header */
>>>> +                build_multipath_flow(lflows, od, ports, route);
>>>> +            } else {
>>>> +                build_static_route_flow(lflows, od, ports, route);
>>>> +            }
>>>>          }
>>>> +        /* Packets are allowed by default in table 6. */
>>>> +        ovn_lflow_add(lflows, od, S_ROUTER_IN_MULTIPATH, 0, "1",
>>>> "next;");
>>>>      }
>>>>
>>>>      /* XXX destination unreachable */
>>>>
>>>> -    /* Local router ingress table 6: ARP Resolution.
>>>> +    /* Local router ingress table 7: ARP Resolution.
>>>>       *
>>>>       * Any packet that reaches this table is an IP packet whose
>>>> next-hop IP
>>>>       * address is in reg0. (ip4.dst is the final destination.) This
>>>> table
>>>> @@ -5555,7 +5738,7 @@ build_lrouter_flows(struct hmap *datapaths,
>>>> struct hmap *ports,
>>>>                        "get_nd(outport, xxreg0); next;");
>>>>      }
>>>>
>>>> -    /* Logical router ingress table 7: Gateway redirect.
>>>> +    /* Logical router ingress table 8: Gateway redirect.
>>>>       *
>>>>       * For traffic with outport equal to the l3dgw_port
>>>>       * on a distributed router, this table redirects a subset
>>>> @@ -5595,7 +5778,7 @@ build_lrouter_flows(struct hmap *datapaths,
>>>> struct hmap *ports,
>>>>          ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 0, "1",
>>>> "next;");
>>>>      }
>>>>
>>>> -    /* Local router ingress table 8: ARP request.
>>>> +    /* Local router ingress table 9: ARP request.
>>>>       *
>>>>       * In the common case where the Ethernet destination has been
>>>> resolved,
>>>>       * this table outputs the packet (priority 0).  Otherwise, it
>>>> composes
>>>> diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema
>>>> index a077bfb..7a43473 100644
>>>> --- a/ovn/ovn-nb.ovsschema
>>>> +++ b/ovn/ovn-nb.ovsschema
>>>> @@ -1,7 +1,7 @@
>>>>  {
>>>>      "name": "OVN_Northbound",
>>>> -    "version": "5.8.0",
>>>> -    "cksum": "2812300190 <(281)%20230-0190> 16766",
>>>> +    "version": "5.9.0",
>>>> +    "cksum": "1515729450 16817",
>>>>      "tables": {
>>>>          "NB_Global": {
>>>>              "columns": {
>>>> @@ -235,7 +235,8 @@
>>>>
>>>> "dst-ip"]]},
>>>>                                      "min": 0, "max": 1}},
>>>>                  "nexthop": {"type": "string"},
>>>> -                "output_port": {"type": {"key": "string", "min": 0,
>>>> "max": 1}}},
>>>> +                "output_port": {"type": {"key": "string", "min": 0,
>>>> +                                         "max": "unlimited"}}},
>>>>              "isRoot": false},
>>>>          "NAT": {
>>>>              "columns": {
>>>> diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
>>>> index 9869d7e..eaba0c8 100644
>>>> --- a/ovn/ovn-nb.xml
>>>> +++ b/ovn/ovn-nb.xml
>>>> @@ -1485,6 +1485,10 @@
>>>>          multiple IP addresses on the router port and none of them are
>>>> in the
>>>>          same subnet of <ref column="nexthop"/>, OVN chooses the first
>>>> IP
>>>>          address as the one via which the <ref column="nexthop"/> is
>>>> reachable.
>>>> +        When it contains more than two ports, it means packet has
>>>> multiple
>>>> +        candidate output ports. OVN uses the packet header to determin
>>>> which
>>>> +        port the packet would be delivered to.
>>>> +        Currently, OVN consumes destination IP field to figure out
>>>> port.
>>>>        </p>
>>>>      </column>
>>>>    </table>
>>>> diff --git a/ovn/utilities/ovn-nbctl.c b/ovn/utilities/ovn-nbctl.c
>>>> index 8e5c1a4..417194f 100644
>>>> --- a/ovn/utilities/ovn-nbctl.c
>>>> +++ b/ovn/utilities/ovn-nbctl.c
>>>> @@ -397,7 +397,7 @@ Logical router port commands:\n\
>>>>                              ('enabled' or 'disabled')\n\
>>>>  \n\
>>>>  Route commands:\n\
>>>> -  [--policy=POLICY] lr-route-add ROUTER PREFIX NEXTHOP [PORT]\n\
>>>> +  [--policy=POLICY] lr-route-add ROUTER PREFIX NEXTHOP [PORT]...\n\
>>>>                              add a route to ROUTER\n\
>>>>    lr-route-del ROUTER [PREFIX]\n\
>>>>                              remove routes from ROUTER\n\
>>>> @@ -2184,13 +2184,15 @@ normalize_prefix_str(const char *orig_prefix)
>>>>          return normalize_ipv6_prefix(ipv6, plen);
>>>>      }
>>>>  }
>>>> -
>>>> +
>>>>  static void
>>>>  nbctl_lr_route_add(struct ctl_context *ctx)
>>>>  {
>>>>      const struct nbrec_logical_router *lr;
>>>>      lr = lr_by_name_or_uuid(ctx, ctx->argv[1], true);
>>>>      char *prefix, *next_hop;
>>>> +    int n_output_port = 0;
>>>> +    const char **output_port;
>>>>
>>>>      const char *policy = shash_find_data(&ctx->options, "--policy");
>>>>      if (policy && strcmp(policy, "src-ip") && strcmp(policy,
>>>> "dst-ip")) {
>>>> @@ -2224,6 +2226,11 @@ nbctl_lr_route_add(struct ctl_context *ctx)
>>>>          }
>>>>      }
>>>>
>>>> +    if (ctx->argc > 4) {
>>>> +        n_output_port = ctx->argc - 4;
>>>> +        output_port = (const char **)&ctx->argv[4];
>>>> +    }
>>>> +
>>>>      bool may_exist = shash_find(&ctx->options, "--may-exist") != NULL;
>>>>      for (int i = 0; i < lr->n_static_routes; i++) {
>>>>          const struct nbrec_logical_router_static_route *route
>>>> @@ -2253,9 +2260,10 @@ nbctl_lr_route_add(struct ctl_context *ctx)
>>>>          nbrec_logical_router_static_route_verify_nexthop(route);
>>>>          nbrec_logical_router_static_route_set_ip_prefix(route,
>>>> prefix);
>>>>          nbrec_logical_router_static_route_set_nexthop(route,
>>>> next_hop);
>>>> -        if (ctx->argc == 5) {
>>>> +        if (n_output_port > 0) {
>>>>              nbrec_logical_router_static_route_set_output_port(route,
>>>> -
>>>> ctx->argv[4]);
>>>> +
>>>> output_port,
>>>> +
>>>> n_output_port);
>>>>          }
>>>>          if (policy) {
>>>>               nbrec_logical_router_static_route_set_policy(route,
>>>> policy);
>>>> @@ -2270,8 +2278,10 @@ nbctl_lr_route_add(struct ctl_context *ctx)
>>>>      route = nbrec_logical_router_static_route_insert(ctx->txn);
>>>>      nbrec_logical_router_static_route_set_ip_prefix(route, prefix);
>>>>      nbrec_logical_router_static_route_set_nexthop(route, next_hop);
>>>> -    if (ctx->argc == 5) {
>>>> -        nbrec_logical_router_static_route_set_output_port(route,
>>>> ctx->argv[4]);
>>>> +    if (n_output_port > 0) {
>>>> +        nbrec_logical_router_static_route_set_output_port(route,
>>>> +                                                          output_port,
>>>> +
>>>> n_output_port);
>>>>      }
>>>>      if (policy) {
>>>>          nbrec_logical_router_static_route_set_policy(route, policy);
>>>> @@ -3066,8 +3076,8 @@ print_route(const struct
>>>> nbrec_logical_router_static_route *route, struct ds *s)
>>>>          ds_put_format(s, " %s", "dst-ip");
>>>>      }
>>>>
>>>> -    if (route->output_port) {
>>>> -        ds_put_format(s, " %s", route->output_port);
>>>> +    for (int i = 0; i < route->n_output_port; i++) {
>>>> +        ds_put_format(s, " %s", route->output_port[i]);
>>>>      }
>>>>      ds_put_char(s, '\n');
>>>>  }
>>>> @@ -3682,7 +3692,7 @@ static const struct ctl_command_syntax
>>>> nbctl_commands[] = {
>>>>        NULL, "", RO },
>>>>
>>>>      /* logical router route commands. */
>>>> -    { "lr-route-add", 3, 4, "ROUTER PREFIX NEXTHOP [PORT]", NULL,
>>>> +    { "lr-route-add", 3, INT_MAX, "ROUTER PREFIX NEXTHOP [PORT]...",
>>>> NULL,
>>>>        nbctl_lr_route_add, NULL, "--may-exist,--policy=", RW },
>>>>      { "lr-route-del", 1, 2, "ROUTER [PREFIX]", NULL,
>>>> nbctl_lr_route_del,
>>>>        NULL, "--if-exists", RW },
>>>> --
>>>> 1.8.3.1
>>>>
>>>>
>>>
>>
>


More information about the dev mailing list