[ovs-dev] 答复: [PATCH v1 1/3] Add multipath static router in OVN northd and north-db

Miguel Angel Ajo Pelayo majopela at redhat.com
Thu Sep 21 11:07:35 UTC 2017


May be I missed something, but when I tried setting logical routers into
specific chassis, still the E/W traffic was handled in a distributed way
(from original chassis to destination chassis without going through the
router chassis), such chassis was only used for N/S, but may be I got
something wrong.


On Wed, Sep 20, 2017 at 4:48 PM, Gao Zhenyu <sysugaozhenyu at gmail.com> wrote:

> "
> But, if an ovn port in foo (chassis A) wants to talk to alice1 (chassis
> B),
> wouldn't all that E/W routing will happen virtually and the end result is
> just a tunneled packet between chassis A and chassis B ? "
> [ Now the hash function base on dst IP, if foo1 only talks to alice1, and
> it is the tunnel packet between chassisA and chassis B ]
>
> The benifit is if you have two ovn-routers and those router are ONLY
> deployed in chassis C and chassis D, the traffics can be sperated in two
> paths automatically. Otherwise you need to config static rule one by one to
> seperate traffics.
> To make a long story short, you also can do same thing by config numerous
> static rules to seperate traffic but the multipath can do it
> automatically.
>
> 2017-09-20 22:08 GMT+08:00 Miguel Angel Ajo Pelayo <majopela at redhat.com>:
>
>> I forgot to say thank you very much for the explanation and diagrams.
>>
>> On Wed, Sep 20, 2017 at 4:07 PM, Miguel Angel Ajo Pelayo <
>> majopela at redhat.com> wrote:
>>
>>> But, if an ovn port in foo (chassis A) wants to talk to alice1 (chassis
>>> B),
>>> wouldn't all that E/W routing will happen virtually and the end result
>>> is just a tunneled packet between chassis A and chassis B ?
>>>
>>> What's the benefit of multipath there if the possible failing link is
>>> always the connection between chassis A and chassis B ?
>>>
>>> I suspect there's something I'm missing on the picture.
>>>
>>> On Wed, Sep 20, 2017 at 3:49 PM, Gao Zhenyu <sysugaozhenyu at gmail.com>
>>> wrote:
>>>
>>>> You can take a look at this patch that implement a testcase :
>>>> https://patchwork.ozlabs.org/patch/815475/
>>>>
>>>> In the testcase, we have R1, R2, R3.
>>>>
>>>>  R1 and R2 that are connected to each other via LS "join"  in
>>>> 20.0.0.0/24 network.
>>>>  R1 and R3 that are connected to each other  via LS "join2" in
>>>> 20.0.0.0/24 network.
>>>>  R1 has switchess foo (192.168.1.0/24) connected to it. R2 and R3 has
>>>> alice (172.16.1.0/24) connected to it.
>>>>  R2 and R3 are gateway routers.
>>>>
>>>> A packet send  to alice1/aclie2 from foo have mulitpath to destination:
>>>>    1. foo-->R1-->join-->R2-->alice.
>>>>    2. foo-->R1-->join2-->R3-->alice.
>>>>
>>>> In this testcase, it simulates two packet, one's destination is
>>>> 172.16.1.2, another is 172.16.1.4.  The mulitpath that was configured in R1
>>>> can seperate those traffics to R2/R3. Finally,  172.16.1.2 packet travels
>>>> path2, 172.16.1.4  packet travels path1
>>>>
>>>>       +------+
>>>>       |  foo |
>>>>       +------+
>>>>           |
>>>>           |
>>>>        +------+
>>>>        |  R1 |---------+
>>>>        +------+       |
>>>>            |        |
>>>>            |        |
>>>>         +------+   +-------+
>>>>         | join |   | join2 |
>>>>         +------+   +-------+
>>>>             |      |
>>>>             |      |
>>>>         +------+   +-------+
>>>>         |  R2 |   |  R3  |
>>>>         +------+   +-------+
>>>>            |       |
>>>>            |       |
>>>>         +-----------------+
>>>>         |      alice  |
>>>>         +-----------------+
>>>>            |         |
>>>>           alice1     alice2
>>>>
>>>> Please let me know if you have any question on it. :)
>>>>
>>>> Thanks
>>>> Zhenyu Gao
>>>>
>>>> 2017-09-20 20:58 GMT+08:00 Miguel Angel Ajo Pelayo <majopela at redhat.com
>>>> >:
>>>>
>>>>> Can you share an example of how this would benefit E/W routing. I'm
>>>>> just not seeing the specific use case myself out of ignorance.
>>>>>
>>>>> It'd be great if you could explain how would it work between several
>>>>> ports in the networks and routers (may be a diagram?) otherwise I can't be
>>>>> really helpful reviewing :)
>>>>>
>>>>> Cheers, and thanks for the patience.
>>>>>
>>>>> On Wed, Sep 20, 2017 at 12:25 PM, Gao Zhenyu <sysugaozhenyu at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Thanks for the suggestions!
>>>>>>
>>>>>> Not all Logical port has a real ofp_port connect with it. And
>>>>>> bundle_load/bundle actions need real ovs port.
>>>>>> Especially in ovn router port, all router port are virtual port which
>>>>>> just a number/reg in our ovs-flows.
>>>>>>
>>>>>> This implement of multipath can seperate ovn east-west traffic, it
>>>>>> helps dispatch traffic to gateways and routers easily.
>>>>>>
>>>>>> For south-north traffic, we can have bundle/bundle_load action to
>>>>>> consider the remote tunnel up/down status. I would like to make it step by
>>>>>> step and implement it in my next series patches.
>>>>>>
>>>>>> Thanks
>>>>>> Zhenyu Gao
>>>>>>
>>>>>> 2017-09-20 17:53 GMT+08:00 Miguel Angel Ajo Pelayo <
>>>>>> majopela at redhat.com>:
>>>>>>
>>>>>>> I'm not very familiar with multipath implementations,
>>>>>>>
>>>>>>> but would it be possible to use bundle( ouput action with hrw
>>>>>>> algorithm instead of multipath calculation to a register?.
>>>>>>>
>>>>>>> I say this, because if you look at lib/multipath.c lib/bundle.c you
>>>>>>> will find that bundle.c is going to consider the up/down status
>>>>>>> (slave_enabled check) of the links.
>>>>>>>
>>>>>>> That way the controller doesn't need to modify any flow based on
>>>>>>> link status.
>>>>>>>
>>>>>>> On Wed, Sep 20, 2017 at 5:45 AM, Gao Zhenyu <sysugaozhenyu at gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> Thansk for the questions.
>>>>>>>>
>>>>>>>> the multipath_port can be set via ovn-nbctl.
>>>>>>>> Like : ovn-nbctl   -- --id=@lrt create Logical_Router_Static_Route
>>>>>>>> ip_prefix=0.0.0.0/0 nexthop=10.88.77.1 multipath_port=[mp1,mp2] --
>>>>>>>> add Logical_Router edge1 static_routes @lrt
>>>>>>>> This patch haven't implement a ovn-nbctl command to configure
>>>>>>>> multipath routing. Because I am still considering reusing nexthop or
>>>>>>>> output_port(make them become array entries), and want to collect
>>>>>>>> suggestions on it.
>>>>>>>>
>>>>>>>> About the status of next -hop, I would like to introduce
>>>>>>>> bundle_load and bfd to make it later.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Zhenyu Gao
>>>>>>>>
>>>>>>>> 2017-09-20 11:13 GMT+08:00 <wang.qianyu at zte.com.cn>:
>>>>>>>>
>>>>>>>>> How to configure multipath_port in static_route? I think the the
>>>>>>>>> multipath
>>>>>>>>> can be figured out from exist data of static_route, may not need
>>>>>>>>> to add
>>>>>>>>> this multipath_port column.
>>>>>>>>>
>>>>>>>>> And I think we should add a status column to indicate the nexthop
>>>>>>>>> state.
>>>>>>>>> When some of nexthop in multipath is down, ovn should change the
>>>>>>>>> correspond flows.
>>>>>>>>>
>>>>>>>>> Thanks.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Zhenyu Gao <sysugaozhenyu at gmail.com>
>>>>>>>>> 发件人: ovs-dev-bounces at openvswitch.org
>>>>>>>>> 2017/09/19 19:37
>>>>>>>>>
>>>>>>>>>         收件人:        blp at ovn.org, majopela at redhat.com,
>>>>>>>>> anilvenkata at redhat.com, russell at ovn.org, dev at openvswitch.org,
>>>>>>>>>         抄送:
>>>>>>>>>         主题:  [ovs-dev] [PATCH v1 1/3] Add multipath static router
>>>>>>>>> in
>>>>>>>>> OVN northd      and north-db
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 1. ovn-nb.ovsschema was updated to add new field multipath_port.
>>>>>>>>> 2. Add multipath feature in ovn-northd part. northd generates
>>>>>>>>> multipath
>>>>>>>>> flows to dispatch traffic by using packet's IP dst address if user
>>>>>>>>> set
>>>>>>>>> Logical_Router_Static_Route's multipath_port with ports.
>>>>>>>>> 3. Add new table(lr_in_multipath) in ovn-northd's router ingress
>>>>>>>>> stages
>>>>>>>>> to dispatch traffic to ports.
>>>>>>>>> 4. Add multipath flow in Table 5(lr_in_ip_routing) and store hash
>>>>>>>>> result
>>>>>>>>> into reg0. reg9[2] was used to indicate packet which need
>>>>>>>>> dispatching.
>>>>>>>>> 5. Add multipath feature description in ovn/northd/ovn-northd.8.xml
>>>>>>>>> and ovn/ovn-nb.xml
>>>>>>>>>
>>>>>>>>> Signed-off-by: Zhenyu Gao <sysugaozhenyu at gmail.com>
>>>>>>>>> ---
>>>>>>>>>  ovn/northd/ovn-northd.8.xml |  67 +++++++++++-
>>>>>>>>>  ovn/northd/ovn-northd.c     | 245
>>>>>>>>> ++++++++++++++++++++++++++++++++++++++------
>>>>>>>>>  ovn/ovn-nb.ovsschema        |   6 +-
>>>>>>>>>  ovn/ovn-nb.xml              |   9 ++
>>>>>>>>>  4 files changed, 289 insertions(+), 38 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/ovn/northd/ovn-northd.8.xml
>>>>>>>>> b/ovn/northd/ovn-northd.8.xml
>>>>>>>>> index 0d85ec0..b1ce9a9 100644
>>>>>>>>> --- a/ovn/northd/ovn-northd.8.xml
>>>>>>>>> +++ b/ovn/northd/ovn-northd.8.xml
>>>>>>>>> @@ -1598,6 +1598,9 @@ icmp4 {
>>>>>>>>>        port (ingress table <code>ARP Request</code> will generate
>>>>>>>>> an ARP
>>>>>>>>>        request, if needed, with <code>reg0</code> as the target
>>>>>>>>> protocol
>>>>>>>>>        address and <code>reg1</code> as the source protocol
>>>>>>>>> address).
>>>>>>>>> +      A IP route can be configured that it has multipath to
>>>>>>>>> next-hop.
>>>>>>>>> +      If a packet has multipath to destination, OVN assign the
>>>>>>>>> port
>>>>>>>>> +      index into reg[0] to indicate the packet's output port in
>>>>>>>>> table 6.
>>>>>>>>>      </p>
>>>>>>>>>
>>>>>>>>>      <p>
>>>>>>>>> @@ -1617,6 +1620,28 @@ icmp4 {
>>>>>>>>>
>>>>>>>>>        <li>
>>>>>>>>>          <p>
>>>>>>>>> +          IPv4/IPV6 multipath routing table. For each route to
>>>>>>>>> IPv4/IPv6
>>>>>>>>> +          network <var>N</var> with netmask <var>M</var>, on
>>>>>>>>> multipath
>>>>>>>>> port
>>>>>>>>> +          <var>P</var> with IP address <var>A</var> and Ethernet
>>>>>>>>> +          address <var>E</var>, a logical flow with match
>>>>>>>>> +          <code>ip4.dst ==<var>N</var>/<var>M</var></code>,whose
>>>>>>>>> priority
>>>>>>>>> +          is the number of 1-bits plus 10 in <var>M</var>,
>>>>>>>>> +          has the following actions:
>>>>>>>>> +        </p>
>>>>>>>>> +
>>>>>>>>> +        <pre>
>>>>>>>>> +ip.ttl--;
>>>>>>>>> +multipath (nw_dst, 0, modulo_n, <var>n_links</var>, 0, reg0);
>>>>>>>>> +reg9[2] = 1
>>>>>>>>> +next;
>>>>>>>>> +        </pre>
>>>>>>>>> +        <p>
>>>>>>>>> +          <var>n_links</var> is the number of multipath port.
>>>>>>>>> +        </p>
>>>>>>>>> +      </li>
>>>>>>>>> +
>>>>>>>>> +      <li>
>>>>>>>>> +        <p>
>>>>>>>>>            IPv4 routing table.  For each route to IPv4 network
>>>>>>>>> <var>N</var> with
>>>>>>>>>            netmask <var>M</var>, on router port <var>P</var> with
>>>>>>>>> IP
>>>>>>>>> address
>>>>>>>>>            <var>A</var> and Ethernet
>>>>>>>>> @@ -1686,7 +1711,43 @@ next;
>>>>>>>>>        </li>
>>>>>>>>>      </ul>
>>>>>>>>>
>>>>>>>>> -    <h3>Ingress Table 6: ARP/ND Resolution</h3>
>>>>>>>>> +    <h3>Ingress Table 6: Multipath</h3>
>>>>>>>>> +    <p>
>>>>>>>>> +      Any packet taht reaches this table is an IP packet and
>>>>>>>>> reg9[2]=1
>>>>>>>>> +      using the following flows to route to corresponding port.
>>>>>>>>> This
>>>>>>>>> table
>>>>>>>>> +      implement dispatching by consuming reg0.
>>>>>>>>> +    </p>
>>>>>>>>> +
>>>>>>>>> +    <ul>
>>>>>>>>> +      <li>
>>>>>>>>> +        <p>
>>>>>>>>> +          A packet with netmask <var>M</var>, IP address
>>>>>>>>> <var>A</var> and
>>>>>>>>> +          <code>reg9[2] = 1</code>, whose priority above 1 has
>>>>>>>>> following
>>>>>>>>> +          actions:
>>>>>>>>> +        </p>
>>>>>>>>> +
>>>>>>>>> +        <pre>
>>>>>>>>> +reg0 = <var>G</var>;
>>>>>>>>> +reg1 = <var>A</var>;
>>>>>>>>> +eth.src = <var>E</var>;
>>>>>>>>> +outport = <var>P</var>;
>>>>>>>>> +flags.loopback = 1;
>>>>>>>>> +next;
>>>>>>>>> +        </pre>
>>>>>>>>> +
>>>>>>>>> +        <p>
>>>>>>>>> +          <var>G</var> is the gateway IP address. <var>A</var>,
>>>>>>>>> <var>E</var>
>>>>>>>>> +          and <var>P</var> are the values that were described in
>>>>>>>>> multipath
>>>>>>>>> +          routeing in table 5
>>>>>>>>> +        </p>
>>>>>>>>> +
>>>>>>>>> +        <p>
>>>>>>>>> +          A priority-0 logical flow with match has actions
>>>>>>>>> <code>next;</code>.
>>>>>>>>> +        </p>
>>>>>>>>> +      </li>
>>>>>>>>> +    </ul>
>>>>>>>>> +
>>>>>>>>> +    <h3>Ingress Table 7: ARP/ND Resolution</h3>
>>>>>>>>>
>>>>>>>>>      <p>
>>>>>>>>>        Any packet that reaches this table is an IP packet whose
>>>>>>>>> next-hop
>>>>>>>>> @@ -1779,7 +1840,7 @@ next;
>>>>>>>>>        </li>
>>>>>>>>>      </ul>
>>>>>>>>>
>>>>>>>>> -    <h3>Ingress Table 7: Gateway Redirect</h3>
>>>>>>>>> +    <h3>Ingress Table 8: Gateway Redirect</h3>
>>>>>>>>>
>>>>>>>>>      <p>
>>>>>>>>>        For distributed logical routers where one of the logical
>>>>>>>>> router
>>>>>>>>> @@ -1836,7 +1897,7 @@ next;
>>>>>>>>>        </li>
>>>>>>>>>      </ul>
>>>>>>>>>
>>>>>>>>> -    <h3>Ingress Table 8: ARP Request</h3>
>>>>>>>>> +    <h3>Ingress Table 9: ARP Request</h3>
>>>>>>>>>
>>>>>>>>>      <p>
>>>>>>>>>        In the common case where the Ethernet destination has been
>>>>>>>>> resolved, this
>>>>>>>>> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
>>>>>>>>> index 49e4ac3..44d1fd4 100644
>>>>>>>>> --- a/ovn/northd/ovn-northd.c
>>>>>>>>> +++ b/ovn/northd/ovn-northd.c
>>>>>>>>> @@ -135,9 +135,10 @@ enum ovn_stage {
>>>>>>>>>      PIPELINE_STAGE(ROUTER, IN,  UNSNAT,      3, "lr_in_unsnat")
>>>>>>>>>      \
>>>>>>>>>      PIPELINE_STAGE(ROUTER, IN,  DNAT,        4, "lr_in_dnat")
>>>>>>>>>      \
>>>>>>>>>      PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING,  5,
>>>>>>>>> "lr_in_ip_routing")   \
>>>>>>>>> -    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE, 6,
>>>>>>>>> "lr_in_arp_resolve")  \
>>>>>>>>> -    PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT, 7,
>>>>>>>>> "lr_in_gw_redirect")  \
>>>>>>>>> -    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST, 8,
>>>>>>>>> "lr_in_arp_request")  \
>>>>>>>>> +    PIPELINE_STAGE(ROUTER, IN,  MULTIPATH,   6,
>>>>>>>>> "lr_in_multipath")    \
>>>>>>>>> +    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE, 7,
>>>>>>>>> "lr_in_arp_resolve")  \
>>>>>>>>> +    PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT, 8,
>>>>>>>>> "lr_in_gw_redirect")  \
>>>>>>>>> +    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST, 9,
>>>>>>>>> "lr_in_arp_request")  \
>>>>>>>>>
>>>>>>>>>      \
>>>>>>>>>      /* Logical router egress stages. */
>>>>>>>>>      \
>>>>>>>>>      PIPELINE_STAGE(ROUTER, OUT, UNDNAT,    0, "lr_out_undnat")
>>>>>>>>>     \
>>>>>>>>> @@ -173,6 +174,11 @@ enum ovn_stage {
>>>>>>>>>   * one of the logical router's own IP addresses. */
>>>>>>>>>  #define REGBIT_EGRESS_LOOPBACK  "reg9[1]"
>>>>>>>>>
>>>>>>>>> +/* Indicate multipath action has process this packet and store
>>>>>>>>> hash
>>>>>>>>> result
>>>>>>>>> + * into other regX. Should consume the hash result to determin
>>>>>>>>> the right
>>>>>>>>> + * output port. */
>>>>>>>>> +#define REGBIT_MULTIPATH "reg9[2]"
>>>>>>>>> +
>>>>>>>>>  /* Returns an "enum ovn_stage" built from the arguments. */
>>>>>>>>>  static enum ovn_stage
>>>>>>>>>  ovn_stage_build(enum ovn_datapath_type dp_type, enum ovn_pipeline
>>>>>>>>> pipeline,
>>>>>>>>> @@ -4142,72 +4148,165 @@ add_route(struct hmap *lflows, const
>>>>>>>>> struct
>>>>>>>>> ovn_port *op,
>>>>>>>>>  }
>>>>>>>>>
>>>>>>>>>  static void
>>>>>>>>> -build_static_route_flow(struct hmap *lflows, struct ovn_datapath
>>>>>>>>> *od,
>>>>>>>>> -                        struct hmap *ports,
>>>>>>>>> -                        const struct
>>>>>>>>> nbrec_logical_router_static_route
>>>>>>>>> *route)
>>>>>>>>> +add_multipath_route(struct hmap *lflows, uint32_t port_num,
>>>>>>>>> +                    struct ovn_port **out_ports,
>>>>>>>>> +                    const char **lrp_addr_s,
>>>>>>>>> +                    struct ovn_datapath *od,
>>>>>>>>> +                    const char *network_s, int plen,
>>>>>>>>> +                    const char *gateway, const char *policy)
>>>>>>>>> +{
>>>>>>>>> +    bool is_ipv4 = strchr(network_s, '.') ? true : false;
>>>>>>>>> +    struct ds match = DS_EMPTY_INITIALIZER;
>>>>>>>>> +    const char *dir;
>>>>>>>>> +    uint16_t priority;
>>>>>>>>> +
>>>>>>>>> +    if (policy && !strcmp(policy, "src-ip")) {
>>>>>>>>> +        dir = "src";
>>>>>>>>> +        priority = plen * 2;
>>>>>>>>> +    } else {
>>>>>>>>> +        dir = "dst";
>>>>>>>>> +        priority = (plen * 2) + 1;
>>>>>>>>> +    }
>>>>>>>>> +
>>>>>>>>> +    /* Set higer priority than regular route. */
>>>>>>>>> +    priority += 10;
>>>>>>>>> +
>>>>>>>>> +    ds_put_format(&match, "ip%s.%s == %s/%d", is_ipv4 ? "4" :
>>>>>>>>> "6", dir,
>>>>>>>>> +                  network_s, plen);
>>>>>>>>> +
>>>>>>>>> +    struct ds actions = DS_EMPTY_INITIALIZER;
>>>>>>>>> +
>>>>>>>>> +    ds_put_format(&actions, "ip.ttl--; ");
>>>>>>>>> +    ds_put_format(&actions,
>>>>>>>>> +                  "multipath (nw_dst, 0, modulo_n, %u, 0, reg0); "
>>>>>>>>> +                  "%s = 1; "
>>>>>>>>> +                  "next;",
>>>>>>>>> +                  port_num, REGBIT_MULTIPATH);
>>>>>>>>> +
>>>>>>>>> +    /* The priority here is calculated to implement
>>>>>>>>> longest-prefix-match
>>>>>>>>> +     * routing. */
>>>>>>>>> +    ovn_lflow_add(lflows, od, S_ROUTER_IN_IP_ROUTING, priority,
>>>>>>>>> +                  ds_cstr(&match), ds_cstr(&actions));
>>>>>>>>> +
>>>>>>>>> +    for (int i = 0; i < port_num; i++) {
>>>>>>>>> +        struct ds mp_match = DS_EMPTY_INITIALIZER;
>>>>>>>>> +        struct ds mp_actions = DS_EMPTY_INITIALIZER;
>>>>>>>>> +
>>>>>>>>> +        ds_put_format(&mp_match, "%s == 1 && reg0 == %d && ",
>>>>>>>>> +                      REGBIT_MULTIPATH, i);
>>>>>>>>> +        ds_put_format(&mp_match, "ip%s.%s == %s/%d",
>>>>>>>>> +                      is_ipv4 ? "4" : "6", dir,
>>>>>>>>> +                      network_s, plen);
>>>>>>>>> +
>>>>>>>>> +        ds_put_format(&mp_actions, "%sreg0 = ", is_ipv4 ? "" :
>>>>>>>>> "xx");
>>>>>>>>> +        if (gateway) {
>>>>>>>>> +            ds_put_cstr(&mp_actions, gateway);
>>>>>>>>> +        } else {
>>>>>>>>> +            ds_put_format(&mp_actions, "ip%s.dst", is_ipv4 ? "4"
>>>>>>>>> : "6");
>>>>>>>>> +        }
>>>>>>>>> +
>>>>>>>>> +        ds_put_format(&mp_actions, "; "
>>>>>>>>> +                      "%sreg1 = %s; "
>>>>>>>>> +                      "eth.src = %s; "
>>>>>>>>> +                      "outport = %s; "
>>>>>>>>> +                      "flags.loopback = 1; "
>>>>>>>>> +                      "next;",
>>>>>>>>> +                      is_ipv4 ? "" : "xx",
>>>>>>>>> +                      lrp_addr_s[i],
>>>>>>>>> +                      out_ports[i]->lrp_networks.ea_s,
>>>>>>>>> +                      out_ports[i]->json_key);
>>>>>>>>> +
>>>>>>>>> +        /* Add flow in table 6 to determin the right output port
>>>>>>>>> +         * for this traffic. */
>>>>>>>>> +        ovn_lflow_add(lflows, od, S_ROUTER_IN_MULTIPATH, priority,
>>>>>>>>> +                      ds_cstr(&mp_match), ds_cstr(&mp_actions));
>>>>>>>>> +        ds_destroy(&mp_match);
>>>>>>>>> +        ds_destroy(&mp_actions);
>>>>>>>>> +    }
>>>>>>>>> +    ds_destroy(&match);
>>>>>>>>> +    ds_destroy(&actions);
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>> +static bool
>>>>>>>>> +verify_nexthop_prefix(const struct nbrec_logical_router_static_ro
>>>>>>>>> ute
>>>>>>>>> *route,
>>>>>>>>> +                      bool *is_ipv4, char **prefix_s, unsigned
>>>>>>>>> int *plen)
>>>>>>>>>  {
>>>>>>>>>      ovs_be32 nexthop;
>>>>>>>>> -    const char *lrp_addr_s = NULL;
>>>>>>>>> -    unsigned int plen;
>>>>>>>>> -    bool is_ipv4;
>>>>>>>>>
>>>>>>>>>      /* Verify that the next hop is an IP address with an all-ones
>>>>>>>>> mask.
>>>>>>>>> */
>>>>>>>>> -    char *error = ip_parse_cidr(route->nexthop, &nexthop, &plen);
>>>>>>>>> +    char *error = ip_parse_cidr(route->nexthop, &nexthop, plen);
>>>>>>>>>      if (!error) {
>>>>>>>>> -        if (plen != 32) {
>>>>>>>>> +        if (*plen != 32) {
>>>>>>>>>              static struct vlog_rate_limit rl =
>>>>>>>>> VLOG_RATE_LIMIT_INIT(5,
>>>>>>>>> 1);
>>>>>>>>>              VLOG_WARN_RL(&rl, "bad next hop mask %s",
>>>>>>>>> route->nexthop);
>>>>>>>>> -            return;
>>>>>>>>> +            return false;
>>>>>>>>>          }
>>>>>>>>> -        is_ipv4 = true;
>>>>>>>>> +        *is_ipv4 = true;
>>>>>>>>>      } else {
>>>>>>>>>          free(error);
>>>>>>>>>
>>>>>>>>>          struct in6_addr ip6;
>>>>>>>>> -        error = ipv6_parse_cidr(route->nexthop, &ip6, &plen);
>>>>>>>>> +        error = ipv6_parse_cidr(route->nexthop, &ip6, plen);
>>>>>>>>>          if (!error) {
>>>>>>>>> -            if (plen != 128) {
>>>>>>>>> +            if (*plen != 128) {
>>>>>>>>>                  static struct vlog_rate_limit rl =
>>>>>>>>> VLOG_RATE_LIMIT_INIT(5, 1);
>>>>>>>>>                  VLOG_WARN_RL(&rl, "bad next hop mask %s",
>>>>>>>>> route->nexthop);
>>>>>>>>> -                return;
>>>>>>>>> +                return false;
>>>>>>>>>              }
>>>>>>>>> -            is_ipv4 = false;
>>>>>>>>> +            *is_ipv4 = false;
>>>>>>>>>          } else {
>>>>>>>>>              static struct vlog_rate_limit rl =
>>>>>>>>> VLOG_RATE_LIMIT_INIT(5,
>>>>>>>>> 1);
>>>>>>>>>              VLOG_WARN_RL(&rl, "bad next hop ip address %s",
>>>>>>>>> route->nexthop);
>>>>>>>>>              free(error);
>>>>>>>>> -            return;
>>>>>>>>> +            return false;
>>>>>>>>>          }
>>>>>>>>>      }
>>>>>>>>>
>>>>>>>>> -    char *prefix_s;
>>>>>>>>> -    if (is_ipv4) {
>>>>>>>>> +    if (*is_ipv4) {
>>>>>>>>>          ovs_be32 prefix;
>>>>>>>>>          /* Verify that ip prefix is a valid IPv4 address. */
>>>>>>>>> -        error = ip_parse_cidr(route->ip_prefix, &prefix, &plen);
>>>>>>>>> +        error = ip_parse_cidr(route->ip_prefix, &prefix, plen);
>>>>>>>>>          if (error) {
>>>>>>>>>              static struct vlog_rate_limit rl =
>>>>>>>>> VLOG_RATE_LIMIT_INIT(5,
>>>>>>>>> 1);
>>>>>>>>>              VLOG_WARN_RL(&rl, "bad 'ip_prefix' in static routes
>>>>>>>>> %s",
>>>>>>>>>                           route->ip_prefix);
>>>>>>>>>              free(error);
>>>>>>>>> -            return;
>>>>>>>>> +            return false;
>>>>>>>>>          }
>>>>>>>>> -        prefix_s = xasprintf(IP_FMT, IP_ARGS(prefix &
>>>>>>>>> be32_prefix_mask(plen)));
>>>>>>>>> +        *prefix_s = xasprintf(IP_FMT, IP_ARGS(prefix
>>>>>>>>> +                                              &
>>>>>>>>> be32_prefix_mask(*plen)));
>>>>>>>>>      } else {
>>>>>>>>>          /* Verify that ip prefix is a valid IPv6 address. */
>>>>>>>>>          struct in6_addr prefix;
>>>>>>>>> -        error = ipv6_parse_cidr(route->ip_prefix, &prefix,
>>>>>>>>> &plen);
>>>>>>>>> +        error = ipv6_parse_cidr(route->ip_prefix, &prefix, plen);
>>>>>>>>>          if (error) {
>>>>>>>>>              static struct vlog_rate_limit rl =
>>>>>>>>> VLOG_RATE_LIMIT_INIT(5,
>>>>>>>>> 1);
>>>>>>>>>              VLOG_WARN_RL(&rl, "bad 'ip_prefix' in static routes
>>>>>>>>> %s",
>>>>>>>>>                           route->ip_prefix);
>>>>>>>>>              free(error);
>>>>>>>>> -            return;
>>>>>>>>> +            return false;
>>>>>>>>>          }
>>>>>>>>> -        struct in6_addr mask = ipv6_create_mask(plen);
>>>>>>>>> +        struct in6_addr mask = ipv6_create_mask(*plen);
>>>>>>>>>          struct in6_addr network = ipv6_addr_bitand(&prefix,
>>>>>>>>> &mask);
>>>>>>>>> -        prefix_s = xmalloc(INET6_ADDRSTRLEN);
>>>>>>>>> -        inet_ntop(AF_INET6, &network, prefix_s, INET6_ADDRSTRLEN);
>>>>>>>>> +        *prefix_s = xmalloc(INET6_ADDRSTRLEN);
>>>>>>>>> +        inet_ntop(AF_INET6, &network, *prefix_s,
>>>>>>>>> INET6_ADDRSTRLEN);
>>>>>>>>> +    }
>>>>>>>>> +
>>>>>>>>> +    return true;
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>> +static void
>>>>>>>>> +build_static_route_flow(struct hmap *lflows, struct ovn_datapath
>>>>>>>>> *od,
>>>>>>>>> +                        struct hmap *ports,
>>>>>>>>> +                        const struct
>>>>>>>>> nbrec_logical_router_static_route
>>>>>>>>> *route)
>>>>>>>>> +{
>>>>>>>>> +    const char *lrp_addr_s = NULL;
>>>>>>>>> +    unsigned int plen;
>>>>>>>>> +    bool is_ipv4;
>>>>>>>>> +    char *prefix_s = NULL;
>>>>>>>>> +
>>>>>>>>> +    if (!verify_nexthop_prefix(route, &is_ipv4, &prefix_s,
>>>>>>>>> &plen)) {
>>>>>>>>> +        return;
>>>>>>>>>      }
>>>>>>>>>
>>>>>>>>>      /* Find the outgoing port. */
>>>>>>>>> @@ -4270,7 +4369,75 @@ build_static_route_flow(struct hmap
>>>>>>>>> *lflows, struct
>>>>>>>>> ovn_datapath *od,
>>>>>>>>>                policy);
>>>>>>>>>
>>>>>>>>>  free_prefix_s:
>>>>>>>>> -    free(prefix_s);
>>>>>>>>> +    if (prefix_s) {
>>>>>>>>> +        free(prefix_s);
>>>>>>>>> +    }
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>> +static void
>>>>>>>>> +build_multipath_flow(struct hmap *lflows, struct ovn_datapath *od,
>>>>>>>>> +                     struct hmap *ports,
>>>>>>>>> +                     const struct nbrec_logical_router_static_ro
>>>>>>>>> ute
>>>>>>>>> *route)
>>>>>>>>> +{
>>>>>>>>> +    unsigned int plen;
>>>>>>>>> +    bool is_ipv4;
>>>>>>>>> +    char *prefix_s = NULL;
>>>>>>>>> +
>>>>>>>>> +    if (!verify_nexthop_prefix(route, &is_ipv4, &prefix_s,
>>>>>>>>> &plen)) {
>>>>>>>>> +        return;
>>>>>>>>> +    }
>>>>>>>>> +
>>>>>>>>> +    /* Find the outgoing port. */
>>>>>>>>> +    struct ovn_port **out_ports = xmalloc(route->n_multipath_port
>>>>>>>>> *
>>>>>>>>> +                                             sizeof(struct
>>>>>>>>> ovn_port *));
>>>>>>>>> +    const char **lrp_addr_s = xmalloc(route->n_multipath_port *
>>>>>>>>> +                                         sizeof(const char *));
>>>>>>>>> +    for (int i = 0; i < route->n_multipath_port; i++) {
>>>>>>>>> +        // TODO May need to consider some ports are not found?
>>>>>>>>> +        out_ports[i] = ovn_port_find(ports,
>>>>>>>>> route->multipath_port[i]);
>>>>>>>>> +        if (!out_ports[i]) {
>>>>>>>>> +            static struct vlog_rate_limit rl =
>>>>>>>>> VLOG_RATE_LIMIT_INIT(5,
>>>>>>>>> 1);
>>>>>>>>> +            VLOG_WARN_RL(&rl, "Bad out port %s for static route
>>>>>>>>> %s",
>>>>>>>>> +                         route->multipath_port[i],
>>>>>>>>> route->ip_prefix);
>>>>>>>>> +            goto free_ports_lrp_addr;
>>>>>>>>> +        }
>>>>>>>>> +
>>>>>>>>> +        lrp_addr_s[i] = find_lrp_member_ip(out_ports[i],
>>>>>>>>> route->nexthop);
>>>>>>>>> +        if (!lrp_addr_s[i]) {
>>>>>>>>> +            if (is_ipv4) {
>>>>>>>>> +                if (out_ports[i]->lrp_networks.n_ipv4_addrs) {
>>>>>>>>> +                    lrp_addr_s[i] = out_ports[i]->
>>>>>>>>> + lrp_networks.ipv4_addrs[0].addr_s;
>>>>>>>>> +                }
>>>>>>>>> +            } else {
>>>>>>>>> +                if (out_ports[i]->lrp_networks.n_ipv6_addrs) {
>>>>>>>>> +                    lrp_addr_s[i] = out_ports[i]->
>>>>>>>>> + lrp_networks.ipv6_addrs[0].addr_s;
>>>>>>>>> +                }
>>>>>>>>> +            }
>>>>>>>>> +        }
>>>>>>>>> +        if (!lrp_addr_s[i]) {
>>>>>>>>> +            static struct vlog_rate_limit rl =
>>>>>>>>> VLOG_RATE_LIMIT_INIT(5,
>>>>>>>>> 1);
>>>>>>>>> +            VLOG_WARN_RL(&rl,
>>>>>>>>> +                         "%s has no path for static route %s;
>>>>>>>>> next hop
>>>>>>>>> %s",
>>>>>>>>> +                         route->multipath_port[i],
>>>>>>>>> route->ip_prefix,
>>>>>>>>> +                         route->nexthop);
>>>>>>>>> +            goto free_ports_lrp_addr;
>>>>>>>>> +        }
>>>>>>>>> +    }
>>>>>>>>> +
>>>>>>>>> +
>>>>>>>>> +    char *policy = route->policy ? route->policy : "dst-ip";
>>>>>>>>> +    add_multipath_route(lflows, route->n_multipath_port,
>>>>>>>>> +                        out_ports, lrp_addr_s, od,
>>>>>>>>> +                        prefix_s, plen, route->nexthop, policy);
>>>>>>>>> +
>>>>>>>>> +free_ports_lrp_addr:
>>>>>>>>> +    free(out_ports);
>>>>>>>>> +    free(lrp_addr_s);
>>>>>>>>> +    if (prefix_s) {
>>>>>>>>> +        free(prefix_s);
>>>>>>>>> +    }
>>>>>>>>>  }
>>>>>>>>>
>>>>>>>>>  static void
>>>>>>>>> @@ -5344,7 +5511,7 @@ build_lrouter_flows(struct hmap *datapaths,
>>>>>>>>> struct
>>>>>>>>> hmap *ports,
>>>>>>>>>          }
>>>>>>>>>      }
>>>>>>>>>
>>>>>>>>> -    /* Convert the static routes to flows. */
>>>>>>>>> +    /* Convert the static routes and multipath route to flows. */
>>>>>>>>>      HMAP_FOR_EACH (od, key_node, datapaths) {
>>>>>>>>>          if (!od->nbr) {
>>>>>>>>>              continue;
>>>>>>>>> @@ -5355,12 +5522,24 @@ build_lrouter_flows(struct hmap
>>>>>>>>> *datapaths, struct
>>>>>>>>> hmap *ports,
>>>>>>>>>
>>>>>>>>>              route = od->nbr->static_routes[i];
>>>>>>>>>              build_static_route_flow(lflows, od, ports, route);
>>>>>>>>> +            /* Logical router ingress table 5-6: Multipath
>>>>>>>>> Routing.
>>>>>>>>> +             *
>>>>>>>>> +             * If router has configured a traffic has multiple
>>>>>>>>> paths
>>>>>>>>> +             * to destination. The right output port should be
>>>>>>>>> firgured
>>>>>>>>> +             * out by computing IP packet's header */
>>>>>>>>> +            if (route->n_multipath_port > 1) {
>>>>>>>>> +                /* Generate multipath routes in table 5,6 for
>>>>>>>>> +                 * dedicated traffic */
>>>>>>>>> +                build_multipath_flow(lflows, od, ports, route);
>>>>>>>>> +            }
>>>>>>>>>          }
>>>>>>>>> +        /* Packets are allowed by default in table 6. */
>>>>>>>>> +        ovn_lflow_add(lflows, od, S_ROUTER_IN_MULTIPATH, 0, "1",
>>>>>>>>> "next;");
>>>>>>>>>      }
>>>>>>>>>
>>>>>>>>>      /* XXX destination unreachable */
>>>>>>>>>
>>>>>>>>> -    /* Local router ingress table 6: ARP Resolution.
>>>>>>>>> +    /* Local router ingress table 7: ARP Resolution.
>>>>>>>>>       *
>>>>>>>>>       * Any packet that reaches this table is an IP packet whose
>>>>>>>>> next-hop
>>>>>>>>> IP
>>>>>>>>>       * address is in reg0. (ip4.dst is the final destination.)
>>>>>>>>> This table
>>>>>>>>> @@ -5555,7 +5734,7 @@ build_lrouter_flows(struct hmap *datapaths,
>>>>>>>>> struct
>>>>>>>>> hmap *ports,
>>>>>>>>>                        "get_nd(outport, xxreg0); next;");
>>>>>>>>>      }
>>>>>>>>>
>>>>>>>>> -    /* Logical router ingress table 7: Gateway redirect.
>>>>>>>>> +    /* Logical router ingress table 8: Gateway redirect.
>>>>>>>>>       *
>>>>>>>>>       * For traffic with outport equal to the l3dgw_port
>>>>>>>>>       * on a distributed router, this table redirects a subset
>>>>>>>>> @@ -5595,7 +5774,7 @@ build_lrouter_flows(struct hmap *datapaths,
>>>>>>>>> struct
>>>>>>>>> hmap *ports,
>>>>>>>>>          ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 0, "1",
>>>>>>>>> "next;");
>>>>>>>>>      }
>>>>>>>>>
>>>>>>>>> -    /* Local router ingress table 8: ARP request.
>>>>>>>>> +    /* Local router ingress table 9: ARP request.
>>>>>>>>>       *
>>>>>>>>>       * In the common case where the Ethernet destination has been
>>>>>>>>> resolved,
>>>>>>>>>       * this table outputs the packet (priority 0).  Otherwise, it
>>>>>>>>> composes
>>>>>>>>> diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema
>>>>>>>>> index a077bfb..b8bdd42 100644
>>>>>>>>> --- a/ovn/ovn-nb.ovsschema
>>>>>>>>> +++ b/ovn/ovn-nb.ovsschema
>>>>>>>>> @@ -1,7 +1,7 @@
>>>>>>>>>  {
>>>>>>>>>      "name": "OVN_Northbound",
>>>>>>>>>      "version": "5.8.0",
>>>>>>>>> -    "cksum": "2812300190 <(281)%20230-0190> 16766",
>>>>>>>>> +    "cksum": "1967092589 16903",
>>>>>>>>>      "tables": {
>>>>>>>>>          "NB_Global": {
>>>>>>>>>              "columns": {
>>>>>>>>> @@ -235,7 +235,9 @@
>>>>>>>>>
>>>>>>>>> "dst-ip"]]},
>>>>>>>>>                                      "min": 0, "max": 1}},
>>>>>>>>>                  "nexthop": {"type": "string"},
>>>>>>>>> -                "output_port": {"type": {"key": "string", "min":
>>>>>>>>> 0,
>>>>>>>>> "max": 1}}},
>>>>>>>>> +                "output_port": {"type": {"key": "string", "min":
>>>>>>>>> 0,
>>>>>>>>> "max": 1}},
>>>>>>>>> +                "multipath_port": {"type": {"key": "string",
>>>>>>>>> "min": 0,
>>>>>>>>> +                                            "max": "unlimited"}}},
>>>>>>>>>              "isRoot": false},
>>>>>>>>>          "NAT": {
>>>>>>>>>              "columns": {
>>>>>>>>> diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
>>>>>>>>> index 9869d7e..15feb97 100644
>>>>>>>>> --- a/ovn/ovn-nb.xml
>>>>>>>>> +++ b/ovn/ovn-nb.xml
>>>>>>>>> @@ -1487,6 +1487,15 @@
>>>>>>>>>          address as the one via which the <ref column="nexthop"/>
>>>>>>>>> is
>>>>>>>>> reachable.
>>>>>>>>>        </p>
>>>>>>>>>      </column>
>>>>>>>>> +    <column name="multipath_port">
>>>>>>>>> +      <p>
>>>>>>>>> +        The name of the <ref table="Logical_Router_Port"/> via
>>>>>>>>> which the
>>>>>>>>> packet
>>>>>>>>> +        needs to be sent out. When it contains more than two
>>>>>>>>> ports, it
>>>>>>>>> means
>>>>>>>>> +        packet has multiple candidate output ports. OVN uses the
>>>>>>>>> packet
>>>>>>>>> header
>>>>>>>>> +        to determin which port the packet would be delivered to.
>>>>>>>>> +        Currently, OVN consumes destination IP address to figure
>>>>>>>>> out
>>>>>>>>> port.
>>>>>>>>> +      </p>
>>>>>>>>> +    </column>
>>>>>>>>>    </table>
>>>>>>>>>
>>>>>>>>>    <table name="NAT" title="NAT rules">
>>>>>>>>> --
>>>>>>>>> 1.8.3.1
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> dev mailing list
>>>>>>>>> dev at openvswitch.org
>>>>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> dev mailing list
>>>>>>>>> dev at openvswitch.org
>>>>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>


More information about the dev mailing list