[ovs-dev] [PATCH ovn] ovn-northd: Multiple distributed gateway port support.

Han Zhou hzhou at ovn.org
Thu Aug 5 15:53:20 UTC 2021


On Tue, Aug 3, 2021 at 10:59 AM Numan Siddique <numans at ovn.org> wrote:
>
> On Tue, Aug 3, 2021 at 1:55 PM Han Zhou <hzhou at ovn.org> wrote:
> >
> > On Tue, Aug 3, 2021 at 10:14 AM Numan Siddique <numans at ovn.org> wrote:
> > >
> > > On Thu, Jul 29, 2021 at 4:03 AM Han Zhou <hzhou at ovn.org> wrote:
> > > >
> > > > From: Ankur Sharma <ankurmnnit2004 at gmail.com>
> > > >
> > > > By default, OVN support only one DGP (distributed gateway port) per
> > > > logical router. While a single DGP port suffices for most of the
North
> > > > South connectivity, there are requirements where a logical router
could
> > > > be connected to multiple external networks and based on routing
decision
> > > > packet could go to different ones.
> > > >
> > > > This patch adds flexibility of having multiple DGPs per logical
router.
> > > >
> > > > Changes can classified as following:
> > > > a. Data structure changes to allow multiple DGPs per ovn_datapath.
> > > >
> > > > b. Consumption of new data structure in logical flows for
> > > >    individual features.
> > > >
> > > > c. Features that require changes are:
> > > >    i. Regular NS traffic flow.
> > > >   ii. Network Address Translation.
> > > >  iii. Load Balancer
> > > >   iv. Gateway_mtu.
> > > >    v. reside-on-redirect-chassis
> > > >   vi. Misc code sections that assumed a single DGP.
> > > >
> > > > d. Except for reside-on-redirect-chassis all the other features
> > > >    could be extended to multiple DGPs. Reside on redirect
> > > >    chassis with its current specification could not be extended
> > > >    and hence should be used only with the logical router that
> > > >    has a single DGP.
> > > >
> > > > This patch doesn't support NAT & load-balancer features for multiple
> > > > DGPs yet, but added validations that disables NAT/load-balancer
> > > > features when there are more than one DGP configured per router.
> > > >
> > > > Signed-off-by: Ankur Sharma <ankurmnnit2004 at gmail.com>
> > > > Co-authored-by: Dhathri Purohith <dhathri.purohith at nutanix.com>
> > > > Signed-off-by: Dhathri Purohith <dhathri.purohith at nutanix.com>
> > > > Co-authored-by: Abhiram Sangana <sangana.abhiram at nutanix.com>
> > > > Signed-off-by: Abhiram Sangana <sangana.abhiram at nutanix.com>
> > > > Co-authored-by: Han Zhou <hzhou at ovn.org>
> > > > Signed-off-by: Han Zhou <hzhou at ovn.org>
> > >
> > > Hi Han, Ankur et all.
> > >
> > > Thanks for adding this feature.
> > >
> > > Please see below for few comments.
> > >
> > > Also this patch needs a rebase.
> > >
> > > Thanks
> > > Numan
> > >
> >
> > Thanks Numan for the review!
> >
> > > > ---
> > > >  NEWS                    |   3 +
> > > >  northd/lrouter.dl       | 103 +++-----
> > > >  northd/ovn-northd.8.xml |   6 +-
> > > >  northd/ovn-northd.c     | 535
++++++++++++++++++++++------------------
> > > >  northd/ovn_northd.dl    | 178 +++++++------
> > > >  ovn-architecture.7.xml  |  19 +-
> > > >  ovn-nb.xml              |  27 +-
> > > >  ovs                     |   2 +-
> > > >  tests/ovn-northd.at     |  92 +++++++
> > > >  tests/ovn.at            | 307 +++++++++++++++++++++++
> > > >  10 files changed, 870 insertions(+), 402 deletions(-)
> > > >
> > > > diff --git a/NEWS b/NEWS
> > > > index eefdae9bc..2a1c56b2d 100644
> > > > --- a/NEWS
> > > > +++ b/NEWS
> > > > @@ -33,6 +33,9 @@ OVN v21.06.0 - 18 Jun 2021
> > > >      "ovn-trim-limit-lflow-cache" and
> > "ovn-trim-wmark-perc-lflow-cache", to
> > > >      allow enforcing a lflow cache size limit and high watermark
> > percentage
> > > >      for which automatic memory trimming is performed.
> > > > +  - Support multiple distributed gateway ports on a single logical
> > router.
> > > > +    (NAT and load-balancer are not supported yet when there are
> > multiple
> > > > +    distributed gateway ports).
> > > >
> > > >  OVN v21.03.0 - 12 Mar 2021
> > > >  -------------------------
> > > > diff --git a/northd/lrouter.dl b/northd/lrouter.dl
> > > > index 4a24f3f61..d37350ab8 100644
> > > > --- a/northd/lrouter.dl
> > > > +++ b/northd/lrouter.dl
> > > > @@ -138,14 +138,14 @@ Warning[message] :-
> > > >      var message = "Bad configuration: distributed gateway port
> > configured on "
> > > >      "port ${lrp.name} on L3 gateway router".
> > > >
> > > > -/* DistributedGatewayPortCandidate.
> > > > +/* Distributed gateway ports.
> > > >   *
> > > > - * Each row pairs a logical router with its distributed gateway
port,
> > > > - * but without checking that there is at most one DGP per LR.
> > > > + * Each row means 'lrp' is a distributed gateway port on 'lr_uuid'.
> > > >   *
> > > > - * (Use DistributedGatewayPort instead, since it guarantees
> > uniqueness.) */
> > > > -relation DistributedGatewayPortCandidate(lr_uuid: uuid, lrp_uuid:
uuid)
> > > > -DistributedGatewayPortCandidate(lr_uuid, lrp_uuid) :-
> > > > + * A logical router can have multiple distributed gateway ports. */
> > > > +relation DistributedGatewayPort(lrp:
Intern<nb::Logical_Router_Port>,
> > > > +                                lr_uuid: uuid)
> > > > +DistributedGatewayPort(lrp, lr_uuid) :-
> > > >      lr in nb::Logical_Router(._uuid = lr_uuid),
> > > >      LogicalRouterPort(lrp_uuid, lr._uuid),
> > > >      lrp in &nb::Logical_Router_Port(._uuid = lrp_uuid),
> > > > @@ -153,30 +153,10 @@ DistributedGatewayPortCandidate(lr_uuid,
> > lrp_uuid) :-
> > > >      var has_hcg = lrp.ha_chassis_group.is_some(),
> > > >      var has_gc = not lrp.gateway_chassis.is_empty(),
> > > >      has_hcg or has_gc.
> > > > -Warning[message] :-
> > > > -    DistributedGatewayPortCandidate(lr_uuid, lrp_uuid),
> > > > -    var lrps = lrp_uuid.group_by(lr_uuid).to_set(),
> > > > -    lrps.size() > 1,
> > > > -    lr in nb::Logical_Router(._uuid = lr_uuid),
> > > > -    var message = "Bad configuration: multiple distributed gateway
> > ports on "
> > > > -    "logical router ${lr.name}; ignoring all of them".
> > > > -
> > > > -/* Distributed gateway ports.
> > > > - *
> > > > - * Each row means 'lrp' is the distributed gateway port on
'lr_uuid'.
> > > > - *
> > > > - * There is at most one distributed gateway port per logical
router. */
> > > > -relation DistributedGatewayPort(lrp:
Intern<nb::Logical_Router_Port>,
> > lr_uuid: uuid)
> > > > -DistributedGatewayPort(lrp, lr_uuid) :-
> > > > -    DistributedGatewayPortCandidate(lr_uuid, lrp_uuid),
> > > > -    var lrps = lrp_uuid.group_by(lr_uuid).to_set(),
> > > > -    lrps.size() == 1,
> > > > -    Some{var lrp_uuid} = lrps.nth(0),
> > > > -    lrp in &nb::Logical_Router_Port(._uuid = lrp_uuid).
> > > >
> > > >  /* HAChassis is an abstraction over nb::Gateway_Chassis and
> > nb::HA_Chassis, which
> > > >   * are different ways to represent the same configuration.  Each
row is
> > > > - * effectively one HA_Chassis record.  (Usually, we could
associated
> > each
> > > > + * effectively one HA_Chassis record.  (Usually, we could associate
> > each
> > > >   * row with a particular 'lr_uuid', but it's permissible for more
than
> > one
> > > >   * logical router to use a HA chassis group, so we omit it so that
> > multiple
> > > >   * references get merged.)
> > > > @@ -236,18 +216,20 @@
> > HAChassisGroup(ha_chassis_group_uuid(hac_group_uuid),
> > > >                          .name = name,
> > > >                          .external_ids = external_ids).
> > > >
> > > > -/* Each row maps from a logical router to the name of its
> > HAChassisGroup.
> > > > - * This level of indirection is needed because multiple logical
routers
> > > > - * are allowed to reference a given HAChassisGroup. */
> > > > -relation LogicalRouterHAChassisGroup(lr_uuid: uuid,
> > > > -                                     hacg_uuid: uuid)
> > > > -LogicalRouterHAChassisGroup(lr_uuid,
ha_chassis_group_uuid(lrp._uuid))
> > :-
> > > > -    DistributedGatewayPort(lrp, lr_uuid),
> > > > +/* Each row maps from a distributed gateway logical router port to
the
> > name of
> > > > + * its HAChassisGroup.
> > > > + * This level of indirection is needed because multiple distributed
> > gateway
> > > > + * logical router ports are allowed to reference a given
> > HAChassisGroup. */
> > > > +relation DistributedGatewayPortHAChassisGroup(
> > > > +    lrp: Intern<nb::Logical_Router_Port>,
> > > > +    hacg_uuid: uuid)
> > > > +DistributedGatewayPortHAChassisGroup(lrp,
> > ha_chassis_group_uuid(lrp._uuid)) :-
> > > > +    DistributedGatewayPort(.lrp = lrp),
> > > >      lrp.ha_chassis_group == None,
> > > >      lrp.gateway_chassis.size() > 0.
> > > > -LogicalRouterHAChassisGroup(lr_uuid,
> > > > -                            ha_chassis_group_uuid(hac_group_uuid))
:-
> > > > -    DistributedGatewayPort(lrp, lr_uuid),
> > > > +DistributedGatewayPortHAChassisGroup(lrp,
> > > > +
> > ha_chassis_group_uuid(hac_group_uuid)) :-
> > > > +    DistributedGatewayPort(.lrp = lrp),
> > > >      Some{var hac_group_uuid} = lrp.ha_chassis_group,
> > > >      nb::HA_Chassis_Group(._uuid = hac_group_uuid).
> > > >
> > > > @@ -259,14 +241,19 @@ RouterPortIsRedirect(lrp, false) :-
> > > >      &nb::Logical_Router_Port(._uuid = lrp),
> > > >      not DistributedGatewayPort(&nb::Logical_Router_Port{._uuid =
lrp},
> > _).
> > > >
> > > > -relation LogicalRouterRedirectPort(lr: uuid, has_redirect_port:
> > Option<Intern<nb::Logical_Router_Port>>)
> > > > -
> > > > -LogicalRouterRedirectPort(lr, Some{lrp}) :-
> > > > -    DistributedGatewayPort(lrp, lr).
> > > > -
> > > > -LogicalRouterRedirectPort(lr, None) :-
> > > > -    nb::Logical_Router(._uuid = lr),
> > > > -    not DistributedGatewayPort(_, lr).
> > > > +/*
> > > > + * LogicalRouterDGWPorts maps from each logical router UUID
> > > > + * to the logical router's set of distributed gateway (or redirect)
> > ports. */
> > > > +relation LogicalRouterDGWPorts(
> > > > +    lr_uuid: uuid,
> > > > +    l3dgw_ports: Vec<Intern<nb::Logical_Router_Port>>)
> > > > +LogicalRouterDGWPorts(lr_uuid, l3dgw_ports) :-
> > > > +    DistributedGatewayPort(lrp, lr_uuid),
> > > > +    var l3dgw_ports = lrp.group_by(lr_uuid).to_vec().
> > > > +LogicalRouterDGWPorts(lr_uuid, vec_empty()) :-
> > > > +    lr in nb::Logical_Router(),
> > > > +    var lr_uuid = lr._uuid,
> > > > +    not DistributedGatewayPort(_, lr_uuid).
> > > >
> > > >  typedef ExceptionalExtIps = AllowedExtIps{ips:
Intern<nb::Address_Set>}
> > > >                            | ExemptedExtIps{ips:
> > Intern<nb::Address_Set>}
> > > > @@ -450,9 +437,7 @@ LogicalRouterCopp0(lr, meters) :-
> > > >
> > > >  /* Router relation collects all attributes of a logical router.
> > > >   *
> > > > - * `l3dgw_port` - optional redirect port (see
`DistributedGatewayPort`)
> > > > - * `redirect_port_name` - derived redirect port name (or empty
string
> > if
> > > > - *      router does not have a redirect port)
> > > > + * `l3dgw_ports` - optional redirect ports (see
> > `DistributedGatewayPort`)
> > > >   * `is_gateway` - true iff the router is a gateway router.
Together
> > with
> > > >   *      `l3dgw_port`, this flag affects the generation of various
flows
> > > >   *      related to NAT and load balancing.
> > > > @@ -474,8 +459,7 @@ typedef Router = Router {
> > > >      external_ids:       Map<string,string>,
> > > >
> > > >      /* Additional computed fields. */
> > > > -    l3dgw_port:         Option<Intern<nb::Logical_Router_Port>>,
> > > > -    redirect_port_name: string,
> > > > +    l3dgw_ports:        Vec<Intern<nb::Logical_Router_Port>>,
> > > >      is_gateway:         bool,
> > > >      nats:               Vec<NAT>,
> > > >      snat_ips:           Map<v46_ip, Set<NAT>>,
> > > > @@ -498,23 +482,18 @@ Router[Router{
> > > >          .options       =    lr.options,
> > > >          .external_ids  =    lr.external_ids,
> > > >
> > > > -        .l3dgw_port = l3dgw_port,
> > > > -        .redirect_port_name =
> > > > -            match (l3dgw_port) {
> > > > -                Some{rport} ->
> > json_string_escape(chassis_redirect_name(rport.name)),
> > > > -                _ -> ""
> > > > -            },
> > > > -        .is_gateway = lr.options.contains_key("chassis"),
> > > > -        .nats       = nats,
> > > > -        .snat_ips   = snat_ips,
> > > > -        .lbs        = lbs,
> > > > -        .mcast_cfg  = mcast_cfg,
> > > > +        .l3dgw_ports = l3dgw_ports,
> > > > +        .is_gateway  = lr.options.contains_key("chassis"),
> > > > +        .nats        = nats,
> > > > +        .snat_ips    = snat_ips,
> > > > +        .lbs         = lbs,
> > > > +        .mcast_cfg   = mcast_cfg,
> > > >          .learn_from_arp_request = learn_from_arp_request,
> > > >          .force_lb_snat = force_lb_snat,
> > > >          .copp       = copp}.intern()] :-
> > > >      lr in nb::Logical_Router(),
> > > >      lr.is_enabled(),
> > > > -    LogicalRouterRedirectPort(lr._uuid, l3dgw_port),
> > > > +    LogicalRouterDGWPorts(lr._uuid, l3dgw_ports),
> > > >      LogicalRouterNATs(lr._uuid, nats),
> > > >      LogicalRouterLBs(lr._uuid, lbs),
> > > >      LogicalRouterSnatIPs(lr._uuid, snat_ips),
> > > > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> > > > index 99a19f853..8a368ef61 100644
> > > > --- a/northd/ovn-northd.8.xml
> > > > +++ b/northd/ovn-northd.8.xml
> > > > @@ -3764,10 +3764,10 @@ icmp6 {
> > > >      <h3>Ingress Table 17: Gateway Redirect</h3>
> > > >
> > > >      <p>
> > > > -      For distributed logical routers where one of the logical
router
> > > > +      For distributed logical routers where one or more of the
logical
> > router
> > > >        ports specifies a gateway chassis, this table redirects
> > > > -      certain packets to the distributed gateway port instance on
the
> > > > -      gateway chassis.  This table has the following flows:
> > > > +      certain packets to the distributed gateway port instances on
the
> > > > +      gateway chassises.  This table has the following flows:
> > > >      </p>
> > > >
> > > >      <ul>
> > > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > > > index ebe12cace..87c4478fa 100644
> > > > --- a/northd/ovn-northd.c
> > > > +++ b/northd/ovn-northd.c
> > > > @@ -655,13 +655,12 @@ struct ovn_datapath {
> > > >      bool is_gw_router;
> > > >
> > > >      /* OVN northd only needs to know about the logical router
gateway
> > port for
> > > > -     * NAT on a distributed router.  This "distributed gateway
port" is
> > > > -     * populated only when there is a gateway chassis specified for
> > one of
> > > > -     * the ports on the logical router.  Otherwise this will be
NULL.
> > */
> > > > -    struct ovn_port *l3dgw_port;
> > > > -    /* The "derived" OVN port representing the instance of
l3dgw_port
> > on
> > > > -     * the gateway chassis. */
> > > > -    struct ovn_port *l3redirect_port;
> > > > +     * NAT on a distributed router.  The "distributed gateway
ports"
> > are
> > > > +     * populated only when there is a gateway chassis or ha chassis
> > group
> > > > +     * specified for some of the ports on the logical router.
> > Otherwise this
> > > > +     * will be NULL. */
> > > > +    struct ovn_port **l3dgw_ports;
> > > > +    size_t n_l3dgw_ports;
> > > >
> > > >      /* NAT entries configured on the router. */
> > > >      struct ovn_nat *nat_entries;
> > > > @@ -802,6 +801,16 @@ init_nat_entries(struct ovn_datapath *od)
> > > >          return;
> > > >      }
> > > >
> > > > +    if (od->n_l3dgw_ports > 1) {
> > > > +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1,
1);
> > > > +        VLOG_WARN_RL(&rl, "NAT is configured on logical router %s,
> > which has %"
> > > > +                     PRIuSIZE" distributed gateway ports. NAT is
not
> > supported"
> > > > +                     " yet when there is more than one distributed
> > gateway "
> > > > +                     "port on the router.",
> > > > +                     od->nbr->name, od->n_l3dgw_ports);
> > > > +        return;
> > > > +    }
> > > > +
> > > >      od->nat_entries = xmalloc(od->nbr->n_nat * sizeof
> > *od->nat_entries);
> > > >
> > > >      for (size_t i = 0; i < od->nbr->n_nat; i++) {
> > > > @@ -941,6 +950,7 @@ ovn_datapath_destroy(struct hmap *datapaths,
struct
> > ovn_datapath *od)
> > > >          destroy_lb_ips(od);
> > > >          free(od->nat_entries);
> > > >          free(od->localnet_ports);
> > > > +        free(od->l3dgw_ports);
> > > >          ovn_ls_port_group_destroy(&od->nb_pgs);
> > > >          destroy_mcast_info_for_datapath(od);
> > > >
> > > > @@ -1489,9 +1499,18 @@ struct ovn_port {
> > > >      /* Logical port multicast data. */
> > > >      struct mcast_port_info mcast_info;
> > > >
> > > > -    /* This is ordinarily false.  It is true if and only if this
> > ovn_port is
> > > > -     * derived from a chassis-redirect port. */
> > > > -    bool derived;
> > > > +    /* At most one of l3dgw_port and cr_port can be not NULL. */
> > > > +
> > > > +    /* This is set to a distributed gateway port if and only if
this
> > ovn_port
> > > > +     * is "derived" from it. Otherwise this is set to NULL. The
derived
> > > > +     * ovn_port represents the instance of distributed gateway
port on
> > the
> > > > +     * gateway chassis.*/
> > > > +    struct ovn_port *l3dgw_port;
> > > > +
> > > > +    /* This is set to the "derived" chassis-redirect port of this
port
> > if and
> > > > +     * only if this port is a distributed gateway port. Otherwise
this
> > is set
> > > > +     * to NULL. */
> > > > +    struct ovn_port *cr_port;
> > >
> > > In my opinion,  the above 2 variables - 'l3dgw_port' and 'cr_port' are
> > > confusing.
> > >
> > > For the code  - if(op->l3dgw_port),  It is not immediately obvious to
> > > me that "op" is a derived
> > > "chassisredirect" port.
> > >
> > > I'd suggest to have the bool - derived to be preserved.
> > >
> >
> > OK, maybe a more specific name: is_derived_crp? is_ makes it clear from
the
> > name it is a bool, and _crp in case there may be other forms of derived
> > port in the future.
> >
> > > How about this ?
> > >
> > > bool derived;
> > > union {
> > >      struct ovn_port *cr_port;
> > >      struct ovn_port *l3gw_port;
> > > };
> > >
> > > If 'derived' is true then 'l3gw_port' would be not NULL.  Otherwise it
> > > would be NULL.
> >
> > If it is a union, it won't be NULL even if "derived" is false when it
is a
> > DGP.
> > But I think using union is good.
> >

Hi Numan, sorry that I tried this approach and changed my mind. It seems
using a bool plus a union is more tedious. When checking if a port is a DGP
we will have to write "if (!op->derived && op->cr_port)", because
!op->derived doesn't mean it has to be a DGP and op->cr_port alone is not
sufficient either because of the union. It seems more clear to do just: "if
(op->cr_port)" for "if op is a DGP" and "if (op->l3dgw_port)" for "if op is
a chassis-redirect port". But as you mentioned this may look unnatural at
the first glance, so instead of adding a bool flag I added helper functions
is_l3dgw_port(op) and is_cr_port(op). Please let me know if you have
different thoughts.

All the comments are addressed here in V2:
https://patchwork.ozlabs.org/project/ovn/list/?series=256862
(Thanks Abhiram for helping with the test case update!)

Thanks,
Han

> > >
> > > Instance of 'struct ovn_port' for distributed router port would have
> > > "cr_port" set.
> > >


More information about the dev mailing list