[ovs-dev] [PATCH ovn] ovn-northd: Multiple distributed gateway port support.
Han Zhou
hzhou at ovn.org
Thu Aug 5 15:53:20 UTC 2021
On Tue, Aug 3, 2021 at 10:59 AM Numan Siddique <numans at ovn.org> wrote:
>
> On Tue, Aug 3, 2021 at 1:55 PM Han Zhou <hzhou at ovn.org> wrote:
> >
> > On Tue, Aug 3, 2021 at 10:14 AM Numan Siddique <numans at ovn.org> wrote:
> > >
> > > On Thu, Jul 29, 2021 at 4:03 AM Han Zhou <hzhou at ovn.org> wrote:
> > > >
> > > > From: Ankur Sharma <ankurmnnit2004 at gmail.com>
> > > >
> > > > By default, OVN support only one DGP (distributed gateway port) per
> > > > logical router. While a single DGP port suffices for most of the
North
> > > > South connectivity, there are requirements where a logical router
could
> > > > be connected to multiple external networks and based on routing
decision
> > > > packet could go to different ones.
> > > >
> > > > This patch adds flexibility of having multiple DGPs per logical
router.
> > > >
> > > > Changes can classified as following:
> > > > a. Data structure changes to allow multiple DGPs per ovn_datapath.
> > > >
> > > > b. Consumption of new data structure in logical flows for
> > > > individual features.
> > > >
> > > > c. Features that require changes are:
> > > > i. Regular NS traffic flow.
> > > > ii. Network Address Translation.
> > > > iii. Load Balancer
> > > > iv. Gateway_mtu.
> > > > v. reside-on-redirect-chassis
> > > > vi. Misc code sections that assumed a single DGP.
> > > >
> > > > d. Except for reside-on-redirect-chassis all the other features
> > > > could be extended to multiple DGPs. Reside on redirect
> > > > chassis with its current specification could not be extended
> > > > and hence should be used only with the logical router that
> > > > has a single DGP.
> > > >
> > > > This patch doesn't support NAT & load-balancer features for multiple
> > > > DGPs yet, but added validations that disables NAT/load-balancer
> > > > features when there are more than one DGP configured per router.
> > > >
> > > > Signed-off-by: Ankur Sharma <ankurmnnit2004 at gmail.com>
> > > > Co-authored-by: Dhathri Purohith <dhathri.purohith at nutanix.com>
> > > > Signed-off-by: Dhathri Purohith <dhathri.purohith at nutanix.com>
> > > > Co-authored-by: Abhiram Sangana <sangana.abhiram at nutanix.com>
> > > > Signed-off-by: Abhiram Sangana <sangana.abhiram at nutanix.com>
> > > > Co-authored-by: Han Zhou <hzhou at ovn.org>
> > > > Signed-off-by: Han Zhou <hzhou at ovn.org>
> > >
> > > Hi Han, Ankur et all.
> > >
> > > Thanks for adding this feature.
> > >
> > > Please see below for few comments.
> > >
> > > Also this patch needs a rebase.
> > >
> > > Thanks
> > > Numan
> > >
> >
> > Thanks Numan for the review!
> >
> > > > ---
> > > > NEWS | 3 +
> > > > northd/lrouter.dl | 103 +++-----
> > > > northd/ovn-northd.8.xml | 6 +-
> > > > northd/ovn-northd.c | 535
++++++++++++++++++++++------------------
> > > > northd/ovn_northd.dl | 178 +++++++------
> > > > ovn-architecture.7.xml | 19 +-
> > > > ovn-nb.xml | 27 +-
> > > > ovs | 2 +-
> > > > tests/ovn-northd.at | 92 +++++++
> > > > tests/ovn.at | 307 +++++++++++++++++++++++
> > > > 10 files changed, 870 insertions(+), 402 deletions(-)
> > > >
> > > > diff --git a/NEWS b/NEWS
> > > > index eefdae9bc..2a1c56b2d 100644
> > > > --- a/NEWS
> > > > +++ b/NEWS
> > > > @@ -33,6 +33,9 @@ OVN v21.06.0 - 18 Jun 2021
> > > > "ovn-trim-limit-lflow-cache" and
> > "ovn-trim-wmark-perc-lflow-cache", to
> > > > allow enforcing a lflow cache size limit and high watermark
> > percentage
> > > > for which automatic memory trimming is performed.
> > > > + - Support multiple distributed gateway ports on a single logical
> > router.
> > > > + (NAT and load-balancer are not supported yet when there are
> > multiple
> > > > + distributed gateway ports).
> > > >
> > > > OVN v21.03.0 - 12 Mar 2021
> > > > -------------------------
> > > > diff --git a/northd/lrouter.dl b/northd/lrouter.dl
> > > > index 4a24f3f61..d37350ab8 100644
> > > > --- a/northd/lrouter.dl
> > > > +++ b/northd/lrouter.dl
> > > > @@ -138,14 +138,14 @@ Warning[message] :-
> > > > var message = "Bad configuration: distributed gateway port
> > configured on "
> > > > "port ${lrp.name} on L3 gateway router".
> > > >
> > > > -/* DistributedGatewayPortCandidate.
> > > > +/* Distributed gateway ports.
> > > > *
> > > > - * Each row pairs a logical router with its distributed gateway
port,
> > > > - * but without checking that there is at most one DGP per LR.
> > > > + * Each row means 'lrp' is a distributed gateway port on 'lr_uuid'.
> > > > *
> > > > - * (Use DistributedGatewayPort instead, since it guarantees
> > uniqueness.) */
> > > > -relation DistributedGatewayPortCandidate(lr_uuid: uuid, lrp_uuid:
uuid)
> > > > -DistributedGatewayPortCandidate(lr_uuid, lrp_uuid) :-
> > > > + * A logical router can have multiple distributed gateway ports. */
> > > > +relation DistributedGatewayPort(lrp:
Intern<nb::Logical_Router_Port>,
> > > > + lr_uuid: uuid)
> > > > +DistributedGatewayPort(lrp, lr_uuid) :-
> > > > lr in nb::Logical_Router(._uuid = lr_uuid),
> > > > LogicalRouterPort(lrp_uuid, lr._uuid),
> > > > lrp in &nb::Logical_Router_Port(._uuid = lrp_uuid),
> > > > @@ -153,30 +153,10 @@ DistributedGatewayPortCandidate(lr_uuid,
> > lrp_uuid) :-
> > > > var has_hcg = lrp.ha_chassis_group.is_some(),
> > > > var has_gc = not lrp.gateway_chassis.is_empty(),
> > > > has_hcg or has_gc.
> > > > -Warning[message] :-
> > > > - DistributedGatewayPortCandidate(lr_uuid, lrp_uuid),
> > > > - var lrps = lrp_uuid.group_by(lr_uuid).to_set(),
> > > > - lrps.size() > 1,
> > > > - lr in nb::Logical_Router(._uuid = lr_uuid),
> > > > - var message = "Bad configuration: multiple distributed gateway
> > ports on "
> > > > - "logical router ${lr.name}; ignoring all of them".
> > > > -
> > > > -/* Distributed gateway ports.
> > > > - *
> > > > - * Each row means 'lrp' is the distributed gateway port on
'lr_uuid'.
> > > > - *
> > > > - * There is at most one distributed gateway port per logical
router. */
> > > > -relation DistributedGatewayPort(lrp:
Intern<nb::Logical_Router_Port>,
> > lr_uuid: uuid)
> > > > -DistributedGatewayPort(lrp, lr_uuid) :-
> > > > - DistributedGatewayPortCandidate(lr_uuid, lrp_uuid),
> > > > - var lrps = lrp_uuid.group_by(lr_uuid).to_set(),
> > > > - lrps.size() == 1,
> > > > - Some{var lrp_uuid} = lrps.nth(0),
> > > > - lrp in &nb::Logical_Router_Port(._uuid = lrp_uuid).
> > > >
> > > > /* HAChassis is an abstraction over nb::Gateway_Chassis and
> > nb::HA_Chassis, which
> > > > * are different ways to represent the same configuration. Each
row is
> > > > - * effectively one HA_Chassis record. (Usually, we could
associated
> > each
> > > > + * effectively one HA_Chassis record. (Usually, we could associate
> > each
> > > > * row with a particular 'lr_uuid', but it's permissible for more
than
> > one
> > > > * logical router to use a HA chassis group, so we omit it so that
> > multiple
> > > > * references get merged.)
> > > > @@ -236,18 +216,20 @@
> > HAChassisGroup(ha_chassis_group_uuid(hac_group_uuid),
> > > > .name = name,
> > > > .external_ids = external_ids).
> > > >
> > > > -/* Each row maps from a logical router to the name of its
> > HAChassisGroup.
> > > > - * This level of indirection is needed because multiple logical
routers
> > > > - * are allowed to reference a given HAChassisGroup. */
> > > > -relation LogicalRouterHAChassisGroup(lr_uuid: uuid,
> > > > - hacg_uuid: uuid)
> > > > -LogicalRouterHAChassisGroup(lr_uuid,
ha_chassis_group_uuid(lrp._uuid))
> > :-
> > > > - DistributedGatewayPort(lrp, lr_uuid),
> > > > +/* Each row maps from a distributed gateway logical router port to
the
> > name of
> > > > + * its HAChassisGroup.
> > > > + * This level of indirection is needed because multiple distributed
> > gateway
> > > > + * logical router ports are allowed to reference a given
> > HAChassisGroup. */
> > > > +relation DistributedGatewayPortHAChassisGroup(
> > > > + lrp: Intern<nb::Logical_Router_Port>,
> > > > + hacg_uuid: uuid)
> > > > +DistributedGatewayPortHAChassisGroup(lrp,
> > ha_chassis_group_uuid(lrp._uuid)) :-
> > > > + DistributedGatewayPort(.lrp = lrp),
> > > > lrp.ha_chassis_group == None,
> > > > lrp.gateway_chassis.size() > 0.
> > > > -LogicalRouterHAChassisGroup(lr_uuid,
> > > > - ha_chassis_group_uuid(hac_group_uuid))
:-
> > > > - DistributedGatewayPort(lrp, lr_uuid),
> > > > +DistributedGatewayPortHAChassisGroup(lrp,
> > > > +
> > ha_chassis_group_uuid(hac_group_uuid)) :-
> > > > + DistributedGatewayPort(.lrp = lrp),
> > > > Some{var hac_group_uuid} = lrp.ha_chassis_group,
> > > > nb::HA_Chassis_Group(._uuid = hac_group_uuid).
> > > >
> > > > @@ -259,14 +241,19 @@ RouterPortIsRedirect(lrp, false) :-
> > > > &nb::Logical_Router_Port(._uuid = lrp),
> > > > not DistributedGatewayPort(&nb::Logical_Router_Port{._uuid =
lrp},
> > _).
> > > >
> > > > -relation LogicalRouterRedirectPort(lr: uuid, has_redirect_port:
> > Option<Intern<nb::Logical_Router_Port>>)
> > > > -
> > > > -LogicalRouterRedirectPort(lr, Some{lrp}) :-
> > > > - DistributedGatewayPort(lrp, lr).
> > > > -
> > > > -LogicalRouterRedirectPort(lr, None) :-
> > > > - nb::Logical_Router(._uuid = lr),
> > > > - not DistributedGatewayPort(_, lr).
> > > > +/*
> > > > + * LogicalRouterDGWPorts maps from each logical router UUID
> > > > + * to the logical router's set of distributed gateway (or redirect)
> > ports. */
> > > > +relation LogicalRouterDGWPorts(
> > > > + lr_uuid: uuid,
> > > > + l3dgw_ports: Vec<Intern<nb::Logical_Router_Port>>)
> > > > +LogicalRouterDGWPorts(lr_uuid, l3dgw_ports) :-
> > > > + DistributedGatewayPort(lrp, lr_uuid),
> > > > + var l3dgw_ports = lrp.group_by(lr_uuid).to_vec().
> > > > +LogicalRouterDGWPorts(lr_uuid, vec_empty()) :-
> > > > + lr in nb::Logical_Router(),
> > > > + var lr_uuid = lr._uuid,
> > > > + not DistributedGatewayPort(_, lr_uuid).
> > > >
> > > > typedef ExceptionalExtIps = AllowedExtIps{ips:
Intern<nb::Address_Set>}
> > > > | ExemptedExtIps{ips:
> > Intern<nb::Address_Set>}
> > > > @@ -450,9 +437,7 @@ LogicalRouterCopp0(lr, meters) :-
> > > >
> > > > /* Router relation collects all attributes of a logical router.
> > > > *
> > > > - * `l3dgw_port` - optional redirect port (see
`DistributedGatewayPort`)
> > > > - * `redirect_port_name` - derived redirect port name (or empty
string
> > if
> > > > - * router does not have a redirect port)
> > > > + * `l3dgw_ports` - optional redirect ports (see
> > `DistributedGatewayPort`)
> > > > * `is_gateway` - true iff the router is a gateway router.
Together
> > with
> > > > * `l3dgw_port`, this flag affects the generation of various
flows
> > > > * related to NAT and load balancing.
> > > > @@ -474,8 +459,7 @@ typedef Router = Router {
> > > > external_ids: Map<string,string>,
> > > >
> > > > /* Additional computed fields. */
> > > > - l3dgw_port: Option<Intern<nb::Logical_Router_Port>>,
> > > > - redirect_port_name: string,
> > > > + l3dgw_ports: Vec<Intern<nb::Logical_Router_Port>>,
> > > > is_gateway: bool,
> > > > nats: Vec<NAT>,
> > > > snat_ips: Map<v46_ip, Set<NAT>>,
> > > > @@ -498,23 +482,18 @@ Router[Router{
> > > > .options = lr.options,
> > > > .external_ids = lr.external_ids,
> > > >
> > > > - .l3dgw_port = l3dgw_port,
> > > > - .redirect_port_name =
> > > > - match (l3dgw_port) {
> > > > - Some{rport} ->
> > json_string_escape(chassis_redirect_name(rport.name)),
> > > > - _ -> ""
> > > > - },
> > > > - .is_gateway = lr.options.contains_key("chassis"),
> > > > - .nats = nats,
> > > > - .snat_ips = snat_ips,
> > > > - .lbs = lbs,
> > > > - .mcast_cfg = mcast_cfg,
> > > > + .l3dgw_ports = l3dgw_ports,
> > > > + .is_gateway = lr.options.contains_key("chassis"),
> > > > + .nats = nats,
> > > > + .snat_ips = snat_ips,
> > > > + .lbs = lbs,
> > > > + .mcast_cfg = mcast_cfg,
> > > > .learn_from_arp_request = learn_from_arp_request,
> > > > .force_lb_snat = force_lb_snat,
> > > > .copp = copp}.intern()] :-
> > > > lr in nb::Logical_Router(),
> > > > lr.is_enabled(),
> > > > - LogicalRouterRedirectPort(lr._uuid, l3dgw_port),
> > > > + LogicalRouterDGWPorts(lr._uuid, l3dgw_ports),
> > > > LogicalRouterNATs(lr._uuid, nats),
> > > > LogicalRouterLBs(lr._uuid, lbs),
> > > > LogicalRouterSnatIPs(lr._uuid, snat_ips),
> > > > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> > > > index 99a19f853..8a368ef61 100644
> > > > --- a/northd/ovn-northd.8.xml
> > > > +++ b/northd/ovn-northd.8.xml
> > > > @@ -3764,10 +3764,10 @@ icmp6 {
> > > > <h3>Ingress Table 17: Gateway Redirect</h3>
> > > >
> > > > <p>
> > > > - For distributed logical routers where one of the logical
router
> > > > + For distributed logical routers where one or more of the
logical
> > router
> > > > ports specifies a gateway chassis, this table redirects
> > > > - certain packets to the distributed gateway port instance on
the
> > > > - gateway chassis. This table has the following flows:
> > > > + certain packets to the distributed gateway port instances on
the
> > > > + gateway chassises. This table has the following flows:
> > > > </p>
> > > >
> > > > <ul>
> > > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > > > index ebe12cace..87c4478fa 100644
> > > > --- a/northd/ovn-northd.c
> > > > +++ b/northd/ovn-northd.c
> > > > @@ -655,13 +655,12 @@ struct ovn_datapath {
> > > > bool is_gw_router;
> > > >
> > > > /* OVN northd only needs to know about the logical router
gateway
> > port for
> > > > - * NAT on a distributed router. This "distributed gateway
port" is
> > > > - * populated only when there is a gateway chassis specified for
> > one of
> > > > - * the ports on the logical router. Otherwise this will be
NULL.
> > */
> > > > - struct ovn_port *l3dgw_port;
> > > > - /* The "derived" OVN port representing the instance of
l3dgw_port
> > on
> > > > - * the gateway chassis. */
> > > > - struct ovn_port *l3redirect_port;
> > > > + * NAT on a distributed router. The "distributed gateway
ports"
> > are
> > > > + * populated only when there is a gateway chassis or ha chassis
> > group
> > > > + * specified for some of the ports on the logical router.
> > Otherwise this
> > > > + * will be NULL. */
> > > > + struct ovn_port **l3dgw_ports;
> > > > + size_t n_l3dgw_ports;
> > > >
> > > > /* NAT entries configured on the router. */
> > > > struct ovn_nat *nat_entries;
> > > > @@ -802,6 +801,16 @@ init_nat_entries(struct ovn_datapath *od)
> > > > return;
> > > > }
> > > >
> > > > + if (od->n_l3dgw_ports > 1) {
> > > > + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1,
1);
> > > > + VLOG_WARN_RL(&rl, "NAT is configured on logical router %s,
> > which has %"
> > > > + PRIuSIZE" distributed gateway ports. NAT is
not
> > supported"
> > > > + " yet when there is more than one distributed
> > gateway "
> > > > + "port on the router.",
> > > > + od->nbr->name, od->n_l3dgw_ports);
> > > > + return;
> > > > + }
> > > > +
> > > > od->nat_entries = xmalloc(od->nbr->n_nat * sizeof
> > *od->nat_entries);
> > > >
> > > > for (size_t i = 0; i < od->nbr->n_nat; i++) {
> > > > @@ -941,6 +950,7 @@ ovn_datapath_destroy(struct hmap *datapaths,
struct
> > ovn_datapath *od)
> > > > destroy_lb_ips(od);
> > > > free(od->nat_entries);
> > > > free(od->localnet_ports);
> > > > + free(od->l3dgw_ports);
> > > > ovn_ls_port_group_destroy(&od->nb_pgs);
> > > > destroy_mcast_info_for_datapath(od);
> > > >
> > > > @@ -1489,9 +1499,18 @@ struct ovn_port {
> > > > /* Logical port multicast data. */
> > > > struct mcast_port_info mcast_info;
> > > >
> > > > - /* This is ordinarily false. It is true if and only if this
> > ovn_port is
> > > > - * derived from a chassis-redirect port. */
> > > > - bool derived;
> > > > + /* At most one of l3dgw_port and cr_port can be not NULL. */
> > > > +
> > > > + /* This is set to a distributed gateway port if and only if
this
> > ovn_port
> > > > + * is "derived" from it. Otherwise this is set to NULL. The
derived
> > > > + * ovn_port represents the instance of distributed gateway
port on
> > the
> > > > + * gateway chassis.*/
> > > > + struct ovn_port *l3dgw_port;
> > > > +
> > > > + /* This is set to the "derived" chassis-redirect port of this
port
> > if and
> > > > + * only if this port is a distributed gateway port. Otherwise
this
> > is set
> > > > + * to NULL. */
> > > > + struct ovn_port *cr_port;
> > >
> > > In my opinion, the above 2 variables - 'l3dgw_port' and 'cr_port' are
> > > confusing.
> > >
> > > For the code - if(op->l3dgw_port), It is not immediately obvious to
> > > me that "op" is a derived
> > > "chassisredirect" port.
> > >
> > > I'd suggest to have the bool - derived to be preserved.
> > >
> >
> > OK, maybe a more specific name: is_derived_crp? is_ makes it clear from
the
> > name it is a bool, and _crp in case there may be other forms of derived
> > port in the future.
> >
> > > How about this ?
> > >
> > > bool derived;
> > > union {
> > > struct ovn_port *cr_port;
> > > struct ovn_port *l3gw_port;
> > > };
> > >
> > > If 'derived' is true then 'l3gw_port' would be not NULL. Otherwise it
> > > would be NULL.
> >
> > If it is a union, it won't be NULL even if "derived" is false when it
is a
> > DGP.
> > But I think using union is good.
> >
Hi Numan, sorry that I tried this approach and changed my mind. It seems
using a bool plus a union is more tedious. When checking if a port is a DGP
we will have to write "if (!op->derived && op->cr_port)", because
!op->derived doesn't mean it has to be a DGP and op->cr_port alone is not
sufficient either because of the union. It seems more clear to do just: "if
(op->cr_port)" for "if op is a DGP" and "if (op->l3dgw_port)" for "if op is
a chassis-redirect port". But as you mentioned this may look unnatural at
the first glance, so instead of adding a bool flag I added helper functions
is_l3dgw_port(op) and is_cr_port(op). Please let me know if you have
different thoughts.
All the comments are addressed here in V2:
https://patchwork.ozlabs.org/project/ovn/list/?series=256862
(Thanks Abhiram for helping with the test case update!)
Thanks,
Han
> > >
> > > Instance of 'struct ovn_port' for distributed router port would have
> > > "cr_port" set.
> > >
More information about the dev
mailing list