[ovs-dev] [PATCH ovn v2 00/13] OVN Interconnection

Numan Siddique numans at ovn.org
Sat Nov 16 12:02:54 UTC 2019


On Fri, Nov 15, 2019 at 1:18 PM Han Zhou <hzhou at ovn.org> wrote:

>
>
> On Thu, Nov 14, 2019 at 10:25 PM Numan Siddique <numans at ovn.org> wrote:
> >
> > On Thu, Nov 14, 2019 at 11:44 PM Han Zhou <hzhou at ovn.org> wrote:
> > >
> > > On Wed, Nov 13, 2019 at 10:27 AM Numan Siddique <numans at ovn.org>
> wrote:
> > > >
> > > > On Thu, Oct 31, 2019 at 2:43 AM Han Zhou <hzhou at ovn.org> wrote:
> > > > >
> > > > > The series supports interconnecting multiple OVN deployments (e.g.
> > >  located at
> > > > > multiple data centers) through logical routers connected with
> tansit
> > > logical
> > > > > switches with overlay tunnels, managed through OVN control plane.
> See
> > > the
> > > > > ovn-architecture.rst document updates for more details, and find
> the
> > > > > instructions in Documentation/tutorials/ovn-interconnection.rst.
> > > > >
> > > > > Han Zhou (13):
> > > > >   ovn-architecture: Add documentation for OVN interconnection
> feature.
> > > > >   ovn-inb: Interconnection northbound DB schema and CLI.
> > > > >   ovn-isb: Interconnection southbound DB schema and CLI.
> > > > >   ovn-ic: Interconnection controller with AZ registeration.
> > > > >   ovn-northd.c: Refactor allocate_tnlid.
> > > > >   ovn-ic: Transit switch controller.
> > > > >   ovn-sb: Add columns is_interconn and is_remote to Chassis.
> > > > >   ovn-ic: Interconnection gateway controller.
> > > > >   ovn-ic: Interconnection port controller.
> > > > >   ovn.at: e2e test for OVN interconnection.
> > > > >   ovn-ctl: Refactor to reduce redundant code.
> > > > >   ovn-ctl: Support commands for interconnection.
> > > > >   tutorial: Add tutorial for OVN Interconnection.
> > > > >
> > > >
> > > > Hi Han,
> > > >
> > > > Thanks for working on this feature. It's an interesting use case.
> > > >
> > > > I had a quick look at all the patches.
> > > >
> > >
> > > Numan, thanks a lot for the thorough review! Please see my response
> inlined.
> > >
> > > > I have few comments
> > > >
> > > > 1. I would suggest to rename the DBs as  ovn-ic-nb.ovsschema (and the
> > > > same for ovn-isb).
> > > >     The DB name is - OVN_IC_Northbound. So it would make sense to
> have
> > > > - ovn-ic-nb.ovsschema
> > > >      I would also suggest to rename the utilities to ovn-ic-nbctl and
> > > > ovn-ic-sbctl.
> > > >     With  ovn-inbctl/ovn-isbctl, it is really confusing.
> > > >
> > > Sure, I felt not quite convenient with two dashes in a command name. I
> > > agree that ovn-ic-nbctl and ovn-ic-sbctl are more clear. I can change
> it.
> > >
> > > > 2. ovn-ic service writes to interconnect south db, ovn north db and
> > > > ovn south db. Writing to ic south db and
> > > >     ovn north db is fine. But it seems a little odd for ovn-ic to
> > > > write to south db. From what I understand it writes
> > > >     to south db for 3 purposes
> > > >       a. Updating the tunnel_key column of datapath_binding
> > > > representing the transit switch
> > > >       b. Updating the key column of port_binding representing the
> > > > logical router port connecting to the transit switch.
> > > >       c. Creating chassis rows for remote gateway chassis.
> > > >
> > > >    I think it's better if ovn-ic can delegate all these to
> ovn-northd.
> > > > For (a) and (b), ovn-ic can set the generated tunnel key
> > > >    in the other_config/options column of Logical switch/Logical
> switch
> > > > port. ovn-northd can internally set this value to
> > > >    the south db.
> > > >
> > > >    For (c), I think its better we add another table - Remote_Chassis
> > > > (or some other appropriate name) . ovn-ic will create rows
> > > >    in this table for each remote chassis and ovn-northd will create
> > > > chassis row in south db.
> > > >    We already have Gateway_Chassis table in North db. So I think it's
> > > > fine if we add Remote_Chassis table. The only odd thing
> > > >    would be is to store the encap details in North db.
> > > >
> > > >    With this, ovn-ic, doesn't need to worry about syncing between
> > > > interconnect south db and ovn south db. In my opinion ovn-ic
> > > >    should act more like CMS to each availability zone and hence
> should
> > > > not do any write transactions to the south db.
> > > >
> > > >     Any concerns with this proposed approach ?
> > > >
> > > We could avoid ovn-ic writing directly to SB with some extra logic in
> > > northd, but I don't see any problem for ovn-ic to update SB directly.
> First
> > > of all, we have hypervisors creating and updating SB directly for
> Chassis
> > > and Encap records. The design here is that ovn-ic updates Chassis and
> Encap
> > > on behalf of remote gateway nodes, which I think is straightforward and
> > > reasonable. Similarly, port-binding's chassis column is updated the
> same
> > > way as how hypervisors are updating it. Secondly, for tunnel keys
> updating,
> > > it may seem graceful to update from northd, since northd is the only
> client
> > > that updates tunnel keys today, but since ovn-ic is responsible for
> > > calculating these keys, and it already has connection to SB, I think
> it is
> > > just more natural and efficient to update it directly, to avoid the
> extra
> > > logic and unnecessary latency from northd with another round of
> > > computation. The scope of the ovn-ic is only for the interconnection
> > > objects, so I don't see any conflict or ownership ambiguity here. What
> do
> > > you think?
> >
> > Regarding the ovn-ic writing to SB DB, I would like to get other's
> > opinion as well.
> > I agree to your points, but at the same time concerned with few things
> like
> >    - What about RBAC for ovn-ic ?
> >    - Right now we have RBAC for ovn-controller in writing to the SB DB.
> >    - Do we want something similar for ovn-ic if ovn-ic writes to SB DB.
> >    - Does CMS need to do something similar for ovn-ic, like how it is
> documented
> >      here -
> https://mail.openvswitch.org/pipermail/ovs-dev/2019-November/364481.html
> >      (related discussion here -
> >
> https://mail.openvswitch.org/pipermail/ovs-discuss/2019-November/049501.html
> )
> >      if no RBAC for ovn-ic ?
> >
> For RBAC, I think ovn-ic and northd are at the same position, because they
> both are AZ level daemons, just focusing on different tasks. In theory they
> can be put in same process, but I separated them for clarity. So from RBAC
> perspective they should be just the same.
> Today there is no role for northd, which I think is a flaw, as discussed
> in the thread you pointed. It is not a big problem though, because a
> workaround is simply creating a separate connection and use iptable rules
> to restrict access from central nodes only. Same practice should be used
> for ovn-ic.
>
> > @Mark Michelson  @Dumitru Ceara  You have any thoughts/concerns on
> > ovn-ic writing to SB DB ?
> >
> > Regarding the tunnel key there are 2 things here
> >   (1) tunnel key for transit datapath
> >   (2) tunnel key for logical port connected to the transit switch
> >
> > For (1), you can completely avoid "Sync ISB TS tunnel key to AZ SB
> > datapath" by storing the tunnel id
> > in logical switch in NB DB like -
> > nbrec_logical_switch_update_other_config_setkey(ls,
> > "interconn-ts-tunnelkey, tunnelkey)
> >
> > ovn-northd when creating the datapath_binding row can set this value
> > directly. With this approach I think the function
> > ts_run() can be simplified a bit as it doesn't need to call -
> > SBREC_DATAPATH_BINDING_FOR_EACH()
> >
>
> I agree this approach should work. The code might be a little simpler but
> the latency will be added because of the extra computer iteration from
> northd to SB DB, although it might not be a big concern. So I don't have
> strong preference for either approach.
>
> > Right now CMS is expected to create lsp-lrp ports on the transit
> > switch - logical router on each AZ.
> > Instead of this, can't CMS/user create these links in IC-NB table ?
> > ''
> > something like
> >  ovn-ic-nbctl ts-add ts0
> >  ovn-ic-nbctl tsp-add ts0 az0 az0-lr0 MAC1 IP1
> >  ovn-ic-nbctl tsp-add ts0 az1 az1-lr0 MAC2 IP2
> >
> > (where az0-lr0 is  logical router present in az0's NB DB and az1-lr0
> > is logical router
> >  present in az1's NB DB).
> >
> > ovn-ic will internally create lsp-lrp ports in its AZ's NB DB. It can
> > also create 'Port_Binding'
> > row in IC-SB DB and doesn't need to worry about syncing between the SB
> > DB and IC-SB DB.
> >
> > This would probably solve  (2) and ovn-ic can set the tunnel key when
> > creating the lsp-lrp and ovn-northd
> > will make use of this tunnel key when creating the port_binding rows in
> SB DB.
> >
> > With this user/CMS doesn't have to worry about creating the lsp-lrps
> > in each AZ and I think
> > this seems more logical to me.
> >
> > Running "ovn-ic-nbctl show" will give a clear output to the user.
> >
>
> This is a different operation model than what I had in mind. My initial
> idea was that all configuration should be done at each AZ itself, and the
> ovn-ic from each AZ will sync between each other automatically, so there is
> no need for configuring the IC DBs by CMS/User. The reason why I end up
> with a IC-NB DB is the use case of multi-tenancy, when an operator needs to
> isolate different tenants on the backbone. They can create different TS for
> this purpose. Each TS can represent a tenant or segmentation-zone. This
> information should be specified by user and it is global, so it is stored
> in IC-NB DB. Other than this, there is no need for global level
> configurations. All the information that can be populated by ovn-ic is
> stored in IC-SB.
>
> With your proposal, LRP's from each AZ is directly configured by user in
> global IC-NB. It is a valid approach, too.
>
> However, I would prefer configuring them at AZ level for below
> considerations:
> - Try to avoid introducing new configurations. Other than the TS switch,
> there is nothing new. All the operations of LRP creation stays the same way
> as it is today, which I believe simplifies operation.
> - The ownership is more clear. AZ information of each LRP is populated
> automatically by ovn-ic from each AZ, and each AZ will only update its own
> LRPs to global DB. Each AZ has full control of its own intention of joining
> the global interconnection, deciding which router ports and gateways should
> get involved.
> - It might seem attractive to have a global view by configuring all these
> LRPs at global level, but in fact we will still need to configure static
> routes for each other at each AZ level. So the overall configuration will
> be half-global and half-local, which might in fact seem less natural.
>

I thought about the static routes.  With my proposal, ovn-ic can configure
the static routes too because it knows the transit switch configuration
would have all the details.


- The command ovn-ic-nbctl show may seem not showing enough information,
> but in fact ovn-ic-sbctl show tells all details with the hierarchy of AZ ->
> GW -> Ports (for each port it also shows its TS and subnet/IP). I think it
> provides a global view of both logical and physical.
>
> If the change is for avoid updating tunnel keys to SB directly, we can do
> similar approach as you suggested for TS.
>
> > If we take this approach, ovn-ic doesn't need to write to the SB DB
> > except for creating 'Chassis' table in
> > SB DB. We could avoid this also if we add Remote_Chassis in NB DB. But
> > we can discuss further on this.
> >
> > Let me know what you think.
> >
> For writing to SB, I am ok to avoid writing tunnel keys (for TS and for
> port-binding) directly to SB but through northd, although this would
> introduce some latency which I think is not a big issue.
>

Ok.

> But for chassis/encap/port-binding's chassis, I would still prefer to
> update to SB directly, consistent with what we are doing from
> ovn-controller. The only difference from ovn-controller is that RBAC is
> necessary from ovn-controller but not from ovn-ic, since ovn-ic is AZ level
> component that only runs on central nodes.
> If in the future we want to avoid SB update from ovn-controller, too, then
> we can make the same change in a consistent way then for both
> ovn-controller and ovn-ic.
> >
> > >
> > > > 3. In patch 7,its better to rename the ovs configuration option -
> > > > "external_ids:is-interconn"  to "external_ids:ovn-is-interconn".
> > > >    You also need to document it in ovn-controller-8.xml.
> > > >
> > > Agree!
> > >
> > > >    Or maybe we can remove this option - external_ids:is-interconn. We
> > > > probably don't need this if we do like below
> > > >
> > > >    2 (c) can also be done this way
> > > >          - User creates transit switch.
> > > >          - ovn-ic creates transit switch in north db.
> > > >          - then the user adds the logical router port - logical
> switch
> > > > port to the transit switch in availability zone - az1 (for example)
> > > >          - then the user creates gw chassis - for example
> > > >               ovn-nbctl lrp-set-gateway-chassis lrp-lr1-ts1 <gateway
> > > > name> [priority]
> > > >          - ovn-ic on az1 ,learns this and creates a Gateway_Chassis
> > > > row in the interconnect south db.
> > > >          - ovn-ic on other az's then create Remote_Chassis rows in
> the
> > > North db.
> > > >         - ovn-northd on other az's then create chassis row in south
> db
> > > > with "is_remote" set to true.
> > > >
> > > >     I am not sure if this complicates further and hence its better
> > > > that ovn-ic writes to the south dbs. But we can discuss further on
> > > > this.
> > > >
> > >
> > > I think this is a good idea and I am incline to it, because it avoids
> one
> > > configuration on the gateway node, which is good.
> > > The only concern is that it makes the remote gateway sync more dynamic
> - it
> > > changes when LRP's gateway-chassis/ha-chassis settings change, which
> may be
> > > less efficient. The code of ovn-ic will be a little more complex. I
> think
> > > it should be ok, but I'd like to hear from you, too.
> >
> > I would prefer to go with this approach. Another advantage is that,
> > ovn-controller don't
> > have to establish tunnels with the remote chassis until it really has
> > to. I think with the present
> > approach it establishes tunnels even if a transit switch doesn't have
> > any router ports ?
> > Correct me if I am wrong here ?
> >
>
> Yes you are right. I think it is mainly about static v.s. dynamic. The
> benefit of static is that it is more predicable and efficient (maybe),
> while the dynamic approach avoids the extra configuration of
> "is-interconnection". I don't worry about the tunnel setup yet, since it is
> merely a piece of data in the ovsdb table on the chassis, and there won't
> be too many interconnection gateways. But I think I am inclined to change
> as you suggested, to avoid the extra configuration.
>

Ok. So to summarize from our discussion, v3 of this series would
  -  avoid tunnel key updates to the SB DB in ovn-ic
   - but it will create remote chassis in SB DB right ?


Thanks
Numan


> >
> > Thanks
> > Numan
> >
> >
> > >
> > > >
> > > > 4. Can you please add a section in ovn-architecture about the
> "Journey
> > > > of  a packet in inter-az scenario" ? This would really
> > > >     help in understanding the feature.
> > > >
> > > Sure, I can add it.
> > >
> > > > Thanks
> > > > Numan
> > > >
> > > >
> > > > >  .gitignore                                      |    6 +
> > > > >  Documentation/automake.mk                       |    1 +
> > > > >  Documentation/tutorials/index.rst               |    1 +
> > > > >  Documentation/tutorials/ovn-interconnection.rst |  188 ++++
> > > > >  Makefile.am                                     |    1 +
> > > > >  NEWS                                            |    5 +
> > > > >  TODO.rst                                        |   10 +
> > > > >  automake.mk                                     |   75 ++
> > > > >  controller/binding.c                            |    6 +-
> > > > >  controller/chassis.c                            |   14 +
> > > > >  debian/ovn-common.install                       |    2 +
> > > > >  debian/ovn-common.manpages                      |    4 +
> > > > >  ic/.gitignore                                   |    2 +
> > > > >  ic/automake.mk                                  |   10 +
> > > > >  ic/ovn-ic.8.xml                                 |  111 +++
> > > > >  ic/ovn-ic.c                                     | 1050
> > > +++++++++++++++++++++++
> > > > >  lib/.gitignore                                  |    6 +
> > > > >  lib/automake.mk                                 |   32 +-
> > > > >  lib/ovn-inb-idl.ann                             |    9 +
> > > > >  lib/ovn-isb-idl.ann                             |    9 +
> > > > >  lib/ovn-util.c                                  |   92 ++
> > > > >  lib/ovn-util.h                                  |   15 +
> > > > >  northd/ovn-northd.c                             |  108 +--
> > > > >  ovn-architecture.7.xml                          |  107 ++-
> > > > >  ovn-inb.ovsschema                               |   75 ++
> > > > >  ovn-inb.xml                                     |  371 ++++++++
> > > > >  ovn-isb.ovsschema                               |  129 +++
> > > > >  ovn-isb.xml                                     |  582
> +++++++++++++
> > > > >  ovn-nb.ovsschema                                |    5 +-
> > > > >  ovn-nb.xml                                      |   28 +-
> > > > >  ovn-sb.ovsschema                                |    8 +-
> > > > >  ovn-sb.xml                                      |   24 +
> > > > >  tests/automake.mk                               |    8 +-
> > > > >  tests/ovn-ic.at                                 |  192 +++++
> > > > >  tests/ovn-inbctl.at                             |   65 ++
> > > > >  tests/ovn-isbctl.at                             |  112 +++
> > > > >  tests/ovn-macros.at                             |  161 +++-
> > > > >  tests/ovn.at                                    |  149 ++++
> > > > >  tests/testsuite.at                              |    3 +
> > > > >  tutorial/ovs-sandbox                            |   78 +-
> > > > >  utilities/.gitignore                            |    4 +
> > > > >  utilities/automake.mk                           |   16 +
> > > > >  utilities/ovn-ctl                               |  423 ++++++++-
> > > > >  utilities/ovn-ctl.8.xml                         |   91 ++
> > > > >  utilities/ovn-inbctl.8.xml                      |  174 ++++
> > > > >  utilities/ovn-inbctl.c                          |  948
> > > ++++++++++++++++++++
> > > > >  utilities/ovn-isbctl.8.xml                      |  148 ++++
> > > > >  utilities/ovn-isbctl.c                          | 1015
> > > ++++++++++++++++++++++
> > > > >  48 files changed, 6528 insertions(+), 145 deletions(-)
> > > > >  create mode 100644 Documentation/tutorials/ovn-interconnection.rst
> > > > >  create mode 100644 ic/.gitignore
> > > > >  create mode 100644 ic/automake.mk
> > > > >  create mode 100644 ic/ovn-ic.8.xml
> > > > >  create mode 100644 ic/ovn-ic.c
> > > > >  create mode 100644 lib/ovn-inb-idl.ann
> > > > >  create mode 100644 lib/ovn-isb-idl.ann
> > > > >  create mode 100644 ovn-inb.ovsschema
> > > > >  create mode 100644 ovn-inb.xml
> > > > >  create mode 100644 ovn-isb.ovsschema
> > > > >  create mode 100644 ovn-isb.xml
> > > > >  create mode 100644 tests/ovn-ic.at
> > > > >  create mode 100644 tests/ovn-inbctl.at
> > > > >  create mode 100644 tests/ovn-isbctl.at
> > > > >  create mode 100644 utilities/ovn-inbctl.8.xml
> > > > >  create mode 100644 utilities/ovn-inbctl.c
> > > > >  create mode 100644 utilities/ovn-isbctl.8.xml
> > > > >  create mode 100644 utilities/ovn-isbctl.c
> > > > >
> > > > > --
> > > > > 2.1.0
> > > > >
> > > > > _______________________________________________
> > > > > dev mailing list
> > > > > dev at openvswitch.org
> > > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > > _______________________________________________
> > > dev mailing list
> > > dev at openvswitch.org
> > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > >
>


More information about the dev mailing list