[ovs-dev] [PATCH ovn v2 00/13] OVN Interconnection

Han Zhou hzhou at ovn.org
Mon Nov 18 02:52:05 UTC 2019


On Sat, Nov 16, 2019 at 4:03 AM Numan Siddique <numans at ovn.org> wrote:
>
>
>
> On Fri, Nov 15, 2019 at 1:18 PM Han Zhou <hzhou at ovn.org> wrote:
>>
>>
>>
>> On Thu, Nov 14, 2019 at 10:25 PM Numan Siddique <numans at ovn.org> wrote:
>> >
>> > On Thu, Nov 14, 2019 at 11:44 PM Han Zhou <hzhou at ovn.org> wrote:
>> > >
>> > > On Wed, Nov 13, 2019 at 10:27 AM Numan Siddique <numans at ovn.org>
wrote:
>> > > >
>> > > > On Thu, Oct 31, 2019 at 2:43 AM Han Zhou <hzhou at ovn.org> wrote:
>> > > > >
>> > > > > The series supports interconnecting multiple OVN deployments
(e.g.
>> > >  located at
>> > > > > multiple data centers) through logical routers connected with
tansit
>> > > logical
>> > > > > switches with overlay tunnels, managed through OVN control
plane.  See
>> > > the
>> > > > > ovn-architecture.rst document updates for more details, and find
the
>> > > > > instructions in Documentation/tutorials/ovn-interconnection.rst.
>> > > > >
>> > > > > Han Zhou (13):
>> > > > >   ovn-architecture: Add documentation for OVN interconnection
feature.
>> > > > >   ovn-inb: Interconnection northbound DB schema and CLI.
>> > > > >   ovn-isb: Interconnection southbound DB schema and CLI.
>> > > > >   ovn-ic: Interconnection controller with AZ registeration.
>> > > > >   ovn-northd.c: Refactor allocate_tnlid.
>> > > > >   ovn-ic: Transit switch controller.
>> > > > >   ovn-sb: Add columns is_interconn and is_remote to Chassis.
>> > > > >   ovn-ic: Interconnection gateway controller.
>> > > > >   ovn-ic: Interconnection port controller.
>> > > > >   ovn.at: e2e test for OVN interconnection.
>> > > > >   ovn-ctl: Refactor to reduce redundant code.
>> > > > >   ovn-ctl: Support commands for interconnection.
>> > > > >   tutorial: Add tutorial for OVN Interconnection.
>> > > > >
>> > > >
>> > > > Hi Han,
>> > > >
>> > > > Thanks for working on this feature. It's an interesting use case.
>> > > >
>> > > > I had a quick look at all the patches.
>> > > >
>> > >
>> > > Numan, thanks a lot for the thorough review! Please see my response
inlined.
>> > >
>> > > > I have few comments
>> > > >
>> > > > 1. I would suggest to rename the DBs as  ovn-ic-nb.ovsschema (and
the
>> > > > same for ovn-isb).
>> > > >     The DB name is - OVN_IC_Northbound. So it would make sense to
have
>> > > > - ovn-ic-nb.ovsschema
>> > > >      I would also suggest to rename the utilities to ovn-ic-nbctl
and
>> > > > ovn-ic-sbctl.
>> > > >     With  ovn-inbctl/ovn-isbctl, it is really confusing.
>> > > >
>> > > Sure, I felt not quite convenient with two dashes in a command name.
I
>> > > agree that ovn-ic-nbctl and ovn-ic-sbctl are more clear. I can
change it.
>> > >
>> > > > 2. ovn-ic service writes to interconnect south db, ovn north db and
>> > > > ovn south db. Writing to ic south db and
>> > > >     ovn north db is fine. But it seems a little odd for ovn-ic to
>> > > > write to south db. From what I understand it writes
>> > > >     to south db for 3 purposes
>> > > >       a. Updating the tunnel_key column of datapath_binding
>> > > > representing the transit switch
>> > > >       b. Updating the key column of port_binding representing the
>> > > > logical router port connecting to the transit switch.
>> > > >       c. Creating chassis rows for remote gateway chassis.
>> > > >
>> > > >    I think it's better if ovn-ic can delegate all these to
ovn-northd.
>> > > > For (a) and (b), ovn-ic can set the generated tunnel key
>> > > >    in the other_config/options column of Logical switch/Logical
switch
>> > > > port. ovn-northd can internally set this value to
>> > > >    the south db.
>> > > >
>> > > >    For (c), I think its better we add another table -
Remote_Chassis
>> > > > (or some other appropriate name) . ovn-ic will create rows
>> > > >    in this table for each remote chassis and ovn-northd will create
>> > > > chassis row in south db.
>> > > >    We already have Gateway_Chassis table in North db. So I think
it's
>> > > > fine if we add Remote_Chassis table. The only odd thing
>> > > >    would be is to store the encap details in North db.
>> > > >
>> > > >    With this, ovn-ic, doesn't need to worry about syncing between
>> > > > interconnect south db and ovn south db. In my opinion ovn-ic
>> > > >    should act more like CMS to each availability zone and hence
should
>> > > > not do any write transactions to the south db.
>> > > >
>> > > >     Any concerns with this proposed approach ?
>> > > >
>> > > We could avoid ovn-ic writing directly to SB with some extra logic in
>> > > northd, but I don't see any problem for ovn-ic to update SB
directly. First
>> > > of all, we have hypervisors creating and updating SB directly for
Chassis
>> > > and Encap records. The design here is that ovn-ic updates Chassis
and Encap
>> > > on behalf of remote gateway nodes, which I think is straightforward
and
>> > > reasonable. Similarly, port-binding's chassis column is updated the
same
>> > > way as how hypervisors are updating it. Secondly, for tunnel keys
updating,
>> > > it may seem graceful to update from northd, since northd is the only
client
>> > > that updates tunnel keys today, but since ovn-ic is responsible for
>> > > calculating these keys, and it already has connection to SB, I think
it is
>> > > just more natural and efficient to update it directly, to avoid the
extra
>> > > logic and unnecessary latency from northd with another round of
>> > > computation. The scope of the ovn-ic is only for the interconnection
>> > > objects, so I don't see any conflict or ownership ambiguity here.
What do
>> > > you think?
>> >
>> > Regarding the ovn-ic writing to SB DB, I would like to get other's
>> > opinion as well.
>> > I agree to your points, but at the same time concerned with few things
like
>> >    - What about RBAC for ovn-ic ?
>> >    - Right now we have RBAC for ovn-controller in writing to the SB DB.
>> >    - Do we want something similar for ovn-ic if ovn-ic writes to SB DB.
>> >    - Does CMS need to do something similar for ovn-ic, like how it is
documented
>> >      here -
https://mail.openvswitch.org/pipermail/ovs-dev/2019-November/364481.html
>> >      (related discussion here -
>> >
https://mail.openvswitch.org/pipermail/ovs-discuss/2019-November/049501.html
)
>> >      if no RBAC for ovn-ic ?
>> >
>> For RBAC, I think ovn-ic and northd are at the same position, because
they both are AZ level daemons, just focusing on different tasks. In theory
they can be put in same process, but I separated them for clarity. So from
RBAC perspective they should be just the same.
>> Today there is no role for northd, which I think is a flaw, as discussed
in the thread you pointed. It is not a big problem though, because a
workaround is simply creating a separate connection and use iptable rules
to restrict access from central nodes only. Same practice should be used
for ovn-ic.
>>
>> > @Mark Michelson  @Dumitru Ceara  You have any thoughts/concerns on
>> > ovn-ic writing to SB DB ?
>> >
>> > Regarding the tunnel key there are 2 things here
>> >   (1) tunnel key for transit datapath
>> >   (2) tunnel key for logical port connected to the transit switch
>> >
>> > For (1), you can completely avoid "Sync ISB TS tunnel key to AZ SB
>> > datapath" by storing the tunnel id
>> > in logical switch in NB DB like -
>> > nbrec_logical_switch_update_other_config_setkey(ls,
>> > "interconn-ts-tunnelkey, tunnelkey)
>> >
>> > ovn-northd when creating the datapath_binding row can set this value
>> > directly. With this approach I think the function
>> > ts_run() can be simplified a bit as it doesn't need to call -
>> > SBREC_DATAPATH_BINDING_FOR_EACH()
>> >
>>
>> I agree this approach should work. The code might be a little simpler
but the latency will be added because of the extra computer iteration from
northd to SB DB, although it might not be a big concern. So I don't have
strong preference for either approach.
>>
>> > Right now CMS is expected to create lsp-lrp ports on the transit
>> > switch - logical router on each AZ.
>> > Instead of this, can't CMS/user create these links in IC-NB table ?
>> > ''
>> > something like
>> >  ovn-ic-nbctl ts-add ts0
>> >  ovn-ic-nbctl tsp-add ts0 az0 az0-lr0 MAC1 IP1
>> >  ovn-ic-nbctl tsp-add ts0 az1 az1-lr0 MAC2 IP2
>> >
>> > (where az0-lr0 is  logical router present in az0's NB DB and az1-lr0
>> > is logical router
>> >  present in az1's NB DB).
>> >
>> > ovn-ic will internally create lsp-lrp ports in its AZ's NB DB. It can
>> > also create 'Port_Binding'
>> > row in IC-SB DB and doesn't need to worry about syncing between the SB
>> > DB and IC-SB DB.
>> >
>> > This would probably solve  (2) and ovn-ic can set the tunnel key when
>> > creating the lsp-lrp and ovn-northd
>> > will make use of this tunnel key when creating the port_binding rows
in SB DB.
>> >
>> > With this user/CMS doesn't have to worry about creating the lsp-lrps
>> > in each AZ and I think
>> > this seems more logical to me.
>> >
>> > Running "ovn-ic-nbctl show" will give a clear output to the user.
>> >
>>
>> This is a different operation model than what I had in mind. My initial
idea was that all configuration should be done at each AZ itself, and the
ovn-ic from each AZ will sync between each other automatically, so there is
no need for configuring the IC DBs by CMS/User. The reason why I end up
with a IC-NB DB is the use case of multi-tenancy, when an operator needs to
isolate different tenants on the backbone. They can create different TS for
this purpose. Each TS can represent a tenant or segmentation-zone. This
information should be specified by user and it is global, so it is stored
in IC-NB DB. Other than this, there is no need for global level
configurations. All the information that can be populated by ovn-ic is
stored in IC-SB.
>>
>> With your proposal, LRP's from each AZ is directly configured by user in
global IC-NB. It is a valid approach, too.
>>
>> However, I would prefer configuring them at AZ level for below
considerations:
>> - Try to avoid introducing new configurations. Other than the TS switch,
there is nothing new. All the operations of LRP creation stays the same way
as it is today, which I believe simplifies operation.
>> - The ownership is more clear. AZ information of each LRP is populated
automatically by ovn-ic from each AZ, and each AZ will only update its own
LRPs to global DB. Each AZ has full control of its own intention of joining
the global interconnection, deciding which router ports and gateways should
get involved.
>> - It might seem attractive to have a global view by configuring all
these LRPs at global level, but in fact we will still need to configure
static routes for each other at each AZ level. So the overall configuration
will be half-global and half-local, which might in fact seem less natural.
>
>
> I thought about the static routes.  With my proposal, ovn-ic can
configure the static routes too because it knows the transit switch
configuration would have all the details.
>
I wanted to make the configuration at IC-NB as simple as possible. For the
static routes, currently they can be configured at each AZ, but I am also
planning a further improvement to avoid the configuration at all, by
learning the routes of edge routers from each other automatically through
IC-SB, but I will add it after the basic feature is consolidated.

>
>> - The command ovn-ic-nbctl show may seem not showing enough information,
but in fact ovn-ic-sbctl show tells all details with the hierarchy of AZ ->
GW -> Ports (for each port it also shows its TS and subnet/IP). I think it
provides a global view of both logical and physical.
>>
>> If the change is for avoid updating tunnel keys to SB directly, we can
do similar approach as you suggested for TS.
>>
>> > If we take this approach, ovn-ic doesn't need to write to the SB DB
>> > except for creating 'Chassis' table in
>> > SB DB. We could avoid this also if we add Remote_Chassis in NB DB. But
>> > we can discuss further on this.
>> >
>> > Let me know what you think.
>> >
>> For writing to SB, I am ok to avoid writing tunnel keys (for TS and for
port-binding) directly to SB but through northd, although this would
introduce some latency which I think is not a big issue.
>
>
> Ok.
>>
>> But for chassis/encap/port-binding's chassis, I would still prefer to
update to SB directly, consistent with what we are doing from
ovn-controller. The only difference from ovn-controller is that RBAC is
necessary from ovn-controller but not from ovn-ic, since ovn-ic is AZ level
component that only runs on central nodes.
>> If in the future we want to avoid SB update from ovn-controller, too,
then we can make the same change in a consistent way then for both
ovn-controller and ovn-ic.
>> >
>> > >
>> > > > 3. In patch 7,its better to rename the ovs configuration option -
>> > > > "external_ids:is-interconn"  to "external_ids:ovn-is-interconn".
>> > > >    You also need to document it in ovn-controller-8.xml.
>> > > >
>> > > Agree!
>> > >
>> > > >    Or maybe we can remove this option - external_ids:is-interconn.
We
>> > > > probably don't need this if we do like below
>> > > >
>> > > >    2 (c) can also be done this way
>> > > >          - User creates transit switch.
>> > > >          - ovn-ic creates transit switch in north db.
>> > > >          - then the user adds the logical router port - logical
switch
>> > > > port to the transit switch in availability zone - az1 (for example)
>> > > >          - then the user creates gw chassis - for example
>> > > >               ovn-nbctl lrp-set-gateway-chassis lrp-lr1-ts1
<gateway
>> > > > name> [priority]
>> > > >          - ovn-ic on az1 ,learns this and creates a Gateway_Chassis
>> > > > row in the interconnect south db.
>> > > >          - ovn-ic on other az's then create Remote_Chassis rows in
the
>> > > North db.
>> > > >         - ovn-northd on other az's then create chassis row in
south db
>> > > > with "is_remote" set to true.
>> > > >
>> > > >     I am not sure if this complicates further and hence its better
>> > > > that ovn-ic writes to the south dbs. But we can discuss further on
>> > > > this.
>> > > >
>> > >
>> > > I think this is a good idea and I am incline to it, because it
avoids one
>> > > configuration on the gateway node, which is good.
>> > > The only concern is that it makes the remote gateway sync more
dynamic - it
>> > > changes when LRP's gateway-chassis/ha-chassis settings change, which
may be
>> > > less efficient. The code of ovn-ic will be a little more complex. I
think
>> > > it should be ok, but I'd like to hear from you, too.
>> >
>> > I would prefer to go with this approach. Another advantage is that,
>> > ovn-controller don't
>> > have to establish tunnels with the remote chassis until it really has
>> > to. I think with the present
>> > approach it establishes tunnels even if a transit switch doesn't have
>> > any router ports ?
>> > Correct me if I am wrong here ?
>> >
>>
>> Yes you are right. I think it is mainly about static v.s. dynamic. The
benefit of static is that it is more predicable and efficient (maybe),
while the dynamic approach avoids the extra configuration of
"is-interconnection". I don't worry about the tunnel setup yet, since it is
merely a piece of data in the ovsdb table on the chassis, and there won't
be too many interconnection gateways. But I think I am inclined to change
as you suggested, to avoid the extra configuration.
>
>
> Ok. So to summarize from our discussion, v3 of this series would
>   -  avoid tunnel key updates to the SB DB in ovn-ic
>    - but it will create remote chassis in SB DB right ?
>

Yes. In addition, I will also remove the "is-interconn" configuration for
Chassis as you suggested, and determine the Chassis is for interconnection
if there is a LRP at TS assigned to it, by either gateway_chassis or by
ha_chassis.
Does this all sound good for v3?

>
> Thanks
> Numan
>
>>
>> >
>> > Thanks
>> > Numan
>> >
>> >
>> > >
>> > > >
>> > > > 4. Can you please add a section in ovn-architecture about the
"Journey
>> > > > of  a packet in inter-az scenario" ? This would really
>> > > >     help in understanding the feature.
>> > > >
>> > > Sure, I can add it.
>> > >
>> > > > Thanks
>> > > > Numan
>> > > >
>> > > >
>> > > > >  .gitignore                                      |    6 +
>> > > > >  Documentation/automake.mk                       |    1 +
>> > > > >  Documentation/tutorials/index.rst               |    1 +
>> > > > >  Documentation/tutorials/ovn-interconnection.rst |  188 ++++
>> > > > >  Makefile.am                                     |    1 +
>> > > > >  NEWS                                            |    5 +
>> > > > >  TODO.rst                                        |   10 +
>> > > > >  automake.mk                                     |   75 ++
>> > > > >  controller/binding.c                            |    6 +-
>> > > > >  controller/chassis.c                            |   14 +
>> > > > >  debian/ovn-common.install                       |    2 +
>> > > > >  debian/ovn-common.manpages                      |    4 +
>> > > > >  ic/.gitignore                                   |    2 +
>> > > > >  ic/automake.mk                                  |   10 +
>> > > > >  ic/ovn-ic.8.xml                                 |  111 +++
>> > > > >  ic/ovn-ic.c                                     | 1050
>> > > +++++++++++++++++++++++
>> > > > >  lib/.gitignore                                  |    6 +
>> > > > >  lib/automake.mk                                 |   32 +-
>> > > > >  lib/ovn-inb-idl.ann                             |    9 +
>> > > > >  lib/ovn-isb-idl.ann                             |    9 +
>> > > > >  lib/ovn-util.c                                  |   92 ++
>> > > > >  lib/ovn-util.h                                  |   15 +
>> > > > >  northd/ovn-northd.c                             |  108 +--
>> > > > >  ovn-architecture.7.xml                          |  107 ++-
>> > > > >  ovn-inb.ovsschema                               |   75 ++
>> > > > >  ovn-inb.xml                                     |  371 ++++++++
>> > > > >  ovn-isb.ovsschema                               |  129 +++
>> > > > >  ovn-isb.xml                                     |  582
+++++++++++++
>> > > > >  ovn-nb.ovsschema                                |    5 +-
>> > > > >  ovn-nb.xml                                      |   28 +-
>> > > > >  ovn-sb.ovsschema                                |    8 +-
>> > > > >  ovn-sb.xml                                      |   24 +
>> > > > >  tests/automake.mk                               |    8 +-
>> > > > >  tests/ovn-ic.at                                 |  192 +++++
>> > > > >  tests/ovn-inbctl.at                             |   65 ++
>> > > > >  tests/ovn-isbctl.at                             |  112 +++
>> > > > >  tests/ovn-macros.at                             |  161 +++-
>> > > > >  tests/ovn.at                                    |  149 ++++
>> > > > >  tests/testsuite.at                              |    3 +
>> > > > >  tutorial/ovs-sandbox                            |   78 +-
>> > > > >  utilities/.gitignore                            |    4 +
>> > > > >  utilities/automake.mk                           |   16 +
>> > > > >  utilities/ovn-ctl                               |  423 ++++++++-
>> > > > >  utilities/ovn-ctl.8.xml                         |   91 ++
>> > > > >  utilities/ovn-inbctl.8.xml                      |  174 ++++
>> > > > >  utilities/ovn-inbctl.c                          |  948
>> > > ++++++++++++++++++++
>> > > > >  utilities/ovn-isbctl.8.xml                      |  148 ++++
>> > > > >  utilities/ovn-isbctl.c                          | 1015
>> > > ++++++++++++++++++++++
>> > > > >  48 files changed, 6528 insertions(+), 145 deletions(-)
>> > > > >  create mode 100644
Documentation/tutorials/ovn-interconnection.rst
>> > > > >  create mode 100644 ic/.gitignore
>> > > > >  create mode 100644 ic/automake.mk
>> > > > >  create mode 100644 ic/ovn-ic.8.xml
>> > > > >  create mode 100644 ic/ovn-ic.c
>> > > > >  create mode 100644 lib/ovn-inb-idl.ann
>> > > > >  create mode 100644 lib/ovn-isb-idl.ann
>> > > > >  create mode 100644 ovn-inb.ovsschema
>> > > > >  create mode 100644 ovn-inb.xml
>> > > > >  create mode 100644 ovn-isb.ovsschema
>> > > > >  create mode 100644 ovn-isb.xml
>> > > > >  create mode 100644 tests/ovn-ic.at
>> > > > >  create mode 100644 tests/ovn-inbctl.at
>> > > > >  create mode 100644 tests/ovn-isbctl.at
>> > > > >  create mode 100644 utilities/ovn-inbctl.8.xml
>> > > > >  create mode 100644 utilities/ovn-inbctl.c
>> > > > >  create mode 100644 utilities/ovn-isbctl.8.xml
>> > > > >  create mode 100644 utilities/ovn-isbctl.c
>> > > > >
>> > > > > --
>> > > > > 2.1.0
>> > > > >
>> > > > > _______________________________________________
>> > > > > dev mailing list
>> > > > > dev at openvswitch.org
>> > > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>> > > _______________________________________________
>> > > dev mailing list
>> > > dev at openvswitch.org
>> > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>> > >


More information about the dev mailing list