[ovs-dev] [PATCH v6 ovn] Allow to run multiple controllers on the same machine

Ihar Hrachyshka ihrachys at redhat.com
Fri Sep 18 14:22:18 UTC 2020


On Fri, Sep 18, 2020 at 3:19 AM Han Zhou <hzhou at ovn.org> wrote:
>
>
>
> On Thu, Sep 17, 2020 at 7:15 PM Ihar Hrachyshka <ihrachys at redhat.com> wrote:
> >
> > Hi Han,
> >
> > thanks a lot for the review! I addressed most comments below, listing
> > here the questions not immediately addressed in the next version I am
> > about to push:
> >
> > 1. why is file_system_id array static? there are drawbacks in
> > switching it to local: we would need to do a lot more xstrdup / free
> > work in multiple places. But it's doable.
>
> Thanks for your explanation. Static makes sense.
>
> >
> > 2. the confusion between system-id (OVN) and system-id.conf (OVS) file
> > names. Should we rename the former into e.g.
> > $ovsconfdir/ovn-chassis-name-override or something along these lines?
> >
>
> ovn-chassis-name-override sounds good to me. Since it is for OVN only, I think it is better to put in OVN's config dir.

Ack. You mean $ovn_sysconfdir/ovn? (/etc/ovn/)

>
> However, I just realized another problem related to the use of the file. Since it is a new input to the I-P engine, we will have to add a node to the engine with the only data being the ID read from the file. We will also track the old ID, and in its run() function we will check if the ID is changed, and then trigger engine compute for all the nodes that depend on it. Otherwise, changing the ID in the file won't get enforced by the I-P engine. Sorry that I didn't thought about this in my previous review.

I appreciate the request, just tested and indeed it doesn't (always)
work as one could expect assuming the file can be changed in runtime.
I am happy to fix this. For that though, I may need some more time
than a couple of hours (need to understand the I-P machinery first,
which I haven't touched yet). Is it possible that we e.g. mark the
limitation in documentation and in release notes and follow up with a
backportable bug fix for that? Let me know.

>
> Thanks,
> Han
>
> > On Thu, Sep 17, 2020 at 3:28 AM Han Zhou <hzhou at ovn.org> wrote:
> > >
> > > Hi Ihar,
> > >
> > > Please see my comments below.
> > >
> > > Thanks,
> > > Han
> > >
> > > On Wed, Sep 16, 2020 at 7:16 PM Ihar Hrachyshka <ihrachys at redhat.com> wrote:
> > > >
> > > > User stories:
> > > > 1) NFV: an admin wants to run two separate instances of OVN controller
> > > >    using the same database but configuring ports on different bridges.
> > > >    Some of these bridges may use DPDK while others may not.
> > > >
> > > > 2) Parallel OVN instances: an admin wants to run two separate
> > > >    instances of OVN controller using different databases. The
> > > >    instances are completely independent and serve different consumers.
> > > >    For example, the same machine runs both OpenStack and OpenShift
> > > >    stacks, each running its own separate OVN stack.
> > > >
> > > > To serve these use cases, several features should be added to
> > > > ovn-controller:
> > > >
> > > > - use different database configuration for multiple controllers;
> > > > - customize chassis name used by controller.
> > > >
> > > > =====
> > > >
> > > > For each of the following database configuration options, their
> > > > extended chassis specific counterparts are introduced:
> > > >
> > > > external_ids:hostname
> > > > external_ids:ovn-bridge
> > > > external_ids:ovn-bridge-datapath-type
> > > > external_ids:ovn-bridge-mappings
> > > > external_ids:ovn-chassis-mac-mappings
> > > > external_ids:ovn-cms-options
> > > > external_ids:ovn-encap-csum
> > > > external_ids:ovn-encap-ip
> > > > external_ids:ovn-encap-type
> > > > external_ids:ovn-is-interconn
> > > > external_ids:ovn-monitor-all
> > > > external_ids:ovn-openflow-probe-interval
> > > > external_ids:ovn-remote
> > > > external_ids:ovn-remote-probe-interval
> > > >
> > > > For example,
> > > >
> > > > external_ids:ovn-bridge -> external_ids:ovn-bridge-<chassis-name>=
> > > > external_ids:ovn-encap-ip -> external_ids:ovn-encap-ip-<chassis-name>=
> > > > external_ids:ovn-remote -> external_ids:ovn-remote-<chassis-name>=
> > > >
> > > > Priority wise, <chassis-name> specific options take precedence.
> > > >
> > > > =====
> > > >
> > > > For system-id,
> > > >
> > > > You can now pass intended chassis name via CLI argument:
> > > >
> > > >   $ ovn-controller ... -n <chassis_name>
> > > >
> > > > Alternatively, you can configure a chassis name by putting it into the
> > > > ${ovs_sysconfdir}/system-id file before running the controller.
> > > >
> > > > The latter option may be more useful in container environment where
> > > > the same image may be reused for multiple controller instances, where
> > > > ovs_sysconfigdir/system-id is a volume mounted into this generic
> > > > image.
> > > >
> > > > Priority wise, this is the order in which different means to configure
> > > > the chassis name are used:
> > > >
> > > > - ovn-controller ... -n <chassis_name> CLI argument.
> > > > - ${ovs_sysconfdir}/system-id file;
> > >
> > > Could you explain why introducing the option of reading from the system-id file? The file is global, so setting in external-ids would serve the same purpose (when there is only one ovn-controller instance), right? In addition, there is system-id.conf file used by OVS, which is easily confused with this file which has the same name but without the .conf suffix. If you meant to use the same system-id.conf file, then I'd say this behavior is not backward compatible, because originally ovn-controller would use the one in external-ids if that is different from the id in the file, and now it preferers the file over the external-ids.
> >
> > If there are multiple controllers on the same host, and if each
> > ovn-controller instance is running in a container, then system-id file
> > may have different contents, depending on mounts used for the pod (not
> > otherwise something possible with ovsdb shared between instances
> > running on the same host). This is the scenario we envision when
> > running two parallel CMS on the same host (e.g. OpenStack and
> > OpenShift), each CMS running its own OVN controller in a separate pod.
> >
> > I was not aware of system-id.conf. Perhaps it's just a naming problem?
> > Not sure how to mitigate the confusion though since system-id.conf
> > itself is not very distinct (system-id-random.conf would probably be a
> > better name if we were to decide now). Maybe the new file could be
> > called "system-id-override" to underscore its priority?
> >
> > >
> > > > - external_ids:system-id= ovsdb option;
> > > >
> > > > =====
> > > >
> > > > Concurrent chassis running on the same host may inadvertantly remove
> > > > patch ports that belong to their peer chassis. To avoid that, a new
> > > > external-ids:ovn-is-concurrent config option is introduced. It will
> > > > avoid removing "unknown" patch ports and tunnel endpoints.
> > > >
> > > > =====
> > > >
> > > > Note: this patch assumes that each chassis has its own unique IP.
> > > > Future work may consider adding support to specify custom port numbers
> > > > for tunneling that would allow to reuse the same IP address for
> > > > multiple chassis running on the same host. This work is out of scope
> > > > for this patch.
> > > >
> > > > Signed-off-by: Ihar Hrachyshka <ihrachys at redhat.com>
> > > >
> > > > ---
> > > >
> > > > v1: initial implementation.
> > > > v2: fixed test case to check ports are claimed by proper chassis.
> > > > v2: added NEWS entry.
> > > > v2: fixed some compiler warnings.
> > > > v2: moved file_system_id declaration inside a function that uses it.
> > > > v2: removed unneeded binding.h #include.
> > > > v2: docs: better explanation of alternatives to select chassis name.
> > > > v3: reverted priority order for chassis configuration: first CLI, then
> > > >     system-id file, then ovsdb.
> > > > v4: introduce helpers to extract external-ids (per-chassis or global).
> > > > v4: introduce per-chassis config options for all keys.
> > > > v4: introduce -M (--concurrent) CLI argument to avoid patch ports
> > > >     removed by concurrent chassis.
> > > > v5: rebased.
> > > > v6: switched from -M (--concurrent) to external-ids:ovn-is-concurrent.
> > > > v6: with ovn-is-concurrent=true, also avoid removing unknown tunnel
> > > >     endpoints.
> > > > ---
> > > >  NEWS                            |   5 ++
> > > >  controller/binding.h            |   1 +
> > > >  controller/chassis.c            |  77 +++++++++++++----------
> > > >  controller/chassis.h            |   3 +-
> > > >  controller/encaps.c             |  22 ++++---
> > > >  controller/encaps.h             |   1 +
> > > >  controller/ovn-controller.8.xml |  21 +++++++
> > > >  controller/ovn-controller.c     | 106 +++++++++++++++++++++++++-------
> > > >  controller/ovn-controller.h     |   4 ++
> > > >  controller/patch.c              |  21 ++++---
> > > >  controller/physical.c           |   2 +-
> > > >  lib/ovn-util.c                  |  50 +++++++++++++++
> > > >  lib/ovn-util.h                  |  18 ++++++
> > > >  ovn-sb.xml                      |  10 +--
> > > >  tests/ovn-controller.at         |   9 ++-
> > > >  tests/ovn-macros.at             |  49 ++++++++++++---
> > > >  tests/ovn.at                    | 103 ++++++++++++++++++++++++++++++-
> > > >  17 files changed, 415 insertions(+), 87 deletions(-)
> > > >
> > > > diff --git a/NEWS b/NEWS
> > > > index dece5831c..aec25be30 100644
> > > > --- a/NEWS
> > > > +++ b/NEWS
> > > > @@ -17,6 +17,11 @@ OVN v20.09.0 - xx xxx xxxx
> > > >       this mechanism should update their code to use this new table.
> > > >     - Added support for external ip based NAT. Now, besides the logical ip,
> > > >       external ips will also decide if a packet will be NATed or not.
> > > > +   - Added support for multiple ovn-controller instances on the same host
> > > > +     (virtual chassis). Now all external-ids:* configuration options can be
> > > > +     customized for each controller instance running on the same host. The only
> > > > +     option that is not available per chassis is external-ids:system-id, which
> > > > +     stands for the chassis name and can be passed via config file or CLI (-n).
> > > >
> > > >  OVN v20.06.0
> > > >  --------------------------
> > > > diff --git a/controller/binding.h b/controller/binding.h
> > > > index c9740560f..12557b3b9 100644
> > > > --- a/controller/binding.h
> > > > +++ b/controller/binding.h
> > > > @@ -32,6 +32,7 @@ struct ovsrec_port_table;
> > > >  struct ovsrec_qos_table;
> > > >  struct ovsrec_bridge_table;
> > > >  struct ovsrec_open_vswitch_table;
> > > > +struct ovsrec_open_vswitch;
> > >
> > > This can be removed.
> > >
> > > >  struct sbrec_chassis;
> > > >  struct sbrec_port_binding_table;
> > > >  struct sset;
> > > > diff --git a/controller/chassis.c b/controller/chassis.c
> > > > index a365188e8..989ec5e1a 100644
> > > > --- a/controller/chassis.c
> > > > +++ b/controller/chassis.c
> > > > @@ -125,9 +125,10 @@ chassis_register_ovs_idl(struct ovsdb_idl *ovs_idl)
> > > >  }
> > > >
> > > >  static const char *
> > > > -get_hostname(const struct smap *ext_ids)
> > > > +get_hostname(const struct smap *ext_ids, const char *chassis_id)
> > > >  {
> > > > -    const char *hostname = smap_get_def(ext_ids, "hostname", "");
> > > > +    const char *hostname = get_chassis_external_id_value(
> > > > +        ext_ids, chassis_id, "hostname", "");
> > > >
> > > >      if (strlen(hostname) == 0) {
> > > >          static char hostname_[HOST_NAME_MAX + 1];
> > > > @@ -143,39 +144,45 @@ get_hostname(const struct smap *ext_ids)
> > > >  }
> > > >
> > > >  static const char *
> > > > -get_bridge_mappings(const struct smap *ext_ids)
> > > > +get_bridge_mappings(const struct smap *ext_ids, const char *chassis_id)
> > > >  {
> > > > -    return smap_get_def(ext_ids, "ovn-bridge-mappings", "");
> > > > +    return get_chassis_external_id_value(
> > > > +        ext_ids, chassis_id, "ovn-bridge-mappings", "");
> > > >  }
> > > >
> > > >  const char *
> > > > -get_chassis_mac_mappings(const struct smap *ext_ids)
> > > > +get_chassis_mac_mappings(const struct smap *ext_ids, const char *chassis_id)
> > > >  {
> > > > -    return smap_get_def(ext_ids, "ovn-chassis-mac-mappings", "");
> > > > +    return get_chassis_external_id_value(
> > > > +        ext_ids, chassis_id, "ovn-chassis-mac-mappings", "");
> > > >  }
> > > >
> > > >  static const char *
> > > > -get_cms_options(const struct smap *ext_ids)
> > > > +get_cms_options(const struct smap *ext_ids, const char *chassis_id)
> > > >  {
> > > > -    return smap_get_def(ext_ids, "ovn-cms-options", "");
> > > > +    return get_chassis_external_id_value(
> > > > +        ext_ids, chassis_id, "ovn-cms-options", "");
> > > >  }
> > > >
> > > >  static const char *
> > > > -get_monitor_all(const struct smap *ext_ids)
> > > > +get_monitor_all(const struct smap *ext_ids, const char *chassis_id)
> > > >  {
> > > > -    return smap_get_def(ext_ids, "ovn-monitor-all", "false");
> > > > +    return get_chassis_external_id_value(
> > > > +        ext_ids, chassis_id, "ovn-monitor-all", "false");
> > > >  }
> > > >
> > > >  static const char *
> > > > -get_enable_lflow_cache(const struct smap *ext_ids)
> > > > +get_enable_lflow_cache(const struct smap *ext_ids, const char *chassis_id)
> > > >  {
> > > > -    return smap_get_def(ext_ids, "ovn-enable-lflow-cache", "true");
> > > > +    return get_chassis_external_id_value(
> > > > +        ext_ids, chassis_id, "ovn-enable-lflow-cache", "true");
> > > >  }
> > > >
> > > >  static const char *
> > > > -get_encap_csum(const struct smap *ext_ids)
> > > > +get_encap_csum(const struct smap *ext_ids, const char *chassis_id)
> > > >  {
> > > > -    return smap_get_def(ext_ids, "ovn-encap-csum", "true");
> > > > +    return get_chassis_external_id_value(
> > > > +        ext_ids, chassis_id, "ovn-encap-csum", "true");
> > > >  }
> > > >
> > > >  static const char *
> > > > @@ -189,9 +196,10 @@ get_datapath_type(const struct ovsrec_bridge *br_int)
> > > >  }
> > > >
> > > >  static bool
> > > > -get_is_interconn(const struct smap *ext_ids)
> > > > +get_is_interconn(const struct smap *ext_ids, const char *chassis_id)
> > > >  {
> > > > -    return smap_get_bool(ext_ids, "ovn-is-interconn", false);
> > > > +    return get_chassis_external_id_value_bool(
> > > > +        ext_ids, chassis_id, "ovn-is-interconn", false);
> > > >  }
> > > >
> > > >  static void
> > > > @@ -278,22 +286,27 @@ chassis_parse_ovs_config(const struct ovsrec_open_vswitch_table *ovs_table,
> > > >          return false;
> > > >      }
> > > >
> > > > -    const char *encap_type = smap_get(&cfg->external_ids, "ovn-encap-type");
> > > > -    const char *encap_ips = smap_get(&cfg->external_ids, "ovn-encap-ip");
> > > > +    const char *chassis_id = get_ovs_chassis_id(cfg);
> > > > +    const struct smap *ext_ids = &cfg->external_ids;
> > > > +
> > > > +    const char *encap_type = get_chassis_external_id_value(
> > > > +        ext_ids, chassis_id, "ovn-encap-type", NULL);
> > > > +    const char *encap_ips = get_chassis_external_id_value(
> > > > +        ext_ids, chassis_id, "ovn-encap-ip", NULL);
> > > >      if (!encap_type || !encap_ips) {
> > > >          static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> > > >          VLOG_INFO_RL(&rl, "Need to specify an encap type and ip");
> > > >          return false;
> > > >      }
> > > >
> > > > -    ovs_cfg->hostname = get_hostname(&cfg->external_ids);
> > > > -    ovs_cfg->bridge_mappings = get_bridge_mappings(&cfg->external_ids);
> > > > +    ovs_cfg->hostname = get_hostname(ext_ids, chassis_id);
> > > > +    ovs_cfg->bridge_mappings = get_bridge_mappings(ext_ids, chassis_id);
> > > >      ovs_cfg->datapath_type = get_datapath_type(br_int);
> > > > -    ovs_cfg->encap_csum = get_encap_csum(&cfg->external_ids);
> > > > -    ovs_cfg->cms_options = get_cms_options(&cfg->external_ids);
> > > > -    ovs_cfg->monitor_all = get_monitor_all(&cfg->external_ids);
> > > > -    ovs_cfg->chassis_macs = get_chassis_mac_mappings(&cfg->external_ids);
> > > > -    ovs_cfg->enable_lflow_cache = get_enable_lflow_cache(&cfg->external_ids);
> > > > +    ovs_cfg->encap_csum = get_encap_csum(ext_ids, chassis_id);
> > > > +    ovs_cfg->cms_options = get_cms_options(ext_ids, chassis_id);
> > > > +    ovs_cfg->monitor_all = get_monitor_all(ext_ids, chassis_id);
> > > > +    ovs_cfg->chassis_macs = get_chassis_mac_mappings(ext_ids, chassis_id);
> > > > +    ovs_cfg->enable_lflow_cache = get_enable_lflow_cache(ext_ids, chassis_id);
> > > >
> > > >      if (!chassis_parse_ovs_encap_type(encap_type, &ovs_cfg->encap_type_set)) {
> > > >          return false;
> > > > @@ -311,7 +324,7 @@ chassis_parse_ovs_config(const struct ovsrec_open_vswitch_table *ovs_table,
> > > >          sset_destroy(&ovs_cfg->encap_ip_set);
> > > >      }
> > > >
> > > > -    ovs_cfg->is_interconn = get_is_interconn(&cfg->external_ids);
> > > > +    ovs_cfg->is_interconn = get_is_interconn(ext_ids, chassis_id);
> > > >
> > > >      return true;
> > > >  }
> > > > @@ -348,7 +361,7 @@ chassis_other_config_changed(const char *bridge_mappings,
> > > >                               const struct sbrec_chassis *chassis_rec)
> > > >  {
> > > >      const char *chassis_bridge_mappings =
> > > > -        get_bridge_mappings(&chassis_rec->other_config);
> > > > +        get_bridge_mappings(&chassis_rec->other_config, NULL);
> > > >
> > > >      if (strcmp(bridge_mappings, chassis_bridge_mappings)) {
> > > >          return true;
> > > > @@ -362,28 +375,28 @@ chassis_other_config_changed(const char *bridge_mappings,
> > > >      }
> > > >
> > > >      const char *chassis_cms_options =
> > > > -        get_cms_options(&chassis_rec->other_config);
> > > > +        get_cms_options(&chassis_rec->other_config, NULL);
> > > >
> > > >      if (strcmp(cms_options, chassis_cms_options)) {
> > > >          return true;
> > > >      }
> > > >
> > > >      const char *chassis_monitor_all =
> > > > -        get_monitor_all(&chassis_rec->other_config);
> > > > +        get_monitor_all(&chassis_rec->other_config, NULL);
> > > >
> > > >      if (strcmp(monitor_all, chassis_monitor_all)) {
> > > >          return true;
> > > >      }
> > > >
> > > >      const char *chassis_enable_lflow_cache =
> > > > -        get_enable_lflow_cache(&chassis_rec->other_config);
> > > > +        get_enable_lflow_cache(&chassis_rec->other_config, NULL);
> > > >
> > > >      if (strcmp(enable_lflow_cache, chassis_enable_lflow_cache)) {
> > > >          return true;
> > > >      }
> > > >
> > > >      const char *chassis_mac_mappings =
> > > > -        get_chassis_mac_mappings(&chassis_rec->other_config);
> > > > +        get_chassis_mac_mappings(&chassis_rec->other_config, NULL);
> > > >      if (strcmp(chassis_macs, chassis_mac_mappings)) {
> > > >          return true;
> > > >      }
> > > > @@ -791,7 +804,7 @@ chassis_get_mac(const struct sbrec_chassis *chassis_rec,
> > > >                  struct eth_addr *chassis_mac)
> > > >  {
> > > >      const char *tokens
> > > > -        = get_chassis_mac_mappings(&chassis_rec->other_config);
> > > > +        = get_chassis_mac_mappings(&chassis_rec->other_config, NULL);
> > > >      if (!tokens[0]) {
> > > >         return false;
> > > >      }
> > > > diff --git a/controller/chassis.h b/controller/chassis.h
> > > > index 220f726b9..c7345f0fa 100644
> > > > --- a/controller/chassis.h
> > > > +++ b/controller/chassis.h
> > > > @@ -49,7 +49,8 @@ bool chassis_get_mac(const struct sbrec_chassis *chassis,
> > > >                       const char *bridge_mapping,
> > > >                       struct eth_addr *chassis_mac);
> > > >  const char *chassis_get_id(void);
> > > > -const char * get_chassis_mac_mappings(const struct smap *ext_ids);
> > > > +const char * get_chassis_mac_mappings(const struct smap *ext_ids,
> > > > +                                      const char *chassis_id);
> > > >
> > > >
> > > >  #endif /* controller/chassis.h */
> > > > diff --git a/controller/encaps.c b/controller/encaps.c
> > > > index 7eac4bb06..243c8bfa5 100644
> > > > --- a/controller/encaps.c
> > > > +++ b/controller/encaps.c
> > > > @@ -293,6 +293,7 @@ chassis_tzones_overlap(const struct sset *transport_zones,
> > > >
> > > >  void
> > > >  encaps_run(struct ovsdb_idl_txn *ovs_idl_txn,
> > > > +           const struct ovsrec_open_vswitch_table *ovs_table,
> > > >             const struct ovsrec_bridge_table *bridge_table,
> > > >             const struct ovsrec_bridge *br_int,
> > > >             const struct sbrec_chassis_table *chassis_table,
> > > > @@ -373,13 +374,20 @@ encaps_run(struct ovsdb_idl_txn *ovs_idl_txn,
> > > >          }
> > > >      }
> > > >
> > > > -    /* Delete any existing OVN tunnels that were not still around. */
> > > > -    struct shash_node *node, *next_node;
> > > > -    SHASH_FOR_EACH_SAFE (node, next_node, &tc.chassis) {
> > > > -        struct chassis_node *chassis = node->data;
> > > > -        ovsrec_bridge_update_ports_delvalue(chassis->bridge, chassis->port);
> > > > -        shash_delete(&tc.chassis, node);
> > > > -        free(chassis);
> > > > +    const struct ovsrec_open_vswitch *cfg;
> > > > +    cfg = ovsrec_open_vswitch_table_first(ovs_table);
> > > > +
> > > > +    /* Delete any existing OVN tunnels that were not still around if it's
> > > > +     * safe. */
> > > > +    if (!is_concurrent_chassis(cfg)) {
> > > > +        struct shash_node *node, *next_node;
> > > > +        SHASH_FOR_EACH_SAFE (node, next_node, &tc.chassis) {
> > > > +            struct chassis_node *chassis = node->data;
> > > > +            ovsrec_bridge_update_ports_delvalue(chassis->bridge,
> > > > +                                                chassis->port);
> > > > +            shash_delete(&tc.chassis, node);
> > > > +            free(chassis);
> > > > +        }
> > >
> > > Why not delete the tunnels? The ovn-controller only operates on the tunnel ports on the bridge owned by this instance, right? So it would never delete tunnels for other instances?
> >
> > Actually, no; right now ovn-controller cleans up all tunnels from all
> > bridges, regardless whether they belong to it or not. But you are
> > right that we should probably act on the single integration bridge
> > only which will automatically fix the issue of fighting instances. I
> > will make adjustments to that effect.
> >
> > >
> > > On the other hand, if tunnels are not deleted when they should, there can be bfd failures if the remote is a GW chassis, which may lead to problems.
> > >
> > > >      }
> > > >      shash_destroy(&tc.chassis);
> > > >      sset_destroy(&tc.port_names);
> > > > diff --git a/controller/encaps.h b/controller/encaps.h
> > > > index f488393c4..da2478086 100644
> > > > --- a/controller/encaps.h
> > > > +++ b/controller/encaps.h
> > > > @@ -30,6 +30,7 @@ struct sset;
> > > >
> > > >  void encaps_register_ovs_idl(struct ovsdb_idl *);
> > > >  void encaps_run(struct ovsdb_idl_txn *ovs_idl_txn,
> > > > +                const struct ovsrec_open_vswitch_table *ovs_table,
> > > >                  const struct ovsrec_bridge_table *,
> > > >                  const struct ovsrec_bridge *br_int,
> > > >                  const struct sbrec_chassis_table *,
> > > > diff --git a/controller/ovn-controller.8.xml b/controller/ovn-controller.8.xml
> > > > index 16bc47b20..64bfa5da9 100644
> > > > --- a/controller/ovn-controller.8.xml
> > > > +++ b/controller/ovn-controller.8.xml
> > > > @@ -233,8 +233,29 @@
> > > >          The boolean flag indicates if the chassis is used as an
> > > >          interconnection gateway.
> > > >        </dd>
> > > > +
> > > > +      <dt><code>external_ids:ovn-is-concurrent</code></dt>
> > > > +      <dd>
> > > > +        The boolean flag indicates if multiple controller instances are running
> > > > +        on the same host in parallel. If set, some cleanup operations for
> > > > +        unknown database resources (tunnel endpoints, patch ports) are skipped
> > > > +        to avoid controllers fighting each other.
> > > > +      </dd>
> > > >      </dl>
> > > >
> > > > +    <p>
> > > > +      Note that every <code>external_ids:*</code> key listed above has its
> > > > +      <code>external_ids:*-chassis_name</code> counterpart keys that allow to
> > > > +      configure values specific to chassis running on the same OVSDB. For
> > > > +      example, if two chassis named <code>blue</code> and <code>red</code> are
> > > > +      available on the same host, then an admin may configure different
> > > > +      <code>ovn-cms-options</code> for each of them by setting
> > > > +      <code>external_ids:ovn-cms-options-blue</code> and
> > > > +      <code>external_ids:ovn-cms-options-red</code> keys in the database. The
> > > > +      only key that is not available for per-chassis configuration is
> > > > +      <code>external_ids:system-id</code>.
> > > > +    </p>
> > > > +
> > >
> > > I think the ovn-is-concurrent should not be available for per-chassis configuration either. It is either true for all controllers (if there are multiple), or false if there is only one.
> >
> > Agreed, and we are going to get rid of the knob completely, as discussed below.
> >
> > >
> > > >      <p>
> > > >        <code>ovn-controller</code> reads the following values from the
> > > >        <code>Open_vSwitch</code> database of the local OVS instance:
> > > > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> > > > index 8d8c678e5..a79a8493d 100644
> > > > --- a/controller/ovn-controller.c
> > > > +++ b/controller/ovn-controller.c
> > > > @@ -18,10 +18,14 @@
> > > >  #include "ovn-controller.h"
> > > >
> > > >  #include <errno.h>
> > > > +#include <fcntl.h>
> > > >  #include <getopt.h>
> > > >  #include <signal.h>
> > > >  #include <stdlib.h>
> > > >  #include <string.h>
> > > > +#include <sys/types.h>
> > > > +#include <sys/stat.h>
> > > > +#include <unistd.h>
> > > >
> > > >  #include "bfd.h"
> > > >  #include "binding.h"
> > > > @@ -85,6 +89,8 @@ static unixctl_cb_func debug_delay_nb_cfg_report;
> > > >
> > > >  #define CONTROLLER_LOOP_STOPWATCH_NAME "ovn-controller-flow-generation"
> > > >
> > > > +static const char *controller_chassis = NULL;
> > > > +
> > > >  static char *parse_options(int argc, char *argv[]);
> > > >  OVS_NO_RETURN static void usage(void);
> > > >
> > > > @@ -260,7 +266,9 @@ out:
> > > >  static const char *
> > > >  br_int_name(const struct ovsrec_open_vswitch *cfg)
> > > >  {
> > > > -    return smap_get_def(&cfg->external_ids, "ovn-bridge", DEFAULT_BRIDGE_NAME);
> > > > +    return get_chassis_external_id_value(
> > > > +        &cfg->external_ids, get_ovs_chassis_id(cfg),
> > > > +        "ovn-bridge", DEFAULT_BRIDGE_NAME);
> > > >  }
> > > >
> > > >  static const struct ovsrec_bridge *
> > > > @@ -361,8 +369,9 @@ process_br_int(struct ovsdb_idl_txn *ovs_idl_txn,
> > > >          const struct ovsrec_open_vswitch *cfg;
> > > >          cfg = ovsrec_open_vswitch_table_first(ovs_table);
> > > >          ovs_assert(cfg);
> > > > -        const char *datapath_type = smap_get(&cfg->external_ids,
> > > > -                                             "ovn-bridge-datapath-type");
> > > > +        const char *datapath_type = get_chassis_external_id_value(
> > > > +            &cfg->external_ids, get_ovs_chassis_id(cfg),
> > > > +            "ovn-bridge-datapath-type", NULL);
> > > >          /* Check for the datapath_type and set it only if it is defined in
> > > >           * cfg. */
> > > >          if (datapath_type && strcmp(br_int->datapath_type, datapath_type)) {
> > > > @@ -372,17 +381,47 @@ process_br_int(struct ovsdb_idl_txn *ovs_idl_txn,
> > > >      return br_int;
> > > >  }
> > > >
> > > > -static const char *
> > > > -get_ovs_chassis_id(const struct ovsrec_open_vswitch_table *ovs_table)
> > > > +static const char *get_file_system_id(void)
> > > >  {
> > > > -    const struct ovsrec_open_vswitch *cfg
> > > > -        = ovsrec_open_vswitch_table_first(ovs_table);
> > > > +    const char *ret = NULL;
> > > > +    char *filename = xasprintf("%s/system-id", ovs_sysconfdir());
> > > > +    errno = 0;
> > > > +    int fd = open(filename, O_RDONLY);
> > > > +    if (fd != -1) {
> > > > +        static char file_system_id[64];
> > >
> > > Why declare as static?
> >
> > Static is to allow it to survive after the function exits. An
> > alternative would be to xstrdup() it and make the caller to free() it.
> > It would work, though since the caller passes the pointer up the
> > stack, we would also have to xstrdup() other values that the caller
> > returns, like controller_chassis, and make all the callers of the
> > caller to free() as needed. This is doable but involves some more leg
> > work in different places of the code base. What's the issue with
> > having a static array here? AFAIU the controller is not multi thread
> > so there should be no issues with parallel access to the array.
> >
> Thanks for the explain. It makes sense to use static.
>
> > >
> > > > +        int nread = read(fd, file_system_id, sizeof file_system_id);
> > > > +        if (nread) {
> > > > +            file_system_id[nread] = '\0';
> > > > +            if (file_system_id[nread - 1] == '\n') {
> > > > +                file_system_id[nread - 1] = '\0';
> > > > +            }
> > > > +            ret = file_system_id;
> > > > +        }
> > > > +        close(fd);
> > > > +    }
> > > > +
> > > > +    free(filename);
> > > > +    return ret;
> > > > +}
> > > > +
> > > > +const char *
> > > > +get_ovs_chassis_id(const struct ovsrec_open_vswitch *cfg)
> > > > +{
> > > > +    if (controller_chassis) {
> > > > +        return controller_chassis;
> > > > +    }
> > > > +
> > > > +    const char *id_from_file = get_file_system_id();
> > > > +    if (id_from_file) {
> > > > +        return id_from_file;
> > > > +    }
> > > > +
> > > >      const char *chassis_id = cfg ? smap_get(&cfg->external_ids, "system-id")
> > > >                                   : NULL;
> > > > -
> > > >      if (!chassis_id) {
> > > >          static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
> > > > -        VLOG_WARN_RL(&rl, "'system-id' in Open_vSwitch database is missing.");
> > > > +        VLOG_WARN_RL(&rl, "Failed to detect system-id, "
> > > > +                          "configuration not found.");
> > > >      }
> > > >
> > > >      return chassis_id;
> > > > @@ -477,10 +516,12 @@ static int
> > > >  get_ofctrl_probe_interval(struct ovsdb_idl *ovs_idl)
> > > >  {
> > > >      const struct ovsrec_open_vswitch *cfg = ovsrec_open_vswitch_first(ovs_idl);
> > > > -    return !cfg ? OFCTRL_DEFAULT_PROBE_INTERVAL_SEC :
> > > > -                  smap_get_int(&cfg->external_ids,
> > > > -                               "ovn-openflow-probe-interval",
> > > > -                               OFCTRL_DEFAULT_PROBE_INTERVAL_SEC);
> > > > +    if (!cfg) {
> > > > +        return OFCTRL_DEFAULT_PROBE_INTERVAL_SEC;
> > > > +    }
> > > > +    return get_chassis_external_id_value_int(
> > > > +        &cfg->external_ids, get_ovs_chassis_id(cfg),
> > > > +        "ovn-openflow-probe-interval", OFCTRL_DEFAULT_PROBE_INTERVAL_SEC);
> > > >  }
> > > >
> > > >  /* Retrieves the pointer to the OVN Southbound database from 'ovs_idl' and
> > > > @@ -496,18 +537,21 @@ update_sb_db(struct ovsdb_idl *ovs_idl, struct ovsdb_idl *ovnsb_idl,
> > > >      }
> > > >
> > > >      /* Set remote based on user configuration. */
> > > > -    const char *remote = smap_get(&cfg->external_ids, "ovn-remote");
> > > > +    const char *chassis_id = get_ovs_chassis_id(cfg);
> > > > +    const char *remote = get_chassis_external_id_value(
> > > > +        &cfg->external_ids, chassis_id, "ovn-remote", NULL);
> > > >      ovsdb_idl_set_remote(ovnsb_idl, remote, true);
> > > >
> > > >      /* Set probe interval, based on user configuration and the remote. */
> > > >      int default_interval = (remote && !stream_or_pstream_needs_probes(remote)
> > > >                              ? 0 : DEFAULT_PROBE_INTERVAL_MSEC);
> > > > -    int interval = smap_get_int(&cfg->external_ids,
> > > > -                                "ovn-remote-probe-interval", default_interval);
> > > > +    int interval = get_chassis_external_id_value_int(
> > > > +        &cfg->external_ids, chassis_id, "ovn-remote-probe-interval",
> > > > +        default_interval);
> > > >      ovsdb_idl_set_probe_interval(ovnsb_idl, interval);
> > > >
> > > > -    bool monitor_all = smap_get_bool(&cfg->external_ids, "ovn-monitor-all",
> > > > -                                     false);
> > > > +    bool monitor_all = get_chassis_external_id_value_bool(
> > > > +        &cfg->external_ids, chassis_id, "ovn-monitor-all", false);
> > > >      if (monitor_all) {
> > > >          /* Always call update_sb_monitors when monitor_all is true.
> > > >           * Otherwise, don't call it here, because there would be unnecessary
> > > > @@ -1166,7 +1210,9 @@ init_binding_ctx(struct engine_node *node,
> > > >      struct ovsrec_bridge_table *bridge_table =
> > > >          (struct ovsrec_bridge_table *)EN_OVSDB_GET(
> > > >              engine_get_input("OVS_bridge", node));
> > > > -    const char *chassis_id = get_ovs_chassis_id(ovs_table);
> > > > +    const struct ovsrec_open_vswitch *cfg =
> > > > +        ovsrec_open_vswitch_table_first(ovs_table);
> > > > +    const char *chassis_id = get_ovs_chassis_id(cfg);
> > > >      const struct ovsrec_bridge *br_int = get_br_int(bridge_table, ovs_table);
> > > >
> > > >      ovs_assert(br_int && chassis_id);
> > > > @@ -2136,6 +2182,17 @@ struct ovn_controller_exit_args {
> > > >      bool *restart;
> > > >  };
> > > >
> > > > +bool
> > > > +is_concurrent_chassis(const struct ovsrec_open_vswitch *cfg)
> > > > +{
> > > > +    if (cfg) {
> > > > +        return get_chassis_external_id_value_bool(
> > > > +            &cfg->external_ids, get_ovs_chassis_id(cfg),
> > > > +            "ovn-is-concurrent", false);
> > > > +    }
> > > > +    return false;
> > >
> > > It may be risky to return false by default. When the ovn-controller just started, before it is connected to local ovsdb, it would regard itself as the only one instance. If the whole purpose of this option is to tell the ovn-controller to not perform some risky behavior then would it be better to return true?
> > >
> > > However, I am not sure about this. I am actually concerned about not performing some necessary cleanups when is_concurrent_chassis() is true. Shall we figure out how to avoid this configuration and always perform the operations properly? For my understanding the tunnel operations should be fine because each instance should own a dedicated bridge. For the patch ports, I think we can include chassis-id information in the patch port's external-ids to make the ownership clear. What do you think?
> >
> > You are right, I haven't realized this is how the resources can be
> > tracked. This simplifies the solution.
> >
> > >
> > > > +}
> > > > +
> > > >  int
> > > >  main(int argc, char *argv[])
> > > >  {
> > > > @@ -2498,7 +2555,9 @@ main(int argc, char *argv[])
> > > >                  sbrec_chassis_private_table_get(ovnsb_idl_loop.idl);
> > > >              const struct ovsrec_bridge *br_int =
> > > >                  process_br_int(ovs_idl_txn, bridge_table, ovs_table);
> > > > -            const char *chassis_id = get_ovs_chassis_id(ovs_table);
> > > > +            const struct ovsrec_open_vswitch *cfg =
> > > > +                ovsrec_open_vswitch_table_first(ovs_table);
> > > > +            const char *chassis_id = get_ovs_chassis_id(cfg);
> > > >              const struct sbrec_chassis *chassis = NULL;
> > > >              const struct sbrec_chassis_private *chassis_private = NULL;
> > > >              if (chassis_id) {
> > > > @@ -2518,7 +2577,7 @@ main(int argc, char *argv[])
> > > >
> > > >                  if (chassis) {
> > > >                      encaps_run(ovs_idl_txn,
> > > > -                               bridge_table, br_int,
> > > > +                               ovs_table, bridge_table, br_int,
> > > >                                 sbrec_chassis_table_get(ovnsb_idl_loop.idl),
> > > >                                 chassis,
> > > >                                 sbrec_sb_global_first(ovnsb_idl_loop.idl),
> > > > @@ -2804,6 +2863,7 @@ parse_options(int argc, char *argv[])
> > > >          STREAM_SSL_LONG_OPTIONS,
> > > >          {"peer-ca-cert", required_argument, NULL, OPT_PEER_CA_CERT},
> > > >          {"bootstrap-ca-cert", required_argument, NULL, OPT_BOOTSTRAP_CA_CERT},
> > > > +        {"chassis", required_argument, NULL, 'n'},
> > > >          {NULL, 0, NULL, 0}
> > > >      };
> > > >      char *short_options = ovs_cmdl_long_options_to_short_options(long_options);
> > > > @@ -2836,6 +2896,10 @@ parse_options(int argc, char *argv[])
> > > >              stream_ssl_set_ca_cert_file(optarg, true);
> > > >              break;
> > > >
> > > > +        case 'n':
> > > > +            controller_chassis = xstrdup(optarg);
> > > > +            break;
> > > > +
> > > >          case '?':
> > > >              exit(EXIT_FAILURE);
> > > >
> > > > diff --git a/controller/ovn-controller.h b/controller/ovn-controller.h
> > > > index 5d9466880..9994dd777 100644
> > > > --- a/controller/ovn-controller.h
> > > > +++ b/controller/ovn-controller.h
> > > > @@ -21,6 +21,7 @@
> > > >  #include "lib/ovn-sb-idl.h"
> > > >
> > > >  struct ovsrec_bridge_table;
> > > > +struct ovsrec_open_vswitch;
> > > >
> > > >  /* Linux supports a maximum of 64K zones, which seems like a fine default. */
> > > >  #define MAX_CT_ZONES 65535
> > > > @@ -87,4 +88,7 @@ enum chassis_tunnel_type {
> > > >
> > > >  uint32_t get_tunnel_type(const char *name);
> > > >
> > > > +const char *get_ovs_chassis_id(const struct ovsrec_open_vswitch *cfg);
> > > > +bool is_concurrent_chassis(const struct ovsrec_open_vswitch *cfg);
> > > > +
> > > >  #endif /* controller/ovn-controller.h */
> > > > diff --git a/controller/patch.c b/controller/patch.c
> > > > index a2a7bcd79..30acd9036 100644
> > > > --- a/controller/patch.c
> > > > +++ b/controller/patch.c
> > > > @@ -157,7 +157,9 @@ add_ovs_bridge_mappings(const struct ovsrec_open_vswitch_table *ovs_table,
> > > >          const char *mappings_cfg;
> > > >          char *cur, *next, *start;
> > > >
> > > > -        mappings_cfg = smap_get(&cfg->external_ids, "ovn-bridge-mappings");
> > > > +        mappings_cfg = get_chassis_external_id_value(
> > > > +            &cfg->external_ids, get_ovs_chassis_id(cfg),
> > > > +            "ovn-bridge-mappings", NULL);
> > > >          if (!mappings_cfg || !mappings_cfg[0]) {
> > > >              return;
> > > >          }
> > > > @@ -317,13 +319,18 @@ patch_run(struct ovsdb_idl_txn *ovs_idl_txn,
> > > >                          port_binding_table, br_int, &existing_ports, chassis,
> > > >                          local_datapaths);
> > > >
> > > > +    const struct ovsrec_open_vswitch *cfg;
> > > > +    cfg = ovsrec_open_vswitch_table_first(ovs_table);
> > > > +
> > > >      /* Now 'existing_ports' only still contains patch ports that exist in the
> > > > -     * database but shouldn't.  Delete them from the database. */
> > > > -    struct shash_node *port_node, *port_next_node;
> > > > -    SHASH_FOR_EACH_SAFE (port_node, port_next_node, &existing_ports) {
> > > > -        port = port_node->data;
> > > > -        shash_delete(&existing_ports, port_node);
> > > > -        remove_port(bridge_table, port);
> > > > +     * database but shouldn't.  Delete them from the database if it's safe. */
> > > > +    if (!is_concurrent_chassis(cfg)) {
> > > > +        struct shash_node *port_node, *port_next_node;
> > > > +        SHASH_FOR_EACH_SAFE (port_node, port_next_node, &existing_ports) {
> > > > +            port = port_node->data;
> > > > +            shash_delete(&existing_ports, port_node);
> > > > +            remove_port(bridge_table, port);
> > > > +        }
> > >
> > > As mentioned above I'd suggest to use external-ids in the patch port to distinguish the chassis, and clean up the patch ports properly.
> >
> > That's a great idea. With that and making tunnel cleanup specific to
> > bridge, we can get rid of the new knob completely.
> >
> > >
> > > >      }
> > > >      shash_destroy(&existing_ports);
> > > >  }
> > > > diff --git a/controller/physical.c b/controller/physical.c
> > > > index 535c77730..a312e6c27 100644
> > > > --- a/controller/physical.c
> > > > +++ b/controller/physical.c
> > > > @@ -424,7 +424,7 @@ populate_remote_chassis_macs(const struct sbrec_chassis *my_chassis,
> > > >          }
> > > >
> > > >          const char *tokens
> > > > -            = get_chassis_mac_mappings(&chassis->other_config);
> > > > +            = get_chassis_mac_mappings(&chassis->other_config, NULL);
> > > >
> > > >          if (!strlen(tokens)) {
> > > >              continue;
> > > > diff --git a/lib/ovn-util.c b/lib/ovn-util.c
> > > > index cdb5e18fb..3193b73db 100644
> > > > --- a/lib/ovn-util.c
> > > > +++ b/lib/ovn-util.c
> > > > @@ -641,3 +641,53 @@ str_tolower(const char *orig)
> > > >
> > > >      return copy;
> > > >  }
> > > > +
> > > > +const char *
> > > > +get_chassis_external_id_value(const struct smap *external_ids,
> > > > +                              const char *chassis_id, const char *option_key,
> > > > +                              const char *def)
> > > > +{
> > > > +    const char *option_value = NULL;
> > > > +    if (chassis_id != NULL) {
> > > > +        char *chassis_option_key = xasprintf("%s-%s", option_key, chassis_id);
> > > > +        option_value = smap_get(external_ids, chassis_option_key);
> > > > +        free(chassis_option_key);
> > > > +    }
> > > > +    if (!option_value) {
> > > > +        option_value = smap_get_def(external_ids, option_key, def);
> > > > +    }
> > > > +    return option_value;
> > > > +}
> > > > +
> > > > +int
> > > > +get_chassis_external_id_value_int(const struct smap *external_ids,
> > > > +                                  const char *chassis_id,
> > > > +                                  const char *option_key,
> > > > +                                  int def)
> > > > +{
> > > > +    const char *value = get_chassis_external_id_value(
> > > > +        external_ids, chassis_id, option_key, NULL);
> > > > +
> > > > +    int i_value;
> > > > +    if (!value || !str_to_int(value, 10, &i_value)) {
> > > > +        return def;
> > > > +    }
> > > > +
> > > > +    return i_value;
> > > > +}
> > > > +
> > > > +bool
> > > > +get_chassis_external_id_value_bool(const struct smap *external_ids,
> > > > +                                   const char *chassis_id,
> > > > +                                   const char *option_key,
> > > > +                                   bool def)
> > > > +{
> > > > +    const char *value = get_chassis_external_id_value(
> > > > +        external_ids, chassis_id, option_key, "");
> > > > +
> > > > +    if (def) {
> > > > +        return strcasecmp("false", value) != 0;
> > >
> > > nit: It is better to be just strcasecmp("false", value) to follow the same style as the else branch.
> > >
> >
> > This is copied from smap_get_bool. I believe on non C99 compliant
> > compilers, returning e.g. -1 as a bool is not the same as returning
> > true as a bool.
> >
> Ack.
>
> >
> > > > +    } else {
> > > > +        return !strcasecmp("true", value);
> > > > +    }
> > > > +}
> > > > diff --git a/lib/ovn-util.h b/lib/ovn-util.h
> > > > index 0f7b501f1..7bca71e86 100644
> > > > --- a/lib/ovn-util.h
> > > > +++ b/lib/ovn-util.h
> > > > @@ -18,6 +18,7 @@
> > > >
> > > >  #include "lib/packets.h"
> > > >  #include "include/ovn/version.h"
> > > > +#include "smap.h"
> > > >
> > > >  #define ovn_set_program_name(name) \
> > > >      ovs_set_program_name(name, OVN_PACKAGE_VERSION)
> > > > @@ -148,6 +149,23 @@ char *normalize_ipv4_prefix(ovs_be32 ipv4, unsigned int plen);
> > > >  char *normalize_ipv6_prefix(struct in6_addr ipv6, unsigned int plen);
> > > >  char *normalize_v46_prefix(const struct v46_ip *prefix, unsigned int plen);
> > > >
> > > > +const char *
> > > > +get_chassis_external_id_value(const struct smap *external_ids,
> > > > +                              const char *chassis_id, const char *option_key,
> > > > +                              const char *def);
> > > > +
> > > > +int
> > > > +get_chassis_external_id_value_int(const struct smap *external_ids,
> > > > +                                  const char *chassis_id,
> > > > +                                  const char *option_key,
> > > > +                                  int def);
> > > > +
> > > > +bool
> > > > +get_chassis_external_id_value_bool(const struct smap *external_ids,
> > > > +                                   const char *chassis_id,
> > > > +                                   const char *option_key,
> > > > +                                   bool def);
> > > > +
> > > >  /* Returns a lowercase copy of orig.
> > > >   * Caller must free the returned string.
> > > >   */
> > > > diff --git a/ovn-sb.xml b/ovn-sb.xml
> > > > index 59888a155..a0518150c 100644
> > > > --- a/ovn-sb.xml
> > > > +++ b/ovn-sb.xml
> > > > @@ -240,10 +240,12 @@
> > > >
> > > >      <column name="name">
> > > >        OVN does not prescribe a particular format for chassis names.
> > > > -      ovn-controller populates this column using <ref key="system-id"
> > > > -      table="Open_vSwitch" column="external_ids" db="Open_vSwitch"/>
> > > > -      in the Open_vSwitch database's <ref table="Open_vSwitch"
> > > > -      db="Open_vSwitch"/> table.  ovn-controller-vtep populates this
> > > > +      ovn-controller populates this column using the <code>system-id</code>
> > > > +      configuration file, or <code>-n</code> CLI argument, or
> > > > +      <ref key="system-id" table="Open_vSwitch" column="external_ids"
> > > > +      db="Open_vSwitch"/> in the Open_vSwitch database's
> > > > +      <ref table="Open_vSwitch" db="Open_vSwitch"/> table.
> > > > +      ovn-controller-vtep populates this
> > > >        column with <ref table="Physical_Switch" column="name"
> > > >        db="hardware_vtep"/> in the hardware_vtep database's
> > > >        <ref table="Physical_Switch" db="hardware_vtep"/> table.
> > > > diff --git a/tests/ovn-controller.at b/tests/ovn-controller.at
> > > > index d8061345f..efb48c057 100644
> > > > --- a/tests/ovn-controller.at
> > > > +++ b/tests/ovn-controller.at
> > > > @@ -50,8 +50,7 @@ patch
> > > >  # is mirrored into the Chassis record in the OVN_Southbound db.
> > > >  check_bridge_mappings () {
> > > >      local_mappings=$1
> > > > -    sysid=$(ovs-vsctl get Open_vSwitch . external_ids:system-id)
> > > > -    OVS_WAIT_UNTIL([test x"${local_mappings}" = x$(ovn-sbctl get Chassis ${sysid} other_config:ovn-bridge-mappings | sed -e 's/\"//g')])
> > > > +    OVS_WAIT_UNTIL([test x"${local_mappings}" = x$(ovn-sbctl get Chassis ${sandbox} other_config:ovn-bridge-mappings | sed -e 's/\"//g')])
> > > >  }
> > > >
> > > >  # Initially there should be no patch ports.
> > > > @@ -133,13 +132,13 @@ ovs-vsctl \
> > > >      -- add-br br-eth2
> > > >  ovn_attach n1 br-phys 192.168.0.1
> > > >
> > > > -sysid=$(ovs-vsctl get Open_vSwitch . external_ids:system-id)
> > > > +sysid=${sandbox}
> > > >
> > > >  # Make sure that the datapath_type set in the Bridge table
> > > >  # is mirrored into the Chassis record in the OVN_Southbound db.
> > > >  check_datapath_type () {
> > > >      datapath_type=$1
> > > > -    chassis_datapath_type=$(ovn-sbctl get Chassis ${sysid} other_config:datapath-type | sed -e 's/"//g') #"
> > > > +    chassis_datapath_type=$(ovn-sbctl get Chassis ${sandbox} other_config:datapath-type | sed -e 's/"//g') #"
> > > >      test "${datapath_type}" = "${chassis_datapath_type}"
> > > >  }
> > > >
> > > > @@ -187,7 +186,7 @@ OVS_WAIT_UNTIL([
> > > >      test "${expected_iface_types}" = "${chassis_iface_types}"
> > > >  ])
> > > >
> > > > -# Change the value of external_ids:system-id and make sure it's mirrored
> > > > +# Set the value of external_ids:system-id and make sure it's mirrored
> > > >  # in the Chassis record in the OVN_Southbound database.
> > > >  sysid=${sysid}-foo
> > > >  ovs-vsctl set Open_vSwitch . external-ids:system-id="${sysid}"
> > > > diff --git a/tests/ovn-macros.at b/tests/ovn-macros.at
> > > > index c639c0ceb..1005350ab 100644
> > > > --- a/tests/ovn-macros.at
> > > > +++ b/tests/ovn-macros.at
> > > > @@ -215,7 +215,7 @@ net_attach () {
> > > >
> > > >  # ovn_az_attach AZ NETWORK BRIDGE IP [MASKLEN]
> > > >  ovn_az_attach() {
> > > > -    local az=$1 net=$2 bridge=$3 ip=$4 masklen=${5-24}
> > > > +    local az=$1 net=$2 bridge=$3 ip=$4 masklen=${5-24} intbr=${6-br-int} chassis=$7
> > > >      net_attach $net $bridge || return 1
> > > >
> > > >      mac=`ovs-vsctl get Interface $bridge mac_in_use | sed s/\"//g`
> > > > @@ -229,15 +229,48 @@ ovn_az_attach() {
> > > >      else
> > > >          ovn_remote=unix:$ovs_base/$az/ovn-sb/ovn-sb.sock
> > > >      fi
> > > > +
> > > > +    if [[ -n "${chassis}" ]]; then
> > > > +        bridge_key=ovn-bridge-${chassis}
> > > > +        remote_key=ovn-remote-${chassis}
> > > > +        encap_type_key=ovn-encap-type-${chassis}
> > > > +        encap_ip_key=ovn-encap-ip-${chassis}
> > > > +        chassis_args="-n $chassis"
> > > > +        chassis_vsctl_args=
> > > > +    else
> > > > +        bridge_key=ovn-bridge
> > > > +        remote_key=ovn-remote
> > > > +        encap_type_key=ovn-encap-type
> > > > +        encap_ip_key=ovn-encap-ip
> > > > +        chassis=$sandbox
> > > > +        chassis_args=
> > > > +        chassis_vsctl_args="-- set Open_vSwitch . external-ids:system-id=$chassis"
> > > > +    fi
> > > > +
> > > >      ovs-vsctl \
> > > > -        -- set Open_vSwitch . external-ids:system-id=$sandbox \
> > > > -        -- set Open_vSwitch . external-ids:ovn-remote=$ovn_remote \
> > > > -        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve,vxlan \
> > > > -        -- set Open_vSwitch . external-ids:ovn-encap-ip=$ip \
> > > > -        -- --may-exist add-br br-int \
> > > > -        -- set bridge br-int fail-mode=secure other-config:disable-in-band=true \
> > > > +        $chassis_vsctl_args \
> > > > +        -- set Open_vSwitch . external-ids:$bridge_key=$intbr \
> > > > +        -- set Open_vSwitch . external-ids:$remote_key=$ovn_remote \
> > > > +        -- set Open_vSwitch . external-ids:$encap_type_key=geneve,vxlan \
> > > > +        -- set Open_vSwitch . external-ids:$encap_ip_key=$ip \
> > > > +        -- --may-exist add-br ${intbr} \
> > > > +        -- set bridge ${intbr} fail-mode=secure other-config:disable-in-band=true \
> > > >          || return 1
> > > > -    start_daemon ovn-controller || return 1
> > > > +
> > > > +    if [[ "${intbr}" = br-int ]]; then
> > > > +        pidfile="${OVS_RUNDIR}/ovn-controller.pid"
> > > > +        logfile="${OVS_LOGDIR}/ovn-controller.log"
> > > > +    else
> > > > +        pidfile="${OVS_RUNDIR}/ovn-controller-${intbr}.pid"
> > > > +        logfile="${OVS_LOGDIR}/ovn-controller-${chassis}.log"
> > > > +    fi
> > > > +
> > > > +    ovn-controller \
> > > > +        ${chassis_args} \
> > > > +        -vconsole:off --detach --no-chdir \
> > > > +        --pidfile=${pidfile} \
> > > > +        --log-file=${logfile} || return 1
> > > > +    on_exit "test -e \"$pidfile\" && kill \`cat \"$pidfile\"\`"
> > > >  }
> > > >
> > > >  # ovn_attach NETWORK BRIDGE IP [MASKLEN]
> > > > diff --git a/tests/ovn.at b/tests/ovn.at
> > > > index 41fe577ff..e731ff520 100644
> > > > --- a/tests/ovn.at
> > > > +++ b/tests/ovn.at
> > > > @@ -1727,7 +1727,108 @@ AT_CLEANUP
> > > >
> > > >  AT_BANNER([OVN end-to-end tests])
> > > >
> > > > -# 3 hypervisors, one logical switch, 3 logical ports per hypervisor
> > > > +AT_SETUP([ovn -- 3 virtual hosts, same node])
> > > > +AT_KEYWORDS([ovn])
> > > > +ovn_start
> > > > +ovn-nbctl ls-add lsw0
> > > > +net_add n1
> > > > +sim_add hv
> > > > +
> > > > +as hv
> > > > +for i in 1 2 3; do
> > > > +    chassis=host-$i
> > > > +    ovs-vsctl add-br br-phys-$i
> > > > +    ovn_attach n1 br-phys-$i 192.168.0.$i 24 br-int-$i $chassis
> > > > +
> > > > +    for j in 1 2 3; do
> > > > +        lpname=lp$i$j
> > > > +        ovs-vsctl add-port br-int-$i vif$i$j -- set Interface vif$i$j external-ids:iface-id=$lpname options:tx_pcap=hv$i/vif$i$j-tx.pcap options:rxq_pcap=hv$i/vif$i$j-rx.pcap ofport-request=$i$j
> > > > +        ovn-nbctl lsp-add lsw0 $lpname
> > > > +        OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up $lpname` = xup])
> > > > +
> > > > +        pb_chassis_id=$(ovn-sbctl --bare --columns chassis list port_binding $lpname)
> > > > +        pb_chassis_name=$(ovn-sbctl get chassis $pb_chassis_id name)
> > > > +        AT_FAIL_IF([test x$pb_chassis_name != x$chassis])
> > > > +    done
> > > > +done
> > > > +
> > > > +for i in 1 2 3; do
> > > > +    > expout
> > > > +    for vif in 1 2 3; do
> > > > +        echo vif$i$vif >> expout
> > > > +    done
> > > > +    AT_CHECK([ovs-vsctl list-ports br-int-$i | grep vif], [0], [expout])
> > > > +done
> > > > +
> > > > +AT_CLEANUP
> > > > +
> > > > +AT_SETUP([ovn -- system-id in file])
> > > > +AT_KEYWORDS([ovn])
> > > > +
> > > > +ovn_start
> > > > +net_add n1
> > > > +sim_add hv
> > > > +
> > > > +as hv
> > > > +
> > > > +echo otherid > ${OVS_SYSCONFDIR}/system-id
> > > > +ovs-vsctl add-br br-phys
> > > > +ovn_attach n1 br-phys 192.168.0.1
> > > > +
> > > > +# system-id file overrides chassis name selected via cli
> > > > +echo otherid > expout
> > > > +AT_CHECK([ovn-sbctl --bare --columns name list chassis], [0], [expout])
> > > > +
> > > > +AT_CLEANUP
> > > > +
> > > > +AT_SETUP([ovn -- ovn-is-concurrent avoids stale resource cleanup])
> > > > +AT_KEYWORDS([ovn])
> > > > +
> > > > +ovn_start
> > > > +sim_add hv
> > > > +
> > > > +for i in 1 2; do
> > > > +    net_add n-$i
> > > > +done
> > > > +
> > > > +as hv
> > > > +ovs-vsctl set open . external_ids:ovn-is-concurrent=true
> > > > +
> > > > +for i in 1 2; do
> > > > +    AT_CHECK([ovn-nbctl ls-add ls-$i])
> > > > +    AT_CHECK([ovn-nbctl lsp-add ls-$i ln_port-$i])
> > > > +    AT_CHECK([ovn-nbctl lsp-set-addresses ln_port-$i unknown])
> > > > +    AT_CHECK([ovn-nbctl lsp-set-type ln_port-$i localnet])
> > > > +    AT_CHECK([ovn-nbctl --wait=hv lsp-set-options ln_port-$i network_name=phys-$i])
> > > > +done
> > > > +
> > > > +for i in 1 2; do
> > > > +    as hv
> > > > +    ovs-vsctl add-br br-phys-$i
> > > > +    ovs-vsctl set open . external-ids:ovn-bridge-mappings-hv-$i=phys-$i:br-phys-$i
> > > > +    ovn_attach n-$i br-phys-$i 192.168.0.$i 24 br-int-$i hv-$i
> > > > +
> > > > +    ovs-vsctl add-port br-int-$i vif-$i -- set Interface vif-$i external-ids:iface-id=lp-$i
> > > > +    ovn-nbctl lsp-add ls-$i lp-$i
> > > > +    OVS_WAIT_UNTIL([test x`ovn-nbctl lsp-get-up lp-$i` = xup])
> > > > +done
> > > > +
> > > > +# check that both patch ports are present
> > > > +AT_CHECK([ovs-vsctl --bare --columns=name find interface type="patch" | awk NF | sort], [0],
> > > > +[[patch-br-int-1-to-ln_port-1
> > > > +patch-br-int-2-to-ln_port-2
> > > > +patch-ln_port-1-to-br-int-1
> > > > +patch-ln_port-2-to-br-int-2
> > > > +]])
> > > > +
> > > > +# check that both tunnel endpoints are present
> > > > +AT_CHECK([ovs-vsctl --bare --columns=name find interface type="geneve" | awk NF | sort], [0],
> > > > +[[ovn-hv-1-0
> > > > +ovn-hv-2-0
> > > > +]])
> > > > +
> > > > +AT_CLEANUP
> > > > +
> > > >  AT_SETUP([ovn -- 3 HVs, 1 LS, 3 lports/HV])
> > > >  AT_KEYWORDS([ovnarp])
> > > >  ovn_start
> > > > --
> > > > 2.26.2
> > > >
> > > > _______________________________________________
> > > > dev mailing list
> > > > dev at openvswitch.org
> > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > > >
> >



More information about the dev mailing list