[ovs-dev] [PATCH ovn v2] Provide the option to pin ovn-controller and ovn-northd to a specific version.

Numan Siddique numans at ovn.org
Wed Nov 18 13:25:52 UTC 2020


On Wed, Nov 18, 2020 at 6:52 PM Dumitru Ceara <dceara at redhat.com> wrote:
>
> On 11/18/20 2:08 PM, Numan Siddique wrote:
> > On Wed, Nov 18, 2020 at 5:23 PM Dumitru Ceara <dceara at redhat.com> wrote:
> >>
> >> On 11/17/20 12:57 PM, numans at ovn.org wrote:
> >>> From: Numan Siddique <numans at ovn.org>
> >>>
> >>> OVN recommends updating/upgrading ovn-controllers first and then
> >>> ovn-northd and OVN DB ovsdb-servers.  This is to ensure that any
> >>> new functionality specified by the database or logical flows created
> >>> by ovn-northd is understood by ovn-controller.
> >>>
> >>> However certain deployments may upgrade ovn-northd services first and
> >>> then ovn-controllers.  In a large scal deployment, this can result in
> >>> downtime during upgrades as old ovn-controllers may not understand
> >>> new logical flows or new actions added by ovn-northd.
> >>>
> >>> Even with upgrading ovn-controllers first, can result in ovn-controllers
> >>> rejecting some of the logical flows if an existing OVN action is
> >>> changed.  One such example is ct_commit action which recently was updated
> >>> to take new arguments.
> >>>
> >>> To avoid such downtimes during upgrades, this patch adds the
> >>> functionality of pinning ovn-controller and ovn-northd to a specific
> >>> version. An internal OVN version is generated and this version is stored
> >>> by ovn-northd in the Southbound SB_Global table's
> >>> options:northd_internal_version.
> >>>
> >>> When ovn-controller notices that the internal version has changed, it
> >>> stops handling the database changes - both Southbound and OVS. All the
> >>> existing OF flows are preserved.  When ovn-controller is upgraded to the
> >>> same version as ovn-northd services, it will process the database
> >>> changes.
> >>>
> >>> This feature is made optional and disabled by default. Any CMS can
> >>> enable it by configuring the OVS local database with the option -
> >>> ovn-pin-to-northd=false.
> >>>
> >>> Signed-off-by: Numan Siddique <numans at ovn.org>
> >>> ---
> >>
> >> Hi Numan,
> >>
> >> Next to Flavio's comments I have some too.
> >>
> >
> > Thanks Flavio and Dumitru for the reviews.
> >
> >
> >>>
> >>> v1 -> v2
> >>> ----
> >>>   * Added test cases and missing documentation.
> >>>   * Renamed the ovsdb option from ignore_northd_verison
> >>>     to ovn-pin-to-northd.
> >>>   * Added NEWS item.
> >>>
> >>>  NEWS                            |   3 +
> >>>  controller/ovn-controller.8.xml |  11 +++
> >>>  controller/ovn-controller.c     |  58 ++++++++++++-
> >>>  lib/ovn-util.c                  |  17 ++++
> >>>  lib/ovn-util.h                  |   4 +
> >>>  northd/ovn-northd.c             |  18 +++-
> >>>  tests/ovn.at                    | 147 ++++++++++++++++++++++++++++++++
> >>>  7 files changed, 251 insertions(+), 7 deletions(-)
> >>>
> >>> diff --git a/NEWS b/NEWS
> >>> index 6010230679..bb95724b36 100644
> >>> --- a/NEWS
> >>> +++ b/NEWS
> >>> @@ -8,6 +8,9 @@ Post-v20.09.0
> >>>       server.
> >>>     - Support other_config:vlan-passthru=true to allow VLAN tagged incoming
> >>>       traffic.
> >>> +   - Support version pinning between ovn-northd and ovn-controller as an
> >>> +     option. If the option is enabled and the versions don't match,
> >>> +     ovn-controller will not process any DB changes.
> >>>
> >>>  OVN v20.09.0 - 28 Sep 2020
> >>>  --------------------------
> >>> diff --git a/controller/ovn-controller.8.xml b/controller/ovn-controller.8.xml
> >>> index 16bc47b205..357edd1f5c 100644
> >>> --- a/controller/ovn-controller.8.xml
> >>> +++ b/controller/ovn-controller.8.xml
> >>> @@ -233,6 +233,17 @@
> >>>          The boolean flag indicates if the chassis is used as an
> >>>          interconnection gateway.
> >>>        </dd>
> >>> +
> >>> +      <dt><code>external_ids:ovn-pin-to-northd</code></dt>
> >>
> >> This still seems misleading to me.  We don't actually pin to a northd
> >> version, we pin to an OVN internal version.  What if we call it
> >> "ovn-pin-to-version"?
> >
> > It's true that we pin to the OVN internal version, but my idea is that
> > ovn-northd writes
> > its internal version to SB_Global. And ovn-controller reads the northd
> > version and compares
> > with its own version.
> >
> > Few other points
> >   - I think ovn-pin-to-northd (or ovn-pin-to-northd-version) makes
> > sense from a CMS point of view.
> >   - CMS doesn't need to know about the internal version mechanism used.
> >   - In production deployments when northd services are updated first,
> > ovn-northd will update its new
> >     internal version to SB_Global.
> >   - And the idea of this patch is to pin the controller to the same
> > northd version.
>
> So I guess this translates to "make ovn-controller reject SB contents if
> they were created by the wrong northd version".
>
> >
> > So I'd prefer to call it ovn-pin-to-northd.
> >
>
> With that in mind, and considering your explanation, what about
> "ovn-match-northd-version"?

Sounds better. Thanks.

>
> >
> >>
> >>> +      <dd>
> >>> +        The boolean flag indicates if <code>ovn-controller</code> needs to
> >>> +        be pinned to the exact <code>ovn-northd</code> version. If this
> >>> +        flag is set to true and the <code>ovn-northd's</code> version (reported
> >>> +        in the Southbound database) doesn't match with the
> >>> +        <code>ovn-controller's</code> internal version, then it will stop
> >>> +        processing the Southbound and local Open vSwitch database changes.
> >>> +        The default value is considered false if this option is not defined.
> >>> +      </dd>
> >>>      </dl>
> >>>
> >>>      <p>
> >>> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> >>> index a06cae3ccb..a7344ea9c4 100644
> >>> --- a/controller/ovn-controller.c
> >>> +++ b/controller/ovn-controller.c
> >>> @@ -2136,6 +2136,37 @@ struct ovn_controller_exit_args {
> >>>      bool *restart;
> >>>  };
> >>>
> >>> +static bool
> >>> +pin_to_northd(struct ovsdb_idl *ovs_idl)
> >>> +{
> >>> +    const struct ovsrec_open_vswitch *cfg = ovsrec_open_vswitch_first(ovs_idl);
> >>> +    return !cfg ? false : smap_get_bool(&cfg->external_ids,
> >>> +                                        "ovn-pin-to-northd", false);
> >>> +}
> >>> +
> >>> +/* Returns false if the northd internal version and ovn-controller internal
> >>> + * version doesn't match.
> >>> + */
> >>> +static bool
> >>> +check_northd_version(const struct sbrec_sb_global *sb, const char *my_version)
> >>> +{
> >>> +    if (!sb) {
> >>> +        return false;
> >>> +    }
> >>> +
> >>> +    const char *northd_version =
> >>> +        smap_get_def(&sb->options, "northd_internal_version", "");
> >>> +
> >>> +    bool mismatch = strcmp(northd_version, my_version);
> >>> +    if (mismatch) {
> >>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
> >>> +        VLOG_WARN_RL(&rl, "controller version - %s mismatch with northd "
> >>> +                     "version - %s", my_version, northd_version);
> >>> +    }
> >>> +
> >>> +    return mismatch;
> >>> +}
> >>> +
> >>>  int
> >>>  main(int argc, char *argv[])
> >>>  {
> >>> @@ -2428,10 +2459,14 @@ main(int argc, char *argv[])
> >>>          .enable_lflow_cache = true
> >>>      };
> >>>
> >>> +    char *ovn_internal_version = ovn_get_internal_version();
> >>> +    VLOG_INFO("OVN internal version is : [%s]", ovn_internal_version);
> >>> +
> >>>      /* Main loop. */
> >>>      exiting = false;
> >>>      restart = false;
> >>>      bool sb_monitor_all = false;
> >>> +    bool version_mismatch_with_northd = false;
> >>>      while (!exiting) {
> >>>          /* If we're paused just run the unixctl server and skip most of the
> >>>           * processing loop.
> >>> @@ -2442,8 +2477,6 @@ main(int argc, char *argv[])
> >>>              goto loop_done;
> >>>          }
> >>>
> >>> -        engine_init_run();
> >>> -
> >>
> >> I'm not sure why it's better to move engine_init_run() lower.  I think
> >> it can stay here.
> >
> > Ack. I'm fine either way.
> >
> >>
> >>>          struct ovsdb_idl_txn *ovs_idl_txn = ovsdb_idl_loop_run(&ovs_idl_loop);
> >>>          unsigned int new_ovs_cond_seqno
> >>>              = ovsdb_idl_get_condition_seqno(ovs_idl_loop.idl);
> >>> @@ -2473,6 +2506,22 @@ main(int argc, char *argv[])
> >>>              ovnsb_cond_seqno = new_ovnsb_cond_seqno;
> >>>          }
> >>>
> >>> +        bool version_mismatched = false;
> >>> +        if (pin_to_northd(ovs_idl_loop.idl)) {
> >>> +            version_mismatched = check_northd_version(
> >>> +                sbrec_sb_global_first(ovnsb_idl_loop.idl),
> >>> +                ovn_internal_version);
> >>> +        } else {
> >>> +            version_mismatched = false;
> >>> +        }
> >>> +
> >>> +        if (version_mismatch_with_northd != version_mismatched) {
> >>> +            version_mismatch_with_northd = version_mismatched;
> >>> +            engine_set_force_recompute(true);
> >>> +        }
> >>> +
> >>> +        engine_init_run();
> >>> +
> >>
> >> I think we can make the version checking more contained, what do you
> >> think about the following incremental?
> >>
> >> https://github.com/dceara/ovn/commit/29c382137eb974aff3ea080b8098b2919189e6e3
> >
> > It looks fine to me. I'd replace check_version with
> > check_northd_version based on the reasons I gave above.
> >
>
> Ok.
>
> >
> >>
> >>>          struct engine_context eng_ctx = {
> >>>              .ovs_idl_txn = ovs_idl_txn,
> >>>              .ovnsb_idl_txn = ovnsb_idl_txn,
> >>> @@ -2481,7 +2530,8 @@ main(int argc, char *argv[])
> >>>
> >>>          engine_set_context(&eng_ctx);
> >>>
> >>> -        if (ovsdb_idl_has_ever_connected(ovnsb_idl_loop.idl)) {
> >>> +        if (ovsdb_idl_has_ever_connected(ovnsb_idl_loop.idl) &&
> >>> +            !version_mismatch_with_northd) {
> >>>              /* Contains the transport zones that this Chassis belongs to */
> >>>              struct sset transport_zones = SSET_INITIALIZER(&transport_zones);
> >>>              sset_from_delimited_string(&transport_zones,
> >>> @@ -2770,6 +2820,7 @@ loop_done:
> >>>          }
> >>>      }
> >>>
> >>> +    free(ovn_internal_version);
> >>>      unixctl_server_destroy(unixctl);
> >>>      lflow_destroy();
> >>>      ofctrl_destroy();
> >>> @@ -2822,6 +2873,7 @@ parse_options(int argc, char *argv[])
> >>>
> >>>          case 'V':
> >>>              ovs_print_version(OFP15_VERSION, OFP15_VERSION);
> >>> +            printf("SB DB Schema %s\n", sbrec_get_db_version());
> >>>              exit(EXIT_SUCCESS);
> >>>
> >>>          VLOG_OPTION_HANDLERS
> >>> diff --git a/lib/ovn-util.c b/lib/ovn-util.c
> >>> index 18aac8da34..9dc9ade3b4 100644
> >>> --- a/lib/ovn-util.c
> >>> +++ b/lib/ovn-util.c
> >>> @@ -20,6 +20,7 @@
> >>>  #include <unistd.h>
> >>>
> >>>  #include "daemon.h"
> >>> +#include "include/ovn/actions.h"
> >>>  #include "openvswitch/ofp-parse.h"
> >>>  #include "openvswitch/vlog.h"
> >>>  #include "ovn-dirs.h"
> >>> @@ -713,3 +714,19 @@ ip_address_and_port_from_lb_key(const char *key, char **ip_address,
> >>>      *addr_family = ss.ss_family;
> >>>      return true;
> >>>  }
> >>> +
> >>> +#define N_OVN_ACTIONS N_OVNACTS
> >>> +BUILD_ASSERT_DECL(N_OVN_ACTIONS == 47);
> >>> +
> >>> +/* Increment this for any logical flow changes or if existing OVN action is
> >>> + * modified. */
> >>
> >> Except for this comment here (that can be easily overlooked), we don't
> >> mention anywhere else that the minor version should be incremented when
> >> changing existing OVN actions.
> >>
> >> I'm not sure if we could automatically generate the minor version but in
> >> any case we should make this more visible (both for contributors and
> >> reviewers).
> >
> > Maybe we should never update an existing action (from now on ) ? If
> > there is a need we can
> > add a new action  ?
> >
> > Or can we come up with a way to version the ovnacts ? Right now we
> > have a 'pad' field in the struct ovnact.
> > I think we can rename it to version. Whenever an action is updated we
> > need to increment this field.
> > And the OVN_INTERNAL_MINOR_VER will be the sum of all the versions of ovnacts ?
>
> There would still be no easy way to enforce this, except at review time
> but it's probably better than with a hardcoded OVN_INTERNAL_MINOR_VER value.

That's true.


>
> >
> > If this sounds reasonable I can look into this as a follow up patch.
> >
>
> I think this would be nice to have.

I will explore if it is straightforward enough to be included in v3.

Thanks for the reviews.

Numan

>
> >>
> >> I'd suggest at least updating the comment in actions.h where we define
> >> OVNACTS.
> >
> > Ack.
> >
> >>
> >> Maybe we should also mention this in one of the internals/*.rst docs.
> >> Is Documentation/internals/submitting-patches.rst the best place to
> >> mention this?  I see we have a section for "Feature Deprecation
> >> Guidelines", maybe we can add one for "OVN compatibility" or "OVN Upgrade"?
> >
> > I'll see if we can add more documentation about it here.
> >
>
> Thanks!
>
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>


More information about the dev mailing list