[ovs-dev] [PATCH ovn] [RFC] Add chassis liveness monitoring mechanism
Dumitru Ceara
dceara at redhat.com
Thu Feb 20 09:18:51 UTC 2020
On 2/19/20 4:37 PM, lmartins at redhat.com wrote:
> From: Lucas Alvares Gomes <lucasagomes at gmail.com>
>
> NOTE: SENDING THIS PATCH AS A RFC TO SEE WHAT OTHERS MIGHT THINK ABOUT
> THE IDEA PRIOR TO WRITTING TESTS TO IT.
>
> CMSes integrating with OVN often uses the nb_cfg mechanism as a way to
> check the health status of the ovn-controller process but, the current
> implementation isn't ideal for that purpose because it floods the control
> plane with update notifications every time the nb_cfg value is
> incremented.
>
> This patch is merging two ideas:
>
> 1) Han Zhou proposed a patch [0] creating a table called Chassis_Private
> where each hypervisor *only* writes and monitors its own record to
> avoid this flooding problem.
>
> 2) Having ovn-controller to periodically log that it's alive instead of
> relying on the nb_cfg mechanism.
>
> By using this mechanism, a CMS can more easily just read this new
> Chassis_Private table and figure out the status of each Chassis
> (ovn-controller) in the cluster.
>
Hi Lucas,
Thanks a lot for working on this!
> Here's some reasons why I believe this approach is better than having
> to bump the nb_cfg value:
>
> 1) Simple integration. Before, the CMS had to increment the nb_cfg
> value in the NB_Global table and wait for it to be propagated to the
> Southbound database (ovn-northd and then ovn-controllers) Chassis
> table in order to know the status of the Chassis.
>
> Now, it can just read the Chassis_Private table and compare the
> alive_at column against the current time.
I guess the only comment I have about this approach is that it doesn't
feel completely OK to store a timestamp representation as string in
"alive_at". I'm afraid this might be too inflexible and force CMSes to
compare their local time with the value in this column.
I know we discussed offline that using a counter that is periodically
incremented by ovn-controller instead of a timestamp would complicate
the implementation in Openstack but maybe other people on this mailing
list have ideas on how to deal with this in a more generic way.
>
> 2) Less costy. Just one read from the db is needed, no writing. Code
> using the nb_cfg mechanism had to implement a few safe-guard code to
> make less error prone. See [1] and [2] for example.
>
> 3) Backwards compatibility. The patch [0] was moving the nb_cfg value
> to new table so, systems relying on it would need to update their
> code upon updating OVN.
Nice!
One more minor comment on the code, inline.
Thanks,
Dumitru
>
> To enable this new mechanism set an option called
> chassis_liveness_interval=N to the "options" column in the NB_Global
> table, where N is the time (in seconds) in which the ovn-controller
> should updates the alive_at column. If not set or set to 0, the
> mechanism is disabled. By default, it's disabled.
>
> [0] https://patchwork.ozlabs.org/patch/899608/
> [1] https://review.opendev.org/#/c/703612/
> [2] https://review.opendev.org/#/c/704530/
>
> Co-authored-by: Han Zhou <hzhou8 at ebay.com>
> Co-authored-by: Numan Siddique <numans at ovn.org>
> Co-authored-by: Dumitru Ceara <dceara at redhat.com>
> Signed-off-by: Lucas Alvares Gomes <lucasagomes at gmail.com>
> ---
> controller/chassis.c | 56 +++++++++++++++++++++++++++++++++++--
> controller/chassis.h | 11 ++++++--
> controller/ovn-controller.c | 44 +++++++++++++++++++++++++++--
> lib/chassis-index.c | 26 +++++++++++++++++
> lib/chassis-index.h | 5 ++++
> northd/ovn-northd.c | 4 +++
> ovn-sb.ovsschema | 13 +++++++--
> ovn-sb.xml | 35 +++++++++++++++++++++++
> 8 files changed, 185 insertions(+), 9 deletions(-)
>
> diff --git a/controller/chassis.c b/controller/chassis.c
> index 522893ead..7892cb966 100644
> --- a/controller/chassis.c
> +++ b/controller/chassis.c
> @@ -24,6 +24,7 @@
> #include "openvswitch/dynamic-string.h"
> #include "openvswitch/vlog.h"
> #include "openvswitch/ofp-parse.h"
> +#include "openvswitch/poll-loop.h"
> #include "lib/chassis-index.h"
> #include "lib/ovn-sb-idl.h"
> #include "ovn-controller.h"
> @@ -47,11 +48,16 @@ struct chassis_info {
>
> /* True if Chassis SB record is initialized, false otherwise. */
> uint32_t id_inited : 1;
> +
> + /* Next livennes update time to update the 'alive_at' column of
> + * Chassis_Private table. */
> + long long int next_liveness_update;
> };
>
> static struct chassis_info chassis_state = {
> .id = DS_EMPTY_INITIALIZER,
> .id_inited = false,
> + .next_liveness_update = LLONG_MAX,
> };
>
> static void
> @@ -581,18 +587,38 @@ chassis_update(const struct sbrec_chassis *chassis_rec,
> free(encaps);
> }
>
> +static void
> +chassis_update_liveness_cfg(const struct sbrec_chassis_private *chassis_rec,
> + struct chassis_info *ch_info,
> + int chassis_liveness_interval)
> +{
> + if (ch_info->next_liveness_update == LLONG_MAX ||
> + time_msec() > ch_info->next_liveness_update) {
> + sbrec_chassis_private_set_alive_at(
> + chassis_rec,
> + xastrftime_msec("%Y-%m-%d %H:%M:%S", time_wall_msec(), true));
> + ch_info->next_liveness_update =
> + time_msec() + chassis_liveness_interval * 1000;
> + }
> +}
> +
> /* Returns this chassis's Chassis record, if it is available. */
> const struct sbrec_chassis *
> chassis_run(struct ovsdb_idl_txn *ovnsb_idl_txn,
> struct ovsdb_idl_index *sbrec_chassis_by_name,
> + struct ovsdb_idl_index *sbrec_chassis_private_by_name,
> const struct ovsrec_open_vswitch_table *ovs_table,
> const struct sbrec_chassis_table *chassis_table,
> const char *chassis_id,
> const struct ovsrec_bridge *br_int,
> - const struct sset *transport_zones)
> + const struct sset *transport_zones,
> + const struct sbrec_chassis_private **chassis_private,
> + int chassis_liveness_interval)
> {
> struct ovs_chassis_cfg ovs_cfg;
>
> + *chassis_private = NULL;
> +
> /* Get the chassis config from the ovs table. */
> ovs_chassis_cfg_init(&ovs_cfg);
> if (!chassis_parse_ovs_config(ovs_table, br_int, &ovs_cfg)) {
> @@ -616,6 +642,24 @@ chassis_run(struct ovsdb_idl_txn *ovnsb_idl_txn,
> chassis_id);
> }
>
> + /* Create the Chassis_Private entry if it's not there yet */
> + const struct sbrec_chassis_private *chassis_private_rec =
> + chassis_private_lookup_by_name(sbrec_chassis_private_by_name,
> + chassis_id);
> + if (!chassis_private_rec && ovnsb_idl_txn) {
> + chassis_private_rec = sbrec_chassis_private_insert(ovnsb_idl_txn);
> + sbrec_chassis_private_set_name(chassis_private_rec, chassis_id);
> + }
> +
> + /* Update the alive_at column from the Chassis_Private table if
> + * the chassis liveness mechanism is enabled */
> + if (chassis_private_rec && ovnsb_idl_txn && chassis_liveness_interval) {
> + chassis_update_liveness_cfg(chassis_private_rec, &chassis_state,
> + chassis_liveness_interval);
> + }
> +
> + *chassis_private = chassis_private_rec;
> +
> ovs_chassis_cfg_destroy(&ovs_cfg);
> return chassis_rec;
> }
> @@ -669,7 +713,8 @@ chassis_get_mac(const struct sbrec_chassis *chassis_rec,
> * required. */
> bool
> chassis_cleanup(struct ovsdb_idl_txn *ovnsb_idl_txn,
> - const struct sbrec_chassis *chassis_rec)
> + const struct sbrec_chassis *chassis_rec,
> + const struct sbrec_chassis_private *chassis_private_rec)
> {
> if (!chassis_rec) {
> return true;
> @@ -679,6 +724,7 @@ chassis_cleanup(struct ovsdb_idl_txn *ovnsb_idl_txn,
> "ovn-controller: unregistering chassis '%s'",
> chassis_rec->name);
> sbrec_chassis_delete(chassis_rec);
> + sbrec_chassis_private_delete(chassis_private_rec);
> }
> return false;
> }
> @@ -695,3 +741,9 @@ chassis_get_id(void)
>
> return NULL;
> }
> +
> +void chassis_wait(int chassis_liveness_interval) {
> + if (chassis_liveness_interval) {
> + poll_timer_wait_until(chassis_state.next_liveness_update);
> + }
> +}
> diff --git a/controller/chassis.h b/controller/chassis.h
> index 178d2957e..38a549553 100644
> --- a/controller/chassis.h
> +++ b/controller/chassis.h
> @@ -17,6 +17,7 @@
> #define OVN_CHASSIS_H 1
>
> #include <stdbool.h>
> +#include "lib/ovn-sb-idl.h"
>
> struct ovsdb_idl;
> struct ovsdb_idl_index;
> @@ -33,17 +34,21 @@ void chassis_register_ovs_idl(struct ovsdb_idl *);
> const struct sbrec_chassis *chassis_run(
> struct ovsdb_idl_txn *ovnsb_idl_txn,
> struct ovsdb_idl_index *sbrec_chassis_by_name,
> + struct ovsdb_idl_index *sbrec_chassis_private_by_name,
> const struct ovsrec_open_vswitch_table *,
> const struct sbrec_chassis_table *,
> const char *chassis_id, const struct ovsrec_bridge *br_int,
> - const struct sset *transport_zones);
> + const struct sset *transport_zones,
> + const struct sbrec_chassis_private **chassis_private,
> + int chassis_liveness_interval);
> bool chassis_cleanup(struct ovsdb_idl_txn *ovnsb_idl_txn,
> - const struct sbrec_chassis *);
> + const struct sbrec_chassis *,
> + const struct sbrec_chassis_private *);
> bool chassis_get_mac(const struct sbrec_chassis *chassis,
> const char *bridge_mapping,
> struct eth_addr *chassis_mac);
> const char *chassis_get_id(void);
> const char * get_chassis_mac_mappings(const struct smap *ext_ids);
> -
> +void chassis_wait(int chassis_liveness_interval);
>
> #endif /* controller/chassis.h */
> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> index 4d245ca28..fd9d95da2 100644
> --- a/controller/ovn-controller.c
> +++ b/controller/ovn-controller.c
> @@ -155,6 +155,7 @@ update_sb_monitors(struct ovsdb_idl *ovnsb_idl,
> struct ovsdb_idl_condition ce = OVSDB_IDL_CONDITION_INIT(&ce);
> struct ovsdb_idl_condition ip_mcast = OVSDB_IDL_CONDITION_INIT(&ip_mcast);
> struct ovsdb_idl_condition igmp = OVSDB_IDL_CONDITION_INIT(&igmp);
> + struct ovsdb_idl_condition chprv = OVSDB_IDL_CONDITION_INIT(&chprv);
>
> if (monitor_all) {
> ovsdb_idl_condition_add_clause_true(&pb);
> @@ -165,6 +166,7 @@ update_sb_monitors(struct ovsdb_idl *ovnsb_idl,
> ovsdb_idl_condition_add_clause_true(&ce);
> ovsdb_idl_condition_add_clause_true(&ip_mcast);
> ovsdb_idl_condition_add_clause_true(&igmp);
> + ovsdb_idl_condition_add_clause_true(&chprv);
> goto out;
> }
>
> @@ -196,6 +198,10 @@ update_sb_monitors(struct ovsdb_idl *ovnsb_idl,
> &chassis->header_.uuid);
> sbrec_igmp_group_add_clause_chassis(&igmp, OVSDB_F_EQ,
> &chassis->header_.uuid);
> +
> + /* Monitors Chassis_Private record for current chassis only */
> + sbrec_chassis_private_add_clause_name(&chprv, OVSDB_F_EQ,
> + chassis->name);
> }
> if (local_ifaces) {
> const char *name;
> @@ -229,6 +235,7 @@ out:
> sbrec_controller_event_set_condition(ovnsb_idl, &ce);
> sbrec_ip_multicast_set_condition(ovnsb_idl, &ip_mcast);
> sbrec_igmp_group_set_condition(ovnsb_idl, &igmp);
> + sbrec_chassis_private_set_condition(ovnsb_idl, &chprv);
> ovsdb_idl_condition_destroy(&pb);
> ovsdb_idl_condition_destroy(&lf);
> ovsdb_idl_condition_destroy(&mb);
> @@ -237,6 +244,7 @@ out:
> ovsdb_idl_condition_destroy(&ce);
> ovsdb_idl_condition_destroy(&ip_mcast);
> ovsdb_idl_condition_destroy(&igmp);
> + ovsdb_idl_condition_destroy(&chprv);
> }
>
> static const char *
> @@ -687,6 +695,15 @@ get_transport_zones(const struct ovsrec_open_vswitch_table *ovs_table)
> return smap_get_def(&cfg->external_ids, "ovn-transport-zones", "");
> }
>
> +static uint16_t
> +get_chassis_liveness_interval(const struct sbrec_sb_global_table
> + *sb_global_table)
> +{
> + const struct sbrec_sb_global *sb
> + = sbrec_sb_global_table_first(sb_global_table);
> + return atoi(smap_get_def(&sb->options, "chassis_liveness_interval", "0"));
We have smap_get_int in smap.h which does exactly this with a bit more
error checking.
> +}
> +
> static void
> ctrl_register_ovs_idl(struct ovsdb_idl *ovs_idl)
> {
> @@ -1760,6 +1777,8 @@ main(int argc, char *argv[])
>
> struct ovsdb_idl_index *sbrec_chassis_by_name
> = chassis_index_create(ovnsb_idl_loop.idl);
> + struct ovsdb_idl_index *sbrec_chassis_private_by_name
> + = chassis_private_index_create(ovnsb_idl_loop.idl);
> struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath
> = mcast_group_index_create(ovnsb_idl_loop.idl);
> struct ovsdb_idl_index *sbrec_port_binding_by_name
> @@ -1821,6 +1840,10 @@ main(int argc, char *argv[])
> ovsdb_idl_omit(ovnsb_idl_loop.idl, &sbrec_connection_col_status);
> ovsdb_idl_omit(ovnsb_idl_loop.idl, &sbrec_connection_col_target);
>
> + /* Do not monitor the Chassis_Private external_ids column */
> + ovsdb_idl_omit(ovnsb_idl_loop.idl,
> + &sbrec_chassis_private_col_external_ids);
> +
> update_sb_monitors(ovnsb_idl_loop.idl, NULL, NULL, NULL, false);
>
> stopwatch_create(CONTROLLER_LOOP_STOPWATCH_NAME, SW_MS);
> @@ -1998,10 +2021,16 @@ main(int argc, char *argv[])
> process_br_int(ovs_idl_txn, bridge_table, ovs_table);
> const char *chassis_id = get_ovs_chassis_id(ovs_table);
> const struct sbrec_chassis *chassis = NULL;
> + const struct sbrec_chassis_private *chassis_private = NULL;
> + uint16_t chassis_liveness_interval = get_chassis_liveness_interval(
> + sbrec_sb_global_table_get(ovnsb_idl_loop.idl));
> if (chassis_id) {
> chassis = chassis_run(ovnsb_idl_txn, sbrec_chassis_by_name,
> + sbrec_chassis_private_by_name,
> ovs_table, chassis_table, chassis_id,
> - br_int, &transport_zones);
> + br_int, &transport_zones,
> + &chassis_private,
> + chassis_liveness_interval);
> }
>
> if (br_int) {
> @@ -2164,6 +2193,10 @@ main(int argc, char *argv[])
> ofctrl_wait();
> pinctrl_wait(ovnsb_idl_txn);
> }
> +
> + if (chassis) {
> + chassis_wait(chassis_liveness_interval);
> + }
> }
>
> unixctl_server_run(unixctl);
> @@ -2232,10 +2265,17 @@ main(int argc, char *argv[])
> ? chassis_lookup_by_name(sbrec_chassis_by_name, chassis_id)
> : NULL);
>
> + const struct sbrec_chassis_private *chassis_private
> + = (chassis_id
> + ? chassis_private_lookup_by_name(
> + sbrec_chassis_private_by_name, chassis_id)
> + : NULL);
> +
> /* Run all of the cleanup functions, even if one of them returns
> * false. We're done if all of them return true. */
> done = binding_cleanup(ovnsb_idl_txn, port_binding_table, chassis);
> - done = chassis_cleanup(ovnsb_idl_txn, chassis) && done;
> + done = chassis_cleanup(ovnsb_idl_txn, chassis,
> + chassis_private) && done;
> done = encaps_cleanup(ovs_idl_txn, br_int) && done;
> done = igmp_group_cleanup(ovnsb_idl_txn, sbrec_igmp_group) && done;
> if (done) {
> diff --git a/lib/chassis-index.c b/lib/chassis-index.c
> index 39066f4cc..d9bbba8e4 100644
> --- a/lib/chassis-index.c
> +++ b/lib/chassis-index.c
> @@ -40,6 +40,32 @@ chassis_lookup_by_name(struct ovsdb_idl_index *sbrec_chassis_by_name,
> return retval;
> }
>
> +struct ovsdb_idl_index *
> +chassis_private_index_create(struct ovsdb_idl *idl)
> +{
> + return ovsdb_idl_index_create1(
> + idl, &sbrec_chassis_private_col_name);
> +}
> +
> +/* Finds and returns the chassis with the given 'name', or NULL if no such
> + * chassis exists. */
> +const struct sbrec_chassis_private *
> +chassis_private_lookup_by_name(
> + struct ovsdb_idl_index *sbrec_chassis_private_by_name,
> + const char *name)
> +{
> + struct sbrec_chassis_private *target = \
> + sbrec_chassis_private_index_init_row(sbrec_chassis_private_by_name);
> + sbrec_chassis_private_index_set_name(target, name);
> +
> + struct sbrec_chassis_private *retval = sbrec_chassis_private_index_find(
> + sbrec_chassis_private_by_name, target);
> +
> + sbrec_chassis_private_index_destroy_row(target);
> +
> + return retval;
> +}
> +
> struct ovsdb_idl_index *
> ha_chassis_group_index_create(struct ovsdb_idl *idl)
> {
> diff --git a/lib/chassis-index.h b/lib/chassis-index.h
> index 302e5f0fd..77d2e1f99 100644
> --- a/lib/chassis-index.h
> +++ b/lib/chassis-index.h
> @@ -23,6 +23,11 @@ struct ovsdb_idl_index *chassis_index_create(struct ovsdb_idl *);
> const struct sbrec_chassis *chassis_lookup_by_name(
> struct ovsdb_idl_index *sbrec_chassis_by_name, const char *name);
>
> +struct ovsdb_idl_index *chassis_private_index_create(struct ovsdb_idl *);
> +
> +const struct sbrec_chassis_private *chassis_private_lookup_by_name(
> + struct ovsdb_idl_index *sbrec_chassis_private_by_name, const char *name);
> +
> struct ovsdb_idl_index *ha_chassis_group_index_create(struct ovsdb_idl *idl);
> const struct sbrec_ha_chassis_group *ha_chassis_group_lookup_by_name(
> struct ovsdb_idl_index *sbrec_ha_chassis_grp_by_name, const char *name);
> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> index 2580b4ec9..1b16f5ef6 100644
> --- a/northd/ovn-northd.c
> +++ b/northd/ovn-northd.c
> @@ -11555,6 +11555,10 @@ main(int argc, char *argv[])
> ovsdb_idl_add_column(ovnsb_idl_loop.idl, &sbrec_chassis_col_name);
> ovsdb_idl_add_column(ovnsb_idl_loop.idl, &sbrec_chassis_col_external_ids);
>
> + ovsdb_idl_add_table(ovnsb_idl_loop.idl, &sbrec_table_chassis_private);
> + add_column_noalert(ovnsb_idl_loop.idl,
> + &sbrec_chassis_private_col_external_ids);
> +
> ovsdb_idl_add_table(ovnsb_idl_loop.idl, &sbrec_table_ha_chassis);
> add_column_noalert(ovnsb_idl_loop.idl,
> &sbrec_ha_chassis_col_chassis);
> diff --git a/ovn-sb.ovsschema b/ovn-sb.ovsschema
> index d89f8dbbb..88da05849 100644
> --- a/ovn-sb.ovsschema
> +++ b/ovn-sb.ovsschema
> @@ -1,7 +1,7 @@
> {
> "name": "OVN_Southbound",
> - "version": "2.7.0",
> - "cksum": "4286723485 21693",
> + "version": "2.8.0",
> + "cksum": "293994447 22064",
> "tables": {
> "SB_Global": {
> "columns": {
> @@ -43,6 +43,15 @@
> "max": "unlimited"}}},
> "isRoot": true,
> "indexes": [["name"]]},
> + "Chassis_Private": {
> + "columns": {
> + "name": {"type": "string"},
> + "alive_at": {"type": "string"},
> + "external_ids": {
> + "type": {"key": "string", "value": "string",
> + "min": 0, "max": "unlimited"}}},
> + "isRoot": true,
> + "indexes": [["name"]]},
> "Encap": {
> "columns": {
> "type": {"type": {"key": {
> diff --git a/ovn-sb.xml b/ovn-sb.xml
> index 3ae9d4f92..8c9822fdb 100644
> --- a/ovn-sb.xml
> +++ b/ovn-sb.xml
> @@ -366,6 +366,41 @@
> </group>
> </table>
>
> + <table name="Chassis_Private" title="Chassis Private">
> + <p>
> + Each row in this table maintains per chassis private data that are
> + accessed only by the owning chassis (write only) and ovn-northd, not by
> + any other chassis. These data are stored in this separate table instead
> + of the <ref table="Chassis"/> table for performance considerations:
> + the rows in this table can be conditionally monitored by chassises so
> + that each chassis only get update notifications for its own row, to avoid
> + unnecessary chassis private data update flooding in a large scale
> + deployment. (Future: this separation can be avoided if ovsdb conditional
> + monitoring is supported on a set of columns)
> + </p>
> +
> + <column name="name">
> + The name of the chassis that owns these chassis-private data.
> + </column>
> +
> + <column name="alive_at">
> + A timestamp indicating the last time ovn-controller
> + signalized it's alive. For monitoring purposes, setting the
> + chassis_liveness_interval configuration to the OVN_Northhbound's
> + <ref table="NB_Global" column="options"/> with an integer value (in
> + seconds) causes the <code>ovn-controller</code> to update this
> + column with a timestamp every N seconds. If the value is not set
> + or 0, then <code>ovn-controller</code> doesn't update this column.
> + </column>
> +
> + <group title="Common Columns">
> + The overall purpose of these columns is described under <code>Common
> + Columns</code> at the beginning of this document.
> +
> + <column name="external_ids"/>
> + </group>
> + </table>
> +
> <table name="Encap" title="Encapsulation Types">
> <p>
> The <ref column="encaps" table="Chassis"/> column in the <ref
>
More information about the dev
mailing list