[ovs-dev] [PATCH] ovsdb-idl: Force reconnect when missing updates encountered.

Mark Michelson mmichels at redhat.com
Thu Apr 2 20:44:10 UTC 2020


On 3/26/20 5:22 PM, Dumitru Ceara wrote:
> When processing update2 messages, if the IDL detects that previous
> updates were missed, clear db->last_id and trigger a reconnect such that
> the IDL refreshes its complete view of the database.
> 
> Such scenarios can happen for example when bugs in
> ovsdb-server/ovsdb-idl are encountered. One such situation is reported
> in https://bugzilla.redhat.com/show_bug.cgi?id=1808580.
> 
> This commit doesn't fix the root cause of the bug above but adds the
> recovery mechanism to avoid staying in an inconsistent state for ever.
> 
> CC: Andy Zhou <azhou at ovn.org>
> Fixes: db2b5757328c ("lib: add monitor2 support in ovsdb-idl.")
> Signed-off-by: Dumitru Ceara <dceara at redhat.com>
> ---
>   lib/ovsdb-idl.c | 37 +++++++++++++++++++++++++++++++++----
>   1 file changed, 33 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c
> index 190143f..b47c0c8 100644
> --- a/lib/ovsdb-idl.c
> +++ b/lib/ovsdb-idl.c
> @@ -324,7 +324,8 @@ static bool ovsdb_idl_process_update(struct ovsdb_idl_table *,
>   static bool ovsdb_idl_process_update2(struct ovsdb_idl_table *,
>                                         const struct uuid *,
>                                         const char *operation,
> -                                      const struct json *row);
> +                                      const struct json *row,
> +                                      bool *inconsistent);
>   static void ovsdb_idl_insert_row(struct ovsdb_idl_row *, const struct json *);
>   static void ovsdb_idl_delete_row(struct ovsdb_idl_row *);
>   static bool ovsdb_idl_modify_row(struct ovsdb_idl_row *, const struct json *);
> @@ -2320,10 +2321,26 @@ ovsdb_idl_db_parse_update__(struct ovsdb_idl_db *db,
>                       row = shash_find_data(json_object(row_update), operation);
>   
>                       if (row)  {
> +                        bool inconsistent = false;
> +
>                           if (ovsdb_idl_process_update2(table, &uuid, operation,
> -                                                      row)) {
> +                                                      row, &inconsistent)) {
>                               db->change_seqno++;
>                           }
> +
> +                        /* If the db is in an inconsistent state, clear the
> +                         * db->last_id and retry to retrieve the complete DB.
> +                         */
> +                        if (inconsistent) {
> +                            memset(&db->last_id, 0, sizeof db->last_id);
> +                            ovsdb_idl_retry(db->idl);
> +                            return ovsdb_syntax_error(row_update, NULL,
> +                                                      "<row_update2> "
> +                                                      "for table \"%s\" "
> +                                                      "cannot be processed: "
> +                                                      "missed updates",
> +                                                      table->class_->name);

A couple of notes on this error:

1) The error message here is a bit redundant since there's already a 
warning printed in each of the cases that "inconsistent" gets set true. 
It may make more sense to state what's being done as a result of the 
inconsistent state instead of the fact that there's a problem.

2) I don't think this is a syntax error. It's probably better to use 
ovsdb_error() or ovsdb_internal_error(). ovsdb_internal_error() prints a 
backtrace, and doesn't actually appear to be used anywhere, so it may be 
better to just use ovsdb_error().

> +                        }
>                           break;
>                       }
>                   }
> @@ -2447,16 +2464,26 @@ ovsdb_idl_process_update(struct ovsdb_idl_table *table,
>   }
>   
>   /* Returns true if a column with mode OVSDB_IDL_MODE_RW changed, false
> - * otherwise. */
> + * otherwise.
> + *
> + * NOTE: When processing the "modify" updates, the IDL can determine that
> + * previous updates were missed (e.g., due to bugs) and that rows that don't
> + * exist locally should be updated. This indicates that the
> + * IDL is in an inconsistent state and, unlike in ovsdb_idl_process_update(),
> + * the missing rows cannot be inserted. If this is the case, 'inconsistent'
> + * is set to true to indicate the catastrophic failure.
> + */
>   static bool
>   ovsdb_idl_process_update2(struct ovsdb_idl_table *table,
>                             const struct uuid *uuid,
>                             const char *operation,
> -                          const struct json *json_row)
> +                          const struct json *json_row,
> +                          bool *inconsistent)
>   {
>       struct ovsdb_idl_row *row;
>   
>       row = ovsdb_idl_get_row(table, uuid);
> +    *inconsistent = false;
>       if (!strcmp(operation, "delete")) {
>           /* Delete row. */
>           if (row && !ovsdb_idl_row_is_orphan(row)) {
> @@ -2488,11 +2515,13 @@ ovsdb_idl_process_update2(struct ovsdb_idl_table *table,
>                   VLOG_WARN_RL(&semantic_rl, "cannot modify missing but "
>                                "referenced row "UUID_FMT" in table %s",
>                                UUID_ARGS(uuid), table->class_->name);
> +                *inconsistent = true;
>                   return false;
>               }
>           } else {
>               VLOG_WARN_RL(&semantic_rl, "cannot modify missing row "UUID_FMT" "
>                            "in table %s", UUID_ARGS(uuid), table->class_->name);
> +            *inconsistent = true;
>               return false;
>           }
>       } else {
> 



More information about the dev mailing list