[ovs-dev] [PATCH v5 2/3] ovsdb-idl: Avoid inconsistent IDL state with OVSDB_MONITOR_V3.

Han Zhou hzhou at ovn.org
Wed May 27 01:41:57 UTC 2020


Thanks Dumitru. Please see my comments below.

On Thu, May 7, 2020 at 4:21 AM Dumitru Ceara <dceara at redhat.com> wrote:
>
> Assuming an ovsdb client connected to a database using OVSDB_MONITOR_V3
> (i.e., "monitor_cond_since" method) with the initial monitor condition
> MC1.
>
> Assuming the following two transactions are executed on the
> ovsdb-server:
> TXN1: "insert record R1 in table T1"
> TXN2: "insert record R2 in table T2"
>
> If the client's monitor condition MC1 for table T2 matches R2 then the
> client will receive the following update3 message:
> method="update3", "insert record R2 in table T2", last-txn-id=TXN2
>
> At this point, if the presence of the new record R2 in the IDL triggers
> the client to update its monitor condition to MC2 and add a clause for
> table T1 which matches R1, a monitor_cond_change message is sent to the
> server:
> method="monitor_cond_change", "clauses from MC2"
>
> In normal operation the ovsdb-server will reply with a new update3
> message of the form:
> method="update3", "insert record R1 in table T1", last-txn-id=TXN2
>
> However, if the connection drops in the meantime, this last update might
> get lost.
>
> It might happen that during the reconnect a new transaction happens
> that modifies the original record R1:
> TXN3: "modify record R1 in table T1"
>
> When the client reconnects, it will try to perform a fast resync by
> sending:
> method="monitor_cond_since", "clauses from MC2", last-txn-id=TXN2
>
> Because TXN2 is still in the ovsdb-server transaction history, the
> server replies with the changes from the most recent transactions only,
> i.e., TXN3:
> result="true", last-txbb-id=TXN3, "modify record R1 in table T1"
>
> This causes the IDL on the client in to end up in an inconsistent
> state because it has never seen the update that created R1.
>
> Such a scenario is described in:
> https://bugzilla.redhat.com/show_bug.cgi?id=1808580#c22
>
> To avoid this issue, the IDL will now maintain (up to) 3 different
> types of conditions for each DB table:
> - new_cond: condition that has been set by the IDL client but has
>   not yet been sent to the server through monitor_cond_change.
> - req_cond: condition that has been sent to the server but the reply
>   acknowledging the change hasn't been received yet.
> - ack_cond: condition that has been acknowledged by the server.
>
> Whenever the IDL FSM is restarted (e.g., voluntary or involuntary
> disconnect):
> - if there is a known last_id txn-id the code ensures that new_cond
>   will contain the most recent condition set by the IDL client
>   (either req_cond if there was a request in flight, or new_cond
>   if the IDL client set a condition while the IDL was disconnected)
> - if there is no known last_id txn-id the code ensures that ack_cond will
>   contain the most recent conditions set by the IDL client regardless
>   whether they were acked by the server or not.
>
> When monitor_cond_since/monitor_cond requests are sent they will
> always include ack_cond and if new_cond is not NULL a follow up
> monitor_cond_change will be generated afterwards.
>
> On the other hand ovsdb_idl_db_set_condition() will always modify
new_cond.
>
> This ensures that updates of type "insert" that happened before the last
> transaction known by the IDL but didn't match old monitor conditions are
> sent upon reconnect if the monitor condition has changed to include them
> in the meantime.
>
> CC: Han Zhou <hzhou at ovn.org>
> Fixes: 403a6a0cb003 ("ovsdb-idl: Fast resync from server when connection
reset.")
> Signed-off-by: Dumitru Ceara <dceara at redhat.com>
> ---
>  lib/ovsdb-idl-provider.h |    8 ++
>  lib/ovsdb-idl.c          |  148
+++++++++++++++++++++++++++++++++++++++-------
>  tests/ovsdb-idl.at       |   56 +++++++++++++++++
>  3 files changed, 187 insertions(+), 25 deletions(-)
>
> diff --git a/lib/ovsdb-idl-provider.h b/lib/ovsdb-idl-provider.h
> index 30d1d08..00497d9 100644
> --- a/lib/ovsdb-idl-provider.h
> +++ b/lib/ovsdb-idl-provider.h
> @@ -122,8 +122,12 @@ struct ovsdb_idl_table {
>      unsigned int change_seqno[OVSDB_IDL_CHANGE_MAX];
>      struct ovs_list indexes;    /* Contains "struct ovsdb_idl_index"s */
>      struct ovs_list track_list; /* Tracked rows
(ovsdb_idl_row.track_node). */
> -    struct ovsdb_idl_condition condition;
> -    bool cond_changed;
> +    struct ovsdb_idl_condition *ack_cond; /* Last condition acked by the
> +                                           * server. */
> +    struct ovsdb_idl_condition *req_cond; /* Last condition requested to
the
> +                                           * server. */
> +    struct ovsdb_idl_condition *new_cond; /* Latest condition set by the
IDL
> +                                           * client. */
>  };
>
>  struct ovsdb_idl_class {
> diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c
> index 1535ad7..557f61c 100644
> --- a/lib/ovsdb-idl.c
> +++ b/lib/ovsdb-idl.c
> @@ -240,6 +240,10 @@ static void ovsdb_idl_send_monitor_request(struct
ovsdb_idl *,
>                                             struct ovsdb_idl_db *,
>                                             enum
ovsdb_idl_monitor_method);
>  static void ovsdb_idl_db_clear(struct ovsdb_idl_db *db);
> +static void ovsdb_idl_db_ack_condition(struct ovsdb_idl_db *db);
> +static void ovsdb_idl_db_sync_condition(struct ovsdb_idl_db *db);
> +static void ovsdb_idl_condition_move(struct ovsdb_idl_condition **dst,
> +                                     struct ovsdb_idl_condition **src);
>
>  struct ovsdb_idl {
>      struct ovsdb_idl_db server;
> @@ -422,9 +426,11 @@ ovsdb_idl_db_init(struct ovsdb_idl_db *db, const
struct ovsdb_idl_class *class,
>              = table->change_seqno[OVSDB_IDL_CHANGE_MODIFY]
>              = table->change_seqno[OVSDB_IDL_CHANGE_DELETE] = 0;
>          table->db = db;
> -        ovsdb_idl_condition_init(&table->condition);
> -        ovsdb_idl_condition_add_clause_true(&table->condition);
> -        table->cond_changed = false;
> +        table->ack_cond = NULL;
> +        table->req_cond = NULL;
> +        table->new_cond = xmalloc(sizeof *table->new_cond);
> +        ovsdb_idl_condition_init(table->new_cond);
> +        ovsdb_idl_condition_add_clause_true(table->new_cond);
>      }
>      db->monitor_id = json_array_create_2(json_string_create("monid"),
>
json_string_create(class->database));
> @@ -556,12 +562,15 @@ ovsdb_idl_set_shuffle_remotes(struct ovsdb_idl
*idl, bool shuffle)
>  static void
>  ovsdb_idl_db_destroy(struct ovsdb_idl_db *db)
>  {
> +    struct ovsdb_idl_condition *null_cond = NULL;
>      ovs_assert(!db->txn);
>      ovsdb_idl_db_txn_abort_all(db);
>      ovsdb_idl_db_clear(db);
>      for (size_t i = 0; i < db->class_->n_tables; i++) {
>          struct ovsdb_idl_table *table = &db->tables[i];
> -        ovsdb_idl_condition_destroy(&table->condition);
> +        ovsdb_idl_condition_move(&table->ack_cond, &null_cond);
> +        ovsdb_idl_condition_move(&table->req_cond, &null_cond);
> +        ovsdb_idl_condition_move(&table->new_cond, &null_cond);
>          ovsdb_idl_destroy_indexes(table);
>          shash_destroy(&table->columns);
>          hmap_destroy(&table->rows);
> @@ -690,6 +699,12 @@ ovsdb_idl_send_request(struct ovsdb_idl *idl, struct
jsonrpc_msg *request)
>  static void
>  ovsdb_idl_restart_fsm(struct ovsdb_idl *idl)
>  {
> +    /* Resync data DB table conditions to avoid missing updates due to
> +     * conditions that were in flight or changed locally while the
connection
> +     * was down.
> +     */
> +    ovsdb_idl_db_sync_condition(&idl->data);
> +
>      ovsdb_idl_send_schema_request(idl, &idl->server);
>      ovsdb_idl_transition(idl, IDL_S_SERVER_SCHEMA_REQUESTED);
>      idl->data.monitoring = OVSDB_IDL_NOT_MONITORING;
> @@ -797,7 +812,9 @@ ovsdb_idl_process_response(struct ovsdb_idl *idl,
struct jsonrpc_msg *msg)
>           * do, it's a "monitor_cond_change", which means that the
conditional
>           * monitor clauses were updated.
>           *
> -         * If further condition changes were pending, send them now. */
> +         * Mark the last requested conditions and acked and if further

typo: s/and acked/as acked

> +         * condition changes were pending, send them now. */
> +        ovsdb_idl_db_ack_condition(&idl->data);
>          ovsdb_idl_send_cond_change(idl);
>          idl->data.cond_seqno++;
>          break;
> @@ -1493,30 +1510,60 @@ ovsdb_idl_condition_equals(const struct
ovsdb_idl_condition *a,
>  }
>
>  static void
> -ovsdb_idl_condition_clone(struct ovsdb_idl_condition *dst,
> +ovsdb_idl_condition_clone(struct ovsdb_idl_condition **dst,
>                            const struct ovsdb_idl_condition *src)
>  {
> -    ovsdb_idl_condition_init(dst);
> +    if (*dst) {
> +        ovsdb_idl_condition_destroy(*dst);
> +    } else {
> +        *dst = xmalloc(sizeof **dst);
> +    }
> +    ovsdb_idl_condition_init(*dst);
>
> -    dst->is_true = src->is_true;
> +    (*dst)->is_true = src->is_true;
>
>      const struct ovsdb_idl_clause *clause;
>      HMAP_FOR_EACH (clause, hmap_node, &src->clauses) {
> -        ovsdb_idl_condition_add_clause__(dst, clause,
clause->hmap_node.hash);
> +        ovsdb_idl_condition_add_clause__(*dst, clause,
clause->hmap_node.hash);
>      }
>  }
>
> +static void
> +ovsdb_idl_condition_move(struct ovsdb_idl_condition **dst,
> +                         struct ovsdb_idl_condition **src)
> +{
> +    if (*dst) {
> +        ovsdb_idl_condition_destroy(*dst);
> +        free(*dst);
> +    }
> +    *dst = *src;
> +    *src = NULL;
> +}
> +
>  static unsigned int
>  ovsdb_idl_db_set_condition(struct ovsdb_idl_db *db,
>                             const struct ovsdb_idl_table_class *tc,
>                             const struct ovsdb_idl_condition *condition)
>  {
> +    struct ovsdb_idl_condition *table_cond;
>      struct ovsdb_idl_table *table = ovsdb_idl_db_table_from_class(db,
tc);
>      unsigned int seqno = db->cond_seqno;
> -    if (!ovsdb_idl_condition_equals(condition, &table->condition)) {
> -        ovsdb_idl_condition_destroy(&table->condition);
> -        ovsdb_idl_condition_clone(&table->condition, condition);
> -        db->cond_changed = table->cond_changed = true;
> +
> +    /* Compare the new condition to the last known condition which can be
> +     * either "new" (not sent yet), "requested" or "acked", in this
order.
> +     */
> +    if (table->new_cond) {
> +        table_cond = table->new_cond;
> +    } else if (table->req_cond) {
> +        table_cond = table->req_cond;
> +    } else {
> +        table_cond = table->ack_cond;
> +    }
> +    ovs_assert(table_cond);
> +
> +    if (!ovsdb_idl_condition_equals(condition, table_cond)) {
> +        ovsdb_idl_condition_clone(&table->new_cond, condition);
> +        db->cond_changed = true;
>          poll_immediate_wake();
>          return seqno + 1;
>      }
> @@ -1561,9 +1608,8 @@ ovsdb_idl_condition_to_json(const struct
ovsdb_idl_condition *cnd)
>  }
>
>  static struct json *
> -ovsdb_idl_create_cond_change_req(struct ovsdb_idl_table *table)
> +ovsdb_idl_create_cond_change_req(const struct ovsdb_idl_condition *cond)
>  {
> -    const struct ovsdb_idl_condition *cond = &table->condition;
>      struct json *monitor_cond_change_request = json_object_create();
>      struct json *cond_json = ovsdb_idl_condition_to_json(cond);
>
> @@ -1583,8 +1629,9 @@ ovsdb_idl_db_compose_cond_change(struct
ovsdb_idl_db *db)
>      for (size_t i = 0; i < db->class_->n_tables; i++) {
>          struct ovsdb_idl_table *table = &db->tables[i];
>
> -        if (table->cond_changed) {
> -            struct json *req = ovsdb_idl_create_cond_change_req(table);
> +        if (table->new_cond) {
> +            struct json *req =
> +                ovsdb_idl_create_cond_change_req(table->new_cond);
>              if (req) {
>                  if (!monitor_cond_change_requests) {
>                      monitor_cond_change_requests = json_object_create();
> @@ -1593,7 +1640,11 @@ ovsdb_idl_db_compose_cond_change(struct
ovsdb_idl_db *db)
>                               table->class_->name,
>                               json_array_create_1(req));
>              }
> -            table->cond_changed = false;
> +            /* Mark the new condition as requested by moving it to
req_cond.
> +             * If there's already requested condition that's a bug.
> +             */
> +            ovs_assert(table->req_cond == NULL);
> +            ovsdb_idl_condition_move(&table->req_cond, &table->new_cond);
>          }
>      }
>
> @@ -1608,6 +1659,60 @@ ovsdb_idl_db_compose_cond_change(struct
ovsdb_idl_db *db)
>      return jsonrpc_create_request("monitor_cond_change", params, NULL);
>  }
>
> +/* Marks all requested table conditions in 'db' as acked by the server.
> + * It should be called when the server replies to monitor_cond_change
> + * requests.
> + */
> +static void
> +ovsdb_idl_db_ack_condition(struct ovsdb_idl_db *db)
> +{
> +    for (size_t i = 0; i < db->class_->n_tables; i++) {
> +        struct ovsdb_idl_table *table = &db->tables[i];
> +
> +        if (table->req_cond) {
> +            ovsdb_idl_condition_move(&table->ack_cond, &table->req_cond);
> +        }
> +    }
> +}
> +
> +/* Should be called when the IDL fsm is restarted and resyncs table
conditions
> + * based on the state the DB is in:
> + * - if a non-zero last_id is available for the DB then upon reconnect,
> + *   if the IDL will use monitor_cond_since, it should first request
acked

This sentence is a little bit confusing. "if the IDL will use
monitor_cond_since," - I think if last_id is non-zero it is definitely
using monitor_cond_since, right? The code doesn't have this check either.

> + *   conditions to avoid missing updates about records that were added
before
> + *   the transaction with txn-id == last_id. If there were requested
> + *   condition changes in flight (i.e., req_cond not NULL) and the IDL
> + *   client didn't set new conditions (i.e., new_cond is NULL) then move
> + *   req_cond to new_cond to trigger a follow up cond_change request.
> + * - if there's no last_id available for the DB then it's safe to use the
> + *   latest conditions set by the IDL client even if they weren't acked
yet.
> + */
> +static void
> +ovsdb_idl_db_sync_condition(struct ovsdb_idl_db *db)
> +{
> +    bool ack_all = uuid_is_zero(&db->last_id);
> +
> +    db->cond_changed = false;
> +    for (size_t i = 0; i < db->class_->n_tables; i++) {
> +        struct ovsdb_idl_table *table = &db->tables[i];
> +
> +        if (ack_all) {
> +            if (table->new_cond) {
> +                ovsdb_idl_condition_move(&table->req_cond,
&table->new_cond);
> +            }
> +
> +            if (table->req_cond) {
> +                ovsdb_idl_condition_move(&table->ack_cond,
&table->req_cond);
> +            }
> +        } else {
> +            if (table->req_cond && !table->new_cond) {
> +                ovsdb_idl_condition_move(&table->new_cond,
&table->req_cond);
> +                db->cond_changed = true;

What if new_cond is not NULL? Or it is impossible? If it is impossible,
shall we assert it instead of adding in the if check?

> +            }
> +        }
> +    }
> +}
> +
>  static void
>  ovsdb_idl_send_cond_change(struct ovsdb_idl *idl)
>  {
> @@ -2062,13 +2167,12 @@ ovsdb_idl_send_monitor_request(struct ovsdb_idl
*idl, struct ovsdb_idl_db *db,
>              monitor_request = json_object_create();
>              json_object_put(monitor_request, "columns", columns);
>
> -            const struct ovsdb_idl_condition *cond = &table->condition;
> +            const struct ovsdb_idl_condition *cond = table->ack_cond;
>              if ((monitor_method == OVSDB_IDL_MM_MONITOR_COND ||
>                   monitor_method == OVSDB_IDL_MM_MONITOR_COND_SINCE) &&
> -                !ovsdb_idl_condition_is_true(cond)) {
> +                cond && !ovsdb_idl_condition_is_true(cond)) {
>                  json_object_put(monitor_request, "where",
>                                  ovsdb_idl_condition_to_json(cond));
> -                table->cond_changed = false;
>              }
>              json_object_put(monitor_requests, tc->name,
>                              json_array_create_1(monitor_request));
> @@ -2076,8 +2180,6 @@ ovsdb_idl_send_monitor_request(struct ovsdb_idl
*idl, struct ovsdb_idl_db *db,
>      }
>      free_schema(schema);
>
> -    db->cond_changed = false;
> -
>      struct json *params = json_array_create_3(
>                                json_string_create(db->class_->database),
>                                json_clone(db->monitor_id),
> diff --git a/tests/ovsdb-idl.at b/tests/ovsdb-idl.at
> index b5cbee7..4efed88 100644
> --- a/tests/ovsdb-idl.at
> +++ b/tests/ovsdb-idl.at
> @@ -1828,3 +1828,59 @@ m4_define([OVSDB_CHECK_IDL_LEADER_ONLY_PY],
>
>  OVSDB_CHECK_IDL_LEADER_ONLY_PY([Check Python IDL connects to leader], 3,
['remote'])
>  OVSDB_CHECK_IDL_LEADER_ONLY_PY([Check Python IDL reconnects to leader],
3, ['remote' '+remotestop' 'remote'])
> +
> +# same as OVSDB_CHECK_IDL but uses C IDL implementation with tcp
> +# with multiple remotes.
> +m4_define([OVSDB_CHECK_CLUSTER_IDL_C],
> +  [AT_SETUP([$1 - C - tcp])
> +   AT_KEYWORDS([ovsdb server idl positive tcp socket $5])
> +   m4_define([LPBK],[127.0.0.1])
> +   AT_CHECK([ovsdb_cluster_start_idltest $2 "ptcp:0:"LPBK])
> +   PARSE_LISTENING_PORT([s1.log], [TCP_PORT_1])
> +   PARSE_LISTENING_PORT([s2.log], [TCP_PORT_2])
> +   PARSE_LISTENING_PORT([s3.log], [TCP_PORT_3])
> +   remotes=tcp:LPBK:$TCP_PORT_1,tcp:LPBK:$TCP_PORT_2,tcp:LPBK:$TCP_PORT_3
> +
> +   m4_if([$3], [], [],
> +     [AT_CHECK([ovsdb-client transact $remotes $3], [0], [ignore],
[ignore])])
> +   AT_CHECK([test-ovsdb '-vPATTERN:console:test-ovsdb|%c|%m' -vjsonrpc
-t10 idl tcp:LPBK:$TCP_PORT_1 $4],
> +            [0], [stdout], [ignore])
> +   AT_CHECK([sort stdout | uuidfilt]m4_if([$7],,, [[| $7]]),
> +            [0], [$5])
> +   AT_CLEANUP])
> +
> +# Checks that monitor_cond_since works fine when disconnects happen
> +# with cond_change requests in flight (i.e., IDL is properly updated).
> +OVSDB_CHECK_CLUSTER_IDL_C([simple idl, monitor_cond_since, cluster
disconnect],
> +  3,
> +  [['["idltest",
> +       {"op": "insert",
> +       "table": "simple",
> +       "row": {"i": 1,
> +               "r": 1.0,
> +               "b": true}},
> +       {"op": "insert",
> +       "table": "simple",
> +       "row": {"i": 2,
> +               "r": 1.0,
> +               "b": true}}]']],
> +  [['condition simple []' \
> +    'condition simple [["i","==",2]]' \
> +    'condition simple [["i","==",1]]' \
> +    '+reconnect' \
> +    '["idltest",
> +      {"op": "update",
> +       "table": "simple",
> +       "where": [["i", "==", 1]],
> +       "row": {"r": 2.0 }}]']],
> +  [[000: change conditions
> +001: empty
> +002: change conditions
> +003: i=2 r=1 b=true s= u=<0> ia=[] ra=[] ba=[] sa=[] ua=[] uuid=<1>
> +004: change conditions
> +005: reconnect
> +006: i=2 r=1 b=true s= u=<0> ia=[] ra=[] ba=[] sa=[] ua=[] uuid=<1>
> +007: {"error":null,"result":[{"count":1}]}
> +008: i=1 r=2 b=true s= u=<0> ia=[] ra=[] ba=[] sa=[] ua=[] uuid=<2>
> +009: done
> +]])
>
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev


More information about the dev mailing list