[ovs-dev] [PATCH v5 2/3] ovsdb-idl: Avoid inconsistent IDL state with OVSDB_MONITOR_V3.
Han Zhou
hzhou at ovn.org
Wed May 27 01:41:57 UTC 2020
Thanks Dumitru. Please see my comments below.
On Thu, May 7, 2020 at 4:21 AM Dumitru Ceara <dceara at redhat.com> wrote:
>
> Assuming an ovsdb client connected to a database using OVSDB_MONITOR_V3
> (i.e., "monitor_cond_since" method) with the initial monitor condition
> MC1.
>
> Assuming the following two transactions are executed on the
> ovsdb-server:
> TXN1: "insert record R1 in table T1"
> TXN2: "insert record R2 in table T2"
>
> If the client's monitor condition MC1 for table T2 matches R2 then the
> client will receive the following update3 message:
> method="update3", "insert record R2 in table T2", last-txn-id=TXN2
>
> At this point, if the presence of the new record R2 in the IDL triggers
> the client to update its monitor condition to MC2 and add a clause for
> table T1 which matches R1, a monitor_cond_change message is sent to the
> server:
> method="monitor_cond_change", "clauses from MC2"
>
> In normal operation the ovsdb-server will reply with a new update3
> message of the form:
> method="update3", "insert record R1 in table T1", last-txn-id=TXN2
>
> However, if the connection drops in the meantime, this last update might
> get lost.
>
> It might happen that during the reconnect a new transaction happens
> that modifies the original record R1:
> TXN3: "modify record R1 in table T1"
>
> When the client reconnects, it will try to perform a fast resync by
> sending:
> method="monitor_cond_since", "clauses from MC2", last-txn-id=TXN2
>
> Because TXN2 is still in the ovsdb-server transaction history, the
> server replies with the changes from the most recent transactions only,
> i.e., TXN3:
> result="true", last-txbb-id=TXN3, "modify record R1 in table T1"
>
> This causes the IDL on the client in to end up in an inconsistent
> state because it has never seen the update that created R1.
>
> Such a scenario is described in:
> https://bugzilla.redhat.com/show_bug.cgi?id=1808580#c22
>
> To avoid this issue, the IDL will now maintain (up to) 3 different
> types of conditions for each DB table:
> - new_cond: condition that has been set by the IDL client but has
> not yet been sent to the server through monitor_cond_change.
> - req_cond: condition that has been sent to the server but the reply
> acknowledging the change hasn't been received yet.
> - ack_cond: condition that has been acknowledged by the server.
>
> Whenever the IDL FSM is restarted (e.g., voluntary or involuntary
> disconnect):
> - if there is a known last_id txn-id the code ensures that new_cond
> will contain the most recent condition set by the IDL client
> (either req_cond if there was a request in flight, or new_cond
> if the IDL client set a condition while the IDL was disconnected)
> - if there is no known last_id txn-id the code ensures that ack_cond will
> contain the most recent conditions set by the IDL client regardless
> whether they were acked by the server or not.
>
> When monitor_cond_since/monitor_cond requests are sent they will
> always include ack_cond and if new_cond is not NULL a follow up
> monitor_cond_change will be generated afterwards.
>
> On the other hand ovsdb_idl_db_set_condition() will always modify
new_cond.
>
> This ensures that updates of type "insert" that happened before the last
> transaction known by the IDL but didn't match old monitor conditions are
> sent upon reconnect if the monitor condition has changed to include them
> in the meantime.
>
> CC: Han Zhou <hzhou at ovn.org>
> Fixes: 403a6a0cb003 ("ovsdb-idl: Fast resync from server when connection
reset.")
> Signed-off-by: Dumitru Ceara <dceara at redhat.com>
> ---
> lib/ovsdb-idl-provider.h | 8 ++
> lib/ovsdb-idl.c | 148
+++++++++++++++++++++++++++++++++++++++-------
> tests/ovsdb-idl.at | 56 +++++++++++++++++
> 3 files changed, 187 insertions(+), 25 deletions(-)
>
> diff --git a/lib/ovsdb-idl-provider.h b/lib/ovsdb-idl-provider.h
> index 30d1d08..00497d9 100644
> --- a/lib/ovsdb-idl-provider.h
> +++ b/lib/ovsdb-idl-provider.h
> @@ -122,8 +122,12 @@ struct ovsdb_idl_table {
> unsigned int change_seqno[OVSDB_IDL_CHANGE_MAX];
> struct ovs_list indexes; /* Contains "struct ovsdb_idl_index"s */
> struct ovs_list track_list; /* Tracked rows
(ovsdb_idl_row.track_node). */
> - struct ovsdb_idl_condition condition;
> - bool cond_changed;
> + struct ovsdb_idl_condition *ack_cond; /* Last condition acked by the
> + * server. */
> + struct ovsdb_idl_condition *req_cond; /* Last condition requested to
the
> + * server. */
> + struct ovsdb_idl_condition *new_cond; /* Latest condition set by the
IDL
> + * client. */
> };
>
> struct ovsdb_idl_class {
> diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c
> index 1535ad7..557f61c 100644
> --- a/lib/ovsdb-idl.c
> +++ b/lib/ovsdb-idl.c
> @@ -240,6 +240,10 @@ static void ovsdb_idl_send_monitor_request(struct
ovsdb_idl *,
> struct ovsdb_idl_db *,
> enum
ovsdb_idl_monitor_method);
> static void ovsdb_idl_db_clear(struct ovsdb_idl_db *db);
> +static void ovsdb_idl_db_ack_condition(struct ovsdb_idl_db *db);
> +static void ovsdb_idl_db_sync_condition(struct ovsdb_idl_db *db);
> +static void ovsdb_idl_condition_move(struct ovsdb_idl_condition **dst,
> + struct ovsdb_idl_condition **src);
>
> struct ovsdb_idl {
> struct ovsdb_idl_db server;
> @@ -422,9 +426,11 @@ ovsdb_idl_db_init(struct ovsdb_idl_db *db, const
struct ovsdb_idl_class *class,
> = table->change_seqno[OVSDB_IDL_CHANGE_MODIFY]
> = table->change_seqno[OVSDB_IDL_CHANGE_DELETE] = 0;
> table->db = db;
> - ovsdb_idl_condition_init(&table->condition);
> - ovsdb_idl_condition_add_clause_true(&table->condition);
> - table->cond_changed = false;
> + table->ack_cond = NULL;
> + table->req_cond = NULL;
> + table->new_cond = xmalloc(sizeof *table->new_cond);
> + ovsdb_idl_condition_init(table->new_cond);
> + ovsdb_idl_condition_add_clause_true(table->new_cond);
> }
> db->monitor_id = json_array_create_2(json_string_create("monid"),
>
json_string_create(class->database));
> @@ -556,12 +562,15 @@ ovsdb_idl_set_shuffle_remotes(struct ovsdb_idl
*idl, bool shuffle)
> static void
> ovsdb_idl_db_destroy(struct ovsdb_idl_db *db)
> {
> + struct ovsdb_idl_condition *null_cond = NULL;
> ovs_assert(!db->txn);
> ovsdb_idl_db_txn_abort_all(db);
> ovsdb_idl_db_clear(db);
> for (size_t i = 0; i < db->class_->n_tables; i++) {
> struct ovsdb_idl_table *table = &db->tables[i];
> - ovsdb_idl_condition_destroy(&table->condition);
> + ovsdb_idl_condition_move(&table->ack_cond, &null_cond);
> + ovsdb_idl_condition_move(&table->req_cond, &null_cond);
> + ovsdb_idl_condition_move(&table->new_cond, &null_cond);
> ovsdb_idl_destroy_indexes(table);
> shash_destroy(&table->columns);
> hmap_destroy(&table->rows);
> @@ -690,6 +699,12 @@ ovsdb_idl_send_request(struct ovsdb_idl *idl, struct
jsonrpc_msg *request)
> static void
> ovsdb_idl_restart_fsm(struct ovsdb_idl *idl)
> {
> + /* Resync data DB table conditions to avoid missing updates due to
> + * conditions that were in flight or changed locally while the
connection
> + * was down.
> + */
> + ovsdb_idl_db_sync_condition(&idl->data);
> +
> ovsdb_idl_send_schema_request(idl, &idl->server);
> ovsdb_idl_transition(idl, IDL_S_SERVER_SCHEMA_REQUESTED);
> idl->data.monitoring = OVSDB_IDL_NOT_MONITORING;
> @@ -797,7 +812,9 @@ ovsdb_idl_process_response(struct ovsdb_idl *idl,
struct jsonrpc_msg *msg)
> * do, it's a "monitor_cond_change", which means that the
conditional
> * monitor clauses were updated.
> *
> - * If further condition changes were pending, send them now. */
> + * Mark the last requested conditions and acked and if further
typo: s/and acked/as acked
> + * condition changes were pending, send them now. */
> + ovsdb_idl_db_ack_condition(&idl->data);
> ovsdb_idl_send_cond_change(idl);
> idl->data.cond_seqno++;
> break;
> @@ -1493,30 +1510,60 @@ ovsdb_idl_condition_equals(const struct
ovsdb_idl_condition *a,
> }
>
> static void
> -ovsdb_idl_condition_clone(struct ovsdb_idl_condition *dst,
> +ovsdb_idl_condition_clone(struct ovsdb_idl_condition **dst,
> const struct ovsdb_idl_condition *src)
> {
> - ovsdb_idl_condition_init(dst);
> + if (*dst) {
> + ovsdb_idl_condition_destroy(*dst);
> + } else {
> + *dst = xmalloc(sizeof **dst);
> + }
> + ovsdb_idl_condition_init(*dst);
>
> - dst->is_true = src->is_true;
> + (*dst)->is_true = src->is_true;
>
> const struct ovsdb_idl_clause *clause;
> HMAP_FOR_EACH (clause, hmap_node, &src->clauses) {
> - ovsdb_idl_condition_add_clause__(dst, clause,
clause->hmap_node.hash);
> + ovsdb_idl_condition_add_clause__(*dst, clause,
clause->hmap_node.hash);
> }
> }
>
> +static void
> +ovsdb_idl_condition_move(struct ovsdb_idl_condition **dst,
> + struct ovsdb_idl_condition **src)
> +{
> + if (*dst) {
> + ovsdb_idl_condition_destroy(*dst);
> + free(*dst);
> + }
> + *dst = *src;
> + *src = NULL;
> +}
> +
> static unsigned int
> ovsdb_idl_db_set_condition(struct ovsdb_idl_db *db,
> const struct ovsdb_idl_table_class *tc,
> const struct ovsdb_idl_condition *condition)
> {
> + struct ovsdb_idl_condition *table_cond;
> struct ovsdb_idl_table *table = ovsdb_idl_db_table_from_class(db,
tc);
> unsigned int seqno = db->cond_seqno;
> - if (!ovsdb_idl_condition_equals(condition, &table->condition)) {
> - ovsdb_idl_condition_destroy(&table->condition);
> - ovsdb_idl_condition_clone(&table->condition, condition);
> - db->cond_changed = table->cond_changed = true;
> +
> + /* Compare the new condition to the last known condition which can be
> + * either "new" (not sent yet), "requested" or "acked", in this
order.
> + */
> + if (table->new_cond) {
> + table_cond = table->new_cond;
> + } else if (table->req_cond) {
> + table_cond = table->req_cond;
> + } else {
> + table_cond = table->ack_cond;
> + }
> + ovs_assert(table_cond);
> +
> + if (!ovsdb_idl_condition_equals(condition, table_cond)) {
> + ovsdb_idl_condition_clone(&table->new_cond, condition);
> + db->cond_changed = true;
> poll_immediate_wake();
> return seqno + 1;
> }
> @@ -1561,9 +1608,8 @@ ovsdb_idl_condition_to_json(const struct
ovsdb_idl_condition *cnd)
> }
>
> static struct json *
> -ovsdb_idl_create_cond_change_req(struct ovsdb_idl_table *table)
> +ovsdb_idl_create_cond_change_req(const struct ovsdb_idl_condition *cond)
> {
> - const struct ovsdb_idl_condition *cond = &table->condition;
> struct json *monitor_cond_change_request = json_object_create();
> struct json *cond_json = ovsdb_idl_condition_to_json(cond);
>
> @@ -1583,8 +1629,9 @@ ovsdb_idl_db_compose_cond_change(struct
ovsdb_idl_db *db)
> for (size_t i = 0; i < db->class_->n_tables; i++) {
> struct ovsdb_idl_table *table = &db->tables[i];
>
> - if (table->cond_changed) {
> - struct json *req = ovsdb_idl_create_cond_change_req(table);
> + if (table->new_cond) {
> + struct json *req =
> + ovsdb_idl_create_cond_change_req(table->new_cond);
> if (req) {
> if (!monitor_cond_change_requests) {
> monitor_cond_change_requests = json_object_create();
> @@ -1593,7 +1640,11 @@ ovsdb_idl_db_compose_cond_change(struct
ovsdb_idl_db *db)
> table->class_->name,
> json_array_create_1(req));
> }
> - table->cond_changed = false;
> + /* Mark the new condition as requested by moving it to
req_cond.
> + * If there's already requested condition that's a bug.
> + */
> + ovs_assert(table->req_cond == NULL);
> + ovsdb_idl_condition_move(&table->req_cond, &table->new_cond);
> }
> }
>
> @@ -1608,6 +1659,60 @@ ovsdb_idl_db_compose_cond_change(struct
ovsdb_idl_db *db)
> return jsonrpc_create_request("monitor_cond_change", params, NULL);
> }
>
> +/* Marks all requested table conditions in 'db' as acked by the server.
> + * It should be called when the server replies to monitor_cond_change
> + * requests.
> + */
> +static void
> +ovsdb_idl_db_ack_condition(struct ovsdb_idl_db *db)
> +{
> + for (size_t i = 0; i < db->class_->n_tables; i++) {
> + struct ovsdb_idl_table *table = &db->tables[i];
> +
> + if (table->req_cond) {
> + ovsdb_idl_condition_move(&table->ack_cond, &table->req_cond);
> + }
> + }
> +}
> +
> +/* Should be called when the IDL fsm is restarted and resyncs table
conditions
> + * based on the state the DB is in:
> + * - if a non-zero last_id is available for the DB then upon reconnect,
> + * if the IDL will use monitor_cond_since, it should first request
acked
This sentence is a little bit confusing. "if the IDL will use
monitor_cond_since," - I think if last_id is non-zero it is definitely
using monitor_cond_since, right? The code doesn't have this check either.
> + * conditions to avoid missing updates about records that were added
before
> + * the transaction with txn-id == last_id. If there were requested
> + * condition changes in flight (i.e., req_cond not NULL) and the IDL
> + * client didn't set new conditions (i.e., new_cond is NULL) then move
> + * req_cond to new_cond to trigger a follow up cond_change request.
> + * - if there's no last_id available for the DB then it's safe to use the
> + * latest conditions set by the IDL client even if they weren't acked
yet.
> + */
> +static void
> +ovsdb_idl_db_sync_condition(struct ovsdb_idl_db *db)
> +{
> + bool ack_all = uuid_is_zero(&db->last_id);
> +
> + db->cond_changed = false;
> + for (size_t i = 0; i < db->class_->n_tables; i++) {
> + struct ovsdb_idl_table *table = &db->tables[i];
> +
> + if (ack_all) {
> + if (table->new_cond) {
> + ovsdb_idl_condition_move(&table->req_cond,
&table->new_cond);
> + }
> +
> + if (table->req_cond) {
> + ovsdb_idl_condition_move(&table->ack_cond,
&table->req_cond);
> + }
> + } else {
> + if (table->req_cond && !table->new_cond) {
> + ovsdb_idl_condition_move(&table->new_cond,
&table->req_cond);
> + db->cond_changed = true;
What if new_cond is not NULL? Or it is impossible? If it is impossible,
shall we assert it instead of adding in the if check?
> + }
> + }
> + }
> +}
> +
> static void
> ovsdb_idl_send_cond_change(struct ovsdb_idl *idl)
> {
> @@ -2062,13 +2167,12 @@ ovsdb_idl_send_monitor_request(struct ovsdb_idl
*idl, struct ovsdb_idl_db *db,
> monitor_request = json_object_create();
> json_object_put(monitor_request, "columns", columns);
>
> - const struct ovsdb_idl_condition *cond = &table->condition;
> + const struct ovsdb_idl_condition *cond = table->ack_cond;
> if ((monitor_method == OVSDB_IDL_MM_MONITOR_COND ||
> monitor_method == OVSDB_IDL_MM_MONITOR_COND_SINCE) &&
> - !ovsdb_idl_condition_is_true(cond)) {
> + cond && !ovsdb_idl_condition_is_true(cond)) {
> json_object_put(monitor_request, "where",
> ovsdb_idl_condition_to_json(cond));
> - table->cond_changed = false;
> }
> json_object_put(monitor_requests, tc->name,
> json_array_create_1(monitor_request));
> @@ -2076,8 +2180,6 @@ ovsdb_idl_send_monitor_request(struct ovsdb_idl
*idl, struct ovsdb_idl_db *db,
> }
> free_schema(schema);
>
> - db->cond_changed = false;
> -
> struct json *params = json_array_create_3(
> json_string_create(db->class_->database),
> json_clone(db->monitor_id),
> diff --git a/tests/ovsdb-idl.at b/tests/ovsdb-idl.at
> index b5cbee7..4efed88 100644
> --- a/tests/ovsdb-idl.at
> +++ b/tests/ovsdb-idl.at
> @@ -1828,3 +1828,59 @@ m4_define([OVSDB_CHECK_IDL_LEADER_ONLY_PY],
>
> OVSDB_CHECK_IDL_LEADER_ONLY_PY([Check Python IDL connects to leader], 3,
['remote'])
> OVSDB_CHECK_IDL_LEADER_ONLY_PY([Check Python IDL reconnects to leader],
3, ['remote' '+remotestop' 'remote'])
> +
> +# same as OVSDB_CHECK_IDL but uses C IDL implementation with tcp
> +# with multiple remotes.
> +m4_define([OVSDB_CHECK_CLUSTER_IDL_C],
> + [AT_SETUP([$1 - C - tcp])
> + AT_KEYWORDS([ovsdb server idl positive tcp socket $5])
> + m4_define([LPBK],[127.0.0.1])
> + AT_CHECK([ovsdb_cluster_start_idltest $2 "ptcp:0:"LPBK])
> + PARSE_LISTENING_PORT([s1.log], [TCP_PORT_1])
> + PARSE_LISTENING_PORT([s2.log], [TCP_PORT_2])
> + PARSE_LISTENING_PORT([s3.log], [TCP_PORT_3])
> + remotes=tcp:LPBK:$TCP_PORT_1,tcp:LPBK:$TCP_PORT_2,tcp:LPBK:$TCP_PORT_3
> +
> + m4_if([$3], [], [],
> + [AT_CHECK([ovsdb-client transact $remotes $3], [0], [ignore],
[ignore])])
> + AT_CHECK([test-ovsdb '-vPATTERN:console:test-ovsdb|%c|%m' -vjsonrpc
-t10 idl tcp:LPBK:$TCP_PORT_1 $4],
> + [0], [stdout], [ignore])
> + AT_CHECK([sort stdout | uuidfilt]m4_if([$7],,, [[| $7]]),
> + [0], [$5])
> + AT_CLEANUP])
> +
> +# Checks that monitor_cond_since works fine when disconnects happen
> +# with cond_change requests in flight (i.e., IDL is properly updated).
> +OVSDB_CHECK_CLUSTER_IDL_C([simple idl, monitor_cond_since, cluster
disconnect],
> + 3,
> + [['["idltest",
> + {"op": "insert",
> + "table": "simple",
> + "row": {"i": 1,
> + "r": 1.0,
> + "b": true}},
> + {"op": "insert",
> + "table": "simple",
> + "row": {"i": 2,
> + "r": 1.0,
> + "b": true}}]']],
> + [['condition simple []' \
> + 'condition simple [["i","==",2]]' \
> + 'condition simple [["i","==",1]]' \
> + '+reconnect' \
> + '["idltest",
> + {"op": "update",
> + "table": "simple",
> + "where": [["i", "==", 1]],
> + "row": {"r": 2.0 }}]']],
> + [[000: change conditions
> +001: empty
> +002: change conditions
> +003: i=2 r=1 b=true s= u=<0> ia=[] ra=[] ba=[] sa=[] ua=[] uuid=<1>
> +004: change conditions
> +005: reconnect
> +006: i=2 r=1 b=true s= u=<0> ia=[] ra=[] ba=[] sa=[] ua=[] uuid=<1>
> +007: {"error":null,"result":[{"count":1}]}
> +008: i=1 r=2 b=true s= u=<0> ia=[] ra=[] ba=[] sa=[] ua=[] uuid=<2>
> +009: done
> +]])
>
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
More information about the dev
mailing list