[ovs-dev] [PATCH v4 2/3] ovsdb-idl: Avoid inconsistent IDL state with OVSDB_MONITOR_V3.

Han Zhou hzhou at ovn.org
Fri May 1 15:20:02 UTC 2020


On Fri, May 1, 2020 at 5:52 AM Dumitru Ceara <dceara at redhat.com> wrote:
>
> On 4/24/20 10:48 PM, Ilya Maximets wrote:
> > On 4/24/20 10:23 PM, Han Zhou wrote:
> >>
> >>
> >> On Thu, Apr 23, 2020 at 12:04 PM Ilya Maximets <i.maximets at ovn.org
<mailto:i.maximets at ovn.org>> wrote:
> >>>
> >>> On 4/23/20 8:22 PM, Ilya Maximets wrote:
> >>>> On 4/22/20 8:38 PM, Dumitru Ceara wrote:
> >>>>> Assuming an ovsdb client connected to a database using
OVSDB_MONITOR_V3
> >>>>> (i.e., "monitor_cond_since" method) with the initial monitor
condition
> >>>>> MC1.
> >>>>>
> >>>>> Assuming the following two transactions are executed on the
> >>>>> ovsdb-server:
> >>>>> TXN1: "insert record R1 in table T1"
> >>>>> TXN2: "insert record R2 in table T2"
> >>>>>
> >>>>> If the client's monitor condition MC1 for table T2 matches R2 then
the
> >>>>> client will receive the following update3 message:
> >>>>> method="update3", "insert record R2 in table T2", last-txn-id=TXN2
> >>>>>
> >>>>> At this point, if the presence of the new record R2 in the IDL
triggers
> >>>>> the client to update its monitor condition to MC2 and add a clause
for
> >>>>> table T1 which matches R1, a monitor_cond_change message is sent to
the
> >>>>> server:
> >>>>> method="monitor_cond_change", "clauses from MC2"
> >>>>>
> >>>>> In normal operation the ovsdb-server will reply with a new update3
> >>>>> message of the form:
> >>>>> method="update3", "insert record R1 in table T1", last-txn-id=TXN2
> >>>>>
> >>>>> However, if the connection drops in the meantime, this last update
might
> >>>>> get lost.
> >>>>>
> >>>>> It might happen that during the reconnect a new transaction happens
> >>>>> that modifies the original record R1:
> >>>>> TXN3: "modify record R1 in table T1"
> >>>>>
> >>>>> When the client reconnects, it will try to perform a fast resync by
> >>>>> sending:
> >>>>> method="monitor_cond_since", "clauses from MC2", last-txn-id=TXN2
> >>>>>
> >>>>> Because TXN2 is still in the ovsdb-server transaction history, the
> >>>>> server replies with the changes from the most recent transactions
only,
> >>>>> i.e., TXN3:
> >>>>> result="true", last-txbb-id=TXN3, "modify record R1 in table T1"
> >>>>>
> >>>>> This causes the IDL on the client in to end up in an inconsistent
> >>>>> state because it has never seen the update that created R1.
> >>>>>
> >>>>> Such a scenario is described in:
> >>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1808580#c22
> >>>>>
> >>>>> To avoid hitting this issue, whenever a reconnect happens (either
> >>>>> due to an IDL retry or due to network issues), we clear db->last_id
> >>>>> in ovsdb_idl_restart_fsm() in any of the following cases:
> >>>>> - a monitor condition has been changed locally and the corresponding
> >>>>>   request was not sent yet (i.e., idl->data.cond_changed == true).
> >>>>> - a monitor_cond_change request is in flight.
> >>>>>
> >>>>> This ensures that updates of type "insert" that happened before the
last
> >>>>> transaction known by the IDL but didn't match old monitor
conditions are
> >>>>> sent upon reconnect if the monitor condition has changed to include
them
> >>>>> in the meantime.
> >>>>>
> >>>>> CC: Han Zhou <hzhou at ovn.org <mailto:hzhou at ovn.org>>
> >>>>> Fixes: 403a6a0cb003 ("ovsdb-idl: Fast resync from server when
connection reset.")
> >>>>> Signed-off-by: Dumitru Ceara <dceara at redhat.com <mailto:
dceara at redhat.com>>
> >>>>> ---
> >>>>>  lib/ovsdb-idl.c |   19 +++++++++++++++++++
> >>>>>  1 file changed, 19 insertions(+)
> >>>>>
> >>>>> diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c
> >>>>> index 1535ad7..67c4745 100644
> >>>>> --- a/lib/ovsdb-idl.c
> >>>>> +++ b/lib/ovsdb-idl.c
> >>>>> @@ -690,6 +690,25 @@ ovsdb_idl_send_request(struct ovsdb_idl *idl,
struct jsonrpc_msg *request)
> >>>>>  static void
> >>>>>  ovsdb_idl_restart_fsm(struct ovsdb_idl *idl)
> >>>>>  {
> >>>>> +    /* If there's an outstanding request of type
monitor_cond_change and
> >>>>> +     * we're in monitor_cond_since mode then we can't trust that
all relevant
> >>>>> +     * updates from transaction idl->data.last_id have been
received as we
> >>>>> +     * might have relaxed the monitor condition with our last
request and
> >>>>> +     * might be missing previously not monitored records.
> >>>>> +     *
> >>>>> +     * Same reasoning applies for the case when a monitor
condition has been
> >>>>> +     * changed locally but the monitor_cond_change request was not
sent yet.
> >>>>> +     *
> >>>>> +     * In both cases, clear last_id to make sure that the next time
> >>>>> +     * monitor_cond_since is sent (i.e., after reconnect) we get
the complete
> >>>>> +     * view of the database.
> >>>>> +     */
> >>>>> +    if (idl->data.cond_changed ||
> >>>>> +            (idl->request_id &&
> >>>>> +                idl->data.monitoring ==
OVSDB_IDL_MONITORING_COND_SINCE)) {
> >>>>> +        uuid_zero(&idl->data.last_id);
> >>>>
> >>>> As Han pointed out during OVN IRC meeting,  when there is a change
to SB, e.g.
> >>>> creating a new DP, it causes all HVs changing the condition, and at
the same time
> >>>> one of the SB servers might be disconnected triggering this bug on a
big number
> >>>> of nodes at once.  If all of these nodes will request the full
database at the
> >>>> same time, this might be an issue.
> >>>>
> >>>> One possibility that came to mind is that we could store 'last_id'
for the
> >>>> previous successful cond_change and use it instead of 0 on
re-connection if there
> >>>> was in-flight cond_change.  This way we could reduce the pressure on
SB server.
> >>>
> >>> Thinking more about this, this idea should not work because with new
conditions we
> >>> might need to receive changes that happened even before the last
successful cond_change
> >>> because we might relax conditions incrementally.
> >>>
> >>> So, current solution seems the most correct for now.  I'd like to
hear some other
> >>> thoughts about this issue with massive re-connections, though, if any.
> >>>
> >>
> >> Ideally, this can be solved by below mechanism:
> >> 1. If there is an uncompleted cond_change (either not sent, or
in-flight) while server
> >> disconnected, the IDL firsty reconnect with old condition with last_id
being the current
> >> data position, so that server knows the old condition first.
> >> 2. Then it retry the cond_change request with the new condition.
Server will send all
> >> needed updates for the delta between old and new condition.
> >
> > Yeah, we've been discussing same approach today with Dumitru.
> >
> >> This may need more complex logic in the IDL implementation, if not too
complex.
> >
> > Implementation suggestion might be to store "new_conditions" along with
"conditions":
> >
> > - set_conditions() should set "new_conditions" if they are not equal to
"conditions"
> >   or current "new_conditions".
> > - monitor_cond_since should always use old "conditions".
> > - monitor_cond_change should always send "new_conditions".
> >   On reply from the server "conditions" should be freed,
"new_conditions" moved to
> >   "conditions" and "new_conditions" cleared.
>
> Actually, I don't think this last part is OK because there's nothing
> stopping a client from changing a condition that was sent but for which
> we got no reply yet:
> - set_conditions(C1)
> - monitor_cond_change(C1)
> - set_conditions(C2) // This happens before the server replies for C1
> - get reply for C1 -> move "new_conditions" to "conditions" but we never
> sent a request for C2
>
> So I think we actually need to store 3 sets of conditions:
> - "acked conditions": condition changes the server replied to
> - "sent conditions": new conditions requested to the server
> - "new conditions": new conditions, not requested yet.
>
> What do you think?
>

Make sense. I agree with you.

> Thanks,
> Dumitru
>
> >
> > In this case, on re-connection, both "conditions" and "new_conditions"
will be present
> > and two sequential requests (monitor_cond_since, monitor_cond_change)
will be triggered
> > naturally by the existing code.
> >
> >>
> >> Thanks,
> >> Han
> >>
> >>>>
> >>>>> +    }
> >>>>> +
> >>>>>      ovsdb_idl_send_schema_request(idl, &idl->server);
> >>>>>      ovsdb_idl_transition(idl, IDL_S_SERVER_SCHEMA_REQUESTED);
> >>>>>      idl->data.monitoring = OVSDB_IDL_NOT_MONITORING;
> >>>>>
> >>>>
> >>>
> >
>


More information about the dev mailing list