[ovs-dev] [PATCH v3] ovsdb-server: Don't drop all connections on read/write status change.

Numan Siddique nusiddiq at redhat.com
Mon Oct 14 18:15:29 UTC 2019


On Mon, Oct 14, 2019, 11:42 PM Han Zhou <zhouhan at gmail.com> wrote:

>
>
> On Mon, Oct 14, 2019 at 8:20 AM <nusiddiq at redhat.com> wrote:
>
>> From: Numan Siddique <nusiddiq at redhat.com>
>>
>> The commit [1] force drops all connections when the db read/write status
>> changes.
>> Prior to the commit [1], when there was read/write status change, the
>> existing
>> jsonrpc sessions with 'db_change_aware' set to true, were not updated
>> with the
>> changed 'read_only' value. If the db status was changed to 'standby', the
>> existing
>> clients could still write to the db.
>>
>> In the case of pacemaker OVN HA, OVN OCF script 'start' action starts the
>> ovsdb-servers in read-only state and later, it sets to read-write in the
>> 'promote' action. We have observed that if some ovn-controllers connect to
>> the SB ovsdb-server (in read-only state) just before the 'promote' action,
>> the connection is not reset all the times and these ovn-controllers
>> remain connected
>> to the SB ovsdb-server in read-only state all the time. Even though
>> the commit [1] calls 'ovsdb_jsonrpc_server_reconnect()' with 'forced' flag
>> set to true when the db read/write status changes, somehow the FSM misses
>> resetting
>> the connections of these ovn-controllers.
>>
>> I think this needs to be addressed in the FSM. This patch doesn't address
>> this FSM issue. Instead it changes the behavior of
>> 'ovsdb_jsonrpc_server_set_read_only()'
>> by setting the 'read_only' flag of all the jsonrpc sessions instead of
>> forcefully
>> resetting the connection.
>>
>> I think there is no need to reset the connection. In large scale
>> production
>> deployements with OVN, this results in unnecessary waste of CPU cycles as
>> ovn-controllers
>> will have to connect twice - once during 'start' action and again during
>> 'promote'.
>>
>> [1] - 2a9679e3b2c6("ovsdb-server: drop all connections on read/write
>> status change")
>>
>> Acked-by: Dumitru Ceara <dceara at redhat.com>
>> Signed-off-by: Numan Siddique <nusiddiq at redhat.com>
>> ---
>>
>>
> Thanks Numan. Is this the root cause of the ovn-controller transaction
> failure you mentioned at last OVN meeting?
>

Hi Han,
Yes. This is the root cause. Earlier I was suspecting ovn-controller though.

Thanks
Numan


More information about the dev mailing list