[ovs-dev] [PATCH v7] ovsdb: provide raft and command interfaces with priority

Anton Ivanov anton.ivanov at cambridgegreys.com
Tue Aug 17 13:28:07 UTC 2021


On 17/08/2021 13:52, Ilya Maximets wrote:
> On 8/17/21 1:27 PM, Anton Ivanov wrote:
>> Hi Ilia, hi list,
>>
>> I ran some detailed experiments and there is an issue with all forms of "skipping" and/or reordering processing.
>>
>> If the session list is skipped or reordered (I tried "fast-forwarding" the list to a new head position after hitting a time constraint), ovsdb fails to issue the response to some transactions when running the cluster test suite.
>>
>> At present I am unable to get to the root cause.
>>
>> The issue does not exist if processing bails out of the session loop and is re-run IN FULL (as in the earliest versions of the patch).
> That is weird.  I'm not sure how the re-ordering is different from
> the 're-run in full' here.  The only thing that different is an
> actual order in which sessions are processed, because we're still
> re-running all of them in full until the time allows.

It may be the way I am reordering - directly manipulating the list head and pointers instead of popping and pushing.

If it works via the brute-force method (pop/push), I will leave that as first iteration and try to figure out what extra macros do you need in list.h at a later date.

Brgds,

>
>> I am going to re-issue the patch without any skipping whatsoever (either at remotes or at sessions level), because that works and improves raft (and overall ovn) stability.
>>
>> While there may be some starvation of the sessions towards the end of the session list, it should be a second order effect, because re-processing sessions which have just been processed generates only a minimal amount of changes.
>>
>> Skipping (if any) will be a later optimization after I get to the bottom of this and figure out why monitor updates are not followed by the transaction response.
> This doesn't sound good to me.  It's pretty easy to spam the
> ovsdb-server with monitor requests or condition changes.  This
> requires walk across the whole database.  And if the database
> is big enough, other sessions will never be served due to one
> faulty/malicious connection.   It's also possible that we
> have a few thousands connections and processing of all of them
> legitimately takes a lot of time.  This will be a problem
> if the rate of database changes is relatively high and constant.
>
> Best regards, Ilya Maximets.
>
-- 
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/



More information about the dev mailing list