[ovs-dev] [PATCH v5] ovsdb raft: Sync commit index to followers without delay.

Ben Pfaff blp at ovn.org
Mon Mar 25 21:00:02 UTC 2019


On Sat, Mar 23, 2019 at 09:44:26AM -0700, Han Zhou wrote:
> From: Han Zhou <hzhou8 at ebay.com>
> 
> When update is requested from follower, the leader sends AppendRequest
> to all followers and wait until AppendReply received from majority, and
> then it will update commit index - the new entry is regarded as committed
> in raft log. However, this commit will not be notified to followers
> (including the one initiated the request) until next heartbeat (ping
> timeout), if no other pending requests. This results in long latency
> for updates made through followers, especially when a batch of updates
> are requested through the same follower.

The tests pass now, but each one of them ends up with several ovn-sbctl
and ovsdb-server processes all trying to use 100% of a CPU.  If I run
the tests with 10-way parallelism, the load average goes above 100.
Surely something has to be wrong in the implementation here.

Each of the ovn-sbctl processes is trying to push through only a single
transaction; after that, it exits.  If we think of ovsdb-server as
giving each of its clients one chance to execute an RPC in round-robin
order, which is approximately correct, then one of those transactions
should succeed per round.  I don't understand why, if this model is
correct, the ovn-sbctls would burn so much CPU.  If the model is wrong,
then we need to understand why it is wrong and how we can fix it.  Maybe
the ovn-sbctl processes are retrying blindly without waiting for an
update from the server; if so, that's a bug and it should be fixed.


More information about the dev mailing list