[ovs-dev] scale test testing requests (was: raft ovsdb clustering with scale test)

Fri May 18 21:30:35 UTC 2018

On Fri, May 18, 2018 at 1:35 PM, Ben Pfaff <blp at ovn.org> wrote:
>
> I've spent some time stressing the database yesterday and today.  So
> far, I can't reproduce these particular problems.  I do see various ways
> to improve OVS and OVN and their tests.
>
> Here are some suggestions I have for further testing:
>
> 1. You mentioned that programs were segfaulting.  They should not be
>    doing that (obviously) but I wasn't able to get them to do so in my
>    own testing.  It would be very helpful to have backtraces.  Would you
>    mind trying to get them?
>
> 2. You mentioned that "perf" shows that lots of time is being spent
>    writing snapshots.  It would be helpful to know whether this is a
>    contributing factor in the failures.  (If it is, then I will work on
>    making snapshots faster.)  One way to figure that out would be to
>    disable snapshots entirely for testing.  That isn't acceptable in
>    production because it will use up all the disk space eventually, but
>    for testing one could apply the following patch:
>
> diff --git a/ovsdb/storage.c b/ovsdb/storage.c
> index 446cae0861ec..9fa9954b6d35 100644
> --- a/ovsdb/storage.c
> +++ b/ovsdb/storage.c
> @@ -490,38 +490,8 @@ schedule_next_snapshot(struct ovsdb_storage
*storage, bool quick)
>  }
>
>  bool
> -ovsdb_storage_should_snapshot(const struct ovsdb_storage *storage)
> +ovsdb_storage_should_snapshot(const struct ovsdb_storage *storage
OVS_UNUSED)
>  {
> -    if (storage->raft || storage->log) {
> -        /* If we haven't reached the minimum snapshot time, don't
snapshot. */
> -        long long int now = time_msec();
> -        if (now < storage->next_snapshot_min) {
> -            return false;
> -        }
> -
> -        /* If we can't snapshot right now, don't. */
> -        if (storage->raft && !raft_may_snapshot(storage->raft)) {
> -            return false;
> -        }
> -
> -        uint64_t log_len = (storage->raft
> -                            ? raft_get_log_length(storage->raft)
> -                            : storage->n_read + storage->n_written);
> -        if (now < storage->next_snapshot_max) {
> -            /* Maximum snapshot time not yet reached.  Take a snapshot
if there
> -             * have been at least 100 log entries and the log file size
has
> -             * grown a lot. */
> -            bool grew_lots = (storage->raft
> -                              ? raft_grew_lots(storage->raft)
> -                              : ovsdb_log_grew_lots(storage->log));
> -            return log_len >= 100 && grew_lots;
> -        } else {
> -            /* We have reached the maximum snapshot time.  Take a
snapshot if
> -             * there have been any log entries at all. */
> -            return log_len > 0;
> -        }
> -    }
> -
>      return false;
>  }
>
>
> 3. This isn't really a testing note but I do see that the way that OVSDB
>    is proxying writes from a Raft follower to the leader is needlessly
>    inefficient and I should rework it for better write performance.

Thanks Ben for taking effort on this. We'll do more tests the way you
suggested. Here I have just one more detail of our test scenario, which
might (or not) help on reproducing the problem:
In our test we create lport in 100 batch size with a single ovn-nbctl
command (i.e. one transaction), and then bind each of the new lports on a
random HV. After all these 100 lports are bound, we do ovn-nbctl wait-until
logical_switch_port <port> up=true for each lport. After all lport is up in
NB, we continue with next batch. Not sure if this has anything to do with
the problems, but it is how we tested :)