[ovs-dev] [OVN Patch v8 3/3] northd: Restore parallel build with dp_groups

Fri Oct 1 00:32:21 UTC 2021

On Thu, Sep 30, 2021 at 2:03 PM Anton Ivanov <
anton.ivanov at cambridgegreys.com> wrote:

> On 30/09/2021 20:48, Han Zhou wrote:
>
>
>
> On Thu, Sep 30, 2021 at 7:34 AM Anton Ivanov <
> anton.ivanov at cambridgegreys.com> wrote:
>
>> Summary of findings.
>>
>> 1. The numbers on the perf test do not align with heater which is much
>> closer to a realistic load. On some tests where heater gives 5-10%
>> end-to-end improvement with parallelization we get worse results with the
>> perf-test. You spotted this one correctly.
>>
>> Example of the northd average pulled out of the test report via grep and
>> sed.
>>
>>    127.489353
>>    131.509458
>>    116.088205
>>    94.721911
>>    119.629756
>>    114.896258
>>    124.811069
>>    129.679160
>>    106.699905
>>    134.490338
>>    112.106713
>>    135.957658
>>    132.471111
>>    94.106849
>>    117.431450
>>    115.861592
>>    106.830657
>>    132.396905
>>    107.092542
>>    128.945760
>>    94.298464
>>    120.455510
>>    136.910426
>>    134.311765
>>    115.881292
>>    116.918458
>>
>> These values are all over the place - this is not a reproducible test.
>>
>> 2. In the present state you need to re-run it > 30+ times and take an
>> average. The standard deviation for the values for the northd loop is >
>> 10%. Compared to that the reproducibility of ovn-heater is significantly
>> better. I usually get less than 0.5% difference between runs if there was
>> no iteration failures. I would suggest using that instead if you want to do
>> performance comparisons until we have figured out what affects the
>> perf-test.
>>
>> 3. It is using the short term running average value in reports which is
>> probably wrong because you have very significant skew from the last several
>> values.
>>
>> I will look into all of these.
>>
> Thanks for the summary! However, I think there is a bigger problem
> (probably related to my environment) than the stability of the test (make
> check-perf TESTSUITEFLAGS="--rebuild") itself. As I mentioned in an earlier
> email I observed even worse results with a large scale topology closer to a
> real world deployment of ovn-k8s just testing with the command:
>     ovn-nbctl --print-wait-time --wait=sb sync
>
> This command simply triggers a change in NB_Global table and wait for
> northd to complete all the recompute and update SB. It doesn't have to use
> "sync" command but any change to the NB DB produces similar result (e.g.:
> ovn-nbctl --print-wait-time --wait=sb ls-add ls1)
>
> Without parallel:
> ovn-northd completion: 7807ms
>
> With parallel:
> ovn-northd completion: 41267ms
>
> Is this with current master or prior to these patches?
>
1. There was an issue prior to these where the hash on first iteration with
> an existing database when loading a large database for the first time was
> not sized correctly. These numbers sound about right when this bug was
> around.
>
The patches are included. The commit id is 9242f27f63 as mentioned in my
first email.

> 2. There should be NO DIFFERENCE in a single compute cycle with an
> existing database between a run with parallel and without with dp groups at
> present. This is because the first cycle does not use parallel compute. It
> is disabled in order to achieve the correct hash sizings for future cycle
> by auto-scaling the hash.
>
Yes, I understand this and I did enable dp-group for the above "ovn-nbctl
sync" test, so the number I showed above for "with parallel" was for the
2nd run and onwards. For the first round the result is exactly the same as
without parallel.

I just tried disabling DP group for the large scale "ovn-nbctl sync" test
(after taking some effort squeezing out memory spaces on my desktop), and
the result shows that parallel build performs slightly better (although it
is 3x slower than with dp-group & without parallel, which is expected).
Summarize the result together below:

Without parallel, with dp-group:
ovn-northd completion: 7807ms

With parallel, with dp-group:
ovn-northd completion: 41267ms

without parallel, without dp-group:
ovn-northd completion: 27996ms

with parallel, without dp-group:
ovn-northd completion: 26584ms

Now the interesting part:
I implemented a POC of a hash based mutex array that replaces the rw lock
in the function do_ovn_lflow_add_pd(), and the performance is greatly
improved for the dp-group test:

with parallel, with dp-group (hash based mutex):
ovn-northd completion: 5081ms

This is 8x faster than the current parallel one and 30% faster than without
parallel. This result looks much more reasonable to me. My theory is that
when using parallel with dp-group, the rwlock contention is causing the low
CPU utilization of the threads and the overall slowness on my machine. I
will refine the POC to a formal patch and send it for review, hopefully by
tomorrow.

Thanks,
Han