[ovs-dev] [OVN Patch v6 4/4] northd: Restore parallel build with dp_groups

Ilya Maximets i.maximets at ovn.org
Fri Sep 10 12:29:04 UTC 2021


On 9/10/21 2:23 PM, Anton Ivanov wrote:
> 
> On 10/09/2021 13:18, Anton Ivanov wrote:
>>
>> On 10/09/2021 12:43, Ilya Maximets wrote:
>>> On 9/9/21 11:02 PM, Mark Michelson wrote:
>>>> Hi Anton,
>>>>
>>>> On a high level, this change results in some parts of the parallelized hashmap being unused. Should we keep the hashrow_locks structure and its APIs, for instance?
>>>>
>>>> See below for more comments.
>>>>
>>>> On 9/2/21 9:27 AM, anton.ivanov at cambridgegreys.com wrote:
>>>>> From: Anton Ivanov <anton.ivanov at cambridgegreys.com>
>>>>>
>>>>> Restore parallel build with dp groups using rwlock instead
>>>>> of per row locking as an underlying mechanism.
>>>>>
>>>>> This provides improvement ~ 10% end-to-end on ovn-heater
>>>>> under virutalization despite awakening some qemu gremlin
>>>>> which makes qemu climb to silly CPU usage. The gain on
>>>>> bare metal is likely to be higher.
>>>>>
>>>>> Signed-off-by: Anton Ivanov <anton.ivanov at cambridgegreys.com>
>>>>> ---
>>>>>    northd/ovn-northd.c | 215 ++++++++++++++++++++++++++++++++------------
>>>>>    1 file changed, 159 insertions(+), 56 deletions(-)
>>>>>
>>>>> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
>>>>> index 1f903882b..4537c55dd 100644
>>>>> --- a/northd/ovn-northd.c
>>>>> +++ b/northd/ovn-northd.c
>>>>> @@ -59,6 +59,7 @@
>>>>>    #include "unixctl.h"
>>>>>    #include "util.h"
>>>>>    #include "uuid.h"
>>>>> +#include "ovs-thread.h"
>>>>>    #include "openvswitch/vlog.h"
>>>>>      VLOG_DEFINE_THIS_MODULE(ovn_northd);
>>>>> @@ -4372,7 +4373,26 @@ ovn_lflow_init(struct ovn_lflow *lflow, struct ovn_datapath *od,
>>>>>    static bool use_logical_dp_groups = false;
>>>>>    static bool use_parallel_build = true;
>>>>>    -static struct hashrow_locks lflow_locks;
>>>>> +static struct ovs_rwlock flowtable_lock;
>>>> With the change from the custom hashrow_locks to using an ovs_rwlock, I think it's important to document what data this lock is intending to protect.
>>>>
>>>>  From what I can tell, this lock specifically is intended to protect access to the hmaps in the global lflow_state structure, and it's also intended to protect all ovn_datapaths' od_group hmaps. This is not something that is immediately obvious just from a global rwlock declaration.
>>>>
>>>>> +
>>>>> +static void ovn_make_multi_lflow(struct ovn_lflow *old_lflow,
>>>>> +                              struct ovn_datapath *od,
>>>>> +                              struct lflow_state *lflow_map,
>>>>> +                              uint32_t hash)
>>>>> +{
>>>>> +    hmapx_add(&old_lflow->od_group, od);
>>>>> +    hmap_remove(&lflow_map->single_od, &old_lflow->hmap_node);
>>>>> +    if (use_parallel_build) {
>>>>> +        hmap_insert_fast(&lflow_map->multiple_od, &old_lflow->hmap_node, hash);
>>>>> +    } else {
>>>>> +        hmap_insert(&lflow_map->multiple_od, &old_lflow->hmap_node, hash);
>>>>> +    }
>>>>> +}
>>>>> +
>>>>> +#ifdef __clang__
>>>>> +#pragma clang diagnostic push
>>>>> +#pragma clang diagnostic ignored "-Wthread-safety"
>>>>> +#endif
>>>> The section above needs a comment to explain why -Wthread-safety is ignored.
>>> Please, use OVS_NO_THREAD_SAFETY_ANALYSIS marker instead of manually disabling
>>> the diagnostics.
>>>
>>>
>>> On a side note, I ran some tests with this patch set applied and it drops
>>> performance of non-parallel case by 20% on my laptop with databases from
>>> the ovn-heater's 120 node density test.
>>>
>>> I see, you mentioned 10% improvement in the commit message. Assuming it's
>>> with parallelization enabled, right?
>>
>> Yes. And no difference on heater tests up to 36 fake nodes with 7200 ports.
>>
>>>
>>> In any case, some performance testing without parallelization required before
>>> accepting the change, as it's a default configuration.
>>
>> They were done. I am somewhat surprised by your measurement, can you please provide some more details.

I have databases from a previous 120 node ovn-heater run.
I'm loading them to the OVN sandbox like this:

make sandbox SANDBOXFLAGS="--nbdb-source=/tmp/ovnnb_db.db --sbdb-source=/tmp/ovnsb_db.db"

And performing a couple of small operations with northbound database just to
trigger the northd processing loop.

Configuration has dp-groups enabled and parallelization disabled.

>>
>> You are saying "database from 120 nodes" - are we talking a stopwatch measurement here?
> 
> Which is the case. The patch moved it where it should not be. Will be fixed in the next revision.

Not only stopwatches.  Poll intervals in general grew too.

> 
> A.
> 
>>
>>>
>>> Best regards, Ilya Maximets.
>>>



More information about the dev mailing list