[ovs-dev] [OVN Patch v6 4/4] northd: Restore parallel build with dp_groups

Anton Ivanov anton.ivanov at cambridgegreys.com
Fri Sep 10 12:50:09 UTC 2021


On 10/09/2021 13:49, Anton Ivanov wrote:
>
> On 10/09/2021 13:34, Anton Ivanov wrote:
>>
>> On 10/09/2021 13:29, Ilya Maximets wrote:
>>> On 9/10/21 2:23 PM, Anton Ivanov wrote:
>>>> On 10/09/2021 13:18, Anton Ivanov wrote:
>>>>> On 10/09/2021 12:43, Ilya Maximets wrote:
>>>>>> On 9/9/21 11:02 PM, Mark Michelson wrote:
>>>>>>> Hi Anton,
>>>>>>>
>>>>>>> On a high level, this change results in some parts of the parallelized hashmap being unused. Should we keep the hashrow_locks structure and its APIs, for instance?
>>>>>>>
>>>>>>> See below for more comments.
>>>>>>>
>>>>>>> On 9/2/21 9:27 AM, anton.ivanov at cambridgegreys.com wrote:
>>>>>>>> From: Anton Ivanov <anton.ivanov at cambridgegreys.com>
>>>>>>>>
>>>>>>>> Restore parallel build with dp groups using rwlock instead
>>>>>>>> of per row locking as an underlying mechanism.
>>>>>>>>
>>>>>>>> This provides improvement ~ 10% end-to-end on ovn-heater
>>>>>>>> under virutalization despite awakening some qemu gremlin
>>>>>>>> which makes qemu climb to silly CPU usage. The gain on
>>>>>>>> bare metal is likely to be higher.
>>>>>>>>
>>>>>>>> Signed-off-by: Anton Ivanov <anton.ivanov at cambridgegreys.com>
>>>>>>>> ---
>>>>>>>>     northd/ovn-northd.c | 215 ++++++++++++++++++++++++++++++++------------
>>>>>>>>     1 file changed, 159 insertions(+), 56 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
>>>>>>>> index 1f903882b..4537c55dd 100644
>>>>>>>> --- a/northd/ovn-northd.c
>>>>>>>> +++ b/northd/ovn-northd.c
>>>>>>>> @@ -59,6 +59,7 @@
>>>>>>>>     #include "unixctl.h"
>>>>>>>>     #include "util.h"
>>>>>>>>     #include "uuid.h"
>>>>>>>> +#include "ovs-thread.h"
>>>>>>>>     #include "openvswitch/vlog.h"
>>>>>>>>       VLOG_DEFINE_THIS_MODULE(ovn_northd);
>>>>>>>> @@ -4372,7 +4373,26 @@ ovn_lflow_init(struct ovn_lflow *lflow, struct ovn_datapath *od,
>>>>>>>>     static bool use_logical_dp_groups = false;
>>>>>>>>     static bool use_parallel_build = true;
>>>>>>>>     -static struct hashrow_locks lflow_locks;
>>>>>>>> +static struct ovs_rwlock flowtable_lock;
>>>>>>> With the change from the custom hashrow_locks to using an ovs_rwlock, I think it's important to document what data this lock is intending to protect.
>>>>>>>
>>>>>>>   From what I can tell, this lock specifically is intended to protect access to the hmaps in the global lflow_state structure, and it's also intended to protect all ovn_datapaths' od_group hmaps. This is not something that is immediately obvious just from a global rwlock declaration.
>>>>>>>
>>>>>>>> +
>>>>>>>> +static void ovn_make_multi_lflow(struct ovn_lflow *old_lflow,
>>>>>>>> +                              struct ovn_datapath *od,
>>>>>>>> +                              struct lflow_state *lflow_map,
>>>>>>>> +                              uint32_t hash)
>>>>>>>> +{
>>>>>>>> +    hmapx_add(&old_lflow->od_group, od);
>>>>>>>> +    hmap_remove(&lflow_map->single_od, &old_lflow->hmap_node);
>>>>>>>> +    if (use_parallel_build) {
>>>>>>>> + hmap_insert_fast(&lflow_map->multiple_od, &old_lflow->hmap_node, hash);
>>>>>>>> +    } else {
>>>>>>>> +        hmap_insert(&lflow_map->multiple_od, &old_lflow->hmap_node, hash);
>>>>>>>> +    }
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> +#ifdef __clang__
>>>>>>>> +#pragma clang diagnostic push
>>>>>>>> +#pragma clang diagnostic ignored "-Wthread-safety"
>>>>>>>> +#endif
>>>>>>> The section above needs a comment to explain why -Wthread-safety is ignored.
>>>>>> Please, use OVS_NO_THREAD_SAFETY_ANALYSIS marker instead of manually disabling
>>>>>> the diagnostics.
>>>>>>
>>>>>>
>>>>>> On a side note, I ran some tests with this patch set applied and it drops
>>>>>> performance of non-parallel case by 20% on my laptop with databases from
>>>>>> the ovn-heater's 120 node density test.
>>>>>>
>>>>>> I see, you mentioned 10% improvement in the commit message. Assuming it's
>>>>>> with parallelization enabled, right?
>>>>> Yes. And no difference on heater tests up to 36 fake nodes with 7200 ports.
>>>>>
>>>>>> In any case, some performance testing without parallelization required before
>>>>>> accepting the change, as it's a default configuration.
>>>>> They were done. I am somewhat surprised by your measurement, can you please provide some more details.
>>> I have databases from a previous 120 node ovn-heater run.
>>> I'm loading them to the OVN sandbox like this:
>>>
>>> make sandbox SANDBOXFLAGS="--nbdb-source=/tmp/ovnnb_db.db --sbdb-source=/tmp/ovnsb_db.db"
>>>
>>> And performing a couple of small operations with northbound database just to
>>> trigger the northd processing loop.
>>>
>>> Configuration has dp-groups enabled and parallelization disabled.
>>>
>>>>> You are saying "database from 120 nodes" - are we talking a stopwatch measurement here?
>>>> Which is the case. The patch moved it where it should not be. Will be fixed in the next revision.
>>> Not only stopwatches.  Poll intervals in general grew too.
>>
>> OK. Looking at it. There may be a slight increase due to extra lookup in ovn_lflow_destroy, but it should not be anywhere near 10%.
>
> The other one is that while there are more single od flows, in most cases ovn_lookup_flow will return a multiple od flows. That is how the arguments to ovn_lookup_flow pan out. Ditto for flow insertion. That lookup is probably in the wrong order. It should lookup

Apologies, pressed send by mistake.

Wanted to say - it should look at multiple flows first, then at single (though it will be good to measure this properly).

A.


>
>>
>>>
>>>> A.
>>>>
>>>>>> Best regards, Ilya Maximets.
>>>>>>
>>>
-- 
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/



More information about the dev mailing list