[ovs-dev] [OVN Patch v6 4/4] northd: Restore parallel build with dp_groups

Anton Ivanov anton.ivanov at cambridgegreys.com
Wed Sep 15 12:47:55 UTC 2021


On 15/09/2021 13:45, Anton Ivanov wrote:
> I have dropped the split processing of lflows for now.

Sorry, pressed send to early by mistake.

V8 has the same code as the existing ovn for single thread and code derived from the my split processing work for multi-thread.

It is tested to be identical to master in single thread and significantly faster (albeit at CPU cost) than single thread if you have at least

8+ threads to work on it.

A.

>
> On 10/09/2021 17:13, Ilya Maximets wrote:
>> On 9/10/21 4:19 PM, Anton Ivanov wrote:
>>> On 10/09/2021 13:29, Ilya Maximets wrote:
>>>> On 9/10/21 2:23 PM, Anton Ivanov wrote:
>>>>> On 10/09/2021 13:18, Anton Ivanov wrote:
>>>>>> On 10/09/2021 12:43, Ilya Maximets wrote:
>>>>>>> On 9/9/21 11:02 PM, Mark Michelson wrote:
>>>>>>>> Hi Anton,
>>>>>>>>
>>>>>>>> On a high level, this change results in some parts of the parallelized hashmap being unused. Should we keep the hashrow_locks structure and its APIs, for instance?
>>>>>>>>
>>>>>>>> See below for more comments.
>>>>>>>>
>>>>>>>> On 9/2/21 9:27 AM, anton.ivanov at cambridgegreys.com wrote:
>>>>>>>>> From: Anton Ivanov <anton.ivanov at cambridgegreys.com>
>>>>>>>>>
>>>>>>>>> Restore parallel build with dp groups using rwlock instead
>>>>>>>>> of per row locking as an underlying mechanism.
>>>>>>>>>
>>>>>>>>> This provides improvement ~ 10% end-to-end on ovn-heater
>>>>>>>>> under virutalization despite awakening some qemu gremlin
>>>>>>>>> which makes qemu climb to silly CPU usage. The gain on
>>>>>>>>> bare metal is likely to be higher.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Anton Ivanov <anton.ivanov at cambridgegreys.com>
>>>>>>>>> ---
>>>>>>>>>      northd/ovn-northd.c | 215 ++++++++++++++++++++++++++++++++------------
>>>>>>>>>      1 file changed, 159 insertions(+), 56 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
>>>>>>>>> index 1f903882b..4537c55dd 100644
>>>>>>>>> --- a/northd/ovn-northd.c
>>>>>>>>> +++ b/northd/ovn-northd.c
>>>>>>>>> @@ -59,6 +59,7 @@
>>>>>>>>>      #include "unixctl.h"
>>>>>>>>>      #include "util.h"
>>>>>>>>>      #include "uuid.h"
>>>>>>>>> +#include "ovs-thread.h"
>>>>>>>>>      #include "openvswitch/vlog.h"
>>>>>>>>>        VLOG_DEFINE_THIS_MODULE(ovn_northd);
>>>>>>>>> @@ -4372,7 +4373,26 @@ ovn_lflow_init(struct ovn_lflow *lflow, struct ovn_datapath *od,
>>>>>>>>>      static bool use_logical_dp_groups = false;
>>>>>>>>>      static bool use_parallel_build = true;
>>>>>>>>>      -static struct hashrow_locks lflow_locks;
>>>>>>>>> +static struct ovs_rwlock flowtable_lock;
>>>>>>>> With the change from the custom hashrow_locks to using an ovs_rwlock, I think it's important to document what data this lock is intending to protect.
>>>>>>>>
>>>>>>>>    From what I can tell, this lock specifically is intended to protect access to the hmaps in the global lflow_state structure, and it's also intended to protect all ovn_datapaths' od_group hmaps. This is not something that is immediately obvious just from a global rwlock declaration.
>>>>>>>>
>>>>>>>>> +
>>>>>>>>> +static void ovn_make_multi_lflow(struct ovn_lflow *old_lflow,
>>>>>>>>> +                              struct ovn_datapath *od,
>>>>>>>>> +                              struct lflow_state *lflow_map,
>>>>>>>>> +                              uint32_t hash)
>>>>>>>>> +{
>>>>>>>>> +    hmapx_add(&old_lflow->od_group, od);
>>>>>>>>> +    hmap_remove(&lflow_map->single_od, &old_lflow->hmap_node);
>>>>>>>>> +    if (use_parallel_build) {
>>>>>>>>> + hmap_insert_fast(&lflow_map->multiple_od, &old_lflow->hmap_node, hash);
>>>>>>>>> +    } else {
>>>>>>>>> + hmap_insert(&lflow_map->multiple_od, &old_lflow->hmap_node, hash);
>>>>>>>>> +    }
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>> +#ifdef __clang__
>>>>>>>>> +#pragma clang diagnostic push
>>>>>>>>> +#pragma clang diagnostic ignored "-Wthread-safety"
>>>>>>>>> +#endif
>>>>>>>> The section above needs a comment to explain why -Wthread-safety is ignored.
>>>>>>> Please, use OVS_NO_THREAD_SAFETY_ANALYSIS marker instead of manually disabling
>>>>>>> the diagnostics.
>>>>>>>
>>>>>>>
>>>>>>> On a side note, I ran some tests with this patch set applied and it drops
>>>>>>> performance of non-parallel case by 20% on my laptop with databases from
>>>>>>> the ovn-heater's 120 node density test.
>>>>>>>
>>>>>>> I see, you mentioned 10% improvement in the commit message. Assuming it's
>>>>>>> with parallelization enabled, right?
>>>>>> Yes. And no difference on heater tests up to 36 fake nodes with 7200 ports.
>>>>>>
>>>>>>> In any case, some performance testing without parallelization required before
>>>>>>> accepting the change, as it's a default configuration.
>>>>>> They were done. I am somewhat surprised by your measurement, can you please provide some more details.
>>>> I have databases from a previous 120 node ovn-heater run.
>>>> I'm loading them to the OVN sandbox like this:
>>>>
>>>> make sandbox SANDBOXFLAGS="--nbdb-source=/tmp/ovnnb_db.db --sbdb-source=/tmp/ovnsb_db.db"
>>>>
>>>> And performing a couple of small operations with northbound database just to
>>>> trigger the northd processing loop.
>>>>
>>>> Configuration has dp-groups enabled and parallelization disabled.
>>> Can you run v7 versus that. It should fix 3 performance regressions I found in v6. Each of them is relatively small, but they may add up for a large database.
>> v7 seems only slightly better.  By a couple of percents.
>> Nowhere close to the current master.
>>
>> And stopwatches in v7 are still messed up.
>>
>>> I will run an ovn-heater with this, but to be honest - I do not expect any difference. I did not see any difference before at the scales I was testing.
>>>
>>> If it is still slower than the original code, it will be interesting to find the culprit. They should be nearly identical.
>>>
>>> A.
>>>
>>>>>> You are saying "database from 120 nodes" - are we talking a stopwatch measurement here?
>>>>> Which is the case. The patch moved it where it should not be. Will be fixed in the next revision.
>>>> Not only stopwatches.  Poll intervals in general grew too.
>>>>
>>>>> A.
>>>>>
>>>>>>> Best regards, Ilya Maximets.
>>>>>>>
>>
-- 
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/



More information about the dev mailing list