[ovs-dev] [PATCH] ofproto: Fix resource usage explosion due to removal of large number of flows.

Ilya Maximets i.maximets at ovn.org
Tue Nov 30 15:01:54 UTC 2021


On 11/23/21 12:13, Vladislav Odintsov wrote:
> Thanks for the patch!
> 
> Tested-by: Vladislav Odintsov <odivlad at gmail.com <mailto:odivlad at gmail.com>>


Thanks!  Applied and backported down to 2.13.

Best regards, Ilya Maximets.

> 
> Regards,
> Vladislav Odintsov
> 
>> On 22 Nov 2021, at 18:23, Ilya Maximets <i.maximets at ovn.org <mailto:i.maximets at ovn.org>> wrote:
>>
>> While removing flows, removal itself is deferred, so classifier changes
>> performed already from the RCU thread.  This way every deferred removal
>> triggers classifier change and reallocation of a pvector.  Freeing of
>> old version of a pvector is postponed.  Since all this is happening
>> from an RCU thread, all these copies of the same pvector will be freed
>> only after the next grace period.
>>
>> Below is the example output of the 'valgrind --tool=massif' from an OVN
>> deployment, where copies of that pvector took 5 GB of memory while
>> processing a bundled flow removal:
>>
>> -------------------------------------------------------------------
>>   n        time(i)         total(B)   useful-heap(B) extra-heap(B)
>> -------------------------------------------------------------------
>>  89 176,257,987,954    5,329,763,160    5,318,171,607    11,591,553
>> 99.78% (5,318,171,607B) (heap allocation functions) malloc/new/new[]
>> ->98.45% (5,247,008,392B) xmalloc__ (util.c:137)
>> |->98.17% (5,232,137,408B) pvector_impl_dup (pvector.c:48)
>> ||->98.16% (5,231,472,896B) pvector_remove (pvector.c:159)
>> |||->98.16% (5,231,472,800B) destroy_subtable (classifier.c:1558)
>> ||||->98.16% (5,231,472,800B) classifier_remove (classifier.c:792)
>> |||| ->98.16% (5,231,472,800B) classifier_remove_assert (classifier.c:832)
>> ||||  ->98.16% (5,231,472,800B) remove_rule_rcu__ (ofproto.c:2978)
>> ||||   ->98.16% (5,231,472,800B) remove_rule_rcu (ofproto.c:2990)
>> ||||    ->98.16% (5,231,472,800B) ovsrcu_call_postponed (ovs-rcu.c:346)
>> ||||     ->98.16% (5,231,472,800B) ovsrcu_postpone_thread (ovs-rcu.c:362)
>> ||||      ->98.16% (5,231,472,800B) ovsthread_wrapper
>> ||||       ->98.16% (5,231,472,800B) start_thread
>> ||||        ->98.16% (5,231,472,800B) clone
>>
>> Collecting all the flows to be removed and postponing removal for
>> all of them together to avoid the problem.  This way all removals
>> will trigger only a single pvector re-allocation greatly reducing
>> the CPU and memory usage.
>>
>> Reported-by: Vladislav Odintsov <odivlad at gmail.com <mailto:odivlad at gmail.com>>
>> Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2021-November/389538.html <https://mail.openvswitch.org/pipermail/ovs-dev/2021-November/389538.html>
>> Signed-off-by: Ilya Maximets <i.maximets at ovn.org <mailto:i.maximets at ovn.org>>
>> ---
>> ofproto/ofproto-provider.h |  4 ++++
>> ofproto/ofproto.c          | 31 +++++++++++++++++++++++++++++--
>> 2 files changed, 33 insertions(+), 2 deletions(-)


More information about the dev mailing list