[ovs-dev] [PATCH] ofproto: Fix resource usage explosion due to removal of large number of flows.
Ilya Maximets
i.maximets at ovn.org
Tue Nov 30 15:01:54 UTC 2021
On 11/23/21 12:13, Vladislav Odintsov wrote:
> Thanks for the patch!
>
> Tested-by: Vladislav Odintsov <odivlad at gmail.com <mailto:odivlad at gmail.com>>
Thanks! Applied and backported down to 2.13.
Best regards, Ilya Maximets.
>
> Regards,
> Vladislav Odintsov
>
>> On 22 Nov 2021, at 18:23, Ilya Maximets <i.maximets at ovn.org <mailto:i.maximets at ovn.org>> wrote:
>>
>> While removing flows, removal itself is deferred, so classifier changes
>> performed already from the RCU thread. This way every deferred removal
>> triggers classifier change and reallocation of a pvector. Freeing of
>> old version of a pvector is postponed. Since all this is happening
>> from an RCU thread, all these copies of the same pvector will be freed
>> only after the next grace period.
>>
>> Below is the example output of the 'valgrind --tool=massif' from an OVN
>> deployment, where copies of that pvector took 5 GB of memory while
>> processing a bundled flow removal:
>>
>> -------------------------------------------------------------------
>> n time(i) total(B) useful-heap(B) extra-heap(B)
>> -------------------------------------------------------------------
>> 89 176,257,987,954 5,329,763,160 5,318,171,607 11,591,553
>> 99.78% (5,318,171,607B) (heap allocation functions) malloc/new/new[]
>> ->98.45% (5,247,008,392B) xmalloc__ (util.c:137)
>> |->98.17% (5,232,137,408B) pvector_impl_dup (pvector.c:48)
>> ||->98.16% (5,231,472,896B) pvector_remove (pvector.c:159)
>> |||->98.16% (5,231,472,800B) destroy_subtable (classifier.c:1558)
>> ||||->98.16% (5,231,472,800B) classifier_remove (classifier.c:792)
>> |||| ->98.16% (5,231,472,800B) classifier_remove_assert (classifier.c:832)
>> |||| ->98.16% (5,231,472,800B) remove_rule_rcu__ (ofproto.c:2978)
>> |||| ->98.16% (5,231,472,800B) remove_rule_rcu (ofproto.c:2990)
>> |||| ->98.16% (5,231,472,800B) ovsrcu_call_postponed (ovs-rcu.c:346)
>> |||| ->98.16% (5,231,472,800B) ovsrcu_postpone_thread (ovs-rcu.c:362)
>> |||| ->98.16% (5,231,472,800B) ovsthread_wrapper
>> |||| ->98.16% (5,231,472,800B) start_thread
>> |||| ->98.16% (5,231,472,800B) clone
>>
>> Collecting all the flows to be removed and postponing removal for
>> all of them together to avoid the problem. This way all removals
>> will trigger only a single pvector re-allocation greatly reducing
>> the CPU and memory usage.
>>
>> Reported-by: Vladislav Odintsov <odivlad at gmail.com <mailto:odivlad at gmail.com>>
>> Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2021-November/389538.html <https://mail.openvswitch.org/pipermail/ovs-dev/2021-November/389538.html>
>> Signed-off-by: Ilya Maximets <i.maximets at ovn.org <mailto:i.maximets at ovn.org>>
>> ---
>> ofproto/ofproto-provider.h | 4 ++++
>> ofproto/ofproto.c | 31 +++++++++++++++++++++++++++++--
>> 2 files changed, 33 insertions(+), 2 deletions(-)
More information about the dev
mailing list