[ovs-dev] [PATCH v1] OVN: Fix the ovn-controller 100% usage issue with put_mac_bindings

Numan Siddique nusiddiq at redhat.com
Wed May 15 18:43:00 UTC 2019


Gentle ping on this patch. This fix is a bit critical to fix the
ovn-controller CPU usage.

Thanks
Numan

On Sat, May 11, 2019 at 2:34 PM Numan Siddique <nusiddiq at redhat.com> wrote:

>
>
> On Sat, May 11, 2019 at 2:09 PM Numan Siddique <nusiddiq at redhat.com>
> wrote:
>
>>
>>
>> On Sat, May 11, 2019 at 5:38 AM Ankur Sharma <ankur.sharma at nutanix.com>
>> wrote:
>>
>>> Hi Han,
>>>
>>> Thanks for review.
>>>
>>> Yup, looks like it, although I did not try the scenario myself, but yes
>>> entry are not getting removed once added,
>>> hence, new mac bindings will not be added after some time (as existing
>>> count reaches 1000).
>>>
>>> Regards,
>>> Ankur
>>>
>>> From: Han Zhou <zhouhan at gmail.com>
>>> Sent: Friday, May 10, 2019 4:47 PM
>>> To: Ankur Sharma <ankur.sharma at nutanix.com>
>>> Cc: ovs-dev at openvswitch.org
>>> Subject: Re: [ovs-dev] [PATCH v1] OVN: Fix the ovn-controller 100% usage
>>> issue with put_mac_bindings
>>>
>>>
>>>
>>> On Fri, May 10, 2019 at 2:46 PM Ankur Sharma <ankur.sharma at nutanix.com
>>> <mailto:ankur.sharma at nutanix.com>> wrote:
>>> >
>>> > ISSUE:
>>> > a. As soon as entries get added to put_mac_bindings in pinctrl.c,
>>> >    ovn-controller CPU consumption reached 100%.
>>> >
>>> > b. This happens because in wait_put_mac_bindings,
>>> >    if !hmap_is_empty(&put_mac_bindings) succeeds and
>>> >    calls poll_immediat_wake().
>>> >
>>> >    ovn-controller.log:
>>> >    "2019-05-10T19:43:28.035Z|00035|poll_loop|INFO|wakeup due to
>>> >     0-ms timeout at ovn/controller/pinctrl.c:2520 (99% CPU usage)"
>>> >
>>> > ROOT_CAUSE:
>>> > a. Earlier it used to work fine, because in run_put_mac_bindings
>>> >    all the bindings in put_mac_bindings would be flushed.
>>> >
>>> > b. Looks like, as a part of adding a new thread in pinctrl, this
>>> >    line got replaced with calling buffer_put_mac_bindings.
>>> >
>>> >     "
>>> >     .
>>> >     .
>>> >     .
>>> >        /* Move the mac bindings from 'put_mac_bindings' hmap to
>>> >      * 'buffered_mac_bindings' and notify the pinctrl_handler.
>>> >      * pinctrl_handler will reinject the buffered packets. */
>>> >     if (!hmap_is_empty(&put_mac_bindings)) {
>>> >         buffer_put_mac_bindings();
>>> >         notify_pinctrl_handler();
>>> >     }
>>> >     "
>>> >
>>> > c. Because of which put_mac_bindings is never emptied and
>>> wait_put_mac_bindings
>>> >    ends up doing poll_immediate_wake.
>>> >
>>> > FIX:
>>> > a. Added call to flush_put_mac_bindings back in
>>> >    run_put_mac_bindings.
>>> >
>>> > b. Additioanlly, logic mentioned in ROOT_CAUSE.b has been changed
>>> >    in 1c24b2f490ba002bbfeb23006965188a7c5b9747, hence updated
>>> >    the documentation in pinctrl.c accordingly.
>>> >
>>> > Signed-off-by: Ankur Sharma <ankur.sharma at nutanix.com<mailto:
>>> ankur.sharma at nutanix.com>>
>>> > ---
>>> >  ovn/controller/pinctrl.c | 9 ++++++---
>>> >  1 file changed, 6 insertions(+), 3 deletions(-)
>>> >
>>> > diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c
>>> > index 8ae1f9b..b7bb4c9 100644
>>> > --- a/ovn/controller/pinctrl.c
>>> > +++ b/ovn/controller/pinctrl.c
>>> > @@ -91,11 +91,13 @@ VLOG_DEFINE_THIS_MODULE(pinctrl);
>>> >   *
>>> >   *   - arp/nd_ns      - These actions generate an ARP/IPv6 Neighbor
>>> solicit
>>> >   *                      requests. The original packets are buffered
>>> and
>>> > - *                      injected back when put_arp/put_nd actions are
>>> called.
>>> > + *                      injected back when put_arp/put_nd resolves
>>> > + *                      corresponding ARP/IPv6 Neighbor solicit
>>> requests.
>>> >   *                      When pinctrl_run(), writes the mac bindings
>>> from the
>>> >   *                      'put_mac_bindings' hmap to the MAC_Binding
>>> table in
>>> > - *                      SB DB, it moves these mac bindings to another
>>> hmap -
>>> > - *                      'buffered_mac_bindings'.
>>> > + *                      SB DB, run_buffered_binding will add the
>>> buffered
>>> > + *                      packets to buffered_mac_bindings and notify
>>> > + *                      pinctrl_handler.
>>> >   *
>>> >   *                      The pinctrl_handler thread calls the function
>>> -
>>> >   *                      send_mac_binding_buffered_pkts(), which uses
>>> > @@ -2468,6 +2470,7 @@ run_put_mac_bindings(struct ovsdb_idl_txn
>>> *ovnsb_idl_txn,
>>> >                              sbrec_mac_binding_by_lport_ip,
>>> >                              pmb);
>>> >      }
>>> > +    "flush_put_mac_bindings();
>>
>>
>> Thanks for the patch and finding the issue. This is a big mistake from me
>> when I added the pinctrl_thread.
>>
>> Instead of calling  "flush_put_mac_bindings()" in the function
>> "run_put_mac_bindings()"
>> I would suggest to use - HMAP_FOR_EACH_POP in run_put_mac_bindings() and
>> then free
>> pmb after the call to run_put_mac_binding().
>>
>> If you see flush_put_mac_bindings() also does the same thing. We can
>> avoid running the for loop twice.
>>
>
> You can ignore my suggestion. Calling flush_put_mac_bindings() seems much
> cleaner.
>
> Acked-by: Numan Siddique <nusiddiq at redhat.com>
>
> Thanks
> Numan
>
>
>> Thanks
>> Numan
>>
>> >  }
>>> >
>>> >  static void
>>> > --
>>> > 1.8.3.1
>>> >
>>> > _______________________________________________
>>> > dev mailing list
>>> > dev at openvswitch.org<mailto:dev at openvswitch.org>
>>> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev [
>>> mail.openvswitch.org]<
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=P-XiYg1pCA7c5_H6MKaFGMBLWLPE7GAfuUnfELvcIGY&s=6QBH6hjY9c7Vc-Bh4_AzkwWJBBGuOH3Y2v7sXIIU-kc&e=
>>> >
>>>
>>> Thanks for the fix. Does it mean the the mac-binding will not work after
>>> some time since it is never flushed, and the check
>>> "hmap_count(&put_mac_bindings) >= 1000" in pinctrl_handle_put_mac_binding()
>>> will prevent any new mac-binding being handled?
>>>
>>> The fix looks good to me.
>>>
>>> Acked-by: Han Zhou <hzhou8 at ebay.com<mailto:hzhou8 at ebay.com>>
>>> _______________________________________________
>>> dev mailing list
>>> dev at openvswitch.org
>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>
>>


More information about the dev mailing list