[ovs-dev] Scaling of Logical_Flows and MAC_Binding tables

Renat Nurgaliyev impleman at gmail.com
Tue Dec 1 12:18:41 UTC 2020


On Tue, Dec 01, 2020 at 11:41:29AM +0100, Dumitru Ceara wrote:
> On 12/1/20 10:41 AM, Renat Nurgaliyev wrote:
> > On Mon, Nov 30, 2020 at 12:28:56PM -0800, Han Zhou wrote:
> >> On Mon, Nov 30, 2020 at 12:13 PM Renat Nurgaliyev <impleman at gmail.com>
> >> wrote:
> >>>
> >>> On 30.11.20 07:07, Numan Siddique wrote:
> >>>> On Mon, Nov 30, 2020 at 7:37 AM Han Zhou <hzhou at ovn.org> wrote:
> >>>>> On Sat, Nov 28, 2020 at 12:31 PM Tony Liu <tonyliu0592 at hotmail.com>
> >> wrote:
> >>>>>> Hi Renat,
> >>>
> >>> Hi folks,
> >>>>>>
> >>>>>> What's this "logical datapath patches that Ilya Maximets submitted"?
> >>>>>> Could you share some links?
> >>>>>>
> >>>>>> There were couple discussions for the similar issue.
> >>>>>> [1] raised the issue and results a new option
> >>>>>> always_learn_from_arp_request to be added [2].
> >>>>>> [3] results a patch to OVN ML2 driver [4] to set the option added by
> >> [1].
> >>>>>>
> >>>>>> It seems that it helps to optimize logical_flow table.
> >>>>>> I am not sure if it helps on mac_binding as well.
> >>>>>>
> >>>>>> Is it the same issue we are trying to address here, by either
> >>>>>> Numan's local cache or the solution proposed by Dumitru?
> >>>>>>
> >>>>>> [1]
> >>>>> https://mail.openvswitch.org/pipermail/ovs-discuss/2020-May/049994.html
> >>>>>> [2]
> >>>>>
> >> https://github.com/ovn-org/ovn/commit/61ccc6b5fc7c49b512e26347cfa12b86f0ec2fd9#diff-05b24a3133733fb7b0f979698083b8128e8f1f18c3c2bd09002ae788d34a32f5
> >>>>>> [3] http://osdir.com/openstack-discuss/msg16002.html
> >>>>>> [4] https://review.opendev.org/c/openstack/neutron/+/752678
> >>>>>>
> >>>>>>
> >>>>>> Thanks!
> >>>>>> Tony
> >>>>> Thanks Tony for pointing to the old discussion [0]. I thought setting
> >> the
> >>>>> option always_learn_from_arp_request to "false" on the logical routers
> >>>>> should have solved this scale problem in MAC_Binding table in this
> >> scenario.
> >>>>>
> >>>>> However, it seems the commit a2b88dc513 ("pinctrl: Directly update
> >>>>> MAC_Bindings created by self originated GARPs.") have overridden the
> >>>>> option. (I haven't tested, but maybe @Dumitru Ceara <dceara at redhat.com>
> >> can
> >>>>> confirm.)
> >>>>>
> >>>>> Similarly, for the Logical_Flow explosion, it should have been solved
> >> by
> >>>>> setting the option dynamic_neigh_routers to "true".
> >>>>>
> >>>>> I think these two options are exactly for the scenario Renat is
> >>>>> reporting. @Renat, could you try setting these options as suggested
> >> above
> >>>>> using the OVN version before the commit a2b88dc513 to see if it solves
> >> your
> >>>>> problem?
> >>>>>
> >>>> When you test it out with the suggested commit, please delete the
> >>>> mac_binding entries manually
> >>>> as ovn-northd or ovn-controllers don't delete any entries from
> >>>> mac_binding table.
> >>>
> >>> We tested with dynamic_neigh_routers set to true, and we saw some very
> >>> positive change, size of Logical_Flows table decresed from 600k
> >>> entries to 100k. This is a huge difference, thanks for pointing this
> >>> out!
> >>>
> >>> It did not affect MAC_Binding table with commit a2b88dc513 ("pinctrl:
> >>> Directly update MAC_Bindings created by self originated GARPs."), but
> >>> that was expected. Just for test purposes we commented out some code
> >>> as follows:
> >>>
> >>> diff --git a/controller/pinctrl.c b/controller/pinctrl.c
> >>> index 291202c24..76047939c 100644
> >>> --- a/controller/pinctrl.c
> >>> +++ b/controller/pinctrl.c
> >>> @@ -4115,10 +4115,10 @@ send_garp_rarp_update(struct ovsdb_idl_txn
> >> *ovnsb_idl_txn,
> >>>                                     laddrs->ipv4_addrs[i].addr,
> >>>                                     binding_rec->datapath->tunnel_key,
> >>>                                     binding_rec->tunnel_key);
> >>> -                    send_garp_locally(ovnsb_idl_txn,
> >>> -                                      sbrec_mac_binding_by_lport_ip,
> >>> -                                      local_datapaths, binding_rec,
> >> laddrs->ea,
> >>> -                                      laddrs->ipv4_addrs[i].addr);
> >>> +                    //send_garp_locally(ovnsb_idl_txn,
> >>> +                    //                  sbrec_mac_binding_by_lport_ip,
> >>> +                    //                  local_datapaths, binding_rec,
> >> laddrs->ea,
> >>> +                    //                  laddrs->ipv4_addrs[i].addr);
> >>>
> >>>                   }
> >>>                   free(name);
> >>>
> >>> Together with dynamic_neigh_routers we achieved quite a stable setup,
> >>> with 62 MiB SB database, which is a huge step forward after 1.9 GiB.
> >>> MAC_Binding size stays around 2000 entries, in comparison to almost a
> >>> million.
> >>>
> >>> Will it make sense to make behaviour introduced in a2b88dc513
> >>> toggleable via a command line option, before there is a better
> >>> solution?
> >>>
> >>> Thanks,
> >>> Renat.
> >>>
> >>
> >> Thanks Renat for the testing. The result looks good. Just to confirm, in
> >> the final test with the code change above, did you also set the
> >> "always_learn_from_arp_request" to "false"?
> > 
> > Hi Han,
> > 
> > yes, sorry for not making it clear initially, always_learn_from_arp_request
> > is set to false
> > 
> 
> Hi Renat, Han,
> 
> I sent a patch to honor always_learn_from_arp_request in pinctrl:
> 
> http://patchwork.ozlabs.org/project/ovn/patch/1606818591-23265-1-git-send-email-dceara@redhat.com/
> 
> Sorry for the regression, this should fix it until we decide whether we
> move MAC_Bindings per LS or use a local cache.

Hi Dumitru,

thanks for submitting the patch! We have already tested it in our lab and
it works good, MAC_Binding table is almost empty and everything works.
We also don't see any significant regression in east-west or north-south
traffic.

Thanks,
Renat.

> 
> Thanks,
> Dumitru
> 
> >> I think the logic introduced in a2b88dc513 can add the check for the option
> >> "always_learn_from_arp_request" instead of overriding it.
> >>
> >> Also, regarding to Winson's question:
> >>> We moved to ovn 20.09 branch recently and the mac binding issues happen
> >> again in our
> >>> ovn-k8s scale test cluster.
> >>> Is there a quick workaround to make the options  "
> >> always_learn_from_arp_request “ works again?
> >>>
> >> Thanks Winson for confirming. As mentioned above, I think the logic of the
> >> patch "pinctrl: Directly update MAC_Bindings created by self originated
> >> GARPs." can be updated to add the check for this option, to restore the
> >> behavior. Before the fix, I think a quick work around for you in 20.09
> >> could be reverting the following patches (I haven't tested though):
> >> 1. "ovn-northd: Limit self originated ARP/ND broadcast domain."
> >> 2. "pinctrl: Fix segfault seen when creating mac_binding for local GARPs."
> >> 3. "pinctrl: Directly update MAC_Bindings created by self originated GARPs."
> >>
> >> Thanks,
> >> Han
> >>
> >>>>> Regarding the proposals in this thread:
> >>>>> - Move MAC_Binding to LS (by Dumitru)
> >>>>>      This sounds good to me, while I am not sure about all the
> >> implications
> >>>>> yet, wondering why it was associated with LRP instead in the beginning.
> >>>>>
> >>>>> - Remove MAC_Binding from SB (by Numan)
> >>>>>      I am a little concerned about this. The MAC_Binding in SB is
> >> required
> >>>>> for distributed LR to work for dynamic ARP resolving. Consider a
> >> general
> >>>>> use case: A - LS1 - LR1 - LS2 - B. A is on HV1 and B is on HV2. Now A
> >> sends
> >>>>> a packet to B's IP. Assume B's IP is unknown by OVN. The packet is
> >> routed
> >>>>> by LR1 and on the LRP facing LS2 an ARP is sent out over the LS1
> >> logical
> >>>>> network. The above steps happen on HV1. Now the ARP request reaches
> >> HV2 and
> >>>>> is received by B, so B sends an ARP response. With the current
> >>>>> implementation, HV2's OVS flow would learn the MAC-IP binding from the
> >> ARP
> >>>>> response and update SB DB, and HV1 will get the SB update and install
> >> the
> >>>>> MAC Binding flow as a result of ARP resolving. The next time A sends a
> >>>>> packet to B, the HV1 will directly resolve the ARP from the MAC Binding
> >>>>> flows locally and send the IP packet to HV2. The SB DB MAC_Binding
> >> table
> >>>>> works as a distributed ARP/Neighbor cache. It is a mechanism to sync
> >> the
> >>>>> ARP cache from the place where it is learned to the place where it is
> >>>>> initiated, and all HVs benefit from this without the need to send ARP
> >>>>> themselves for the same LRP. In other words, the LRP is distributed,
> >> so the
> >>>>> ARP resolving is in a distributed fashion. Without this, each HV would
> >>>>> initiate ARP request on behalf of the same LRP, which would largely
> >>>>> increase the ARP traffic unnecessarily - even more than the traditional
> >>>>> network (where one physical router only needs to do one ARP resolving
> >> for
> >>>>> each neighbor and maintain one copy of ARP cache). And I am not sure if
> >>>>> there are other side effects when an endpoint sees unexpectedly
> >> frequent
> >>>>> ARP requests from the same LRP - would there be any rate limit that
> >> even
> >>>>> discards repeated ARP requests from the same source? Numan, maybe you
> >> have
> >>>>> already considered these. Would you share your thoughts?
> >>>> Thanks for the comments and highlighting this use case which I missed
> >>>> completely.
> >>>>
> >>>> I was thinking more in lines on the N-S usecase with a distributed
> >>>> gateway router port.
> >>>> And I completely missed the E-W with an unknown address scenario. If
> >>>> we don't consider
> >>>> the unknown address scenario, I think moving away from MAC_Binding
> >>>> south db tabe would
> >>>> be beneficial in the long run. For  few reasons
> >>>>     1. For better scale.
> >>>>     2. To address the mac_binding stale entries (which presently CMS
> >>>> have to handle)
> >>>>
> >>>> For N-S traffic scenario, ovn-controller claiming the gw router port
> >>>> will take care of generating the ARP.
> >>>> For Floating IP dvr scenario, each compute node will have to generate
> >>>> the ARP request to learn a remote.
> >>>> I think this should be fine as it is just a one time thing.
> >>>>
> >>>> Regarding the unknown address scenario, right now ovn controller
> >>>> floods the packet to all the unknown logical ports
> >>>> of a switch if OVN doesn't know the MAC. All these are unknown logical
> >>>> ports belonging to a multicast group.
> >>>>
> >>>> I think we should solve this case. In the case of Openstack, when port
> >>>> security is disabled for a neutron port, the logical
> >>>> port will have an unknown address configured. There are a few related
> >>>> bugzillas/lauchpad bugs [1].
> >>>>
> >>>> I think we should fix this behavior in OVN and ovn should do the mac
> >>>> learning on the switch for the unknown ports. And If we do that,
> >>>> I think the scenario you mentioned will be addressed.
> >>>>
> >>>> Maybe we can extend Dumitru's suggestion and have just one approach
> >>>> which does the mac learning on the switch (keeping
> >>>> the SB Mac_binding table).
> >>>>      -  for unknown logical ports
> >>>>      -  for unknown macs for the N-S routing.
> >>>>
> >>>> Any thoughts ?
> >>>>
> >>>> FYI - I have a PoC/RFC patch in progress which adds the mac binding
> >>>> cache support -
> >>>>
> >> https://github.com/numansiddique/ovn/commit/22082d04ca789155ea2edd3c1706bde509ae44da
> >>>>
> >>>> [1] - https://review.opendev.org/c/openstack/neutron/+/763567/
> >>>>         https://bugzilla.redhat.com/show_bug.cgi?id=1888441
> >>>>        https://bugs.launchpad.net/neutron/+bug/1904412
> >>>>        https://bugzilla.redhat.com/show_bug.cgi?id=1672625
> >>>>
> >>>> Thanks
> >>>> Numan
> >>>>
> >>>>> Thanks,
> >>>>> Han
> >>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: dev <ovs-dev-bounces at openvswitch.org> On Behalf Of Numan
> >> Siddique
> >>>>>>> Sent: Thursday, November 26, 2020 11:36 AM
> >>>>>>> To: Daniel Alvarez Sanchez <dalvarez at redhat.com>
> >>>>>>> Cc: ovs-dev <ovs-dev at openvswitch.org>
> >>>>>>> Subject: Re: [ovs-dev] Scaling of Logical_Flows and MAC_Binding
> >> tables
> >>>>>>>
> >>>>>>> On Thu, Nov 26, 2020 at 4:32 PM Numan Siddique <numans at ovn.org>
> >> wrote:
> >>>>>>>> On Thu, Nov 26, 2020 at 4:11 PM Daniel Alvarez Sanchez
> >>>>>>>> <dalvarez at redhat.com> wrote:
> >>>>>>>>> On Wed, Nov 25, 2020 at 7:59 PM Dumitru Ceara <dceara at redhat.com>
> >>>>>>> wrote:
> >>>>>>>>>> On 11/25/20 7:06 PM, Numan Siddique wrote:
> >>>>>>>>>>> On Wed, Nov 25, 2020 at 10:24 PM Renat Nurgaliyev
> >>>>>>>>>>> <impleman at gmail.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On 25.11.20 16:14, Dumitru Ceara wrote:
> >>>>>>>>>>>>> On 11/25/20 3:30 PM, Renat Nurgaliyev wrote:
> >>>>>>>>>>>>>> Hello folks,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> Hi Renat,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> we run a lab where we try to evaluate scalability potential
> >>>>>>>>>>>>>> of OVN
> >>>>>>>>>> with
> >>>>>>>>>>>>>> OpenStack as CMS.
> >>>>>>>>>>>>>> Current lab setup is following:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> 500 networks
> >>>>>>>>>>>>>> 500 routers
> >>>>>>>>>>>>>> 1500 VM ports (3 per network/router)
> >>>>>>>>>>>>>> 1500 Floating IPs (one per VM port)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> There is an external network, which is bridged to br-provider
> >>>>>>>>>>>>>> on
> >>>>>>>>>> gateway
> >>>>>>>>>>>>>> nodes. There are 2000 ports
> >>>>>>>>>>>>>> connected to this external network (1500 Floating IPs + 500
> >>>>>>>>>>>>>> SNAT
> >>>>>>>>>> router
> >>>>>>>>>>>>>> ports). So the setup is not
> >>>>>>>>>>>>>> very big we'd say, but after applying this configuration via
> >>>>>>>>>>>>>> ML2/OVN plugin, northd kicks in and does its job, and after
> >>>>>>>>>>>>>> its done, Logical_Flow table gets 645877 entries, which is
> >>>>>>>>>>>>>> way too much. But ok, we move on and start one controller on
> >>>>>>>>>>>>>> the gateway chassis, and here things get really messy.
> >>>>>>>>>>>>>> MAC_Binding table grows from 0 to 999088 entries in one
> >>>>>>>>>>>>>> moment, and after its done, the size of SB biggest tables
> >>>>>>>>>>>>>> look like this:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> 999088 MAC_Binding
> >>>>>>>>>>>>>> 645877 Logical_Flow
> >>>>>>>>>>>>>> 4726 Port_Binding
> >>>>>>>>>>>>>> 1117 Multicast_Group
> >>>>>>>>>>>>>> 1068 Datapath_Binding
> >>>>>>>>>>>>>> 1046 Port_Group
> >>>>>>>>>>>>>> 551 IP_Multicast
> >>>>>>>>>>>>>> 519 DNS
> >>>>>>>>>>>>>> 517 HA_Chassis_Group
> >>>>>>>>>>>>>> 517 HA_Chassis
> >>>>>>>>>>>>>> ...
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> MAC binding table gets huge, basically it now has an entry
> >>>>>>>>>>>>>> for every port that is connected to external network * number
> >>>>>>>>>>>>>> of datapaths, which roughly makes it one million entries.
> >>>>>>>>>>>>>> This table by itself increases the size of the SB by 200
> >>>>>>>>>>>>>> megabytes. Logical_Flow table also gets very heavy, we have
> >>>>>>>>>>>>>> already played a bit with logical datapath patches that Ilya
> >>>>>>>>>>>>>> Maximets submitted, and it
> >>>>>>>>>> looks
> >>>>>>>>>>>>>> much better, but the size of
> >>>>>>>>>>>>>> the MAC_Binding table still feels inadequate.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> We would like to start to work at least on MAC_Binding table
> >>>>>>>>>>>>>> optimisation, but it is a bit difficult to start working from
> >>>>>>>>>>>>>> scratch. Can someone help us with ideas how this could be
> >>>>>>>>>>>>>> optimised?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Maybe it would also make sense to group entries in
> >>>>>>>>>>>>>> MAC_Binding table
> >>>>>>>>>> in
> >>>>>>>>>>>>>> the same way like it is proposed for logical flows in Ilya's
> >>>>>>>>>>>>>> patch?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> Maybe it would work but I'm not really sure how, right now.
> >>>>>>>>>>>>> However, what if we change the way MAC_Bindings are created?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Right now a MAC Binding is created for each logical router
> >>>>>>>>>>>>> port but in your case there are a lot of logical router ports
> >>>>>>>>>>>>> connected to the single provider logical switch and they all
> >>>>>>> learn the same ARPs.
> >>>>>>>>>>>>> What if we instead store MAC_Bindings per logical switch?
> >>>>>>>>>>>>> Basically sharing all these MAC_Bindings between all router
> >>>>>>>>>>>>> ports connected to
> >>>>>>>>>> the
> >>>>>>>>>>>>> same LS.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Do you see any problem with this approach?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>> Dumitru
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>> I believe that this approach is way to go, at least nothing
> >>>>>>>>>>>> comes to my
> >>>>>>>>>> mind
> >>>>>>>>>>>> that could go wrong here. We will try to make a patch for that.
> >>>>>>>>>> However, if
> >>>>>>>>>>>> someone is familiar with the code and knows how to do it fast,
> >>>>>>>>>>>> it would
> >>>>>>>>>> also
> >>>>>>>>>>>> be very nice.
> >>>>>>>>>>> This approach should work.
> >>>>>>>>>>>
> >>>>>>>>>>> I've another idea (I won't call it a solution yet). What if we
> >>>>>>>>>>> drop the usage of MAC_Binding altogether ?
> >>>>>>>>>> This would be great!
> >>>>>>>>>>
> >>>>>>>>>>> - When ovn-controller learns a mac_binding, it will not create a
> >>>>>>>>>>> row into the SB MAC_binding table
> >>>>>>>>>>> - Instead it will maintain the learnt mac binding in its memory.
> >>>>>>>>>>> - ovn-controller will still program the table 66 with the flow
> >>>>>>>>>>> to set the eth.dst (for the get_arp() action)
> >>>>>>>>>>>
> >>>>>>>>>>> This has a couple of advantages
> >>>>>>>>>>>    - Right now we never flush the old/stale mac_binding entries.
> >>>>>>>>>>>    - If suppose the mac of an external IP has changed, but OVN
> >>>>>>>>>>> has an entry for that IP with old mac in the mac_binding table,
> >>>>>>>>>>>      we will use the old mac, causing the packet to be sent out
> >>>>>>>>>>> to the wrong destination and the packet might get lost.
> >>>>>>>>>>>    - So we will get rid of this problem
> >>>>>>>>>>>    - We will also save SB DB space.
> >>>>>>>>>>>
> >>>>>>>>>>> There are few disadvantages
> >>>>>>>>>>>    -  Other ovn-controllers will not add the flows in table 66. I
> >>>>>>>>>>> guess this should be fine as each ovn-controller can generate
> >>>>>>>>>>> the ARP request and learn the mac.
> >>>>>>>>>>>    - When ovn-controller restarts we lose the learnt macs and
> >>>>>>>>>>> would need to learn again.
> >>>>>>>>>>>
> >>>>>>>>>>> Any thoughts on this ?
> >>>>>>>>> It'd be great to have some sort of local ARP cache but I'm
> >> concerned
> >>>>>>>>> about the performance implications.
> >>>>>>>>>
> >>>>>>>>> - How are you going to determine when an entry is stale?
> >>>>>>>>> If you slow path the packets to reset the timeout everytime a pkt
> >>>>>>>>> with source mac is received, it doesn't look good. Maybe you have
> >>>>>>>>> something else in mind.
> >>>>>>>> Right now we don't stale any mac_binding entry. If I understand you
> >>>>>>>> correctly, your concern is for the scenario where a floating ip is
> >>>>>>>> updated with a different mac, how the local cache is updated ?
> >>>>>>>>
> >>>>>>>> Right now networking-ovn (in the case of openstack) updates the
> >>>>>>>> mac_binding entry in the South db for such cases right ?
> >>>>>>>>
> >>>>>>> FYI - I have started working on this approach as PoC. i.e to use
> >> local
> >>>>>>> mac_binding cache
> >>>>>>> instead of using the SB mac_binding table.
> >>>>>>>
> >>>>>>> I will update this thread about the progress.
> >>>>>>>
> >>>>>>> Thanks
> >>>>>>> Numan
> >>>>>>>
> >>>>>>>> Thanks
> >>>>>>>> Numan
> >>>>>>>>
> >>>>>>>>> -
> >>>>>>>>>
> >>>>>>>>>> There's another scenario that we need to take care of and doesn't
> >>>>>>> seem
> >>>>>>>>>> too obvious to address without MAC_Bindings.
> >>>>>>>>>>
> >>>>>>>>>> GARPs were being injected in the L2 broadcast domain of a LS for
> >>>>>>> nat
> >>>>>>>>>> addresses in case FIPs are reused by the CMS, introduced by:
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> https://github.com/ovn-
> >>>>>>> org/ovn/commit/069a32cbf443c937feff44078e8828d7a2702da8
> >>>>>>>>>
> >>>>>>>>> Dumitru and I have been discussing the possibility of reverting
> >> this
> >>>>>>> patch
> >>>>>>>>> and rely on CMSs to maintain the MAC_Binding entries associated
> >> with
> >>>>>>> the
> >>>>>>>>> FIPs [0].
> >>>>>>>>> I'm against reverting this patch in OVN [1] for multiple reasons
> >>>>>>> being the
> >>>>>>>>> most important one the fact that if we rely on workarounds in the
> >>>>>>> CMS side,
> >>>>>>>>> we'll be creating a control plane dependency for something that is
> >>>>>>> pure
> >>>>>>>>> dataplane only (ie. if Neutron server is down - outage, upgrades,
> >>>>>>> etc. -,
> >>>>>>>>> traffic is going to be disrupted). On the other hand one could
> >> argue
> >>>>>>> that
> >>>>>>>>> the same dependency now exists on ovn-controller being up & running
> >>>>>>> but I
> >>>>>>>>> believe that this is better than a) relying on workarounds on CMSs
> >>>>> b)
> >>>>>>>>> relying on CMSs availability.
> >>>>>>>>>
> >>>>>>>>> In the short term I think that moving the MAC_Binding entries to LS
> >>>>>>> instead
> >>>>>>>>> of LRP as it was suggested up thread would be a good idea and in
> >> the
> >>>>>>> long
> >>>>>>>>> haul, the ARP *local* cache seems to be the right solution.
> >>>>>>> Brainstorming
> >>>>>>>>> with Dumitru he suggested inspecting the flows regularly to see if
> >>>>>>> the
> >>>>>>>>> packet count on flows that check if src_mac == X has not increased
> >>>>>>> in a
> >>>>>>>>> while and then remove the ARP responder flows locally.
> >>>>>>>>>
> >>>>>>>>> [0]
> >>>>>>>>> https://github.com/openstack/networking-
> >>>>>>> ovn/commit/5181f1106ff839d08152623c25c9a5f6797aa2d7
> >>>>>>>>> [1]
> >>>>>>>>> https://github.com/ovn-
> >>>>>>> org/ovn/commit/069a32cbf443c937feff44078e8828d7a2702da8
> >>>>>>>>>>
> >>>>>>>>>> Recently, due to the dataplane scaling issue (4K resubmit limit
> >>>>>>> being
> >>>>>>>>>> hit), we don't flood these packets on non-router ports and instead
> >>>>>>>>>> create the MAC Bindings directly from ovn-controller:
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> https://github.com/ovn-
> >>>>>>> org/ovn/commit/a2b88dc5136507e727e4bcdc4bf6fde559f519a9
> >>>>>>>>>> Without the MAC_Binding table we'd need to find a way to update or
> >>>>>>> flush
> >>>>>>>>>> stale bindings when an IP is used for a VIF or FIP.
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Dumitru
> >>>>>>>>>>
> >>>>>>>>>> _______________________________________________
> >>>>>>>>>> dev mailing list
> >>>>>>>>>> dev at openvswitch.org
> >>>>>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>> _______________________________________________
> >>>>>>>>> dev mailing list
> >>>>>>>>> dev at openvswitch.org
> >>>>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >>>>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> dev mailing list
> >>>>>>> dev at openvswitch.org
> >>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >>>>>> _______________________________________________
> >>>>>> dev mailing list
> >>>>>> dev at openvswitch.org
> >>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >>>>> _______________________________________________
> >>>>> dev mailing list
> >>>>> dev at openvswitch.org
> >>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >>>>>
> >>>> _______________________________________________
> >>>> dev mailing list
> >>>> dev at openvswitch.org
> >>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >>>
> > _______________________________________________
> > dev mailing list
> > dev at openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > 
> 


More information about the dev mailing list