[ovs-discuss] [OVN] logical flow explosion in lr_in_ip_input table for dnat_and_snat IPs
Dumitru Ceara
dceara at redhat.com
Tue Jul 14 07:46:19 UTC 2020
On 7/14/20 2:09 AM, Girish Moodalbail wrote:
> Hello Han/Dumitru,
>
> We tried the patch on our cluster, and we do not see the explosion of
> OpenFlow rules in the Integration Bridge on each of the OVN Chassis nodes.
>
> Also, on the logical switch with localnet port attached, as expected, we
> still have a rule to Flood the Gratuitous ARPs sent for all SNAT and
> DNAT extenal IP addresses.
>
> This looks great. Thank you both.
>
> Regards,
> ~Girish
>
Hi Girish,
Thanks for the confirmation!
Regards,
Dumitru
> On Wed, Jul 8, 2020 at 5:23 AM Dumitru Ceara <dceara at redhat.com
> <mailto:dceara at redhat.com>> wrote:
>
> On 6/25/20 9:47 PM, Dumitru Ceara wrote:
> > On 6/25/20 9:34 PM, Girish Moodalbail wrote:
> >> Hello Dumitru, Han,
> >>
> >> So, we applied this patchset and gave it a spin on our large scale
> >> cluster and saw a significant reduction in the number of logical
> flows
> >> in lr_in_ip_input table. Before this patch there were around 1.6M
> flows
> >> in lr_in_ip_input table. However, after the patch we see about 26K
> >> flows. So that is significant reduction in number of logical flows.
> >>
> >> In lr_in_ip_input, I see
> >>
> >> * priority 92 flows matching ARP requests for dnat_and_snat IPs on
> >> distributed gateway port with is_chassis_resident() and
> >> corresponding ARP reply
> >> * priority 91 flows matching ARP requests for dnat_and_snat IPs on
> >> distributed gateway port with !is_chassis_resident() and
> >> corresponding drop
> >> * priority 90 flow matching ARP request for dnat_and_snat IPs and
> >> corresponding ARP replies
> >>
> >> So far so good.
> >
> > Hi Girish,
> >
> > Great, thanks for testing out the series and confirming that it's
> > working ok.
> >
> >>
> >> However, not directly related to this patch per-se but directly
> related
> >> to the behaviour of ARP and dnat_and_snat IP, on the OVN chassis
> we are
> >> seeing a significant number of OpenFlow flows in table 27 (around
> 2.3M
> >> OpenFlow flows). This table gets populated from logical flows in
> >> table=19 (ls_in_l2_lkup) of logical switch.
> >>
> >> The two logical flows in l2_in_l2_lkup that are contributing to huge
> >> number of OpenFlow flows are: (for the entire logical flow entry,
> >> please
> >>
> see: https://gist.github.com/girishmg/57b3005030d421c59b30e6c36cfc9c18)
> >>
> >> Priority=75 flow
> >> =============
> >> This flow looks like below (where 169.254.0.0/29
> <http://169.254.0.0/29> <http://169.254.0.0/29>
> >> is dnat_and_snat subnet and 192.168.0.1 is the logical_switch's
> gateway IP)
> >>
> >> table=19(ls_in_l2_lkup ), priority=75 , match=(flags[1] ==
> 0 &&
> >> arp.op == 1 && arp.tpa == { 169.254.3.107, 169.254.1.85, 192.168.0.1,
> >> 169.254.10.155, 169.254.1.6}), action=(outport =
> "stor-sdn-test1"; output;)
> >>
> >> What this flow says is that any ARP request packet from the switch
> >> heading towards the default gateway or any of those 1-to-1 nat
> send it
> >> out through the port towards the ovn_cluster_router’s ingress
> pipeline.
> >> Question though is why any Pod on the logical switch would send
> an ARP
> >> for an IP that is not in its subnet. A packet from a Pod towards a
> >> non-subnet IP should ARP only for the default gateway IP.
> >>
> >
> > This is a bug. I'll start working on a fix send a patch for it soon.
> >
> >> Priority=80 Flow
> >> =============
> >> This flow looks like below
> >>
> >> table=19(ls_in_l2_lkup ), priority=80 , match=(eth.src == {
> >> 0a:58:c0:a8:00:01, 6a:93:f4:55:aa:a7, ae:92:2d:33:24:ea,
> >> ba:0a:d3:7d:bc:e8, b2:2f:40:4d:d9:2b} && (arp.op == 1 || nd_ns)),
> >> action=(outport = "_MC_flood"; output;)
> >>
> >> The question again for this flow is why will there be a
> self-originated
> >> arp requests for the dnat_and_snat IPs from inside of the node's
> logical
> >> switch. I can see how this is a possibility on the switch that has
> >> `localnet port` on it and to which the distributed router connects to
> >> through a gateway port.
> >>
> >
> > This is also a bug, similar to the one above, we should only deal with
> > external_mac's that might be used on this port. I'll fix it too soon.
> >
> > Thanks,
> > Dumitru
> >
>
> Hi Girish,
>
> I just sent a patch that should fix these two new issues you reported
> above. Do you mind giving it a try when you get the chance?
>
> https://patchwork.ozlabs.org/project/openvswitch/patch/1594210824-11382-1-git-send-email-dceara@redhat.com/
>
> Thanks,
> Dumitru
>
> --
> You received this message because you are subscribed to the Google
> Groups "ovn-kubernetes" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to ovn-kubernetes+unsubscribe at googlegroups.com
> <mailto:ovn-kubernetes%2Bunsubscribe at googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/ovn-kubernetes/7121360b-9f69-52be-70a4-7cb7e2f95eff%40redhat.com.
>
More information about the discuss
mailing list