[ovs-discuss] [OVN] logical flow explosion in lr_in_ip_input table for dnat_and_snat IPs

Dumitru Ceara dceara at redhat.com
Tue Jul 14 07:46:19 UTC 2020


On 7/14/20 2:09 AM, Girish Moodalbail wrote:
> Hello Han/Dumitru,
> 
> We tried the patch on our cluster, and we do not see the explosion of
> OpenFlow rules in the Integration Bridge on each of the OVN Chassis nodes.
> 
> Also, on the logical switch with localnet port attached, as expected, we
> still have a rule to Flood the Gratuitous ARPs sent for all SNAT and
> DNAT extenal IP addresses.
> 
> This looks great. Thank you both.
> 
> Regards,
> ~Girish
> 

Hi Girish,

Thanks for the confirmation!

Regards,
Dumitru

> On Wed, Jul 8, 2020 at 5:23 AM Dumitru Ceara <dceara at redhat.com
> <mailto:dceara at redhat.com>> wrote:
> 
>     On 6/25/20 9:47 PM, Dumitru Ceara wrote:
>     > On 6/25/20 9:34 PM, Girish Moodalbail wrote:
>     >> Hello Dumitru, Han,
>     >>
>     >> So, we applied this patchset and gave it a spin on our large scale
>     >> cluster and saw a significant reduction in the number of logical
>     flows
>     >> in lr_in_ip_input table. Before this patch there were around 1.6M
>     flows
>     >> in lr_in_ip_input table. However, after the patch we see about 26K
>     >> flows. So that is significant reduction in number of logical flows.
>     >>
>     >> In lr_in_ip_input, I see
>     >>
>     >>   * priority 92 flows matching ARP requests for dnat_and_snat IPs on
>     >>     distributed gateway port with is_chassis_resident() and
>     >>     corresponding ARP reply
>     >>   * priority 91 flows matching ARP requests for dnat_and_snat IPs on
>     >>     distributed gateway port with !is_chassis_resident() and
>     >>     corresponding drop
>     >>   * priority 90 flow matching ARP request for dnat_and_snat IPs and
>     >>     corresponding ARP replies
>     >>
>     >> So far so good.
>     >
>     > Hi Girish,
>     >
>     > Great, thanks for testing out the series and confirming that it's
>     > working ok.
>     >
>     >>
>     >> However, not directly related to this patch per-se but directly
>     related
>     >> to the behaviour of ARP and dnat_and_snat IP, on the OVN chassis
>     we are
>     >> seeing a significant number of OpenFlow flows in table 27 (around
>     2.3M
>     >> OpenFlow flows). This table gets populated from logical flows in
>     >> table=19 (ls_in_l2_lkup) of logical switch.
>     >>
>     >> The two logical flows in l2_in_l2_lkup that are contributing to huge
>     >> number of OpenFlow flows are: (for the  entire logical flow entry,
>     >> please
>     >>
>     see: https://gist.github.com/girishmg/57b3005030d421c59b30e6c36cfc9c18)
>     >>
>     >> Priority=75 flow 
>     >> =============
>     >> This flow looks like below (where 169.254.0.0/29
>     <http://169.254.0.0/29> <http://169.254.0.0/29>
>     >> is dnat_and_snat subnet and 192.168.0.1 is the logical_switch's
>     gateway IP)
>     >>
>     >> table=19(ls_in_l2_lkup      ), priority=75   , match=(flags[1] ==
>     0 &&
>     >> arp.op == 1 && arp.tpa == { 169.254.3.107, 169.254.1.85, 192.168.0.1,
>     >> 169.254.10.155, 169.254.1.6}), action=(outport =
>     "stor-sdn-test1"; output;)
>     >>
>     >> What this flow says is that any ARP request packet from the switch
>     >> heading towards the default gateway or any of those 1-to-1 nat
>     send it
>     >> out through the port towards  the ovn_cluster_router’s ingress
>     pipeline.
>     >> Question though is why any Pod on the logical switch would send
>     an ARP
>     >> for an IP that is not in its subnet. A packet from a Pod towards a
>     >> non-subnet IP should ARP only for the default gateway IP.
>     >>
>     >
>     > This is a bug. I'll start working on a fix send a patch for it soon.
>     >
>     >> Priority=80 Flow
>     >> =============
>     >> This flow looks like below
>     >>
>     >> table=19(ls_in_l2_lkup      ), priority=80   , match=(eth.src == {
>     >> 0a:58:c0:a8:00:01, 6a:93:f4:55:aa:a7, ae:92:2d:33:24:ea,
>     >> ba:0a:d3:7d:bc:e8, b2:2f:40:4d:d9:2b} && (arp.op == 1 || nd_ns)),
>     >> action=(outport = "_MC_flood"; output;)
>     >>
>     >> The question again for this flow is why will there be a
>     self-originated
>     >> arp requests for the dnat_and_snat IPs from inside of the node's
>     logical
>     >> switch. I can see how this is a possibility on the switch that has
>     >> `localnet port` on it and to which the distributed router connects to
>     >> through a gateway port. 
>     >>
>     >
>     > This is also a bug, similar to the one above, we should only deal with
>     > external_mac's that might be used on this port. I'll fix it too soon.
>     >
>     > Thanks,
>     > Dumitru
>     >
> 
>     Hi Girish,
> 
>     I just sent a patch that should fix these two new issues you reported
>     above. Do you mind giving it a try when you get the chance?
> 
>     https://patchwork.ozlabs.org/project/openvswitch/patch/1594210824-11382-1-git-send-email-dceara@redhat.com/
> 
>     Thanks,
>     Dumitru
> 
>     -- 
>     You received this message because you are subscribed to the Google
>     Groups "ovn-kubernetes" group.
>     To unsubscribe from this group and stop receiving emails from it,
>     send an email to ovn-kubernetes+unsubscribe at googlegroups.com
>     <mailto:ovn-kubernetes%2Bunsubscribe at googlegroups.com>.
>     To view this discussion on the web visit
>     https://groups.google.com/d/msgid/ovn-kubernetes/7121360b-9f69-52be-70a4-7cb7e2f95eff%40redhat.com.
> 



More information about the discuss mailing list