[ovs-discuss] [OVN] flow explosion in lr_in_arp_resolve table

Girish Moodalbail gmoodalbail at gmail.com
Tue May 26 20:07:03 UTC 2020


On Tue, May 26, 2020 at 12:42 PM Han Zhou <zhouhan at gmail.com> wrote:

> Hi Girish,
>
> Thanks for the summary. I agree with you that GARP request v.s. reply is
> irrelavent to the problem here.
> Please see my comment inline below.
>
> On Tue, May 26, 2020 at 12:09 PM Girish Moodalbail <gmoodalbail at gmail.com>
> wrote:
> >
> > Hello Dumitru,
> >
> > There are several things that are being discussed on this thread. Let me
> see if I can tease them out for clarity.
> >
> > 1. All the router IPs are known to OVN (the join switch case)
> > 2. Some IPs are known and some are not known (the external logical
> switch that connects to physical network case).
> >
> > Let us look at each of the case above:
> >
> > 1. Join Switch Case
> >
> > +----------------+        +----------------+
> > |   l3gateway    |        |   l3gateway    |
> > |    router2     |        |    router3     |
> > +-------------+--+        +-+--------------+
> >             IP2,M2         IP3,M3
> >               |             |
> >            +--+-------------+---+
> >            |    join switch     |
> >            +---------+----------+
> >                      |
> >                   IP1,M1
> >              +-------+--------+
> >              |  distributed   |
> >              |     router     |
> >              +----------------+
> >
> >
> > Say, GR router2 wants to send the packet out to DR and that we don't
> have static mappings of MAC to IP in lr_in_arp_resolve table on GR router2
> (with Han's patch of dynamic_neigh_routes=true for all the Gateway
> Routers). With this in mind, when an ARP request is sent out by router2's
> hypervisor the packet should be directly sent to the distributed router
> alone. Your commit 32f5ebb0622 (ovn-northd: Limit ARP/ND broadcast domain
> whenever possible) should have allowed only unicast. However, in
> ls_in_l2_lkup table we have
> >
> >   table=19(ls_in_l2_lkup      ), priority=80   , match=(eth.src == { M2
> } && (arp.op == 1 || nd_ns)), action=(outport = "_MC_flood"; output;)
> >   table=19(ls_in_l2_lkup      ), priority=75   , match=(flags[1] == 0 &&
> arp.op == 1 && arp.tpa == { IP1}), action=(outport = "jtor-router2";
> output;)
> >
> > As you can see, `priority=80` rule will always be hit and sent out to
> all the GRs. The `priority=75` rule is never hit. So, we will see ARP
> packets on the GENEVE tunnel. So, we need to change `priority=80` to match
> GARP request packets. That way, for the known OVN IPs case we don't do
> broadcast.
>
> Since the solution to case 2) below (i.e. learn_from_arp_request=false)
> solves the problem of case 1), too, I think we don't need this change just
> for case 1). As @Dumitru Ceara <dceara at redhat.com>  mentioned, there is
> some cost because it adds extra flows. It would be significant amount of
> flows if there are a lot of snat_and_dnat IPs. What do you think?
>

Han, yes it will work. However, my only concern is that we would send all
these ARP requests via tunnel to each of 1000 hypervisors and these
hypervisors will just drop them on the floor. when they see
learn_from_arp_request=false.

Han, Dumitru,

Why can't we swap the priorities of the above two flows so that the ARP
request for NexHop IP known to OVN will be always sent via `unicast`?

Regards,
~Girish


> >
> > 2. External Logical Switch Case
> >
> >                        10.10.10.0/24
> >    -------------------------+--------------------------
> >                             |
> >                          localnet
> >                       +-----+-----+
> >                       | external  |
> >          +------------+    LS1    +-------------+
> >          |            +-----+-----+             |
> >          |                  |                   |
> >      10.10.10.2         10.10.10.3          10.10.10.4
> >         SNAT               SNAT                SNAT
> >    +-----+-----+      +-----+-----+       +-----------+
> >    | l3gateway |      | l3gateway |       | l3gateway |
> >    |   node1   |      |   node2   |       |   node3   |
> >    +-----------+      +-----------+       +-----------+
> >
> > In this case, we have some of the IPs in OVN and some in the physical
> network. If we fix (1) above, all the ARP requests for the OVN's router IPs
> will be unicast. However, all the ARP requests to external IPs, say
> 10.10.10.1 on the "physical router", will be broadcast. Now, we will see
> these ARP broadcasts on all the L3 gateway routers. With
> 'learn_from_arp_request=false' [a], then the MAC_Binding table will not
> explode for both ARP and GARP requests.
> >
> > So, I don't think GARP requests and replies is the issue here?
> Furthermore, learning from the GARP replies are blocked on certain routers.
> For example:
> https://www.juniper.net/documentation/en_US/junose15.1/topics/concept/ip-gratuitous-arps-transmission-overview.html
>  says "By default, updating the ARP cache on GARP replies is disabled on
> the router.". So, our NAT addresses mapping will not be learnt.
> >
> > Regards,
> > ~Girish
> >
> >
> > [a] - From Han's mail, the meaning of learn_from_arp_request=false -->
> if the TPA is on the router, add a new entry (it means the
> > >     remote wants to communicate with this node, so it makes sense to
> > >     learn the remote as well). Otherwise, ignore it and no new entry
> added.
> >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20200526/daec7372/attachment.html>


More information about the discuss mailing list