[ovs-discuss] [OVN] flow explosion in lr_in_arp_resolve table

Han Zhou zhouhan at gmail.com
Tue May 26 21:51:52 UTC 2020


On Tue, May 26, 2020 at 1:07 PM Girish Moodalbail <gmoodalbail at gmail.com>
wrote:
>
>
>
> On Tue, May 26, 2020 at 12:42 PM Han Zhou <zhouhan at gmail.com> wrote:
>>
>> Hi Girish,
>>
>> Thanks for the summary. I agree with you that GARP request v.s. reply is
irrelavent to the problem here.
>> Please see my comment inline below.
>>
>> On Tue, May 26, 2020 at 12:09 PM Girish Moodalbail <gmoodalbail at gmail.com>
wrote:
>> >
>> > Hello Dumitru,
>> >
>> > There are several things that are being discussed on this thread. Let
me see if I can tease them out for clarity.
>> >
>> > 1. All the router IPs are known to OVN (the join switch case)
>> > 2. Some IPs are known and some are not known (the external logical
switch that connects to physical network case).
>> >
>> > Let us look at each of the case above:
>> >
>> > 1. Join Switch Case
>> >
>> > +----------------+        +----------------+
>> > |   l3gateway    |        |   l3gateway    |
>> > |    router2     |        |    router3     |
>> > +-------------+--+        +-+--------------+
>> >             IP2,M2         IP3,M3
>> >               |             |
>> >            +--+-------------+---+
>> >            |    join switch     |
>> >            +---------+----------+
>> >                      |
>> >                   IP1,M1
>> >              +-------+--------+
>> >              |  distributed   |
>> >              |     router     |
>> >              +----------------+
>> >
>> >
>> > Say, GR router2 wants to send the packet out to DR and that we don't
have static mappings of MAC to IP in lr_in_arp_resolve table on GR router2
(with Han's patch of dynamic_neigh_routes=true for all the Gateway
Routers). With this in mind, when an ARP request is sent out by router2's
hypervisor the packet should be directly sent to the distributed router
alone. Your commit 32f5ebb0622 (ovn-northd: Limit ARP/ND broadcast domain
whenever possible) should have allowed only unicast. However, in
ls_in_l2_lkup table we have
>> >
>> >   table=19(ls_in_l2_lkup      ), priority=80   , match=(eth.src == {
M2 } && (arp.op == 1 || nd_ns)), action=(outport = "_MC_flood"; output;)
>> >   table=19(ls_in_l2_lkup      ), priority=75   , match=(flags[1] == 0
&& arp.op == 1 && arp.tpa == { IP1}), action=(outport = "jtor-router2";
output;)
>> >
>> > As you can see, `priority=80` rule will always be hit and sent out to
all the GRs. The `priority=75` rule is never hit. So, we will see ARP
packets on the GENEVE tunnel. So, we need to change `priority=80` to match
GARP request packets. That way, for the known OVN IPs case we don't do
broadcast.
>>
>> Since the solution to case 2) below (i.e. learn_from_arp_request=false)
solves the problem of case 1), too, I think we don't need this change just
for case 1). As @Dumitru Ceara  mentioned, there is some cost because it
adds extra flows. It would be significant amount of flows if there are a
lot of snat_and_dnat IPs. What do you think?
>
>
> Han, yes it will work. However, my only concern is that we would send all
these ARP requests via tunnel to each of 1000 hypervisors and these
hypervisors will just drop them on the floor. when they see
learn_from_arp_request=false.

I think maybe it is not a problem since it happens only once on the Join
switch. Once the MAC is learned, it won't broadcast again. It may be more
of a problem on the external LS if periodical GARP is required there.
However, I'd suggest to have some test and see if it is really a problem,
before trying to solve it.

>
> Han, Dumitru,
>
> Why can't we swap the priorities of the above two flows so that the ARP
request for NexHop IP known to OVN will be always sent via `unicast`?

If swapped, even GARP won't get broadcasted. Maybe that's not the desired
behavior.

>
> Regards,
> ~Girish
>
>>
>> >
>> > 2. External Logical Switch Case
>> >
>> >                        10.10.10.0/24
>> >    -------------------------+--------------------------
>> >                             |
>> >                          localnet
>> >                       +-----+-----+
>> >                       | external  |
>> >          +------------+    LS1    +-------------+
>> >          |            +-----+-----+             |
>> >          |                  |                   |
>> >      10.10.10.2         10.10.10.3          10.10.10.4
>> >         SNAT               SNAT                SNAT
>> >    +-----+-----+      +-----+-----+       +-----------+
>> >    | l3gateway |      | l3gateway |       | l3gateway |
>> >    |   node1   |      |   node2   |       |   node3   |
>> >    +-----------+      +-----------+       +-----------+
>> >
>> > In this case, we have some of the IPs in OVN and some in the physical
network. If we fix (1) above, all the ARP requests for the OVN's router IPs
will be unicast. However, all the ARP requests to external IPs, say
10.10.10.1 on the "physical router", will be broadcast. Now, we will see
these ARP broadcasts on all the L3 gateway routers. With
'learn_from_arp_request=false' [a], then the MAC_Binding table will not
explode for both ARP and GARP requests.
>> >
>> > So, I don't think GARP requests and replies is the issue here?
Furthermore, learning from the GARP replies are blocked on certain routers.
For example:
https://www.juniper.net/documentation/en_US/junose15.1/topics/concept/ip-gratuitous-arps-transmission-overview.html
 says "By default, updating the ARP cache on GARP replies is disabled on
the router.". So, our NAT addresses mapping will not be learnt.
>> >
>> > Regards,
>> > ~Girish
>> >
>> >
>> > [a] - From Han's mail, the meaning of learn_from_arp_request=false -->
if the TPA is on the router, add a new entry (it means the
>> > >     remote wants to communicate with this node, so it makes sense to
>> > >     learn the remote as well). Otherwise, ignore it and no new entry
added.
>> >
>> >
>> >
>
> --
> You received this message because you are subscribed to the Google Groups
"ovn-kubernetes" group.
> To unsubscribe from this group and stop receiving emails from it, send an
email to ovn-kubernetes+unsubscribe at googlegroups.com.
> To view this discussion on the web visit
https://groups.google.com/d/msgid/ovn-kubernetes/CAAF2STRnem2PeSahuwhro1t%2BQJxchZNC7viq8n-ngM9KU%2B%2B-Xw%40mail.gmail.com
.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20200526/f5c5dbd4/attachment-0001.html>


More information about the discuss mailing list