[ovs-discuss] [OVN] logical flow explosion in lr_in_ip_input table for dnat_and_snat IPs

Girish Moodalbail gmoodalbail at gmail.com
Thu Jun 4 05:23:26 UTC 2020


On Wed, Jun 3, 2020 at 9:39 PM Han Zhou <zhouhan at gmail.com> wrote:

>
>
> On Wed, Jun 3, 2020 at 7:16 PM Girish Moodalbail <gmoodalbail at gmail.com>
> wrote:
>
>> Hello all,
>>
>> While working on an extension, see the diagram below, to the existing
>> OVN logical topology for the ovn-kubernetes project, I am seeing an
>> explosion of the "Reply to ARP requests" logical flows in the
>> `lr_in_ip_input` table for the distributed router (ovn_cluster_router)
>> configured with gateway port (rtol-LS)
>>
>>                         internet
>>                ---------+-------------->
>>                         |
>>                         |
>>       +----------localnet-port---------+
>>       |LS                              |
>>       +-----------------ltor-LS--------+
>>                            |
>>                            |
>>  +---------------------rtol-LS------------+
>>  |           ovn_cluster_router           |
>>  |          (Distributed Router)          |
>>  +-rtos-ls0------rtos-ls1--------rtos-ls2-+
>>       |              |              |
>>       |              |              |
>> +-----+-+       +----+--+     +-----+-+
>> |  LS0  |       |  LS1  |     |  LS2  |
>> +-+-----+       +-+-----+     +-+-----+
>>   |               |             |
>>   p0              p1            p2
>>  IA0             IA1           IA2
>>  EA0             EA1           EA2
>> (Node0)          (Node1)       (Node2)
>>
>> In the topology above, each of the three logical switch port has an
>> internal address of IAx and an external address of EAx (dnat_and_snat IP).
>> They are all bound to their respective nodes (Nodex). A packet from `p0`
>> heading towards the internet will be SNAT'ed to EA0 on the local hypervisor
>> and then sent out through the LS's localnet-port on that hypervisor.
>> Basically, they are configured for distributed NATing.
>>
>> I am seeing interesting "Reply to ARP requests" flows for arp.tpa set to
>> "EAX". Flows are like this:
>>
>> For EA0
>> priority=90, match=(inport == "rtos-ls0" && arp.tpa == EA0 && arp.op ==
>> 1), action=(/* ARP reply */)
>> priority=90, match=(inport == "rtos-ls1" && arp.tpa == EA0 && arp.op ==
>> 1), action=(/* ARP reply */)
>> priority=90, match=(inport == "rtos-ls2" && arp.tpa == EA0 && arp.op ==
>> 1), action=(/* ARP reply */)
>>
>> For EA1
>> priority=90, match=(inport == "rtos-ls0" && arp.tpa == EA1 && arp.op ==
>> 1), action=(/* ARP reply */)
>> priority=90, match=(inport == "rtos-ls1" && arp.tpa == EA0 && arp.op ==
>> 1), action=(/* ARP reply */)
>> priority=90, match=(inport == "rtos-ls2" && arp.tpa == EA1 && arp.op ==
>> 1), action=(/* ARP reply */)
>>
>> Similarly, for EA2.
>>
>> So, we have N * N "Reply to ARP requests" flows for N nodes each with 1
>> dnat_and_snat ip.
>> This is causing scale issues.
>>
>> If you look at the flows for `EA0`, i am confused as to why is it needed?
>>
>>    1. When will one see an ARP request for the EA0 from any of the
>>    LS{0,1,2}'s logical switch port.
>>    2. If it is needed at all, can't we just remove the `inport` thing
>>    altogether since the flow is configured for every port of logical router
>>    port except for the distributed gateway port rtol-LS. For this port, we
>>    could add an higher priority rule with action set to `next`.
>>    3. Say, we don't need east-west NAT connectivity. Is there a way to
>>    make these ARPs be learnt dynamically, like we are doing for join and
>>    external logical switch (the other thread [1]).
>>
>> Regards,
>> ~Girish
>>
>> [1]
>> https://mail.openvswitch.org/pipermail/ovs-discuss/2020-May/049994.html
>>
>
> In general, these flows should be per router instead of per router port,
> since the nat addresses are not attached to any router port. For
> distributed gateway ports, there will need per-port flows to match
> is_chassis_resident(gateway-chassis). I think this can be handled by:
> - priority X + 20 flows for each distributed gateway port with
> is_chassis_resident(), reply ARP
> - priority X + 10 flows for each distributed gateway port without
> is_chassis_resident(), drop
> - priority X flows for each router (no need to match inport), reply ARP
>

> This way, there are N * (2D + 1) flows per router. N = number of NAT IPs,
> D = number of distributed gateway ports. This would optimize the above
> scenario where there is only 1 distributed gateway port but many regular
> router ports. Thoughts?
>

Han, I think this will work.

Again, thanks for the quick reply.

Regards,
~Girish
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20200603/eeb22a1c/attachment.html>


More information about the discuss mailing list