[ovs-dev] [PATCH ovn v3 2/2] northd: Remove "reachable" functions and users of them.

Dumitru Ceara dceara at redhat.com
Fri Mar 26 11:32:38 UTC 2021

On 3/24/21 12:44 AM, Mark Michelson wrote:
> On 3/23/21 6:55 PM, Dumitru Ceara wrote:
>> On 3/23/21 10:02 PM, Mark Michelson wrote:
>>> On 3/23/21 12:13 PM, Dumitru Ceara wrote:
>>>> On 3/23/21 5:05 PM, Numan Siddique wrote:
>>>>> On Fri, Mar 19, 2021 at 2:20 AM Mark Michelson <mmichels at redhat.com>
>>>>> wrote:
>>>>>> Self-originated ARPs are intended to be sent to the "owning"
>>>>>> router for
>>>>>> a given IP address, as well as flooded to non-router ports on a
>>>>>> logical
>>>>>> switch.
>>>>>> When trying to determine the owning router for an IP address, we
>>>>>> would
>>>>>> iterate through load balancer and NAT addresses, and check if
>>>>>> these IP
>>>>>> addresses were "reachable" on this particular router. Reachable here
>>>>>> means that the NAT external IP or load balancer VIP falls within
>>>>>> any of
>>>>>> the networks served by this router. If an IP address were determined
>>>>>> not
>>>>>> to be reachable, then we would not try to send ARPs for that
>>>>>> particular
>>>>>> address to the router.
>>>>>> However, it is possible (and sometimes desired) to configure NAT
>>>>>> floating
>>>>>> IPs on a router that are not in any subnet handled by that router.
>>>>>> In this case, OVN refuses to send ARP requests to the router on
>>>>>> which the
>>>>>> floating IP has been configured. The result is that
>>>>>> internally-generated
>>>>>> traffic that targets the floating IP cannot reach its destination,
>>>>>> since the router on which the floating IP is configured never
>>>>>> receives ARPs
>>>>>> for the floating IP.
>>>>>> This patch fixes the issue by removing the reachability checks
>>>>>> altogether. If a router has a NAT external IP or load balancer VIP
>>>>>> that
>>>>>> is outside the range of any of its configured subnets, we still
>>>>>> should
>>>>>> send ARPs to that router for those requested addresses.
>>>>>> Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=1929901
>>>>>> Signed-off-by: Mark Michelson <mmichels at redhat.com>
>>>>> Thanks for addressing this.
>>>>> Acked-by: Numan Siddique <numans at ovn.org>
>>>>> @Dumitru - Since you had added the code to limit ARPs.  Can you please
>>>>> also take a look at this patch ?
>>>> Hi Mark, Numan,
>>>> I've been thinking about this for a while.  I think this needs at
>>>> least:
>>>> Fixes: 1e07781310d8 ("ovn-northd: Fix logical flows to limit ARP/NS
>>>> broadcast domain.")
>>> OK, I can either just add that to the commit when I merge, or if I need
>>> to put a new version up for review, I can add it to the new version
>>> then.
>> Sounds good.

I investigated some more and the problem is actually because of the
combination of:

1e07781310d8 ("ovn-northd: Fix logical flows to limit ARP/NS broadcast domain.")


8c6a5bc21847 ("ovn-northd: Limit self originated ARP/ND broadcast domain.")

Without 8c6a5bc21847, the "reachable" functions just optimize ARP
request flooding but when the flows using them are not hit we always
fall back to flooding in the MC_FLOOD group on LS-PUB.
With 8c6a5bc21847, we instead flood all logical router self originated
ARP requests in the MC_FLOOD_L2 group.

>>>> With the latest changes in ovn-kubernetes I think that the above was
>>>> not
>>>> needed anyway.  Mark, do you have more details about this by any
>>>> chance?
>>> I'm not sure which changes in ovn-kubernetes you're referring to here.
>> I was thinking of this:
>> https://github.com/ovn-org/ovn-kubernetes/pull/2055
>> Specifically:
>> https://github.com/ovn-org/ovn-kubernetes/commit/ce2d312c7266c8d55173c9660002c67715a5444b
>> However, after a closer look, I'm not sure if dnat_and_snat usage is
>> completely removed in ovn-kubernetes.  This makes me wonder if reverting
>> 1e07781310d8 ("ovn-northd: Fix logical flows to limit ARP/NS broadcast
>> domain.") will not cause a regression in the ovn-kubernetes use case.
>>>> Initial bug report was here:
>>>> https://mail.openvswitch.org/pipermail/ovs-discuss/2020-June/050287.html
>>> To quote from that e-mail:
>>> "Question though is why any Pod on the logical switch would send an ARP
>>> for an IP that is not in its subnet. A packet from a Pod towards a non
>>> subnet IP should ARP only for the default gateway IP."
>>> The above holds true for the scenario I'm working here. The key is that
>>> this is a self-originated ARP from OVN, not an ARP from a pod/VM.
>>> The use case I'm solving here is OpenStack floating IPs. It's not
>>> common, but it is expected that a user is able to assign a floating IP
>>> to a router that does not service that floating IP's subnet. They also
>>> expect all other VMs in the cluster to be able to reach the target VM
>>> via its floating IP.
>>> So let's say you have a setup like:
>>>                        EXT-ROUTER
>>>                            ^
>>>                            |
>>>                            v
>>> VM1 <-> LS1 <-> LR1 <-> LS-PUB <-> LR2 <-> LS2 <-> VM2
>>> In this case, LR1 and LR2 both have distributed gateway router ports on
>>> them (the ports going to LS-PUB). They also have default static routes
>>> that point to the EXT-ROUTER. LR2's gateway router port IP address is
>>> An OpenStack user configures a floating IP for VM2 of
>>> This translates to a dnat_and_snat entry on LR2 for that
>>> address, which is NOT in the subnet configured on that router. Now VM1
>>> wants to ping In this case, when the ping reaches LR1, LR1
>>> needs to ARP to find where is. The ARP needs to target LR2
>>> since that's where the floating IP is configured. If the ARP only goes
>>> to EXT-ROUTER, EXT-ROUTER won't be able to respond to the ARP properly,
>>> and connectivity cannot be established.
>> I might be wrong but I'm still not sure how valid it is to define a
>> floating IP ( on a router that doesn't have network that
>> includes
> Unfortunately, this battle has been fought a couple of times already,
> and in the case of OpenStack, this is something they expect to work and
> currently does work with ml2-ovs.
>> Would it work if neutron-ovn configured a network on the
>> LR2-to-LS-PUB router port?
> Yes, that would make the router appear to be "reachable" with the
> current OVN code.
>> Otherwise, I think we need to clarify if ovn-kubernetes would still have
>> a scale issue or not.
>> A less pretty alternative is to allow such floating IPs (without a
>> network that includes that IP) only if a new config knob is set.
> I think there are a couple of valid options here:
> 1) A configuration option. The option could be configured on the NAT
> itself to allow for that address to be considered "reachable" on the
> router even if it lies outside the router subnet.
> 2) Automatically allow this behavior for "floating IP" NATs
> (dnat_and_snat, external_mac set, and port set).
> (2) Has the advantage of "just working" for the OpenStack FIP case while
> also not causing a flow explosion for ovn-kubernetes. However, (1) is
> more flexible, allowing for the option to be set on other DNATs as well.

Maybe an option (3) that would make all CMSs happy could be:

For all dnat_and_snat logical IPs that are not part of router subnets
and a single logical flow in table in_l2_lkup on LS-PUB flooding ARP
requests that target them in the MC_FLOOD group.  This shouldn't create
a scale issue because it's a single flow for all such IPs. If this
flow has higher priority than the flow restricting router self
originated ARPs to MC_FLOOD_L2 then it should also work fine.

The downside is that we *might* hit the OVS resubmit limit for ARPs
targeting such FIPs on scaled setups but, I guess, that's a different

> I'm just bracing for when someone wants the same behavior for load
> balancers...

Also, I think the system test you're adding needs a small change as it
currently passes both with and without your patch.  We need to remove:

ovn-nbctl lr-route-add lr1 lr1-ls-pub

What do you think?


More information about the dev mailing list