[ovs-dev] [PATCH ovn v3 2/2] northd: Remove "reachable" functions and users of them.

Dumitru Ceara dceara at redhat.com
Fri Mar 26 11:35:09 UTC 2021


On 3/26/21 12:32 PM, Dumitru Ceara wrote:
> On 3/24/21 12:44 AM, Mark Michelson wrote:
>> On 3/23/21 6:55 PM, Dumitru Ceara wrote:
>>> On 3/23/21 10:02 PM, Mark Michelson wrote:
>>>> On 3/23/21 12:13 PM, Dumitru Ceara wrote:
>>>>> On 3/23/21 5:05 PM, Numan Siddique wrote:
>>>>>> On Fri, Mar 19, 2021 at 2:20 AM Mark Michelson <mmichels at redhat.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> Self-originated ARPs are intended to be sent to the "owning"
>>>>>>> router for
>>>>>>> a given IP address, as well as flooded to non-router ports on a
>>>>>>> logical
>>>>>>> switch.
>>>>>>>
>>>>>>> When trying to determine the owning router for an IP address, we
>>>>>>> would
>>>>>>> iterate through load balancer and NAT addresses, and check if
>>>>>>> these IP
>>>>>>> addresses were "reachable" on this particular router. Reachable here
>>>>>>> means that the NAT external IP or load balancer VIP falls within
>>>>>>> any of
>>>>>>> the networks served by this router. If an IP address were determined
>>>>>>> not
>>>>>>> to be reachable, then we would not try to send ARPs for that
>>>>>>> particular
>>>>>>> address to the router.
>>>>>>>
>>>>>>> However, it is possible (and sometimes desired) to configure NAT
>>>>>>> floating
>>>>>>> IPs on a router that are not in any subnet handled by that router.
>>>>>>> In this case, OVN refuses to send ARP requests to the router on
>>>>>>> which the
>>>>>>> floating IP has been configured. The result is that
>>>>>>> internally-generated
>>>>>>> traffic that targets the floating IP cannot reach its destination,
>>>>>>> since the router on which the floating IP is configured never
>>>>>>> receives ARPs
>>>>>>> for the floating IP.
>>>>>>>
>>>>>>> This patch fixes the issue by removing the reachability checks
>>>>>>> altogether. If a router has a NAT external IP or load balancer VIP
>>>>>>> that
>>>>>>> is outside the range of any of its configured subnets, we still
>>>>>>> should
>>>>>>> send ARPs to that router for those requested addresses.
>>>>>>>
>>>>>>> Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=1929901
>>>>>>>
>>>>>>> Signed-off-by: Mark Michelson <mmichels at redhat.com>
>>>>>>
>>>>>> Thanks for addressing this.
>>>>>>
>>>>>> Acked-by: Numan Siddique <numans at ovn.org>
>>>>>>
>>>>>> @Dumitru - Since you had added the code to limit ARPs.  Can you please
>>>>>> also take a look at this patch ?
>>>>>>
>>>>>
>>>>> Hi Mark, Numan,
>>>>>
>>>>> I've been thinking about this for a while.  I think this needs at
>>>>> least:
>>>>>
>>>>> Fixes: 1e07781310d8 ("ovn-northd: Fix logical flows to limit ARP/NS
>>>>> broadcast domain.")
>>>>
>>>> OK, I can either just add that to the commit when I merge, or if I need
>>>> to put a new version up for review, I can add it to the new version
>>>> then.
>>>
>>> Sounds good.
> 
> I investigated some more and the problem is actually because of the
> combination of:
> 
> 1e07781310d8 ("ovn-northd: Fix logical flows to limit ARP/NS broadcast domain.")
> 
> and
> 
> 8c6a5bc21847 ("ovn-northd: Limit self originated ARP/ND broadcast domain.")
> 
> Without 8c6a5bc21847, the "reachable" functions just optimize ARP
> request flooding but when the flows using them are not hit we always
> fall back to flooding in the MC_FLOOD group on LS-PUB.
> With 8c6a5bc21847, we instead flood all logical router self originated
> ARP requests in the MC_FLOOD_L2 group.
> 
>>>
>>>>
>>>>>
>>>>> With the latest changes in ovn-kubernetes I think that the above was
>>>>> not
>>>>> needed anyway.  Mark, do you have more details about this by any
>>>>> chance?
>>>>
>>>> I'm not sure which changes in ovn-kubernetes you're referring to here.
>>>
>>> I was thinking of this:
>>>
>>> https://github.com/ovn-org/ovn-kubernetes/pull/2055
>>>
>>> Specifically:
>>>
>>> https://github.com/ovn-org/ovn-kubernetes/commit/ce2d312c7266c8d55173c9660002c67715a5444b
>>>
>>>
>>> However, after a closer look, I'm not sure if dnat_and_snat usage is
>>> completely removed in ovn-kubernetes.  This makes me wonder if reverting
>>> 1e07781310d8 ("ovn-northd: Fix logical flows to limit ARP/NS broadcast
>>> domain.") will not cause a regression in the ovn-kubernetes use case.
>>>
>>>>
>>>>>
>>>>> Initial bug report was here:
>>>>> https://mail.openvswitch.org/pipermail/ovs-discuss/2020-June/050287.html
>>>>>
>>>>
>>>> To quote from that e-mail:
>>>>
>>>> "Question though is why any Pod on the logical switch would send an ARP
>>>> for an IP that is not in its subnet. A packet from a Pod towards a non
>>>> subnet IP should ARP only for the default gateway IP."
>>>>
>>>> The above holds true for the scenario I'm working here. The key is that
>>>> this is a self-originated ARP from OVN, not an ARP from a pod/VM.
>>>>
>>>> The use case I'm solving here is OpenStack floating IPs. It's not
>>>> common, but it is expected that a user is able to assign a floating IP
>>>> to a router that does not service that floating IP's subnet. They also
>>>> expect all other VMs in the cluster to be able to reach the target VM
>>>> via its floating IP.
>>>>
>>>> So let's say you have a setup like:
>>>>
>>>>
>>>>                        EXT-ROUTER
>>>>                            ^
>>>>                            |
>>>>                            v
>>>> VM1 <-> LS1 <-> LR1 <-> LS-PUB <-> LR2 <-> LS2 <-> VM2
>>>>
>>>> In this case, LR1 and LR2 both have distributed gateway router ports on
>>>> them (the ports going to LS-PUB). They also have default static routes
>>>> that point to the EXT-ROUTER. LR2's gateway router port IP address is
>>>> 172.18.1.1/24. An OpenStack user configures a floating IP for VM2 of
>>>> 172.18.2.100. This translates to a dnat_and_snat entry on LR2 for that
>>>> address, which is NOT in the subnet configured on that router. Now VM1
>>>> wants to ping 172.18.2.100. In this case, when the ping reaches LR1, LR1
>>>> needs to ARP to find where 172.18.2.100 is. The ARP needs to target LR2
>>>> since that's where the floating IP is configured. If the ARP only goes
>>>> to EXT-ROUTER, EXT-ROUTER won't be able to respond to the ARP properly,
>>>> and connectivity cannot be established.
>>>
>>> I might be wrong but I'm still not sure how valid it is to define a
>>> floating IP (172.18.2.100) on a router that doesn't have network that
>>> includes 172.18.2.100.
>>
>> Unfortunately, this battle has been fought a couple of times already,
>> and in the case of OpenStack, this is something they expect to work and
>> currently does work with ml2-ovs.
>>
>>>
>>> Would it work if neutron-ovn configured a 172.18.2.100/32 network on the
>>> LR2-to-LS-PUB router port?
>>
>> Yes, that would make the router appear to be "reachable" with the
>> current OVN code.
>>
>>>
>>> Otherwise, I think we need to clarify if ovn-kubernetes would still have
>>> a scale issue or not.
>>>
>>> A less pretty alternative is to allow such floating IPs (without a
>>> network that includes that IP) only if a new config knob is set.
>>
>> I think there are a couple of valid options here:
>>
>> 1) A configuration option. The option could be configured on the NAT
>> itself to allow for that address to be considered "reachable" on the
>> router even if it lies outside the router subnet.
>>
>> 2) Automatically allow this behavior for "floating IP" NATs
>> (dnat_and_snat, external_mac set, and port set).
>>
>>
>> (2) Has the advantage of "just working" for the OpenStack FIP case while
>> also not causing a flow explosion for ovn-kubernetes. However, (1) is
>> more flexible, allowing for the option to be set on other DNATs as well.
> 
> Maybe an option (3) that would make all CMSs happy could be:
> 
> For all dnat_and_snat logical IPs that are not part of router subnets
> and a single logical flow in table in_l2_lkup on LS-PUB flooding ARP

s/and a single/add a single/

> requests that target them in the MC_FLOOD group.  This shouldn't create
> a scale issue because it's a single flow for all such IPs. If this
> flow has higher priority than the flow restricting router self
> originated ARPs to MC_FLOOD_L2 then it should also work fine.
> 
> The downside is that we *might* hit the OVS resubmit limit for ARPs
> targeting such FIPs on scaled setups but, I guess, that's a different
> story.
> 
>>
>> I'm just bracing for when someone wants the same behavior for load
>> balancers...
>>
> 
> Also, I think the system test you're adding needs a small change as it
> currently passes both with and without your patch.  We need to remove:
> 
> ovn-nbctl lr-route-add lr1 172.18.2.12 172.18.1.173 lr1-ls-pub
> 
> What do you think?
> 
> Regards,
> Dumitru
> 



More information about the dev mailing list