[ovs-dev] [PATCH 1/2] ovn-northd: Ability to loop-back in a router.

Fri Jul 8 08:30:34 UTC 2016

> On Jul 7, 2016, at 10:36 PM, Mickey Spiegel <emspiege at us.ibm.com> wrote:
> 
> -----Guru Shetty <guru at ovn.org> wrote: -----
> 
>> To: Mickey Spiegel/San Jose/IBM at IBMUS
>> From: Guru Shetty <guru at ovn.org>
>> Date: 07/07/2016 09:34PM
>> Cc: ovs dev <dev at openvswitch.org>
>> Subject: Re: [ovs-dev] [PATCH 1/2] ovn-northd: Ability to loop-back
>> in a router.
>> 
>> 
>> 
>> On 7 July 2016 at 21:28, Guru Shetty <guru at ovn.org> wrote:
>> 
>> 
>>> On 7 July 2016 at 20:30, Mickey Spiegel <emspiege at us.ibm.com> wrote:
>>> To: dev at openvswitch.org
>>> From: Gurucharan Shetty
>>> Sent by: "dev"
>>> Date: 07/05/2016 11:15AM
>>> Subject: [ovs-dev] [PATCH 1/2] ovn-northd: Ability to loop-back in a router.
>>> 
>>> Currently, when a client looks at a load balancer VIP,
>>> it notices that it is in a different subnet than itself
>>> and sends the packet to its connected router port's
>>> MAC address. The load balancer intercepts it.
>>> 
>>> If the load balancer VIP translates to an endpoint IP in a
>>> different subnet (than the one client has), than the
>>> load balancing works fine because the router will send
>>> the packet to the correct destination.
>>> 
>>> But if one of the endpoints that VIP translated into
>>> was in the same subnet as the client, the OVN router
>>> fails to send the packet back via the same interface.
>> 
>> So the load balancer is translating the destination IP,
>> but leaving the MAC address unchanged?
>> Based on the MAC address, the packet is forwarded to
>> the router patch port?
>> Yes. This does look like a common behavior. Atleast, the default
>> Kubernetes load balancers (or any iptables based load-balancers) seem
>> to do that.
> 
> This does not seem clean. I still wonder whether it would make
> more sense to start over on a separate logical switch for the load
> balancer, leading to a different patch port into the logical router.
I feel right now that  it complicates the topology for not a lot of useful benefits.

> 
>> --snip... 
>> 
>> 
>> I am concerned about two aspects of this proposal:
>> 1. It applies to all traffic to directly connected subnets, not just
>>   for load balancer traffic. That is a significant change in behavior.
>> Agreed. (Having said that, some Physical routers seem to do the same
>> thing. i.e. have the capability to send back the traffic. I am not
>> sure whether all Physical routers are capable of doing it.)
> 
> A quick search told me that one of the major router vendors allowed
> that 8 or 9 years ago. Not sure if they allow it now.
> 
> Their firewalls do not allow it by default, but have a configuration knob.
> 
>> 2. It is removing the inport early on in the router ingress pipeline,
>>   which scares me and seems like it will make debugging difficult.
>>   You could narrow it down quite a bit by matching on inport, but
>>   that still leaves the behavior that concerns me for some traffic.
>>   Looking at my design for NAT in a distributed router, removing
>>   the inport would break it. I suspect there might be other
>>   future features that might act on inport, such as RPF.
>> 
>> 
>> This is only true when the destination IP address is in the same
>> subnet as the router port. For other cases, inport is available. Do
>> you also need to send back traffic? I guess what I am getting at is,
>> why do you think this will hurt other features which won't loop-back?
> 
> This is not about loopback. It is about the mechanism that you chose
> to achieve your goal, zeroing out the inport very early in the router
> ingress pipeline. Other lookups later in the router ingress pipeline
> may need to have the inport available for match conditions. For the
> NAT design that I am working on, I want to match on the router
> gateway address (SNAT) and inport == gateway port, together. For
> RPF, it could be any router port.
> 
>> Looks like my patch does it for every router port in that router.
>> That is clearly wrong and was not my intention. If I limit it to only
>> the port which has that subnet, would that satisfy your concern?
> 
> No. That is what I mentioned above by "narrow it down quite a bit
> by matching on inport". You would still be zeroing out the inport
> in some cases, which may affect later pipeline stages that want to
> match on inport.
> 
> Once you put this change in, in what cases are you still precluding
> inport == outport?
> Only when the dest IP matches a default, static or dynamic route
> rather than a connected subnet.
> Does the inport == outport check still have any significant value
> once you do that?
> I would argue not much. The simplest solution in that case would
> be to turn off the check for router datapaths, though I would still
> think it should be protected by a configuration knob of some sort.
> If you turn off the check for router datapaths, the change would be
> in physical.c for table 34, and would not affect the logical flows
> constructed by northd.

That is one way to look at it and makes sense. Let me think over this and talk to people for more ideas.

> 
> Mickey
> 
>> (For cases like that, a workaround would be to store inport in a
>> register for later use? )
>> 
>> 
>>>    /* NAT in Gateway routers. */
>>> --
>>> 1.9.1
>>> 
>>> _______________________________________________
>>> dev mailing list
>>> dev at openvswitch.org
>>> http://openvswitch.org/mailman/listinfo/dev
>> 
>> _______________________________________________
>> dev mailing list
>> dev at openvswitch.org
>> http://openvswitch.org/mailman/listinfo/dev
> 
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev