[ovs-dev] [PATCH v7 3/7] ovn: Introduce "chassisredirect" port binding

Mon Jan 9 06:30:44 UTC 2017

On Fri, Jan 6, 2017 at 8:31 PM, Mickey Spiegel <mickeys.dev at gmail.com>
wrote:

>
> On Fri, Jan 6, 2017 at 4:21 PM, Mickey Spiegel <mickeys.dev at gmail.com>
> wrote:
>
>>
>> On Fri, Jan 6, 2017 at 4:11 PM, Ben Pfaff <blp at ovn.org> wrote:
>>
>>> On Fri, Jan 06, 2017 at 03:47:03PM -0800, Mickey Spiegel wrote:
>>> > On Fri, Jan 6, 2017 at 3:20 PM, Ben Pfaff <blp at ovn.org> wrote:
>>> >
>>> > > On Fri, Jan 06, 2017 at 12:00:30PM -0800, Mickey Spiegel wrote:
>>> > > > Currently OVN handles all logical router ports in a distributed
>>> manner,
>>> > > > creating instances on each chassis.  The logical router ingress and
>>> > > > egress pipelines are traversed locally on the source chassis.
>>> > > >
>>> > > > In order to support advanced features such as one-to-many NAT (aka
>>> IP
>>> > > > masquerading), where multiple private IP addresses spread across
>>> > > > multiple chassis are mapped to one public IP address, it will be
>>> > > > necessary to handle some of the logical router processing on a
>>> specific
>>> > > > chassis in a centralized manner.
>>> > > >
>>> > > > The goal of this patch is to develop abstractions that allow for a
>>> > > > subset of router gateway traffic to be handled in a centralized
>>> manner
>>> > > > (e.g. one-to-many NAT traffic), while allowing for other subsets of
>>> > > > router gateway traffic to be handled in a distributed manner (e.g.
>>> > > > floating IP traffic).
>>> > > >
>>> > > > This patch introduces a new type of SB port_binding called
>>> > > > "chassisredirect".  A "chassisredirect" port represents a
>>> particular
>>> > > > instance, bound to a specific chassis, of an otherwise distributed
>>> > > > port.  The ovn-controller on that chassis populates the "chassis"
>>> > > > column for this record as an indication for other ovn-controllers
>>> of
>>> > > > its physical location.  Other ovn-controllers do not treat this
>>> port
>>> > > > as a local port.
>>> > > >
>>> > > > A "chassisredirect" port should never be used as an "inport".
>>> When an
>>> > > > ingress pipeline sets the "outport", it may set the value to a
>>> logical
>>> > > > port of type "chassisredirect".  This will cause the packet to be
>>> > > > directed to a specific chassis to carry out the egress logical
>>> router
>>> > > > pipeline, in the same way that a logical switch forwards egress
>>> traffic
>>> > > > to a VIF port residing on a specific chassis.  At the beginning of
>>> the
>>> > > > egress pipeline, the "outport" will be reset to the value of the
>>> > > > distributed port.
>>> > > >
>>> > > > For outbound traffic to be handled in a centralized manner, the
>>> > > > "outport" should be set to the "chassisredirect" port representing
>>> > > > centralized gateway functionality in the otherwise distributed
>>> router.
>>> > > > For outbound traffic to be handled in a distributed manner,
>>> locally on
>>> > > > the source chassis, the "outport" should be set to the existing
>>> "patch"
>>> > > > port representing distributed gateway functionality.
>>> > > >
>>> > > > Inbound traffic will be directed to the appropriate chassis by
>>> > > > restricting source MAC address usage and ARP responses to that
>>> chassis,
>>> > > > or by running dynamic routing protocols.
>>> > > >
>>> > > > Note that "chassisredirect" ports have no associated IP or MAC
>>> addresses.
>>> > > > Any pipeline stages that depend on port specific IP or MAC
>>> addresses
>>> > > > should be carried out in the context of the distributed port.
>>> > > >
>>> > > > Although the abstraction represented by the "chassisredirect" port
>>> > > > binding is generalized, in this patch the "chassisredirect" port
>>> binding
>>> > > > is only created for NB logical router ports that specify the new
>>> > > > "redirect-chassis" option.  There is no explicit notion of a
>>> > > > "chassisredirect" port in the NB database.  The expectation is when
>>> > > > capabilities are implemented that take advantage of
>>> "chassisredirect"
>>> > > > ports (e.g. NAT), the addition of flows specifying a
>>> "chassisredirect"
>>> > > > port as the outport will also be triggered by the presence of the
>>> > > > "redirect-chassis" option.  Such flows are added for NB logical
>>> router
>>> > > > ports that specify the "redirect-chassis" option.
>>> > > >
>>> > > > Signed-off-by: Mickey Spiegel <mickeys.dev at gmail.com>
>>> > >
>>> > > chassisredirect ports seem incredibly similar to vif ports.  Is the
>>> only
>>> > > difference that the output port is changed at the beginning of the
>>> > > egress pipeline?  That's something that could be implemented in the
>>> > > logical egress pipeline with 'outport = "...";'.  We do say that the
>>> > > outport isn't supposed to be modified in an egress pipeline, but
>>> nothing
>>> > > enforces that and if it's actually useful then we could just change
>>> the
>>> > > documentation.
>>> > >
>>> >
>>> > I don't get the similarity to vif ports.
>>> >
>>> > I need to create two different ports for each logical router port
>>> > specifying a "redirect-chassis". One represents the centralized
>>> > instance, for traffic that needs to be centralized. The other
>>> > represents the distributed instance, i.e. just take the local patch
>>> > port and go to/from the local logical router instance. I wanted the
>>> > egress pipeline processing to be the same regardless of whether
>>> > the packet arrived at the egress pipeline on the port representing
>>> > the centralized instance, or whether the packet arrived at the
>>> > egress pipeline on the port representing the distributed instance.
>>> >
>>> > There is no pipeline processing of the chassisredirect port,
>>> > except as the outport in the ingress pipeline. Everything else
>>> > happens in tables 32 and 33.
>>>
>>> OK, then I'm having trouble following the description.  For me, here's
>>> the key paragraphs that led me to my conclusions:
>>>
>>>     This patch introduces a new type of SB port_binding called
>>>     "chassisredirect".  A "chassisredirect" port represents a particular
>>>     instance, bound to a specific chassis, of an otherwise distributed
>>>     port.  The ovn-controller on that chassis populates the "chassis"
>>>     column for this record as an indication for other ovn-controllers of
>>>     its physical location.  Other ovn-controllers do not treat this port
>>>     as a local port.
>>>
>>>     A "chassisredirect" port should never be used as an "inport".  When
>>>     an ingress pipeline sets the "outport", it may set the value to a
>>>     logical port of type "chassisredirect".  This will cause the packet
>>>     to be directed to a specific chassis to carry out the egress logical
>>>     router pipeline, in the same way that a logical switch forwards
>>>     egress traffic to a VIF port residing on a specific chassis.  At the
>>>     beginning of the egress pipeline, the "outport" will be reset to the
>>>     value of the distributed port.
>>>
>>> The first paragraph appears to say that a chassisredirect port is a port
>>> on a particular chassis and that its chassis column says what chassis
>>> it's on.  OK, that's the same as a vif port, right?
>>>
>>
>> Yes, the same as vif, l2gateway, or l3gateway in the sense that this
>> port is bound to a chassis. No differences there.
>>
>>>
>>> The second paragraph appears to me to say, first, that packets would
>>> never originate from a chassisredirect port.  OK, fine, no problem.
>>> Second, it directly makes an analogy to vif ports, and then says that
>>> the outport changes.  No problem.
>>>
>>
>> Two main differences from vif:
>> 1. The outport changes. I want the ct_zone assignments in table 33
>>    and the loopback check in table 34 to be according to the new
>>    outport.
>>
>> 2. There is no pipeline processing of this port. This port has no
>>    addresses or other configuration. The purpose of the port is to
>>    tell table 32 to go to a particular chassis, and then tell table 33
>>    what the real outport should be.
>>
>> I got to this notion because a port is the way to tell table 32 to
>> go to a particular chassis. The first thought was two regular patch
>> ports, but the idea of two patch ports with the same addresses
>> is confusing and dangerous. By changing back to the real patch
>> port right away in the egress pipeline, it avoids those problems.
>>
>> Mickey
>>
>
> Let me go back to first principles. I need three sorts of chassis
> specific behaviors for distributed NAT:
> 1. Install some flows only on the chassis where a certain logical
>    port resides. That is is_chassis_resident which you already
>    reviewed and acked. The nat flows patch at the end of the
>    patch set uses this mechanism.
> 2. Install a different set of flows associated with the distributed
>    gateway port only on the redirect-chassis. There are several
>    such flows in this patch.
> 3. Direct some traffic with outport being the distributed gateway
>    port to the instance of the distributed gateway port on the
>    redirect-chassis. When this traffic hits table 32, it gets
>    sent through the normal tunnel to the redirect-chassis.
>
> I needed some handle that triggers 3. I decided to make that
> handle be a port, which I called a "chassisredirect" port. That
> also allows me to use is_chassis_resident(chassisredirect_port)
> to solve 2.
>
> It is possible to make that handle be something other than a
> port, as long as table 32 is modified to act on that. In that case,
> I will need another match "condition" (as I called it) based on
> that handle, similar to is_chassis_resident but based on
> whatever handle we decide on instead of port.
>

I realized earlier tonight that there is a straightforward
alternative, though it does have one potentially confusing
aspect.

For some reason, I had been assuming that a port_binding is
either exclusive to a chassis (in the previous implementation
with OVS patch ports, it had an ofport), or the port_binding
exists everywhere and does not have a chassis association
(is_remote in the previous implementation with OVS patch
ports).

If this is relaxed and we allow logical patch ports to be
associated with a chassis, then all I need is a new
MLF_FORCE_CHASSIS_REDIRECT flag rather than
a second port_binding with a new "chassisredirect" type.

The potentially confusing aspect is that even though the
mechanism for associating a logical patch port with a
chassis is identical to that for other port_binding types such
as "l3gateway", the association of a chassis with a logical
patch port has a different meaning than the association of a
chassis with a VIF, a type "l3gateway" port_binding, or a
type "l2gateway" port_binding.  For the latter, the association
is exclusive, i.e. the port only exists on that chassis.  For
logical patch ports, whether there is an association with a
chassis or not, the logical patch port exists everywhere
(subject to the constraints of conditional monitoring).

The chassis association would only be used for a new
table 32 flow similar to other flows sending packets to
remote hypervisors for other port_binding types, but with
a different match condition:
    match_set_metadata(&match, htonll(dp_key))
    match_set_reg(&match, MFF_LOG_OUTPORT - MFF_REG0, port_key);
    match_set_reg_masked(&match, MFF_LOG_FLAGS - MFF_REG0,
                         1, MLF_FORCE_CHASSIS_REDIRECT);

Depending on whether the
MLF_FORCE_CHASSIS_REDIRECT flag is set, the
packet would either be sent to the remote hypervisor,
or it would fall through to the table 32 priority 0 fallback
flow and be processed locally.

The chassis association could also be used for
evaluation of is_chassis_resident("l3dgw_port") functions
in flow matches.

If you agree that this approach is more promising than
type "chassisredirect" ports, I can code this up tomorrow.

Mickey

> Mickey
>
>
>>
>>> I guess that I must be missing important points, but that's why I
>>> interpreted the text as I did.  Can you help me figure out why I'm not
>>> following?
>>>
>>> Thanks,
>>>
>>> Ben.
>>>
>>
>>
>