[ovs-dev] OVN based distributed virtual routing for VLAN backed networks

Miguel Angel Ajo Pelayo majopela at redhat.com
Tue Oct 23 07:43:47 UTC 2018


Hi,

     Nice document, you are capturing the nuisances of handling VLAN very
well,
I agree that we need a chassis specific mac for the distributed routers,
when
handling E/W traffic, otherwise you end up with the situation described in
[1] making
the router port MAC flip.

    That is also an issue if you need then to reach a centralized gateway
chassis for N/S,
on such MAC address.


    Also as @Mark Michelson <mmichels at redhat.com> already said, there are
current ways to model that, +Venkata Kommaddi <anilvenkata at redhat.com>
sent patches to make that approach possible [2], and it was working,
although it needed more
work to avoid MAC flipping, that's why @Numan Siddique
<nusiddiq at redhat.com> introduced
a new set of patches to
allow VLAN E/W in a centralized way for the time being [3] but with much
more simple modifications.


   Said that, your proposal makes ovn handle vlans more easily instead,
although it lacks something
that we make use of in openstack, and it's the ability to use several
provider networks for VLAN,
instead of only one. In OpenStack/neutron, you can reference several
provider networks, each of them
with a range of VLAN ids, that will serve you as carriers of your "virtual
switches/networks". (equivalent
to the current ovn bridge mappings + vlan ranges). see [4]

network_vlan_ranges¶
<https://docs.openstack.org/neutron/pike/configuration/ml2-conf.html#ml2_type_vlan.network_vlan_ranges>
Type: list
Default:

List of <physical_network>:<vlan_min>:<vlan_max> or <physical_network>
specifying

physical_network names usable for VLAN provider and tenant networks, as
well as ranges

of VLAN tags on each available for allocation to tenant networks.

If we move into a different direction I believe support for that would be
necessary from the openstack
side.

    Best regards,
Miguel Ángel.


[1] https://ajo.es/ovn-distributed-ew-on-vlan/#the-end-oh-no
[2] https://patchwork.ozlabs.org/bundle/mangelajo/l3-vlan/
[3] https://patchwork.ozlabs.org/bundle/mangelajo/vlan-ctl/
[4] https://docs.openstack.org/neutron/pike/configuration/ml2-conf.html

On Sat, Oct 20, 2018 at 2:36 AM Ankur Sharma <ankur.sharma at nutanix.com>
wrote:

> Hi Han,
>
> Appreciate your feedback.
> Please find the reply inline.
>
> Thanks
>
> Regards,
> Ankur
>
> From: Han Zhou <zhouhan at gmail.com>
> Sent: Friday, October 19, 2018 5:17 PM
> To: Ankur Sharma <ankur.sharma at nutanix.com>
> Cc: Mark Michelson <mmichels at redhat.com>; ovs-dev at openvswitch.org
> Subject: Re: [ovs-dev] OVN based distributed virtual routing for VLAN
> backed networks
>
>
>
> On Fri, Oct 19, 2018 at 3:09 PM Ankur Sharma <ankur.sharma at nutanix.com
> <mailto:ankur.sharma at nutanix.com>> wrote:
> >
> > Hi Han,
> >
> > Thanks a lot for review.
> > Please find my replies inline.
> >
> > Please  feel free to put forth more points for discussion.
> >
> > Thanks
> >
> > Regards,
> > Ankur
> >
> >
> >
> > From: Han Zhou <zhouhan at gmail.com<mailto:zhouhan at gmail.com>>
> > Sent: Thursday, October 18, 2018 11:55 PM
> > To: Ankur Sharma <ankur.sharma at nutanix.com<mailto:
> ankur.sharma at nutanix.com>>
> > Cc: Mark Michelson <mmichels at redhat.com<mailto:mmichels at redhat.com>>;
> ovs-dev at openvswitch.org<mailto:ovs-dev at openvswitch.org>
> > Subject: Re: [ovs-dev] OVN based distributed virtual routing for VLAN
> backed networks
> >
> >
> >
> > Hi Ankur, Mark,
> >
> >
> >
> > Please find my comments inline below.
> >
> >
> >
> > (I will spend more time to understand the change for the NAT case. )
> >
> >
> >
> > Thanks,
> >
> > Han
> >
> >
> >
> >
> > On Thu, Oct 18, 2018 at 4:40 PM Ankur Sharma <ankur.sharma at nutanix.com
> <mailto:ankur.sharma at nutanix.com>> wrote:
> > >
> > > Hi,
> > >
> > > As per our discussion in the IRC meeting  today, i have added all the
> diagrams in following google doc.
> > >
> https://docs.google.com/document/d/1uoQH478wM1OZ16HrxzbOUvk5LvFnfNEWbkPT6Zmm9OU/edit?usp=sharing
> [docs.google.com]<
> https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_document_d_1uoQH478wM1OZ16HrxzbOUvk5LvFnfNEWbkPT6Zmm9OU_edit-3Fusp-3Dsharing&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=Qyhta6TTVormxLjKqg-dj1OhOVaUXuw1lT8Qmcwveds&s=6LOAPnjbB_6axL57XoQCnLdUusTSqx_V8K-Y3IaRGcc&e=>
> [docs.google.com [docs.google.com]<
> https://urldefense.proofpoint.com/v2/url?u=http-3A__docs.google.com&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=Qyhta6TTVormxLjKqg-dj1OhOVaUXuw1lT8Qmcwveds&s=ewiL9SiyeHem4VmQ_Eocc9pPSaJbRZ9Gf_fggXNM-zk&e=
> >]
> > >
> > > Please take a look.
> > >
> > > Appreciate the feedback so far, looking forward to more discussions.
> > >
> > > Thanks
> > >
> > > Regards,
> > > Ankur
> > >
> > >
> > > -----Original Message-----
> > > From: Ankur Sharma
> > > Sent: Wednesday, October 17, 2018 3:37 PM
> > > To: 'Mark Michelson' <mmichels at redhat.com<mailto:mmichels at redhat.com>>;
> ovs-dev at openvswitch.org<mailto:ovs-dev at openvswitch.org>
> > > Subject: RE: [ovs-dev] OVN based distributed virtual routing for VLAN
> backed networks
> > >
> > > Hi Mark,
> > >
> > > Thanks a lot for the feedback.
> > > Regarding the figures, i attached the PNGs (shows in my sent items),
> but looks like they got filtered.
> > > My bad on that, is there a location, where OVS community uploads
> images for references.
> > > Please bear with us, hopefully, we will be able to avoid some of these
> glitches in our next conversations.
> > >
> > > Appreciate your comments on the proposal, please find my replies
> inline.
> > >
> > > Thanks
> > >
> > > Regards,
> > > Ankur
> > >
> > > -----Original Message-----
> > > From: Mark Michelson <mmichels at redhat.com<mailto:mmichels at redhat.com>>
> > > Sent: Wednesday, October 17, 2018 2:50 PM
> > > To: Ankur Sharma <ankur.sharma at nutanix.com<mailto:
> ankur.sharma at nutanix.com>>; ovs-dev at openvswitch.org<mailto:
> ovs-dev at openvswitch.org>
> > > Subject: Re: [ovs-dev] OVN based distributed virtual routing for VLAN
> backed networks
> > >
> > > Hi Ankur,
> > >
> > > Thanks for the detailed document! I always appreciate it when things
> are planned out in great detail so we know exactly what to expect.
> > >
> > > A general comment: there are places below where things like "figure 1"
> > > and "figure OVN bridge deployment" are referenced, but we can't see
> them. Is there a link to another document you can share that has these
> figures present?
> > >
> > > Other comments of mine are inline below.
> > >
> > > On 10/16/2018 06:43 PM, Ankur Sharma wrote:
> > > > Hi,
> > > >
> > > > We have done some effort in evaluating usage of OVN for Distributed
> >
> > > > Virtual Routing (DVR) for vlan backed networks.
> >
> >
> >
> > So the proposal should work only when all the HVs are physically L2
> connected (no L3 hops in between), correct? OVN doesn't have this
> assumption, but I think it should be ok if it is documented well so that
> users will understand this limitation when using this feature.
> >
> >
> >
> > > >
> > > > We would like to take it forward with the community.
> > > >
> > > > We understand that some of the work could be overlapping with
> existing
> > > > patches in review.
> > > >
> > > > We would appreciate the feedback and would be happy to update our
> > > > patches to avoid known overlaps.
> > > >
> > > > This email explains the proposal. We will be following it up with
> patches.
> > > > Each "CODE CHANGES" section summarizes the change that corresponding
> > > > patch would have.
> > > >
> > > >
> > > > DISTRIBUTED VIRTUAL ROUTING FOR VLAN BACKED NETWORKS
> > > > ======================================================
> > > >
> > > >
> > > > 1. OVN Bridge Deployment
> > > > ------------------------------------
> > > >
> > > > Our design follows following ovn-bridge deployment model (please
> refer
> > > > to figure OVN Bridge deployment).
> > > >      i. br-int ==> OVN managed bridge.
> > > >         br-pif ==> Learning Bridge, where physical NICs will be
> connected.
> > > >
> > > >     ii. Any packet that should be on physical network, will travel
> from BR-INT
> > > >         to BR-PIF, via patch ports (localnet ports).
> > > >
> > > > 2. Layer 2
> > > > -------------
> > > >
> > > >     DESIGN:
> > > >     ~~~~~~~
> > > >     a. Leverage on localnet logical port type as path port between
> br-int and
> > > >         br-pif.
> > > >     b. Each VLAN backed logical switch will have a localnet port
> connected
> > > >         to it.
> > > >     c. Tagging and untagging of vlan headers happens at localnet
> port boundary.
> > > >
> > > >     PIPELINE EXECUTION:
> > > >     ~~~~~~~~~~~~~~~~~~~
> > > >     a. Unlike geneve encap based solution, where we execute ingress
> pipeline on
> > > >         source chassis and egress pipeline on destination chassis,
> for vlan
> > > >         backed logical switches, packet will go through ingress
> pipeline
> > > >         on destination chassis as well.
> > > >
> > > >     PACKET FLOW (Figure 1. shows topology and Figure 2. shows the
> packet flow):
> > > >
>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > >     a. VM sends unicast traffic (destined to VM2_MAC) to br-int.
> > > >     b. For br-int, destination mac is not local, hence it will
> forward it to
> > > >         localnet port (by design), which is attached to br-pif. This
> is
> > > >         the stage at which vlan tag is added. Br-pif forwards the
> packet
> > > >         to physical interface.
> > > >     c. br-pif on destination chassis sends the received traffic to
> patch-ports
> > > >         on br-int (as unicast or unknown unicast).
> > > >     d. br-int does vlan tag check, strips the vlan header and sends
> > > >         the packet to ingress pipeline of the corresponding datapath.
> > > >
> > > >
> > > >     KEY DIFFERENCES AS COMPARED TO OVERLAY:
> > > >     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > >     a. No encapsulation.
> > > >     b. Both ingress and egress pipelines of logical switch are
> executed on
> > > >         both source and destination hypervisor (unlike overlay where
> ingress
> > > >         pipeline is executed on source hypervisor and egress on
> destination).
> > > >
> > > >     CODE CHANGES:
> > > >     ~~~~~~~~~~~~~
> > > >     a. ovn-nb.ovsschema:
> > > >          1. Add a new column to table Logical_Switch.
> > > >          2. Column name would be "type".
> > > >          3. Values would be either "vlan" or "overlay", with
> "overlay"
> > > >              being default.
> > > >
> > > >     b. ovn-sbctl:
> > > >          1. Add a new cli which sets the "type" of logical-switch.
> > > >              ovn-nbctl ls-set-network-type SWITCH TYPE
> > > >
> > > >     c. ovn-northd:
> > > >          1. Add a new enum to ovn_datapath struct, which will
> indicate
> > > >              if logical_switch datapath type is overlay or vlan.
> > > >          2. Populate a new key value pair in southbound database for
> Datapath
> > > >              Bindings of Logical_Switch.
> > > >          3. Key value pair: <logical-switch-type, "vlan" or
> "overlay">, default
> > > >              will be overlay.
> > >
> > > I believe everything described in this section is doable in OVN
> already without any code changes.
> > >
> > > Essentially, you can do the following:
> > > 1) On a logical switch, create a logical switch port of type "localnet"
> > > and set its addresses to "unknown".
> > > 2) On the localnet port, set options:network_name to a network name.
> > > 3) On the localnet port, set tag_request to the VLAN identifier you
> want to use.
> > > 4) On the hypervisor where ovn-controller runs, create the br-pif
> bridge.
> > > 5) On the hypervisor where ovn-controller runs, in the Open_vSwitch
> table's record, set external-ids:ovn-bridge-mappings =
> <network_name>:br-pif. "network_name" in this case is the network_name you
> set on the localnet port in step 2.
> > >
> > > With this setup, ovn-controller will automatically create the patch
> ports between br-int and br-pif, and will use the VLAN tag from the
> localnet port for two purposes:
> > > 1) On traffic sent out of br-int over the patch port, the tag will be
> added to the packet.
> > > 2) On traffic received from the patch port into br-int, the VLAN tag
> must match the configured VLAN tag on the localnet port. If it matches, the
> tag is stripped.
> > >
> > > The only aspect of the above I'm not 100% sure about is the logical
> switch ingress and egress pipelines being run on both source and
> destination hypervisor. But I *think* that's how it works in this case.
> > > [ANKUR]:
> > > Sorry, should have mentioned it.
> > > yes, the current OVN implementation for localnet ports worked fine for
> us (we followed exactly same steps you mentioned, minus step 2.).
> > >
> > > Our proposal is to add a new column in Logical_Switch table to
> indicate if a logical switch is of type "vlan" or "overlay".
> > > This logical_switch type will be of help in our Layer 3 patches and
> based on network type, we can make some specific Forwarding decisions.
> > >
> > > Please let us know your opinion on this.
> >
> > >
> >
> >
> >
> > Explicit configuration seems to be a good idea. Would you mind have more
> details on how will the "type" be used regarding implementation?
> > [ANKUR]
> > We intend “type” to differentiate if a logical switch is of type vlan or
> overlay.
> > It will be useful for following:
> > Facilitate ease in debuggability
> >      i. if the type is vlan, then we would expect a localnet port on the
> logical-switch etc.
> >
> > Functionality:
> >     i. From a logical router perspective, if a connected logical-switch
> is of type “vlan”,
> >
> >        then we will have  flow which will replace router-port-mac with
> chassis mac,
> >
> >        when packet is going on wire on this logical-switch.
> >
> > >
> >
> > > >
> > > >
> > > > 3. Layer 3 East West
> > > > --------------------
> > > >
> > > >     DESIGN:
> > > >     ~~~~~~~
> > > >     a. Since the router port is distributed and there is no
> encapsulation,
> > > >         hence packets with router port mac as source mac cannot go
> on wire.
> > > >     b. We propose replacing router port mac with a chassis specific
> mac,
> > > >         whenever packet goes on wire.
> > > >     c. Number of chassis_mac per chassis could be dependent on
> number of
> > > >         physical nics and corresponding bond policy  on br-pif.
> > > >
> > > >        As of now, we propose only one chassis_mac per chassis
> > > >        (shared by all resident logical routers). However, we are
> analyzing
> > > >        if br-pif's bond policy would require more macs per chassis.
> > > >
> > > >     PIPELINE EXECUTION:
> > > >     ~~~~~~~~~~~~~~~~~~~
> > > >     a. For a DVR E-W flow, both ingress and egress pipelines for
> logical_router
> > > >         will execute on source chassis only.
> > > >
> > > >     PACKET FLOW (Figure 3. shows topology and Figure 4. shows the
> packet flow):
> > > >
>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > >     a. VM1 sends packet (destined to IP2), to br-int.
> > > >     b. On Source hypervisor, packet goes through following pipelines:
> > > >        1. Ingress: logical-switch 1
> > > >        2. Egress:  logical-switch 1
> > > >        3. Ingress: logical-router
> > > >        4. Egress:  logical-router
> > > >        5. Ingress: logical-switch2
> > > >        6. Egress:  logical-switch2
> > > >
> > > >        On wire, packet goes out with destination logical switch's
> vlan.
> > > >        As mentioned in design, source mac (RP2_MAC) would be
> replaced with
> > > >        CHASSIS_MAC and destination mac would be that of VM2.
> > > >
> > > >     c. Packet reaches destination chassis and enters logical-switch2
> > > >         pipeline in br-int.
> > > >     d. Packet goes through logical-switch2 pipeline (both ingress
> and egress)
> > > >         and gets forwarded to VM2.
> > > >
> > > >     CODE CHANGES:
> > > >     ~~~~~~~~~~~~~
> > > >     a. ovn-sb.ovsschema:
> > > >          1. Add a new column to the table Chassis.
> > > >          2. Column name would be "chassis_macs", type being string
> and no
> > > >              limit on range of values.
> > > >          3. This column will hold a list if chassis unique macs.
> > > >          4. This table will be populated from ovn-controller.
> > > >
> > > >     b. ovn-sbctl:
> > > >          1. CLI to add/delete chassis_macs to/from the south bound
> database.
> > > >
> > > >     c. ovn-controller:
> > > >          1. Read chassis macs from OVS Open_Vswitch table and
> populate
> > > >              south bound database.
> > > >          2. In table=65, add a new flow at priority 150, which will
> do following:
> > > >             a. Match: source_mac == router_port_mac, metadata ==
> > > >                 destination_logical_switch, logical_outport =
> localnet_port
> > > >             b. Action: Replace source mac with chassis_mac, add vlan
> tag.
> > > >
> > >
> > > It sounds like this shares some similarities with this proposed patch:
> > >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.ozlabs.org_patch_952122_&d=DwIC-g&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=DrUEx5o-hlioi3sUIucV-m2_VuYrU-flZsCzNNBW97U&s=_WD2_wisMQxMPymNcrr6Mt28oDTglSv0rHKZ1Dubj0M&e=
> > >
> > > In the linked patch, the idea is to use a consistent source MAC in
> order to play well with physical switches. However, the approach used in
> the linked patch is quite different from your proposal here.
> > >
> > > I like your proposal because I like the explicit configuration. The
> one question I have is, how do you determine which chassis MAC to use if
> multiple are specified? One idea might be to use something similar to the
> ovn-bridge-mappings. In other words, you map a network_name to a specific
> chassis MAC.
> > >
> > > [ANKUR]:
> > > We went through this series.
> > > yes, it might look to be on similar lines, but it differs in following
> ways:
> > > a. Fix is to use the gateway router port mac for any reply packets
> from  gateway chassis, which makes it specific
> > >     to gateway router port.
> > > b. Out proposal is meant for E-W,
> > > i.e we want to make sure that in the absence of any encapsulation,
> distributed the router port mac does not go on the wire as source mac.
> >
> > >
> >
> >
> >
> > I guess the reason you don't want router port MAC be on the wire is
> because it is distributed and so the physical switch would see the MAC
> coming from different ports thus would get confused - but it seems not a
> real problem if no one is using it as destination MAC on the wire, right?
> Or is there any other reason?
> > [ANKUR]:
> > You are correct, main reason for replacing router port mac, is to avoid
> mac moves in the physical switch.
> >
> > There are 2 reasons we want to avoid it:
> >
> > a. Mac Move limit on physical switches:
> >    Physical switches have a feature (mac move limit), where they
> restrict  number of a times a MAC can move to different ports within
> certain time interval.
> >      <link>
> https://www.juniper.net/documentation/en_US/junos/topics/task/configuration/port-security-mac-move-limiting-cli.html
> [juniper.net]<
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.juniper.net_documentation_en-5FUS_junos_topics_task_configuration_port-2Dsecurity-2Dmac-2Dmove-2Dlimiting-2Dcli.html&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=Qyhta6TTVormxLjKqg-dj1OhOVaUXuw1lT8Qmcwveds&s=ge-3Ka80TCCO5CCxbF7Br5fl6L4lGhF2E-U_BtGpJ7I&e=
> >
> >     In case of a violation, it could either log the event, drop the
> packet or worst shut down the port.
> >
> >
> >
> >     Also, high frequency mac moves will add load on the physical switch,
> especially on the control plane.
> >
> > b.  In the absence of encapsulation, any “redirect” to gateway-chassis
> has to be done by sending packet to the corresponding router port mac.
> >       Hence, we have to make sure that from physical switch perspective
> router port mac is learnt “only” on the port which is connected to
> corresponding
> >
> >       gateway chassis.
>
> Thanks for explain. These are very good points that I didn't thought about.
> >
> > Also, replacing router port mac with chassis mac will be configuration
> driven, i,e if there is no chassis mac configuration, then we will use
> router port mac.
> >
> > And I have a concern for replacing the router port MAC with a chassis
> specific MAC, since the VM will send packets firstly using router port MAC
> as destination MAC for the router IP, but will see ingress packets with
> different src MAC (the chassis specific MAC). Wouldn't it confuse the ARP
> cache (neighbor table) on the VM?
> >
> > [ANKUR]:
> > You have valid concern Han.
> >
> > a. It would not affect VM’s ARP cache. This is because ARP cache is
> populated only on  the basis of ARP headers.
> >     In our implementation,  we replace only the Ethernet Header’s source
> mac with chassis mac and ARP (/GARP/RARP) header is untouched.
> >     That ways Router Port’s <MAC, IP> binding will remain intact (not
> just in VM, even in physical network switches, routers).
> >
>
> The VM may still function well, but I think it would at least change the
> behavior a little bit: ARP cache will be learned firstly though the APR
> response from logical router port i.e. <router port IP, router port MAC>.
> However, since the VM would not see any response traffic hitting this entry
> with this IP & MAC pair, the entry will move to "stale" state and then
> start ARP procedure again, which will be resolved with the same router port
> MAC again, thus put the entry back to "reachable" state. This would not
> impact the traffic, but is different from the original implementation. It
> would periodically generate ARP requests unnecessarily. Maybe it is not a
> problem in reality.
>
> [ANKUR]:
> Yes, I do not disagree with it. I believe it would depend on the tcp/ip
> stack running in guest (vm) operating system as well.
> For example, I have seen cases, where stack generates periodic ARP queries
> for its gateway, irrespective of traffic flow.
>
> We thought of fixing it, by replacing the source mac again with router
> port mac on destination chassis,
> but the implementation was not trivial, hence we did not proceed further
> with it.
>
> As of now, I believe we should consider doing it, only if we see a strong
> use case/limitation.
>
> >
> > > And just to complete the story,  only non distributed router port's
> (cr-lrp-*)  mac will be sent on the wire and on the corresponding gateway
> chassis only.
> > >
> > > c. For gateway router port, we intend to send periodic (tunable and at
> interval of approx. 3 minutes) garps (or RARPs ?)
> > >     to make sure that physical switch will not age the gateway router
> port mac.
> > >    This will be helpful in request packets as well, since they will
> directed to gateway router port.
> > >
> > > Regarding multiple CHASSIS Macs:
> > > a. Yes, you are right, in case of multiple uplink bridges, we should
> map chassis_mac, with a bridge.
> > >     We will make sure this change in there, when we send out the patch
> for review.
> > >
> > > >
> > > > 4. LAYER 3 North South (NO NAT)
> > > > -------------------------------
> > > >
> > > >     DESIGN:
> > > >     ~~~~~~~
> > > >     a. For talking to external network endpoint, we will need a
> gateway
> > > >        on OVN DVR.
> > > >     b. We propose to use the gateway_chassis construct to achieve
> the same.
> > > >     c. LRP will be attached to Gateway Chassis(s) and only on the
> active
> > > >         chassis we will respond to ARP request for the LRP IP from
> undelay
> > > >         network.
> > > >     d. If NATing (keeping state) is not involved then traffic need
> not go
> > > >         via the gateway chassis always, i.e traffic from OVN chassis
> to
> > > >         external network need not go via the gateway chassis.
> > > >
> > > >     PIPELINE EXECUTION:
> > > >     ~~~~~~~~~~~~~~~~~~~
> > > >     a. From endpoint on OVN chassis to endpoint on underlay.
> > > >        i. Like DVR E-W, logical_router ingress and egress pipelines
> are
> > > >           executed on source chassis.
> > > >
> > > >     b. From endpoint on underlay TO endpoint on OVN chassis.
> > > >        i. logical_router ingress and egress pipelines are executed on
> > > >           gateway chassis.
> > > >
> > > >     PACKET FLOW LS ENDPOINT to UNDERLAY ENDPOINT (Figure 5. shows
> topology):
> > > >
>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > >     a. Packet flow in this case is exactly same as Layer 3 E-W.
> > > >
> > > >
> > > >     PACKET FLOW UNDERLAY ENDPOINT to LS ENDPOINT (Figure 5. shows
> topology and
> > > >
>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > >     Figure 6. shows the packet flow):
> > > >     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > >     a. Gateway for endpoints behind DVR will be resident on only
> > > >         gateway-chassis.
> > > >     b. Unicast packets will come to gateway-chassis, with
> destination MAC
> > > >         being RP2_MAC.
> > > >     c. From now on, it is like L3 E-W flow.
> > > >
> > > >     CODE CHANGES:
> > > >     ~~~~~~~~~~~~~
> > > >     a. ovn-northd:
> > > >          1. Changes to respond to vlan backed router port ARP from
> uplink,
> >
> > > >             only if it is on a gateway chassis.
> >
> >
> >
> > What does "uplink" mean here? If it means from the external network to
> the gateway router port, then isn't this already the current implementation?
> >
> > [ANKUR]
> > yes, uplink means external network here.
> > You are right, resolving ARP for a gateway router port on gateway
> chassis is already present.
> >
> > This is my mistake I did not phrase it correctly.
> > What I wanted to convey:
> > Changes to respond to Arp response for “all” the router ports only on
> gateway chassis.
> >
> > i.e as of now, we put chassis_is_resident check only for a router port
> which has gateway chassis attached.
> > However, if an ARP request comes from external network, which is for
> other vlan backed router ports, then current OVN logical flow will allow a
> response.
> >
> > This change will put this check for other router ports as well (one
> which are patched to vlan backed logical switches).
> > This is to ensure that only one router port is eligible to communicate
> with external network (as OVN supports only one gateway router port per
> logical router as of now).
> >
> > > >          2. Changes to make sure that in the absence of NAT
> configuration,
> > > >             OVN_CHASSIS to external network traffic does not go via
> the gateway
> > > >             chassis.
> > > >
> > > >     b. ovn-controller:
> > > >          1. Send out garps, advertising the vlan backed router port's
> > > >             (which has gateway chassis attached to it) from the
> > > >             active gateway chassis.
> >
> > > >
> >
> > Maybe this is the currently implementation, too?
> > [ANKUR]:
> > As of now, OVN generates GARP for following:
> > a. Local VIFs, if they are on a logical switch which has localnet ports.
> >
> > b. NAT IPs.
> >
> >
> >
> > i.e a router port mac will be advertised only, if it is configured for
> SNAT as well.
> >
> > However, with vlan backed networks, SNAT is not mandatory for N-S
> traffic, hence we should be advertising router port mac regardless of SNAT
> configuration.
> >
> > >
> > > It may be because it's getting late, but I'm having trouble following
> this :)
> > >
> > > Maybe the figures would help to visualize it better?
> > >
> > > [ANKUR]:
> > > Yes, my bad on that. Please suggest, how to share the PNGs in
> community (I can add the ASCII drawings but they might not render well).
> > >
> > >
> > > >
> > > > 5. LAYER 3 North South (NAT)
> > > > ----------------------------
> > > >
> > > >     SNAT, DNAT, SNAT_AND_DNAT (without external mac):
> > > >     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > >     a. Our proposal aligns with following patch series which is out
> for review:
> > > >         link
> > > > <
> https://urldefense.proofpoint.com/v2/url?u=http-3A__patchwork.ozlabs.
> > > >
> org_patch_952119_&d=DwIC-g&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-
> > > >
> 68aCJgsODyUEVsHGFOfL90J6MJY&m=DrUEx5o-hlioi3sUIucV-m2_VuYrU-flZsCzNNBW
> > > > 97U&s=aLwsdaC1DrPgC4-OxMfcQwtsnGxOqCbYBn4sWNDtDWg&e=>
> > > >
> > > >     b. However, our implementation deviates from proposal in
> following areas:
> > > >        i. Usage of lr_in_ip_routing:
> > > >           Our implementation sets the redirect flag after routing
> decision is taken.
> > > >           This is to ensure that a user entered static route will
> not affect the
> > > >           redirect decision (unless it is meant to).
> > > >
> > > >       ii. Using Tenant VLAN ID for "redirection":
> > > >           Our implementation uses external network router port's
> > > >           (router port that has gateway chassis attached to it) vlan
> id
> > > >           for redirection. This is because chassisredirect port is
> NOT on
> > > >           tenant network and logically packet is being forwarded to
> > > >           chassisredirect port.
> > > >
> > > >
> > > >     SNAT_AND_DNAT (with external mac):
> > > >     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > >     a. Current OVN implementation of not going via gateway chassis
> aligns with
> > > >         our design and it worked fine.
> > > >
> > > >
> > > > This is just an initial proposal. We have identified more areas that
> > > > should be worked upon, we will submit patches (and put forth
> > > > topics/design for discussion), as we make progress.
> > > >
> > > >
> > > > Thanks
> > > >
> > > > Regards,
> > > > Ankur
> > > >
> > > >
> > > >
> > > > _______________________________________________
> > > > dev mailing list
> > > > dev at openvswitch.org<mailto:dev at openvswitch.org>
> > > >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.
> > > >
> org_mailman_listinfo_ovs-2Ddev&d=DwIC-g&c=s883GpUCOChKOHiocYtGcg&r=mZw
> > > >
> X9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=DrUEx5o-hlioi3sUIucV-m2_VuY
> > > > rU-flZsCzNNBW97U&s=KsIzKw3AHPiiRKYCaZBr2tJDOY0oq2Kxpi9UIuTDWTU&e=
> > > >
> > >
> > > _______________________________________________
> > > dev mailing list
> > > dev at openvswitch.org<mailto:dev at openvswitch.org>
> > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev [
> mail.openvswitch.org]<
> https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=Qyhta6TTVormxLjKqg-dj1OhOVaUXuw1lT8Qmcwveds&s=L1i51SoFZAgwt0I6LXGHwkdMyNq1Y4HY_0M-vcsW07I&e=>
> [mail.openvswitch.org [mail.openvswitch.org]<
> https://urldefense.proofpoint.com/v2/url?u=http-3A__mail.openvswitch.org&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=Qyhta6TTVormxLjKqg-dj1OhOVaUXuw1lT8Qmcwveds&s=aL-uNbHOrk2NLH2QAqpNg6bclPZblrOoD4iPVZSSA98&e=
> >]
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>


-- 
Miguel Ángel Ajo
OSP / Networking DFG, OVN Squad Engineering


More information about the dev mailing list