[ovs-dev] VLAN tenant network patches

Ankur Sharma ankur.sharma at nutanix.com
Thu Nov 15 10:27:34 UTC 2018


Hi Miguel,
Thanks a lot for the clarification, sounds good to me.

Hi Guru,
Yes, Implementation will be generic (not specific to a particular use case) and should be seen as enhancement.

Regards,
Ankur

________________________________
From: Miguel Angel Ajo Pelayo <majopela at redhat.com>
Sent: Thursday, November 15, 2018 1:06:34 AM
To: Gurucharan Shetty; Numan Siddique
Cc: Ankur Sharma; ovs dev
Subject: Re: [ovs-dev] VLAN tenant network patches

Thanks for looking at this and keeping it moving forward.

I'm also fine with both ways of implementing the feature, and of course, having distributed
E/W for VLAN is great, It'd be amazing, based on not duplicating interfaces, that the
implementations don't interfere each other, and that numan's patches are reasonably small,
to have numans implementation merged, and then Ankur's when it's ready, let me explain why:

1) For Openstack with OVN VLAN support is blocked at the moment, because of the
     MTU missmatch issue, so it'd be great to have that available as soon as possible.

2)  Numan's implementation would also allow the gateway chassis handling the
     VLAN E/W traffic to be used by SR-IOV[1] instances on an OVN managed network
     (something common in the openstack world).

     But this is a secondary bonus, 1** is the important part from my POV.

[1] SR-IOV is a network card technology that partitions a physical network card into severals, and then
     those partitions are memory mapped into VMs, along with an IRQ. The network card has a
     simple switch inside for the partitionet network cards, and can assign a VLAN to each of them.


Thank you very much @Gurucharan Shetty<mailto:guru at ovn.org> , @ankur.sharma at nutanix.com<mailto:ankur.sharma at nutanix.com> and @Numan Siddique<mailto:nusiddiq at redhat.com>

On Wed, Nov 14, 2018 at 6:04 PM Guru Shetty <guru at ovn.org<mailto:guru at ovn.org>> wrote:
Okay. I want to make sure that we don't end up with 3 gateway
implementations. We currently have 2. So when you are adding this new
feature, please do NOT make it an independent feature which only works for
your use case. It will be a maintenance headache otherwise. Instead, it has
to be looked at as enhancements for existing OVN features. I hope that is
clear.

On Wed, 14 Nov 2018 at 05:18, Ankur Sharma <ankur.sharma at nutanix.com<mailto:ankur.sharma at nutanix.com>> wrote:

> Hi Guru,
>
> Sure, providing more explanation.
>
> Q. What are we trying to solve?
> Ans. Getting distributed routing to work for vlan backed networks through
> OVN.
>
> Q. Disconnect wrt OVN capabilities for above task?
>
> Ans. OVN lacks in certain areas wrt how to forward the packets
> "correctly/efficiently" in the absence of encapsulation (VXLAN, STT or
> GENEVE).
>
> Following are the known gaps:
>
> L3 E-W
>
> ======
> a. Since a router port is distributed, hence in the absence of
> encapsulation, we should not be using router port mac as source mac. Our
> proposal is to replace router port mac with a chassis specific unique mac,
> when an unencapsulated packet originating from router port goes on wire.
> This was explained in following email:
> https://mail.openvswitch.org/pipermail/ovs-dev/2018-October/353179.html [mail.openvswitch.org]<https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddev_2018-2DOctober_353179.html&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=xd7ItuSSr7o2nn72SgtC5JRoR71FYa3060ZIMfHZ_G0&s=Bt5ocYwq0bArC0i_T5rbeQXu29LYVbv6PMt3ZHo1qrs&e=>
>
>
> b. Sending ARP reply on wire.
>
> As of now, OVN consumes ARP reply from VM which are destined to router
> port (because router port is present locally on vm's chassis as well).
> Because ARP reply is NOT seen on the wire, hence a physical switch will
> never learn VM's mac (unless VM is involved in a L2 communication as well).
>
> As a result, a DVR routed traffic, will always be flooded by TOR (top of
> the rack switch), as dest mac is that of the VM, which TOR never learnt.
>
>
> L3 N-S
>
> ======
>
> a. For vlan backed networks, NATing is NOT a must to talk to "outside"
> physical network (for overlay it is). Hence, OVN requires some changes in
> this area as well.
>
> b. DO NOT respond to ARP request for any ROUTER PORT from uplink, unless
> it is on gateway chassis.
>
> c. When gateway chassis failover happens, then advertise router port mac
> as well.
>
>
>
> L3 N-S NAT
>
> =========
>
> a. Current OVN implementation uses geneve encap (geneve options) to
> provide metadata to the gateway chassis (where SNAT happens).
>
> b. In the absence of encapsulation, OVN should be enhanced to still
> support NAT on gateway chassis.
>
>
> ===========================================================
>
>
> Our initial proposal has details as well:
>
> https://mail.openvswitch.org/pipermail/ovs-dev/2018-October/353066.html [mail.openvswitch.org]<https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddev_2018-2DOctober_353066.html&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=xd7ItuSSr7o2nn72SgtC5JRoR71FYa3060ZIMfHZ_G0&s=1hKr2XbionG70gi4tlvrsxuP7Dfx63ZuPJw5To6d2fU&e=>
>
>
> Like i mentioned, problem statement we are trying to solve is "Distributed
> Virtual Routing For VLAN Backed Networks".
> As a part of above, we have identified some gaps, which we intend to fix.
>
>
> As we progress further, we will have to add some features as well.
> But, as of now we are focused on getting basic functionality to work
> correctly first.
>
>
>
> Please feel free to put forth more queries/concerns you have, i will be
> happy to explain.
>
> Thanks again for review.
>
> Regards,
> Ankur
>
>
> ------------------------------
> *From:* Guru Shetty <guru at ovn.org<mailto:guru at ovn.org>>
> *Sent:* Monday, November 12, 2018 9:58:07 AM
> *To:* Ankur Sharma
> *Cc:* ovs dev; Numan Siddique; Ben Pfaff
> *Subject:* Re: VLAN tenant network patches
>
>
>
> On Sun, 11 Nov 2018 at 21:02, Ankur Sharma <ankur.sharma at nutanix.com<mailto:ankur.sharma at nutanix.com>>
> wrote:
>
> Hi Guru,
>
> Thanks for spending time in understanding the proposal and drafting your
> understanding as well.
> Thanks Numan for pitching in.
>
> Some comments (trying to keep them as brief as possible).
>
>
> a. On a high level, we are trying to do following:
>     "Distributed router functionality for vlan backed networks"
>
>
> I guess, there is a big disconnect then. OVN currently does "distributed
> router for VLAN backed networks". Do you disagree? If so, please explain.
>
>
>
> b. This would include changes/analysis for E-W traffic and N-S traffic.
>
>
> c. Some the changes are specific to the characteristics of a distributed
> router and some are specific to OVN way of doing things.
>
>
> d. The points we have discussed thus far, is a subset of changes,
>
>     i.e a vlan backed DVR (or logical router) would be more than just
> replacing router port mac with a chassis mac.
>
>
> e. Numan's changes DO NOT conflict/overlap with what we have proposed so
> far and hence should be discussed/reviewed independently.
>
>     His changes are solving a very specific problem.
>
>     His changes are to "mimic" a centralized router in a distributed
> router.
>
>     i.e to execute router pipeline on a centralized chassis, while the
> router is still is distributed.
>     I have provided my feedback here:
>     https://mail.openvswitch.org/pipermail/ovs-dev/2018-November/353701.html [mail.openvswitch.org]<https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddev_2018-2DNovember_353701.html&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=xd7ItuSSr7o2nn72SgtC5JRoR71FYa3060ZIMfHZ_G0&s=Y-QhWbVAUc7GiR6mqOhDHE2Rpm8lG36Wcr9qfIH1plQ&e=>
> [mail.openvswitch.org [mail.openvswitch.org]<https://urldefense.proofpoint.com/v2/url?u=http-3A__mail.openvswitch.org&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=xd7ItuSSr7o2nn72SgtC5JRoR71FYa3060ZIMfHZ_G0&s=FcxZIx1944YwQVZLW_qXfcoAj5mODr__pUcrnvFN-gA&e=>]
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddev_2018-2DNovember_353701.html&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=XO-18lbZRBMj3Y31RX-knscxn_yu-Y9ukK_MhWMq5_s&s=RF3Hsy_IA_3gDzjc66Cpnm8PSAczbRB1PDNIWCIUYQc&e=>
>
>
>
> Providing some more comments inline.
>
> Thanks again Guru, Numan, Mark and Han for spending time on the proposal
> and providing feedback.
> I am preparing a v2, which will have changes till E-W.
>
>
>
> Regards,
> Ankur
>
>
>
>
>
>
>
>
> ------------------------------
> *From:* Guru Shetty <guru at ovn.org<mailto:guru at ovn.org>>
> *Sent:* Friday, November 9, 2018 11:45 AM
> *To:* ovs dev
> *Cc:* Ankur Sharma; Numan Siddique; Ben Pfaff
> *Subject:* VLAN tenant network patches
>
> I have tried to summarize the problem statement that Numan and Ankur are
> trying to solve here based on my understanding so far. Please correct me
> and I will revise it along.
>
> Current feature set in OVN.
> ==========================
>
> A logical switch should only have one localnet logical port. If a logical
> switch has a logical port of type "localnet",then all traffic for that
> logical switch avoids overlays. So in essence, this is only useful when all
> the hypervisors are in the same broadcast domain.  Currently there are no
> known problems as long as logical switches are not connected to any logical
> routers.
>
>
> When 2 logical switches (each with a localnet port) is connected to a
> logical router, we still push all east-west traffic to the underlay. The
> source hypervisor executes the pipeline of all 3 logical datapaths and then
> pushes the traffic to the underlay via the localnet port (with its
> corresponding vlan tag) of third logical switch.
>
> The above topology creates a problem for the underlying hardware switch.
> Because now it can see the same mac address of the distributed router
> coming from 2 different hypervisors as a source mac address of the packet
> on wire. According to Ankur, there are physical switches which can detect
> source mac address coming from differnet ports and limit it. But this looks
> like it is configurable in physical switches.
>
>
> For N/S traffic, currently traffic is punted to gateway chassis via a
> overlay tunnel. There is a use case where you want to avoid overlay
> tunnels. This is because for "localnet" topology you can keep the the MTU
> of inner VM to be the same as underlay MTU. But when you have overlays just
> for one class of traffic, this becomes a problem.
>
> So both Ankur's and Numan's patches tries to tackle the above 2 problems.
>
> To re-summarize
> Problem 1: External switch getting confused about the machine on which OVN
> router mac address resides. But this is only source mac address coming from
> different hypervisors (not destination mac).
>
> [ANKUR]:
> We are trying to do more than just replacing a router port mac with a
> chassis mac. We are trying to get a distributed routing functionality
> working via OVN for vlan backed networks.
> Not using the router port mac, is one of the first problems that has to be
> solved.
> For a production deployment, we might need some more changes/analysis.
>
>
> Problem 2: When packet has destination IP address outside OVN router known
> subnets, it is being currently sent via overlay tunnel. This would need MTU
> configuration for inner VMs.
>
> Numans patch:
> ============
>
> Numans patches tries to solve the above 2 problems by doing the following.
> 1. When VM-A (on Hyp-A) in switch-A tries to talk to VM-B in switch-B
> (Hyp-B) (switch-A and switch-B are connected with router), Hyp-A will
> execute switch-A pipeline and push the traffic out of localnet port with
> router's mac address as destination.
> 2. Router chassis will receive the packet, execute switch-A pipleline
> again, router pipeline and then switch-B pipeline and push packet out of
> switch-B's localnet port.  Now Hyp-B receives the traffic, executes
> switch-B pipeline again and packet gets delivered.
>
> The result is that all east west traffic is centralized and has an extra
> hop.
>
> [ANKUR]:
> Yes, Numan's approach is to mimic a centralized router, while the vlan
> backed logical switch is still connected to a distributed logical router
> (i.e connecting ports are of type "patch").
>
>
>
> Ankur's proposal:
> ==============
>
> Though the complete patches do not exist, Ankur wants to solve the problem
> 1 by having a chassis specific MAC. So when packet leaves a hypervisor for
> east-west routing, it uses a unique mac. The disadvantage with this
> proposal is that the VM (i.e logical port) will see the mac of its first
> hop router change continuously which may have some yet to be clearly
> defined side-effects (leads to more ARP requests from the VM).
>
> [ANKUR]:
> Just want to clarify that a tcp/ip stack would NEVER populate its ARP
> cache based on IP packets. It would rely on ARP (/GARP) to resolve gateway
> mac, ARP queries for router port (gateway) ip will always be responded by
> OVN with router port mac only.
> i.e using Chassis mac as source mac WOULD NOT impact any functionality of
> a VM's networking stack. However, it could still be desirable to NOT TO
> show the chassis mac to a VM.  We intend to solve it as well, but our first
> implementation does not look clean/scalable. We will submit it for review
> anyways, but not in the first series.
>
> Problem 2 is solved similar to what Numan has in patches, although there
> are small changes in implementation. It is not clear whether one code is
> more complicated than other. But it looks like Ankur’s patches will avoid
> the extra hop for east-west traffic.
>
> Numan is perfectly fine with Ankur’s patches (after it is sent, reviewed
> and tested) if they satisfy his problem statements. But he does prefer his
> patches reviewed and merged if there is delay in Ankur's patches (and
> possibly reverted later, if there is an alternative).
>
> [ANKUR] Mine and Numan's patches are not realted to each other and should
> not be seen as "either or".
> Numan's patch is trying to solve a very specific case.
> It should be reviewed independently and should not be blocked because of
> my changes.
> Management plane / data center architecture would drive which approach
> to take.
> As a platform, OVN should support both.
>
>
>
>
>
>
_______________________________________________
dev mailing list
dev at openvswitch.org<mailto:dev at openvswitch.org>
https://mail.openvswitch.org/mailman/listinfo/ovs-dev [mail.openvswitch.org]<https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=mZwX9gFQgeJHzTg-68aCJgsODyUEVsHGFOfL90J6MJY&m=xd7ItuSSr7o2nn72SgtC5JRoR71FYa3060ZIMfHZ_G0&s=IziP73XzJrfsFvnSsMtsoVHUQGabUg18UZeWitvnBJI&e=>


--
Miguel Ángel Ajo
OSP / Networking DFG, OVN Squad Engineering


More information about the dev mailing list