[ovs-dev] VLAN tenant network patches

Ankur Sharma ankur.sharma at nutanix.com
Mon Nov 12 05:02:10 UTC 2018


Hi Guru,

Thanks for spending time in understanding the proposal and drafting your understanding as well.
Thanks Numan for pitching in.

Some comments (trying to keep them as brief as possible).

a. On a high level, we are trying to do following:
    "Distributed router functionality for vlan backed networks"


b. This would include changes/analysis for E-W traffic and N-S traffic.


c. Some the changes are specific to the characteristics of a distributed router and some are specific to OVN way of doing things.


d. The points we have discussed thus far, is a subset of changes,

    i.e a vlan backed DVR (or logical router) would be more than just replacing router port mac with a chassis mac.


e. Numan's changes DO NOT conflict/overlap with what we have proposed so far and hence should be discussed/reviewed independently.

    His changes are solving a very specific problem.

    His changes are to "mimic" a centralized router in a distributed router.

    i.e to execute router pipeline on a centralized chassis, while the router is still is distributed.
    I have provided my feedback here:
    https://mail.openvswitch.org/pipermail/ovs-dev/2018-November/353701.html



Providing some more comments inline.

Thanks again Guru, Numan, Mark and Han for spending time on the proposal and providing feedback.
I am preparing a v2, which will have changes till E-W.



Regards,
Ankur








________________________________
From: Guru Shetty <guru at ovn.org>
Sent: Friday, November 9, 2018 11:45 AM
To: ovs dev
Cc: Ankur Sharma; Numan Siddique; Ben Pfaff
Subject: VLAN tenant network patches

I have tried to summarize the problem statement that Numan and Ankur are trying to solve here based on my understanding so far. Please correct me and I will revise it along.

Current feature set in OVN.
==========================

A logical switch should only have one localnet logical port. If a logical switch has a logical port of type "localnet",then all traffic for that logical switch avoids overlays. So in essence, this is only useful when all the hypervisors are in the same broadcast domain.  Currently there are no known problems as long as logical switches are not connected to any logical routers.


When 2 logical switches (each with a localnet port) is connected to a logical router, we still push all east-west traffic to the underlay. The source hypervisor executes the pipeline of all 3 logical datapaths and then pushes the traffic to the underlay via the localnet port (with its corresponding vlan tag) of third logical switch.

The above topology creates a problem for the underlying hardware switch. Because now it can see the same mac address of the distributed router coming from 2 different hypervisors as a source mac address of the packet on wire. According to Ankur, there are physical switches which can detect source mac address coming from differnet ports and limit it. But this looks like it is configurable in physical switches.


For N/S traffic, currently traffic is punted to gateway chassis via a overlay tunnel. There is a use case where you want to avoid overlay tunnels. This is because for "localnet" topology you can keep the the MTU of inner VM to be the same as underlay MTU. But when you have overlays just for one class of traffic, this becomes a problem.

So both Ankur's and Numan's patches tries to tackle the above 2 problems.

To re-summarize
Problem 1: External switch getting confused about the machine on which OVN router mac address resides. But this is only source mac address coming from different hypervisors (not destination mac).

[ANKUR]:
We are trying to do more than just replacing a router port mac with a chassis mac. We are trying to get a distributed routing functionality working via OVN for vlan backed networks.
Not using the router port mac, is one of the first problems that has to be solved.
For a production deployment, we might need some more changes/analysis.


Problem 2: When packet has destination IP address outside OVN router known subnets, it is being currently sent via overlay tunnel. This would need MTU configuration for inner VMs.

Numans patch:
============

Numans patches tries to solve the above 2 problems by doing the following.
1. When VM-A (on Hyp-A) in switch-A tries to talk to VM-B in switch-B (Hyp-B) (switch-A and switch-B are connected with router), Hyp-A will execute switch-A pipeline and push the traffic out of localnet port with router's mac address as destination.
2. Router chassis will receive the packet, execute switch-A pipleline again, router pipeline and then switch-B pipeline and push packet out of switch-B's localnet port.  Now Hyp-B receives the traffic, executes switch-B pipeline again and packet gets delivered.

The result is that all east west traffic is centralized and has an extra hop.

[ANKUR]:
Yes, Numan's approach is to mimic a centralized router, while the vlan backed logical switch is still connected to a distributed logical router (i.e connecting ports are of type "patch").



Ankur's proposal:
==============

Though the complete patches do not exist, Ankur wants to solve the problem 1 by having a chassis specific MAC. So when packet leaves a hypervisor for east-west routing, it uses a unique mac. The disadvantage with this proposal is that the VM (i.e logical port) will see the mac of its first hop router change continuously which may have some yet to be clearly defined side-effects (leads to more ARP requests from the VM).

[ANKUR]:
Just want to clarify that a tcp/ip stack would NEVER populate its ARP cache based on IP packets. It would rely on ARP (/GARP) to resolve gateway mac, ARP queries for router port (gateway) ip will always be responded by OVN with router port mac only.
i.e using Chassis mac as source mac WOULD NOT impact any functionality of a VM's networking stack. However, it could still be desirable to NOT TO show the chassis mac to a VM.  We intend to solve it as well, but our first implementation does not look clean/scalable. We will submit it for review anyways, but not in the first series.

Problem 2 is solved similar to what Numan has in patches, although there are small changes in implementation. It is not clear whether one code is more complicated than other. But it looks like Ankur’s patches will avoid the extra hop for east-west traffic.

Numan is perfectly fine with Ankur’s patches (after it is sent, reviewed and tested) if they satisfy his problem statements. But he does prefer his patches reviewed and merged if there is delay in Ankur's patches (and possibly reverted later, if there is an alternative).

[ANKUR] Mine and Numan's patches are not realted to each other and should not be seen as "either or".
Numan's patch is trying to solve a very specific case.
It should be reviewed independently and should not be blocked because of my changes.
Management plane / data center architecture would drive which approach to take.
As a platform, OVN should support both.







More information about the dev mailing list