[ovs-discuss] VXLAN support for OVN

Ihar Hrachyshka ihrachys at redhat.com
Tue Mar 10 01:22:24 UTC 2020


Good day,

at Red Hat, once in a while we hear from customers, both internal and
external, that they would like to see VXLAN support in OVN for them to
consider switching to the technology. This email is a notice that I
plan to work on this feature in the next weeks and months and hope to
post patches for you to consider. Below is an attempt to explain why
we may want it, how we could achieve it, potential limitations. This
is also an attempt to collect early feedback for the whole idea.

Reasons for the customer requests are multiple; some of more merit,
some are more about perception. One technical reason is that there are
times when a SDN / cloud deployment team doesn't have direct influence
on protocols allowed in the underlying network; and when it's hard,
due to politics or other reasons, to make policy changes to allow
Geneve traffic while VXLAN is already available to use. Coming from
OpenStack background, usually you have interested customers already
using ML2-OVS implementation of Neutron that already relies on VXLAN.

Another reason is that some potential users may believe that VXLAN
would bring specific benefits in their environment compared to Geneve
tunnelling (these gains are largely expected in performance, not
functionality because of objective limitations of VXLAN protocol
definition).  While Geneve vs. VXLAN performance is indeed quite an
old debate with no clear answers, and while there were experiments set
in the past that apparently demonstrated that potential performance
gains from VXLAN may not be as prominent or present as one may
believe*, nevertheless the belief that VXLAN would be beneficial at
least in some environments on some hardware never dies out; and so
regardless of proven merit of such belief, OVN adoption suffers
because of its lack of VXLAN support.

* https://blog.russellbryant.net/2017/05/30/ovn-geneve-vs-vxlan-does-it-matter/

So our plan is to satisfy such requests by introducing support for the
new tunnelling type into OVN and by doing that allow interested
parties to try it in their specific environments and see if it makes
the expected difference.

Obviously, there is a cost to introduce additional protocol to support
matrix (especially considering limitations it would introduce, as
discussed below). We will probably have to consider the complexity of
the final implementation once it's available for review.

=====

For implementation, the base problem to solve here is the fact that
VXLAN doesn't carry as many bits available to use for encoding
datapath as Geneve does. (Geneve occupies both the 24-bit VNI field as
well as 32 more bits of metadata to carry logical source and
destination ports.) VXLAN ID is just 24 bits long, and there are no
additional fields available for OVN to pass port information.  (This
would be different if one would consider protocol extensions like
VXLAN-GPE, but relying on them makes both reasons to consider VXLAN
listed above somewhat moot.)

To satisfy OVN while also working with VXLAN, the limited 24 bit VNI
space would have to be split between three components - network ID,
logical source and destination ports. The split necessarily limits the
maximum number of networks or ports per network, depending on where
the split is cut.

Splitting the same 24 bit space between all three components equally
would result in limitations that would probably not satisfy most real
life deployments (we are talking about max 256 networks with max 256
ports per network).

An alternative to that would be not encoding one of the components
passed through metadata right now. There seems to be no clear way to
avoid passing destination port ID because once the packet is on the
other side of the tunnel, OVN wouldn't be able to determine to which
port to deliver the incoming packet.  (But let me know if you have
ideas!)

On the other hand, we could pass just network ID and logical
destination port ID instead, leaving source port behind. This should
work if we don't match against source ports in egress ACLs. While this
puts a functional limitation on OVN primitives available to CMS, it
shouldn't be a problem for a large number of setups (specifically,
OpenStack security groups don't support matching against source ports;
not sure about other popular CMS platforms.)

If this works, we are left with 24 bits to split between two
components, not three. If we split them equally, we end up with max
4096 networks with 4096 ports per network. As a data point, internal
Red Hat measurements suggest that these numbers would satisfy most
customers of Red Hat OpenStack Platform.

If such a split would not satisfy some requirements, we may consider
alternative splits as well as allowing to customize the numbers for a
particular environment as needed. (Obviously, while trying to pick the
most sane values for default behavior.)

=====

Let me know if there are holes in the reasoning above, both high level
as well as around implementation options. Perhaps you even have
better ideas as to how to implement it.

Thanks,
Ihar



More information about the discuss mailing list