[ovs-discuss] OVN: availability zones concept

Lucas Alvares Gomes lucasagomes at gmail.com
Fri Mar 22 11:22:50 UTC 2019


Hi,

Getting back to this topic because we do have people (read clients)
interested in this work for their Telco/Edge computing uses cases as a
way to preventing each edge site to try to create tunnels with every
node on every site.

Also, another request we saw which this feature would help is related
to security where people want to create "trust zones" and prevent
computes in a more secure zone to communicate with a less secure zone.

As Dan Sneddon pointed out in this thread, it's possible to use
firewalls to workaround these problems but the approach is not ideal
(and may result in negative consequences) because each Chassis in OVN
will still try again and again to form the tunnel mesh with every
other Chassis but it will fail/timeout over and over.

Regarding to the approach to implement this I see we have a couple
ideas floating around:

1. Explicit setting the transport zone(s) in the OpenVSwitch table for
the Chassis.

2. Dynamically creating tunnels between Chassis that are logically
connected with each other.

3. Creating the tunnels on demand when the first packet is going
to/from a Chassis (although Ben has stated that he is not pushing for
this).

So, while every approach has their pos/cons, personally (and also by
talking to some work colleagues), I think that 1. is probably the
best/most straight forward way to go with this because since it's
explicit it is also the easiest to troubleshoot/monitor what's going
on with the network. The approach 2. and 3., while very interesting
from an engineering point of view, might not be so interesting for
folks responsible for monitoring the network because it leaves the
compute nodes in some "unknown" state (is there a tunnel between
Chassis A and B ? no/yes/not yet ?).

If there's no objection from people in this thread (specially Han and
Ben) I would be more than happy in helping implementing this feature.

Cheers,
Lucas

On Thu, Mar 7, 2019 at 6:54 PM Ben Pfaff <blp at ovn.org> wrote:
>
> On Wed, Mar 06, 2019 at 10:32:29PM -0800, Han Zhou wrote:
> > On Wed, Mar 6, 2019 at 9:06 AM Ben Pfaff <blp at ovn.org> wrote:
> > >
> > > On Tue, Mar 05, 2019 at 09:39:37PM -0800, Han Zhou wrote:
> > > > On Tue, Mar 5, 2019 at 7:24 PM Ben Pfaff <blp at ovn.org> wrote:
> > > > > What's the effective difference between an OVN deployment with 3 zones,
> > > > > and a collection of 3 OVN deployments?  Is it simply that the 3-zone
> > > > > deployment shares databases?  Is that a significant advantage?
> > > >
> > > > Hi Ben, based on the discussions there are two cases:
> > > >
> > > > For completely separated zones (no overlapping) v.s. separate OVN
> > > > deployments, the difference is that separate OVN deployments requires
> > > > some sort of federation at a higher layer, so that a single CMS can
> > > > operate multiple OVN deployments. Of course separate zones in same OVN
> > > > still requires changes in CMS to operate but the change may be smaller
> > > > in some cases.
> > > >
> > > > For overlapping zones v.s. separate OVN deployments, the difference is
> > > > more obvious. Separate OVN deployments doesn't allow overlapping.
> > > > Overlapping zones allows sharing gateways between different groups of
> > > > hypervisors.
> > >
> > > OK.  The difference is obvious in the case where there is overlap.
> > >
> > > > If the purpose is only reducing tunnel mesh size, I think it may be
> > > > better to avoid the zone concept but instead create tunnels (and bfd
> > > > sessions) on-demand, as discussed here:
> > > > https://mail.openvswitch.org/pipermail/ovs-discuss/2019-March/048281.html
> > >
> > > Except in cases where we have BFD sessions, it is possible to entirely
> > > avoid having explicitly defined tunnels, since the tunnels can be
> > > defined in the flow table.  The ovs-fields(7) manpage describes these
> > > under "flow-based tunnels" in the TUNNEL FIELDS section.  Naively, doing
> > > it this way would require, on each hypervisor, a few OpenFlow flows per
> > > remote chassis, as opposed to one port per remote chassis.  That
> > > probably scales better.  If necessary, it could be made to scale better
> > > than that by using send-to-controller actions to add flows for tunnels
> > > as packets arrive for them or as packets need to go through them.
> >
> > Thanks Ben for the pointer. I have to admit I was not aware of these
> > different ways of using tunnels. The documentation is very clear, and
> > now I understand what OVN currently uses is "Intermediate models",
> > i.e. partially flow-based - remote-ips are port based while keys are
> > flow based.
>
> Thanks for the documentation fixes!
>
> > While purely flow-based tunnel is attractive in terms of flexibility,
> > it seems not fit very well for OVN use case because we do need BFD
> > sessions.
>
> I think that OVN only uses BFD for a few of its ports--only for gateways
> with HA, right?  Those could continue to have ports.
>
> > For the "send-to-controller", i.e. reactively set up flows when
> > packets arrives, I hope it is not really needed for solving the tunnel
> > scaling problem, since it introduces data plane latency which could be
> > a bigger problem. (But I am not sure if reactive mode in general is a
> > good idea - it might be a reasonable trade-off for solving the scale
> > problem of each HV pre-installing flows for all related datapaths in a
> > full-mesh alike scenario. Anyway, not directly related to current
> > topic).
>
> It would introduce data plane latency for the first packet to go to or
> from a particular hypervisor.  After that there would be no further
> additional latency.  It would probably not be noticeable.
>
> Let me be clear that I am not pushing this solution.  It will complicate
> things, and I do not like unnecessary complication.  I am just pointing
> out that is possible.
>
> > So I would propose to keep the current partially flow-based tunnel
> > usage in OVN and optimize the tunnel setup only between peers that are
> > logically connected, if this satisfies the scaling goal of OVN users.
> > Even with this optimization, we may need to make it as a configurable
> > option, since in small scale use cases users may in practice prefer
> > the original behavior to avoid the latency of tunnel setup.
>
> OK.
> _______________________________________________
> discuss mailing list
> discuss at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


More information about the discuss mailing list