[ovs-discuss] OVN: availability zones concept

Han Zhou zhouhan at gmail.com
Tue Mar 5 02:03:52 UTC 2019


On Mon, Mar 4, 2019 at 4:48 PM Dan Sneddon <dsneddon at redhat.com> wrote:
>
>
>
> On Mon, Mar 4, 2019 at 5:34 AM Lucas Alvares Gomes <lucasagomes at gmail.com> wrote:
>>
>> Hi,
>>
>> On Sat, Mar 2, 2019 at 1:52 AM Han Zhou <zhouhan at gmail.com> wrote:
>> >
>> > On Thu, Feb 28, 2019 at 9:58 AM Daniel Alvarez Sanchez
>> > <dalvarez at redhat.com> wrote:
>> > >
>> > > Hi folks,
>> > >
>> > > Just wanted to throw an idea here about introducing availability zones
>> > > (AZ) concept in OVN and get implementation ideas. From a CMS
>> > > perspective, it makes sense to be able to implement some sort of
>> > > logical division of resources into failure domains to maximize their
>> > > availability.
>> > >
>> > > In this sense, establishing a full mesh of Geneve tunnels is not
>> > > needed (and possibly undesired when strict firewalls are used between
>> > > AZs) as L2 connectivity will be constrained to the AZ boundaries.
>> > >
>> > > A possibility would be to let the deployer of the CMS set a key on the
>> > > OpenvSwitch table of the local OVS instance like
>> > > 'external_ids:ovn_az=<int>' and if it's set, ovn-controller will
>> > > register itself as a Chassis with the same external ID and establish
>> > > tunnels to those Chassis within the same AZ, otherwise it'll keep the
>> > > current behavior.
>> > >
>> > > It'll be responsibility of the CMS to schedule gateway ports in the
>> > > right AZ as well to provide L3 AZ awareness.
>> > >
>> > > Does that make sense? Thoughts?
>> > >
>> > > Thanks a lot!!
>> > > Daniel
>> > > _______________________________________________
>> > > discuss mailing list
>> > > discuss at openvswitch.org
>> > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>> >
>> > This sounds like a good idea to me. Just a concern for the name "AZ".
>> > The feature seems to be quite useful to optimize at scale when you
>> > know there are different groups of chassises (and gateways) would
>> > never need to communicate with each other. However, it doesn't sound
>> > like availability zone concept, since it is managed by a single
>> > control plane, which means they are not independently availability
>> > zones. I'd call it TZ (transport zone), or maybe just cell. However, I
>> > like the idea and it seems not hard to be implemented.
>> >
>>
>> I agree with Han here, the idea is sound but the name seems a bit off.
>> I specially liked the "transport zone" (TZs) suggested by Han here. So
>> +1 to that name :D
>>
>> Quick question. Should we have a default TZ for the chassis/gateways
>> that doesn't have that key set ? For example, if we have 9 chassis
>> where three of them have the TZ key set to 1, three others setting TZ
>> to 2 and remainder three left with no TZ key set. That should result
>> in 3 different zones right ? I wanna clarify that because I don't
>> think we should create a mash with all Chassis for those who doesn't
>> have the TZ set, instead, if it's omitted ovn-controller could
>> consider them to be part of a "default" TZ of some sort.
>>
>> What you think ? Is that aligned with your idea ?
>>
>> Cheers,
>> Lucas
>
>
> Hello, I'm chiming in here because I think Daniel was prompted to start this thread based on a related feature request that I made against the OpenStack OVN component. Transport Zones is a better name for this feature in my opinion, and that is incidentally the name for this feature in VMWare NSX-T. As far as I can tell, they are not claiming it to be a trademark, and it's a fairly generic term.
>
> Having a default transport zone makes a lot of sense to me if we consider the introduction of transport zones into an existing environment. If existing chassis are not assigned to any transport zone, I think to most users it would be expected behavior that chassis assigned to a new transport zone would be separate.
>
> However, treating the default TZ as a separate TZ doesn't address a central hub-and-spoke network where each chassis forms tunnels to other chassis in the same TZ, and also forms tunnels with nodes at the central site, but doesn't form tunnels with chassis in other specific TZs. In this alternative, a chassis without a TZ forms tunnels with all other chassis. A chassis with a specific TZ forms tunnels within that TZ and with chassis with no TZ specified. That allows network functions such as gateways to be centralized for all TZs.
>
> Perhaps an ideal solution would allow a chassis to be a member of more than one TZ? That would allow flexibility, but may be more difficult to implement.
>
> --
> Dan Sneddon

Good points on the default TZ and hub-and-spoke use case! To support
that, I agree with the flexible way that a chassis is allowed to be a
member of more than one TZ. So if TZ is not set, the chassis belongs
to default TZ. If a chassis wants to join both default TZ and some
other TZs, it should explicitely set to "default" and the other TZs.

However, if our purpose here is just to avoid unnecessary tunnel mesh
and improve tunnel/bfd scalability, another approach would be
automatically calculate the remotes that are needed for each chassis -
on-demand tunnel creation. It shall be doable following the logic of
local datapath calculation currently implemented in ovn-controller.
Only the chassises sharing common set of datapaths need to be in the
same tunnel mesh.
Pros: This is more optimal (least number of tunnels created) and less
error prone (no ops work) than the TZ solution.
Cons: This may introduce some control plane cost, although I think it
should be acceptable, if implemented properly. It may introduce a
latency because of the process of dynamically creating tunnels when a
port from a different set of datapaths is bound on a chassis. It does
increase the complexity of control plane a little bit.

Thoughts?


More information about the discuss mailing list