[ovs-dev] OVN L3-HA request for feedback

Miguel Angel Ajo Pelayo majopela at redhat.com
Fri Apr 7 07:14:33 UTC 2017


Updating what I wrote yesterday (I hope I won't make people's
eyes hurt today) after a talk on IRC (thank you Mickey Spiegel
and Gurucharan Shetty):

I propose having:

   1) chassis on NB/Logical_Router accept multiple chassis, to cover
HA on the centralized gateway case for DNAT/SNAT.

           ovn-nbctl create Logical_Router name=edge1 \
                     options:chassis=gw1:10,gw2:20,gw3:30

        Or multiple chassis without priorities:

           ovn-nbctl create Logical_Router name=edge1 \
                     options:chassis=gw1,gw2,gw3

        and in this case we let ovn decide -and rewrite the option-
        to balance priorities between gateways to spread the load.

   2) redirect-chassis on NB/Logical_Router_Port to accept multiple
chassis to cover HA for centralized SNAT on distributed routers.

        ovn-nbctl lrp-add R1 alice 00:00:02:01:02:03 172.16.1.1/24 \
                  -- set Logical_Router_Port alice \
                  options:redirect-chassis=gw1:10,gw2:20,gw3:30

        (or again, without priorities)

        ovn-nbctl lrp-add R1 alice 00:00:02:01:02:03 172.16.1.1/24 \
                  -- set Logical_Router_Port alice \
                  options:redirect-chassis=gw1,gw2,gw3


These logical model changes allow for Active/Active L3 when we have
that implemented, for example by assigning the same priorities.

Alternatively in such case we could add another option
for case (1):  ha-chassis-mode=active_standby/active_active,
and ha-redirect-mode=active_standby/active_active for case (2).

For the dataplane implementation I propose following what [1] defines
for Active/Standby per-router implemetation, with BFD monitoring for
tunnel endpoints, where the location of the master router is
independently calculated at every chassis,  making the solution
independent of the controller connection via SB database.

There are to start with, a few gaps that we need to properly defined yet:

1) I'd like to see reporting of the master gateway somehow through
SB db [up to the NB db?], in a way that the administrator can inspect
the system and see what's it's current state.

2) While how hypervisors will direct traffic to the calculated
master router via the bundle action with the active_backup algorithm,
I believe we can't have anything in OpenFlow to drop packets in the
standby routers based on the inter-gateway link matrix status.

3) Other related changes in the SouthBound DB.

Best regards,
Miguel Ángel Ajo

[1]
https://github.com/openvswitch/ovs/blob/master/Documentation/topics/high-availability.rst


On Thu, Apr 6, 2017 at 12:13 PM, Miguel Angel Ajo Pelayo <
majopela at redhat.com> wrote:

> Hello everybody,
>
>      First I'd like to say hello, because I'll be able to spend more time
> working
> with this community, and I'm sure it will be an enjoyable journey for what
> I've
> seen (silently watching) during the last few years.
>
>      I'm planning to start work (together with Anil) on the L3 High
> availability area of OVN. We've been reading [1], and it seems quite
> reasonable.
>
>      We're wondering to fast forward and skip the naive active/backup
> implementation
> in favor of the Active/Standby (per router) based on bfd +
> bundle(active_backup)
> output actions, since the proposal of having ovn-northd monitoring the
> gateways
> seems a bit unnatural, and the difference in effort (naive vs
> active/standby) is
> probably not very big (warning: I tend to be optimistic).
>
>      I spent a couple of days looking at how L3 works now, and, very
> naively, I would
> propose either having the redirect-chassis option of Logical_Router_Ports
> accept
> multiple chassis with priorities.
>
>     For example:
>
>         ovn-nbctl lrp-add R1 alice 00:00:02:01:02:03 172.16.1.1/24 \
>              -- set Logical_Router_Port alice options:redirect-chassis=gw1:
> 10,gw2:20,gw3:30
>
> Or multiple chassis without priorities:
>
>         ovn-nbctl lrp-add R1 alice 00:00:02:01:02:03 172.16.1.1/24 \
>              -- set Logical_Router_Port alice options:redirect-chassis=gw1,
> gw2,gw3
>
>         (and in this case we let ovn decide -and may be rewrite the
> option- how to balance
> priorities between gateways to spread the load)
>
>
>         We may want to have another field in the Logical_Router_Port, to
> let us know which
> one(s) is(are) the active gateway(s)
>
>
>        This logical model would also allow for Active/Active L3 when we
> have that implemented,
> for example by assigning the same priorities.
>
>
>        Alternatively we could have two options:
>           * ha-redirect-chassis=<chassis>:<priority>[ ..
> :<chassis2>:<priority2>]
>           * ha-redirect-mode=active_standby/active_active
>
>
> Best regards,
> Miguel Ángel Ajo
>
> [1] https://github.com/openvswitch/ovs/blob/master/
> Documentation/topics/high-availability.rst
>
>


More information about the dev mailing list