[ovs-dev] OVN L3-HA request for feedback

Wed May 24 17:39:24 UTC 2017

On Wed, May 24, 2017 at 7:19 AM, Miguel Angel Ajo Pelayo
<majopela at redhat.com> wrote:
> I wanted to share a small status update:
>
> Anil and I have been working on this [1] and we expect to post
> some preliminary  patches before the end of the week.
>
> I can confirm that the BFD + bundle(active_backup) strategy
> works well from the hypervisors point of view. With 1 sec BFD
> pings we get a ~2.8s failover time.
>
> So far we have only focused on case "2" so far for the distributed
> routers where we specify a set of hosts to act as chassis.
>
> """
>      ovn-nbctl lrp-add R1 alice 00:00:02:01:02:03 172.16.1.1/24 \
>                   -- set Logical_Router_Port alice \
>                   options:redirect-chassis=gw1:10,gw2:20,gw3:30
> """
>

Thanks for the update!  Sounds like great progress.

> We wonder if there's any value at all in exploring support on "1"
> the old way of pinning a logical router to a chassis.

You mean only specifying a single chassis here?  Does it add a lot of
complexity to only support a single gateway?  If not, it definitely
seems worth keeping.  Supporting simpler setups is a good thing.

> If anybody wants to give it a try you can use [2] to quickly deploy
> 2 gw hosts + 2 "hv" hosts, + 1 service host (accessible through an
> 'external' network via gw1 and gw2) (see ascii diagram [3] and [4] details)
>
>
> Then you can ping the external service from a port in hv1 with:
>
>     $ vagrant ssh hv1 -c "sudo ip netns exec vm1 ping 10.0.0.111"
>
> or the vm3 via floating point with:
>
>     $ vagrant ssh svc1 -c "ping 10.0.0.16"
>
> you can trigger a failover anytime by doing:
>
>     $ vagrant ssh gw1 -c "sudo ifdown eth1"
>
> and a failback, by:
>
>      $ vagrant ssh gw1 -c "sudo ifup eth1"
>
>
>
> We are currently working on:
>
> 1) Addressing the monitoring of the inter-gateway bfd, to make sure that
>     non-master routers will drop any packet (external/internal) or any ARP
> request.
> 2) Same as 1 but for playing gARPs when a router is in a new chassis.
> 3) Documentation changes.
> 4) Tests
>
> And we have some questions:
>
>     About preemption (see failover/failback example above), we have several
> options:
>    a) we stick to have preemptive failbacks (if a gateway chassis comes
> back online, the routers which were scheduled there will bome back)
>    b) not preemtive: (when a chassis goes down all logical router ports with
> redirect chassis will be recalculated). or
>    c) we make it configurable.
>
> My intuition says that with very low failover times "a" could be a
> reasonable
> thing for most cases, since your load stays balanced when your gateway
> chassis
> comes back.  But I'm not an operator, how could we gather feedback on
> this area?

Good question and good point about wanting to ensure load remains
balanced when a chassis comes back.

With (a), I'd be worried about the case where a chassis is in more of
a half-dead (zombie?) state.  We don't want failover bouncing back and
forth because we keep thinking a chassis is going up and down.  Any
thoughts on how to mitigate this?

>
> Best regards,
> Miguel Ángel Ajo
>
> [1] https://github.com/mangelajo/ovs/commits/l3ha
> [2] https://github.com/mangelajo/vagrants/tree/master/ovn-l3-ha
> [3]
> https://github.com/mangelajo/vagrants/blob/master/ovn-l3-ha/Vagrantfile#L16
> [4] https://github.com/mangelajo/vagrants/blob/master/ovn-l3-ha/gw1.sh#L67
>
> On Fri, Apr 7, 2017 at 9:14 AM, Miguel Angel Ajo Pelayo <majopela at redhat.com
>> wrote:
>
>> Updating what I wrote yesterday (I hope I won't make people's
>> eyes hurt today) after a talk on IRC (thank you Mickey Spiegel
>> and Gurucharan Shetty):
>>
>> I propose having:
>>
>>    1) chassis on NB/Logical_Router accept multiple chassis, to cover
>> HA on the centralized gateway case for DNAT/SNAT.
>>
>>            ovn-nbctl create Logical_Router name=edge1 \
>>                      options:chassis=gw1:10,gw2:20,gw3:30
>>
>>         Or multiple chassis without priorities:
>>
>>            ovn-nbctl create Logical_Router name=edge1 \
>>                      options:chassis=gw1,gw2,gw3
>>
>>         and in this case we let ovn decide -and rewrite the option-
>>         to balance priorities between gateways to spread the load.
>>
>>    2) redirect-chassis on NB/Logical_Router_Port to accept multiple
>> chassis to cover HA for centralized SNAT on distributed routers.
>>
>>         ovn-nbctl lrp-add R1 alice 00:00:02:01:02:03 172.16.1.1/24 \
>>                   -- set Logical_Router_Port alice \
>>                   options:redirect-chassis=gw1:10,gw2:20,gw3:30
>>
>>         (or again, without priorities)
>>
>>         ovn-nbctl lrp-add R1 alice 00:00:02:01:02:03 172.16.1.1/24 \
>>                   -- set Logical_Router_Port alice \
>>                   options:redirect-chassis=gw1,gw2,gw3
>>
>>
>> These logical model changes allow for Active/Active L3 when we have
>> that implemented, for example by assigning the same priorities.
>>
>> Alternatively in such case we could add another option
>> for case (1):  ha-chassis-mode=active_standby/active_active,
>> and ha-redirect-mode=active_standby/active_active for case (2).
>>
>> For the dataplane implementation I propose following what [1] defines
>> for Active/Standby per-router implemetation, with BFD monitoring for
>> tunnel endpoints, where the location of the master router is
>> independently calculated at every chassis,  making the solution
>> independent of the controller connection via SB database.
>>
>> There are to start with, a few gaps that we need to properly defined yet:
>>
>> 1) I'd like to see reporting of the master gateway somehow through
>> SB db [up to the NB db?], in a way that the administrator can inspect
>> the system and see what's it's current state.
>>
>> 2) While how hypervisors will direct traffic to the calculated
>> master router via the bundle action with the active_backup algorithm,
>> I believe we can't have anything in OpenFlow to drop packets in the
>> standby routers based on the inter-gateway link matrix status.
>>
>> 3) Other related changes in the SouthBound DB.
>>
>> Best regards,
>> Miguel Ángel Ajo
>>
>> [1] https://github.com/openvswitch/ovs/blob/master/
>> Documentation/topics/high-availability.rst
>>
>>
>> On Thu, Apr 6, 2017 at 12:13 PM, Miguel Angel Ajo Pelayo <
>> majopela at redhat.com> wrote:
>>
>>> Hello everybody,
>>>
>>>      First I'd like to say hello, because I'll be able to spend more time
>>> working
>>> with this community, and I'm sure it will be an enjoyable journey for
>>> what I've
>>> seen (silently watching) during the last few years.
>>>
>>>      I'm planning to start work (together with Anil) on the L3 High
>>> availability area of OVN. We've been reading [1], and it seems quite
>>> reasonable.
>>>
>>>      We're wondering to fast forward and skip the naive active/backup
>>> implementation
>>> in favor of the Active/Standby (per router) based on bfd +
>>> bundle(active_backup)
>>> output actions, since the proposal of having ovn-northd monitoring the
>>> gateways
>>> seems a bit unnatural, and the difference in effort (naive vs
>>> active/standby) is
>>> probably not very big (warning: I tend to be optimistic).
>>>
>>>      I spent a couple of days looking at how L3 works now, and, very
>>> naively, I would
>>> propose either having the redirect-chassis option of Logical_Router_Ports
>>> accept
>>> multiple chassis with priorities.
>>>
>>>     For example:
>>>
>>>         ovn-nbctl lrp-add R1 alice 00:00:02:01:02:03 172.16.1.1/24 \
>>>              -- set Logical_Router_Port alice
>>> options:redirect-chassis=gw1:10,gw2:20,gw3:30
>>>
>>> Or multiple chassis without priorities:
>>>
>>>         ovn-nbctl lrp-add R1 alice 00:00:02:01:02:03 172.16.1.1/24 \
>>>              -- set Logical_Router_Port alice
>>> options:redirect-chassis=gw1,gw2,gw3
>>>
>>>         (and in this case we let ovn decide -and may be rewrite the
>>> option- how to balance
>>> priorities between gateways to spread the load)
>>>
>>>
>>>         We may want to have another field in the Logical_Router_Port, to
>>> let us know which
>>> one(s) is(are) the active gateway(s)
>>>
>>>
>>>        This logical model would also allow for Active/Active L3 when we
>>> have that implemented,
>>> for example by assigning the same priorities.
>>>
>>>
>>>        Alternatively we could have two options:
>>>           * ha-redirect-chassis=<chassis>:<priority>[ ..
>>> :<chassis2>:<priority2>]
>>>           * ha-redirect-mode=active_standby/active_active
>>>
>>>
>>> Best regards,
>>> Miguel Ángel Ajo
>>>
>>> [1] https://github.com/openvswitch/ovs/blob/master/Documenta
>>> tion/topics/high-availability.rst
>>>
>>>
>>
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

-- 
Russell Bryant