[ovs-dev] [PATCH v5 0/3] Use VLANs for VLAN packets redirected to a gateway chassis

Miguel Angel Ajo Pelayo majopela at redhat.com
Wed Jul 18 13:22:22 UTC 2018

I have been testing the patches, and seeing them work as expected
(L3HA failovers, N/S, E/W, etc...), but I have found a couple of
issues, one of them, "2", I'm not sure it's an issue, but I will
describe it too, in case it's not a real issue we can move it to
discuss at openvswitch.org then.

1) The expiry of the chassisredirect port MACs on the switch CAM
   table: In N/S routing, when any traffic needs to be handled
   by the master Chassis for a router the dst.mac is the MAC of
   the chassisredirect port.

   The switch knows about such MAC because it's announced via gARP
   on the L2 level. The problem is that, for incoming N to S traffic
   the router pipelines translate to the router internal leg src.mac
   before sending the packet to the destination Chassis.

   Because of that, the chassisredirect MAC/VLAN combination is never
   again relearned as outgoing traffic on the right port (master gw
   chassis port), it will eventually expire after 300 seconds.

   From that moment any traffic directed to the "chassisredirect"
   port MAC will be flooded until any other gARP happens. Everything
   seems to work fine at a very small scale, but that would really
   kill the network in real life conditions.

   You can see it live here:


  (sorry for the audio which is missing in a couple of non-important
   moments, not sure why)

    The problematic MAC in that video is "fa:16:3e:48:66:e", the one
    of this chassisredirect port:

logical_port        : "cr-lrp-4823af55-cd17-4de8-8120-6d13c44dc86b"
mac                 : ["fa:16:3e:48:66:e7"]
nat_addresses       : []
options             : {distributed-port=
parent_port         : []
tag                 : []
tunnel_key          : 3
type                : chassisredirect

Here I can think of one solutions:

a) Make sure that the traffic is not fully processed by the lrouter
flows on the gateway chassis, and let the packet egress the host with the
src.mac = "chassisredirect" mac.   That would make switches again
relearn the MAC/VLAN to port association every time a packet flows N to S.

b) which I believe doesn't work: make sure gARPs don't stop happening
(or happen <300sec). Would not be a valid solution, since CAM table
entries could be early expired on switches if they overflow.

2) MAC flipping on E/W traffic, which is easier to see in this blog post:

   If you want the TL;DR version for more context go to the top:

   Where the VLAN/MAC combination lives is not really important, since we
   never direct traffic to such mac, all the lrouter flow processing
   happens in OpenFlow before leaving the host.

   My worry here, is... for a switch, is it just enough to disable port
   flapping protection as we already have to do for L3HA (a MAC can move
   around ports based on master/backup status)., or, given the higher rate
   of port flapping, can it be problematic (for example, I could think
   of the switch logging every port flap, but I don't know if that would
   be the case).

   One solution for this could be:

a)  Making sure that packets that leave the host have the host MAC address
on the physical interface of the provider bridge where the Logical Switch
has a localport attached to.  It would be fine, since that mac address is
never matched on destination, but we would also need to restore it with
another lflow at the moment it arrives the final Chassis.

     (As far as I've been told, this is what neutron/dvr does for VLAN
     tenant networks)

I plan to start working (with some help from Anil) on a follow up
patch to make sure "1" does not happen, and then "2" if we confirm that's

Miguel Ángel.

On Tue, Jul 10, 2018 at 8:25 AM, Miguel Angel Ajo Pelayo <
majopela at redhat.com> wrote:

> Anil, good work!. thank you.
> I'm reviewing the patches and the behaviour of the series to make sure
> everything is all right.
> E/W distributed L3 routing over L2 is an interesting problem I'm
> documenting what I see to
> share it on this thread.
> Best,
> Miguel Ángel
> On Mon, Jun 25, 2018 at 9:33 AM Anil Venkata <anilvenkata at redhat.com>
> wrote:
>> On Sat, Jun 16, 2018 at 12:05 AM, Ben Pfaff <blp at ovn.org> wrote:
>> > On Thu, Jun 07, 2018 at 02:59:46PM +0530, vkommadi at redhat.com wrote:
>> > > From: Venkata Anil <vkommadi at redhat.com>
>> > >
>> > > This patch avoids tunneling and instead uses source tenant vlan
>> network
>> > > across hypervisors for traffic from vlan network on local hypervisor
>> > > towards gateway hypervisor hosting redirect chassiss port.
>> > >
>> > > On the local hypervisor, when the packet enters logical router ingress
>> > > pipeline from tenant vlan network, router will set REGBIT_NAT_REDIRECT
>> > > and redirect the packet to gateway hypervisor, which is hosting the
>> > > chassis redirect port, using tenant vlan network.
>> > > Packet travelling across hypervisors will have source vlan tag and
>> > > distributed gateway port MAC as destination MAC (other packet data
>> > > unchanged).
>> > >
>> > > Gateway hypervisor will check the vlan tag and destination MAC and
>> > > resubmit it to router logical ingress pipeline for routing and finding
>> > > the logical output port(i.e it treats this packet as coming from the
>> > > local patch port connected to tenant vlan network for routing).
>> > >
>> > > No changes done for return path as return path to source hypervisor
>> > > always uses tenant vlan networks.
>> >
>> > Thanks a lot for revising the patch series.
>> >
>> > We've had a lot of churn in ovn-controller over the last week, and it
>> > has caused some patch rejects for this patch series.  Would you mind
>> > rebasing and reposting it?
>> >
>> Thanks Ben. Sorry for the delay, I was on vacation. I will rebase it now.
>> Thanks
>> Anil
>> _______________________________________________
>> dev mailing list
>> dev at openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

More information about the dev mailing list