[ovs-dev] [PATCH] bridge: Filter some gratuitous ARPs on bond slaves.

Ian Campbell Ian.Campbell at citrix.com
Thu Jun 3 17:50:56 UTC 2010


Thanks for this Jesse, I must admit I'd forgotten all about it...

On Thu, 2010-06-03 at 03:43 +0100, Jesse Gross wrote:
> 
> +    /* We don't want to learn from GARP packets that are reflected back over
> +     * bond slaves so we lock the learning table. */
> +    lock_type = !is_gratuitous_arp(flow) ? GARP_LOCK_NONE :
> +                    (in_port->n_ifaces == 1) ? GARP_LOCK_SET : GARP_LOCK_CHECK; 

I don't know if it makes sense in the vswitch world but in the old style
bridge+bond mode we only lock the entry if the packet we are looking at
is a gratARP which came from a non-physical port and we only pay
attention to the locks for frames coming from a physical port where the
address was already learnt on an internal port (and it's a gratARP).

In particular this means that we don't extend the lock timeout for any
old traffic but only for gratARPs seen on an internal port. If you lock
from all traffic then won't any traffic sent in the 5 seconds before
migrating away from a host cause the normal exception for that case to
not kick in?

Maybe I've missed some subtlety in the vswitch code (I only read the
patch, I didn't put it into context by applying it) and you already do
as I described or something similar.

This whole area is so fraught with edge cases and complexity that any
difference in behaviour makes me nervous.

I don't know if the vswitch has such a concept as a physical port (at
least in the way I'm thinking of) so this may not make sense or even be
possible.

I've stuck the relevant bit of the bridge+bond code below as well as my
commit message from the time, on the off chance it helps make my
explanation any clearer.

Ian.

CA-39056, SCTX-398: filter gratuitous ARPs sent from this host returning
via another bond link

When a VM is migrated on to a host using SLB bonding and sends out a
gratuitous broadcast ARP there is a good chance that this ARP will be
re-received via another bond slave. This would cause the bridge code
to relearn the source MAC address on the external port replacing the
currently learnt internal port and leading to traffic outage for that
VM.

Fix this by locking out learning for broadcast ARPs received from a
given MAC address on the external port for a configurable length of
time (default 5s) after a broadcast ARP from the same source was
received on an internal port.

If a VM is migrated very quickly back-to-back (i.e. faster than the
configured lockout time, including time for the migration itself) then
it is possible that the broadcast ARP will be incorrectly discarded
leading to a short outage of communication to the migrated VM from the
point of view of VMs resident on the source host. In the normal case
this will be corrected when the source domain's VIF is cleaned up
which happens reasonably quickly on an unloaded host. This was deemed
preferable to a complete outage for everyone.

We already actively discard any non-broadcast ARP traffic from an
external port where we determine that the real source was an internal
port.

Signed-off-by: Ian Campbell <ian.campbell at citrix.com>

       fdb = fdb_find(head, addr);
        if (likely(fdb)) {
                /*
                 * If this is an address arriving on the physical port
                 * which we have previously seen on a non-physical
                 * port then ignore it.
                 *
                 * _Unless_ it is a broadcast ARP reply in which case
                 * the guest in question has migrated. However we lock
                 * out updates due to broadcast ARP replies received
                 * on the physical port for a configurable amount of
                 * time after any broadcast ARP from the same source
                 * address received on a non-physical link -- this is
                 * order to avoid incorrect learning when a broadcast
                 * ARP transmitted by a VM on this host comes back in
                 * another bond link and causes the bridge to learn
                 * the MAC on the exernal port.
                 */
                if (!is_physical_port(br, fdb->dst) && is_physical_port(br, source)) {

                        if (!is_gratuitous_arp(skb))
                                return 0;

                        if (time_before(jiffies, fdb->garp_lock_until))
                                return 0;
                }

                /* attempt to update an entry for a local interface */
                if (unlikely(fdb->is_local)) {
                        return 0;
                } else {
                        if (is_gratuitous_arp(skb) && !is_physical_port(br, source))
                                fdb->garp_lock_until = jiffies + fdb_garp_lock_time*HZ;

                        /* fastpath: update of existing entry */
                        fdb->dst = source;
                        fdb->ageing_timer = jiffies;
                }
        } else {
                spin_lock(&br->hash_lock);
                if (!fdb_find(head, addr)) {
                        unsigned long garp_lock = NO_GARP_LOCK;
                        if (is_gratuitous_arp(skb) && !is_physical_port(br, source))
                                        garp_lock = jiffies + fdb_garp_lock_time*HZ;
                        fdb_create(head, source, addr, 0, garp_lock);
                }
                /* else  we lose race and someone else inserts
                 * it first, don't bother updating
                 */
                spin_unlock(&br->hash_lock);
        }






More information about the dev mailing list