[ovs-dev] [PATCH] ofproto-dpif: Fix nondeterministic flow revalidation behavior.

Ben Pfaff blp at nicira.com
Tue Dec 27 21:29:43 UTC 2011


I reposted this patch as the first in a longer series:
        http://openvswitch.org/pipermail/dev/2011-December/013948.html
Please review it as part of that series.

On Thu, Dec 22, 2011 at 04:49:05PM -0800, Ben Pfaff wrote:
> SLB bonds are very strange beasts.  It's taken OVS a while to figure out
> how they should really work.  Way back in the mists of time, when we were
> in the midst of this process, we noticed that the following could happen:
> 
> 	1. Local VM sends a packet to the OVS bridge.
> 	2. OVS bridge learns VM's MAC, forwards packets to SLB bond.
> 	3. Remote switch hasn't learned packet's destination and
> 	   forwards packet back to other interfaces in the SLB bond.
> 
> Normally nothing bad happens in this scenario because OVS has already
> learned the local port for the VM's MAC in step 2 and the rules for SLB
> bonding (this is rule #2 in vswitchd/INTERNALS).  But at the time we were
> implementing this, OVS didn't yet use active flows to keep MAC learning
> entries alive; only new flows prevented a MAC entry from aging out.  So
> in steady state (e.g. just "ping" traffic) OVS would regularly forget MAC
> addresses.  If the remote switch also happened to forward a packet back to
> one of the SLB bond interfaces, then OVS would learn the VM's address on
> the bond, with the result that any traffic coming in from the remote switch
> would be black-holed until the VM sent a new packet.  This was not good.
> 
> The fix we applied at the time was commit 2416b8ecea (bridge: Eject NORMAL
> flows without a learning entry from datapath.) followed by a small
> refinement in commit e96a4d8035 (bridge: Feed flow stats into learning
> table.).  This fix causes flows that don't have a learning entry to be
> ejected from the datapath if revalidation occurs.  This forced the next
> packet in the flow to go to userspace, which in turn caused learning to
> happen, fixing the problem.
> 
> However, this isn't a good solution for several reasons:
> 
> 	* It forces more packets to userspace, which is expensive.
> 
> 	* It doesn't just affect the cases where it helps, those where an
> 	  SLB bond is actually involved.  (This could be fixed, but it is
> 	  not worth it.)
> 
> 	* It means that flow installability becomes nondeterministic.  When
> 	  the first packet shows up for a flow, we install it.  But later
> 	  if we revalidate it, we have to uninstall it.  That doesn't make
> 	  sense; a flow should be either installable or not installable,
> 	  not some weird mix.
> 
> Fortunately, the situation has improved since this fix was originally
> designed.  First, active flows now keep MAC learning entries alive, since
> commit e96a4d8035367 (bridge: Feed flow stats into learning table.)
> Second, gratuitous ARP locking, added in commit 7febb9100b (bridge: Filte
> some gratuitous ARPs on bond slaves.) means that gratuitous ARPs reflected
> on bond slaves don't cause confusion (this is rule #4 in
> vswitchd/INTERNALS).
> 
> These improvements mean that it is no longer necessary to have this
> strange special case at all.  Therefore, this commit removes it.
> 
> I found this while investigating reports from code that I added to
> occasionally check that flow actions were correct.
> 
> Signed-off-by: Ben Pfaff <blp at nicira.com>
> ---
>  ofproto/ofproto-dpif.c |    8 --------
>  1 files changed, 0 insertions(+), 8 deletions(-)
> 
> diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
> index 32d90da..a84dcf9 100644
> --- a/ofproto/ofproto-dpif.c
> +++ b/ofproto/ofproto-dpif.c
> @@ -5312,14 +5312,6 @@ xlate_normal(struct action_xlate_ctx *ctx)
>          if (mac->port.p != in_bundle) {
>              output_normal(ctx, mac->port.p, vlan);
>          }
> -    } else if (!ctx->packet && !eth_addr_is_multicast(ctx->flow.dl_dst)) {
> -        /* If we are revalidating but don't have a learning entry then eject
> -         * the flow.  Installing a flow that floods packets opens up a window
> -         * of time where we could learn from a packet reflected on a bond and
> -         * blackhole packets before the learning table is updated to reflect
> -         * the correct port. */
> -        ctx->may_set_up_flow = false;
> -        return;
>      } else {
>          struct ofbundle *bundle;
>  
> -- 
> 1.7.2.5
> 



More information about the dev mailing list