[ovs-dev] [PATCH] vswitchd: Treat gratuitous ARP requests like gratuitous ARP replies.

Ben Pfaff blp at nicira.com
Thu May 27 17:10:59 UTC 2010


On Thu, May 27, 2010 at 09:24:45AM +0100, Ian Campbell wrote:
> On Wed, 2010-05-26 at 21:12 +0100, Ben Pfaff wrote:
> > vswitchd has long used a gratuitous ARP reply as an indication that a VM
> > has migrated, because Citrix-patch Linux DomUs send such packets out when
> > they complete migration.
> 
> They aren't really Citrix patches but more a xen.org thing. I often call
> them "traditional-Xen" or "classic-Xen" to distinguish them from the
> pvops based stuff which is now available in the kernel.org trees.
> 
> >   Relatively recently, however, we realized that
> > upstream Linux does not do this.  Ian Campbell submitted a patch to the
> > Linux network maintainer, Dave Miller, to fix this, but Dave did not accept
> > it in its original form.  Instead, he required the form of the packet to
> > change to a gratuitous ARP request.
> 
> The upstream kernel already sends a gratuitous ARP request (or would if
> it weren't for a separate issue, patched separately) and Dave only
> rejected my suggestion to turn it into a gratuitous ARP reply.
> Gratuitous ARP request and replies are roughly equivalent at the L2
> layer wrt switch MAC tables etc but I think Dave was concerned that the
> gratuitous replies would also have additional impact on e.g. ARP tables
> of other devices on the network (I think this is what he meant by
> spamming) so I think he was correct to reject the patch.
> 
> The patch for the separate issue mentioned above was also initially
> rejected but after discussion we have agreed a different approach. I
> have already submitted a new patch with that.

Thanks for all the clarifications.  I believe that I now have the story
correct.  I'm appending what I pushed at the bottom of this email

> > +/* A VM broadcasts a gratuitous ARP to indicate that it has resumed after
> > + * migration.  Older Citrix-patched Linux DomU used gratuitous ARP replies to
> > + * indicate this; newer upstream kernels use gratuitous ARP requests. */
> >  static bool
> > -is_bcast_arp_reply(const flow_t *flow)
> > +is_gratuitous_arp(const flow_t *flow)
> >  {
> >      return (flow->dl_type == htons(ETH_TYPE_ARP)
> > -            && flow->nw_proto == ARP_OP_REPLY
> > -            && eth_addr_is_broadcast(flow->dl_dst));
> > +            && eth_addr_is_broadcast(flow->dl_dst)
> > +            && (flow->nw_proto == ARP_OP_REPLY
> > +                || (flow->nw_proto == ARP_OP_REQUEST
> > +                    && flow->nw_src == flow->nw_dst)));
> >  }
> 
> If I understand correctly flow->nw_{src,dst} are
> flow->dl_type/flow->nw_proto specific fields filed in by flow_extract()
> and for ARP types they contain the sender and target protocol addresses.
> 
> Assuming that is right this change looks correct to me.

Yes, that's correct.  Thanks again.

--8<--------------------------cut here-------------------------->8--

>From 5d0ae1387c968859b390dba9768ce44ac788405b Mon Sep 17 00:00:00 2001
From: Ben Pfaff <blp at nicira.com>
Date: Thu, 27 May 2010 10:06:36 -0700
Subject: [PATCH] vswitchd: Treat gratuitous ARP requests like gratuitous ARP replies.

vswitchd has long used a gratuitous ARP reply as an indication that a VM
has migrated, because traditional xen.org Linux DomUs send such packets out
when they complete migration.  Relatively recently, however, we realized
that upstream Linux does not do this.  Ian Campbell tracked this down to
two separate issues:

        1. A bug prevented gratuitous ARPs from being sent.

        2. When this was fixed, the gratuitous ARPs that were sent were
           requests, not replies, although kernel documentation sent that
           replies were to be sent.

Ian submitted patches to fix both bugs.  #1 is in process of revision for
acceptance.  #2 was rejected: according to Dave Miller, the documentation
is wrong, not the implementation, because ARP replies would unnecessarily
fill up the ARP tables of devices on the network.

OVS has not until now treated gratuitous ARP requests specially, only
replies.  Now that Linux will be using ARP requests to indicate migration,
OVS should also treat them as such.!  This commit does so.

See http://marc.info/?l=linux-netdev&m=127367215620212&w=2 for Ian's
original patch and http://marc.info/?l=linux-netdev&m=127468303701361&w=2
for Dave Miller's response.

CC: Ian Campbell <Ian.Campbell at citrix.com>
NIC-74.
---
 vswitchd/bridge.c |   15 ++++++++++-----
 1 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
index 8314c53..61813bb 100644
--- a/vswitchd/bridge.c
+++ b/vswitchd/bridge.c
@@ -2269,12 +2269,17 @@ update_learning_table(struct bridge *br, const flow_t *flow, int vlan,
     }
 }
 
+/* A VM broadcasts a gratuitous ARP to indicate that it has resumed after
+ * migration.  Older Citrix-patched Linux DomU used gratuitous ARP replies to
+ * indicate this; newer upstream kernels use gratuitous ARP requests. */
 static bool
-is_bcast_arp_reply(const flow_t *flow)
+is_gratuitous_arp(const flow_t *flow)
 {
     return (flow->dl_type == htons(ETH_TYPE_ARP)
-            && flow->nw_proto == ARP_OP_REPLY
-            && eth_addr_is_broadcast(flow->dl_dst));
+            && eth_addr_is_broadcast(flow->dl_dst)
+            && (flow->nw_proto == ARP_OP_REPLY
+                || (flow->nw_proto == ARP_OP_REQUEST
+                    && flow->nw_src == flow->nw_dst)));
 }
 
 /* Determines whether packets in 'flow' within 'br' should be forwarded or
@@ -2366,11 +2371,11 @@ is_admissible(struct bridge *br, const flow_t *flow, bool have_packet,
 
         /* Drop all packets for which we have learned a different input
          * port, because we probably sent the packet on one slave and got
-         * it back on the other.  Broadcast ARP replies are an exception
+         * it back on the other.  Gratuitous ARP packets are an exception
          * to this rule: the host has moved to another switch. */
         src_idx = mac_learning_lookup(br->ml, flow->dl_src, vlan);
         if (src_idx != -1 && src_idx != in_port->port_idx &&
-            !is_bcast_arp_reply(flow)) {
+            !is_gratuitous_arp(flow)) {
                 return false;
         }
     }
-- 
1.7.1





More information about the dev mailing list