[ovs-dev] Possible Regression in 'netdev: Add new "struct netdev_rx" for capturing packets from a netdev.'

Murphy McCauley murphy.mccauley at gmail.com
Wed Jun 12 14:11:50 UTC 2013


On Jun 12, 2013, at 6:28 AM, Ed Maste wrote:

> On 12 June 2013 07:04, Murphy McCauley <murphy.mccauley at gmail.com> wrote:
>> (Sorry this isn't an actual reply and is missing context -- I wasn't on the list when it was originally posted.)
>> 
>> Simon and I have been in touch about this, and I thought I'd share my findings for what they're worth.
>> 
>> The problem is from the commit Simon mentioned (796223f5bc3a4896e6398733c798390158479400).  Specifically, it's in netdev-linux.c in netdev_linux_send().
>> 
>> The new version always sends using the "sender" socket made by af_packet_sock() unless the interface is a tap, in which case it sends it using the tap fd.  This differs from the old version which sent using whatever was in the fd field of the netdev if it was available.  For tap interfaces, this was the tap fd, so the result was the same as it is now.  But for other interfaces, this held the socket opened for receiving if the interface was listening (which was maybe never "right" in some sense and isn't convenient anymore since this socket descriptor is no longer stored in the non-rx netdev).
>> 
>> The comments indicate that the exception is made for tap interfaces since writing to a tap interface with an AF_PACKET socket results in receiving the packet you just wrote.  However, I don't think this behavior is limited to taps.  Since the old version of the code sent and received with the same socket descriptor, I think the loop was fixed by the check in dev_queue_xmit_nit() in net/core/dev.c.  Since they're two different socket descriptors now, this no longer works and you get the loop.
> 
> Ahh, it turns out Ben explained this to me when I ran into a related
> issue with the FreeBSD userspace implementation.  Ben's message in the
> thread is at http://openvswitch.org/pipermail/dev/2012-July/018806.html
> .
> 
>> I fixed it (I think) by adding a BPF packet filter on the rx socket so that it only receives incoming packets.  There's probably a better fix, but you're welcome to the patch if you want it.
> 
> I think it's worth taking a look.


I think this is more or less against master.

diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index d73115b..cc47a6b 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -744,6 +744,14 @@ netdev_linux_rx_open(struct netdev *netdev_, struct netdev_rx **rxp)
     } else {
         struct sockaddr_ll sll;
         int ifindex;
+        /* Result of tcpdump -dd inbound */
+        static struct sock_filter filt[] = {
+            { 0x28, 0, 0, 0xfffff004 },
+            { 0x15, 0, 1, 0x00000004 },
+            { 0x6, 0, 0, 0x00000000 },
+            { 0x6, 0, 0, 0x0000ffff }
+        };
+        static struct sock_fprog fprog = {ARRAY_SIZE(filt), filt};
 
         /* Create file descriptor. */
         fd = socket(PF_PACKET, SOCK_RAW, 0);
@@ -776,6 +784,16 @@ netdev_linux_rx_open(struct netdev *netdev_, struct netdev_rx **rxp)
                      netdev_get_name(netdev_), strerror(error));
             goto error;
         }
+
+        /* Filter for only incoming packets. */
+        error = setsockopt(fd, SOL_SOCKET, SO_ATTACH_FILTER, &fprog,
+                           sizeof fprog);
+        if (error) {
+            error = errno;
+            VLOG_ERR("%s: failed attach filter (%s)",
+                     netdev_get_name(netdev_), strerror(error));
+            goto error;
+        }
     }
 
     rx = xmalloc(sizeof *rx);




More information about the dev mailing list