[ovs-git] Open vSwitch: netlink-socket: Work around upstream kernel Netlink bug. (master)

dev at openvswitch.org dev at openvswitch.org
Wed Jul 2 21:04:22 UTC 2014


This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "Open vSwitch".

The branch, master has been updated
       via  8f20fd98db14404ea7c396ca91c5b29f3b20769f (commit)
       via  38206499970aee0b1231f0c4c851d974d79d96a8 (commit)
      from  6ba531aa7f744a4a3eac5c138fac65b66500e17c (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
commit 8f20fd98db14404ea7c396ca91c5b29f3b20769f
Diffs: http://openvswitch.org/cgi-bin/gitweb.cgi?p=openvswitch;a=commitdiff;h=8f20fd98db14404ea7c396ca91c5b29f3b20769f
Author: Ben Pfaff <blp at nicira.com>
		
netlink-socket: Work around upstream kernel Netlink bug.
		
The upstream kernel net/netlink/af_netlink.c netlink_recvmsg() contains the
following code to refill the Netlink socket buffer with more dump skbs
while a dump is in progress:

	if (nlk->cb && atomic_read(&sk->sk_rmem_alloc) <= sk->sk_rcvbuf / 2) {
		ret = netlink_dump(sk);
		if (ret) {
			sk->sk_err = ret;
			sk->sk_error_report(sk);
		}
	}

The netlink_dump() function that this calls returns a negative number on
error, the convention used throughout the kernel, and thus sk->sk_err
receives a negative value on error.

However, sk->sk_err is supposed to contain either 0 or a positive errno
value, as one can see from a quick "grep" through net for 'sk_err =', e.g.:

    ipv4/tcp.c:2067:		sk->sk_err = ECONNRESET;
    ipv4/tcp.c:2069:		sk->sk_err = ECONNRESET;
    ipv4/tcp_input.c:4106:		sk->sk_err = ECONNREFUSED;
    ipv4/tcp_input.c:4109:		sk->sk_err = EPIPE;
    ipv4/tcp_input.c:4114:		sk->sk_err = ECONNRESET;
    netlink/af_netlink.c:741:			sk->sk_err = ENOBUFS;
    netlink/af_netlink.c:1796:			sk->sk_err = ENOBUFS;
    packet/af_packet.c:2476:		sk->sk_err = ENETDOWN;
    unix/af_unix.c:341:			other->sk_err = ECONNRESET;
    unix/af_unix.c:407:				skpair->sk_err = ECONNRESET;

The result is that the next attempt to receive from the socket will return
the error to userspace with the wrong sign.

(The root of the error in this case is that multiple threads are attempting
to read a single flow dump from a shared fd.  That should work, but the
kernel has an internal race that can result in one or more of those threads
hitting the EINVAL case at the start of netlink_dump().  The EINVAL is
harmless in this case and userspace should be able to ignore it, but
reporting the EINVAL as if it were a 22-byte message received in userspace
throws a real wrench in the works.)

This bug makes me think that there are probably not many programs doing
multithreaded Netlink dumps.  Maybe it is good that we are considering
other approaches.

VMware-BZ: #1255704
Reported-by: Mihir Gangar <gangarm at vmware.com>
Signed-off-by: Ben Pfaff <blp at nicira.com>
Acked-by: Alex Wang <alexw at nicira.com>


commit 38206499970aee0b1231f0c4c851d974d79d96a8
Diffs: http://openvswitch.org/cgi-bin/gitweb.cgi?p=openvswitch;a=commitdiff;h=38206499970aee0b1231f0c4c851d974d79d96a8
Author: Ben Pfaff <blp at nicira.com>
		
INSTALL: Mention conflict with NET_IPGRE setting before Linux 3.11.
		
I found when reconfiguring my kernel that if I turned on NET_IPGRE, GRE
tunnels no longer worked.

Signed-off-by: Ben Pfaff <blp at nicira.com>
Acked-by: Jesse Gross <jesse at nicira.com>


-----------------------------------------------------------------------

Summary of changes:
 AUTHORS              |    1 +
 INSTALL              |    6 ++++--
 lib/netlink-socket.c |   21 ++++++++++++++++-----
 3 files changed, 21 insertions(+), 7 deletions(-)


hooks/post-receive
-- 
Open vSwitch



More information about the git mailing list