[ovs-dev] [PATCH] netlink-socket: Increase Netlink socket receive buffer size.
Ben Pfaff
blp at nicira.com
Fri Mar 16 04:15:45 UTC 2012
On Thu, Mar 15, 2012 at 08:45:07PM -0700, Justin Pettit wrote:
> On Mar 15, 2012, at 8:30 PM, Ben Pfaff wrote:
>
> > +#ifndef SO_RCVBUFFORCE
> > +#define SO_RCVBUFFORCE 33
> > +#endif
>
> It seems like on some platforms, this is defined to be other values.
> For example, this appears to be the definition on SPARC:
>
> #define SO_RCVBUFFORCE 0x100b
Oops. Thanks for the scrutiny.
I looked into more closely into the problem I was seeing with this
definition and figured out that, in fact, it was a missing definition in
the pseudo-headers that we feed to "sparse". I've appended a revised
commit that avoids the #ifndef.
> > @@ -122,6 +126,13 @@ nl_sock_create(int protocol, struct nl_sock **sockp)
> > sock->protocol = protocol;
> > sock->dump = NULL;
> >
> > + rcvbuf = 1024 * 1024;
> > + if (setsockopt(sock->fd, SOL_SOCKET, SO_RCVBUFFORCE,
> > + &rcvbuf, sizeof rcvbuf)) {
> > + VLOG_WARN_RL(&rl, "setting %d-byte socket receive buffer failed (%s)",
> > + rcvbuf, strerror(errno));
> > + }
>
> It appears that this increase will happen for anything created with
> nl_sock_create(), of which there are a few things now. Would it make
> sense to make this an optional argument to nl_sock_create()? (Not
> necessarily right at this moment, of course.)
Yes, it would make sense to do this.
--8<--------------------------cut here-------------------------->8--
From: Ben Pfaff <blp at nicira.com>
Date: Thu, 15 Mar 2012 21:12:31 -0700
Subject: [PATCH] netlink-socket: Increase Netlink socket receive buffer size.
Open vSwitch userspace can set up flows at a high rate, but it is somewhat
"bursty" in opportunities to set up flows, by which I mean that OVS sets up
a batch of flows, then goes off and does some other work for a while, then
sets up another batch of flows, and so on. The result is that, if a large
number of packets that need flow setups come in all at once, then some of
them can overflow the relatively small kernel-to-user buffers.
This commit increases the kernel-to-user buffers from the default of
approximately 120 kB each to 1 MB each. In one somewhat synthetic test
case that I ran based on an "hping3" that generated a load of about 20,000
new flows per second (including both requests and replies), this reduced
the packets dropped at the kernel-to-user interface from about 30% to none.
I expect that it will similarly improve packet loss in workloads where
flow arrival is not easily predictable.
(This has little effect on workloads generated by "ovs-benchmark rate"
because that benchmark is effectively "self-clocking", that is, a new flow
is triggered only by a reply to a request made earlier, which means that
the number of buffered packets at any given has a known, constant upper
limit.)
Bug #10210.
Signed-off-by: Ben Pfaff <blp at nicira.com>
---
include/sparse/sys/socket.h | 5 +++--
lib/netlink-socket.c | 10 +++++++++-
2 files changed, 12 insertions(+), 3 deletions(-)
diff --git a/include/sparse/sys/socket.h b/include/sparse/sys/socket.h
index 89e3c2d..1ed195b 100644
--- a/include/sparse/sys/socket.h
+++ b/include/sparse/sys/socket.h
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 2011 Nicira Networks.
+ * Copyright (c) 2011, 2012 Nicira Networks.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
@@ -74,7 +74,8 @@ enum {
SO_SNDBUF,
SO_SNDLOWAT,
SO_SNDTIMEO,
- SO_TYPE
+ SO_TYPE,
+ SO_RCVBUFFORCE
};
enum {
diff --git a/lib/netlink-socket.c b/lib/netlink-socket.c
index bc46235..df6f1d8 100644
--- a/lib/netlink-socket.c
+++ b/lib/netlink-socket.c
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 2008, 2009, 2010, 2011 Nicira Networks.
+ * Copyright (c) 2008, 2009, 2010, 2011, 2012 Nicira Networks.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
@@ -89,6 +89,7 @@ nl_sock_create(int protocol, struct nl_sock **sockp)
struct nl_sock *sock;
struct sockaddr_nl local, remote;
socklen_t local_size;
+ int rcvbuf;
int retval = 0;
if (!max_iovs) {
@@ -122,6 +123,13 @@ nl_sock_create(int protocol, struct nl_sock **sockp)
sock->protocol = protocol;
sock->dump = NULL;
+ rcvbuf = 1024 * 1024;
+ if (setsockopt(sock->fd, SOL_SOCKET, SO_RCVBUFFORCE,
+ &rcvbuf, sizeof rcvbuf)) {
+ VLOG_WARN_RL(&rl, "setting %d-byte socket receive buffer failed (%s)",
+ rcvbuf, strerror(errno));
+ }
+
retval = get_socket_rcvbuf(sock->fd);
if (retval < 0) {
retval = -retval;
--
1.7.2.5
More information about the dev
mailing list