[ovs-dev] Threaded userspace datapath
Ed Maste
emaste at freebsd.org
Thu Aug 9 19:25:58 UTC 2012
On 9 August 2012 11:56, Ben Pfaff <blp at nicira.com> wrote:
> I'm curious about the performance improvement. You mentioned a 10x
> performance improvement. How much did CPU usage increase as part of
> that? We already have users who complain when CPU usage spikes to 100%;
> I'm not sure that users will be happy if CPU usage can spike to 1000%
> (!).
Giuseppe did that basic benchmarking on Linux with the threaded
userspace patch and may have observed CPU usage statistics during that
work. I don't have current numbers on the FreeBSD port, but will get
some soon. Performance improvements are very important on FreeBSD of
course because we have only the userspace datapath.
> Oh I see, it looks like there is only one thread that does all packet
> processing? I had a notion that there would be many threads, for
> example on a per-datapath or per-port basis. It's very impressive
> getting 10x improvement with only a single additional thread.
Eventually I think a thread per datapath may be the way to go, and I
think the model in this patch can be reasonably extended to
per-datapath.
> I looked at the repository and then at the diff, both very briefly. The
> repository looks more like development notes than something mergeable;
> the diff was easier to read.
Yes, the repo started from a snapshot patch against 1.1.0pre2 and has
evolved over time. To get to something mergeable I'd expect to roll
it all up, or refactor into two or three committable pieces.
> One early reaction to the code is that there are too many #ifdefs
> (mainly for locking and unlocking mutexes). I think that we should be
> able to remove many of them with a few well-chosen macros.
Indeed; the original 1.1.0pre2 work included this:
/* We could use these macros instead of using #ifdef and #endif every time we
* need to call the pthread_mutex_lock/unlock.
#ifdef THREADED
#define LOCK(mutex) pthread_mutex_lock(mutuex)
#define UNLOCK(mutex) pthread_mutex_unlock(mutex)
#else
#define LOCK(mutex)
#define UNLOCK(mutex)
#endif
but then never used those macros. I can switch it over to this scheme
if this is an agreeable approach (or define macros for each individual
lock, e.g. TABLE_LOCK and PORT_LOCK).
> OVS recently got a library, lib/worker.c, for implementing code in a
> "worker process". So far, we're using that in a single-worker-process
> model, mostly for asynchronous logging, but we could generalize it so
> that there could be a second worker process used for datapath
> implementation. If this would achieve acceptable performance, then it
> would avoid the difficulties of threads. I guess the question there is
> how much communication a process model would require versus a thread
> model.
Right now packets that had a flow lookup miss are the only data
flowing from the datapath thread to the main thread, so that should be
easily done in either a thread or process model. In the other
direction the datapath thread needs access to the flow table and I
think this would be a bit more awkward to implement in a process
model. (On the other hand, a userland datapath process may be a
decent analogue to the Linux kernel module.)
-Ed
More information about the dev
mailing list