[ovs-dev] Threaded userspace datapath

Ben Pfaff blp at nicira.com
Thu Aug 9 20:56:47 UTC 2012


On Thu, Aug 09, 2012 at 03:25:58PM -0400, Ed Maste wrote:
> On 9 August 2012 11:56, Ben Pfaff <blp at nicira.com> wrote:
> 
> > I'm curious about the performance improvement.  You mentioned a 10x
> > performance improvement.  How much did CPU usage increase as part of
> > that?  We already have users who complain when CPU usage spikes to 100%;
> > I'm not sure that users will be happy if CPU usage can spike to 1000%
> > (!).
> 
> Giuseppe did that basic benchmarking on Linux with the threaded
> userspace patch and may have observed CPU usage statistics during that
> work.  I don't have current numbers on the FreeBSD port, but will get
> some soon.  Performance improvements are very important on FreeBSD of
> course because we have only the userspace datapath.

Yes, I understand that, and it makes sense.

I'm very interested in seeing some up-to-date performance numbers
because introducing threads definitely raises the bar when it comes to
the need for careful programming, and that's only worthwhile if there
is a correspondingly high payoff.

> > Oh I see, it looks like there is only one thread that does all packet
> > processing?  I had a notion that there would be many threads, for
> > example on a per-datapath or per-port basis.  It's very impressive
> > getting 10x improvement with only a single additional thread.
> 
> Eventually I think a thread per datapath may be the way to go, and I
> think the model in this patch can be reasonably extended to
> per-datapath.

OK.

> > I looked at the repository and then at the diff, both very briefly.  The
> > repository looks more like development notes than something mergeable;
> > the diff was easier to read.
> 
> Yes, the repo started from a snapshot patch against 1.1.0pre2 and has
> evolved over time.  To get to something mergeable I'd expect to roll
> it all up, or refactor into two or three committable pieces.

Yes, we're on the same page.

> > One early reaction to the code is that there are too many #ifdefs
> > (mainly for locking and unlocking mutexes).  I think that we should be
> > able to remove many of them with a few well-chosen macros.
> 
> Indeed; the original 1.1.0pre2 work included this:
> 
> /* We could use these macros instead of using #ifdef and #endif every time we
>  * need to call the pthread_mutex_lock/unlock.
> #ifdef THREADED
> #define LOCK(mutex) pthread_mutex_lock(mutuex)
> #define UNLOCK(mutex) pthread_mutex_unlock(mutex)
> #else
> #define LOCK(mutex)
> #define UNLOCK(mutex)
> #endif
> 
> but then never used those macros.  I can switch it over to this scheme
> if this is an agreeable approach (or define macros for each individual
> lock, e.g. TABLE_LOCK and PORT_LOCK).

Those original macros seem OK to me.

> > OVS recently got a library, lib/worker.c, for implementing code in a
> > "worker process".  So far, we're using that in a single-worker-process
> > model, mostly for asynchronous logging, but we could generalize it so
> > that there could be a second worker process used for datapath
> > implementation.  If this would achieve acceptable performance, then it
> > would avoid the difficulties of threads.  I guess the question there is
> > how much communication a process model would require versus a thread
> > model.
> 
> Right now packets that had a flow lookup miss are the only data
> flowing from the datapath thread to the main thread, so that should be
> easily done in either a thread or process model.  In the other
> direction the datapath thread needs access to the flow table and I
> think this would be a bit more awkward to implement in a process
> model.  (On the other hand, a userland datapath process may be a
> decent analogue to the Linux kernel module.)

I'm provisionally OK with using a thread.  But there may be a lot of
work to do to make sure that it is safe.  If that somehow becomes a
burden, then it might prompt a look at a process-based solution.



More information about the dev mailing list