[ovs-dev] Threaded userspace datapath

Giuseppe Lettieri g.lettieri at iet.unipi.it
Fri Aug 10 16:36:46 UTC 2012


Il giorno 09/ago/2012, alle ore 22:56, Ben Pfaff <blp at nicira.com> ha scritto:

> I'm very interested in seeing some up-to-date performance numbers
> because introducing threads definitely raises the bar when it comes to
> the need for careful programming, and that's only worthwhile if there
> is a correspondingly high payoff.

The setup I used was the following:

ip tuntap add tap0 mode tap
ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
ovs-vsctl add-port br0 tap0
ifconfig br0 10.0.0.1/24
arp -s 10.0.0.2 00:01:02:03:04:05 

./tapreceive tap0  # reads and discards packets from the fd endpoint of tap0
./netsend 10.0.0.2 5678 60 0 5

netsend is a little program from FreeBSD. When called with those parameters it repeatedly sends 60 bytes UDP packets to 10.0.0.2, port 5678, for 5 seconds, then it prints the packets per second and some other measure. 

The CPU is a 3.2 GHz i7 with 6 cores and 12 threads, running Linux 3.3.8. 

With the original ovs-vswitchd I obtain ~26 Kpps, with the threaded ovs-vswitchd I obtain ~270 Kpps.

A similar speed-up, but with different (higher) numbers where obtained by Catalli and Rizzo (in CC) on FreeBSD and real NICs. 

The datapath thread goes to 100% CPU in my tests, but this is expected, since everything is being done in software and no real I/O is being performed. The main thread goes to ~65%, and this is less expected (at least, I did not expect it) and we should investigate why.


Anyway, we should consider the following (Luigi may also say something more precise on this, or even correct me if I am wrong): most of the performance benefit comes from two things that are unrelated to threading:

1) use the information returned by the poll syscall to read from ready ports only;
2) process several packets per port (if available) before reentering the poll syscall.   

The thread is there mainly to offer a nice place where to do these things, without rewriting the main application loop.  
Restructuring the main loop is another possibility, which should be weighed against the dapatah thread(s) and the separate process(es) solutions. 

Giuseppe


Dr. Ing. Giuseppe Lettieri
Dipartimento di Ingegneria della Informazione
Universita' di Pisa
Largo Lucio Lazzarino 2, 56122 Pisa - Italy
Ph. : (+39) 050-2217.649 (direct) .599 (switch)
Fax : (+39) 050-2217.600
e-mail: g.lettieri at iet.unipi.it




More information about the dev mailing list