[ovs-dev] [PATCH v4] ofproto-dpif-xlate: Implement RCU locking in ofproto-dpif-xlate.

Ryan Wilson 76511 wryan at vmware.com
Tue May 20 18:04:02 UTC 2014


Per Alex's request, I ran a 10K internal port creation test (using batches of 500 ports at a time via ovs-vsctl) on my 8GB memory machine. Again RCU was slightly faster:

master: real 3m28.301s
with RCU: real 3m21.489s

Also, the reason I don't simply batch all creation of ports together via — separator in ovs-vsctl is because if doing so, the message to the xlate module will contain all 1000 ports. This means there will be only be 1 copy of the configuration in memory. However, when the creation of ports is not batched, this creates 1000 different messages to the xlate module, meaning 1000 copies of the configuration in memory. This should stress memory usage. I ran a test in my previous email that details this behavior more specifically, but I know I've sent a lot of emails so here's the gist of it.

Ryan

From: Ryan Wilson <wryan at vmware.com<mailto:wryan at vmware.com>>
Date: Monday, May 19, 2014 9:59 PM
To: Ben Pfaff <blp at nicira.com<mailto:blp at nicira.com>>
Cc: Ryan Wilson <wryan at nicira.com<mailto:wryan at nicira.com>>, "dev at openvswitch.org<mailto:dev at openvswitch.org>" <dev at openvswitch.org<mailto:dev at openvswitch.org>>
Subject: Re: [ovs-dev] [PATCH v4] ofproto-dpif-xlate: Implement RCU locking in ofproto-dpif-xlate.

And sorry for the email spam but here's a test I did earlier with the RCU patch when I first finished it. I sent the email to Alex, but unfortunately not to the rest of the list. This may give some insight into the case where this patch is the most helpful

----

So I performance tested this and most cases were about the same performance. However, I saw a significant performance increase while doing the following test:

- Establish 100 netperf TCP_CRR connections to sink (as you do in your start script)
- Sleep for 20 (to let connections get established and OVS to get started)
- Run the following commands in an indefinite loop:

   ovs-vsctl add-br br0 -- add-br br1
   ovs-vsctl set bridge br1 datapath-type=dummy \
       other-config:hwaddr=aa:55:aa:56:00:00 -- \
       add-port br1 p11 -- set Interface p11 type=patch \
       options:peer=p00 -- \
       add-port br0 p00 -- set Interface p00 type=patch \
       options:peer=p11 --
   ovs-vsctl set Interface p00 bfd:enable=true -- \
       set Interface p11 bfd:enable=true
    sleep 1
    ovs-vsctl del-br br0 -- del-br br1

Notice how I don't chain the commands together, this is because if I do so, the new config is batched in 1 message to the ofproto-dpif-xlate layer, meaning only 1 global xlate_rwlock worth of delay in master. So if I batch commands together, there is no significant performance hit.

However, when I don't chain commands together (i.e. I use 3 separate ovs-vsctl commands), then in master this means 3 separate messages between ofproto and ofproto-dpif-xlate, meaning 3 lockings of global xlate_rwlock. This can add up to a bunch of delay! Hence this is where we see the real improvement for RCU.

Some numbers for about 10000 interim results of the netperf processes (in trans/s):

RCU:
Mean: 84.591932
Median: 83.405000

Master:
Mean: 78.528627
Median: 70.550000

Its not huge, but if we add more ovs-vsctl commands, I'd imagine we'd see more improvement. Not sure if this is a valid use case, but these are my findings so far.

Ryan Wilson
Member of Technical Staff
wryan at vmware.com<mailto:wryan at vmware.com>
3401 Hillview Avenue, Palo Alto, CA
650.427.1511 Office
916.588.7783 Mobile

On May 19, 2014, at 9:56 PM, Ryan Wilson <wryan at vmware.com<mailto:wryan at vmware.com>> wrote:

Sorry Gurucharan, totally forgot to answer your question!

After interspersing these tests with random calls to reload the kernel module, it doesn't appear to affect time in any significant way.

Ryan Wilson
Member of Technical Staff
wryan at vmware.com<mailto:wryan at vmware.com>
3401 Hillview Avenue, Palo Alto, CA
650.427.1511 Office
916.588.7783 Mobile

On May 19, 2014, at 9:53 PM, Ryan Wilson <wryan at vmware.com<mailto:wryan at vmware.com>> wrote:

So I did an experiment where I added 500 and 1000 ports and then deleted 500 and 1000 ports with and without this patch on both machines with 8 GB and 62 GB memory. Weirdly enough, adding / deleting ports with the RCU patch turned out to actually be faster than without. My only explanation here is taking the global xlate lock is expensive and / or 500 ports wasn't enough to induce memory pressure.

Here are the numbers for the 500 port case on a 8 GB memory machine:
WIth RCU patch:
Adding ports: real 1m15.850s
Deleting ports: real1m21.830s

Without RCU patch:
Adding ports: real 1m28.357s
Deleting ports: real1m33.277s

Ryan Wilson
Member of Technical Staff
wryan at vmware.com<mailto:wryan at vmware.com>
3401 Hillview Avenue, Palo Alto, CA
650.427.1511 Office
916.588.7783 Mobile

On May 19, 2014, at 8:56 AM, Ben Pfaff <blp at nicira.com<mailto:blp at nicira.com>> wrote:

On Fri, May 16, 2014 at 06:59:02AM -0700, Ryan Wilson wrote:
Before, a global read-write lock protected the ofproto-dpif / ofproto-dpif-xlate
interface. Handler and revalidator threads had to wait while configuration was
being changed. This patch implements RCU locking which allows handlers and
revalidators to operate while configuration is being updated.

Signed-off-by: Ryan Wilson <wryan at nicira.com<mailto:wryan at nicira.com>>
Acked-by: Alex Wang <alexw at nicira.com<mailto:alexw at nicira.com>>

One side effect of this change that I am a bit concerned about is
performance of configuration changes.  In particular, it looks like
removing a port requires copying the entire configuration and that
removing N ports requires copying the entire configuration N times.  Can
you try a few experiments with configurations that have many ports,
maybe 500 or 1000, and see how long it takes to remove several of them?
_______________________________________________
dev mailing list
dev at openvswitch.org<mailto:dev at openvswitch.org>
https://urldefense.proofpoint.com/v1/url?u=http://openvswitch.org/mailman/listinfo/dev&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=TfBS78Vw3dzttvXidhbffg%3D%3D%0A&m=Zs91K1%2FqNCTCBEK8%2FYn6ZxlWk8%2B9KnAmWsxIFslVMIM%3D%0A&s=d0a516ff3c7de6162c8224e60363e8159c1da0eabe5ede2e10d43b18858e965d



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-dev/attachments/20140520/77709e4e/attachment-0005.html>


More information about the dev mailing list