[ovs-dev] [PATCHv2 0/5] Tunnel Scalability (was Global netdev change_seq)

Joe Stringer joestringer at nicira.com
Thu Nov 14 23:28:24 UTC 2013


When dealing with a large number of ports, a major performance bottleneck is
various places in the main loop where all ports are iterated through each time
the thread wakes up. This often involves a bunch of lock jiggling and polling
unchanged attributes.

This patchset shifts change_seq out from per-netdevice to a single global
'struct seq' tracking all interface changes. This is updated on changes to
netdev attributes, bfd, cfm, lacp and stp status. We can use this new tracker
to skip O(N) loops over devices in the following areas:-

- ofproto: Skip updating ofports if no netdevs have changed
--> ofproto_run()

- ofproto-dpif: Only run bundles when lacp/bonds are enabled and devices change
--> run()
--> wait()

- bridge: Skip unchanging or unused instant_stats
--> instant_stats_run()

For the most part, these improve the average case by dynamically disabling
unused functionality. If you're intense enough to run thousands of ports, with
bonds, lacp, cfm, bfd and stp on a single bridge, in an environment where links
flap more than 10 times a second, then this patchset will do very little. The
fewer of these conditions that hold, the better your average CPU usage will be.

All tunnels in the following tests are configured with bfd enabled. When
running 5,000 internal ports and 50 tunnels, this reduces average CPU usage of
the main thread from ~40% to ~5%. In stable conditions, each additional 100
tunnels causes CPU usage to increase by ~1%, up to around 90% for 10K tunnels.
Under constantly changing network conditions*, main thread avg CPU usage
appears to drop to around 65% for 10K tunnels, which is traded off for higher
usage in the monitor thread.

This patchset ties netdev changes and cfm/bfd/lacp/stp changes to the same
global seq; I have not investigated using a separate seq for these two areas,
but can do if this is preferred. Further improvements can be achieved by
disabling coverage_run(), and by not writing port statistics to the database. I
intend to investigate how these can be implemented, as they provide an
improvement of ~10% each in the 10K tunnel case.

* 100 tunnels go down on a remote host, 100-200ms apart, then come back up at
  the same interval. Measured as avg CPU usage, sampled every 1sec for 20sec.
  bfd min_rx,min_tx set to 500.

Joe Stringer (5):
  netdev: Globally track port status changes
  ofproto-dpif: Don't poll ports when nothing changes
  ofproto-dpif: Only run bundles when lacp or bonds are enabled
  bridge: Only store instant_stats on device changes
  bridge: Only update instant_stats for active protocols

 lib/bfd.c                  |    4 +++
 lib/bond.c                 |    4 +--
 lib/cfm.c                  |    7 +++++
 lib/lacp.c                 |    3 ++
 lib/netdev-bsd.c           |   22 +++------------
 lib/netdev-dummy.c         |   25 ++---------------
 lib/netdev-linux.c         |   23 ++--------------
 lib/netdev-provider.h      |   23 ++++++++--------
 lib/netdev-vport.c         |   20 ++++----------
 lib/netdev.c               |   38 +++++++++++++++++++-------
 lib/netdev.h               |    5 +++-
 ofproto/ofproto-dpif.c     |   41 +++++++++++++++++++++-------
 ofproto/ofproto-provider.h |    3 +-
 ofproto/ofproto.c          |   65 +++++++++++++++++++++++++-------------------
 ofproto/ofproto.h          |   12 ++++++++
 ofproto/tunnel.c           |    4 +--
 vswitchd/bridge.c          |   56 ++++++++++++++++++++++++++------------
 17 files changed, 195 insertions(+), 160 deletions(-)

-- 
1.7.9.5




More information about the dev mailing list