[ovs-dev] [OVN] Potential scalability bug in ovn-northd on creating and binding large number of lports

Ben Pfaff blp at ovn.org
Sat Jun 25 03:56:04 UTC 2016


On Fri, Jun 24, 2016 at 08:52:07PM -0700, Ben Pfaff wrote:
> On Thu, Jun 23, 2016 at 01:56:59PM -0400, Hui Kang wrote:
> > 
> > Hi,
> > In our scalability test for OVN, we observed an in-scalable behaviour of
> > the
> > ovn-northd process: the time binding a logical port increases as # of large
> > port increasing, regardless of whether logical ports belong to the same
> > logical
> > switch. The most suspicious function in causing this issue is build_ports()
> > called by ovnnb_db_run() [1], as described below.
> > 
> > Test description:
> >     step 1: Create 6 logical switches. For each logical switch, create 200
> >             logical ports.
> >     step 2: Bind 200 lports from each logical switch on an OVN chassis.
> > 
> > Test results for step 2:
> > 
> >     # of ports  |  # of ovn_ports            |  Cpu cycle spent in       |
> >                 | allocated in build_port()  | built_port(), in million  |
> >             200 |                        200 |                     25    |
> >             400 |                        400 |                     50    |
> >             600 |                        600 |                     75    |
> >             800 |                        800 |                     93    |
> >            1000 |                       1000 |                    108    |
> >            1200 |                       1200 |                    125    |
> 
> I'm surprised that this is expensive for so few ports.  I believe that
> build_ports() runs in O(n) time where n is the larger of the number of
> ports in the northbound and southbound databases.  Does anyone see
> anything that would cause quadratic or more regressive behavior there?

Actually, I take that back.  The cycles/port for all the cases above
demonstrate only slightly nonlinear scaling: 200/25 is 8 Mcycles/port,
1200/125 is 9.6 Mcycles/port.

So the issue is not that it does not scale.  The issue is that it is
slow.



More information about the dev mailing list