[ovs-dev] [OVN] Potential scalability bug in ovn-northd on creating and binding large number of lports

Ryan Moats rmoats at us.ibm.com
Sat Jun 25 12:44:14 UTC 2016


"dev" <dev-bounces at openvswitch.org> wrote on 06/24/2016 10:56:04 PM:

> From: Ben Pfaff <blp at ovn.org>
> To: Hui Kang/Watson/IBM at IBMUS
> Cc: dev at openvswitch.org
> Date: 06/24/2016 10:56 PM
> Subject: Re: [ovs-dev] [OVN] Potential scalability bug in ovn-northd
> on creating and binding large number of lports
> Sent by: "dev" <dev-bounces at openvswitch.org>
>
> On Fri, Jun 24, 2016 at 08:52:07PM -0700, Ben Pfaff wrote:
> > On Thu, Jun 23, 2016 at 01:56:59PM -0400, Hui Kang wrote:
> > >
> > > Hi,
> > > In our scalability test for OVN, we observed an in-scalable behaviour
of
> > > the
> > > ovn-northd process: the time binding a logical port increases as# of
large
> > > port increasing, regardless of whether logical ports belong to the
same
> > > logical
> > > switch. The most suspicious function in causing this issue is
> build_ports()
> > > called by ovnnb_db_run() [1], as described below.
> > >
> > > Test description:
> > >     step 1: Create 6 logical switches. For each logical switch,
create 200
> > >             logical ports.
> > >     step 2: Bind 200 lports from each logical switch on an OVN
chassis.
> > >
> > > Test results for step 2:
> > >
> > >     # of ports  |  # of ovn_ports            |  Cpu cycle spent in
|
> > >                 | allocated in build_port()  | built_port(), in
million  |
> > >             200 |                        200 |                    25
|
> > >             400 |                        400 |                    50
|
> > >             600 |                        600 |                    75
|
> > >             800 |                        800 |                    93
|
> > >            1000 |                       1000 |                   108
|
> > >            1200 |                       1200 |                   125
|
> >
> > I'm surprised that this is expensive for so few ports.  I believe that
> > build_ports() runs in O(n) time where n is the larger of the number of
> > ports in the northbound and southbound databases.  Does anyone see
> > anything that would cause quadratic or more regressive behavior there?
>
> Actually, I take that back.  The cycles/port for all the cases above
> demonstrate only slightly nonlinear scaling: 200/25 is 8 Mcycles/port,
> 1200/125 is 9.6 Mcycles/port.
>
> So the issue is not that it does not scale.  The issue is that it is
> slow.

Er? When I do the ratios, I come up with 125 Kcycles/port at 200 ports
going
down to slightly more than 104 Kcycles/port at 1200 ports, which is
slightly
sub-linear (and I do think that's a good thing).

However, I'm left wondering if it would be possible to make things even
better through judicial use of persistence and incremental processing.

Right now the ports logic looks to me like:
- Build a list of all ports known via port bindings in the sb db.
- For each port known via the nb db:
  - Look for the port in the sb list.
  - If found, move the port from the sb list to the both list
  - If not found, create a new entry in the nb_only list.
(After the above finishes, we have three lists: sb_only, nb_only, and both)
- For each entry in the both list, do modifications to align the port
  binding with nb information.
- For each entry in the nb_only list, create port_binding information in
  the sb db.
  [If I were updating the port lists, I'd move the port from the nb_only
  list to both list]
- For each entry in the sb_only list, remove from the port_binding table.
  [If I were updating the sb_only list, I'd remove it from the sb_only
  list]

I *think* if I were to consider persisting the sb_only, nb_only, and both
lists and follow the extra logic I've added in square brackets above, I'd
only have entries in the both list at the end of the calculation set, so I
should only need to persist the both table.

Further, I *think* if I were to then apply change tracking to the first
part of the process above, the logic changes to:

- For each tracked entry in the port bindings table
  - if it is a deleted entry, remove from the both list (if there is still
    a nb entry, we'll recreate it further on)
  - if it is a new entry, add it to the sb_only list
  - if it is a modified entry, find it in the both list and update the
    sb information contained in the entry
- For each port known via the nb db:
  - if the entry is found in the both list, update the nb data contained
    in the entry
  - if the entry is not in the both list, but is in the sb_only list,
    move the entry from the sb_list to the both list
  - if the entry is not in either the both or the sb_only list, create
    a new entry in the nb_only list
- For each entry in the both list, do modifications to align the port
  binding with nb information.
- For each entry in the nb_only list, create port_binding information in
  the sb db and move the entry from the nb_only to the both list
- For each entry in the sb_only list, remove from the port_binding table.

Now, I'm pretty sure this will cut down the number of cycles, but before
I go off and code it [and potentially break something ala yesterday's
excitement], I'm looking for some verification of both my conclusion of
persisting just the both list and the modified logic incorporating the
persisted both list and port binding change tracking adjustments). Do
these make sense or have I missed something?

Ryan




More information about the dev mailing list