[ovs-dev] [OVN] Potential scalability bug in ovn-northd on creating and binding large number of lports

Hui Kang kangh at us.ibm.com
Thu Jun 23 17:56:59 UTC 2016


Hi,
In our scalability test for OVN, we observed an in-scalable behaviour of
the
ovn-northd process: the time binding a logical port increases as # of large
port increasing, regardless of whether logical ports belong to the same
logical
switch. The most suspicious function in causing this issue is build_ports()
called by ovnnb_db_run() [1], as described below.

Test description:
    step 1: Create 6 logical switches. For each logical switch, create 200
            logical ports.
    step 2: Bind 200 lports from each logical switch on an OVN chassis.

Test results for step 2:

    # of ports  |  # of ovn_ports            |  Cpu cycle spent in       |
                | allocated in build_port()  | built_port(), in million  |
            200 |                        200 |                     25    |
            400 |                        400 |                     50    |
            600 |                        600 |                     75    |
            800 |                        800 |                     93    |
           1000 |                       1000 |                    108    |
           1200 |                       1200 |                    125    |

We see that on binding each logical port on a hypervisor,
join_logical_ports()
in build_port allocates the number of (struct ovn_port) for all the
existing
ports in the southbound database [2], which causes the accumulated CPU
cycles.

My question is whether there is any particular reason to allocate that
number
of (struct ovn_port)? It seems to me there is room in this code to optimize
for performance. Thanks.

- Hui


[1]
https://github.com/openvswitch/ovs/blob/master/ovn/northd/ovn-northd.c#L2529
[2]
https://github.com/openvswitch/ovs/blob/master/ovn/northd/ovn-northd.c#L571



More information about the dev mailing list