[ovs-dev] [PATCH] ovsdb: Don't let transaction history grow larger than the database.

Han Zhou hzhou at ovn.org
Mon Nov 8 18:24:55 UTC 2021


On Fri, Nov 5, 2021 at 2:55 PM Ilya Maximets <i.maximets at ovn.org> wrote:
>
> On 10/23/21 08:40, Han Zhou wrote:
> >
> >
> > On Thu, Oct 14, 2021 at 7:58 AM Ilya Maximets <i.maximets at ovn.org
<mailto:i.maximets at ovn.org>> wrote:
> >>
> >> If user frequently changes a lot of rows in a database, transaction
> >> history could grow way larger than the database itself.  This wastes
> >> a lot of memory and also makes monitor_cond_since slower than
> >> usual monotor_cond if the transaction id is old enough, because
> >> re-construction of the changes from a history is slower than just
> >> creation of initial database snapshot.  This is also the case if
> >> user deleted a lot of data, so transaction history still holds all of
> >> it while the database itself doesn't.
> >>
> >> In case of current lb-per-service model in ovn-kubernetes, each
> >> load-balancer is added to every logical switch/router.  Such a
> >> transaction touches more than a half of a OVN_Northbound database.
> >> And each of these transactions is added to the transaction history.
> >> Since transaction history depth is 100, in worst case scenario,
> >> it will hold 100 copies of a database increasing memory consumption
> >> dramatically.  In tests with 3000 LBs and 120 LSs, memory goes up
> >> to 3 GB, while holding at 30 MB if transaction history disabled in
> >> the code.
> >>
> >> Fixing that by keeping count of the number of ovsdb_atom's in the
> >> database and not allowing the total number of atoms in transaction
> >> history to grow larger than this value.  Counting atoms is fairly
> >> cheap because we don't need to iterate over them, so it doesn't have
> >> significant performance impact.  It would be ideal to measure the
> >> size of individual atoms, but that will hit the performance.
> >> Counting cells instead of atoms is not sufficient, because OVN
> >> users are adding hundreds or thousands of atoms to a single cell,
> >> so they are largely different in size.
> >>
> >> Signed-off-by: Ilya Maximets <i.maximets at ovn.org <mailto:
i.maximets at ovn.org>>
> >> ---
> >>  ovsdb/ovsdb.c       |  5 ++++
> >>  ovsdb/ovsdb.h       |  3 +++
> >>  ovsdb/transaction.c | 56 +++++++++++++++++++++++++++++++++++++++++++--
> >>  3 files changed, 62 insertions(+), 2 deletions(-)
>
> <snip>
>
> >
> > Thanks Ilya. This looks good to me. Although the atoms number is not
equal to the size of the data, it should be sufficient for the purpose of
controlling history size.
> > It would be good to have a test case to ensure this behavior, and also
ensure the atoms count is as expected, just to avoid messing up the
counters by mistake in the future. (if that happens it is not easily
noticed because there is no functional difference but possible performance
impact of fast-resync)
> >
> > Acked-by: Han Zhou <hzhou at ovn.org <mailto:hzhou at ovn.org>>
>
> Thanks for review!  I prepared a test below for the number of atoms.
> If it looks good to you, I can fold it in before applying the patch.
>
> Han, Dumitru, What do you think?
>
> diff --git a/tests/ovsdb-server.at b/tests/ovsdb-server.at
> index ac243d6a7..2b742f78b 100644
> --- a/tests/ovsdb-server.at
> +++ b/tests/ovsdb-server.at
> @@ -1228,6 +1228,69 @@ AT_CHECK([test $logged_updates -lt
$logged_nonblock_updates])
>  AT_CHECK_UNQUOTED([ovs-vsctl get open_vswitch . system_version], [0],
>    [xyzzy$counter
>  ])
> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> +AT_CLEANUP
> +
> +AT_SETUP([ovsdb-server transaction history size])
> +on_exit 'kill `cat *.pid`'
> +
> +dnl Start an ovsdb-server with the clustered vswitchd schema.
> +AT_CHECK([ovsdb-tool create-cluster db dnl
> +            $abs_top_srcdir/vswitchd/vswitch.ovsschema unix:s1.raft],
> +         [0], [ignore], [ignore])
> +AT_CHECK([ovsdb-server --detach --no-chdir --pidfile dnl
> +            --log-file --remote=punix:db.sock db],
> +         [0], [ignore], [ignore])
> +AT_CHECK([ovs-vsctl --no-wait init])
> +
> +dnl Create a bridge with N ports per transaction.  Increase N every 4
> +dnl iterations.  And then remove the bridges.  By increasing the size of
> +dnl transactions, ensuring that they take up a significant percentage of
> +dnl the total database size, so the transaction history will not be able
> +dnl to hold all of them.
> +dnl
> +dnl The test verifies that the number of atoms in the transaction history
> +dnl is always less than the number of atoms in the database.
> +get_n_atoms () {
> +    n=$(ovs-appctl -t ovsdb-server memory/show dnl
> +            | tr ' ' '\n' | grep atoms | grep "^$1:" | cut -d ':' -f 2)
> +    if test X"$n" == "X"; then
> +        n=0
> +    fi
> +    echo $n
> +}
> +
> +check_atoms () {
> +    n_db_atoms=$(get_n_atoms atoms)
> +    n_txn_history_atoms=$(get_n_atoms txn-history-atoms)
> +    echo "n_db_atoms:          $n_db_atoms"
> +    echo "n_txn_history_atoms: $n_txn_history_atoms"
> +    AT_CHECK([test $n_txn_history_atoms -le $n_db_atoms])
> +}
> +
> +add_ports () {
> +    for j in $(seq 1 $2); do
> +        printf " -- add-port br$1 p$1-%d" $j
> +    done
> +}
> +
> +initial_db_atoms=$(get_n_atoms atoms)
> +
> +for i in $(seq 1 100); do
> +    cmd=$(add_ports $i $(($i / 4 + 1)))
> +    AT_CHECK([ovs-vsctl --no-wait add-br br$i $cmd])
> +    check_atoms
> +done
> +
> +for i in $(seq 1 100); do
> +    AT_CHECK([ovs-vsctl --no-wait del-br br$i])
> +    check_atoms
> +done
> +

Thanks Ilya for the test case. I noticed that it seems the case when
txn_history is bigger than the data is not covered by the above loops. When
i increases from 1 to 100, the history is smaller or equal to the whole
data. Then when the first deletion happens, because there are max 100
entries in the history, so the oldest history is removed, which is the same
size as the new transaction, because it is deleting the same br that was
added in the oldest transaction, and the txn history size is still smaller
or equal to the whole data items, and so on ... And it seems that simply
changing the loop count to 99 or less would have the scenario covered.

> +dnl After removing all the bridges, the number of atoms in the database
> +dnl should return to its initial value.
> +AT_CHECK([test $(get_n_atoms atoms) -eq $initial_db_atoms])
> +
>  OVS_APP_EXIT_AND_WAIT([ovsdb-server])
>  AT_CLEANUP
>
> ---


More information about the dev mailing list