[ovs-dev] [PATCH 4/4] ovsdb: Compact databases online automatically and on-demand.

Ben Pfaff blp at nicira.com
Thu Mar 18 18:20:42 UTC 2010


On Wed, Mar 17, 2010 at 08:08:00PM -0400, Jesse Gross wrote:
> On Tue, Mar 16, 2010 at 6:24 PM, Ben Pfaff <blp at nicira.com> wrote:
> 
> > If the database grows fairly large, and we've written a fair number of
> > transactions to it, and it's been a while since the database was compacted,
> > then (after the next commit) compact the database.
> >
> > Also, compact the database online if the "ovsdb-server/compact" command is
> > issued via unixctl.  I suspect that this feature will rarely if ever be
> > used in practice, but it's easier to test than compacting automatically.
> >
> > Bug #2391.
> 
> I think that this (and the rest of the set) looks good generally but it made
> me wonder about the impact of compaction while running.
> 
> Since this will stop all other database activity, how long do you think that
> it might run for?  I had a large database that I manually compacted and I
> think I remember it taking several minutes, which could cause connections to
> timeout.  Obviously running compaction more frequently will reduce the
> amount of time that it takes.  In theory it might also be nice to wait on
> compaction if we have pending work.
> 
> I suspect that most of this is not important in real life, just wondering if
> you had any thoughts/tests.

The big reason that compacting a large database takes a long time is
that there is a lot of data to read and hence a lot of data to write.
But compacting an OVSDB database just writes the database's current
contents to a new file and then renames the new file over the old one.
It should be very fast because there is no data to read (the database is
already completely in memory) and not much data to write (the database
isn't very big).

What I am a little concerned about is the performance of the fsync()
operations that we do during compacting.  From memory, there are three
of them: the old version of the database, the new version of the
database, and then the directory itself after the rename.  fsync()
performance is terrible on ext3, which XenServer uses for Dom0.  If this
is a problem in practice then maybe we will have to do the fsyncs
asynchronously, e.g. by forking a new process or via aio_fsync(), and
service other requests in the meantime.  But I decided to start out
simply and worry about fsync() if it proves to be a real problem.




More information about the dev mailing list