[ovs-dev] locks for clustered OVSDB

Miguel Angel Ajo Pelayo majopela at redhat.com
Tue Oct 10 08:12:29 UTC 2017


Thanks for the clarification Russell.

On Mon, Oct 9, 2017 at 7:45 PM, Ben Pfaff <blp at ovn.org> wrote:

> On Mon, Oct 09, 2017 at 01:13:56PM -0400, Russell Bryant wrote:
> > On Mon, Sep 25, 2017 at 2:29 PM, Ben Pfaff <blp at ovn.org> wrote:
> > > On Mon, Sep 25, 2017 at 11:09:49AM -0700, Han Zhou wrote:
> > >> On Mon, Sep 25, 2017 at 2:36 AM, Miguel Angel Ajo Pelayo <
> > >> majopela at redhat.com> wrote:
> > >> >
> > >> > I believe Lucas Alvares could give you valuable feedback on this as
> > >> > he was planning to use this as a mechanism for synchronization on
> > >> > the networking-ovn side (if I didn't get it wrong).
> > >> >
> > >> > I believe he's back by October.
> > >> >
> > >> > Best regards.
> > >> > Miguel Ángel.
> > >> >
> > >> > On Fri, Sep 22, 2017 at 6:58 PM, Ben Pfaff <blp at ovn.org> wrote:
> > >> >
> > >> > > We've had a couple of brief discussions during the OVN meeting
> about
> > >> > > locks in OVSDB.  As I understand it, a few services use OVSDB
> locks to
> > >> > > avoid duplicating work.  The question is whether and how to
> extend OVSDB
> > >> > > locks to a distributed context.
> > >> > >
> > >> > > First, I think it's worth reviewing how OVSDB locks work, filling
> in
> > >> > > some of the implications that aren't covered by RFC 7047.  OVSDB
> locks
> > >> > > are server-level (not database-level) objects that can be owned
> by at
> > >> > > most one client at a time.  Clients can obtain them either
> through a
> > >> > > "lock" operation, in which case they get queued to obtain the
> lock when
> > >> > > it's no longer owned by anyone else, or through a "steal"
> operation that
> > >> > > always succeeds immediately, kicking out whoever (if anyone)
> previously
> > >> > > owned the lock.  A client loses a lock whenever it releases it
> with an
> > >> > > "unlock" operation or whenever its connection to the server
> drops.  The
> > >> > > server notifies a client whenever it acquires a lock or whenever
> it is
> > >> > > stolen by another client.
> > >> > >
> > >> > > This scheme works perfectly for one particular scenario: where the
> > >> > > resource protected by the lock is an OVSDB database (or part of
> one) on
> > >> > > the same server as the lock.  This is because OVSDB transactions
> include
> > >> > > an "assert" operation that names a lock and aborts the
> transaction if
> > >> > > the client does not hold the lock.  Since the server is both the
> lock
> > >> > > manager and the implementer of the transaction, it can always
> make the
> > >> > > correct decision.  This scenario could be extended to distributed
> locks
> > >> > > with the same guarantee.
> > >> > >
> > >> > > Another scenario that could work acceptably with distributed
> OVSDB locks
> > >> > > is one where the lock guards against duplicated work.  For
> example,
> > >> > > suppose a couple of ovn-northd instances both try to grab a lock,
> with
> > >> > > only the winner actually running, to avoid having both of them
> spend a
> > >> > > lot of CPU time recomputing the southbound flow table.  A
> distributed
> > >> > > version of OVSDB locks would probably work fine in practice for
> this,
> > >> > > although occasionally due to network propagation delays, "steal"
> > >> > > operations, or different ideas between client and server of when a
> > >> > > session has dropped, both ovn-northd might think they have the
> lock.
> > >> > > (If, however, they combined this with "assert" when they actually
> > >> > > committed their changes to the southbound database, then they
> would
> > >> > > never actually interfere with each other in database commits.)
> > >> > >
> > >> > > A scenario that would not work acceptably with distributed OVSDB
> locks,
> > >> > > without a change to the model, is where the lock ensures
> correctness,
> > >> > > that is, if two clients both think they have the lock then bad
> things
> > >> > > happen.  I believe that this requires clients to understand a
> concept of
> > >> > > leases, which OVSDB doesn't currently have.  The "steal"
> operation is
> > >> > > also problematic in this model since it would require canceling a
> > >> > > lease.  (This scenario also does not work acceptably with
> single-server
> > >> > > OVSDB locks.)
> > >> > >
> > >> > > I'd appreciate anyone's thoughts on the topic.
> > >> > >
> > >> > > This webpage is good reading:
> > >> > >
> > >> https://martin.kleppmann.com/2016/02/08/how-to-do-
> distributed-locking.html
> > >> > >
> > >> > > Thanks,
> > >> > >
> > >> > > Ben.
> > >>
> > >> Hi Ben,
> > >>
> > >> If I understand correctly, you are saying that the clustering wouldn't
> > >> introduce any new restriction to the locking mechanism, comparing
> with the
> > >> current single node implementation. Both new and old approach support
> > >> avoiding redundant work, but not for correctness (unless "assert" or
> some
> > >> other "fence" is used). Is this correct?
> > >
> > > It's accurate that clustering would not technically introduce new
> > > restrictions.  It will increase race windows, especially over Unix
> > > sockets, so anyone who is currently (incorrectly) relying on OVSDB
> > > locking for correctness will probably start seeing failures that they
> > > did not see before.  I'd be pleased to hear that no one is doing this.
> >
> > You discussed the ovn-northd use case in your original post (thanks!).
> >
> > The existing Neutron integration use case should be fine.  In that
> > case, it's not committing any transactions.  The lock is only used to
> > ensure that only one server is processing logical switch port "up"
> > state.  If more than one thinks it has a lock, the worst that can
> > happen is we send the same port event through OpenStack more than
> > once.  That's mostly harmless, aside from a log message.
> >
> > Miguel mentioned that it might be used for an additional use case that
> > Lucas is working on, but OVSDB locks are not used there.
>
> OK, thanks.
>
> My current patch series do not implement distributed locks, but now I
> can start designing the feature.
>


More information about the dev mailing list