[ovs-discuss] etcd for OVN status update (was: Re: more about etcd (can it support big transactions and many monitors?))

Andy Zhou azhou at ovn.org
Wed Jun 29 22:38:26 UTC 2016


On Wed, Jun 22, 2016 at 10:43 AM, Ben Pfaff <blp at ovn.org> wrote:

> On Wed, Jun 22, 2016 at 01:56:17AM -0700, Andy Zhou wrote:
> > 3. How should the OVN databases be arranged within etcd?  There are
> > >    multiple possibilities:
> > >
> > >    - Define OVSDB bindings to etcd and implement those bindings in the
> > >      OVSDB client libraries (C and Python).
> > >
> > >    - Define OVSDB bindings to etcd and implement those bindings in the
> > >      OVSDB server (so that ovsdb-server uses etcd as a storage layer).
> > >
> > >    - Define a native etcd schema for OVN SB (and probably NB) database
> > >      and make ovn-controller and ovn-northd use it natively.
> >
> >
> > >
> > It would be nice to be able to reuse current schema definition.  #3
> option
> > makes this not a
> > hard requirement, but having schema is much nicer to maintain changes
> over
> > release -- for example, upgrade due to schema version changes.
> >
> > Both #1 and #2 option above require us to figure out how DB, TABLE and
> > COLUMNS are logically map to
> > a key value store.  Just for discussion purpose, Let's say the keys are
> in
> > the format of db/table/<row-uuid>/column.
> >
> >
> > OVSDB supports complex value types such as set and maps, Those can also
> be
> > supported with the following
> > format:  db/table/<row-uuid>/column/set-key (with a fixed value, say,
> > "set") or db/table/<row-uuid>/column/map-key
> >
> > To optimize certain key range queries (i.e. the benefits that can be
> > realized by conditional monitoring), we can declare
> > certain columns to be prefix of the <row-uuid>. One possible way is to
> > enhance current schema definition to add a "priority"
> > field for each column.  "normal" columns, by default have the lowest
> > priority. When C1 has a higher priority than C2, and both
> > have non default priority,  The etcd key layout can be:
> > db/table/c1<value>/c2<value>/<row-uuid>/columns.
> >
> > With this key layout, rows that matches a particular c1 value (or c1 &&
> c2)
> > to be "watched". This is not as general as the conditional monitoring,
> but
> > may be sufficient for OVN SB's current use cases.
> >
> > Enforcing constrains expressed in schema can be tricky for #1, some of
> the
> > possible solutions are:
> >
> > The value constrains expressed by the schema are not going to enforced by
> > etcd. One possible solution here is
> > to have all clients that issues transactions enforce constrains before
> > issuing.
> >
> > References integrity can also be enforced by the client.  Logically, we
> can
> > have a dedicated client that enforces referential integrity,
> > (It can be combined into one of the clients in practice).  Ideally we
> would
> > like to both original transaction + reference integrity changes appears
> as
> > one transaction to the client (at least the clients of the idl layer).
> This
> > may need additional logic OVN needs to build that
> > not currently provided by etcd -- I don't know if this is a deal breaker.
> >
> > To me, #2 seems to make overall system more complex and less efficient
> than
> > #1.
>
> Thanks for all the thoughts!  I agree with all of these ideas, at least
> at first glance.  They are very close to what I was thinking too.  It's
> good that we're on the same page.
>

This is one possible way to implement reference integrity with etcd:

* DB wide versioning.

Assign a key db/version that stores db wide transaction id. Assume the id
starts with 0. Any client issued transaction on the DB should also include
this key; A transaction will increase its value by 1; Any etcd client
transaction
will always bring this version number from even to an odd number.

No further transaction can be issued until "db/version"'s value become
even.

* A dedicated client enforces referential integrity

There is a dedicated etcd client whose job is to enforce referential
integrity.
It starts to run when the version number is odd, commit the next transaction
that "fixes"  the etcd.  The version number is increased even if there is
nothing
to fix.

In the HA setup, referential integrity checking clients should run on the
same machines
that run etcd. Only the etcd client that runs on the same machine as the
etcd leader
will actively enforce referential integrity.  Other clients will be running
in standby mode,
and only become active when its local etcd server become the leader.

Will this work?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://openvswitch.org/pipermail/ovs-discuss/attachments/20160629/95d0020d/attachment-0002.html>


More information about the discuss mailing list