[ovs-dev] [PATCH 1/4] docs: OVSDB replication design document

Marcelo E. Magallon marcelo.magallon at hpe.com
Mon Apr 18 16:02:34 UTC 2016


Hi,

 sorry about the delay in responding. I was actually catching up with
 emails on the mailing list to try to gauge if we are indeed trying to
 accomplish the same thing or not.

On Mon, Apr 11, 2016 at 03:44:09PM -0700, Ben Pfaff wrote:
> On Fri, Apr 01, 2016 at 10:52:26AM -0700, Ben Pfaff wrote:
> > I don't think it makes sense to stack replication and Raft-based HA.
> > 
> > Thinking about OpenSwitch, I guess that your use case is something
> > like this: an OpenSwitch instance maintains, on-box, an
> > authoritative database instance, and then the replication feature
> > allows that database's content to be backed up somewhere else.  I
> > see how that differs from the use case for Raft, where there is no
> > preference for a particular database server to be authoritative.
> > What I don't see yet is how or why that's useful.  What is the use
> > case?
> 
> In case it wasn't clear, I didn't mean my message above to sound like
> a "no, we won't take this".  Instead, I'm trying to understand the use
> case better.  Perhaps there is room for both replication and HA in
> OVSDB, but before I say "yes" to that, I want to understand both
> cases.

 Yes, that's totally fair.

 We do not have a need for only 1+1 redundancy. We have a need in which
 we have to remain operational with less than a quantum of instances in
 operation, which raft can’t do unless you introduce modifications to
 the algorithm (e.g. etcd or consul, I can't remember which one
 exactly).

 Also, raft assumes that everybody's vote is equal. If you’re treating
 multiple instances of OVS as one large virtual switch, you are not
 running a separate version of OSPF on each instance, each feeding its
 own version of the routing table into the database.  You have one OSPF
 instance on a "stack commander" feeding the entire routing table into
 the database. This is the "correct" state, no matter how many raft
 members have voted on it. We grow to more than 2 members by setting up
 multiple one way replications, all originating from the "commander". In
 future patches, we will also implement two way replication so that the
 member can write to his local database to reflect state that the
 commander cannot know about (like port state) ... until that happens
 daemons on a "member" can connect directly to the commander's OVSDB
 instance and update the commander's state directly.

 This work is done in the conetxt of OpenSwitch (http://openswitch.net/,
 probaly http://openswitch.net/documents/user/architecture is more
 relevant to this discussion).  With the proposed patch we can have two
 OVSDB instances each running on a TOR switch. One of the switches is
 active and the other is a stand-by. The stand-by instance is constantly
 replicating the active one. In case of a failure in the active, the
 stand-by can take over and the control plane can be rebuilt from the
 state stored in the database.

 I don't think the two approaches are in conflict with each other,
 actually the complement each other. What I'm trying to figure out is
 where they overlap (from a code point of view).

 Marcelo



More information about the dev mailing list