[ovs-dev] [PATCH 13/14] Add support for multiple OpenFlow controllers on a single bridge.

Ben Pfaff blp at nicira.com
Tue Apr 20 17:51:51 UTC 2010


On Tue, Apr 20, 2010 at 12:46:49AM -0700, Justin Pettit wrote:
> On Apr 8, 2010, at 5:07 PM, Ben Pfaff wrote:
> 
> > diff --git a/ofproto/fail-open.c b/ofproto/fail-open.c
> > index eca1417..c3ff897 100644
> > --- a/ofproto/fail-open.c
> > +++ b/ofproto/fail-open.c
> > @@ -68,8 +68,8 @@
> > 
> > +/* Returns the number of seconds for which all controllers have been
> > + * disconnected.  */
> > +static int
> > +failure_duration(const struct fail_open *fo)
> > +{
> > +    int min_failure_duration;
> > +    size_t i;
> > +
> > +    min_failure_duration = 0;
> > +    for (i = 0; i < fo->n_controllers; i++) {
> > +        int failure_duration = rconn_failure_duration(fo->controllers[i]);
> > +        min_failure_duration = MAX(min_failure_duration, failure_duration);
> 
> Should this be MIN?

You're right.  Fixed.

> > diff --git a/ofproto/fail-open.h b/ofproto/fail-open.h
> > index 900d587..c4b8716 100644
> > --- a/ofproto/fail-open.h
> > +++ b/ofproto/fail-open.h
> > @@ -1,5 +1,5 @@
> > /*
> > - * Copyright (c) 2008, 2009 Nicira Networks.
> > + * Copyright (c) 2008, 2009, 2010 Nicira Networks.
> >  *
> >  * Licensed under the Apache License, Version 2.0 (the "License");
> >  * you may not use this file except in compliance with the License.
> > @@ -32,10 +32,8 @@ struct switch_status;
> >  * creates flows with this priority). */
> > #define FAIL_OPEN_PRIORITY 70000
> 
> No special value here like 0xf0f0f0f0?  :-)

Hey, that's much better!  Changed.

> > diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
> > index 10d8161..00ebfe2 100644
> > --- a/ofproto/ofproto.c
> > +++ b/ofproto/ofproto.c
> > 
> > +/* ofproto supports two kinds of OpenFlow connections:
> > + *
> > + *   - "Controller connections": Connections to ordinary OpenFlow controllers.
> > + *     ofproto maintains persistent connections to these controllers and by
> > + *     default sends them asynchronous messages such as packet-ins.
> > + *
> > + *   - "Management connections", e.g. from ovs-ofctl.  When these connections
> > + *     drop, it is the other side's responsibility to reconnect them if
> > + *     necessary.  ofproto does not send them asynchronous messages by default.
> > + */
> > +enum ofconn_type {
> > +    OFCONN_CONTROLLER,          /* An OpenFlow controller. */
> > +    OFCONN_MANAGEMENT           /* A (transient) management connection. */
> > +};
> 
> I don't have a better proposal at the moment, but this is going to get
> confusing with the management/config protocol.

You're right.  I've changed OFCONN_MANAGEMENT to OFCONN_TRANSIENT.

> > +static void
> > +add_controller(struct ofproto *ofproto, const struct ofproto_controller *c)
> > 
> > +    ofconn = ofconn_create(ofproto, rconn_create(5, 8), OFCONN_CONTROLLER);
> 
> It might be good to write a comment that the probe_interval and
> max_backoff will be set by update_controller() later.  Otherwise,
> these magic numbers are a bit confusing.  Actually, it may just be a
> good idea to mention at the function-level description that
> update_controller() needs to be called after add_controller().

Good idea.  This was definitely obscure.

I added a comment on add_controller():

/* Creates a new controller in 'ofproto'.  Some of the settings are initially
 * drawn from 'c', but update_controller() needs to be called later to finish
 * the new ofconn's configuration. */

and one on update_controller():

/* Reconfigures 'ofconn' to match 'c'.  This function cannot update an ofconn's
 * target or turn discovery on or off (these are done by creating new ofconns
 * and deleting old ones), but it can update the rest of an ofconn's
 * settings. */

> > @@ -847,7 +942,19 @@ ofproto_run1(struct ofproto *p)
> > 
> > +            LIST_FOR_EACH (ofconn, struct ofconn, node, &p->all_conns) {
> > +                if (ofconn->type == OFCONN_CONTROLLER) {
> > +                    rconn_add_monitor(ofconn->rconn, vconn);
> > +                    goto success;
> > +                }
> > +            }
> > +            VLOG_INFO_RL(&rl, "no controller connection to monitor");
> > +            vconn_close(vconn);
> > +
> > +        success:;
> 
> As I mentioned in a previous patch, this just looks weird to me.  :-)

I admit that it *is* weird-looking.  Probably better to use a
function.  So I broke it out like this:

    /* One of ofproto's "snoop" pvconns has accepted a new connection on 'vconn'.
     * Connects this vconn to a controller. */
    static void
    add_snooper(struct ofproto *ofproto, struct vconn *vconn)
    {
        struct ofconn *ofconn;

        /* Arbitrarily pick the first controller in the list for monitoring.  We
         * could do something smarter or more flexible later, if it ever proves
         * useful. */
        LIST_FOR_EACH (ofconn, struct ofconn, node, &ofproto->all_conns) {
            if (ofconn->type == OFCONN_CONTROLLER) {
                rconn_add_monitor(ofconn->rconn, vconn);
                return;
            }

        }
        VLOG_INFO_RL(&rl, "no controller connection to monitor");
        vconn_close(vconn);
    }

and then added a single call to add_snooper().

> > @@ -920,14 +1027,9 @@ ofproto_wait(struct ofproto *p)
> >     if (p->in_band) {
> >         in_band_wait(p->in_band);
> >     }
> > -    if (p->discovery) {
> > -        discovery_wait(p->discovery);
> > -    }
> >     if (p->fail_open) {
> >         fail_open_wait(p->fail_open);
> >     }
> > -    pinsched_wait(p->miss_sched);
> > -    pinsched_wait(p->action_sched);
> >     if (p->sflow) {
> >         ofproto_sflow_wait(p->sflow);
> >     }
> 
> I'm sure I missed some subtlety, but why don't we need to call
> pinsched_wait() and discovery_wait() anymore?

They got moved into "struct ofconn" so need to be in ofconn_wait().
Oops, I forgot to move them there.  Fixed.  (Thanks.)

> > static void
> > +send_packet_in(struct ofproto *ofproto, struct ofpbuf *packet)
> > 
> > +    LIST_FOR_EACH (ofconn, struct ofconn, node, &ofproto->all_conns) {
> > +        if (ofconn->miss_send_len
> > +            || (msg->type == _ODPL_ACTION_NR
> > +                && ofconn->type == OFCONN_CONTROLLER)) {
> 
> Looking through the OpenFlow 1.0 spec, I believe that we are supposed
> to send a packet_in for misses even if the miss_send_len is
> zero...it's just supposed to be of zero length.

What a stupid specification.  OK, fixed.

> > diff --git a/utilities/ovs-openflowd.c b/utilities/ovs-openflowd.c
> > 
> > @@ -266,16 +267,18 @@ parse_options(int argc, char *argv[], struct ofsettings *s)
> > 
> > +    controller_opts.probe_interval = 0;
> 
> As mentioned in an earlier patch, I think this is supposed to have a
> default of 5.  It looks to me like this will cause probes to be turned
> off by default based on the code in ofproto.c.

OK, fixed here too.

> >         case OPT_FAIL_MODE:
> >             if (!strcmp(optarg, "open")) {
> > -                s->controller.fail = OFPROTO_FAIL_STANDALONE;
> > +                controller_opts.fail = OFPROTO_FAIL_STANDALONE;
> >             } else if (!strcmp(optarg, "closed")) {
> > -                s->controller.fail = OFPROTO_FAIL_SECURE;
> > +                controller_opts.fail = OFPROTO_FAIL_SECURE;
> 
> Should we transition to using "standalone" and "secure" on the command
> line?  We could continue to use "open" and "closed" as undocumented
> options for a while.  "open" and "closed" have always been confusing.

OK, I've now done this in a separate commit.

> > diff --git a/utilities/ovs-vsctl.8.in b/utilities/ovs-vsctl.8.in
> > 
> > +For each of these commands, a \fIbridge\fR of \fBdefault\fR applies
> > +the configuration as the default for any bridge that has not been
> > +explicitly configured.  Otherwise, \fIbridge\fR must name a bridge,
> > +and the settings apply only to that bridge.  (Omitting \fIbridge\fR
> > +entirely usually has the same effect as specifying \fBdefault\fR.)
> 
> I think it would be clearer if we always required a "bridge" argument.
> I feel like a lot of people get confused about where the controller
> setting has been configured as it is.

I agree, but don't we need to preserve any backward compatibility here?
I'll happily do it your way if we don't need any.

> > diff --git a/utilities/ovs-vsctl.c b/utilities/ovs-vsctl.c
> > 
> > +static const char *
> > +get_fail_mode(struct ovsrec_controller **controllers, size_t n_controllers)
> > +{
> > +    const char *fail_mode;
> > +    size_t i;
> > +
> > +    fail_mode = NULL;
> > +    for (i = 0; i < n_controllers; i++) {
> > +        const char *s = controllers[i]->fail_mode;
> > +        if (s) {
> > +            if (!strcmp(s, "secure")) {
> > +                return s;
> 
> I'm not sure it's documented that "secure" on any bridge trumps any
> other setting.  

This is in the database documentation:

      When  more  than one controller is configured, fail_mode is con-
      sidered only when none of the configured controllers can be con-
      tacted.   At that point, the bridge enters secure mode if any of
      the controllers' fail_mode is  set  to  secure.   Otherwise,  it
      enters standalone mode if at least one fail_mode is set to stan-
      dalone.  If none of the fail_mode values are set, the default is
      implementation-defined.

> Also, does it make sense to have the fail-mode setting attached to the
> controller as opposed to the bridge, anyway?

No, probably not.  How much database compatibility do we need at this
point?  If we can change the database schema, then I'd rather have this
in the Bridge table too.

> > diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
> > 
> > @@ -1587,8 +1541,74 @@ bridge_reconfigure_controller(const struct ovsrec_open_vswitch 
> > +                    if (!c->local_netmask || !inet_aton(c->local_netmask, &mask)) {
> 
> This line doesn't fit in 80 columns.

OK, I wrapped it.

> > +            oc->band = (!c->connection_mode
> > +                       || !strcmp(c->connection_mode, "out-of-band")
> > +                       ? OFPROTO_IN_BAND
> > +                       : OFPROTO_OUT_OF_BAND);
> 
> I think the string comparison should be "in-band".

Yes, thank you, I fixed this now.

> >     <group title="OpenFlow Configuration">
> >       <column name="controller">
> > -        OpenFlow controller.  If unset, defaults to that specified by
> > -        <ref column="controller" table="Open_vSwitch"/> in the
> > -        <ref table="Open_vSwitch"/> table.  If the default is also unset, then
> > -        no OpenFlow controller will be used.
> > +        OpenFlow controller set.  If unset, defaults to the set of
> > +        controllers specified by <ref column="controller"
> > +        table="Open_vSwitch"/> in the <ref table="Open_vSwitch"/>
> > +        table.  If the default is also unset, then no OpenFlow
> > +        controllers will be used.
> >       </column>
> 
> Do you think it's worth explicitly mentioning that there's no merging
> of controllers between Bridge and Open_vSwitch tables (i.e., that
> entries in a Bridge trump any in the Open_vSwitch table)?

I thought it was clear.  Do you want to provide better wording?

> >               connecting to the controller forever.</dd>
> >           </dl>
> >         </p>
> > -        <p>If this value is unset, the default is
> > -        implementation-specific.</p>
> > +        <p>If this value is unset, the default is implementation-specific.</p>
> > +	<p>When more than one controller is configured,
> > +	  <ref column="fail_mode"/> is considered only when none of the
> > +	  configured controllers can be contacted.  At that point, the bridge
> > +	  enters secure mode if any of the controllers'
> > +	  <ref column="fail_mode"/> is set to <code>secure</code>.  Otherwise,
> > +	  it enters standalone mode if at least one <ref column="fail_mode"/>
> > +	  is set to <code>standalone</code>.  If none of the
> > +	  <ref column="fail_mode"/> values are set, the default is
> > +	  implementation-defined.</p>
> 
> Should we point out where these sorts of implementation-defined things
> can be located?

"implementation-defined" isn't quite the right term.  What I want to say
is that the default is subject to change from one version of OVS to the
next, and that if you really want one particular behavior you should set
that explicitly.  Any idea of a concise way to say that?

> > +      <column name="local_gateway">
> > +        The IP address of the gateway to configure on the local port, as a
> > +        string, e.g. <code>192.168.0.1</code>.  Leave this column unset if
> > +        this network has no gateway.
> 
> Did you want to mention that it's ignored if target is "discover"?
> This was done for the "local_ip" and "local_netmask" columns.

I don't see what you are looking at.  As far as I can tell there is only
one mention of behavior when discovery is configured, at the level of
the group that contains local_ip, local_netmask, and local_gateway.  I
don't see anything in the column descriptions for local_ip or
local_netmask.




More information about the dev mailing list