[ovs-dev] [netlink 02/16] datapath: Make adding and attaching a vport a single step.

Ben Pfaff blp at nicira.com
Wed Nov 3 22:41:25 UTC 2010


On Wed, Sep 29, 2010 at 05:35:15PM -0700, Jesse Gross wrote:
> On Fri, Sep 24, 2010 at 3:10 PM, Ben Pfaff <blp at nicira.com> wrote:
> > On Thu, Sep 23, 2010 at 07:10:58PM -0700, Jesse Gross wrote:
> >> On Fri, Sep 10, 2010 at 3:55 PM, Ben Pfaff <blp at nicira.com> wrote:
> >> > -#define ODP_PORT_INTERNAL (1 << 0) /* This port is simulated. */
> >> > +#define VPORT_TYPE_SIZE     16
> >> > +#define VPORT_CONFIG_SIZE     32
> >>
> >> I'm not sure that 32 bytes is sufficient.  What happens if we had an
> >> IPv6 tunnel?
> >
> > We probably wouldn't want to do it this way permanently.  But this is
> > short-lived: later in the series the vport configuration gets
> > transformed into a variable-length TLV using Netlink.  And it's enough
> > for all the current tunnels.
> >
> > If you think this is important enough to push by itself, then I'll
> > rework it somehow, maybe put it back to using a pointer (but then we'll
> > have to restore the compat code too).
> 
> I do think it is worth pushing pieces incrementally.  However, it
> seems unlikely that we will need the extra space before moving to
> Netlink, so this is fine for now.

OK, thanks.

> >> >  struct odp_port {
> >> >     char devname[16];           /* IFNAMSIZ */
> >> > +    char type[VPORT_TYPE_SIZE];
> >> >     uint16_t port;
> >> > -    uint16_t flags;
> >> > +    uint16_t reserved1;
> >> >     uint32_t reserved2;
> >> > +    uint32_t config[VPORT_CONFIG_SIZE / 4]; /* type-specific configuration */
> >>
> >> Why is this an array of uint32_t with the size divided by 4 instead of
> >> an array of uint8_t?
> >
> > It's to ensure proper alignment of 32-bit members inside 'config', so
> > that one can just cast port->config to a configuration struct instead of
> > having to memcpy().
> >
> > It's not strictly necessary since reserved2 is also 32-bits.
> >
> > I'll do this any way you like, if you'll tell me what you prefer.
> 
> Theoretically, config could have 64-bit members that would like to be
> properly aligned.  This is fine because config is 64-bit aligned
> currently but it makes the array of uint32_t seem weird.  However,
> making it an aligned array probably makes sense since the compiler is
> free to pad however it feels like.  64-bit alignment seems better to
> me but nothing currently needs it and, as you say, this is going away.

aligned_u64 would be a better choice, yes, but datapath-protocol.h is
used on non-Linux systems too and I get complaints if I put anything
Linux-specific in it.  (That's one of the problems I haven't solved for
the end of this series yet, for the new <linux/vswitch.h>.)  uint64_t
would just make it look like there wasn't a problem, so I guess I'll
leave it as uint32_t.

> >> I find the configuration to be somewhat inconsistent with respect to
> >> what is done in the dpif and what is done in the netdev (i.e. initial
> >> configuration is done here, reconfiguration is done through the
> >> netdev).
> >>
> >> What if we did this instead:
> >> * All configuration is done through the netdev, like before.
> >> * The netdev is created first and rather than passing in the name to
> >> the dpif, a pointer to the netdev is passed instead.
> >> * In general we would have a tighter coupling between netdevs and
> >> dpifs.  I think this is almost certain to happen anyways as we branch
> >> out into different, more specialized platforms since, for example, a
> >> hardware switching chip isn't going to have interfaces for ports that
> >> are separate from the datapath.
> >> * In the case of vports, the netdev could just parse and store the
> >> config but hold onto it until the port is added, allowing it to be
> >> done in one step.
> >
> > So instead of having a function to parse a config, we'd have a function
> > to retrieved a parsed config from a netdev?  Sure, I'll buy that.  If
> > that's what you mean, I'm happy to rework it along those lines.
> > Otherwise let me know more specifically what you'd like to see.
> 
> Yes, that's what I meant.  At some level it is not too different but
> it seems more self contained to me.

OK, done.

> >> > diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c
> >> > index 575b5b2..80712e3 100644
> >> > +    if (type[0] == '\0' || !strcmp(type, "system")) {
> >> > +        strncpy(port->type, "netdev", sizeof port->type);
> >> > +    } else {
> >>
> >> I wonder if we can require ports to have a type, that is, drop the
> >> implicit empty equals "system" mapping.  Actual people will use
> >> ovs-vsctl and that can keep the same interface.  For programmatic
> >> access, I don't see much benefit in allowing the implicit behavior.
> >
> > I think that we need to have a default in the system at some level (not
> > just in ovs-vsctl) because the "type" column in the Interface table is
> > optional.  When we do it here at the lowest level, we only have to do it
> > in one place.  We could make the "type" column mandatory, I guess,
> > although I had not considered it.
> 
> I was thinking about making the type column mandatory.  It would
> require fixing controllers but it seems better to  explicit specify
> what is wanted, rather than implicitly divining it.  Other than the
> backwards compatibility issue, I don't see any benefit in making it
> optional.

At the time you wrote that I think that it would have been possible to
make it mandatory, if we had acted quickly.  But I think that that time
has passed us by now.

> >> > diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
> >> > index 598b001..4ab66ef 100644
> >> > +                type = (iface_is_internal(br, if_name)
> >> > +                        ? "internal"
> >> > +                        : iface->cfg->type);
> >> > +
> >>
> >> Is it possible to require that internal interfaces actually have a
> >> type of "internal"?
> >
> > This seems reasonable.  It is a little tricky, because there is no
> > "iface" structure at all for bond fake interfaces (a "feature" I hate,
> > by the way).
> 
> Yeah, that is an annoying case.  Maybe we could at least start
> requiring actual internal interfaces to have an "internal" type and if
> we can clean this up in the future it will make it easier to
> transition.

I've started on that by adding a commit that makes ovs-vsctl set the
"type" column of new bridges to "internal"; previously it left them
unset.




More information about the dev mailing list