[ovs-dev] netdev_reopen() review

Jesse Gross jesse at nicira.com
Tue Apr 13 19:11:27 UTC 2010


On Mon, Apr 12, 2010 at 2:30 PM, Ben Pfaff <blp at nicira.com> wrote:

> On Sun, Apr 11, 2010 at 02:09:18PM -0400, Jesse Gross wrote:
> > On Thu, Apr 8, 2010 at 7:09 PM, Ben Pfaff <blp at nicira.com> wrote:
> >
> > > I pushed a small change to the netdev library to the "wdp" branch this
> > > morning.  Since this is not a "core" branch (yet) and already has a
> > > bunch of badly broken stuff in it, I did not see a need to have it
> > > reviewed in advance, but it seems like a good idea to get it reviewed
> > > now.  Here's the commit.
> >
> > The code looks correct to me, though what do you think about calling it
> > something like netdev_clone() instead?  netdev_reopen() isn't the most
> > descriptive name to me.
>
> What do you think of netdev_ref() or netdev_incref()?  To me, "clone"
> implies data copying.
>

Either of those is fine.


> > Finally, I haven't really looked at the wdp branch so I don't know the
> > context but do you think the performance benefits are worth it?  It took
> me
> > a while to understand all the levels to which struct sk_buff in Linux can
> be
> > shared or independent.  This isn't nearly as bad but this is now the
> second
> > refcount that we have on netdevs.  How does this compare to struct
> wdp_port
> > that contains the netdev?
>
> netdev_(reopen/clone/ref) is used as one step as part of copying
> "wdp_port" objects around.  If we can't copy wdp_ports cheaply then we
> have to copy them expensively or avoid copying them at all.  We already
> rejected not copying them (see the ovs-hw discussion).  It probably
> doesn't matter whether it is expensive to copy wdp_ports, because they
> are not used anywhere performance critical.
>
> What I do like about using a refcount:
>

I'm not sure that I agree with all of your arguments:


>
>        * The netdev implementations themselves don't have to care.
>          It's transparent to them.  If we add a "copy" operation
>          etc. then presumably they would have to implement it.
>

I think a copy could be implemented entirely in the netdev core.  It just
needs to call the device open function and bump the netdev_dev refcount.


>        * It can't fail.  I guess that a copy operation could return an
>          error if something goes wrong.
>

That is true.


>
>        * There are very few contexts where even the netdev core code
>          has to care (i.e. the patch is very short).
>

Given a choice of where to put complexity, I'd prefer to put it in the
netdev core since it only has to be implemented once.  While the patch is
short, I think it adds complexity to users of the netdev library because it
implicitly adds a new type of netdev (a shared shallow copy) that the caller
needs to be aware of.  It's also not just the caller that needs to know
since after you pass the pointer around a few times it's hard to know where
it came from.


>
> I'm leaning toward using the refcounts, as I guess you can see.  If you
> think my arguments are bogus, please let me know.
>

To me the strongest argument in favor of refcounting here is when it is used
to handle datapath miss events.  Potentially opening a netdev could be
somewhat expensive and it might slow down flow setup times.  It also causes
churn (allocating memory and file descriptors) during the steady state
operation of the switch.  Though copying the wdp_port involves allocating
memory anyways, independent of the netdev changes.

Of course all of this probably doesn't matter that much in practice because
reading from a netdev in userspace is pretty rare, it's more just the
principle.  If we do implement refcounting I would like to see a
netdev_is_cloned (obviously whatever we decide to call it, clone just seems
to stick in my mind) and then assert on that in the places where we actually
read or even just check in netdev_recv().  The tap netdev always shared FDs
so it should probably also hook into that mechanism.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openvswitch.org/pipermail/ovs-dev/attachments/20100413/0bd08a36/attachment-0003.html>


More information about the dev mailing list