[ovs-dev] [netlink v5 56/61] datapath: Convert ODP_FLOW_* commands to use AF_NETLINK socket layer.

Jesse Gross jesse at nicira.com
Fri Jan 28 21:42:43 UTC 2011


On Fri, Jan 28, 2011 at 1:25 PM, Ben Pfaff <blp at nicira.com> wrote:
> On Fri, Jan 28, 2011 at 01:13:39PM -0800, Jesse Gross wrote:
>> On Thu, Jan 27, 2011 at 9:59 PM, Jesse Gross <jesse at nicira.com> wrote:
>> > On Thu, Jan 27, 2011 at 8:55 PM, Ben Pfaff <blp at nicira.com> wrote:
>> >> On Thu, Jan 27, 2011 at 07:57:20PM -0800, Jesse Gross wrote:
>> >>> On Wed, Jan 26, 2011 at 4:23 PM, Ben Pfaff <blp at nicira.com> wrote:
>> >>> > diff --git a/datapath/datapath.c b/datapath/datapath.c
>> >>> > index 36cc803..b2354d7 100644
>> >>> > --- a/datapath/datapath.c
>> >>> > +++ b/datapath/datapath.c
>> >>> > +static int odp_flow_cmd_del(struct sk_buff *skb, struct genl_info *info)
>> >>> [...]
>> >>> > - ? ? ? err = copy_flow_to_user(uodp_flow, dp, flow, flowcmd.total_len,
>> >>> > - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ((u64)bucket << 32) | obj);
>> >>> > + ? ? ? flow_deferred_free(flow);
>> >>> >
>> >>> > -exit_kfree_skb:
>> >>> > - ? ? ? kfree_skb(skb);
>> >>> > -exit:
>> >>> > - ? ? ? return err;
>> >>> > + ? ? ? reply = odp_flow_cmd_build_info(flow, dp, info->snd_pid,
>> >>> > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? info->snd_seq, ODP_FLOW_CMD_DEL);
>> >>>
>> >>> We're not holding rcu_read_lock here, so flow_deferred_free() could
>> >>> free flow immediately but then we use it in the next line.
>> >>
>> >> At first I was just going to fix the ordering, but then I saw that it
>> >> was better to allocate room for the reply before we remove it from the
>> >> table, and then to fill it afterward. ?So I ended up with the following
>> >> (as a context diff this time because it seems more readable in this
>> >> case). ?Compile tested only:
>> >
>> > This makes sense but should we do the same in other places? ?None of
>> > the fill_info functions can fail once memory has been allocated (the
>> > vport functions can return errors but that should just result in the
>> > attribute not being included).
>>
>> Looking at this again, what is the benefit to splitting the allocation
>> here?  Removing the flow from the table doesn't change anything that
>> is in the reply, so we could just as easily easily build the entire
>> response before deleting the flow as we do in other places.  Delaying
>> capturing the stats by a little bit potentially makes them more
>> accurate but it's not a guarantee and we don't do it for other types
>> of deletions that report final stats.
>
> It's more important for flows to get the stats right, in my opinion.

It doesn't get it right though, there's just an extra microsecond
before it takes a snapshot.  We could actually get it right but that
would require rearranging userspace a little bit to not have
performance problems like we had previously.  It would probably be
easier to do now than with ioctls though.

Most of this stuff I'm not really all that hung up about, whether it
is preallocating memory or improving the accuracy of the stats.
They're just incremental improvements that make the correct case more
common but don't fundamentally change the potential outcomes.  What I
don't see the reason for is inconsistency: the operations in all three
families are very similar so if we do something in one we should do it
in all.




More information about the dev mailing list