[ovs-dev] [PATCH] datapath: Use percpu allocator for flow-stats.

Jesse Gross jesse at nicira.com
Thu Dec 5 00:23:54 UTC 2013


On Wed, Dec 4, 2013 at 3:42 PM, Pravin Shelar <pshelar at nicira.com> wrote:
> On Wed, Dec 4, 2013 at 1:08 PM, Jesse Gross <jesse at nicira.com> wrote:
>> On Fri, Nov 22, 2013 at 1:20 PM, Pravin B Shelar <pshelar at nicira.com> wrote:
>>> diff --git a/datapath/flow.c b/datapath/flow.c
>>> index 57eb6b5..4353a4f 100644
>>> --- a/datapath/flow.c
>>> +++ b/datapath/flow.c
>>>  void ovs_flow_stats_update(struct sw_flow *flow, struct sk_buff *skb)
>> [...]
>>> +       if (!flow->stats.is_percpu) {
>>> +               stat = flow->stats.stat;
>>> +
>>> +               if ((flow->key.eth.type == htons(ETH_P_IP) ||
>>> +                   flow->key.eth.type == htons(ETH_P_IPV6)) &&
>>> +                   flow->key.ip.proto == IPPROTO_TCP &&
>>> +                   likely(skb->len >= skb_transport_offset(skb) + sizeof(struct tcphdr)))
>>> +                       tcp_flags = TCP_FLAGS_BE16(tcp_hdr(skb));
>>> +       } else
>>> +               stat = this_cpu_ptr(flow->stats.cpu_stats);
>>
>> I'm not sure that it's right to not update TCP flags when we are doing
>> per-CPU stats. I assume that this is because it might correspond to
>> multiple TCP flows but the behavior could be surprising (and is
>> different from what currently happens) and exposes what should be an
>> internal optimization.
>>
> Even if we aggregate tcp flags from multiple flows, resulting
> tcp_flags are going to be surprising.
> But I think we shld keep current behavior, so I will revert this change.

When you say that it is surprising do you just mean that it will be
combination of multiple flows? Or is there something worse that will
happen?

>>> diff --git a/datapath/flow.h b/datapath/flow.h
>>> index 6b68cf1..2405638 100644
>>> --- a/datapath/flow.h
>>> +++ b/datapath/flow.h
>>> -struct sw_flow_stats {
>>> +struct flow_stats {
>>>         u64 packet_count;               /* Number of packets matched. */
>>>         u64 byte_count;                 /* Number of bytes matched. */
>>>         unsigned long used;             /* Last used time (in jiffies). */
>>>         spinlock_t lock;                /* Lock for atomic stats update. */
>>> +};
>>> +
>>> +struct sw_flow_stats {
>>>         __be16 tcp_flags;               /* Union of seen TCP flags. */
>>> -} ____cacheline_aligned_in_smp;
>>> +       bool is_percpu;
>>> +       union {
>>> +               struct flow_stats *stat;
>>> +               struct flow_stats __percpu *cpu_stats;
>>> +       };
>>> +};
>>
>> I wonder if it makes sense to directly put struct flow_stats in the
>> union for the non-percpu case and avoid an extra indirection there.
>>
> This saves some memory, thats why I done it this way.

It does in the percpu case, although we're essentially already betting
that there are relatively few flows there. I think it saves some
memory in the non-percpu case though.



More information about the dev mailing list