[ovs-dev] [PATCH] datapath: Use percpu allocator for flow-stats.

Pravin Shelar pshelar at nicira.com
Thu Dec 5 00:36:09 UTC 2013


On Wed, Dec 4, 2013 at 4:23 PM, Jesse Gross <jesse at nicira.com> wrote:
> On Wed, Dec 4, 2013 at 3:42 PM, Pravin Shelar <pshelar at nicira.com> wrote:
>> On Wed, Dec 4, 2013 at 1:08 PM, Jesse Gross <jesse at nicira.com> wrote:
>>> On Fri, Nov 22, 2013 at 1:20 PM, Pravin B Shelar <pshelar at nicira.com> wrote:
>>>> diff --git a/datapath/flow.c b/datapath/flow.c
>>>> index 57eb6b5..4353a4f 100644
>>>> --- a/datapath/flow.c
>>>> +++ b/datapath/flow.c
>>>>  void ovs_flow_stats_update(struct sw_flow *flow, struct sk_buff *skb)
>>> [...]
>>>> +       if (!flow->stats.is_percpu) {
>>>> +               stat = flow->stats.stat;
>>>> +
>>>> +               if ((flow->key.eth.type == htons(ETH_P_IP) ||
>>>> +                   flow->key.eth.type == htons(ETH_P_IPV6)) &&
>>>> +                   flow->key.ip.proto == IPPROTO_TCP &&
>>>> +                   likely(skb->len >= skb_transport_offset(skb) + sizeof(struct tcphdr)))
>>>> +                       tcp_flags = TCP_FLAGS_BE16(tcp_hdr(skb));
>>>> +       } else
>>>> +               stat = this_cpu_ptr(flow->stats.cpu_stats);
>>>
>>> I'm not sure that it's right to not update TCP flags when we are doing
>>> per-CPU stats. I assume that this is because it might correspond to
>>> multiple TCP flows but the behavior could be surprising (and is
>>> different from what currently happens) and exposes what should be an
>>> internal optimization.
>>>
>> Even if we aggregate tcp flags from multiple flows, resulting
>> tcp_flags are going to be surprising.
>> But I think we shld keep current behavior, so I will revert this change.
>
> When you say that it is surprising do you just mean that it will be
> combination of multiple flows? Or is there something worse that will
> happen?
>
It is about tcp_flags from multiple flows.

>>>> diff --git a/datapath/flow.h b/datapath/flow.h
>>>> index 6b68cf1..2405638 100644
>>>> --- a/datapath/flow.h
>>>> +++ b/datapath/flow.h
>>>> -struct sw_flow_stats {
>>>> +struct flow_stats {
>>>>         u64 packet_count;               /* Number of packets matched. */
>>>>         u64 byte_count;                 /* Number of bytes matched. */
>>>>         unsigned long used;             /* Last used time (in jiffies). */
>>>>         spinlock_t lock;                /* Lock for atomic stats update. */
>>>> +};
>>>> +
>>>> +struct sw_flow_stats {
>>>>         __be16 tcp_flags;               /* Union of seen TCP flags. */
>>>> -} ____cacheline_aligned_in_smp;
>>>> +       bool is_percpu;
>>>> +       union {
>>>> +               struct flow_stats *stat;
>>>> +               struct flow_stats __percpu *cpu_stats;
>>>> +       };
>>>> +};
>>>
>>> I wonder if it makes sense to directly put struct flow_stats in the
>>> union for the non-percpu case and avoid an extra indirection there.
>>>
>> This saves some memory, thats why I done it this way.
>
> It does in the percpu case, although we're essentially already betting
> that there are relatively few flows there. I think it saves some
> memory in the non-percpu case though.

I also wanted to make better use of cache-line (as it is accessed in
hot-path) for per cpu case.



More information about the dev mailing list