[ovs-dev] [PATCH] datapath: Use percpu allocator for flow-stats.

Pravin Shelar pshelar at nicira.com
Wed Dec 4 23:42:29 UTC 2013


On Wed, Dec 4, 2013 at 1:08 PM, Jesse Gross <jesse at nicira.com> wrote:
> On Fri, Nov 22, 2013 at 1:20 PM, Pravin B Shelar <pshelar at nicira.com> wrote:
>> diff --git a/datapath/datapath.c b/datapath/datapath.c
>> index d0a8595..41f2f8b 100644
>> --- a/datapath/datapath.c
>> +++ b/datapath/datapath.c
>> @@ -750,6 +744,7 @@ static int ovs_flow_cmd_new_or_set(struct sk_buff *skb, struct genl_info *info)
>>         struct nlattr **a = info->attrs;
>>         struct ovs_header *ovs_header = info->userhdr;
>>         struct sw_flow_key key, masked_key;
>> +       bool wc_5tupple = false;
>
> There's actually only one 'p' in 'tuple'.
>
ok.

>> @@ -765,7 +760,8 @@ static int ovs_flow_cmd_new_or_set(struct sk_buff *skb, struct genl_info *info)
>>
>>         ovs_match_init(&match, &key, &mask);
>>         error = ovs_nla_get_match(&match,
>> -                                 a[OVS_FLOW_ATTR_KEY], a[OVS_FLOW_ATTR_MASK]);
>> +                                 a[OVS_FLOW_ATTR_KEY], a[OVS_FLOW_ATTR_MASK],
>> +                                 &wc_5tupple);
>
> It might be a little more readable if you made wc_5tuple the second
> argument so the inputs and outputs are grouped together.
>
ok.

>> @@ -892,7 +889,7 @@ static int ovs_flow_cmd_get(struct sk_buff *skb, struct genl_info *info)
>>         }
>>
>>         ovs_match_init(&match, &key, NULL);
>> -       err = ovs_nla_get_match(&match, a[OVS_FLOW_ATTR_KEY], NULL);
>> +       err = ovs_nla_get_match(&match, a[OVS_FLOW_ATTR_KEY], NULL, &wc_5tupple);
>
> Should we define a wrapper for this case or allow wc_5tuple to be NULL
> if we don't care?
>
ok, I allowed NULL for wc_5tuple ptr.

>> diff --git a/datapath/flow.c b/datapath/flow.c
>> index 57eb6b5..4353a4f 100644
>> --- a/datapath/flow.c
>> +++ b/datapath/flow.c
>>  void ovs_flow_stats_update(struct sw_flow *flow, struct sk_buff *skb)
> [...]
>> +       if (!flow->stats.is_percpu) {
>> +               stat = flow->stats.stat;
>> +
>> +               if ((flow->key.eth.type == htons(ETH_P_IP) ||
>> +                   flow->key.eth.type == htons(ETH_P_IPV6)) &&
>> +                   flow->key.ip.proto == IPPROTO_TCP &&
>> +                   likely(skb->len >= skb_transport_offset(skb) + sizeof(struct tcphdr)))
>> +                       tcp_flags = TCP_FLAGS_BE16(tcp_hdr(skb));
>> +       } else
>> +               stat = this_cpu_ptr(flow->stats.cpu_stats);
>
> I'm not sure that it's right to not update TCP flags when we are doing
> per-CPU stats. I assume that this is because it might correspond to
> multiple TCP flows but the behavior could be surprising (and is
> different from what currently happens) and exposes what should be an
> internal optimization.
>
Even if we aggregate tcp flags from multiple flows, resulting
tcp_flags are going to be surprising.
But I think we shld keep current behavior, so I will revert this change.

>> -void ovs_flow_stats_get(struct sw_flow *flow, struct sw_flow_stats *res)
>> +void ovs_flow_stats_get(struct sw_flow *flow, unsigned long *used,
>> +                       struct ovs_flow_stats *stats, __be16 *tcp_flags)
>>  {
>>         int cpu, cur_cpu;
>>
>> -       memset(res, 0, sizeof(*res));
>> +       if (!flow->stats.is_percpu) {
>> +               *used = flow->stats.stat->used;
>> +               stats->n_packets = flow->stats.stat->packet_count;
>> +               stats->n_bytes = flow->stats.stat->byte_count;
>> +               *tcp_flags = flow->stats.tcp_flags;
>> +               return;
>> +       }
>
> I guess it seems a little nicer in these two functions if we don't
> just return immediately here but have the branch cover both conditions
> - otherwise it's a little hard to read.
>
ok.

>> diff --git a/datapath/flow.h b/datapath/flow.h
>> index 6b68cf1..2405638 100644
>> --- a/datapath/flow.h
>> +++ b/datapath/flow.h
>> -struct sw_flow_stats {
>> +struct flow_stats {
>>         u64 packet_count;               /* Number of packets matched. */
>>         u64 byte_count;                 /* Number of bytes matched. */
>>         unsigned long used;             /* Last used time (in jiffies). */
>>         spinlock_t lock;                /* Lock for atomic stats update. */
>> +};
>> +
>> +struct sw_flow_stats {
>>         __be16 tcp_flags;               /* Union of seen TCP flags. */
>> -} ____cacheline_aligned_in_smp;
>> +       bool is_percpu;
>> +       union {
>> +               struct flow_stats *stat;
>> +               struct flow_stats __percpu *cpu_stats;
>> +       };
>> +};
>
> I wonder if it makes sense to directly put struct flow_stats in the
> union for the non-percpu case and avoid an extra indirection there.
>
This saves some memory, thats why I done it this way.

>> diff --git a/datapath/flow_netlink.c b/datapath/flow_netlink.c
>> index 75c72b3..631aa59 100644
>> --- a/datapath/flow_netlink.c
>> +++ b/datapath/flow_netlink.c
>> @@ -567,6 +568,12 @@ static int ovs_key_from_nlattrs(struct sw_flow_match *match,  u64 attrs,
>>                 SW_FLOW_KEY_PUT(match, ipv4.addr.dst,
>>                                 ipv4_key->ipv4_dst, is_mask);
>>                 attrs &= ~(1ULL << OVS_KEY_ATTR_IPV4);
>> +
>> +               if (is_mask) {
>> +                       if (!ipv4_key->ipv4_proto ||
>> +                           !ipv4_key->ipv4_src || !ipv4_key->ipv4_dst)
>> +                               *wc_5tupple = true;
>
> Should we require that the entire mask is zero? I think that if any
> bit is wildcarded then we might see multiple CPUs being hit.
>
right.

> I think we also need to check the EtherType - if it is wildcarded then
> we might see many different TCP/IP flows.
>
ok.

>> @@ -640,6 +654,9 @@ static int ovs_key_from_nlattrs(struct sw_flow_match *match,  u64 attrs,
>>                                         tcp_key->tcp_dst, is_mask);
>>                 }
>>                 attrs &= ~(1ULL << OVS_KEY_ATTR_TCP);
>> +
>> +               if (is_mask && (!tcp_key->tcp_src || !tcp_key->tcp_dst))
>> +                       *wc_5tupple = true;
>>         }
>
> Don't we want to do this for UDP as well?
>
right, I missed it.

>>  int ovs_nla_get_match(struct sw_flow_match *match,
>>                       const struct nlattr *key,
>> -                     const struct nlattr *mask)
>> +                     const struct nlattr *mask,
>> +                     bool *wc_5tupple)
>>  {
>>         const struct nlattr *a[OVS_KEY_ATTR_MAX + 1];
>>         const struct nlattr *encap;
>
> It seems like it would be good to initialize wc_5tuple here rather
> than forcing callers to do it.
ok.



More information about the dev mailing list