[ovs-dev] [PATCH v7 11/11] netdev-offload-tc: Add offload support for sFlow

Eelco Chaudron echaudro at redhat.com
Fri Dec 4 11:34:20 UTC 2020



On 4 Dec 2020, at 6:34, Chris Mi wrote:

> On 12/2/2020 11:08 PM, Eelco Chaudron wrote:
>>
>>
>> On 30 Nov 2020, at 5:48, Chris Mi wrote:
>>
>>> Create a unique group ID to map the sFlow info when offloading sFlow
>>> action to TC. When showing the offloaded datapath flows, translate 
>>> the
>>> group ID from TC sample action to sFlow info using the mapping.
>>>
>>> Signed-off-by: Chris Mi <cmi at nvidia.com>
>>> Reviewed-by: Eli Britstein <elibr at nvidia.com>
>>>
>>
>> <SNIP>
>>
>> This is not a full review, I was just trying to understand how you 
>> implemented the conditional action set as I’m working on something 
>> similar for dec_ttl.
>>
>> However, I noticed that you ignore this condition entirely, which I 
>> think should be solved one way or the other in this patchset. See my 
>> comments below.
>>
>>> @@ -2061,17 +2147,49 @@ netdev_tc_flow_put(struct netdev *netdev, 
>>> struct match *match,
>>>              struct dpif_sflow_attr sflow_attr;
>>>
>>>              memset(&sflow_attr, 0, sizeof sflow_attr);
>>> -            gid_alloc_ctx(&sflow_attr);
>>> +            if (flower.tunnel) {
>>> +                sflow_attr.tunnel = 
>>> CONST_CAST(struct flow_tnl *, tnl);
>>> +            }
>>> +            err = parse_sample_action(nla, &sflow_attr, 
>>> action);
>>
>> The main problem here (in the above call) you only look for 
>> OVS_ACTION_ATTR_USERSPACE, if it exists you are ok and handle ONLY 
>> that action.
>> If there are more actions like “userspace(),1,2”, they output to 
>> ports 1 and 2 are ignored. Guess for your implementation you should 
>> only allow TC offload if the action set consists of a single 
>> userspace() action.
> If sample rate is not 1, the sample action looks like this:
> actions:sample(sample=50.0%,actions(userspace(pid=2619320998,sFlow(vid=0,pcp=0,output=17),actions))),enp4s0f0_1

actions:sample(sample=50.0%,actions(userspace(pid=2619320998,sFlow(vid=0,pcp=0,output=17),actions)),output=12),enp4s0f0_1

You assume the only action in actions is userspace, there could be 
others like for example output=12 above.
Currently, this might not happen but could be in the future. So you have 
to be sure all actions in the actions() set are offloadable. See below 
for more details.

> If sample rate is 1, the sample action looks like this:
> actions:userspace(pid=2619320998,sFlow(vid=0,pcp=0,output=17),actions),enp4s0f0_1
>
> In your case, userspace(),1,2, if there is a sFlow cookie inside 
> userspace action, we can offload. Otherwise, we don't offload.

Thanks, now it makes sense to me :) Maybe the comment should be more 
explicit, saying something that we can only offload user spaces actions 
for sflow.

> In the code, we deal with above 2 situations like this:
> ...
>         } else if (nl_attr_type(nla) == OVS_ACTION_ATTR_SAMPLE) 
> {
> ...
>         } else if (nl_attr_type(nla) == 
> OVS_ACTION_ATTR_USERSPACE) {
> ...
>>
>>> +            if (err) {
>>> +                goto out;
>>> +            }
>>> +            group_id = gid_alloc_ctx(&sflow_attr);
>>> +            action->sample.action_group_id = group_id;
>>> +            flower.action_count++;
>>
>> I think these last three commands should be done as part of 
>> parse_sample_action() as all other "if" conditions do either all in 
>> the branch or in the called function.

I’m suggesting the following (including same ordering and name changes 
to reflect what they point to):

diff --git a/lib/dpif.h b/lib/dpif.h
index ed4257210..150162034 100644
--- a/lib/dpif.h
+++ b/lib/dpif.h
@@ -878,6 +878,7 @@ void dpif_register_upcall_cb(struct dpif *, 
upcall_callback *, void *aux);
  struct dpif_sflow_attr {
      const struct nlattr *sflow; /* sFlow action */
      size_t sflow_len;           /* Length of 'sflow' in bytes. */
+    // Why do we need to store sflow_len, it's part of the nlattr data?

      void *userdata;             /* struct user_action_cookie */
      size_t userdata_len;        /* struct user_action_cookie length */
diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
index 75744aa3c..84991ff6a 100644
--- a/lib/netdev-offload-tc.c
+++ b/lib/netdev-offload-tc.c
@@ -1722,20 +1722,25 @@ parse_action_userspace(const struct nlattr 
*actions,
  }

  static int
-parse_sample_action(const struct nlattr *actions,
-                    struct dpif_sflow_attr *sflow_attr,
-                    struct tc_action *tc_action)
+parse_sample_action(struct tc_flower *flower, struct tc_action 
*tc_action,
+                    const struct nlattr *sample_action,
+                    const struct flow_tnl *tnl)
  {
+    struct dpif_sflow_attr sflow_attr;
      const struct nlattr *nla;
      unsigned int left;
      int ret = EINVAL;

-    sflow_attr->sflow = actions;
-    sflow_attr->sflow_len = actions->nla_len;
+    memset(&sflow_attr, 0, sizeof sflow_attr);
+    sflow_attr.sflow = sample_action;
+    sflow_attr.sflow_len = sample_action->nla_len;
+    if (flower->tunnel) {
+        sflow_attr.tunnel = CONST_CAST(struct flow_tnl *, tnl);
+    }

-    NL_NESTED_FOR_EACH_UNSAFE (nla, left, actions) {
+    NL_NESTED_FOR_EACH_UNSAFE (nla, left, sample_action) {
          if (nl_attr_type(nla) == OVS_SAMPLE_ATTR_ACTIONS) {
-            ret = parse_action_userspace(nla, sflow_attr);
+            ret = parse_action_userspace(nla, &sflow_attr);
          } else if (nl_attr_type(nla) == OVS_SAMPLE_ATTR_PROBABILITY) {
              tc_action->type = TC_ACT_SAMPLE;
              tc_action->sample.action_rate = UINT32_MAX / 
nl_attr_get_u32(nla);
@@ -1744,11 +1749,13 @@ parse_sample_action(const struct nlattr 
*actions,
          }
      }

-    if (tc_action->sample.action_rate) {
-        return ret;
-    }
+    if (!tc_action->sample.action_rate || !ret)
+        return EINVAL;

-    return EINVAL;
+    tc_action->sample.action_group_id = gid_alloc_ctx(&sflow_attr);
+    //In theory alloc can fail, so maybe we should check for it.
+    flower->action_count++;
+    return 0;
  }

  static int
@@ -2144,19 +2151,10 @@ netdev_tc_flow_put(struct netdev *netdev, struct 
match *match,
              action->chain = 0;  /* 0 is reserved and not used by 
recirc. */
              flower.action_count++;
          } else if (nl_attr_type(nla) == OVS_ACTION_ATTR_SAMPLE) {
-            struct dpif_sflow_attr sflow_attr;
-
-            memset(&sflow_attr, 0, sizeof sflow_attr);
-            if (flower.tunnel) {
-                sflow_attr.tunnel = CONST_CAST(struct flow_tnl *, tnl);
-            }
-            err = parse_sample_action(nla, &sflow_attr, action);
+            err = parse_sample_action(&flower, action, nla, tnl);
              if (err) {
                  goto out;
              }
-            group_id = gid_alloc_ctx(&sflow_attr);
-            action->sample.action_group_id = group_id;
-            flower.action_count++;
          } else if (nl_attr_type(nla) == OVS_ACTION_ATTR_USERSPACE) {
              struct dpif_sflow_attr sflow_attr;

In addition the main objection is that you ignore the other actions, 
which I think can be solved with the following diff (your called your 
function odd, this is why you probably did not notice it):

diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
index 84991ff6a..7277b699f 100644
--- a/lib/netdev-offload-tc.c
+++ b/lib/netdev-offload-tc.c
@@ -1704,21 +1704,31 @@ parse_userspace_userdata(const struct nlattr 
*actions,
  }

  static int
-parse_action_userspace(const struct nlattr *actions,
-                       struct dpif_sflow_attr *sflow_attr)
+parse_sample_actions(const struct nlattr *actions,
+                     struct dpif_sflow_attr *sflow_attr)
  {
      const struct nlattr *nla;
      unsigned int left;
+    int err = EINVAL;

      NL_NESTED_FOR_EACH_UNSAFE (nla, left, actions) {
          if (nl_attr_type(nla) == OVS_ACTION_ATTR_USERSPACE) {
-            return parse_userspace_userdata(nla, sflow_attr);
+            err = parse_userspace_userdata(nla, sflow_attr);
+        } else {
+            /* We can't offload other nested actions */
+            VLOG_DBG_RL(&error_rl,
+                        "%s: other than OVS_ACTION_ATTR_USERSPACE 
attribute",
+                        __func__);
+            return EINVAL;
          }
      }

-    VLOG_ERR_RL(&error_rl, "%s: no OVS_ACTION_ATTR_USERSPACE 
attribute",
-                __func__);
-    return EINVAL;
+    if (err) {
+        //THIS IS NOT AN ERROR CASE, FROM A DATAPATH IT'S VALID TO DO 
ANY ACTION
+        VLOG_DBG_RL(&error_rl, "%s: no OVS_ACTION_ATTR_USERSPACE 
attribute",
+                    __func__);
+    }
+    return err;
  }

  static int
@@ -1740,7 +1750,7 @@ parse_sample_action(struct tc_flower *flower, 
struct tc_action *tc_action,

      NL_NESTED_FOR_EACH_UNSAFE (nla, left, sample_action) {
          if (nl_attr_type(nla) == OVS_SAMPLE_ATTR_ACTIONS) {
-            ret = parse_action_userspace(nla, &sflow_attr);
+            ret = parse_sample_actions(nla, &sflow_attr);
          } else if (nl_attr_type(nla) == OVS_SAMPLE_ATTR_PROBABILITY) {
              tc_action->type = TC_ACT_SAMPLE;
              tc_action->sample.action_rate = UINT32_MAX / 
nl_attr_get_u32(nla);


> Please see my above reply. There are two situations. I have comment in 
> the code:
>
> 2160         } else if (nl_attr_type(nla) == 
> OVS_ACTION_ATTR_USERSPACE) {
> 2161             struct dpif_sflow_attr flow_attr;
> 2162
> 2163             /* If there is a sFlow cookie inside of a 
> userspace attribute,
> 2164              * but no sample attribute, that means 
> sampling rate is 1.
> 2165              */

Ack, maybe also wrap this in a single function… Something like (not 
tested):

diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
index 7277b699f..bf3cd3ada 100644
--- a/lib/netdev-offload-tc.c
+++ b/lib/netdev-offload-tc.c
@@ -1680,14 +1680,15 @@ flower_match_to_tun_opt(struct tc_flower 
*flower, const struct flow_tnl *tnl,
      flower->mask.tunnel.metadata.present.len = 
tnl->metadata.present.len;
  }

+
  static int
-parse_userspace_userdata(const struct nlattr *actions,
-                         struct dpif_sflow_attr *sflow_attr)
+__parse_userspace_attributes(const struct nlattr *userspace_action,
+                             struct dpif_sflow_attr *sflow_attr)
  {
      const struct nlattr *nla;
      unsigned int left;

-    NL_NESTED_FOR_EACH_UNSAFE (nla, left, actions) {
+    NL_NESTED_FOR_EACH_UNSAFE (nla, left, userspace_action) {
          if (nl_attr_type(nla) == OVS_USERSPACE_ATTR_USERDATA) {
              struct user_action_cookie *cookie;

@@ -1695,14 +1696,50 @@ parse_userspace_userdata(const struct nlattr 
*actions,
              if (cookie->type == USER_ACTION_COOKIE_SFLOW) {
                  sflow_attr->userdata = CONST_CAST(void *, 
nl_attr_get(nla));
                  sflow_attr->userdata_len = nl_attr_get_size(nla);
+                // ^ Why do we store this user data? It's part of the 
data we
+                //   store in sflow_attr.sflow.
                  return 0;
              }
+            // What about the other attributes, don't we need them? Or 
do
+            // we process them in the slow path by sflow_attr.sflow?
+            // As I mentioned earlier, I did not review the full code.
          }
      }
-
      return EINVAL;
  }

+static int
+parse_userspace_attributes(struct tc_flower *flower, struct tc_action 
*action,
+                           const struct nlattr *userspace_action,
+                           const struct flow_tnl *tnl)
+{
+    struct dpif_sflow_attr sflow_attr;
+    int err;
+
+    /* We only support offloading the userspace action for sFlow.
+     * This is when the sFlow cookie exists inside of a the userspace
+     * attribute. This is true when the sample rate for sFlow is 1.
+     */
+    memset(&sflow_attr, 0, sizeof sflow_attr);
+    if (flower->tunnel) {
+        sflow_attr.tunnel = CONST_CAST(struct flow_tnl *, tnl);
+    }
+
+    err = __parse_userspace_attributes(userspace_action, &sflow_attr);
+    if (err)
+        return err;
+
+    sflow_attr.sflow = userspace_action;
+    //Did you notice that you put the userspace_action here?
+    //In the other case you put the sample_action here...
+    sflow_attr.sflow_len = userspace_action->nla_len;
+    action->type = TC_ACT_SAMPLE;
+    action->sample.action_group_id = gid_alloc_ctx(&sflow_attr);
+    action->sample.action_rate = 1;
+    flower->action_count++;
+    return 0;
+}
+
  static int
  parse_sample_actions(const struct nlattr *actions,
                       struct dpif_sflow_attr *sflow_attr)
@@ -1713,7 +1750,7 @@ parse_sample_actions(const struct nlattr *actions,

      NL_NESTED_FOR_EACH_UNSAFE (nla, left, actions) {
          if (nl_attr_type(nla) == OVS_ACTION_ATTR_USERSPACE) {
-            err = parse_userspace_userdata(nla, sflow_attr);
+            err = __parse_userspace_attributes(nla, sflow_attr);
          } else {
              /* We can't offload other nested actions */
              VLOG_DBG_RL(&error_rl,
@@ -2166,27 +2203,11 @@ netdev_tc_flow_put(struct netdev *netdev, struct 
match *match,
                  goto out;
              }
          } else if (nl_attr_type(nla) == OVS_ACTION_ATTR_USERSPACE) {
-            struct dpif_sflow_attr sflow_attr;
-
-            /* We only support offloading the userspace action for 
sFlow.
-             * This is when the sFlow cookie exists inside of a the 
userspace
-             * attribute. This is true when the sample rate for sFlow 
is 1.
-             */
-            memset(&sflow_attr, 0, sizeof sflow_attr);
-            if (flower.tunnel) {
-                sflow_attr.tunnel = CONST_CAST(struct flow_tnl *, tnl);
-            }
-            err = parse_userspace_userdata(nla, &sflow_attr);
+
+            err = parse_userspace_userdata(&flower, action, nla, tnl);
              if (err) {
                  goto out;
              }
-            sflow_attr.sflow = nla;
-            sflow_attr.sflow_len = nla->nla_len;
-            group_id = gid_alloc_ctx(&sflow_attr);
-            action->type = TC_ACT_SAMPLE;
-            action->sample.action_group_id = group_id;
-            action->sample.action_rate = 1;
-            flower.action_count++;
          } else {
              VLOG_DBG_RL(&rl, "unsupported put action type: %d",
                          nl_attr_type(nla));

>>
>>
>>> +        } else if (nl_attr_type(nla) == 
>>> OVS_ACTION_ATTR_USERSPACE) {
>>
>> The OVS_ACTION_ATTR_USERSPACE should not be handled here as a 
>> separate action, at least it should not end up as a TC_ACT_SAMPLE. 
>> What if someone else is having an OVS_ACTION_ATTR_USERPSACE in its 
>> chain, now it will be translated into a SAMPLE action.
>>
>> The parse_sample_action() function called above should loop trough 
>> the OVS_SAMPLE_ATTR_ACTIONS and find the OVS_ACTION_ATTR_USERSPACE, 
>> which it looks like it does. So this “if (nl_attr_type(nla) == 
>> OVS_ACTION_ATTR_USERSPACE)” part should not be needed at all.
> Please see above reply.

Ack see above
>>
>>> +            struct dpif_sflow_attr sflow_attr;
>>> +
>>> +            /* If there is a sFlow cookie inside of a 
>>> userspace attribute,
>>> +             * but no sample attribute, that means 
>>> sampling rate is 1.
>>> +             */
>>> +            memset(&sflow_attr, 0, sizeof sflow_attr);
>>> +            if (flower.tunnel) {
>>> +                sflow_attr.tunnel = 
>>> CONST_CAST(struct flow_tnl *, tnl);
>>> +            }
>>> +            err = parse_userspace_userdata(nla, 
>>> &sflow_attr);
>>> +            if (err) {
>>> +                goto out;
>>> +            }
>>> +            sflow_attr.sflow = nla;
>>> +            sflow_attr.sflow_len = nla->nla_len;
>>> +            group_id = gid_alloc_ctx(&sflow_attr);
>>> +            action->type = TC_ACT_SAMPLE;
>>> +            action->sample.action_group_id = group_id;
>>> +            action->sample.action_rate = 1;
>>> +            flower.action_count++;
>>>          } else {
>>>              VLOG_DBG_RL(&rl, "unsupported put action 
>>> type: %d",
>>>                          nl_attr_type(nla));
>>> -            return EOPNOTSUPP;
>>> +            err = EOPNOTSUPP;
>>> +            goto out;
>>>          }
>>>      }
>>>
>>
>> <SNIP>
>>
>> One other general command, how would we guarantee that the group ID 
>> value used by tc-sample is unique and not being used by any other 
>> application than OVS?
> Please see the comment:
>
>  239 /* Allocate a unique group id for the given set of flow 
> metadata.
>  240  * The ID space is 2^^32, so there should never be a situation 
> in which all
>  241  * the IDs are used up.  We loop until we find a free one. */
>  242 static struct gid_node *
>  243 gid_alloc__(const struct dpif_sflow_attr *sflow, uint32_t hash)
>
> This id is similar to recirc id.
>> Do we assume tc-sample is elusively assigned to OVS?
> Yes, I think so.
>> This might not be the case in customer environments, so we might need 
>> to be able to disable offloading this action.
> If we can offload, why not disable it. If we can't offload, tc will 
> return EOPNOTSUPP and fail back to ovs datapath.
> So I don't think we need to be able to disable it.

Ack all three above, but I was referring to someone that uses sflow kmod 
on his management interface already, and now all of a sudden OVS will 
use it too. Which could cause a problem as the u32 space is started 
between all user space application. So I do think we need an option to 
disable it in OVS.


>>
>> One final question, do you know if this will be supported in your 
>> NICs as hardware offload?
> Sure. We submitted the first version of the OVS patch set in 
> September. We support offload before that.
> I believe the driver change will be in linux net-next branch soon.

Cool, looking forward to seeing the change…


If you do send out a new revision of this patchset, I’ll try to make 
some time and do a full review and some testing.



More information about the dev mailing list