[ovs-dev] [PATCH v7 11/11] netdev-offload-tc: Add offload support for sFlow
Chris Mi
cmi at nvidia.com
Mon Dec 14 09:33:39 UTC 2020
On 12/4/2020 7:34 PM, Eelco Chaudron wrote:
>
>
> On 4 Dec 2020, at 6:34, Chris Mi wrote:
>
>> On 12/2/2020 11:08 PM, Eelco Chaudron wrote:
>>>
>>>
>>> On 30 Nov 2020, at 5:48, Chris Mi wrote:
>>>
>>>> Create a unique group ID to map the sFlow info when offloading sFlow
>>>> action to TC. When showing the offloaded datapath flows, translate the
>>>> group ID from TC sample action to sFlow info using the mapping.
>>>>
>>>> Signed-off-by: Chris Mi <cmi at nvidia.com>
>>>> Reviewed-by: Eli Britstein <elibr at nvidia.com>
>>>>
>>>
>>> <SNIP>
>>>
>>> This is not a full review, I was just trying to understand how you
>>> implemented the conditional action set as I’m working on something
>>> similar for dec_ttl.
>>>
>>> However, I noticed that you ignore this condition entirely, which I
>>> think should be solved one way or the other in this patchset. See my
>>> comments below.
>>>
>>>> @@ -2061,17 +2147,49 @@ netdev_tc_flow_put(struct netdev *netdev,
>>>> struct match *match,
>>>> struct dpif_sflow_attr sflow_attr;
>>>>
>>>> memset(&sflow_attr, 0, sizeof sflow_attr);
>>>> - gid_alloc_ctx(&sflow_attr);
>>>> + if (flower.tunnel) {
>>>> + sflow_attr.tunnel = CONST_CAST(struct flow_tnl *,
>>>> tnl);
>>>> + }
>>>> + err = parse_sample_action(nla, &sflow_attr, action);
>>>
>>> The main problem here (in the above call) you only look for
>>> OVS_ACTION_ATTR_USERSPACE, if it exists you are ok and handle ONLY
>>> that action.
>>> If there are more actions like “userspace(),1,2”, they output to
>>> ports 1 and 2 are ignored. Guess for your implementation you should
>>> only allow TC offload if the action set consists of a single
>>> userspace() action.
>> If sample rate is not 1, the sample action looks like this:
>> actions:sample(sample=50.0%,actions(userspace(pid=2619320998,sFlow(vid=0,pcp=0,output=17),actions))),enp4s0f0_1
>>
>
> actions:sample(sample=50.0%,actions(userspace(pid=2619320998,sFlow(vid=0,pcp=0,output=17),actions)),output=12),enp4s0f0_1
>
>
> You assume the only action in actions is userspace, there could be
> others like for example output=12 above.
> Currently, this might not happen but could be in the future. So you
> have to be sure all actions in the actions() set are offloadable. See
> below for more details.
>
>> If sample rate is 1, the sample action looks like this:
>> actions:userspace(pid=2619320998,sFlow(vid=0,pcp=0,output=17),actions),enp4s0f0_1
>>
>>
>> In your case, userspace(),1,2, if there is a sFlow cookie inside
>> userspace action, we can offload. Otherwise, we don't offload.
>
> Thanks, now it makes sense to me :) Maybe the comment should be more
> explicit, saying something that we can only offload user spaces
> actions for sflow.
Done.
>
>> In the code, we deal with above 2 situations like this:
>> ...
>> } else if (nl_attr_type(nla) == OVS_ACTION_ATTR_SAMPLE) {
>> ...
>> } else if (nl_attr_type(nla) == OVS_ACTION_ATTR_USERSPACE) {
>> ...
>>>
>>>> + if (err) {
>>>> + goto out;
>>>> + }
>>>> + group_id = gid_alloc_ctx(&sflow_attr);
>>>> + action->sample.action_group_id = group_id;
>>>> + flower.action_count++;
>>>
>>> I think these last three commands should be done as part of
>>> parse_sample_action() as all other "if" conditions do either all in
>>> the branch or in the called function.
>
Thanks for your patch. I revised the code according to your suggestion.
> I’m suggesting the following (including same ordering and name changes
> to reflect what they point to):
>
> diff --git a/lib/dpif.h b/lib/dpif.h
> index ed4257210..150162034 100644
> --- a/lib/dpif.h
> +++ b/lib/dpif.h
> @@ -878,6 +878,7 @@ void dpif_register_upcall_cb(struct dpif *,
> upcall_callback *, void *aux);
> struct dpif_sflow_attr {
> const struct nlattr *sflow; /* sFlow action */
> size_t sflow_len; /* Length of 'sflow' in bytes. */
> + // Why do we need to store sflow_len, it's part of the nlattr data?
We need to copy and compare the dpif_sflow_attr. So sflow_len is used
for that.
>
> void *userdata; /* struct user_action_cookie */
> size_t userdata_len; /* struct user_action_cookie length */
> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
> index 75744aa3c..84991ff6a 100644
> --- a/lib/netdev-offload-tc.c
> +++ b/lib/netdev-offload-tc.c
> @@ -1722,20 +1722,25 @@ parse_action_userspace(const struct nlattr
> *actions,
> }
>
> static int
> -parse_sample_action(const struct nlattr *actions,
> - struct dpif_sflow_attr *sflow_attr,
> - struct tc_action *tc_action)
> +parse_sample_action(struct tc_flower *flower, struct tc_action
> *tc_action,
> + const struct nlattr *sample_action,
> + const struct flow_tnl *tnl)
> {
> + struct dpif_sflow_attr sflow_attr;
> const struct nlattr *nla;
> unsigned int left;
> int ret = EINVAL;
>
> - sflow_attr->sflow = actions;
> - sflow_attr->sflow_len = actions->nla_len;
> + memset(&sflow_attr, 0, sizeof sflow_attr);
> + sflow_attr.sflow = sample_action;
> + sflow_attr.sflow_len = sample_action->nla_len;
> + if (flower->tunnel) {
> + sflow_attr.tunnel = CONST_CAST(struct flow_tnl *, tnl);
> + }
>
> - NL_NESTED_FOR_EACH_UNSAFE (nla, left, actions) {
> + NL_NESTED_FOR_EACH_UNSAFE (nla, left, sample_action) {
> if (nl_attr_type(nla) == OVS_SAMPLE_ATTR_ACTIONS) {
> - ret = parse_action_userspace(nla, sflow_attr);
> + ret = parse_action_userspace(nla, &sflow_attr);
> } else if (nl_attr_type(nla) == OVS_SAMPLE_ATTR_PROBABILITY) {
> tc_action->type = TC_ACT_SAMPLE;
> tc_action->sample.action_rate = UINT32_MAX /
> nl_attr_get_u32(nla);
> @@ -1744,11 +1749,13 @@ parse_sample_action(const struct nlattr *actions,
> }
> }
>
> - if (tc_action->sample.action_rate) {
> - return ret;
> - }
> + if (!tc_action->sample.action_rate || !ret)
> + return EINVAL;
>
> - return EINVAL;
> + tc_action->sample.action_group_id = gid_alloc_ctx(&sflow_attr);
> + //In theory alloc can fail, so maybe we should check for it.
> + flower->action_count++;
> + return 0;
> }
>
> static int
> @@ -2144,19 +2151,10 @@ netdev_tc_flow_put(struct netdev *netdev,
> struct match *match,
> action->chain = 0; /* 0 is reserved and not used by
> recirc. */
> flower.action_count++;
> } else if (nl_attr_type(nla) == OVS_ACTION_ATTR_SAMPLE) {
> - struct dpif_sflow_attr sflow_attr;
> -
> - memset(&sflow_attr, 0, sizeof sflow_attr);
> - if (flower.tunnel) {
> - sflow_attr.tunnel = CONST_CAST(struct flow_tnl *, tnl);
> - }
> - err = parse_sample_action(nla, &sflow_attr, action);
> + err = parse_sample_action(&flower, action, nla, tnl);
> if (err) {
> goto out;
> }
> - group_id = gid_alloc_ctx(&sflow_attr);
> - action->sample.action_group_id = group_id;
> - flower.action_count++;
> } else if (nl_attr_type(nla) == OVS_ACTION_ATTR_USERSPACE) {
> struct dpif_sflow_attr sflow_attr;
>
> In addition the main objection is that you ignore the other actions,
> which I think can be solved with the following diff (your called your
> function odd, this is why you probably did not notice it):
>
> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
> index 84991ff6a..7277b699f 100644
> --- a/lib/netdev-offload-tc.c
> +++ b/lib/netdev-offload-tc.c
> @@ -1704,21 +1704,31 @@ parse_userspace_userdata(const struct nlattr
> *actions,
> }
>
> static int
> -parse_action_userspace(const struct nlattr *actions,
> - struct dpif_sflow_attr *sflow_attr)
> +parse_sample_actions(const struct nlattr *actions,
> + struct dpif_sflow_attr *sflow_attr)
> {
> const struct nlattr *nla;
> unsigned int left;
> + int err = EINVAL;
>
> NL_NESTED_FOR_EACH_UNSAFE (nla, left, actions) {
> if (nl_attr_type(nla) == OVS_ACTION_ATTR_USERSPACE) {
> - return parse_userspace_userdata(nla, sflow_attr);
> + err = parse_userspace_userdata(nla, sflow_attr);
> + } else {
> + /* We can't offload other nested actions */
> + VLOG_DBG_RL(&error_rl,
> + "%s: other than OVS_ACTION_ATTR_USERSPACE
> attribute",
> + __func__);
> + return EINVAL;
> }
> }
I think we should not log in this function. We may be overwhelmed for
various usespace actions.
>
> - VLOG_ERR_RL(&error_rl, "%s: no OVS_ACTION_ATTR_USERSPACE attribute",
> - __func__);
> - return EINVAL;
> + if (err) {
> + //THIS IS NOT AN ERROR CASE, FROM A DATAPATH IT'S VALID TO DO
> ANY ACTION
But sample action expects a userspace attribute.
> + VLOG_DBG_RL(&error_rl, "%s: no OVS_ACTION_ATTR_USERSPACE attribute",
> + __func__);
> + }
> + return err;
> }
>
> static int
> @@ -1740,7 +1750,7 @@ parse_sample_action(struct tc_flower *flower,
> struct tc_action *tc_action,
>
> NL_NESTED_FOR_EACH_UNSAFE (nla, left, sample_action) {
> if (nl_attr_type(nla) == OVS_SAMPLE_ATTR_ACTIONS) {
> - ret = parse_action_userspace(nla, &sflow_attr);
> + ret = parse_sample_actions(nla, &sflow_attr);
> } else if (nl_attr_type(nla) == OVS_SAMPLE_ATTR_PROBABILITY) {
> tc_action->type = TC_ACT_SAMPLE;
> tc_action->sample.action_rate = UINT32_MAX /
> nl_attr_get_u32(nla);
>
>
>> Please see my above reply. There are two situations. I have comment
>> in the code:
>>
>> 2160 } else if (nl_attr_type(nla) ==
>> OVS_ACTION_ATTR_USERSPACE) {
>> 2161 struct dpif_sflow_attr flow_attr;
>> 2162
>> 2163 /* If there is a sFlow cookie inside of a userspace
>> attribute,
>> 2164 * but no sample attribute, that means sampling rate
>> is 1.
>> 2165 */
>
> Ack, maybe also wrap this in a single function… Something like (not
> tested):
>
> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
> index 7277b699f..bf3cd3ada 100644
> --- a/lib/netdev-offload-tc.c
> +++ b/lib/netdev-offload-tc.c
> @@ -1680,14 +1680,15 @@ flower_match_to_tun_opt(struct tc_flower
> *flower, const struct flow_tnl *tnl,
> flower->mask.tunnel.metadata.present.len =
> tnl->metadata.present.len;
> }
>
> +
> static int
> -parse_userspace_userdata(const struct nlattr *actions,
> - struct dpif_sflow_attr *sflow_attr)
> +__parse_userspace_attributes(const struct nlattr *userspace_action,
> + struct dpif_sflow_attr *sflow_attr)
> {
> const struct nlattr *nla;
> unsigned int left;
>
> - NL_NESTED_FOR_EACH_UNSAFE (nla, left, actions) {
> + NL_NESTED_FOR_EACH_UNSAFE (nla, left, userspace_action) {
> if (nl_attr_type(nla) == OVS_USERSPACE_ATTR_USERDATA) {
> struct user_action_cookie *cookie;
>
> @@ -1695,14 +1696,50 @@ parse_userspace_userdata(const struct nlattr
> *actions,
> if (cookie->type == USER_ACTION_COOKIE_SFLOW) {
> sflow_attr->userdata = CONST_CAST(void *,
> nl_attr_get(nla));
> sflow_attr->userdata_len = nl_attr_get_size(nla);
> + // ^ Why do we store this user data? It's part of the
> data we
> + // store in sflow_attr.sflow.
> return 0;
> }
> + // What about the other attributes, don't we need them?
> Or do
> + // we process them in the slow path by sflow_attr.sflow?
> + // As I mentioned earlier, I did not review the full code.
> }
> }
> -
> return EINVAL;
> }
>
> +static int
> +parse_userspace_attributes(struct tc_flower *flower, struct tc_action
> *action,
> + const struct nlattr *userspace_action,
> + const struct flow_tnl *tnl)
> +{
> + struct dpif_sflow_attr sflow_attr;
> + int err;
> +
> + /* We only support offloading the userspace action for sFlow.
> + * This is when the sFlow cookie exists inside of a the userspace
> + * attribute. This is true when the sample rate for sFlow is 1.
> + */
> + memset(&sflow_attr, 0, sizeof sflow_attr);
> + if (flower->tunnel) {
> + sflow_attr.tunnel = CONST_CAST(struct flow_tnl *, tnl);
> + }
> +
> + err = __parse_userspace_attributes(userspace_action, &sflow_attr);
> + if (err)
> + return err;
> +
> + sflow_attr.sflow = userspace_action;
> + //Did you notice that you put the userspace_action here?
> + //In the other case you put the sample_action here...
> + sflow_attr.sflow_len = userspace_action->nla_len;
> + action->type = TC_ACT_SAMPLE;
> + action->sample.action_group_id = gid_alloc_ctx(&sflow_attr);
> + action->sample.action_rate = 1;
> + flower->action_count++;
> + return 0;
> +}
> +
> static int
> parse_sample_actions(const struct nlattr *actions,
> struct dpif_sflow_attr *sflow_attr)
> @@ -1713,7 +1750,7 @@ parse_sample_actions(const struct nlattr *actions,
>
> NL_NESTED_FOR_EACH_UNSAFE (nla, left, actions) {
> if (nl_attr_type(nla) == OVS_ACTION_ATTR_USERSPACE) {
> - err = parse_userspace_userdata(nla, sflow_attr);
> + err = __parse_userspace_attributes(nla, sflow_attr);
> } else {
> /* We can't offload other nested actions */
> VLOG_DBG_RL(&error_rl,
> @@ -2166,27 +2203,11 @@ netdev_tc_flow_put(struct netdev *netdev,
> struct match *match,
> goto out;
> }
> } else if (nl_attr_type(nla) == OVS_ACTION_ATTR_USERSPACE) {
> - struct dpif_sflow_attr sflow_attr;
> -
> - /* We only support offloading the userspace action for
> sFlow.
> - * This is when the sFlow cookie exists inside of a the
> userspace
> - * attribute. This is true when the sample rate for sFlow
> is 1.
> - */
> - memset(&sflow_attr, 0, sizeof sflow_attr);
> - if (flower.tunnel) {
> - sflow_attr.tunnel = CONST_CAST(struct flow_tnl *, tnl);
> - }
> - err = parse_userspace_userdata(nla, &sflow_attr);
> +
> + err = parse_userspace_userdata(&flower, action, nla, tnl);
> if (err) {
> goto out;
> }
> - sflow_attr.sflow = nla;
> - sflow_attr.sflow_len = nla->nla_len;
> - group_id = gid_alloc_ctx(&sflow_attr);
> - action->type = TC_ACT_SAMPLE;
> - action->sample.action_group_id = group_id;
> - action->sample.action_rate = 1;
> - flower.action_count++;
> } else {
> VLOG_DBG_RL(&rl, "unsupported put action type: %d",
> nl_attr_type(nla));
>
>>>
>>>
>>>> + } else if (nl_attr_type(nla) == OVS_ACTION_ATTR_USERSPACE) {
>>>
>>> The OVS_ACTION_ATTR_USERSPACE should not be handled here as a
>>> separate action, at least it should not end up as a TC_ACT_SAMPLE.
>>> What if someone else is having an OVS_ACTION_ATTR_USERPSACE in its
>>> chain, now it will be translated into a SAMPLE action.
>>>
>>> The parse_sample_action() function called above should loop trough
>>> the OVS_SAMPLE_ATTR_ACTIONS and find the OVS_ACTION_ATTR_USERSPACE,
>>> which it looks like it does. So this “if (nl_attr_type(nla) ==
>>> OVS_ACTION_ATTR_USERSPACE)” part should not be needed at all.
>> Please see above reply.
>
> Ack see above
>>>
>>>> + struct dpif_sflow_attr sflow_attr;
>>>> +
>>>> + /* If there is a sFlow cookie inside of a userspace
>>>> attribute,
>>>> + * but no sample attribute, that means sampling rate
>>>> is 1.
>>>> + */
>>>> + memset(&sflow_attr, 0, sizeof sflow_attr);
>>>> + if (flower.tunnel) {
>>>> + sflow_attr.tunnel = CONST_CAST(struct flow_tnl *,
>>>> tnl);
>>>> + }
>>>> + err = parse_userspace_userdata(nla, &sflow_attr);
>>>> + if (err) {
>>>> + goto out;
>>>> + }
>>>> + sflow_attr.sflow = nla;
>>>> + sflow_attr.sflow_len = nla->nla_len;
>>>> + group_id = gid_alloc_ctx(&sflow_attr);
>>>> + action->type = TC_ACT_SAMPLE;
>>>> + action->sample.action_group_id = group_id;
>>>> + action->sample.action_rate = 1;
>>>> + flower.action_count++;
>>>> } else {
>>>> VLOG_DBG_RL(&rl, "unsupported put action type: %d",
>>>> nl_attr_type(nla));
>>>> - return EOPNOTSUPP;
>>>> + err = EOPNOTSUPP;
>>>> + goto out;
>>>> }
>>>> }
>>>>
>>>
>>> <SNIP>
>>>
>>> One other general command, how would we guarantee that the group ID
>>> value used by tc-sample is unique and not being used by any other
>>> application than OVS?
>> Please see the comment:
>>
>> 239 /* Allocate a unique group id for the given set of flow metadata.
>> 240 * The ID space is 2^^32, so there should never be a situation
>> in which all
>> 241 * the IDs are used up. We loop until we find a free one. */
>> 242 static struct gid_node *
>> 243 gid_alloc__(const struct dpif_sflow_attr *sflow, uint32_t hash)
>>
>> This id is similar to recirc id.
>>> Do we assume tc-sample is elusively assigned to OVS?
>> Yes, I think so.
>>> This might not be the case in customer environments, so we might
>>> need to be able to disable offloading this action.
>> If we can offload, why not disable it. If we can't offload, tc will
>> return EOPNOTSUPP and fail back to ovs datapath.
>> So I don't think we need to be able to disable it.
>
> Ack all three above, but I was referring to someone that uses sflow
> kmod on his management interface already, and now all of a sudden OVS
> will use it too. Which could cause a problem as the u32 space is
> started between all user space application. So I do think we need an
> option to disable it in OVS.
Sorry, what do you mean by sflow kmod?
>
>
>>>
>>> One final question, do you know if this will be supported in your
>>> NICs as hardware offload?
>> Sure. We submitted the first version of the OVS patch set in
>> September. We support offload before that.
>> I believe the driver change will be in linux net-next branch soon.
>
> Cool, looking forward to seeing the change…
>
>
> If you do send out a new revision of this patchset, I’ll try to make
> some time and do a full review and some testing.
>
V8 will be sent for the full review.
Thanks,
Chris
More information about the dev
mailing list