[ovs-dev] [PATCH v4 4/4] datapath: Allow masks for set actions.

Jesse Gross jesse at nicira.com
Tue Aug 12 00:39:08 UTC 2014


On Mon, Aug 11, 2014 at 9:15 AM, Jarno Rajahalme <jrajahalme at nicira.com> wrote:
> Masked set action allows more megaflow wildcarding.  Masked set action
> is now supported for all writeable key types, except for the tunnel
> key.
>
> The set tunnel action is an exception as any input tunnel info is
> cleared before action processing starts, so there is no tunnel info to
> mask.
>
> The kernel module converts all (non-tunnel) set actions to masked set
> actions.  This makes action processing more uniform, and results in
> less branching and duplicating the action processing code.
>
> Signed-off-by: Jarno Rajahalme <jrajahalme at nicira.com>

This triggers some sparse warnings that would be nice to avoid:

  CHECK   /home/jesse/openvswitch/datapath/linux/actions.c
/home/jesse/openvswitch/datapath/linux/actions.c:498:44: warning:
incorrect type in return expression (different base types)
/home/jesse/openvswitch/datapath/linux/actions.c:498:44:    expected bool
/home/jesse/openvswitch/datapath/linux/actions.c:498:44:    got
restricted __be32

> diff --git a/datapath/actions.c b/datapath/actions.c
> index 25c5d77..d57d779 100644
> --- a/datapath/actions.c
> +++ b/datapath/actions.c
> @@ -432,35 +462,47 @@ static int set_ipv4(struct sk_buff *skb, const struct ovs_key_ipv4 *ipv4_key)
>
>         nh = ip_hdr(skb);
>
> -       if (ipv4_key->ipv4_src != nh->saddr) {
> -               set_ip_addr(skb, nh, &nh->saddr, ipv4_key->ipv4_src);
> -               flow_key_set_ipv4_src(skb, ipv4_key->ipv4_src);
> -       }
> +       /* Setting an IP addresses is typically only a side effect of
> +        * matching on them in the current userspace implementation, so it
> +        * makes sense to check if the value actually changed. */
> +       if (mask->ipv4_src) {
> +               new_addr = MASKED(nh->saddr, key->ipv4_src, mask->ipv4_src);
>
> -       if (ipv4_key->ipv4_dst != nh->daddr) {

Here we are checking first that the mask is non-zero and then later
that the values are different. In other places we do only the second
step. What is the rationale?

> +/* Mask is at the midpoint of the data. */
> +#define get_mask(a, type) \
> +       ((const type *)((const char *)nla_data(a) + nla_len(a)))

This doesn't seem right to me. From a netlink perspective, the
attribute length covers the whole thing, both the value and mask. So
won't this always find the end of the attribute, not the middle?

> diff --git a/datapath/flow_netlink.c b/datapath/flow_netlink.c
> index e525c9d..937f4d6 100644
> --- a/datapath/flow_netlink.c
> +++ b/datapath/flow_netlink.c
> @@ -1543,42 +1563,62 @@ static int validate_set(const struct nlattr *a,
>                 break;
>
>         case OVS_KEY_ATTR_TUNNEL:
> -               *set_tun = true;
> +               if (masked)
> +                       return -EINVAL; /* Masked tunnel set not supported. */
> +               *skip_copy = true;
>                 err = validate_and_copy_set_tun(a, sfa);
>                 if (err)
>                         return err;
> -               break;
> +               return 0;

Returning 0 here makes me a little nervous - if we ever add more
validation at the end of the function, it might be silently skipped.

>         case OVS_KEY_ATTR_IPV4:
[...]
> -               if (ipv4_key->ipv4_frag != flow_key->ip.frag)
> -                       return -EINVAL;
> +                       /* Non-writeable fields. */
> +                       if (mask->ipv4_proto || mask->ipv4_frag)
> +                               return -EINVAL;
> +               } else {
> +                       if (!flow_key->ip.proto)
> +                               return -EINVAL;

I believe that this check on ip.proto is being used to verify that the
IP header is actually present, so this would mean that we can write
off the end of the packet in the masked case.

>         case OVS_KEY_ATTR_IPV6:
[...]
> -               if (ipv6_key->ipv6_frag != flow_key->ip.frag)
> -                       return -EINVAL;
> +                       /* Non-writeable fields. */
> +                       if (mask->ipv6_proto || mask->ipv6_frag)
> +                               return -EINVAL;
> +
> +                       /* Invalid bits in the flow label mask? */
> +                       if (ntohl(mask->ipv6_label) & 0xFFF00000)
> +                               return -EINVAL;
> +               } else {
> +                       if (!flow_key->ip.proto)
> +                               return -EINVAL;

Same issue with header validation here.

> @@ -1605,12 +1649,43 @@ static int validate_set(const struct nlattr *a,
> +       /* Convert non-masked set actions to masked set actions.
> +        * Tunnel set action returns before getting here. */
> +       if (!masked) {
> +               int start, len = key_len * 2;
> +               struct nlattr *at;
> +
> +               *skip_copy = true;
> +
> +               start = add_nested_action_start(sfa,
> +                                               OVS_ACTION_ATTR_SET_MASKED);
> +               if (start < 0)
> +                       return start;
> +
> +               at = reserve_sfa_size(sfa, nla_attr_size(len));
> +               if (IS_ERR(at))
> +                       return PTR_ERR(at);
> +
> +               at->nla_type = key_type;
> +               at->nla_len = nla_attr_size(len);

I think this can be simplified by using add_action() or __add_action().

What about the reverse direction? I think this will return masked
actions to old userspace, which presumable will choke on them.

> diff --git a/datapath/linux/compat/include/linux/openvswitch.h b/datapath/linux/compat/include/linux/openvswitch.h
> index 9ea1f37..a8b318a 100644
> --- a/datapath/linux/compat/include/linux/openvswitch.h
> +++ b/datapath/linux/compat/include/linux/openvswitch.h
> @@ -590,6 +590,12 @@ struct ovs_action_hash {
>   * indicate the new packet contents. This could potentially still be
>   * %ETH_P_MPLS if the resulting MPLS label stack is not empty.  If there
>   * is no MPLS label stack, as determined by ethertype, no action is taken.
> + * @OVS_ACTION_ATTR_SET_MASKED: Replaces the contents of an existing header.  A
> + * nested %OVS_KEY_ATTR_* attribute specifies a header to modify, its value,
> + * and a mask.  For every bit set in the mask, the corresponding bit value
> + * is copied from the value to the packet header field, rest of the bits are
> + * left unchanged.  The non-masked value bits must be passed in as zeroes.
> + * Masking is not supported for the %OVS_KEY_ATTR_TUNNEL attribute.

It is probably a good idea to add the attribute and the comment in the
same patch. Also, I don't know how Pravin feels but in general it is a
little easier if the kernel code is in a single patch. Otherwise, this
will have to be transformed before upstreaming.



More information about the dev mailing list