[ovs-dev] [PATCH] datapath: add ipv6 'set' action

Wed Nov 7 20:45:23 UTC 2012

On Tue, Oct 30, 2012 at 1:11 AM, Jesse Gross <jesse at nicira.com> wrote:
> On Sun, Oct 28, 2012 at 2:21 PM, Ansis Atteka <aatteka at nicira.com> wrote:
>> This patch adds ipv6 set action functionality. It allows to change
>> traffic class, flow label, hop-limit, ipv6 source and destination
>> address fields.
>>
>> Signed-off-by: Ansis Atteka <aatteka at nicira.com>
>
> Can you add an item to NEWS?
>
> I got a few sparse errors when I applied this:
>
>   CHECK   /home/jgross/openvswitch/datapath/linux/actions.c
> /home/jgross/openvswitch/datapath/linux/actions.c:270:12: warning:
> incorrect type in assignment (different base types)
> /home/jgross/openvswitch/datapath/linux/actions.c:270:12:    expected
> restricted __be32 [usertype] fl
> /home/jgross/openvswitch/datapath/linux/actions.c:270:12:    got int
> /home/jgross/openvswitch/datapath/linux/actions.c:273:41: warning:
> incorrect type in argument 2 (different base types)
> /home/jgross/openvswitch/datapath/linux/actions.c:273:41:    expected
> unsigned int [unsigned] [usertype] fl
> /home/jgross/openvswitch/datapath/linux/actions.c:273:41:    got
> restricted __be32 const [usertype] ipv6_label
>
> These look like real problems to me on certain architectures because
> the constants that you are using for bit manipulations will have
> different results depending on endianness.
>
> I think we also need to do some validation of these actions in
> datapath.c similar to what we do currently for IPv4 set.  I think
> right now these actions just get rejected immediately because they
> aren't expected by the validation code.
>
> We could implement the equivalent actions in the userspace datapath as
> well.  There aren't any complex dependencies here so it would be ideal
> to maintain parity.

I am thinking, where in the user-space datapath I should put that code
that parses routing headers. Two options come into my mind:

1. In the parse_ipv6() function, when packets are parsed for the the
first time. This would mean that I would need store this routing
header flag somewhere in struct *flow. And then pass *flow all the way
to my packet_set_ipv6() function. Would it be appropriate to store it
there?
2. Visit routing headers one more time from my packet_set_ipv6()
function. The ugly side of putting this code here, would mean that
ipv6 headers will need to be parsed twice and there will be another
function very similar to parse_ipv6(). Except this one will have to
use ugly pointer arithmetic, because ofpbuf *packet buffer was already
consumed.

I think solution #1 would be better? What do you think Jesse?
>
>> diff --git a/datapath/actions.c b/datapath/actions.c
>> index ec9b595..7e60b0f 100644
>> --- a/datapath/actions.c
>> +++ b/datapath/actions.c
>> +static void set_ipv6_addr(struct sk_buff *skb, struct ipv6hdr *nh,
>> +                         __be32 addr[4], const __be32 new_addr[4])
>> +{
>> +       int transport_len = skb->len - skb_transport_offset(skb);
>> +
>> +       if (nh->nexthdr == IPPROTO_TCP) {
>> +               if (likely(transport_len >= sizeof(struct tcphdr)))
>> +                       inet_proto_csum_replace16(&tcp_hdr(skb)->check, skb,
>> +                                                 addr, new_addr, 1);
>> +       } else if (nh->nexthdr == IPPROTO_UDP) {
>> +               if (likely(transport_len >= sizeof(struct udphdr))) {
>> +                       struct udphdr *uh = udp_hdr(skb);
>> +
>> +                       if (uh->check ||
>> +                           get_ip_summed(skb) == OVS_CSUM_PARTIAL) {
>> +                               inet_proto_csum_replace16(&uh->check, skb,
>> +                                                         addr, new_addr, 1);
>> +                               if (!uh->check)
>> +                                       uh->check = CSUM_MANGLED_0;
>> +                       }
>> +               }
>> +       }
>
> IPv6 has an extra little twist when it comes to checksums compared to
> IPv4: routing headers.  In an IPv6 packet, the destination IP address
> used for the purposes of checksum calculation in the pseudo header is
> the final destination address.  This means that if you have a routing
> header then it is that address that matters and not one in the IP
> header.  This is likely to be a pretty rare case but we should handle
> it.
>
> Also, due to the presence of extension headers in general, nh->nexthdr
> might not point to the L4 protocol.  We should use the extracted L4
> protocol in the flow instead.
>
>> +static void set_ipv6_tc(struct ipv6hdr *nh, __u8 tc)
>
> In general, I would prefer the versions of the host byte order types
> without underscores (i.e. u8, u32, etc.) for components entirely
> within the kernel.
>
>> +{
>> +       nh->priority = (nh->priority & 0xf0) | (tc >> 4);
>
> nh->priority is only 4 bits, so you shouldn't have to do the mask.
>
>> +       nh->flow_lbl[0] = (nh->flow_lbl[0] & 0x0f) | ((tc & 0x0f) << 4);
>> +}
>> +
>> +static void set_ipv6_fl(struct ipv6hdr *nh, __u32 fl)
>> +{
>> +       nh->flow_lbl[0] = (nh->flow_lbl[0] & 0xf0) | ((fl & 0x0f0000) >> 16);
>> +       nh->flow_lbl[1] = (fl & 0x00ff00) >> 8;
>> +       nh->flow_lbl[2] = fl & 0x0000ff;
>
> I would make all of these hex constants the full 4 bytes.  Whenever I
> see only 3 being used with a 4 byte value I think that I've
> miscounted.
>
>> +static int set_ipv6(struct sk_buff *skb, const struct ovs_key_ipv6 *ipv6_key)
>> +{
>> +       struct ipv6hdr *nh;
>> +       int err;
>> +       __u8 tc;
>> +       __be32 fl;
>
> Based on your usage below, I think that fl is supposed to be in host byte order.
>
>> +       __be32 *saddr;
>> +       __be32 *daddr;
>> +
>> +       err = make_writable(skb, skb_network_offset(skb) +
>> +                                sizeof(struct ipv6hdr));
>> +       if (unlikely(err))
>> +               return err;
>> +
>> +       nh = ipv6_hdr(skb);
>> +       saddr = (__be32 *)&nh->saddr;
>> +       daddr = (__be32 *)&nh->daddr;
>> +
>> +       if (memcmp(ipv6_key->ipv6_src, saddr, sizeof(ipv6_key->ipv6_src)))
>> +               set_ipv6_addr(skb, nh, saddr, ipv6_key->ipv6_src);
>> +
>> +       if (memcmp(ipv6_key->ipv6_dst, daddr, sizeof(ipv6_key->ipv6_dst)))
>> +               set_ipv6_addr(skb, nh, daddr, ipv6_key->ipv6_dst);
>> +
>> +       tc = (nh->priority << 4) | (nh->flow_lbl[0] >> 4);
>> +       if (ipv6_key->ipv6_tclass != tc)
>> +               set_ipv6_tc(nh, ipv6_key->ipv6_tclass);
>> +
>> +       fl = (nh->flow_lbl[0] & 0x0f) << 16 | nh->flow_lbl[1] << 8 |
>> +             nh->flow_lbl[2];
>> +       if (ipv6_key->ipv6_label != fl)
>> +               set_ipv6_fl(nh, ipv6_key->ipv6_label);
>
> There's a mismatch here between the byte order fl and ipv6_key->ipv6_label.
>
>> +       if (ipv6_key->ipv6_hlimit != nh->hop_limit)
>> +               nh->hop_limit = ipv6_key->ipv6_hlimit;
>
> I think you can drop the checks for traffic class, flow label, and hop
> limit and just do the assignments directly.  The cost to do the check
> is basically the same as the actual operation so the check doesn't
> really save us anything even in the common case.
>
>> diff --git a/datapath/compat.h b/datapath/compat.h
>> index 3113b96..fb660d2 100644
>> --- a/datapath/compat.h
>> +++ b/datapath/compat.h
>
> We usually put functions that are backported directly from Linux in
> the corresponding header file in the compat directory.  Normally that
> would be net/checksum.h in this case but this function depends on the
> OVS checksum constants, so it should go in datapath/checksum.h.  It
> should also use the OVS constants like the example of
> inet_proto_csum_replace4() in the same file.
>
>> +#if LINUX_VERSION_CODE < KERNEL_VERSION(3,6,0)
>> +static inline void inet_proto_csum_replace16(__sum16 *sum,
>
> I think this function is coming in 3.7, not 3.6.