[ovs-dev] [PATCH v2 3/3] datapath: add layer 3 flow/port support

Jesse Gross jesse at nicira.com
Fri May 9 21:48:47 UTC 2014

On Fri, Apr 25, 2014 at 2:55 AM, Lori Jakab <lojakab at cisco.com> wrote:
> On 4/25/14, 5:19 AM, Jesse Gross wrote:
>>>>>>>>> +noethernet:
>>>>>>>>>             if (nla_put_be16(skb, OVS_KEY_ATTR_ETHERTYPE,
>>>>>>>>> output->eth.type))
>>>>>>>>>                     goto nla_put_failure;
>>>>>>>> Does it still make sense to send the EtherType? It's not present in
>>>>>>>> the packet and I believe that the information is contained in the
>>>>>>>> attributes that we do send (i.e. IPv4 or v6 attributes).
>>>>>>> We had a discussion about this in August last year:
>>>>>>>>>>>>> One such decision is how to handle the flow key.  I set all
>>>>>>>>>>>>> fields
>>>>>>>>>>>>> in
>>>>>>>>>>>>> key->eth to 0, except the type, because we still need to know
>>>>>>>>>>>>> what
>>>>>>>>>>>>> kind
>>>>>>>>>>>>> of L3 packet do we have.  Since a lot of code is accessing
>>>>>>>>>>>>> key->eth.type, this is easier than having this information in a
>>>>>>>>>>>>> different place, although it would be more elegant to set this
>>>>>>>>>>>>> field
>>>>>>>>>>>>> to
>>>>>>>>>>>>> 0 as well.
>>>>>>>>>>>> Can we use skb->protocol for the cases where we currently access
>>>>>>>>>>>> the
>>>>>>>>>>>> EtherType? Are there cases that you are particularly concerned
>>>>>>>>>>>> about?
>>>>>>>>>>> The only case I'm concerned about is in the action validation
>>>>>>>>>>> code,
>>>>>>>>>>> particularly the validate_set() and validate_tp_port() functions,
>>>>>>>>>>> since
>>>>>>>>>>> they
>>>>>>>>>>> don't have access to the skb, and need to know the layer 3
>>>>>>>>>>> protocol
>>>>>>>>>>> of
>>>>>>>>>>> the
>>>>>>>>>>> flow.  In validate_set() we could relax the check for eth.type
>>>>>>>>>>> for
>>>>>>>>>>> OVS_KEY_ATTR_IPV4 and OVS_KEY_ATTR_IPV6, but I'm not sure what to
>>>>>>>>>>> do
>>>>>>>>>>> about
>>>>>>>>>>> validate_tp_port().
>>>>>>>>>>> In general, I think it would be a good idea for the flow key to
>>>>>>>>>>> have
>>>>>>>>>>> a
>>>>>>>>>>> field
>>>>>>>>>>> specifying the layer 3 protocol, when an ethernet header is not
>>>>>>>>>>> present.
>>>>>>>>>> That makes sense to me.
>>>>>>>>> You mean that we keep eth.type as the L3 protocol field, or define
>>>>>>>>> a
>>>>>>>>> new
>>>>>>>>> one, separate from the eth union?
>>>>>>>> I think it's fine to keep using eth.type as the protocol field. I
>>>>>>>> think we can probably some holes to move the is_layer3 flag into the
>>>>>>>> non-tunnel portion of the flow though.
>>>>>>> Should we revisit this discussion?
>>>>>> I was just referring to the Netlink encoding here. Can we populate the
>>>>>> flow key in the kernel when we translate the flow?
>>>>> Not sure I understand the question.
>>>>> Going through the code, I see that omitting OVS_KEY_ATTR_ETHERTYPE
>>>>> currently
>>>>> means an 802.2 packet, if the mask is set to 0xffff.  Are you
>>>>> suggesting
>>>>> to
>>>>> omit OVS_KEY_ATTR_ETHERTYPE for layer 3 packets both in the flow key
>>>>> and
>>>>> the
>>>>> mask?
>>>> All I was trying to say is that is that the Netlink and struct
>>>> sw_flow_key representations don't necessarily have to be exactly the
>>>> same as long as we can translate back and forth. I'm not sure that the
>>>> previous discussion applies - this is more about Netlink and that
>>>> seemed to apply to struct sw_flow_key.
>>> Ok, understand now.  But validation would become even more complicated if
>>> we
>>> didn't send the OVS_KEY_ATTR_ETHERTYPE over Netlink for layer 3 packets.
>> I think we would basically have to scan for L3 attributes and use that
>> to fill in the EtherType, right? Is there anything else?
>> It would be nice to fully break the link to Ethernet and it seems like
>> a half measure if we keep EtherType in. There have been various
>> requests for non-Ethernet protocols in the past so it seems like it
>> could get messy if we assume some elements of Ethernet.
> I had a discussion about this with Thomas Morin, who proposed the
> IP-over-GRE patches and MPLS-over-GRE patches based on this work, and we
> think it would still be useful to keep the EtherType, even in the Netlink
> attributes, to be able to support generic packet types in the future.
> Ben was involved in EXT-112 at the ONF, which has the objective to "Support
> non-Ethernet packets throughout the pipeline".  The current version of the
> proposal, recently moved from incubation to prototyping, defines a new
> "packet type" match field, which is used to determine the type of the first
> header.  There are 5 namespaces that can be used to define the packet type,
> one of which is EtherType.  With this match type, the canonical
> representation for an IPv4 packet would be (namespace=1,ns_type=0x800),
> namespace 1 being the EtherType.

Actually, that's pretty much what I'm trying to avoid :) I don't
really like that we have to introduce a new namespace for EtherType
and then yet another new namespace for the type of the type.

> Because of the potential future implementation of EXT-112 in OVS I think
> keeping the EtherType around would be beneficial.

I've been thinking about this (and also talked to Ben) and I'm still
not convinced that keeping the EtherType explicitly in the netlink
protocol really adds much of value or would really impede the
implementation of EXT-112. I'm really specifically talking about the
netlink protocol - I think that's fine if the internal implementation
of userspace or kernel wants to keep it around for efficiency or
simplicity but I don't think that it should leak out beyond the

I think for the two main reasons for including an explicit type (in
either OpenFlow or netlink) are to be able to enforce prerequisites
and being able to determine whether a field should be wildcarded or
doesn't apply. It's not actually needed to describe the protocol
because in both cases TLVs are used that provide actual typing.

One significant difference between OpenFlow and the OVS netlink
protocol is that netlink is designed to always be reactive whereas
OpenFlow is not necessarily. This means that when communicating with
the kernel, the packet is always fully described as it came in through
the key and the wildcarded fields are separate. This completely
eliminates the issue with wildcard ambiguity, which I think is the
more significant one. It does slightly reduce the ability to detect
errors by loosening prerequisite checking but I think that is also
less significant in the reactive model.

Given that, I still think that we can consider a packet to be "rooted"
in the type of the TLV for the lowest layer that appears. If there is
more than one, we can just reject that as invalid.

>>>> I'm starting to think that the path we went down with the megaflow
>>>> Netlink encoding is not particularly sustainable because it means that
>>>> while the keys can do something fairly reasonable, the masks must
>>>> always list all the protocols that it is not, such as Ethernet and
>>>> EtherType in this case.
>>>> I wonder if it actually makes more sense to switch over to something
>>>> similar to what Jarno is planning for the actions - a key and mask
>>>> paired together and then we can also fix some left over oddities from
>>>> the exact match days.
>>> I wonder if we could refine this in later patches?
>> My hope is that we can at least get some other opinions on this first.
> Sure, looking forward to that!

OK, I'm retracting this idea - based on the above analysis, I think
the current way we are doing things is fine and maybe actually
preferable for this use case.

More information about the dev mailing list