[ovs-dev] [PATCH v2 3/3] datapath: add layer 3 flow/port support
Lori Jakab
lojakab at cisco.com
Wed Apr 16 19:18:11 UTC 2014
On 4/16/14, 2:57 AM, Jesse Gross wrote:
> On Fri, Apr 11, 2014 at 4:30 AM, Lori Jakab <lojakab at cisco.com> wrote:
>> On 4/11/14, 1:47 AM, Jesse Gross wrote:
>>> On Thu, Mar 20, 2014 at 4:37 AM, Lori Jakab <lojakab at cisco.com> wrote:
>>>> On 1/6/14, 7:55 PM, Jesse Gross wrote:
>>>>> On Tue, Dec 24, 2013 at 9:02 AM, Lorand Jakab<lojakab at cisco.com> wrote:
>>> More importantly though, what happens if we have a pop followed by a
>>> set action? I'm not sure that this will catch that.
>>
>> Right. Fixed with the following:
> Is there any case for pop, then push, then set? It's obviously not
> very optimal but it doesn't seem inherently illegal.
I agree. I don't see why anyone would use it, but that's no good reason
to reject it.
> On one hand, it
> would be nice to just modify the flow key as we go but on the other
> hand it would probably be expensive to preserve everything. It would
> be nice to have that behavior though.
How about the following instead?
--- a/datapath/flow_netlink.c
+++ b/datapath/flow_netlink.c
@@ -1283,7 +1283,8 @@ static int validate_and_copy_set_tun(const struct
nlattr *attr,
static int validate_set(const struct nlattr *a,
const struct sw_flow_key *flow_key,
struct sw_flow_actions **sfa,
- bool *set_tun)
+ bool *set_tun,
+ bool noeth)
{
const struct nlattr *ovs_key = nla_data(a);
int key_type = nla_type(ovs_key);
@@ -1304,7 +1305,11 @@ static int validate_set(const struct nlattr *a,
case OVS_KEY_ATTR_PRIORITY:
case OVS_KEY_ATTR_SKB_MARK:
+ break;
+
case OVS_KEY_ATTR_ETHERNET:
+ if (noeth)
+ return -EINVAL;
break;
case OVS_KEY_ATTR_TUNNEL:
@@ -1416,6 +1421,7 @@ int ovs_nla_copy_actions(const struct nlattr *attr,
{
const struct nlattr *a;
int rem, err;
+ bool noeth = key->phy.noeth;
if (depth >= SAMPLE_ACTION_DEPTH)
return -EOVERFLOW;
@@ -1459,15 +1465,21 @@ int ovs_nla_copy_actions(const struct nlattr *attr,
case OVS_ACTION_ATTR_POP_ETH:
+ if (noeth)
+ return -EINVAL;
+ noeth = true;
break;
case OVS_ACTION_ATTR_PUSH_ETH:
+ noeth = false;
break;
case OVS_ACTION_ATTR_POP_VLAN:
break;
case OVS_ACTION_ATTR_PUSH_VLAN:
+ if (noeth)
+ return -EINVAL;
vlan = nla_data(a);
if (vlan->vlan_tpid != htons(ETH_P_8021Q))
return -EINVAL;
@@ -1476,7 +1488,7 @@ int ovs_nla_copy_actions(const struct nlattr *attr,
break;
case OVS_ACTION_ATTR_SET:
- err = validate_set(a, key, sfa, &skip_copy);
+ err = validate_set(a, key, sfa, &skip_copy, noeth);
if (err)
return err;
break;
>
>>>>>> +noethernet:
>>>>>> if (nla_put_be16(skb, OVS_KEY_ATTR_ETHERTYPE,
>>>>>> output->eth.type))
>>>>>> goto nla_put_failure;
>>>>> Does it still make sense to send the EtherType? It's not present in
>>>>> the packet and I believe that the information is contained in the
>>>>> attributes that we do send (i.e. IPv4 or v6 attributes).
>>>>
>>>> We had a discussion about this in August last year:
>>>>
>>>>>>>>>> One such decision is how to handle the flow key. I set all fields
>>>>>>>>>> in
>>>>>>>>>> key->eth to 0, except the type, because we still need to know what
>>>>>>>>>> kind
>>>>>>>>>> of L3 packet do we have. Since a lot of code is accessing
>>>>>>>>>> key->eth.type, this is easier than having this information in a
>>>>>>>>>> different place, although it would be more elegant to set this
>>>>>>>>>> field
>>>>>>>>>> to
>>>>>>>>>> 0 as well.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Can we use skb->protocol for the cases where we currently access the
>>>>>>>>> EtherType? Are there cases that you are particularly concerned
>>>>>>>>> about?
>>>>>>>>
>>>>>>>>
>>>>>>>> The only case I'm concerned about is in the action validation code,
>>>>>>>> particularly the validate_set() and validate_tp_port() functions,
>>>>>>>> since
>>>>>>>> they
>>>>>>>> don't have access to the skb, and need to know the layer 3 protocol
>>>>>>>> of
>>>>>>>> the
>>>>>>>> flow. In validate_set() we could relax the check for eth.type for
>>>>>>>> OVS_KEY_ATTR_IPV4 and OVS_KEY_ATTR_IPV6, but I'm not sure what to do
>>>>>>>> about
>>>>>>>> validate_tp_port().
>>>>>>>>
>>>>>>>> In general, I think it would be a good idea for the flow key to have
>>>>>>>> a
>>>>>>>> field
>>>>>>>> specifying the layer 3 protocol, when an ethernet header is not
>>>>>>>> present.
>>>>>>>
>>>>>>> That makes sense to me.
>>>>>>
>>>>>> You mean that we keep eth.type as the L3 protocol field, or define a
>>>>>> new
>>>>>> one, separate from the eth union?
>>>>> I think it's fine to keep using eth.type as the protocol field. I
>>>>> think we can probably some holes to move the is_layer3 flag into the
>>>>> non-tunnel portion of the flow though.
>>>>
>>>> Should we revisit this discussion?
>>> I was just referring to the Netlink encoding here. Can we populate the
>>> flow key in the kernel when we translate the flow?
>>
>> Not sure I understand the question.
>>
>> Going through the code, I see that omitting OVS_KEY_ATTR_ETHERTYPE currently
>> means an 802.2 packet, if the mask is set to 0xffff. Are you suggesting to
>> omit OVS_KEY_ATTR_ETHERTYPE for layer 3 packets both in the flow key and the
>> mask?
> All I was trying to say is that is that the Netlink and struct
> sw_flow_key representations don't necessarily have to be exactly the
> same as long as we can translate back and forth. I'm not sure that the
> previous discussion applies - this is more about Netlink and that
> seemed to apply to struct sw_flow_key.
Ok, understand now. But validation would become even more complicated
if we didn't send the OVS_KEY_ATTR_ETHERTYPE over Netlink for layer 3
packets.
>
> I'm starting to think that the path we went down with the megaflow
> Netlink encoding is not particularly sustainable because it means that
> while the keys can do something fairly reasonable, the masks must
> always list all the protocols that it is not, such as Ethernet and
> EtherType in this case.
>
> I wonder if it actually makes more sense to switch over to something
> similar to what Jarno is planning for the actions - a key and mask
> paired together and then we can also fix some left over oddities from
> the exact match days.
I wonder if we could refine this in later patches?
>
> Jarno, Pravin, any thoughts?
More information about the dev
mailing list