[ovs-dev] [PATCH v4 3/3] datapath: add layer 3 flow/port support

Thu Aug 21 17:24:04 UTC 2014

On 8/6/14 4:37 AM, Jesse Gross wrote:
>> On 8/2/14, 4:11 AM, Jesse Gross wrote:
>>> On Fri, Aug 1, 2014 at 3:19 PM, Lori Jakab <lojakab at cisco.com> wrote:
>>>> On 8/1/14, 3:25 AM, Jesse Gross wrote:
>>>>> On Thu, Jul 31, 2014 at 5:45 AM, Lori Jakab <lojakab at cisco.com> wrote:
>>>>>> On 7/31/14, 10:09 AM, Jesse Gross wrote:
>>>>> Going further, what happens if packet comes in with multiple Ethernet
>>>>> headers on it (such as PBB)? This will pull off all of the headers but
>>>>> that doesn't seem right. If we were to only pull off one, what are the
>>>>> semantics with respect to VLANs?
>>>>
>>>> I have a feeling we may need to rename this action, since the my
>>>> intention
>>>> actually was to remove everything up to the layer 3 header.  That
>>>> includes
>>>> MPLS header(s) and any number of Ethernet headers.  The reason is, I need
>>>> an
>>>> action that guarantees that the resulting packet is layer 3, and can be
>>>> sent
>>>> through a layer 3 port.  I think you're right that the expectation from a
>>>> pop_eth() action would be what you describe above: if more then one
>>>> Ethernet
>>>> header is present, remove the outer header only, keep MPLS headers.
>>>>
>>>> How about renaming this action to extract_l3_packet() or similar?
>>> I'm not excited about this. Thomas Morin was already talking about
>>> using this code to do MPLS over GRE, which would need exactly the
>>> semantics of popping an Ethernet header. I would like to find a way to
>>> do this now, otherwise it is likely that we will have to solve the
>>> same problem with another new action very soon.
>>
>> Actually, that's what I'm proposing.  We do pop_eth() as you describe above,
>> which can be used for MPLS over GRE among other things, and do the
>> extract_l3_packet() action as well, for the cases when we need to send a
>> packet out on an L3 port, regardless of how many L2 or L2.5 headers we have.
> I'm not sure that I understand - your proposal is to have two separate actions?

Yes.

>
> Besides the fact that it would be nice to unify things, I'm not sure
> that it is actually correct to pull off VLAN, MPLS, etc. tags that we
> don't understand. Presumably, these are being used to create some kind
> of isolation and dropping all the tags that we don't understand would
> just blindly merge things together. Also, from an implementation
> perspective, we don't really parse beyond what we understand (this is
> not quite true for MPLS but it is for VLANs and any unknown protocols)
> so we wouldn't necessarily actually get to an L3 header. On the other
> hand, if we only deal with tags that we do understand then can't we
> have more atomic instructions that pull them off one-by-one?

Atomic instructions make a lot of sense, and we definitely should 
support them.  However, if a packet is sent to an L3 port, it should not 
have any Ethernet, VLAN or MPLS header. Note that for now these actions 
are not user visible, and cannot be added with OpenFlow, they are added 
automatically by the user space code, when a packet is sent from an 
L2->L3 port or L3->L2 port.

I agreed with Ben that this behavior will be changed in the future, 
based on discussions in the ONF's EXT-112 proposal, where the consensus 
was to have explicit push_eth and pop_eth OpenFlow actions.  However, 
the current behavior of the LISP code is to add/remove L2 headers 
automatically, so we want to keep that in the first phase, until I 
implement OpenFlow support.  Once automatic header popping is not used, 
we can remove the "extract_l3_packet()" action.

 From the implementation perspective, this action would rely on the 
skb->network_header being set correctly.

>
>>>>>>>> +static void push_eth(struct sk_buff *skb, const struct
>>>>>>>> ovs_action_push_eth *ethh)
>>>>>>>> +{
>>>>>>>> +       skb_push(skb, ETH_HLEN);
>>>>>>>> +       skb_reset_mac_header(skb);
>>>>>>>> +
>>>>>>>> +       ether_addr_copy(eth_hdr(skb)->h_source,
>>>>>>>> ethh->addresses.eth_src);
>>>>>>>> +       ether_addr_copy(eth_hdr(skb)->h_dest,
>>>>>>>> ethh->addresses.eth_dst);
>>>>>>>> +
>>>>>>>> +       eth_hdr(skb)->h_proto = ethh->eth_type;
>>>>>>>> +       skb->protocol = ethh->eth_type;
>>>>>>>> +
>>>>>>>> +       ovs_skb_postpush_rcsum(skb, skb->data, ETH_HLEN);
>>>>>>>> +
>>>>>>>> +       OVS_CB(skb)->is_layer3 = false;
>>>>>>> What happens if there is something in skb->vlan_tci? I think the
>>>>>>> effect will be that it still on the outer Ethernet header, which
>>>>>>> doesn't seem correct.
>>>>>>
>>>>>> I missed that.  Addressed with:
>>>>>>
>>>>>> --- a/datapath/actions.c
>>>>>> +++ b/datapath/actions.c
>>>>>> @@ -258,6 +258,7 @@ static void push_eth(struct sk_buff *skb, const
>>>>>> struct
>>>>>> ovs_action_push_eth *ethh
>>>>>>            skb_push(skb, ETH_HLEN);
>>>>>>            skb_reset_mac_header(skb);
>>>>>>            skb_reset_mac_len(skb);
>>>>>> +       vlan_set_tci(skb, 0);
>>>>> I don't think it's right to throw away the vlan tag. My assumption is
>>>>> that it should be placed into the packet before we push on the
>>>>> Ethernet header (this actually should never happen since we disallow
>>>>> pushing multiple Ethernet headers but it still seems like the right
>>>>> thing to do).
>>>>
>>>> That's not what I would expect from a simple push_eth() action.  If
>>>> someone
>>>> wanted to push a VLAN tag, that should be an explicit action.  So I would
>>>> just throw away whatever there is in skb->vlan_tci.
>>> The contents of skb->vlan_tci were already pushed explicitly (look at
>>> the implementation of push_vlan()). From a user's perspective, there
>>> is no difference - it is just an internal optimization to pull the tag
>>> out of the packet, so I can't see any justification for throwing away
>>> that tag.
>>
>> OK.  I will remove that line.
> But just to be clear, I think we still need to explicitly push the
> vlan tag on the packet. Otherwise, it will migrate from the inner
> Ethernet header to the outer.

Still not sure what I need to do here.  If skb->vlan_tci was set by a 
push_vlan action, then the vlan tag should already have been pushed on 
the packet, right?

>
>>>> BTW, for a packet with multiple Ethernet headers, is there already
>>>> support
>>>> for matching on anything after the outer header?  Does that need
>>>> recirculation?
>>> There isn't currently any support on matching on multiple headers but
>>> it seems like a natural possibility if we introduce the capability to
>>> push/pop Ethernet headers. Once those are there, it seems like you
>>> could do it with recirculation.
>>>
>>>>> Somewhat related, what happens if someone specifies just an EtherType
>>>>> but no Ethernet addresses? Maybe we should just reject this?
>>>>
>>>> In a flow key?  I don't have a strong opinion on this, but I think I
>>>> wouldn't reject it.
>>> Is there any situation where it could be used (what meaning would it
>>> have)?
>>
>> Not that I can think of right now.
>>
>>
>>> Normally we reject things that can never make sense.
>>
>> If that's the policy, I'll change the implementation.
> OK, I think that probably makes sense (we can loosen it later if we
> come up with a use case and add support for it).

I just want to note that the push_eth action has a single struct with 
all the data.  So the netlink message will always have the Ethertype AND 
the Ethernet addresses.  The only thing I can think of validating here 
is if the Ethernet addresses are not all zero, but I think we should 
allow even that.