[ovs-dev] [PATCH v4 3/3] datapath: add layer 3 flow/port support

Jesse Gross jesse at nicira.com
Wed Aug 6 01:37:05 UTC 2014


On Tue, Aug 5, 2014 at 1:45 PM, Lori Jakab <lojakab at cisco.com> wrote:
> Hi Jesse,
>
> I'll be on vacation starting tomorrow, it looks like this won't be merged
> for at least another two weeks.

Have a nice vacation. Sorry that we couldn't get this in before you leave.

> On 8/2/14, 4:11 AM, Jesse Gross wrote:
>> On Fri, Aug 1, 2014 at 3:19 PM, Lori Jakab <lojakab at cisco.com> wrote:
>>> On 8/1/14, 3:25 AM, Jesse Gross wrote:
>>>> On Thu, Jul 31, 2014 at 5:45 AM, Lori Jakab <lojakab at cisco.com> wrote:
>>>>> On 7/31/14, 10:09 AM, Jesse Gross wrote:
>>>> Going further, what happens if packet comes in with multiple Ethernet
>>>> headers on it (such as PBB)? This will pull off all of the headers but
>>>> that doesn't seem right. If we were to only pull off one, what are the
>>>> semantics with respect to VLANs?
>>>
>>>
>>> I have a feeling we may need to rename this action, since the my
>>> intention
>>> actually was to remove everything up to the layer 3 header.  That
>>> includes
>>> MPLS header(s) and any number of Ethernet headers.  The reason is, I need
>>> an
>>> action that guarantees that the resulting packet is layer 3, and can be
>>> sent
>>> through a layer 3 port.  I think you're right that the expectation from a
>>> pop_eth() action would be what you describe above: if more then one
>>> Ethernet
>>> header is present, remove the outer header only, keep MPLS headers.
>>>
>>> How about renaming this action to extract_l3_packet() or similar?
>>
>> I'm not excited about this. Thomas Morin was already talking about
>> using this code to do MPLS over GRE, which would need exactly the
>> semantics of popping an Ethernet header. I would like to find a way to
>> do this now, otherwise it is likely that we will have to solve the
>> same problem with another new action very soon.
>
>
> Actually, that's what I'm proposing.  We do pop_eth() as you describe above,
> which can be used for MPLS over GRE among other things, and do the
> extract_l3_packet() action as well, for the cases when we need to send a
> packet out on an L3 port, regardless of how many L2 or L2.5 headers we have.

I'm not sure that I understand - your proposal is to have two separate actions?

Besides the fact that it would be nice to unify things, I'm not sure
that it is actually correct to pull off VLAN, MPLS, etc. tags that we
don't understand. Presumably, these are being used to create some kind
of isolation and dropping all the tags that we don't understand would
just blindly merge things together. Also, from an implementation
perspective, we don't really parse beyond what we understand (this is
not quite true for MPLS but it is for VLANs and any unknown protocols)
so we wouldn't necessarily actually get to an L3 header. On the other
hand, if we only deal with tags that we do understand then can't we
have more atomic instructions that pull them off one-by-one?

>>>>>>> +static void push_eth(struct sk_buff *skb, const struct
>>>>>>> ovs_action_push_eth *ethh)
>>>>>>> +{
>>>>>>> +       skb_push(skb, ETH_HLEN);
>>>>>>> +       skb_reset_mac_header(skb);
>>>>>>> +
>>>>>>> +       ether_addr_copy(eth_hdr(skb)->h_source,
>>>>>>> ethh->addresses.eth_src);
>>>>>>> +       ether_addr_copy(eth_hdr(skb)->h_dest,
>>>>>>> ethh->addresses.eth_dst);
>>>>>>> +
>>>>>>> +       eth_hdr(skb)->h_proto = ethh->eth_type;
>>>>>>> +       skb->protocol = ethh->eth_type;
>>>>>>> +
>>>>>>> +       ovs_skb_postpush_rcsum(skb, skb->data, ETH_HLEN);
>>>>>>> +
>>>>>>> +       OVS_CB(skb)->is_layer3 = false;
>>>>>>
>>>>>> What happens if there is something in skb->vlan_tci? I think the
>>>>>> effect will be that it still on the outer Ethernet header, which
>>>>>> doesn't seem correct.
>>>>>
>>>>>
>>>>> I missed that.  Addressed with:
>>>>>
>>>>> --- a/datapath/actions.c
>>>>> +++ b/datapath/actions.c
>>>>> @@ -258,6 +258,7 @@ static void push_eth(struct sk_buff *skb, const
>>>>> struct
>>>>> ovs_action_push_eth *ethh
>>>>>           skb_push(skb, ETH_HLEN);
>>>>>           skb_reset_mac_header(skb);
>>>>>           skb_reset_mac_len(skb);
>>>>> +       vlan_set_tci(skb, 0);
>>>>
>>>> I don't think it's right to throw away the vlan tag. My assumption is
>>>> that it should be placed into the packet before we push on the
>>>> Ethernet header (this actually should never happen since we disallow
>>>> pushing multiple Ethernet headers but it still seems like the right
>>>> thing to do).
>>>
>>>
>>> That's not what I would expect from a simple push_eth() action.  If
>>> someone
>>> wanted to push a VLAN tag, that should be an explicit action.  So I would
>>> just throw away whatever there is in skb->vlan_tci.
>>
>> The contents of skb->vlan_tci were already pushed explicitly (look at
>> the implementation of push_vlan()). From a user's perspective, there
>> is no difference - it is just an internal optimization to pull the tag
>> out of the packet, so I can't see any justification for throwing away
>> that tag.
>
>
> OK.  I will remove that line.

But just to be clear, I think we still need to explicitly push the
vlan tag on the packet. Otherwise, it will migrate from the inner
Ethernet header to the outer.

>>> BTW, for a packet with multiple Ethernet headers, is there already
>>> support
>>> for matching on anything after the outer header?  Does that need
>>> recirculation?
>>
>> There isn't currently any support on matching on multiple headers but
>> it seems like a natural possibility if we introduce the capability to
>> push/pop Ethernet headers. Once those are there, it seems like you
>> could do it with recirculation.
>>
>>>> Somewhat related, what happens if someone specifies just an EtherType
>>>> but no Ethernet addresses? Maybe we should just reject this?
>>>
>>>
>>> In a flow key?  I don't have a strong opinion on this, but I think I
>>> wouldn't reject it.
>>
>> Is there any situation where it could be used (what meaning would it
>> have)?
>
>
> Not that I can think of right now.
>
>
>> Normally we reject things that can never make sense.
>
>
> If that's the policy, I'll change the implementation.

OK, I think that probably makes sense (we can loosen it later if we
come up with a use case and add support for it).



More information about the dev mailing list