[ovs-dev] [PATCH v2 3/3] datapath: add layer 3 flow/port support

Jesse Gross jesse at nicira.com
Wed Apr 30 23:19:47 UTC 2014


On Fri, Apr 25, 2014 at 1:52 PM,  <thomas.morin at orange.com> wrote:
> Hi Jesse,
>
> 2014-04-25, Jesse Gross:
>>> Practically speaking, making your layer 3 port code work for
>>> MPLS-over-GRE is not entirely trivial:
>>> - on emission, compose_output_action__ pops the Ethernet header before
>>> push_mpls is called (which is done during commit_odp_actions) ; this
>>> does not work since push_mpls actually relies on the presence of the
>>> Ethernet header to set the Ethernet ethertype to an MPLS ethertype.  One
>>> way to fix it would be to adapt MPLS code to support the case where the
>>> frame is layer3. Another is to have compose_output_action__ call
>>> odp_put_pop_eth_action after commit_odp_actions. I implemented the
>>> latter and it works. This latter option has the benefit preserving
>>> compatibility with other actions that can be done before outputting and
>>> which rely on the Ethernet header being there (such as changing a
>>> Ethernet dst) -- such operations do not make sense if the header is
>>> popped afterward, but I think we should try to not break if the user is
>>> to define such a flow.
>>
>> I think it's actually OK to break (or really ignore) actions that
>> don't make sense. There are some similar examples already - if you try
>> to set the TCP port on an L2 packet then the action will just get
>> ignored.
>
> The alternative is making all MPLS code work for both Ethernet and non
> Ethernet cases. More on this below.
>
>>> - on reception, although I haven't fully tried to reach a resolution,
>>> there is at least an issue in flow_extract: with the current patch flow
>>> extract does not parse MPLS for a layer3 frame. I guess that could be
>>> solved although I haven't clarified exactly yet how flow_extract would
>>> guess what payload is in the ofppuf->packet (as I understand, this is
>>> initially determined in the kernel datapath and stored in skb->protocol,
>>> but it has to be propagated to userspace).
>>
>> I think this would probably have to come as metadata from the
>> originator in some form, similar to the input port which also can't be
>> extracted from the packet.
>
> Ok.
>
>  > In theory, this could either be from the
>> kernel or OpenFlow directly.
>
> (I'm not sure how this can come from Openflow, but possibly that's a
> side question)

An OpenFlow controller can send a packet-out message to inject a
packet. This should behave similar to a received packet but will skip
the kernel entirely.

>>> Based on the above, I wonder if the code couldn't be made overall
>>> simpler by *always* pushing an Ethernet header on reception of a
>>> non-Ethernet tunnelled packet (not only when, after flow identification,
>>> we conclude that the in_port is layer 3 while the out_port is not).
>>> Similarly, on emission, the popping of the Ethernet header would not
>>> have to be conditioned to the in/out ports being layer 3 or not.  The
>>> coexistence of the layer 3 port code with operations relying on the fact
>>> that the packet being manipulated is Ethernet, would become a non-issue
>>> ; the code related to these operations could remain completely untouched.
>>>
>>> In practice, that could be done by doing pop_eth based on flow actions
>>> (as currently done by your patch), but by doing push_eth unconditionally
>>> on reception of a packet from a layer3 tunnel (*not* as a flow action as
>>> currently done in your patch).
>>
>> This is pretty much what is already done in the checked-in LISP code.
>> I think at some point, we need to bite the bullet and figure out how
>> to make OVS independent of Ethernet so this seems to be the time to do
>> it.
>
> I can understand your point of view, although it seems to me that
> normalizing all packets to Ethernet in input would be beneficial in that
> it would avoid having to make a bunch of codepaths conditional to the
> type of payload (MPLS comes to mind, there might be others). It would be
> similarly beneficial in terms of testsuite coverage, and in terms of
> avoiding the introduction of regressions in the kernel datapath or in
> kernel/user space exchanges.

In the past, support for Infiniband has come up a few times. I think
if we started adding fake Ethernet headers that would drive us even
further away from the possibility of being truly multi-protocol.

(I'm still thinking about the earlier messages because I think there
are some significant differences between the kernel netlink protocol
and OpenFlow).



More information about the dev mailing list