[ovs-dev] MPLS and VLAN QinQ patch

ravi kerur rkerur at gmail.com
Sat Jun 30 15:56:45 UTC 2012


Can you take a look at latest diffs I sent out on Tuesday? It has
simplified ttl handling for both userspace and kernel datapath. Most
of the logic is in ofproto-dpif.c and datapath just updates ttl in the
packet for respective headers. This is the minimal operation any
datapath will do irrespective of where the complexity is.

Its still unclear to me why ttl operations would need packet recirculation?

On Fri, Jun 29, 2012 at 5:12 PM, Jesse Gross <jesse at nicira.com> wrote:
> Ben and I talked a little about the best way to move forward with
> this.  At this point, none of the possible implementations for
> handling TTL are particularly appealing.  Recirculation in the kernel
> is likely the right solution for at least some of the problems but it
> would likely be complicated and not necessarily complete.
>
> In the interests of moving things along, we decided that the best path
> forward would be to process any packets that require one of the
> OpenFlow TTL actions in userspace.  This will obviously be slow but it
> provides a starting pointing which can be improved upon without
> locking down an API now.  There are some examples of this already
> where userspace just installs a flow to send packets up.
>
> This would mean that the kernel only needs at a minimum push and pop
> actions.  As userspace would always know the contents of the packet it
> can pass it down MPLS in their entirety with the kernel having to
> compute anything.
>
> On Tue, Jun 26, 2012 at 5:02 PM, ravi kerur <rkerur at gmail.com> wrote:
>> Patch sent last night had missed some fixes in the kernel + whitespace
>> errors. Anyways attaching latest patch.
>>
>> On Mon, Jun 25, 2012 at 5:34 PM, ravi kerur <rkerur at gmail.com> wrote:
>>> Attached latest patch which mimics upstream kernel offload in ovs for
>>> both mpls and qinq.
>>>
>>> Upstream linux patch has changes for mpls and qinq and includes e1000e
>>> driver changes as well. They can be divided into different sub-patches
>>> before sending upstream but can be reviewed as a whole.
>>>
>>> On Mon, Jun 25, 2012 at 3:54 PM, Jesse Gross <jesse at nicira.com> wrote:
>>>> On Thu, Jun 21, 2012 at 11:08 AM, ravi kerur <rkerur at gmail.com> wrote:
>>>>> On Tue, Jun 19, 2012 at 11:55 AM, Jesse Gross <jesse at nicira.com> wrote:
>>>>>> On Thu, Jun 14, 2012 at 5:29 PM, ravi kerur <rkerur at gmail.com> wrote:
>>>>>>> On Thu, Jun 14, 2012 at 8:22 AM, Jesse Gross <jesse at nicira.com> wrote:
>>>>>>>> On Jun 14, 2012, at 10:13 PM, ravi kerur <rkerur at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> On Thu, Jun 14, 2012 at 3:58 AM, Jesse Gross <jesse at nicira.com> wrote:
>>>>>>>>>> On Thu, Jun 14, 2012 at 1:24 PM, ravi kerur <rkerur at gmail.com> wrote:
>>>>>>>>>>> On Wed, Jun 13, 2012 at 7:58 PM, Jesse Gross <jesse at nicira.com> wrote:
>>>>>>>>>>>> On Thu, Jun 14, 2012 at 4:51 AM, ravi kerur <rkerur at gmail.com> wrote:
>>>>>>>>>>>>> There are additional things that needs to be addressed as well
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1. offload code review, it's currently generic enough and getting near
>>>>>>>>>>>>> line rate performance numbers.
>>>>>>>>>>>>
>>>>>>>>>>>> You can't just take what is done for vlans and copy that.  There is
>>>>>>>>>>>> far too much code that you're adding to OVS.  Did you read what I
>>>>>>>>>>>> wrote earlier about where to start?:
>>>>>>>>>>>
>>>>>>>>>>> <rk> How is adding far too much code to ovs related to offload? It is
>>>>>>>>>>> handled similar to what vlan is doing for older kernel + in addition
>>>>>>>>>>> it takes care of handling copy + restore skb->protocol since
>>>>>>>>>>> skb_gso_segment relies on it and handle cases for non-gso packets to
>>>>>>>>>>> calculate checksum. I don't understand your comments, have you looked
>>>>>>>>>>> at latest patch?
>>>>>>>>>>
>>>>>>>>>> The vlan code that's there is backporting and emulation for various
>>>>>>>>>> quirks of vlans on different kernels.  Most of these don't apply to
>>>>>>>>>> MPLS because no version of Linux supports MPLS.  You can't start from
>>>>>>>>>> the backported version though, you need to begin with the correct way
>>>>>>>>>> to do things assuming that you have freedom to modify the upstream
>>>>>>>>>> kernel because OVS is upstream now and that's the future.  Once you
>>>>>>>>>> have things working there, you can backport to older versions but if
>>>>>>>>>> you do it in the opposite order you just end up with a mess.  Once
>>>>>>>>>> again, did you read what I wrote below?  I know that your code isn't
>>>>>>>>>> correct just by looking at the diff stat because you didn't modify the
>>>>>>>>>> file that I told you is the place to start.
>>>>>>>>>
>>>>>>>>> <rk> comments below
>>>>>>>>>>
>>>>>>>>>>>> Generally speaking the emulation code is handled by skb_gso_segment()
>>>>>>>>>>>> in dev.c in the kernel code outside of OVS.  This should mostly work
>>>>>>>>>>>> except that it needs to be able to detect that MPLS requires
>>>>>>>>>>>> emulation.  This will be the easiest part to get working and is the
>>>>>>>>>>>> best place to start.  However, in order for this code to work on any
>>>>>>>>>>>> kernel before your changes get integrated (i.e. Linux 3.6 at the
>>>>>>>>>>>> earliest) you'll have to emulate it in OVS as well, like we do for
>>>>>>>>>>>> vlans in vport-netdev.c.
>>>>>>>>>
>>>>>>>>> <rk> From the above paragraph, I deciphered it as emulate in ovs since
>>>>>>>>> it needs to work with any kernel version(vport-netdev.c) and then in
>>>>>>>>> dev.c and that's what I have done. Modified vport-netdev.c to support
>>>>>>>>> mpls and qinq and for any kernel version. However, it looks like you
>>>>>>>>> wanted me to work on dev.c first and later emulate in ovs via
>>>>>>>>> vport-netdev.c?
>>>>>>>>
>>>>>>>> Yes.
>>>>>>>
>>>>>>> <rk> ok thanks, will work on it. Just thinking about this, why do you
>>>>>>> want to modify upstream code dev.c first and not ovs? Atleast it makes
>>>>>>> sense to work on ovs first since it covers older version + current
>>>>>>> version and via macro LINUX_VERSION_CODE can be controlled on which
>>>>>>> version it's enabled and later handle upstream code(so long as it's
>>>>>>> made sure code gets in upstream on targeted version). Moreover,
>>>>>>> changes in dev.c and vport-netdev.c are completely different and
>>>>>>> cannot be treated as a backport, instead they should be treated as two
>>>>>>> separate changes.
>>>>>>
>>>>>> You must treat it as a backport.  If you look at it as a completely
>>>>>> separate path then you'll almost certainly end up with a different
>>>>>> model for how to do things and it will make it more complicated than
>>>>>> necessary because other parts of the code will need to support both.
>>>>>> Furthermore, the upstream version is going to be simpler so that's
>>>>>> where you want to be choosing the right model.
>>>>>>
>>>>>> Remember, we have to keep both versions in sync over time and the more
>>>>>> deviation there is the harder this gets.
>>>>>
>>>>> <rk> I am not sure I following your argument (first paragraph), but I
>>>>> definitely agree with keeping both ovs emulation and upstream in sync.
>>>>> Let me ask couple of questions to clarify things.
>>>>>
>>>>> Please note code path is the same until netdev_send is called where
>>>>> decision is made to use ovs emulation or kernel based on kernel
>>>>> version.
>>>>>
>>>>> ovs emulation supporting mpls and qinq
>>>>>
>>>>> 1. first clear hardware netif_* flags
>>>>> 2. get gso segments which is a list
>>>>> 3. skb_gso_segment relies on l3/l4 protocol information, save and restore
>>>>> 4. iterate through the loop of segments and transmit each one of them
>>>>>
>>>>> since this is a proven method I have stuck with this in ovs emulation.
>>>>> Are you suggesting to use different logic here, if yes, I don't
>>>>> understand why but would like to know what the argument is before
>>>>> making any changes?
>>>>
>>>> I want it to be based off the upstream version.  Since you haven't
>>>> done that yet, it's clearly not based on that.  That's really all
>>>> there is to it.
>>>>
>>>>> upstream kernel
>>>>> 1. ovs clears hardware netif_* flags before calling dev_queue_xmit
>>>>
>>>> OVS should not clear any flags.  It should initialize them as
>>>> appropriate for the protocol type.
>>>>
>>>>> 2. in skb_gso_segment (upstream kernel code), handle mpls or qinq
>>>>> eth_types such that correct l3 information is extracted
>>>>> 3.  change drivers to handle mpls or qinq eth_types to handle checksum offloads.
>>>>
>>>> Updating drivers is not necessary at this time.
>>>>
>>>>> As I see that changes are different for both(agree upstream kernel
>>>>> code changes are simpler), I was merely suggesting not to treat them
>>>>> as backport and code review can happen in parallel.
>>>>
>>>> I'm not going to review this until the upstream version exists because
>>>> I'm going to review the OVS version by comparing it to upstream.  This
>>>> is how every other feature has been added to OVS, which now supports
>>>> over 20 different kernel versions.  I don't see any reason why this is
>>>> different.



More information about the dev mailing list