[ovs-dev] [PATCH V7 00/13] Netdev vxlan-decap offload

Van Haaren, Harry harry.van.haaren at intel.com
Thu Jul 8 08:30:41 UTC 2021


> -----Original Message-----
> From: Flavio Leitner <fbl at sysclose.org>
> Sent: Wednesday, July 7, 2021 10:59 PM
> To: Ilya Maximets <i.maximets at ovn.org>
> Cc: Van Haaren, Harry <harry.van.haaren at intel.com>; Eli Britstein
> <elibr at nvidia.com>; ovs dev <dev at openvswitch.org>; Ivan Malov
> <Ivan.Malov at oktetlabs.ru>; Majd Dibbiny <majd at nvidia.com>
> Subject: Re: [ovs-dev] [PATCH V7 00/13] Netdev vxlan-decap offload
> 
> 
> Hi,
> 
> On Wed, Jul 07, 2021 at 04:53:14PM +0200, Ilya Maximets wrote:
> > On 7/6/21 3:34 PM, Van Haaren, Harry wrote:
> > >> -----Original Message-----
> > >> From: Ilya Maximets <i.maximets at ovn.org>
> > >> Sent: Thursday, July 1, 2021 11:32 AM
> > >> To: Van Haaren, Harry <harry.van.haaren at intel.com>; Ilya Maximets
> > >> <i.maximets at ovn.org>
> > >> Cc: Eli Britstein <elibr at nvidia.com>; ovs dev <dev at openvswitch.org>; Ivan
> Malov
> > >> <Ivan.Malov at oktetlabs.ru>; Majd Dibbiny <majd at nvidia.com>; Stokes, Ian
> > >> <ian.stokes at intel.com>; Ferriter, Cian <cian.ferriter at intel.com>; Ben Pfaff
> > >> <blp at ovn.org>; Balazs Nemeth <bnemeth at redhat.com>; Sriharsha
> Basavapatna
> > >> <sriharsha.basavapatna at broadcom.com>
> > >> Subject: Re: [ovs-dev] [PATCH V7 00/13] Netdev vxlan-decap offload
> > >>
> > >> On 6/29/21 1:53 PM, Van Haaren, Harry wrote:
> > >>>> -----Original Message-----
> > >>>> From: Ilya Maximets <i.maximets at ovn.org>
> > >>>> Sent: Monday, June 28, 2021 3:33 PM
> > >>>> To: Van Haaren, Harry <harry.van.haaren at intel.com>; Ilya Maximets
> > >>>> <i.maximets at ovn.org>; Sriharsha Basavapatna
> > >>>> <sriharsha.basavapatna at broadcom.com>
> > >>>> Cc: Eli Britstein <elibr at nvidia.com>; ovs dev <dev at openvswitch.org>;
> Ivan
> > >> Malov
> > >>>> <Ivan.Malov at oktetlabs.ru>; Majd Dibbiny <majd at nvidia.com>; Stokes,
> Ian
> > >>>> <ian.stokes at intel.com>; Ferriter, Cian <cian.ferriter at intel.com>; Ben
> Pfaff
> > >>>> <blp at ovn.org>; Balazs Nemeth <bnemeth at redhat.com>
> > >>>> Subject: Re: [ovs-dev] [PATCH V7 00/13] Netdev vxlan-decap offload
> > >>>>
> > >>>> On 6/25/21 7:28 PM, Van Haaren, Harry wrote:
> > >>>>>> -----Original Message-----
> > >>>>>> From: dev <ovs-dev-bounces at openvswitch.org> On Behalf Of Ilya
> Maximets
> > >>>>>> Sent: Friday, June 25, 2021 4:26 PM
> > >>>>>> To: Sriharsha Basavapatna <sriharsha.basavapatna at broadcom.com>;
> Ilya
> > >>>> Maximets
> > >>>>>> <i.maximets at ovn.org>
> > >>>>>> Cc: Eli Britstein <elibr at nvidia.com>; ovs dev <dev at openvswitch.org>;
> Ivan
> > >>>> Malov
> > >>>>>> <Ivan.Malov at oktetlabs.ru>; Majd Dibbiny <majd at nvidia.com>
> > >>>>>> Subject: Re: [ovs-dev] [PATCH V7 00/13] Netdev vxlan-decap offload
> > >>>>>
> > >>>>> <snip commit message detail>
> > >>>>>
> > >>>>>>>> That looks good to me.  So, I guess, Harsha, we're waiting for
> > >>>>>>>> your review/tests here.
> > >>>>>>>
> > >>>>>>> Thanks Ilya and Eli, looks good to me; I've also tested it and it works
> fine.
> > >>>>>>> -Harsha
> > >>>>>>
> > >>>>>> Thanks, everyone.  Applied to master.
> > >>>>>
> > >>>>> Hi Ilya and OVS Community,
> > >>>>>
> > >>>>> There are open questions around this patchset, why has it been
> merged?
> > >>>>>
> > >>>>> Earlier today, new concerns were raised by Cian around the negative
> > >> performance
> > >>>> impact of these code changes:
> > >>>>> - https://mail.openvswitch.org/pipermail/ovs-dev/2021-
> June/384445.html
> > >>>>>
> > >>>>> Both you (Ilya) and Eli responded, and I was following the conversation.
> Various
> > >>>> code changes were suggested,
> > >>>>> and some may seem like they might work, Eli mentioned some solutions
> might
> > >> not
> > >>>> work due to the hardware:
> > >>>>> I was processing both your comments and input, and planning a
> technical reply
> > >>>> later today.
> > >>>>> - suggestions: https://mail.openvswitch.org/pipermail/ovs-dev/2021-
> > >>>> June/384446.html
> > >>>>> - concerns around hw: https://mail.openvswitch.org/pipermail/ovs-
> dev/2021-
> > >>>> June/384464.html
> > >>>>
> > >>>> Concerns not really about the hardware, but the API itself
> > >>>> that should be clarified a little bit to avoid confusion and
> > >>>> avoid incorrect changes like the one I suggested.
> > >>>> But this is a small enhancement that could be done on top.
> > >>>>
> > >>>>>
> > >>>>> Keep in mind that there are open performance issues to be worked out,
> that
> > >> have
> > >>>> not been resolved at this point in the conversation.
> > >>>>
> > >>>> Performance issue that can be worked out, will be worked out
> > >>>> in a separate patch , v1 for which we already have on a mailing
> > >>>> list for some time, so it didn't make sense to to re-validate
> > >>>> the whole series again due to this one pretty obvious change.
> > >>>>
> > >>>>> There is no agreement on solutions, nor an agreement to ignore the
> > >> performance
> > >>>> degradation, or to try resolve this degradation later.
> > >>>>
> > >>>> Particular part of the packet restoration call seems hard
> > >>>> to avoid in a long term (I don't see a good solution for that),
> > >>>> but the short term solution might be implemented on top.
> > >>>> The part with multiple reads of recirc_id and checking if
> > >>>> offloading is enabled has a fix already (that needs a v2, but
> > >>>> anyway).
> > >>>>
> > >>>>>
> > >>>>> That these patches have been merged is inappropriate:
> > >>>>> 1) Not enough time given for responses (11 am concerns raised, 5pm
> merged
> > >>>> without resolution? (Irish timezone))
> > >>>>
> > >>>> I responded with suggestions and arguments against solutions
> > >>>> suggested in the report, Eli responded with rejection of one
> > >>>> one of my suggestions.  And it seems clear (for me) that
> > >>>> there is no good solution for this part at the moment.
> > >>>> Part of the performance could be won back, but the rest
> > >>>> seems to be inevitable.  As a short-term solution we can
> > >>>> guard the netdev_hw_miss_packet_recover() with experimental
> > >>>> API ifdef, but it will strike back anyway in the future.
> > >>>>
> > >>>>> 2) Open question not addressed/resolved, resulting in a 6% known
> negative
> > >>>> performance impact being merged.
> > >>>>
> > >>>> I don't think it wasn't addressed.
> > >>>
> > >>> Was code merged that resulted in a known regression of 6%?  Yes. Facts
> are facts.
> > >>> I don't care for arguing over exactly what "addressed" means in this
> context.
> > >>>
> > >>>
> > >>>>> 3) Suggestions provided were not reviewed technically in detail (no
> technical
> > >>>> collaboration or code-changes/patches reviewed)
> > >>>>
> > >>>> Patches was heavily reviewed/tested by at least 4 different
> > >>>> parties including 2 test rounds from Intel engineers that,
> > >>>> I believe, included testing of partial offloading.  And that
> > >>>> bothers me the most.  If I can not trust performance test
> > >>>> reports, I'm not sure performance can be a gating factor here.
> > >>>>
> > >>>>>
> > >>>>> I feel that the OVS process of allowing time for community review and
> > >>>> collaboration was not adhered to in this instance.
> > >>>>> As a result, code was merged that is known to cause performance
> degradation.
> > >>>>>
> > >>>>> Therefore, this email is a request to revert these patches as they are not
> > >> currently
> > >>>> fit for inclusion in my opinion.
> > >>>>>
> > >>>>> As next steps, I can propose the following:
> > >>>>> 1) Revert the patches from master branch
> > >>>>> 2) Continue technical discussion on how to avoid negative performance
> impact
> > >>>>> 3) Review solutions, allowing time for responses and replies
> > >>>>> 4) Merge a future revision of this patchset at a later date
> > >>>>
> > >>>> I don't think that there is a necessity to revert the patch-set.
> > >>>> All performance issues can be addressed by small fixes on top:
> > >>>> 1. Patch from Balazs to read certain information per-batch.
> > >>>> 2. Optionally, guard packet miss recovery with ifdef of an
> > >>>>    experimental API. -- I'm not sure about this one as it will
> > >>>>    strike back in the future.
> > >>>
> > >>> Instead of replying point-by-point, let me summarize my stance:
> > >>> 1) I hold the OVS community and its maintainers to the highest standard
> > >>> 2) I feel the OVS review process of reaching a consensus and then merging
> is a
> > >> cornerstone of the community
> > >>> 3) I feel that when patches are merged without consensus, ignoring
> community
> > >> input, and causing known
> > >>> performance degradations, that is not in line with the OVS review process.
> > >>>
> > >>> To move forward in a pragmatic way, instead of ad infinitum discussing
> details, I
> > >> propose the following:
> > >>> 1) We accept that this patchset was merged without community
> consensus, and
> > >> this must never happen again in future.
> > >>
> > >> I would respectfully disagree with this statement again.
> > >
> > > There is a disconnect somewhere. Let my try to simplify the conversation to
> per topic
> > > statements, and try to understand where your disagreement originates.
> > >
> > >
> > >> The patch set was reviewed and tested by many people from
> > >> 4 different companies.  Tests that was performed earlier
> > >> by other Intel engineers, as far as I understand, was much
> > >> closer to real-wold workloads and showed no measurable
> > >> performance degradation to my best knowledge.  Therefore,
> > >> the patch set was merged.
> > >
> > > Yes, tested-by tags were given to a patchset. This is not the root issue
> > > being discussed here. Presence of a tested-by tag does not mean it is
> > > OK to ignore open questions, and merge code without consensus.
> > > Do you agree?
> >
> > I agree that merge without consensus on large questions should
> > not happen.  However, in this case, there was consensus between
> > multiple people and very small incremental changes needs in order
> > to fix performance issues reported.  So, to my judgement merge
> > was justified.
> 
> Definitely there was a discussion between multiple people after
> the non-trivial issue had been raised.  However, I don't see any
> replies from Ferriter, which raised the issue (or someone from
> the same company), discussing the proposed solutions. It seems
> to me that there was not enough time given to Ferriter.
>
> Anyways, I've been in the community for many years now and so far
> it is clear to me that the community strives to build consensus
> with all participants in a discussion.

Agree, and as an OVS community member I value consensus a lot,
hence raising the issue initially.

I'll note that the techie feedback on this list is top-notch,  the OVS 
conferences over the years have been great to chat and discuss all
things OVS, and in general I look forward to continuing to optimize
and improve OVS with this community.

> I assume that everyone in
> this thread wants that too, so can we shift focus on getting a
> good technical solution for the performance impact merged?

Yes, agree that best next steps from here are to focus on getting
the issues addressed, and move on.

> Thank you,
> fbl

Regards, -Harry


More information about the dev mailing list