[ovs-dev] [v4 03/12] dpif-netdev: Add study function to select the best mfex function

Van Haaren, Harry harry.van.haaren at intel.com
Wed Jun 30 09:32:01 UTC 2021

> -----Original Message-----
> From: Eelco Chaudron <echaudro at redhat.com>
> Sent: Wednesday, June 30, 2021 10:18 AM
> To: Van Haaren, Harry <harry.van.haaren at intel.com>
> Cc: Amber, Kumar <kumar.amber at intel.com>; dev at openvswitch.org;
> i.maximets at ovn.org
> Subject: Re: [ovs-dev] [v4 03/12] dpif-netdev: Add study function to select the best
> mfex function
> On 29 Jun 2021, at 18:32, Van Haaren, Harry wrote:
> >> -----Original Message-----
> >> From: dev <ovs-dev-bounces at openvswitch.org> On Behalf Of Eelco Chaudron
> >> Sent: Tuesday, June 29, 2021 1:38 PM
> >> To: Amber, Kumar <kumar.amber at intel.com>
> >> Cc: dev at openvswitch.org; i.maximets at ovn.org
> >> Subject: Re: [ovs-dev] [v4 03/12] dpif-netdev: Add study function to select the
> best
> >> mfex function

<snip previous discussion>

> > Perfect is the enemy of good... I'd prefer focus on getting existing code changes
> merged,
> > and add additional (optional) parameters in future if deemed useful in real world
> testing?
> See Flavio’s reply, as those were the concerns same concerns I thought of.

Yes - thanks for combining threads - I'm writing a detailed reply there as we speak here :)
I'll send that reply shortly.

<snip code/patch changes>

> >>> +        if (max_hits >= MFEX_MIN_HIT_COUNT_FOR_USE) {
> >>> +            /* Set the implementation to index with max_hits. */
> >>> +            pmd->miniflow_extract_opt =
> >>> +                        miniflow_funcs[best_func_index].extract_func;
> >>> +            VLOG_INFO("MFEX study chose impl %s: (hits %d/%d pkts)\n",
> >>> +                      miniflow_funcs[best_func_index].name, max_hits,
> >>> +                      stats->pkt_count);
> >>
> >> We have no idea which PMD the mode is selected for guess we might need to add
> >> this?
> >>
> >> Maybe we should report the numbers/hits for the other methods, as they might
> be
> >> equal, and some might be faster in execution time?
> >
> > As above, the implementations are sorted in performance order. Performance
> > here can be known by micro-benchmarks, and developers of such SIMD optimized
> > code can be expected to know which impl is fastest.
> Don’t think we can, as it’s not documented in the code, and some one can just add
> his own, and has no clue about the existing ones.

Yes, in theory somebody could add his own, and get this wrong. There are many many
things that could go wrong when making code changes. We cannot document everything.

> > In our current code, the avx512_vbmi_* impls are always before the avx512_*
> > impls, as the VBMI instruction set allows a faster runtime.
> Guess we need some documentation in the developer's section on how to add
> processor optimized functions, and how to benchmark them (and maybe some
> benchmark data for the current implementations).
> Also, someone can write a sloppy avx512_vbmi* function that might be slower than
> an avx512_*, right?

What are we trying to achieve here? What is the root problem that is being addressed?

Yes, somebody "could" write sloppy (complex, advanced, ISA specific, SIMD) avx512 code,
and have it be slower. Who is realistically going to do that?

I'm fine with documenting a few things if they make sense to document, but
trying to "hand hold" at every level just doesn't work. Adding sections on how
to benchmark code, and how function pointers work and how to add them?
These things are documented in various places across the internet.

If there's really an interest to learn AVX512 SIMD optimization, reach out to the
OVS community, put me on CC, and I'll be willing to help. Adding documentation
ad nauseam is not the solution, as each optimization is likely to have subtle differences.

> > <snip code changes till end of patch>
<snip snip away irrelevant old discussions>

More information about the dev mailing list