[ovs-dev] [v4 03/12] dpif-netdev: Add study function to select the best mfex function

Eelco Chaudron echaudro at redhat.com
Wed Jun 30 09:52:21 UTC 2021



On 30 Jun 2021, at 11:32, Van Haaren, Harry wrote:

>> -----Original Message-----
>> From: Eelco Chaudron <echaudro at redhat.com>
>> Sent: Wednesday, June 30, 2021 10:18 AM
>> To: Van Haaren, Harry <harry.van.haaren at intel.com>
>> Cc: Amber, Kumar <kumar.amber at intel.com>; dev at openvswitch.org;
>> i.maximets at ovn.org
>> Subject: Re: [ovs-dev] [v4 03/12] dpif-netdev: Add study function to select the best
>> mfex function
>>
>>
>>
>> On 29 Jun 2021, at 18:32, Van Haaren, Harry wrote:
>>
>>>> -----Original Message-----
>>>> From: dev <ovs-dev-bounces at openvswitch.org> On Behalf Of Eelco Chaudron
>>>> Sent: Tuesday, June 29, 2021 1:38 PM
>>>> To: Amber, Kumar <kumar.amber at intel.com>
>>>> Cc: dev at openvswitch.org; i.maximets at ovn.org
>>>> Subject: Re: [ovs-dev] [v4 03/12] dpif-netdev: Add study function to select the
>> best
>>>> mfex function
>
> <snip previous discussion>
>
>>> Perfect is the enemy of good... I'd prefer focus on getting existing code changes
>> merged,
>>> and add additional (optional) parameters in future if deemed useful in real world
>> testing?
>>
>> See Flavio’s reply, as those were the concerns same concerns I thought of.
>
> Yes - thanks for combining threads - I'm writing a detailed reply there as we speak here :)
> I'll send that reply shortly.
>
> <snip code/patch changes>
>
>>>>> +        if (max_hits >= MFEX_MIN_HIT_COUNT_FOR_USE) {
>>>>> +            /* Set the implementation to index with max_hits. */
>>>>> +            pmd->miniflow_extract_opt =
>>>>> +                        miniflow_funcs[best_func_index].extract_func;
>>>>> +            VLOG_INFO("MFEX study chose impl %s: (hits %d/%d pkts)\n",
>>>>> +                      miniflow_funcs[best_func_index].name, max_hits,
>>>>> +                      stats->pkt_count);
>>>>
>>>> We have no idea which PMD the mode is selected for guess we might need to add
>>>> this?
>>>>
>>>> Maybe we should report the numbers/hits for the other methods, as they might
>> be
>>>> equal, and some might be faster in execution time?
>>>
>>> As above, the implementations are sorted in performance order. Performance
>>> here can be known by micro-benchmarks, and developers of such SIMD optimized
>>> code can be expected to know which impl is fastest.
>>
>> Don’t think we can, as it’s not documented in the code, and some one can just add
>> his own, and has no clue about the existing ones.
>
> Yes, in theory somebody could add his own, and get this wrong. There are many many
> things that could go wrong when making code changes. We cannot document everything.

I meant that the code currently does not document that the implementation table, mfex_impls[], is in order of preference. So I think this should be added.

>>> In our current code, the avx512_vbmi_* impls are always before the avx512_*
>>> impls, as the VBMI instruction set allows a faster runtime.
>>
>> Guess we need some documentation in the developer's section on how to add
>> processor optimized functions, and how to benchmark them (and maybe some
>> benchmark data for the current implementations).
>> Also, someone can write a sloppy avx512_vbmi* function that might be slower than
>> an avx512_*, right?
>
> What are we trying to achieve here? What is the root problem that is being addressed?
>
> Yes, somebody "could" write sloppy (complex, advanced, ISA specific, SIMD) avx512 code,
> and have it be slower. Who is realistically going to do that?
>
> I'm fine with documenting a few things if they make sense to document, but
> trying to "hand hold" at every level just doesn't work. Adding sections on how
> to benchmark code, and how function pointers work and how to add them?
> These things are documented in various places across the internet.
>
> If there's really an interest to learn AVX512 SIMD optimization, reach out to the
> OVS community, put me on CC, and I'll be willing to help. Adding documentation
> ad nauseam is not the solution, as each optimization is likely to have subtle differences.
I think the problem is that except you, and some other small group at Intel might know AVX512, but for most of the OVS community this is moving back to handwritten assembler. So at least some guidelines on what you should do when adding a custom function would help. Like order them in priority, maybe some simple example on how to benchmark the runtime of the mfex function. Don't think this has to be part of this patch, but a follow-up would be nice.
>
>
>>> <snip code changes till end of patch>
> <snip snip away irrelevant old discussions>



More information about the dev mailing list