[ovs-dev] [PATCH RFC 5/5] dpif-netdev: Prefetch the cacheline having the cycle stats.

Ilya Maximets i.maximets at samsung.com
Thu Dec 7 14:04:10 UTC 2017


On 05.12.2017 18:11, Bodireddy, Bhanuprakash wrote:
>>
>>> Prefetch the cacheline having the cycle stats so that we can speed up
>>> the cycles_count_start() and cycles_count_intermediate().
>>
>> Do you have any performance results?
> 
> I don’t have nos. for this patch alone. I was testing the overall throughput along with other patches (that were *not* part of this RFC series) to verify performance improvements. I will include in commit log when I do for individual patches. 
> 
> BTW, I usually look at  the % of total instructions getting retired, cycles spent in front and back-end for the functions to see if prefetching does improve/degrade performance.
> 
> - Bhanuprakash.
> 
>>
>>>
>>> Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy at
>>> intel.com>
>>> ---
>>>  lib/dpif-netdev.c | 3 ++-
>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index
>>> b74b5d7..ab13d83 100644
>>> --- a/lib/dpif-netdev.c
>>> +++ b/lib/dpif-netdev.c
>>> @@ -576,7 +576,7 @@ struct dp_netdev_pmd_thread {
>>>          struct ovs_mutex flow_mutex;
>>>          /* 8 pad bytes. */
>>>      );
>>> -    PADDED_MEMBERS(CACHE_LINE_SIZE,
>>> +    PADDED_MEMBERS_CACHELINE_MARKER(CACHE_LINE_SIZE,
>> cachelineC,
>>>          struct cmap flow_table OVS_GUARDED; /* Flow table. */
>>>
>>>          /* One classifier per in_port polled by the pmd */ @@ -4082,6
>>> +4082,7 @@ reload:
>>>          lc = UINT_MAX;
>>>      }
>>>
>>> +    OVS_PREFETCH_CACHE(&pmd->cachelineC, OPCH_HTW);

How does prefetch just before the infinite loop should improve performance?
I didn't test that, but IMHO, this should have zero impact.

>>>      cycles_count_start(pmd);
>>>      for (;;) {
>>>          for (i = 0; i < poll_cnt; i++) {
>>> --
>>> 2.4.11


More information about the dev mailing list