[ovs-dev] [PATCH v12 0/5] dpcls func ptrs & optimizations
Van Haaren, Harry
harry.van.haaren at intel.com
Wed Jul 17 16:33:38 UTC 2019
> -----Original Message-----
> From: Ilya Maximets [mailto:i.maximets at samsung.com]
> Sent: Wednesday, July 17, 2019 5:26 PM
> To: Van Haaren, Harry <harry.van.haaren at intel.com>; dev at openvswitch.org
> Cc: echaudro at redhat.com; malvika.gupta at arm.com; Stokes, Ian
> <ian.stokes at intel.com>
> Subject: Re: [PATCH v12 0/5] dpcls func ptrs & optimizations
> >> I performed a few tests with v11 of this patch-set on my usual setup to
> >> check
> >> performance of the new generic (non-optimized) implementation. The result
> >> that
> >> new generic implementation is ~5% faster for me than current master (it
> >> 12%
> >> for optimized lookup functions) which is good.
> >> The code looks OK for me in general. I still don't really like the fact
> >> dpcls depends on the internal structure of a flowmap, but we, probably,
> >> could
> >> deal with that while we have a build time assertion. Hope, we'll have
> >> better
> >> implementation with the same level of performance in the future.
> > Thanks for the feedback on performance Ilya - good to see that we're going
> > in the right direction performance wise anyway.
> > I have one item I'd still like to improve in this patchset, and its
> > where the blocks scratch array is being stored. I'll rework and find a
> > solution than the current, and post that as the lucky patchset v13 :)
> As an idea, I could suggest DEFINE_STATIC_PER_THREAD_DATA inside the dpif-
Thanks for the suggestion - that would make the blocks scratch pointer available
on a per-thread basis, and be initialized to NULL correct?
As a result, we would must a (predictable) branch to check if the pointer is
NULL, to allocate the required space on first usage of the subtable_lookup?
Or is there a better way to initialize it for all threads that call dpcls_lookup()?
My concern here is what if one thread creates a subtable (and allocs blocks_scratch for itself),
but then the PMD is polled by another thread - which uses its own blocks_scratch which is NULL.
Hence, I think the runtime check + alloc is probably the best/safest way,
but I'd appreciate your input on the above threading logic Ilya :)
More information about the dev