[ovs-dev] [PATCH v10 0/5] dpcls func ptrs & optimizations

Thu Jul 11 14:40:59 UTC 2019

> -----Original Message-----
> From: Ilya Maximets [mailto:i.maximets at samsung.com]
> Sent: Thursday, July 11, 2019 3:14 PM
> To: Van Haaren, Harry <harry.van.haaren at intel.com>; dev at openvswitch.org
> Cc: malvika.gupta at arm.com; Stokes, Ian <ian.stokes at intel.com>; Michal Orsák
> <michal.orsak at gmail.com>
> Subject: Re: [PATCH v10 0/5] dpcls func ptrs & optimizations
> 
> On 09.07.2019 15:34, Harry van Haaren wrote:
> > Hey All,
> >
> >
> > Here a v10 of the DPCLS Function Pointer patchset, as has been
> > presented at OVS Conf in Nov '18, and discussed on the ML since then.
> > I'm aware of the soft-freeze for 2.12, I feel this patchset has had
> > enough reviews/versions/testing to be merged in 2.12.
> >
> > Thanks Ilya and Ian for review comments on v9, they should all be addressed
> > in this v10.
> >
> > Thanks Malvika Gupta for testing (Tested-by tag added to patches) and also
> > for reporting ARM performance gains, see here for details:
> > https://mail.openvswitch.org/pipermail/ovs-dev/2019-June/360088.html
> >
> >
> > Regards, -Harry
> 
> Hi, Harry.
> Thanks for working on this.

My pleasure - it’s a nice part of OVS. And there's lots more to do :)

> I performed some tests with this version in my usual PVP with bonded PHY
> setup and here are some observations:
> 
> * Bug that redirected packets to wrong rules is gone. At least I can't
>   catch it in my testing anymore. Assuming it's fixed now.
> 
> * dpcls performance boost for 512B packets is around 12% in compare with
>   current master.

Ah great! Glad to hear its giving you performance.

> Few remarks about the test scenario:
> All packets mostly goes through the NORMAL action with vlan push/pop.
> Packets that goes from VM to balanced-tcp bonded PHY goes through
> recirculation. Datapath flows for them looks like this:
> 
> Before recirculation:
> recirc_id=0,eth,ip,vlan_tci=0x0000/0x1fff,dl_src=aa:16:3e:24:30:dd,dl_dst=aa:b
> b:cc:dd:ee:11,nw_frag=no
> 
> After recirculation:
> recirc_id=0x1,dp_hash=0xf5/0xff,eth,ip,dl_vlan=42,dl_vlan_pcp=0,nw_frag=no
> 
> I have 256 flows in datapath for different 'dp_hash'es.
> 
> So, even if the number of ipv4 flows is as high as 256K, I have about ~270
> datapath
> flows in dpcls. (This gives a huge advantage to dpcls over EMC and SMC).

Right - I'm a big fan of the consistent performance characteristic of DPCLS,
which is due to its wildcarding capabilities and lack of caching concepts.

> All the flows fits into 5+1 case, i.e. optimized function
> dpcls_subtable_lookup_mf_u0w5_u1w1 used.
> 
> Most interesting observation:
> 
> * New version of dpcls lookup outperforms SMC in this setup even on
>   relatively small number of flows. With 8K flows dpcls faster than SMC
>   by 1.5% and by 5.7% with 256K flows.
>   Of course, SMC is 10% faster than dpcls with 8 flows, but it's not very
>   interesting because no-one can beat EMC in this area.
>
> I'd like to read the code more carefully tomorrow and probably give some
> more feedback.
> 
> Best regards, Ilya Maximets.

Thanks for your comments - please do prioritize feedback ASAP, because as
you know the 2.12 soft-freeze is already in effect.

I'll work on Ian's comments on v10, but hold off sending v11 until there
is some feedback from you too :)

Thanks again, -Harry