[ovs-dev] [PATCH 0/5] dpif-netdev: Cuckoo-Distributor implementation
Fischetti, Antonio
antonio.fischetti at intel.com
Wed Aug 9 08:38:21 UTC 2017
Any comment on this patchset?
Adding Jan in CC.
In one of the last bi-weekly meeting there was some interest in testing
this patchset in conjunction with the patch to avoid using EMC for
recirculated packets - this is contained inside the patchset
https://mail.openvswitch.org/pipermail/ovs-dev/2017-July/335938.html
Thanks,
-Antonio
> -----Original Message-----
> From: ovs-dev-bounces at openvswitch.org [mailto:ovs-dev-bounces at openvswitch.org]
> On Behalf Of Wang, Yipeng1
> Sent: Tuesday, July 11, 2017 8:59 PM
> To: Darrell Ball <dball at vmware.com>; dev at openvswitch.org
> Subject: Re: [ovs-dev] [PATCH 0/5] dpif-netdev: Cuckoo-Distributor
> implementation
>
> Thank you Darrell for the comments.
>
> To ones who are interested, this patch is mainly for improving the subtable
> lookup process when subtable count is large. We heard about use cases that
> the current sequential search of subtables is not efficient enough. With 30
> subtables, this patch could achieve more than 2x speedup. Basically, a hash
> table is used to direct the packets to correct sub-table.
>
> We also plan a replacement policy mechanism for version 2, our initial results
> Show another 7% improvement on top of current CD for certain use cases.
>
> Please feel free to comment and share any thought on this patch.
>
> Thanks
> Yipeng
>
> > -----Original Message-----
> > From: Darrell Ball [mailto:dball at vmware.com]
> > Sent: Friday, July 7, 2017 6:37 PM
> > To: Wang, Yipeng1 <yipeng1.wang at intel.com>; dev at openvswitch.org
> > Subject: Re: [ovs-dev] [PATCH 0/5] dpif-netdev: Cuckoo-Distributor
> > implementation
> >
> > I just noticed this patch set has not had much discussion since the RFC
> > version.
> > It would be nice if the discussion can be revived.
> >
> > Thanks Darrell
> >
> >
> > On 6/13/17, 4:09 PM, "ovs-dev-bounces at openvswitch.org on behalf of
> > yipeng1.wang at intel.com" <ovs-dev-bounces at openvswitch.org on behalf of
> > yipeng1.wang at intel.com> wrote:
> >
> > From: Yipeng Wang <yipeng1.wang at intel.com>
> >
> > The Datapath Classifier uses tuple space search for flow classification.
> > The rules are arranged into a set of tuples/subtables (each with a
> > distinct mask). Each subtable is implemented as a hash table and lookup
> > is done with flow keys formed by selecting the bits from the packet
> header
> > based on each subtable's mask. Tuple space search will sequentially
> search
> > each subtable until a match is found. With a large number of subtables, a
> > sequential search of the subtables could consume a lot of CPU cycles. In
> > a testbench with a uniform traffic pattern equally distributed across 20
> > subtables, we measured that up to 65% of total execution time is
> > attributed
> > to the megaflow cache lookup.
> >
> > This patch presents the idea of the two-layer hierarchical lookup, where
> a
> > low overhead first level of indirection is accessed first, we call this
> > level cuckoo distributor (CD). If a flow key has been inserted in the
> flow
> > table the first level will indicate with high probability that which
> > subtable to look into. A lookup is performed on the second level (the
> > target subtable) to retrieve the result. If the key doesn’t have a match,
> > then we revert back to the sequential search of subtables. The patch is
> > partially inspired by earlier concepts proposed in "simTable"[1] and
> > "Cuckoo Filter"[2], and DPDK's Cuckoo Hash implementation.
> >
> > This patch can improve the already existing Subtable Ranking when traffic
> > data has high entropy. Subtable Ranking helps minimize the number of
> > traversed subtables when most of the traffic hit the same subtable.
> > However, in the case of high entropy traffic such as traffic coming from
> > a physical port, multiple subtables could be hit with a similar
> frequency.
> > In this case the average subtable lookups per hit would be much greater
> > than 1. In addition, CD can adaptively turn off when it finds the traffic
> > mostly hit one subtable. Thus, CD will not be an overhead when Subtable
> > Ranking works well.
> >
> > Scheme:
> >
> > -------
> > | CD |
> > -------
> > \
> > \
> > ----- ----- -----
> > |sub ||sub |...|sub |
> > |table||table| |table|
> > ----- ----- -----
> >
> > Evaluation:
> >
> > We create set of rules with various src IP. We feed traffic containing
> various
> > numbers of flows with various src IP and dst IP. All the flows hit
> 10/20/30
> > rules creating 10/20/30 subtables.
> >
> > The table below shows the preliminary continuous testing results (full
> line
> > speed test) we collected with a uni-directional phy-to-phy setup. The
> > machine we tested on is a Xeon E5 server running with 2.2GHz cores. OvS
> > runs with 1 PMD. We use Spirent as the hardware traffic generator.
> >
> > AVX2 data:
> > 20k flows:
> > no.subtable: 10 20 30
> > cd-ovs 4267332 3478251 3126763
> > orig-ovs 3260883 2174551 1689981
> > speedup 1.31x 1.60x 1.85x
> >
> > 100k flows:
> > no.subtable: 10 20 30
> > cd-ovs 4015783 3276100 2970645
> > orig-ovs 2692882 1711955 1302321
> > speedup 1.49x 1.91x 2.28x
> >
> > 1M flows:
> > no.subtable: 10 20 30
> > cd-ovs 3895961 3170530 2968555
> > orig-ovs 2683455 1646227 1240501
> > speedup 1.45x 1.92x 2.39x
> >
> > Scalar data:
> > 1M flows:
> > no.subtable: 10 20 30
> > cd-ovs 3658328 3028111 2863329
> > orig_ovs 2683455 1646227 1240501
> > speedup 1.36x 1.84x 2.31x
> >
> > [1] H. Lee and B. Lee, Approaches for improving tuple space search-based
> > table lookup, ICTC '15
> > [2] B. Fan, D. G. Andersen, M. Kaminsky, and M. D. Mitzenmacher,
> > Cuckoo Filter: Practically Better Than Bloom, CoNEXT '14
> >
> > This patch set is created based on commit
> > a13784ba95efeb5a1f77253df40d433a1ce60087
> >
> > The previous RFC on mailing list are at:
> > https://mail.openvswitch.org/pipermail/ovs-dev/2017-May/331834.html
> > https://mail.openvswitch.org/pipermail/ovs-dev/2017-April/330570.html
> >
> > Signed-off-by: Yipeng Wang <yipeng1.wang at intel.com>
> > Signed-off-by: Charlie Tai <charlie.tai at intel.com>
> > Co-authored-by: Charlie Tai <charlie.tai at intel.com>
> > Signed-off-by: Sameh Gobriel <sameh.gobriel at intel.com>
> > Co-authored-by: Sameh Gobriel <sameh.gobriel at intel.com>
> > Signed-off-by: Ren Wang <ren.wang at intel.com>
> > Co-authored-by: Ren Wang <ren.wang at intel.com>
> > Signed-off-by: Antonio Fischetti <antonio.fischetti at intel.com>
> > Co-authored-by: Antonio Fischetti <antonio.fischetti at intel.com>
> >
> >
> > Yipeng Wang (5):
> > dpif-netdev: Basic CD feature with scalar lookup.
> > dpif-netdev: Add AVX2 implementation for CD lookup.
> > dpif-netdev: Add CD statistics
> > dpif-netdev: Add adaptive CD mechanism
> > unit-test: Add a delay for CD initialization.
> >
> > lib/dpif-netdev.c | 566
> > +++++++++++++++++++++++++++++++++++++++++++++++++-
> > tests/ofproto-dpif.at | 3 +
> > 2 files changed, 558 insertions(+), 11 deletions(-)
> >
> > --
> > 1.9.1
> >
> > _______________________________________________
> > dev mailing list
> > dev at openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
>
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
More information about the dev
mailing list