[ovs-dev] [PATCH 0/5] dpif-netdev: Cuckoo-Distributor implementation

Darrell Ball dball at vmware.com
Sat Jul 8 01:37:18 UTC 2017


I just noticed this patch set has not had much discussion since the RFC version.
It would be nice if the discussion can be revived.

Thanks Darrell


On 6/13/17, 4:09 PM, "ovs-dev-bounces at openvswitch.org on behalf of yipeng1.wang at intel.com" <ovs-dev-bounces at openvswitch.org on behalf of yipeng1.wang at intel.com> wrote:

    From: Yipeng Wang <yipeng1.wang at intel.com>
    
    The Datapath Classifier uses tuple space search for flow classification.
    The rules are arranged into a set of tuples/subtables (each with a
    distinct mask).  Each subtable is implemented as a hash table and lookup
    is done with flow keys formed by selecting the bits from the packet header
    based on each subtable's mask. Tuple space search will sequentially search
    each subtable until a match is found. With a large number of subtables, a
    sequential search of the subtables could consume a lot of CPU cycles. In
    a testbench with a uniform traffic pattern equally distributed across 20
    subtables, we measured that up to 65% of total execution time is attributed
    to the megaflow cache lookup.
    
    This patch presents the idea of the two-layer hierarchical lookup, where a
    low overhead first level of indirection is accessed first, we call this
    level cuckoo distributor (CD). If a flow key has been inserted in the flow
    table the first level will indicate with high probability that which
    subtable to look into. A lookup is performed on the second level (the
    target subtable) to retrieve the result. If the key doesn’t have a match,
    then we revert back to the sequential search of subtables. The patch is
    partially inspired by earlier concepts proposed in "simTable"[1] and
    "Cuckoo Filter"[2], and DPDK's Cuckoo Hash implementation.
    
    This patch can improve the already existing Subtable Ranking when traffic
    data has high entropy. Subtable Ranking helps minimize the number of
    traversed subtables when most of the traffic hit the same subtable.
    However, in the case of high entropy traffic such as traffic coming from
    a physical port, multiple subtables could be hit with a similar frequency.
    In this case the average subtable lookups per hit would be much greater
    than 1. In addition, CD can adaptively turn off when it finds the traffic
    mostly hit one subtable. Thus, CD will not be an overhead when Subtable
    Ranking works well.
    
    Scheme:
    
     -------
    |  CD   |
     -------
           \
            \
     -----  -----     -----
    |sub  ||sub  |...|sub  |
    |table||table|   |table|
     -----  -----     -----
    
    Evaluation:
    
    We create set of rules with various src IP. We feed traffic containing various
    numbers of flows with various src IP and dst IP. All the flows hit 10/20/30
    rules creating 10/20/30 subtables.
    
    The table below shows the preliminary continuous testing results (full line
    speed test) we collected with a uni-directional phy-to-phy setup. The
    machine we tested on is a Xeon E5 server running with 2.2GHz cores. OvS
    runs with 1 PMD. We use Spirent as the hardware traffic generator.
    
    AVX2 data:
    20k flows:
    no.subtable: 10          20          30
    cd-ovs       4267332     3478251     3126763
    orig-ovs     3260883     2174551     1689981
    speedup      1.31x       1.60x       1.85x
    
    100k flows:
    no.subtable: 10          20          30
    cd-ovs       4015783     3276100     2970645
    orig-ovs     2692882     1711955     1302321
    speedup      1.49x       1.91x       2.28x
    
    1M flows:
    no.subtable: 10          20          30
    cd-ovs       3895961     3170530     2968555
    orig-ovs     2683455     1646227     1240501
    speedup      1.45x       1.92x       2.39x
    
    Scalar data:
    1M flows:
    no.subtable: 10          20          30
    cd-ovs       3658328     3028111     2863329
    orig_ovs     2683455     1646227     1240501
    speedup      1.36x       1.84x       2.31x
    
    [1] H. Lee and B. Lee, Approaches for improving tuple space search-based
    table lookup, ICTC '15
    [2] B. Fan, D. G. Andersen, M. Kaminsky, and M. D. Mitzenmacher,
    Cuckoo Filter: Practically Better Than Bloom, CoNEXT '14
    
    This patch set is created based on commit
    a13784ba95efeb5a1f77253df40d433a1ce60087
    
    The previous RFC on mailing list are at:
    https://mail.openvswitch.org/pipermail/ovs-dev/2017-May/331834.html
    https://mail.openvswitch.org/pipermail/ovs-dev/2017-April/330570.html
    
    Signed-off-by: Yipeng Wang <yipeng1.wang at intel.com>
    Signed-off-by: Charlie Tai <charlie.tai at intel.com>
    Co-authored-by: Charlie Tai <charlie.tai at intel.com>
    Signed-off-by: Sameh Gobriel <sameh.gobriel at intel.com>
    Co-authored-by: Sameh Gobriel <sameh.gobriel at intel.com>
    Signed-off-by: Ren Wang <ren.wang at intel.com>
    Co-authored-by: Ren Wang <ren.wang at intel.com>
    Signed-off-by: Antonio Fischetti <antonio.fischetti at intel.com>
    Co-authored-by: Antonio Fischetti <antonio.fischetti at intel.com>
    
    
    Yipeng Wang (5):
      dpif-netdev: Basic CD feature with scalar lookup.
      dpif-netdev: Add AVX2 implementation for CD lookup.
      dpif-netdev: Add CD statistics
      dpif-netdev: Add adaptive CD mechanism
      unit-test: Add a delay for CD initialization.
    
     lib/dpif-netdev.c     | 566 +++++++++++++++++++++++++++++++++++++++++++++++++-
     tests/ofproto-dpif.at |   3 +
     2 files changed, 558 insertions(+), 11 deletions(-)
    
    -- 
    1.9.1
    
    _______________________________________________
    dev mailing list
    dev at openvswitch.org
    https://mail.openvswitch.org/mailman/listinfo/ovs-dev
    



More information about the dev mailing list