[ovs-dev] [PATCH RFC] dpif-netdev: Add Cuckoo Distributor to Accelerate Megaflow Search

Fischetti, Antonio antonio.fischetti at intel.com
Fri Jun 23 07:56:30 UTC 2017


Hi All,
thanks for your feedback. We published a patchset v1 at

http://patchwork.ozlabs.org/patch/775505/

please feel free to review.

Thanks,
Antonio

> -----Original Message-----
> From: Wang, Yipeng1
> Sent: Wednesday, May 3, 2017 12:04 AM
> To: Darrell Ball <dball at vmware.com>; dev at openvswitch.org; jarno at ovn.org;
> jan.scheurich at ericsson.com
> Cc: Tai, Charlie <charlie.tai at intel.com>; Wang, Ren <ren.wang at intel.com>;
> Gobriel, Sameh <sameh.gobriel at intel.com>; Fischetti, Antonio
> <antonio.fischetti at intel.com>
> Subject: RE: [ovs-dev] [PATCH RFC] dpif-netdev: Add Cuckoo Distributor to
> Accelerate Megaflow Search
> 
> Thank you Darrell for the comment, we collect some data with the scalar
> version,  please see my reply inlined.  Our newest results show good
> speedup for both scalar and AVX version.
> 
> We are still waiting for more feedback before implementing version 2.
> Please feel free to comment on the patch.
> 
> Thank you.
> 
> > -----Original Message-----
> > From: Darrell Ball [mailto:dball at vmware.com]
> > Sent: Wednesday, April 26, 2017 10:04 PM
> > To: Wang, Yipeng1 <yipeng1.wang at intel.com>; dev at openvswitch.org
> > Cc: Tai, Charlie <charlie.tai at intel.com>; Wang, Ren
> <ren.wang at intel.com>;
> > Gobriel, Sameh <sameh.gobriel at intel.com>
> > Subject: Re: [ovs-dev] [PATCH RFC] dpif-netdev: Add Cuckoo Distributor
> to
> > Accelerate Megaflow Search
> >
> >
> >
> > On 4/14/17, 6:10 PM, "Wang, Yipeng1" <yipeng1.wang at intel.com> wrote:
> >
> >     Thank you Darrell for the comments. Please take a look at my reply
> inlined.
> >
> >
> >
> >     > -----Original Message-----
> >
> >     > From: Darrell Ball [mailto:dball at vmware.com]
> >
> >     > Sent: Thursday, April 13, 2017 10:36 PM
> >
> >     > To: Wang, Yipeng1 <yipeng1.wang at intel.com>; dev at openvswitch.org
> >
> >     > Subject: Re: [ovs-dev] [PATCH RFC] dpif-netdev: Add Cuckoo
> Distributor
> > to
> >
> >     > Accelerate Megaflow Search
> >
> >     >
> >
> >     >
> >
> >     >
> >
> >     > On 4/6/17, 2:48 PM, "ovs-dev-bounces at openvswitch.org on behalf of
> >
> >     > yipeng1.wang at intel.com" <ovs-dev-bounces at openvswitch.org on
> > behalf of
> >
> >     > yipeng1.wang at intel.com> wrote:
> >
> >     >
> >
> >     >     From: Yipeng Wang <yipeng1.wang at intel.com>
> >
> >     >
> >
> >     >     The Datapath Classifier uses tuple space search for flow
> classification.
> >
> >     >     The rules are arranged into a set of tuples/subtables (each
> with a
> >
> >     >     distinct mask).  Each subtable is implemented as a hash table
> and
> > lookup
> >
> >     >     is done with flow keys formed by selecting the bits from the
> packet
> > header
> >
> >     >     based on each subtable's mask. Tuple space search will
> sequentially
> > search
> >
> >     >     each subtable until a match is found. With a large number of
> subtables,
> > a
> >
> >     >     sequential search of the subtables could consume a lot of CPU
> cycles.
> > In
> >
> >     >     a testbench with a uniform traffic pattern equally distributed
> across 20
> >
> >     >     subtables, we measured that up to 65% of total execution time
> is
> > attributed
> >
> >     >     to the megaflow cache lookup.
> >
> >     >
> >
> >     >     This patch presents the idea of the two-layer hierarchical
> lookup,
> > where a
> >
> >     >     low overhead first level of indirection is accessed first, we
> call this
> >
> >     >     level cuckoo distributor (CD). If a flow key has been inserted
> in the flow
> >
> >     >     table the first level will indicate with high probability that
> which
> >
> >     >     subtable to look into. A lookup is performed on the second
> level (the
> >
> >     >     target subtable) to retrieve the result. If the key doesn’t
> have a match,
> >
> >     >     then we revert back to the sequential search of subtables.
> >
> >     >
> >
> >     >     This patch can improve the already existing Subtable Ranking
> when
> > traffic
> >
> >     >     data has high entropy. Subtable Ranking helps minimize the
> number of
> >
> >     >     traversed subtables when most of the traffic hit the same
> subtable.
> >
> >     >     However, in the case of high entropy traffic such as traffic
> coming from
> >
> >     >     a physical port, multiple subtables could be hit with a
> similar frequency.
> >
> >     >     In this case the average subtable lookups per hit would be
> much
> > greater
> >
> >     >     than 1. In addition, CD can adaptively turn off when it finds
> the traffic
> >
> >     >     mostly hit one subtable. Thus, CD will not be an overhead when
> > Subtable
> >
> >     >     Ranking works well.
> >
> >     >
> >
> >     >     Scheme:
> >
> >     >
> >
> >     >          -------
> >
> >     >         |  CD   |
> >
> >     >          -------
> >
> >     >            \
> >
> >     >             \
> >
> >     >      -----  -----     -----
> >
> >     >     |sub  ||sub  |...|sub  |
> >
> >     >     |table||table|   |table|
> >
> >     >      -----  -----     -----
> >
> >     >
> >
> >     >     Evaluation:
> >
> >     >
> >
> >     >     We create set of rules with various src IP. We feed traffic
> containing 1
> >
> >     >     million flows with various src IP and dst IP. All the flows
> hit 10/20/30
> >
> >     >     rules creating 10/20/30 subtables.
> >
> >     >
> >
> >     >     The table below shows the preliminary continuous testing
> results (full
> > line
> >
> >     >     speed test) we collected with a uni-directional port-to-port
> setup. The
> >
> >     >     machine we tested on is a Xeon E5 server running with 2.2GHz
> cores.
> > OvS
> >
> >     >     runs with 1 PMD. We use Spirent as the hardware traffic
> generator.
> >
> >     >
> >
> >     >     no.subtable: 10          20          30
> >
> >     >     cd-ovs       3895961     3170530     2968555
> >
> >     >     orig-ovs     2683455     1646227     1240501
> >
> >     >     speedup      1.45x       1.92x       2.39x
> >
> >     >
> >
> >     >
> >
> >     > I have a few initial comments.
> >
> >     > 1) Can you present the numbers with and without __AVX2__
> “enabled”.'
> >
> >     [Wang, Yipeng] We mainly test with AVX2 to find the upper-bound
> > performance speedup of the design.  Throughput-wise, we have not
> > optimized for the scalar version thus we did not present the results. If
> people
> > are interested in this patch, we will update the implementation to
> consider
> > the performance for both AVX and scalar in Version 2 and report the
> results.
> > We may design different structure (mainly different entry count per
> bucket)
> > for scalar and AVX to optimize the performance.
> >
> >
> > [Darrell] This seem interesting.
> >                 It would be nice to hear from others.
> >
> >
> [Wang, Yipeng] We found a bug associated with the scalar part of the code.
> We fixed the bug and collected new results for scalar version. Please see
> my reply for the next comment.
> 
> >
> >     > 2) Can you present the numbers with say 20000 and say 100000 flows
> for
> > some
> >
> >     > comparison.
> >
> >     [Wang, Yipeng] As long as flows cannot all fit in EMC, CD should
> benefit.
> > Generally, CD benefit more when there are more flows fall out of EMC. We
> > collect the new results and report them as following:
> >
> >
> >
> >     20000 flows:
> >
> >       no.subtable: 10          20          30
> >
> >         cd-ovs       4267332     3478251     3126763
> >
> >         orig-ovs     3260883     2174551     1689981
> >
> >         speedup      1.31x       1.60x            1.85x
> >
> >
> >
> >     100000 flows:
> >
> >           no.subtable: 10          20          30
> >
> >         cd-ovs       4015783    3276100   2970645
> >
> >         orig-ovs     2692882    1711955    1302321
> >
> >         speedup      1.49x       1.91x       2.28x
> >
> >
> >
> >     > 3) Is the below logic conservative for when CD would provide
> benefit in
> > the case
> >
> >     > of the 1 million flow test for example ?
> >
> >     > if (avg_table_cnt >= 1) {
> >
> >     >     cls->cd_on = 1;
> >
> >     > } else if (avg_table_cnt != -1) {
> >
> >     >      cls->cd_on = 0;
> >
> >     > }
> >
> >     [Wang, Yipeng] We found that as long as the average iterated
> subtable
> > count is larger than 2 (>=1 in the code), CD will benefit. Otherwise CD
> does
> > not benefit much no matter how many flows there are. In such case,
> either
> > there are not many subtables or subtable ranking works well.
> >
> >
> >
> >     > 4)  Why #define CD_ENTRIES 16: did you explore other values ?
> >
> >     [Wang, Yipeng] We set 16 mainly because of two reasons. First, 16
> entries
> > fit in one hardware cache line and AVX2 can process 16 entries together.
> > Second, the more entries per bucket, the less potential key collision.
> > Performance-wise, 16 is not necessarily a good number for scalar
> > implementation. If people are interested in this patch, we will do more
> > exploration on this number considering both scalar and vector
> > implementations.
> >
> >
> > [Darrell]
> > The non-AVX2 results would be relevant and interesting for comparison to
> > baseline and other possible approaches
> > Also, it would be good to have a better entries replacement policy and
> > exercising that code path.
> >
> >
> [Wang, Yipeng]
> The new results for scalar version with  various numbers of entries per
> bucket are shown below (we corrected a bug before collecting):
> 
> 	entry/bucket	                10subtable	20subtable	30subtable
> scalar	   16			3658328	3028111	2863329
> 	   8			3754389	3102534	2941114
> 	   4			3786733	3120870	2920553
> 	   2			3640258	2916161	2637687
> Avx2	   16			3852039	3162984	2965839
> orig_ovs			2683455	1646227	1240501
> 
> The results show that the scalar version of cuckoo distributor still
> achieves significant throughput improvement over the original OvS. The AVX
> version generally have another 1%-5% speedup over the scalar version.
> 
> Meanwhile, for scalar version, using 4 or 8 entries per bucket seems
> better than 16. It is reasonable since for scalar version the lookup
> function loops the entries and 16 entries means longer iteration time.
> However, the difference is small.
> 
> We are still waiting for more feedback for this patch. Besides the bug
> fix, here are two potential new things we plan to implement. First,
> current cuckoo distributor supports up to 255 subtables, if there is
> feedback about real use cases that requires more than 255 subtables, we
> will improve the implementation to support more subtables. Second, we will
> implement a replacement policy for CD as we mentioned.  Both of the new
> things may incur a little bit extra overhead of the lookup path, so we
> would like to wait for more feedback before we design the algorithm for
> version 2.
> 
> >     >
> >
> >     >
> >
> >     >
> >
> >     >
> >
> >     >     Signed-off-by: Yipeng Wang <yipeng1.wang at intel.com>
> >
> >     >     Signed-off-by: Charlie Tai <charlie.tai at intel.com>
> >
> >     >     Co-authored-by: Charlie Tai <charlie.tai at intel.com>
> >
> >     >     Signed-off-by: Sameh Gobriel <sameh.gobriel at intel.com>
> >
> >     >     Co-authored-by: Sameh Gobriel <sameh.gobriel at intel.com>
> >
> >     >     Signed-off-by: Ren Wang <ren.wang at intel.com>
> >
> >     >     Co-authored-by: Ren Wang <ren.wang at intel.com>
> >
> >     >     Signed-off-by: Antonio Fischetti <antonio.fischetti at
> intel.com>
> >
> >     >     Co-authored-by: Antonio Fischetti <antonio.fischetti at
> intel.com>
> >
> >     >     ---
> >
> >     >      lib/dpif-netdev.c     | 654
> >
> >     > ++++++++++++++++++++++++++++++++++++++++++++++++--
> >
> >     >      tests/ofproto-dpif.at |   3 +-
> >
> >     >      2 files changed, 633 insertions(+), 24 deletions(-)
> >
> >     >
> >
> >     >     diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> >
> >     >     index a14a2eb..d9a883b 100644
> >
> >     >     --- a/lib/dpif-netdev.c
> >
> >     >     +++ b/lib/dpif-netdev.c
> >
> >     >     @@ -79,11 +79,23 @@
> >
> >     >
> >
> >     >      VLOG_DEFINE_THIS_MODULE(dpif_netdev);
> >
> >     >
> >
> >     >     +/* Length of Subtable table for cuckoo distributor to index
> subtables.
> >
> >     >     + * The size of the table is at most 256 entires because the
> CD's entry
> > only
> >
> >     >     + * provides 1 byte for indexing.
> >
> >     >     + */
> >
> >     >     +#define SUBTABLE_TABLE_LENGTH 256
> >
> >     >     +
> >
> >     >      #define FLOW_DUMP_MAX_BATCH 50
> >
> >     >      /* Use per thread recirc_depth to prevent recirculation loop.
> */
> >
> >     >      #define MAX_RECIRC_DEPTH 5
> >
> >     >      DEFINE_STATIC_PER_THREAD_DATA(uint32_t, recirc_depth, 0)
> >
> >     >
> >
> >     >     +
> >
> >     >     +#define CD_DEBUG 0
> >
> >     >     +#define debug_print(...) \
> >
> >     >     +            do { if (CD_DEBUG) fprintf(stderr, __VA_ARGS__);
> } while (0)
> >
> >     >     +
> >
> >     >     +
> >
> >     >      /* Configuration parameters. */
> >
> >     >      enum { MAX_FLOWS = 65536 };     /* Maximum number of flows in
> flow
> > table.
> >
> >     > */
> >
> >     >      enum { MAX_METERS = 65536 };    /* Maximum number of meters.
> */
> >
> >     >     @@ -163,6 +175,44 @@ struct emc_cache {
> >
> >     >          int sweep_idx;                /* For
> emc_cache_slow_sweep(). */
> >
> >     >      };
> >
> >     >
> >
> >     >     +
> >
> >     >     +/* Cuckoo distributor (CD) is a 2-hash function hash table.
> >
> >     >     + * For now, the design does not allow desplacing items when
> bucket is
> > full,
> >
> >     >     + * which is different from the behavior of a cuckoo hash
> table.
> >
> >     >     + * The advantage is that we do not need to store two
> sigantures so
> > that
> >
> >     >     + * the struct will be more compact. We use 16 entries per
> bucket for
> > the
> >
> >     >     + * usage of AVX.
> >
> >     >     + *
> >
> >     >     + * Each classifier has its own cuckoo distributor. It is NOT
> thread-safe
> >
> >     >     + */
> >
> >     >     +#define CD_NUM_BUCKETS (1<<16)
> >
> >     >     +#define CD_BUCKET_MASK (CD_NUM_BUCKETS-1)
> >
> >     >     +#define CD_ENTRIES 16
> >
> >     >     +
> >
> >     >     +/* These two seeds are used for hashing two bucket locations
> */
> >
> >     >     +#define CD_PRIM_BUCKET_SEED 10
> >
> >     >     +#define CD_SEC_BUCKET_SEED 20
> >
> >     >     +
> >
> >     >     +/* This bit is used to choose which bucket to replace CD's
> entry in
> > cd_insert*/
> >
> >     >     +#define CD_CHOOSE_SEC_BUCKT_BIT (1 << CD_ENTRIES)
> >
> >     >     +
> >
> >     >     +typedef uint16_t simple_sig_store_t;
> >
> >     >     +
> >
> >     >     +
> >
> >     >     +/* The bucket struct for cuckoo distributor*/
> >
> >     >     +struct cuckoo_distributor_bucket {
> >
> >     >     +    simple_sig_store_t sig[CD_ENTRIES]; /*2-byte long
> signature*/
> >
> >     >     +    uint8_t table_index[CD_ENTRIES];    /*index to subtable
> table*/
> >
> >     >     +    uint8_t flag[CD_ENTRIES];           /*FIXME: not yet
> used*/
> >
> >     >     +} __attribute__ ((packed));
> >
> >     >     +
> >
> >     >     +
> >
> >     >     +struct cuckoo_distributor {
> >
> >     >     +    struct cuckoo_distributor_bucket buckets[CD_NUM_BUCKETS];
> > /*buckets
> >
> >     > array*/
> >
> >     >     +    uint32_t sig_store_bitmask;  /*mask to derive signature
> from hash
> > value*/
> >
> >     >     +} __attribute__ ((aligned (64)));
> >
> >     >     +
> >
> >     >     +
> >
> >     >      /* Iterate in the exact match cache through every entry that
> might
> > contain a
> >
> >     >       * miniflow with hash 'HASH'. */
> >
> >     >      #define EMC_FOR_EACH_POS_WITH_HASH(EMC, CURRENT_ENTRY,
> > HASH)
> >
> >     > \
> >
> >     >     @@ -175,12 +225,19 @@ struct emc_cache {
> >
> >     >
> >
> >     >      /* Time in ms between successive optimizations of the dpcls
> subtable
> > vector
> >
> >     > */
> >
> >     >      #define DPCLS_OPTIMIZATION_INTERVAL 1000
> >
> >     >     +/* Time in ms between the decisions of turning on or off CD
> */
> >
> >     >     +#define DPCLS_CD_OPTIMIZATION_INTERVAL 5000
> >
> >     >
> >
> >     >      struct dpcls {
> >
> >     >          struct cmap_node node;      /* Within
> > dp_netdev_pmd_thread.classifiers */
> >
> >     >          odp_port_t in_port;
> >
> >     >          struct cmap subtables_map;
> >
> >     >          struct pvector subtables;
> >
> >     >     +    struct cuckoo_distributor *cdtable;
> >
> >     >     +    uint8_t cd_on;                    /* turn on of off CD
> during runtime */
> >
> >     >     +    unsigned long cd_insert_cnt;      /* data collected for
> test purpose */
> >
> >     >     +    unsigned long cd_insert_fail_cnt; /* data collected for
> test purpose
> > */
> >
> >     >     +    struct dpcls_subtable*
> subtable_table[SUBTABLE_TABLE_LENGTH];
> >
> >     >      };
> >
> >     >
> >
> >     >      /* A rule to be inserted to the classifier. */
> >
> >     >     @@ -197,7 +254,9 @@ static void
> dpcls_sort_subtable_vector(struct
> > dpcls *);
> >
> >     >      static void dpcls_insert(struct dpcls *, struct dpcls_rule *,
> >
> >     >                               const struct netdev_flow_key *mask);
> >
> >     >      static void dpcls_remove(struct dpcls *, struct dpcls_rule
> *);
> >
> >     >     -static bool dpcls_lookup(struct dpcls *cls,
> >
> >     >     +
> >
> >     >     +struct dp_netdev_pmd_thread;
> >
> >     >     +static bool dpcls_lookup(struct dp_netdev_pmd_thread *pmd,
> struct
> > dpcls
> >
> >     > *cls,
> >
> >     >                               const struct netdev_flow_key keys[],
> >
> >     >                               struct dpcls_rule **rules, size_t
> cnt,
> >
> >     >                               int *num_lookups_p);
> >
> >     >     @@ -322,6 +381,8 @@ enum dp_stat_type {
> >
> >     >          DP_STAT_LOST,               /* Packets not passed up to
> the client. */
> >
> >     >          DP_STAT_LOOKUP_HIT,         /* Number of subtable lookups
> for flow
> > table
> >
> >     >                                         hits */
> >
> >     >     +    DP_CD_STAT_HIT,             /* Packets that hit in cuckoo
> distributor */
> >
> >     >     +    DP_CD_STAT_MISS,            /* Packets that miss in
> cuckoo distributor
> > */
> >
> >     >          DP_N_STATS
> >
> >     >      };
> >
> >     >
> >
> >     >     @@ -539,6 +600,7 @@ struct dp_netdev_pmd_thread {
> >
> >     >          struct cmap classifiers;
> >
> >     >          /* Periodically sort subtable vectors according to hit
> frequencies */
> >
> >     >          long long int next_optimization;
> >
> >     >     +    long long int next_cd_optimization;
> >
> >     >
> >
> >     >          /* Statistics. */
> >
> >     >          struct dp_netdev_pmd_stats stats;
> >
> >     >     @@ -698,6 +760,28 @@ emc_cache_uninit(struct emc_cache
> > *flow_cache)
> >
> >     >          }
> >
> >     >      }
> >
> >     >
> >
> >     >     +/* Initialize the cuckoo distributor structure */
> >
> >     >     +static void
> >
> >     >     +cd_init(struct cuckoo_distributor *cd)
> >
> >     >     +{
> >
> >     >     +    int i, j;
> >
> >     >     +    for (i = 0; i < CD_NUM_BUCKETS; i++) {
> >
> >     >     +        for(j = 0; j < CD_ENTRIES; j++){
> >
> >     >     +            cd->buckets[i].sig[j] = 0;
> >
> >     >     +            cd->buckets[i].table_index[j] = 0;
> >
> >     >     +            cd->buckets[i].flag[j] = 0 ;
> >
> >     >     +        }
> >
> >     >     +    }
> >
> >     >     +    cd->sig_store_bitmask = (1 << (8 *
> sizeof(simple_sig_store_t))) - 1;
> >
> >     >     +}
> >
> >     >     +
> >
> >     >     +/* Delete the cuckoo distributor*/
> >
> >     >     +static void
> >
> >     >     +cd_delete(struct cuckoo_distributor *cd)
> >
> >     >     +{
> >
> >     >     +    free(cd);
> >
> >     >     +}
> >
> >     >     +
> >
> >     >      /* Check and clear dead flow references slowly (one entry at
> each
> >
> >     >       * invocation).  */
> >
> >     >      static void
> >
> >     >     @@ -760,7 +844,8 @@ pmd_info_show_stats(struct ds *reply,
> >
> >     >                  stats[i] = 0;
> >
> >     >              }
> >
> >     >
> >
> >     >     -        if (i != DP_STAT_LOST) {
> >
> >     >     +        if (i != DP_STAT_LOST && i != DP_STAT_LOOKUP_HIT
> >
> >     >     +                && i != DP_CD_STAT_HIT && i !=
> DP_CD_STAT_MISS) {
> >
> >     >                  /* Lost packets are already included in
> DP_STAT_MISS */
> >
> >     >                  total_packets += stats[i];
> >
> >     >              }
> >
> >     >     @@ -797,6 +882,11 @@ pmd_info_show_stats(struct ds *reply,
> >
> >     >                        : 0,
> >
> >     >                        stats[DP_STAT_MISS], stats[DP_STAT_LOST]);
> >
> >     >
> >
> >     >     +    ds_put_format(reply,
> >
> >     >     +                  "\tCD hits:%llu\n\tCD miss:%llu\n",
> >
> >     >     +                  stats[DP_CD_STAT_HIT],
> stats[DP_CD_STAT_MISS]);
> >
> >     >     +
> >
> >     >     +
> >
> >     >          if (total_cycles == 0) {
> >
> >     >              return;
> >
> >     >          }
> >
> >     >     @@ -2026,6 +2116,274 @@ emc_insert(struct emc_cache *cache,
> > const
> >
> >     > struct netdev_flow_key *key,
> >
> >     >          emc_change_entry(to_be_replaced, flow, key);
> >
> >     >      }
> >
> >     >
> >
> >     >     +
> >
> >     >     +static inline int
> >
> >     >     +cd_insert(struct cuckoo_distributor *cd,
> >
> >     >     +            const struct netdev_flow_key *key, int index)
> >
> >     >     +{
> >
> >     >     +    /* FIXME: make return value meaningful. */
> >
> >     >     +    int i;
> >
> >     >     +    simple_sig_store_t tmp_sig = (key->hash & cd-
> >sig_store_bitmask);
> >
> >     >     +
> >
> >     >     +    /* First entry of subtable_table (index == 0) means an
> empty entry.
> >
> >     >     +     * Here we should have a valide entry for cd insertion so
> it is not 0.
> >
> >     >     +     */
> >
> >     >     +    ovs_assert(index != 0);
> >
> >     >     +    /* Using 2 hash functions (or different seeds) get 2
> totally
> >
> >     >     +     * random bucket places
> >
> >     >     +     */
> >
> >     >     +    /* FIXME: use higher/lower bits rather than 2 hashes and
> test. */
> >
> >     >     +    uint32_t prim_bucket = hash_int(key->hash,
> > CD_PRIM_BUCKET_SEED)
> >
> >     >     +                            & CD_BUCKET_MASK;
> >
> >     >     +    uint32_t sec_bucket = hash_int(key->hash,
> CD_SEC_BUCKET_SEED)
> >
> >     >     +                            & CD_BUCKET_MASK;
> >
> >     >     +
> >
> >     >     +    /* Check if the signature already in the two buckets */
> >
> >     >     +    for (i = 0; i < CD_ENTRIES; i++) {
> >
> >     >     +        if (cd->buckets[prim_bucket].sig[i] == tmp_sig) {
> >
> >     >     +            cd->buckets[prim_bucket].table_index[i] = index;
> >
> >     >     +            return 0;
> >
> >     >     +        }
> >
> >     >     +        if (cd->buckets[sec_bucket].sig[i] == tmp_sig) {
> >
> >     >     +            cd->buckets[sec_bucket].table_index[i] = index;
> >
> >     >     +            return 0;
> >
> >     >     +        }
> >
> >     >     +    }
> >
> >     >     +
> >
> >     >     +
> >
> >     >     +    /* If not then insert into one slot (prefer empty slot)
> */
> >
> >     >     +    for (i = 0; i < CD_ENTRIES; i++) {
> >
> >     >     +        if(cd->buckets[prim_bucket].table_index[i] == 0){
> >
> >     >     +            cd->buckets[prim_bucket].sig[i] = tmp_sig;
> >
> >     >     +            cd->buckets[prim_bucket].table_index[i] = index;
> >
> >     >     +            return 0;
> >
> >     >     +        }
> >
> >     >     +    }
> >
> >     >     +
> >
> >     >     +    /* Primary location full */
> >
> >     >     +    if (i == CD_ENTRIES) {
> >
> >     >     +        for(i = 0; i < CD_ENTRIES; i++){
> >
> >     >     +        if(cd->buckets[sec_bucket].table_index[i] == 0){
> >
> >     >     +                cd->buckets[sec_bucket].sig[i] = tmp_sig;
> >
> >     >     +                cd->buckets[sec_bucket].table_index[i] =
> index;
> >
> >     >     +                return 0;
> >
> >     >     +            }
> >
> >     >     +        }
> >
> >     >     +    }
> >
> >     >     +
> >
> >     >     +    /* Then we should evict someone. */
> >
> >     >     +
> >
> >     >     +    /* FIXME: replace pseudo random to aging based
> replacement
> > policy
> >
> >     >     +     * otherwise a slowly sweep process (like EMC) to kill
> infrequently
> >
> >     >     +     * accessed items could also help
> >
> >     >     +     */
> >
> >     >     +
> >
> >     >     +    uint32_t random = random_uint32();
> >
> >     >     +    uint32_t evict_idx = random & (CD_ENTRIES-1);
> >
> >     >     +    uint32_t bucket_choose = prim_bucket;
> >
> >     >     +
> >
> >     >     +    if (random & CD_CHOOSE_SEC_BUCKT_BIT) {
> >
> >     >     +        bucket_choose = sec_bucket;
> >
> >     >     +    }
> >
> >     >     +
> >
> >     >     +    cd->buckets[bucket_choose].sig[evict_idx] = tmp_sig;
> >
> >     >     +    cd->buckets[bucket_choose].table_index[evict_idx] =
> index;
> >
> >     >     +    return 0;
> >
> >     >     +}
> >
> >     >     +
> >
> >     >     +
> >
> >     >     +static inline void
> >
> >     >     +cd_compare_signatures(uint32_t *prim_hash_matches, uint32_t
> >
> >     > *sec_hash_matches,
> >
> >     >     +            const struct cuckoo_distributor_bucket *prim_bkt,
> >
> >     >     +            const struct cuckoo_distributor_bucket *sec_bkt,
> >
> >     >     +            simple_sig_store_t sig)
> >
> >     >     +{
> >
> >     >     +#ifdef __AVX2__
> >
> >     >     +
> >
> >     >     +      *prim_hash_matches =
> >
> >     > _mm256_movemask_epi8((__m256i)_mm256_cmpeq_epi16(
> >
> >     >     +                _mm256_load_si256((__m256i const *)prim_bkt-
> >sig),
> >
> >     >     +                _mm256_set1_epi16(sig)));
> >
> >     >     +
> >
> >     >     +
> >
> >     >     +      *sec_hash_matches =
> >
> >     > _mm256_movemask_epi8((__m256i)_mm256_cmpeq_epi16(
> >
> >     >     +                _mm256_load_si256((__m256i const *)sec_bkt-
> >sig),
> >
> >     >     +                _mm256_set1_epi16(sig)));
> >
> >     >     +
> >
> >     >     +#else
> >
> >     >     +        unsigned int i;
> >
> >     >     +        for (i = 0; i < CD_ENTRIES; i++) {
> >
> >     >     +            *prim_hash_matches |= ((sig == prim_bkt->sig[i])
> << i);
> >
> >     >     +            *sec_hash_matches |= ((sig == sec_bkt->sig[i]) <<
> i);
> >
> >     >     +        }
> >
> >     >     +#endif
> >
> >     >     +}
> >
> >     >     +
> >
> >     >     +/* 2-stage pipelined cd lookup*/
> >
> >     >     +static inline int
> >
> >     >     +cd_lookup_bulk_pipe( struct dpcls *cls,  const struct
> > netdev_flow_key keys[],
> >
> >     >     +                     int32_t num_keys, uint32_t *hit_mask,
> int data[])
> >
> >     >     +{
> >
> >     >     +    int i;
> >
> >     >     +    uint32_t prim_hitmask = 0;
> >
> >     >     +    uint32_t sec_hitmask = 0;
> >
> >     >     +    uint64_t hits = 0;
> >
> >     >     +    struct cuckoo_distributor* cd = cls->cdtable;
> >
> >     >     +    simple_sig_store_t temp_sig0 = (keys[0].hash) & cd-
> > >sig_store_bitmask;
> >
> >     >     +
> >
> >     >     +
> >
> >     >     +    struct cuckoo_distributor_bucket* prim_bkt0 =
> >
> >     >     +                    &cd->buckets[hash_int(keys[0].hash,
> > CD_PRIM_BUCKET_SEED)
> >
> >     >     +                    & CD_BUCKET_MASK];
> >
> >     >     +    struct cuckoo_distributor_bucket* sec_bkt0 =
> >
> >     >     +                    &cd->buckets[hash_int(keys[0].hash,
> > CD_SEC_BUCKET_SEED)
> >
> >     >     +                    & CD_BUCKET_MASK];
> >
> >     >     +    rte_prefetch0(prim_bkt0);
> >
> >     >     +    rte_prefetch0(sec_bkt0);
> >
> >     >     +
> >
> >     >     +    for (i = 1; i < num_keys; i++) {
> >
> >     >     +        simple_sig_store_t temp_sig1 = (keys[i].hash) & cd-
> > >sig_store_bitmask;
> >
> >     >     +
> >
> >     >     +        struct cuckoo_distributor_bucket* prim_bkt1 =
> >
> >     >     +                    &cd->buckets[hash_int(keys[i].hash,
> > CD_PRIM_BUCKET_SEED)
> >
> >     >     +                    & CD_BUCKET_MASK];
> >
> >     >     +        struct cuckoo_distributor_bucket* sec_bkt1 =
> >
> >     >     +                    &cd->buckets[hash_int(keys[i].hash,
> > CD_SEC_BUCKET_SEED)
> >
> >     >     +                    & CD_BUCKET_MASK];
> >
> >     >     +
> >
> >     >     +        rte_prefetch0(prim_bkt1);
> >
> >     >     +        rte_prefetch0(sec_bkt1);
> >
> >     >     +
> >
> >     >     +#ifdef __AVX2__
> >
> >     >     +
> >
> >     >     +        prim_hitmask =
> >
> >     > _mm256_movemask_epi8((__m256i)_mm256_cmpeq_epi16(
> >
> >     >     +                _mm256_load_si256((__m256i const *)prim_bkt0-
> >sig),
> >
> >     >     +                _mm256_set1_epi16(temp_sig0)));
> >
> >     >     +
> >
> >     >     +
> >
> >     >     +        sec_hitmask =
> >
> >     > _mm256_movemask_epi8((__m256i)_mm256_cmpeq_epi16(
> >
> >     >     +                _mm256_load_si256((__m256i const *)sec_bkt0-
> >sig),
> >
> >     >     +                _mm256_set1_epi16(temp_sig0)));
> >
> >     >     +
> >
> >     >     +        if (prim_hitmask) {
> >
> >     >     +            data[i-1] =
> >
> >     >     +                     prim_bkt0-
> >table_index[raw_ctz(prim_hitmask) / 2];
> >
> >     >     +            if (data[i-1] != 0 && cls->subtable_table[data[i-
> 1]] != 0) {
> >
> >     >     +                hits |= 1 << (i - 1);
> >
> >     >     +                prim_bkt0 = prim_bkt1;
> >
> >     >     +                sec_bkt0 = sec_bkt1;
> >
> >     >     +                temp_sig0 = temp_sig1;
> >
> >     >     +                continue;
> >
> >     >     +            }
> >
> >     >     +
> >
> >     >     +        }
> >
> >     >     +
> >
> >     >     +        if (sec_hitmask) {
> >
> >     >     +            data[i-1] = sec_bkt0-
> >table_index[raw_ctz(sec_hitmask) / 2];
> >
> >     >     +            if (data[i-1] != 0 && cls->subtable_table[data[i-
> 1]] != 0) {
> >
> >     >     +               hits |= 1 << (i - 1);
> >
> >     >     +            }
> >
> >     >     +        }
> >
> >     >     +
> >
> >     >     +#else
> >
> >     >     +        unsigned int j;
> >
> [Wang, Yipeng]  It should have reset the variables. We will fix the bug in
> version2.
> 	Prim_hitmask = 0;
> 	Sec_hitmask = 0;
> >     >     +        for (j = 0; j < CD_ENTRIES; j++) {
> >
> >     >     +            prim_hitmask |= ((temp_sig0 == prim_bkt0->sig[j])
> << j);
> >
> >     >     +            sec_hitmask |= ((temp_sig0 == sec_bkt0->sig[j])
> << j);
> >
> >     >     +        }
> >
> >     >     +
> >
> >     >     +        if (prim_hitmask) {
> >
> >     >     +            data[i-1] = prim_bkt0-
> >table_index[raw_ctz(prim_hitmask)];
> >
> >     >     +            if (data[i-1] != 0 && cls->subtable_table[data[i-
> 1]] != 0) {
> >
> >     >     +                hits |= 1 << (i - 1);
> >
> >     >     +                prim_bkt0 = prim_bkt1;
> >
> >     >     +                sec_bkt0 = sec_bkt1;
> >
> >     >     +                temp_sig0 = temp_sig1;
> >
> >     >     +
> >
> >     >     +                continue;
> >
> >     >     +            }
> >
> >     >     +
> >
> >     >     +        }
> >
> >     >     +
> >
> >     >     +        if (sec_hitmask) {
> >
> >     >     +            data[i-1] = sec_bkt0-
> >table_index[raw_ctz(sec_hitmask)];
> >
> >     >     +            if (data[i-1] != 0 && cls->subtable_table[data[i-
> 1]] != 0) {
> >
> >     >     +                hits |= 1 << (i - 1);
> >
> >     >     +            }
> >
> >     >     +        }
> >
> >     >     +
> >
> >     >     +#endif
> >
> >     >     +
> >
> >     >     +        prim_bkt0 = prim_bkt1;
> >
> >     >     +        sec_bkt0 = sec_bkt1;
> >
> >     >     +        temp_sig0 = temp_sig1;
> >
> >     >     +
> >
> >     >     +    }
> >
> >     >     +
> >
> >     >     +
> >
> >     >     +#ifdef __AVX2__
> >
> >     >     +
> >
> >     >     +      prim_hitmask =
> >
> >     > _mm256_movemask_epi8((__m256i)_mm256_cmpeq_epi16(
> >
> >     >     +                _mm256_load_si256((__m256i const *)prim_bkt0-
> >sig),
> >
> >     >     +                _mm256_set1_epi16(temp_sig0)));
> >
> >     >     +
> >
> >     >     +
> >
> >     >     +      sec_hitmask=
> >
> >     > _mm256_movemask_epi8((__m256i)_mm256_cmpeq_epi16(
> >
> >     >     +                _mm256_load_si256((__m256i const *)sec_bkt0-
> >sig),
> >
> >     >     +                _mm256_set1_epi16(temp_sig0)));
> >
> >     >     +
> >
> >     >     +     if (prim_hitmask) {
> >
> >     >     +        data[i-1] = prim_bkt0-
> >table_index[raw_ctz(prim_hitmask) / 2];
> >
> >     >     +        if (data[i-1] != 0 && cls->subtable_table[data[i-1]]
> != 0) {
> >
> >     >     +            hits |= 1 << (i - 1);
> >
> >     >     +            if (hit_mask != NULL)
> >
> >     >     +                *hit_mask = hits;
> >
> >     >     +            return count_1bits(*hit_mask);
> >
> >     >     +        }
> >
> >     >     +
> >
> >     >     +     }
> >
> >     >     +
> >
> >     >     +    if (sec_hitmask) {
> >
> >     >     +        data[i-1] = sec_bkt0-
> >table_index[raw_ctz(sec_hitmask) / 2];
> >
> >     >     +        if (data[i-1] != 0 && cls->subtable_table[data[i-1]]
> != 0) {
> >
> >     >     +           hits |= 1 << (i - 1);
> >
> >     >     +        }
> >
> >     >     +    }
> >
> >     >     +
> >
> >     >     +
> >
> >     >     +#else
> >
> >     >     +    unsigned int j;
> >
> [Wang, Yipeng] It should have reset the variables. We will fix the bug in
> version2.
> 	Prim_hitmask = 0;
> 	Sec_hitmask = 0;
> 
> >     >     +    for (j = 0; j < CD_ENTRIES; j++) {
> >
> >     >     +        prim_hitmask |= ((temp_sig0 == prim_bkt0->sig[j]) <<
> j);
> >
> >     >     +        sec_hitmask |= ((temp_sig0 == sec_bkt0->sig[j]) <<
> j);
> >
> >     >     +    }
> >
> >     >     +
> >
> >     >     +    if (prim_hitmask) {
> >
> >     >     +        data[i-1] = prim_bkt0-
> >table_index[raw_ctz(prim_hitmask)];
> >
> >     >     +        if (data[i-1] != 0 && cls->subtable_table[data[i-1]]
> != 0) {
> >
> >     >     +            hits |= 1 << (i-1);
> >
> >     >     +            if (hit_mask != NULL)
> >
> >     >     +                *hit_mask = hits;
> >
> >     >     +            return count_1bits(*hit_mask);
> >
> >     >     +        }
> >
> >     >     +
> >
> >     >     +    }
> >
> >     >     +
> >
> >     >     +    if (sec_hitmask) {
> >
> >     >     +        data[i-1] = sec_bkt0-
> >table_index[raw_ctz(sec_hitmask)];
> >
> >     >     +        if (data[i-1] != 0 && cls->subtable_table[data[i-1]]
> != 0) {
> >
> >     >     +           hits |= 1 << (i - 1);
> >
> >     >     +        }
> >
> >     >     +    }
> >
> >     >     +#endif
> >
> >     >     +
> >
> >     >     +    if (hit_mask != NULL)
> >
> >     >     +        *hit_mask = hits;
> >
> >     >     +    return count_1bits(*hit_mask);
> >
> >     >     +}
> >
> >     >     +
> >
> >     >     +
> >
> >     >     +
> >
> >     >     +
> >
> >     >      static inline void
> >
> >     >      emc_probabilistic_insert(struct dp_netdev_pmd_thread *pmd,
> >
> >     >                               const struct netdev_flow_key *key,
> >
> >     >     @@ -2065,6 +2423,24 @@ emc_lookup(struct emc_cache *cache,
> > const
> >
> >     > struct netdev_flow_key *key)
> >
> >     >          return NULL;
> >
> >     >      }
> >
> >     >
> >
> >     >     +
> >
> >     >     +static inline struct dpcls_subtable *
> >
> >     >     +dpcls_find_subtable(struct dpcls *cls, const struct
> netdev_flow_key
> > *mask);
> >
> >     >     +
> >
> >     >     +
> >
> >     >     +/* Insert the subtable pointer to the subtable table. */
> >
> >     >     +static inline int
> >
> >     >     +insert_subtable_table(struct dpcls *, struct dpcls_subtable *
> );
> >
> >     >     +
> >
> >     >     +/* Remove a subtable from the subtable table */
> >
> >     >     +static inline int
> >
> >     >     +remove_subtable_table(struct dpcls *, struct dpcls_subtable *
> );
> >
> >     >     +
> >
> >     >     +/* Find the index of a certain subtable. */
> >
> >     >     +static inline int
> >
> >     >     +find_index_in_subtable_table(struct dpcls *, struct
> dpcls_subtable * );
> >
> >     >     +
> >
> >     >     +
> >
> >     >      static struct dp_netdev_flow *
> >
> >     >      dp_netdev_pmd_lookup_flow(struct dp_netdev_pmd_thread *pmd,
> >
> >     >                                const struct netdev_flow_key *key,
> >
> >     >     @@ -2077,7 +2453,7 @@ dp_netdev_pmd_lookup_flow(struct
> >
> >     > dp_netdev_pmd_thread *pmd,
> >
> >     >
> >
> >     >          cls = dp_netdev_pmd_lookup_dpcls(pmd, in_port);
> >
> >     >          if (OVS_LIKELY(cls)) {
> >
> >     >     -        dpcls_lookup(cls, key, &rule, 1, lookup_num_p);
> >
> >     >     +        dpcls_lookup(pmd, cls, key, &rule, 1, lookup_num_p);
> >
> >     >              netdev_flow = dp_netdev_flow_cast(rule);
> >
> >     >          }
> >
> >     >          return netdev_flow;
> >
> >     >     @@ -2311,7 +2687,8 @@ out:
> >
> >     >      static struct dp_netdev_flow *
> >
> >     >      dp_netdev_flow_add(struct dp_netdev_pmd_thread *pmd,
> >
> >     >                         struct match *match, const ovs_u128 *ufid,
> >
> >     >     -                   const struct nlattr *actions, size_t
> actions_len)
> >
> >     >     +                   const struct nlattr *actions, size_t
> actions_len,
> >
> >     >     +                   const struct netdev_flow_key *key)
> >
> >     >          OVS_REQUIRES(pmd->flow_mutex)
> >
> >     >      {
> >
> >     >          struct dp_netdev_flow *flow;
> >
> >     >     @@ -2358,6 +2735,17 @@ dp_netdev_flow_add(struct
> >
> >     > dp_netdev_pmd_thread *pmd,
> >
> >     >          cmap_insert(&pmd->flow_table, CONST_CAST(struct cmap_node
> *,
> > &flow-
> >
> >     > >node),
> >
> >     >                      dp_netdev_flow_hash(&flow->ufid));
> >
> >     >
> >
> >     >     +    /* Insert to CD here. */
> >
> >     >     +    if (cls->cd_on) {
> >
> >     >     +        if (key) {
> >
> >     >     +            struct dpcls_subtable *subtable =
> dpcls_find_subtable(cls,
> > &mask);
> >
> >     >     +            int index = find_index_in_subtable_table(cls,
> subtable);
> >
> >     >     +            if (index != 0) {
> >
> >     >     +                cd_insert(cls->cdtable, key, index);
> >
> >     >     +            }
> >
> >     >     +        }
> >
> >     >     +    }
> >
> >     >     +
> >
> >     >          if (OVS_UNLIKELY(VLOG_IS_DBG_ENABLED())) {
> >
> >     >              struct ds ds = DS_EMPTY_INITIALIZER;
> >
> >     >              struct ofpbuf key_buf, mask_buf;
> >
> >     >     @@ -2414,7 +2802,7 @@ flow_put_on_pmd(struct
> > dp_netdev_pmd_thread
> >
> >     > *pmd,
> >
> >     >              if (put->flags & DPIF_FP_CREATE) {
> >
> >     >                  if (cmap_count(&pmd->flow_table) < MAX_FLOWS) {
> >
> >     >                      dp_netdev_flow_add(pmd, match, ufid, put-
> >actions,
> >
> >     >     -                                   put->actions_len);
> >
> >     >     +                                   put->actions_len, NULL);
> >
> >     >                      error = 0;
> >
> >     >                  } else {
> >
> >     >                      error = EFBIG;
> >
> >     >     @@ -4160,6 +4548,7 @@ dp_netdev_configure_pmd(struct
> >
> >     > dp_netdev_pmd_thread *pmd, struct dp_netdev *dp,
> >
> >     >          cmap_init(&pmd->flow_table);
> >
> >     >          cmap_init(&pmd->classifiers);
> >
> >     >          pmd->next_optimization = time_msec() +
> > DPCLS_OPTIMIZATION_INTERVAL;
> >
> >     >     +    pmd->next_cd_optimization = time_msec() +
> >
> >     > DPCLS_CD_OPTIMIZATION_INTERVAL;
> >
> >     >          hmap_init(&pmd->poll_list);
> >
> >     >          hmap_init(&pmd->tx_ports);
> >
> >     >          hmap_init(&pmd->tnl_port_cache);
> >
> >     >     @@ -4629,7 +5018,7 @@ handle_packet_upcall(struct
> >
> >     > dp_netdev_pmd_thread *pmd, struct dp_packet *packet,
> >
> >     >              if (OVS_LIKELY(!netdev_flow)) {
> >
> >     >                  netdev_flow = dp_netdev_flow_add(pmd, &match,
> &ufid,
> >
> >     >                                                   add_actions-
> >data,
> >
> >     >     -                                             add_actions-
> >size);
> >
> >     >     +                                             add_actions-
> >size, key);
> >
> >     >              }
> >
> >     >              ovs_mutex_unlock(&pmd->flow_mutex);
> >
> >     >              emc_probabilistic_insert(pmd, key, netdev_flow);
> >
> >     >     @@ -4667,7 +5056,7 @@ fast_path_processing(struct
> >
> >     > dp_netdev_pmd_thread *pmd,
> >
> >     >          /* Get the classifier for the in_port */
> >
> >     >          cls = dp_netdev_pmd_lookup_dpcls(pmd, in_port);
> >
> >     >          if (OVS_LIKELY(cls)) {
> >
> >     >     -        any_miss = !dpcls_lookup(cls, keys, rules, cnt,
> &lookup_cnt);
> >
> >     >     +        any_miss = !dpcls_lookup(pmd, cls, keys, rules, cnt,
> &lookup_cnt);
> >
> >     >          } else {
> >
> >     >              any_miss = true;
> >
> >     >              memset(rules, 0, sizeof(rules));
> >
> >     >     @@ -5426,23 +5815,109 @@ struct dpcls_subtable {
> >
> >     >          struct cmap rules;           /* Contains "struct
> dpcls_rule"s. */
> >
> >     >          uint32_t hit_cnt;            /* Number of match hits in
> subtable in current
> >
> >     >                                          optimization interval. */
> >
> >     >     +
> >
> >     >     +    uint32_t access_cnt;     /* With CD implemented, hit_cnt
> should be
> >
> >     > subtable
> >
> >     >     +                              * hits that miss in CD, so the
> ranking mechanism
> >
> >     >     +                              * which is based on hit_cnt
> still works properly.
> >
> >     >     +                              * We have the access_cnt as
> total access count to
> >
> >     >     +                              * each subtable to consider if
> we should turn on
> >
> >     >     +                              * or turn off CD.
> >
> >     >     +                              */
> >
> >     >     +
> >
> >     >          struct netdev_flow_key mask; /* Wildcards for fields
> (const). */
> >
> >     >          /* 'mask' must be the last field, additional space is
> allocated here. */
> >
> >     >      };
> >
> >     >
> >
> >     >     +
> >
> >     >     +
> >
> >     >     +static int
> >
> >     >     +insert_subtable_table(struct dpcls *cls, struct
> dpcls_subtable*
> > subtable)
> >
> >     >     +{
> >
> >     >     +    int i;
> >
> >     >     +    ovs_assert(subtable != NULL );
> >
> >     >     +    for (i = 1; i < SUBTABLE_TABLE_LENGTH; i++) {
> >
> >     >     +        if (cls->subtable_table[i] == subtable) {
> >
> >     >     +            /* When we insert, we should know that the
> subtable is not
> > inserted
> >
> >     >     +             * before.
> >
> >     >     +             */
> >
> >     >     +            VLOG_ERR("already have the subtable in
> subtable_table");
> >
> >     >     +            return -1;
> >
> >     >     +        }
> >
> >     >     +    }
> >
> >     >     +
> >
> >     >     +    for (i = 1; i < SUBTABLE_TABLE_LENGTH; i++) {
> >
> >     >     +        if (cls->subtable_table[i] == 0) {
> >
> >     >     +            cls->subtable_table[i] = subtable;
> >
> >     >     +            return i;
> >
> >     >     +        }
> >
> >     >     +    }
> >
> >     >     +    /* When the subtable count is larger than
> subtable_table_length
> > (255
> >
> >     > now)*/
> >
> >     >     +    VLOG_INFO("create subtable in subtable_table failed,
> overflow");
> >
> >     >     +    return 0;
> >
> >     >     +}
> >
> >     >     +
> >
> >     >     +static int
> >
> >     >     +remove_subtable_table(struct dpcls *cls, struct
> dpcls_subtable*
> > subtable)
> >
> >     >     +{
> >
> >     >     +
> >
> >     >     +    int i;
> >
> >     >     +    ovs_assert(subtable != NULL );
> >
> >     >     +    for (i = 1; i < SUBTABLE_TABLE_LENGTH; i++) {
> >
> >     >     +        if (cls->subtable_table[i] == subtable) {
> >
> >     >     +            /*reset to subtable index in subtable_table to
> NULL*/
> >
> >     >     +            cls->subtable_table[i] = (struct
> dpcls_subtable*)0;
> >
> >     >     +            return i;
> >
> >     >     +        }
> >
> >     >     +    }
> >
> >     >     +
> >
> >     >     +    /* Happens when remove while more subtable than the
> >
> >     > subtable_table_length*/
> >
> >     >     +    VLOG_INFO("cannot find the table ptr in subtable_table to
> > remove");
> >
> >     >     +    return 0;
> >
> >     >     +}
> >
> >     >     +
> >
> >     >     +static int
> >
> >     >     +find_index_in_subtable_table(struct dpcls *cls,
> >
> >     >     +                             struct dpcls_subtable* subtable)
> >
> >     >     +{
> >
> >     >     +    int i;
> >
> >     >     +    ovs_assert(subtable != NULL );
> >
> >     >     +    for (i = 1; i < SUBTABLE_TABLE_LENGTH; i++) {
> >
> >     >     +        if (cls->subtable_table[i] == subtable) {
> >
> >     >     +            return i;
> >
> >     >     +        }
> >
> >     >     +    }
> >
> >     >     +    return 0;
> >
> >     >     +}
> >
> >     >     +
> >
> >     >     +
> >
> >     >      /* Initializes 'cls' as a classifier that initially contains
> no classification
> >
> >     >       * rules. */
> >
> >     >      static void
> >
> >     >      dpcls_init(struct dpcls *cls)
> >
> >     >      {
> >
> >     >     +    int i;
> >
> >     >          cmap_init(&cls->subtables_map);
> >
> >     >          pvector_init(&cls->subtables);
> >
> >     >     +    int ret = posix_memalign((void**)&cls->cdtable, 64,
> >
> >     >     +                            sizeof(struct
> cuckoo_distributor));
> >
> >     >     +    if(ret != 0) {
> >
> >     >     +        VLOG_ERR("Create cuckoo distributor failed");
> >
> >     >     +    }
> >
> >     >     +    cd_init(cls->cdtable);
> >
> >     >     +    cls->cd_on = 1;
> >
> >     >     +    cls->cd_insert_cnt = 0;
> >
> >     >     +    cls->cd_insert_fail_cnt = 0;
> >
> >     >     +    for(i = 0; i < SUBTABLE_TABLE_LENGTH; i++){
> >
> >     >     +       cls->subtable_table[i] = 0;
> >
> >     >     +    }
> >
> >     >     +    random_set_seed(100);
> >
> >     >      }
> >
> >     >
> >
> >     >      static void
> >
> >     >      dpcls_destroy_subtable(struct dpcls *cls, struct
> dpcls_subtable
> > *subtable)
> >
> >     >      {
> >
> >     >          VLOG_DBG("Destroying subtable %p for in_port %d",
> subtable, cls-
> >
> >     > >in_port);
> >
> >     >     +    remove_subtable_table(cls, subtable);
> >
> >     >          pvector_remove(&cls->subtables, subtable);
> >
> >     >          cmap_remove(&cls->subtables_map, &subtable->cmap_node,
> >
> >     >                      subtable->mask.hash);
> >
> >     >     @@ -5465,6 +5940,7 @@ dpcls_destroy(struct dpcls *cls)
> >
> >     >              }
> >
> >     >              cmap_destroy(&cls->subtables_map);
> >
> >     >              pvector_destroy(&cls->subtables);
> >
> >     >     +        cd_delete(cls->cdtable);
> >
> >     >          }
> >
> >     >      }
> >
> >     >
> >
> >     >     @@ -5478,6 +5954,7 @@ dpcls_create_subtable(struct dpcls *cls,
> const
> >
> >     > struct netdev_flow_key *mask)
> >
> >     >                             - sizeof subtable->mask.mf + mask-
> >len);
> >
> >     >          cmap_init(&subtable->rules);
> >
> >     >          subtable->hit_cnt = 0;
> >
> >     >     +    subtable->access_cnt = 0;
> >
> >     >          netdev_flow_key_clone(&subtable->mask, mask);
> >
> >     >          cmap_insert(&cls->subtables_map, &subtable->cmap_node,
> mask-
> > >hash);
> >
> >     >          /* Add the new subtable at the end of the pvector (with
> no hits yet)
> > */
> >
> >     >     @@ -5485,6 +5962,9 @@ dpcls_create_subtable(struct dpcls *cls,
> const
> >
> >     > struct netdev_flow_key *mask)
> >
> >     >          VLOG_DBG("Creating %"PRIuSIZE". subtable %p for in_port
> %d",
> >
> >     >                   cmap_count(&cls->subtables_map), subtable, cls-
> >in_port);
> >
> >     >          pvector_publish(&cls->subtables);
> >
> >     >     +    int ret = insert_subtable_table(cls, subtable);
> >
> >     >     +    /* The subtable should not be in subtable-table yet */
> >
> >     >     +    ovs_assert(ret >= 0);
> >
> >     >
> >
> >     >          return subtable;
> >
> >     >      }
> >
> >     >     @@ -5510,7 +5990,6 @@ dpcls_sort_subtable_vector(struct dpcls
> *cls)
> >
> >     >      {
> >
> >     >          struct pvector *pvec = &cls->subtables;
> >
> >     >          struct dpcls_subtable *subtable;
> >
> >     >     -
> >
> >     >          PVECTOR_FOR_EACH (subtable, pvec) {
> >
> >     >              pvector_change_priority(pvec, subtable, subtable-
> >hit_cnt);
> >
> >     >              subtable->hit_cnt = 0;
> >
> >     >     @@ -5537,6 +6016,39 @@ dp_netdev_pmd_try_optimize(struct
> >
> >     > dp_netdev_pmd_thread *pmd)
> >
> >     >                  pmd->next_optimization = now +
> > DPCLS_OPTIMIZATION_INTERVAL;
> >
> >     >              }
> >
> >     >          }
> >
> >     >     +
> >
> >     >     +    if (now > pmd->next_cd_optimization) {
> >
> >     >     +
> >
> >     >     +        CMAP_FOR_EACH (cls, node, &pmd->classifiers) {
> >
> >     >     +            struct pvector *pvec = &cls->subtables;
> >
> >     >     +            struct dpcls_subtable *subtable;
> >
> >     >     +            float avg_table_cnt = 0;
> >
> >     >     +            int cnt = 0;
> >
> >     >     +            uint32_t total = 0;
> >
> >     >     +            uint32_t sum = 0;
> >
> >     >     +            PVECTOR_FOR_EACH (subtable,pvec) {
> >
> >     >     +                sum += subtable->access_cnt * cnt;
> >
> >     >     +                total += subtable->access_cnt;
> >
> >     >     +                subtable->access_cnt = 0;
> >
> >     >     +                cnt++;
> >
> >     >     +            }
> >
> >     >     +            /* If total access is too small, we keep previous
> decision */
> >
> >     >     +            if (total > cnt * 5) {
> >
> >     >     +                avg_table_cnt = (float)sum / total;
> >
> >     >     +            }
> >
> >     >     +            else {
> >
> >     >     +                avg_table_cnt = -1;
> >
> >     >     +            }
> >
> >     >     +
> >
> >     >     +            if (avg_table_cnt >= 1) {
> >
> >     >     +                cls->cd_on = 1;
> >
> >     >     +            }
> >
> >     >     +            else if (avg_table_cnt != -1) {
> >
> >     >     +                cls->cd_on = 0;
> >
> >     >     +            }
> >
> >     >     +        }
> >
> >     >     +        pmd->next_cd_optimization = now +
> >
> >     > DPCLS_CD_OPTIMIZATION_INTERVAL;
> >
> >     >     +    }
> >
> >     >      }
> >
> >     >
> >
> >     >      /* Insert 'rule' into 'cls'. */
> >
> >     >     @@ -5587,6 +6099,10 @@ dpcls_rule_matches_key(const struct
> > dpcls_rule
> >
> >     > *rule,
> >
> >     >          return true;
> >
> >     >      }
> >
> >     >
> >
> >     >     +
> >
> >     >     +
> >
> >     >     +
> >
> >     >     +
> >
> >     >      /* For each miniflow in 'keys' performs a classifier lookup
> writing the
> > result
> >
> >     >       * into the corresponding slot in 'rules'.  If a particular
> entry in 'keys' is
> >
> >     >       * NULL it is skipped.
> >
> >     >     @@ -5597,40 +6113,113 @@ dpcls_rule_matches_key(const struct
> > dpcls_rule
> >
> >     > *rule,
> >
> >     >       * priorities, instead returning any rule which matches the
> flow.
> >
> >     >       *
> >
> >     >       * Returns true if all miniflows found a corresponding rule.
> */
> >
> >     >     -static bool
> >
> >     >     -dpcls_lookup(struct dpcls *cls, const struct netdev_flow_key
> keys[],
> >
> >     >     -             struct dpcls_rule **rules, const size_t cnt,
> >
> >     >     -             int *num_lookups_p)
> >
> >     >     + static bool
> >
> >     >     + dpcls_lookup(struct dp_netdev_pmd_thread *pmd, struct dpcls
> *cls,
> >
> >     >     +              const struct netdev_flow_key keys[],
> >
> >     >     +              struct dpcls_rule **rules, const size_t cnt,
> >
> >     >     +              int *num_lookups_p)
> >
> >     >      {
> >
> >     >          /* The received 'cnt' miniflows are the search-keys that
> will be
> > processed
> >
> >     >           * to find a matching entry into the available subtables.
> >
> >     >           * The number of bits in map_type is equal to
> NETDEV_MAX_BURST.
> > */
> >
> >     >     +
> >
> >     >     +    if (cnt == 0) {
> >
> >     >     +        return false;
> >
> >     >     +    }
> >
> >     >     +
> >
> >     >          typedef uint32_t map_type;
> >
> >     >     +
> >
> >     >      #define MAP_BITS (sizeof(map_type) * CHAR_BIT)
> >
> >     >          BUILD_ASSERT_DECL(MAP_BITS >= NETDEV_MAX_BURST);
> >
> >     >
> >
> >     >          struct dpcls_subtable *subtable;
> >
> >     >     -
> >
> >     >          map_type keys_map = TYPE_MAXIMUM(map_type); /* Set all
> bits.
> > */
> >
> >     >     -    map_type found_map;
> >
> >     >     +    map_type found_map = keys_map;
> >
> >     >          uint32_t hashes[MAP_BITS];
> >
> >     >          const struct cmap_node *nodes[MAP_BITS];
> >
> >     >     +    int cd_match = 0;
> >
> >     >
> >
> >     >          if (cnt != MAP_BITS) {
> >
> >     >              keys_map >>= MAP_BITS - cnt; /* Clear extra bits. */
> >
> >     >          }
> >
> >     >     +
> >
> >     >          memset(rules, 0, cnt * sizeof *rules);
> >
> >     >
> >
> >     >          int lookups_match = 0, subtable_pos = 1;
> >
> >     >
> >
> >     >     -    /* The Datapath classifier - aka dpcls - is composed of
> subtables.
> >
> >     >     -     * Subtables are dynamically created as needed when new
> rules are
> >
> >     > inserted.
> >
> >     >     -     * Each subtable collects rules with matches on a
> specific subset of
> > packet
> >
> >     >     -     * fields as defined by the subtable's mask.  We proceed
> to process
> > every
> >
> >     >     -     * search-key against each subtable, but when a match is
> found for a
> >
> >     >     -     * search-key, the search for that key can stop because
> the rules are
> >
> >     >     -     * non-overlapping. */
> >
> >     >     +    if (cls->cd_on) {
> >
> >     >     +
> >
> >     >     +        int i;
> >
> >     >     +        int data[MAP_BITS];
> >
> >     >     +        int valid_cnt = count_1bits(keys_map);
> >
> >     >     +        int nfound = cd_lookup_bulk_pipe(cls, keys,
> valid_cnt,
> > &found_map,
> >
> >     >     +                                            data);
> >
> >     >     +
> >
> >     >     +        debug_print("CD found %d   maps %x\n", nfound,
> found_map);
> >
> >     >     +
> >
> >     >     +        ULLONG_FOR_EACH_1(i, found_map) {
> >
> >     >     +            hashes[i] =
> netdev_flow_key_hash_in_mask(&keys[i],
> >
> >     >     +                                    &(cls-
> >subtable_table[data[i]])->mask);
> >
> >     >     +            nodes[i] = cmap_find(&((cls-
> >subtable_table[data[i]])->rules),
> >
> >     >     +                                 hashes[i]);
> >
> >     >     +            if (nodes[i] != NULL) {
> >
> >     >     +                struct dpcls_rule *rule;
> >
> >     >     +                CMAP_NODE_FOR_EACH (rule, cmap_node,
> nodes[i]) {
> >
> >     >     +                    if
> (OVS_LIKELY(dpcls_rule_matches_key(rule, &keys[i]))) {
> >
> >     >     +                        rules[i] = rule;
> >
> >     >     +                        cls->subtable_table[data[i]]-
> >access_cnt++;
> >
> >     >     +                        lookups_match += 1;
> >
> >     >     +                        cd_match += 1;
> >
> >     >     +                        goto scnext;
> >
> >     >     +                    }
> >
> >     >     +                }
> >
> >     >     +                ULLONG_SET0(found_map, i);  /* Did not match.
> */
> >
> >     >     +                /*
> >
> >     >     +                 * Here means key same in subtable but not
> same rule
> >
> >     >     +                 * since CD find the correct subtable,
> >
> >     >     +                 * we dont need to insert to CD.
> >
> >     >     +                 */
> >
> >     >     +
> >
> >     >     +                scnext:
> >
> >     >     +                ;
> >
> >     >     +            }
> >
> >     >     +            else if (nodes[i] == NULL) {
> >
> >     >     +                 ULLONG_SET0(found_map, i);
> >
> >     >     +                 /* Here means in CD but not in the target
> subtable.
> >
> >     >     +                  * meaning that it matches to a same (but
> wrong) key in CD.
> >
> >     >     +                  * we should insert it into CD later when we
> know
> >
> >     >     +                  * which subtable it hits.
> >
> >     >     +                  */
> >
> >     >     +            }
> >
> >     >     +        }
> >
> >     >     +
> >
> >     >     +        keys_map &= ~found_map;
> >
> >     >     +
> >
> >     >     +        dp_netdev_count_packet(pmd, DP_CD_STAT_HIT,
> cd_match);
> >
> >     >     +        dp_netdev_count_packet(pmd, DP_CD_STAT_MISS, cnt-
> > cd_match);
> >
> >     >     +
> >
> >     >     +        if (!keys_map) {
> >
> >     >     +            if (num_lookups_p) {
> >
> >     >     +                *num_lookups_p = lookups_match;
> >
> >     >     +            }
> >
> >     >     +
> >
> >     >     +            debug_print( "every key found in CD\n");
> >
> >     >     +            return true;              /* All found. */
> >
> >     >     +        }
> >
> >     >     +
> >
> >     >     +        debug_print( "Need search subtable (CD miss)\n");
> >
> >     >     +    }
> >
> >     >     +
> >
> >     >     +
> >
> >     >     +    /*The Datapath classifier - aka dpcls - is composed of
> subtables.
> >
> >     >     +    * Subtables are dynamically created as needed when new
> rules are
> >
> >     > inserted.
> >
> >     >     +    * Each subtable collects rules with matches on a specific
> subset of
> > packet
> >
> >     >     +    * fields as defined by the subtable's mask.  We proceed
> to process
> > every
> >
> >     >     +    * search-key against each subtable, but when a match is
> found for a
> >
> >     >     +    * search-key, the search for that key can stop because
> the rules are
> >
> >     >     +    * non-overlapping. */
> >
> >     >          PVECTOR_FOR_EACH (subtable, &cls->subtables) {
> >
> >     >     +
> >
> >     >              int i;
> >
> >     >
> >
> >     >              /* Compute hashes for the remaining keys.  Each
> search-key is
> >
> >     >     @@ -5649,13 +6238,13 @@ dpcls_lookup(struct dpcls *cls, const
> struct
> >
> >     > netdev_flow_key keys[],
> >
> >     >               * search-key. */
> >
> >     >              ULLONG_FOR_EACH_1(i, found_map) {
> >
> >     >                  struct dpcls_rule *rule;
> >
> >     >     -
> >
> >     >                  CMAP_NODE_FOR_EACH (rule, cmap_node, nodes[i]) {
> >
> >     >                      if (OVS_LIKELY(dpcls_rule_matches_key(rule,
> &keys[i]))) {
> >
> >     >                          rules[i] = rule;
> >
> >     >                          /* Even at 20 Mpps the 32-bit hit_cnt
> cannot wrap
> >
> >     >                           * within one second optimization
> interval. */
> >
> >     >                          subtable->hit_cnt++;
> >
> >     >     +                    subtable->access_cnt++;
> >
> >     >                          lookups_match += subtable_pos;
> >
> >     >                          goto next;
> >
> >     >                      }
> >
> >     >     @@ -5663,10 +6252,25 @@ dpcls_lookup(struct dpcls *cls, const
> struct
> >
> >     > netdev_flow_key keys[],
> >
> >     >                  /* None of the found rules was a match.  Reset
> the i-th bit to
> >
> >     >                   * keep searching this key in the next subtable.
> */
> >
> >     >                  ULLONG_SET0(found_map, i);  /* Did not match. */
> >
> >     >     +            continue;
> >
> >     >              next:
> >
> >     >     -            ;                     /* Keep Sparse happy. */
> >
> >     >     +            debug_print("find in subtable\n");
> >
> >     >     +            /* If we find things here, it means it misses in
> CD.
> >
> >     >     +             * we should insert into CD.
> >
> >     >     +             */
> >
> >     >     +
> >
> >     >     +           if (cls->cd_on) {
> >
> >     >     +                int index = find_index_in_subtable_table(cls,
> subtable);
> >
> >     >     +                /* If 0, means not in subtable_table, then no
> need to insert
> >
> >     >     +                 * into CD.
> >
> >     >     +                 */
> >
> >     >     +                if (index != 0) {
> >
> >     >     +                    cd_insert(cls->cdtable, &keys[i], index);
> >
> >     >     +                }
> >
> >     >     +            }
> >
> >     >              }
> >
> >     >              keys_map &= ~found_map;             /* Clear the
> found rules. */
> >
> >     >     +
> >
> >     >              if (!keys_map) {
> >
> >     >                  if (num_lookups_p) {
> >
> >     >                      *num_lookups_p = lookups_match;
> >
> >     >     @@ -5678,5 +6282,9 @@ dpcls_lookup(struct dpcls *cls, const
> struct
> >
> >     > netdev_flow_key keys[],
> >
> >     >          if (num_lookups_p) {
> >
> >     >              *num_lookups_p = lookups_match;
> >
> >     >          }
> >
> >     >     +    debug_print("Miss in both CD and subtable\n");
> >
> >     >     +    /* Things that miss in both tables should also be
> inserted into CD.
> >
> >     >     +     * the upcall function should be able to handle it.
> >
> >     >     +     */
> >
> >     >          return false;                     /* Some misses. */
> >
> >     >      }
> >
> >     >     diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
> >
> >     >     index 0c2ea38..937dd3d 100644
> >
> >     >     --- a/tests/ofproto-dpif.at
> >
> >     >     +++ b/tests/ofproto-dpif.at
> >
> >     >     @@ -9383,7 +9383,8 @@ AT_CHECK([ovs-ofctl add-flows br0
> flows.txt])
> >
> >     >
> >
> >     >      dnl Start a new connection from port 1.
> >
> >     >      AT_CHECK([ovs-appctl netdev-dummy/receive p1
> >
> >     >
> >
> 'in_port(1),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x08
> 0
> > 0),
> >
> >     >
> >
> ipv4(src=10.1.1.1,dst=10.1.1.2,proto=17,tos=0,ttl=64,frag=no),udp(src=1,ds
> t=
> > 2)'
> >
> >     > ])
> >
> >     >     -
> >
> >     >     +# cuckoo distributor requires time for initilization, add
> sleep
> >
> >     >     +sleep 2
> >
> >     >      AT_CHECK([cat ovs-vswitchd.log | strip_ufid |
> filter_flow_install], [0],
> > [dnl
> >
> >     >
> recirc_id(0),in_port(1),eth_type(0x0800),ipv4(proto=17,frag=no),
> >
> >     > actions:ct(commit)
> >
> >     >      ])
> >
> >     >     --
> >
> >     >     1.9.1
> >
> >     >
> >
> >     >     _______________________________________________
> >
> >     >     dev mailing list
> >
> >     >     dev at openvswitch.org
> >
> >     >     https://urldefense.proofpoint.com/v2/url?u=https-
> > 3A__mail.openvswitch.org_mailman_listinfo_ovs-
> > 2Ddev&d=DwIGaQ&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-
> > uZnsw&m=bcx31MrzBpVUDtYrDTrW0XODRYpGQbumIRMXga6ieJM&s=8t2l1l
> > onhEcWf0Af-fbMcqqLCeX6qnityAHBVmtxpmY&e=
> >
> >     >
> >
> >
> >
> >



More information about the dev mailing list