[ovs-dev] [RFC Patch] dpif-netdev: dpcls per in_port with sorted subtables

Jarno Rajahalme jarno at ovn.org
Mon Jul 4 12:05:55 UTC 2016


I like this RFC patch and suggest you send a non-RFC patch after taking care of the following:

- There are numerous coding style violations. I've done my best to note them below, but you should also read 'CodingStyle.md'.

- The subject line should have 'Patch' in all upper-case and have the version number follow it (e.g., "[RFC PATCH v2]...").

- You should analyze the optimization code path to make sure there are no blocking locks being taken, as we have been carefully removing any such blocking to eliminate the associated latency spikes. It would be best to make a statement about this in the commit message as well.

I'd like someone else review the non-RFC patch to have an another set of eyes on this code.

  Jarno

> On Jul 1, 2016, at 4:44 AM, Jan Scheurich <jan.scheurich at web.de> wrote:
> 
> This is a redesigned version of the earlier RFC patch “dpif-netdev: Sorted
> subtable vectors per in_port in dpcls”. 
> 
> The user-space datapath (dpif-netdev) consists of a first level "exact match
> cache" (EMC) matching on 5-tuples and the normal megaflow classifier. With
> many parallel packet flows (e.g. TCP connections) the EMC becomes inefficient
> and the OVS forwarding performance is determined by the megaflow classifier.
> 
> The megaflow classifier (dpcls) consists of a variable number of hash tables
> (aka subtables), each containing megaflow entries with the same mask of
> packet header and metadata fields to match upon. A dpcls lookup matches a
> given packet against all subtables in sequence until it hits a match. As
> megaflow cache entries are by construction non-overlapping, the first match
> is the only match.
> 
> Today the order of the subtables in the dpcls is essentially random so that
> on average a dpcsl lookup has to visit N/2 subtables for a hit, when N is the
> total number of subtables. Even though every single hash-table lookup is
> fast, the performance of the current dpcls degrades when there are many
> subtables.
> 
> How does the patch address this issue:
> 
> In reality there is often a strong correlation between the ingress port and a
> small subset of subtables that have hits. The entire megaflow cache typically
> decomposes nicely into partitions that are hit only by packets entering from
> a range of similar ports (e.g. traffic from Phy  -> VM vs. traffic from VM ->
> Phy). 
> 
> Therefore, maintaining a separate dpcls instance per ingress port with its
> subtable vector sorted by frequency of hits reduces the average number of
> subtables lookups in the dpcls to a minimum, even if the total number of
> subtables gets large. This is possible because megaflows always have an exact
> match on in_port, so every megaflow belongs to unique dpcls instance.
> 
> To monitor the effectiveness of the patch we have enhanced the ovs-appctl
> dpif-netdev/pmd-stats-show command with an extra line "avg. subtable lookups
> per hit" to report the average number of subtable lookup needed for a
> megaflow match. Ideally, this should be close to 1 and almost all cases much
> smaller than N/2.
> 
> I have benchmarked a cloud L3 overlay pipeline with a VXLAN overlay mesh.
> With pure L3 tenant traffic between VMs on different nodes the resulting
> netdev dpcls contains N=4 subtables.
> 
> Disabling the EMC, I have measured a baseline performance (in+out) of ~1.32
> Mpps (64 bytes, 1000 L4 flows). The average number of subtable lookups per
> dpcls match is 2.5.
> 
> With the patch the average number of subtable lookups per dpcls match is
> reduced 1 and the forwarding performance grows by ~30% to 1.72 Mpps. 
> 
> As the actual number of subtables will often be higher in reality, we can
> assume that this is at the lower end of the speed-up one can expect from this
> optimization. Just running a parallel ping between the VXLAN tunnel endpoints
> increases the number of subtables and hence the average number of subtable
> lookups from 2.5 to 3.5 with a corresponding decrease of throughput to 1.14
> Mpps. With the patch the parallel ping has no impact on average number of
> subtable lookups and performance. The performance gain is then ~50%.
> 
> The main change to the previous patch is that instead of having a subtable
> vector per in_port in a single dplcs instance, we now have one dpcls instance
> with a single subtable per ingress port. This is better aligned with the
> design base code and also improves the number of subtable lookups in a miss
> case.
> 
> The PMD tests have been adjusted to the additional line in pmd-stats-show.
> 
> Signed-off-by: Jan Scheurich <jan.scheurich at ericsson.com>
> 
> 
> ---
> 
> lib/dpif-netdev.c | 184 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------------------
> tests/pmd.at      |   6 +++--
> 2 files changed, 165 insertions(+), 25 deletions(-)
> 
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index ff4227c..43d6e61 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -72,6 +72,7 @@
> 
> VLOG_DEFINE_THIS_MODULE(dpif_netdev);
> 
> +

Added empty line for no reason.

> #define FLOW_DUMP_MAX_BATCH 50
> /* Use per thread recirc_depth to prevent recirculation loop. */
> #define MAX_RECIRC_DEPTH 5
> @@ -149,7 +150,12 @@ struct emc_cache {
> 
> /* Simple non-wildcarding single-priority classifier. */
> 
> +#define NUM_SUBTABLE_SLOTS 32
> +#define DPCLS_OPTIMIATION_INTERVAL 1000

Spelling mistake, should be "optimization"

> +
> struct dpcls {
> +    struct cmap_node node;
> +    odp_port_t in_port;
>     struct cmap subtables_map;
>     struct pvector subtables;
> };
> @@ -164,12 +170,14 @@ struct dpcls_rule {
> 
> static void dpcls_init(struct dpcls *);
> static void dpcls_destroy(struct dpcls *);
> +static void dpcls_optimize(struct dpcls *);
> static void dpcls_insert(struct dpcls *, struct dpcls_rule *,
>                          const struct netdev_flow_key *mask);
> static void dpcls_remove(struct dpcls *, struct dpcls_rule *);
> -static bool dpcls_lookup(const struct dpcls *cls,
> +static bool dpcls_lookup(struct dpcls *cls,
>                          const struct netdev_flow_key keys[],
> -                         struct dpcls_rule **rules, size_t cnt);
> +                         struct dpcls_rule **rules, size_t cnt,
> +                         int *num_lookups_p);
> 
> /* Datapath based on the network device interface from netdev.h.
>  *
> @@ -239,6 +247,7 @@ enum dp_stat_type {
>     DP_STAT_MASKED_HIT,         /* Packets that matched in the flow table. */
>     DP_STAT_MISS,               /* Packets that did not match. */
>     DP_STAT_LOST,               /* Packets not passed up to the client. */
> +    DP_STAT_LOOKUP_HIT,        /* Number of subtable lookups for flow table hits */
>     DP_N_STATS
> };
> 
> @@ -419,15 +428,19 @@ struct dp_netdev_pmd_thread {
>      * will only be accessed by its own pmd thread. */
>     struct emc_cache flow_cache;
> 
> -    /* Classifier and Flow-Table.
> +    /* Flow-Table and classifiers
>      *
>      * Writers of 'flow_table' must take the 'flow_mutex'.  Corresponding
> -     * changes to 'cls' must be made while still holding the 'flow_mutex'.
> +     * changes to 'classifiesr' must be made while still holding the 'flow_mutex'.
>      */
>     struct ovs_mutex flow_mutex;
> -    struct dpcls cls;
>     struct cmap flow_table OVS_GUARDED; /* Flow table. */
> 
> +    /* One classifier per in_port polled by the pmd */
> +    struct cmap classifiers;
> +    /* Periodically sort subtable vectors according to hit frequencies */
> +    long long int next_optimization;
> +
>     /* Statistics. */
>     struct dp_netdev_pmd_stats stats;
> 
> @@ -538,6 +551,8 @@ static void dp_netdev_pmd_unref(struct dp_netdev_pmd_thread *pmd);
> static void dp_netdev_pmd_flow_flush(struct dp_netdev_pmd_thread *pmd);
> static void pmd_load_cached_ports(struct dp_netdev_pmd_thread *pmd)
>     OVS_REQUIRES(pmd->port_mutex);
> +static inline void
> +dp_netdev_pmd_try_optimize(struct dp_netdev_pmd_thread *pmd);
> 
> static inline bool emc_entry_alive(struct emc_entry *ce);
> static void emc_clear_entry(struct emc_entry *ce);
> @@ -657,8 +672,10 @@ pmd_info_show_stats(struct ds *reply,
> 
>     ds_put_format(reply,
>                   "\temc hits:%llu\n\tmegaflow hits:%llu\n"
> +                  "\tavg. subtable lookups per hit:%.2f\n"
>                   "\tmiss:%llu\n\tlost:%llu\n",
>                   stats[DP_STAT_EXACT_HIT], stats[DP_STAT_MASKED_HIT],
> +                  stats[DP_STAT_MASKED_HIT] > 0 ? (1.0*stats[DP_STAT_LOOKUP_HIT])/stats[DP_STAT_MASKED_HIT] : 0,
>                   stats[DP_STAT_MISS], stats[DP_STAT_LOST]);
> 
>     if (total_cycles == 0) {
> @@ -1501,20 +1518,59 @@ dp_netdev_flow_hash(const ovs_u128 *ufid)
>     return ufid->u32[0];
> }
> 
> +static inline struct dpcls *
> +dp_netdev_pmd_lookup_dpcls(struct dp_netdev_pmd_thread *pmd,
> +                           odp_port_t in_port)
> +{
> +    struct dpcls *cls;
> +    uint32_t hash = hash_port_no(in_port);
> +    CMAP_FOR_EACH_WITH_HASH (cls, node, hash, &pmd->classifiers) {
> +        if (cls->in_port == in_port) {
> +            /* Port classifier exists already */
> +            return cls;
> +        }
> +    }
> +    return NULL;
> +}
> +
> +static inline struct dpcls *
> +dp_netdev_pmd_find_dpcls(struct dp_netdev_pmd_thread *pmd,
> +                         odp_port_t in_port)
> +{
> +    struct dpcls *cls = dp_netdev_pmd_lookup_dpcls(pmd, in_port);
> +    uint32_t hash = hash_port_no(in_port);
> +
> +    if (!cls) {
> +        /* Create new classifier for in_port */
> +        cls = xmalloc(sizeof(*cls));
> +        dpcls_init(cls);
> +        cls->in_port = in_port;
> +        cmap_insert(&pmd->classifiers, &cls->node, hash);
> +        VLOG_DBG("Creating dpcls %p for in_port %d", cls, in_port);
> +    }
> +    return cls;
> +}
> +
> static void
> dp_netdev_pmd_remove_flow(struct dp_netdev_pmd_thread *pmd,
>                           struct dp_netdev_flow *flow)
>     OVS_REQUIRES(pmd->flow_mutex)
> {
>     struct cmap_node *node = CONST_CAST(struct cmap_node *, &flow->node);
> +    struct dpcls *cls;
> +    odp_port_t in_port = flow->flow.in_port.odp_port;
> 
> -    dpcls_remove(&pmd->cls, &flow->cr);
> +    cls = dp_netdev_pmd_lookup_dpcls(pmd, in_port);
> +    if (cls) {


It would be better to assert that cls was found, as cmap_remove() requires the node to be removed to be in the cmap anyway.

> +        dpcls_remove(cls, &flow->cr);
> +    }
>     cmap_remove(&pmd->flow_table, node, dp_netdev_flow_hash(&flow->ufid));
>     flow->dead = true;
> 
>     dp_netdev_flow_unref(flow);
> }
> 
> +

Extra newline for no reason.

> static void
> dp_netdev_pmd_flow_flush(struct dp_netdev_pmd_thread *pmd)
> {
> @@ -1855,15 +1911,20 @@ emc_lookup(struct emc_cache *cache, const struct netdev_flow_key *key)
> }
> 
> static struct dp_netdev_flow *
> -dp_netdev_pmd_lookup_flow(const struct dp_netdev_pmd_thread *pmd,
> -                          const struct netdev_flow_key *key)
> +dp_netdev_pmd_lookup_flow(struct dp_netdev_pmd_thread *pmd,
> +                          const struct netdev_flow_key *key,
> +                          int *lookup_num_p)
> {
> -    struct dp_netdev_flow *netdev_flow;
> +    struct dpcls *cls;
>     struct dpcls_rule *rule;
> +    odp_port_t in_port = MINIFLOW_GET_U32(&key->mf, in_port);
> +    struct dp_netdev_flow *netdev_flow = NULL;
> 
> -    dpcls_lookup(&pmd->cls, key, &rule, 1);
> -    netdev_flow = dp_netdev_flow_cast(rule);
> -
> +    cls = dp_netdev_pmd_lookup_dpcls(pmd, in_port);
> +    if (cls) {

Could use OVS_LIKELY() here.

> +        dpcls_lookup(cls, key, &rule, 1, lookup_num_p);
> +        netdev_flow = dp_netdev_flow_cast(rule);
> +    }
>     return netdev_flow;
> }
> 
> @@ -2096,6 +2157,8 @@ dp_netdev_flow_add(struct dp_netdev_pmd_thread *pmd,
> {
>     struct dp_netdev_flow *flow;
>     struct netdev_flow_key mask;
> +    struct dpcls *cls;
> +    odp_port_t in_port = match->flow.in_port.odp_port;
> 
>     netdev_flow_mask_init(&mask, match);
>     /* Make sure wc does not have metadata. */
> @@ -2114,7 +2177,11 @@ dp_netdev_flow_add(struct dp_netdev_pmd_thread *pmd,
>     ovsrcu_set(&flow->actions, dp_netdev_actions_create(actions, actions_len));
> 
>     netdev_flow_key_init_masked(&flow->cr.flow, &match->flow, &mask);
> -    dpcls_insert(&pmd->cls, &flow->cr, &mask);
> +
> +    /* Select dpcls for in_port. Relies on in_port to be exact match */
> +    ovs_assert(match->wc.masks.in_port.odp_port == ~0);

I'm not sure if this is in CodingStyle.md, but I'd rather use UINT32_MAX than ~0.

> +    cls = dp_netdev_pmd_find_dpcls(pmd, in_port);
> +    dpcls_insert(cls, &flow->cr, &mask);
> 
>     cmap_insert(&pmd->flow_table, CONST_CAST(struct cmap_node *, &flow->node),
>                 dp_netdev_flow_hash(&flow->ufid));
> @@ -2195,7 +2262,7 @@ dpif_netdev_flow_put(struct dpif *dpif, const struct dpif_flow_put *put)
>     }
> 
>     ovs_mutex_lock(&pmd->flow_mutex);
> -    netdev_flow = dp_netdev_pmd_lookup_flow(pmd, &key);
> +    netdev_flow = dp_netdev_pmd_lookup_flow(pmd, &key, 0);

Use NULL instead of '0' for a null pointer.

>     if (!netdev_flow) {
>         if (put->flags & DPIF_FP_CREATE) {
>             if (cmap_count(&pmd->flow_table) < MAX_FLOWS) {
> @@ -2872,6 +2939,7 @@ reload:
> 
>             emc_cache_slow_sweep(&pmd->flow_cache);
>             coverage_try_clear();
> +            dp_netdev_pmd_try_optimize(pmd);

Make sure the code path can't block.

>             ovsrcu_quiesce();
> 
>             atomic_read_relaxed(&pmd->change_seq, &seq);
> @@ -3031,8 +3099,9 @@ dp_netdev_configure_pmd(struct dp_netdev_pmd_thread *pmd, struct dp_netdev *dp,
>     ovs_mutex_init(&pmd->cond_mutex);
>     ovs_mutex_init(&pmd->flow_mutex);
>     ovs_mutex_init(&pmd->port_mutex);
> -    dpcls_init(&pmd->cls);
>     cmap_init(&pmd->flow_table);
> +    cmap_init(&pmd->classifiers);
> +    pmd->next_optimization = time_msec() + DPCLS_OPTIMIATION_INTERVAL;
>     ovs_list_init(&pmd->poll_list);
>     hmap_init(&pmd->tx_ports);
>     hmap_init(&pmd->port_cache);
> @@ -3048,10 +3117,16 @@ dp_netdev_configure_pmd(struct dp_netdev_pmd_thread *pmd, struct dp_netdev *dp,
> static void
> dp_netdev_destroy_pmd(struct dp_netdev_pmd_thread *pmd)
> {
> +    struct dpcls *cls;
> +
>     dp_netdev_pmd_flow_flush(pmd);
> -    dpcls_destroy(&pmd->cls);
>     hmap_destroy(&pmd->port_cache);
>     hmap_destroy(&pmd->tx_ports);
> +    /* All flows (including their dpcls_rules) have been deleted already */
> +    CMAP_FOR_EACH(cls, node, &pmd->classifiers) {
> +        dpcls_destroy(cls);
> +    }
> +    cmap_destroy(&pmd->classifiers);
>     cmap_destroy(&pmd->flow_table);
>     ovs_mutex_destroy(&pmd->flow_mutex);
>     latch_destroy(&pmd->exit_latch);
> @@ -3221,6 +3296,7 @@ dp_netdev_del_port_from_pmd__(struct dp_netdev_port *port,
>         free(tx);
>         found = true;
>     }
> +

Don't add newline here, especially if you do not add one after the corresponding ovs_mutex_lock().

>     ovs_mutex_unlock(&pmd->port_mutex);
> 
>     return found;
> @@ -3796,7 +3872,7 @@ handle_packet_upcall(struct dp_netdev_pmd_thread *pmd, struct dp_packet *packet,
>          * to be locking everyone out of making flow installs.  If we
>          * move to a per-core classifier, it would be reasonable. */
>         ovs_mutex_lock(&pmd->flow_mutex);
> -        netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key);
> +        netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key, 0);

NULL.

>         if (OVS_LIKELY(!netdev_flow)) {
>             netdev_flow = dp_netdev_flow_add(pmd, &match, &ufid,
>                                              add_actions->data,
> @@ -3822,18 +3898,32 @@ fast_path_processing(struct dp_netdev_pmd_thread *pmd,
>     enum { PKT_ARRAY_SIZE = NETDEV_MAX_BURST };
> #endif
>     struct dp_packet **packets = packets_->packets;
> +    struct dpcls *cls;
>     struct dpcls_rule *rules[PKT_ARRAY_SIZE];
>     struct dp_netdev *dp = pmd->dp;
>     struct emc_cache *flow_cache = &pmd->flow_cache;
>     int miss_cnt = 0, lost_cnt = 0;
> +    int lookup_cnt = 0, add_lookup_cnt;
>     bool any_miss;
>     size_t i;
> +    odp_port_t in_port;
> 
>     for (i = 0; i < cnt; i++) {
>         /* Key length is needed in all the cases, hash computed on demand. */
>         keys[i].len = netdev_flow_key_size(miniflow_n_values(&keys[i].mf));
>     }
> -    any_miss = !dpcls_lookup(&pmd->cls, keys, rules, cnt);
> +    /* Get ingress port from metadata of first packet. */
> +    /* Relies on all packets in patch having same ingress port! */

Maybe make this explicit by making 'in_port' a new argument for this function?

> +    ovs_assert (cnt > 0);
> +    in_port = packets[0]->md.in_port.odp_port;
> +    /* Get the classifier for the in_port */
> +    cls = dp_netdev_pmd_lookup_dpcls(pmd, in_port);
> +    if (cls) {

Could use OVS_LIKELY() here.

> +        any_miss = !dpcls_lookup(cls, keys, rules, cnt, &lookup_cnt);
> +    } else {
> +        any_miss = true;
> +        memset(rules, 0, sizeof(rules));
> +    }
>     if (OVS_UNLIKELY(any_miss) && !fat_rwlock_tryrdlock(&dp->upcall_rwlock)) {
>         uint64_t actions_stub[512 / 8], slow_stub[512 / 8];
>         struct ofpbuf actions, put_actions;
> @@ -3851,8 +3941,9 @@ fast_path_processing(struct dp_netdev_pmd_thread *pmd,
>             /* It's possible that an earlier slow path execution installed
>              * a rule covering this flow.  In this case, it's a lot cheaper
>              * to catch it here than execute a miss. */
> -            netdev_flow = dp_netdev_pmd_lookup_flow(pmd, &keys[i]);
> +            netdev_flow = dp_netdev_pmd_lookup_flow(pmd, &keys[i], &add_lookup_cnt);
>             if (netdev_flow) {
> +                lookup_cnt += add_lookup_cnt;
>                 rules[i] = &netdev_flow->cr;
>                 continue;
>             }
> @@ -3891,6 +3982,7 @@ fast_path_processing(struct dp_netdev_pmd_thread *pmd,
>     }
> 
>     dp_netdev_count_packet(pmd, DP_STAT_MASKED_HIT, cnt - miss_cnt);
> +    dp_netdev_count_packet(pmd, DP_STAT_LOOKUP_HIT, lookup_cnt);
>     dp_netdev_count_packet(pmd, DP_STAT_MISS, miss_cnt);
>     dp_netdev_count_packet(pmd, DP_STAT_LOST, lost_cnt);
> }
> @@ -3947,7 +4039,7 @@ static void
> dp_netdev_recirculate(struct dp_netdev_pmd_thread *pmd,
>                       struct dp_packet_batch *packets)
> {
> -     dp_netdev_input__(pmd, packets, true, 0);
> +    dp_netdev_input__(pmd, packets, true, 0);

Might fix the whitespace in the preceding dp_netdev_input() as well, even better to put unrelated style fixes to a separate patch.

> }
> 
> struct dp_netdev_execute_aux {
> @@ -4372,6 +4464,7 @@ struct dpcls_subtable {
> 
>     /* These fields are accessed by readers. */
>     struct cmap rules;           /* Contains "struct dpcls_rule"s. */
> +    uint32_t hit_cnt;            /* Number of match hits in subtable. */

Add "for the current optimization interval"

>     struct netdev_flow_key mask; /* Wildcards for fields (const). */
>     /* 'mask' must be the last field, additional space is allocated here. */
> };
> @@ -4388,6 +4481,7 @@ dpcls_init(struct dpcls *cls)
> static void
> dpcls_destroy_subtable(struct dpcls *cls, struct dpcls_subtable *subtable)
> {
> +    VLOG_DBG("Destroying subtable %p for in_port %d", subtable, cls->in_port);

Suggest removing debugging from the final patch.

>     pvector_remove(&cls->subtables, subtable);
>     cmap_remove(&cls->subtables_map, &subtable->cmap_node,
>                 subtable->mask.hash);
> @@ -4422,10 +4516,13 @@ dpcls_create_subtable(struct dpcls *cls, const struct netdev_flow_key *mask)
>     subtable = xmalloc(sizeof *subtable
>                        - sizeof subtable->mask.mf + mask->len);
>     cmap_init(&subtable->rules);
> +    subtable->hit_cnt = 0;
>     netdev_flow_key_clone(&subtable->mask, mask);
>     cmap_insert(&cls->subtables_map, &subtable->cmap_node, mask->hash);
> +    /* Insert new subtable with priority zero (no hits yet) */
>     pvector_insert(&cls->subtables, subtable, 0);
>     pvector_publish(&cls->subtables);
> +    VLOG_DBG("Creating %lu. subtable %p for in_port %d", cmap_count(&cls->subtables_map), subtable, cls->in_port);
> 
>     return subtable;
> }
> @@ -4444,6 +4541,37 @@ dpcls_find_subtable(struct dpcls *cls, const struct netdev_flow_key *mask)
>     return dpcls_create_subtable(cls, mask);
> }
> 
> +
> +/* Periodically sort the dpcls subtable vectors according to hit counts */
> +static inline void
> +dpcls_optimize(struct dpcls *cls)
> +{
> +    struct dpcls_subtable *subtable;
> +
> +    struct pvector *pvec = &cls->subtables;
> +    PVECTOR_FOR_EACH (subtable, pvec) {
> +        pvector_change_priority(pvec, subtable, subtable->hit_cnt);
> +        subtable->hit_cnt = 0;
> +    }
> +    pvector_publish(pvec);
> +}
> +
> +static inline void
> +dp_netdev_pmd_try_optimize(struct dp_netdev_pmd_thread *pmd)
> +{
> +    struct dpcls *cls;
> +    long long int now = time_msec();
> +
> +    if (now > pmd->next_optimization) {
> +        /* Optimize each classifier */
> +        CMAP_FOR_EACH(cls, node, &pmd->classifiers) {
> +            dpcls_optimize(cls);
> +        }
> +        /* Start new measuring interval */
> +        pmd->next_optimization = now + DPCLS_OPTIMIATION_INTERVAL;
> +    }
> +}
> +
> /* Insert 'rule' into 'cls'. */
> static void
> dpcls_insert(struct dpcls *cls, struct dpcls_rule *rule,
> @@ -4451,6 +4579,7 @@ dpcls_insert(struct dpcls *cls, struct dpcls_rule *rule,
> {
>     struct dpcls_subtable *subtable = dpcls_find_subtable(cls, mask);
> 
> +    /* Refer to subtable's mask, also for later removal */

Add the trailing '.'

>     rule->mask = &subtable->mask;
>     cmap_insert(&subtable->rules, &rule->cmap_node, rule->flow.hash);
> }
> @@ -4463,10 +4592,11 @@ dpcls_remove(struct dpcls *cls, struct dpcls_rule *rule)
> 
>     ovs_assert(rule->mask);
> 
> +    /* Get subtable from reference in rule->mask. */
>     INIT_CONTAINER(subtable, rule->mask, mask);
> -
>     if (cmap_remove(&subtable->rules, &rule->cmap_node, rule->flow.hash)
>         == 0) {
> +        /* Delete empty subtable. */
>         dpcls_destroy_subtable(cls, subtable);
>         pvector_publish(&cls->subtables);
>     }
> @@ -4501,8 +4631,9 @@ dpcls_rule_matches_key(const struct dpcls_rule *rule,
>  *
>  * Returns true if all flows found a corresponding rule. */
> static bool
> -dpcls_lookup(const struct dpcls *cls, const struct netdev_flow_key keys[],
> -             struct dpcls_rule **rules, const size_t cnt)
> +dpcls_lookup(struct dpcls *cls, const struct netdev_flow_key keys[],
> +             struct dpcls_rule **rules, const size_t cnt,
> +             int *num_lookups_p)
> {
>     /* The batch size 16 was experimentally found faster than 8 or 32. */
>     typedef uint16_t map_type;
> @@ -4522,6 +4653,8 @@ dpcls_lookup(const struct dpcls *cls, const struct netdev_flow_key keys[],
>     }
>     memset(rules, 0, cnt * sizeof *rules);
> 
> +    int lookups_match = 0, subtable_pos = 1;
> +
>     PVECTOR_FOR_EACH (subtable, &cls->subtables) {
>         const struct netdev_flow_key *mkeys = keys;
>         struct dpcls_rule **mrules = rules;
> @@ -4554,6 +4687,8 @@ dpcls_lookup(const struct dpcls *cls, const struct netdev_flow_key keys[],
>                 CMAP_NODE_FOR_EACH (rule, cmap_node, nodes[i]) {
>                     if (OVS_LIKELY(dpcls_rule_matches_key(rule, &mkeys[i]))) {
>                         mrules[i] = rule;
> +                        subtable->hit_cnt++;
> +                        lookups_match += subtable_pos;
>                         goto next;
>                     }
>                 }
> @@ -4565,8 +4700,11 @@ dpcls_lookup(const struct dpcls *cls, const struct netdev_flow_key keys[],
>             remains |= maps[m];
>         }
>         if (!remains) {
> +            if (num_lookups_p) *num_lookups_p = lookups_match;

if statements must have the code block in curly braces and not on the same line.

>             return true;              /* All found. */
>         }
> +        subtable_pos++;
>     }
> +    if (num_lookups_p) *num_lookups_p = lookups_match;

Ditto.

>     return false;                     /* Some misses. */
> }
> diff --git a/tests/pmd.at b/tests/pmd.at
> index 3216762..c033047 100644
> --- a/tests/pmd.at
> +++ b/tests/pmd.at
> @@ -162,10 +162,11 @@ dummy at ovs-dummy: hit:0 missed:0
> 		p0 7/1: (dummy-pmd: configured_rx_queues=4, configured_tx_queues=<cleared>, requested_rx_queues=4, requested_tx_queues=<cleared>)
> ])
> 
> -AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | sed SED_NUMA_CORE_PATTERN | sed '/cycles/d' | grep pmd -A 4], [0], [dnl
> +AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | sed SED_NUMA_CORE_PATTERN | sed '/cycles/d' | grep pmd -A 5], [0], [dnl
> pmd thread numa_id <cleared> core_id <cleared>:
> 	emc hits:0
> 	megaflow hits:0
> +	avg. subtable lookups per hit:0.00
> 	miss:0
> 	lost:0
> ])
> @@ -188,10 +189,11 @@ AT_CHECK([cat ovs-vswitchd.log | filter_flow_install | strip_xout], [0], [dnl
> recirc_id(0),in_port(1),eth(src=50:54:00:00:00:77,dst=50:54:00:00:01:78),eth_type(0x0800),ipv4(frag=no), actions: <del>
> ])
> 
> -AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | sed SED_NUMA_CORE_PATTERN | sed '/cycles/d' | grep pmd -A 4], [0], [dnl
> +AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | sed SED_NUMA_CORE_PATTERN | sed '/cycles/d' | grep pmd -A 5], [0], [dnl
> pmd thread numa_id <cleared> core_id <cleared>:
> 	emc hits:19
> 	megaflow hits:0
> +	avg. subtable lookups per hit:0.00
> 	miss:1
> 	lost:0
> ])
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev




More information about the dev mailing list