[ovs-dev] [PATCH 2/4] ofproto-dpif: Run fast internally.

Mon Apr 1 22:13:46 UTC 2013

On Sun, Mar 31, 2013 at 06:22:57PM -0700, Ethan Jackson wrote:
> ofproto-dpif is responsible for quite a few book keeping tasks in
> addition to handling flow misses.  Many of these tasks (flow
> expiration, flow revalidation, etc) can take many hundreds of
> milliseconds, during which no misses can be handled.  The ideal
> long term solution to this problem, is to isolate flow miss
> handling into it's own thread.  However, for now this patch
> provides a 5% increase in TCP_CRR performance, and smooths out
> results during revalidations.
> 
> Signed-off-by: Ethan Jackson <ethan at nicira.com>

I think the arithmetic on port_rl is overflow-prone.  If time_msec()
returns the value 12345, then 12345 - LLONG_MIN is roughly 12345 +
LLONG_MAX, which overflows to a negative number, which is less than
200.  Therefore I'm not certain that the port_run_fast() case ever
triggers.  So, it would be better to use "time_msec() >= port_rl" as
the condition and then reset port_rl to time_msec() + 200.

In run_fast_rl(), backer_rl is not a time value, so it doesn't need to
be a long long int.  An "unsigned int" is probably a better choice
since modulo is going to be cheaper on a 32-bit value than on a 64-bit
one (you could in fact just do something like "if (++backer_rl >= 10)
{ backer_rl = 0; .... }" and avoid the expensive modulo entirely).

Thanks,

Ben.

> +static void
> +run_fast_rl(void)
> +{
> +    static long long int port_rl = LLONG_MIN;
> +    static long long int backer_rl = 0;
> +    long long int now = time_msec();
> +    struct shash_node *node;
> +
> +    if (now - port_rl > 200) {
> +        struct ofproto_dpif *ofproto;
> +        struct ofport_dpif *ofport;
> +
> +        port_rl = now;
> +        HMAP_FOR_EACH (ofproto, all_ofproto_dpifs_node, &all_ofproto_dpifs) {
> +
> +            HMAP_FOR_EACH (ofport, up.hmap_node, &ofproto->up.ports) {
> +                port_run_fast(ofport);
> +            }
> +        }
> +    }
> +
> +    /* XXX: We have to be careful not to do too much work in this function.  If
> +     * we call dpif_backer_run_fast() too often, or with too large a batch,
> +     * performance improves signifcantly, but at a cost.  It's possible for the
> +     * number of flows in the datapath to increase without bound, and for poll
> +     * loops to take 10s of seconds.   The correct solution to this problem,
> +     * long term, is to separate flow miss handling into it's own thread so it
> +     * isn't affected by revalidations, and expirations.  Until then, this is
> +     * the best we can do. */
> +    if (backer_rl++ % 10) {
> +        return;
> +    }
> +
> +    SHASH_FOR_EACH (node, &all_dpif_backers) {
> +        dpif_backer_run_fast(node->data, 1);
> +    }
> +}
> +
>  static void
>  type_wait(const char *type)
>  {
> @@ -4254,6 +4301,7 @@ update_stats(struct dpif_backer *backer)
>              delete_unexpected_flow(ofproto, key, key_len);
>              break;
>          }
> +        run_fast_rl();
>      }
>      dpif_flow_dump_done(&dump);
>  }
> @@ -5052,6 +5100,7 @@ push_all_stats(void)
>  
>          HMAP_FOR_EACH (facet, hmap_node, &ofproto->facets) {
>              facet_push_stats(facet);
> +            run_fast_rl();
>          }
>      }
>  }
> @@ -5222,6 +5271,7 @@ subfacet_destroy_batch(struct ofproto_dpif *ofproto,
>          subfacet_reset_dp_stats(subfacets[i], &stats[i]);
>          subfacets[i]->path = SF_NOT_INSTALLED;
>          subfacet_destroy(subfacets[i]);
> +        run_fast_rl();
>      }
>  }
>  
> -- 
> 1.7.9.5
> 
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev