[ovs-dev] [PATCH v2] Upcall/Slowpath rate limiter for OVS

Ben Pfaff blp at ovn.org
Mon Jan 22 20:31:27 UTC 2018


I guess I'm probably the wrong one to ultimately review this patch,
since it's really a DPDK patch.

It's difficult to say for sure that this is really an improvement, since
it also causes traffic to be dropped (that is, the traffic that needs to
be slow-pathed).

I think that this could be handled in a datapath independent way using
meters.

On Mon, Jan 15, 2018 at 08:33:31AM +0000, Manohar Krishnappa Chidambaraswamy wrote:
> Rebased to master.
> 
> Could you please review this patch?
> 
> v1 of this patch has been sitting idle for quite sometime. I believe this
> feature is important. Without this feature, an attacker sending packets 
> belonging to different (unestablished) flows can result in PMDs running only
> upcalls, reducing the throughput drastically. Huge traffic loss due to this
> is more like a DOS attack.
> 
> Please let me know your thoughts/comments. This is my first patch series in
> OVS. Sorry if naming of this patch as v2 is not appropriate.
> 
> Thanx
> Manu
> 
> v1 of this patch: https://patchwork.ozlabs.org/patch/836737/
> 
> Signed-off-by: Manohar K C <manohar.krishnappa.chidambaraswamy at ericsson.com>
> cc: Jan Scheurich <jan.scheurich at ericsson.com>
> cc: Ben Pfaff <blp at ovn.org>
> 
> ---
>  Documentation/howto/dpdk.rst      | 21 ++++++++++
>  debian/libopenvswitch-dev.install |  1 -
>  debian/libopenvswitch.install     |  1 -
>  debian/rules                      |  4 +-
>  lib/dpif-netdev.c                 | 81 +++++++++++++++++++++++++++++++++++++--
>  vswitchd/vswitch.xml              | 47 +++++++++++++++++++++++
>  6 files changed, 148 insertions(+), 7 deletions(-)
> 
> diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst
> index 587aaed..1db1562 100644
> --- a/Documentation/howto/dpdk.rst
> +++ b/Documentation/howto/dpdk.rst
> @@ -716,3 +716,24 @@ devices to bridge ``br0``. Once complete, follow the below steps:
>     Check traffic on multiple queues::
>  
>         $ cat /proc/interrupts | grep virtio
> +
> +Upcall rate limiting
> +--------------------
> +ovs-vsctl can be used to enable and configure upcall rate limit parameters.
> +There are 2 configurable values ``upcall-rate`` and ``upcall-burst`` which
> +take effect when global enable knob ``upcall-rl`` is set to true.
> +
> +Upcall rate should be set using ``upcall-rate`` in packets-per-sec. For
> +example::
> +
> +    $ ovs-vsctl set Open_vSwitch . other_config:upcall-rate=2000
> +
> +Upcall burst should be set using ``upcall-burst`` in packets-per-sec. For
> +example::
> +
> +    $ ovs-vsctl set Open_vSwitch . other_config:upcall-burst=2000
> +
> +Upcall ratelimit feature should be globally enabled using ``upcall-rl``. For
> +example::
> +
> +    $ ovs-vsctl set Open_vSwitch . other_config:upcall-rl=true
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index c7d157a..e2fd92c 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -101,6 +101,16 @@ static struct shash dp_netdevs OVS_GUARDED_BY(dp_netdev_mutex)
>  
>  static struct vlog_rate_limit upcall_rl = VLOG_RATE_LIMIT_INIT(600, 600);
>  
> +/* Upcall rate-limit parameters */
> +static bool upcall_ratelimit;
> +static unsigned int upcall_rate;
> +static unsigned int upcall_burst;
> +
> +/* TODO: Tune these default values based on upcall perf test */
> +#define UPCALL_RATELIMIT_DEFAULT false /* Disabled by default */
> +#define UPCALL_RATE_DEFAULT      1000  /* pps */
> +#define UPCALL_BURST_DEFAULT     1000  /* pps */
> +
>  #define DP_NETDEV_CS_SUPPORTED_MASK (CS_NEW | CS_ESTABLISHED | CS_RELATED \
>                                       | CS_INVALID | CS_REPLY_DIR | CS_TRACKED \
>                                       | CS_SRC_NAT | CS_DST_NAT)
> @@ -340,6 +350,7 @@ enum dp_stat_type {
>                                     hits */
>      DP_STAT_SENT_PKTS,          /* Packets that has been sent. */
>      DP_STAT_SENT_BATCHES,       /* Number of batches sent. */
> +    DP_STAT_RATELIMIT_DROP,     /* Packets dropped due to upcall policer */
>      DP_N_STATS
>  };
>  
> @@ -645,6 +656,9 @@ struct dp_netdev_pmd_thread {
>      unsigned long long stats_zero[DP_N_STATS];
>      uint64_t cycles_zero[PMD_N_CYCLES];
>  
> +    /* Policer to rate limit slow-path */
> +    struct token_bucket upcall_tb;
> +
>      /* Set to true if the pmd thread needs to be reloaded. */
>      bool need_reload;
>  };
> @@ -894,10 +908,11 @@ pmd_info_show_stats(struct ds *reply,
>      ds_put_format(reply,
>                    "\temc hits:%llu\n\tmegaflow hits:%llu\n"
>                    "\tavg. subtable lookups per hit:%.2f\n"
> -                  "\tmiss:%llu\n\tlost:%llu\n"
> +                  "\tmiss:%llu\n\tlost:%llu\n\tratelimit drops:%llu\n"
>                    "\tavg. packets per output batch: %.2f\n",
>                    stats[DP_STAT_EXACT_HIT], stats[DP_STAT_MASKED_HIT],
>                    lookups_per_hit, stats[DP_STAT_MISS], stats[DP_STAT_LOST],
> +                  stats[DP_STAT_RATELIMIT_DROP],
>                    packets_per_batch);
>  
>      if (total_cycles == 0) {
> @@ -3031,6 +3046,8 @@ dpif_netdev_set_config(struct dpif *dpif, const struct smap *other_config)
>          smap_get_ullong(other_config, "emc-insert-inv-prob",
>                          DEFAULT_EM_FLOW_INSERT_INV_PROB);
>      uint32_t insert_min, cur_min;
> +    unsigned int rate, burst;
> +    bool ratelimit;
>  
>      if (!nullable_string_is_equal(dp->pmd_cmask, cmask)) {
>          free(dp->pmd_cmask);
> @@ -3056,6 +3073,36 @@ dpif_netdev_set_config(struct dpif *dpif, const struct smap *other_config)
>          }
>      }
>  
> +    /* Handle upcall policer params */
> +    ratelimit = smap_get_bool(other_config, "upcall-rl",
> +                              UPCALL_RATELIMIT_DEFAULT);
> +    rate = smap_get_int(other_config, "upcall-rate",
> +                        UPCALL_RATE_DEFAULT);
> +    burst = smap_get_int(other_config, "upcall-burst",
> +                         UPCALL_BURST_DEFAULT);
> +
> +    if ((rate != upcall_rate) || (burst != upcall_burst)) {
> +        VLOG_INFO("Upcall ratelimit params changed : Old - rate=%d burst=%d "
> +                  ": New - rate=%d burst=%d\n", upcall_rate, upcall_burst,
> +                  rate, burst);
> +
> +        upcall_rate = rate;
> +        upcall_burst = burst;
> +
> +        /*
> +         * TODO: See if there is a way to reconfig only the policer
> +         * in each PMD.
> +         */
> +        dp_netdev_request_reconfigure(dp);
> +    }
> +
> +    if (ratelimit != upcall_ratelimit) {
> +        upcall_ratelimit = ratelimit;
> +
> +        VLOG_INFO("Upcall ratelimit changed to %s\n",
> +                  (upcall_ratelimit ? "Enabled" : "Disabled"));
> +    }
> +
>      return 0;
>  }
>  
> @@ -3958,6 +4005,12 @@ dpif_netdev_run(struct dpif *dpif)
>      ovs_mutex_lock(&dp->port_mutex);
>      non_pmd = dp_netdev_get_pmd(dp, NON_PMD_CORE_ID);
>      if (non_pmd) {
> +        /* Reconfig the upcall policer if params have changed */
> +        if ((upcall_rate != non_pmd->upcall_tb.rate) ||
> +            (upcall_burst != non_pmd->upcall_tb.burst)) {
> +            token_bucket_init(&non_pmd->upcall_tb, upcall_rate, upcall_burst);
> +        }
> +
>          ovs_mutex_lock(&dp->non_pmd_mutex);
>          cycles_count_start(non_pmd);
>          HMAP_FOR_EACH (port, node, &dp->ports) {
> @@ -4154,6 +4207,9 @@ reload:
>          lc = UINT_MAX;
>      }
>  
> +    /* Initialize upcall policer token bucket with configured params */
> +    token_bucket_init(&pmd->upcall_tb, upcall_rate, upcall_burst);
> +
>      cycles_count_start(pmd);
>      for (;;) {
>          for (i = 0; i < poll_cnt; i++) {
> @@ -4638,6 +4694,10 @@ dp_netdev_configure_pmd(struct dp_netdev_pmd_thread *pmd, struct dp_netdev *dp,
>          emc_cache_init(&pmd->flow_cache);
>          pmd_alloc_static_tx_qid(pmd);
>      }
> +
> +    /* Initialize upcall policer token bucket with configured params */
> +    token_bucket_init(&pmd->upcall_tb, upcall_rate, upcall_burst);
> +
>      cmap_insert(&dp->poll_threads, CONST_CAST(struct cmap_node *, &pmd->node),
>                  hash_int(core_id, 0));
>  }
> @@ -5076,7 +5136,7 @@ handle_packet_upcall(struct dp_netdev_pmd_thread *pmd,
>                       struct dp_packet *packet,
>                       const struct netdev_flow_key *key,
>                       struct ofpbuf *actions, struct ofpbuf *put_actions,
> -                     int *lost_cnt)
> +                     int *lost_cnt, int *rl_drop_cnt)
>  {
>      struct ofpbuf *add_actions;
>      struct dp_packet_batch b;
> @@ -5084,6 +5144,18 @@ handle_packet_upcall(struct dp_netdev_pmd_thread *pmd,
>      ovs_u128 ufid;
>      int error;
>  
> +    /*
> +     * Grab a token from the upcall policer to enter slowpath. If token
> +     * is not available, drop and account the packet. This is to
> +     * rate-limit packets getting into slowpath.
> +     */
> +    if (upcall_ratelimit && !token_bucket_withdraw(&pmd->upcall_tb, 1)) {
> +        dp_packet_delete(packet);
> +        (*rl_drop_cnt)++;
> +
> +        return;
> +    }
> +
>      match.tun_md.valid = false;
>      miniflow_expand(&key->mf, &match.flow);
>  
> @@ -5158,7 +5230,7 @@ fast_path_processing(struct dp_netdev_pmd_thread *pmd,
>      struct dpcls *cls;
>      struct dpcls_rule *rules[PKT_ARRAY_SIZE];
>      struct dp_netdev *dp = pmd->dp;
> -    int miss_cnt = 0, lost_cnt = 0;
> +    int miss_cnt = 0, lost_cnt = 0, rl_drop_cnt = 0;
>      int lookup_cnt = 0, add_lookup_cnt;
>      bool any_miss;
>      size_t i;
> @@ -5202,7 +5274,7 @@ fast_path_processing(struct dp_netdev_pmd_thread *pmd,
>  
>              miss_cnt++;
>              handle_packet_upcall(pmd, packet, &keys[i], &actions,
> -                                 &put_actions, &lost_cnt);
> +                                 &put_actions, &lost_cnt, &rl_drop_cnt);
>          }
>  
>          ofpbuf_uninit(&actions);
> @@ -5235,6 +5307,7 @@ fast_path_processing(struct dp_netdev_pmd_thread *pmd,
>      dp_netdev_count_packet(pmd, DP_STAT_LOOKUP_HIT, lookup_cnt);
>      dp_netdev_count_packet(pmd, DP_STAT_MISS, miss_cnt);
>      dp_netdev_count_packet(pmd, DP_STAT_LOST, lost_cnt);
> +    dp_netdev_count_packet(pmd, DP_STAT_RATELIMIT_DROP, rl_drop_cnt);
>  }
>  
>  /* Packets enter the datapath from a port (or from recirculation) here.
> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> index 58c0ebd..bc77c35 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -412,6 +412,53 @@
>          </p>
>        </column>
>  
> +      <column name="other_config" key="upcall-rl"
> +              type='{"type": "boolean"}'>
> +        <p>
> +          Set this value to <code>true</code> to enable upcall rate-limiting.
> +          The upcall parameters like rate and burst will be ignored, if this is
> +          not set.
> +        </p>
> +        <p>
> +          The default value is <code>false</code> and upcall rate-limiting will
> +          be disabled.
> +        </p>
> +      </column>
> +
> +      <column name="other_config" key="upcall-rate"
> +        type='{"type": "integer", "minInteger": 0, "maxInteger": 4294967295}'>
> +        <p>
> +          Specifies the rate of upcalls in packets-per-second that is to be
> +          allowed. For example, if the value is 10000, then those many upcalls
> +          (for packets) are allowed per second in each of the packet polling
> +          thread (PMD or non-PMD).
> +        </p>
> +        <p>
> +          A value of <code>0</code> means, no upcalls would be allowed i.e,
> +          upcall will be disabled. This is mainly for debugging.
> +        </p>
> +        <p>
> +          The default value is 1000.
> +        </p>
> +      </column>
> +
> +      <column name="other_config" key="upcall-burst"
> +        type='{"type": "integer", "minInteger": 0, "maxInteger": 4294967295}'>
> +        <p>
> +          Specifies the maximum burst of upcalls in packets-per-second that is
> +          to be allowed. For example, if the value is 15000, then a maximum
> +          burst of 15000 upcalls (for packets) are allowed per second in each
> +          of the packet polling thread (PMD or non-PMD).
> +        </p>
> +        <p>
> +          A value of <code>0</code> means, no upcalls would be allowed i.e,
> +          upcall will be disabled. This is mainly for debugging.
> +        </p>
> +        <p>
> +          The default value is 1000.
> +        </p>
> +      </column>
> +
>        <column name="other_config" key="vlan-limit"
>                type='{"type": "integer", "minInteger": 0}'>
>          <p>
> -- 
> 1.9.1
> 
> 
> 


More information about the dev mailing list