[ovs-dev] [PATCH v7 3/3] revalidator: Rebalance offloaded flows based on the pps rate

Sriharsha Basavapatna sriharsha.basavapatna at broadcom.com
Sun Oct 14 08:04:55 UTC 2018


Hi Kevin,

On Fri, Oct 12, 2018 at 10:08 PM Kevin Traynor <ktraynor at redhat.com> wrote:
>
> On 09/27/2018 07:48 AM, Sriharsha Basavapatna via dev wrote:
> > This is the third patch in the patch-set to support dynamic rebalancing
> > of offloaded flows.
> >
> > The dynamic rebalancing functionality is implemented in this patch. The
> > ukeys that are not scheduled for deletion are obtained and passed as input
> > to the rebalancing routine. The rebalancing is done in the context of
> > revalidation leader thread, after all other revalidator threads are
> > done with gathering rebalancing data for flows.
> >
> > For each netdev that is in OOR state, a list of flows - both offloaded
> > and non-offloaded (pending) - is obtained using the ukeys. For each netdev
> > that is in OOR state, the flows are grouped and sorted into offloaded and
> > pending flows.  The offloaded flows are sorted in descending order of
> > pps-rate, while pending flows are sorted in ascending order of pps-rate.
> >
> > The rebalancing is done in two phases. In the first phase, we try to
> > offload all pending flows and if that succeeds, the OOR state on the device
> > is cleared. If some (or none) of the pending flows could not be offloaded,
> > then we start replacing an offloaded flow that has a lower pps-rate than
> > a pending flow, until there are no more pending flows with a higher rate
> > than an offloaded flow. The flows that are replaced from the device are
> > added into kernel datapath.
> >
> > A new OVS configuration parameter "offload-rebalance", is added to ovsdb.
> > The default value of this is "false". To enable this feature, set the
> > value of this parameter to "true", which provides packets-per-second
> > rate based policy to dynamically offload and un-offload flows.
> >
> > Note: This option can be enabled only when 'hw-offload' policy is enabled.
> > It also requires 'tc-policy' to be set to 'skip_sw'; otherwise, flow
> > offload errors (specifically ENOSPC error this feature depends on) reported
> > by an offloaded device are supressed by TC-Flower kernel module.
> >
> > Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna at broadcom.com>
> > Co-authored-by: Venkat Duvvuru <venkatkumar.duvvuru at broadcom.com>
> > Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru at broadcom.com>
> > Reviewed-by: Sathya Perla <sathya.perla at broadcom.com>
> > ---
>
> A few of minor comments below, nothing worth holding up a merge over -
> could be addressed in follow up if required.

Thanks for your review comments. Please see my response inline.

>
> <snip>
>
> >
> > +static void
> > +udpif_run_flow_rebalance(struct udpif *udpif)
> > +{
> > +    long long int now = 0;
> > +
> > +    /* Don't rebalance if OFFL_REBAL_INTVL_MSEC have not elapsed */
> > +    now = time_msec();
> > +    if (now < udpif->offload_rebalance_time + OFFL_REBAL_INTVL_MSEC) {
> > +        return;
> > +    }
> > +
> > +    if (!netdev_any_oor()) {
>
> won't you continually check this when there are no oor netdevs as
> offload_rebalance_time is not updated - how about setting
> 'offload_rebalance_time = now' here too, or you want it to check
> continually?

I agree it could be done like you suggested. But for the time being
(until this feature gets tested more), I'd like to have it the way it
is for a couple of reasons - it responds more quickly to OOR condition
in its current form and offload_rebalance_time really reflects the
time when rebalance() was run last time (when some netdevs were OOR).
>
> > +        return;
> > +    }
> > +
> > +    VLOG_DBG("Offload rebalance: Found OOR netdevs");
>
> Seems like this could easily be a persistent state, do you want to log
> it each time this code runs? perhaps just when the state changes?

Again, for debugging I'd prefer to keep this message for now.

>
> > +    udpif->offload_rebalance_time = now;
> > +    udpif_flow_rebalance(udpif);
> > +}
> > +
> >  static void *
> >  udpif_revalidator(void *arg)
> >  {
> > @@ -933,6 +963,9 @@ udpif_revalidator(void *arg)
> >
> >              dpif_flow_dump_destroy(udpif->dump);
> >              seq_change(udpif->dump_seq);
> > +            if (netdev_is_offload_rebalance_policy_enabled()) {
> > +                udpif_run_flow_rebalance(udpif);
> > +            }
> >
> >              duration = MAX(time_msec() - start_time, 1);
> >              udpif->dump_duration = duration;
> > @@ -977,7 +1010,7 @@ udpif_revalidator(void *arg)
> >
> >      return NULL;
> >  }
> > -
> > +
> >  static enum upcall_type
> >  classify_upcall(enum dpif_upcall_type type, const struct nlattr *userdata,
> >                  struct user_action_cookie *cookie)
> > @@ -1579,7 +1612,7 @@ handle_upcalls(struct udpif *udpif, struct upcall *upcalls,
> >      for (i = 0; i < n_ops; i++) {
> >          opsp[n_opsp++] = &ops[i].dop;
> >      }
> > -    dpif_operate(udpif->dpif, opsp, n_opsp);
> > +    dpif_operate(udpif->dpif, opsp, n_opsp, DPIF_OFFLOAD_AUTO);
> >      for (i = 0; i < n_ops; i++) {
> >          struct udpif_key *ukey = ops[i].ukey;
> >
> > @@ -1671,13 +1704,14 @@ ukey_create__(const struct nlattr *key, size_t key_len,
> >      ukey->state = UKEY_CREATED;
> >      ukey->state_thread = ovsthread_id_self();
> >      ukey->state_where = OVS_SOURCE_LOCATOR;
> > -    ukey->created = time_msec();
> > +    ukey->created = ukey->flow_time = time_msec();
> >      memset(&ukey->stats, 0, sizeof ukey->stats);
> >      ukey->stats.used = used;
> >      ukey->xcache = NULL;
> >
> >      ukey->offloaded = false;
> >      ukey->flow_time = 0;
>
> this is leftover from 2/3 and can be removed, flow_time is set a few
> lines above.

Thanks for catching this, will remove it in a subsequent patch after
it gets merged.

Thanks,
-Harsha

>
> > +    ukey->in_netdev = NULL;
> >      ukey->flow_packets = ukey->flow_backlog_packets = 0;
> >
> >      ukey->key_recirc_id = key_recirc_id;
> > @@ -2329,7 +2363,7 @@ push_dp_ops(struct udpif *udpif, struct ukey_op *ops, size_t n_ops)
> >      for (i = 0; i < n_ops; i++) {
> >          opsp[i] = &ops[i].dop;
> >      }
> > -    dpif_operate(udpif->dpif, opsp, n_ops);
> > +    dpif_operate(udpif->dpif, opsp, n_ops, DPIF_OFFLOAD_AUTO);
> >
> >      for (i = 0; i < n_ops; i++) {
> >          struct ukey_op *op = &ops[i];
> > @@ -2455,6 +2489,57 @@ reval_op_init(struct ukey_op *op, enum reval_result result,
> >      }
> >  }
> >
>
> <snip>


More information about the dev mailing list