[ovs-dev] [PATCH v7 0/3] Support dynamic rebalancing of offloaded flows

Ben Pfaff blp at ovn.org
Thu Oct 11 23:01:40 UTC 2018


Thanks for the revision.

This seems basically OK at a glance but I'd like a second set of eyes.
Simon, are you willing to review this?  It seems roughly in your area
too.

On Thu, Sep 27, 2018 at 12:18:24PM +0530, Sriharsha Basavapatna via dev wrote:
> With the current OVS offload design, when an offload-device fails to add a
> flow rule and returns an error, OVS adds the rule to the kernel datapath.
> The flow gets processed by the kernel datapath for the entire life of that
> flow. This is fine when an error is returned by the device due to lack of
> support for certain keys or actions.
> 
> But when an error is returned due to temporary conditions such as lack of
> resources to add a flow rule, the flow continues to be processed by kernel
> even when resources become available later. That is, those flows never get
> offloaded again. This problem becomes more pronounced when a flow that has
> been initially offloaded may have a smaller packet rate than a later flow
> that could not be offloaded due to lack of resources. This leads to
> inefficient use of HW resources and wastage of host CPU cycles.
> 
> This patch-set addresses this issue by providing a way to detect temporary
> offload resource constraints (Out-Of-Resource or OOR condition) and to
> selectively and dynamically offload flows with a higher packets-per-second
> (pps) rate. This dynamic rebalancing is done periodically on netdevs that
> are in OOR state until resources become available to offload all pending
> flows.
> 
> The patch-set involves the following changes at a high level:
> 
> 1. Detection of Out-Of-Resources (OOR) condition on an offload-capable 
>    netdev.
> 2. Gathering flow offload selection criteria for all flows on an OOR netdev;
>    i.e, packets-per-second (pps) rate of flows for offloaded and
>    non-offloaded (pending) flows.
> 3. Dynamically replacing offloaded flows with a lower pps-rate, with
>    non-offloaded flows with a higher pps-rate, on an OOR netdev. A new
>    OpenvSwitch configuration option - "offload-rebalance" to enable
>    this policy.
> 
> Cost/benefits data points:
> 
> 1. Rough cost of the new rebalancing, in terms of CPU time:
> 
>    Ran a test that replaced 256 low pps-rate flows(pings) with 256 high
>    pps-rate flows(iperf), in a system with 4 cpus (Intel Xeon E5 @ 2.40GHz;
>    2 cores with hw threads enabled, rest disabled). The data showed that cpu
>    utilization increased by about ~20%. This increase occurs during the
>    specific second in which rebalancing is done. And subsequently (from the
>    next second), cpu utilization decreases significantly due to offloading
>    of higher pps-rate flows. So effectively there's a bump in cpu utilization
>    at the time of rebalancing, that is more than compensated by reduced cpu
>    utilization once the right flows get offloaded.
> 
> 2. Rough benefits to the user in terms of offload performance:
> 
>    The benefits to the user is reduced cpu utilization in the host, since
>    higher pps-rate flows get offloaded, replacing lower pps-rate flows.
>    Replacing a single offloaded flood ping flow with an iperf flow (multiple
>    connections), shows that the cpu %used that was originally 100% on a
>    single cpu (rebalancing disabled) goes down to 35% (rebalancing enabled).
>    That is, cpu utilization decreased 65% after rebalancing.
> 
> 3. Circumstances under which the benefits would show up:
> 
>    The rebalancing benefits would show up once offload resources are
>    exhausted and new flows with higher pps-rate are initiated, that would
>    otherwise be handled by kernel datapath costing host cpu cycles.
> 
>    This can be observed using 'ovs appctl dpctl/ dump-flows' command. Prior
>    to rebalancing, any high pps-rate flows that couldn't be offloaded due to
>    resource crunch would show up in the output of 'dump-flows type=ovs' and
>    after rebalancing such flows would appear in the output of
>    'dump-flows type=offloaded'.
> 
> ******
> 
> v6-->v7:
>  - Fixed comments from Simon Horman. Key changes:
>    - Removed out_netdev and related code which is not needed.
>    - Fixed a potential race between un-offloading and adding a rule to kernel.
>    - Moved config option parameter check to respective patches.
>    - Removed a couple of assertions.
>    - Added a few comments.
>    - Added a workaround to avoid a checkpatch warning for a pre-decrement op.
> 
> v5-->v6:
>   - Fixed a build error reported by 0-day robot in Patch-2
> 
> v4-->v5:
>  - Fixed v4 comments from Ben Pfaff. Key changes:
>    - Updated patch-0 comments to reflect cost/benefit of rebalancing
>    - Release references to netdevs after rebalancing
>    - Reduce ref holding window to rebalancing routine
>    - Avoid collecting all flows before rebalancing (save time/space)
>    - Skip rebalancing when no OOR netdev (new function netdev_any_oor())
>    - Avoid direct access to netdev members (add new netdev interfaces)
>    - Handle tunnel-netdev as a special case of in-netdev (no separate var)
>    - Remove floating point type for pps-rate (not needed, per-second based)
>    - Simplify ukey_to_flow_netdevs() (remove tmp dpif_flow)
>    - Update skip_offload code to more meaningful offload_type
>    - dpif_operate() should validate offload_types
>    - dpif_netlink_operate() should process all ops and set error if needed
>    - Save time of last rebalancing in udpif (remove static vars)
>    - Declare variables at point of first use where possible
>    - Merged patch-4 into patch-3
>    - Simplified offload-rebalance config parameter to bool type
>    - Removed feature enable/disable log messages
> 
> v3-->v4:
>   - Updated parse_flow_put() with the following changes:
>     - Fixed outdev memory leak with multiple output actions
>     - Moved variables closer to their first use
>     - Removed outdev check while setting oor, since indev is sufficient
> 
> v2-->v3:
>   - Removed some VLOG_DBG() in patches 2 and 3
>   - Reworded a few VLOG_DBG() in patch 3
>   - Fixed a comment line in patch 3
> 
> v1-->v2:
>   - Fixed build errors reported by 0-day robot
>   - Updated patch prefixes with relevant subsystem names
> 
> ******
> 
> Sriharsha Basavapatna (3):
>   dpif-netlink: Detect Out-Of-Resource condition on a netdev
>   revalidator: Gather packets-per-second rate of flows
>   revalidator: Rebalance offloaded flows based on the pps rate
> 
>  NEWS                          |   3 +
>  lib/dpif-netdev.c             |   3 +-
>  lib/dpif-netlink.c            |  47 +++-
>  lib/dpif-provider.h           |   9 +-
>  lib/dpif.c                    |  30 ++-
>  lib/dpif.h                    |  12 +-
>  lib/flow.c                    |  25 ++
>  lib/flow.h                    |   1 +
>  lib/netdev-provider.h         |  14 +
>  lib/netdev.c                  |  89 ++++++
>  lib/netdev.h                  |   4 +
>  ofproto/ofproto-dpif-upcall.c | 494 +++++++++++++++++++++++++++++++++-
>  vswitchd/vswitch.xml          |  21 ++
>  13 files changed, 729 insertions(+), 23 deletions(-)
> 
> -- 
> 2.18.0.rc1.1.g6f333ff
> 
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev


More information about the dev mailing list