[ovs-dev] [PATCH v4 0/6] OVS-DPDK rxq to pmd assignment improvements.

Darrell Ball dball at vmware.com
Wed Aug 16 20:02:52 UTC 2017



-----Original Message-----
From: Kevin Traynor <ktraynor at redhat.com>
Organization: Red Hat
Date: Wednesday, August 16, 2017 at 12:19 PM
To: Darrell Ball <dball at vmware.com>, "dev at openvswitch.org" <dev at openvswitch.org>, "ian.stokes at intel.com" <ian.stokes at intel.com>, "jan.scheurich at ericsson.com" <jan.scheurich at ericsson.com>, "bhanuprakash.bodireddy at intel.com" <bhanuprakash.bodireddy at intel.com>, "mark.b.kavanagh at intel.com" <mark.b.kavanagh at intel.com>, "gvrose8192 at gmail.com" <gvrose8192 at gmail.com>
Subject: Re: [ovs-dev] [PATCH v4 0/6] OVS-DPDK rxq to pmd assignment improvements.

    On 08/16/2017 07:28 AM, Darrell Ball wrote:
    > Hi Kevin
    > 
    > I did some basic testing and parsed all the code.
    > I have some high level comments, below.
    > I’ll add the other comments tomorrow.
    > 
    
    Thanks Darrell.
    
    > 1/ The redistribution is based on the last interval – I wonder if a number of intervals may be better.
    >      Possibly a single interval could result in a skewed decision ?
    > 
    
    Sure, it could happen if there was some spike. Ian had made a similar
    comment. The only danger is that if the interval is too long then the
    older information is less relevant and also a rebalance would be occasional.
    
    In the v4 I decoupled the interval used here from anything else, so it
    can easily be increased if there is a better suggestion for length.

[Darrell]:  You could keep it simple for now and use a little longer time - maybe a ‘minute’ and a future
                 patch could use a weighted average maybe or something else ? This is the only major concern
                 I have with this series. Point 2 can come as a follow-up series.
             

    
    Ian also suggested making it user configurable, but I'd only want to add
    that if it's clear that different interval lengths are needed for
    different use cases. Otherwise it's another low level config the user
    has to grapple with and that has to be maintained.

[Darrell] If we ultimately have a show/rebalance maintenance command in a future series, a
configurable interval would be moot I think.

    
    > 2/ I see a rebalance command added that can be used other than at reconfiguration time.
    
    Yep, I'd really like to hear opinions about whether the command to
    rebalance is useful for people in production. I prefer the case where
    rebalance is done as part of a happening anyway reconfig because it
    doesn't need user intervention. But you don't want someone having to add
    a VM in order to do a rebalance :-)
    
    >      Rather than for testing (per the patch use case), maybe this could be used as maintenance command when combined
    >      with  https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.ozlabs.org_patch_768757_&d=DwIDaQ&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=kd7bxA2tqkzEpBpen6pNG2aWw61Pv0_EQtg7tddsFX8&s=QLFdqh_ZEJpnsKwnemFEqEIAbNBfXI4NgzUD1FnshEs&e=  ?
    
    Sugesh's patch re-implemented the data collection from this RFC, but
    used the data for displaying stats.

[Darrell] got it.

.It's not compatible with this or
    Ciara's merged patch for collecting pmd stats.

[Darrell] agreed

 I agree that increased
    stats reporting would be useful to compliment a rebalance command in
    production.

[Darrell] Anyhow, the show/rebalance can come as a follow-up series that
                 you could do alone or both you and Sugesh.

    
    Kevin.
    
    >    If this is the case, patch 6 could have more description how best to do that.
    >    I will look at Sugesh’s patch tomorrow as well so we can expedite.
    > 
    > Thanks Darrell
    > 
    >      
    > 
    > 
    > 
    > -----Original Message-----
    > From: <ovs-dev-bounces at openvswitch.org> on behalf of Kevin Traynor <ktraynor at redhat.com>
    > Date: Wednesday, August 9, 2017 at 8:45 AM
    > To: "dev at openvswitch.org" <dev at openvswitch.org>, "ian.stokes at intel.com" <ian.stokes at intel.com>, "jan.scheurich at ericsson.com" <jan.scheurich at ericsson.com>, "bhanuprakash.bodireddy at intel.com" <bhanuprakash.bodireddy at intel.com>, "mark.b.kavanagh at intel.com" <mark.b.kavanagh at intel.com>, "gvrose8192 at gmail.com" <gvrose8192 at gmail.com>
    > Subject: [ovs-dev] [PATCH v4 0/6] OVS-DPDK rxq to pmd assignment	improvements.
    > 
    >     For the DPDK datapath, by default rxqs are assigned to available pmds
    >     in round robin order with no weight or priority.
    >     
    >     It can happen that some very busy queues are handled by one pmd which
    >     does not have enough cycles to prevent packets being dropped on them.
    >     While at the same time another pmd which handles queues with no traffic
    >     on them is essentially idle.
    >     
    >     Rxq to pmd assignment happens as a result of a number of events and
    >     when it does, the same unweighted round robin approach is applied
    >     each time.
    >     
    >     This patchset proposes to improve the round robin nature of rxq to pmd
    >     assignment by counting the processing cycles used by the rxqs during
    >     their operation and incorporating that data into assignment.
    >     
    >     Before assigning in a round robin manner, the rxqs will be sorted in
    >     order of the processing cycles they have been consuming. Assuming
    >     multiple pmds, this ensures that the rxqs measured to be using the
    >     most processing cycles will be assigned to different cores.
    >     
    >     In some cases the measured cycles for an rxq may be not available as
    >     the rxq is new or may not be useful for assignment as traffic patterns
    >     may change.  In those cases the code will essentially fallback to being
    >     round round similar to what currently exists. However, in the case
    >     where data is available and a reliable indication of future rxq cycles
    >     consumption, rxq to pmd distribution will be much improved.
    >     
    >     V3 -> V4
    >     ========
    >     4/6
    >     Rebased to accomodate new cross numa assigment.
    >     
    >     V2 -> V3
    >     ========
    >     Dropped v2 1/7 as not reusing dpcls optimisation interval anymore
    >     
    >     2/6
    >     Moved unused functions to 3/6 to avoid compiler warning
    >     
    >     3/6
    >     Made pmd rxq interval independent from dpcls opt interval
    >     
    >     4/6
    >     Moved docs about rebalance command to when it is available in 6/6
    >     Added logging info for pmd to rxq assignment
    >     
    >     5/6
    >     Added an example to docs
    >     
    >     6/6
    >     Noted in commit msg that Jan requested this for testing purposes
    >     
    >     V1 -> V2
    >     ========
    >     Dropped Ciara's patch to change how pmd cycles are counted as it merged.
    >     
    >     6/7: Rebased unit tests.
    >     
    >     
    >     Kevin Traynor (6):
    >       dpif-netdev: Change polled_queue to use dp_netdev_rxq.
    >       dpif-netdev: Add rxq processing cycle counters.
    >       dpif-netdev: Count the rxq processing cycles for an rxq.
    >       dpif-netdev: Change rxq_scheduling to use rxq processing cycles.
    >       dpif-netdev: Change pmd selection order.
    >       dpif-netdev: Add ovs-appctl dpif-netdev/pmd-rxq-rebalance.
    >     
    >      Documentation/howto/dpdk.rst |  26 +++++
    >      lib/dpif-netdev.c            | 252 +++++++++++++++++++++++++++++++++++--------
    >      tests/pmd.at                 |   2 +-
    >      vswitchd/ovs-vswitchd.8.in   |   2 +
    >      4 files changed, 237 insertions(+), 45 deletions(-)
    >     
    >     -- 
    >     1.8.3.1
    >     
    >     _______________________________________________
    >     dev mailing list
    >     dev at openvswitch.org
    >     https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=21tyraQPA9POr0vUmVHtd5ubXxYuyyd4NUSpsiKv0jE&s=oll6IIbXYm-jrOZ36LvbIoZL1jJ_fx26ZbWdYmFprAM&e= 
    >     
    > 
    
    



More information about the dev mailing list