[ovs-dev] [PATCH v2 1/2] dpif-netdev: Add percentage of pmd/core used by each rxq.

Kevin Traynor ktraynor at redhat.com
Mon Jan 15 11:42:51 UTC 2018


On 01/15/2018 11:37 AM, Stokes, Ian wrote:
>> On 01/15/2018 08:26 AM, Jan Scheurich wrote:
>>> Hi Kevin,
>>>
>>> This patch doesn't apply on top of my "Refactor PMD stats and cycle
>> counting" v9 (as we discussed when outlining the strategy for merging
>> contributions 1&2 and 3a-c) nor does the my series apply easily on top of
>> yours. Both cases seem to require non-trivial rebasing.
>>>
>>
>> Hi Jan - not sure if you looked through the code changes needed, but IIUC
>> it's just a one line change.
>>
>> @@ -6051,5 +6051,5 @@ dp_netdev_pmd_try_optimize(struct
>> dp_netdev_pmd_thread *pmd,
>>                                       0);
>>          }
>> -        curr_tsc = cycles_counter();
>> +        curr_tsc = cycles_counter_update(&pmd->perf_stats);
>>
>>
>>
>>> Would it be possible for you to rebase your patches on my v9 to make use
>> of the refactored cycle counting? I believe that makes more sense than
>> refactoring your cycle counting afterwards.
>>>
>>
>> I don't agree that it makes sense to send new patches that don't apply on
>> master anymore but do apply after a chain of patches which might be
>> merged. Apart from that one line of code, the code is independent of each
>> other, so a rebase for either patches should be trivial IMHO. I'm happy to
>> help with rebasing any of the patches, if it helps things.
> 
> I'd agree with above, let's keep Kevin's patch separate for them moment, I can review today and rebase it to DPDK_MERGE when required, I'd like to see these patches go to DPDK_MERGE in the following order
> 
> (i) PMD Refactor v10.
> (ii)Tx_Time_Batching v11.
> (iii) Percentage PMD core (Possibly v3).
> 

Sure, I can send a v3/rebase when the others have merged. I just don't
want it to be blocked indefinitely if some issues arise with the other
patches.

thanks,
Kevin.

> Ian
>>
>> thanks,
>> Kevin.
>>
>>> BR, Jan
>>>
>>>> -----Original Message-----
>>>> From: Kevin Traynor [mailto:ktraynor at redhat.com]
>>>> Sent: Thursday, 11 January, 2018 15:32
>>>> To: dev at openvswitch.org; Jan Scheurich <jan.scheurich at ericsson.com>;
>>>> ian.stokes at intel.com; i.maximets at samsung.com;
>>>> billy.o.mahony at intel.com; aconole at redhat.com; fbl at redhat.com
>>>> Cc: Kevin Traynor <ktraynor at redhat.com>
>>>> Subject: [PATCH v2 1/2] dpif-netdev: Add percentage of pmd/core used by
>> each rxq.
>>>>
>>>> It is based on the length of history that is stored about an rxq
>>>> (currently 1 min).
>>>>
>>>> $ ovs-appctl dpif-netdev/pmd-rxq-show pmd thread numa_id 0 core_id 4:
>>>>         isolated : false
>>>>         port:         dpdkphy1  queue-id:  0    pmd usage: 70 %
>>>>         port:       dpdkvhost0  queue-id:  0    pmd usage:  0 %
>>>> pmd thread numa_id 0 core_id 6:
>>>>         isolated : false
>>>>         port:         dpdkphy0  queue-id:  0    pmd usage: 64 %
>>>>         port:       dpdkvhost1  queue-id:  0    pmd usage:  0 %
>>>>
>>>> These values are what would be used as part of rxq to pmd assignment
>>>> due to a reconfiguration event e.g. adding pmds, adding rxqs or with
>>>> the command:
>>>>
>>>> ovs-appctl dpif-netdev/pmd-rxq-rebalance
>>>>
>>>> Signed-off-by: Jan Scheurich <jan.scheurich at ericsson.com>
>>>> Co-authored-by: Jan Scheurich <jan.scheurich at ericsson.com>
>>>> Signed-off-by: Kevin Traynor <ktraynor at redhat.com>
>>>> ---
>>>>
>>>> V2:
>>>> - Re-worked to calculate and store cycles needed for stats directly in
>> pmd.
>>>>
>>>>  Documentation/howto/dpdk.rst | 11 +++++++++
>>>>  NEWS                         |  1 +
>>>>  lib/dpif-netdev.c            | 53 +++++++++++++++++++++++++++++++++---
>> --------
>>>>  tests/pmd.at                 | 51 +++++++++++++++++++++++++++++++-----
>> ------
>>>>  4 files changed, 90 insertions(+), 26 deletions(-)
>>>>
>>>> diff --git a/Documentation/howto/dpdk.rst
>>>> b/Documentation/howto/dpdk.rst index 587aaed..40f9d96 100644
>>>> --- a/Documentation/howto/dpdk.rst
>>>> +++ b/Documentation/howto/dpdk.rst
>>>> @@ -140,4 +140,15 @@ Core 7: Q4 (70%) | Q5 (10%)  core 8: Q3 (60%) |
>>>> Q0 (30%)
>>>>
>>>> +To see the current measured usage history of pmd core cycles for each
>> rxq::
>>>> +
>>>> +    $ ovs-appctl dpif-netdev/pmd-rxq-show
>>>> +
>>>> +.. note::
>>>> +
>>>> +  A history of one minute is recorded and shown for each rxq to
>>>> + allow for  traffic pattern spikes. An rxq's pmd core cycles usage
>>>> + changes due to traffic  pattern or reconfig changes will take one
>>>> + minute before they are fully  reflected in the stats.
>>>> +
>>>>  Rxq to pmds assignment takes place whenever there are configuration
>>>> changes  or can be triggered by using::
>>>> diff --git a/NEWS b/NEWS
>>>> index aff0671..088dca3 100644
>>>> --- a/NEWS
>>>> +++ b/NEWS
>>>> @@ -41,4 +41,5 @@ Post-v2.8.0
>>>>          - ovs-ofctl dump-ports command now prints new of set custom
>> statistics
>>>>            if available (for OpenFlow 1.4+).
>>>> +     * Add rxq utilization of pmd to appctl 'dpif-netdev/pmd-rxq-
>> show'.
>>>>     - vswitchd:
>>>>       * Datapath IDs may now be specified as 0x1 (etc.) instead of 16
>> digits.
>>>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index
>>>> 57f6570..d960a3a 100644
>>>> --- a/lib/dpif-netdev.c
>>>> +++ b/lib/dpif-netdev.c
>>>> @@ -602,4 +602,9 @@ struct dp_netdev_pmd_thread {
>>>>      struct dp_netdev_pmd_cycles cycles;
>>>>
>>>> +    /* Last interval timestamp. */
>>>> +    uint64_t intrvl_tsc_prev;
>>>> +    /* Last interval cycles. */
>>>> +    atomic_ullong intrvl_cycles;
>>>> +
>>>>      /* Current context of the PMD thread. */
>>>>      struct dp_netdev_pmd_thread_ctx ctx; @@ -996,7 +1001,7 @@
>>>> pmd_info_show_rxq(struct ds *reply, struct dp_netdev_pmd_thread *pmd)
>>>> {
>>>>      if (pmd->core_id != NON_PMD_CORE_ID) {
>>>> -        const char *prev_name = NULL;
>>>>          struct rxq_poll *list;
>>>> -        size_t i, n;
>>>> +        size_t n_rxq;
>>>> +        uint64_t total_cycles = 0;
>>>>
>>>>          ds_put_format(reply,
>>>> @@ -1006,20 +1011,32 @@ pmd_info_show_rxq(struct ds *reply, struct
>>>> dp_netdev_pmd_thread *pmd)
>>>>
>>>>          ovs_mutex_lock(&pmd->port_mutex);
>>>> -        sorted_poll_list(pmd, &list, &n);
>>>> -        for (i = 0; i < n; i++) {
>>>> -            const char *name = netdev_rxq_get_name(list[i].rxq->rx);
>>>> +        sorted_poll_list(pmd, &list, &n_rxq);
>>>>
>>>> -            if (!prev_name || strcmp(name, prev_name)) {
>>>> -                if (prev_name) {
>>>> -                    ds_put_cstr(reply, "\n");
>>>> -                }
>>>> -                ds_put_format(reply, "\tport: %s\tqueue-id:", name);
>>>> +        /* Get the total pmd cycles for an interval. */
>>>> +        atomic_read_relaxed(&pmd->intrvl_cycles, &total_cycles);
>>>> +        /* Estimate the cycles to cover all intervals. */
>>>> +        total_cycles *= PMD_RXQ_INTERVAL_MAX;
>>>> +
>>>> +        for (int i = 0; i < n_rxq; i++) {
>>>> +            struct dp_netdev_rxq *rxq = list[i].rxq;
>>>> +            const char *name = netdev_rxq_get_name(rxq->rx);
>>>> +            uint64_t proc_cycles = 0;
>>>> +
>>>> +            for (int j = 0; j < PMD_RXQ_INTERVAL_MAX; j++) {
>>>> +                proc_cycles += dp_netdev_rxq_get_intrvl_cycles(rxq,
>>>> + j);
>>>>              }
>>>> -            ds_put_format(reply, " %d",
>>>> +            ds_put_format(reply, "\tport: %16s\tqueue-id: %2d",
>>>> + name,
>>>>                            netdev_rxq_get_queue_id(list[i].rxq->rx));
>>>> -            prev_name = name;
>>>> +            ds_put_format(reply, "\tpmd usage: ");
>>>> +            if (total_cycles) {
>>>> +                ds_put_format(reply, "%2"PRIu64"",
>>>> +                              proc_cycles * 100 / total_cycles);
>>>> +                ds_put_cstr(reply, " %");
>>>> +            } else {
>>>> +                ds_put_format(reply, "%s", "NOT AVAIL");
>>>> +            }
>>>> +            ds_put_cstr(reply, "\n");
>>>>          }
>>>>          ovs_mutex_unlock(&pmd->port_mutex);
>>>> -        ds_put_cstr(reply, "\n");
>>>>          free(list);
>>>>      }
>>>> @@ -4157,4 +4174,6 @@ reload:
>>>>      }
>>>>
>>>> +    pmd->intrvl_tsc_prev = 0;
>>>> +    atomic_store_relaxed(&pmd->intrvl_cycles, 0);
>>>>      cycles_count_start(pmd);
>>>>      for (;;) {
>>>> @@ -6138,4 +6157,5 @@ dp_netdev_pmd_try_optimize(struct
>>>> dp_netdev_pmd_thread *pmd,
>>>>
>>>>      if (pmd->ctx.now > pmd->rxq_next_cycle_store) {
>>>> +        uint64_t curr_tsc;
>>>>          /* Get the cycles that were used to process each queue and
>> store. */
>>>>          for (unsigned i = 0; i < poll_cnt; i++) { @@ -6146,4
>>>> +6166,11 @@ dp_netdev_pmd_try_optimize(struct dp_netdev_pmd_thread
>> *pmd,
>>>>                                       0);
>>>>          }
>>>> +        curr_tsc = cycles_counter();
>>>> +        if (pmd->intrvl_tsc_prev) {
>>>> +            /* There is a prev timestamp, store a new intrvl cycle
>> count. */
>>>> +            atomic_store_relaxed(&pmd->intrvl_cycles,
>>>> +                                 curr_tsc - pmd->intrvl_tsc_prev);
>>>> +        }
>>>> +        pmd->intrvl_tsc_prev = curr_tsc;
>>>>          /* Start new measuring interval */
>>>>          pmd->rxq_next_cycle_store = pmd->ctx.now +
>>>> PMD_RXQ_INTERVAL_LEN; diff --git a/tests/pmd.at b/tests/pmd.at index
>>>> fcb007c..09d09cb 100644
>>>> --- a/tests/pmd.at
>>>> +++ b/tests/pmd.at
>>>> @@ -7,5 +7,13 @@ m4_divert_push([PREPARE_TESTS])  # port_name rxq_id
>>>> numa_id core_id  parse_pmd_rxq_show () {
>>>> -    awk '/pmd/ {numa=$4; core=substr($6, 1, length($6) - 1)} /\t/{for
>> (i=4; i<=NF; i++) print  $2, $i, numa, core}' | sort
>>>> +    awk '/pmd thread/ {numa=$4; core=substr($6, 1, length($6) - 1)}
>>>> +/\tport:/ {print  $2, $4, numa, core}' | sort }
>>>> +
>>>> +# Given the output of `ovs-appctl dpif-netdev/pmd-rxq-show`, # and
>>>> +with queues for each core on one line, prints the rxqs # of the core
>>>> +on one line # 'port:' port_name 'queue_id:' rxq_id rxq_id rxq_id
>>>> +rxq_id parse_pmd_rxq_show_group () {
>>>> +   awk '/port:/ {print  $1, $2, $3, $4, $12, $20, $28}'
>>>>  }
>>>>
>>>> @@ -54,5 +62,5 @@ m4_define([CHECK_PMD_THREADS_CREATED], [
>>>>
>>>>  m4_define([SED_NUMA_CORE_PATTERN], ["s/\(numa_id \)[[0-9]]*\(
>>>> core_id \)[[0-9]]*:/\1<cleared>\2<cleared>:/"])
>>>> -m4_define([SED_NUMA_CORE_QUEUE_PATTERN], ["s/\(numa_id \)[[0-9]]*\(
>> core_id \)[[0-9]]*:/\1<cleared>\2<cleared>:/;s/\(queue-id:
>>>> \)1 2 5 6/\1<cleared>/;s/\(queue-id: \)0 3 4 7/\1<cleared>/"])
>>>> +m4_define([SED_NUMA_CORE_QUEUE_PATTERN], ["s/1 2 5 6/<group>/;s/0 3
>>>> +4 7/<group>/"])
>>>>  m4_define([DUMMY_NUMA], [--dummy-numa="0,0,0,0"])
>>>>
>>>> @@ -66,5 +74,5 @@ AT_CHECK([ovs-appctl dpif-netdev/pmd-rxq-show | sed
>>>> SED_NUMA_CORE_PATTERN], [0],  pmd thread numa_id <cleared> core_id
>> <cleared>:
>>>>  	isolated : false
>>>> -	port: p0	queue-id: 0
>>>> +	port:               p0	queue-id:  0	pmd usage: NOT AVAIL
>>>>  ])
>>>>
>>>> @@ -97,5 +105,12 @@ AT_CHECK([ovs-appctl dpif-netdev/pmd-rxq-show |
>>>> sed SED_NUMA_CORE_PATTERN], [0],  pmd thread numa_id <cleared> core_id
>> <cleared>:
>>>>  	isolated : false
>>>> -	port: p0	queue-id: 0 1 2 3 4 5 6 7 <sip:01234567> <sip:01234567 <sip:01234567>>
>>>> +	port:               p0	queue-id:  0	pmd usage: NOT AVAIL
>>>> +	port:               p0	queue-id:  1	pmd usage: NOT AVAIL
>>>> +	port:               p0	queue-id:  2	pmd usage: NOT AVAIL
>>>> +	port:               p0	queue-id:  3	pmd usage: NOT AVAIL
>>>> +	port:               p0	queue-id:  4	pmd usage: NOT AVAIL
>>>> +	port:               p0	queue-id:  5	pmd usage: NOT AVAIL
>>>> +	port:               p0	queue-id:  6	pmd usage: NOT AVAIL
>>>> +	port:               p0	queue-id:  7	pmd usage: NOT AVAIL
>>>>  ])
>>>>
>>>> @@ -121,5 +136,12 @@ AT_CHECK([ovs-appctl dpif-netdev/pmd-rxq-show |
>>>> sed SED_NUMA_CORE_PATTERN], [0],  pmd thread numa_id <cleared> core_id
>> <cleared>:
>>>>  	isolated : false
>>>> -	port: p0	queue-id: 0 1 2 3 4 5 6 7 <sip:01234567> <sip:01234567 <sip:01234567>>
>>>> +	port:               p0	queue-id:  0	pmd usage: NOT AVAIL
>>>> +	port:               p0	queue-id:  1	pmd usage: NOT AVAIL
>>>> +	port:               p0	queue-id:  2	pmd usage: NOT AVAIL
>>>> +	port:               p0	queue-id:  3	pmd usage: NOT AVAIL
>>>> +	port:               p0	queue-id:  4	pmd usage: NOT AVAIL
>>>> +	port:               p0	queue-id:  5	pmd usage: NOT AVAIL
>>>> +	port:               p0	queue-id:  6	pmd usage: NOT AVAIL
>>>> +	port:               p0	queue-id:  7	pmd usage: NOT AVAIL
>>>>  ])
>>>>
>>>> @@ -128,11 +150,7 @@ AT_CHECK([ovs-vsctl set Open_vSwitch .
>>>> other_config:pmd-cpu-mask=0x3])  CHECK_PMD_THREADS_CREATED([2], [],
>>>> [+$TMP])
>>>>
>>>> -AT_CHECK([ovs-appctl dpif-netdev/pmd-rxq-show | sed
>>>> SED_NUMA_CORE_QUEUE_PATTERN], [0], [dnl -pmd thread numa_id <cleared>
>> core_id <cleared>:
>>>> -	isolated : false
>>>> -	port: p0	queue-id: <cleared>
>>>> -pmd thread numa_id <cleared> core_id <cleared>:
>>>> -	isolated : false
>>>> -	port: p0	queue-id: <cleared>
>>>> +AT_CHECK([ovs-appctl dpif-netdev/pmd-rxq-show | sed
>>>> +':a;/AVAIL$/{N;s/\n//;ba}' | parse_pmd_rxq_show_group | sed
>>>> SED_NUMA_CORE_QUEUE_PATTERN], [0], [dnl
>>>> +port: p0 queue-id: <group>
>>>> +port: p0 queue-id: <group>
>>>>  ])
>>>>
>>>> @@ -144,5 +162,12 @@ AT_CHECK([ovs-appctl dpif-netdev/pmd-rxq-show |
>>>> sed SED_NUMA_CORE_PATTERN], [0],  pmd thread numa_id <cleared> core_id
>> <cleared>:
>>>>  	isolated : false
>>>> -	port: p0	queue-id: 0 1 2 3 4 5 6 7 <sip:01234567> <sip:01234567 <sip:01234567>>
>>>> +	port:               p0	queue-id:  0	pmd usage: NOT AVAIL
>>>> +	port:               p0	queue-id:  1	pmd usage: NOT AVAIL
>>>> +	port:               p0	queue-id:  2	pmd usage: NOT AVAIL
>>>> +	port:               p0	queue-id:  3	pmd usage: NOT AVAIL
>>>> +	port:               p0	queue-id:  4	pmd usage: NOT AVAIL
>>>> +	port:               p0	queue-id:  5	pmd usage: NOT AVAIL
>>>> +	port:               p0	queue-id:  6	pmd usage: NOT AVAIL
>>>> +	port:               p0	queue-id:  7	pmd usage: NOT AVAIL
>>>>  ])
>>>>
>>>> --
>>>> 1.8.3.1
>>>
> 



More information about the dev mailing list