[ovs-dev] [PATCH 3/4] dpif-netdev: pmd-rxq-affinity with optional PMD isolation

anurag2k at gmail.com anurag2k at gmail.com
Tue Jun 29 11:27:55 UTC 2021


From: Anurag Agarwal <anurag.agarwal at ericsson.com>

In some scenarios it is beneficial for DPDK datapath performance to pin
rx queues to specific PMDs, for example to allow cross-NUMA polling
when both physical ports are on one NUMA node but the PMD configuration
is symmetric.

Today such rxq pinning unconditionally makes these PMDs isolated, i.e.
they are no longer available for polling unpinned rx queues and hence
limit the ability of the load-based rxq distrubution logic to use spare
capacity on these isolated PMDs for unpinned rx queues. This typically
leads to a sub-optimal load balance over the available PMDs.

The overall OVS-DPDK performance can be improved by not isolating PMDs
with pinned rxqs and let OVS decide on the optimally balanced
distribution of rxqs autonomously.

This patch introduces a new option in the pmd-rxq-affinity cofiguration
parameter to skip the isolation of the target PMD threads:

ovs-vsctl set interface <Name>  \
    other_config : pmd-rxq-affinity = rxq1:cpu1,rxq2:cpu2,...[,no-isol]

Without the no-isol option, pinning isolates the target PMDs as before.
With the no-isol option, the target PMDs remain non-isolated.

Note.: A single rx queue of any one port that is pinned without the
no-isol option is enough to isolate a PMD.

Signed-off-by: Anurag Agarwal <anurag.agarwal at ericsson.com>
Signed-off-by: Jan Scheurich <jan.scheurich at ericsson.com>
Signed-off-by: Rudra Surya Bhaskara Rao <rudrasurya.r at acldigital.com>
---
 Documentation/topics/dpdk/pmd.rst | 18 ++++++++++++++----
 lib/dpif-netdev.c                 | 19 ++++++++++++++++---
 tests/pmd.at                      | 25 +++++++++++++++++++++++++
 vswitchd/vswitch.xml              | 11 ++++++++++-
 4 files changed, 65 insertions(+), 8 deletions(-)

diff --git a/Documentation/topics/dpdk/pmd.rst b/Documentation/topics/dpdk/pmd.rst
index e481e79..d63750e 100644
--- a/Documentation/topics/dpdk/pmd.rst
+++ b/Documentation/topics/dpdk/pmd.rst
@@ -101,8 +101,18 @@ like so:
 - Queue #2 not pinned
 - Queue #3 pinned to core 8
 
-PMD threads on cores where Rx queues are *pinned* will become *isolated*. This
-means that this thread will only poll the *pinned* Rx queues.
+By default PMD threads on cores where Rx queues are *pinned* will become
+*isolated*.This means that these threads will only poll the *pinned* Rx queues.
+If this isolation of PMD threads is not wanted, it can be skipped by adding
+the ``no-isol`` option to the ``<rxq-affinity-list>``, e.g.
+
+    $ ovs-vsctl set interface dpdk-p0 options:n_rxq=4 \
+        other_config:pmd-rxq-affinity="0:3,1:7,3:8,no-isol"
+
+.. note::
+
+   A single Rx queue pinned to a CPU core without the ``no-isol`` option
+   suffices to isolate the PMD thread.
 
 .. warning::
 
@@ -111,8 +121,8 @@ means that this thread will only poll the *pinned* Rx queues.
    ``<core-id>`` is not in ``pmd-cpu-mask``), the RX queue will not be polled
    by any PMD thread.
 
-If ``pmd-rxq-affinity`` is not set for Rx queues, they will be assigned to PMDs
-(cores) automatically.
+If ``pmd-rxq-affinity`` is not set for Rx queues, they will be assigned to
+non-isolated PMDs (cores) automatically.
 
 The algorithm used to automatically assign Rxqs to PMDs can be set by::
 
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 1c458b2..7d9078f 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -453,6 +453,8 @@ struct dp_netdev_rxq {
     unsigned intrvl_idx;               /* Write index for 'cycles_intrvl'. */
     struct dp_netdev_pmd_thread *pmd;  /* pmd thread that polls this queue. */
     bool is_vhost;                     /* Is rxq of a vhost port. */
+    bool isolate;                      /* Isolate the core to which this queue
+                                          is pinned.*/
 
     /* Counters of cycles spent successfully polling and processing pkts. */
     atomic_ullong cycles[RXQ_N_CYCLES];
@@ -4447,7 +4449,8 @@ dpif_netdev_set_config(struct dpif *dpif, const struct smap *other_config)
 
 /* Parses affinity list and returns result in 'core_ids'. */
 static int
-parse_affinity_list(const char *affinity_list, unsigned *core_ids, int n_rxq)
+parse_affinity_list(const char *affinity_list, unsigned *core_ids, int n_rxq,
+                    bool *isolate)
 {
     unsigned i;
     char *list, *copy, *key, *value;
@@ -4466,6 +4469,11 @@ parse_affinity_list(const char *affinity_list, unsigned *core_ids, int n_rxq)
     while (ofputil_parse_key_value(&list, &key, &value)) {
         int rxq_id, core_id;
 
+        if (strcmp(key, "no-isol") == 0) {
+            *isolate = false;
+            continue;
+        }
+
         if (!str_to_int(key, 0, &rxq_id) || rxq_id < 0
             || !str_to_int(value, 0, &core_id) || core_id < 0) {
             error = EINVAL;
@@ -4488,15 +4496,19 @@ dpif_netdev_port_set_rxq_affinity(struct dp_netdev_port *port,
 {
     unsigned *core_ids, i;
     int error = 0;
+    bool isolate = true;
 
     core_ids = xmalloc(port->n_rxq * sizeof *core_ids);
-    if (parse_affinity_list(affinity_list, core_ids, port->n_rxq)) {
+    if (parse_affinity_list(affinity_list, core_ids, port->n_rxq, &isolate)) {
         error = EINVAL;
         goto exit;
     }
 
     for (i = 0; i < port->n_rxq; i++) {
         port->rxqs[i].core_id = core_ids[i];
+        if (core_ids[i] != OVS_CORE_UNSPEC) {
+            port->rxqs[i].isolate = isolate;
+        }
     }
 
 exit:
@@ -5140,7 +5152,7 @@ prepare_rxq_scheduling(struct dp_netdev *dp)
                               q->core_id, qid, netdev_get_name(port->netdev));
                 } else {
                     q->pmd = pmd;
-                    pmd->isolated = true;
+                    pmd->isolated = pmd->isolated || q->isolate;
                     pmd->ll_n_rxq++;
                     pmd->ll_cycles += cycles;
                     VLOG_INFO("Core %d on numa node %d assigned port \'%s\' "
@@ -6569,6 +6581,7 @@ dp_netdev_configure_pmd(struct dp_netdev_pmd_thread *pmd, struct dp_netdev *dp,
     pmd->numa_id = numa_id;
     pmd->need_reload = false;
     pmd->n_output_batches = 0;
+    pmd->isolated = false;
 
     ovs_refcount_init(&pmd->ref_cnt);
     atomic_init(&pmd->exit, false);
diff --git a/tests/pmd.at b/tests/pmd.at
index 93a0bad..57b5fb8 100644
--- a/tests/pmd.at
+++ b/tests/pmd.at
@@ -579,6 +579,31 @@ p1 2 0 2
 p1 3 0 2
 ])
 
+AT_CHECK([ovs-vsctl set Interface p1 other_config:pmd-rxq-affinity='0:1,no-isol'])
+AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=2])
+
+dnl Queue 0 is pinned to core 1 without isolation.  The remaining queues are
+dnl distributed over both cores.  When removing core 2, all queues are
+dnl assignned to core 1.
+AT_CHECK([ovs-appctl dpif-netdev/pmd-rxq-show | parse_pmd_rxq_show], [0], [dnl
+p1 0 0 1
+p1 1 0 1
+p1 2 0 1
+p1 3 0 1
+])
+
+dnl Remove no-isol flag by requesting core 1 for queue 0. So Core 1 becomes
+dnl isolated and every other queue goes to core 2.
+AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=6])
+AT_CHECK([ovs-vsctl set Interface p1 other_config:pmd-rxq-affinity='0:1'])
+
+AT_CHECK([ovs-appctl dpif-netdev/pmd-rxq-show | parse_pmd_rxq_show], [0], [dnl
+p1 0 0 1
+p1 1 0 2
+p1 2 0 2
+p1 3 0 2
+])
+
 OVS_VSWITCHD_STOP(["/dpif_netdev|WARN|There is no PMD thread on core/d"])
 AT_CLEANUP
 
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index 4597a21..97bbb11 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -3230,7 +3230,8 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 type=patch options:peer=p1 \
         <p>
           <ul>
             <li>
-              <rxq-affinity-list> ::= NULL | <non-empty-list>
+              <rxq-affinity-list> ::= NULL | <non-empty-list> |
+                      <non-empty-list> , no-isol
             </li>
             <li>
               <non-empty-list> ::= <affinity-pair> |
@@ -3241,6 +3242,14 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 type=patch options:peer=p1 \
             </li>
           </ul>
         </p>
+        <p>
+          By default CPU cores with pinned RX queues are isolated and not
+          used for polling non-pinned RX queues. This can be overriden with
+          the no-isol option to let CPU cores poll both pinned and non-pinned
+          RX queues.
+          Note: A single pinned RX queue without the no-isol option suffices
+          to isolate a CPU core.
+        </p>
       </column>
 
       <column name="options" key="xdp-mode"
-- 
2.7.4



More information about the dev mailing list