[ovs-discuss] OVS-DPDK: Packets stuck in dpdk_tx_queue
Mattias Johansson G
mattias.g.johansson at ericsson.com
Wed May 4 13:15:55 UTC 2016
Hi!
I'm doing a throughput test of OVS 2.4.1 and I'm seeing an issue with abnormally high
latency variation when sending traffic via a physical port connected to a numa node 1
when having OVS control plane running on numa node 0.
The problem seen is that when a PMD thread running on numa node 0 reads a packet from
the vhost queue and transmit it to the physical ports Tx queue located on numa node 1
the packet gets stuck there until next packet arrives.
In the code a PMD thread running on numa node 0 reads a packet from the vhost queue,
and calls dpdk_queue_pkts to queue it in struct dpdk_tx_queue for the physical port
located on numa node 1.
In this test dev->tx_q[i].flush_tx is false which means that dpdk_queue_flush__ is not
called (except if the Tx queue gets full, i.e. reach MAX_TX_QUEUE_LEN amount of
packets but that is not of interest here).
For the queues on numa node 1 no other trigger mechanism for a flush is present, so the
packet gets stuck in struct dpdk_tx_queue until the next packet arrives.
What I think happens in my test is that the last packets I send doesn't fill up the Tx queue
hence no flush of it happens. The flush happens when I send an additional packet, then
the DRAIN_TSC interval has passed which forces a call to dpdk_queue_flush__.
This behavior is not seen when sending packets to a physical port connected to numa
node 0 where the OVS control plane is running as the dev->tx_q[i].flush_tx is true in
that situation.
The behavior seems to be controlled by how the Tx queue is configured in relation with
the CPU id is the PMD thread is executing on. Below is code from a function in
netdev-dpdk.c (seems to be similar in the latest version on the master- and 2.4.1 branch).
Why isn't dev->tx_q[i].flush_tx always enabled (set to true)? If I change the code and
always enable it the problem seems to disappear.
I've attached some logs describing the cpu layout, where my different PMD threads
executes and how the queues for the two physical ports I have gets configured.
I would very much appreciate if I can get help in understanding of how this is intended
to work.
Thanks in advance!
BR,
Mattias
Unmodified code snippet from:
https://github.com/openvswitch/ovs/blob/master/lib/netdev-dpdk.c (362ca39)
static void
netdev_dpdk_alloc_txq(struct netdev_dpdk *dev, unsigned int n_txqs)
{
unsigned i;
dev->tx_q = dpdk_rte_mzalloc(n_txqs * sizeof *dev->tx_q);
for (i = 0; i < n_txqs; i++) {
int numa_id = ovs_numa_get_numa_id(i);
if (!dev->txq_needs_locking) {
/* Each index is considered as a cpu core id, since there should
* be one tx queue for each cpu core. If the corresponding core
* is not on the same numa node as 'dev', flags the
* 'flush_tx'. */
dev->tx_q[i].flush_tx = dev->socket_id == numa_id;
} else {
/* Queues are shared among CPUs. Always flush */
dev->tx_q[i].flush_tx = true;
}
/* Initialize map for vhost devices. */
dev->tx_q[i].map = OVS_VHOST_QUEUE_MAP_UNKNOWN;
rte_spinlock_init(&dev->tx_q[i].tx_lock);
}
}
Traffic tool output:
burst size: 2
frequency: 100000 burst/s
payload size: 30 B
frame size: 64 B
length of test: 2 s
tx: 399730
rx: 399730
bits received: 0 Gb
elapsed: 4 s
tx lost: 0
rx lost: 0
out of order: 0
corrupt: 0
noise: 0
min: 14 us
max: 1994061 us
mean latency: 95.254 us
standard deviation: 8344.242 us
Latencies (us):
0 <= x < 50 158747
50 <= x < 100 200266
100 <= x < 150 40552
150 <= x < 200 32
200 <= x < 250 30
250 <= x < 300 20
300 <= x < 350 20
350 <= x < 400 22
400 <= x < 450 16
450 <= x < 500 10
500 <= x < 550 8
50000 <= x 7
OVS threads:
Name CPU
pmd128 2
pmd129 39
pmd127 22
pmd126 19
ovs-vswitchd 0
cpu_layout.py:
============================================================
Core and Socket Information (as reported by '/proc/cpuinfo')
============================================================
cores = [0, 1, 2, 3, 4, 8, 9, 10, 11, 12] sockets = [0, 1]
Socket 0 Socket 1
-------- --------
Core 0 [0, 20] [1, 21]
Core 1 [2, 22] [3, 23]
Core 2 [4, 24] [5, 25]
Core 3 [6, 26] [7, 27]
Core 4 [8, 28] [9, 29]
Core 8 [10, 30] [11, 31]
Core 9 [12, 32] [13, 33]
Core 10 [14, 34] [15, 35]
Core 11 [16, 36] [17, 37]
Core 12 [18, 38] [19, 39]
How Tx flush is configured for dpdk0 on numa node 0:
2016-04-19T08:34:36.389Z|00024|dpdk|INFO|Enable flush for tx queue 0
2016-04-19T08:34:36.389Z|00025|dpdk|INFO|Disable flush for tx queue 1
2016-04-19T08:34:36.389Z|00026|dpdk|INFO|Enable flush for tx queue 2
2016-04-19T08:34:36.389Z|00027|dpdk|INFO|Disable flush for tx queue 3
2016-04-19T08:34:36.389Z|00028|dpdk|INFO|Enable flush for tx queue 4
2016-04-19T08:34:36.389Z|00029|dpdk|INFO|Disable flush for tx queue 5
2016-04-19T08:34:36.389Z|00030|dpdk|INFO|Enable flush for tx queue 6
2016-04-19T08:34:36.389Z|00031|dpdk|INFO|Disable flush for tx queue 7
2016-04-19T08:34:36.389Z|00032|dpdk|INFO|Enable flush for tx queue 8
2016-04-19T08:34:36.389Z|00033|dpdk|INFO|Disable flush for tx queue 9
2016-04-19T08:34:36.389Z|00034|dpdk|INFO|Enable flush for tx queue 10
2016-04-19T08:34:36.389Z|00035|dpdk|INFO|Disable flush for tx queue 11
2016-04-19T08:34:36.389Z|00036|dpdk|INFO|Enable flush for tx queue 12
2016-04-19T08:34:36.389Z|00037|dpdk|INFO|Disable flush for tx queue 13
2016-04-19T08:34:36.389Z|00038|dpdk|INFO|Enable flush for tx queue 14
2016-04-19T08:34:36.389Z|00039|dpdk|INFO|Disable flush for tx queue 15
2016-04-19T08:34:36.389Z|00040|dpdk|INFO|Enable flush for tx queue 16
2016-04-19T08:34:36.389Z|00041|dpdk|INFO|Disable flush for tx queue 17
2016-04-19T08:34:36.389Z|00042|dpdk|INFO|Enable flush for tx queue 18
2016-04-19T08:34:36.389Z|00043|dpdk|INFO|Disable flush for tx queue 19
2016-04-19T08:34:36.389Z|00044|dpdk|INFO|Enable flush for tx queue 20
2016-04-19T08:34:36.389Z|00045|dpdk|INFO|Disable flush for tx queue 21
2016-04-19T08:34:36.389Z|00046|dpdk|INFO|Enable flush for tx queue 22
2016-04-19T08:34:36.389Z|00047|dpdk|INFO|Disable flush for tx queue 23
2016-04-19T08:34:36.389Z|00048|dpdk|INFO|Enable flush for tx queue 24
2016-04-19T08:34:36.389Z|00049|dpdk|INFO|Disable flush for tx queue 25
2016-04-19T08:34:36.389Z|00050|dpdk|INFO|Enable flush for tx queue 26
2016-04-19T08:34:36.389Z|00051|dpdk|INFO|Disable flush for tx queue 27
2016-04-19T08:34:36.389Z|00052|dpdk|INFO|Enable flush for tx queue 28
2016-04-19T08:34:36.389Z|00053|dpdk|INFO|Disable flush for tx queue 29
2016-04-19T08:34:36.389Z|00054|dpdk|INFO|Enable flush for tx queue 30
2016-04-19T08:34:36.389Z|00055|dpdk|INFO|Disable flush for tx queue 31
2016-04-19T08:34:36.389Z|00056|dpdk|INFO|Enable flush for tx queue 32
2016-04-19T08:34:36.389Z|00057|dpdk|INFO|Disable flush for tx queue 33
2016-04-19T08:34:36.389Z|00058|dpdk|INFO|Enable flush for tx queue 34
2016-04-19T08:34:36.389Z|00059|dpdk|INFO|Disable flush for tx queue 35
2016-04-19T08:34:36.389Z|00060|dpdk|INFO|Enable flush for tx queue 36
2016-04-19T08:34:36.389Z|00061|dpdk|INFO|Disable flush for tx queue 37
2016-04-19T08:34:36.389Z|00062|dpdk|INFO|Enable flush for tx queue 38
2016-04-19T08:34:36.389Z|00063|dpdk|INFO|Disable flush for tx queue 39
2016-04-19T08:34:36.389Z|00064|dpdk|INFO|Disable flush for tx queue 40
How Tx flush is configured for dpdk1 on numa node 1:
2016-04-19T08:34:37.495Z|00069|dpdk|INFO|Disable flush for tx queue 0
2016-04-19T08:34:37.495Z|00070|dpdk|INFO|Enable flush for tx queue 1
2016-04-19T08:34:37.495Z|00071|dpdk|INFO|Disable flush for tx queue 2
2016-04-19T08:34:37.495Z|00072|dpdk|INFO|Enable flush for tx queue 3
2016-04-19T08:34:37.495Z|00073|dpdk|INFO|Disable flush for tx queue 4
2016-04-19T08:34:37.495Z|00074|dpdk|INFO|Enable flush for tx queue 5
2016-04-19T08:34:37.495Z|00075|dpdk|INFO|Disable flush for tx queue 6
2016-04-19T08:34:37.495Z|00076|dpdk|INFO|Enable flush for tx queue 7
2016-04-19T08:34:37.495Z|00077|dpdk|INFO|Disable flush for tx queue 8
2016-04-19T08:34:37.495Z|00078|dpdk|INFO|Enable flush for tx queue 9
2016-04-19T08:34:37.495Z|00079|dpdk|INFO|Disable flush for tx queue 10
2016-04-19T08:34:37.495Z|00080|dpdk|INFO|Enable flush for tx queue 11
2016-04-19T08:34:37.495Z|00081|dpdk|INFO|Disable flush for tx queue 12
2016-04-19T08:34:37.495Z|00082|dpdk|INFO|Enable flush for tx queue 13
2016-04-19T08:34:37.495Z|00083|dpdk|INFO|Disable flush for tx queue 14
2016-04-19T08:34:37.495Z|00084|dpdk|INFO|Enable flush for tx queue 15
2016-04-19T08:34:37.495Z|00085|dpdk|INFO|Disable flush for tx queue 16
2016-04-19T08:34:37.495Z|00086|dpdk|INFO|Enable flush for tx queue 17
2016-04-19T08:34:37.495Z|00087|dpdk|INFO|Disable flush for tx queue 18
2016-04-19T08:34:37.495Z|00088|dpdk|INFO|Enable flush for tx queue 19
2016-04-19T08:34:37.495Z|00089|dpdk|INFO|Disable flush for tx queue 20
2016-04-19T08:34:37.495Z|00090|dpdk|INFO|Enable flush for tx queue 21
2016-04-19T08:34:37.495Z|00091|dpdk|INFO|Disable flush for tx queue 22
2016-04-19T08:34:37.495Z|00092|dpdk|INFO|Enable flush for tx queue 23
2016-04-19T08:34:37.495Z|00093|dpdk|INFO|Disable flush for tx queue 24
2016-04-19T08:34:37.495Z|00094|dpdk|INFO|Enable flush for tx queue 25
2016-04-19T08:34:37.495Z|00095|dpdk|INFO|Disable flush for tx queue 26
2016-04-19T08:34:37.495Z|00096|dpdk|INFO|Enable flush for tx queue 27
2016-04-19T08:34:37.495Z|00097|dpdk|INFO|Disable flush for tx queue 28
2016-04-19T08:34:37.495Z|00098|dpdk|INFO|Enable flush for tx queue 29
2016-04-19T08:34:37.495Z|00099|dpdk|INFO|Disable flush for tx queue 30
2016-04-19T08:34:37.495Z|00100|dpdk|INFO|Enable flush for tx queue 31
2016-04-19T08:34:37.495Z|00101|dpdk|INFO|Disable flush for tx queue 32
2016-04-19T08:34:37.495Z|00102|dpdk|INFO|Enable flush for tx queue 33
2016-04-19T08:34:37.495Z|00103|dpdk|INFO|Disable flush for tx queue 34
2016-04-19T08:34:37.495Z|00104|dpdk|INFO|Enable flush for tx queue 35
2016-04-19T08:34:37.495Z|00105|dpdk|INFO|Disable flush for tx queue 36
2016-04-19T08:34:37.495Z|00106|dpdk|INFO|Enable flush for tx queue 37
2016-04-19T08:34:37.495Z|00107|dpdk|INFO|Disable flush for tx queue 38
2016-04-19T08:34:37.495Z|00108|dpdk|INFO|Enable flush for tx queue 39
2016-04-19T08:34:37.495Z|00109|dpdk|INFO|Disable flush for tx queue 40
More information about the discuss
mailing list