[ovs-dev] [ovs-dev, ovs-dev, v4] netdev-dpdk: Configurable Link State Change (LSC) detection mode
Ilya Maximets
i.maximets at samsung.com
Fri Feb 2 16:17:28 UTC 2018
On 02.02.2018 17:05, Róbert Mulik wrote:
> It is possible to change LSC detection mode to polling or interrupt mode
> for DPDK interfaces. The default is polling mode. To set interrupt mode,
> option dpdk-lsc-interrupt has to be set to true.
>
> In polling mode more processor time is needed, since the OVS repeatedly reads
> the link state with a short period. It can lead to packet loss for certain
> systems.
>
> In interrupt mode the hardware itself triggers an interrupt when link state
> change happens, so less processing time needs for the OVS. It is not possible
> to enable the interrupt mode on all hardware.
>
> For detailed description and usage see the dpdk install documentation.
>
> Signed-off-by: Robert Mulik <robert.mulik at ericsson.com>
> Reviewed-by: Ilya Maximets <i.maximets at samsung.com>
I'm sorry, but this is not true. I even stated that in my reply to v3:
"Not a full review.".
Also, we didn't decide what to do with some of my concerns.
As OVS doesn't usually use 'Reviewed-by' tag, I'll point you to the kernel docs:
https://www.kernel.org/doc/html/v4.12/process/submitting-patches.html#using-reported-by-tested-by-reviewed-by-suggested-by-and-fixes
> Reviewed-by: Ian Stokes <ian.stokes at intel.com>
> Reviewed-by: Eelco Chaudron <echaudro at redhat.com>
> ---
It's a good common practice to put the changes between patch versions here.
It's really hard to track all the changes.
> Documentation/intro/install/dpdk.rst | 48 ++++++++++++++++++++++++++++++++++++
> lib/netdev-dpdk.c | 42 ++++++++++++++++++++++++++++---
> lib/netdev-dpdk.h | 2 ++
> vswitchd/bridge.c | 8 ++++++
> vswitchd/vswitch.xml | 45 +++++++++++++++++++++++++++++++++
> 5 files changed, 142 insertions(+), 3 deletions(-)
>
> --
> 1.9.1
>
> diff --git a/Documentation/intro/install/dpdk.rst b/Documentation/intro/install/dpdk.rst
> index ed358d5..14a6684 100644
> --- a/Documentation/intro/install/dpdk.rst
> +++ b/Documentation/intro/install/dpdk.rst
> @@ -628,6 +628,54 @@ The average number of packets per output batch can be checked in PMD stats::
>
> $ ovs-appctl dpif-netdev/pmd-stats-show
>
It's still not a full review.
But I wanted to say that below documentation is just text and definitely
not a reStructuredText. Please re-format it using rST markup features.
Look at the docs around for examples.
> +Link State Change (LSC) detection configuration
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +There are two methods to get the information when Link State Change (LSC)
> +happens on a network interface: by polling or interrupt.
> +
> +With polling method, a process is running in the background and repeatedly
> +reads the link state with a short period. It continuously needs processor time
> +and between 2 reading periods it can`t see the link state change, therefore
> +the reaction time depends on the polling period. With higher rate, more
> +processor time is needed. Another problem with the poll mode is that on some
> +hardware a polling cycle takes too much time, which (in the end) leads to
> +packet loss for certain systems.
> +
> +If interrupts are used to get LSC information, the hardware itself triggers
> +an interrupt when link state change happens, the thread wakes up from sleep,
> +updates the information, and goes back to sleep mode. When no link state
> +change happens (most of the time), the thread remains in sleep mode and
> +doesn`t use processor time at all. The disadvantage of this method is that
> +when interrupt happens, the processor has to handle it immediately, so it
> +puts the currently running process to background, handles the interrupt, and
> +takes the background process back. Another disadvantage is that some hardware
> +can`t be configured to generate LSC interrupts.
> +
> +The default configuration is polling mode. To set interrupt mode, option
> +dpdk-lsc-interrupt has to be set to true.
> +
> +Global settings
> +
> +Command to set interrupt mode for all interfaces:
> +ovs-vsctl set Open_vSwitch . other_config:dpdk-lsc-interrupt=true
> +
> +Command to set polling mode for all interfaces:
> +ovs-vsctl set Open_vSwitch . other_config:dpdk-lsc-interrupt=false
> +or:
> +ovs-vsctl remove Open_vSwitch . other_config dpdk-lsc-interrupt
> +
> +Interface specific settings (override global settings)
> +
> +Command to set interrupt mode for a specific interface:
> +ovs-vsctl set interface <interface_name> options:dpdk-lsc-interrupt=true
> +
> +Command to set polling mode for a specific interface:
> +ovs-vsctl set interface <interface_name> options:dpdk-lsc-interrupt=false
> +
> +Command to reset to globally defined mode for a specific interface:
> +ovs-vsctl remove interface <interface_name> options dpdk-lsc-interrupt
> +
> Limitations
> ------------
>
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index 94fb163..73d0d4b 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -148,7 +148,7 @@ typedef uint16_t dpdk_port_t;
> #define VHOST_ENQ_RETRY_NUM 8
> #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
>
> -static const struct rte_eth_conf port_conf = {
> +static struct rte_eth_conf port_conf = {
> .rxmode = {
> .mq_mode = ETH_MQ_RX_RSS,
> .split_hdr_size = 0,
> @@ -167,6 +167,10 @@ static const struct rte_eth_conf port_conf = {
> .txmode = {
> .mq_mode = ETH_MQ_TX_NONE,
> },
> + .intr_conf = {
> + /* LSC interrupt mode disabled, polling mode used. */
> + .lsc = 0,
> + },
> };
>
> /*
> @@ -433,6 +437,12 @@ struct netdev_dpdk {
> /* DPDK-ETH hardware offload features,
> * from the enum set 'dpdk_hw_ol_features' */
> uint32_t hw_ol_features;
> +
> + /* Properties for link state change detection mode.
> + * If lsc_interrupt_mode is set to false, poll mode is used,
> + * otherwise interrupt mode is used. */
> + bool requested_lsc_interrupt_mode;
> + bool lsc_interrupt_mode;
> );
>
> PADDED_MEMBERS(CACHE_LINE_SIZE,
> @@ -459,6 +469,18 @@ int netdev_dpdk_get_vid(const struct netdev_dpdk *dev);
> struct ingress_policer *
> netdev_dpdk_get_ingress_policer(const struct netdev_dpdk *dev);
>
> +void
> +netdev_dpdk_set_default_lsc_detect_mode(bool enabled)
> +{
> + port_conf.intr_conf.lsc = enabled;
> +}
> +
> +bool
> +netdev_dpdk_get_def_lsc_int_mode_enabled(void)
> +{
> + return port_conf.intr_conf.lsc;
> +}
> +
> static bool
> is_dpdk_class(const struct netdev_class *class)
> {
> @@ -686,12 +708,14 @@ dpdk_watchdog(void *dummy OVS_UNUSED)
> }
>
> static int
> -dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int n_rxq, int n_txq)
> +dpdk_eth_dev_port_config(struct netdev_dpdk *dev, int n_rxq, int n_txq)
> {
> int diag = 0;
> int i;
> struct rte_eth_conf conf = port_conf;
>
> + conf.intr_conf.lsc = dev->lsc_interrupt_mode;
> +
> /* For some NICs (e.g. Niantic), scatter_rx mode needs to be explicitly
> * enabled. */
> if (dev->mtu > ETHER_MTU) {
> @@ -801,7 +825,7 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
> n_rxq = MIN(info.max_rx_queues, dev->up.n_rxq);
> n_txq = MIN(info.max_tx_queues, dev->up.n_txq);
>
> - diag = dpdk_eth_dev_queue_setup(dev, n_rxq, n_txq);
> + diag = dpdk_eth_dev_port_config(dev, n_rxq, n_txq);
> if (diag) {
> VLOG_ERR("Interface %s(rxq:%d txq:%d) configure error: %s",
> dev->up.name, n_rxq, n_txq, rte_strerror(-diag));
> @@ -897,6 +921,8 @@ common_construct(struct netdev *netdev, dpdk_port_t port_no,
> dev->flags = 0;
> dev->requested_mtu = ETHER_MTU;
> dev->max_packet_len = MTU_TO_FRAME_LEN(dev->mtu);
> + dev->requested_lsc_interrupt_mode =
> + netdev_dpdk_get_def_lsc_int_mode_enabled();
> ovsrcu_index_init(&dev->vid, -1);
> dev->vhost_reconfigured = false;
> dev->attached = false;
> @@ -1520,6 +1546,13 @@ netdev_dpdk_set_config(struct netdev *netdev, const struct smap *args,
> goto out;
> }
>
> + bool lsc_interrupt_mode = smap_get_bool(args, "dpdk-lsc-interrupt",
> + netdev_dpdk_get_def_lsc_int_mode_enabled());
> + if (dev->requested_lsc_interrupt_mode != lsc_interrupt_mode) {
> + dev->requested_lsc_interrupt_mode = lsc_interrupt_mode;
> + netdev_request_reconfigure(netdev);
> + }
> +
> rx_fc_en = smap_get_bool(args, "rx-flow-ctrl", false);
> tx_fc_en = smap_get_bool(args, "tx-flow-ctrl", false);
> autoneg = smap_get_bool(args, "flow-ctrl-autoneg", false);
> @@ -3546,6 +3579,7 @@ netdev_dpdk_reconfigure(struct netdev *netdev)
> if (netdev->n_txq == dev->requested_n_txq
> && netdev->n_rxq == dev->requested_n_rxq
> && dev->mtu == dev->requested_mtu
> + && dev->lsc_interrupt_mode == dev->requested_lsc_interrupt_mode
> && dev->rxq_size == dev->requested_rxq_size
> && dev->txq_size == dev->requested_txq_size
> && dev->socket_id == dev->requested_socket_id) {
> @@ -3561,6 +3595,8 @@ netdev_dpdk_reconfigure(struct netdev *netdev)
> goto out;
> }
>
> + dev->lsc_interrupt_mode = dev->requested_lsc_interrupt_mode;
> +
> netdev->n_txq = dev->requested_n_txq;
> netdev->n_rxq = dev->requested_n_rxq;
>
> diff --git a/lib/netdev-dpdk.h b/lib/netdev-dpdk.h
> index b7d02a7..8eb51bc 100644
> --- a/lib/netdev-dpdk.h
> +++ b/lib/netdev-dpdk.h
> @@ -27,6 +27,8 @@ struct dp_packet;
>
> void netdev_dpdk_register(void);
> void free_dpdk_buf(struct dp_packet *);
> +void netdev_dpdk_set_def_lsc_int_mode_enabled(bool enabled);
> +bool netdev_dpdk_get_def_lsc_int_mode_enabled(void);
>
> #else
>
> diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
> index f44f950..87608a5 100644
> --- a/vswitchd/bridge.c
> +++ b/vswitchd/bridge.c
> @@ -68,6 +68,9 @@
> #include "lib/vswitch-idl.h"
> #include "xenserver.h"
> #include "vlan-bitmap.h"
> +#ifdef DPDK_NETDEV
> +#include "./lib/netdev-provider.h"
> +#endif
>
> VLOG_DEFINE_THIS_MODULE(bridge);
>
> @@ -601,6 +604,11 @@ bridge_reconfigure(const struct ovsrec_open_vswitch *ovs_cfg)
> ofproto_set_vlan_limit(smap_get_int(&ovs_cfg->other_config, "vlan-limit",
> LEGACY_MAX_VLAN_HEADERS));
>
> +#ifdef DPDK_NETDEV
> + netdev_dpdk_set_def_lsc_int_mode_enabled(
> + smap_get_bool(&ovs_cfg->other_config, "dpdk-lsc-interrupt", false));
> +#endif
> +
I'm still thinking that this is not necessary.
Eelco, why do you think we need to change the default global value in run-time?
IMHO, this only complicates the code. (I didn't check, but it looks like current
implementation will not work anyway.)
> ofproto_set_threads(
> smap_get_int(&ovs_cfg->other_config, "n-handler-threads", 0),
> smap_get_int(&ovs_cfg->other_config, "n-revalidator-threads", 0));
> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> index 0c6a43d..91a9003 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -320,6 +320,29 @@
> </p>
> </column>
>
> + <column name="other_config" key="dpdk-lsc-interrupt"
> + type='{"type": "boolean"}'>
> + <p>
> + Set this value to <code>true</code> to configure interrupt mode for
> + Link State Change (LSC) detection instead of poll mode for DPDK
> + interfaces.
> + </p>
> + <p>
> + The default value is <code>false</code>.
> + </p>
> + <p>
> + If this value is <code>false</code> at startup, poll mode is used for
> + all netdev dpdk interfaces.
> + </p>
> + <p>
> + This value can be overridden for a specific interface in the options
> + section of that interface.
> + </p>
> + <p>
> + This parameter has an effect only on netdev dpdk interfaces.
> + </p>
> + </column>
> +
> <column name="other_config" key="dpdk-extra"
> type='{"type": "string"}'>
> <p>
> @@ -3631,6 +3654,28 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 type=patch options:peer=p1 \
> </column>
> </group>
>
> + <group title="Link State Change detection mode">
> + <column name="options" key="dpdk-lsc-interrupt"
> + type='{"type": "boolean"}'>
> + <p>
> + Set this value to <code>true</code> to configure interrupt mode for
> + Link State Change (LSC) detection instead of poll mode for the DPDK
> + interface.
> + </p>
> + <p>
> + If this value is not set, the value is taken from the global
> + settings.
> + </p>
> + <p>
> + If this value is set, the global LSC interrupt settings are
> + overridden.
> + </p>
> + <p>
> + This parameter has an effect only on netdev dpdk interfaces.
> + </p>
> + </column>
> + </group>
> +
> <group title="Common Columns">
> The overall purpose of these columns is described under <code>Common
> Columns</code> at the beginning of this document.
>
More information about the dev
mailing list