[ovs-dev] [ovs-dev, v3] netdev-dpdk: Configurable Link State Change (LSC) detection mode

Ilya Maximets i.maximets at samsung.com
Thu Jan 25 12:34:28 UTC 2018


Not a full review.
Comments inline.

Best regards, Ilya Maximets.

On 24.01.2018 17:35, Róbert Mulik wrote:
> It is possible to change LSC detection mode to polling or interrupt mode
> for DPDK interfaces. The default is polling mode. To set interrupt mode,
> option dpdk-lsc-interrupt has to be set to true.
> 
> In polling mode more processor time is needed, since the OVS repeatedly reads
> the link state with a short period. It can lead to packet loss for certain
> systems.
> 
> In interrupt mode the hardware itself triggers an interrupt when link state
> change happens, so less processing time needs for the OVS. It is not possible
> to enable the interrupt mode on all hardware.
> 
> For detailed description and usage see the dpdk install documentation.
> 
> Signed-off-by: Robert Mulik <robert.mulik at ericsson.com>
> ---
>  Documentation/intro/install/dpdk.rst | 48 +++++++++++++++++++++++++++++++++++
>  lib/netdev-dpdk.c                    | 49 +++++++++++++++++++++++++++++++++---
>  lib/netdev-dpdk.h                    |  2 ++
>  vswitchd/bridge.c                    |  8 ++++++
>  vswitchd/vswitch.xml                 | 45 +++++++++++++++++++++++++++++++++
>  5 files changed, 149 insertions(+), 3 deletions(-)
> 
> --
> 1.9.1
> 
> diff --git a/Documentation/intro/install/dpdk.rst b/Documentation/intro/install/dpdk.rst
> index 040e62e..806a7af 100644
> --- a/Documentation/intro/install/dpdk.rst
> +++ b/Documentation/intro/install/dpdk.rst
> @@ -626,6 +626,54 @@ The average number of packets per output batch can be checked in PMD stats::
> 
>      $ ovs-appctl dpif-netdev/pmd-stats-show
> 
> +Link State Change (LSC) detection configuration
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +There are two methods to get the information when Link State Change (LSC)
> +happens on a network interface: by polling or interrupt.
> +
> +With polling method, a process is running in the background and repeatedly
> +reads the link state with a short period. It continuously needs processor time
> +and between 2 reading periods it can`t see the link state change, therefore
> +the reaction time depends on the polling period. With higher rate, more
> +processor time is needed. Another problem with the poll mode is that on some
> +hardware a polling cycle takes too much time, which (in the end) leads to
> +packet loss for certain systems.
> +
> +If interrupts are used to get LSC information, the hardware itself triggers
> +an interrupt when link state change happens, the thread wakes up from sleep,
> +updates the information, and goes back to sleep mode. When no link state
> +change happens (most of the time), the thread remains in sleep mode and
> +doesn`t use processor time at all. The disadvantage of this method is that
> +when interrupt happens, the processor has to handle it immediately, so it
> +puts the currently running process to background, handles the interrupt, and
> +takes the background process back. Another disadvantage is that some hardware
> +can`t be configured to generate LSC interrupts.
> +
> +The default configuration is polling mode. To set interrupt mode, option
> +dpdk-lsc-interrupt has to be set to true.
> +
> +Global settings
> +
> +Command to set interrupt mode for all interfaces:
> +ovs-vsctl set Open_vSwitch . other_config:dpdk-lsc-interrupt=true
> +
> +Command to set polling mode for all interfaces:
> +ovs-vsctl set Open_vSwitch . other_config:dpdk-lsc-interrupt=false
> +or:
> +ovs-vsctl remove Open_vSwitch . other_config dpdk-lsc-interrupt
> +
> +Interface specific settings (override global settings)
> +
> +Command to set interrupt mode for a specific interface:
> +ovs-vsctl set interface <interface_name> options:dpdk-lsc-interrupt=true
> +
> +Command to set polling mode for a specific interface:
> +ovs-vsctl set interface <interface_name> options:dpdk-lsc-interrupt=false
> +
> +Command to reset to globally defined mode for a specific interface:
> +ovs-vsctl remove interface <interface_name> options dpdk-lsc-interrupt
> +
>  Limitations
>  ------------
> 
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index ac2e38e..6643ea9 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -148,7 +148,7 @@ typedef uint16_t dpdk_port_t;
>  #define VHOST_ENQ_RETRY_NUM 8
>  #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
> 
> -static const struct rte_eth_conf port_conf = {
> +static struct rte_eth_conf port_conf = {
>      .rxmode = {
>          .mq_mode = ETH_MQ_RX_RSS,
>          .split_hdr_size = 0,
> @@ -167,6 +167,10 @@ static const struct rte_eth_conf port_conf = {
>      .txmode = {
>          .mq_mode = ETH_MQ_TX_NONE,
>      },
> +    .intr_conf = {
> +        /* LSC interrupt mode disabled, polling mode used. */
> +        .lsc = 0,
> +    },
>  };
> 
>  /*
> @@ -433,6 +437,12 @@ struct netdev_dpdk {
>          /* DPDK-ETH hardware offload features,
>           * from the enum set 'dpdk_hw_ol_features' */
>          uint32_t hw_ol_features;
> +
> +        /* Properties for link state change detection mode.
> +         * If lsc_interrupt_mode is set to false, poll mode is used,
> +         * otherwise interrupt mode is used. */
> +        bool requested_lsc_interrupt_mode;
> +        bool lsc_interrupt_mode;
>      );
> 
>      PADDED_MEMBERS(CACHE_LINE_SIZE,
> @@ -457,6 +467,18 @@ int netdev_dpdk_get_vid(const struct netdev_dpdk *dev);
>  struct ingress_policer *
>  netdev_dpdk_get_ingress_policer(const struct netdev_dpdk *dev);
> 
> +void
> +netdev_dpdk_set_default_lsc_detect_mode(bool interrupt_mode)
> +{
> +    port_conf.intr_conf.lsc = interrupt_mode;
> +}
> +
> +bool
> +netdev_dpdk_get_default_lsc_detect_mode(void)> +{
> +    return port_conf.intr_conf.lsc;
> +}

Names of these functions are incorrect.
IMHO, netdev_dpdk_get_default_lsc_detect_mode() should return "interrupt" or "polling",
but netdev_dpdk_get_default_lsc_interrupt_mode() or
netdev_dpdk_get_default_lsc_interrupt_mode_enabled() should return "true" or "false".

All the variants are too long. Can we make them shorter without degrading readability?

> +
>  static bool
>  is_dpdk_class(const struct netdev_class *class)
>  {
> @@ -684,12 +706,14 @@ dpdk_watchdog(void *dummy OVS_UNUSED)
>  }
> 
>  static int
> -dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int n_rxq, int n_txq)
> +dpdk_eth_dev_port_config(struct netdev_dpdk *dev, int n_rxq, int n_txq)
>  {
>      int diag = 0;
>      int i;
>      struct rte_eth_conf conf = port_conf;
> 
> +    conf.intr_conf.lsc = dev->lsc_interrupt_mode;
> +
>      /* For some NICs (e.g. Niantic), scatter_rx mode needs to be explicitly
>       * enabled. */
>      if (dev->mtu > ETHER_MTU) {
> @@ -799,7 +823,7 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
>      n_rxq = MIN(info.max_rx_queues, dev->up.n_rxq);
>      n_txq = MIN(info.max_tx_queues, dev->up.n_txq);
> 
> -    diag = dpdk_eth_dev_queue_setup(dev, n_rxq, n_txq);
> +    diag = dpdk_eth_dev_port_config(dev, n_rxq, n_txq);
>      if (diag) {
>          VLOG_ERR("Interface %s(rxq:%d txq:%d) configure error: %s",
>                   dev->up.name, n_rxq, n_txq, rte_strerror(-diag));
> @@ -895,6 +919,8 @@ common_construct(struct netdev *netdev, dpdk_port_t port_no,
>      dev->flags = 0;
>      dev->requested_mtu = ETHER_MTU;
>      dev->max_packet_len = MTU_TO_FRAME_LEN(dev->mtu);
> +    dev->requested_lsc_interrupt_mode =
> +        netdev_dpdk_get_default_lsc_detect_mode();
>      ovsrcu_index_init(&dev->vid, -1);
>      dev->vhost_reconfigured = false;
>      dev->attached = false;
> @@ -1469,6 +1495,20 @@ netdev_dpdk_set_config(struct netdev *netdev, const struct smap *args,
>                              "The old 'dpdk<port_id>' names are not supported",
>                        netdev_get_name(netdev));
>          err = EINVAL;
> +        goto out;
> +    }
> +
> +    struct smap_node *node = smap_get_node(args, "dpdk-lsc-interrupt");
> +    bool lsc_interrupt_mode;
> +
> +    if (node) {
> +        lsc_interrupt_mode = smap_get_bool(args, "dpdk-lsc-interrupt", false);
> +    } else {
> +        lsc_interrupt_mode = netdev_dpdk_get_default_lsc_detect_mode();
> +    }

Above code could be replaced by just:

    lsc_interrupt_mode = smap_get_bool(args, "dpdk-lsc-interrupt",
                                       dpdk_get_default_lsc_detect_mode());

Second thought here: Why we're checking this value before checking the 'err'?

> +    if (dev->requested_lsc_interrupt_mode != lsc_interrupt_mode) {
> +        dev->requested_lsc_interrupt_mode = lsc_interrupt_mode;
> +        netdev_request_reconfigure(netdev);
>      }
> 
>      if (err) {
> @@ -3482,6 +3522,7 @@ netdev_dpdk_reconfigure(struct netdev *netdev)
>      if (netdev->n_txq == dev->requested_n_txq
>          && netdev->n_rxq == dev->requested_n_rxq
>          && dev->mtu == dev->requested_mtu
> +        && dev->lsc_interrupt_mode == dev->requested_lsc_interrupt_mode
>          && dev->rxq_size == dev->requested_rxq_size
>          && dev->txq_size == dev->requested_txq_size
>          && dev->socket_id == dev->requested_socket_id) {
> @@ -3497,6 +3538,8 @@ netdev_dpdk_reconfigure(struct netdev *netdev)
>          goto out;
>      }
> 
> +    dev->lsc_interrupt_mode = dev->requested_lsc_interrupt_mode;
> +
>      netdev->n_txq = dev->requested_n_txq;
>      netdev->n_rxq = dev->requested_n_rxq;
> 
> diff --git a/lib/netdev-dpdk.h b/lib/netdev-dpdk.h
> index b7d02a7..5440743 100644
> --- a/lib/netdev-dpdk.h
> +++ b/lib/netdev-dpdk.h
> @@ -27,6 +27,8 @@ struct dp_packet;
> 
>  void netdev_dpdk_register(void);
>  void free_dpdk_buf(struct dp_packet *);
> +void netdev_dpdk_set_default_lsc_detect_mode(bool interrupt_mode);
> +bool netdev_dpdk_get_default_lsc_detect_mode(void);
> 
>  #else
> 
> diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
> index f44f950..86c88e7 100644
> --- a/vswitchd/bridge.c
> +++ b/vswitchd/bridge.c
> @@ -68,6 +68,9 @@
>  #include "lib/vswitch-idl.h"
>  #include "xenserver.h"
>  #include "vlan-bitmap.h"
> +#ifdef DPDK_NETDEV
> +#include "./lib/netdev-provider.h"
> +#endif
> 
>  VLOG_DEFINE_THIS_MODULE(bridge);
> 
> @@ -601,6 +604,11 @@ bridge_reconfigure(const struct ovsrec_open_vswitch *ovs_cfg)
>      ofproto_set_vlan_limit(smap_get_int(&ovs_cfg->other_config, "vlan-limit",
>                                         LEGACY_MAX_VLAN_HEADERS));
> 
> +#ifdef DPDK_NETDEV
> +    netdev_dpdk_set_default_lsc_detect_mode(
> +        smap_get_bool(&ovs_cfg->other_config, "dpdk-lsc-interrupt", false));
> +#endif

Is there a reason why we should be able to configure default value in runtime?
I think that we could make this boot-time option and move all the related code
to lib/dpdk.{c,h} just like for vhost iommu support.
IMHO, user should know if most of his HW NICs supports LSC interrupt mode or not
before starting the OVS.

> +
>      ofproto_set_threads(
>          smap_get_int(&ovs_cfg->other_config, "n-handler-threads", 0),
>          smap_get_int(&ovs_cfg->other_config, "n-revalidator-threads", 0));
> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> index 7e89325..95071eb 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -320,6 +320,29 @@
>          </p>
>        </column>
> 
> +      <column name="other_config" key="dpdk-lsc-interrupt"
> +              type='{"type": "boolean"}'>
> +        <p>
> +          Set this value to <code>true</code> to configure interrupt mode for
> +          Link State Change (LSC) detection instead of poll mode for DPDK
> +          interfaces.
> +        </p>
> +        <p>
> +          The default value is <code>false</code>.
> +        </p>
> +        <p>
> +          If this value is <code>false</code> at startup, poll mode is used for
> +          all netdev dpdk interfaces.
> +        </p>
> +        <p>
> +          This value can be overridden for a specific interface in the options
> +          section of that interface.
> +        </p>
> +        <p>
> +          This parameter has an effect only on netdev dpdk interfaces.
> +        </p>
> +      </column>
> +
>        <column name="other_config" key="dpdk-extra"
>                type='{"type": "string"}'>
>          <p>
> @@ -3620,6 +3643,28 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 type=patch options:peer=p1 \
>        </column>
>      </group>
> 
> +    <group title="Link State Change detection mode">
> +      <column name="options" key="dpdk-lsc-interrupt"
> +              type='{"type": "boolean"}'>
> +        <p>
> +          Set this value to <code>true</code> to configure interrupt mode for
> +          Link State Change (LSC) detection instead of poll mode for the DPDK
> +          interface.
> +        </p>
> +        <p>
> +          If this value is not set, the value is taken from the global
> +          settings.
> +        </p>
> +        <p>
> +          If this value is set, the global LSC interrupt settings are
> +          overridden.
> +        </p>
> +        <p>
> +          This parameter has an effect only on netdev dpdk interfaces.
> +        </p>
> +      </column>
> +    </group>
> +
>      <group title="Common Columns">
>        The overall purpose of these columns is described under <code>Common
>        Columns</code> at the beginning of this document.
> 


More information about the dev mailing list