[ovs-dev] [PATCH V4 2/2] netdev-dpdk: vHost IOMMU support

Kavanagh, Mark B mark.b.kavanagh at intel.com
Thu Dec 7 12:27:32 UTC 2017


>From: Ilya Maximets [mailto:i.maximets at samsung.com]
>Sent: Thursday, December 7, 2017 12:19 PM
>To: Kavanagh, Mark B <mark.b.kavanagh at intel.com>; Stokes, Ian
><ian.stokes at intel.com>; dev at openvswitch.org
>Cc: maxime.coquelin at redhat.com; jan.scheurich at ericsson.com; Mooney, Sean K
><sean.k.mooney at intel.com>; Fischetti, Antonio <antonio.fischetti at intel.com>;
>Bie, Tiwei <tiwei.bie at intel.com>; Mcnamara, John <john.mcnamara at intel.com>;
>Guoshuai Li <ligs at dtdream.com>; ktraynor at redhat.com; Loftus, Ciara
><ciara.loftus at intel.com>; Yuanhan Liu <yliu at fridaylinux.org>
>Subject: Re: [ovs-dev] [PATCH V4 2/2] netdev-dpdk: vHost IOMMU support
>
>On 07.12.2017 14:32, Kavanagh, Mark B wrote:
>> Yuanhan's old email address was used in the previous mail - updated
>correctly here.
>> -Mark
>>
>>> -----Original Message-----
>>> From: ovs-dev-bounces at openvswitch.org [mailto:ovs-dev-
>bounces at openvswitch.org]
>>> On Behalf Of Kavanagh, Mark B
>>> Sent: Thursday, December 7, 2017 11:24 AM
>>> To: Stokes, Ian <ian.stokes at intel.com>; dev at openvswitch.org
>>> Cc: Liu, Yuanhan <yuanhan.liu at intel.com>; Bie, Tiwei <tiwei.bie at intel.com>;
>>> Mcnamara, John <john.mcnamara at intel.com>; maxime.coquelin at redhat.com;
>>> i.maximets at samsung.com
>>> Subject: Re: [ovs-dev] [PATCH V4 2/2] netdev-dpdk: vHost IOMMU support
>>>
>>> Hi folks,
>>>
>>> Thanks for all of your respective reviews and testing of this patchset.
>>>
>>> It seems, however, that an issue in DPDK v17.11 has come to light that
>affects
>>> guests which use testpmd.
>>> Specifically, traffic does not reach a guest when traffic is live prior to
>>> kicking off the testpmd app within said guest, and at least two forwarding
>>> cores are used. [1]
>>>
>>> This is explained in additional detail is [2], an extract of which is
>>> reproduced below:
>>>
>>> 	"the vector Rx could be broken if backend has consumed all the avail
>>> descs before the
>>> 	 device is started. Because in current implementation, the vector Rx will
>>> return immediately
>>> 	 without refilling the avail ring if the used ring is empty. So we have
>>> to refill
>>> 	 the avail ring after flushing the elements in the used ring."
>>>
>>> This issue was initially uncovered by Antonio Fischetti, as part of the
>17.11
>>> patchset validation, and has since been localized to DPDK, rather than OvS.
>>> As a result, it seems now that we should not move to 17.11, but seek an
>out-
>>> of-tree 17.11.1 stable/bugfix release. I'm interested in opinions on this -
>>> should we:
>>> 	a) move to 17.11 now, note the issue above in the 'errata' section of the
>>> documentation, and move to 17.11.1 when it becomes available in February of
>>> next year
>>> 	b) request the early release of the 17.11.1 stable branch, which
>>> incorporates a fix for this issue (the possibility, and availability, of
>which
>>> are both TBD).
>>>
>>> Thanks in advance,
>>> Mark
>
>Hmm. Isn't it a guest driver issue? Do we need to care so much about running
>OVS inside the VM? If I assumed right, if we're running OVS not inside the VM,
>there will be no issues on the OVS side. The issue happens if guest
>application
>uses DPDK 17.11, but it will happen regardless of the DPDK version on the host
>side.
>
>So, the only bad scenario is running OVS inside the VM with virtio-net PMD
>driver.
>My question is: Do we need to care about this scenario so much?
>If the answer is "not really" then we can use variant a).
>But if it's very important thing we need to support than b) will be
>preferable.
>
>Am I missing something?

Hey Ilya,

I've just had the exact same conversation with Sean Mooney locally.

I think the point that both yourself and Sean has made is completely valid, which puts option a) back on the table.

Antonio has agreed to test that this works; once he confirms I can push v5 of the 17.11 patchet, and simply document the guest DPDK issue.

Thanks again,
Mark


>
>Best regards, Ilya Maximets.
>
>
>>>
>>> [1] http://dpdk.org/ml/archives/dev/2017-December/082801.html
>>> [2] http://dpdk.org/ml/archives/dev/2017-December/082874.html
>>>
>>>
>>>> -----Original Message-----
>>>> From: Stokes, Ian
>>>> Sent: Thursday, December 7, 2017 8:43 AM
>>>> To: Kavanagh, Mark B <mark.b.kavanagh at intel.com>; dev at openvswitch.org
>>>> Cc: ktraynor at redhat.com; maxime.coquelin at redhat.com;
>i.maximets at samsung.com;
>>>> jan.scheurich at ericsson.com; Mooney, Sean K <sean.k.mooney at intel.com>
>>>> Subject: RE: [ovs-dev][PATCH V4 2/2] netdev-dpdk: vHost IOMMU support
>>>>
>>>>> DPDK v17.11 introduces support for the vHost IOMMU feature.
>>>>> This is a security feature, which restricts the vhost memory that a
>virtio
>>>>> device may access.
>>>>
>>>> Hi all,
>>>>
>>>> There were a few requests for changes in the v4 of this patch and it's an
>>>> important aspect of the DPDK 17.11 support for OVS.
>>>>
>>>> I'd like to get this series in to the pull request this week. If people
>have
>>>> flagged issues for the latest revision I'd appreciate if they could review
>>> the
>>>> latest revision and flag any new issues that need to be raised.
>>>>
>>>> Ian
>>>>
>>>>>
>>>>> This feature also enables the vhost REPLY_ACK protocol, the
>implementation
>>>>> of which is known to work in newer versions of QEMU (i.e. v2.10.0), but
>is
>>>>> buggy in older versions (v2.7.0 - v2.9.0, inclusive). As such, the
>feature
>>>>> is disabled by default in (and should remain so), for the aforementioned
>>>>> older QEMU verions. Starting with QEMU v2.9.1, vhost-iommu-support can
>>>>> safely be enabled, even without having an IOMMU device, with no
>>>>> performance penalty.
>>>>>
>>>>> This patch adds a new global config option, vhost-iommu-support, that
>>>>> controls enablement of the vhost IOMMU feature:
>>>>>
>>>>>     ovs-vsctl set Open_vSwitch . other_config:vhost-iommu-support=true
>>>>>
>>>>> This value defaults to false; to enable IOMMU support, this field should
>>>>> be set to true when setting other global parameters on init (such as
>>>>> "dpdk-socket-mem", for example). Changing the value at runtime is not
>>>>> supported, and requires restarting the vswitch daemon.
>>>>>
>>>>> Signed-off-by: Mark Kavanagh <mark.b.kavanagh at intel.com>
>>>>>
>>>>> ---
>>>>>
>>>>> v4:  - rebase to HEAD of master
>>>>>      - clarify that IOMMU support applies exclusively to vhost user
>>>>>        client ports.
>>>>>      - reword caveats regarding modifying vhost-iommu-support at runtime
>>>>>      - use smap_get_bool() to parse vhost-iommu-support, instead of
>>>>>        process_vhost_flags().
>>>>>      - add stub implementation of dpdk_vhost_iommu_enabled().
>>>>>      - move stdbool.h #include outside of DPDK compiler guards.
>>>>>      - remove mention of vhost IOMMU mode from
>>>>>        netdev_dpdk_vhost_client_configure() log.
>>>>>
>>>>> v3:  - erroneous; disregard.
>>>>>
>>>>> v2:  - rebase to HEAD of master
>>>>>      - refactor vHost IOMMU enablement mechanism (use a global
>>>>>        config option, instead of the previous per-port approach).
>>>>> ---
>>>>>  Documentation/topics/dpdk/vhost-user.rst | 27
>+++++++++++++++++++++++++++
>>>>>  NEWS                                     |  1 +
>>>>>  lib/dpdk-stub.c                          |  6 ++++++
>>>>>  lib/dpdk.c                               | 12 ++++++++++++
>>>>>  lib/dpdk.h                               |  3 +++
>>>>>  lib/netdev-dpdk.c                        | 14 ++++++++++----
>>>>>  vswitchd/vswitch.xml                     | 15 +++++++++++++++
>>>>>  7 files changed, 74 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/Documentation/topics/dpdk/vhost-user.rst
>>>>> b/Documentation/topics/dpdk/vhost-user.rst
>>>>> index 5347995..ffef653 100644
>>>>> --- a/Documentation/topics/dpdk/vhost-user.rst
>>>>> +++ b/Documentation/topics/dpdk/vhost-user.rst
>>>>> @@ -273,6 +273,33 @@ One benefit of using this mode is the ability for
>>>>> vHost ports to 'reconnect' in  event of the switch crashing or being
>>>>> brought down. Once it is brought back up,  the vHost ports will reconnect
>>>>> automatically and normal service will resume.
>>>>>
>>>>> +vhost-user-client IOMMU Support
>>>>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>> +
>>>>> +vhost IOMMU is a feature which restricts the vhost memory that a virtio
>>>>> +device can access, and as such is useful in deployments in which
>security
>>>>> is a concern.
>>>>> +
>>>>> +IOMMU support may be enabled via a global config value, ```vhost-iommu-
>>>>> support```.
>>>>> +Setting this to true enables vhost IOMMU support for all vhost ports
>>>>> +when/where
>>>>> +available::
>>>>> +
>>>>> +    $ ovs-vsctl set Open_vSwitch.other_config:vhost-iommu-support=true
>>>>> +
>>>>> +The default value is false.
>>>>> +
>>>>> +.. important::
>>>>> +
>>>>> +    Changing this value requires restarting the daemon.
>>>>> +
>>>>> +.. important::
>>>>> +
>>>>> +    Enabling the IOMMU feature also enables the vhost user reply-ack
>>>>> protocol;
>>>>> +    this is known to work on QEMU v2.10.0, but is buggy on older
>versions
>>>>> +    (2.7.0 - 2.9.0, inclusive). Consequently, the IOMMU feaure is
>>>>> disabled by
>>>>> +    default (and should remain so if using the aforementioned versions
>of
>>>>> QEMU).
>>>>> +    Starting with QEMU v2.9.1, vhost-iommu-support can safely be
>enabled,
>>>>> even
>>>>> +    without having an IOMMU device, with no performance penalty.
>>>>> +
>>>>>  .. _dpdk-testpmd:
>>>>>
>>>>>  DPDK in the Guest
>>>>> diff --git a/NEWS b/NEWS
>>>>> index d4a1c9a..99c47d8 100644
>>>>> --- a/NEWS
>>>>> +++ b/NEWS
>>>>> @@ -15,6 +15,7 @@ Post-v2.8.0
>>>>>       * Add support for compiling OVS with the latest Linux 4.13 kernel
>>>>>     - DPDK:
>>>>>       * Add support for DPDK v17.11
>>>>> +     * Add support for vHost IOMMU
>>>>>
>>>>>  v2.8.0 - 31 Aug 2017
>>>>>  --------------------
>>>>> diff --git a/lib/dpdk-stub.c b/lib/dpdk-stub.c index daef729..3602180
>>>>> 100644
>>>>> --- a/lib/dpdk-stub.c
>>>>> +++ b/lib/dpdk-stub.c
>>>>> @@ -48,3 +48,9 @@ dpdk_get_vhost_sock_dir(void)  {
>>>>>      return NULL;
>>>>>  }
>>>>> +
>>>>> +bool
>>>>> +dpdk_vhost_iommu_enabled(void)
>>>>> +{
>>>>> +    return false;
>>>>> +}
>>>>> diff --git a/lib/dpdk.c b/lib/dpdk.c
>>>>> index 8da6c32..6710d10 100644
>>>>> --- a/lib/dpdk.c
>>>>> +++ b/lib/dpdk.c
>>>>> @@ -41,6 +41,7 @@ VLOG_DEFINE_THIS_MODULE(dpdk);
>>>>>  static FILE *log_stream = NULL;       /* Stream for DPDK log redirection
>>>>> */
>>>>>
>>>>>  static char *vhost_sock_dir = NULL;   /* Location of vhost-user sockets
>>>>> */
>>>>> +static bool vhost_iommu_enabled = false; /* Status of vHost IOMMU
>>>>> +support */
>>>>>
>>>>>  static int
>>>>>  process_vhost_flags(char *flag, const char *default_val, int size, @@ -
>>>>> 345,6 +346,11 @@ dpdk_init__(const struct smap *ovs_other_config)
>>>>>          vhost_sock_dir = sock_dir_subcomponent;
>>>>>      }
>>>>>
>>>>> +    vhost_iommu_enabled = smap_get_bool(ovs_other_config,
>>>>> +                                        "vhost-iommu-support", false);
>>>>> +    VLOG_INFO("IOMMU support for vhost-user-client %s.",
>>>>> +               vhost_iommu_enabled ? "enabled" : "disabled");
>>>>> +
>>>>>      argv = grow_argv(&argv, 0, 1);
>>>>>      argc = 1;
>>>>>      argv[0] = xstrdup(ovs_get_program_name()); @@ -482,6 +488,12 @@
>>>>> dpdk_get_vhost_sock_dir(void)
>>>>>      return vhost_sock_dir;
>>>>>  }
>>>>>
>>>>> +bool
>>>>> +dpdk_vhost_iommu_enabled(void)
>>>>> +{
>>>>> +    return vhost_iommu_enabled;
>>>>> +}
>>>>> +
>>>>>  void
>>>>>  dpdk_set_lcore_id(unsigned cpu)
>>>>>  {
>>>>> diff --git a/lib/dpdk.h b/lib/dpdk.h
>>>>> index 673a1f1..dc58d96 100644
>>>>> --- a/lib/dpdk.h
>>>>> +++ b/lib/dpdk.h
>>>>> @@ -17,6 +17,8 @@
>>>>>  #ifndef DPDK_H
>>>>>  #define DPDK_H
>>>>>
>>>>> +#include <stdbool.h>
>>>>> +
>>>>>  #ifdef DPDK_NETDEV
>>>>>
>>>>>  #include <rte_config.h>
>>>>> @@ -35,5 +37,6 @@ struct smap;
>>>>>  void dpdk_init(const struct smap *ovs_other_config);  void
>>>>> dpdk_set_lcore_id(unsigned cpu);  const char
>>>>> *dpdk_get_vhost_sock_dir(void);
>>>>> +bool dpdk_vhost_iommu_enabled(void);
>>>>>
>>>>>  #endif /* dpdk.h */
>>>>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index f552444..9715c39
>>>>> 100644
>>>>> --- a/lib/netdev-dpdk.c
>>>>> +++ b/lib/netdev-dpdk.c
>>>>> @@ -3253,6 +3253,7 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev
>>>>> *netdev)  {
>>>>>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>>>>>      int err;
>>>>> +    uint64_t vhost_flags = 0;
>>>>>
>>>>>      ovs_mutex_lock(&dev->mutex);
>>>>>
>>>>> @@ -3263,16 +3264,21 @@ netdev_dpdk_vhost_client_reconfigure(struct
>netdev
>>>>> *netdev)
>>>>>       */
>>>>>      if (!(dev->vhost_driver_flags & RTE_VHOST_USER_CLIENT)
>>>>>              && strlen(dev->vhost_id)) {
>>>>> -        /* Register client-mode device */
>>>>> -        err = rte_vhost_driver_register(dev->vhost_id,
>>>>> -                                        RTE_VHOST_USER_CLIENT);
>>>>> +        /* Register client-mode device. */
>>>>> +        vhost_flags |= RTE_VHOST_USER_CLIENT;
>>>>> +
>>>>> +        /* Enable IOMMU support, if explicitly requested. */
>>>>> +        if (dpdk_vhost_iommu_enabled()) {
>>>>> +            vhost_flags |= RTE_VHOST_USER_IOMMU_SUPPORT;
>>>>> +        }
>>>>> +        err = rte_vhost_driver_register(dev->vhost_id, vhost_flags);
>>>>>          if (err) {
>>>>>              VLOG_ERR("vhost-user device setup failure for device %s\n",
>>>>>                       dev->vhost_id);
>>>>>              goto unlock;
>>>>>          } else {
>>>>>              /* Configuration successful */
>>>>> -            dev->vhost_driver_flags |= RTE_VHOST_USER_CLIENT;
>>>>> +            dev->vhost_driver_flags |= vhost_flags;
>>>>>              VLOG_INFO("vHost User device '%s' created in 'client' mode,
>"
>>>>>                        "using client socket '%s'",
>>>>>                        dev->up.name, dev->vhost_id); diff --git
>>>>> a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index c145e1a..6c6df50
>>>>> 100644
>>>>> --- a/vswitchd/vswitch.xml
>>>>> +++ b/vswitchd/vswitch.xml
>>>>> @@ -344,6 +344,21 @@
>>>>>          </p>
>>>>>        </column>
>>>>>
>>>>> +      <column name="other_config" key="vhost-iommu-support"
>>>>> +              type='{"type": "boolean"}'>
>>>>> +        <p>
>>>>> +          vHost IOMMU is a  security feature, which restricts the vhost
>>>>> memory
>>>>> +          that a virtio device may access. vHost IOMMU support is
>>>>> disabled by
>>>>> +          default, due to a bug in QEMU implementations of the vhost
>>>>> REPLY_ACK
>>>>> +          protocol, (on which vHost IOMMU relies) prior to v2.9.1.
>>>>> Setting this
>>>>> +          value to <code>true</code> enables vHost IOMMU support for
>>>>> vHost User
>>>>> +          Client ports in OvS-DPDK, starting from DPDK v17.11.
>>>>> +        </p>
>>>>> +        <p>
>>>>> +          Changing this value requires restarting the daemon.
>>>>> +        </p>
>>>>> +      </column>
>>>>> +
>>>>>        <column name="other_config" key="n-handler-threads"
>>>>>                type='{"type": "integer", "minInteger": 1}'>
>>>>>          <p>
>>>>> --
>>>>> 2.7.4
>>>
>>> _______________________________________________
>>> dev mailing list
>>> dev at openvswitch.org
>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
>>
>>


More information about the dev mailing list