[ovs-dev] [PATCH RFC v2] netdev-dpdk: Allow specification of index for PCI devices

Kevin Traynor ktraynor at redhat.com
Fri Oct 20 09:01:43 UTC 2017


On 10/19/2017 07:27 PM, Loftus, Ciara wrote:
>>
>> On 10/17/2017 11:48 AM, Ciara Loftus wrote:
>>> Some NICs have only one PCI address associated with multiple ports. This
>>> patch extends the dpdk-devargs option's format to cater for such
>>> devices. Whereas before only one of N ports associated with one PCI
>>> address could be added, now N can.
>>>
>>> This patch allows for the following use of the dpdk-devargs option:
>>>
>>> ovs-vsctl set Interface myport options:dpdk-devargs=0000:06:00.0,X
>>>
>>> Where X is an unsigned integer representing one of multiple ports
>>> associated with the PCI address provided.
>>>
>>> This patch has not been tested.
>>>
>>> Signed-off-by: Ciara Loftus <ciara.loftus at intel.com>
>>> ---
>>> v2:
>>> * Simplify function to find port ID of indexed device
>>> * Hotplug compatibility
>>> * Detach compatibility
>>> * Documentation
>>> * Use strtol instead of atoi
>>>
>>>  Documentation/howto/dpdk.rst         |  9 +++++
>>>  Documentation/intro/install/dpdk.rst |  9 +++++
>>>  NEWS                                 |  2 ++
>>>  lib/netdev-dpdk.c                    | 67 ++++++++++++++++++++++++++++++---
>> ---
>>>  vswitchd/vswitch.xml                 | 11 ++++--
>>>  5 files changed, 85 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/Documentation/howto/dpdk.rst
>> b/Documentation/howto/dpdk.rst
>>> index d123819..9447b71 100644
>>> --- a/Documentation/howto/dpdk.rst
>>> +++ b/Documentation/howto/dpdk.rst
>>> @@ -48,6 +48,15 @@ number of dpdk devices found in the log file::
>>>      $ ovs-vsctl add-port br0 dpdk-p1 -- set Interface dpdk-p1 type=dpdk \
>>>          options:dpdk-devargs=0000:01:00.1
>>>
>>> +If your PCI device has multiple ports under the same PCI ID, you can use
>> the
>>> +following notation to indicate the specific device you wish to add::
>>> +
>>> +    $ ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk \
>>> +        options:dpdk-devargs=0000:01:00.0,0
>>> +
>>> +The above would attempt to use the first device (0) associated with that
>> PCI
>>> +ID. Use ,1 ,2 etc. to access the next.
>>> +
>>>  After the DPDK ports get added to switch, a polling thread continuously
>> polls
>>>  DPDK devices and consumes 100% of the core, as can be checked from
>> ``top`` and
>>>  ``ps`` commands::
>>> diff --git a/Documentation/intro/install/dpdk.rst
>> b/Documentation/intro/install/dpdk.rst
>>> index bb69ae5..d0e87f5 100644
>>> --- a/Documentation/intro/install/dpdk.rst
>>> +++ b/Documentation/intro/install/dpdk.rst
>>> @@ -271,6 +271,15 @@ ports. For example, to create a userspace bridge
>> named ``br0`` and add two
>>>  DPDK devices will not be available for use until a valid dpdk-devargs is
>>>  specified.
>>>
>>> +If your PCI device has multiple ports under the same PCI ID, you can use
>> the
>>> +following notation to indicate the specific device you wish to add::
>>> +
>>> +    $ ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk \
>>> +        options:dpdk-devargs=0000:01:00.0,0
>>> +
>>> +The above would attempt to use the first device (0) associated with that
>> PCI
>>> +ID. Use ,1 ,2 etc. to access the next.
>>> +
>>>  Refer to ovs-vsctl(8) and :doc:`/howto/dpdk` for more details.
>>>
>>>  Performance Tuning
>>> diff --git a/NEWS b/NEWS
>>> index 75f5fa5..ca885a6 100644
>>> --- a/NEWS
>>> +++ b/NEWS
>>> @@ -5,6 +5,8 @@ Post-v2.8.0
>>>         chassis "hostname" in addition to a chassis "name".
>>>     - Linux kernel 4.13
>>>       * Add support for compiling OVS with the latest Linux 4.13 kernel
>>> +   - DPDK:
>>> +     * Support for adding devices with multiple DPDK ports under one PCI
>> ID.
>>>
>>>  v2.8.0 - 31 Aug 2017
>>>     - ovs-ofctl:
>>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>>> index c60f46f..35e15da 100644
>>> --- a/lib/netdev-dpdk.c
>>> +++ b/lib/netdev-dpdk.c
>>> @@ -1214,16 +1214,40 @@
>> netdev_dpdk_lookup_by_port_id(dpdk_port_t port_id)
>>>  }
>>>
>>>  static dpdk_port_t
>>> +dpdk_get_port_by_name_with_index(char *name, int idx, int base_id)
>>> +{
>>> +    struct rte_eth_dev_info dev_info;
>>> +    char pci_addr[PCI_PRI_STR_SIZE];
>>> +    dpdk_port_t offset_port_id = base_id + idx;
>>> +
>>> +    if (rte_eth_dev_is_valid_port(offset_port_id)) {
>>> +        rte_eth_dev_info_get(offset_port_id, &dev_info);
>>> +        rte_pci_device_name(&dev_info.pci_dev->addr, pci_addr,
>>> +                            sizeof(pci_addr));
>>> +        if (!strcmp(pci_addr, name)) {
>>> +            return offset_port_id;
>>> +        }
>>> +    }
>>> +
>>> +    VLOG_INFO("Invalid PCI offset %i for device %s", idx, name);
>>> +
>>> +    return DPDK_ETH_PORT_ID_INVALID;
>>> +}
>>> +
>>> +static dpdk_port_t
>>>  netdev_dpdk_process_devargs(struct netdev_dpdk *dev,
>>>                              const char *devargs, char **errp)
>>>  {
>>> -    /* Get the name up to the first comma. */
>>> -    char *name = xmemdup0(devargs, strcspn(devargs, ","));
>>> +    char *devargs_copy = xmemdup0((devargs), strlen(devargs));
>>> +    char *name, *idx_str;
>>> +    unsigned idx;
>>>      dpdk_port_t new_port_id = DPDK_ETH_PORT_ID_INVALID;
>>>
>>> +    name = strtok_r(devargs_copy, ",", &devargs_copy);
>>> +    idx_str = strtok_r(devargs_copy, ",", &devargs_copy);
>>> +
>>>      if (!rte_eth_dev_count()
>>> -            || rte_eth_dev_get_port_by_name(name, &new_port_id)
>>> -            || !rte_eth_dev_is_valid_port(new_port_id)) {
>>> +        || rte_eth_dev_get_port_by_name(name, &new_port_id)) {
>>>          /* Device not found in DPDK, attempt to attach it */
>>>          if (!rte_eth_dev_attach(devargs, &new_port_id)) {
>>
>> I think you should strip out the ",X" for attach. Although looking at
>> the DPDK code it would probably ignore the ",X" once the pci address is
>> ok, so maybe it was intentional.
> 
> I left this in intentionally. Vdev devargs often include commas and the information following them is important. Eg.
> net_pcap0,tx_pcap=/tmp/file_tx.pcap
> 

True for vdev, but I was thinking of the pci case. If you are
distinguishing between pci and vdev because of the "<pci>,<int>" vs
"<vdev>,<string>" issue below then maybe you could also strip out the
",<int>" for pci.

> But actually this will cause a failure later when we assume those arguments are an integer index and try to process them. That needs to be updated in v3.
> 
>>
>>>              /* Attach successful */
>>> @@ -1232,12 +1256,21 @@ netdev_dpdk_process_devargs(struct
>> netdev_dpdk *dev,
>>>          } else {
>>>              /* Attach unsuccessful */
>>>              new_port_id = DPDK_ETH_PORT_ID_INVALID;
>>> -            VLOG_WARN_BUF(errp, "Error attaching device '%s' to DPDK",
>>> -                          devargs);
>>> +            goto out;
>>>          }
>>>      }
>>>
>>> -    free(name);
>>> +    if (idx_str) {
>>> +        idx = strtol(idx_str, NULL, 10);
>>> +        new_port_id = dpdk_get_port_by_name_with_index(name, idx,
>> new_port_id);
>>> +    }
>>> +
>>> +out:
>>> +    if (new_port_id == DPDK_ETH_PORT_ID_INVALID) {
>>> +        VLOG_WARN_BUF(errp, "Error attaching device '%s' to DPDK",
>>> +                      devargs);
>>> +    }
>>> +
>>
>> you need to free devargs_copy here?
> 
> Yes. Will put this in v3.
> 
>>
>>>      return new_port_id;
>>>  }
>>>
>>> @@ -2549,14 +2582,28 @@ netdev_dpdk_detach(struct unixctl_conn
>> *conn, int argc OVS_UNUSED,
>>>  {
>>>      int ret;
>>>      char *response;
>>> -    uint8_t port_id;
>>>      char devname[RTE_ETH_NAME_MAX_LEN];
>>>      struct netdev_dpdk *dev;
>>> +    char *devname_copy = xmemdup0((argv[1]), strlen(argv[1]));
>>> +    char *name, *idx_str;
>>> +    int idx;
>>> +    dpdk_port_t port_id;
>>>
>>>      ovs_mutex_lock(&dpdk_mutex);
>>>
>>> -    if (!rte_eth_dev_count() || rte_eth_dev_get_port_by_name(argv[1],
>>> -                                                             &port_id)) {
>>> +    name = strtok_r(devname_copy, ",", &devname_copy);
>>> +    idx_str = strtok_r(devname_copy, ",", &devname_copy);
>>> +
>>> +    if (rte_eth_dev_count()) {
>>> +        if (rte_eth_dev_get_port_by_name(argv[1], &port_id)) {
>>> +            port_id = DPDK_ETH_PORT_ID_INVALID;
>>> +        } else if (idx_str) {
>>> +            idx = strtol(idx_str, NULL, 10);
>>> +            port_id = dpdk_get_port_by_name_with_index(name, idx,
>> port_id);
>>> +        }
>>> +    }
>>> +
>>> +    if (port_id == DPDK_ETH_PORT_ID_INVALID) {
>>>          response = xasprintf("Device '%s' not found in DPDK", argv[1]);
>>>          goto error;
>>>      }
>>> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
>>> index 04c771f..7f503fb 100644
>>> --- a/vswitchd/vswitch.xml
>>> +++ b/vswitchd/vswitch.xml
>>> @@ -2608,12 +2608,17 @@ ovs-vsctl add-port br0 p0 -- set Interface p0
>> type=patch options:peer=p1 \
>>>          <p>
>>>            Specifies the PCI address associated with the port for physical
>>>            devices, or the virtual driver to be used for the port when a virtual
>>> -          PMD is intended to be used. For the latter, the argument string
>>> -          typically takes the form of
>>> +          PMD is intended to be used.
>>> +          For physical devices, you may specify a single PCI ID like
>>> +          <code>0000:06:00.0</code> or an ID followed by a comma and an
>> index
>>> +          like <code>0000:06:00.0,1</code> for cases where multiple ports are
>>> +          associated with a single PCI ID.
>>> +          For virtual devices, the argument string typically takes the form of
>>>            <code>eth_<var>driver_name</var><var>x</var></code>, where
>>>            <var>driver_name</var> is a valid virtual DPDK PMD driver name and
>>>            <var>x</var> is a unique identifier of your choice for the given
>>> -          port.  Only supported by the dpdk port type.
>>> +          port.
>>> +          Only supported by the dpdk port type.
>>
>> Spurious
> 
> Will remove in v3.
> 
> Thanks,
> Ciara
> 
>>
>>>          </p>
>>>        </column>
>>>
>>>
> 



More information about the dev mailing list