[ovs-dev] [PATCH RFC v2] netdev-dpdk: Allow specification of index for PCI devices
Kevin Traynor
ktraynor at redhat.com
Fri Oct 20 09:01:43 UTC 2017
On 10/19/2017 07:27 PM, Loftus, Ciara wrote:
>>
>> On 10/17/2017 11:48 AM, Ciara Loftus wrote:
>>> Some NICs have only one PCI address associated with multiple ports. This
>>> patch extends the dpdk-devargs option's format to cater for such
>>> devices. Whereas before only one of N ports associated with one PCI
>>> address could be added, now N can.
>>>
>>> This patch allows for the following use of the dpdk-devargs option:
>>>
>>> ovs-vsctl set Interface myport options:dpdk-devargs=0000:06:00.0,X
>>>
>>> Where X is an unsigned integer representing one of multiple ports
>>> associated with the PCI address provided.
>>>
>>> This patch has not been tested.
>>>
>>> Signed-off-by: Ciara Loftus <ciara.loftus at intel.com>
>>> ---
>>> v2:
>>> * Simplify function to find port ID of indexed device
>>> * Hotplug compatibility
>>> * Detach compatibility
>>> * Documentation
>>> * Use strtol instead of atoi
>>>
>>> Documentation/howto/dpdk.rst | 9 +++++
>>> Documentation/intro/install/dpdk.rst | 9 +++++
>>> NEWS | 2 ++
>>> lib/netdev-dpdk.c | 67 ++++++++++++++++++++++++++++++---
>> ---
>>> vswitchd/vswitch.xml | 11 ++++--
>>> 5 files changed, 85 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/Documentation/howto/dpdk.rst
>> b/Documentation/howto/dpdk.rst
>>> index d123819..9447b71 100644
>>> --- a/Documentation/howto/dpdk.rst
>>> +++ b/Documentation/howto/dpdk.rst
>>> @@ -48,6 +48,15 @@ number of dpdk devices found in the log file::
>>> $ ovs-vsctl add-port br0 dpdk-p1 -- set Interface dpdk-p1 type=dpdk \
>>> options:dpdk-devargs=0000:01:00.1
>>>
>>> +If your PCI device has multiple ports under the same PCI ID, you can use
>> the
>>> +following notation to indicate the specific device you wish to add::
>>> +
>>> + $ ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk \
>>> + options:dpdk-devargs=0000:01:00.0,0
>>> +
>>> +The above would attempt to use the first device (0) associated with that
>> PCI
>>> +ID. Use ,1 ,2 etc. to access the next.
>>> +
>>> After the DPDK ports get added to switch, a polling thread continuously
>> polls
>>> DPDK devices and consumes 100% of the core, as can be checked from
>> ``top`` and
>>> ``ps`` commands::
>>> diff --git a/Documentation/intro/install/dpdk.rst
>> b/Documentation/intro/install/dpdk.rst
>>> index bb69ae5..d0e87f5 100644
>>> --- a/Documentation/intro/install/dpdk.rst
>>> +++ b/Documentation/intro/install/dpdk.rst
>>> @@ -271,6 +271,15 @@ ports. For example, to create a userspace bridge
>> named ``br0`` and add two
>>> DPDK devices will not be available for use until a valid dpdk-devargs is
>>> specified.
>>>
>>> +If your PCI device has multiple ports under the same PCI ID, you can use
>> the
>>> +following notation to indicate the specific device you wish to add::
>>> +
>>> + $ ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk \
>>> + options:dpdk-devargs=0000:01:00.0,0
>>> +
>>> +The above would attempt to use the first device (0) associated with that
>> PCI
>>> +ID. Use ,1 ,2 etc. to access the next.
>>> +
>>> Refer to ovs-vsctl(8) and :doc:`/howto/dpdk` for more details.
>>>
>>> Performance Tuning
>>> diff --git a/NEWS b/NEWS
>>> index 75f5fa5..ca885a6 100644
>>> --- a/NEWS
>>> +++ b/NEWS
>>> @@ -5,6 +5,8 @@ Post-v2.8.0
>>> chassis "hostname" in addition to a chassis "name".
>>> - Linux kernel 4.13
>>> * Add support for compiling OVS with the latest Linux 4.13 kernel
>>> + - DPDK:
>>> + * Support for adding devices with multiple DPDK ports under one PCI
>> ID.
>>>
>>> v2.8.0 - 31 Aug 2017
>>> - ovs-ofctl:
>>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>>> index c60f46f..35e15da 100644
>>> --- a/lib/netdev-dpdk.c
>>> +++ b/lib/netdev-dpdk.c
>>> @@ -1214,16 +1214,40 @@
>> netdev_dpdk_lookup_by_port_id(dpdk_port_t port_id)
>>> }
>>>
>>> static dpdk_port_t
>>> +dpdk_get_port_by_name_with_index(char *name, int idx, int base_id)
>>> +{
>>> + struct rte_eth_dev_info dev_info;
>>> + char pci_addr[PCI_PRI_STR_SIZE];
>>> + dpdk_port_t offset_port_id = base_id + idx;
>>> +
>>> + if (rte_eth_dev_is_valid_port(offset_port_id)) {
>>> + rte_eth_dev_info_get(offset_port_id, &dev_info);
>>> + rte_pci_device_name(&dev_info.pci_dev->addr, pci_addr,
>>> + sizeof(pci_addr));
>>> + if (!strcmp(pci_addr, name)) {
>>> + return offset_port_id;
>>> + }
>>> + }
>>> +
>>> + VLOG_INFO("Invalid PCI offset %i for device %s", idx, name);
>>> +
>>> + return DPDK_ETH_PORT_ID_INVALID;
>>> +}
>>> +
>>> +static dpdk_port_t
>>> netdev_dpdk_process_devargs(struct netdev_dpdk *dev,
>>> const char *devargs, char **errp)
>>> {
>>> - /* Get the name up to the first comma. */
>>> - char *name = xmemdup0(devargs, strcspn(devargs, ","));
>>> + char *devargs_copy = xmemdup0((devargs), strlen(devargs));
>>> + char *name, *idx_str;
>>> + unsigned idx;
>>> dpdk_port_t new_port_id = DPDK_ETH_PORT_ID_INVALID;
>>>
>>> + name = strtok_r(devargs_copy, ",", &devargs_copy);
>>> + idx_str = strtok_r(devargs_copy, ",", &devargs_copy);
>>> +
>>> if (!rte_eth_dev_count()
>>> - || rte_eth_dev_get_port_by_name(name, &new_port_id)
>>> - || !rte_eth_dev_is_valid_port(new_port_id)) {
>>> + || rte_eth_dev_get_port_by_name(name, &new_port_id)) {
>>> /* Device not found in DPDK, attempt to attach it */
>>> if (!rte_eth_dev_attach(devargs, &new_port_id)) {
>>
>> I think you should strip out the ",X" for attach. Although looking at
>> the DPDK code it would probably ignore the ",X" once the pci address is
>> ok, so maybe it was intentional.
>
> I left this in intentionally. Vdev devargs often include commas and the information following them is important. Eg.
> net_pcap0,tx_pcap=/tmp/file_tx.pcap
>
True for vdev, but I was thinking of the pci case. If you are
distinguishing between pci and vdev because of the "<pci>,<int>" vs
"<vdev>,<string>" issue below then maybe you could also strip out the
",<int>" for pci.
> But actually this will cause a failure later when we assume those arguments are an integer index and try to process them. That needs to be updated in v3.
>
>>
>>> /* Attach successful */
>>> @@ -1232,12 +1256,21 @@ netdev_dpdk_process_devargs(struct
>> netdev_dpdk *dev,
>>> } else {
>>> /* Attach unsuccessful */
>>> new_port_id = DPDK_ETH_PORT_ID_INVALID;
>>> - VLOG_WARN_BUF(errp, "Error attaching device '%s' to DPDK",
>>> - devargs);
>>> + goto out;
>>> }
>>> }
>>>
>>> - free(name);
>>> + if (idx_str) {
>>> + idx = strtol(idx_str, NULL, 10);
>>> + new_port_id = dpdk_get_port_by_name_with_index(name, idx,
>> new_port_id);
>>> + }
>>> +
>>> +out:
>>> + if (new_port_id == DPDK_ETH_PORT_ID_INVALID) {
>>> + VLOG_WARN_BUF(errp, "Error attaching device '%s' to DPDK",
>>> + devargs);
>>> + }
>>> +
>>
>> you need to free devargs_copy here?
>
> Yes. Will put this in v3.
>
>>
>>> return new_port_id;
>>> }
>>>
>>> @@ -2549,14 +2582,28 @@ netdev_dpdk_detach(struct unixctl_conn
>> *conn, int argc OVS_UNUSED,
>>> {
>>> int ret;
>>> char *response;
>>> - uint8_t port_id;
>>> char devname[RTE_ETH_NAME_MAX_LEN];
>>> struct netdev_dpdk *dev;
>>> + char *devname_copy = xmemdup0((argv[1]), strlen(argv[1]));
>>> + char *name, *idx_str;
>>> + int idx;
>>> + dpdk_port_t port_id;
>>>
>>> ovs_mutex_lock(&dpdk_mutex);
>>>
>>> - if (!rte_eth_dev_count() || rte_eth_dev_get_port_by_name(argv[1],
>>> - &port_id)) {
>>> + name = strtok_r(devname_copy, ",", &devname_copy);
>>> + idx_str = strtok_r(devname_copy, ",", &devname_copy);
>>> +
>>> + if (rte_eth_dev_count()) {
>>> + if (rte_eth_dev_get_port_by_name(argv[1], &port_id)) {
>>> + port_id = DPDK_ETH_PORT_ID_INVALID;
>>> + } else if (idx_str) {
>>> + idx = strtol(idx_str, NULL, 10);
>>> + port_id = dpdk_get_port_by_name_with_index(name, idx,
>> port_id);
>>> + }
>>> + }
>>> +
>>> + if (port_id == DPDK_ETH_PORT_ID_INVALID) {
>>> response = xasprintf("Device '%s' not found in DPDK", argv[1]);
>>> goto error;
>>> }
>>> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
>>> index 04c771f..7f503fb 100644
>>> --- a/vswitchd/vswitch.xml
>>> +++ b/vswitchd/vswitch.xml
>>> @@ -2608,12 +2608,17 @@ ovs-vsctl add-port br0 p0 -- set Interface p0
>> type=patch options:peer=p1 \
>>> <p>
>>> Specifies the PCI address associated with the port for physical
>>> devices, or the virtual driver to be used for the port when a virtual
>>> - PMD is intended to be used. For the latter, the argument string
>>> - typically takes the form of
>>> + PMD is intended to be used.
>>> + For physical devices, you may specify a single PCI ID like
>>> + <code>0000:06:00.0</code> or an ID followed by a comma and an
>> index
>>> + like <code>0000:06:00.0,1</code> for cases where multiple ports are
>>> + associated with a single PCI ID.
>>> + For virtual devices, the argument string typically takes the form of
>>> <code>eth_<var>driver_name</var><var>x</var></code>, where
>>> <var>driver_name</var> is a valid virtual DPDK PMD driver name and
>>> <var>x</var> is a unique identifier of your choice for the given
>>> - port. Only supported by the dpdk port type.
>>> + port.
>>> + Only supported by the dpdk port type.
>>
>> Spurious
>
> Will remove in v3.
>
> Thanks,
> Ciara
>
>>
>>> </p>
>>> </column>
>>>
>>>
>
More information about the dev
mailing list