[ovs-dev] [PATCH v3 2/2] netdev-dpdk: Add option to configure VF MAC address.

Ilya Maximets i.maximets at ovn.org
Tue Oct 27 14:03:58 UTC 2020


On 10/20/20 1:15 PM, Kevin Traynor wrote:
> On 16/09/2020 16:17, Gaetan Rivet wrote:
>> In some cloud topologies, using DPDK VF representors in guest requires
>> configuring a VF before it is assigned to the guest.
>>
>> A first basic option for such configuration is setting the VF MAC
>> address. Add a key 'dpdk-vf-mac' to the 'options' column of the Interface
>> table.
>>
>> This option can be used as such:
>>
>>    $ ovs-vsctl add-port br0 dpdk-rep0 -- set Interface dpdk-rep0 type=dpdk \
>>       options:dpdk-vf-mac=00:11:22:33:44:55
>>
> 
> If I got it right, it doesn't seem like it solves any issues by limiting
> to representor, but is just an attempt to limit some of the edge cases
> around bifurcated devices to the devices where the requested
> functionality is really needed now.
> 
> This limitation has some costs...
> 
> We will now have, mac, mac_in_use, dpdk-vf-mac, configured_hwaddr,
> requested_hwaddr for the user to understand. I think it can be confusing.

To be clear, 'configured_hwaddr/requested_hwaddr' type of report should not
exist at least in the output of 'get_config()' function.  Same applies
to all existing '{configured,requested}_something' reports.
'get_config()' intended to report things that could be copied and passed directly
to database.  OTOH, 'get_status()' should report what is an actual device
status, i.e. 'dpdk-vf-mac' should be in 'get_config()' and 'configured_hwaddr'
should be in 'get_status()', but it should be named differently, e.g. just
'dpdk-vf-mac' and the callback it placed in already says that it's actually
'configured'.  In short:
  - 'dpdk-vf-mac' from the 'get_config()' should report dev->requested_hwaddr.
  - 'dpdk-vf-mac' from the 'get_status()' should report dev->hwaddr.
Of course, all this should be reported only if device is a representor.


> 
> 'ovs-vsctl show' will show the dpdk-vf-mac, which is the main one we
> document, but it's not clear if it was successful or not. (yes there is
> a log entry, but it is a separate place).
> 
> As it is specific for representors, if we ever want to allow setting of
> mac on any non representor DPDK ports, we have an awkward user
> interface. We would need to make another 'dpdk-mac' type option, or we
> decide then to allow use mac in interface table but that it doesn't work
> for representors, or there are two ways to set for representors etc.
> None of this seems great.
> 
> My feeling is that it is making things complicated for the user with
> this many knobs, where we could just have mac and mac_in_use, live with
> the edge cases (as Gaetan pointed out, we already do for MTU) and we
> know as well if we ever need to extend further, the user interface will
> stay simple. Just my 2C, maybe there's a good argument why we can't do
> it like this.

Setting mac for physical device has lots of split-brain issues and unclear
ways how and when undo changes as we do not know what was original mac or
what to do if there was no mac at all at the start.  Moreover, some devices
will not allow setting mac to it's original value (e.g. all zeroes).
One more point is that vf and representor itself are different devices and
these devices could have different mac addresses.  For now we agreed that
DPDK api works for representors in a way that it always changes parameters
of a virtual function, not representors', but that is not a very clear thing
for all users (e.g. for DPDK developers), so 'vf' in 'dpdk-vf-mac' specifically
points to that difference.

After a long discussion we decided to make this option as specific as possible,
so users will not try to use it to configure anything else beside mac of a
virtual function.  All restrictions and downsides of configuration of vf mac
was/should be documented.


> 
> Few comments on the code as it is now below.
> 
> 
>> Signed-off-by: Gaetan Rivet <grive at u256.net>
>> Suggested-by: Ilya Maximets <i.maximets at ovn.org>
>> Acked-by: Eli Britstein <elibr at nvidia.com>
>> ---
>>  Documentation/topics/dpdk/phy.rst | 22 ++++++++++++
>>  NEWS                              |  2 ++
>>  lib/netdev-dpdk.c                 | 60 +++++++++++++++++++++++++++++++
>>  vswitchd/vswitch.xml              | 13 +++++++
>>  4 files changed, 97 insertions(+)
>>
>> diff --git a/Documentation/topics/dpdk/phy.rst b/Documentation/topics/dpdk/phy.rst
>> index 38e52c8de..73dcb7c28 100644
>> --- a/Documentation/topics/dpdk/phy.rst
>> +++ b/Documentation/topics/dpdk/phy.rst
>> @@ -379,6 +379,28 @@ an eth device whose mac address is ``00:11:22:33:44:55``::
>>      $ ovs-vsctl add-port br0 dpdk-mac -- set Interface dpdk-mac type=dpdk \
>>         options:dpdk-devargs="class=eth,mac=00:11:22:33:44:55"
>>  
>> +Representor specific configuration
>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> +
>> +In some topologies, a VF must be configured before being assigned to a
>> +guest (VM) machine.  This configuration is done through VF-specific fields
>> +in the ``options`` column of the Interface table.
>> +
>> +.. important::
>> +
>> +   Some DPDK port use `bifurcated drivers <bifurcated-drivers>`__,
>> +   which means that a kernel netdevice remains when Open vSwitch is stopped.
>> +
>> +   In such case, any configuration applied to a VF would remain set on the
>> +   kernel netdevice, and be inherited from it when Open vSwitch is restarted,
>> +   even if the options described in this section are unset from Open vSwitch.
>> +
>> +.. _bifurcated-drivers: http://doc.dpdk.org/guides/linux_gsg/linux_drivers.html#bifurcated-driver
>> +
>> +- Configure the VF MAC address::
>> +
>> +    $ ovs-vsctl set Interface dpdk-rep0 options:dpdk-vf-mac=00:11:22:33:44:55
>> +
> 
> As mentioned above, if for some reason setting the mac fails there will
> be warning in the logs, but 'ovs-vsctl show' will still show the
> dpdk-vf-mac that was not applied and it's quite prominent. Also, there
> are mac, mac_in_use, dpdk-vf-mac, configured_hwaddr, requested_hwaddr.
> 
> Suggest to add a small summary here for each one as it relates
> representors and make clear to users,
> 1. how to find the mac the user requested
> 2. how to find the current mac
> 
>>  Jumbo Frames
>>  ------------
>>  
>> diff --git a/NEWS b/NEWS
>> index 2f67d5047..5ece29474 100644
>> --- a/NEWS
>> +++ b/NEWS
>> @@ -24,6 +24,8 @@ v2.14.0 - 17 Aug 2020
> 
> 2.15....hopefully :-)
> 
>>         to list and change log levels in DPDK components.
>>       * Vhost-user Dequeue zero-copy support is deprecated and will be removed
>>         in the next release.
>> +     * New 'options:dpdk-vf-mac' field for DPDK interface of VF ports,
>> +       that allows configuring the MAC address of a VF representor.
>>     - Linux datapath:
>>       * Support for kernel versions up to 5.5.x.
>>     - AF_XDP:
>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>> index 4ef7f78a6..060348369 100644
>> --- a/lib/netdev-dpdk.c
>> +++ b/lib/netdev-dpdk.c
>> @@ -523,6 +523,9 @@ struct netdev_dpdk {
>>           * otherwise interrupt mode is used. */
>>          bool requested_lsc_interrupt_mode;
>>          bool lsc_interrupt_mode;
>> +
>> +        /* VF configuration. */
>> +        struct eth_addr requested_hwaddr;
>>      );
>>  
>>      PADDED_MEMBERS(CACHE_LINE_SIZE,
>> @@ -1685,6 +1688,16 @@ out:
>>      return ret;
>>  }
>>  
>> +static bool
>> +dpdk_port_is_representor(struct netdev_dpdk *dev)
>> +    OVS_REQUIRES(dev->mutex)
>> +{
>> +    struct rte_eth_dev_info dev_info;
>> +
>> +    rte_eth_dev_info_get(dev->port_id, &dev_info);
>> +    return (*dev_info.dev_flags) & RTE_ETH_DEV_REPRESENTOR;
>> +}
>> +
>>  static int
>>  netdev_dpdk_get_config(const struct netdev *netdev, struct smap *args)
>>  {
>> @@ -1719,6 +1732,13 @@ netdev_dpdk_get_config(const struct netdev *netdev, struct smap *args)
>>          }
>>          smap_add(args, "lsc_interrupt_mode",
>>                   dev->lsc_interrupt_mode ? "true" : "false");
>> +
>> +        if (dpdk_port_is_representor(dev)) {
>> +            smap_add_format(args, "requested_hwaddr", ETH_ADDR_FMT,
>> +                            ETH_ADDR_ARGS(dev->requested_hwaddr));
>> +            smap_add_format(args, "configured_hwaddr", ETH_ADDR_FMT,
>> +                            ETH_ADDR_ARGS(dev->hwaddr));
> 
> '*mac*' is used elsewhere (mac, mac_in_use, dpdk-vf-mac), think it's
> better to keep consistent

Since this is 'config' it should report things in a way they could be passed
to the datbase.  So, it should be 'dpdk-vf-mac' and it should report the
value that was requested, i.e. dev->requested_hwaddr.

> 
>> +        }
>>      }
>>      ovs_mutex_unlock(&dev->mutex);
>>  
>> @@ -1898,6 +1918,7 @@ netdev_dpdk_set_config(struct netdev *netdev, const struct smap *args,
>>          {RTE_FC_RX_PAUSE, RTE_FC_FULL    }
>>      };
>>      const char *new_devargs;
>> +    const char *vf_mac;
>>      int err = 0;
>>  
>>      ovs_mutex_lock(&dpdk_mutex);
>> @@ -1968,6 +1989,36 @@ netdev_dpdk_set_config(struct netdev *netdev, const struct smap *args,
>>          goto out;
>>      }
>>  
>> +    vf_mac = smap_get(args, "dpdk-vf-mac");
>> +    if (vf_mac) {
>> +        struct eth_addr mac;
>> +
>> +        err = EINVAL;
>> +
>> +        if (!dpdk_port_is_representor(dev)) {
>> +            VLOG_ERR_BUF(errp, "'%s' is trying to set the VF MAC '%s' "
>> +                         "but 'options:dpdk-vf-mac' is only supported for "
>> +                         "VF representors.",
>> +                         netdev_get_name(netdev), vf_mac);
> 
> Warnings here seems more in line with the other configs that cannot be
> applied.

I agree, if it's not a representor we could just issue a warning and continue.

> 
>> +            goto out;
>> +        }
>> +        if (!eth_addr_from_string(vf_mac, &mac)) {
>> +            VLOG_ERR_BUF(errp, "interface '%s': cannot parse VF MAC '%s'",
>> +                         netdev_get_name(netdev), vf_mac);
> 
> as above
> 
>> +            goto out;
>> +        }
>> +        if (eth_addr_is_multicast(mac)) {
>> +            VLOG_ERR_BUF(errp,
>> +                         "interface '%s': cannot set VF MAC to multicast "
>> +                         "address", netdev_get_name(netdev));
> 
> as above
> 
>> +            goto out;
>> +        }
>> +        if (!eth_addr_equals(dev->requested_hwaddr, mac)) {
>> +            dev->requested_hwaddr = mac;
>> +            netdev_request_reconfigure(netdev);
>> +        }
>> +    }
>> +
>>      lsc_interrupt_mode = smap_get_bool(args, "dpdk-lsc-interrupt", false);
>>      if (dev->requested_lsc_interrupt_mode != lsc_interrupt_mode) {
>>          dev->requested_lsc_interrupt_mode = lsc_interrupt_mode;
>> @@ -4938,6 +4989,7 @@ netdev_dpdk_reconfigure(struct netdev *netdev)
>>          && dev->lsc_interrupt_mode == dev->requested_lsc_interrupt_mode
>>          && dev->rxq_size == dev->requested_rxq_size
>>          && dev->txq_size == dev->requested_txq_size
>> +        && eth_addr_equals(dev->hwaddr, dev->requested_hwaddr)
>>          && dev->socket_id == dev->requested_socket_id
>>          && dev->started && !dev->reset_needed) {
>>          /* Reconfiguration is unnecessary */
>> @@ -4969,6 +5021,14 @@ netdev_dpdk_reconfigure(struct netdev *netdev)
>>      dev->txq_size = dev->requested_txq_size;
>>  
>>      rte_free(dev->tx_q);
>> +
>> +    if (!eth_addr_equals(dev->hwaddr, dev->requested_hwaddr)) {
> 
> Looks like if dpdk-vf-mac is not set, after initialization there will be
> a configured hwaddr, and requested_hwaddr will be all zeros.
> 
> If this function is called because of some other config change, then
> hwaddr !=  requested_hwaddr, and it will attempt to set the mac to the
> requested_hwaddr of all zeros. This may also fail causing the
> reconfigure to fail for other configs items.
> 
> (I admit, I hacked a phy port dev_info to say it was a representor, when
> playing with this but the issue seems valid)
> 
> 
>> +        err = netdev_dpdk_set_etheraddr__(dev, dev->requested_hwaddr);
>> +        if (err) {
>> +            goto out;
>> +        }
>> +    }
>> +
>>      err = dpdk_eth_dev_init(dev);
>>      if (dev->hw_ol_features & NETDEV_TX_TSO_OFFLOAD) {
>>          netdev->ol_flags |= NETDEV_TX_OFFLOAD_TCP_TSO;
>> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
>> index 81c84927f..c0bf03c95 100644
>> --- a/vswitchd/vswitch.xml
>> +++ b/vswitchd/vswitch.xml
>> @@ -3280,6 +3280,19 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 type=patch options:peer=p1 \
>>            descriptors will be used by default.
>>          </p>
>>        </column>
>> +
>> +      <column name="options" key="dpdk-vf-mac">
>> +        <p>
>> +          Ethernet address to set for this VF interface.  If unset then the
>> +          default MAC address is used:
>> +        </p>
>> +        <ul>
>> +          <li>For most drivers, the default MAC address assigned by
>> +          their hardware.</li><li>For bifurcated drivers, the MAC currently
>> +          used by the kernel netdevice.</li>
> 
> nit: the formatting is a little different compared to the other
> <li></li> usage, newline/indent etc.
> 
> thanks,
> Kevin.
> 
>> +        </ul>
>> +        <p>This option may only be used with dpdk VF representors.</p>
>> +      </column>
>>      </group>
>>  
>>      <group title="EMC (Exact Match Cache) Configuration">
>>
> 
> 
> 
> 
> 



More information about the dev mailing list