[ovs-dev] [PATCH v3] netdev-dpdk: add support for the RTE_ETH_EVENT_INTR_RESET event

Ilya Maximets i.maximets at ovn.org
Thu Sep 26 10:30:53 UTC 2019


On 24.09.2019 16:15, Eelco Chaudron wrote:
> 
> 
> On 12 Sep 2019, at 12:24, Ilya Maximets wrote:
> 
>> On 12.09.2019 13:19, Ilya Maximets wrote:
>>> On 12.09.2019 13:07, Eelco Chaudron wrote:
>>>>
>>>>
>>>> On 12 Sep 2019, at 10:39, Ilya Maximets wrote:
>>>>
>>>>> On 11.09.2019 16:20, Eelco Chaudron wrote:
>>>>>> Currently, OVS does not register and therefore not handle the
>>>>>> interface reset event from the DPDK framework. This would cause a
>>>>>> problem in cases where a VF is used as an interface, and its
>>>>>> configuration changes.
>>>>>>
>>>>>> As an example in the following scenario the MAC change is not
>>>>>> detected/acted upon until OVS is restarted without the patch applied:
>>>>>>
>>>>>>   $ echo 1 > /sys/bus/pci/devices/0000:05:00.1/sriov_numvfs
>>>>>>   $ ovs-vsctl add-port ovs_pvp_br0 dpdk0 -- \
>>>>>>             set Interface dpdk0 type=dpdk -- \
>>>>>>             set Interface dpdk0 options:dpdk-devargs=0000:05:0a.0
>>>>>>
>>>>>>   $ ip link set p5p2 vf 0 mac 52:54:00:92:d3:33
>>>>>>
>>>>>> Signed-off-by: Eelco Chaudron <echaudro at redhat.com>
>>>>>> ---
>>>>>
>>>>> <snip>
>>>>>
>>>>>> @@ -4180,13 +4223,19 @@ netdev_dpdk_reconfigure(struct netdev *netdev)
>>>>>>          && dev->rxq_size == dev->requested_rxq_size
>>>>>>          && dev->txq_size == dev->requested_txq_size
>>>>>>          && dev->socket_id == dev->requested_socket_id
>>>>>> -        && dev->started) {
>>>>>> +        && dev->started && !dev->reset_needed) {
>>>>>>          /* Reconfiguration is unnecessary */
>>>>>>
>>>>>>          goto out;
>>>>>>      }
>>>>>>
>>>>>> -    rte_eth_dev_stop(dev->port_id);
>>>>>> +    if (dev->reset_needed) {
>>>>>> +        rte_eth_dev_reset(dev->port_id);
>>>>>
>>>>> Thinking more about the change and looking at flow control configuration,
>>>>> it seems that on reset we'll lost configurations done in set_config().
>>>>> Device reset should return it to initial state, i.e. all the default settings,
>>>>> but set_config() will not be called after that.
>>>>> I know, that VFs commonly doesn't support flow control, but if we'll add like
>>>>> real configuration of MAC address, it will be lost too.
>>>>>
>>>>> What do you think?
>>>>
>>>> Doesn’t the full bridge run sequence take care of this?
>>>>
>>>> So in callback we do netdev_request_reconfigure() which triggers the following sequence…
>>>>
>>>> bridge_run__()
>>>>  ...
>>>>    dpif_netdev_run()
>>>>      ...
>>>>        reconfigure_datapath()
>>>>          ...
>>>>            netdev_dpdk_reconfigure()
>>>>
>>>> But than also it will call in the next run
>>>>
>>>> bridge_run()
>>>>  ...
>>>>    bridge_delete_or_reconfigure_ports()
>>>>      ...
>>>>         netdev_set_config()
>>>>           netdev_dpdk_set_config()
>>>>
>>>>
>>>> Or do I miss something?
>>>
>>> dpif_netdev_run() is called from ofproto_type_run() which is called
>>> unconditionally from the bridge_run().
>>> But bridge_delete_or_reconfigure_ports() only called from bridge_reconfigure(),
>>> which is protected by the following condition:
>>>
>>>     if (ovsdb_idl_get_seqno(idl) != idl_seqno ||
>>>         if_notifier_changed(ifnotifier)) {
>>>
>>> i.e. if there will be no DB updates or there will be no netlink notifications,
>>> set_config() will not be called. I understand that in your scenario there
>>> might be netlink notification for interface changes since PF is controlled
>>> by the kernel, but it might be not the case if PF attached as a DPDK port
>>> to OVS or some other application.
>>
>> Hmm. Basically, dpdk_eth_event_callback() is an analogue of the
>> if_notifier, but for DPDK application.
>>
>> Maybe we can add it as another trigger for above 'if' condition?
> 
> It sounds like a good idea, however, I see no clean way of doing this as the> if_notifier_changed()/if_notifier_create() is not data path abstracted.
> Any ideas for a nice clean solution here?
> Guess we could add a dpdk_notifier_create() in if-notifier.c
> 
> Thoughts?

I'm not sure how the clean solution could look like, but I'd like to
rise another issue closely related to this change.

The issue is an infinite re-addition of the failed ports.
This happens if the device in userspace datapath has a linux network
interface and it's not able to be configured. For example, if the first
reconfiguration fails because of misconfiguration.
In current code victims are afxdp ports and the mellanox NIC ports
opened by the DPDK due to their bifurcated drivers (It's unlikely for
usual netdev-linux ports to fail).

The root cause: Every change in the state of the network interface
of a linux kernel device generates if-notifier event and if-notifier
event triggers the OVS code to re-apply the configuration of ports,
i.e. add broken ports back. The most obvious part is that dpif-netdev
changes the device flags before trying to configure it:

    1. add_port()
    2. set_flags() --> if-notifier event
    3. reconfigure() --> port removal from the datapath due to
                         misconfiguration or any other issue in
                         the underlying device.
    4. setting flags back --> another if-notifier event.
    5. There was new if-notifier event?
       yes --> re-apply all settings. --> goto step 1.

We could move applying the netdev flags after the first successful
reconfiguration and this might fix part of the problem, but many
changes that we're applying on the reconfiguration phase could
generate if-notifier events too and we can't easily avoid that.

I think, if we'll add a dpdk_notifier, we'll have same issue for
the rest of DPDK port types.

Any thoughts?

BTW, I'm not against the dpdk_notifier and I think that it's a
right direction to move, but it's the right place and time to
highlight the issue we have.

Best regards, Ilya Maximets.


More information about the dev mailing list