[ovs-dev] [PATCH v3] netdev-dpdk: add support for the RTE_ETH_EVENT_INTR_RESET event

Eelco Chaudron echaudro at redhat.com
Thu Oct 31 10:16:45 UTC 2019



On 26 Sep 2019, at 12:30, Ilya Maximets wrote:

> On 24.09.2019 16:15, Eelco Chaudron wrote:
>>
>>
>> On 12 Sep 2019, at 12:24, Ilya Maximets wrote:
>>
>>> On 12.09.2019 13:19, Ilya Maximets wrote:
>>>> On 12.09.2019 13:07, Eelco Chaudron wrote:
>>>>>
>>>>>
>>>>> On 12 Sep 2019, at 10:39, Ilya Maximets wrote:
>>>>>
>>>>>> On 11.09.2019 16:20, Eelco Chaudron wrote:
>>>>>>> Currently, OVS does not register and therefore not handle the
>>>>>>> interface reset event from the DPDK framework. This would cause 
>>>>>>> a
>>>>>>> problem in cases where a VF is used as an interface, and its
>>>>>>> configuration changes.
>>>>>>>
>>>>>>> As an example in the following scenario the MAC change is not
>>>>>>> detected/acted upon until OVS is restarted without the patch 
>>>>>>> applied:
>>>>>>>
>>>>>>>   $ echo 1 > /sys/bus/pci/devices/0000:05:00.1/sriov_numvfs
>>>>>>>   $ ovs-vsctl add-port ovs_pvp_br0 dpdk0 -- \
>>>>>>>             set Interface dpdk0 type=dpdk -- \
>>>>>>>             set Interface dpdk0 
>>>>>>> options:dpdk-devargs=0000:05:0a.0
>>>>>>>
>>>>>>>   $ ip link set p5p2 vf 0 mac 52:54:00:92:d3:33
>>>>>>>
>>>>>>> Signed-off-by: Eelco Chaudron <echaudro at redhat.com>
>>>>>>> ---
>>>>>>
>>>>>> <snip>
>>>>>>
>>>>>>> @@ -4180,13 +4223,19 @@ netdev_dpdk_reconfigure(struct netdev 
>>>>>>> *netdev)
>>>>>>>          && dev->rxq_size == dev->requested_rxq_size
>>>>>>>          && dev->txq_size == dev->requested_txq_size
>>>>>>>          && dev->socket_id == dev->requested_socket_id
>>>>>>> -        && dev->started) {
>>>>>>> +        && dev->started && !dev->reset_needed) {
>>>>>>>          /* Reconfiguration is unnecessary */
>>>>>>>
>>>>>>>          goto out;
>>>>>>>      }
>>>>>>>
>>>>>>> -    rte_eth_dev_stop(dev->port_id);
>>>>>>> +    if (dev->reset_needed) {
>>>>>>> +        rte_eth_dev_reset(dev->port_id);
>>>>>>
>>>>>> Thinking more about the change and looking at flow control 
>>>>>> configuration,
>>>>>> it seems that on reset we'll lost configurations done in 
>>>>>> set_config().
>>>>>> Device reset should return it to initial state, i.e. all the 
>>>>>> default settings,
>>>>>> but set_config() will not be called after that.
>>>>>> I know, that VFs commonly doesn't support flow control, but if 
>>>>>> we'll add like
>>>>>> real configuration of MAC address, it will be lost too.
>>>>>>
>>>>>> What do you think?
>>>>>
>>>>> Doesn’t the full bridge run sequence take care of this?
>>>>>
>>>>> So in callback we do netdev_request_reconfigure() which triggers 
>>>>> the following sequence…
>>>>>
>>>>> bridge_run__()
>>>>>  ...
>>>>>    dpif_netdev_run()
>>>>>      ...
>>>>>        reconfigure_datapath()
>>>>>          ...
>>>>>            netdev_dpdk_reconfigure()
>>>>>
>>>>> But than also it will call in the next run
>>>>>
>>>>> bridge_run()
>>>>>  ...
>>>>>    bridge_delete_or_reconfigure_ports()
>>>>>      ...
>>>>>         netdev_set_config()
>>>>>           netdev_dpdk_set_config()
>>>>>
>>>>>
>>>>> Or do I miss something?
>>>>
>>>> dpif_netdev_run() is called from ofproto_type_run() which is called
>>>> unconditionally from the bridge_run().
>>>> But bridge_delete_or_reconfigure_ports() only called from 
>>>> bridge_reconfigure(),
>>>> which is protected by the following condition:
>>>>
>>>>     if (ovsdb_idl_get_seqno(idl) != idl_seqno ||
>>>>         if_notifier_changed(ifnotifier)) {
>>>>
>>>> i.e. if there will be no DB updates or there will be no netlink 
>>>> notifications,
>>>> set_config() will not be called. I understand that in your scenario 
>>>> there
>>>> might be netlink notification for interface changes since PF is 
>>>> controlled
>>>> by the kernel, but it might be not the case if PF attached as a 
>>>> DPDK port
>>>> to OVS or some other application.
>>>
>>> Hmm. Basically, dpdk_eth_event_callback() is an analogue of the
>>> if_notifier, but for DPDK application.
>>>
>>> Maybe we can add it as another trigger for above 'if' condition?
>>
>> It sounds like a good idea, however, I see no clean way of doing this 
>> as the> if_notifier_changed()/if_notifier_create() is not data path 
>> abstracted.
>> Any ideas for a nice clean solution here?
>> Guess we could add a dpdk_notifier_create() in if-notifier.c
>>
>> Thoughts?
>
> I'm not sure how the clean solution could look like, but I'd like to
> rise another issue closely related to this change.

I’ll send out a patch later today that will include a DPDK notifier as 
part of the if-notifier.c to take care of the set_config() part.

> The issue is an infinite re-addition of the failed ports.
> This happens if the device in userspace datapath has a linux network
> interface and it's not able to be configured. For example, if the 
> first
> reconfiguration fails because of misconfiguration.
> In current code victims are afxdp ports and the mellanox NIC ports
> opened by the DPDK due to their bifurcated drivers (It's unlikely for
> usual netdev-linux ports to fail).
>
> The root cause: Every change in the state of the network interface
> of a linux kernel device generates if-notifier event and if-notifier
> event triggers the OVS code to re-apply the configuration of ports,
> i.e. add broken ports back. The most obvious part is that dpif-netdev
> changes the device flags before trying to configure it:
>
>    1. add_port()
>    2. set_flags() --> if-notifier event
>    3. reconfigure() --> port removal from the datapath due to
>                         misconfiguration or any other issue in
>                         the underlying device.
>    4. setting flags back --> another if-notifier event.
>    5. There was new if-notifier event?
>       yes --> re-apply all settings. --> goto step 1.
>
> We could move applying the netdev flags after the first successful
> reconfiguration and this might fix part of the problem, but many
> changes that we're applying on the reconfiguration phase could
> generate if-notifier events too and we can't easily avoid that.
>
> I think, if we'll add a dpdk_notifier, we'll have same issue for
> the rest of DPDK port types.
>
> Any thoughts?


I’ve been thinking about this, but see no fix-all solution right now, 
it needs a bit more thinking.

As this is not something introduced by this patch, I’ll work on it 
separately. I’ve not been able to trigger it in the DPDK case, 
however, I can easily in the AF_XDP case.

//Eelco



More information about the dev mailing list