[ovs-dev] [PATCH v3] netdev-dpdk: add support for the RTE_ETH_EVENT_INTR_RESET event
Eelco Chaudron
echaudro at redhat.com
Thu Oct 31 10:16:45 UTC 2019
On 26 Sep 2019, at 12:30, Ilya Maximets wrote:
> On 24.09.2019 16:15, Eelco Chaudron wrote:
>>
>>
>> On 12 Sep 2019, at 12:24, Ilya Maximets wrote:
>>
>>> On 12.09.2019 13:19, Ilya Maximets wrote:
>>>> On 12.09.2019 13:07, Eelco Chaudron wrote:
>>>>>
>>>>>
>>>>> On 12 Sep 2019, at 10:39, Ilya Maximets wrote:
>>>>>
>>>>>> On 11.09.2019 16:20, Eelco Chaudron wrote:
>>>>>>> Currently, OVS does not register and therefore not handle the
>>>>>>> interface reset event from the DPDK framework. This would cause
>>>>>>> a
>>>>>>> problem in cases where a VF is used as an interface, and its
>>>>>>> configuration changes.
>>>>>>>
>>>>>>> As an example in the following scenario the MAC change is not
>>>>>>> detected/acted upon until OVS is restarted without the patch
>>>>>>> applied:
>>>>>>>
>>>>>>> $ echo 1 > /sys/bus/pci/devices/0000:05:00.1/sriov_numvfs
>>>>>>> $ ovs-vsctl add-port ovs_pvp_br0 dpdk0 -- \
>>>>>>> set Interface dpdk0 type=dpdk -- \
>>>>>>> set Interface dpdk0
>>>>>>> options:dpdk-devargs=0000:05:0a.0
>>>>>>>
>>>>>>> $ ip link set p5p2 vf 0 mac 52:54:00:92:d3:33
>>>>>>>
>>>>>>> Signed-off-by: Eelco Chaudron <echaudro at redhat.com>
>>>>>>> ---
>>>>>>
>>>>>> <snip>
>>>>>>
>>>>>>> @@ -4180,13 +4223,19 @@ netdev_dpdk_reconfigure(struct netdev
>>>>>>> *netdev)
>>>>>>> && dev->rxq_size == dev->requested_rxq_size
>>>>>>> && dev->txq_size == dev->requested_txq_size
>>>>>>> && dev->socket_id == dev->requested_socket_id
>>>>>>> - && dev->started) {
>>>>>>> + && dev->started && !dev->reset_needed) {
>>>>>>> /* Reconfiguration is unnecessary */
>>>>>>>
>>>>>>> goto out;
>>>>>>> }
>>>>>>>
>>>>>>> - rte_eth_dev_stop(dev->port_id);
>>>>>>> + if (dev->reset_needed) {
>>>>>>> + rte_eth_dev_reset(dev->port_id);
>>>>>>
>>>>>> Thinking more about the change and looking at flow control
>>>>>> configuration,
>>>>>> it seems that on reset we'll lost configurations done in
>>>>>> set_config().
>>>>>> Device reset should return it to initial state, i.e. all the
>>>>>> default settings,
>>>>>> but set_config() will not be called after that.
>>>>>> I know, that VFs commonly doesn't support flow control, but if
>>>>>> we'll add like
>>>>>> real configuration of MAC address, it will be lost too.
>>>>>>
>>>>>> What do you think?
>>>>>
>>>>> Doesn’t the full bridge run sequence take care of this?
>>>>>
>>>>> So in callback we do netdev_request_reconfigure() which triggers
>>>>> the following sequence…
>>>>>
>>>>> bridge_run__()
>>>>> ...
>>>>> dpif_netdev_run()
>>>>> ...
>>>>> reconfigure_datapath()
>>>>> ...
>>>>> netdev_dpdk_reconfigure()
>>>>>
>>>>> But than also it will call in the next run
>>>>>
>>>>> bridge_run()
>>>>> ...
>>>>> bridge_delete_or_reconfigure_ports()
>>>>> ...
>>>>> netdev_set_config()
>>>>> netdev_dpdk_set_config()
>>>>>
>>>>>
>>>>> Or do I miss something?
>>>>
>>>> dpif_netdev_run() is called from ofproto_type_run() which is called
>>>> unconditionally from the bridge_run().
>>>> But bridge_delete_or_reconfigure_ports() only called from
>>>> bridge_reconfigure(),
>>>> which is protected by the following condition:
>>>>
>>>> if (ovsdb_idl_get_seqno(idl) != idl_seqno ||
>>>> if_notifier_changed(ifnotifier)) {
>>>>
>>>> i.e. if there will be no DB updates or there will be no netlink
>>>> notifications,
>>>> set_config() will not be called. I understand that in your scenario
>>>> there
>>>> might be netlink notification for interface changes since PF is
>>>> controlled
>>>> by the kernel, but it might be not the case if PF attached as a
>>>> DPDK port
>>>> to OVS or some other application.
>>>
>>> Hmm. Basically, dpdk_eth_event_callback() is an analogue of the
>>> if_notifier, but for DPDK application.
>>>
>>> Maybe we can add it as another trigger for above 'if' condition?
>>
>> It sounds like a good idea, however, I see no clean way of doing this
>> as the> if_notifier_changed()/if_notifier_create() is not data path
>> abstracted.
>> Any ideas for a nice clean solution here?
>> Guess we could add a dpdk_notifier_create() in if-notifier.c
>>
>> Thoughts?
>
> I'm not sure how the clean solution could look like, but I'd like to
> rise another issue closely related to this change.
I’ll send out a patch later today that will include a DPDK notifier as
part of the if-notifier.c to take care of the set_config() part.
> The issue is an infinite re-addition of the failed ports.
> This happens if the device in userspace datapath has a linux network
> interface and it's not able to be configured. For example, if the
> first
> reconfiguration fails because of misconfiguration.
> In current code victims are afxdp ports and the mellanox NIC ports
> opened by the DPDK due to their bifurcated drivers (It's unlikely for
> usual netdev-linux ports to fail).
>
> The root cause: Every change in the state of the network interface
> of a linux kernel device generates if-notifier event and if-notifier
> event triggers the OVS code to re-apply the configuration of ports,
> i.e. add broken ports back. The most obvious part is that dpif-netdev
> changes the device flags before trying to configure it:
>
> 1. add_port()
> 2. set_flags() --> if-notifier event
> 3. reconfigure() --> port removal from the datapath due to
> misconfiguration or any other issue in
> the underlying device.
> 4. setting flags back --> another if-notifier event.
> 5. There was new if-notifier event?
> yes --> re-apply all settings. --> goto step 1.
>
> We could move applying the netdev flags after the first successful
> reconfiguration and this might fix part of the problem, but many
> changes that we're applying on the reconfiguration phase could
> generate if-notifier events too and we can't easily avoid that.
>
> I think, if we'll add a dpdk_notifier, we'll have same issue for
> the rest of DPDK port types.
>
> Any thoughts?
I’ve been thinking about this, but see no fix-all solution right now,
it needs a bit more thinking.
As this is not something introduced by this patch, I’ll work on it
separately. I’ve not been able to trigger it in the DPDK case,
however, I can easily in the AF_XDP case.
//Eelco
More information about the dev
mailing list