[ovs-dev] [PATCH v3] netdev-dpdk: add support for the RTE_ETH_EVENT_INTR_RESET event

Ilya Maximets i.maximets at ovn.org
Thu Oct 31 10:49:44 UTC 2019


On 31.10.2019 11:16, Eelco Chaudron wrote:
> 
> 
> On 26 Sep 2019, at 12:30, Ilya Maximets wrote:
> 
>> On 24.09.2019 16:15, Eelco Chaudron wrote:
>>>
>>>
>>> On 12 Sep 2019, at 12:24, Ilya Maximets wrote:
>>>
>>>> On 12.09.2019 13:19, Ilya Maximets wrote:
>>>>> On 12.09.2019 13:07, Eelco Chaudron wrote:
>>>>>>
>>>>>>
>>>>>> On 12 Sep 2019, at 10:39, Ilya Maximets wrote:
>>>>>>
>>>>>>> On 11.09.2019 16:20, Eelco Chaudron wrote:
>>>>>>>> Currently, OVS does not register and therefore not handle the
>>>>>>>> interface reset event from the DPDK framework. This would cause a
>>>>>>>> problem in cases where a VF is used as an interface, and its
>>>>>>>> configuration changes.
>>>>>>>>
>>>>>>>> As an example in the following scenario the MAC change is not
>>>>>>>> detected/acted upon until OVS is restarted without the patch applied:
>>>>>>>>
>>>>>>>>   $ echo 1 > /sys/bus/pci/devices/0000:05:00.1/sriov_numvfs
>>>>>>>>   $ ovs-vsctl add-port ovs_pvp_br0 dpdk0 -- \
>>>>>>>>             set Interface dpdk0 type=dpdk -- \
>>>>>>>>             set Interface dpdk0 options:dpdk-devargs=0000:05:0a.0
>>>>>>>>
>>>>>>>>   $ ip link set p5p2 vf 0 mac 52:54:00:92:d3:33
>>>>>>>>
>>>>>>>> Signed-off-by: Eelco Chaudron <echaudro at redhat.com>
>>>>>>>> ---
>>>>>>>
>>>>>>> <snip>
>>>>>>>
>>>>>>>> @@ -4180,13 +4223,19 @@ netdev_dpdk_reconfigure(struct netdev *netdev)
>>>>>>>>          && dev->rxq_size == dev->requested_rxq_size
>>>>>>>>          && dev->txq_size == dev->requested_txq_size
>>>>>>>>          && dev->socket_id == dev->requested_socket_id
>>>>>>>> -        && dev->started) {
>>>>>>>> +        && dev->started && !dev->reset_needed) {
>>>>>>>>          /* Reconfiguration is unnecessary */
>>>>>>>>
>>>>>>>>          goto out;
>>>>>>>>      }
>>>>>>>>
>>>>>>>> -    rte_eth_dev_stop(dev->port_id);
>>>>>>>> +    if (dev->reset_needed) {
>>>>>>>> +        rte_eth_dev_reset(dev->port_id);
>>>>>>>
>>>>>>> Thinking more about the change and looking at flow control configuration,
>>>>>>> it seems that on reset we'll lost configurations done in set_config().
>>>>>>> Device reset should return it to initial state, i.e. all the default settings,
>>>>>>> but set_config() will not be called after that.
>>>>>>> I know, that VFs commonly doesn't support flow control, but if we'll add like
>>>>>>> real configuration of MAC address, it will be lost too.
>>>>>>>
>>>>>>> What do you think?
>>>>>>
>>>>>> Doesn’t the full bridge run sequence take care of this?
>>>>>>
>>>>>> So in callback we do netdev_request_reconfigure() which triggers the following sequence…
>>>>>>
>>>>>> bridge_run__()
>>>>>>  ...
>>>>>>    dpif_netdev_run()
>>>>>>      ...
>>>>>>        reconfigure_datapath()
>>>>>>          ...
>>>>>>            netdev_dpdk_reconfigure()
>>>>>>
>>>>>> But than also it will call in the next run
>>>>>>
>>>>>> bridge_run()
>>>>>>  ...
>>>>>>    bridge_delete_or_reconfigure_ports()
>>>>>>      ...
>>>>>>         netdev_set_config()
>>>>>>           netdev_dpdk_set_config()
>>>>>>
>>>>>>
>>>>>> Or do I miss something?
>>>>>
>>>>> dpif_netdev_run() is called from ofproto_type_run() which is called
>>>>> unconditionally from the bridge_run().
>>>>> But bridge_delete_or_reconfigure_ports() only called from bridge_reconfigure(),
>>>>> which is protected by the following condition:
>>>>>
>>>>>     if (ovsdb_idl_get_seqno(idl) != idl_seqno ||
>>>>>         if_notifier_changed(ifnotifier)) {
>>>>>
>>>>> i.e. if there will be no DB updates or there will be no netlink notifications,
>>>>> set_config() will not be called. I understand that in your scenario there
>>>>> might be netlink notification for interface changes since PF is controlled
>>>>> by the kernel, but it might be not the case if PF attached as a DPDK port
>>>>> to OVS or some other application.
>>>>
>>>> Hmm. Basically, dpdk_eth_event_callback() is an analogue of the
>>>> if_notifier, but for DPDK application.
>>>>
>>>> Maybe we can add it as another trigger for above 'if' condition?
>>>
>>> It sounds like a good idea, however, I see no clean way of doing this as the> if_notifier_changed()/if_notifier_create() is not data path abstracted.
>>> Any ideas for a nice clean solution here?
>>> Guess we could add a dpdk_notifier_create() in if-notifier.c
>>>
>>> Thoughts?
>>
>> I'm not sure how the clean solution could look like, but I'd like to
>> rise another issue closely related to this change.
> 
> I’ll send out a patch later today that will include a DPDK notifier as part of the if-notifier.c to take care of the set_config() part.
> 
>> The issue is an infinite re-addition of the failed ports.
>> This happens if the device in userspace datapath has a linux network
>> interface and it's not able to be configured. For example, if the first
>> reconfiguration fails because of misconfiguration.
>> In current code victims are afxdp ports and the mellanox NIC ports
>> opened by the DPDK due to their bifurcated drivers (It's unlikely for
>> usual netdev-linux ports to fail).
>>
>> The root cause: Every change in the state of the network interface
>> of a linux kernel device generates if-notifier event and if-notifier
>> event triggers the OVS code to re-apply the configuration of ports,
>> i.e. add broken ports back. The most obvious part is that dpif-netdev
>> changes the device flags before trying to configure it:
>>
>>    1. add_port()
>>    2. set_flags() --> if-notifier event
>>    3. reconfigure() --> port removal from the datapath due to
>>                         misconfiguration or any other issue in
>>                         the underlying device.
>>    4. setting flags back --> another if-notifier event.
>>    5. There was new if-notifier event?
>>       yes --> re-apply all settings. --> goto step 1.
>>
>> We could move applying the netdev flags after the first successful
>> reconfiguration and this might fix part of the problem, but many
>> changes that we're applying on the reconfiguration phase could
>> generate if-notifier events too and we can't easily avoid that.
>>
>> I think, if we'll add a dpdk_notifier, we'll have same issue for
>> the rest of DPDK port types.
>>
>> Any thoughts?
> 
> 
> I’ve been thinking about this, but see no fix-all solution right now, it needs a bit more thinking.
> 
> As this is not something introduced by this patch, I’ll work on it separately.
> I’ve not been able to trigger it in the DPDK case, however, I can easily in the AF_XDP case.

netdev-dpdk is more robust.  It validates most of the config options
and re-tries the configuration.  Configuration failures usually happens
because of the incorrect devargs or broken device/missing libs.

I'm working on making netdev-afxdp more error-proof, e.g. best-effort
xdp-mode choosing. Start from "drv+zc" and try just "drv" and subsequently
"skb" on failures. And we also need validation of n_rxq option to not
try to configure more than available.

Moving of flags' updating after the first reconfiguration solves
most of the obvious cases  with afxdp misconfigurations. I have
a patch for this. Will finalize it and send sometime soon.

Best regards, Ilya Maximets.


More information about the dev mailing list