[ovs-discuss] offloads are leaked when removing a port

Ilya Maximets i.maximets at ovn.org
Thu Feb 27 15:57:44 UTC 2020


On 2/27/20 4:20 PM, Eli Britstein wrote:
> Hello,
> 
> 
> I have 2 ports connected in a bridge, and traffic is going, offloaded (full).
> 
> Doing that, I remove one port "ovs-vsctl del-port pf".
> 
> Adding it back, traffic resumes, but OVS does not create the correct flows, as the rules were not deleted properly with del-port.
> 
> The issue is that:
> 
> In lib/dpif.c, in dpif_port_del, there is a call to dpif->dpif_class->port_del() and then netdev_ports_remove().
> 
> port_del function in this case is dpif_netdev_port_del (in lib/dpif-netdev.c) that calls do_del_port(). That does invoke removing the flows, but those are just requests to be handled by dp_netdev_flow_offload_main().
> 
> When dp_netdev_flow_offload_main() tries to handle the requests, in mark_to_flow_disassociate, the call to netdev_ports_get() (line 2295) fails, so netdev_flow_del() is not called.
> 
> 
> I tried to add a delay as below, and it resolves the issue. However, it looks bad to me.
> 
> Suggestions?

Yes, this is a known issue of the initial feature design
that was already flagged a few times. (I'm always forgetting
about writing down the list of known issues in documentation)

The case is that, according to DPDK API, we *must* remove all
the flow from the device, but OVS doesn't do that.

Few things that should be implemented:
1. flush() method for removing of all the installed flows for
   particular netdev.
2. dpif-netdev should start using flush() method.

There is an issue, however, while removing ports in dpif-netdev.
Reconfiguration function is not allowed to release dp->port_mutex.
This means that likely we will have to call flush() callback
directly from the reconfiguration function (main thread) before
the actual port removing, but after removing it from PMD threads.
After that we'll likely need to traverse offloading queue
and remove all the flows that are targeted for our device that
is going to be removed.  There is still issue the we might
have offloading thread already started processing of one of
the offloading operations and waiting on the dp->port_mutex.
Not sure if this will produce any issues, but we need to check
that all the resources will be properly cleaned up in this
case too.
>From the dpif-netdev perspective we'll need also clean all the
allocated marks and dereference all the remaining flows if any.

I didn't think deep enough on this issue though.

Best regards, Ilya maximets.


More information about the discuss mailing list