[ovs-dev] [PATCH ovs-dev v1] netdev-offload: Flush offloaded rules when ports removed

Tonghao Zhang xiangxia.m.yue at gmail.com
Thu Jun 18 11:41:52 UTC 2020


On Thu, Jun 18, 2020 at 7:18 PM Simon Horman <simon.horman at netronome.com> wrote:
>
> On Tue, Jun 16, 2020 at 04:25:21PM +0800, Tonghao Zhang wrote:
> > On Mon, Jun 15, 2020 at 7:13 PM Ilya Maximets <i.maximets at ovn.org> wrote:
> > >
> > > On 6/14/20 1:40 PM, Tonghao Zhang wrote:
> > > > On Sun, Jun 14, 2020 at 1:31 PM Roi Dayan <roid at mellanox.com> wrote:
> > > >>
> > > >>
> > > >>
> > > >> On 2020-06-11 1:36 PM, xiangxia.m.yue at gmail.com wrote:
> > > >>> From: Tonghao Zhang <xiangxia.m.yue at gmail.com>
> > > >>>
> > > >>> When ports were removed from bridge, we should flush the
> > > >>> offload rules on the ports. The main reason is two factors:
> > > >>>
> > > >>> * The ports removed from bridge, will be managed by OvS, so
> > > >>>   flush the rules which installed by OvS.
> > > >>> * If using the TC flower offload, for example, tc rules still
> > > >>>   are on the ports. And the information still are maintained by
> > > >>>   OvS, such as the mapping for tc and ufid.
> > > >>>   Then if adding the port to bridge and installing the rules
> > > >>>   to it again, *del_filter_and_ufid_mapping will be invoked,
> > > >>>   and delete the tc rule using tc handle which may not exist
> > > >>>   (offload init api flushed them.) on kernel or is used by other
> > > >>>   previous rules (if so, that rules will be deleted that is not
> > > >>>   we expected.).
> > > >>>
> > > >>> Cc: Simon Horman <simon.horman at netronome.com>
> > > >>> Cc: Paul Blakey <paulb at mellanox.com>
> > > >>> Cc: Roi Dayan <roid at mellanox.com>
> > > >>> Cc: Ben Pfaff <blp at ovn.org>
> > > >>> Cc: William Tu <u9012063 at gmail.com>
> > > >>> Cc: Ilya Maximets <i.maximets at ovn.org>
> > > >>> Tested-at: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftravis-ci.com%2Fgithub%2Fovn-open-virtual-networks%2Fovs%2Fbuilds%2F170832624&data=02%7C01%7Croid%40mellanox.com%7Cf0bc38aa88074693105108d80df36979%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637274686378147436&sdata=Dlq4jnSclj0vXQoldO7Oq2Wwx%2BGEHjK%2FftPUDHnJcKM%3D&reserved=0
> > > >>> Co-authored-by: Wengang Hou <houwengang at didiglobal.com>
> > > >>> Signed-off-by: Tonghao Zhang <xiangxia.m.yue at gmail.com>
> > > >>> ---
> > > >>>  lib/netdev-offload.c | 1 +
> > > >>>  1 file changed, 1 insertion(+)
> > > >>>
> > > >>> diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c
> > > >>> index ab97a292ebac..964566caab1e 100644
> > > >>> --- a/lib/netdev-offload.c
> > > >>> +++ b/lib/netdev-offload.c
> > > >>> @@ -593,6 +593,7 @@ netdev_ports_remove(odp_port_t port_no, const struct dpif_class *dpif_class)
> > > >>>      data = netdev_ports_lookup(port_no, dpif_class);
> > > >>>      if (data) {
> > > >>>          dpif_port_destroy(&data->dpif_port);
> > > >>> +        netdev_flow_flush(data->netdev); /* flush offloaded rules. */
> > > >>>          netdev_close(data->netdev); /* unref and possibly close */
> > > >>>          hmap_remove(&port_to_netdev, &data->portno_node);
> > > >>>          hmap_remove(&ifindex_to_port, &data->ifindex_node);
> > > >>>
> > > >>
> > > > Hi Roi
> > > > Thanks for your review. The case is reproduced as below:
> > > > # ovs-appctl dpctl/show
> > > > system at ovs-system:
> > > >   lookups: hit:0 missed:0 lost:0
> > > >   flows: 0
> > > >   masks: hit:0 total:0 hit/pkt:0.00
> > > >   port 0: ovs-system (internal)
> > > >   port 1: br-int (internal)
> > > >   port 2: enp130s0f0_0
> > > >   port 3: enp130s0f0_1
> > > >   port 4: vxlan_sys_4789 (vxlan: packet_type=ptap)
> > > >
> > > > # cat /tmp/tmp-ovs-debug.sh
> > > > set -x
> > > > ovs-appctl dpctl/add-flow
> > > > 'recirc_id(0),in_port(2),eth(dst=00:11:22:33:44:66),eth_type(0x0800),ipv4(dst=1.1.1.200)'
> > > > 3
> > > > ovs-appctl dpctl/dump-flows
> > > > ovs-vsctl del-port br-int enp130s0f0_0
> > > > tc filter show dev enp130s0f0_0 ingress
> > > > ovs-appctl dpctl/dump-flows
> > > >
> > > > # sh /tmp/tmp-ovs-debug.sh
> > > > + ovs-appctl dpctl/add-flow
> > > > 'recirc_id(0),in_port(2),eth(dst=00:11:22:33:44:66),eth_type(0x0800),ipv4(dst=1.1.1.200)'
> > > > 3
> > > > + ovs-appctl dpctl/dump-flows
> > > > recirc_id(0),in_port(2),eth(dst=00:11:22:33:44:66),eth_type(0x0800),ipv4(dst=1.1.1.200),
> > > > packets:0, bytes:0, used:0.020s, actions:3
> > > > + ovs-vsctl del-port br-int enp130s0f0_0
> > > > + tc filter show dev enp130s0f0_0 ingress
> > > > filter protocol ip pref 2 flower chain 0
> > > > filter protocol ip pref 2 flower chain 0 handle 0x1
> > > >   dst_mac 00:11:22:33:44:66
> > > >   eth_type ipv4
> > > >   dst_ip 1.1.1.200
> > > >   in_hw in_hw_count 1
> > > > action order 1: mirred (Egress Redirect to device enp130s0f0_1) stolen
> > > > index 1 ref 1 bind 1
> > > > cookie 40ca20750c4add5a9cc1c880ccb3d0e8
> > > > no_percpu
> > > > used_hw_stats delayed
> > > >
> > > > + ovs-appctl dpctl/dump-flows
> > > >
> > > > The tc rules on port enp130s0f0_0 are not deleted, we can use the tc
> > > > command to show them.
> > > >
> > > >> it looks ok but I wonder if it's redundant?
> > > >> I tested and when I removed a port with tc rules on it,
> > > >> all tc rules got deleted using the flow api flow del,
> > > >> we get into netdev_tc_flow_del() for each flow.
> > > > I didn't find the netdev_tc_flow_del was invoked while the port was deleting.
> > > > The debug step is:
> > > > 1. run the /tmp/tmp-ovs-debug.sh
> > > > 2. and while gdb the ovs-vswitchd:
> > > > gdb /root/local/openvswitch-master-dpdk/sbin/ovs-vswitchd -ex 'b
> > > > netdev_tc_flow_del' -ex c --pid [ovs-vswitchd pid]
> > > >
> > > > the breakpoint was not triggered.
> > > >> So you shouldn't have a tc rule left or ufid mapping.
> > > > As I debug, the  netdev_tc_flow_del was not invoked, and the ufids of
> > > > flow still are in the mapping.
> > > > When we add the port back and the install the rules again, the new
> > > > rules(e.g. tc handle 10) may be deleted by the later new rules(which
> > > > ufid mapping not
> > > > deleted and the handle may be 10), via  del_filter_and_ufid_mapping.
> > > >> How did you test that you saw tc rules were still on the port?
> > > > Yes, I test it using the tc command.
> > >
> > > Hi.
> > >
> > > {add/del/mod}-flow dpctl commands are for *debugging purposes only*.  And this
> > > is clearly stated in documentation:
> > >
> > > "DATAPATH FLOW TABLE DEBUGGING COMMANDS
> > >  ...
> > >  Do not use commands to add or remove or modify datapath flows if ovs-vswitchd
> > >  is running because it interferes with ovs-vswitchd's own datapath flow
> > >  management.  Use ovs-ofctl(8), instead, to work with OpenFlow flow entries."
> > >
> > > As you're injecting flows directly to datapath you can't expect higher layers
> > > to handle them correctly.  The issue here is that you're using OVS in a way
> > > it's not intended to be used.  Please, configure OVS rules via OpenFlow or
> > > be sure that you're removing flows by hands from the datapath as you're adding
> > > them.
> > Hi
> > Thanks for your review. I test it with openflow, It works fine because
> > the *revalidator
> > will clean up the flows(which may be offloaded).
> > > In case above scenario leaves ufids in OVS that cannot be removed, I'd suggest
> > > reverting of the previous patch that allowed offloading via add-flow since this
> > > is effectively a memory leak and even debugging commands should not produce
> > > a memory leak.
> > yes, we can revert the previous patch. But I think the command is
> > useful for us to
> > debug offload(rte flow offload and tc offload), and test the offload function.
> > For this issue, can we use this little patch to fix it ? or other better way ?
>
> I am confused. Which little patch?
Hi Simon
This patch [1]:
[1] - http://patchwork.ozlabs.org/project/openvswitch/patch/20200611103635.53367-1-xiangxia.m.yue@gmail.com/

If we revert the patch which commit id is
e61984e781e6c7d621568428788cb87c11be8f1f
and we can't debug or install the flows with "ovs-app dpctl/add-flow",
for tc offload.
I think that tools are useful for us.
> > > ---
> > >
> > > BTW, IMHO, it's a very confusing design that we're installing rules directly to
> > > tc.  The issue here is that we have to maintain list of ports we offloaded to
> > > with ufids completely in userspace and we can not re-create these mappings after
> > > OVS restart like we do in case of kernel datapath.  The main issue is that we
> > > do not know which ports are ours.  Datapath is actually split between kernel
> > > and userspace with all the metadata in ovs-vswitchd and flows in tc in kernel.
> > > This leads to big number of issues.  If we could install tc rules from the
> > > inside of the usual kernel datapath that might solve a lot of configuration issues.
> > > Might create some new, but anyway, I think, common architecture would be much
> > > more clear.
> > > For example, in userspace datapath we're installing flows to usual userspace
> > > flow table AND to offload provider. This allows us to manage flows in more
> > > natural way.  And even if we will lost some flows inside offload provider
> > > implementation we will still clean up all the hanging data while removing flows
> > > from the usual userspace datapath flow tables.  So, offloading is actually
> > > joined part of userspace datapath and not a side thing like tc for kernel datapath.
> > > So, what is this part about: kernel datapath has persistent ports and flows
> > > (persistent in terms of ovs-vswitchd restarts), but tc offloading has persistent
> > > flows and non-persistent ports (i.e. list of ports we're affloading to) and
> > > non-persistent flows metadata (ufids of flows we offlaoded).  That introduces
> > > a lot of complications.
> > >
> > >
> > > Best regards, Ilya Maximets.
> >
> >
> >
> > --
> > Best regards, Tonghao



-- 
Best regards, Tonghao


More information about the dev mailing list