[ovs-dev] [PATCH v2] Fix redundant datapath set ethernet action with NSH Decap

Jan Scheurich jan.scheurich at ericsson.com
Mon Jun 7 16:12:51 UTC 2021


> -----Original Message-----
> From: Martin Varghese <martinvarghesenokia at gmail.com>
> Sent: Monday, 7 June, 2021 16:47
> To: Ilya Maximets <i.maximets at ovn.org>
> Cc: dev at openvswitch.org; echaudro at redhat.com; Jan Scheurich
> <jan.scheurich at ericsson.com>; Martin Varghese
> <martin.varghese at nokia.com>
> Subject: Re: [ovs-dev] [PATCH v2] Fix redundant datapath set ethernet action
> with NSH Decap
> 
> On Wed, May 19, 2021 at 12:26:40PM +0200, Ilya Maximets wrote:
> > On 5/19/21 5:26 AM, Martin Varghese wrote:
> > > On Tue, May 18, 2021 at 10:03:39PM +0200, Ilya Maximets wrote:
> > >> On 5/17/21 3:45 PM, Martin Varghese wrote:
> > >>> From: Martin Varghese <martin.varghese at nokia.com>
> > >>>
> > >>> When a decap action is applied on NSH header encapsulatiing a
> > >>> ethernet packet a redundant set mac address action is programmed
> > >>> to the datapath.
> > >>>
> > >>> Fixes: f839892a206a ("OF support and translation of generic encap
> > >>> and decap")
> > >>> Signed-off-by: Martin Varghese <martin.varghese at nokia.com>
> > >>> Acked-by: Jan Scheurich <jan.scheurich at ericsson.com>
> > >>> Acked-by: Eelco Chaudron <echaudro at redhat.com>
> > >>> ---
> > >>> Changes in v2:
> > >>>   - Fixed code styling
> > >>>   - Added Ack from jan.scheurich at ericsson.com
> > >>>   - Added Ack from echaudro at redhat.com
> > >>>
> > >>
> > >> Hi, Martin.
> > >> For some reason this patch triggers frequent failures of the
> > >> following unit test:
> > >>
> > >> 2314. packet-type-aware.at:619: testing ptap - L3 over patch port
> > >> ...
> 
> The test is failing as, during revalidation, NORMAL action is dropping packets.
> With these changes, the mac address in flow structures get cleared with decap
> action. Hence the NORMAL action drops the packet assuming a loop (SRC and
> DST mac address are zero). I assume NORMAL action handling in
> xlate_push_stats_entry is not adapted for l3 packet. The timing at which
> revalidator gets triggered explains the sporadicity of the issue. The issue is
> never seen as the MAC addresses in flow structure were not cleared with decap
> before.
> 
> So can we use NORMAL action with a L3 packet ?  Does OVS handle all the L3
> use cases with Normal action ? If not, shouldn't we not use NORMAL action in
> this test case
> 
> Comments?
> 

Good catch! Normal flow L2 bridging is of course nonsense for the use case of forwarding an L3 packet. I am surprised that the packet was forwarded at all in the first place. That in itself can be considered a bug. Correctly, a Normal flow should drop non-Ethernet packets, I would say.

To fix the test case I suggest to replace the Normal action in br1 with "output:gre1" in line 700.

> 
> > >> stdout:
> > >> warped
> > >> ./packet-type-aware.at:726:
> > >>     ovs-appctl dpctl/dump-flows --names dummy at ovs-dummy |
> > >> strip_used | grep -v ipv6 | sort
> > >>
> > >> --- -   2021-05-18 21:57:56.810513366 +0200
> > >> +++ /home/i.maximets/work/git/ovs/tests/testsuite.dir/at-
> groups/2314/stdout     2021-05-18 21:57:56.806609814 +0200
> > >> @@ -1,3 +1,3 @@
> > >>  flow-dump from the main thread:
> > >> -recirc_id(0),in_port(n0),packet_type(ns=0,id=0),eth(src=3a:6d:d2:0
> > >> 9:9c:ab,dst=1e:2c:e9:2a:66:9e),eth_type(0x0800),ipv4(tos=0/0x3,frag
> > >> =no), packets:1, bytes:98, used:0.0s,
> > >> actions:pop_eth,clone(tnl_push(tnl_port(gre_sys),header(size=38,typ
> > >> e=3,eth(dst=de:af:be:ef:ba:be,src=aa:55:00:00:00:02,dl_type=0x0800)
> > >> ,ipv4(src=10.0.0.1,dst=10.0.0.2,proto=47,tos=0,ttl=64,frag=0x4000),
> > >> gre((flags=0x0,proto=0x800))),out_port(br2)),n2)
> > >> +recirc_id(0),in_port(n0),packet_type(ns=0,id=0),eth(src=3a:6d:d2:0
> > >> +9:9c:ab,dst=1e:2c:e9:2a:66:9e),eth_type(0x0800),ipv4(tos=0/0x3,fra
> > >> +g=no), packets:1, bytes:98, used:0.0s, actions:drop
> > >>
> > >>
> > >> It fails very frequently in GitHub Actions, but it's harder to make
> > >> it fail on my local machine.  Following change to the test allows
> > >> to reproduce the failure almost always on my local machine:
> > >>
> > >> diff --git a/tests/packet-type-aware.at
> > >> b/tests/packet-type-aware.at index 540cf98f3..01dbc8030 100644
> > >> --- a/tests/packet-type-aware.at
> > >> +++ b/tests/packet-type-aware.at
> > >> @@ -721,7 +721,7 @@ AT_CHECK([
> > >>      ovs-appctl netdev-dummy/receive n0
> > >>
> 1e2ce92a669e3a6dd2099cab0800450000548a83400040011aadc0a80a0ac0a80
> a1
> > >>
> e0800b7170a4d0002fd509a5800000000de1c0200000000001011121314151617
> 18
> > >>
> 191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f3031323334353637
> > >>  ], [0], [ignore])
> > >>
> > >> -ovs-appctl time/warp 1000
> > >> +ovs-appctl time/warp 1000 100
> > >>
> > >>  AT_CHECK([
> > >>      ovs-appctl dpctl/dump-flows --names dummy at ovs-dummy |
> > >> strip_used | grep -v ipv6 | sort
> > >> --
> > >>
> > >> Without your patch I can not make it fail locally even with above
> > >> wrapping change applied.
> > >>
> > >> Could you, please, take a look?
> > >>
> > >
> > > Hi Ilya
> > >
> > > Travis CI was good. i will rebase & try again
> > > https://protect2.fireeye.com/v1/url?k=8ec813e4-d1532ae4-8ec8537f-86f
> > > c6812c361-3273f254661f9c75&q=1&e=c83c3bb8-d3c9-439a-8031-
> 72634d0fecc
> > > 2&u=https%3A%2F%2Ftravis-
> ci.org%2Fgithub%2Fmartinpattara%2Fovs%2Fbui
> > > lds%2F770919003
> >
> > Travis has only one job with tests enabled and it tests on arm.
> > GitHub Actions (which is our main CI now) wasn't good:
> >
> > https://protect2.fireeye.com/v1/url?k=c1443ad3-9edf03d3-c1447a48-86fc6
> > 812c361-35a33eb8a7c0b489&q=1&e=c83c3bb8-d3c9-439a-8031-
> 72634d0fecc2&u=
> >
> https%3A%2F%2Fgithub.com%2Fmartinpattara%2Fovs%2Fruns%2F2567454405
> >
> > Best regards, Ilya Maximets.


More information about the dev mailing list