[ovs-dev] [PATCH ovn 1/1] tests: Work around ovn-controller incremental processing bugs.

Ben Pfaff blp at ovn.org
Thu Nov 26 16:30:07 UTC 2020


On Thu, Nov 26, 2020 at 06:35:44PM +0530, Numan Siddique wrote:
> On Thu, Nov 26, 2020 at 11:30 AM Numan Siddique <numans at ovn.org> wrote:
> >
> > On Thu, Nov 26, 2020 at 10:54 AM Ben Pfaff <blp at ovn.org> wrote:
> > >
> > > On Wed, Nov 25, 2020 at 01:13:22PM +0530, Numan Siddique wrote:
> > > > On Wed, Nov 25, 2020 at 4:21 AM Ben Pfaff <blp at ovn.org> wrote:
> > > > >
> > > > > The tests "superseding ACLs with conjunction" and "ARP replies for SNAT
> > > > > external ips" trigger bugs in the ovn-controller incremental processing
> > > > > logic.  This works around those bugs.
> > > > >
> > > >
> > > > > Signed-off-by: Ben Pfaff <blp at ovn.org>
> > > >
> > > > Can you please try test case - "ARP replies for SNAT external ips"
> > > > with the latest OVN master ?
> > > >
> > > > The commit https://github.com/ovn-org/ovn/commit/53f60c7ab742cba0b3dd84b73658e0bbd44ec145
> > > > should solve this issue.
> > > >
> > > > I will take a look into the other test case - "superseding ACLs with
> > > > conjunction".
> > >
> > > It does solve the issues that this was meant to fix.
> > >
> > > The following tests still segfault in ovn-controlle:
> > >
> > > 269: ovn -- controller I-P handling with monitoring disabled -- ovn-northd-ddlog FAILED (ovs-macros.at:253)
> > > 301: ovn -- ovn-controller incremental processing    FAILED (ovn-performance.at:542)
> > >
> > > with backtraces that look like the following.  If this is because of a
> > > bug I introduced into ovsdb-idl, I think it has to be a subtle one...
> > >
> > > #0  0x0000000000413e00 in handle_deleted_lport (pb=0x110c550,
> > >     b_ctx_in=0x7ffea1c813d0, b_ctx_out=0x7ffea1c81380)
> > >     at ../controller/binding.c:1982
> > > #1  0x000000000041628e in binding_handle_port_binding_changes (
> > >     b_ctx_in=b_ctx_in at entry=0x7ffea1c813d0,
> > >     b_ctx_out=b_ctx_out at entry=0x7ffea1c81380) at ../controller/binding.c:2153
> > > #2  0x0000000000434650 in runtime_data_sb_port_binding_handler (
> > >     node=0x7ffea1c82730, data=0x10ad150) at ../controller/ovn-controller.c:1471
> > > #3  0x00007f0016dff4ab in engine_compute (recompute_allowed=<optimized out>,
> > >     node=<optimized out>) at ../lib/inc-proc-eng.c:306
> > > #4  engine_run_node (recompute_allowed=true, node=0x7ffea1c82730)
> > >     at ../lib/inc-proc-eng.c:352
> > > #5  engine_run (recompute_allowed=recompute_allowed at entry=true)
> > >     at ../lib/inc-proc-eng.c:377
> > > #6  0x0000000000411a4d in main (argc=<optimized out>, argv=<optimized out>)
> > >     at ../controller/ovn-controller.c:2747
> >
> > With your IDL CS patch series, I'm seeing 100% failure for
> > "ovn-controller incremental processing" test case.
> > I think ovn-controller should not segfault. Thanks for the backtrace.
> > I will look into it.
> >
> 
> Hi Ben,
> 
> The crash is seen because in binding.c, we access port_binding->datapath column.
> 
> Since the 'datapath' column of the Port_Binding table has a  strong
> reference to the Datapath_binding table, this column
> should never be NULL, right ?
> 
> Since the crash is seen with the tracked data, maybe your IDL CS
> patchset needs some handling in the tracked code in IDL ?

OK, I will look into it.  Thanks.


More information about the dev mailing list