[ovs-discuss] vswitchd 2.9.3/2.10.1 crashes handling port add/mod

Ben Pfaff blp at ovn.org
Wed Dec 12 23:24:53 UTC 2018


OK, I'll look closer.

On Thu, Dec 13, 2018 at 12:13:16PM +1300, Josh Bailey wrote:
> Just tried it - frustratingly, this seems to cause the vswitchd disappears
> with status 0/no coredump scenario.
> 
> 2018-12-12T23:07:01.485Z|00002|daemon_unix(monitor)|INFO|pid 23736 died,
> exit status 0, exiting
> 
> 
> I ran with vlog DBG and it didn't seem to print anything more than that.
> 
> What a puzzle!
> 
> 
> 
> On Thu, Dec 13, 2018 at 9:29 AM Ben Pfaff <blp at ovn.org> wrote:
> 
> > Thanks for testing.
> >
> > I discovered that this exact patch causes another problem.  I posted a
> > slight revision without that issue.  Would you mind re-testing?  Thanks
> > a lot.
> >
> > The new version is here:
> > https://patchwork.ozlabs.org/patch/1012261/
> >
> > On Tue, Dec 11, 2018 at 12:35:15PM +1300, Josh Bailey wrote:
> > > Yes sir. That fixes it - vswitchd no longer crashes.
> > >
> > > Thanks,
> > >
> > >
> > > On Tue, Dec 11, 2018 at 12:31 PM Ben Pfaff <blp at ovn.org> wrote:
> > >
> > > > Here's a more specific patch that, if my hypothesis is correct, would
> > > > solve the issue.
> > > >
> > > > diff --git a/ofproto/connmgr.c b/ofproto/connmgr.c
> > > > index 7c0f16b321f1..ebee5817710e 100644
> > > > --- a/ofproto/connmgr.c
> > > > +++ b/ofproto/connmgr.c
> > > > @@ -1493,7 +1493,7 @@ ofconn_receives_async_msg(const struct ofconn
> > > > *ofconn,
> > > >      ovs_assert(reason < 32);
> > > >      ovs_assert((unsigned int) type < OAM_N_TYPES);
> > > >
> > > > -    if (!rconn_is_connected(ofconn->rconn)) {
> > > > +    if (!rconn_is_connected(ofconn->rconn) || !ofconn->protocol) {
> > > >          return false;
> > > >      }
> > > >
> > > > On Mon, Dec 10, 2018 at 12:37:14PM -0800, Ben Pfaff wrote:
> > > > > Probably, this is a different issue.  My guess is that the
> > connection in
> > > > > question doesn't have an OpenFlow protocol at the moment.  We've
> > dealt
> > > > > with problems like this before, see e.g. commit 903f6c4f8a9b
> > ("connmgr:
> > > > > Fix vswitchd abort when a port is added and the controller is down").
> > > > > Either that fix wasn't sufficient or it wasn't backported or it's
> > some
> > > > > other slightly different issue.
> > > > >
> > > > > I've had a restructuring that should improve things in this area out
> > for
> > > > > review since the end of October.  So far it hasn't attracted a
> > review:
> > > > > https://patchwork.ozlabs.org/patch/990599/
> > > > >
> > > > > On Mon, Dec 10, 2018 at 06:16:19PM -0200, Flavio Leitner wrote:
> > > > > >
> > > > > > Looks like you're using an unsupported OpenFlow protocol:
> > > > > >
> > > >
> > https://github.com/openvswitch/ovs/blob/5f361a2a320717c46289fc30d65a186f2f5d3ba0/lib/ofp-protocol.c#L123
> > > > > >
> > > > > > I see that you are configuring a controller in OVS and you are
> > > > > > running Ryu, maybe it's using the wrong protocol version?
> > > > > >
> > > > > > fbl
> > > > > >
> > > > > > On Tue, Dec 11, 2018 at 08:07:51AM +1300, Josh Bailey wrote:
> > > > > > > Certainly:
> > > > > > >
> > > > > > > 2018-12-04 21:23:59 josh #1  0x00007f870edf9801 in __GI_abort ()
> > at
> > > > > > > abort.c:79
> > > > > > >
> > > > > > > 2018-12-04 21:23:59 josh #2  0x00005634f368e0a8 in
> > > > > > > ofputil_protocol_to_ofp_version (protocol=<optimized out>) at
> > > > > > > lib/ofp-protocol.c:123
> > > > > > >
> > > > > > > 2018-12-04 21:23:59 josh #3  0x00005634f36890ae in
> > > > > > > ofputil_encode_port_status (ps=ps at entry=0x7ffef1dc7880,
> > > > protocol=<optimized
> > > > > > > out>) at lib/ofp-port.c:938
> > > > > > >
> > > > > > > 2018-12-04 21:23:59 josh #4  0x00005634f35f7ab2 in
> > > > connmgr_send_port_status
> > > > > > > (mgr=0x5634f518a9a0, source=source at entry=0x0, pp=pp at entry
> > > > =0x5634f5247310,
> > > > > > > reason=reason at entry=2 '\002') at ofproto/connmgr.c:1654
> > > > > > >
> > > > > > > 2018-12-04 21:23:59 josh #5  0x00005634f35bcfe3 in
> > > > ofproto_port_set_state
> > > > > > > (port=port at entry=0x5634f52472f0, state=<optimized out>) at
> > > > > > > ofproto/ofproto.c:2485
> > > > > > >
> > > > > > > 2018-12-04 21:23:59 josh #6  0x00005634f35d07e3 in port_run
> > > > > > > (ofport=0x5634f52472e0) at ofproto/ofproto-dpif.c:3629
> > > > > > >
> > > > > > > 2018-12-04 21:23:59 josh #7  run (ofproto_=0x5634f51dd2c0) at
> > > > > > > ofproto/ofproto-dpif.c:1666
> > > > > > >
> > > > > > > 2018-12-04 21:23:59 josh #8  0x00005634f35be5ee in ofproto_run
> > > > > > > (p=0x5634f51dd2c0) at ofproto/ofproto.c:1741
> > > > > > >
> > > > > > > 2018-12-04 21:23:59 josh #9  0x00005634f35abe9c in bridge_run__
> > () at
> > > > > > > vswitchd/bridge.c:2944
> > > > > > >
> > > > > > > 2018-12-04 21:23:59 josh #10 0x00005634f35b19e0 in bridge_run ()
> > at
> > > > > > > vswitchd/bridge.c:3002
> > > > > > >
> > > > > > > 2018-12-04 21:23:59 josh #11 0x00005634f3211595 in main
> > > > (argc=<optimized
> > > > > > > out>, argv=<optimized out>) at vswitchd/ovs-vswitchd.c:125
> > > > > > >
> > > > > > > On Tue, Dec 11, 2018 at 7:50 AM Flavio Leitner <fbl at sysclose.org
> > >
> > > > wrote:
> > > > > > >
> > > > > > > > On Wed, Dec 05, 2018 at 12:11:28PM +1300, Josh Bailey via
> > discuss
> > > > wrote:
> > > > > > > > > Hello OVS colleagues,
> > > > > > > > >
> > > > > > > > > vswitchd appears to crash handling a port add/mod. Please see
> > > > following
> > > > > > > > to
> > > > > > > > > reproduce.
> > > > > > > > >
> > > > > > > > > Run two Ryu OF controllers:
> > > > > > > > >
> > > > > > > > > $ ryu-manager --ofp-tcp-listen-port 6653  --ofp-listen-host
> > > > 127.0.0.1
> > > > > > > > > --verbose --app-lists ryu.app.simple_switch_stp
> > > > > > > > >
> > > > > > > > > $ ryu-manager --ofp-tcp-listen-port 6654  --ofp-listen-host
> > > > 127.0.0.1
> > > > > > > > > --verbose --app-lists ryu.app.simple_switch_stp
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Now set up a bridge with no interfaces:
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > root at faucet:~/faucet#
> > > > /usr/local/share/openvswitch/scripts/ovs-ctl start
> > > > > > > > > * Starting ovsdb-server
> > > > > > > > > * system ID not configured, please use --system-id
> > > > > > > > > * Configuring Open vSwitch system IDs
> > > > > > > > > * Starting ovs-vswitchd
> > > > > > > > > * Enabling remote OVSDB managers
> > > > > > > > > root at faucet:~/faucet# ovs-vsctl --version
> > > > > > > > > ovs-vsctl (Open vSwitch) 2.9.3
> > > > > > > > > DB Schema 7.15.1
> > > > > > > > > root at faucet:~/faucet# ovs-vsctl add-br br0
> > > > > > > > > root at faucet:~/faucet# ovs-vsctl set-controller br0 tcp:
> > > > 127.0.0.1:6653
> > > > > > > > tcp:
> > > > > > > > > 127.0.0.1:6654
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Now add a physical interface known to be up:
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > root at faucet:~/faucet# ovs-vsctl add-port br0 enp2s0f0
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Observe crash in log:
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > 2018-12-04T23:03:06.663Z|00036|bridge|INFO|bridge br0: added
> > > > interface
> > > > > > > > > enp2s0f0 on port 1
> > > > > > > > > 2018-12-04T23:03:06.663Z|00037|bridge|INFO|bridge br0: using
> > > > datapath ID
> > > > > > > > > 000090e2ba7e7558
> > > > > > > > > 2018-12-04T23:03:06.663Z|00038|rconn|INFO|br0<->tcp:
> > > > 127.0.0.1:6653:
> > > > > > > > > disconnecting
> > > > > > > > > 2018-12-04T23:03:06.663Z|00039|rconn|INFO|br0<->tcp:
> > > > 127.0.0.1:6654:
> > > > > > > > > disconnecting
> > > > > > > > > 2018-12-04T23:03:06.664Z|00040|fail_open|WARN|Could not
> > connect
> > > > to
> > > > > > > > > controller (or switch failed controller's post-connection
> > > > admission
> > > > > > > > control
> > > > > > > > > policy) for 19 seconds, failing open
> > > > > > > > > 2018-12-04T23:03:06.710Z|00002|daemon_unix(monitor)|ERR|1
> > > > crashes: pid
> > > > > > > > 5620
> > > > > > > > > died, killed (Aborted), core dumped, restarting
> > > > > > > >
> > > > > > > > Please open the coredump using gdb and provide the backtrace at
> > > > least,
> > > > > > > > Thanks,
> > > > > > > > --
> > > > > > > > fbl
> > > > > > > >
> > > > > > > >
> > > > > >
> > > > > > --
> > > > > > Flavio
> > > > > >
> > > > > > _______________________________________________
> > > > > > discuss mailing list
> > > > > > discuss at openvswitch.org
> > > > > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
> > > >
> >


More information about the discuss mailing list