[ovs-dev] [PATCH] connmgr: Fix vswitchd abort when a port is added and the controller is down

Numan Siddique nusiddiq at redhat.com
Wed Oct 17 16:52:51 UTC 2018


On Wed, Oct 17, 2018 at 10:01 PM Ben Pfaff <blp at ovn.org> wrote:

> On Wed, Oct 17, 2018 at 05:33:05PM +0530, nusiddiq at redhat.com wrote:
> > From: Numan Siddique <nusiddiq at redhat.com>
> >
> > We see the below trace when a port is added to a bridge and the
> configured
> > controller is down
> >
> > 0x00007fb002f8b207 in raise () from /lib64/libc.so.6
> > 0x00007fb002f8c8f8 in abort () from /lib64/libc.so.6
> > 0x00007fb004953026 in ofputil_protocol_to_ofp_version () from /lib64/
> libopenvswitch-2.10.so.0
> > 0x00007fb00494e38e in ofputil_encode_port_status () from /lib64/
> libopenvswitch-2.10.so.0
> > 0x00007fb004ef1c5b in connmgr_send_port_status () from
> /lib64/libofproto-2.10.so.0
> > 0x00007fb004efa9f4 in ofport_install () from /lib64/libofproto-2.10.so.0
> > 0x00007fb004efbfb2 in update_port () from /lib64/libofproto-2.10.so.0
> > 0x00007fb004efc7f9 in ofproto_port_add () from
> /lib64/libofproto-2.10.so.0
> > 0x0000556d540a3f95 in bridge_add_ports__ ()
> > 0x0000556d540a5a47 in bridge_reconfigure ()
> > 0x0000556d540a9199 in bridge_run ()
> > 0x0000556d540a02a5 in main ()
> >
> > When connmgr detects that the connection to the controller is down, it
> > resets the ofconn's protocol to 'OFPUTIL_P_NONE' and that's why we
> > see the above abort. This patch fixes the issue by also checking the
> > connection status before sending the port status in the
> >  connmgr_send_port_status().
> >
> > The issue can be reproduced by running the test added in this patch
> > without the fix.
> >
> > Signed-off-by: Numan Siddique <nusiddiq at redhat.com>
>
> This is good work.  Thank you.
>
> In general there's some bad factoring here, but a proper fix would be a
> lot more work.
>
> I think that we could generalize this by adjusting
> ofconn_receives_async_msg() to return false if the connection is down.
> At least one caller checks already, but it seems like all the callers
> should ignore connections that are down.  What do you think of that?
>
>
Agree. I will work on it.  This issue is seen after the commit - [1]
- rconn: Introduce new invariant to fix assertion failure in corner case.
So probably the caller should check the connection status by
calling rconn_is_connected().


[1] -
https://github.com/openvswitch/ovs/commit/476d2551abd2871696a64203f78d658ac2d7f32c#diff-fe98aa3f76d0f1b61d8498890e5c945e


Thanks
Numan



> Thanks,
>
> Ben.
>


More information about the dev mailing list