[ovs-discuss] Failed to add ovs bridge

fukaige fukaige at huawei.com
Fri May 12 03:56:51 UTC 2017


Nothing seems unusual in setup. Just hit this by accident. I am checking that 
if there is any chance failing to delete the netdev.

> -----Original Message-----
> From: Ben Pfaff [mailto:blp at ovn.org]
> Sent: Friday, May 12, 2017 10:50 AM
> To: fukaige
> Cc: ovs-discuss at openvswitch.org; joe at ovn.org
> Subject: Re: Failed to add ovs bridge
> 
> Are you aware of anything unusual in your setup?  We have not had any
> similar reports.
> 
> On Fri, May 12, 2017 at 01:34:46AM +0000, fukaige wrote:
> > I am not using STP/RSTP.I saw the bug fix you mentioned, seems it is
> irrelevant to my problem.
> > May be there is some race condition lead to deleting netdev in
> > netdev_shash. But, I cannot figure it out right now.
> >
> > The occurrence probability is very low. I just hit this for three times in two
> month.
> >
> > > -----Original Message-----
> > > From: Ben Pfaff [mailto:blp at ovn.org]
> > > Sent: Thursday, May 11, 2017 9:32 PM
> > > To: fukaige
> > > Cc: ovs-discuss at openvswitch.org; joe at ovn.org
> > > Subject: Re: Failed to add ovs bridge
> > >
> > > Are you using STP or RSTP?  There's a bug fix related to them on
> branch-2.5.
> > >
> > > On Thu, May 11, 2017 at 11:11:02AM +0000, fukaige wrote:
> > > > Hi all,
> > > >
> > > > Occasionally, I get error when creating a bridge using “ovs-vsctl
> > > > add-br
> > > br-eth”
> > > >
> > > >
> > > > ovs-vsctl: Error detected while setting up 'br-eth'.  See
> > > > ovs-vswitchd log for
> > > details.
> > > >
> > > >
> > > > Ovs-vswitched log is below:
> > > >
> > > >
> 2017-05-11T03:45:25.293Z|00026|ofproto_dpif|INFO|system at ovs-system:
> > > > Datapath supports recirculation
> > > >
> 2017-05-11T03:45:25.293Z|00027|ofproto_dpif|INFO|system at ovs-system:
> > > > MPLS label stack length probed as 1
> > > >
> 2017-05-11T03:45:25.293Z|00028|ofproto_dpif|INFO|system at ovs-system:
> > > > Datapath supports unique flow ids
> > > >
> 2017-05-11T03:45:25.293Z|00029|ofproto_dpif|INFO|system at ovs-system:
> > > > Datapath supports ct_state
> > > >
> 2017-05-11T03:45:25.293Z|00030|ofproto_dpif|INFO|system at ovs-system:
> > > > Datapath supports ct_zone
> > > >
> 2017-05-11T03:45:25.293Z|00031|ofproto_dpif|INFO|system at ovs-system:
> > > > Datapath supports ct_mark
> > > >
> 2017-05-11T03:45:25.293Z|00032|ofproto_dpif|INFO|system at ovs-system:
> > > > Datapath supports ct_label
> > > > 2017-05-11T03:45:25.364Z|00001|ofproto_dpif_upcall(handler226)|INF
> > > > O|re ceived packet on unassociated datapath port 0
> > > > 2017-05-11T03:45:25.368Z|00033|netdev_linux|WARN|ethtool
> command
> > > > ETHTOOL_GFLAGS on network device br-eth failed: No such device
> > > > 2017-05-11T03:45:25.368Z|00034|dpif|WARN|system at ovs-system:
> failed
> > > to
> > > > add br-eth as port: No such device
> > > > 2017-05-11T03:45:25.368Z|00035|bridge|INFO|bridge br-eth: using
> > > > datapath ID 00002a51cf9f2841
> > > > 2017-05-11T03:45:25.368Z|00036|connmgr|INFO|br-eth: added service
> > > controller "punix:/var/run/openvswitch/br-eth.mgmt"
> > > >
> > > > Then I delete the br-eth, then try to add it. But, still get same error as
> above.
> > > However, bridge which name is different from br-eth can be created
> > > successfully.
> > > >
> > > > Some clues:
> > > >
> > > > 1.       As I kown, the port br-eth’s type is internel, and there is no
> way to
> > > get into netdev_linux_ethtool_set_flag(). But, the log shows that
> > > request.type is wrong.
> > > > request.type get wrong value OVS_VPORT_TYPE_NETDEV instead of
> > > OVS_VPORT_TYPE_INTERNAL.
> > > >
> > > > static int
> > > > dpif_netlink_port_add__(struct dpif_netlink *dpif, struct netdev *netdev,
> > > >                         odp_port_t *port_nop)
> > > >     OVS_REQ_WRLOCK(dpif->upcall_lock) {
> > > >          ……
> > > >
> > > >     if (request.type == OVS_VPORT_TYPE_NETDEV) { #ifdef _WIN32
> > > >         /* XXX : Map appropiate Windows handle */ #else
> > > >         netdev_linux_ethtool_set_flag(netdev, ETH_FLAG_LRO, "LRO",
> > > > false); #endif }
> > > >
> > > > ……
> > > > }
> > > >
> > > >
> > > > 2.       Debug ovs-vswitchd with gdb. I find that there is a netdev with
> > > same name was not deleted(lib/netdev.c:netdev_open).
> > > > netdev_open (name=0xffff6000d6b0 "br-int", type=0x52ca80
> > > > "internal",
> > > netdevp=0xfffffc20fab8, netdevp at entry=0xfffffc20fb28)
> > > >     at lib/netdev.c:354
> > > > 354  {
> > > > (gdb) n
> > > > 358      netdev_initialize();
> > > > (gdb)
> > > > 360      ovs_mutex_lock(&netdev_class_mutex);
> > > > (gdb)
> > > > 361      ovs_mutex_lock(&netdev_mutex);
> > > > (gdb)
> > > > 360      ovs_mutex_lock(&netdev_class_mutex);
> > > > (gdb)
> > > > 361      ovs_mutex_lock(&netdev_mutex);
> > > > (gdb)
> > > > 362      netdev = shash_find_data(&netdev_shash, name);
> > > > (gdb)
> > > > 363      if (!netdev) {
> > > > (gdb) print netdev->name
> > > > $1 = 0x47852e0 "br-int"
> > > > (gdb) print netdev->refcnt
> > > > There is no member named refcnt.
> > > > (gdb) n
> > > > 405          netdev->ref_cnt++;
> > > > (gdb) print netdev->ref_cnt
> > > > $2 = 2
> > > > (gdb) n
> > > > 406          *netdevp = netdev;
> > > > (gdb) print netdev->ref_cnt
> > > > $3 = 3
> > > >
> > > > There must be something wrong when deleting bridge. But, I cannot
> > > > find out
> > > a way to reproduce it and why it was not deleted correctly. Is
> > > > any can offer some suggestions to reproduce the error or solve it?
> > > >
> > > > Note:
> > > > ovs version: 2.5.2
> > > > kernel version: 4.1
> > > >


More information about the discuss mailing list