[ovs-dev] Fail to netdev_open internal iface with error "File exists"
Guru Shetty
guru at ovn.org
Sat Jul 15 19:00:45 UTC 2017
IMO, we pretty much have a broken branch-2.7 for a while now. I can't seem
to get geneve tunnels to work across 2 machines in a flaky way. I am not
sure whether this is the cause of the problem, but looks like others are
seeing similar issues.
On 14 July 2017 at 06:34, Huanle Han <hanxueluo at gmail.com> wrote:
> Hi, Ben
> I see some netdev which is not added to ovs constructs and destructs
> frequently because of route_table_handle_msg(). and there may be some side
> effects everytime reopen the netdev, such as netdev_change_seq_changed.
>
> Is it neccessy that open a netdev not in ovs?
> How about not open netdev or return failure for it?
>
> Thanks
> Huanle
>
>
> On Fri, Jul 14, 2017 at 2:33 AM, Ben Pfaff <blp at ovn.org> wrote:
>
> > On Wed, Jul 05, 2017 at 05:03:06PM +0200, Eelco Chaudron wrote:
> > > On 23/06/17 10:37, Eelco Chaudron wrote:
> > > >On 06/23/2017 02:12 AM, Ben Pfaff wrote:
> > > >>On Thu, Jun 22, 2017 at 04:04:44PM -0300, Flavio Leitner wrote:
> > > >>>On Thu, Jun 22, 2017 at 01:04:59AM +0800, Huanle Han wrote:
> > > >>>>Hi,all
> > > >>>>
> > > >>>>I get this problem with latest(dbd8112) branch-2.7 code on my
> Ubuntu.
> > > >>>>root at ubuntu:/var/log/# ovs-vsctl show
> > > >>>>adf2ea99-0c53-4180-914f-7dadaa71302b
> > > >>>> Bridge test
> > > >>>> Port test
> > > >>>> Interface test
> > > >>>> type: internal
> > > >>>> Bridge "manage"
> > > >>>> Port "manage"
> > > >>>> Interface "manage"
> > > >>>> type: internal
> > > >>>> error: "could not open network device manage (File
> > > >>>>exists)"
> > > >>>> Port "veth0"
> > > >>>> Interface "veth0"
> > > >>>> Port "eth0"
> > > >>>> Interface "eth0"
> > > >>>> ovs_version: "2.7.0"
> > > >>>>
> > > >>>>How to reproduce:
> > > >>>>1. add bridge "manage", up and add ip on it
> > > >>>>2. restart ovs-vswitchd
> > > >>>>3. "ovs-vsctl show" displays error message.
> > > >>>>
> > > >>>>The reason:
> > > >>>>In following "netdev_open" call on ovs-vswitchd start, input "type"
> > > >>>>is NULL
> > > >>>>and "manage" is opened as a "system" netdev_class iface
> incorrectly.
> > > >>>>
> > > >>>>#0 netdev_open (name=0x7fffffffe2bc "manage", type=0x0,
> > > >>>>netdevp=0x7fffffffc3b0) at ../lib/netdev.c:396
> > > >>>>#1 0x000000000052c492 in get_src_addr (ip6_dst=0x7fffffffe2ac,
> > > >>>>output_bridge=0x7fffffffe2bc "manage", psrc=0x8f3490) at
> > > >>>>../lib/ovs-router.c:141
> > > >>>>#2 0x000000000052c85d in ovs_router_insert__ (priority=104 'h',
> > > >>>>ip6_dst=0x7fffffffe29c, plen=104 'h', output_bridge=0x7fffffffe2bc
> > > >>>>"manage", gw=0x7fffffffe2ac) at ../lib/ovs-router.c:202
> > > >>>>#3 0x000000000052c980 in ovs_router_insert (ip_dst=0x7fffffffe29c,
> > > >>>>plen=104 'h', output_bridge=0x7fffffffe2bc "manage",
> > > >>>>gw=0x7fffffffe2ac) at
> > > >>>>../lib/ovs-router.c:228
> > > >>>>#4 0x000000000058f63a in route_table_handle_msg
> > > >>>>(change=0x7fffffffe290) at
> > > >>>>../lib/route-table.c:295
> > > >>>>#5 0x000000000058f1da in route_table_reset () at
> > > >>>>../lib/route-table.c:174
> > > >>>>#6 0x000000000058f034 in route_table_init () at
> > > >>>>../lib/route-table.c:110
> > > >>>>#7 0x0000000000495838 in dp_initialize () at ../lib/dpif.c:126
> > > >>>>#8 0x0000000000495b40 in dp_enumerate_types (types=0x7fffffffe3a0)
> > at
> > > >>>>../lib/dpif.c:244
> > > >>>>#9 0x000000000042eb1c in enumerate_types (types=0x7fffffffe3a0) at
> > > >>>>../ofproto/ofproto-dpif.c:267
> > > >>>>#10 0x000000000041b81c in ofproto_enumerate_types
> > > >>>>(types=0x7fffffffe3a0) at
> > > >>>>../ofproto/ofproto.c:432
> > > >>>>#11 0x000000000040df1e in bridge_run__ () at
> > ../vswitchd/bridge.c:3020
> > > >>>>#12 0x000000000040e196 in bridge_run () at
> ../vswitchd/bridge.c:3082
> > > >>>>#13 0x00000000004138ef in main (argc=1, argv=0x7fffffffe578) at
> > > >>>>../vswitchd/ovs-vswitchd.c:119
> > > >>>>
> > > >>>>After then, ovs fails to netdev_open "manage" with type ==
> > "internal".
> > > >>>>"File exists" error is reported.
> > > >>>>I think commit d3b8f50(netdev: Fix netdev_open() to adhere to class
> > > >>>>type if
> > > >>>>given) introduces this problem. It need be improved.
> > > >>>Your analysis is correct. One solution is to wait vswitchd to
> > > >>>configure first and only then enable ovs-route. The problem is that
> > > >>>more modules might use netdev_open() and it doesn't sound like a
> good
> > > >>>idea to have a control per module.
> > > >>>
> > > >>>Another option is to map the device's info to a class instead of
> using
> > > >>>"system", so we could use internal classes for vports, for instance.
> > > >>>However, that doesn't guarantee it will match with what is
> configured
> > > >>>in the DB.
> > > >>This sounds like the kind of problem that I expected the following
> > > >>commit might cause.
> > > >>
> > > >> commit d3b8f5052292b3ba9084ffed097e90b87f2950f5
> > > >> Author: Eelco Chaudron <echaudro at redhat.com>
> > > >> Date: Thu Jun 1 14:38:09 2017 +0200
> > > >>
> > > >> netdev: Fix netdev_open() to adhere to class type if given
> > > >>
> > > >>If we can't fix it somehow, we might need to revert.
> > > >
> > > >Reverting the above patch does not solve the real issue here. If we
> > revert
> > > >we
> > > >donot get the error, but the wrong class is used, hence the wrong
> > > >callbacks
> > > >get called.
> > > >
> > > >The main issue is with netdev_open() being called with type = NULL
> > before
> > > >the interface is actually configured in the system. We could track
> these
> > > >"auto" generated interfaces, and once netdev_open() gets called with a
> > > >valid type reconfigure (re-create) it. netdev_remove() could work here
> > > >but not sure if its safe due to reference counting.
> > > >
> > > Some background info; I see a lot of netdev_open() with a NULL type,
> but
> > > they just grep some data and closed it again. I found that the tunnel
> > code
> > > is keeping a reference to the netdev (for the IP assigned) and hence it
> > is
> > > not going away before the "real" interface opens it.
> > >
> > > The change below is based on tracking the "classless" opened devices,
> and
> > > once they get opened with a "real" class, re-create them. I did some
> > basic
> > > testing and it seem to work fine, also went over related code and could
> > not
> > > find any corner case.
> > >
> > > So please take a peek, and if it all makes sense I can send out an
> > official
> > > patch.
> >
> > It's not exactly pleasing, but it's the best solution I've seen so far.
> >
> > I wonder whether there's a way to better figure out the class type in
> > advance.
> >
> > Let's try the solution you're proposing. You want to send it out?
> >
> > Thanks,
> >
> > Ben.
> >
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
More information about the dev
mailing list