[ovs-dev] Fail to netdev_open internal iface with error "File exists"

Eelco Chaudron echaudro at redhat.com
Fri Jun 23 08:37:13 UTC 2017


On 06/23/2017 02:12 AM, Ben Pfaff wrote:
> On Thu, Jun 22, 2017 at 04:04:44PM -0300, Flavio Leitner wrote:
>> On Thu, Jun 22, 2017 at 01:04:59AM +0800, Huanle Han wrote:
>>> Hi,all
>>>
>>> I get this problem with latest(dbd8112) branch-2.7 code on my Ubuntu.
>>> root at ubuntu:/var/log/# ovs-vsctl show
>>> adf2ea99-0c53-4180-914f-7dadaa71302b
>>>      Bridge test
>>>          Port test
>>>              Interface test
>>>                  type: internal
>>>      Bridge "manage"
>>>          Port "manage"
>>>              Interface "manage"
>>>                  type: internal
>>>                  error: "could not open network device manage (File exists)"
>>>          Port "veth0"
>>>              Interface "veth0"
>>>          Port "eth0"
>>>              Interface "eth0"
>>>      ovs_version: "2.7.0"
>>>
>>> How to reproduce:
>>> 1. add bridge "manage", up and add ip on it
>>> 2. restart ovs-vswitchd
>>> 3. "ovs-vsctl show" displays error message.
>>>
>>> The reason:
>>> In following "netdev_open" call on ovs-vswitchd start, input "type" is NULL
>>> and "manage" is opened as a "system" netdev_class iface incorrectly.
>>>
>>> #0  netdev_open (name=0x7fffffffe2bc "manage", type=0x0,
>>> netdevp=0x7fffffffc3b0) at ../lib/netdev.c:396
>>> #1  0x000000000052c492 in get_src_addr (ip6_dst=0x7fffffffe2ac,
>>> output_bridge=0x7fffffffe2bc "manage", psrc=0x8f3490) at
>>> ../lib/ovs-router.c:141
>>> #2  0x000000000052c85d in ovs_router_insert__ (priority=104 'h',
>>> ip6_dst=0x7fffffffe29c, plen=104 'h', output_bridge=0x7fffffffe2bc
>>> "manage", gw=0x7fffffffe2ac) at ../lib/ovs-router.c:202
>>> #3  0x000000000052c980 in ovs_router_insert (ip_dst=0x7fffffffe29c,
>>> plen=104 'h', output_bridge=0x7fffffffe2bc "manage", gw=0x7fffffffe2ac) at
>>> ../lib/ovs-router.c:228
>>> #4  0x000000000058f63a in route_table_handle_msg (change=0x7fffffffe290) at
>>> ../lib/route-table.c:295
>>> #5  0x000000000058f1da in route_table_reset () at ../lib/route-table.c:174
>>> #6  0x000000000058f034 in route_table_init () at ../lib/route-table.c:110
>>> #7  0x0000000000495838 in dp_initialize () at ../lib/dpif.c:126
>>> #8  0x0000000000495b40 in dp_enumerate_types (types=0x7fffffffe3a0) at
>>> ../lib/dpif.c:244
>>> #9  0x000000000042eb1c in enumerate_types (types=0x7fffffffe3a0) at
>>> ../ofproto/ofproto-dpif.c:267
>>> #10 0x000000000041b81c in ofproto_enumerate_types (types=0x7fffffffe3a0) at
>>> ../ofproto/ofproto.c:432
>>> #11 0x000000000040df1e in bridge_run__ () at ../vswitchd/bridge.c:3020
>>> #12 0x000000000040e196 in bridge_run () at ../vswitchd/bridge.c:3082
>>> #13 0x00000000004138ef in main (argc=1, argv=0x7fffffffe578) at
>>> ../vswitchd/ovs-vswitchd.c:119
>>>
>>> After then, ovs fails to netdev_open "manage" with type == "internal".
>>> "File exists" error is reported.
>>> I think commit d3b8f50(netdev: Fix netdev_open() to adhere to class type if
>>> given) introduces this problem. It need be improved.
>> Your analysis is correct.  One solution is to wait vswitchd to
>> configure first and only then enable ovs-route.  The problem is that
>> more modules might use netdev_open() and it doesn't sound like a good
>> idea to have a control per module.
>>
>> Another option is to map the device's info to a class instead of using
>> "system", so we could use internal classes for vports, for instance.
>> However, that doesn't guarantee it will match with what is configured
>> in the DB.
> This sounds like the kind of problem that I expected the following
> commit might cause.
>
>      commit d3b8f5052292b3ba9084ffed097e90b87f2950f5
>      Author: Eelco Chaudron <echaudro at redhat.com>
>      Date:   Thu Jun 1 14:38:09 2017 +0200
>
>          netdev: Fix netdev_open() to adhere to class type if given
>
> If we can't fix it somehow, we might need to revert.

Reverting the above patch does not solve the real issue here. If we 
revert we
donot get the error, but the wrong class is used, hence the wrong callbacks
get called.

The main issue is with netdev_open() being called with type = NULL before
the interface is actually configured in the system. We could track these
"auto" generated interfaces, and once netdev_open() gets called with a
valid type reconfigure (re-create) it. netdev_remove() could work here
but not sure if its safe due to reference counting.





More information about the dev mailing list