[ovs-dev] [PATCH] netlink-socket: Exit NL transaction loop when EINVAL is returned

Alin Serdean aserdean at cloudbasesolutions.com
Tue Apr 14 21:19:10 UTC 2015


In nl_sock_transact_multiple__ we do the following:

if (!DeviceIoControl(sock->handle, OVS_IOCTL_TRANSACT,
                             txn->request->data,
                             txn->request->size,
                             reply_buf, sizeof reply_buf,
                             &reply_len, NULL)) {
            /* XXX: Map to a more appropriate error. */
            error = EINVAL;
            break;
        }

We map every failure to EINVAL that is why it did not pop out into Linux.

We should definitely log the error using ovs_lasterror_to_string before setting it to EINVAL.

Maybe we should just increase the number of transactions in some situations(i.e. STATUS_INVALID_PARAMETER) as an idea to allow the rest of the transactions to be processed.

Alin.
-----Mesaj original-----
De la: dev [mailto:dev-bounces at openvswitch.org] În numele Ben Pfaff
Trimis: Tuesday, April 14, 2015 11:42 PM
Către: Sorin Vinturis
Cc: dev at openvswitch.org
Subiect: Re: [ovs-dev] [PATCH] netlink-socket: Exit NL transaction loop when EINVAL is returned

On Tue, Apr 14, 2015 at 08:25:59PM +0000, Sorin Vinturis wrote:
> The nl_sock_transact_multiple function enters in an infinite loop, 
> when invalid error, EINVAL, is returned by nl_sock_transact_multiple__.
> EINVAL is the error returned by the latter function when a driver 
> request fails.
> 
> Signed-off-by: Sorin Vinturis <svinturis at cloudbasesolutions.com>
> Reported-by: Alin Gabriel Serdean <aserdean at cloudbasesolutions.com>
> Reported-at: https://github.com/openvswitch/ovs-issues/issues/57

I see that this fixes a bug, even on Linux.  Thank you.  It's actually a pretty serious bug (given the infinite loop), but I guess that it must not occur in any normal circumstances, otherwise we would have heard about it over the years.

However, I want to make sure of something before I commit it.
nl_sock_transact_multiple__() should only return an error in the case of a "transport" error, that is, of some problem communicating with the datapath (e.g. the kernel module has been removed or something similarly fatal).  It should not return an error in cases where some message asks the datapath to do something erroneous (e.g. to add a flow that the datapath doesn't understand, to delete a vport that doesn't exist, ...).  This is because only in the former case should all of the transactions be aborted; in the latter case, any remaining transactions should still be processed.
_______________________________________________
dev mailing list
dev at openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


More information about the dev mailing list