[ovs-dev] [PATCH] netlink-socket: Exit NL transaction loop when EINVAL is returned

Ben Pfaff blp at nicira.com
Tue Apr 14 21:39:06 UTC 2015


On Tue, Apr 14, 2015 at 09:30:28PM +0000, Nithin Raju wrote:
> > On Apr 14, 2015, at 1:41 PM, Ben Pfaff <blp at nicira.com> wrote:
> > 
> > On Tue, Apr 14, 2015 at 08:25:59PM +0000, Sorin Vinturis wrote:
> >> The nl_sock_transact_multiple function enters in an infinite loop,
> >> when invalid error, EINVAL, is returned by nl_sock_transact_multiple__.
> >> EINVAL is the error returned by the latter function when a driver
> >> request fails.
> >> 
> >> Signed-off-by: Sorin Vinturis <svinturis at cloudbasesolutions.com>
> >> Reported-by: Alin Gabriel Serdean <aserdean at cloudbasesolutions.com>
> >> Reported-at: https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_openvswitch_ovs-2Dissues_issues_57&d=AwIGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=pNHQcdr7B40b4h6Yb7FIedI1dnBsxdDuTLBYD3JqV80&m=BCtSQ7DHL7pHXhiSkvwQK8-jnNwHwzs2l6FydHeVFQ4&s=tmLYKmqY4znbvlzSXVB3tkTwuwrgghsF_B-VbbhLHdw&e= 
> > 
> > I see that this fixes a bug, even on Linux.  Thank you.  It's actually
> > a pretty serious bug (given the infinite loop), but I guess that it
> > must not occur in any normal circumstances, otherwise we would have
> > heard about it over the years.
> > 
> > However, I want to make sure of something before I commit it.
> > nl_sock_transact_multiple__() should only return an error in the case
> > of a "transport" error, that is, of some problem communicating with
> > the datapath (e.g. the kernel module has been removed or something
> > similarly fatal).  It should not return an error in cases where some
> > message asks the datapath to do something erroneous (e.g. to add a
> > flow that the datapath doesn't understand, to delete a vport that
> > doesn't exist, ...).  This is because only in the former case should
> > all of the transactions be aborted; in the latter case, any remaining
> > transactions should still be processed.
> 
> What is the genetlink semantics for returning an error from the OVS
> module in Linux. I looked at the OVS code, and it returns -EINVAL if
> it runs into a condition of invalid input (eg. flow with key & UFID
> missing). If OVS module returns -EINVAL, does Linux???s netlink
> layer massage the message to insert the error into a ???struct
> nlmsgerr???, and return 0?

The netlink code turns that sort of return value into an
nlmsghdr+nlmsgerr.

> We have a little confusion about that. We currently return the error
> (and also the struct nlmsgerr) in the equivalent of recvmsg().

Linux would only return an error from recvmsg() if something really
bad happened.  You can think of the difference between these two types
of errors as being like a "real" network connection.  Suppose you're
running a Netlink-like RPC protocol over a TCP socket instead of a
Netlink socket.  You only get an error from recvmsg() if something
really bad happens like the TCP connection dropping.  If the remote
process wants to send you an error reply to your request, you get it
as a message encapsulated inside the TCP connection data, not as an
error returned by recvmsg().



More information about the dev mailing list