[ovs-dev] [PATCH] rhel: update udev rules to allow vfio access

Flavio Leitner fbl at sysclose.org
Tue May 14 14:39:19 UTC 2019


On Fri, May 10, 2019 at 02:31:26PM -0400, Aaron Conole wrote:
> Aaron Conole <aconole at redhat.com> writes:
> 
> > Flavio Leitner <fbl at sysclose.org> writes:
> >
> >> On Thu, Apr 18, 2019 at 01:46:22PM -0600, Alex Williamson wrote:
> >>> On Thu, 18 Apr 2019 15:50:43 -0300
> >>> Flavio Leitner <fbl at sysclose.org> wrote:
> >>> 
> >>> > On Thu, Apr 18, 2019 at 12:06:57PM -0600, Alex Williamson wrote:
> >>> > > On Thu, 18 Apr 2019 13:56:23 -0300
> >>> > > Flavio Leitner <fbl at sysclose.org> wrote:
> >>> > >   
> >>> > > > On Thu, Apr 18, 2019 at 10:43:11AM -0600, Alex Williamson wrote:  
> >>> > > > > On Thu, 18 Apr 2019 13:23:54 -0300
> >>> > > > > Flavio Leitner <fbl at sysclose.org> wrote:
> >>> > > > Another thing is that when the module is ready and the event is sent
> >>> > > > out, what holds OVS for not trying to open and get EACCESS before
> >>> > > > udev is triggered to fix the device permission?  
> >>> > > 
> >>> > > If there were a race, could ovs ever run before udev on system
> >>> > > startup?  Probably not.  
> >>> > 
> >>> > It does wait, but only for the udev to settle, which means if the
> >>> > module has not triggered an event until that time, OVS will not wait
> >>> > and we still have a race.
> >>> 
> >>> But udev isn't waiting on the module to trigger an event, the module
> >>> contains a MODULE_ALIAS, so I believe it's just the static processing
> >>> of the modules.alias that triggers the event.
> >>
> >> What I am saying is that driverctl will trigger load the module and
> >> bind the device, later on systemd will trigger OVS service which
> >> waits udev to settle, but none of that guarantees that the permissions
> >> are updated when OVS is initializing, see below.
> >>
> >>> > >  Ideally perhaps a cleaner solution might be an
> >>> > > explicit dependency on the vfio module specific to ovs startup rather
> >>> > > than changing a system policy, but it really depends on the context and
> >>> > > use cases.  Thanks,  
> >>> > 
> >>> > It does have. The driverctl will bind the devices to vfio-pci but
> >>> > the problem is that which signal we should rely on to know when
> >>> > the vfio module is still initializing, or failed or finished.
> >>> 
> >>> What signal/mechanism is being used currently?  If driverctl is asked
> >>> to set a driver override it does:
> >>> 
> >>>  1) if module is not loaded, modprobe
> >>>  2) unbinds device from existing driver, if any
> >>>  3) sets driver_override
> >>>  4) triggers drivers_probe
> >>>  5) tests if device is bound to a driver, any driver
> >>> 
> >>> There are certainly some deficiencies here, unbinding the device before
> >>> setting the driver_override leaves the device open to getting bound by
> >>> the wrong driver, and the verification in the last step could be more
> >>> specific in testing for binding to the correct driver, but step #1 is
> >>> the modprobe of the driver, which should be a synchronous operation.
> >>> We shouldn't be able to complete a 'driverctl set-override $DEV
> >>> vfio-pci' without vfio being initialized, afaict.  Thanks,
> >>
> >> Right, sounds like systemd is starting openvswitch service before
> >> the driverctl is done with the devices.
> >
> > I'm not sure.  The ordering could be a problem.
> >
> > Perhaps we could try adding:
> >
> >   After=basic.target
> >
> > for the ovs-vswitchd.service if we have a machine that exhibits this
> > behavior, but I don't know if it will resolve the race.  There is some
> > kind of strange ordering looking at:
> >
> > https://www.freedesktop.org/software/systemd/man/systemd.special.html
> > and
> > https://www.freedesktop.org/software/systemd/man/bootup.html#
> >
> > I can't find how network.target dependency really works w.r.t. ordering
> > and the driverctl+basic.target services.
> 
> Ping?  Any thoughts?  Do you have an alternative approach you'd rather
> see?  I can try asking the customer if they can test out the
> After=basic.target change I propose, but I'm not positive it will
> resolve anything.  And if it doesn't, I want to be able to say "well,
> here's a follow up."

IIRC we have a dependency on systemd-udev-settle.service, which
would mean systemd would wait for the device probing to be done,
but apparently it doesn't mean that udev rules have completed
execution.

Maybe using systemd-analyze after had reproduced the issue can
shed some light?

Or change ovs-vswitchd.service to take a screenshot of all running
processes when the service is starting (ExecStartPre) ? That way
we will know if modprobe is still running and whatnot.

fbl


More information about the dev mailing list