[ovs-dev] [RFC 1/3] dpdk: Unify passing of DPDK EAL arguments.

Flavio Leitner fbl at sysclose.org
Thu Feb 7 16:56:23 UTC 2019


On Wed, Feb 06, 2019 at 11:33:04AM +0300, Ilya Maximets wrote:
> On 05.02.2019 23:19, Flavio Leitner wrote:
> > On Tue, Feb 05, 2019 at 06:43:08PM +0300, Ilya Maximets wrote:
> >> Hi Flavio.
> >> Thanks for taking a look.
> >>
> >> On 05.02.2019 15:38, Flavio Leitner wrote:
> >>>
> >>> Hi Ilya,
> >>>
> >>> Thanks for the patch and I think we knew about that pain when we
> >>> exposed the very first parameter. I still remember Aaron struggling
> >>> to find the best path forward. Time flies and here we are.
> >>>
> >>> The problem is that this change is not friendly from the community
> >>> perspective to its users. That is like an exposed API which we should
> >>> avoid break as much as possible.
> >>>
> >>> For instance, there are users (OpenStack) with support life cycle of
> >>> 5+ years that cannot keep up with this kind of change but they expect
> >>> to be able to update to newer OVS.
> >>
> >> Sure, there are users that wants stable API that will never change.
> >> But this is really impossible in practice. I'm working with OpenStack
> >> too and will have to fixup deployment tools with these changes. BTW, from
> >> the whole OpenStack only few deployment projects (like TripleO) will need
> >> to make changes and these changes are not very big.
> > 
> > That's only part of the work. There will be work on QA, documentation
> > and even migration path from one to another. And we can't change the
> > past for existing deployments.
> 
> Sure. But incompatible API changes are almost unavoidable for a young
> projects that wants to be better.

My point here is that the associated impact and cost of this change is
far from trivial.

> > I don't think we should remove the docs if the parameters are there as
> > a first step. I mean, assume an existing deployment, there is a parameter
> > which might be in use but there is no documentation available. That
> > doesn't sound like a good user experience to me.
> 
> Maybe we could save a man pages while removing the guides. There is no much
> information in Documentation/intro/install/dpdk.rst anyway.

Agreed.

> > On another hand, you could introduce the new interface and update the
> > docs to recommend using the new one because the old one will be removed
> > in the future. Warning messages come next, and then finally its removal.
> 
> I'd prefer to have warning messages to be there right from the start to
> push users to migrate to the new interfaces as early as possible.

If it's a WARN level message then I can tell you it will break
environments over here. If that is a INFO level message, then it might
be okay.  Though I still think there could be a period of time where
both could co-exist before we announce/warn about the deprecation of
the current interface.

> How about this:
> 
> First stage (apply now, release in 2.12):
>   - Introduce new interface 'dpdk-options'.
>   - Rewrite installation guide with new interface fully removing the old one.
>   - Add new interface to man pages (vswitch.xml) and mark all the old options
>     as deprecated by adding something like: "Deprecated. 'dpdk-options' should
>     be used instead. Will be ignored in the future."
>   - Add a runtime deprecation warning if old interface is in use.

^^^ this is bad as an initial step as explained above.

>   - Ignore values of old knobs if 'dpdk-options' provided.

Sounds like a good transition barrier.


> Second stage (release in 2.13 or 2.14, could wait longer if required):
>   - Remove old interfaces wile keeping the warnings. (i.e. values always ignored)
>   - Remove old knobs from the man pages.
> 
> Third stage (optional):
>   - Remove warnings.
> 
> So the main difference from the current patches is delaying removal of the man
> pages to the second stage.
[...] 
> >> We're keeping few sane defaults like providing default lcore and setting the
> >> socket-limit if needed. And we'll try to do that in the future. The thing
> >> this patch tries to eliminate is the dependency tracking between different
> >> EAL arguments, i.e. duplicating the work with rte_eal_init() and having too
> >> many configuration knobs with similar meanings what does not work at the
> >> same time.
> >>
> >> Right now there are no critical arguments that user must provide.
> >> So, IMHO, having special configs for them is really unnecessary.
> > 
> > Do you think the defaults work for the majority of DPDK users?
> 
> My personal experience is opposite. Most of my deployments and deployments that
> I had to work with had massive 'dpdk-extra' arguments. These are mostly memory
> tuning options, pci whitelists and various virtual devices. And these 'dpdk-extra'
> strings only grows over time with '--socket-limit', '--legacy-mem',
> '--single-file-segments', '--in-memory', '--file-prefix' and so on.

That is indeed a problem, but moving to 'dpdk-options' will not solve it.

> Defaults are OK if your application is the only application you want to see on
> a system. Otherwise, you need to fight with DPDK for resources/devices and your
> (almost) only weapon is EAL argument string.

The above looks like to be the real motivation behind the change, which is
fine, but somehow I missed that while reading the patchset.

> > If we expose only the necessary things to consume DPDK it means we
> > have tight control (less combinations to worry about), otherwise if
> > the user can pass anything, a can of worms is opened and we have no
> > control of all things the user can do.
> 
> We have only illusion of control. User can pass anything with 'dpdk-extra'.
> Also, the only necessary thing with dpdk 18.11 is an lcore list.

My statement was generic which remains true, but you're right that we
already exposed dpdk-extra and maybe that was already a mistake.

> IMHO, non-proficent users are happy with defaults. If user wants to tune,
> 'dpdk-extra' will be used anyway.

My point is that this is going to an unhealthy direction. We will be
telling that users are in full control of all possible options and
combinations. The user may be or not proficient, doesn't matter because
in the end, users do stupid things and OVS is the one at the front to
blame.

The only reason that '--socket-mem' meaning change was brought up was
because we exposed that to users. That would not have happened if we had
only --dpdk-options and then I wonder how we will manage OVS upgrades
in the future when we upgrade DPDK.

fbl



More information about the dev mailing list