[ovs-dev] [PATCH] dpif-netdev: Fix crash when PACKET_OUT is metered

Tony van der Peet Tony.vanderPeet at alliedtelesis.co.nz
Sat Jun 19 02:11:48 UTC 2021


Hi all

I am in favour of a better fix, bandaids tend to come back and bite you in the end anyway. So, will be happy to help with this effort, though I am probably going to have to defer to the rest of you when it comes to actually knowing what to do in this area.

Cheers
Tony
________________________________________
From: Aaron Conole <aconole at redhat.com>
Sent: Saturday, 19 June 2021 2:50 a.m.
To: Ilya Maximets
Cc: Ben Pfaff; Tony van der Peet; dev at openvswitch.org; Tony van der Peet
Subject: Re: [ovs-dev] [PATCH] dpif-netdev: Fix crash when PACKET_OUT is metered

Ilya Maximets <i.maximets at ovn.org> writes:

> On 6/17/21 11:59 PM, Ben Pfaff wrote:
>> All these flags for stealing, allowing stealing, blah blah, are just
>> ways to do some kind of dumb reference counting without actually have a
>> reference count.  When it gets super complex like this, maybe
>> introducing a reference count is the way to go.  It would be a bigger
>> change, but perhaps more maintainable over time.
>>
>
> Yes, I absolutely agree.  We just removed one of these hacky flags from
> the struct dp_packet_batch and we need to get rid of the rest of them.

+1 from me - it's a bigger scoped change, but over-all, I think it's a
better one, and makes the most sense.

> One thing that bothers me is that some parts of the code treats
> 'should_steal=false' as "do not modify".  For example, I don't really
> understand why these functions are carrying cutlen around and clones
> the packet before truncating if 'should_steal=false'.
>
> Some action executors also have optimizations that allows to not clone
> the packet before the fatal action if this action is the last one.
>
> So, in general, yes, we need to get rid of these flags and accurately
> re-work all the packet paths.  And yes, this would be not a small
> change.
>
> For now, I think, we need to find a less ugly as possible solution for
> existing crash (maybe the one that I suggested), so we will be able to
> backport it and continue working on correct reference counting.
>
> What do you think?  Tony?  Aaron?

That makes sense to me, and I hope we will actually work on the
ref-count stuff.  I had started taking a look a few weeks back, but the
idea of 'steal' is pretty ingrained throughout the code, so getting a
ref-count semantic correct is a big effort.  As an example, the
odp-execute area expects that each sub-action will have its own copy to
modify (and this is documented with the 'should_steal' semantics).
Taking that out will be a big rework in that area.  I think it makes
the most sense, and we probably could implement something like COW to
cover those cases where we really need to modify a packet without
propagating those changes back up.

> Best regards, Ilya Maximets.



More information about the dev mailing list