[ovs-dev] [PATCHv2 1/2] xlate: auto ofproto trace when recursion too deep

William Tu u9012063 at gmail.com
Wed Feb 28 23:01:24 UTC 2018


On Wed, Feb 28, 2018 at 2:14 PM, Ben Pfaff <blp at ovn.org> wrote:
> On Mon, Feb 26, 2018 at 01:43:39PM -0800, William Tu wrote:
>> On Wed, Feb 14, 2018 at 3:03 PM, Ben Pfaff <blp at ovn.org> wrote:
>> > On Wed, Feb 14, 2018 at 10:56:08AM -0800, William Tu wrote:
>> >> Usually ofproto/trace is used to debug the flow translation error.
>> >> When translation error such as recursion too deep or too many resubmit,
>> >> the issue might happen momentary; flows causing the recursion expire
>> >> when users try to debug it.  This patch enables the ofproto trace
>> >> automatically when recursion is too deep or too many resubmit, by
>> >> invoking the translation again, and log the ofproto trace as warnings.
>> >> Since the log will be huge, rate limit to one per minute.
>> >>
>> >> VMWare-BZ: #2054659
>> >> Signed-off-by: William Tu <u9012063 at gmail.com>
>> >> Tested-by: Greg Rose <gvrose8192 at gmail.com>
>> >> Reviewed-by: Greg Rose <gvrose8192 at gmail.com>
>> >
>> > Thanks for working on this!
>> >
>> > Some of the data passed to ofproto_trace() is also passed to the
>> > xlate_actions() call, indirectly.  Did you check whether that data is
>> > possibly modified by xlate_actions()?  If it is, then we might have to
>> > reconsider this approach, because flow data, etc. is very large and I
>> > don't think that we can afford to always make a copy of it in advance on
>> > the chance that the original might be needed for tracing later.
>>
>> We pass the "const struct flow *" and "const struct dp_packet *" to
>> the ofproto_trace(),
>> so I think these two data is unmodified and xlate_in_init() actually make a copy
>> of the "struct flow", so xlate_actions() modifies a separate copy of the flow.
>
> OK.
>
>> > I think that VLOG_WARN is a very high log level for this data.  I would
>> > tend to use DBG.
>> >
>> I worried that most of the time DBG is not on, and asking people
>> to turn it on and reproduce the issue is hard.  Can we keep VLOG_WARN and
>> rate limit to very low frequent, "VLOG_RATE_LIMIT_INIT(1, 5);",?
>
> OK, let's try that.

Thanks Ben.
Should I send another version of the patch?

William


More information about the dev mailing list