[ovs-dev] [PATCH ovn] Revert "lflow.c: Use warn level log for a lflow that has neither dp nor dpg."

Thu Apr 29 16:50:20 UTC 2021

On 4/29/21 5:50 PM, Han Zhou wrote:
> 
> 
> On Thu, Apr 29, 2021 at 3:12 AM Dumitru Ceara <dceara at redhat.com <mailto:dceara at redhat.com>> wrote:
>>
>> On 4/29/21 11:29 AM, Ilya Maximets wrote:
>> > On 4/29/21 10:07 AM, Dumitru Ceara wrote:
>> >> On 4/29/21 3:55 AM, Han Zhou wrote:
>> >>> This reverts commit 0ba2e1a2e2c413f057ac198e4c2bcf00ac40b19f.
>> >>
>> >> Oops, I should've thought about this when reviewing the initial patch,
>> >> good that the CI caught it.
>> >>
>> >>>
>> >>> System tests with "dp-groups=yes" are all failing due to the warning
>> >>> log. Need to investigate why the warning appears. But before that is
>> >>> clear, revert this log level change to make CI pass.
>> >>
>> >> I would remove this part of the commit log or change it to something like:
>> >>
>> >> "With conditional monitoring, if a logical flow refers to a datapath
>> >> group that doesn't contain any local datapaths, ovsdb-server will send
>> >> updates with lflow.logical_dp_group set to NULL.
>> >
>> > This is not correct.  lflow does have a logical_dp_group in a message
>> > received from the ovsdb-server.  It's IDL that hides it from a user
>> > because it didn't receive the group itself.  And it's allowed to do
>> > that because of the 'min:0, max:1' definition of the column in the schema.
>>
>> You're right, thanks for the correction.
>>
>> >
>> >> This is a valid case
>> >> and we shouldn't log a warning message for it."
>> >
>> > +1
>> > It's a valid case and should not be logged as a warning.
>> >
>> >>
>> >> It's actually expected to have lflows with both logical_dp_group and
>> >> logical_datapath NULL now since we unconditionally monitor lflows with
>> >> datapath group set.
>> >
>> > Another point here is that we can't actually guarantee the order of updates
>> > from the server.  Even though currently ovsdb-server sends updates for
>> > monitored tables in a single 'update' message, it's not mandatory and could
>> > change in the future.  In this case it will be possible to receive lfow
>> > before receiving dp_group even with monotor-all.
>>
>> +1
>>
>> So even with the additional patch Han is preparing (to unconditionally
>> monitor all DPGs) we should still revert this change.
>>
> 
> I think it should be guaranteed by OVSDB transaction that all changes in the same transaction should be sent and received atomically, as long as the rows satisfy the monitor condition. Otherwise, ovsdb transaction is not really useful.

Transactions and monitor updates are separate things. They just happen
to occur sometimes on the same connection.  Transaction itself is atomic,
i.e. it's executed as a whole or not at all.  But between sending the
transaction request and receiving transaction response client may receive
arbitrary number of database updates that might be or might not be related
to the transaction.  As long as these updates are consistent (i.e. doesn't
violate database integrity), there is no reason to send them within a
single 'update' message.

ovsdb-server guarantees that transaction response will be sent
after all the monitor updates; and only after receiving of that response,
client can be sure that it has an up to date version of a database with
all the changes it made.

For example, client A sends transaction with changes X and Y. Client
B at the same time sends transaction with change X only.  After that
both clients receives update with change X, because transaction from
B happened to hit ovsdb-server first.  Now assuming that X was an
unconditional mutation of a row, i.e. had no prerequisites.
When transaction from client A hits ovsdb-server it will not be rejected
because X has no prerequisites.  X will be executed and Y will be
executed.  At this point both clients will receive monitor update
with only changes made by operation Y, because X didn't actually
change anything as it mutated the row to exactly same row as it was.
After that client A receives the transaction response with success.

Summarizing:
1. A sends transaction {X, Y}.
2. A receives update for X.
3. A receives update for Y.
4. A receives transaction response.

And this can happen right now with current implementation of ovsdb-server.

> So, with monitor-all I think this problem should never happen.
> 
> For the other side-effects of monitor-all, as Ilya pointed out in the other reply:
> ===
> I'm not sure if this will actually save wakeups.
> 
> In some cases it will, but it will increase number of wakeups in other
> cases.  If all datapath groups are monitored unconditionally, all
> controllers will wake up at the moment new datapath added or removed
> from any datapath group or if there is any other change.
> 
> What do you think?
> 
> For the processing.  lflow will not be considered therefore not much
> to process.  And when dp group will be updated, only affected lflows
> will be considered.  There should not be a big overhead for processing,
> right?"
> ===
> I think overall the wakeups should still be reduced with monitor-all, because DPG changes can happen only in two cases (correct me if I am wrong):
> 1. When a DPG contains all DPs of the same type (LS or LR). In this case whenever there is a DP add/deletion, the DPG would be updated. However, this should anyway happen no matter if monitor-all is used or not.
> 2. When a DPG maps to a port-group, when ACLs are applied to port-groups. In this case port-group changes *may* trigger DPG change, if the ports change in the PG causes the set of DP changes. However, since we are monitoring all port groups already, there will be always more port group changes than this kind of DPG changes (because a PG change doesn't have to trigger a DPG change).
> 
> I agree that there is not much overhead of processing the lflow itself when the DPG is not monitored yet, but there are at least two other concerns:
> 1. The latency of a flow installation can be increased significantly, because another round-trip between the ovn-controller and the SB DB is needed.
> 2. The overhead of wakeup itself is not negligible today because of lot of preprocessing in some of change handlers in the I-P engine nodes, and also some overhead out of the I-P engine. But this is not as important as 1).
> 
>> >
>> >>
>> >> With conditional monitoring it might be that there are lflows that refer
>> >> to a DPG but none of the DPG datapaths are local to ovn-controller so
>> >> they won't be monitored.  In that case lflow.logical_dp_group will be
>> >> NULL.  It's not a bug however so the old VLOG_DBG was fine.
>> >>
>> >> For the revert patch:
>> >> Acked-by: Dumitru Ceara <dceara at redhat.com <mailto:dceara at redhat.com>>
> 
> Thanks Dumitru and Ilya. Now I pushed this revert to master with the commit message updated:
> 
> ---- 8>< --------------------------------------------------------------------- ><8 ------------
>     This reverts commit 0ba2e1a2e2c413f057ac198e4c2bcf00ac40b19f.
>    
>     System tests with "dp-groups=yes" are all failing due to the warning
>     log. The reason of the warning log (a lflow has neither dp nor dpg) is
>     because the dpg reference by the lflow is not monitored by the
>     ovn-controller yet when the lflow is firsly observed by the
>     ovn-controller. So dbg level log is valid with the current
>     implementation. There may be some inefficiency in such scenarios
>     because ovn-controller needs to be wakeup again when the dpg is
>     monitored later and the processing of the lflow would be delayed a
>     little. This may be addressed with future patches, if necessary.
> ---- 8>< --------------------------------------------------------------------- ><8 ------------
> 
> We can continue discussing whether we should change the monitor condition.
> 
> Thanks,
> Han
> 
>> >>
>> >> Thanks,
>> >> Dumitru
>> >>
>> >
>>