[ovs-dev] [PATCH v3 0/5] Fast OVSDB resync after restart or failover.

Thu Feb 28 04:13:54 UTC 2019

On Wed, Feb 27, 2019 at 8:04 PM Han Zhou <zhouhan at gmail.com> wrote:
>
> On Wed, Feb 27, 2019 at 6:01 PM Ben Pfaff <blp at ovn.org> wrote:
> >
> > I spent a bunch of time looking through this series.  The actual support
> > for fast resync is really simple and clean.
> >
> > The first patch is the one that I have the most trouble with.  I think
> > it's probably because I don't understand the existing code all that
> > well.  I wrote the original monitor implementation, then other people
> > added the caching layer, and I think I reviewed that caching layer
> > without ever properly understanding it.  Do you understand it?  Are you
> > confident that your revision is high quality, and are you confident that
> > you can maintain it?
> >
> > Thanks,
> >
> > Ben.
>
> Hi Ben,
>
> Thanks a lot for spending time reviewing this. For the first patch:
> "ovsdb-monitor: Refactor ovsdb monitor implementation.", I am pretty
> confident on it. I see spaces for further improvement, but at least I
> am sure the revised version is functionally equivalent to the original
> one. The monitoring implementation was hard to understand for me, too.
> After spending a lot of time on it I understand the structure and most
> of the details now, and I believe it is slightly easier to understand
> after the refactoring, and I think I can maintain it. Consider my
> previous bug fix on the crash caused by condition change as an
> evidence :o) I think I can add some documentation on it (so that I
> will not forget after some time).
>
> And thanks for finding the Address Sanitizer test failure. I think I
> got the root cause, which is related to the order of freeing the rows
> tracked in the txn history and the table schema that is used. I will
> figure out how to properly fix it.
>
> Thanks,
> Han

I realized that you might be talking specifically about the json cache
part. It is pretty clear to me now and I can briefly explain. It just
tries to avoid the repeated work of translating same set of changes
into json_rpc. When many clients are monitoring same set of tables and
columns without filter conditions, the cached json-rpc will be used to
send notifications and this may save a lot of cost. When clients have
different conditions, the cache is not used. The cache is based on a
change set and monitor version, because different monitor version
constructs different json-rpc messages. Whenever there are changes to
the change set, the cache is invalidated (flushed).