[ovs-dev] [PATCH ovn 1/2] chassis: Fix the way encaps are updated for a chassis record.

Dumitru Ceara dceara at redhat.com
Thu Sep 3 13:11:42 UTC 2020


On 9/3/20 1:07 PM, Numan Siddique wrote:
> 
> 
> On Thu, Sep 3, 2020 at 3:55 PM Dumitru Ceara <dceara at redhat.com
> <mailto:dceara at redhat.com>> wrote:
> 
>     On 9/3/20 11:57 AM, Numan Siddique wrote:
>     >
>     >
>     > On Thu, Aug 27, 2020 at 7:32 PM Dumitru Ceara <dceara at redhat.com
>     <mailto:dceara at redhat.com>
>     > <mailto:dceara at redhat.com <mailto:dceara at redhat.com>>> wrote:
>     >
>     >     ovn-controller always stores the last configured
>     system-id/chassis-id in
>     >     memory regardless if the connection to the SB is up or down.
>     This is OK
>     >     as long as the change can be committed successfully when the SB DB
>     >     connection comes back up.
>     >
>     >     This commit changes the way we search for a "stale" chassis record
>     >     in the
>     >     SB to cover the above mentioned case. Also, in such cases
>     there's no
>     >     need
>     >     to recreate the Encaps, it's sufficient to update the chassis_name
>     >     field.
>     >
>     >     Fixes: 5344f24ecb1a ("ovn-controller: Refactor chassis.c to
>     abstract
>     >     the string parsing")
>     >     Signed-off-by: Dumitru Ceara <dceara at redhat.com
>     <mailto:dceara at redhat.com>
>     >     <mailto:dceara at redhat.com <mailto:dceara at redhat.com>>>
>     >
>     >
>     >
>     > Hi Dumitru,
>     >
>     > Thanks for the patch. Below are few observations when I tested
>     with this
>     > patch and without
>     >
> 
>     Hi Numan,
> 
>     Thanks for trying it out.
> 
>     >    1.  In a sandbox environment, when I change the
>     > external_ids:system-id, the chassis row for the 
>     >      ovn-controller never gets updated.  I observed with this
>     patch and
>     > without this patch. I see the below error
>     >     message
>     >
>     >    ******
>     >   2020-09-03T09:19:53.654Z|00032|ovsdb_idl|WARN|transaction error:
>     > {"details":"RBAC rules for client \"chassis-1\" role
>     \"ovn-controller\"
>     > prohibit row insertion into table
>     > \"Chassis_Private\".","error":"permission error"}
>     > 2020-09-03T09:19:53.654Z|00033|ovsdb_idl|WARN|transaction error:
>     > {"details":"RBAC rules for client \"chassis-1\" role
>     \"ovn-controller\"
>     > prohibit row insertion into table
>     > \"Chassis_Private\".","error":"permission error"}
>     > *****
>     >      When I use a unix socket for SB DB connection, I don't see this
>     > issue. May be its to do with the RBAC permissions. Its not related to
>     > your patch. But thought of listing it here.
>     >
> 
>     Right, this is not addressed by this series. We should probably follow
>     up with a patch for this issue too.
> 
>     >   2. When I change the external_ids:system-id name, the
>     chassis_private
>     > row is leaked. Again this is not related to your patch. But
>     something we
>     > should fix ?
>     >
> 
>     Is this with both patches applied?
> 
>     https://patchwork.ozlabs.org/project/openvswitch/list/?series=198050
> 
>     >   3. with this patch, I ran the below commands 
>     >       - ovs-vsctl set open . external_ids:system-id=ch-1  -- > the
>     > chassis row is recreated with ch-1
>     >       -  ovs-vsctl set open . external_ids:system-id=ch-2  -- > the
>     > chassis row is recreated with ch-2
>     >       - ovs-vsctl set open . external_ids:system-id=ch-1 -- > the
>     > chassis row is still ch-2. 
>     >      - ovs-vsctl set open . external_ids:system-id=ch-3  -- > the
>     > chassis row is recreated with ch-3
>     >      - ovs-vsctl set open . external_ids:system-id=ch-2 -- > the
>     chassis
>     > row is still ch-3.
>     >
>     >     Something is not correct when the old system-id is restored.
> 
>     Same question here: are both patches of the series applied?
> 
>     Because I did:
> 
>     $ SANDBOXFLAGS=--no-ovn-rbac make sandbox
> 
>     $ ovn-sbctl list chassis_private | grep '^name'
>     name                : chassis-1
>     $ ovn-sbctl list chassis | grep '^name'
>     name                : chassis-1
>     $ ovs-vsctl set open . external_ids:system-id=ch-1
>     $ ovn-sbctl list chassis_private | grep '^name'
>     name                : ch-1
>     $ ovn-sbctl list chassis | grep '^name'
>     name                : ch-1
>     $ ovs-vsctl set open . external_ids:system-id=ch-2
>     $ ovn-sbctl list chassis | grep '^name'
>     name                : ch-2
>     $ ovn-sbctl list chassis_private | grep '^name'
>     name                : ch-2
>     $ ovs-vsctl set open . external_ids:system-id=ch-1
>     $ ovn-sbctl list chassis | grep '^name'
>     name                : ch-1
>     $ ovn-sbctl list chassis_private | grep '^name'
>     name                : ch-1
>     $ ovs-vsctl set open . external_ids:system-id=ch-3
>     $ ovn-sbctl list chassis | grep '^name'
>     name                : ch-3
>     $ ovn-sbctl list chassis_private | grep '^name'
>     name                : ch-3
>     $ ovs-vsctl set open . external_ids:system-id=ch-2
>     $ ovn-sbctl list chassis | grep '^name'
>     name                : ch-2
>     $ ovn-sbctl list chassis_private | grep '^name'
>     name                : ch-2
> 
>     So with both patches applied it seems to me that the system-id update is
>     working fine.
> 
>     The first patch of the series doesn't apply cleanly on branch-20.03 but
>     if I cherry pick it on branch-20.06:
> 
>     $ SANDBOXFLAGS=--no-ovn-rbac make sandbox
> 
>     $ ovn-sbctl list chassis | grep '^name'
>     name                : chassis-1
>     $ ovs-vsctl set open . external_ids:system-id=ch-1
>     $ ovn-sbctl list chassis | grep '^name'
>     name                : ch-1
>     $ ovs-vsctl set open . external_ids:system-id=ch-2
>     $ ovn-sbctl list chassis | grep '^name'
>     name                : ch-2
>     $ ovs-vsctl set open . external_ids:system-id=ch-1
>     $ ovn-sbctl list chassis | grep '^name'
>     name                : ch-1
>     $ ovs-vsctl set open . external_ids:system-id=ch-3
>     $ ovn-sbctl list chassis | grep '^name'
>     name                : ch-3
>     $ ovs-vsctl set open . external_ids:system-id=ch-2
>     $ ovn-sbctl list chassis | grep '^name'
>     name                : ch-2
> 
> 
> Oops. My bad. Thanks.
> 
> I tested after applying both the patches.
> 
> When I run this command - ovs-vsctl set open . external_ids:system-id=ch-2
> I see below warning message in ovn-controller log
> 
> *****
> 2020-09-03T11:03:34.411Z|00014|ovsdb_idl|WARN|transaction error:
> {"details":"Transaction causes multiple rows in \"Chassis_Private\"
> table to have identical values (ch-2) for index on column \"name\". 
> First row, with UUID 1cf765f0-774d-460f-b394-988e7a2bf8ee, was inserted
> by this transaction.  Second row, with UUID
> 04b66cab-0705-4cbb-b11c-2414d076b09d, existed in the database before
> this transaction and was not modified by the
> transaction.","error":"constraint violation"}
> 2020-09-03T11:03:34.412Z|00015|ovsdb_idl|WARN|transaction error:
> {"details":"Transaction causes multiple rows in \"Chassis_Private\"
> table to have identical values (ch-2) for index on column \"name\". 
> First row, with UUID c15a0c17-a56a-4ec2-be3a-ceff644aa908, was inserted
> by this transaction.  Second row, with UUID
> 04b66cab-0705-4cbb-b11c-2414d076b09d, existed in the database before
> this transaction and was not modified by the
> transaction.","error":"constraint violation"}
> *******
> 
> Functionally it works as expected. i.e the previous chassis_private
> record is updated with the new name. I think should be fixed though.
> 

Hi Numan,

Nice catch! It's happening because when ovn-controller sends the
transaction to update chassis.name and chassis_private.name,
ovsdb-server replies with:

update3: "modify": chassis.row.name = <new-value>, "delete":
chassis_private.row

Followed by:
update3: "insert": chassis_private.row(name = <new-value>).

ovn-controller runs the processing loop after the first update3 and
tries to insert the chassis_private because it can't find it anymore in
the IDL. This insert obviously fails because ovsdb-server knows about
the old chassis_private record.

Once the last update3 is received by ovn-controller things settle down.

I've been trying to understand why ovsdb-server replies with
insert/delete when chassis_private.name is updated but I couldn't figure
it out.

Do you maybe have suggestions/ideas?

> Also can you please update the commit message in patch 1 which states
> the issue clearly. It was not clear to me the issue the patch is trying
> to address.
> 
> 

Sure, I'll update the commit message in v2 once we figure out the
chassis_private.update delete/insert issue.

Thanks,
Dumitru



More information about the dev mailing list