[ovs-dev] ovn-northd-ddlog bug with HA_Chassis_Groups

Ilya Maximets i.maximets at ovn.org
Fri Apr 9 16:14:24 UTC 2021


On 4/9/21 6:01 PM, Mark Michelson wrote:
> Hi guys,
> 
> While developing a new feature, I found a defect in ovn-northd-ddlog when HA_Chassis_Groups are used. I have attached a script that replicates the issue.
> 
> If you start a sandbox environment in ovn master (`make sandbox SANDBOXFLAGS="--ddlog"`), and then run the script, you'll find that the script hangs, and you must ctrl+c to terminate. At this point, my system shows ovn-northd-ddlog taking up 60-80% CPU. If you run `ovn-sbctl list port_binding`, you'll see that chassis-resident port bindings are not present. If you run `ovn-sbctl list ha_chassis_group` nothing is returned.
> 
> If you remove the "--wait=sb" from the final ovn-nbctl command, then the script will not hang, but the same symptoms occur.
> 
> The attached script is the minimum I could manage.

FYI, mail list strips out attachments, so only direct
recipients, probably, received your script.

> I attempted to remove the second router and second HA_Chassis_Group, but doing that made the issue disappear.
> 
> Ideally, rather than reporting the issue to you guys, I would be diagnosing the issue myself and then presenting a patch to fix it. However, I'm at a bit of a loss for how to debug this. I could try to inspect the source for the issue, but finding what the running process is doing and stepping through the running code would be much more sensible.
> 
> I'm looking for two pieces of information here:
> 1) How would you go about debugging this particular issue?
> 2) What's going on here? :) If you know what is going on, then how did you make that determination?
> 
> Thanks,
> Mark Michelson
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> 



More information about the dev mailing list