[ovs-discuss] OpenvSwitch 1.0.1 on XenServer 5.6 with Bonding

Ben Pfaff blp at nicira.com
Wed Aug 25 23:42:46 UTC 2010


Thanks so much.  I think I see the real problem now.  Could you
re-enable the call to bond_wait(br), and then make a different change?
Here it is:

diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
index 476073a..4c9b019 100644
--- a/vswitchd/bridge.c
+++ b/vswitchd/bridge.c
@@ -141,7 +141,7 @@ struct port {
     int updelay, downdelay;     /* Delay before iface goes up/down, in ms. */
     bool bond_compat_is_stale;  /* Need to call port_update_bond_compat()? */
     bool bond_fake_iface;       /* Fake a bond interface for legacy compat? */
-    long bond_next_fake_iface_update; /* Next update to fake bond stats. */
+    long long int bond_next_fake_iface_update; /* Time of next update. */
     int bond_rebalance_interval; /* Interval between rebalances, in ms. */
     long long int bond_next_rebalance; /* Next rebalancing time. */
 
In other words, change the type of bond_next_fake_iface_update from
"long' to "long long int".  I think that this is the correct fix.

Thanks,

Ben.

On Wed, Aug 25, 2010 at 08:31:22PM -0300, Luiz Henrique Ozaki wrote:
> Perfect !!
> 
> Commenting out the bond_wait(br) solved this high CPU.
> 
> If you need more debuging and testing, be my guest.
> 
> Regards,
> 
> On Wed, Aug 25, 2010 at 7:43 PM, Ben Pfaff <blp at nicira.com> wrote:
> 
> > There's nothing unusual there.  Hmm.
> >
> > If you're willing to try some experiments, maybe we can learn more.
> >
> > First, try commenting out the call to "bond_wait(br)" in bridge_wait()
> > in vswitchd/bridge.c.  Does that have any effect?
> >
> > If that has no effect, then try commenting out the call to
> > poll_timer_wait_until(iface_stats_timer) in the same function and see if
> > it makes a difference.
> >
> > Thanks,
> >
> > Ben.
> >
> > On Wed, Aug 25, 2010 at 06:37:09PM -0300, Luiz Henrique Ozaki wrote:
> > > # ovs-appctl bond/show bond0
> > > updelay: 200 ms
> > > downdelay: 0 ms
> > > next rebalance: 8481 ms
> > > slave eth3: enabled
> > >         active slave
> > >         hash 218: 5 kB load
> > >                 00:23:7d:e8:2a:00
> > > slave eth2: enabled
> > > # ovs-appctl bond/show bond1
> > > updelay: 200 ms
> > > downdelay: 0 ms
> > > next rebalance: 9737 ms
> > > slave eth4: enabled
> > >         active slave
> > > slave eth5: enabled
> > > # ovs-appctl bond/show bond2
> > > updelay: 200 ms
> > > downdelay: 0 ms
> > > next rebalance: 8585 ms
> > > slave eth7: enabled
> > >         active slave
> > > slave eth6: enabled
> > >
> > >
> > > I saw in the openvswitch init script that there's a verification for
> > > Xenserver 5.5, I changed for 5.6 to load the /proc/net just to check if
> > this
> > > was the issue but cpu process is still high, with or without
> > > /proc/net/bonding
> > >
> > > Changed for the original init again.
> > >
> > >
> > > On Wed, Aug 25, 2010 at 6:04 PM, Ben Pfaff <blp at nicira.com> wrote:
> > >
> > > > On Wed, Aug 25, 2010 at 05:59:35PM -0300, Luiz Henrique Ozaki wrote:
> > > > > May 16 21:56:39|14629|poll_loop|DBG|0-ms timeout:
> > 0x805bac1(bridge_wait)
> > > > > 0x8063da9(main) 0xb7470e9c
> > > >
> > > > Thanks.  It's definitely part of the bridge code then.  What does
> > > > "ovs-appctl bond/show <bondname>", with <bondname> replaced by the name
> > > > of the bonded port, print out?
> > > >
> > >
> > >
> > >
> > > --
> > > []'s
> > > Luiz Henrique Ozaki
> >
> 
> 
> 
> -- 
> []'s
> Luiz Henrique Ozaki




More information about the discuss mailing list