[ovs-dev] [PATCH V2] dpif-netdev: Do RCU synchronization at fixed interval in PMD main loop.
Ben Pfaff
blp at ovn.org
Fri Oct 4 17:13:13 UTC 2019
Regarding my own part, usually I do not review dpif-netdev patches.
On Fri, Oct 04, 2019 at 05:35:58AM +0000, Nitin Katiyar wrote:
> Hi,
> Can you please review the following patch and provide the feedback?
>
> Regards,
> Nitin
>
> > -----Original Message-----
> > From: Nitin Katiyar
> > Sent: Wednesday, August 28, 2019 2:00 PM
> > To: ovs-dev at openvswitch.org
> > Cc: Anju Thomas <anju.thomas at ericsson.com>; Ilya Maximets
> > <i.maximets at samsung.com>
> > Subject: RE: [PATCH V2] dpif-netdev: Do RCU synchronization at fixed interval
> > in PMD main loop.
> >
> > Hi,
> > Please provide your feedback.
> >
> > Regards,
> > Nitin
> >
> > > -----Original Message-----
> > > From: Nitin Katiyar
> > > Sent: Thursday, August 22, 2019 10:24 PM
> > > To: ovs-dev at openvswitch.org
> > > Cc: Nitin Katiyar <nitin.katiyar at ericsson.com>; Anju Thomas
> > > <anju.thomas at ericsson.com>
> > > Subject: [PATCH V2] dpif-netdev: Do RCU synchronization at fixed
> > > interval in PMD main loop.
> > >
> > > Each PMD updates the global sequence number for RCU synchronization
> > > purpose with other OVS threads. This is done at every 1025th iteration
> > > in PMD main loop.
> > >
> > > If the PMD thread is responsible for polling large number of queues
> > > that are carrying traffic, it spends a lot of time processing packets
> > > and this results in significant delay in performing the housekeeping
> > activities.
> > >
> > > If the OVS main thread is waiting to synchronize with the PMD threads
> > > and if those threads delay performing housekeeping activities for more
> > > than 3 sec then LACP processing will be impacted and it will lead to
> > > LACP flaps. Similarly, other controls protocols run by OVS main thread are
> > impacted.
> > >
> > > For e.g. a PMD thread polling 200 ports/queues with average of 1600
> > > processing cycles per packet with batch size of 32 may take 10240000
> > > (200 * 1600 * 32) cycles per iteration. In system with 2.0 GHz CPU it
> > > means more than 5 ms per iteration. So, for 1024 iterations to
> > > complete it would be more than 5 seconds.
> > >
> > > This gets worse when there are PMD threads which are less loaded.
> > > It reduces possibility of getting mutex lock in ovsrcu_try_quiesce()
> > > by heavily loaded PMD and next attempt to quiesce would be after 1024
> > iterations.
> > >
> > > With this patch, PMD RCU synchronization will be performed after fixed
> > > interval instead after a fixed number of iterations. This will ensure
> > > that even if the packet processing load is high the RCU
> > > synchronization will not be delayed long.
> > >
> > > Co-authored-by: Anju Thomas <anju.thomas at ericsson.com>
> > > Signed-off-by: Anju Thomas <anju.thomas at ericsson.com>
> > > Signed-off-by: Nitin Katiyar <nitin.katiyar at ericsson.com>
> > > ---
> > > lib/dpif-netdev.c | 22 ++++++++++++++++++++++
> > > 1 file changed, 22 insertions(+)
> > >
> > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index
> > > d0a1c58..63b6cb9
> > > 100644
> > > --- a/lib/dpif-netdev.c
> > > +++ b/lib/dpif-netdev.c
> > > @@ -228,6 +228,9 @@ struct dfc_cache {
> > > * and used during rxq to pmd assignment. */ #define
> > > PMD_RXQ_INTERVAL_MAX 6
> > >
> > > +/* Time in microseconds to try RCU quiescing. */ #define
> > > +PMD_RCU_QUIESCE_INTERVAL 10000LL
> > > +
> > > struct dpcls {
> > > struct cmap_node node; /* Within dp_netdev_pmd_thread.classifiers
> > > */
> > > odp_port_t in_port;
> > > @@ -751,6 +754,9 @@ struct dp_netdev_pmd_thread {
> > >
> > > /* Set to true if the pmd thread needs to be reloaded. */
> > > bool need_reload;
> > > +
> > > + /* Next time when PMD should try RCU quiescing. */
> > > + long long next_rcu_quiesce;
> > > };
> > >
> > > /* Interface to netdev-based datapath. */ @@ -5486,6 +5492,8 @@
> > reload:
> > > pmd->intrvl_tsc_prev = 0;
> > > atomic_store_relaxed(&pmd->intrvl_cycles, 0);
> > > cycles_counter_update(s);
> > > +
> > > + pmd->next_rcu_quiesce = pmd->ctx.now +
> > > PMD_RCU_QUIESCE_INTERVAL;
> > > /* Protect pmd stats from external clearing while polling. */
> > > ovs_mutex_lock(&pmd->perf_stats.stats_mutex);
> > > for (;;) {
> > > @@ -5493,6 +5501,17 @@ reload:
> > >
> > > pmd_perf_start_iteration(s);
> > >
> > > + /* Do RCU synchronization at fixed interval instead of doing it
> > > + * at fixed number of iterations. This ensures that synchronization
> > > + * would not be delayed long even at high load of packet
> > > + * processing. */
> > > + if (pmd->ctx.now > pmd->next_rcu_quiesce) {
> > > + if (!ovsrcu_try_quiesce()) {
> > > + pmd->next_rcu_quiesce =
> > > + pmd->ctx.now + PMD_RCU_QUIESCE_INTERVAL;
> > > + }
> > > + }
> > > +
> > > for (i = 0; i < poll_cnt; i++) {
> > >
> > > if (!poll_list[i].rxq_enabled) { @@ -5527,6 +5546,8 @@ reload:
> > > dp_netdev_pmd_try_optimize(pmd, poll_list, poll_cnt);
> > > if (!ovsrcu_try_quiesce()) {
> > > emc_cache_slow_sweep(&((pmd->flow_cache).emc_cache));
> > > + pmd->next_rcu_quiesce =
> > > + pmd->ctx.now + PMD_RCU_QUIESCE_INTERVAL;
> > > }
> > >
> > > for (i = 0; i < poll_cnt; i++) { @@ -5986,6 +6007,7 @@
> > > dp_netdev_configure_pmd(struct dp_netdev_pmd_thread *pmd, struct
> > > dp_netdev *dp,
> > > pmd->ctx.last_rxq = NULL;
> > > pmd_thread_ctx_time_update(pmd);
> > > pmd->next_optimization = pmd->ctx.now +
> > > DPCLS_OPTIMIZATION_INTERVAL;
> > > + pmd->next_rcu_quiesce = pmd->ctx.now +
> > > PMD_RCU_QUIESCE_INTERVAL;
> > > pmd->rxq_next_cycle_store = pmd->ctx.now +
> > PMD_RXQ_INTERVAL_LEN;
> > > hmap_init(&pmd->poll_list);
> > > hmap_init(&pmd->tx_ports);
> > > --
> > > 1.9.1
>
More information about the dev
mailing list