[ovs-dev] [dev master 2/3] bridge: Make ovs-vswitchd run again if status_txn commit fails.

Ben Pfaff blp at nicira.com
Mon Jun 9 21:36:22 UTC 2014


On Sun, Jun 01, 2014 at 10:04:29PM -0700, Alex Wang wrote:
> This commit adds logic that checks the return value of status_txn
> transaction and runs the update again if the transaction fails.
> 
> Signed-off-by: Alex Wang <alexw at nicira.com>

I am not sure that you understand what TXN_INCOMPLETE means.  It means
that a transaction has been queued up to send to the database server,
but the database server has not yet replied.  Some of the comments in
this commit imply that we should retry the transaction in the case of
TXN_INCOMPLETE, but no, that's not the case; we should just wait for
the database server to respond.

It seems to me that status_txn_try_again is only used after the
transaction finishes, since if status == TXN_INCOMPLETE then we're
still waiting for it to complete and status_txn is still nonnull.  So
I would update this comment to remove the mention of "incomplete":
> + * If the previous database transaction was incomplete or failed (is not
> + * 'TXN_SUCCESS' or 'TXN_UNCHANGED'), 'status_txn_try_again' is set to true,
> + * which will cause the main thread wake up soon and retry the status update.
>   */
>  static struct ovsdb_idl_txn *status_txn;
> +static bool status_txn_try_again;

Also, here we really only care to set status_txn_try_again if the
transaction is somehow complete, right?  Then I would move the new
code inside the previous "if" block that only runs if status !=
TXN_INCOMPLETE.  Also, TXN_UNCOMMITTED should be impossible here,
since we called ovsdb_idl_txn_commit() a few lines earlier.
> @@ -2453,6 +2458,14 @@ bridge_run(void)
>              ovsdb_idl_txn_destroy(status_txn);
>              status_txn = NULL;
>          }
> +
> +        /* Sets the 'status_txn_try_again' if the transaction fails or
> +         * is still incomplete. */
> +        if (status == TXN_SUCCESS || status == TXN_UNCOMMITTED) {
> +            status_txn_try_again = false;
> +        } else {
> +            status_txn_try_again = true;
> +        }
>      }
>  
>      run_system_stats();

There seems to be some similar confusion here.  If the transaction is
incomplete, then we shouldn't do any kind of timer-based waiting, we
should instead wait for the transaction to complete:
> @@ -2486,11 +2499,11 @@ bridge_wait(void)
>          poll_timer_wait_until(stats_timer);
>      }
>  
> -    /* If the status database transaction is 'TXN_INCOMPLETE' in this run,
> -     * register a timeout in 'STATUS_CHECK_AGAIN_MSEC'.  Else, wait on the
> -     * global connectivity sequence number.  Note, this also helps batch
> -     * multiple status changes into one transaction. */
> -    if (status_txn) {
> +    /* If the status update to database needs to be run again (transaction
> +     * fails or incomplete), registers a timeout in 'STATUS_CHECK_AGAIN_MSEC'.
> +     * Else, waits on the global connectivity sequence number.  Note, this also
> +     * helps batch multiple status changes into one transaction. */
> +    if (status_txn_try_again) {
>          poll_timer_wait_until(time_msec() + STATUS_CHECK_AGAIN_MSEC);
>      } else {
>          seq_wait(connectivity_seq_get(), connectivity_seqno);
> -- 
> 1.7.9.5
> 



More information about the dev mailing list