[ovs-dev] [PATCH] ovn-controller: Add a new thread in pinctrl module to handle packet-ins.

Han Zhou zhouhan at gmail.com
Mon Mar 11 22:11:23 UTC 2019


On Mon, Mar 11, 2019 at 9:29 AM <nusiddiq at redhat.com> wrote:
>
> From: Numan Siddique <nusiddiq at redhat.com>
>
> Prior to this patch, ovn-controller was single threaded and everytime the
> poll_block() at the end of the main while() loop wakes up, it  processes
> the whole SB DB and translates the logical flows to OF flows.
>
> There are few issues with this -
>
>   * For every packet-in received, ovn-controller does this translation
>     resulting in unnecessary CPU cycles.
>
>   * If the translation takes a lot of time, then the packet-in handling
>     would get delayed. The delay in responses to DHCP requests could
>     result in resending of these requests.
>
> This patch addresses these issues by creating a new pthread in pinctrl module
> to handle packet-ins. This thread doesn't access the Southbound DB IDL object.
>
> Since some of the OVN actions - like dns_lookup, arp, put_arp, put_nd
> require access to the Southbound DB contents and gARPs, periodic IPv6 RA
> generation also requires the DB access, pinctrl_run() called by the main
> ovn-controller thread accesses the Southbound DB IDL and builds the local
> datastructures. pinctrl_handler thread accesses these data structures
> in handling such requests. An ovs_mutex is used between the pinctr_run() and
> the pinctrl_handler thread to protect these data structures.
>
> Signed-off-by: Numan Siddique <nusiddiq at redhat.com>
> ---
>  ovn/controller/pinctrl.c | 681 ++++++++++++++++++++++++++++++---------
>  1 file changed, 532 insertions(+), 149 deletions(-)
>
> diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c
> index 100a20ff2..857cb5d51 100644
> --- a/ovn/controller/pinctrl.c
> +++ b/ovn/controller/pinctrl.c
> @@ -27,6 +27,7 @@
>  #include "lport.h"
>  #include "nx-match.h"
>  #include "ovn-controller.h"
> +#include "latch.h"
>  #include "lib/packets.h"
>  #include "lib/sset.h"
>  #include "openvswitch/ofp-actions.h"
> @@ -48,19 +49,13 @@
>  #include "openvswitch/poll-loop.h"
>  #include "openvswitch/rconn.h"
>  #include "socket-util.h"
> +#include "seq.h"
>  #include "timeval.h"
>  #include "vswitch-idl.h"
>  #include "lflow.h"
>
>  VLOG_DEFINE_THIS_MODULE(pinctrl);
>
> -/* OpenFlow connection to the switch. */
> -static struct rconn *swconn;
> -
> -/* Last seen sequence number for 'swconn'.  When this differs from
> - * rconn_get_connection_seqno(rconn), 'swconn' has reconnected. */
> -static unsigned int conn_seq_no;
> -
>  static void init_buffered_packets_map(void);
>  static void destroy_buffered_packets_map(void);
>
> @@ -76,11 +71,14 @@ static void run_put_mac_bindings(
>      struct ovsdb_idl_index *sbrec_mac_binding_by_lport_ip);
>  static void wait_put_mac_bindings(struct ovsdb_idl_txn *ovnsb_idl_txn);
>  static void flush_put_mac_bindings(void);
> +static void buffer_put_mac_bindings(void);
> +static void destroy_buffered_mac_bindings(void);
> +static void send_mac_binding_buffered_pkts(struct rconn *swconn);
>
>  static void init_send_garps(void);
>  static void destroy_send_garps(void);
>  static void send_garp_wait(void);
> -static void send_garp_run(
> +static void send_garp_prepare(
>      struct ovsdb_idl_index *sbrec_chassis_by_name,
>      struct ovsdb_idl_index *sbrec_port_binding_by_datapath,
>      struct ovsdb_idl_index *sbrec_port_binding_by_name,
> @@ -88,45 +86,144 @@ static void send_garp_run(
>      const struct sbrec_chassis *,
>      const struct hmap *local_datapaths,
>      const struct sset *active_tunnels);
> -static void pinctrl_handle_nd_na(const struct flow *ip_flow,
> +static void send_garp_run(struct rconn *swconn);
> +static void pinctrl_handle_nd_na(struct rconn *swconn,
> +                                 const struct flow *ip_flow,
>                                   const struct match *md,
>                                   struct ofpbuf *userdata,
>                                   bool is_router);
>  static void reload_metadata(struct ofpbuf *ofpacts,
>                              const struct match *md);
>  static void pinctrl_handle_put_nd_ra_opts(
> +    struct rconn *swconn,
>      const struct flow *ip_flow, struct dp_packet *pkt_in,
>      struct ofputil_packet_in *pin, struct ofpbuf *userdata,
>      struct ofpbuf *continuation);
> -static void pinctrl_handle_nd_ns(const struct flow *ip_flow,
> +static void pinctrl_handle_nd_ns(struct rconn *swconn,
> +                                 const struct flow *ip_flow,
>                                   struct dp_packet *pkt_in,
>                                   const struct match *md,
>                                   struct ofpbuf *userdata);
>  static void init_ipv6_ras(void);
>  static void destroy_ipv6_ras(void);
>  static void ipv6_ra_wait(void);
> -static void send_ipv6_ras(
> +static void prepare_ipv6_ras(
>      struct ovsdb_idl_index *sbrec_port_binding_by_datapath,
>      struct ovsdb_idl_index *sbrec_port_binding_by_name,
>      const struct hmap *local_datapaths);
> -;
> +static void send_ipv6_ras(struct rconn *swconn);
>
>  COVERAGE_DEFINE(pinctrl_drop_put_mac_binding);
>  COVERAGE_DEFINE(pinctrl_drop_buffered_packets_map);
>
> +/* pinctrl module creates a thread - pinctrl_handler to handle
> + * the packet-ins from ovs-vswitchd. Some of the OVN actions
> + * are translated to OF 'controller' actions. See include/ovn/actions.h
> + * for more details.
> + *
> + * pinctrl_handler thread doesn't access the Southbound IDL object. But
> + * some of the OVN actions which gets translated to 'controller'
> + * OF action, require data from Southbound DB.  Below are the details
> + * on how these actions are implemented.
> + *
> + * pinctrl_run() function is called by ovn-controller main thread.
> + * A Mutex - 'pinctrl_mutex' is used between the pinctrl_handler() thread
> + * and pinctrl_run().
> + *
> + *   - dns_lookup -     In order to do a DNS lookup, this action needs
> + *                      to access the 'DNS' table. pinctrl_run() builds a
> + *                      local DNS cache - 'dns_cache'. See sync_dns_cache()
> + *                      for more details.
> + *                      The function 'pinctrl_handle_dns_lookup()' (which is
> + *                      called with in the pinctrl_handler thread) looks into
> + *                      the local DNS cache to resolve the DNS requests.
> + *
> + *   - put_arp/put_nd - These actions stores the IPv4/IPv6 and MAC addresses
> + *                      in the 'MAC_Binding' table.
> + *                      The function 'pinctrl_handle_put_mac_binding()' (which
> + *                      is called with in the pinctrl_handler thread), stores
> + *                      the IPv4/IPv6 and MAC addresses in the
> + *                      hmap - put_mac_bindings.
> + *
> + *                      pinctrl_run(), reads these mac bindings from the hmap
> + *                      'put_mac_bindings' and writes to the 'MAC_Binding'
> + *                      table in the Southbound DB.
> + *
> + *   - arp/nd_ns      - These actions generate an ARP/IPv6 Neighbor solicit
> + *                      requests. The original packets are buffered and
> + *                      injected back when put_arp/put_nd actions are called.
> + *                      When pinctrl_run(), writes the mac bindings from the
> + *                      'put_mac_bindings' hmap, it moves these mac bindings
> + *                      to another hmap - 'buffered_mac_bindings'.
> + *
> + *                      The pinctrl_handler thread calls the function -
> + *                      send_mac_binding_buffered_pkts(), which uses
> + *                      the hmap - 'buffered_mac_bindings' and reinjects the
> + *                      buffered packets.
> + *
> + * pinctrl module also periodically sends IPv6 Router Solicitation requests
> + * and gARPs (for the router gateway IPs and configured NAT addresses).
> + *
> + * IPv6 RA handling - pinctrl_run() prepares the IPv6 RA information
> + *                    (see prepare_ipv6_ras()) in the shash 'ipv6_ras' by
> + *                    looking into the Southbound DB table - Port_Binding.
> + *
> + *                    pinctrl_handler thread sends the periodic IPv6 RAs using
> + *                    the shash - 'ipv6_ras'
> + *
> + * gARP handling    - pinctrl_run() prepares the gARP information
> + *                    (see send_garp_prepare()) in the shash 'send_garp_data'
> + *                    by looking into the Southbound DB table Port_Binding.
> + *
> + *                    pinctrl_handler() thread sends these gARPs using the
> + *                    shash 'send_garp_data'.
> + *
> + * Notification between pinctrl_handler() and pinctrl_run()
> + * -------------------------------------------------------
> + * 'struct seq' is used for notification between pinctrl_handler() thread
> + *  and pinctrl_run().
> + *  'pinctrl_handler_seq' is used by pinctrl_run() to
> + *  wake up pinctrl_handler thread from poll_block() if any changes happened
> + *  in 'send_garp_data', 'ipv6_ras' and 'buffered_mac_bindings' structures.
> + *
> + *  'pinctrl_main_seq' is used by pinctrl_handler() thread to wake up
> + *  the main thread from poll_block() when mac bindings needs to be updated
> + *  in the Southboubd DB.
> + * */
> +

Thanks Numan for addressing this problem. It is very good that the
implementation doesn't require a separate IDL for SB access. Here are
some generic comments.

For operations that doesn't rely on SB DB, such as OVS probe echo, ACL
logging and DHCP, in theory, they can be handled completely in
parallel. However, in this patch the pinctrl_mutex locks the whole
pinctrl_handler. I think the lock can be relaxed to exclude the
operations that doesn't need SB DB access.

For tasks that need READONLY access to SB DB, the main thread reads
from DB, and notifies pinctrl thread to process the changed data. This
synchronization is definitely required and the implementation looks
good to me.

For tasks that requires WRITE access to SB DB, such as put_arp, it
will still trigger recompute in main thread with this patch, and the
two thread would still block each other. This is because when waking
up main thread with pinctrl_main_seq, the main thread will recompute
flows again, even if there is no change in SB DB. In fact this can be
solved by the first patches of the incremental processing
(https://github.com/hzhou8/ovs/commit/2bc0e2423fdcff21db4d607f3164b0e0c62cd660).
In that patch there is no real incremental processing yet, but since
it is abstracted with dependency model, it can avoid flow computing
when it is not needed. Do you think it is good to be combined with
this multi-threading patch to avoid the problem of put_arp triggering
flow computing which blocks put_arp operation? Sorry for bringing the
I-P series back if it is annoying. I think the major reason that was
rejected earlier was the concern of maintenance effort for dependency
graph and change-handler implementations. However, since the first
patches of I-P only build the coarse grained dependency without any
change-handler, we can just utilize the benefit without worrying about
maintenance, although I might be biased. (p.s. I am still eager to
work on DDlog for ovn-controller, but I guess it will just take quite
some time to be ready).

Thanks,
Han


More information about the dev mailing list