[ovs-dev] [PATCH] ovn-controller: Add a new thread in pinctrl module to handle packet-ins.

Numan Siddique nusiddiq at redhat.com
Tue Mar 12 16:22:41 UTC 2019


On Tue, Mar 12, 2019 at 6:16 PM Mark Michelson <mmichels at redhat.com> wrote:

> On 3/11/19 6:11 PM, Han Zhou wrote:
> > On Mon, Mar 11, 2019 at 9:29 AM <nusiddiq at redhat.com> wrote:
> >>
> >> From: Numan Siddique <nusiddiq at redhat.com>
> >>
> >> Prior to this patch, ovn-controller was single threaded and everytime
> the
> >> poll_block() at the end of the main while() loop wakes up, it  processes
> >> the whole SB DB and translates the logical flows to OF flows.
> >>
> >> There are few issues with this -
> >>
> >>    * For every packet-in received, ovn-controller does this translation
> >>      resulting in unnecessary CPU cycles.
> >>
> >>    * If the translation takes a lot of time, then the packet-in handling
> >>      would get delayed. The delay in responses to DHCP requests could
> >>      result in resending of these requests.
> >>
> >> This patch addresses these issues by creating a new pthread in pinctrl
> module
> >> to handle packet-ins. This thread doesn't access the Southbound DB IDL
> object.
> >>
> >> Since some of the OVN actions - like dns_lookup, arp, put_arp, put_nd
> >> require access to the Southbound DB contents and gARPs, periodic IPv6 RA
> >> generation also requires the DB access, pinctrl_run() called by the main
> >> ovn-controller thread accesses the Southbound DB IDL and builds the
> local
> >> datastructures. pinctrl_handler thread accesses these data structures
> >> in handling such requests. An ovs_mutex is used between the
> pinctr_run() and
> >> the pinctrl_handler thread to protect these data structures.
> >>
> >> Signed-off-by: Numan Siddique <nusiddiq at redhat.com>
> >> ---
> >>   ovn/controller/pinctrl.c | 681 ++++++++++++++++++++++++++++++---------
> >>   1 file changed, 532 insertions(+), 149 deletions(-)
> >>
> >> diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c
> >> index 100a20ff2..857cb5d51 100644
> >> --- a/ovn/controller/pinctrl.c
> >> +++ b/ovn/controller/pinctrl.c
> >> @@ -27,6 +27,7 @@
> >>   #include "lport.h"
> >>   #include "nx-match.h"
> >>   #include "ovn-controller.h"
> >> +#include "latch.h"
> >>   #include "lib/packets.h"
> >>   #include "lib/sset.h"
> >>   #include "openvswitch/ofp-actions.h"
> >> @@ -48,19 +49,13 @@
> >>   #include "openvswitch/poll-loop.h"
> >>   #include "openvswitch/rconn.h"
> >>   #include "socket-util.h"
> >> +#include "seq.h"
> >>   #include "timeval.h"
> >>   #include "vswitch-idl.h"
> >>   #include "lflow.h"
> >>
> >>   VLOG_DEFINE_THIS_MODULE(pinctrl);
> >>
> >> -/* OpenFlow connection to the switch. */
> >> -static struct rconn *swconn;
> >> -
> >> -/* Last seen sequence number for 'swconn'.  When this differs from
> >> - * rconn_get_connection_seqno(rconn), 'swconn' has reconnected. */
> >> -static unsigned int conn_seq_no;
> >> -
> >>   static void init_buffered_packets_map(void);
> >>   static void destroy_buffered_packets_map(void);
> >>
> >> @@ -76,11 +71,14 @@ static void run_put_mac_bindings(
> >>       struct ovsdb_idl_index *sbrec_mac_binding_by_lport_ip);
> >>   static void wait_put_mac_bindings(struct ovsdb_idl_txn
> *ovnsb_idl_txn);
> >>   static void flush_put_mac_bindings(void);
> >> +static void buffer_put_mac_bindings(void);
> >> +static void destroy_buffered_mac_bindings(void);
> >> +static void send_mac_binding_buffered_pkts(struct rconn *swconn);
> >>
> >>   static void init_send_garps(void);
> >>   static void destroy_send_garps(void);
> >>   static void send_garp_wait(void);
> >> -static void send_garp_run(
> >> +static void send_garp_prepare(
> >>       struct ovsdb_idl_index *sbrec_chassis_by_name,
> >>       struct ovsdb_idl_index *sbrec_port_binding_by_datapath,
> >>       struct ovsdb_idl_index *sbrec_port_binding_by_name,
> >> @@ -88,45 +86,144 @@ static void send_garp_run(
> >>       const struct sbrec_chassis *,
> >>       const struct hmap *local_datapaths,
> >>       const struct sset *active_tunnels);
> >> -static void pinctrl_handle_nd_na(const struct flow *ip_flow,
> >> +static void send_garp_run(struct rconn *swconn);
> >> +static void pinctrl_handle_nd_na(struct rconn *swconn,
> >> +                                 const struct flow *ip_flow,
> >>                                    const struct match *md,
> >>                                    struct ofpbuf *userdata,
> >>                                    bool is_router);
> >>   static void reload_metadata(struct ofpbuf *ofpacts,
> >>                               const struct match *md);
> >>   static void pinctrl_handle_put_nd_ra_opts(
> >> +    struct rconn *swconn,
> >>       const struct flow *ip_flow, struct dp_packet *pkt_in,
> >>       struct ofputil_packet_in *pin, struct ofpbuf *userdata,
> >>       struct ofpbuf *continuation);
> >> -static void pinctrl_handle_nd_ns(const struct flow *ip_flow,
> >> +static void pinctrl_handle_nd_ns(struct rconn *swconn,
> >> +                                 const struct flow *ip_flow,
> >>                                    struct dp_packet *pkt_in,
> >>                                    const struct match *md,
> >>                                    struct ofpbuf *userdata);
> >>   static void init_ipv6_ras(void);
> >>   static void destroy_ipv6_ras(void);
> >>   static void ipv6_ra_wait(void);
> >> -static void send_ipv6_ras(
> >> +static void prepare_ipv6_ras(
> >>       struct ovsdb_idl_index *sbrec_port_binding_by_datapath,
> >>       struct ovsdb_idl_index *sbrec_port_binding_by_name,
> >>       const struct hmap *local_datapaths);
> >> -;
> >> +static void send_ipv6_ras(struct rconn *swconn);
> >>
> >>   COVERAGE_DEFINE(pinctrl_drop_put_mac_binding);
> >>   COVERAGE_DEFINE(pinctrl_drop_buffered_packets_map);
> >>
> >> +/* pinctrl module creates a thread - pinctrl_handler to handle
> >> + * the packet-ins from ovs-vswitchd. Some of the OVN actions
> >> + * are translated to OF 'controller' actions. See include/ovn/actions.h
> >> + * for more details.
> >> + *
> >> + * pinctrl_handler thread doesn't access the Southbound IDL object. But
> >> + * some of the OVN actions which gets translated to 'controller'
> >> + * OF action, require data from Southbound DB.  Below are the details
> >> + * on how these actions are implemented.
> >> + *
> >> + * pinctrl_run() function is called by ovn-controller main thread.
> >> + * A Mutex - 'pinctrl_mutex' is used between the pinctrl_handler()
> thread
> >> + * and pinctrl_run().
> >> + *
> >> + *   - dns_lookup -     In order to do a DNS lookup, this action needs
> >> + *                      to access the 'DNS' table. pinctrl_run()
> builds a
> >> + *                      local DNS cache - 'dns_cache'. See
> sync_dns_cache()
> >> + *                      for more details.
> >> + *                      The function 'pinctrl_handle_dns_lookup()'
> (which is
> >> + *                      called with in the pinctrl_handler thread)
> looks into
> >> + *                      the local DNS cache to resolve the DNS
> requests.
> >> + *
> >> + *   - put_arp/put_nd - These actions stores the IPv4/IPv6 and MAC
> addresses
> >> + *                      in the 'MAC_Binding' table.
> >> + *                      The function
> 'pinctrl_handle_put_mac_binding()' (which
> >> + *                      is called with in the pinctrl_handler thread),
> stores
> >> + *                      the IPv4/IPv6 and MAC addresses in the
> >> + *                      hmap - put_mac_bindings.
> >> + *
> >> + *                      pinctrl_run(), reads these mac bindings from
> the hmap
> >> + *                      'put_mac_bindings' and writes to the
> 'MAC_Binding'
> >> + *                      table in the Southbound DB.
> >> + *
> >> + *   - arp/nd_ns      - These actions generate an ARP/IPv6 Neighbor
> solicit
> >> + *                      requests. The original packets are buffered and
> >> + *                      injected back when put_arp/put_nd actions are
> called.
> >> + *                      When pinctrl_run(), writes the mac bindings
> from the
> >> + *                      'put_mac_bindings' hmap, it moves these mac
> bindings
> >> + *                      to another hmap - 'buffered_mac_bindings'.
> >> + *
> >> + *                      The pinctrl_handler thread calls the function -
> >> + *                      send_mac_binding_buffered_pkts(), which uses
> >> + *                      the hmap - 'buffered_mac_bindings' and
> reinjects the
> >> + *                      buffered packets.
> >> + *
> >> + * pinctrl module also periodically sends IPv6 Router Solicitation
> requests
> >> + * and gARPs (for the router gateway IPs and configured NAT addresses).
> >> + *
> >> + * IPv6 RA handling - pinctrl_run() prepares the IPv6 RA information
> >> + *                    (see prepare_ipv6_ras()) in the shash 'ipv6_ras'
> by
> >> + *                    looking into the Southbound DB table -
> Port_Binding.
> >> + *
> >> + *                    pinctrl_handler thread sends the periodic IPv6
> RAs using
> >> + *                    the shash - 'ipv6_ras'
> >> + *
> >> + * gARP handling    - pinctrl_run() prepares the gARP information
> >> + *                    (see send_garp_prepare()) in the shash
> 'send_garp_data'
> >> + *                    by looking into the Southbound DB table
> Port_Binding.
> >> + *
> >> + *                    pinctrl_handler() thread sends these gARPs using
> the
> >> + *                    shash 'send_garp_data'.
> >> + *
> >> + * Notification between pinctrl_handler() and pinctrl_run()
> >> + * -------------------------------------------------------
> >> + * 'struct seq' is used for notification between pinctrl_handler()
> thread
> >> + *  and pinctrl_run().
> >> + *  'pinctrl_handler_seq' is used by pinctrl_run() to
> >> + *  wake up pinctrl_handler thread from poll_block() if any changes
> happened
> >> + *  in 'send_garp_data', 'ipv6_ras' and 'buffered_mac_bindings'
> structures.
> >> + *
> >> + *  'pinctrl_main_seq' is used by pinctrl_handler() thread to wake up
> >> + *  the main thread from poll_block() when mac bindings needs to be
> updated
> >> + *  in the Southboubd DB.
> >> + * */
> >> +
> >
> > Thanks Numan for addressing this problem. It is very good that the
> > implementation doesn't require a separate IDL for SB access. Here are
> > some generic comments.
> >
> > For operations that doesn't rely on SB DB, such as OVS probe echo, ACL
> > logging and DHCP, in theory, they can be handled completely in
> > parallel. However, in this patch the pinctrl_mutex locks the whole
> > pinctrl_handler. I think the lock can be relaxed to exclude the
> > operations that doesn't need SB DB access.
>

I did think of it earlier, but then went with this approach to take a safe
approach.
I will address this in v2  as per your suggestion.


> >
> > For tasks that need READONLY access to SB DB, the main thread reads
> > from DB, and notifies pinctrl thread to process the changed data. This
> > synchronization is definitely required and the implementation looks
> > good to me.
> >
> > For tasks that requires WRITE access to SB DB, such as put_arp, it
> > will still trigger recompute in main thread with this patch, and the
> > two thread would still block each other.

> up main thread with pinctrl_main_seq, the main thread will recompute
> > flows again, even if there is no change in SB DB. In fact this can be
> > solved by the first patches of the incremental processing
> > (
> https://github.com/hzhou8/ovs/commit/2bc0e2423fdcff21db4d607f3164b0e0c62cd660
> ).
> > In that patch there is no real incremental processing yet, but since
> > it is abstracted with dependency model, it can avoid flow computing
> > when it is not needed. Do you think it is good to be combined with
> > this multi-threading patch to avoid the problem of put_arp triggering
> > flow computing which blocks put_arp operation?
>
> I think the idea of not recomputing flows if SB DB has not changed is a
> fantastic idea and should be implemented whether pinctrl is made
> multithreaded or not. If the goal is *only* not to recompute flows if SB
> DB has not changed, then it potentially could be done with a simpler set
> of patches, since there is no goal of implementing an I-P engine.
>
>
Thanks Han and Mark for the comments.
I agree with Mark.  I think we should address that and it would definitely
help in reducing ovn-controller's  CPU usage. But I think that can be
addressed as a separate patch since this patch is not changing that
behavior. I will look into your patch you pointed out and see if I can test
it out.

Does this sound good ?

Thanks
Numan



> I was going to suggest that it may be possible to avoid such a patch by
> having the pinctrl_handle_mac_binding() function detect if local data
> has changed before notifying the main thread. However, I don't think
> this will work if MAC bindings move between hypervisors.
>
> Sorry for bringing the
> > I-P series back if it is annoying. I think the major reason that was
> > rejected earlier was the concern of maintenance effort for dependency
> > graph and change-handler implementations. However, since the first
> > patches of I-P only build the coarse grained dependency without any
> > change-handler, we can just utilize the benefit without worrying about
> > maintenance, although I might be biased. (p.s. I am still eager to
> > work on DDlog for ovn-controller, but I guess it will just take quite
> > some time to be ready).
> >
> > Thanks,
> > Han
> > _______________________________________________
> > dev mailing list
> > dev at openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
>
>


More information about the dev mailing list