[ovs-dev] [PATCH v3 4/4] dp-packet: Use memcpy on dp_packet elements.

Fischetti, Antonio antonio.fischetti at intel.com
Thu Aug 17 10:44:03 UTC 2017


> -----Original Message-----
> From: Darrell Ball [mailto:dball at vmware.com]
> Sent: Monday, August 14, 2017 7:34 AM
> To: Fischetti, Antonio <antonio.fischetti at intel.com>; dev at openvswitch.org
> Subject: Re: [ovs-dev] [PATCH v3 4/4] dp-packet: Use memcpy on dp_packet
> elements.
> 
> Could you show some throughput results for a particular tests ?

[Antonio]
The only way I could test this change is with a setup that I was 
initially using on a modified firewall for learning purposes. 
I had added a normal action to the 1st rule below, this can't be 
a real scenario. I couldn't find a better test to involve the 
dp_packet_clone_with_headroom function, if anyone has some
suggestion is welcome.
Below some continuous tests results, ie bidir test with packets
sent at the line-rate. The Rx pkt rate is measured, without 
considering the packet loss.

	Test setup
	----------

table=0, priority=100,ct_state=-trk,ip actions=ct(table=1),NORMAL  <----
table=0, priority=10,arp actions=NORMAL
table=0, priority=1 actions=drop
table=1, ct_state=+new+trk,ip,in_port=dpdk0 actions=ct(commit),output:dpdk1
table=1, ct_state=+new+trk,ip,in_port=dpdk1 actions=drop
table=1, ct_state=+est+trk,ip,in_port=dpdk0 actions=output:dpdk1
table=1, ct_state=+est+trk,ip,in_port=dpdk1 actions=output:dpdk0

2 PMDs, bidir test, 64B UDP packets.

	Throughput results
	------------------

Original OvS-DPDK means at Commit ID:
    6b1babacc3ca0488e07596bf822fe356c9bab646

              +----------------------+-----------------------+
              |  Original OvS-DPDK + |   Original OvS-DPDK + |
              |  patches 1,2,3       |   patches 1,2,3 +     |
              |                      |      this patch       |
     ---------+----------------------+-----------------------+
      Traffic |          Rx          |          Rx           |
      Streams |        [Mpps]        |        [Mpps]         |
     ---------+----------------------+-----------------------+
         100  |      3.03, 3.02      |      3.04, 3.04       |
         500  |      2.87, 2.86      |      2.90, 2.87       |
       1,000  |      2.72, 2.70      |      2.80, 2.76       |
       2,000  |      2.57, 2.59      |      2.59, 2.58       |
       5,000  |      2.54, 2.47      |      2.53, 2.48       |
      10,000  |      2.43, 2.39      |      2.43, 2.39       |
     ---------+----------------------+-----------------------+


	Perf comparison with 500 UDP connections
	----------------------------------------

With perf top I am seeing the following difference on 
dp_packet_clone_with_headroom: from 0.95% -> 0.77%, 
below some details.

Orig + patches 1,2,3:	
---------------------
  11.09%  ovs-vswitchd        [.] dp_netdev_input__
   9.42%  libc-2.21.so        [.] malloc
   9.33%  libpthread-2.21.so  [.] pthread_mutex_unlock
   7.49%  libc-2.21.so        [.] free
   6.27%  ovs-vswitchd        [.] miniflow_extract
   5.94%  libc-2.21.so        [.] __memcpy_avx_unaligned
   4.43%  ovs-vswitchd        [.] ovs_mutex_lock_at
   4.37%  libc-2.21.so        [.] __memcmp_sse4_1
   4.30%  libc-2.21.so        [.] _int_malloc
   4.20%  libpthread-2.21.so  [.] pthread_mutex_lock
   3.34%  ovs-vswitchd        [.] dpdk_do_tx_copy
   3.34%  ovs-vswitchd        [.] conn_key_lookup
   3.20%  ovs-vswitchd        [.] ixgbe_xmit_fixed_burst_vec
   2.58%  ovs-vswitchd        [.] conntrack_execute
   2.49%  ovs-vswitchd        [.] conn_key_hash
   2.25%  ovs-vswitchd        [.] process_one
   2.19%  ovs-vswitchd        [.] other_conn_update
   1.77%  ovs-vswitchd        [.] ixgbe_recv_pkts_vec
   0.95%  ovs-vswitchd        [.] dp_packet_clone_with_headroom     <<<<<<<<<<<<
   0.83%  ovs-vswitchd        [.] write_ct_md
   0.79%  ovs-vswitchd        [.] dp_execute_cb
   0.74%  ovs-vswitchd        [.] conn_update_state.isra.15.part.16
   0.63%  ovs-vswitchd        [.] conn_key_cmp
   0.42%  ovs-vswitchd        [.] dp_packet_put

Orig + patches 1,2,3 + this patch:
----------------------------------
  11.72%  ovs-vswitchd        [.] dp_netdev_input__
   9.30%  libpthread-2.21.so  [.] pthread_mutex_unlock
   9.09%  libc-2.21.so        [.] malloc
   7.52%  libc-2.21.so        [.] free
   6.40%  libc-2.21.so        [.] __memcpy_avx_unaligned
   6.04%  ovs-vswitchd        [.] miniflow_extract
   4.55%  libc-2.21.so        [.] _int_malloc
   4.45%  libc-2.21.so        [.] __memcmp_sse4_1
   4.31%  ovs-vswitchd        [.] ovs_mutex_lock_at
   3.88%  libpthread-2.21.so  [.] pthread_mutex_lock
   3.62%  ovs-vswitchd        [.] dpdk_do_tx_copy
   3.62%  ovs-vswitchd        [.] conn_key_lookup
   3.00%  ovs-vswitchd        [.] ixgbe_xmit_fixed_burst_vec
   2.60%  ovs-vswitchd        [.] conntrack_execute
   2.42%  ovs-vswitchd        [.] conn_key_hash
   2.39%  ovs-vswitchd        [.] other_conn_update
   2.30%  ovs-vswitchd        [.] process_one
   1.65%  ovs-vswitchd        [.] ixgbe_recv_pkts_vec
   0.78%  ovs-vswitchd        [.] dp_execute_cb
   0.77%  ovs-vswitchd        [.] dp_packet_clone_with_headroom     <<<<<<<<<<<<
   0.71%  ovs-vswitchd        [.] write_ct_md
   0.71%  ovs-vswitchd        [.] conn_update_state.isra.15.part.16
   0.63%  ovs-vswitchd        [.] conn_key_cmp


	VTune analysis
	--------------
The CPU Time over a 60 sec analysis goes from 4.530s -> 3.920s.
Please refer to results reported in previous email.



> 
> -----Original Message-----
> From: <ovs-dev-bounces at openvswitch.org> on behalf of
> "antonio.fischetti at intel.com" <antonio.fischetti at intel.com>
> Date: Friday, August 11, 2017 at 8:52 AM
> To: "dev at openvswitch.org" <dev at openvswitch.org>
> Subject: [ovs-dev] [PATCH v3 4/4] dp-packet: Use memcpy on dp_packet
> 	elements.
> 
>     memcpy replaces the several single copies inside
>     dp_packet_clone_with_headroom().
> 
>     Signed-off-by: Antonio Fischetti <antonio.fischetti at intel.com>
>     ---
>     I tested this change by comparing the CPU Time over a 60 sec analysis
>     with VTune.
> 
>     In original ovs:
>     dp_packet_clone_with_headroom    4.530s
> 
>     + this changes:
>     dp_packet_clone_with_headroom    3.920s
> 
>     Further details were reported in this reply for v1
>     https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__mail.openvswitch.org_pipermail_ovs-2Ddev_2017-
> 2DJune_334536.html&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-
> uZnsw&m=6qGelsac4wBCA5cHYalnmiB0QySdkGCItzNJ_-
> twiJg&s=Wc0BH6ff_CG15qMnkGq8VDI5YmNby5Eo3JxRdZslOsI&e=
>     ---
>      lib/dp-packet.c | 18 +++++++++---------
>      lib/dp-packet.h |  6 +++++-
>      2 files changed, 14 insertions(+), 10 deletions(-)
> 
>     diff --git a/lib/dp-packet.c b/lib/dp-packet.c
>     index 67aa406..f4dbcb7 100644
>     --- a/lib/dp-packet.c
>     +++ b/lib/dp-packet.c
>     @@ -157,8 +157,9 @@ dp_packet_clone(const struct dp_packet *buffer)
>          return dp_packet_clone_with_headroom(buffer, 0);
>      }
> 
>     -/* Creates and returns a new dp_packet whose data are copied from
> 'buffer'.   The
>     - * returned dp_packet will additionally have 'headroom' bytes of headroom.
> */
>     +/* Creates and returns a new dp_packet whose data are copied from
> 'buffer'.
>     + * The returned dp_packet will additionally have 'headroom' bytes of
>     + * headroom. */
>      struct dp_packet *
>      dp_packet_clone_with_headroom(const struct dp_packet *buffer, size_t
> headroom)
>      {
>     @@ -167,13 +168,12 @@ dp_packet_clone_with_headroom(const struct dp_packet
> *buffer, size_t headroom)
>          new_buffer =
> dp_packet_clone_data_with_headroom(dp_packet_data(buffer),
>                                                       dp_packet_size(buffer),
>                                                       headroom);
>     -    new_buffer->l2_pad_size = buffer->l2_pad_size;
>     -    new_buffer->l2_5_ofs = buffer->l2_5_ofs;
>     -    new_buffer->l3_ofs = buffer->l3_ofs;
>     -    new_buffer->l4_ofs = buffer->l4_ofs;
>     -    new_buffer->md = buffer->md;
>     -    new_buffer->cutlen = buffer->cutlen;
>     -    new_buffer->packet_type = buffer->packet_type;
>     +    /* Copy the following fields into the returned buffer: l2_pad_size,
>     +     * l2_5_ofs, l3_ofs, l4_ofs, cutlen, packet_type and md. */
>     +    memcpy(&new_buffer->l2_pad_size, &buffer->l2_pad_size,
>     +            sizeof(struct dp_packet) -
>     +            offsetof(struct dp_packet, l2_pad_size));
>     +
>      #ifdef DPDK_NETDEV
>          new_buffer->mbuf.ol_flags = buffer->mbuf.ol_flags;
>      #else
>     diff --git a/lib/dp-packet.h b/lib/dp-packet.h
>     index 9dbb611..9cab7c7 100644
>     --- a/lib/dp-packet.h
>     +++ b/lib/dp-packet.h
>     @@ -40,7 +40,8 @@ enum OVS_PACKED_ENUM dp_packet_source {
>          DPBUF_STACK,               /* Un-movable stack space or static buffer.
> */
>          DPBUF_STUB,                /* Starts on stack, may expand into heap.
> */
>          DPBUF_DPDK,                /* buffer data is from DPDK allocated
> memory.
>     -                                * ref to dp_packet_init_dpdk() in dp-
> packet.c. */
>     +                                * ref to dp_packet_init_dpdk() in dp-
> packet.c.
>     +                                */
>      };
> 
>      #define DP_PACKET_CONTEXT_SIZE 64
>     @@ -61,6 +62,9 @@ struct dp_packet {
>          bool rss_hash_valid;        /* Is the 'rss_hash' valid? */
>      #endif
>          enum dp_packet_source source;  /* Source of memory allocated as
> 'base'. */
>     +
>     +    /* All the following elements of this struct are copied by a single
> call
>     +     * to memcpy in dp_packet_clone_with_headroom. */
>          uint8_t l2_pad_size;           /* Detected l2 padding size.
>                                          * Padding is non-pullable. */
>          uint16_t l2_5_ofs;             /* MPLS label stack offset, or
> UINT16_MAX */
>     --
>     2.4.11
> 
>     _______________________________________________
>     dev mailing list
>     dev at openvswitch.org
>     https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__mail.openvswitch.org_mailman_listinfo_ovs-
> 2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-
> uZnsw&m=6qGelsac4wBCA5cHYalnmiB0QySdkGCItzNJ_-
> twiJg&s=LREI681tchx40jdHLb94j5YDPgqXZnbI9_Yv4QNnqyA&e=
> 



More information about the dev mailing list