[ovs-dev] [PATCH v3] dpif-netdev: Expand the meter capacity.

Tonghao Zhang xiangxia.m.yue at gmail.com
Thu Jun 24 14:59:14 UTC 2021


On Wed, Jun 23, 2021 at 7:07 PM Ilya Maximets <i.maximets at ovn.org> wrote:
>
> On 5/12/21 11:17 AM, xiangxia.m.yue at gmail.com wrote:
> > From: Tonghao Zhang <xiangxia.m.yue at gmail.com>
> >
> > For now, ovs-vswitchd use the array of the dp_meter struct
> > to store meter's data, and at most, there are only 65536
> > (defined by MAX_METERS) meters that can be used. But in some
> > case, for example, in the edge gateway, we should use 200,000+,
> > at least, meters for IP address bandwidth limitation.
> > Every one IP address will use two meters for its rx and tx
> > path[1]. In other way, ovs-vswitchd should support meter-offload
> > (rte_mtr_xxx api introduced by dpdk.), but there are more than
> > 65536 meters in the hardware, such as Mellanox ConnectX-6.
> >
> > This patch use cmap to manage the meter, instead of the array.
> >
> > * Insertion performance, ovs-ofctl add-meter 1000+ meters,
> >   the cmap takes abount 4000ms, as same as previous implementation.
> > * Lookup performance in datapath, we add 1000+ meters which rate limit
> >   are 10Gbps (the NIC cards are 10Gbps, so netdev-datapath will not
> >   drop the packets.), and a flow which only forward packets from p0
> >   to p1, with meter action[2]. On other machine, pktgen-dpdk will
> >   generate 64B packets to p0.
> >
> >   The forwarding performance always is 1324 Kpps on my server
> >   which CPU is Intel E5-2650, 2.00GHz.
> >
> > [1].
> > $ in_port=p0,ip,ip_dst=1.1.1.x action=meter:n,output:p1
> > $ in_port=p1,ip,ip_src=1.1.1.x action=meter:m,output:p0
> >
> > [2].
> > $ in_port=p0 action=meter:100,output:p1
> >
> > Signed-off-by: Tonghao Zhang <xiangxia.m.yue at gmail.com>
> > ---
> > v3:
> > * update the commit message
> > * remove dp_netdev_meter struct
> > * remove create_dp_netdev function
> > * don't use the hash_basis
> > * use the meter_id as a hash instead of hash_xxx function. see *dp_meter_hash for details
> > * fix coding style
> > * v2: http://patchwork.ozlabs.org/project/openvswitch/patch/1584254601-7321-1-git-send-email-xiangxia.m.yue@gmail.com/
> > ---
> >  lib/dpif-netdev.c | 158 ++++++++++++++++++++++++++++------------------
> >  1 file changed, 97 insertions(+), 61 deletions(-)
>
> Hi.  Thanks for v3!
>
> This version looks mostly OK to me with only one question:
> In current code meter locks are adaptive mutexes, but this patch
> makes them usual.  Is there particular reason to do that?
No, we should use the adaptive mutexes, shown below. I tested the
forwarding performance, using dpdk-pktgen to gen flows:
range 0 dst ip start 4.4.4.200
range 0 dst ip min 4.4.4.200
range 0 dst ip max 4.4.4.200
range 0 proto udp
range 0 src port start 1
range 0 src port min 1
range 0 src port max 1000
range 0 src port inc 1
range 0 size start 64
range 0 size min 64
range 0 size max 64
enable 0 range
start 0

the usual metex: 988192 pps
the adaptive mutex: 1007872 pps

Ilya, we can change them before applying this patch. Thanks!

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index f800c8c65e13..cc173541f1c8 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -6377,7 +6377,7 @@ dpif_netdev_meter_set(struct dpif *dpif,
ofproto_meter_id meter_id,
     meter->max_delta_t = 0;
     meter->used = time_usec();
     meter->id = mid;
-    ovs_mutex_init(&meter->lock);
+    ovs_mutex_init_adaptive(&meter->lock);

     /* set up bands */
     for (i = 0; i < config->n_bands; ++i) {

> Have you tested performance in case where several threads uses
> the same meter?  If not, I'd prefer to keep it adaptive, as it's
> the current behavior and adaptive mutexes sometimes provides
> better performance since they act like spinlocks for a short
> period of time (in some cases they're worse than simple mutexes,
> but extensive performance testing is needed for each particular
> case to confirm).
> I can change the type of meter mutexes before applying the patch.
> Let me know, what do you think.
>
> Best regards, Ilya Maximets.



-- 
Best regards, Tonghao


More information about the dev mailing list