[ovs-discuss] set dpdk packet refcnt when flow output to group.

Daniele Di Proietto diproiettod at vmware.com
Tue Oct 20 18:33:21 UTC 2015


Hi,

Currently every DPDK mbuf in OVS has the `refcnt` set to one. Output to
multiple ports is handled by making a copy of the packet's payload (see
`may_steal` in dp_netdev_execute_actions(), and in netdev_send()).

You're right, having a `refcnt` != 1 might be necessary to use
rte_ipv4_fragment_packet() or to support certain offloading capabilities
(currently not implemented in OVS).

Does this answer you question?

Daniele


On 15/10/2015 12:41, "Ben Pfaff" <blp at nicira.com> wrote:

>I don't understand what you're asking for.
>
>Daniele or Pravin, I think that you know the DPDK datapath well.  Do you
>understand what David wants or why?
>
>On Thu, Oct 15, 2015 at 01:15:11PM -0500, David Evans wrote:
>> Thanks Ben,
>> 
>> If that¹s the case, then it would be better to be adding custom action
>>that applies prior to this group action, to update the refcnt.
>> 
>> I expect it just has to happen some time before the first PMD has
>>finished processing the packet so that the packet does not get deleted
>>by the tx routine before other PMD¹s have seen the packet.
>> 
>> Cheers
>> Dave.
>> 
>> 
>> 
>> > On Oct 12, 2015, at 12:23 PM, Ben Pfaff <blp at nicira.com> wrote:
>> > 
>> > Your change isn't going to have much effect because most packets don't
>> > go through the translation process.  If you try to force all packets
>> > through translation, it will kill performance.
>> > 
>> > I think that you should read this paper that describes the various
>> > caching layers in Open vSwitch:
>> >        
>>https://urldefense.proofpoint.com/v2/url?u=http-3A__openvswitch.org_suppo
>>rt_papers_nsdi2015.pdf&d=BQIDaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt
>>-uEs&r=SmB5nZacmXNq0gKCC1s_Cw5yUNjxgD4v5kJqZ2uWLlE&m=xaCdqbPumJKYqzipA2A5
>>CYRDmbv1Q_lFRe2Aw2_bqpQ&s=GdugumekoH_nwJ4XnY2ip92yy-YoGNIV8Rj_tQkQ_b0&e=
>> > 
>> > On Mon, Oct 12, 2015 at 11:56:03AM -0500, David Evans wrote:
>> >> Hi Ben,
>> >> 
>> >> When i use the OFPGT11_ALL group action, the packets for a  flow
>>will be sent out all buckets in a group. (in my case all the buckets are
>>ports to transmit out)
>> >> 
>> >> I added a group_bucket_count to the context
>> >> and 
>> >> in xlate_all_group fn the following.
>> >> 
>> >>    group_dpif_get_buckets(group, &buckets);
>> >> +    if(ctx->group_bucket_count == 0){
>> >> +    	LIST_FOR_EACH (bucket, list_node, buckets) {
>> >> +    		ctx->group_bucket_count++;
>> >> +        }
>> >> +    }
>> >> +    if(ctx->xin->packet)
>> >> +    	if(ctx->xin->packet->source == DPBUF_DPDK)
>> >> +    
>>		rte_pktmbuf_refcnt_update(&ctx->xin->packet->mbuf,ctx->group_bucket_cou
>>nt);
>> >> 	LIST_FOR_EACH (bucket, list_node, buckets) {
>> >> 
>> >> this stops the transmit pmd¹s attempting to free the packet until
>>all the buckets( ports ) have transmitted it.
>> >> My switch also does reassembly on rx - this refcnt is necessary for
>>handling multi-segment dpdk buffers too.
>> >> I also changed the segment free to rte_pktmbuf_free in netdev-dpdk.c
>>for this purpose.
>> >> I¹m expecting it will also be important for tso or the possibility
>>of using rte_ipv4_fragment_packet on an outgoing port.
>> >> 
>> >> i have between 6 and 12 PMD¹s depending on the number of dpdk ports
>>running at any time, and if i use OFPGT11_ALL with many output
>>buckets(ports) buffers will disappear from under some pmd¹s and cause
>>segfaults etc..
>> >> 
>> >> Cheers,
>> >> 
>> >> Dave.
>> >> 
>> >>> On Oct 12, 2015, at 11:38 AM, Ben Pfaff <blp at nicira.com> wrote:
>> >>> 
>> >>> On Wed, Oct 07, 2015 at 05:36:18PM -0500, David Evans wrote:
>> >>>> While using netdev-dpdk - When i add a rule for which the action
>>is to
>> >>>> send to a group (type=all) containing (x) output buckets (ports)
>>how
>> >>>> can i increment the dp_packet->pkt_mbuf¹s refcnt to (x) so that the
>> >>>> packet is not deleted before it has transmitted all ports(buckets)
>>in
>> >>>> the group.
>> >>>> 
>> >>>> Perhaps in ofproto-dpif-xlate.c function xlate_all_group find the
>> >>>> packet and apply the ctx->xin->packet->mbuf->refcnt ?  Will that
>>work
>> >>>> for all packets for a ctx?
>> >>> 
>> >>> I don't understand what relationship you expect here.  A group has
>>no
>> >>> direct relationship to a packet.  Translation produces a flat list
>>of
>> >>> simple actions that don't refer back to the group.
>> >> 
>> 




More information about the discuss mailing list