[ovs-dev] [PATCH v2 5/8] dpif-netdev: record rx queue id for the upcall

Darrell Ball dball at vmware.com
Wed Sep 13 03:04:05 UTC 2017



On 9/11/17, 5:58 AM, "ovs-dev-bounces at openvswitch.org on behalf of Chandran, Sugesh" <ovs-dev-bounces at openvswitch.org on behalf of sugesh.chandran at intel.com> wrote:

    
    
    Regards
    _Sugesh
    
    
    > -----Original Message-----
    > From: Yuanhan Liu [mailto:yliu at fridaylinux.org]
    > Sent: Monday, September 11, 2017 11:12 AM
    > To: Chandran, Sugesh <sugesh.chandran at intel.com>
    > Cc: dev at openvswitch.org
    > Subject: Re: [ovs-dev] [PATCH v2 5/8] dpif-netdev: record rx queue id for the
    > upcall
    > 
    > On Mon, Sep 11, 2017 at 09:49:52AM +0000, Chandran, Sugesh wrote:
    > > > > > For the DPDK flow offload, which basically just binds a MARK
    > > > > > action to a flow, the MARK is required to be used together with
    > > > > > a QUEUE action for the most NICs I'm aware of. The QUEUE action
    > > > > > then needs a queue index, which is not given in the flow content.
    > > > > [Sugesh] Looks to me this is again another hardware specific req.
    > > > > This could have impact on RSS hash distribution and queue
    > > > > redistribution across pmds at runtime.
    > > >
    > > > If you have read my earlier version, you should have seen similar
    > > > concerns from me.
    > > [Sugesh] I feel this has to be addressed properly to make this feature work
    > on all the cases.
    > 
    > Agreed.
    > 
    > > > > > handle_packet_upcall(struct dp_netdev_pmd_thread *pmd,
    > > > > >                       struct dp_packet *packet,
    > > > > >                       const struct netdev_flow_key *key,
    > > > > >                       struct ofpbuf *actions, struct ofpbuf *put_actions,
    > > > > > -                     int *lost_cnt, long long now)
    > > > > > +                     int *lost_cnt, long long now, int rxq)
    > > > > [Sugesh] IMHO its not really good practice to change the default
    > > > > packet processing path for some specific hardware offload. Rxq
    > > > > doesn't
    > > > have any meaning for handling the packets in normal path.
    > > >
    > > > The same: some concerns I have already expressed before.
    > > > Unfortunately, we didn't come up with something better.
    > > >
    > > > > Why cant install flow on all the configured queues for a specific inport?
    > > > Flow handling is per port, not per queue.
    > > > > This will assure the packets will have mark even after the rss
    > > > > hash
    > > > distribution.
    > > >
    > > > Like how? The QUEUE action only accepts one queue index. Setting it
    > > > to many times will only let the last one take effect. The another
    > > > possiblity I'm aware of is with the RTE_FLOW_ACTION_TYPE_RSS, which,
    > > > unfortunately, is only supported by Mellanox in DPDK. Yet again, I
    > > > was told it's not functional well.
    > > [Sugesh] Hmm. I got it what you meant.   Flow director has an option called
    > > passthrough. This will allow to use RSS hash on packet after the filter.
    > > If I remember correctly this has been in supported in XL710 all the time.
    > > This will allow to program the flow without any queue index.
    > 
    > Good to know. However, if you git grep it, you will see that only i40e and tap
    > have this support.
    [Sugesh] Yes, you are right. 
    > 
    > > If there is a
    > > functionality issue to configure the MARK action properly, it has to
    > > be fixed in DPDK than doing a workaround in OVS.
    > 
    > Again, can't agree more on this. But the truth/fact is that it's not an easy task
    > to fix DPDK. For Mellanox, I don't know exactly how many parts need to be
    > fixed (something like DPDK PMD, libibvers, kernel, etc). For others, it might
    > just be a hardware limitation.
    > 
    > It's even harder (if not impossible) to fix most (if not all) DPDK pmds has rte
    > flow support.
    > 
    > Even if we could do that, it may take years to finish that. At least, I see no
    > related tasks from DPDK v17.11.
    [Sugesh] Ok. IMO making changes in DPDK is cleaner and avoid lot of extra
    work in OVS

[Darrell] The queue action workaround (for Intel and Mellanox nics) has been discussed
extensively in the first version of the patchset and the last 2 dpdk public meetings.
Nobody likes it.
There are some other mitigating options discussed already.

I am not sure it is feasible to wait for the underlying support to come, assuming it does.
However, some requests for enhancements could be made in parallel.


    > 
    > > > Also, even it could work, I think it still would be probematic. I'm
    > > > thinking what might happen for following case.
    > > >
    > > > Assume there is a 5-tuple flow. According to the initial RSS setting
    > > > by OVS, all pkts match that flow would be ended up being recieved
    > > > from one queue only. If we re-do RSS settings on it, if the RSS
    > > > settings are the same, the behaviour might be the same. If not,
    > > > those pkts which are supposed to be
    > > [Sugesh] When a user changes number of queues, RSS setting might
    > change.
    > > Also in the current design, when a queue get pinned to different PMD
    > > at run time, The mark details may loose as its on the PMD struct.
    > > > distributed to one queue only might be distributed to many queues.
    > > >
    > > > Is it a valid concern?
    > > [Sugesh] I feel there will be performance and scalability issues if we
    > > wanted to program a flow for a queue ID. Hence I prefer to have flow
    > programming without queue specific information.
    > 
    > True me, I also really want to get rid of the queue action, just if we have good
    > options.
    
    > 
    > 	--yliu
    _______________________________________________
    dev mailing list
    dev at openvswitch.org
    https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=gCgqKWT7TG8YtKH-bHF0kxNELaOVnFKHw1Sib8s65fA&s=3wZntu-3h8pDivrsLzclYWqjCu4KaDztj2taArziDTc&e= 
    



More information about the dev mailing list