[ovs-dev] [patch net-next v2 8/9] switchdev: introduce Netlink API

Tom Herbert therbert at google.com
Mon Sep 22 15:10:08 UTC 2014


On Mon, Sep 22, 2014 at 1:13 AM, Thomas Graf <tgraf at suug.ch> wrote:
> On 09/20/14 at 03:50pm, Alexei Starovoitov wrote:
>> I think HW should not be limited by SW abstractions whether
>> these abstractions are called flows, n-tuples, bridge or else.
>> Really looking forward to see "device reporting the headers as
>> header fields (len, offset) and the associated parse graph"
>> as the first step.
>>
>> Another topic that this discussion didn't cover yet is how this
>> all connects to tunnels and what is 'tunnel offloading'.
>> imo flow offloading by itself serves only academic interest.
>
> We haven't touched encryption yet either ;-)
>
> Certainly true for the host case. The Linux on TOR case is less
> dependant on this and L2/L3 offload w/o encap already has value.
>
Thomas, can you (or someone else) quantify what the host case is. I
suppose there may be merit in using a switch on NIC for kernel bypass
scenarios, but I'm still having a hard time understanding how this
could be integrated into the host stack with benefits that outweigh
complexity. The history of stateful offloads in NICs is not great, and
encapsulation (stuffing a few bytes of header into a packet) is in
itself not nearly an expensive enough operation to warrant offloading
to the NIC. Personally, I wish if NIC vendors are going to focus on
stateful offload I rather see it be for encryption which I believe
currently does warrant offload at 40G and higher speeds.

Tom

> I'm with you though, all of this has little value on the host in
> the DC if stateful encap offload is not incorporated. I expect the
> HW to provide filters on the outer header plus metadata in the
> encap. Actually, this was a follow-up question I had for John as
> this is not easily describable with offset/len filters. How would
> we represent such capabilities?
>
> The TX side of this was one of the reasons why I initially thought
> it would be beneficial to implement a cache like offload as we could
> serve an initial encap in SW, do the FIB lookup and offload it
> transparently to avoid replicating the FIB in user space.
>
> What seems most feasisble to me right now is to separate the offload
> of the encap action from the IP -> dev mapping decision. The eSwitch
> would send the first encap for an unknown dest IP to the CPU due
> to a miss in the IP mapping table, the CPU would do the FIB lookup,
> update the table and send it back.
>
> What do you have in mind?
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



More information about the dev mailing list