[ovs-dev] The NET_BUFFER_LIST issue

Alessandro Pilotti apilotti at cloudbasesolutions.com
Sat Aug 2 16:51:35 UTC 2014


Thanks Sam, 

I openend an issue on github for this one:
https://github.com/openvswitch/ovs-issues/issues/2

Alessandro



On 02 Aug 2014, at 19:15, Eitan Eliahu <eliahue at vmware.com> wrote:

> Hi Samuel,
> You are correct and we are well aware of this issue,  in fact  this issue was in our to do list for a long time. The plan is to create an NBL from each NB and to run the new created NBL(s) against the flow table.
> If you guys have some free cycles please go ahead and implement it. 
> Eitan
> 
> -----Original Message-----
> From: dev [mailto:dev-bounces at openvswitch.org] On Behalf Of Samuel Ghinet
> Sent: Saturday, August 02, 2014 3:08 AM
> To: dev at openvswitch.org
> Subject: [ovs-dev] The NET_BUFFER_LIST issue
> 
> Hello guys,
> 
> While working on the integration between our kernel netlink component and your driver, I had my attention drawn a bit more towards your implementation of the buffer management.
> I expect that there was a discussion upon it, but I'd like to highlight a few things myself, hoping that together we'll come up with a solution that will work best for the future.
> 
> Namely, in the OvsExtractFlow function, in the code pushed upstream, the first NET_BUFFER of the NET_BUFFER_LIST is used as the basis for the flow key extraction.
> 
> I hope you had read this MSDN documentattion:
> https://urldefense.proofpoint.com/v1/url?u=http://msdn.microsoft.com/en-us/library/windows/hardware/ff570756%28v%3Dvs.85%29.aspx&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=yTvML8OxA42Jb6ViHe7fUXbvPVOYDPVq87w43doxtlY%3D%0A&m=6jqjOj7Vq%2B8mf1eZZXUFqAW9emd4ga9kgXkDrFFHzqo%3D%0A&s=ad7c2d6a7e4d93211759bcb91865f09f9ba30cb0285d2bee9cfce3a05e4cb985
> 
> The relevant issues here, for our project, are:
> "Each NET_BUFFER structure that is linked to a NET_BUFFER_LIST structure describes a single Ethernet frame."
> 
> "All NET_BUFFER structures that are associated with a NET_BUFFER_LIST structure must have the same Ethernet frame type and IP protocol version (IPv4 or IPv6)."
> 
> "All NET_BUFFER structures that are associated with a NET_BUFFER_LIST structure must have the same source and destination MAC addresses."
> 
> "If a driver is sending TCP or UDP frames, all of the NET_BUFFER structures that are associated with a NET_BUFFER_LIST structure must be associated with same TCP or UDP connection."
> 
> -------
> 
> In other words, we have 3 special cases:
> a) We've got one NET_BUFFER_LIST, that contains one NET_BUFFER - best case for us, nothing special to do.
> 
> b) We've got a TCP or UDP packet split in multiple NET_BUFFER-s, in a NET_BUFFER_LIST.
> This is not a very good case for us, because the packet fields that are guaranteed to be the same for all NET_BUFFERs are:
> eth, net layer info (transport protocol only), transport layer (source and dest port only), ipv4 info (source and dest ip) and arp info, and ipv6 info.
> 
> Whereas the fields that we have NO guarantee that are the same among NET_BUFFER-s, are:
> net layer: TOS, TTL, fragment type
> transport layer: tcp flags
> ipv6 info: flow label.
> 
> c) We have something else than TCP / UDP packet (e.g. icmp4, icmp6, etc.):
> NONE of these fields are guaranteed to be the same for each NET_BUFFER in the NET_BUFFER_LIST:
> net layer: TOS, TTL, fragment type (don't know about mpls, where it fits) transpot layer: icmp type, icmp code (icmp4 and icmp6)
> ipv4: src and dest address (they are the same only for UDP and TCP)
> arp: perhaps the same problem as above, for ip address
> ipv6: src address, dest address, flow label It can also happen (i.e. we have no guarantee otherwise) that we get a NET_BUFFER_LIST with 2 NET_BUFFERs: first is some icmp6, the second contains an ICMP6 Neighbor Discovery. By the current implementation of OvsExtractFlow , we would fail matching a flow for the Neighbor Discovery packet - instead, the flow for the first icmp6 packet would be used for both packets.
> 
> ----------
> 
> As a one line conclusion, given the current implementation of the buffer management mechanism, the flow mechanics can at any time create or match a wrong flow for a given NET_BUFFER.
> 
> Our implementation - create a NET_BUFFER_LIST for each existing NET_BUFFER, and wrap it in an OVS_NET_BUFFER - may be less efficient, but it takes into account all these cases, by the fact that it considers each NET_BUFFER in a NET_BUFFER_LIST as a totally different packet.
> An improvement for our OVS_NET_BUFFER implementation would've been to have a cache of pre-allocated buffers, so as not to waste time allocating and de-allocating memory for them each time; also to take one NET_BUFFER packet in a NET_BUFFER_LIST as it is, without duplicating it.
> 
> I believe this can be considered an architectural issue, and should best be addressed early, rather than to continue to build functionality upon it, and come to change it later - given that the network packets are used in many places in the project.
> 
> I am eager to hear any suggestions. As well as, perhaps, how you thought to address this issue.
> Samuel Ghinet
> 
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> https://urldefense.proofpoint.com/v1/url?u=http://openvswitch.org/mailman/listinfo/dev&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=yTvML8OxA42Jb6ViHe7fUXbvPVOYDPVq87w43doxtlY%3D%0A&m=6jqjOj7Vq%2B8mf1eZZXUFqAW9emd4ga9kgXkDrFFHzqo%3D%0A&s=7de5237aa6d263325d67ca5724808065b3201d2955ae3f9cbb20d7a477990edb
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev




More information about the dev mailing list