[ovs-dev] Cloning packets for "action: set field"
sghinet at cloudbasesolutions.com
Tue Aug 5 12:00:03 UTC 2014
I believe there are more complexities associated to this packet cloning.
Consider for instance the following flow:
if (conditions on packet) =>
set ip_dest: X1; out_to_port: vm2
set ip_dest: X2; out_to_port: vm3
Given that multiple destinations are set, and only then the packet is outputted, then, given the issue with cloning, all packets (that will be outputted to vm1, vm2, vm3) will have the same, last destination ip.
I believe we cannot actually copy only the headers: if the packet contains a single MDL, there is nothing else we can do but to duplicate the packet.
For the case where the NB has multiple MDLs chained, we must copy the first MDL(s), i.e. the MDL or MDLs that contain the headers that must be copied.
There is also a problem with NBL completion (say we do the MDL copy and replacement):
for the case above, consider:
out_to_port: vm1; (NB1)
set ip_dest: X1; out_to_port: vm2 (clone to NB2, create new MDL, store ptr to original MDL)
set ip_dest: X2; out_to_port: vm3(clone to NB3, create new MDL, store ptr to original MDL)
At completion, we may need to do this:
NB3: Put back old MDL into original NB, free clone NB3
NB2: Put back old MDL (but already done before), and free NB2
NB1: it's the original, so nothing special.
In order to implement this "but already done before" case, we need to use ref counting for NBL context data (which I have seen you do have some use of ref counting)
However, consider the complexity of restoring an original NB when multiple MDLs have been replaced, by multiple clones. The big problem here is that the whole MDL link would be broken.
I strongly believe that building this component with the mind set "optimize first, fix bugs later" is not at all a good way to do it. Plus, it would make the code more and more intrincate, as we continue to both find new problems and we get to have large pieces of code (dealing with NBLs) spread all over.
I personally believe a better approach would be "start with something simpler, which is fully functional even though suboptimal, and then continue by building on top of it, improving incrementally for each of various cases".
From: Saurabh Shah [ssaurabh at vmware.com]
Sent: Monday, August 04, 2014 10:25 PM
To: Samuel Ghinet
Cc: dev at openvswitch.org
Subject: RE: Cloning packets for "action: set field"
This is a bug. As you rightfully pointed, we shouldn't modify the original packet but instead copy out the relevant bits before modifying them. Copying the entire data buffer is simpler, but sub-optimal. We should only copy out the headers that are being modified. We already have the infrastructure to do this. I will create an issue to track it.
> -----Original Message-----
> From: dev [mailto:dev-bounces at openvswitch.org] On Behalf Of Samuel
> Sent: Saturday, August 02, 2014 5:03 AM
> To: dev at openvswitch.org
> Subject: [ovs-dev] Cloning packets for "action: set field"
> Hello guys,
> I wanted to ask you: do you have buffer management functionality to
> duplicate a packet?
> I have seen that the function OvsOutputBeforeSetAction CLONES instead of
> duplicating the packet.
> Did you know that, when cloning a packet, both the old and the cloned
> packet reference the same data (buffer)?
> So that setting bytes, say, in the ipv4 header of the cloned NET_BUFFER
> actually modifies the original NET_BUFFER as well?
> Also, we are not allowed to set data in the original packet. We must create a
> clone / duplicate for this.
> For tunneling (i.e. adding headroom) cloning is ok, because the clone writes
> bytes to the "unused" area of the buffer, or allocates a new MDL for the
> headroom (which is removed at Complete).
> The procedure for setting data within the buffer using cloning is a bit more
> You must allocate a new MDL, copy the 'modified' data into its buffer, and
> chain it to the cloned NET_BUFFER (replacing the old MDL). And at Complete,
> you must free your MDL and put the old MDL back.
> A simpler method would be to duplicate the buffer, instead of cloning it.
> Here the architecture of a cloned NET_BUFFER_LIST is presented:
> Samuel Ghinet
> dev mailing list
> dev at openvswitch.org
More information about the dev