[ovs-dev] [PATCH v7] datapath: Add Stateless TCP Tunneling protocol.

Jesse Gross jesse at nicira.com
Thu Apr 23 02:48:15 UTC 2015


On Wed, Apr 22, 2015 at 3:38 PM, Pravin Shelar <pshelar at nicira.com> wrote:
> On Wed, Apr 22, 2015 at 3:24 PM, Jesse Gross <jesse at nicira.com> wrote:
>> On Wed, Apr 22, 2015 at 3:19 PM, Pravin Shelar <pshelar at nicira.com> wrote:
>>> On Wed, Apr 22, 2015 at 11:31 AM, Jesse Gross <jesse at nicira.com> wrote:
>>>> On Wed, Apr 22, 2015 at 11:09 AM, Pravin Shelar <pshelar at nicira.com> wrote:
>>>>> On Wed, Apr 22, 2015 at 10:52 AM, Jesse Gross <jesse at nicira.com> wrote:
>>>>>> On Wed, Apr 22, 2015 at 10:03 AM, Pravin Shelar <pshelar at nicira.com> wrote:
>>>>>>> On Mon, Apr 20, 2015 at 9:54 PM, Jesse Gross <jesse at nicira.com> wrote:
>>>>>>>> On Mon, Apr 20, 2015 at 7:33 PM, Pravin Shelar <pshelar at nicira.com> wrote:
>>>>>>>>> On Mon, Apr 20, 2015 at 5:56 PM, Jesse Gross <jesse at nicira.com> wrote:
>>>>>>>>>> On Mon, Apr 20, 2015 at 3:54 PM, Pravin Shelar <pshelar at nicira.com> wrote:
>>>>>>>>>>> On Mon, Apr 20, 2015 at 3:36 PM, Jesse Gross <jesse at nicira.com> wrote:
>>>>>>>>>>>> On Mon, Apr 20, 2015 at 1:29 PM, Pravin B Shelar <pshelar at nicira.com> wrote:
>>>>>>>>>>>>> diff --git a/datapath/linux/compat/stt.c b/datapath/linux/compat/stt.c
>>>>>>>>>>>>> new file mode 100644
>>>>>>>>>>>>> index 0000000..209bf1a
>>>>>>>>>>>>> --- /dev/null
>>>>>>>>>>>>> +++ b/datapath/linux/compat/stt.c
>>>>>>>>>>>>> +static void update_headers(struct sk_buff *skb, bool head,
>>>>>>>>>>>>> +                              unsigned int l4_offset, unsigned int hdr_len,
>>>>>>>>>>>>> +                              bool ipv4, u32 tcp_seq)
>>>>>>>>>>>> [...]
>>>>>>>>>>>>> +       skb->truesize = SKB_TRUESIZE(skb_end_offset(skb)) + skb->data_len;
>>>>>>>>>>>>> +}
>>>>>>>>>>>>
>>>>>>>>>>>> I wonder if there are any possible edge cases with resetting truesize
>>>>>>>>>>>> where the packet is still in someone's transmit queue (such as if we
>>>>>>>>>>>> are looping back packet). Do we need to orphan it first?
>>>>>>>>>>>>
>>>>>>>>>>> ok, I will orphan it in update_headers.
>>>>>>>>>>
>>>>>>>>>> Just to clarify - I was mostly just thinking aloud on orphaning it.
>>>>>>>>>> I'm not totally sure if that is the right thing to do or if this is
>>>>>>>>>> the right place to do it. I'm not sure what the conceptual
>>>>>>>>>> justification would be for it and it could potentially result in the
>>>>>>>>>> sender's buffers not being properly limited. Perhaps not resetting the
>>>>>>>>>> truesize is the right thing too...
>>>>>>>>>>
>>>>>>>>> I have seen warning msg if we do no keep truesize update along with
>>>>>>>>> changes to skb.
>>>>>>>>
>>>>>>>> Hmm, interesting, what is the warning? I don't think that I have seen
>>>>>>>> that before.
>>>>>>>
>>>>>>> Actually skb_try_coalesce() is updating it correctly. so there no need
>>>>>>> to change truesize anymore. I will update patch accordingly.
>>>>>>
>>>>>> That's much nicer. I also checked and other receive side code (like
>>>>>> TCP input) doesn't worry about the case where a local sender may still
>>>>>> be accounting for the packet since any type of loopback device does
>>>>>> call skb_orphan() in some form.
>>>>>>
>>>>>> I hate to bring this up but what about on transmit? In cases where we
>>>>>> merge or split skbs (skb_try_coalesce() and normalize_frag_list()
>>>>>> respectively) we do track the truesize for correctly for the result
>>>>>> but the individual pieces might not have the right destructors or
>>>>>> might not have their truesize updated for the destructor they do have.
>>>>>
>>>>> How about about we update merged skb stats (len, data_len, truesize)
>>>>> according to *delta_truesize we get from skb_try_coalesce() and then
>>>>> free the skb?
>>>>
>>>> I think that would work for the skb_try_coalesce() case (although I
>>>> would only worry about truesize, not the lengths). For
>>>> normalize_frag_list() I think we would either have to add a destructor
>>>> or not update truesize. I'm not sure that the condition where we have
>>>> frag_lists and a destructor actually ever happens in practice though.
>>>
>>> I am not sure why do we need a destructor for normalize_frag_list
>>> changes. Even with the changes in the function truesize is consistent
>>> for given skb memory usage.
>>
>> If you have a packet with a frag_list, all of the memory for the
>> individual elements will be accounted for in the truesize in the top
>> level skb. This skb could also be accounted to some socket and have a
>> destructor. When we break apart the list, we remove the truesize from
>> the head because the memory is now part of individual packets.
>> However, the destructor is presumably only on the head and so only its
>> memory will be removed from the socket accounting when it is freed but
>> not each of the other skbs that came from it.
>
> ok. In that case we can not have our own destructor since there is one
> already (I am not sure if we can use skb->cb to restore original). not
> changing true size can complicate skb coalesce, since it does update
> truesize. Easy option would be orphan skb if we are going to coalesce
> its fragments.

I'm not sure that we need our own destructor. What do you think about
just replicating the original destructor onto each of the newly
generated skbs?

Completely unrelated but I noticed that there is one discrepancy
between this implementation and the STT protocol spec. The spec
recommends that the TCP window be set to zero but this sets it to
0xffff. I only vaguely remember setting the window size to something
non-zero but we should check that and update one or the other.



More information about the dev mailing list