[ovs-dev] [PATCH 05/12] dpif-netdev: Clear flow batches inside packet_batch_execute.

Fischetti, Antonio antonio.fischetti at intel.com
Tue Oct 11 16:14:03 UTC 2016


Thanks Daniele for all your explanations.
Instead of the proposed changes, is it worth to add a comment to the code just to explain why that loop is needed? This way we will avoid other people attempting to make the same change in the future.
Could be something like:

+    /* All the flow batches need to be reset before any call to 
+    packet_batch_per_flow_execute() as it could potentially 
+    trigger recirculation.
+    When a packet matching flow ‘j’ happens to be recirculated, 
+    the nested call to dp_netdev_input__() could potentially 
+    classify the packet as matching another flow - say 'k'.
+    It could happen that in the previous call to dp_netdev_input__()
+    that same flow 'k' had already its own batches[k] still waiting
+    to be served.  So if its ‘batch’ member is not reset, the
+    recirculated packet would be wrongly appended to batches[k]
+    of the 1st call to dp_netdev_input__(). */
     for (i = 0; i < n_batches; i++) {
         batches[i].flow->batch = NULL;
     }


From: Daniele Di Proietto [mailto:diproiettod at ovn.org] 
Sent: Monday, October 10, 2016 1:31 AM
To: Fischetti, Antonio <antonio.fischetti at intel.com>
Cc: Jarno Rajahalme <jarno at ovn.org>; dev at openvswitch.org
Subject: Re: [ovs-dev] [PATCH 05/12] dpif-netdev: Clear flow batches inside packet_batch_execute.



2016-10-09 0:54 GMT-07:00 Fischetti, Antonio <antonio.fischetti at intel.com>:
Thanks Daniele,
in this patch we moved the reset of batches[i].flow->batch
from dp_netdev_input__() into packet_batch_per_flow_execute().
So before calling dp_netdev_execute_actions() the flow->batch is already NULL.

batches[0]->flow->batch is already NULL, but batches[1]->flow->batch is not NULL with this patch.  The lookup after recirculation could find batches[1]->flow

I think this should prevent recirculation, or am I missing some detail?

Suppose the datapath has the following flow table (I'm not sure something like this is ever generated by the current translation code, but it's a valid datapath flow table):
in_port(1),recird_id(0),...,tcp(src=80) actions:set(tcp(src=443)),recirc(0)   #flow 0
in_port(1),recird_id(0),...,tcp(src=443) actions:output:2                              #flow 1
Suppose the following packets enter the datapath on port 1, in the same input batch, in this order:

in_port(1),...,tcp(src=80)   #packet 0
in_port(1),...,tcp(src=443) #packet 1
Here's how the datapath behaves:
The batch is received and classified. We create two batches_per_flow
packet_batch_per_flow[0] = { .flow = flow1, .array = [packet 1] } #batch 0
packet_batch_per_flow[1] = { .flow = flow2, .array = [packet 2] } #batch 1
Flow 0 points to batch_per_flow[0] and flow 1 points to batch_per_flow[1].
Now the datapath starts to execute the batch 0. The recirculation will execute immediately and it will find flow 1.  If we didn't clear flow 1->batch before executing batch 0 (with this patch we don't), the datapath will append #packet 0 (after recirculation) to batch 1.  As I said, I believe this to be wrong because we should have created a new batch.
Every time I look at that code again I need some time to re understand this, and I actually got this wrong when I introduced the batch pointer into struct dp_netdev_flow, so I agree it's quite complicated.  Please, feel free to ask more if my reasoning seems wrong or if I'm making wrong assumptions.
Thanks,
Daniele


Antonio

> -----Original Message-----
> From: dev [mailto:dev-bounces at openvswitch.org] On Behalf Of Daniele Di
> Proietto
> Sent: Friday, October 7, 2016 11:46 PM
> To: Jarno Rajahalme <jarno at ovn.org>
> Cc: dev at openvswitch.org
> Subject: Re: [ovs-dev] [PATCH 05/12] dpif-netdev: Clear flow batches
> inside packet_batch_execute.
>
> This patch basically reverts 603f2ce04d00("dpif-netdev: Clear flow batches
> before execute.")
>
> As explained in that commit message the problem is that
> packet_batch_per_flow_execute() can trigger recirculation.  This means
> that
> we will call recursively dp_netdev_input__().  Here's a stack frame:
>
> dp_netdev_input__()
> {
>     struct packet_batch_per_flow batches[PKT_ARRAY_SIZE];
>         /*...*/
>         packet_batch_per_flow_execute()
>         {
>             dp_execute_cb()
>             {
>                 case OVS_ACTION_ATTR_RECIRC:
>                     dp_netdev_input__() {
>                         struct packet_batch_per_flow
> batches[PKT_ARRAY_SIZE];
>                         /*...*/
>                     }
>             }
>
> }
>
> In the inner dp_netdev_input__() a lookup could theoretically find the
> same
> flows that it finds in the outer.
> If we don't clear _all_ the batch pointers before calling the inner
> dp_netdev_input__() we could find a flow that still has a pointer to a
> batch in the outer dp_netdev_input__(). Then we will append packets to the
> outer batch, which is clearly wrong.
>
> 2016-10-07 14:09 GMT-07:00 Jarno Rajahalme <jarno at ovn.org>:
>
> > Daniele had a comment on this, I believe?
> >
> > > On Oct 7, 2016, at 9:17 AM, Bhanuprakash Bodireddy <
> > bhanuprakash.bodireddy at intel.com> wrote:
> > >
> > > There is a slight negative performance impact, by zeroing out the flow
> > > batch pointers in dp_netdev_input__ ahead of packet_batch_execute().
> The
> > > issue has been observed with multiple batches test scenario.
> > >
> > > This patch fixes the problem by removing the extra for loop and clear
> > > the flow batches inside the packet_batch_per_flow_execute(). Also the
> > > vtune analysis showed that the overall no. of instructions retired for
> > > dp_netdev_input__ reduced by 1,273,800,000 with this patch.
> > >
> > > Fixes: 603f2ce04d00 ("dpif-netdev: Clear flow batches before
> execute.")
> > > Signed-off-by: Bhanuprakash Bodireddy
> <bhanuprakash.bodireddy at intel.com>
> > > Signed-off-by: Antonio Fischetti <antonio.fischetti at intel.com>
> > > ---
> > > lib/dpif-netdev.c | 5 +----
> > > 1 file changed, 1 insertion(+), 4 deletions(-)
> > >
> > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> > > index c002dd3..d0bb191 100644
> > > --- a/lib/dpif-netdev.c
> > > +++ b/lib/dpif-netdev.c
> > > @@ -3878,6 +3878,7 @@ packet_batch_per_flow_execute(struct
> > packet_batch_per_flow *batch,
> > > {
> > >     struct dp_netdev_actions *actions;
> > >     struct dp_netdev_flow *flow = batch->flow;
> > > +    flow->batch = NULL;
> > >
> > >     dp_netdev_flow_used(flow, batch->array.count, batch->byte_count,
> > >                         batch->tcp_flags, now);
> > > @@ -4173,10 +4174,6 @@ dp_netdev_input__(struct dp_netdev_pmd_thread
> > *pmd,
> > >     }
> > >
> > >     for (i = 0; i < n_batches; i++) {
> > > -        batches[i].flow->batch = NULL;
> > > -    }
> > > -
> > > -    for (i = 0; i < n_batches; i++) {
> > >         packet_batch_per_flow_execute(&batches[i], pmd, now);
> > >     }
> > > }
> > > --
> > > 2.4.11
> > >
> > > _______________________________________________
> > > dev mailing list
> > > dev at openvswitch.org
> > > http://openvswitch.org/mailman/listinfo/dev
> >
> > _______________________________________________
> > dev mailing list
> > dev at openvswitch.org
> > http://openvswitch.org/mailman/listinfo/dev
> >
> _______________________________________________
> dev mailing list
> dev at openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev



More information about the dev mailing list