[ovs-dev] [PATCH v2] dpif: allow adding ukeys for same flow by different pmds

Alex Wang alexw at nicira.com
Tue Aug 11 02:43:34 UTC 2015


Just post a proposed fix here,

http://openvswitch.org/pipermail/dev/2015-August/058760.html

Any comments are welcomed,


On Mon, Aug 10, 2015 at 11:12 AM, Alex Wang <alexw at nicira.com> wrote:

> Hey IIya,
>
> Thx for the reply, Daniele helped me reproduced the issue offline and we
> confirmed that this is not DPDK issue.
>
> However, this is also not a ufid issue either...
>
> Let me explain,
>
> First to clear up, with "packets from the same flow (or stream)", I mean
> packets with same src/dst MAC, IP, and TCP/UDP port...  Packets with such
> same info must be hashed to the same rx queue of the receiving port and be
> handled by the same pmd threads,...  Otherwise, we may have serious issue
> with reordering.
>
> Then, turns out this issue can be reproduced whenever we reconfigure the
> pmd threads (i.e. change number of pmd threads or change cpu pin) or the
> port receiving queues...  And when such reconfiguration happens, the
> following two things happen:
>
>    - all pmd threads will be recreated, so after recreation, there is no
> flow
>      flow exists in any pmd thread's classifier and exact match cache.
>
>    - the rx queues of each port will be re-mapped (highly likely to a
> different
>      pmd thread).
>
> So, after the queue re-mapping, the pmd thread will fire a new upcall
> handling request to the ofproto-dpif-upcall module.  However, the module
> has
> no knowledge of the reconfiguration in the dpif-netdev layer, and still
> think
> the same packet has already been installed to in dpif-netdev module
> (even though the old pmd thread has been reset).  And thusly, the ofproto-
> dpif-upcall module will fail the flow installation and increment the
> handler_duplicate_upcall coverage counter.
>
> Expectedly, this will only be a transient state, since the revalidator
> should
> cleans up the old ukey (with the ufid) since it can no longer be dumped
> from the dpif-netdev module.  But there is a bug that makes the
> revalidators
> skip doing so.
>
> I'd like to explain more if things are still unclear.  Meanwhile, we will
> discuss further on the fix and post the patch asap...
>
> Thanks,
> Alex Wang,
>
>
> On Mon, Aug 10, 2015 at 7:57 AM, Ilya Maximets <i.maximets at samsung.com>
> wrote:
>
>> Hello, Alex.
>> I think, that this problem doesn't depend on dpdk. It is generic problem
>> with ukeys.
>> I found it while testing ODP-OVS from (Linaro Networking Group) without
>> dpdk.
>> IMHO, there must be opportunity to install ukeys for same flow by
>> different pmd threads
>> anyway, because they are reading from different rx queues.
>>
>> Sorry for delay, was busy fixing coverage to show it to you.)
>> ( http://openvswitch.org/pipermail/dev/2015-August/058743.html )
>>
>> #  ./bin/ovs-appctl coverage/show
>> Event coverage, avg rate over last: 5 seconds, last minute, last hour,
>> hash=09437d7d:
>> bridge_reconfigure         0.0/sec     0.000/sec        0.0003/sec
>>  total: 1
>> ofproto_flush              0.0/sec     0.000/sec        0.0003/sec
>>  total: 1
>> ofproto_recv_openflow      0.0/sec     0.000/sec        0.0017/sec
>>  total: 6
>> ofproto_update_port        0.0/sec     0.000/sec        0.0011/sec
>>  total: 4
>> rev_reconfigure            0.0/sec     0.000/sec        0.0003/sec
>>  total: 1
>> rev_flow_table             0.0/sec     0.000/sec        0.0011/sec
>>  total: 4
>> handler_duplicate_upcall 243016.0/sec 268180.317/sec     5728.5375/sec
>>  total: 20622735
>> xlate_actions            243016.2/sec 259283.650/sec     5580.2606/sec
>>  total: 20088938
>> cmap_expand                0.0/sec     0.000/sec        0.0019/sec
>>  total: 7
>> cmap_shrink                0.0/sec     0.000/sec        0.0008/sec
>>  total: 3
>> dpif_port_add              0.0/sec     0.000/sec        0.0008/sec
>>  total: 3
>> dpif_flow_flush            0.0/sec     0.000/sec        0.0006/sec
>>  total: 2
>> dpif_flow_get              0.0/sec     0.000/sec        0.0014/sec
>>  total: 5
>> dpif_flow_put              0.0/sec     0.000/sec        0.0014/sec
>>  total: 5
>> dpif_flow_del              0.0/sec     0.000/sec        0.0014/sec
>>  total: 5
>> dpif_execute               0.0/sec     0.000/sec        0.0006/sec
>>  total: 2
>> flow_extract               0.0/sec     0.000/sec        0.0003/sec
>>  total: 1
>> hmap_pathological          0.0/sec     0.000/sec        0.0050/sec
>>  total: 18
>> hmap_expand               13.4/sec    19.467/sec        0.8011/sec
>>  total: 2884
>> netdev_received          71024.6/sec 77401.583/sec     1678.5072/sec
>>  total: 6042626
>> netdev_sent              306446.4/sec 328582.933/sec     7084.3894/sec
>>  total: 25503802
>> netdev_get_stats           0.6/sec     0.600/sec        0.0200/sec
>>  total: 72
>> txn_unchanged              0.2/sec     0.200/sec        0.0067/sec
>>  total: 24
>> txn_incomplete             0.0/sec     0.000/sec        0.0006/sec
>>  total: 2
>> txn_success                0.0/sec     0.000/sec        0.0006/sec
>>  total: 2
>> poll_create_node          63.8/sec    89.917/sec        3.0958/sec
>>  total: 11145
>> poll_zero_timeout          0.0/sec     0.000/sec        0.0017/sec
>>  total: 6
>> rconn_queued               0.0/sec     0.000/sec        0.0008/sec
>>  total: 3
>> rconn_sent                 0.0/sec     0.000/sec        0.0008/sec
>>  total: 3
>> pstream_open               0.0/sec     0.000/sec        0.0008/sec
>>  total: 3
>> stream_open                0.0/sec     0.000/sec        0.0003/sec
>>  total: 1
>> unixctl_received           0.0/sec     0.383/sec        0.0131/sec
>>  total: 47
>> unixctl_replied            0.0/sec     0.383/sec        0.0131/sec
>>  total: 47
>> util_xalloc              729151.8/sec 778037.800/sec    16750.5969/sec
>>  total: 60302149
>> vconn_received             0.0/sec     0.000/sec        0.0025/sec
>>  total: 9
>> vconn_sent                 0.0/sec     0.000/sec        0.0017/sec
>>  total: 6
>> netdev_set_policing        0.0/sec     0.000/sec        0.0003/sec
>>  total: 1
>> netdev_get_ifindex         0.0/sec     0.000/sec        0.0003/sec
>>  total: 1
>> netdev_get_hwaddr          0.0/sec     0.000/sec        0.0003/sec
>>  total: 1
>> netdev_set_hwaddr          0.0/sec     0.000/sec        0.0003/sec
>>  total: 1
>> netdev_get_ethtool         0.0/sec     0.000/sec        0.0008/sec
>>  total: 3
>> netlink_received           0.2/sec     0.200/sec        0.0128/sec
>>  total: 46
>> netlink_recv_jumbo         0.2/sec     0.200/sec        0.0067/sec
>>  total: 24
>> netlink_sent               0.2/sec     0.200/sec        0.0097/sec
>>  total: 35
>> nln_changed                0.0/sec     0.000/sec        0.0014/sec
>>  total: 5
>> 54 events never hit
>>
>> But also, I think, that "packets from the same flow (or stream) should
>> only be
>> handled by one pmd thread" is a very bad behavior. I don't know exactly
>> how dpdk works,
>> but it shouldn't work so.
>>
>> Best regards, Ilya Maximets.
>>
>> On 08.08.2015 01:28, Alex Wang wrote:
>> > Thx for reporting this, I worry this may not be an ovs issue, but to do
>> with
>> > the dpdk driver...  Here is my analysis, please correct me if wrong,
>> >
>> > When pmd threads are spawned, each of them will take ownership of port
>> > queues in a round robin fashion...  In other words, each port queue will
>> > be read by only one pmd thread.
>> >
>> > Then based on my understanding the dpdk module will hash packets to one
>> > particular queue on the receiving port.  Thusly packets from the same
>> flow
>> > (or stream) should only be handled by one pmd thread.
>> >
>> > Next, the issue you reported will only happen when the packets from same
>> > flow (or stream) are handled by multiple pmd threads.  [In that case,
>> only one
>> > pmd thread will have the flow installed in its classifier (and exact
>> match
>> > cache)...  and all other pmd threads fail to install the flow due to
>> duplicate
>> > ufid]
>> >
>> > So, based on the analysis, this really should not happen, and if
>> happens,
>> > indicate a packet queueing issue in dpdk.
>> >
>> > Ilya, could you reproduce this issue and provide the `ovs-appctl
>> > coverage/show` output?  Want to confirm that the
>> 'handler_duplicate_upcall'
>> > counter is increasing when this issue happens.
>> >
>> > Daniele, could you also provide thoughts on this?  Since I'm not sure
>> if the
>> > ovs dpdk code changes (since i last touched it long time ago) affect
>> the issue
>> > here.
>> >
>> >
>> > Thanks,
>> > Alex Wang,
>> >
>> >
>> >
>> >
>> > On Fri, Aug 7, 2015 at 2:35 AM, Ilya Maximets <i.maximets at samsung.com
>> <mailto:i.maximets at samsung.com>> wrote:
>> >
>> >     In multiqueue mode several pmd threads may process one
>> >     port, but different queues. Flow doesn't depend on queue.
>> >
>> >     So, while miss upcall processing, all threads (except first
>> >     for that port) will receive error = ENOSPC due to
>> >     ukey_install failure. Therefore they will not add the flow
>> >     to flow_table and will not insert it to exact match cache.
>> >
>> >     As a result all threads (except first for that port) will
>> >     always execute a miss.
>> >
>> >     Fix that by mixing pmd_id with the bits from the ufid
>> >     for ukey->hash calculation.
>> >
>> >     Signed-off-by: Ilya Maximets <i.maximets at samsung.com <mailto:
>> i.maximets at samsung.com>>
>> >     ---
>> >      ofproto/ofproto-dpif-upcall.c | 22 ++++++++++++----------
>> >      1 file changed, 12 insertions(+), 10 deletions(-)
>> >
>> >     diff --git a/ofproto/ofproto-dpif-upcall.c
>> b/ofproto/ofproto-dpif-upcall.c
>> >     index 0f2e186..57a7f34 100644
>> >     --- a/ofproto/ofproto-dpif-upcall.c
>> >     +++ b/ofproto/ofproto-dpif-upcall.c
>> >     @@ -291,7 +291,8 @@ static bool ukey_install_start(struct udpif *,
>> struct udpif_key *ukey);
>> >      static bool ukey_install_finish(struct udpif_key *ukey, int error);
>> >      static bool ukey_install(struct udpif *udpif, struct udpif_key
>> *ukey);
>> >      static struct udpif_key *ukey_lookup(struct udpif *udpif,
>> >     -                                     const ovs_u128 *ufid);
>> >     +                                     const ovs_u128 *ufid,
>> >     +                                     const unsigned pmd_id);
>> >      static int ukey_acquire(struct udpif *, const struct dpif_flow *,
>> >                              struct udpif_key **result, int *error);
>> >      static void ukey_delete__(struct udpif_key *);
>> >     @@ -1143,7 +1144,8 @@ process_upcall(struct udpif *udpif, struct
>> upcall *upcall,
>> >                  }
>> >                  if (actions_len == 0) {
>> >                      /* Lookup actions in userspace cache. */
>> >     -                struct udpif_key *ukey = ukey_lookup(udpif,
>> upcall->ufid);
>> >     +                struct udpif_key *ukey = ukey_lookup(udpif,
>> upcall->ufid,
>> >     +
>>  upcall->pmd_id);
>> >                      if (ukey) {
>> >                          actions = ukey->actions->data;
>> >                          actions_len = ukey->actions->size;
>> >     @@ -1320,19 +1322,19 @@ handle_upcalls(struct udpif *udpif, struct
>> upcall *upcalls,
>> >      }
>> >
>> >      static uint32_t
>> >     -get_ufid_hash(const ovs_u128 *ufid)
>> >     +get_ukey_hash(const ovs_u128 *ufid, const unsigned pmd_id)
>> >      {
>> >     -    return ufid->u32[0];
>> >     +    return ufid->u32[0] + pmd_id;
>> >      }
>> >
>> >      static struct udpif_key *
>> >     -ukey_lookup(struct udpif *udpif, const ovs_u128 *ufid)
>> >     +ukey_lookup(struct udpif *udpif, const ovs_u128 *ufid, const
>> unsigned pmd_id)
>> >      {
>> >          struct udpif_key *ukey;
>> >     -    int idx = get_ufid_hash(ufid) % N_UMAPS;
>> >     +    int idx = get_ukey_hash(ufid, pmd_id) % N_UMAPS;
>> >          struct cmap *cmap = &udpif->ukeys[idx].cmap;
>> >
>> >     -    CMAP_FOR_EACH_WITH_HASH (ukey, cmap_node, get_ufid_hash(ufid),
>> cmap) {
>> >     +    CMAP_FOR_EACH_WITH_HASH (ukey, cmap_node, get_ukey_hash(ufid,
>> pmd_id), cmap) {
>> >              if (ovs_u128_equals(&ukey->ufid, ufid)) {
>> >                  return ukey;
>> >              }
>> >     @@ -1362,7 +1364,7 @@ ukey_create__(const struct nlattr *key,
>> size_t key_len,
>> >          ukey->ufid_present = ufid_present;
>> >          ukey->ufid = *ufid;
>> >          ukey->pmd_id = pmd_id;
>> >     -    ukey->hash = get_ufid_hash(&ukey->ufid);
>> >     +    ukey->hash = get_ukey_hash(&ukey->ufid, pmd_id);
>> >          ukey->actions = ofpbuf_clone(actions);
>> >
>> >          ovs_mutex_init(&ukey->mutex);
>> >     @@ -1490,7 +1492,7 @@ ukey_install_start(struct udpif *udpif,
>> struct udpif_key *new_ukey)
>> >          idx = new_ukey->hash % N_UMAPS;
>> >          umap = &udpif->ukeys[idx];
>> >          ovs_mutex_lock(&umap->mutex);
>> >     -    old_ukey = ukey_lookup(udpif, &new_ukey->ufid);
>> >     +    old_ukey = ukey_lookup(udpif, &new_ukey->ufid,
>> new_ukey->pmd_id);
>> >          if (old_ukey) {
>> >              /* Uncommon case: A ukey is already installed with the
>> same UFID. */
>> >              if (old_ukey->key_len == new_ukey->key_len
>> >     @@ -1572,7 +1574,7 @@ ukey_acquire(struct udpif *udpif, const
>> struct dpif_flow *flow,
>> >          struct udpif_key *ukey;
>> >          int retval;
>> >
>> >     -    ukey = ukey_lookup(udpif, &flow->ufid);
>> >     +    ukey = ukey_lookup(udpif, &flow->ufid, flow->pmd_id);
>> >          if (ukey) {
>> >              retval = ovs_mutex_trylock(&ukey->mutex);
>> >          } else {
>> >     --
>> >     2.1.4
>> >
>> >     _______________________________________________
>> >     dev mailing list
>> >     dev at openvswitch.org <mailto:dev at openvswitch.org>
>> >     http://openvswitch.org/mailman/listinfo/dev
>> >
>> >
>>
>
>



More information about the dev mailing list