[ovs-dev] [PATCH v2] dpif: allow adding ukeys for same flow by different pmds

Alex Wang alexw at nicira.com
Mon Aug 10 18:12:51 UTC 2015


Hey IIya,

Thx for the reply, Daniele helped me reproduced the issue offline and we
confirmed that this is not DPDK issue.

However, this is also not a ufid issue either...

Let me explain,

First to clear up, with "packets from the same flow (or stream)", I mean
packets with same src/dst MAC, IP, and TCP/UDP port...  Packets with such
same info must be hashed to the same rx queue of the receiving port and be
handled by the same pmd threads,...  Otherwise, we may have serious issue
with reordering.

Then, turns out this issue can be reproduced whenever we reconfigure the
pmd threads (i.e. change number of pmd threads or change cpu pin) or the
port receiving queues...  And when such reconfiguration happens, the
following two things happen:

   - all pmd threads will be recreated, so after recreation, there is no
flow
     flow exists in any pmd thread's classifier and exact match cache.

   - the rx queues of each port will be re-mapped (highly likely to a
different
     pmd thread).

So, after the queue re-mapping, the pmd thread will fire a new upcall
handling request to the ofproto-dpif-upcall module.  However, the module has
no knowledge of the reconfiguration in the dpif-netdev layer, and still
think
the same packet has already been installed to in dpif-netdev module
(even though the old pmd thread has been reset).  And thusly, the ofproto-
dpif-upcall module will fail the flow installation and increment the
handler_duplicate_upcall coverage counter.

Expectedly, this will only be a transient state, since the revalidator
should
cleans up the old ukey (with the ufid) since it can no longer be dumped
from the dpif-netdev module.  But there is a bug that makes the revalidators
skip doing so.

I'd like to explain more if things are still unclear.  Meanwhile, we will
discuss further on the fix and post the patch asap...

Thanks,
Alex Wang,


On Mon, Aug 10, 2015 at 7:57 AM, Ilya Maximets <i.maximets at samsung.com>
wrote:

> Hello, Alex.
> I think, that this problem doesn't depend on dpdk. It is generic problem
> with ukeys.
> I found it while testing ODP-OVS from (Linaro Networking Group) without
> dpdk.
> IMHO, there must be opportunity to install ukeys for same flow by
> different pmd threads
> anyway, because they are reading from different rx queues.
>
> Sorry for delay, was busy fixing coverage to show it to you.)
> ( http://openvswitch.org/pipermail/dev/2015-August/058743.html )
>
> #  ./bin/ovs-appctl coverage/show
> Event coverage, avg rate over last: 5 seconds, last minute, last hour,
> hash=09437d7d:
> bridge_reconfigure         0.0/sec     0.000/sec        0.0003/sec
>  total: 1
> ofproto_flush              0.0/sec     0.000/sec        0.0003/sec
>  total: 1
> ofproto_recv_openflow      0.0/sec     0.000/sec        0.0017/sec
>  total: 6
> ofproto_update_port        0.0/sec     0.000/sec        0.0011/sec
>  total: 4
> rev_reconfigure            0.0/sec     0.000/sec        0.0003/sec
>  total: 1
> rev_flow_table             0.0/sec     0.000/sec        0.0011/sec
>  total: 4
> handler_duplicate_upcall 243016.0/sec 268180.317/sec     5728.5375/sec
>  total: 20622735
> xlate_actions            243016.2/sec 259283.650/sec     5580.2606/sec
>  total: 20088938
> cmap_expand                0.0/sec     0.000/sec        0.0019/sec
>  total: 7
> cmap_shrink                0.0/sec     0.000/sec        0.0008/sec
>  total: 3
> dpif_port_add              0.0/sec     0.000/sec        0.0008/sec
>  total: 3
> dpif_flow_flush            0.0/sec     0.000/sec        0.0006/sec
>  total: 2
> dpif_flow_get              0.0/sec     0.000/sec        0.0014/sec
>  total: 5
> dpif_flow_put              0.0/sec     0.000/sec        0.0014/sec
>  total: 5
> dpif_flow_del              0.0/sec     0.000/sec        0.0014/sec
>  total: 5
> dpif_execute               0.0/sec     0.000/sec        0.0006/sec
>  total: 2
> flow_extract               0.0/sec     0.000/sec        0.0003/sec
>  total: 1
> hmap_pathological          0.0/sec     0.000/sec        0.0050/sec
>  total: 18
> hmap_expand               13.4/sec    19.467/sec        0.8011/sec
>  total: 2884
> netdev_received          71024.6/sec 77401.583/sec     1678.5072/sec
>  total: 6042626
> netdev_sent              306446.4/sec 328582.933/sec     7084.3894/sec
>  total: 25503802
> netdev_get_stats           0.6/sec     0.600/sec        0.0200/sec
>  total: 72
> txn_unchanged              0.2/sec     0.200/sec        0.0067/sec
>  total: 24
> txn_incomplete             0.0/sec     0.000/sec        0.0006/sec
>  total: 2
> txn_success                0.0/sec     0.000/sec        0.0006/sec
>  total: 2
> poll_create_node          63.8/sec    89.917/sec        3.0958/sec
>  total: 11145
> poll_zero_timeout          0.0/sec     0.000/sec        0.0017/sec
>  total: 6
> rconn_queued               0.0/sec     0.000/sec        0.0008/sec
>  total: 3
> rconn_sent                 0.0/sec     0.000/sec        0.0008/sec
>  total: 3
> pstream_open               0.0/sec     0.000/sec        0.0008/sec
>  total: 3
> stream_open                0.0/sec     0.000/sec        0.0003/sec
>  total: 1
> unixctl_received           0.0/sec     0.383/sec        0.0131/sec
>  total: 47
> unixctl_replied            0.0/sec     0.383/sec        0.0131/sec
>  total: 47
> util_xalloc              729151.8/sec 778037.800/sec    16750.5969/sec
>  total: 60302149
> vconn_received             0.0/sec     0.000/sec        0.0025/sec
>  total: 9
> vconn_sent                 0.0/sec     0.000/sec        0.0017/sec
>  total: 6
> netdev_set_policing        0.0/sec     0.000/sec        0.0003/sec
>  total: 1
> netdev_get_ifindex         0.0/sec     0.000/sec        0.0003/sec
>  total: 1
> netdev_get_hwaddr          0.0/sec     0.000/sec        0.0003/sec
>  total: 1
> netdev_set_hwaddr          0.0/sec     0.000/sec        0.0003/sec
>  total: 1
> netdev_get_ethtool         0.0/sec     0.000/sec        0.0008/sec
>  total: 3
> netlink_received           0.2/sec     0.200/sec        0.0128/sec
>  total: 46
> netlink_recv_jumbo         0.2/sec     0.200/sec        0.0067/sec
>  total: 24
> netlink_sent               0.2/sec     0.200/sec        0.0097/sec
>  total: 35
> nln_changed                0.0/sec     0.000/sec        0.0014/sec
>  total: 5
> 54 events never hit
>
> But also, I think, that "packets from the same flow (or stream) should
> only be
> handled by one pmd thread" is a very bad behavior. I don't know exactly
> how dpdk works,
> but it shouldn't work so.
>
> Best regards, Ilya Maximets.
>
> On 08.08.2015 01:28, Alex Wang wrote:
> > Thx for reporting this, I worry this may not be an ovs issue, but to do
> with
> > the dpdk driver...  Here is my analysis, please correct me if wrong,
> >
> > When pmd threads are spawned, each of them will take ownership of port
> > queues in a round robin fashion...  In other words, each port queue will
> > be read by only one pmd thread.
> >
> > Then based on my understanding the dpdk module will hash packets to one
> > particular queue on the receiving port.  Thusly packets from the same
> flow
> > (or stream) should only be handled by one pmd thread.
> >
> > Next, the issue you reported will only happen when the packets from same
> > flow (or stream) are handled by multiple pmd threads.  [In that case,
> only one
> > pmd thread will have the flow installed in its classifier (and exact
> match
> > cache)...  and all other pmd threads fail to install the flow due to
> duplicate
> > ufid]
> >
> > So, based on the analysis, this really should not happen, and if happens,
> > indicate a packet queueing issue in dpdk.
> >
> > Ilya, could you reproduce this issue and provide the `ovs-appctl
> > coverage/show` output?  Want to confirm that the
> 'handler_duplicate_upcall'
> > counter is increasing when this issue happens.
> >
> > Daniele, could you also provide thoughts on this?  Since I'm not sure if
> the
> > ovs dpdk code changes (since i last touched it long time ago) affect the
> issue
> > here.
> >
> >
> > Thanks,
> > Alex Wang,
> >
> >
> >
> >
> > On Fri, Aug 7, 2015 at 2:35 AM, Ilya Maximets <i.maximets at samsung.com
> <mailto:i.maximets at samsung.com>> wrote:
> >
> >     In multiqueue mode several pmd threads may process one
> >     port, but different queues. Flow doesn't depend on queue.
> >
> >     So, while miss upcall processing, all threads (except first
> >     for that port) will receive error = ENOSPC due to
> >     ukey_install failure. Therefore they will not add the flow
> >     to flow_table and will not insert it to exact match cache.
> >
> >     As a result all threads (except first for that port) will
> >     always execute a miss.
> >
> >     Fix that by mixing pmd_id with the bits from the ufid
> >     for ukey->hash calculation.
> >
> >     Signed-off-by: Ilya Maximets <i.maximets at samsung.com <mailto:
> i.maximets at samsung.com>>
> >     ---
> >      ofproto/ofproto-dpif-upcall.c | 22 ++++++++++++----------
> >      1 file changed, 12 insertions(+), 10 deletions(-)
> >
> >     diff --git a/ofproto/ofproto-dpif-upcall.c
> b/ofproto/ofproto-dpif-upcall.c
> >     index 0f2e186..57a7f34 100644
> >     --- a/ofproto/ofproto-dpif-upcall.c
> >     +++ b/ofproto/ofproto-dpif-upcall.c
> >     @@ -291,7 +291,8 @@ static bool ukey_install_start(struct udpif *,
> struct udpif_key *ukey);
> >      static bool ukey_install_finish(struct udpif_key *ukey, int error);
> >      static bool ukey_install(struct udpif *udpif, struct udpif_key
> *ukey);
> >      static struct udpif_key *ukey_lookup(struct udpif *udpif,
> >     -                                     const ovs_u128 *ufid);
> >     +                                     const ovs_u128 *ufid,
> >     +                                     const unsigned pmd_id);
> >      static int ukey_acquire(struct udpif *, const struct dpif_flow *,
> >                              struct udpif_key **result, int *error);
> >      static void ukey_delete__(struct udpif_key *);
> >     @@ -1143,7 +1144,8 @@ process_upcall(struct udpif *udpif, struct
> upcall *upcall,
> >                  }
> >                  if (actions_len == 0) {
> >                      /* Lookup actions in userspace cache. */
> >     -                struct udpif_key *ukey = ukey_lookup(udpif,
> upcall->ufid);
> >     +                struct udpif_key *ukey = ukey_lookup(udpif,
> upcall->ufid,
> >     +
>  upcall->pmd_id);
> >                      if (ukey) {
> >                          actions = ukey->actions->data;
> >                          actions_len = ukey->actions->size;
> >     @@ -1320,19 +1322,19 @@ handle_upcalls(struct udpif *udpif, struct
> upcall *upcalls,
> >      }
> >
> >      static uint32_t
> >     -get_ufid_hash(const ovs_u128 *ufid)
> >     +get_ukey_hash(const ovs_u128 *ufid, const unsigned pmd_id)
> >      {
> >     -    return ufid->u32[0];
> >     +    return ufid->u32[0] + pmd_id;
> >      }
> >
> >      static struct udpif_key *
> >     -ukey_lookup(struct udpif *udpif, const ovs_u128 *ufid)
> >     +ukey_lookup(struct udpif *udpif, const ovs_u128 *ufid, const
> unsigned pmd_id)
> >      {
> >          struct udpif_key *ukey;
> >     -    int idx = get_ufid_hash(ufid) % N_UMAPS;
> >     +    int idx = get_ukey_hash(ufid, pmd_id) % N_UMAPS;
> >          struct cmap *cmap = &udpif->ukeys[idx].cmap;
> >
> >     -    CMAP_FOR_EACH_WITH_HASH (ukey, cmap_node, get_ufid_hash(ufid),
> cmap) {
> >     +    CMAP_FOR_EACH_WITH_HASH (ukey, cmap_node, get_ukey_hash(ufid,
> pmd_id), cmap) {
> >              if (ovs_u128_equals(&ukey->ufid, ufid)) {
> >                  return ukey;
> >              }
> >     @@ -1362,7 +1364,7 @@ ukey_create__(const struct nlattr *key, size_t
> key_len,
> >          ukey->ufid_present = ufid_present;
> >          ukey->ufid = *ufid;
> >          ukey->pmd_id = pmd_id;
> >     -    ukey->hash = get_ufid_hash(&ukey->ufid);
> >     +    ukey->hash = get_ukey_hash(&ukey->ufid, pmd_id);
> >          ukey->actions = ofpbuf_clone(actions);
> >
> >          ovs_mutex_init(&ukey->mutex);
> >     @@ -1490,7 +1492,7 @@ ukey_install_start(struct udpif *udpif, struct
> udpif_key *new_ukey)
> >          idx = new_ukey->hash % N_UMAPS;
> >          umap = &udpif->ukeys[idx];
> >          ovs_mutex_lock(&umap->mutex);
> >     -    old_ukey = ukey_lookup(udpif, &new_ukey->ufid);
> >     +    old_ukey = ukey_lookup(udpif, &new_ukey->ufid,
> new_ukey->pmd_id);
> >          if (old_ukey) {
> >              /* Uncommon case: A ukey is already installed with the same
> UFID. */
> >              if (old_ukey->key_len == new_ukey->key_len
> >     @@ -1572,7 +1574,7 @@ ukey_acquire(struct udpif *udpif, const struct
> dpif_flow *flow,
> >          struct udpif_key *ukey;
> >          int retval;
> >
> >     -    ukey = ukey_lookup(udpif, &flow->ufid);
> >     +    ukey = ukey_lookup(udpif, &flow->ufid, flow->pmd_id);
> >          if (ukey) {
> >              retval = ovs_mutex_trylock(&ukey->mutex);
> >          } else {
> >     --
> >     2.1.4
> >
> >     _______________________________________________
> >     dev mailing list
> >     dev at openvswitch.org <mailto:dev at openvswitch.org>
> >     http://openvswitch.org/mailman/listinfo/dev
> >
> >
>



More information about the dev mailing list