[ovs-dev] [PATCH v3 00/19] Add OVS DPDK keep-alive functionality.

Bodireddy, Bhanuprakash bhanuprakash.bodireddy at intel.com
Fri Aug 4 15:24:18 UTC 2017


HI Ilya,

Thanks for looking in to this and providing your feedback. 

When this feature was first posted as RFC (https://mail.openvswitch.org/pipermail/ovs-dev/2016-July/318243.html), the implementation in OvS was done based on DPDK Keepalive library and keeping collectd in sync.  As you can see from RFC it was pretty compact code and integrated well with ceilometer and provided end to end functionality. Much of the RFC code was  to handle SHM. 

However the reviewers pointed below flaws.

- Very DPDK specific.
- Shared memory for inter process communication(Between OvS and collectd threads).
- Tracks PMD cores and not threads.
- Limited support to detect false negatives & false positives.
- Limited support to query KA status.

As per suggestions, below changes were introduced.

- Basic infrastructure to register & track threads instead of cores. (Now only PMDs are tracked only but can be extended to track non-PMD threads).
- Keep most of the APIs generic so that they can extended in the future. All generic APIs are in Keepalive.[hc]
- Remove Shared memory and introduce OvSDB.
- Add support to detect false negatives.
- appctl options to query status.

I agree that we have few issues but they can be reworked.
 -  invoke dpdk_is_enabled() from generic code (vswitchd/bridge.c) isn't nice, I had to do  to pass few unit test cases last time.
 -  Half a dozen stub APIs. I couldn't avoid it as they are needed to get the kernel datapath build.

The patch series can be categorized  in to sub patchesets (KA infrastructure/  OvSDB changes/  Query KA stats / Check False positives).  This patch series in the current form is using rte_keepalive library to handle PMD thread. But importantly has  introduced basic infrastructure to deal with other threads in the future.  

Regards,
Bhanuprakash. 

>-----Original Message-----
>From: Ilya Maximets [mailto:i.maximets at samsung.com]
>Sent: Friday, August 4, 2017 2:40 PM
>To: ovs-dev at openvswitch.org; Bodireddy, Bhanuprakash
><bhanuprakash.bodireddy at intel.com>
>Cc: Darrell Ball <dball at vmware.com>; Ben Pfaff <blp at ovn.org>; Aaron
>Conole <aconole at redhat.com>
>Subject: Re: [ovs-dev] [PATCH v3 00/19] Add OVS DPDK keep-alive
>functionality.
>
>Hi Bhanuprakash,
>
>Thanks for working on this.
>I have a general concern about implementation of this functionality:
>
>*What is the profit from using rte_keepalive library ?*
>
>Pros:
>
>    * No need to implement 'rte_keepalive_dispatch_pings()' (40 LOC)
>      and struct rte_keepalive (30 LOC, can be significantly decreased
>      by removing not needed elements) ---> ~70 LOC.
>
>Cons:
>
>    * DPDK dependency:
>
>        * Implementation of PMD threads management (KA) inside netdev code
>          (netdev-dpdk) looks very strange.
>        * Many DPDK references in generic code (like dpdk_is_enabled).
>        * Feature isn't available for the common threads (main?) wihtout DPDK.
>        * Many stubs and placeholders for cases without dpdk.
>        * No ability for unit testing.
>
>So, does it worth to use rte_keepalive? To make functionality fully generic we
>only need to implement 'rte_keepalive_dispatch_pings()'
>and few helpers. As soon as this function does nothing dpdk-specific it's a
>really simple task which will allow to greatly clean up the code. The feature is
>too big to use external library for 70 LOCs of really simple code. (Clean up
>should save much more).
>
>Am I missed something?
>Any thoughts?
>
>Best regards, Ilya Maximets.
>
>> Keepalive feature is aimed at achieving Fastpath Service Assurance in
>> OVS-DPDK deployments. It adds support for monitoring the packet
>> processing cores(PMD thread cores) by dispatching heartbeats at
>> regular intervals. Incase of heartbeat misses additional health checks
>> are enabled on the PMD thread to detect the failure and the same shall
>> be reported to higher level fault management systems/frameworks.
>>
>> The implementation uses OVSDB for reporting the health of the PMD
>threads.
>> Any external monitoring application can read the status from OVSDB at
>> regular intervals (or) subscribe to the updates in OVSDB so that they
>> get notified when the changes happen on OVSDB.
>>
>> keepalive info struct is created and initialized for storing the
>> status of the PMD threads. This is initialized by main
>> thread(vswitchd) as part of init process and will be periodically updated by
>'keepalive'
>> thread. keepalive feature can be enabled through below OVSDB settings.
>>
>>     enable-keepalive=true
>>       - Keepalive feature is disabled by default.
>>
>>     keepalive-interval="5000"
>>       - Timer interval in milliseconds for monitoring the packet
>>         processing cores.
>>
>> When KA is enabled, 'ovs-keepalive' thread shall be spawned that wakes
>> up at regular intervals to update the timestamp and status of pmd
>> cores in keepalive info struct. This information shall be read by
>> vswitchd thread and write the status in to 'keepalive' column of
>Open_vSwitch table in OVSDB.
>>
>> An external monitoring framework like collectd with ovs events support
>> can read (or) subscribe to the datapath status changes in ovsdb. When
>> the state is updated, the collectd shall be notified and will
>> eventually relay the status to ceilometer service running in the
>> controller. Below is the high level overview of deployment model.
>>
>>     Compute Node            Controller            Compute Node
>>
>>     Collectd  <----------> Ceilometer <-------->   Collectd
>>
>>     OvS DPDK                                       OvS DPDK
>>
>>     +-----+
>>     | VM  |
>>     +--+--+
>>     \---+---/
>>     |
>>     +--+---+       +------------+----------+     +------+-------+
>>     | OVS  |-----> |   ovsevents plugin    | --> |   collectd   |
>>     +--+---+       +------------+----------+     +------+-------+
>>
>>     +------+-----+     +---------------+------------+     |
>>     | Ceilometer | <-- | collectd ceilometer plugin |  <---
>>     +------+-----+     +---------------+------------+
>>
>> github: The patches can be found here:
>>   https://github.com/bbodired/ovs (Last master commit e7cd8c363)
>>
>> Performance impact:
>>   No noticeable performance or latency impact is observed with
>>   KA feature enabled.
>>
>> -------------------------------------------------
>> v2 -> v3
>>   * Rebase.
>>   * Verified with dpdk-stable-17.05.1 release.
>>   * Fixed build issues with MSVC and cross checked with appveyor.
>>
>> v1 -> v2
>>   * Rebase
>>   * Drop 01/20 Patch "Consolidate process related APIs" of V1 as it
>>     is already applied as separate patch.
>>
>> RFCv3 -> v1
>>   * Made changes to fix failures in some unit test cases.
>>   * some more code cleanup w.r.t process related APIs.
>>
>> RFCv2 -> RFCv3
>>   * Remove POSIX shared memory block implementation (suggested by
>Aaron).
>>   * Rework the logic to register and track threads instead of cores. This way
>>     in the future any thread can be registered to KA framework. For now only
>PMD
>>     threads are tracked (suggested by Aaron).
>>   * Refactor few APIs and further clean up the code.
>>
>> RFCv1 -> RFCv2
>>   * Merged the xml and schema commits to later commit where the actual
>>     implementation is done(suggested by Ben).
>>   * Fix ovs-appctl keepalive/* hang issue when KA disabled.
>>   * Fixed memory leaks with appctl commands for keepalive/pmd-health-
>show,
>>     pmd-xstats-show.
>>   * Refactor code and fixed APIs dealing with PMD health monitoring.
>>
>>
>> Bhanuprakash Bodireddy (19):
>>   dpdk: Add helper functions for DPDK datapath keepalive.
>>   process: Avoid warnings compiling process.c
>>   process: Add helper function to retrieve process status.
>>   Keepalive: Add initial keepalive support.
>>   keepalive: Add more helper functions to KA framework.
>>   dpif-netdev: Add helper function to store datapath tids.
>>   dpif-netdev: Register packet processing cores to KA framework.
>>   dpif-netdev: Enable heartbeats for DPDK datapath.
>>   keepalive: Retrieve PMD status periodically.
>>   bridge: Update keepalive status in OVSDB
>>   keepalive: Add support to query keepalive statistics.
>>   keepalive: Add support to query keepalive status.
>>   dpif-netdev: Add additional datapath health checks.
>>   keepalive: Check the link status as part of PMD health checks.
>>   keepalive: Check the packet statistics as part of PMD health checks.
>>   keepalive: Check the PMD cycle stats as part of PMD health checks.
>>   netdev-dpdk: Enable PMD health checks on heartbeat failure.
>>   keepalive: Display extended Keepalive status.
>>   Documentation: Update DPDK doc with Keepalive feature.
>>
>>  Documentation/howto/dpdk.rst |  90 +++++
>>  lib/automake.mk              |   2 +
>>  lib/dpdk-stub.c              |  36 ++
>>  lib/dpdk.c                   |  68 +++-
>>  lib/dpdk.h                   |  13 +
>>  lib/dpif-netdev.c            | 203 +++++++++-
>>  lib/dpif-netdev.h            |   7 +
>>  lib/keepalive.c              | 922
>+++++++++++++++++++++++++++++++++++++++++++
>>  lib/keepalive.h              | 149 +++++++
>>  lib/netdev-dpdk.c            | 119 +++++-
>>  lib/netdev-dpdk.h            |   5 +
>>  lib/process.c                |  84 +++-
>>  lib/process.h                |  13 +
>>  lib/util.c                   |  10 +
>>  lib/util.h                   |   1 +
>>  vswitchd/bridge.c            |  34 ++
>>  vswitchd/vswitch.ovsschema   |   8 +-
>>  vswitchd/vswitch.xml         |  49 +++
>>  18 files changed, 1779 insertions(+), 34 deletions(-)  create mode
>> 100644 lib/keepalive.c  create mode 100644 lib/keepalive.h
>>
>> --
>> 2.4.11



More information about the dev mailing list