[ovs-dev] [PATCH v3 00/19] Add OVS DPDK keep-alive functionality.
Bodireddy, Bhanuprakash
bhanuprakash.bodireddy at intel.com
Tue Aug 8 10:52:50 UTC 2017
HI Ilya,
>I understand that using rte_keepalive library was worth in the early RFC
>because size of RFC was comparable with the size of rte_keepalive library.
>But now, as so many generic things was implemented in lib/keepalive.{c,h}
>and the size of the patch-set is pretty large, IMHO, it's better to implement
>'struct rte_keepalive' and 'rte_keepalive_dispatch_pings()' inside
>lib/keepalive.{c,h} and remove dpdk library as a dependency for this
>functionality.
I agree with your suggestion and will factor in this input for next series.
>
>'rte_keepalive' doesn't have any dpdk-specific things inside. It doesn't work
>with NICs or DPDK-allocated memory. This library is just a simple wrapper.
>So, do we need the dependency from dpdk only to use this wrapper? Without
>it we'll have generic keepalive functionality for the whole OVS without
>additional subs and dpdk references in generic code.
Completely agree and this will help us avoid dummy functions.
>
>I'm asking you to try to implement 'struct rte_keepalive' and
>'rte_keepalive_dispatch_pings()' inside lib/keepalive.{c,h} and move all the
>keepalive related code out of [netdev-]dpdk.{c,h} to keepalive.{c,h} and,
>possibly, to dpif-netdev.{c,h}.
>I'm expecting significant improvements in code size, simplicity and readability.
>Also, this will allow to use keepalive without DPDK.
I have tried my best not to clutter netdev-dpdk and dpif-netdev. I hope by removing
the dependency on DPDK Keepalive library it might be even better.
I will work on this and wait for inputs from other reviewers before posting next version.
- Bhanuprakash.
>
>Best regards, Ilya Maximets.
>
>On 04.08.2017 18:24, Bodireddy, Bhanuprakash wrote:
>> HI Ilya,
>>
>> Thanks for looking in to this and providing your feedback.
>>
>> When this feature was first posted as RFC
>(https://mail.openvswitch.org/pipermail/ovs-dev/2016-July/318243.html),
>the implementation in OvS was done based on DPDK Keepalive library and
>keeping collectd in sync. As you can see from RFC it was pretty compact code
>and integrated well with ceilometer and provided end to end functionality.
>Much of the RFC code was to handle SHM.
>>
>> However the reviewers pointed below flaws.
>>
>> - Very DPDK specific.
>> - Shared memory for inter process communication(Between OvS and
>collectd threads).
>> - Tracks PMD cores and not threads.
>> - Limited support to detect false negatives & false positives.
>> - Limited support to query KA status.
>>
>> As per suggestions, below changes were introduced.
>>
>> - Basic infrastructure to register & track threads instead of cores. (Now only
>PMDs are tracked only but can be extended to track non-PMD threads).
>> - Keep most of the APIs generic so that they can extended in the
>> future. All generic APIs are in Keepalive.[hc]
>> - Remove Shared memory and introduce OvSDB.
>> - Add support to detect false negatives.
>> - appctl options to query status.
>>
>> I agree that we have few issues but they can be reworked.
>> - invoke dpdk_is_enabled() from generic code (vswitchd/bridge.c) isn't
>nice, I had to do to pass few unit test cases last time.
>> - Half a dozen stub APIs. I couldn't avoid it as they are needed to get the
>kernel datapath build.
>>
>> The patch series can be categorized in to sub patchesets (KA infrastructure/
>OvSDB changes/ Query KA stats / Check False positives). This patch series in
>the current form is using rte_keepalive library to handle PMD thread. But
>importantly has introduced basic infrastructure to deal with other threads in
>the future.
>>
>> Regards,
>> Bhanuprakash.
>>
>>> -----Original Message-----
>>> From: Ilya Maximets [mailto:i.maximets at samsung.com]
>>> Sent: Friday, August 4, 2017 2:40 PM
>>> To: ovs-dev at openvswitch.org; Bodireddy, Bhanuprakash
>>> <bhanuprakash.bodireddy at intel.com>
>>> Cc: Darrell Ball <dball at vmware.com>; Ben Pfaff <blp at ovn.org>; Aaron
>>> Conole <aconole at redhat.com>
>>> Subject: Re: [ovs-dev] [PATCH v3 00/19] Add OVS DPDK keep-alive
>>> functionality.
>>>
>>> Hi Bhanuprakash,
>>>
>>> Thanks for working on this.
>>> I have a general concern about implementation of this functionality:
>>>
>>> *What is the profit from using rte_keepalive library ?*
>>>
>>> Pros:
>>>
>>> * No need to implement 'rte_keepalive_dispatch_pings()' (40 LOC)
>>> and struct rte_keepalive (30 LOC, can be significantly decreased
>>> by removing not needed elements) ---> ~70 LOC.
>>>
>>> Cons:
>>>
>>> * DPDK dependency:
>>>
>>> * Implementation of PMD threads management (KA) inside netdev
>code
>>> (netdev-dpdk) looks very strange.
>>> * Many DPDK references in generic code (like dpdk_is_enabled).
>>> * Feature isn't available for the common threads (main?) wihtout
>DPDK.
>>> * Many stubs and placeholders for cases without dpdk.
>>> * No ability for unit testing.
>>>
>>> So, does it worth to use rte_keepalive? To make functionality fully
>>> generic we only need to implement 'rte_keepalive_dispatch_pings()'
>>> and few helpers. As soon as this function does nothing dpdk-specific
>>> it's a really simple task which will allow to greatly clean up the
>>> code. The feature is too big to use external library for 70 LOCs of
>>> really simple code. (Clean up should save much more).
>>>
>>> Am I missed something?
>>> Any thoughts?
>>>
>>> Best regards, Ilya Maximets.
>>>
>>>> Keepalive feature is aimed at achieving Fastpath Service Assurance
>>>> in OVS-DPDK deployments. It adds support for monitoring the packet
>>>> processing cores(PMD thread cores) by dispatching heartbeats at
>>>> regular intervals. Incase of heartbeat misses additional health
>>>> checks are enabled on the PMD thread to detect the failure and the
>>>> same shall be reported to higher level fault management
>systems/frameworks.
>>>>
>>>> The implementation uses OVSDB for reporting the health of the PMD
>>> threads.
>>>> Any external monitoring application can read the status from OVSDB
>>>> at regular intervals (or) subscribe to the updates in OVSDB so that
>>>> they get notified when the changes happen on OVSDB.
>>>>
>>>> keepalive info struct is created and initialized for storing the
>>>> status of the PMD threads. This is initialized by main
>>>> thread(vswitchd) as part of init process and will be periodically
>>>> updated by
>>> 'keepalive'
>>>> thread. keepalive feature can be enabled through below OVSDB settings.
>>>>
>>>> enable-keepalive=true
>>>> - Keepalive feature is disabled by default.
>>>>
>>>> keepalive-interval="5000"
>>>> - Timer interval in milliseconds for monitoring the packet
>>>> processing cores.
>>>>
>>>> When KA is enabled, 'ovs-keepalive' thread shall be spawned that
>>>> wakes up at regular intervals to update the timestamp and status of
>>>> pmd cores in keepalive info struct. This information shall be read
>>>> by vswitchd thread and write the status in to 'keepalive' column of
>>> Open_vSwitch table in OVSDB.
>>>>
>>>> An external monitoring framework like collectd with ovs events
>>>> support can read (or) subscribe to the datapath status changes in
>>>> ovsdb. When the state is updated, the collectd shall be notified and
>>>> will eventually relay the status to ceilometer service running in
>>>> the controller. Below is the high level overview of deployment model.
>>>>
>>>> Compute Node Controller Compute Node
>>>>
>>>> Collectd <----------> Ceilometer <--------> Collectd
>>>>
>>>> OvS DPDK OvS DPDK
>>>>
>>>> +-----+
>>>> | VM |
>>>> +--+--+
>>>> \---+---/
>>>> |
>>>> +--+---+ +------------+----------+ +------+-------+
>>>> | OVS |-----> | ovsevents plugin | --> | collectd |
>>>> +--+---+ +------------+----------+ +------+-------+
>>>>
>>>> +------+-----+ +---------------+------------+ |
>>>> | Ceilometer | <-- | collectd ceilometer plugin | <---
>>>> +------+-----+ +---------------+------------+
>>>>
>>>> github: The patches can be found here:
>>>> https://github.com/bbodired/ovs (Last master commit e7cd8c363)
>>>>
>>>> Performance impact:
>>>> No noticeable performance or latency impact is observed with
>>>> KA feature enabled.
>>>>
>>>> -------------------------------------------------
>>>> v2 -> v3
>>>> * Rebase.
>>>> * Verified with dpdk-stable-17.05.1 release.
>>>> * Fixed build issues with MSVC and cross checked with appveyor.
>>>>
>>>> v1 -> v2
>>>> * Rebase
>>>> * Drop 01/20 Patch "Consolidate process related APIs" of V1 as it
>>>> is already applied as separate patch.
>>>>
>>>> RFCv3 -> v1
>>>> * Made changes to fix failures in some unit test cases.
>>>> * some more code cleanup w.r.t process related APIs.
>>>>
>>>> RFCv2 -> RFCv3
>>>> * Remove POSIX shared memory block implementation (suggested by
>>> Aaron).
>>>> * Rework the logic to register and track threads instead of cores. This
>way
>>>> in the future any thread can be registered to KA framework. For
>>>> now only
>>> PMD
>>>> threads are tracked (suggested by Aaron).
>>>> * Refactor few APIs and further clean up the code.
>>>>
>>>> RFCv1 -> RFCv2
>>>> * Merged the xml and schema commits to later commit where the actual
>>>> implementation is done(suggested by Ben).
>>>> * Fix ovs-appctl keepalive/* hang issue when KA disabled.
>>>> * Fixed memory leaks with appctl commands for
>>>> keepalive/pmd-health-
>>> show,
>>>> pmd-xstats-show.
>>>> * Refactor code and fixed APIs dealing with PMD health monitoring.
>>>>
>>>>
>>>> Bhanuprakash Bodireddy (19):
>>>> dpdk: Add helper functions for DPDK datapath keepalive.
>>>> process: Avoid warnings compiling process.c
>>>> process: Add helper function to retrieve process status.
>>>> Keepalive: Add initial keepalive support.
>>>> keepalive: Add more helper functions to KA framework.
>>>> dpif-netdev: Add helper function to store datapath tids.
>>>> dpif-netdev: Register packet processing cores to KA framework.
>>>> dpif-netdev: Enable heartbeats for DPDK datapath.
>>>> keepalive: Retrieve PMD status periodically.
>>>> bridge: Update keepalive status in OVSDB
>>>> keepalive: Add support to query keepalive statistics.
>>>> keepalive: Add support to query keepalive status.
>>>> dpif-netdev: Add additional datapath health checks.
>>>> keepalive: Check the link status as part of PMD health checks.
>>>> keepalive: Check the packet statistics as part of PMD health checks.
>>>> keepalive: Check the PMD cycle stats as part of PMD health checks.
>>>> netdev-dpdk: Enable PMD health checks on heartbeat failure.
>>>> keepalive: Display extended Keepalive status.
>>>> Documentation: Update DPDK doc with Keepalive feature.
>>>>
>>>> Documentation/howto/dpdk.rst | 90 +++++
>>>> lib/automake.mk | 2 +
>>>> lib/dpdk-stub.c | 36 ++
>>>> lib/dpdk.c | 68 +++-
>>>> lib/dpdk.h | 13 +
>>>> lib/dpif-netdev.c | 203 +++++++++-
>>>> lib/dpif-netdev.h | 7 +
>>>> lib/keepalive.c | 922
>>> +++++++++++++++++++++++++++++++++++++++++++
>>>> lib/keepalive.h | 149 +++++++
>>>> lib/netdev-dpdk.c | 119 +++++-
>>>> lib/netdev-dpdk.h | 5 +
>>>> lib/process.c | 84 +++-
>>>> lib/process.h | 13 +
>>>> lib/util.c | 10 +
>>>> lib/util.h | 1 +
>>>> vswitchd/bridge.c | 34 ++
>>>> vswitchd/vswitch.ovsschema | 8 +-
>>>> vswitchd/vswitch.xml | 49 +++
>>>> 18 files changed, 1779 insertions(+), 34 deletions(-) create mode
>>>> 100644 lib/keepalive.c create mode 100644 lib/keepalive.h
>>>>
>>>> --
>>>> 2.4.11
>>
More information about the dev
mailing list