[ovs-dev] [PATCH v6 7/8] Documentation: Update DPDK doc with Keepalive feature.
Aaron Conole
aconole at redhat.com
Wed Jan 24 20:36:29 UTC 2018
Hi Bhanu,
This time around I just reviewed the documentation. I'll spend some
time on the other patches as I can, but I think at least the
documentation needs more work.
Comments inline.
Bhanuprakash Bodireddy <bhanuprakash.bodireddy at intel.com> writes:
> Keepalive feature is aimed at achieving Fastpath Service Assurance
> in OVS-DPDK deployments. It adds support for monitoring the packet
> processing threads by dispatching heartbeats at regular intervals.
This belongs in the document.
> The implementation uses OvSDB for reporting the health of the PMD threads.
s/implementation/keepalive framework/
> Any external monitoring application can query the OvSDB for status
> at regular intervals (or) subscribe to OvSDB updates.
^ for
>
> keepalive feature can be enabled through below OVSDB settings.
^ The
>
> enable-keepalive=true
> - Keepalive feature is disabled by default and should be enabled
> at startup before ovs-vswitchd daemon is started.
>
> keepalive-interval="5000"
> - Timer interval in milliseconds for monitoring the packet
> processing cores.
>
> When KA is enabled, 'ovs-keepalive' thread shall be spawned that wakes
> up at regular intervals to update the timestamp and status of pmd
> threads in process map. This information shall be read by vswitchd thread
s/shall/will/
> and written in to 'keepalive' column of Open_vSwitch table in OVSDB.
>
> An external monitoring framework like collectd with ovs events support
s/like collectd/(such as collectd)/
> can read (or) subscribe to the datapath status changes in ovsdb. When the state
s/(or)/or/
Everything after 'When the state is updated' is specific to
collectd/ceilometer. Should it go into a separate document?
> is updated, the collectd shall be notified and will eventually relay the status
> to ceilometer service running in the controller. Below is the high level
> overview of deployment model.
>
> Compute Node Controller Compute Node
>
> Collectd <----------> Ceilometer <--------> Collectd
>
> OvS DPDK OvS DPDK
>
> +-----+
> | VM |
> +--+--+
> \---+---/
> |
> +--+---+ +------------+----------+ +------+-------+
> | OVS |-----> | ovsevents plugin | --> | collectd |
> +--+---+ +------------+----------+ +------+-------+
>
> +------+-----+ +---------------+------------+ |
> | Ceilometer | <-- | collectd ceilometer plugin | <---
> +------+-----+ +---------------+------------+
>
> Performance impact:
> No noticeable performance or latency impact is observed with
> KA feature enabled.
>
> Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy at intel.com>
> ---
A good chunk of this commit message should be in the documentation.
Either as a duplicate, or just moved there. While commit messages are
meant to describe the 'why' of a change, this commit merely adds
documentation for the keepalive feature. The actual description of the
feature should live on in the documentation, so that when a user is
trying to figure out what it is, they can read it.
I hope this feature isn't exclusively for ceilometer / collectd.
> Documentation/howto/dpdk.rst | 112 +++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 112 insertions(+)
Is the dpdk documentation the correct place for this? Will this
framework ever be applied to more than just PMDs? Isn't that a goal of
monitoring the health of various tasks?
> diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst
> index d123819..e7a2b27 100644
> --- a/Documentation/howto/dpdk.rst
> +++ b/Documentation/howto/dpdk.rst
> @@ -439,6 +439,118 @@ For certain traffic profiles with many parallel flows, it's recommended to set
>
> For more information on the EMC refer to :doc:`/intro/install/dpdk` .
>
> +.. _dpdk_keepalive:
> +
> +Keepalive
> +---------
> +
This needs a bit more description. I read this section, and I don't
know what the Keepalive(KA) feature is - is it merely a reporting
framework for the last time threads 'checked in'? Does it take any
actions? What is a user meant to do with this data? Are there cases
where this framework gives false positives / negatives? Downsides to
enabling it? Alternatives?
Some of that is explained in the commit message - but it really belongs
here.
> +OvS Keepalive(KA) feature is disabled by default. To enable KA feature::
> +
> + $ ovs-vsctl --no-wait set Open_vSwitch . other_config:enable-keepalive=true
> +
> +The KA feature can't be enabled at run time and should be done at startup
> +before ovs-vswitchd daemon is started.
> +
> +The default timer interval for monitoring packet processing threads is 1000ms.
> +To set a different timer value, run::
> +
> + $ ovs-vsctl --no-wait set Open_vSwitch . \
> + other_config:keepalive-interval="5000"
> +
Can this be changed at run-time? Such information should be included.
> +The events comprise of thread states and the last seen timestamps. The events
> +are written in to process map periodically by keepalive thread.
> +
> +The events in the process map are retrieved by main(vswitchd) thread and
> +updated in to keepalive column of Open_vSwitch table in OVSDB. Any external
> +monitoring application can read the status from OVSDB at intervals or subscribe
> +to the updates so that they get notified when the changes happen on OvSDB.
> +
> +To monitor the datapath status using ovsdb-client, run::
Will this monitor revalidator threads? RCU related threads? Is it only
ever datapath status? Maybe should read 'To monitor the keepalive status'.
> + $ ovsdb-client monitor Open_vSwitch
> + $ ovsdb-client monitor Open_vSwitch Open_vSwitch keepalive
> +
> +The datapath thread states are explained below::
s/datapath/keepalive/
> +
> + KA_STATE_UNUSED - Not registered to KA framework.
Maybe "KA_STATE_NONE"? Or better yet - why should we have an enum for
this? After all... the thread isn't tracked?
> + KA_STATE_ALIVE - Thread alive.
--
> + KA_STATE_MISSING - Thread missed first heartbeat.
> + KA_STATE_DEAD - Thread missed two heartbeats.
> + KA_STATE_GONE - Thread missed two or more heartbeats and burried.
--
Is it useful to distinguish between 'missed one', 'missed two', and
'missed two or more?' Further, what does 'burried' mean? I hope we
don't have to dig a thread out from the under a vast field of
permafrost.
Maybe it's better to have a state for MISSED, and show the missed
counter.
> + KA_STATE_SLEEP - Thread is sleeping.
> +
> +To query the datapath status, run::
> +
> + $ ovs-appctl keepalive/pmd-health-show
> +
The following section seems to be more about installation and
configuration of collectd. Perhaps that should be split into a
different document?
> +`collectd <https://collectd.org/>`__ has built-in support for DPDK and provides
> +a `ovs_events` and `ovs_stats` plugin that can be enabled to relay the datapath
> +status and the PMD status to OpenStack service `Ceilometer
> +<https://docs.openstack.org/developer/ceilometer/>`__.
> +
> +To install and configure `collectd`, run::
> +
> + # Clone collectd from Git repository
> + $ git clone https://github.com/collectd/collectd.git
> +
> + # configure and install collectd
> + $ cd collectd
> + $ ./build.sh
> + $ ./configure --enable-syslog --enable-logfile --with-libdpdk=/usr
> + $ make
> + $ make install
> +
> +`collectd` is installed in ``/opt/collectd`` by default. Edit the configuration
> +file in ``/opt/collectd/etc/collectd.conf`` to enable logfile, dpdkevents
> +and csv plugin::
> +
> + LoadPlugin logfile
> + <Plugin logfile>
> + LogLevel debug
> + File "/var/log/collectd/collectd.log"
> + Timestamp true
> + PrintSeverity false
> + </Plugin>
> +
> + <Plugin syslog>
> + LogLevel info
> + </Plugin>
> +
> +Enable `ovs_events` plugin and update the plugindetails as below::
> +
> + LoadPlugin ovs_events
> +
> + <Plugin ovs_events>
> + Port "6640"
> + Address "127.0.0.1"
> + Socket "/usr/local/var/run/openvswitch/db.sock"
> + SendNotification true
> + DispatchValues false
> + </Plugin>
> +
> +Enable `ovs_stats` plugin and update the plugindetails as below::
> +
> + LoadPlugin ovs_stats
> +
> + <Plugin ovs_stats>
> + Port "6640"
> + Address "127.0.0.1"
> + Socket "/usr/local/var/run/openvswitch/db.sock"
> + Bridges "br0"
> + </Plugin>
> +
> +Enable ``csv`` plugin as below::
> +
> + LoadPlugin csv
> +
> + <Plugin csv>
> + DataDir "/var/log/collectd/csv"
> + StoreRates false
> + </Plugin>
> +
> +With csv plugin enabled, *meter* or *gauge* file is created and timestamp
> +and thread status gets updated which are sent to ceilometer service.
> +
.. _dpdk-ovs-in-guest:
OVS with DPDK Inside VMs
--
2.4.11
More information about the dev
mailing list