[ovs-dev] [patch v5 00/11] Userspace datapath: Add fragmentation support.

Aaron Conole aconole at redhat.com
Mon Mar 5 16:16:15 UTC 2018


Darrell Ball <dlu998 at gmail.com> writes:

> Hi Aaron
>
> I totally missed this response - sorry!
>
> On Thu, Feb 22, 2018 at 10:24 AM, Aaron Conole <aconole at redhat.com> wrote:
>
>  Darrell Ball <dlu998 at gmail.com> writes:
>
>  > On Fri, Feb 9, 2018 at 1:35 PM, Aaron Conole <aconole at redhat.com> wrote:
>  >
>  >  Hi Darrell,
>  >
>  >  Darrell Ball <dlu998 at gmail.com> writes:
>  >
>  >  > Fragmentation support for userspace datapath conntrack is added; both
>  >  > v4 and v6 are supported. See the patches for additional details.
>  >
>  >  Very pumped about this!
>  >
>  >  I went to start reviewing this, but found out that 04/ didn't apply
>  >  correctly.  No problem, I'll proceed - just figured I'd let you know
>  >  that:
>  >
>  > The merge conflict in NEWS was due to 2 subsequent applies on Feb 6, while
>  > the series was posted on Feb 4.
>
>  Yep.  I've worked around it.
>
>  >    1. I'm looking at it :)
>  >
>  > Thanks Aaron
>
>  I have a question on patches 5-7:
>
>    Do you think it's worthwhile having these as configuration options in
>    the database?
>
>  I can see where it could be considered that they aren't appropriate
>  (after all, the linux fragmentation support is enabled/disabled not via
>  the netlink interface, but by the system administrator outside of the
>  ovs configuration).  On the other hand, from a user perspective it's
>  probably more friendly to have that configuration knob be persistent.
>
>  I think there is value in having the configuration for ipfrag be stored
>  in a database so that each 'restart' doesn't require an extra set of
>  commands be run.  I envision a situation where users have their system
>  configured, restart, and get confused because they have to manually
>  re-enable the configuration.  There are a few ways this could be done
>  (for instance, a post-start script that runs after ovs-ctl), but I think
>  the database might be a useful way of accomplishing that goal - it
>  exists 
>
>  and we use it for dpdk configurations, ex.
>
> For global and interface configuration, yes.
> For flow specific information, the database has very limited usage; notably, there is flow eviction
> policy
>  an ip prefix settings.
>
> I did not opt for using the database here for a few reasons:
>
> 1/ Scripts for starts and restarts will be the same and hence
> the possibility to forget for restart is pretty low;  database
> commands themselves are also run from scripts, for example for interfaces in general and various
> other stuff.

I'm not sure I follow this.  Users, at least in my experience, almost
never deal directly with ovs.  They go through some kind of extra
interface, like neutron/director/ovn/whatever and that deals with all
the headaches of setting up the datapaths, etc.  Usually, they want to
configure everything in the database and then be done with it.  That
lets them query / set using the same interface, and can work remotely
rather than needing permissions and local access.

I just imagine a situation where a customer has a system set up via
OSP/OCP/Kube and vswitchd restarts (for whatever reason... lets say a
crash or something).  When it comes back, the ipfrag flags are cleared.
The customers network may start misbehaving, with the cause unknown
(unless those mechanism start caring about vswitchd process restarts).

Maybe my concern is overblown?  Is there some other mechanism to catch
this case I missed?

> 2/ Using the database, introduces more complexity and dependencies for limited
> gain for something that is usually set once at startup and never changes after that.
>
> I was/am simply leaning towards not using the database here for reasons mentioned above,
> although I did not think it was a clear question one way or the other and I expected the question
> to be mentioned in review. 

I agree that the expectation for this config is 'set and be done'.

>  Something to think on.
>
>  I'm planning on running performance tests this coming week, and will
>  give my patch 3/11 comments then.
>
> Thanks, right now I have not optimized this fully although I identified some
> enhancements.  Fragmentation usage in itself is highly suboptimal and preferred usage is
> either none or a very limited percentage of total pps.

Agreed.  Just want to have a test for when it is turned on vs off.  My
setup got taken up for something else for the moment so I haven't been
able to measure, yet.  Sorry for the delay.

>  Thanks for working on this, Darrell!
>
>  > Darrell
>  >
>  >
>  >    2. it'll need at least one more spin (but no need to rush that since
>  >       I'll move myself past the error)
>  >
>  >  Thanks,
>  >  -Aaron
>  >
>  >  > v4->v5: Added a sub-feature to optionally dump fragmentation lists.
>  >  >         This is useful for DOS forensics and debugging.
>  >  >
>  >  >         The testing coverage was also extended including checking
>  >  >         more counters and frag list occupancies.
>  >  >
>  >  >         Fixed a few bugs:
>  >  >         1/ Handle dpdk mempool source restrictions for a batch of
>  >  >            packets from multiple sources; this also brings in a purge
>  >  >            frag list function to handle pathological cases of stuck frags.
>  >  >         2/ ipf_destroy was missing packet frees for frag lists.
>  >  >         3/ A setting of CS_INVALID was missing for expired packets -
>  >  >            I mentioned this earlier for version 4.
>  >  >
>  >  >         Some enhancements and coding standards changes for Patch 3.
>  >  >
>  >  > v3->v4: Add V6 support to the patches.
>  >  >         Fix possible race cleanup bug when the user disables
>  >  >            fragmentation and there are list occupancies, not cleaned up
>  >  >            yet.
>  >  >         Add missed orig tuple fields for copy from reassembled packet
>  >  >             to fragments.
>  >  >         Fix an fragment list increment check - shoiuld have been "> 0"
>  >  >             rather then "!= 0".
>  >  >         Fix max frags calculation in case of theoretical corner case.
>  >  >         Add proper lock annotations.
>  >  >         Made some other improvements while adding V6 support.
>  >  >
>  >  > v2->v3: Patch 2 was updated:
>  >  >         Remove "XXX" todo items by implementing the ones needed,
>  >  >         including realloc frag_list contexts to save memory.
>  >  >         Fix related bug with max_frag_list_size when min_frag_size is
>  >  >         reconfigured.
>  >  >
>  >  >         Tighten ip_tot_len sanity check for reassembled packets which
>  >  >         was more loose than intended.
>  >  >
>  >  >         Add another sanity check for fragment ip_tot_len; even though
>  >  >         it be redundant, add for completeness.
>  >  >
>  >  > v1->v2: Few fixes, improvements and cleanups.
>  >  >
>  >  > Darrell Ball (11):
>  >  >   dp-packet: Add const qualifiers for checksum apis.
>  >  >   flow: Enhance parse_ipv6_ext_hdrs.
>  >  >   Userspace datapath: Add fragmentation handling.
>  >  >   conntrack: Support fragmentation.
>  >  >   ipf: Add command to enable fragmentation handling.
>  >  >   ipf: Add set minimum fragment size command.
>  >  >   ipf: Add set maximum fragments supported command.
>  >  >   ipf: Add command to get fragmentation handling status.
>  >  >   ipf: Enhance ipf_get_status.
>  >  >   tests: Add missed local stack checks.
>  >  >   tests: Enable fragmentation for userspace datapath.
>  >  >
>  >  >  NEWS                             |   10 +
>  >  >  include/sparse/netinet/ip6.h     |    1 +
>  >  >  lib/automake.mk                  |    2 +
>  >  >  lib/conntrack.c                  |   10 +-
>  >  >  lib/ct-dpif.c                    |   69 ++
>  >  >  lib/ct-dpif.h                    |   13 +
>  >  >  lib/dp-packet.h                  |    4 +-
>  >  >  lib/dpctl.c                      |  216 ++++++
>  >  >  lib/dpctl.man                    |   32 +
>  >  >  lib/dpif-netdev.c                |   83 +++
>  >  >  lib/dpif-netlink.c               |    7 +
>  >  >  lib/dpif-provider.h              |   18 +
>  >  >  lib/flow.c                       |   23 +-
>  >  >  lib/flow.h                       |    3 +-
>  >  >  lib/ipf.c                        | 1390
>  ++++++++++++++++++++++++++++++++++++++
>  >  >  lib/ipf.h                        |   86 +++
>  >  >  tests/system-kmod-macros.at      |   30 +-
>  >  >  tests/system-traffic.at          |   45 +-
>  >  >  tests/system-userspace-macros.at |  125 +++-
>  >  >  19 files changed, 2129 insertions(+), 38 deletions(-)
>  >  >  create mode 100644 lib/ipf.c
>  >  >  create mode 100644 lib/ipf.h


More information about the dev mailing list