[ovs-dev] [RFC 00/11] python: introduce flow parsing library

Eelco Chaudron echaudro at redhat.com
Mon Nov 1 14:05:15 UTC 2021



On 27 Oct 2021, at 12:11, Adrian Moreno wrote:

> Thanks for you comments Eelco.
>
> On 10/21/21 11:52, Eelco Chaudron wrote:
>>
>>
>> On 1 Sep 2021, at 12:00, Adrian Moreno wrote:
>>
>>> While troubleshooting or developing new features in OVS, a considerable
>>> amount of time is spent analyzing flows (whether that's Openflow flows
>>> or datapath flows). Currently, OVS has tools to dump flows with
>>> different levels of verbosity as well as to filter flows prior to
>>> dumping them, e.g: 'ovs-ofctl dump-flows', 'ovs-appctl
>>> dpctl/dump-flows', etc.
>>>
>>> The output of these commands is considered stable so it should be
>>> possible to write more layered tools that enable advanced flow analysis.
>>> However, the way flows are formatted into strings is not trivial to
>>> parse.
>>>
>>> This series proposes the introduction of a flow parsing library capable
>>> of parsing both Openflow and DPIF flows.
>>>
>>> The library is based on generic key-value and list parsers and a number
>>> of common decoders. Based on that, an Openflow flow parser and a DPIF
>>> flow parser are introduced by defining a way to decode each possible
>>> field and action they might contain.
>>>
>>> The library has the following features:
>>> - Parsed key-value pairs keep some metadata that refer to the original
>>>    strings they were extracted from. That way the flows can be printed
>>>    and formatted in flexible ways.
>>> - It includes a basic flow filtering mechanism. A filter can be defined
>>>    combining logical (||, &&, !), arithmetical (<, >, =) or mask (~=)
>>>    operations
>>> - It supports IPAddress and Ethernet masking (based on netaddr)
>>> - The decoder to use for each key (match or action) is set explicitly to
>>>    avoid expensive runtime type-guessing.
>>> - The decoders to use for Openflow fields is automatically generated
>>>    based on meta-flow.h
>>> - Additional dependencies:
>>>    - netaddr: For IP and Ethernet Address management
>>>    - pyparsing: For filtering syntax
>>>
>>> As a proof of concept of how this library might be used, the series'
>>> last patch includes a user program, called "ofparse" that prints
>>> openflow and datapath flows in different formats including colorizing
>>> the output (using "rich" [3] library), json format and a "logic"
>>> representation of the flow tables.
>>>
>>> The idea behind the logical view of the flow tables comes from
>>> Flavio Leitner <fbl at sysclose.org>.
>>>
>>> My original thought was to only add the utility as a proof of usage
>>> in the RFC, not as part of the final patch. However, if having such tool
>>> in the ovs tree is considered useful, I can clean up the code,
>>> split it in smaller commits and post it in a separate patch set.
>>> More info on the tool is available via 'ofparse --help' and on the
>>> online documentation [4].
>>>
>>> ofparse usage
>>> -------------
>>>
>>> $ make
>>> $ cd python; python -m venv venv; . ./venv/bin/activate; pip install .
>>> $ ovs-ofctl dump-flows br-int | ofparse openflow logic
>>> $ ovs-dpctl dpctl/dump-flows -m | ofparse datapath logic
>>>
>>> Library usage
>>> -------------
>>>>>> from ovs.flows.ofp import OFPFlow
>>>>>> flow = OFPFlow.from_string("cookie=0x2b32ab4d, table=41, n_packets=11, n_bytes=462, priority=33,ip,reg15=0x2/0x2,nw_src=10.128.0.2/24 actions=move:NXM_OF_TCP_DST[]->NXM_NX_XXREG0[32..47],ct(table=16,zone=NXM_NX_REG13[0..15],nat)")
>>>>>> flow.info
>>> {'cookie': 724740941, 'table': 41, 'n_packets': 11, 'n_bytes': 462}
>>>>>> flow.match
>>> {'priority': 33, 'ip': True, 'reg15': Mask32('0x2/0x2'), 'nw_src': IPMask('10.128.0.2/24')}
>>>>>> flow.actions
>>> [{'move': {'src': {'field': 'NXM_OF_TCP_DST'}, 'dst': {'field': 'NXM_NX_XXREG0', 'start': 32, 'end': 47}}}, {'ct': {'table': 16, 'zone': {'field': 'NXM_NX_REG13', 'start': 0, 'end': 15}, 'nat': True}}]
>>>>>> from ovs.flows.filter import OFFilter
>>>>>> filt = OFFilter("nw_src ~= 10.128.0.10 and (table = 42 or n_packets > 0)")
>>>>>> filt.evaluate(flow)
>>> True
>>>
>>> Apart from the overall idea and approach, I would like to get feedback
>>> from the community on a number of aspects I'm not very clear about:
>>> - Apart from the openflow fields information, is there any other part of
>>>    the decoding logic that can be automatically generated from ovs
>>>    headers to improve maintainability?
>>> - Is there interest in adding specific utilities (such as ofparse) to the ovs tree?
>>> - While writing the library, I've written some unit tests. What do you
>>>    think about adding python unit-tests?
>>
>> Hi Adrian,
>>
>> Thanks for sending this RFC! I do think this would be a nice addition to the OVS repository, including the ofparse utility (maybe it needs to be called ovs-flowparse ;)
>
> Agree!
>
>>
>> I did not do a real review, just a conceptual walk through the code (also knowing more updated code is in your private GitHub, https://github.com/amorenoz/ovs-dbg).
>>
>> Let me answer your questions:
>>
>>> - Apart from the openflow fields information, is there any other part of the decoding logic that can be automatically generated from ovs headers to improve maintainability?
>>
>> I do like the OpenFlow field extractor auto-generation, as it automatically syncs the Python parser. I feel like we must do the same for the datapath flows and actions, or else we will be out of sync when a new datapath action is added. Even most datapath addition patches fail to do it right the first time (i.e. missing decode, or even proper parsing when it gets added via dpctl/add-flow).
>>
>> Take a look at the self-tests for matching and actions in tests/odp.at. They use the test-odp.c utility to parse the fields, which uses the odp-util.c functions to parse back and forth. Maybe this c file with some more code can be compiled into a Python C wrapper?
>>
>
> Thanks for the pointer.
> Looking at test-odp.c/odp-util.c, what I see is code to translate strings to and from a buffer of consecutive netlink attributes.
> But the logic of how to interpret the payload of each attribute is "hidden" inside format_odp_action() and parse_odp_action__().
>
> That is essentially the logic encoded in python/ovs/flows/odp.py in this RFC (patch [RFC 08/11] python: add ovs datapath flow parsing). This allows, for instance, the creation of netaddr.IPAddress objects when corresponds which then enables rich filtering and formatting.
>
> So, adding an intermediate format (netlink buffer) might be useful to reduce some more error-prone string parsing code (the KVParser in the present RFC), but it would not reduce the need of a payload interpreter (i.e: "big switch-case"). So if the format of a payload is modified, we would still need to modify the python library.
>
> BTW, the same applies to OpenFlow actions IIUC.

Guess the problem I'm trying to point out, is not the actual parsing of the format, but being able to stay in sync with OVS to support newer actions/matches. For OpenFlow, this is covered due to the autogeneration of the parser code.

> Having said that, what we can do is add a wrapper around test-odp.c so each time we test a odp action or match we also test using the python library. That way we'll significantly reduce the risk of going out of sync (amusing self-tests are added when actions or matches are modified/added).

The problem is that people are not always updating the test cases, and they do slip through the review process sometimes. Maybe we can add some form of function change detection for the specific function in odp-util.c? Not sure how the maintainers feel about this, but we could store some hash related to those functions, and if they change pop out a warning/error? This way, people get triggered to update the python related functions (and maybe also to update the test cases).

>>> - Is there interest in adding specific utilities (such as ofparse) to the ovs tree?
>>
>> Yes, please!
>>
>
> OK. In that case, and unless you suggest otherwise, I will continue working on the parsing library in this RFC and send a new series with the utility.

Sounds like a plan!

>>> - While writing the library, I've written some unit tests. What do you think about adding python unit-tests?
>>
>> I do not think why this could not be added. We already do flake8 during the make process, maybe we can run the simple unit tests also as part of the build process.
>>
>
> OK.
>
>>
>>
>> Cheers,
>>
>> Eelco
>>
>> <SNIP>
>>
>
> Thanks
> -- 
> Adrián Moreno



More information about the dev mailing list