[ovs-dev] [PATCH] dpif-netdev: Forwarding optimization for direct output flows.
Ilya Maximets
i.maximets at samsung.com
Mon May 27 13:44:46 UTC 2019
On 24.05.2019 19:20, Ben Pfaff wrote:
> On Fri, May 24, 2019 at 03:18:50PM +0300, Ilya Maximets wrote:
>> There are some cases where users want to have simple forwarding or drop
>> rules for all packets received from particular port, i.e :
>>
>> "in_port=1,actions=2"
>> "in_port=1,actions=IN_PORT"
>> "in_port=1,actions=drop"
>>
>> There are also cases where complex OF flows could be simplified down
>> to simple forwarding/drop datapath flows.
>>
>> In theory, we don't need to parse packets at all to follow these flows.
>> "Direct output forwarding" optimization is intended to speed up above
>> cases.
>>
>> Design:
>>
>> Due to various implementation restrictions userspace datapath has
>> following flow fields always in exact match (i.e. it's required to
>> match at least these fields of a packet even if the OF rule doesn't
>> need that):
>>
>> - recirc_id
>> - in_port
>> - packet_type
>> - dl_type
>> - vlan_tci
>> - nw_frag (for ip packets)
>>
>> Not all of these fields are related to packet itself. We already
>> know the current 'recirc_id' and the 'in_port' before starting the
>> packet processing. It also seems safe to assume that we're working
>> with Ethernet packets. dpif-netdev sets exact match on 'vlan_tci'
>> to avoid issues with flow format conversion and we don't really need
>> to match with it until ofproto layer didn't ask us to.
>> So, for the simple forwarding OF rule we need to match only with
>> 'dl_type' and 'nw_frag'.
>>
>> 'in_port', 'dl_type' and 'nw_frag' could be combined in a single
>> 64bit integer that could be used as a hash in hash map.
>>
>> New per-PMD flow table 'direct_output_table' introduced to store
>> direct output flows only. 'dp_netdev_flow_add' adds flow to the
>> usual 'flow_table' and to 'direct_output_table' if the flow meets
>> following constraints:
>>
>> - 'recirc_id' in flow match is 0.
>> - Flow wildcards originally had wildcarded 'vlan_tci'.
>> - Flow has no actions (drop) or exactly one action equal to
>> OVS_ACTION_ATTR_OUTPUT.
>> - Flow wildcards contains only minimal set of non-wildcarded fields
>> (listed above).
>>
>> If the number of flows for current 'in_port' in regular 'flow_table'
>> equals number of flows for current 'in_port' in 'direct_output_table',
>> we may use direct output optimization, because all the flows we have
>> are direct output flows. This means that we only need to parse
>> 'dl_type' and 'nw_frag' to perform packet matching.
>> Now we making the unique flow mark from the 'in_port', 'dl_type' and
>> 'nw_frag' and looking for it in 'direct_output_table'.
>> On successful lookup we don't need to make full 'miniflow_extract()'.
>>
>> Unsuccessful lookup technically means that we have no sufficient flow
>> in datapath and upcall will be required. We may optimize this path
>> in the future by bypassing the EMC, SMC and dpcls lookups in this case.
>>
>> Performance improvement of this solution on a 'direct output' flows
>> should be comparable with partial HW offloading, because it parses same
>> packet fields and uses similar flow lookup scheme.
>> However, unlike partial HW offloading, it works for all port types
>> including virtual ones.
>>
>> Signed-off-by: Ilya Maximets <i.maximets at samsung.com>
>> ---
>>
>> This patch was made as a point for "virtio-forwarder" discussion:
>> https://mail.openvswitch.org/pipermail/ovs-dev/2019-May/358686.html
>> However, it might be very useful by itself for usual cases too.
>>
>> Testing is very welcome. I didn't run the performance tests on real
>> systems, so I don't know the real performance impact yet.
>
> The changes to parse_tcp_flags() look good to me. It may be worth
> updating the comments on that function to say that the output arguments
> are set only if the function returns nonzero.
>
> In general, I really support the idea that OVS is general-purpose but
> that we should optimize it for important use cases. That's basically
> what we did at Nicira/VMware for network virtualization: we wanted to
> make sure that OVS was as general as possible so that many people would
> use it and adopt it, while at the same time making sure that it
> performed well for NVP/NSX. I mean, we didn't want it to run slowly in
> other cases, obviously, but they weren't what we were internally
> benchmarking.
>
> In my view, this patch is entirely in line with that philosophy.
Thanks!
About comments, It's a good point. I'm going to add following description:
* 'dl_type_p' will be set only if 'packet' is an Ethernet packet.
* 'nw_frag_p' will be set only if 'packet' is an IP packet.
It's because I need these fields even if it's not an IP or TCP packet.
I also noticed that I didn't initialize the parameters, so I might
be using them uninitialized for non-IP packets. I'll fix that too.
Few more self-review observations:
1. No need to check if 'dp_netdev_direct_output_enabled' for each packet.
This could be done once for a batch.
2. I missed checking that packet_type equals to PT_ETH before adding to
'direct_output_table'.
So, I'll send v2 with the following incremental:
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index b455bdc7b..993997c3b 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -3435,8 +3435,10 @@ dp_netdev_flow_add(struct dp_netdev_pmd_thread *pmd,
dp_netdev_flow_hash(&flow->ufid));
ccmap_inc(&pmd->n_flows, odp_to_u32(in_port));
- if (vlan_tci_wc_faked && match->flow.recirc_id == 0 &&
- dp_netdev_flow_is_direct_output(&match->wc, actions, actions_len)) {
+ if (vlan_tci_wc_faked
+ && match->flow.recirc_id == 0
+ && match->flow.packet_type == htonl(PT_ETH)
+ && dp_netdev_flow_is_direct_output(&match->wc, actions, actions_len)) {
dp_netdev_direct_output_insert(pmd, flow);
}
@@ -6599,6 +6601,7 @@ dfc_processing(struct dp_netdev_pmd_thread *pmd,
bool smc_enable_db;
size_t map_cnt = 0;
bool batch_enable = true;
+ bool direct_output_enabled = dp_netdev_direct_output_enabled(pmd, port_no);
atomic_read_relaxed(&pmd->dp->smc_enable_db, &smc_enable_db);
pmd_perf_update_counter(&pmd->perf_stats,
@@ -6631,9 +6634,9 @@ dfc_processing(struct dp_netdev_pmd_thread *pmd,
}
}
- if (!flow && dp_netdev_direct_output_enabled(pmd, port_no)) {
- ovs_be16 dl_type;
- uint8_t nw_frag;
+ if (!flow && direct_output_enabled) {
+ ovs_be16 dl_type = 0;
+ uint8_t nw_frag = 0;
tcp_flags = parse_tcp_flags(packet, &dl_type, &nw_frag);
flow = dp_netdev_direct_output_lookup(pmd, port_no,
diff --git a/lib/flow.c b/lib/flow.c
index 9c4222a93..88c54a37b 100644
--- a/lib/flow.c
+++ b/lib/flow.c
@@ -1089,6 +1089,8 @@ parse_dl_type(const struct eth_header *data_, size_t size)
/* Parses and return the TCP flags in 'packet', converted to host byte order.
* If 'packet' is not an Ethernet packet embedding TCP, returns 0.
+ * 'dl_type_p' will be set only if 'packet' is an Ethernet packet.
+ * 'nw_frag_p' will be set only if 'packet' is an IP packet.
*
* The caller must ensure that 'packet' is at least ETH_HEADER_LEN bytes
* long.'*/
---
Best regards, Ilya Maximets.
More information about the dev
mailing list