[ovs-dev] [PATCH ovn v2] ovn-controller: Avoid infinite replying for TCP/ICMP connection reset messages

Mohammad Heib mheib at redhat.com
Tue Oct 12 11:58:14 UTC 2021


Hi Mark,

On 10/11/21 8:03 PM, Mark Gray wrote:
> On 11/10/2021 10:01, mheib at redhat.com wrote:
>> From: Mohammad Heib <mheib at redhat.com>
>>
>> When the ovn controller receives an ip packet that targets a lport that has ACL
>> rule to reject ip packets, the controller will reply with TCP_RST or icmp4/6 unreachable packet
>> to notify the sender that the destination is not available.
>>
>> In turn, the receiver host will receive the notification packet and handle it as a normal IP packet
>> and if the receiver host is part of the same logical-switch/port-group or has IP reject ACL rule
>> it will send TCP_RST or icmp4/6 unreachable packet replying to the TCP_RST or icmp4/6 unreachable
>> packet we received and here we will enter to an infinity loop of replying about replying which
>> will consume high CPU.
>>
>> To avoid such scenarios this patch proposes to drop/ignore TCP_RST or icmp4/6 unreachable packets
>> that received on lport that has  IP reject ACL rules.
>>
>> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1934011
>> Fixes: 64f8c9e9f ("actions: Add a new OVN action - reject {}.")
>> Signed-off-by: Mohammad Heib <mheib at redhat.com>
>> ---
>>   controller/pinctrl.c | 29 +++++++++++++++
>>   tests/ovn-northd.at  | 85 ++++++++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 114 insertions(+)
>>
>> diff --git a/controller/pinctrl.c b/controller/pinctrl.c
>> index cc3edaaf4..eff599d2b 100644
>> --- a/controller/pinctrl.c
>> +++ b/controller/pinctrl.c
>> @@ -1934,11 +1934,40 @@ pinctrl_handle_sctp_abort(struct rconn *swconn, const struct flow *ip_flow,
>>       dp_packet_uninit(&packet);
>>   }
>>   
>> +static bool
>> +pinctrl_handle_reject_ignore_pkt(const struct flow *ip_flow,
>> +                                 struct dp_packet *pkt_in)
>> +{
>> +    if (ip_flow->nw_proto == IPPROTO_TCP) {
>> +        struct tcp_header *th = dp_packet_l4(pkt_in);
>> +        if (!th || (TCP_FLAGS(th->tcp_ctl) & TCP_RST)) {
>> +            return true;
>> +        }
>> +    } else {
>> +        if (is_icmpv4(ip_flow, NULL)) {
>> +            struct icmp_header *ih = dp_packet_l4(pkt_in);
>> +            if (!ih || (ih->icmp_type == ICMP4_DST_UNREACH)) {
>> +                return true;
>> +            }
>> +        } else if (is_icmpv6(ip_flow, NULL)) {
>> +            struct icmp6_data_header *ih = dp_packet_l4(pkt_in);
>> +            if (!ih || (ih->icmp6_base.icmp6_type == ICMP6_DST_UNREACH)) {
>> +                return true;
>> +            }
>> +        }
>> +    }
>> +    return false;
>> +}
>> +
>>   static void
>>   pinctrl_handle_reject(struct rconn *swconn, const struct flow *ip_flow,
>>                         struct dp_packet *pkt_in,
>>                         const struct match *md, struct ofpbuf *userdata)
>>   {
>> +    if (pinctrl_handle_reject_ignore_pkt(ip_flow, pkt_in)) {
>> +        return;
>> +    }
>> +
>>       if (ip_flow->nw_proto == IPPROTO_TCP) {
>>           pinctrl_handle_tcp_reset(swconn, ip_flow, pkt_in, md, userdata, true);
>>       } else if (ip_flow->nw_proto == IPPROTO_SCTP) {
>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
>> index 8b9049899..d06065a86 100644
>> --- a/tests/ovn-northd.at
>> +++ b/tests/ovn-northd.at
>> @@ -2062,6 +2062,91 @@ sw1flows3:  table=4 (ls_out_acl         ), priority=2003 , match=((reg0[[9]] ==
>>   AT_CLEANUP
>>   ])
>>   
>> +OVN_FOR_EACH_NORTHD([
>> +AT_SETUP([ACL Reject ping pong])
>> +AT_KEYWORDS([ACL Reject ping pong])
> :D .. nice name
>
> Thanks for adding this. Some comments below.
Thank you :)
>
>> +ovn_start
>> +
>> +send_icmp_packet() {
>> +    local inport=$1 hv=$2 eth_src=$3 eth_dst=$4 ipv4_src=$5 ipv4_dst=$6 ip_chksum=$7 data=$8
>> +    shift 8
>> +
>> +    local ip_ttl=ff
>> +    local ip_len=001c
>> +    local packet=${eth_dst}${eth_src}08004500${ip_len}00004000${ip_ttl}01${ip_chksum}${ipv4_src}${ipv4_dst}${data}
>> +    as hv$hv ovs-appctl netdev-dummy/receive hv${hv}-vif$inport $packet
>> +}
>> +
>> +net_add n1
>> +
>> +sim_add hv1
>> +as hv1
>> +ovs-vsctl add-br br-phys
>> +ovn_attach n1 br-phys 192.168.0.1
>> +ovs-vsctl -- add-port br-int hv1-vif1 -- \
>> +    set interface hv1-vif1 external-ids:iface-id=sw0-p1 \
>> +    options:tx_pcap=hv1/vif1-tx.pcap \
>> +    options:rxq_pcap=hv1/vif1-rx.pcap \
>> +    ofport-request=1
>> +
>> +sim_add hv2
>> +as hv2
>> +ovs-vsctl add-br br-phys
>> +ovn_attach n1 br-phys 192.168.0.2
>> +ovs-vsctl -- add-port br-int hv2-vif1 -- \
>> +    set interface hv2-vif1 external-ids:iface-id=sw0-p2 \
>> +    options:tx_pcap=hv2/vif1-tx.pcap \
>> +    options:rxq_pcap=hv2/vif1-rx.pcap \
>> +    ofport-request=1
>> +
>> +ovn-nbctl ls-add sw0
>> +
>> +check ovn-nbctl lsp-add sw0 sw0-p1
>> +check ovn-nbctl lsp-set-addresses sw0-p1 "50:54:00:00:00:03 10.0.0.3"
>> +check ovn-nbctl lsp-set-port-security sw0-p1 "50:54:00:00:00:03 10.0.0.3"
>> +
>> +check ovn-nbctl lsp-add sw0 sw0-p2
>> +check ovn-nbctl lsp-set-addresses sw0-p2 "50:54:00:00:00:04 10.0.0.4"
>> +check ovn-nbctl lsp-set-port-security sw0-p2 "50:54:00:00:00:04 10.0.0.4"
>> +
>> +check ovn-nbctl acl-add sw0 to-lport 1002 ip reject
>> +
>> +OVN_POPULATE_ARP
>> +
>> +wait_for_ports_up
>> +ovn-nbctl --wait=hv sync
>> +
>> +ovn-sbctl dump-flows sw0 > sw0-flows
>> +AT_CAPTURE_FILE([sw0-flows])
>> +
>> +AT_CHECK([grep -E 'ls_(in|out)_acl' sw0-flows |grep reject| sed 's/table=../table=??/' | sort], [0], [dnl
>> +  table=??(ls_out_acl         ), priority=2002 , match=(ip), action=(reg0 = 0; reject { /* eth.dst <-> eth.src; ip.dst <-> ip.src; is implicit. */ outport <-> inport; next(pipeline=ingress,table=22); };)
>> +])
>> +
>> +
>> +eth_src=505400000003
>> +eth_dst=505400000004
>> +ip_src=$(ip_to_hex 10 0 0 3)
>> +ip_dst=$(ip_to_hex 10 0 0 4)
>> +send_icmp_packet 1 1 $eth_src $eth_dst $ip_src $ip_dst c4c9 0000000000000000000000
>> +sleep 10
> Its a shame we have to sleep but there probably isn't a way around this.
> Could it be reduced to 5 or 2 seconds safely in order to minimize the
> additional time added to a run of the test suite?

we can decrease it to 2 seconds and still receive so many packets.

>
>> +
>> +# Get total number of packets that received on ovs
>> +
>> +# sender side
>> +flow=$(as hv1 ovs-ofctl dump-flows br-int table=44 | grep priority=2002|grep ip,metadata=0x1)
> Could you also add ip6?
yes.
>
>> +n_pkts="$(echo $flow|awk -F',' '{ print $4 }'|awk -F'=' '{ print $2 }')"
>> +check test $n_pkts -lt 30
>> +
> Rather than count the number of packets matched by the flow, could we
> check that we only receive 1 (?) tcp rst or unreachable packet using
> something like `OVN_CHECK_PACKETS([hv1/vif1-tx.pcap], [expected])`?
i'm not sure if that is possible i tried it before but seems that the  
tcp rst or unreachable packets

are not received on the hv1/vif1-tx/rx.pcap and i guess that because we 
replying from the ovn controller.

so i decided to check the ovs flows counters.


>
>> +# reciver side
> typo: receiver

fixed.

thanks

>> +flow=$(as hv2 ovs-ofctl dump-flows br-int table=44 | grep priority=2002|grep ip,metadata=0x1)
>> +n_pkts="$(echo $flow|awk -F',' '{ print $4 }'|awk -F'=' '{ print $2 }')"
>> +check test $n_pkts -lt 30
>> +
>> +OVN_CLEANUP([hv1], [hv2])
>> +AT_CLEANUP
>> +])
>> +
>>   OVN_FOR_EACH_NORTHD([
>>   AT_SETUP([ACL fair Meters])
>>   AT_KEYWORDS([acl log meter fair])
>>


More information about the dev mailing list