[ovs-dev] [PATCH v1] conntrack: Add state and sequence validation

Paolo Valerio pvalerio at redhat.com
Tue Jun 22 16:45:31 UTC 2021


Hi Aaron,

Aaron Conole <aconole at redhat.com> writes:

> During testing, there was an edge condition that was found during
> packet pickup where userspace can improperly advance the TCP state
> machine during connection exstablishment and bypass the 3whs.  This
> can pollute the TCP sequence windows.
>
> Add a fix to ensure that we move the state machine when we see the
> appropriate flags, and include a test to show the error condition.
>
> Signed-off-by: Aaron Conole <aconole at redhat.com>
> ---
> NOTE: I haven't done as much testing for 'learn existing connections' as
>       I would want.  I expect that I may have missed a case there, and
>       hope to include a test for it when I work on the tcp_loose mode
>       support in the userspace conntrack
>
>  lib/conntrack-tcp.c     |  7 +++--
>  tests/system-traffic.at | 62 +++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 67 insertions(+), 2 deletions(-)
>

I can see the value of the patch. Thanks for working on this.

I suppose this is due to the handling of both 3whs and the pick up of
established connections, and the way it is done.

> diff --git a/lib/conntrack-tcp.c b/lib/conntrack-tcp.c
> index 8a7c98cc45..1f6cc4368d 100644
> --- a/lib/conntrack-tcp.c
> +++ b/lib/conntrack-tcp.c
> @@ -224,6 +224,7 @@ tcp_conn_update(struct conntrack *ct, struct conn *conn_,
>  
>          end = seq + p_len;
>          if (tcp_flags & TCP_SYN) {
> +            src->state = CT_DPIF_TCPS_SYN_SENT; /* SYN_SENT by src */
>              end++;
>              if (dst->wscale & CT_WSCALE_FLAG) {
>                  src->wscale = tcp_get_wscale(tcp);
> @@ -245,7 +246,6 @@ tcp_conn_update(struct conntrack *ct, struct conn *conn_,
>          }
>  
>          src->seqlo = seq;
> -        src->state = CT_DPIF_TCPS_SYN_SENT;
>          /*

I think, this implies, for the reverse direction, that it gets its state
advancement when the SYN is included.
It is perfectly ok for the 3whs, but for established connections the
state doesn't move as we expect.

As a side note, I further tested it against "ofproto-dpif - conntrack -
tcp pick up" in tests/ofproto-dpif.at and it passes. After digging a
little though, it turns out that the state transitions for each packet
are respectively:

tcp,orig=(src=10.1.1.2,dst=10.1.1.1,sport=2,dport=1),reply=(src=10.1.1.1,dst=10.1.1.2,sport=1,dport=2),protoinfo=(state=SYN_SENT)
tcp,orig=(src=10.1.1.2,dst=10.1.1.1,sport=2,dport=1),reply=(src=10.1.1.1,dst=10.1.1.2,sport=1,dport=2),protoinfo=(state=SYN_SENT)
tcp,orig=(src=10.1.1.2,dst=10.1.1.1,sport=2,dport=1),reply=(src=10.1.1.1,dst=10.1.1.2,sport=1,dport=2),protoinfo=(state=SYN_SENT)
tcp,orig=(src=10.1.1.2,dst=10.1.1.1,sport=2,dport=1),reply=(src=10.1.1.1,dst=10.1.1.2,sport=1,dport=2),protoinfo=(state=SYN_SENT)
tcp,orig=(src=10.1.1.2,dst=10.1.1.1,sport=2,dport=1),reply=(src=10.1.1.1,dst=10.1.1.2,sport=1,dport=2),protoinfo=(state=SYN_SENT)
tcp,orig=(src=10.1.1.2,dst=10.1.1.1,sport=2,dport=1),reply=(src=10.1.1.1,dst=10.1.1.2,sport=1,dport=2),protoinfo=(state=SYN_SENT)
tcp,orig=(src=10.1.1.2,dst=10.1.1.1,sport=2,dport=1),reply=(src=10.1.1.1,dst=10.1.1.2,sport=1,dport=2),protoinfo=(state=CLOSING)
tcp,orig=(src=10.1.1.2,dst=10.1.1.1,sport=2,dport=1),reply=(src=10.1.1.1,dst=10.1.1.2,sport=1,dport=2),protoinfo=(state=TIME_WAIT)
tcp,orig=(src=10.1.1.2,dst=10.1.1.1,sport=2,dport=1),reply=(src=10.1.1.1,dst=10.1.1.2,sport=1,dport=2),protoinfo=(state=TIME_WAIT)
tcp,orig=(src=10.1.1.2,dst=10.1.1.1,sport=2,dport=1),reply=(src=10.1.1.1,dst=10.1.1.2,sport=1,dport=2),protoinfo=(state=TIME_WAIT)
tcp,orig=(src=10.1.1.2,dst=10.1.1.1,sport=2,dport=1),reply=(src=10.1.1.1,dst=10.1.1.2,sport=1,dport=2),protoinfo=(state=TIME_WAIT)

It seems we do need a test that checks this feature from more
perspective, so thanks for working on this as well.

>           * May need to slide the window (seqhi may have been set by
>           * the crappy stack check or if we picked up the connection
> @@ -329,7 +329,10 @@ tcp_conn_update(struct conntrack *ct, struct conn *conn_,
>          }
>          if (tcp_flags & TCP_ACK) {
>              if (dst->state == CT_DPIF_TCPS_SYN_SENT) {
> -                dst->state = CT_DPIF_TCPS_ESTABLISHED;
> +                if (src->state == CT_DPIF_TCPS_SYN_SENT) {
> +                    /* Move to EST only once SRC things this is okay */
> +                    dst->state = CT_DPIF_TCPS_ESTABLISHED;
> +                }
>              } else if (dst->state == CT_DPIF_TCPS_CLOSING) {
>                  dst->state = CT_DPIF_TCPS_FIN_WAIT_2;
>              }
> diff --git a/tests/system-traffic.at b/tests/system-traffic.at
> index c73bbc420f..4e849085bb 100644
> --- a/tests/system-traffic.at
> +++ b/tests/system-traffic.at
> @@ -6025,6 +6025,68 @@ AT_CHECK([ovs-ofctl dump-flows br0 | grep table=2, | OFPROTO_CLEAR_DURATION_IDLE
>  OVS_TRAFFIC_VSWITCHD_STOP
>  AT_CLEANUP
>  
> +AT_SETUP([conntrack - Out of order TCP state transition])
> +dnl This can happen due to buggy TCP implementations that reuse ephemeral
> +dnl ports - this test will check that some invalid parameters don't advance
> +dnl the state machine
> +CHECK_CONNTRACK()
> +OVS_TRAFFIC_VSWITCHD_START()
> +OVS_CHECK_CT_CLEAR()
> +
> +ADD_NAMESPACES(at_ns0, at_ns1)
> +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24", "f0:00:00:01:01:01")
> +ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24", "f0:00:00:01:01:02")
> +
> +dnl setup ct flows
> +AT_DATA([flows.txt], [dnl
> +table=0,priority=10  arp action=normal
> +table=0,priority=10  ip,tcp,ct_state=-trk action=ct(table=1)
> +table=0,priority=1   action=drop
> +dnl dst ns2
> +table=1,priority=20  ip,ct_state=+new+trk,nw_dst=10.1.1.2 action=ct(commit),output:ovs-p1
> +table=1,priority=20  ip,ct_state=+est+trk,nw_dst=10.1.1.2 action=output:ovs-p1
> +dnl dst ns1
> +table=1,priority=10  ip,ct_state=+trk+est,nw_dst=10.1.1.1 action=output:ovs-p0
> +table=1,priority=10  ip,ct_state=+trk+new,nw_dst=10.1.1.1 action=output:ovs-p0
> +table=1,priority=1   ip,ct_state=+trk+inv action=drop
> +])
> +
> +AT_CHECK([ovs-ofctl --bundle add-flows br0 flows.txt])
> +
> +dnl kill tcp packets - this will suppress RST/RST-ACK messages
> +NS_CHECK_EXEC([at_ns0], [iptables -I INPUT 1 -p tcp --sport 6667 -j DROP])
> +NS_CHECK_EXEC([at_ns1], [iptables -I INPUT 1 -p tcp --dport 6667 -j DROP])
> +
> +dnl Send TCP SYN
> +NS_CHECK_EXEC([at_ns0], [$PYTHON3 $srcdir/sendpkt.py p0 f0 00 00 01 01 02 f0 00 00 01 01 01 08 00 45 00 00 3c c9 c8 40 00 40 06 5a ef 0a 01 01 01 0a 01 01 02 b2 2a 1a 0b d3 78 6f 81 00 00 00 00 a0 02 fa f0 7a 0b 00 00 02 04 05 b4 04 02 08 0a 6d 35 40 9a 00 00 00 00 01 03 03 07 > /dev/null])
> +
> +dnl Send TCP PSH|URG
> +NS_CHECK_EXEC([at_ns1], [$PYTHON3 $srcdir/sendpkt.py p1 f0 00 00 01 01 01 f0 00 00 01 01 02 08 00 45 00 00 34 c9 c9 40 00 40 06 5a f6 0a 01 01 02 0a 01 01 01 b2 2a 1a 0b d3 78 6f 82 0a 0b d5 33 80 28 01 f6 72 bb 00 00 01 01 08 0a 6d 35 40 9a 11 90 3e 20 > /dev/null])
> +
> +dnl Check that we haven't advanced the ct state machine
> +AT_CHECK([ovs-appctl dpctl/dump-conntrack | grep "dst=10.1.1.1" | sort | uniq], [0], [dnl
> +tcp,orig=(src=10.1.1.1,dst=10.1.1.2,sport=45610,dport=6667),reply=(src=10.1.1.2,dst=10.1.1.1,sport=6667,dport=45610),protoinfo=(state=SYN_SENT)
> +])
> +
> +dnl Send TCP ACK without syn, and with garbage ack values
> +NS_CHECK_EXEC([at_ns1], [$PYTHON3 $srcdir/sendpkt.py p1 f0 00 00 01 01 01 f0 00 00 01 01 02 08 00 45 00 00 34 73 7a 40 00 40 06 b1 45 0a 01 01 02 0a 01 01 01 1a 0b b2 2a 0a 0b e5 33 d3 78 6f 87 80 10 01 fe 4b fa 00 00 01 01 08 0a 11 90 49 86 6d 35 4c 00 > /dev/null])
> +
> +dnl Check that we haven't advanced the ct state machine
> +AT_CHECK([ovs-appctl dpctl/dump-conntrack | grep "dst=10.1.1.1" | sort | uniq], [0], [dnl
> +tcp,orig=(src=10.1.1.1,dst=10.1.1.2,sport=45610,dport=6667),reply=(src=10.1.1.2,dst=10.1.1.1,sport=6667,dport=45610),protoinfo=(state=SYN_SENT)
> +])
> +
> +dnl Now send a valid SYN|ACK
> +NS_CHECK_EXEC([at_ns1], [$PYTHON3 $srcdir/sendpkt.py p1 f0 00 00 01 01 01 f0 00 00 01 01 02 08 00 45 00 00 3c 00 00 40 00 40 06 24 b8 0a 01 01 02 0a 01 01 01 1a 0b b2 2a 0a 0b d5 32 d3 78 6f 82 a0 12 fe 88 47 74 00 00 02 04 05 b4 04 02 08 0a 11 90 3e 20 6d 35 40 9a 01 03 03 07 > /dev/null])
> +
> +dnl Now check that we are in ESTABLISHED
> +AT_CHECK([ovs-appctl dpctl/dump-conntrack | grep "dst=10.1.1.1" | sort | uniq], [0], [dnl
> +tcp,orig=(src=10.1.1.1,dst=10.1.1.2,sport=45610,dport=6667),reply=(src=10.1.1.2,dst=10.1.1.1,sport=6667,dport=45610),protoinfo=(state=ESTABLISHED)
> +])
> +
> +OVS_TRAFFIC_VSWITCHD_STOP
> +AT_CLEANUP
> +
>  AT_BANNER([802.1ad])
>  
>  AT_SETUP([802.1ad - vlan_limit])
> -- 
> 2.31.1



More information about the dev mailing list