[ovs-dev] [PATCH net-next] net: openvswitch: Add support to lookup invalid packet in ct action.

Numan Siddique nusiddiq at redhat.com
Tue Oct 6 13:22:58 UTC 2020


On Tue, Oct 6, 2020 at 5:49 PM Numan Siddique <nusiddiq at redhat.com> wrote:
>
> On Tue, Oct 6, 2020 at 4:46 PM Florian Westphal <fw at strlen.de> wrote:
> >
> > nusiddiq at redhat.com <nusiddiq at redhat.com> wrote:
> > > From: Numan Siddique <nusiddiq at redhat.com>
> > >
> > > For a tcp packet which is part of an existing committed connection,
> > > nf_conntrack_in() will return err and set skb->_nfct to NULL if it is
> > > out of tcp window. ct action for this packet will set the ct_state
> > > to +inv which is as expected.
> >
> > This is because from conntrack p.o.v., such TCP packet is NOT part of
> > the existing connection.
> >
> > For example, because it is considered part of a previous incarnation
> > of the same connection.
> >
> > > But a controller cannot add an OVS flow as
> > >
> > > table=21,priority=100,ct_state=+inv, actions=drop
> > >
> > > to drop such packets. That is because when ct action is executed on other
> > > packets which are not part of existing committed connections, ct_state
> > > can be set to invalid. Few such cases are:
> > >    - ICMP reply packets.
> >
> > Can you elaborate? Echo reply should not be invalid. Conntrack should
> > mark it as established (unless such echo reply came out of the blue).
>
> Hi Florian,
>
> Thanks for providing the comments.
>
> Sorry for not being very clear.
>
> Let me brief about the present problem we see in OVN (which is a
> controller using ovs)
>
> When a VM/container sends a packet (in the ingress direction), we don't send all
> the packets to conntrack. If a packet is destined to an OVN load
> balancer virtual ip,
> only then we send the packet to conntrack in the ingress direction and
> then we do dnat
> to the backend.
>
> Eg. in the ingress direction
>
> table=1, match = (ip && ip4.dst == VIP) action = ct(table=2)
> tablle=2, ct_state=+new+trk && ip4.dst == VIP, action = ct(commit,
> nat=BACKEND_IP)
> ...
> ..
>
> However for the egress direction (when the packet is to be delivered
> to the VM/container),
> we send all the packets to conntrack and if the ct.est is set, we do
> undnat before delivering
> the packet to the VM/container.
> ...
> table=40, match = ip, action = ct(table=41)
> table=41, match = ct_state=+est+trk, action = ct(nat)
> ...
>
> What I mean here is that, since we send all the packets in the egress
> pipeline to conntrack,
> we can't add a flow like - match = ct_state=+inv, action = drop.
>
> i.e When a VM/container sends an ICMP request packet, it will not be
> sent to conntrack, but
> the reply ICMP will be sent to conntrack and it will be marked as invalid.
>
> So is the case with TCP, the TCP SYN from the VM is not sent to
> conntrack, but the SYN/ACK
> from the server would be sent to conntrack and it will be marked as invalid.
>
> >
> > >    - TCP SYN/ACK packets during connection establishment.
> >
> > SYN/ACK should also be established state.
> > INVALID should only be matched for packets that were never seen
> > by conntrack, or that are deemed out of date / corrupted.
> >
> > > To distinguish between an invalid packet part of committed connection
> > > and others, this patch introduces as a new ct attribute
> > > OVS_CT_ATTR_LOOKUP_INV. If this is set in the ct action (without commit),
> > > it tries to find the ct entry and if present, sets the ct_state to
> > > +inv,+trk and also sets the mark and labels associated with the
> > > connection.
> > >
> > > With this,  a controller can add flows like
> > >
> > > ....
> > > ....
> > > table=20,ip, action=ct(table=21, lookup_invalid)
> > > table=21,priority=100,ct_state=+inv+trk,ct_label=0x2/0x2 actions=drop
> > > table=21,ip, actions=resubmit(,22)
> > > ....
> > > ....
> >
> > What exactly is the feature/problem that needs to be solved?
> > I suspect this would help me to provide better feedback than the
> > semi-random comments below .... :-)
> >
> > My only problem with how conntrack does things ATM is that the ruleset
> > cannot distinguish:
> >
> > 1. packet was not even seen by conntrack
> > 2. packet matches existing connection, but is "bad", for example:
> >   - contradicting tcp flags
> >   - out of window
> >   - invalid checksum
>
> We want the below to be solved (using OVS flows) :
>   - If the packet is marked as invalid due to (2) which you mentioned above,
>     we would like to read the ct_mark and ct_label fields as the packet is
>     part of existing connection, so that we can add an OVS flow like
>
> ct_state=+inv+trk,ct_label=0x2 actions=drop
>
> Right now it is not possible.

I forgot to mention the side effect of it. Since the tcp out of window packet
is set as +inv, this packet is delivered to the VM/container without undnat
and because of this VM/container resets the connection.

Thanks
Numan



>
> This patch does another lookup if skb->_nfct is NULL after
> nf_conntrack_in() to check
> if (2) is the case. If the lookup is successful, it updates the ct flow
> key with the ct_mark and ct_label. This is made optional using a
> netlink attribute.
>
> I'm not sure if it's possible for nf_conntrack_in() to provide this
> information for
> its callers so that the caller can come to know that the state is
> invalid because of (2).
>
> I tested by setting 'be_liberal' sysctl flag and since skb->_nfct was
> set for (2), OVS
> datapath module set the ct_state to +est.
>
> Thanks
> Numan
>
>
> >
> > There are a few sysctls to modify default behaviour, e.g. relax window
> > checks, or ignore/skip checksum validation.
> >
> > The other problem i see (solveable for sure by yet-another-sysctl but i
> > see that as last-resort) is usual compatibility problem:
> >
> > ct state invalid drop
> > ct mark gt 0 accept
> >
> > If standard netfilter conntrack were to set skb->_nfct e.g. even if
> > state is invalid, we could still make the above work via some internal
> > flag.
> >
> > But if you reverse it, you get different behaviour:
> >
> > ct mark gt 0 accept
> > ct state invalid drop
> >
> > First rule might now accept out-of-window packet even when "be_liberal"
> > sysctl is off.
> >



More information about the dev mailing list