[ovs-dev] [PATCH] datapath: Avoid null deref when GSO is for verifying header integrity only.
jesse at nicira.com
Tue Jan 22 00:11:51 UTC 2013
On Mon, Jan 21, 2013 at 4:09 PM, Ben Pfaff <blp at nicira.com> wrote:
> On Mon, Jan 21, 2013 at 04:04:22PM -0800, Jesse Gross wrote:
>> On Mon, Jan 21, 2013 at 2:25 PM, Jesse Gross <jesse at nicira.com> wrote:
>> > On Mon, Jan 21, 2013 at 10:11 AM, Ben Pfaff <blp at nicira.com> wrote:
>> >> skb_gso_segment() has the following comment:
>> >> * It may return NULL if the skb requires no segmentation. This is
>> >> * only possible when GSO is used for verifying header integrity.
>> >> Somehow queue_gso_packets() has never hit this case before, but some
>> >> failures have suddenly been reported. This commit should fix the problem.
>> >> Bug #14772.
>> >> Reported-by: Deepesh Govindan <dgovindan at vmware.com>.
>> >> Signed-off-by: Ben Pfaff <blp at nicira.com>
>> > Acked-by: Jesse Gross <jesse at nicira.com>
>> > However, I think we also potentially have a similar problem in
>> > tunnel.c:handle_offloads().
>> >> ---
>> >> I don't know how to trigger this condition, so I haven't tested this
>> >> patch beyond verifying that it compiles against the Linux 3.2 kernel
>> >> on which the problem was reported. The patch was generated against
>> >> branch-1.7 because that's where the problem occurred; it looks like
>> >> it should apply cleanly against master also.
>> > We shouldn't normally be hitting this case because we're actually
>> > trying to do GSO, not header validation. However, I guess the
>> > guest/backend must be generating a packet with an MSS, which tricks us
>> > into thinking that it's GSO, but no GSO is actually requested. In the
>> > case of the bridge, header validation does take place so the situation
>> > is handled already. It seems not ideal that the network backend
>> > doesn't sanitize these packets but it's probably good that we handle
>> > it in any case.
>> I kept on looking into this and realized that my explanation is
>> completely wrong. Both KVM and Xen do correctly sanitize the GSO
>> metadata. The problem is actually on the receive side - LRO was
>> somehow enabled and that creates packets of the form that are causing
>> problems here. We disable LRO when interfaces are added to OVS but
>> it's possible that someone turned it on by hand later on. The real
>> source of the problem is that we're not checking and dropping LRO
>> packets on receive. We're supposed to be doing this but when code was
>> being restructured for upstreaming the check was accidentally moved
>> from the receive to the transmit function where it obviously doesn't
>> do anything. This is clearly a bug and moving it back should solve
>> the problem (the logs clearly confirm LRO as the problem).
>> As far as the current patch, it should also mostly avoid the problem
>> but can potentially cause problems later on in the system. With this
>> patch, we will send large packets up to userspace and they will come
>> back down to be forwarded but without any indication that they were
>> LRO packets originally. This will probably just cause them to get
>> dropped later on but could cause issues with some subsystems (like
>> GRO, which tries to infer MSS based on packet size). None of the
>> other places in the kernel besides dev.c that use skb_gso_segment()
>> have this check, so I think we probably should revert it.
> Presumably you have a real fix coming up, I hope you're willing to pair
> a revert and a fix patch as you commit the fix?
Yeah, I'm working on it.
More information about the dev