[ovs-dev] OVS Netlink zerocopy vs Xen netback zerocopy

Zoltan Kiss zoltan.kiss at citrix.com
Wed Feb 19 15:50:43 UTC 2014


Currently I'm working on a patchset which reintroduces grant mapping 
into netback. We used it before Linux Xen bits were upstreamed, but we 
had to change to grant copy as the original solution were fundamentally 
not upstreamable. But the advantage would be huge, as we could replace 
copy guest pages by Xen to mapping guest pages to Dom0.
Parallel to this I'm working on a grant mapping optimization, which 
makes it possible to avoid m2p_override for grant mapped pages. It 
causes lock contention and we don't need it if the pages doesn't go to 
userspace. This could be a safe assumption, as those pages would stay in 
kernel space while switched by OVS, and if they end up on the local 
port, delivered to Dom0 IP stack, deliver_skb will call skb_orphan_frags 
which swaps out those foreign (=grant mapped from guest) pages by local 
copies and notify netback through a callback that it can give back the 
pages to the guest.

And after that bit long introduction here comes the main question: OVS 
recently introduced Netlink zerocopy, which by my understanding means 
that Netlink messages from kernel are not copied but mapped to 
userspace. And such message can contain a whole packet if it haven't 
matched any flows in the kernel, or the flow action said so. As far as I 
saw skb_zerocopy will clone the frags from the real packet skb to the 
Netlink skb. Note, the linear buffer is local memory in netback case as 
well, we copy the beginning of the packet (max 128 bytes) there, only 
the pages on frags are foreign ones.
I don't know the internals of Netlink that much, how a packet is 
forwarded up in this case, but that concerns me, as if the pages on the 
skb_shinfo(skb)->frags array are still the foreign ones, and userspace 
wants to touch that data, we are in trouble.
If this is the scenario, I think the best would be to call 
skb_orphan_frags before skb_zerocopy in queue_userspace_packet, so the 
frags will become local. Fortunately this is a corner case, as it 
shouldn't happen very often that the kernel sends up packets bigger than 
128 bytes.

What do you think about the solution in the last paragraph? Or do we 
need it at all?



More information about the dev mailing list