[ovs-dev] [PATCH] FAQ: Explain how "tap" devices work and why you should not use them.
Daniele Di Proietto
diproiettod at vmware.com
Tue May 5 16:45:50 UTC 2015
> On 5 May 2015, at 16:10, Ben Pfaff <blp at nicira.com> wrote:
>
> On Tue, May 05, 2015 at 11:52:37AM +0100, Daniele Di Proietto wrote:
>>
>>> On 5 May 2015, at 02:25, Ben Pfaff <blp at nicira.com> wrote:
>>>
>>> CC: 张伟 <zhangwqh at 126.com>
>>> Signed-off-by: Ben Pfaff <blp at nicira.com>
>>> ---
>>> AUTHORS | 1 +
>>> FAQ.md | 80 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> 2 files changed, 81 insertions(+)
>>>
>>> diff --git a/AUTHORS b/AUTHORS
>>> index cff99e6..9db112d 100644
>>> --- a/AUTHORS
>>> +++ b/AUTHORS
>>> @@ -360,6 +360,7 @@ likunyun kunyunli at hotmail.com
>>> rahim entezari rahim.entezari at gmail.com
>>> 冯全树(Crab) fqs888 at 126.com
>>> 胡靖飞 hujingfei914 at msn.com
>>> +张伟 zhangwqh at 126.com
>>>
>>> Thanks to all Open vSwitch contributors. If you are not listed above
>>> but believe that you should be, please write to dev at openvswitch.org.
>>> diff --git a/FAQ.md b/FAQ.md
>>> index 21d4e7a..3d4ce6f 100644
>>> --- a/FAQ.md
>>> +++ b/FAQ.md
>>> @@ -823,6 +823,86 @@ A: Open vSwitch wasn't able to create the port. Check the
>>> ovs-vsctl will immediately report when there is an issue creating a
>>> port.
>>>
>>> +### Q: I created a tap device tap0, configured an IP address on it, and
>>> + added it to a bridge, like this:
>>> +
>>> + tunctl -t tap0
>>> + ifconfig tap0 192.168.0.123
>>> + ovs-vsctl add-br br0
>>> + ovs-vsctl add-port br0 tap0
>>> +
>>> + I expected that I could then use this IP address to contact other
>>> + hosts on the network, but it doesn't work. Why not?
>>> +
>>> +A: The short answer is that this is a misuse of a "tap" device. Use
>>> + an "internal" device implemented by Open vSwitch, which works
>>> + differently and is designed for this use. To solve this problem
>>> + with an internal device, instead run:
>>> +
>>> + ovs-vsctl add-br br0
>>> + ovs-vsctl add-port br0 int0 -- set Interface int0 type=internal
>>> + ifconfig int0 192.168.0.123
>>> +
>>> + Even more simply, you can take advantage of the internal port that
>>> + every bridge has under the name of the bridge:
>>> +
>>> + ovs-vsctl add-br br0
>>> + ifconfig br0 192.168.0.123
>>> +
>>> + In more detail, a "tap" device is an interface between the Linux
>>> + (or *BSD) network stack and a user program that opens it as a
>>> + socket. When the "tap" device transmits a packet, it appears in
>>> + the socket opened by the userspace program. Conversely, when the
>>> + userspace program writes to the "tap" socket, the kernel TCP/IP
>>> + stack processes the packet as if it had been received by the "tap"
>>> + device.
>>> +
>>> + Consider the configuration above. Given this configuration, if you
>>> + "ping" an IP address in the 192.168.0.x subnet, the Linux kernel
>>> + routing stack will transmit an ARP on the tap0 device. Open
>>> + vSwitch userspace treats "tap" devices just like any other network
>>> + device; that is, it doesn't open them as "tap" sockets. That means
>>> + that the ARP packet will simply get dropped.
>>> +
>>> + You might wonder why the Open vSwitch kernel module doesn't
>>> + intercept the ARP packet and bridge it. After all, Open vSwitch
>>> + intercepts packets on other devices. The answer is that Open
>>> + vSwitch only intercepts *received* packets, but this is a packet
>>> + being transmitted. The same thing happens for all other types of
>>> + network devices, except for Open vSwitch "internal" ports. If you,
>>> + for example, add a physical Ethernet port to an OVS bridge,
>>> + configure an IP address on a physical Ethernet port, and then issue
>>> + a "ping" to an address in that subnet, the same thing happens: an
>>> + ARP gets transmitted on the physical Ethernet port and Open vSwitch
>>> + never sees it. (You should not do that, as documented at the
>>> + beginning of this section.)
>>> +
>>> + It can make sense to add a "tap" device to an Open vSwitch bridge,
>>> + if some userspace program (other than Open vSwitch) has opened the
>>> + tap socket. This is the case, for example, if the "tap" device was
>>> + created by KVM (or QEMU) to simulate a virtual NIC. In such a
>>> + case, when OVS bridges a packet to the "tap" device, the kernel
>>> + forwards that packet to KVM in userspace, which passes it along to
>>> + the VM, and in the other direction, when the VM sends a packet, KVM
>>> + writes it to the "tap" socket, which causes OVS to receive it and
>>> + bridge it to the other OVS ports. Please note that in such a case
>>> + no IP address is configured on the "tap" device (there is normally
>>> + an IP address configured in the virtual NIC inside the VM, but this
>>> + is not visible to the host Linux kernel or to Open vSwitch).
>>
>> I would also add that, in the above case, the interface type in OVS
>> should be "system" and not "tap" (please, correct me if I'm wrong).
>> I believe this confusion led to Debian bug #764843 and #764847.
>>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.debian.org_cgi-2Dbin_bugreport.cgi-3Fbug-3D764843&d=AwIDaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=SmB5nZacmXNq0gKCC1s_Cw5yUNjxgD4v5kJqZ2uWLlE&m=w2zz2x3n_3SA3haC0OCIPQ4286ENDGIdNV9BEDUrG-w&s=hKa5N7sxU-WxmnkK3aaokS9iNvAS7yHyjbMOy-l79xw&e=
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.debian.org_cgi-2Dbin_bugreport.cgi-3Fbug-3D764847&d=AwIDaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=SmB5nZacmXNq0gKCC1s_Cw5yUNjxgD4v5kJqZ2uWLlE&m=w2zz2x3n_3SA3haC0OCIPQ4286ENDGIdNV9BEDUrG-w&s=fRfjd3g0rOW5TVEg7biMDOCiwls5KOqg_zwJ_X_8xig&e=
>>
>> What do you think?
>
> That's a good point. Thanks, how about this additional paragraph to
> clear that up?
>
> Open vSwitch has a network device type called "tap". This is
> intended only for implementing "internal" ports in the OVS
> userspace switch and should not be used otherwise. In particular,
> users should not configure KVM "tap" devices as type "tap" (use
> type "system", the default, instead).
Perfect, thanks for taking care of this.
Acked-by: Daniele Di Proietto <diproiettod at vmware.com>
More information about the dev
mailing list