[ovs-dev] [PATCH 4/9] dpif-netdev: Add poll-mode-device thread.

Pravin Shelar pshelar at nicira.com
Wed Mar 19 22:16:37 UTC 2014


On Wed, Mar 19, 2014 at 11:08 AM, Ben Pfaff <blp at nicira.com> wrote:
> On Tue, Mar 18, 2014 at 01:53:27PM -0700, Pravin wrote:
>> This patch adds PMD type netdev for netdevice with poll-mode
>> drivers.  Since there is no way to get signal on a packet recv
>> from these devices we need to poll them in busy loop.  So minimize
>> system call overhead this patch uses dpif-thread exclusively
>> for PMD devices and rest of devices which needs system calls to
>> do IO are moved to dpif-netdev-run().
>> PMD device like DPDK work in userspace so there is no system call
>> overhead for them.
>>
>> Signed-off-by: Pravin B Shelar <pshelar at nicira.com>
>
> Clang reports:
>
>     ../lib/dpif-netdev.c:1640:13: error: calling function 'dp_netdev_port_input'
>           requires shared lock on 'dp->port_rwlock'
>           [-Werror,-Wthread-safety-analysis]
>                 dp_netdev_port_input(dp, packet[i], md);
>                 ^
>
> sparse also gave me this really odd error message:
>     ../lib/dpif-netdev.c:1645:15: warning: Initializer entry defined twice
>     ../lib/dpif-netdev.c:1645:15:   also defined here
> which I was able to fix by removing the "inline" from
> dp_netdev_process_rx_port().  (I think that must be a bug in sparse,
> but it's still nice to avoid it.)
>
I have not check with clang I will do it for v2.

> When pmd_thread_main() exits, I don't see anything unref'ing the ports
> it ref'd in pmd_load_queues.
>
pmd_load_queues() does unref for device added before.

> I imagine that this commit will make userspace datapath forwarding
> slower on non-PMD devices, because it moves their forwarding back into
> the vswitchd main thread.  Did you consider doing all forwarding in
> the pmd threads?  You could do that by polling non-pmd devices in the
> pmd loop and then, at the bottom of the loop, if the thread has only
> non-pmd devices, wait for one of them to become ready.  (But that's
> pretty nonessential; we can always improve performance later.)
>
As discussed off-line, I will keep it simple.

> Polling the atomic variable only every 67,108,864 times through the
> loop seems pretty conservative.  (Does that even poll once a second?)
>
I am not sure how to calculate one second in this loop efficiently.
But I will try to use poll() rather than atomic variable here.

> The new 'md' member in struct dp_netdev_port seems a little odd to me.
> It is initialized when the port is created and never modified; at
> least, I think not but non-const pointers to it are constructed in
> dpif_netdev_execute() and dp_netdev_process_rx_port().  (Can these be
> made const?)  But I really wonder whether it's worth caching this at
> all; I guess that the caching is for performance but since we only
> initialize the in_port member it would probably be faster to just pass
> NULL as 'md' to flow_extract() and then initialize flow->in_port after
> the call returns, since this would bypass a lost of wasteful copying
> 0-bytes around in flow_extract().
>
OK, I will use port->md here.

> I think that the new 'md' member is the only real change in struct
> dp_netdev_port.  A lot of members are reordered in that struct.  Is
> that for organization, or speculation that it produces better cache
> behavior, or...?
>
I just changed those member according to access pattern. I will keep
struct as it is. we can improve it later.

> I'd prefer to see the pmd property be defined as part of the
> netdev-provider interface (and documented).  'pmd' is new jargon
> introduced by DPDK, right?  Maybe we should expand it to something
> that a new developer can understand more readily, like "poll-only" or
> "not-waitable" or...?
>
At this point I do not see pmd-thread use for anything other than DPDK
so we can use pmd name, I will document it.

> Any particular reason to define NR_THREADS in a header file instead of
> dpif-netdev.c?
>
ok, I will move it to dpif-netdev.c

> The reason for the change to port_unref() isn't obvious to me.
>
That is not required. It is incorrect rebase.

> In pmd_thread_main(), I usually see "for (;;)" instead of "while (1)"
> in OVS code, and I would tend to do a "do { ... } while" instead of
> the goto.
>
ok

> There's a spurious blank line in dp_netdev_process_rx_port() here:
> +        }
> +
> +    } else if (error != EAGAIN && error != EOPNOTSUPP) {
ok.



More information about the dev mailing list