<div dir="ltr">Thanks very much for the review!~ ;D<br><div class="gmail_extra"><br>Please see my reply inline,<br><br><br><div class="gmail_quote">On Thu, Feb 20, 2014 at 3:25 PM, Pravin Shelar <span dir="ltr"><<a href="mailto:pshelar@nicira.com" target="_blank">pshelar@nicira.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div><span style="color:rgb(80,0,80)">> upcall.cmd = OVS_PACKET_CMD_MISS;</span><br>
</div><div>
<div>
> upcall.key = &key;<br>
> upcall.userdata = NULL;<br>
> - upcall.portid = p->upcall_portid;<br>
> + upcall.portid = ovs_vport_find_pid(p, &key);<br>
> ovs_dp_upcall(dp, skb, &upcall);<br>
<br>
</div></div>Can you use different name than pid? pid is generally used for process id.<br>
<div><br></div></blockquote><div><br></div><div><br></div><div><br></div><div>I'll use 'portid' for consistency. In userspace, we call it pid (e.g. nl_sock_pid(),</div><div>VPORT_ATTR_UPCALL_PID). So I'll keep using it.</div>
<div><br></div><div><br></div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div>> @@ -1459,7 +1459,7 @@ static const struct nla_policy vport_policy[OVS_VPORT_ATTR_MAX + 1] = {<br>
> [OVS_VPORT_ATTR_STATS] = { .len = sizeof(struct ovs_vport_stats) },<br>
> [OVS_VPORT_ATTR_PORT_NO] = { .type = NLA_U32 },<br>
> [OVS_VPORT_ATTR_TYPE] = { .type = NLA_U32 },<br>
> - [OVS_VPORT_ATTR_UPCALL_PID] = { .type = NLA_U32 },<br>
> + [OVS_VPORT_ATTR_UPCALL_PID] = { .type = NLA_UNSPEC },<br>
> [OVS_VPORT_ATTR_OPTIONS] = { .type = NLA_NESTED },<br>
> };<br>
><br>
<br>
</div>Have you tried this patch with old userspace to check compatibility?</blockquote><div><br></div><div><br></div><div>Will make sure the kernel module is compatible with old userspace.</div><div><br></div><div>Also, you just reminded me that this patch can be cut into series! ;D So, I'm</div>
<div>planning to do that and resend it as series of patches.</div><div><br></div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div>
> @@ -1579,8 +1581,7 @@ static int ovs_vport_cmd_new(struct sk_buff *skb, struct genl_info *info)<br>
> int err;<br>
><br>
> err = -EINVAL;<br>
> - if (!a[OVS_VPORT_ATTR_NAME] || !a[OVS_VPORT_ATTR_TYPE] ||<br>
> - !a[OVS_VPORT_ATTR_UPCALL_PID])<br>
> + if (!a[OVS_VPORT_ATTR_NAME] || !a[OVS_VPORT_ATTR_TYPE])<br>
> goto exit;<br>
><br>
<br>
</div>Port ids array should be mandatory. as it is today.<br></blockquote><div><br></div><div><br></div><div><br></div><div>Yeah, I'll change accordingly to honor the API and guarantee compatibility,</div>
<div><br></div><div><br></div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div><div>> +static void __vport_pids_destroy(struct vport_pids *pids)<br>
> +{<br>
> + if (pids->pids)<br>
> + kfree(pids->pids);<br>
> +<br>
> + kfree(pids);<br>
> +}<br>
> +<br>
<br>
</div></div>This is not called from anywhere else. Can you use rcu-callback<br>
function to free?<br>
<div><div><br></div></div></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div><div>
> +static void vport_pids_destroy_rcu_cb(struct rcu_head *rcu)<br>
> +{<br>
> + struct vport_pids *pids = container_of(rcu, struct vport_pids, rcu);<br>
> +<br>
> + __vport_pids_destroy(pids);<br>
> +}<br></div></div></blockquote><div><br></div><div><br></div><div>I see, I'll combine the two.</div><div><br></div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div><div>> + */<br>
> +int ovs_vport_set_upcall_pids(struct vport *vport, struct nlattr *pids)<br>
> +{<br>
> + struct vport_pids *old;<br>
> +<br>
> + if (pids && nla_len(pids) % sizeof(u32))<br>
> + return -EINVAL;<br>
> +<br>
> + old = ovsl_dereference(vport->upcall_pids);<br>
> +</div></div></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div>
> + if (pids) {<br>
> + struct vport_pids *vport_pids;<br>
> +<br>
> + vport_pids = kmalloc(sizeof *vport_pids, GFP_KERNEL);<br>
> + vport_pids->pids = kmalloc(nla_len(pids), GFP_KERNEL);<br>
> + vport_pids->n_pids = nla_len(pids)<br>
> + / (sizeof *vport_pids->pids);<br>
> + memcpy(vport_pids->pids, nla_data(pids), nla_len(pids));<br>
> +<br>
<br>
</div>We can just use single memory space to allocate both struct vport_pids<br>
and array. Same is done for flow-stats.<br>
<div><br></div></blockquote><div><br></div><div><br></div><div><br></div><div>Thx for the good suggestion, I'll adjust my code accordingly.</div><div><br></div><div><br></div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div>
> + rcu_assign_pointer(vport->upcall_pids, vport_pids);<br>
> + } else if (old) {<br>
> + rcu_assign_pointer(vport->upcall_pids, NULL);<br>
> + }<br>
> +<br>
> + if (old)<br>
> + call_rcu(&old->rcu, vport_pids_destroy_rcu_cb);<br>
> +<br>
<br>
</div>This is bit confusing. We can free pids array in from vport destroy callback.<br>
So there is no need to handle case where pids is NULL.<br>
<div><br></div></blockquote><div><br></div><div><br></div><div><br></div><div>Sure, changes are made so that vport_pids will never be NULL, except for the first</div><div>time it is configured.</div><div><br></div>
<div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div>> +int ovs_vport_get_upcall_pids(const struct vport *vport, struct sk_buff *skb)<br>
> +{<br>
> + struct vport_pids *pids;<br>
> + int err = 0;<br>
> +<br>
> + rcu_read_lock();<br>
> + pids = ovsl_dereference(vport->upcall_pids);<br>
> +<br>
</div>This function is always called with rcu or ovs mutex. so there is no<br>
need to take extra lock.<br>
<div><br></div></blockquote><div><br></div><div><br></div><div>I should check for that next time. Fixed.</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div>
> + if (!pids)<br>
> + goto exit;<br>
> +<br>
<br>
</div>pids should not be NULL in this case.</blockquote><div><br></div><div> </div><div>Yes, I'll remove it.</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div><div>
> + * Returns the pid of the target socket. Must be called with rcu_read_lock.<br>
> + */<br>
> +u32 ovs_vport_find_pid(const struct vport *p, const struct sw_flow_key *key)<br>
> +{<br>
> + struct vport_pids *pids;<br>
> + u32 hash;<br>
> +<br>
> + pids = ovsl_dereference(p->upcall_pids);<br>
> +<br>
</div></div>This function is only called under rcu read context, therefore<br>
rcu_dereference is sufficient.<br>
<div><br></div></blockquote><div><br></div><div><br></div><div>Thanks, fixed.</div><div><br></div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div>
> + if (!pids)<br>
> + return 0;<br>
> +<br>
</div>pids should not be NULL in this case.<br>
<div><br></div></blockquote><div><br></div><div><br></div><div>Yes, I'll remove it.</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div>
> + hash = key->ipv4.addr.src ^ key->ipv4.addr.dst<br>
> + ^ key->ip.proto ^ key->ipv4.tp.src<br>
> + ^ key->ipv4.tp.dst;<br>
> +<br>
<br>
</div>skb_get_rxhash() should be cheaper in general than calculating hash.<br>
<div><br></div></blockquote><div><br></div><div><br></div><div>Thanks for the suggestion, I'll use it.</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div>
> + return pids->pids[jhash((void *) &hash, 4, 0) % pids->n_pids];<br>
> +}<br>
> +<br>
<br>
</div> We should use hash seed here since we are using values from packet<br>
coming from network.<br>
<div><div><br></div></div></blockquote><div><br></div><div><br></div><div>Since i'll use skb_get_rxhash(), I will just used the skb_get_rxhash() return value.</div><div>So, no seed is needed.</div><div><br></div><div>
<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div>> - int idx = dpif->epoll_events[dpif->event_offset].data.u32;<br>
> - struct dpif_channel *ch = &dpif->channels[idx];<br>
> + while (epoll->event_offset < epoll->n_events) {<br>
> + int idx = epoll->epoll_events[epoll->event_offset].data.u32;<br>
> + struct dpif_channel *ch = &dpif->channels[idx][handler_id];<br>
><br>
<br>
</div>I have not looked at whole userspace part, but just noticed that this<br>
channels access pattern is not good for cpu cacheline.<br>
Is there reason for doing it this way rather than channel[handler_id][idx] ?<br></blockquote><div><br></div><div><br></div><div>I just haven't thought about the cacheline benefits. Will make the change. Thanks.</div>
<div> </div></div><br></div></div>