[ovs-dev] [PATCH v3] datapath-windows: Avoid BSOD when switch context is NULL

Sorin Vinturis svinturis at cloudbasesolutions.com
Tue Nov 18 07:48:31 UTC 2014


Hi Nithin,

I will follow your recommendation to not combine different patches into one. Please find my answers to your comments inline.

Sorin

-----Original Message-----
From: Nithin Raju [mailto:nithin at vmware.com] 
Sent: Tuesday, 18 November, 2014 09:10
To: Sorin Vinturis
Cc: dev at openvswitch.org
Subject: Re: [ovs-dev] [PATCH v3] datapath-windows: Avoid BSOD when switch context is NULL

hi Sorin,
Like I mentioned in another review, pls. don't combine unrelated changes into one patch. Here I see a fix to buffer management code, and the other one is related to device handle cleanup.

Can you pls. split them into separate reviews and send it as a series?

For now, I've reviewed the code nevertheless.

On Nov 14, 2014, at 5:24 AM, Sorin Vinturis <svinturis at cloudbasesolutions.com> wrote:

> I came around a BSOD that happened when trying to access pidHashLock 
> from the gOvsSwitchContext, which was NULL. The stop happened in 
> OvsAcquirePidHashLock function.
> 
> To reproduce this BSOD, make sure the extension is enabled and 
> running, disable it and, after that, execute 'ovs-dpctl.exe show'. The 
> BSOD is triggered afterwards.
> 
> Signed-off-by: Sorin Vinturis <svinturis at cloudbasesolutions.com>
> Reported-by: Sorin Vinturis <svinturis at cloudbasesolutions.com>
> Reported-at: 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_openvs
> witch_ovs-2Dissues_issues_53&d=AAIGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-
> YihVMNtXt-uEs&r=pNHQcdr7B40b4h6Yb7FIedI1dnBsxdDuTLBYD3JqV80&m=myh4dF1m
> oeIJYk-RA8t1wqv28VpYg_aScFAkBe8Esp0&s=eZX2_Er6DHMb3gOjIlbaDYhlPJIv2Bnw
> x_vos3NuGmo&e=
> ---
> datapath-windows/ovsext/BufferMgmt.c |  8 ++++++--
> datapath-windows/ovsext/Datapath.c   |  5 ++++-
> datapath-windows/ovsext/Tunnel.c     |  3 ++-
> datapath-windows/ovsext/User.c       | 11 +++++++----
> 4 files changed, 19 insertions(+), 8 deletions(-)
> 
> diff --git a/datapath-windows/ovsext/BufferMgmt.c 
> b/datapath-windows/ovsext/BufferMgmt.c
> index e0377c1..1a68575 100644
> --- a/datapath-windows/ovsext/BufferMgmt.c
> +++ b/datapath-windows/ovsext/BufferMgmt.c
> @@ -474,15 +474,19 @@ OvsAllocateVariableSizeNBL(PVOID ovsContext, {
>     PNET_BUFFER_LIST nbl = NULL;
>     POVS_SWITCH_CONTEXT context = (POVS_SWITCH_CONTEXT)ovsContext;
> -    POVS_NBL_POOL ovsPool = &context->ovsPool;
> +    POVS_NBL_POOL ovsPool = NULL;
>     POVS_BUFFER_CONTEXT ctx;
>     UINT32 realSize;
>     PMDL mdl;
>     NDIS_STATUS status;
>     PNDIS_SWITCH_FORWARDING_DETAIL_NET_BUFFER_LIST_INFO info;
> -    if (size == 0) {
> +
> +    if ((NULL == ovsContext) || (0 == size)) {

When would this happen? Have you found a codepath or a crash dump where you saw this issue? The three references I see to OvsAllocateVariableSizeNBL() are all safe to access 'ovsContext'.
[Sorin]: I didn't saw any crashes here for this codepath. It's just that I am used to checking a pointer before usage. I'll remove the check.

>         return NULL;
>     }
> +
> +    ovsPool = &context->ovsPool;
> +
>     realSize = MEM_ALIGN_SIZE(size + headRoom);
> 
>     mdl = OvsAllocateMDLAndData(ovsPool->ndisHandle, realSize); diff 
> --git a/datapath-windows/ovsext/Datapath.c 
> b/datapath-windows/ovsext/Datapath.c
> index 1b504a2..d8962bd 100644
> --- a/datapath-windows/ovsext/Datapath.c
> +++ b/datapath-windows/ovsext/Datapath.c
> @@ -994,9 +994,10 @@ OvsDpFillInfo(POVS_SWITCH_CONTEXT 
> ovsSwitchContext, {
>     BOOLEAN writeOk;
>     OVS_MESSAGE msgOutTmp;
> -    OVS_DATAPATH *datapath = &ovsSwitchContext->datapath;
> +    OVS_DATAPATH *datapath = NULL;
>     PNL_MSG_HDR nlMsg;
> 
> +    ASSERT(ovsSwitchContext);
>     ASSERT(NlBufAt(nlBuf, 0, 0) != 0 && NlBufRemLen(nlBuf) >= sizeof 
> *msgIn);
> 
>     msgOutTmp.nlMsg.nlmsgType = OVS_WIN_NL_DATAPATH_FAMILY_ID; @@ 
> -1009,6 +1010,8 @@ OvsDpFillInfo(POVS_SWITCH_CONTEXT ovsSwitchContext,
>     msgOutTmp.genlMsg.reserved = 0;
> 
>     msgOutTmp.ovsHdr.dp_ifindex = ovsSwitchContext->dpNo;
> +    
> +    datapath = &ovsSwitchContext->datapath;

Why is this change necessary? I don't see it as adding any additional checks. We'll for sure not call into OvsDpFillInfo() if 'ovsSwitchContext' is NULL.
[Sorin]: Ok, I'll remove the changes.

> 
>     writeOk = NlMsgPutHead(nlBuf, (PCHAR)&msgOutTmp, sizeof msgOutTmp);
>     if (writeOk) {
> diff --git a/datapath-windows/ovsext/Tunnel.c 
> b/datapath-windows/ovsext/Tunnel.c
> index b55a223..fed58f1 100644
> --- a/datapath-windows/ovsext/Tunnel.c
> +++ b/datapath-windows/ovsext/Tunnel.c
> @@ -224,9 +224,10 @@ OvsInjectPacketThroughActions(PNET_BUFFER_LIST pNbl,
>     OvsCompletionList completionList;
>     KIRQL irql;
>     ULONG SendFlags = NDIS_SEND_FLAGS_SWITCH_DESTINATION_GROUP;
> -    OVS_DATAPATH *datapath = &gOvsSwitchContext->datapath;
> +    OVS_DATAPATH *datapath = NULL;
> 
>     ASSERT(gOvsSwitchContext);
> +    datapath = &gOvsSwitchContext->datapath;

This is fine.


>     /* Fill the tunnel key */
>     status = OvsSlowPathDecapVxlan(pNbl, &tunnelKey); diff --git 
> a/datapath-windows/ovsext/User.c b/datapath-windows/ovsext/User.c 
> index fc27f7d..ca60cd0 100644
> --- a/datapath-windows/ovsext/User.c
> +++ b/datapath-windows/ovsext/User.c
> @@ -141,10 +141,13 @@ OvsCleanupPacketQueue(POVS_OPEN_INSTANCE instance)
>         OvsFreeMemory(queue);
>     }
> 
> -    /* Remove the instance from pidHashArray */
> -    OvsAcquirePidHashLock();
> -    OvsDelPidInstance(gOvsSwitchContext, instance->pid);
> -    OvsReleasePidHashLock();
> +    /* Verify if gOvsSwitchContext exists. */
> +    if (gOvsSwitchContext) {
> +        /* Remove the instance from pidHashArray */
> +        OvsAcquirePidHashLock();
> +        OvsDelPidInstance(gOvsSwitchContext, instance->pid);
> +        OvsReleasePidHashLock();

'gOvsSwitchContext' should be accessed while holding, OvsAcquireCtrlLock()/OvsReleaseCtrlLock().
[Sorin]: Ok, I'll modify the code to obtain the control lock first.




More information about the dev mailing list