[ovs-dev] [PATCH] ofproto: Fix crash on PACKET_OUT due to recursive locking after upcall.

Ilya Maximets i.maximets at ovn.org
Thu Nov 21 18:31:41 UTC 2019


On 01.11.2019 22:24, Ilya Maximets wrote:
> Handling of OpenFlow PACKET_OUT implies pushing the packet through
> the datapath and packet processing inside the datapath could trigger
> an upcall.  In case of system datapath, 'dpif_execute()' sends packet
> to the kernel module and returns.  If any, upcall  will be triggered
> inside the kernel and handled by a separate thread in userspace.
> But in case of userspace datapath full processing of the packet happens
> inside the 'dpif_execute()' in the same thread that handled PACKET_OUT.
> This causes an issue if upcall will lead to modification of flow rules.
> For example, it could happen while processing of 'learn' actions.
> Since whole handling of PACKET_OUT is protected by 'ofproto_mutex',
> OVS will assert on attempt to take it recursively while processing
> 'learn' actions:
> 
>    0 __GI_raise (sig=sig at entry=6)
>    1 __GI_abort ()
>    2 ovs_abort_valist ()
>    3 ovs_abort ()
>    4 ovs_mutex_lock_at (where=where at entry=0xad4199 "ofproto/ofproto.c:5391")
>                 <Trying to acquire ofproto_mutex again>
>    5 ofproto_flow_mod_learn ()       at ofproto/ofproto.c:5391
>                 <Trying to modify flows according to 'learn' action>
>    6 xlate_learn_action ()           at ofproto/ofproto-dpif-xlate.c:5378
>                 <'learn' action found>
>    7 do_xlate_actions ()             at ofproto/ofproto-dpif-xlate.c:6893
>    8 xlate_recursively ()            at ofproto/ofproto-dpif-xlate.c:4233
>    9 xlate_table_action ()           at ofproto/ofproto-dpif-xlate.c:4361
>   10 in xlate_ofpact_resubmit ()     at ofproto/ofproto-dpif-xlate.c:4672
>   11 do_xlate_actions ()             at ofproto/ofproto-dpif-xlate.c:6773
>   12 xlate_actions ()                at ofproto/ofproto-dpif-xlate.c:7570
>                  <Translating actions>
>   13 upcall_xlate ()                 at ofproto/ofproto-dpif-upcall.c:1197
>   14 process_upcall ()               at ofproto/ofproto-dpif-upcall.c:1413
>   15 upcall_cb ()                    at ofproto/ofproto-dpif-upcall.c:1315
>   16 dp_netdev_upcall (DPIF_UC_MISS) at lib/dpif-netdev.c:6236
>                  <Flow cache miss. Making upcall>
>   17 handle_packet_upcall ()         at lib/dpif-netdev.c:6591
>   18 fast_path_processing ()         at lib/dpif-netdev.c:6709
>   19 dp_netdev_input__ ()            at lib/dpif-netdev.c:6797
>   20 dp_netdev_recirculate ()        at lib/dpif-netdev.c:6842
>   21 dp_execute_cb ()                at lib/dpif-netdev.c:7158
>   22 odp_execute_actions ()          at lib/odp-execute.c:794
>   23 dp_netdev_execute_actions ()    at lib/dpif-netdev.c:7332
>   24 dpif_netdev_execute ()          at lib/dpif-netdev.c:3725
>   25 dpif_netdev_operate ()          at lib/dpif-netdev.c:3756
>                  <Packet pushed to userspace datapath for processing>
>   26 dpif_operate ()                 at lib/dpif.c:1367
>   27 dpif_execute ()                 at lib/dpif.c:1321
>   28 packet_execute ()               at ofproto/ofproto-dpif.c:4760
>   29 ofproto_packet_out_finish ()    at ofproto/ofproto.c:3594
>                  <Taking ofproto_mutex>
>   30 handle_packet_out ()            at ofproto/ofproto.c:3635
>   31 handle_single_part_openflow (OFPTYPE_PACKET_OUT) at ofproto/ofproto.c:8400
>   32 handle_openflow ()                               at ofproto/ofproto.c:8587
>   33 ofconn_run ()
>   34 connmgr_run ()
>   35 ofproto_run ()
>   36 bridge_run__ ()
>   37 bridge_run ()
>   38 main ()
> 
> Fix that by splitting the 'ofproto_packet_out_finish()' in two parts.
> First one that translates side-effects and requires holding 'ofproto_mutex'
> and the second that only pushes packet to datapath.  The second part moved
> out of 'ofproto_mutex' because 'ofproto_mutex' is not required and actually
> should not be taken in order to avoid recursive locking.
> 
> Reported-by: Anil Kumar Koli <anilkumar.k at altencalsoftlabs.com>
> Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2019-April/048494.html
> Signed-off-by: Ilya Maximets <i.maximets at ovn.org>
> ---
> 
> Previous discussion about implementation:
> https://mail.openvswitch.org/pipermail/ovs-dev/2019-August/362082.html
> 
> I'm not able to reproduce the original issue, so I can not test if
> this patch fixes it.
> 
>  ofproto/ofproto-dpif.c     | 40 ++++++++++++++++++++++++++++----------
>  ofproto/ofproto-provider.h | 12 +++++++++---
>  ofproto/ofproto.c          | 29 +++++++++++++++++++++++----
>  3 files changed, 64 insertions(+), 17 deletions(-)


Hi Ben,

Could you, please, take a look at this patch?

Best regards, Ilya Maximets.



More information about the dev mailing list