• D
    net/smc: fix sk_refcnt underflow on linkdown and fallback · e5d5aadc
    Dust Li 提交于
    We got the following WARNING when running ab/nginx
    test with RDMA link flapping (up-down-up).
    The reason is when smc_sock fallback and at linkdown
    happens simultaneously, we may got the following situation:
    
    __smc_lgr_terminate()
     --> smc_conn_kill()
        --> smc_close_active_abort()
               smc_sock->sk_state = SMC_CLOSED
               sock_put(smc_sock)
    
    smc_sock was set to SMC_CLOSED and sock_put() been called
    when terminate the link group. But later application call
    close() on the socket, then we got:
    
    __smc_release():
        if (smc_sock->fallback)
            smc_sock->sk_state = SMC_CLOSED
            sock_put(smc_sock)
    
    Again we set the smc_sock to CLOSED through it's already
    in CLOSED state, and double put the refcnt, so the following
    warning happens:
    
    refcount_t: underflow; use-after-free.
    WARNING: CPU: 5 PID: 860 at lib/refcount.c:28 refcount_warn_saturate+0x8d/0xf0
    Modules linked in:
    CPU: 5 PID: 860 Comm: nginx Not tainted 5.10.46+ #403
    Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 8c24b4c 04/01/2014
    RIP: 0010:refcount_warn_saturate+0x8d/0xf0
    Code: 05 5c 1e b5 01 01 e8 52 25 bc ff 0f 0b c3 80 3d 4f 1e b5 01 00 75 ad 48
    
    RSP: 0018:ffffc90000527e50 EFLAGS: 00010286
    RAX: 0000000000000026 RBX: ffff8881300df2c0 RCX: 0000000000000027
    RDX: 0000000000000000 RSI: ffff88813bd58040 RDI: ffff88813bd58048
    RBP: 0000000000000000 R08: 0000000000000003 R09: 0000000000000001
    R10: ffff8881300df2c0 R11: ffffc90000527c78 R12: ffff8881300df340
    R13: ffff8881300df930 R14: ffff88810b3dad80 R15: ffff8881300df4f8
    FS:  00007f739de8fb80(0000) GS:ffff88813bd40000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 000000000a01b008 CR3: 0000000111b64003 CR4: 00000000003706e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     smc_release+0x353/0x3f0
     __sock_release+0x3d/0xb0
     sock_close+0x11/0x20
     __fput+0x93/0x230
     task_work_run+0x65/0xa0
     exit_to_user_mode_prepare+0xf9/0x100
     syscall_exit_to_user_mode+0x27/0x190
     entry_SYSCALL_64_after_hwframe+0x44/0xa9
    
    This patch adds check in __smc_release() to make
    sure we won't do an extra sock_put() and set the
    socket to CLOSED when its already in CLOSED state.
    
    Fixes: 51f1de79 (net/smc: replace sock_put worker by socket refcounting)
    Signed-off-by: NDust Li <dust.li@linux.alibaba.com>
    Reviewed-by: NTony Lu <tonylu@linux.alibaba.com>
    Signed-off-by: NDust Li <dust.li@linux.alibaba.com>
    Acked-by: NKarsten Graul <kgraul@linux.ibm.com>
    Signed-off-by: NDavid S. Miller <davem@davemloft.net>
    e5d5aadc
af_smc.c 72.9 KB