1. 21 3月, 2023 1 次提交
  2. 22 2月, 2021 1 次提交
  3. 16 11月, 2020 1 次提交
    • T
      tipc: fix memory leak in service subscripting · c77a9ea9
      Tuong Lien 提交于
      stable inclusion
      from linux-4.19.149
      commit 6b3ea3aa6c675b65b6b068f5726c93abc8a4b460
      
      --------------------------------
      
      [ Upstream commit 0771d7df ]
      
      Upon receipt of a service subscription request from user via a topology
      connection, one 'sub' object will be allocated in kernel, so it will be
      able to send an event of the service if any to the user correspondingly
      then. Also, in case of any failure, the connection will be shutdown and
      all the pertaining 'sub' objects will be freed.
      
      However, there is a race condition as follows resulting in memory leak:
      
             receive-work       connection        send-work
                    |                |                |
              sub-1 |<------//-------|                |
              sub-2 |<------//-------|                |
                    |                |<---------------| evt for sub-x
              sub-3 |<------//-------|                |
                    :                :                :
                    :                :                :
                    |       /--------|                |
                    |       |        * peer closed    |
                    |       |        |                |
                    |       |        |<-------X-------| evt for sub-y
                    |       |        |<===============|
              sub-n |<------/        X    shutdown    |
          -> orphan |                                 |
      
      That is, the 'receive-work' may get the last subscription request while
      the 'send-work' is shutting down the connection due to peer close.
      
      We had a 'lock' on the connection, so the two actions cannot be carried
      out simultaneously. If the last subscription is allocated e.g. 'sub-n',
      before the 'send-work' closes the connection, there will be no issue at
      all, the 'sub' objects will be freed. In contrast the last subscription
      will become orphan since the connection was closed, and we released all
      references.
      
      This commit fixes the issue by simply adding one test if the connection
      remains in 'connected' state right after we obtain the connection lock,
      then a subscription object can be created as usual, otherwise we ignore
      it.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Acked-by: NJon Maloy <jmaloy@redhat.com>
      Reported-by: NThang Ngo <thang.h.ngo@dektech.com.au>
      Signed-off-by: NTuong Lien <tuong.t.lien@dektech.com.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
      Signed-off-by: NAichun Li <liaichun@huawei.com>
      Reviewed-by: Nwangxiaopeng <wangxiaopeng7@huawei.com>
      Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
      c77a9ea9
  4. 22 9月, 2020 1 次提交
  5. 27 12月, 2019 3 次提交
    • J
      tipc: fix modprobe tipc failed after switch order of device registration · e373fab8
      Junwei Hu 提交于
      commit 526f5b85 upstream.
      
      Error message printed:
      modprobe: ERROR: could not insert 'tipc': Address family not
      supported by protocol.
      when modprobe tipc after the following patch: switch order of
      device registration, commit 7e27e8d6
      ("tipc: switch order of device registration to fix a crash")
      
      Because sock_create_kern(net, AF_TIPC, ...) called by
      tipc_topsrv_create_listener() in the initialization process
      of tipc_init_net(), so tipc_socket_init() must be execute before that.
      Meanwhile, tipc_net_id need to be initialized when sock_create()
      called, and tipc_socket_init() is no need to be called for each namespace.
      
      I add a variable tipc_topsrv_net_ops, and split the
      register_pernet_subsys() of tipc into two parts, and split
      tipc_socket_init() with initialization of pernet params.
      
      By the way, I fixed resources rollback error when tipc_bcast_init()
      failed in tipc_init_net().
      
      Fixes: 7e27e8d6 ("tipc: switch order of device registration to fix a crash")
      Signed-off-by: NJunwei Hu <hujunwei4@huawei.com>
      Reported-by: NWang Wang <wangwang2@huawei.com>
      Reported-by: syzbot+1e8114b61079bfe9cbc5@syzkaller.appspotmail.com
      Reviewed-by: NKang Zhou <zhoukang7@huawei.com>
      Reviewed-by: NSuanming Mou <mousuanming@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
      e373fab8
    • E
      tipc: fix cancellation of topology subscriptions · b8d954fb
      Erik Hugne 提交于
      [ Upstream commit 33872d79 ]
      
      When cancelling a subscription, we have to clear the cancel bit in the
      request before iterating over any established subscriptions with memcmp.
      Otherwise no subscription will ever be found, and it will not be
      possible to explicitly unsubscribe individual subscriptions.
      
      Fixes: 8985ecc7 ("tipc: simplify endianness handling in topology subscriber")
      Signed-off-by: NErik Hugne <erik.hugne@gmail.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
      b8d954fb
    • Y
      tipc: fix uninit-value in in tipc_conn_rcv_sub · f2ac24ff
      Ying Xue 提交于
      commit a88289f4 upstream.
      
      syzbot reported:
      
      BUG: KMSAN: uninit-value in tipc_conn_rcv_sub+0x184/0x950 net/tipc/topsrv.c:373
      CPU: 0 PID: 66 Comm: kworker/u4:4 Not tainted 4.17.0-rc3+ #88
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: tipc_rcv tipc_conn_recv_work
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x185/0x1d0 lib/dump_stack.c:113
       kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
       __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:683
       tipc_conn_rcv_sub+0x184/0x950 net/tipc/topsrv.c:373
       tipc_conn_rcv_from_sock net/tipc/topsrv.c:409 [inline]
       tipc_conn_recv_work+0x3cd/0x560 net/tipc/topsrv.c:424
       process_one_work+0x12c6/0x1f60 kernel/workqueue.c:2145
       worker_thread+0x113c/0x24f0 kernel/workqueue.c:2279
       kthread+0x539/0x720 kernel/kthread.c:239
       ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:412
      
      Local variable description: ----s.i@tipc_conn_recv_work
      Variable was created at:
       tipc_conn_recv_work+0x65/0x560 net/tipc/topsrv.c:419
       process_one_work+0x12c6/0x1f60 kernel/workqueue.c:2145
      
      In tipc_conn_rcv_from_sock(), it always supposes the length of message
      received from sock_recvmsg() is not smaller than the size of struct
      tipc_subscr. However, this assumption is false. Especially when the
      length of received message is shorter than struct tipc_subscr size,
      we will end up touching uninitialized fields in tipc_conn_rcv_sub().
      
      Reported-by: syzbot+8951a3065ee7fd6d6e23@syzkaller.appspotmail.com
      Reported-by: syzbot+75e6e042c5bbf691fc82@syzkaller.appspotmail.com
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
      f2ac24ff
  6. 08 12月, 2018 1 次提交
  7. 04 9月, 2018 1 次提交
  8. 20 2月, 2018 2 次提交
    • P
      tipc: don't call sock_release() in atomic context · 26736a08
      Paolo Abeni 提交于
      syzbot reported a scheduling while atomic issue at netns
      destruction time:
      
      BUG: sleeping function called from invalid context at net/core/sock.c:2769
      in_atomic(): 1, irqs_disabled(): 0, pid: 85, name: kworker/u4:3
      5 locks held by kworker/u4:3/85:
        #0:  ((wq_completion)"%s""netns"){+.+.}, at: [<00000000c9792deb>]
      process_one_work+0xaaf/0x1af0 kernel/workqueue.c:2084
        #1:  (net_cleanup_work){+.+.}, at: [<00000000adc12e2a>]
      process_one_work+0xb01/0x1af0 kernel/workqueue.c:2088
        #2:  (net_sem){++++}, at: [<000000009ccb5669>] cleanup_net+0x23f/0xd20
      net/core/net_namespace.c:494
        #3:  (net_mutex){+.+.}, at: [<00000000a92767d9>] cleanup_net+0xa7d/0xd20
      net/core/net_namespace.c:496
        #4:  (&(&srv->idr_lock)->rlock){+...}, at: [<000000001343e568>]
      spin_lock_bh include/linux/spinlock.h:315 [inline]
        #4:  (&(&srv->idr_lock)->rlock){+...}, at: [<000000001343e568>]
      tipc_topsrv_stop+0x231/0x610 net/tipc/topsrv.c:685
      CPU: 0 PID: 85 Comm: kworker/u4:3 Not tainted 4.16.0-rc1+ #230
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
      Google 01/01/2011
      Workqueue: netns cleanup_net
      Call Trace:
        __dump_stack lib/dump_stack.c:17 [inline]
        dump_stack+0x194/0x257 lib/dump_stack.c:53
        ___might_sleep+0x2b2/0x470 kernel/sched/core.c:6128
        __might_sleep+0x95/0x190 kernel/sched/core.c:6081
        lock_sock_nested+0x37/0x110 net/core/sock.c:2769
        lock_sock include/net/sock.h:1463 [inline]
        tipc_release+0x103/0xff0 net/tipc/socket.c:572
        sock_release+0x8d/0x1e0 net/socket.c:594
        tipc_topsrv_stop+0x3c0/0x610 net/tipc/topsrv.c:696
        tipc_exit_net+0x15/0x40 net/tipc/core.c:96
        ops_exit_list.isra.6+0xae/0x150 net/core/net_namespace.c:148
        cleanup_net+0x6ba/0xd20 net/core/net_namespace.c:529
        process_one_work+0xbbf/0x1af0 kernel/workqueue.c:2113
        worker_thread+0x223/0x1990 kernel/workqueue.c:2247
        kthread+0x33c/0x400 kernel/kthread.c:238
        ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:429
      
      This is caused by tipc_topsrv_stop() releasing the listener socket
      with the idr lock held. This changeset addresses the issue moving
      the release operation outside such lock.
      
      Reported-and-tested-by: syzbot+749d9d87c294c00ca856@syzkaller.appspotmail.com
      Fixes: 0ef897be ("tipc: separate topology server listener socket from subcsriber sockets")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Acked-by:  ///jon
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      26736a08
    • J
      tipc: fix bug on error path in tipc_topsrv_kern_subscr() · 96c252bf
      Jon Maloy 提交于
      In commit cc1ea9ffadf7 ("tipc: eliminate struct tipc_subscriber") we
      re-introduced an old bug on the error path in the function
      tipc_topsrv_kern_subscr(). We now re-introduce the correction too.
      
      Reported-by: syzbot+f62e0f2a0ef578703946@syzkaller.appspotmail.com
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      96c252bf
  9. 17 2月, 2018 10 次提交
  10. 17 1月, 2018 1 次提交
    • J
      tipc: fix race condition at topology server receive · e88f2be8
      Jon Maloy 提交于
      We have identified a race condition during reception of socket
      events and messages in the topology server.
      
      - The function tipc_close_conn() is releasing the corresponding
        struct tipc_subscriber instance without considering that there
        may still be items in the receive work queue. When those are
        scheduled, in the function tipc_receive_from_work(), they are
        using the subscriber pointer stored in struct tipc_conn, without
        first checking if this is valid or not. This will sometimes
        lead to crashes, as the next call of tipc_conn_recvmsg() will
        access the now deleted item.
        We fix this by making the usage of this pointer conditional on
        whether the connection is active or not. I.e., we check the condition
        test_bit(CF_CONNECTED) before making the call tipc_conn_recvmsg().
      
      - Since the two functions may be running on different cores, the
        condition test described above is not enough. tipc_close_conn()
        may come in between and delete the subscriber item after the condition
        test is done, but before tipc_conn_recv_msg() is finished. This
        happens less frequently than the problem described above, but leads
        to the same symptoms.
      
        We fix this by using the existing sk_callback_lock for mutual
        exclusion in the two functions. In addition, we have to move
        a call to tipc_conn_terminate() outside the mentioned lock to
        avoid deadlock.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e88f2be8
  11. 10 1月, 2018 2 次提交
    • J
      tipc: improve groupcast scope handling · 232d07b7
      Jon Maloy 提交于
      When a member joins a group, it also indicates a binding scope. This
      makes it possible to create both node local groups, invisible to other
      nodes, as well as cluster global groups, visible everywhere.
      
      In order to avoid that different members end up having permanently
      differing views of group size and memberhip, we must inhibit locally
      and globally bound members from joining the same group.
      
      We do this by using the binding scope as an additional separator between
      groups. I.e., a member must ignore all membership events from sockets
      using a different scope than itself, and all lookups for message
      destinations must require an exact match between the message's lookup
      scope and the potential target's binding scope.
      
      Apart from making it possible to create local groups using the same
      identity on different nodes, a side effect of this is that it now also
      becomes possible to create a cluster global group with the same identity
      across the same nodes, without interfering with the local groups.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      232d07b7
    • J
      tipc: add option to suppress PUBLISH events for pre-existing publications · 8348500f
      Jon Maloy 提交于
      Currently, when a user is subscribing for binding table publications,
      he will receive a PUBLISH event for all already existing matching items
      in the binding table.
      
      However, a group socket making a subscriptions doesn't need this initial
      status update from the binding table, because it has already scanned it
      during the join operation. Worse, the multiplicatory effect of issuing
      mutual events for dozens or hundreds group members within a short time
      frame put a heavy load on the topology server, with the end result that
      scale out operations on a big group tend to take much longer than needed.
      
      We now add a new filter option, TIPC_SUB_NO_STATUS, for topology server
      subscriptions, so that this initial avalanche of events is suppressed.
      This change, along with the previous commit, significantly improves the
      range and speed of group scale out operations.
      
      We keep the new option internal for the tipc driver, at least for now.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8348500f
  12. 06 12月, 2017 2 次提交
    • J
      tipc: fix memory leak in tipc_accept_from_sock() · a7d5f107
      Jon Maloy 提交于
      When the function tipc_accept_from_sock() fails to create an instance of
      struct tipc_subscriber it omits to free the already created instance of
      struct tipc_conn instance before it returns.
      
      We fix that with this commit.
      Reported-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a7d5f107
    • C
      tipc: fix a null pointer deref on error path · 672ecbe1
      Cong Wang 提交于
      In tipc_topsrv_kern_subscr() when s->tipc_conn_new() fails
      we call tipc_close_conn() to clean up, but in this case
      calling conn_put() is just enough.
      
      This fixes the folllowing crash:
      
       kasan: GPF could be caused by NULL-ptr deref or user memory access
       general protection fault: 0000 [#1] SMP KASAN
       Dumping ftrace buffer:
          (ftrace buffer empty)
       Modules linked in:
       CPU: 0 PID: 3085 Comm: syzkaller064164 Not tainted 4.15.0-rc1+ #137
       Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
       task: 00000000c24413a5 task.stack: 000000005e8160b5
       RIP: 0010:__lock_acquire+0xd55/0x47f0 kernel/locking/lockdep.c:3378
       RSP: 0018:ffff8801cb5474a8 EFLAGS: 00010002
       RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
       RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffffffff85ecb400
       RBP: ffff8801cb547830 R08: 0000000000000001 R09: 0000000000000000
       R10: 0000000000000000 R11: ffffffff87489d60 R12: ffff8801cd2980c0
       R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000020
       FS:  00000000014ee880(0000) GS:ffff8801db400000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 00007ffee2426e40 CR3: 00000001cb85a000 CR4: 00000000001406f0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       Call Trace:
        lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:4004
        __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
        _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:175
        spin_lock_bh include/linux/spinlock.h:320 [inline]
        tipc_subscrb_subscrp_delete+0x8f/0x470 net/tipc/subscr.c:201
        tipc_subscrb_delete net/tipc/subscr.c:238 [inline]
        tipc_subscrb_release_cb+0x17/0x30 net/tipc/subscr.c:316
        tipc_close_conn+0x171/0x270 net/tipc/server.c:204
        tipc_topsrv_kern_subscr+0x724/0x810 net/tipc/server.c:514
        tipc_group_create+0x702/0x9c0 net/tipc/group.c:184
        tipc_sk_join net/tipc/socket.c:2747 [inline]
        tipc_setsockopt+0x249/0xc10 net/tipc/socket.c:2861
        SYSC_setsockopt net/socket.c:1851 [inline]
        SyS_setsockopt+0x189/0x360 net/socket.c:1830
        entry_SYSCALL_64_fastpath+0x1f/0x96
      
      Fixes: 14c04493 ("tipc: add ability to order and receive topology events in driver")
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Cc: Jon Maloy <jon.maloy@ericsson.com>
      Cc: Ying Xue <ying.xue@windriver.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      672ecbe1
  13. 03 12月, 2017 1 次提交
  14. 20 10月, 2017 1 次提交
  15. 13 10月, 2017 1 次提交
  16. 25 1月, 2017 4 次提交
  17. 27 6月, 2016 1 次提交
  18. 20 5月, 2016 1 次提交
  19. 15 4月, 2016 1 次提交
    • P
      tipc: fix a race condition leading to subscriber refcnt bug · 333f7962
      Parthasarathy Bhuvaragan 提交于
      Until now, the requests sent to topology server are queued
      to a workqueue by the generic server framework.
      These messages are processed by worker threads and trigger the
      registered callbacks.
      To reduce latency on uniprocessor systems, explicit rescheduling
      is performed using cond_resched() after MAX_RECV_MSG_COUNT(25)
      messages.
      
      This implementation on SMP systems leads to an subscriber refcnt
      error as described below:
      When a worker thread yields by calling cond_resched() in a SMP
      system, a new worker is created on another CPU to process the
      pending workitem. Sometimes the sleeping thread wakes up before
      the new thread finishes execution.
      This breaks the assumption on ordering and being single threaded.
      The fault is more frequent when MAX_RECV_MSG_COUNT is lowered.
      
      If the first thread was processing subscription create and the
      second thread processing close(), the close request will free
      the subscriber and the create request oops as follows:
      
      [31.224137] WARNING: CPU: 2 PID: 266 at include/linux/kref.h:46 tipc_subscrb_rcv_cb+0x317/0x380         [tipc]
      [31.228143] CPU: 2 PID: 266 Comm: kworker/u8:1 Not tainted 4.5.0+ #97
      [31.228377] Workqueue: tipc_rcv tipc_recv_work [tipc]
      [...]
      [31.228377] Call Trace:
      [31.228377]  [<ffffffff812fbb6b>] dump_stack+0x4d/0x72
      [31.228377]  [<ffffffff8105a311>] __warn+0xd1/0xf0
      [31.228377]  [<ffffffff8105a3fd>] warn_slowpath_null+0x1d/0x20
      [31.228377]  [<ffffffffa0098067>] tipc_subscrb_rcv_cb+0x317/0x380 [tipc]
      [31.228377]  [<ffffffffa00a4984>] tipc_receive_from_sock+0xd4/0x130 [tipc]
      [31.228377]  [<ffffffffa00a439b>] tipc_recv_work+0x2b/0x50 [tipc]
      [31.228377]  [<ffffffff81071925>] process_one_work+0x145/0x3d0
      [31.246554] ---[ end trace c3882c9baa05a4fd ]---
      [31.248327] BUG: spinlock bad magic on CPU#2, kworker/u8:1/266
      [31.249119] BUG: unable to handle kernel NULL pointer dereference at 0000000000000428
      [31.249323] IP: [<ffffffff81099d0c>] spin_dump+0x5c/0xe0
      [31.249323] PGD 0
      [31.249323] Oops: 0000 [#1] SMP
      
      In this commit, we
      - rename tipc_conn_shutdown() to tipc_conn_release().
      - move connection release callback execution from tipc_close_conn()
        to a new function tipc_sock_release(), which is executed before
        we free the connection.
      Thus we release the subscriber during connection release procedure
      rather than connection shutdown procedure.
      Signed-off-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      333f7962
  20. 06 2月, 2016 1 次提交
  21. 15 5月, 2015 1 次提交
  22. 05 5月, 2015 1 次提交
  23. 23 4月, 2015 1 次提交
    • Y
      tipc: fix topology server broken issue · def81f69
      Ying Xue 提交于
      When a new topology server is launched in a new namespace, its
      listening socket is inserted into the "init ns" namespace's socket
      hash table rather than the one owned by the new namespace. Although
      the socket's namespace is forcedly changed to the new namespace later,
      the socket is still stored in the socket hash table of "init ns"
      namespace. When a client created in the new namespace connects
      its own topology server, the connection is failed as its server's
      socket could not be found from its own namespace's socket table.
      
      If __sock_create() instead of original sock_create_kern() is used
      to create the server's socket through specifying an expected namesapce,
      the socket will be inserted into the specified namespace's socket
      table, thereby avoiding to the topology server broken issue.
      
      Fixes: 76100a8a ("tipc: fix netns refcnt leak")
      Reported-by: NErik Hugne <erik.hugne@ericsson.com>
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      def81f69