1. 06 12月, 2018 1 次提交
    • J
      tipc: fix lockdep warning during node delete · 4e3fbd74
      Jon Maloy 提交于
      [ Upstream commit ec835f891232d7763dea9da0358f31e24ca6dfb7 ]
      
      We see the following lockdep warning:
      
      [ 2284.078521] ======================================================
      [ 2284.078604] WARNING: possible circular locking dependency detected
      [ 2284.078604] 4.19.0+ #42 Tainted: G            E
      [ 2284.078604] ------------------------------------------------------
      [ 2284.078604] rmmod/254 is trying to acquire lock:
      [ 2284.078604] 00000000acd94e28 ((&n->timer)#2){+.-.}, at: del_timer_sync+0x5/0xa0
      [ 2284.078604]
      [ 2284.078604] but task is already holding lock:
      [ 2284.078604] 00000000f997afc0 (&(&tn->node_list_lock)->rlock){+.-.}, at: tipc_node_stop+0xac/0x190 [tipc]
      [ 2284.078604]
      [ 2284.078604] which lock already depends on the new lock.
      [ 2284.078604]
      [ 2284.078604]
      [ 2284.078604] the existing dependency chain (in reverse order) is:
      [ 2284.078604]
      [ 2284.078604] -> #1 (&(&tn->node_list_lock)->rlock){+.-.}:
      [ 2284.078604]        tipc_node_timeout+0x20a/0x330 [tipc]
      [ 2284.078604]        call_timer_fn+0xa1/0x280
      [ 2284.078604]        run_timer_softirq+0x1f2/0x4d0
      [ 2284.078604]        __do_softirq+0xfc/0x413
      [ 2284.078604]        irq_exit+0xb5/0xc0
      [ 2284.078604]        smp_apic_timer_interrupt+0xac/0x210
      [ 2284.078604]        apic_timer_interrupt+0xf/0x20
      [ 2284.078604]        default_idle+0x1c/0x140
      [ 2284.078604]        do_idle+0x1bc/0x280
      [ 2284.078604]        cpu_startup_entry+0x19/0x20
      [ 2284.078604]        start_secondary+0x187/0x1c0
      [ 2284.078604]        secondary_startup_64+0xa4/0xb0
      [ 2284.078604]
      [ 2284.078604] -> #0 ((&n->timer)#2){+.-.}:
      [ 2284.078604]        del_timer_sync+0x34/0xa0
      [ 2284.078604]        tipc_node_delete+0x1a/0x40 [tipc]
      [ 2284.078604]        tipc_node_stop+0xcb/0x190 [tipc]
      [ 2284.078604]        tipc_net_stop+0x154/0x170 [tipc]
      [ 2284.078604]        tipc_exit_net+0x16/0x30 [tipc]
      [ 2284.078604]        ops_exit_list.isra.8+0x36/0x70
      [ 2284.078604]        unregister_pernet_operations+0x87/0xd0
      [ 2284.078604]        unregister_pernet_subsys+0x1d/0x30
      [ 2284.078604]        tipc_exit+0x11/0x6f2 [tipc]
      [ 2284.078604]        __x64_sys_delete_module+0x1df/0x240
      [ 2284.078604]        do_syscall_64+0x66/0x460
      [ 2284.078604]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [ 2284.078604]
      [ 2284.078604] other info that might help us debug this:
      [ 2284.078604]
      [ 2284.078604]  Possible unsafe locking scenario:
      [ 2284.078604]
      [ 2284.078604]        CPU0                    CPU1
      [ 2284.078604]        ----                    ----
      [ 2284.078604]   lock(&(&tn->node_list_lock)->rlock);
      [ 2284.078604]                                lock((&n->timer)#2);
      [ 2284.078604]                                lock(&(&tn->node_list_lock)->rlock);
      [ 2284.078604]   lock((&n->timer)#2);
      [ 2284.078604]
      [ 2284.078604]  *** DEADLOCK ***
      [ 2284.078604]
      [ 2284.078604] 3 locks held by rmmod/254:
      [ 2284.078604]  #0: 000000003368be9b (pernet_ops_rwsem){+.+.}, at: unregister_pernet_subsys+0x15/0x30
      [ 2284.078604]  #1: 0000000046ed9c86 (rtnl_mutex){+.+.}, at: tipc_net_stop+0x144/0x170 [tipc]
      [ 2284.078604]  #2: 00000000f997afc0 (&(&tn->node_list_lock)->rlock){+.-.}, at: tipc_node_stop+0xac/0x19
      [...}
      
      The reason is that the node timer handler sometimes needs to delete a
      node which has been disconnected for too long. To do this, it grabs
      the lock 'node_list_lock', which may at the same time be held by the
      generic node cleanup function, tipc_node_stop(), during module removal.
      Since the latter is calling del_timer_sync() inside the same lock, we
      have a potential deadlock.
      
      We fix this letting the timer cleanup function use spin_trylock()
      instead of just spin_lock(), and when it fails to grab the lock it
      just returns so that the timer handler can terminate its execution.
      This is safe to do, since tipc_node_stop() anyway is about to
      delete both the timer and the node instance.
      
      Fixes: 6a939f36 ("tipc: Auto removal of peer down node instance")
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4e3fbd74
  2. 23 11月, 2018 3 次提交
    • J
      tipc: fix link re-establish failure · 961842dc
      Jon Maloy 提交于
      [ Upstream commit 7ab412d33b4c7ff3e0148d3db25dd861edd1283d ]
      
      When a link failure is detected locally, the link is reset, the flag
      link->in_session is set to false, and a RESET_MSG with the 'stopping'
      bit set is sent to the peer.
      
      The purpose of this bit is to inform the peer that this endpoint just
      is going down, and that the peer should handle the reception of this
      particular RESET message as a local failure. This forces the peer to
      accept another RESET or ACTIVATE message from this endpoint before it
      can re-establish the link. This again is necessary to ensure that
      link session numbers are properly exchanged before the link comes up
      again.
      
      If a failure is detected locally at the same time at the peer endpoint
      this will do the same, which is also a correct behavior.
      
      However, when receiving such messages, the endpoints will not
      distinguish between 'stopping' RESETs and ordinary ones when it comes
      to updating session numbers. Both endpoints will copy the received
      session number and set their 'in_session' flags to true at the
      reception, while they are still expecting another RESET from the
      peer before they can go ahead and re-establish. This is contradictory,
      since, after applying the validation check referred to below, the
      'in_session' flag will cause rejection of all such messages, and the
      link will never come up again.
      
      We now fix this by not only handling received RESET/STOPPING messages
      as a local failure, but also by omitting to set a new session number
      and the 'in_session' flag in such cases.
      
      Fixes: 7ea817f4 ("tipc: check session number before accepting link protocol messages")
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      961842dc
    • J
      tipc: fix lockdep warning when reinitilaizing sockets · ce209966
      Jon Maloy 提交于
      [ Upstream commit adba75be0d23cca92a028749d92c60c8909bbdb3 ]
      
      We get the following warning:
      
      [   47.926140] 32-bit node address hash set to 2010a0a
      [   47.927202]
      [   47.927433] ================================
      [   47.928050] WARNING: inconsistent lock state
      [   47.928661] 4.19.0+ #37 Tainted: G            E
      [   47.929346] --------------------------------
      [   47.929954] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
      [   47.930116] swapper/3/0 [HC0[0]:SC1[3]:HE1:SE0] takes:
      [   47.930116] 00000000af8bc31e (&(&ht->lock)->rlock){+.?.}, at: rhashtable_walk_enter+0x36/0xb0
      [   47.930116] {SOFTIRQ-ON-W} state was registered at:
      [   47.930116]   _raw_spin_lock+0x29/0x60
      [   47.930116]   rht_deferred_worker+0x556/0x810
      [   47.930116]   process_one_work+0x1f5/0x540
      [   47.930116]   worker_thread+0x64/0x3e0
      [   47.930116]   kthread+0x112/0x150
      [   47.930116]   ret_from_fork+0x3a/0x50
      [   47.930116] irq event stamp: 14044
      [   47.930116] hardirqs last  enabled at (14044): [<ffffffff9a07fbba>] __local_bh_enable_ip+0x7a/0xf0
      [   47.938117] hardirqs last disabled at (14043): [<ffffffff9a07fb81>] __local_bh_enable_ip+0x41/0xf0
      [   47.938117] softirqs last  enabled at (14028): [<ffffffff9a0803ee>] irq_enter+0x5e/0x60
      [   47.938117] softirqs last disabled at (14029): [<ffffffff9a0804a5>] irq_exit+0xb5/0xc0
      [   47.938117]
      [   47.938117] other info that might help us debug this:
      [   47.938117]  Possible unsafe locking scenario:
      [   47.938117]
      [   47.938117]        CPU0
      [   47.938117]        ----
      [   47.938117]   lock(&(&ht->lock)->rlock);
      [   47.938117]   <Interrupt>
      [   47.938117]     lock(&(&ht->lock)->rlock);
      [   47.938117]
      [   47.938117]  *** DEADLOCK ***
      [   47.938117]
      [   47.938117] 2 locks held by swapper/3/0:
      [   47.938117]  #0: 0000000062c64f90 ((&d->timer)){+.-.}, at: call_timer_fn+0x5/0x280
      [   47.938117]  #1: 00000000ee39619c (&(&d->lock)->rlock){+.-.}, at: tipc_disc_timeout+0xc8/0x540 [tipc]
      [   47.938117]
      [   47.938117] stack backtrace:
      [   47.938117] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G            E     4.19.0+ #37
      [   47.938117] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      [   47.938117] Call Trace:
      [   47.938117]  <IRQ>
      [   47.938117]  dump_stack+0x5e/0x8b
      [   47.938117]  print_usage_bug+0x1ed/0x1ff
      [   47.938117]  mark_lock+0x5b5/0x630
      [   47.938117]  __lock_acquire+0x4c0/0x18f0
      [   47.938117]  ? lock_acquire+0xa6/0x180
      [   47.938117]  lock_acquire+0xa6/0x180
      [   47.938117]  ? rhashtable_walk_enter+0x36/0xb0
      [   47.938117]  _raw_spin_lock+0x29/0x60
      [   47.938117]  ? rhashtable_walk_enter+0x36/0xb0
      [   47.938117]  rhashtable_walk_enter+0x36/0xb0
      [   47.938117]  tipc_sk_reinit+0xb0/0x410 [tipc]
      [   47.938117]  ? mark_held_locks+0x6f/0x90
      [   47.938117]  ? __local_bh_enable_ip+0x7a/0xf0
      [   47.938117]  ? lockdep_hardirqs_on+0x20/0x1a0
      [   47.938117]  tipc_net_finalize+0xbf/0x180 [tipc]
      [   47.938117]  tipc_disc_timeout+0x509/0x540 [tipc]
      [   47.938117]  ? call_timer_fn+0x5/0x280
      [   47.938117]  ? tipc_disc_msg_xmit.isra.19+0xa0/0xa0 [tipc]
      [   47.938117]  ? tipc_disc_msg_xmit.isra.19+0xa0/0xa0 [tipc]
      [   47.938117]  call_timer_fn+0xa1/0x280
      [   47.938117]  ? tipc_disc_msg_xmit.isra.19+0xa0/0xa0 [tipc]
      [   47.938117]  run_timer_softirq+0x1f2/0x4d0
      [   47.938117]  __do_softirq+0xfc/0x413
      [   47.938117]  irq_exit+0xb5/0xc0
      [   47.938117]  smp_apic_timer_interrupt+0xac/0x210
      [   47.938117]  apic_timer_interrupt+0xf/0x20
      [   47.938117]  </IRQ>
      [   47.938117] RIP: 0010:default_idle+0x1c/0x140
      [   47.938117] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 41 54 55 53 65 8b 2d d8 2b 74 65 0f 1f 44 00 00 e8 c6 2c 8b ff fb f4 <65> 8b 2d c5 2b 74 65 0f 1f 44 00 00 5b 5d 41 5c c3 65 8b 05 b4 2b
      [   47.938117] RSP: 0018:ffffaf6ac0207ec8 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
      [   47.938117] RAX: ffff8f5b3735e200 RBX: 0000000000000003 RCX: 0000000000000001
      [   47.938117] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff8f5b3735e200
      [   47.938117] RBP: 0000000000000003 R08: 0000000000000001 R09: 0000000000000000
      [   47.938117] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
      [   47.938117] R13: 0000000000000000 R14: ffff8f5b3735e200 R15: ffff8f5b3735e200
      [   47.938117]  ? default_idle+0x1a/0x140
      [   47.938117]  do_idle+0x1bc/0x280
      [   47.938117]  cpu_startup_entry+0x19/0x20
      [   47.938117]  start_secondary+0x187/0x1c0
      [   47.938117]  secondary_startup_64+0xa4/0xb0
      
      The reason seems to be that tipc_net_finalize()->tipc_sk_reinit() is
      calling the function rhashtable_walk_enter() within a timer interrupt.
      We fix this by executing tipc_net_finalize() in work queue context.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ce209966
    • J
      tipc: don't assume linear buffer when reading ancillary data · aaf13772
      Jon Maloy 提交于
      [ Upstream commit 1c1274a56999fbdf9cf84e332b28448bb2d55221 ]
      
      The code for reading ancillary data from a received buffer is assuming
      the buffer is linear. To make this assumption true we have to linearize
      the buffer before message data is read.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      aaf13772
  3. 04 11月, 2018 1 次提交
  4. 19 10月, 2018 1 次提交
  5. 16 10月, 2018 2 次提交
    • T
      tipc: fix unsafe rcu locking when accessing publication list · d3092b2e
      Tung Nguyen 提交于
      The binding table's 'cluster_scope' list is rcu protected to handle
      races between threads changing the list and those traversing the list at
      the same moment. We have now found that the function named_distribute()
      uses the regular list_for_each() macro to traverse the said list.
      Likewise, the function tipc_named_withdraw() is removing items from the
      same list using the regular list_del() call. When these two functions
      execute in parallel we see occasional crashes.
      
      This commit fixes this by adding the missing _rcu() suffixes.
      Signed-off-by: NTung Nguyen <tung.q.nguyen@dektech.com.au>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d3092b2e
    • J
      tipc: initialize broadcast link stale counter correctly · 4af00f4c
      Jon Maloy 提交于
      In the commit referred to below we added link tolerance as an additional
      criteria for declaring broadcast transmission "stale" and resetting the
      unicast links to the affected node.
      
      Unfortunately, this 'improvement' introduced two bugs, which each and
      one alone cause only limited problems, but combined lead to seemingly
      stochastic unicast link resets, depending on the amount of broadcast
      traffic transmitted.
      
      The first issue, a missing initialization of the 'tolerance' field of
      the receiver broadcast link, was recently fixed by commit 047491ea
      ("tipc: set link tolerance correctly in broadcast link").
      
      Ths second issue, where we omit to reset the 'stale_cnt' field of
      the same link after a 'stale' period is over, leads to this counter
      accumulating over time, and in the absence of the 'tolerance' criteria
      leads to the above described symptoms. This commit adds the missing
      initialization.
      
      Fixes: a4dc70d4 ("tipc: extend link reset criteria for stale packet retransmission")
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4af00f4c
  6. 12 10月, 2018 1 次提交
    • Y
      tipc: eliminate possible recursive locking detected by LOCKDEP · a1f8dd34
      Ying Xue 提交于
      When booting kernel with LOCKDEP option, below warning info was found:
      
      WARNING: possible recursive locking detected
      4.19.0-rc7+ #14 Not tainted
      --------------------------------------------
      swapper/0/1 is trying to acquire lock:
      00000000dcfc0fc8 (&(&list->lock)->rlock#4){+...}, at: spin_lock_bh
      include/linux/spinlock.h:334 [inline]
      00000000dcfc0fc8 (&(&list->lock)->rlock#4){+...}, at:
      tipc_link_reset+0x125/0xdf0 net/tipc/link.c:850
      
      but task is already holding lock:
      00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at: spin_lock_bh
      include/linux/spinlock.h:334 [inline]
      00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
      tipc_link_reset+0xfa/0xdf0 net/tipc/link.c:849
      
      other info that might help us debug this:
       Possible unsafe locking scenario:
      
             CPU0
             ----
        lock(&(&list->lock)->rlock#4);
        lock(&(&list->lock)->rlock#4);
      
       *** DEADLOCK ***
      
       May be due to missing lock nesting notation
      
      2 locks held by swapper/0/1:
       #0: 00000000f7539d34 (pernet_ops_rwsem){+.+.}, at:
      register_pernet_subsys+0x19/0x40 net/core/net_namespace.c:1051
       #1: 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
      spin_lock_bh include/linux/spinlock.h:334 [inline]
       #1: 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
      tipc_link_reset+0xfa/0xdf0 net/tipc/link.c:849
      
      stack backtrace:
      CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc7+ #14
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x1af/0x295 lib/dump_stack.c:113
       print_deadlock_bug kernel/locking/lockdep.c:1759 [inline]
       check_deadlock kernel/locking/lockdep.c:1803 [inline]
       validate_chain kernel/locking/lockdep.c:2399 [inline]
       __lock_acquire+0xf1e/0x3c60 kernel/locking/lockdep.c:3411
       lock_acquire+0x1db/0x520 kernel/locking/lockdep.c:3900
       __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
       _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:168
       spin_lock_bh include/linux/spinlock.h:334 [inline]
       tipc_link_reset+0x125/0xdf0 net/tipc/link.c:850
       tipc_link_bc_create+0xb5/0x1f0 net/tipc/link.c:526
       tipc_bcast_init+0x59b/0xab0 net/tipc/bcast.c:521
       tipc_init_net+0x472/0x610 net/tipc/core.c:82
       ops_init+0xf7/0x520 net/core/net_namespace.c:129
       __register_pernet_operations net/core/net_namespace.c:940 [inline]
       register_pernet_operations+0x453/0xac0 net/core/net_namespace.c:1011
       register_pernet_subsys+0x28/0x40 net/core/net_namespace.c:1052
       tipc_init+0x83/0x104 net/tipc/core.c:140
       do_one_initcall+0x109/0x70a init/main.c:885
       do_initcall_level init/main.c:953 [inline]
       do_initcalls init/main.c:961 [inline]
       do_basic_setup init/main.c:979 [inline]
       kernel_init_freeable+0x4bd/0x57f init/main.c:1144
       kernel_init+0x13/0x180 init/main.c:1063
       ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:413
      
      The reason why the noise above was complained by LOCKDEP is because we
      nested to hold l->wakeupq.lock and l->inputq->lock in tipc_link_reset
      function. In fact it's unnecessary to move skb buffer from l->wakeupq
      queue to l->inputq queue while holding the two locks at the same time.
      Instead, we can move skb buffers in l->wakeupq queue to a temporary
      list first and then move the buffers of the temporary list to l->inputq
      queue, which is also safe for us.
      
      Fixes: 3f32d0be ("tipc: lock wakeup & inputq at tipc_link_reset()")
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Acked-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a1f8dd34
  7. 11 10月, 2018 2 次提交
  8. 02 10月, 2018 1 次提交
    • L
      tipc: ignore STATE_MSG on wrong link session · d949cfed
      LUU Duc Canh 提交于
      The initial session number when a link is created is based on a random
      value, taken from struct tipc_net->random. It is then incremented for
      each link reset to avoid mixing protocol messages from different link
      sessions.
      
      However, when a bearer is reset all its links are deleted, and will
      later be re-created using the same random value as the first time.
      This means that if the link never went down between creation and
      deletion we will still sometimes have two subsequent sessions with
      the same session number. In virtual environments with potentially
      long transmission times this has turned out to be a real problem.
      
      We now fix this by randomizing the session number each time a link
      is created.
      
      With a session number size of 16 bits this gives a risk of session
      collision of 1/64k. To reduce this further, we also introduce a sanity
      check on the very first STATE message arriving at a link. If this has
      an acknowledge value differing from 0, which is logically impossible,
      we ignore the message. The final risk for session collision is hence
      reduced to 1/4G, which should be sufficient.
      Signed-off-by: NLUU Duc Canh <canh.d.luu@dektech.com.au>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d949cfed
  9. 30 9月, 2018 1 次提交
    • L
      tipc: fix failover problem · c140eb16
      LUU Duc Canh 提交于
      We see the following scenario:
      1) Link endpoint B on node 1 discovers that its peer endpoint is gone.
         Since there is a second working link, failover procedure is started.
      2) Link endpoint A on node 1 sends a FAILOVER message to peer endpoint
         A on node 2. The node item 1->2 goes to state FAILINGOVER.
      3) Linke endpoint A/2 receives the failover, and is supposed to take
         down its parallell link endpoint B/2, while producing a FAILOVER
         message to send back to A/1.
      4) However, B/2 has already been deleted, so no FAILOVER message can
         created.
      5) Node 1->2 remains in state FAILINGOVER forever, refusing to receive
         any messages that can bring B/1 up again. We are left with a non-
         redundant link between node 1 and 2.
      
      We fix this with letting endpoint A/2 build a dummy FAILOVER message
      to send to back to A/1, so that the situation can be resolved.
      Signed-off-by: NLUU Duc Canh <canh.d.luu@dektech.com.au>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c140eb16
  10. 26 9月, 2018 3 次提交
  11. 13 9月, 2018 1 次提交
  12. 07 9月, 2018 1 次提交
    • C
      tipc: call start and done ops directly in __tipc_nl_compat_dumpit() · 8f5c5fcf
      Cong Wang 提交于
      __tipc_nl_compat_dumpit() uses a netlink_callback on stack,
      so the only way to align it with other ->dumpit() call path
      is calling tipc_dump_start() and tipc_dump_done() directly
      inside it. Otherwise ->dumpit() would always get NULL from
      cb->args[].
      
      But tipc_dump_start() uses sock_net(cb->skb->sk) to retrieve
      net pointer, the cb->skb here doesn't set skb->sk, the net pointer
      is saved in msg->net instead, so introduce a helper function
      __tipc_dump_start() to pass in msg->net.
      
      Ying pointed out cb->args[0...3] are already used by other
      callbacks on this call path, so we can't use cb->args[0] any
      more, use cb->args[4] instead.
      
      Fixes: 9a07efa9 ("tipc: switch to rhashtable iterator")
      Reported-and-tested-by: syzbot+e93a2c41f91b8e2c7d9b@syzkaller.appspotmail.com
      Cc: Jon Maloy <jon.maloy@ericsson.com>
      Cc: Ying Xue <ying.xue@windriver.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8f5c5fcf
  13. 06 9月, 2018 1 次提交
  14. 04 9月, 2018 2 次提交
  15. 30 8月, 2018 2 次提交
    • C
      tipc: switch to rhashtable iterator · 9a07efa9
      Cong Wang 提交于
      syzbot reported a use-after-free in tipc_group_fill_sock_diag(),
      where tipc_group_fill_sock_diag() still reads tsk->group meanwhile
      tipc_group_delete() just deletes it in tipc_release().
      
      tipc_nl_sk_walk() aims to lock this sock when walking each sock
      in the hash table to close race conditions with sock changes like
      this one, by acquiring tsk->sk.sk_lock.slock spinlock, unfortunately
      this doesn't work at all. All non-BH call path should take
      lock_sock() instead to make it work.
      
      tipc_nl_sk_walk() brutally iterates with raw rht_for_each_entry_rcu()
      where RCU read lock is required, this is the reason why lock_sock()
      can't be taken on this path. This could be resolved by switching to
      rhashtable iterator API's, where taking a sleepable lock is possible.
      Also, the iterator API's are friendly for restartable calls like
      diag dump, the last position is remembered behind the scence,
      all we need to do here is saving the iterator into cb->args[].
      
      I tested this with parallel tipc diag dump and thousands of tipc
      socket creation and release, no crash or memory leak.
      
      Reported-by: syzbot+b9c8f3ab2994b7cd1625@syzkaller.appspotmail.com
      Cc: Jon Maloy <jon.maloy@ericsson.com>
      Cc: Ying Xue <ying.xue@windriver.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9a07efa9
    • C
      tipc: fix a missing rhashtable_walk_exit() · bd583fe3
      Cong Wang 提交于
      rhashtable_walk_exit() must be paired with rhashtable_walk_enter().
      
      Fixes: 40f9f439 ("tipc: Fix tipc_sk_reinit race conditions")
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Ying Xue <ying.xue@windriver.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bd583fe3
  16. 28 8月, 2018 1 次提交
  17. 08 8月, 2018 1 次提交
    • Y
      tipc: fix an interrupt unsafe locking scenario · 37436d9c
      Ying Xue 提交于
      Commit 9faa89d4 ("tipc: make function tipc_net_finalize() thread
      safe") tries to make it thread safe to set node address, so it uses
      node_list_lock lock to serialize the whole process of setting node
      address in tipc_net_finalize(). But it causes the following interrupt
      unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        rht_deferred_worker()
        rhashtable_rehash_table()
        lock(&(&ht->lock)->rlock)
      			       tipc_nl_compat_doit()
                                     tipc_net_finalize()
                                     local_irq_disable();
                                     lock(&(&tn->node_list_lock)->rlock);
                                     tipc_sk_reinit()
                                     rhashtable_walk_enter()
                                     lock(&(&ht->lock)->rlock);
        <Interrupt>
        tipc_disc_rcv()
        tipc_node_check_dest()
        tipc_node_create()
        lock(&(&tn->node_list_lock)->rlock);
      
       *** DEADLOCK ***
      
      When rhashtable_rehash_table() holds ht->lock on CPU0, it doesn't
      disable BH. So if an interrupt happens after the lock, it can create
      an inverse lock ordering between ht->lock and tn->node_list_lock. As
      a consequence, deadlock might happen.
      
      The reason causing the inverse lock ordering scenario above is because
      the initial purpose of node_list_lock is not designed to do the
      serialization of node address setting.
      
      As cmpxchg() can guarantee CAS (compare-and-swap) process is atomic,
      we use it to replace node_list_lock to ensure setting node address can
      be atomically finished. It turns out the potential deadlock can be
      avoided as well.
      
      Fixes: 9faa89d4 ("tipc: make function tipc_net_finalize() thread safe")
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Acked-by: NJon Maloy <maloy@donjonn.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      37436d9c
  18. 02 8月, 2018 1 次提交
  19. 31 7月, 2018 1 次提交
  20. 28 7月, 2018 2 次提交
  21. 27 7月, 2018 1 次提交
  22. 22 7月, 2018 1 次提交
    • Y
      tipc: make some functions static · e064cce1
      YueHaibing 提交于
      Fixes the following sparse warnings:
      
      net/tipc/link.c:376:5: warning: symbol 'link_bc_rcv_gap' was not declared. Should it be static?
      net/tipc/link.c:823:6: warning: symbol 'link_prepare_wakeup' was not declared. Should it be static?
      net/tipc/link.c:959:6: warning: symbol 'tipc_link_advance_backlog' was not declared. Should it be static?
      net/tipc/link.c:1009:5: warning: symbol 'tipc_link_retrans' was not declared. Should it be static?
      net/tipc/monitor.c:687:5: warning: symbol '__tipc_nl_add_monitor_peer' was not declared. Should it be static?
      net/tipc/group.c:230:20: warning: symbol 'tipc_group_find_member' was not declared. Should it be static?
      Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e064cce1
  23. 21 7月, 2018 1 次提交
    • J
      tipc: make link capability update thread safe · 40999f11
      Jon Maloy 提交于
      The commit referred to below introduced an update of the link
      capabilities field that is not safe. Given the recently added
      feature to remove idle node and link items after 5 minutes, there
      is a small risk that the update will happen at the very moment the
      targeted link is being removed. To avoid this we have to perform
      the update inside the node item's write lock protection.
      
      Fixes: 9012de50 ("tipc: add sequence number check for link STATE messages")
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      40999f11
  24. 19 7月, 2018 2 次提交
  25. 12 7月, 2018 2 次提交
    • J
      tipc: check session number before accepting link protocol messages · 7ea817f4
      Jon Maloy 提交于
      In some virtual environments we observe a significant higher number of
      packet reordering and delays than we have been used to traditionally.
      
      This makes it necessary with stricter checks on incoming link protocol
      messages' session number, which until now only has been validated for
      RESET messages.
      
      Since the other two message types, ACTIVATE and STATE messages also
      carry this number, it is easy to extend the validation check to those
      messages.
      
      We also introduce a flag indicating if a link has a valid peer session
      number or not. This eliminates the mixing of 32- and 16-bit arithmethics
      we are currently using to achieve this.
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7ea817f4
    • J
      tipc: add sequence number check for link STATE messages · 9012de50
      Jon Maloy 提交于
      Some switch infrastructures produce huge amounts of packet duplicates.
      This becomes a problem if those messages are STATE/NACK protocol
      messages, causing unnecessary retransmissions of already accepted
      packets.
      
      We now introduce a unique sequence number per STATE protocol message
      so that duplicates can be identified and ignored. This will also be
      useful when tracing such cases, and to avert replay attacks when TIPC
      is encrypted.
      
      For compatibility reasons we have to introduce a new capability flag
      TIPC_LINK_PROTO_SEQNO to handle this new feature.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9012de50
  26. 07 7月, 2018 4 次提交