1. 28 10月, 2014 5 次提交
    • D
      Merge branch 'cxgb4-net' · 8ae3c911
      David S. Miller 提交于
      Anish Bhatt says:
      
      ====================
      cxgb4 : DCBx fixes for apps/host lldp agents
      
      This patchset  contains some minor fixes for cxgb4 DCBx code. Chiefly, cxgb4
      was not cleaning up any apps added to kernel app table when link was lost.
      Disabling DCBx in firmware would automatically set DCBx state to host-managed
      and enabled, we now wait for an explicit enable call from an lldp agent instead
      
      First patch was originally sent to net-next, but considering it applies to
      correcting behaviour of code already in net, I think it qualifies as a bug fix.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8ae3c911
    • A
      cxgb4 : Handle dcb enable correctly · 3bb06261
      Anish Bhatt 提交于
      Disabling DCBx in firmware automatically enables DCBx for control via host
      lldp agents. Wait for an explicit setstate call from an lldp agents to enable
       DCBx instead.
      
      Fixes: 76bcb31e ("cxgb4 : Add DCBx support codebase and dcbnl_ops")
      Signed-off-by: NAnish Bhatt <anish@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3bb06261
    • A
      cxgb4 : Improve handling of DCB negotiation or loss thereof · 2376c879
      Anish Bhatt 提交于
      Clear out any DCB apps we might have added to kernel table when we lose DCB
      sync (or IEEE equivalent event). These were previously left behind and not
      cleaned up correctly. IEEE allows individual components to work independently,
       so improve check for IEEE completion by specifying individual components.
      
      Fixes: 10b00466 ("cxgb4: IEEE fixes for DCBx state machine")
      Signed-off-by: NAnish Bhatt <anish@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2376c879
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · 5d26b1f5
      David S. Miller 提交于
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains Netfilter fixes for your net tree,
      they are:
      
      1) Allow to recycle a TCP port in conntrack when the change role from
         server to client, from Marcelo Leitner.
      
      2) Fix possible off by one access in ip_set_nfnl_get_byindex(), patch
         from Dan Carpenter.
      
      3) alloc_percpu returns NULL on error, no need for IS_ERR() in nf_tables
         chain statistic updates. From Sabrina Dubroca.
      
      4) Don't compile ip options in bridge netfilter, this mangles the packet
         and bridge should not alter layer >= 3 headers when forwarding packets.
         Patch from Herbert Xu and tested by Florian Westphal.
      
      5) Account the final NLMSG_DONE message when calculating the size of the
         nflog netlink batches. Patch from Florian Westphal.
      
      6) Fix a possible netlink attribute length overflow with large packets.
         Again from Florian Westphal.
      
      7) Release the skbuff if nfnetlink_log fails to put the final
         NLMSG_DONE message. This fixes a leak on error. This shouldn't ever
         happen though, otherwise this means we miscalculate the netlink batch
         size, so spot a warning if this ever happens so we can track down the
         problem. This patch from Houcheng Lin.
      
      8) Look at the right list when recycling targets in the nft_compat,
         patch from Arturo Borrero.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5d26b1f5
    • A
      netfilter: nft_compat: fix wrong target lookup in nft_target_select_ops() · 7965ee93
      Arturo Borrero 提交于
      The code looks for an already loaded target, and the correct list to search
      is nft_target_list, not nft_match_list.
      Signed-off-by: NArturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      7965ee93
  2. 27 10月, 2014 4 次提交
  3. 26 10月, 2014 8 次提交
    • G
      drivers: net: xgene: Rewrite buggy loop in xgene_enet_ecc_init() · b71e821d
      Geert Uytterhoeven 提交于
      drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.c: In function ‘xgene_enet_ecc_init’:
      drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.c:126: warning: ‘data’ may be used uninitialized in this function
      
      Depending on the arbitrary value on the stack, the loop may terminate
      too early, and cause a bogus -ENODEV failure.
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b71e821d
    • D
      i40e: _MASK vs _SHIFT typo in i40e_handle_mdd_event() · 013f6579
      Dan Carpenter 提交于
      We accidentally mask by the _SHIFT variable.  It means that "event" is
      always zero.
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Tested-by: NJim Young <jamesx.m.young@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      013f6579
    • E
      macvlan: fix a race on port dismantle and possible skb leaks · fe0ca732
      Eric Dumazet 提交于
      We need to cancel the work queue after rcu grace period,
      otherwise it can be rescheduled by incoming packets.
      
      We need to purge queue if some skbs are still in it.
      
      We can use __skb_queue_head_init() variant in
      macvlan_process_broadcast()
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Fixes: 412ca155 ("macvlan: Move broadcasts into a work queue")
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fe0ca732
    • E
      tcp: md5: do not use alloc_percpu() · 349ce993
      Eric Dumazet 提交于
      percpu tcp_md5sig_pool contains memory blobs that ultimately
      go through sg_set_buf().
      
      -> sg_set_page(sg, virt_to_page(buf), buflen, offset_in_page(buf));
      
      This requires that whole area is in a physically contiguous portion
      of memory. And that @buf is not backed by vmalloc().
      
      Given that alloc_percpu() can use vmalloc() areas, this does not
      fit the requirements.
      
      Replace alloc_percpu() by a static DEFINE_PER_CPU() as tcp_md5sig_pool
      is small anyway, there is no gain to dynamically allocate it.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Fixes: 765cf997 ("tcp: md5: remove one indirection level in tcp_md5sig_pool")
      Reported-by: NCrestez Dan Leonard <cdleonard@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      349ce993
    • D
      Merge branch 'xen-netback' · 4cc40af0
      David S. Miller 提交于
      David Vrabel says:
      
      ====================
      xen-netback: guest Rx queue drain and stall fixes
      
      This series fixes two critical xen-netback bugs.
      
      1. Netback may consume all of host memory by queuing an unlimited
         number of skb on the internal guest Rx queue.  This behaviour is
         guest triggerable.
      
      2. Carrier flapping under high traffic rates which reduces
         performance.
      
      The first patch is a prerequite.  Removing support for frontends with
      feature-rx-notify makes it easier to reason about the correctness of
      netback since it no longer has to support this outdated and broken
      mode.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4cc40af0
    • D
      xen-netback: reintroduce guest Rx stall detection · ecf08d2d
      David Vrabel 提交于
      If a frontend not receiving packets it is useful to detect this and
      turn off the carrier so packets are dropped early instead of being
      queued and drained when they expire.
      
      A to-guest queue is stalled if it doesn't have enough free slots for a
      an extended period of time (default 60 s).
      
      If at least one queue is stalled, the carrier is turned off (in the
      expectation that the other queues will soon stall as well).  The
      carrier is only turned on once all queues are ready.
      
      When the frontend connects, all the queues start in the stalled state
      and only become ready once the frontend queues enough Rx requests.
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Reviewed-by: NWei Liu <wei.liu2@citrix.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ecf08d2d
    • D
      xen-netback: fix unlimited guest Rx internal queue and carrier flapping · f48da8b1
      David Vrabel 提交于
      Netback needs to discard old to-guest skb's (guest Rx queue drain) and
      it needs detect guest Rx stalls (to disable the carrier so packets are
      discarded earlier), but the current implementation is very broken.
      
      1. The check in hard_start_xmit of the slot availability did not
         consider the number of packets that were already in the guest Rx
         queue.  This could allow the queue to grow without bound.
      
         The guest stops consuming packets and the ring was allowed to fill
         leaving S slot free.  Netback queues a packet requiring more than S
         slots (ensuring that the ring stays with S slots free).  Netback
         queue indefinately packets provided that then require S or fewer
         slots.
      
      2. The Rx stall detection is not triggered in this case since the
         (host) Tx queue is not stopped.
      
      3. If the Tx queue is stopped and a guest Rx interrupt occurs, netback
         will consider this an Rx purge event which may result in it taking
         the carrier down unnecessarily.  It also considers a queue with
         only 1 slot free as unstalled (even though the next packet might
         not fit in this).
      
      The internal guest Rx queue is limited by a byte length (to 512 Kib,
      enough for half the ring).  The (host) Tx queue is stopped and started
      based on this limit.  This sets an upper bound on the amount of memory
      used by packets on the internal queue.
      
      This allows the estimatation of the number of slots for an skb to be
      removed (it wasn't a very good estimate anyway).  Instead, the guest
      Rx thread just waits for enough free slots for a maximum sized packet.
      
      skbs queued on the internal queue have an 'expires' time (set to the
      current time plus the drain timeout).  The guest Rx thread will detect
      when the skb at the head of the queue has expired and discard expired
      skbs.  This sets a clear upper bound on the length of time an skb can
      be queued for.  For a guest being destroyed the maximum time needed to
      wait for all the packets it sent to be dropped is still the drain
      timeout (10 s) since it will not be sending new packets.
      
      Rx stall detection is reintroduced in a later commit.
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Reviewed-by: NWei Liu <wei.liu2@citrix.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f48da8b1
    • D
      xen-netback: make feature-rx-notify mandatory · bc96f648
      David Vrabel 提交于
      Frontends that do not provide feature-rx-notify may stall because
      netback depends on the notification from frontend to wake the guest Rx
      thread (even if can_queue is false).
      
      This could be fixed but feature-rx-notify was introduced in 2006 and I
      am not aware of any frontends that do not implement this.
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      Acked-by: NWei Liu <wei.liu2@citrix.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bc96f648
  4. 25 10月, 2014 1 次提交
  5. 24 10月, 2014 4 次提交
  6. 23 10月, 2014 10 次提交
  7. 22 10月, 2014 8 次提交
    • S
      netfilter: nf_tables: check for NULL in nf_tables_newchain pcpu stats allocation · c123bb71
      Sabrina Dubroca 提交于
      alloc_percpu returns NULL on failure, not a negative error code.
      
      Fixes: ff3cd7b3 ("netfilter: nf_tables: refactor chain statistic routines")
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      c123bb71
    • D
      netfilter: ipset: off by one in ip_set_nfnl_get_byindex() · 0f9f5e1b
      Dan Carpenter 提交于
      The ->ip_set_list[] array is initialized in ip_set_net_init() and it
      has ->ip_set_max elements so this check should be >= instead of >
      otherwise we are off by one.
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      0f9f5e1b
    • M
      netfilter: nf_conntrack: allow server to become a client in TW handling · e37ad9fd
      Marcelo Leitner 提交于
      When a port that was used to listen for inbound connections gets closed
      and reused for outgoing connections (like rsh ends up doing for stderr
      flow), current we may reject the SYN/ACK packet for the new connection
      because tcp_conntracks states forbirds a port to become a client while
      there is still a TIME_WAIT entry in there for it.
      
      As TCP may expire the TIME_WAIT socket in 60s and conntrack's timeout
      for it is 120s, there is a ~60s window that the application can end up
      opening a port that conntrack will end up blocking.
      
      This patch fixes this by simply allowing such state transition: if we
      see a SYN, in TIME_WAIT state, on REPLY direction, move it to sSS. Note
      that the rest of the code already handles this situation, more
      specificly in tcp_packet(), first switch clause.
      Signed-off-by: NMarcelo Ricardo Leitner <mleitner@redhat.com>
      Acked-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      e37ad9fd
    • S
      net: sched: initialize bstats syncp · 7c1c97d5
      Sabrina Dubroca 提交于
      Use netdev_alloc_pcpu_stats to allocate percpu stats and initialize syncp.
      
      Fixes: 22e0f8b9 "net: sched: make bstats per cpu and estimator RCU safe"
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Acked-by: NCong Wang <cwang@twopensource.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7c1c97d5
    • A
      bpf: fix bug in eBPF verifier · 32bf08a6
      Alexei Starovoitov 提交于
      while comparing for verifier state equivalency the comparison
      was missing a check for uninitialized register.
      Make sure it does so and add a testcase.
      
      Fixes: f1bca824 ("bpf: add search pruning optimization to verifier")
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      32bf08a6
    • T
      netlink: Re-add locking to netlink_lookup() and seq walker · 78fd1d0a
      Thomas Graf 提交于
      The synchronize_rcu() in netlink_release() introduces unacceptable
      latency. Reintroduce minimal lookup so we can drop the
      synchronize_rcu() until socket destruction has been RCUfied.
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Reported-by: NSteinar H. Gunderson <sgunderson@bigfoot.com>
      Reported-and-tested-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      78fd1d0a
    • Y
      tipc: fix lockdep warning when intra-node messages are delivered · 1a194c2d
      Ying Xue 提交于
      When running tipcTC&tipcTS test suite, below lockdep unsafe locking
      scenario is reported:
      
      [ 1109.997854]
      [ 1109.997988] =================================
      [ 1109.998290] [ INFO: inconsistent lock state ]
      [ 1109.998575] 3.17.0-rc1+ #113 Not tainted
      [ 1109.998762] ---------------------------------
      [ 1109.998762] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
      [ 1109.998762] swapper/7/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
      [ 1109.998762]  (slock-AF_TIPC){+.?...}, at: [<ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc]
      [ 1109.998762] {SOFTIRQ-ON-W} state was registered at:
      [ 1109.998762]   [<ffffffff810a4770>] __lock_acquire+0x6a0/0x1d80
      [ 1109.998762]   [<ffffffff810a6555>] lock_acquire+0x95/0x1e0
      [ 1109.998762]   [<ffffffff81a2d1ce>] _raw_spin_lock+0x3e/0x80
      [ 1109.998762]   [<ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc]
      [ 1109.998762]   [<ffffffffa0004fe8>] tipc_link_xmit+0xa8/0xc0 [tipc]
      [ 1109.998762]   [<ffffffffa000ec6f>] tipc_sendmsg+0x15f/0x550 [tipc]
      [ 1109.998762]   [<ffffffffa000f165>] tipc_connect+0x105/0x140 [tipc]
      [ 1109.998762]   [<ffffffff817676ee>] SYSC_connect+0xae/0xc0
      [ 1109.998762]   [<ffffffff81767b7e>] SyS_connect+0xe/0x10
      [ 1109.998762]   [<ffffffff817a9788>] compat_SyS_socketcall+0xb8/0x200
      [ 1109.998762]   [<ffffffff81a306e5>] sysenter_dispatch+0x7/0x1f
      [ 1109.998762] irq event stamp: 241060
      [ 1109.998762] hardirqs last  enabled at (241060): [<ffffffff8105a4ad>] __local_bh_enable_ip+0x6d/0xd0
      [ 1109.998762] hardirqs last disabled at (241059): [<ffffffff8105a46f>] __local_bh_enable_ip+0x2f/0xd0
      [ 1109.998762] softirqs last  enabled at (241020): [<ffffffff81059a52>] _local_bh_enable+0x22/0x50
      [ 1109.998762] softirqs last disabled at (241021): [<ffffffff8105a626>] irq_exit+0x96/0xc0
      [ 1109.998762]
      [ 1109.998762] other info that might help us debug this:
      [ 1109.998762]  Possible unsafe locking scenario:
      [ 1109.998762]
      [ 1109.998762]        CPU0
      [ 1109.998762]        ----
      [ 1109.998762]   lock(slock-AF_TIPC);
      [ 1109.998762]   <Interrupt>
      [ 1109.998762]     lock(slock-AF_TIPC);
      [ 1109.998762]
      [ 1109.998762]  *** DEADLOCK ***
      [ 1109.998762]
      [ 1109.998762] 2 locks held by swapper/7/0:
      [ 1109.998762]  #0:  (rcu_read_lock){......}, at: [<ffffffff81782dc9>] __netif_receive_skb_core+0x69/0xb70
      [ 1109.998762]  #1:  (rcu_read_lock){......}, at: [<ffffffffa0001c90>] tipc_l2_rcv_msg+0x40/0x260 [tipc]
      [ 1109.998762]
      [ 1109.998762] stack backtrace:
      [ 1109.998762] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 3.17.0-rc1+ #113
      [ 1109.998762] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
      [ 1109.998762]  ffffffff82745830 ffff880016c03828 ffffffff81a209eb 0000000000000007
      [ 1109.998762]  ffff880017b3cac0 ffff880016c03888 ffffffff81a1c5ef 0000000000000001
      [ 1109.998762]  ffff880000000001 ffff880000000000 ffffffff81012d4f 0000000000000000
      [ 1109.998762] Call Trace:
      [ 1109.998762]  <IRQ>  [<ffffffff81a209eb>] dump_stack+0x4e/0x68
      [ 1109.998762]  [<ffffffff81a1c5ef>] print_usage_bug+0x1f1/0x202
      [ 1109.998762]  [<ffffffff81012d4f>] ? save_stack_trace+0x2f/0x50
      [ 1109.998762]  [<ffffffff810a406c>] mark_lock+0x28c/0x2f0
      [ 1109.998762]  [<ffffffff810a3440>] ? print_irq_inversion_bug.part.46+0x1f0/0x1f0
      [ 1109.998762]  [<ffffffff810a467d>] __lock_acquire+0x5ad/0x1d80
      [ 1109.998762]  [<ffffffff810a70dd>] ? trace_hardirqs_on+0xd/0x10
      [ 1109.998762]  [<ffffffff8108ace8>] ? sched_clock_cpu+0x98/0xc0
      [ 1109.998762]  [<ffffffff8108ad2b>] ? local_clock+0x1b/0x30
      [ 1109.998762]  [<ffffffff810a10dc>] ? lock_release_holdtime.part.29+0x1c/0x1a0
      [ 1109.998762]  [<ffffffff8108aa05>] ? sched_clock_local+0x25/0x90
      [ 1109.998762]  [<ffffffffa000dec0>] ? tipc_sk_get+0x60/0x80 [tipc]
      [ 1109.998762]  [<ffffffff810a6555>] lock_acquire+0x95/0x1e0
      [ 1109.998762]  [<ffffffffa0011969>] ? tipc_sk_rcv+0x49/0x2b0 [tipc]
      [ 1109.998762]  [<ffffffff810a6fb6>] ? trace_hardirqs_on_caller+0xa6/0x1c0
      [ 1109.998762]  [<ffffffff81a2d1ce>] _raw_spin_lock+0x3e/0x80
      [ 1109.998762]  [<ffffffffa0011969>] ? tipc_sk_rcv+0x49/0x2b0 [tipc]
      [ 1109.998762]  [<ffffffffa000dec0>] ? tipc_sk_get+0x60/0x80 [tipc]
      [ 1109.998762]  [<ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc]
      [ 1109.998762]  [<ffffffffa00076bd>] tipc_rcv+0x5ed/0x960 [tipc]
      [ 1109.998762]  [<ffffffffa0001d1c>] tipc_l2_rcv_msg+0xcc/0x260 [tipc]
      [ 1109.998762]  [<ffffffffa0001c90>] ? tipc_l2_rcv_msg+0x40/0x260 [tipc]
      [ 1109.998762]  [<ffffffff81783345>] __netif_receive_skb_core+0x5e5/0xb70
      [ 1109.998762]  [<ffffffff81782dc9>] ? __netif_receive_skb_core+0x69/0xb70
      [ 1109.998762]  [<ffffffff81784eb9>] ? dev_gro_receive+0x259/0x4e0
      [ 1109.998762]  [<ffffffff817838f6>] __netif_receive_skb+0x26/0x70
      [ 1109.998762]  [<ffffffff81783acd>] netif_receive_skb_internal+0x2d/0x1f0
      [ 1109.998762]  [<ffffffff81785518>] napi_gro_receive+0xd8/0x240
      [ 1109.998762]  [<ffffffff815bf854>] e1000_clean_rx_irq+0x2c4/0x530
      [ 1109.998762]  [<ffffffff815c1a46>] e1000_clean+0x266/0x9c0
      [ 1109.998762]  [<ffffffff8108ad2b>] ? local_clock+0x1b/0x30
      [ 1109.998762]  [<ffffffff8108aa05>] ? sched_clock_local+0x25/0x90
      [ 1109.998762]  [<ffffffff817842b1>] net_rx_action+0x141/0x310
      [ 1109.998762]  [<ffffffff810bd710>] ? handle_fasteoi_irq+0xe0/0x150
      [ 1109.998762]  [<ffffffff81059fa6>] __do_softirq+0x116/0x4d0
      [ 1109.998762]  [<ffffffff8105a626>] irq_exit+0x96/0xc0
      [ 1109.998762]  [<ffffffff81a30d07>] do_IRQ+0x67/0x110
      [ 1109.998762]  [<ffffffff81a2ee2f>] common_interrupt+0x6f/0x6f
      [ 1109.998762]  <EOI>  [<ffffffff8100d2b7>] ? default_idle+0x37/0x250
      [ 1109.998762]  [<ffffffff8100d2b5>] ? default_idle+0x35/0x250
      [ 1109.998762]  [<ffffffff8100dd1f>] arch_cpu_idle+0xf/0x20
      [ 1109.998762]  [<ffffffff810999fd>] cpu_startup_entry+0x27d/0x4d0
      [ 1109.998762]  [<ffffffff81034c78>] start_secondary+0x188/0x1f0
      
      When intra-node messages are delivered from one process to another
      process, tipc_link_xmit() doesn't disable BH before it directly calls
      tipc_sk_rcv() on process context to forward messages to destination
      socket. Meanwhile, if messages delivered by remote node arrive at the
      node and their destinations are also the same socket, tipc_sk_rcv()
      running on process context might be preempted by tipc_sk_rcv() running
      BH context. As a result, the latter cannot obtain the socket lock as
      the lock was obtained by the former, however, the former has no chance
      to be run as the latter is owning the CPU now, so headlock happens. To
      avoid it, BH should be always disabled in tipc_sk_rcv().
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Reviewed-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1a194c2d
    • Y
      tipc: fix a potential deadlock · 7b8613e0
      Ying Xue 提交于
      Locking dependency detected below possible unsafe locking scenario:
      
                 CPU0                          CPU1
      T0:  tipc_named_rcv()                tipc_rcv()
      T1:  [grab nametble write lock]*     [grab node lock]*
      T2:  tipc_update_nametbl()           tipc_node_link_up()
      T3:  tipc_nodesub_subscribe()        tipc_nametbl_publish()
      T4:  [grab node lock]*               [grab nametble write lock]*
      
      The opposite order of holding nametbl write lock and node lock on
      above two different paths may result in a deadlock. If we move the
      the updating of the name table after link state named out of node
      lock, the reverse order of holding locks will be eliminated, and
      as a result, the deadlock risk.
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7b8613e0