1. 17 11月, 2014 1 次提交
  2. 06 11月, 2014 1 次提交
    • D
      net: Add and use skb_copy_datagram_msg() helper. · 51f3d02b
      David S. Miller 提交于
      This encapsulates all of the skb_copy_datagram_iovec() callers
      with call argument signature "skb, offset, msghdr->msg_iov, length".
      
      When we move to iov_iters in the networking, the iov_iter object will
      sit in the msghdr.
      
      Having a helper like this means there will be less places to touch
      during that transformation.
      
      Based upon descriptions and patch from Al Viro.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      51f3d02b
  3. 31 10月, 2014 1 次提交
  4. 22 10月, 2014 2 次提交
    • Y
      tipc: fix lockdep warning when intra-node messages are delivered · 1a194c2d
      Ying Xue 提交于
      When running tipcTC&tipcTS test suite, below lockdep unsafe locking
      scenario is reported:
      
      [ 1109.997854]
      [ 1109.997988] =================================
      [ 1109.998290] [ INFO: inconsistent lock state ]
      [ 1109.998575] 3.17.0-rc1+ #113 Not tainted
      [ 1109.998762] ---------------------------------
      [ 1109.998762] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
      [ 1109.998762] swapper/7/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
      [ 1109.998762]  (slock-AF_TIPC){+.?...}, at: [<ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc]
      [ 1109.998762] {SOFTIRQ-ON-W} state was registered at:
      [ 1109.998762]   [<ffffffff810a4770>] __lock_acquire+0x6a0/0x1d80
      [ 1109.998762]   [<ffffffff810a6555>] lock_acquire+0x95/0x1e0
      [ 1109.998762]   [<ffffffff81a2d1ce>] _raw_spin_lock+0x3e/0x80
      [ 1109.998762]   [<ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc]
      [ 1109.998762]   [<ffffffffa0004fe8>] tipc_link_xmit+0xa8/0xc0 [tipc]
      [ 1109.998762]   [<ffffffffa000ec6f>] tipc_sendmsg+0x15f/0x550 [tipc]
      [ 1109.998762]   [<ffffffffa000f165>] tipc_connect+0x105/0x140 [tipc]
      [ 1109.998762]   [<ffffffff817676ee>] SYSC_connect+0xae/0xc0
      [ 1109.998762]   [<ffffffff81767b7e>] SyS_connect+0xe/0x10
      [ 1109.998762]   [<ffffffff817a9788>] compat_SyS_socketcall+0xb8/0x200
      [ 1109.998762]   [<ffffffff81a306e5>] sysenter_dispatch+0x7/0x1f
      [ 1109.998762] irq event stamp: 241060
      [ 1109.998762] hardirqs last  enabled at (241060): [<ffffffff8105a4ad>] __local_bh_enable_ip+0x6d/0xd0
      [ 1109.998762] hardirqs last disabled at (241059): [<ffffffff8105a46f>] __local_bh_enable_ip+0x2f/0xd0
      [ 1109.998762] softirqs last  enabled at (241020): [<ffffffff81059a52>] _local_bh_enable+0x22/0x50
      [ 1109.998762] softirqs last disabled at (241021): [<ffffffff8105a626>] irq_exit+0x96/0xc0
      [ 1109.998762]
      [ 1109.998762] other info that might help us debug this:
      [ 1109.998762]  Possible unsafe locking scenario:
      [ 1109.998762]
      [ 1109.998762]        CPU0
      [ 1109.998762]        ----
      [ 1109.998762]   lock(slock-AF_TIPC);
      [ 1109.998762]   <Interrupt>
      [ 1109.998762]     lock(slock-AF_TIPC);
      [ 1109.998762]
      [ 1109.998762]  *** DEADLOCK ***
      [ 1109.998762]
      [ 1109.998762] 2 locks held by swapper/7/0:
      [ 1109.998762]  #0:  (rcu_read_lock){......}, at: [<ffffffff81782dc9>] __netif_receive_skb_core+0x69/0xb70
      [ 1109.998762]  #1:  (rcu_read_lock){......}, at: [<ffffffffa0001c90>] tipc_l2_rcv_msg+0x40/0x260 [tipc]
      [ 1109.998762]
      [ 1109.998762] stack backtrace:
      [ 1109.998762] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 3.17.0-rc1+ #113
      [ 1109.998762] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
      [ 1109.998762]  ffffffff82745830 ffff880016c03828 ffffffff81a209eb 0000000000000007
      [ 1109.998762]  ffff880017b3cac0 ffff880016c03888 ffffffff81a1c5ef 0000000000000001
      [ 1109.998762]  ffff880000000001 ffff880000000000 ffffffff81012d4f 0000000000000000
      [ 1109.998762] Call Trace:
      [ 1109.998762]  <IRQ>  [<ffffffff81a209eb>] dump_stack+0x4e/0x68
      [ 1109.998762]  [<ffffffff81a1c5ef>] print_usage_bug+0x1f1/0x202
      [ 1109.998762]  [<ffffffff81012d4f>] ? save_stack_trace+0x2f/0x50
      [ 1109.998762]  [<ffffffff810a406c>] mark_lock+0x28c/0x2f0
      [ 1109.998762]  [<ffffffff810a3440>] ? print_irq_inversion_bug.part.46+0x1f0/0x1f0
      [ 1109.998762]  [<ffffffff810a467d>] __lock_acquire+0x5ad/0x1d80
      [ 1109.998762]  [<ffffffff810a70dd>] ? trace_hardirqs_on+0xd/0x10
      [ 1109.998762]  [<ffffffff8108ace8>] ? sched_clock_cpu+0x98/0xc0
      [ 1109.998762]  [<ffffffff8108ad2b>] ? local_clock+0x1b/0x30
      [ 1109.998762]  [<ffffffff810a10dc>] ? lock_release_holdtime.part.29+0x1c/0x1a0
      [ 1109.998762]  [<ffffffff8108aa05>] ? sched_clock_local+0x25/0x90
      [ 1109.998762]  [<ffffffffa000dec0>] ? tipc_sk_get+0x60/0x80 [tipc]
      [ 1109.998762]  [<ffffffff810a6555>] lock_acquire+0x95/0x1e0
      [ 1109.998762]  [<ffffffffa0011969>] ? tipc_sk_rcv+0x49/0x2b0 [tipc]
      [ 1109.998762]  [<ffffffff810a6fb6>] ? trace_hardirqs_on_caller+0xa6/0x1c0
      [ 1109.998762]  [<ffffffff81a2d1ce>] _raw_spin_lock+0x3e/0x80
      [ 1109.998762]  [<ffffffffa0011969>] ? tipc_sk_rcv+0x49/0x2b0 [tipc]
      [ 1109.998762]  [<ffffffffa000dec0>] ? tipc_sk_get+0x60/0x80 [tipc]
      [ 1109.998762]  [<ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc]
      [ 1109.998762]  [<ffffffffa00076bd>] tipc_rcv+0x5ed/0x960 [tipc]
      [ 1109.998762]  [<ffffffffa0001d1c>] tipc_l2_rcv_msg+0xcc/0x260 [tipc]
      [ 1109.998762]  [<ffffffffa0001c90>] ? tipc_l2_rcv_msg+0x40/0x260 [tipc]
      [ 1109.998762]  [<ffffffff81783345>] __netif_receive_skb_core+0x5e5/0xb70
      [ 1109.998762]  [<ffffffff81782dc9>] ? __netif_receive_skb_core+0x69/0xb70
      [ 1109.998762]  [<ffffffff81784eb9>] ? dev_gro_receive+0x259/0x4e0
      [ 1109.998762]  [<ffffffff817838f6>] __netif_receive_skb+0x26/0x70
      [ 1109.998762]  [<ffffffff81783acd>] netif_receive_skb_internal+0x2d/0x1f0
      [ 1109.998762]  [<ffffffff81785518>] napi_gro_receive+0xd8/0x240
      [ 1109.998762]  [<ffffffff815bf854>] e1000_clean_rx_irq+0x2c4/0x530
      [ 1109.998762]  [<ffffffff815c1a46>] e1000_clean+0x266/0x9c0
      [ 1109.998762]  [<ffffffff8108ad2b>] ? local_clock+0x1b/0x30
      [ 1109.998762]  [<ffffffff8108aa05>] ? sched_clock_local+0x25/0x90
      [ 1109.998762]  [<ffffffff817842b1>] net_rx_action+0x141/0x310
      [ 1109.998762]  [<ffffffff810bd710>] ? handle_fasteoi_irq+0xe0/0x150
      [ 1109.998762]  [<ffffffff81059fa6>] __do_softirq+0x116/0x4d0
      [ 1109.998762]  [<ffffffff8105a626>] irq_exit+0x96/0xc0
      [ 1109.998762]  [<ffffffff81a30d07>] do_IRQ+0x67/0x110
      [ 1109.998762]  [<ffffffff81a2ee2f>] common_interrupt+0x6f/0x6f
      [ 1109.998762]  <EOI>  [<ffffffff8100d2b7>] ? default_idle+0x37/0x250
      [ 1109.998762]  [<ffffffff8100d2b5>] ? default_idle+0x35/0x250
      [ 1109.998762]  [<ffffffff8100dd1f>] arch_cpu_idle+0xf/0x20
      [ 1109.998762]  [<ffffffff810999fd>] cpu_startup_entry+0x27d/0x4d0
      [ 1109.998762]  [<ffffffff81034c78>] start_secondary+0x188/0x1f0
      
      When intra-node messages are delivered from one process to another
      process, tipc_link_xmit() doesn't disable BH before it directly calls
      tipc_sk_rcv() on process context to forward messages to destination
      socket. Meanwhile, if messages delivered by remote node arrive at the
      node and their destinations are also the same socket, tipc_sk_rcv()
      running on process context might be preempted by tipc_sk_rcv() running
      BH context. As a result, the latter cannot obtain the socket lock as
      the lock was obtained by the former, however, the former has no chance
      to be run as the latter is owning the CPU now, so headlock happens. To
      avoid it, BH should be always disabled in tipc_sk_rcv().
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Reviewed-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1a194c2d
    • Y
      tipc: fix a potential deadlock · 7b8613e0
      Ying Xue 提交于
      Locking dependency detected below possible unsafe locking scenario:
      
                 CPU0                          CPU1
      T0:  tipc_named_rcv()                tipc_rcv()
      T1:  [grab nametble write lock]*     [grab node lock]*
      T2:  tipc_update_nametbl()           tipc_node_link_up()
      T3:  tipc_nodesub_subscribe()        tipc_nametbl_publish()
      T4:  [grab node lock]*               [grab nametble write lock]*
      
      The opposite order of holding nametbl write lock and node lock on
      above two different paths may result in a deadlock. If we move the
      the updating of the name table after link state named out of node
      lock, the reverse order of holding locks will be eliminated, and
      as a result, the deadlock risk.
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7b8613e0
  5. 18 10月, 2014 1 次提交
    • J
      tipc: fix bug in bundled buffer reception · 643566d4
      Jon Paul Maloy 提交于
      In commit ec8a2e56 ("tipc: same receive
      code path for connection protocol and data messages") we omitted the
      the possiblilty that an arriving message extracted from a bundle buffer
      may be a multicast message. Such messages need to be to be delivered to
      the socket via a separate function, tipc_sk_mcast_rcv(). As a result,
      small multicast messages arriving as members of a bundle buffer will be
      silently dropped.
      
      This commit corrects the error by considering this case in the function
      tipc_link_bundle_rcv().
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      643566d4
  6. 08 10月, 2014 1 次提交
    • J
      tipc: fix bug in multicast congestion handling · 908344cd
      Jon Maloy 提交于
      One aim of commit 50100a5e ("tipc:
      use pseudo message to wake up sockets after link congestion") was
      to handle link congestion abatement in a uniform way for both unicast
      and multicast transmit. However, the latter doesn't work correctly,
      and has been broken since the referenced commit was applied.
      
      If a user now sends a burst of multicast messages that is big
      enough to cause broadcast link congestion, it will be put to sleep,
      and not be waked up when the congestion abates as it should be.
      
      This has two reasons. First, the flag that is used, TIPC_WAKEUP_USERS,
      is set correctly, but in the wrong field. Instead of setting it in the
      'action_flags' field of the arrival node struct, it is by mistake set
      in the dummy node struct that is owned by the broadcast link, where it
      will never tested for. Second, we cannot use the same flag for waking
      up unicast and multicast users, since the function tipc_node_unlock()
      needs to pick the wakeup pseudo messages to deliver from different
      queues. It must hence be able to distinguish between the two cases.
      
      This commit solves this problem by adding a new flag
      TIPC_WAKEUP_BCAST_USERS, and a new function tipc_bclink_wakeup_user().
      The latter is to be called by tipc_node_unlock() when the named flag,
      now set in the correct field, is encountered.
      
      v2: using explicit 'unsigned int' declaration instead of 'uint', as
      per comment from David Miller.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      908344cd
  7. 11 9月, 2014 1 次提交
  8. 02 9月, 2014 2 次提交
    • E
      tipc: add name distributor resiliency queue · a5325ae5
      Erik Hugne 提交于
      TIPC name table updates are distributed asynchronously in a cluster,
      entailing a risk of certain race conditions. E.g., if two nodes
      simultaneously issue conflicting (overlapping) publications, this may
      not be detected until both publications have reached a third node, in
      which case one of the publications will be silently dropped on that
      node. Hence, we end up with an inconsistent name table.
      
      In most cases this conflict is just a temporary race, e.g., one
      node is issuing a publication under the assumption that a previous,
      conflicting, publication has already been withdrawn by the other node.
      However, because of the (rtt related) distributed update delay, this
      may not yet hold true on all nodes. The symptom of this failure is a
      syslog message: "tipc: Cannot publish {%u,%u,%u}, overlap error".
      
      In this commit we add a resiliency queue at the receiving end of
      the name table distributor. When insertion of an arriving publication
      fails, we retain it in this queue for a short amount of time, assuming
      that another update will arrive very soon and clear the conflict. If so
      happens, we insert the publication, otherwise we drop it.
      
      The (configurable) retention value defaults to 2000 ms. Knowing from
      experience that the situation described above is extremely rare, there
      is no risk that the queue will accumulate any large number of items.
      Signed-off-by: NErik Hugne <erik.hugne@ericsson.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a5325ae5
    • E
      tipc: refactor name table updates out of named packet receive routine · f4ad8a4b
      Erik Hugne 提交于
      We need to perform the same actions when processing deferred name
      table updates, so this functionality is moved to a separate
      function.
      Signed-off-by: NErik Hugne <erik.hugne@ericsson.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f4ad8a4b
  9. 30 8月, 2014 1 次提交
  10. 24 8月, 2014 15 次提交
  11. 20 8月, 2014 1 次提交
  12. 17 8月, 2014 1 次提交
  13. 30 7月, 2014 1 次提交
  14. 29 7月, 2014 1 次提交
    • J
      tipc: make tipc_buf_append() more robust · 13e9b997
      Jon Paul Maloy 提交于
      As per comment from David Miller, we try to make the buffer reassembly
      function more resilient to user errors than it is today.
      
      - We check that the "*buf" parameter always is set, since this is
        mandatory input.
      
      - We ensure that *buf->next always is set to NULL before linking in
        the buffer, instead of relying of the caller to have done this.
      
      - We ensure that the "tail" pointer in the head buffer's control
        block is initialized to NULL when the first fragment arrives.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      13e9b997
  15. 21 7月, 2014 1 次提交
  16. 17 7月, 2014 7 次提交
  17. 16 7月, 2014 1 次提交
  18. 12 7月, 2014 1 次提交
    • J
      tipc: clear 'next'-pointer of message fragments before reassembly · 99941754
      Jon Paul Maloy 提交于
      If the 'next' pointer of the last fragment buffer in a message is not
      zeroed before reassembly, we risk ending up with a corrupt message,
      since the reassembly function itself isn't doing this.
      
      Currently, when a buffer is retrieved from the deferred queue of the
      broadcast link, the next pointer is not cleared, with the result as
      described above.
      
      This commit corrects this, and thereby fixes a bug that may occur when
      long broadcast messages are transmitted across dual interfaces. The bug
      has been present since 40ba3cdf ("tipc:
      message reassembly using fragment chain")
      
      This commit should be applied to both net and net-next.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      99941754