1. 18 4月, 2013 7 次提交
  2. 17 4月, 2013 1 次提交
    • D
      sctp: Add buffer utilization fields to /proc/net/sctp/assocs · f406c8b9
      Dilip Daya 提交于
      sctp: Add buffer utilization fields to /proc/net/sctp/assocs
      
      This patch adds the following fields to /proc/net/sctp/assocs output:
      
      	- sk->sk_wmem_alloc as "wmema"	(transmit queue bytes committed)
      	- sk->sk_wmem_queued as "wmemq"	(persistent queue size)
      	- sk->sk_sndbuf as "sndbuf"	(size of send buffer in bytes)
      	- sk->sk_rcvbuf as "rcvbuf"	(size of receive buffer in bytes)
      
      When small DATA chunks containing 136 bytes data are sent the TX_QUEUE
      (assoc->sndbuf_used) reaches a maximum of 40.9% of sk_sndbuf value when
      peer.rwnd = 0. This was diagnosed from sk_wmem_alloc value reaching maximum
      value of sk_sndbuf.
      
      TX_QUEUE (assoc->sndbuf_used), sk_wmem_alloc and sk_wmem_queued values are
      incremented in sctp_set_owner_w() for outgoing data chunks. Having access to
      the above values in /proc/net/sctp/assocs will provide a better understanding
      of SCTP buffer management.
      
      With patch applied, example output when peer.rwnd = 0
      
      where:
          ASSOC ffff880132298000 is sender
                ffff880125343000 is receiver
      
       ASSOC           SOCK            STY SST ST  HBKT ASSOC-ID TX_QUEUE RX_QUEUE \
      ffff880132298000 ffff880124a0a0c0 2   1   3  29325    1      214656        0 \
      ffff880125343000 ffff8801237d7700 2   1   3  36210    2           0   524520 \
      
      UID   INODE LPORT  RPORT LADDRS <-> RADDRS       HBINT   INS  OUTS \
        0   25108 3455   3456  *10.4.8.3 <-> *10.5.8.3  7500     2     2 \
        0   27819 3456   3455  *10.5.8.3 <-> *10.4.8.3  7500     2     2 \
      
      MAXRT T1X T2X RTXC   wmema   wmemq  sndbuf  rcvbuf
          4   0   0   72  525633  440320  524288  524288
          4   0   0    0       1       0  524288  524288
      Signed-off-by: NDilip Daya <dilip.daya@hp.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f406c8b9
  3. 16 4月, 2013 2 次提交
  4. 08 4月, 2013 1 次提交
  5. 01 4月, 2013 1 次提交
    • K
      net: add option to enable error queue packets waking select · 7d4c04fc
      Keller, Jacob E 提交于
      Currently, when a socket receives something on the error queue it only wakes up
      the socket on select if it is in the "read" list, that is the socket has
      something to read. It is useful also to wake the socket if it is in the error
      list, which would enable software to wait on error queue packets without waking
      up for regular data on the socket. The main use case is for receiving
      timestamped transmit packets which return the timestamp to the socket via the
      error queue. This enables an application to select on the socket for the error
      queue only instead of for the regular traffic.
      
      -v2-
      * Added the SO_SELECT_ERR_QUEUE socket option to every architechture specific file
      * Modified every socket poll function that checks error queue
      Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
      Cc: Jeffrey Kirsher <jeffrey.t.kirsher@intel.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Matthew Vick <matthew.vick@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7d4c04fc
  6. 13 3月, 2013 2 次提交
  7. 05 3月, 2013 1 次提交
  8. 02 3月, 2013 1 次提交
  9. 01 3月, 2013 4 次提交
  10. 28 2月, 2013 3 次提交
    • S
      hlist: drop the node parameter from iterators · b67bfe0d
      Sasha Levin 提交于
      I'm not sure why, but the hlist for each entry iterators were conceived
      
              list_for_each_entry(pos, head, member)
      
      The hlist ones were greedy and wanted an extra parameter:
      
              hlist_for_each_entry(tpos, pos, head, member)
      
      Why did they need an extra pos parameter? I'm not quite sure. Not only
      they don't really need it, it also prevents the iterator from looking
      exactly like the list iterator, which is unfortunate.
      
      Besides the semantic patch, there was some manual work required:
      
       - Fix up the actual hlist iterators in linux/list.h
       - Fix up the declaration of other iterators based on the hlist ones.
       - A very small amount of places were using the 'node' parameter, this
       was modified to use 'obj->member' instead.
       - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
       properly, so those had to be fixed up manually.
      
      The semantic patch which is mostly the work of Peter Senna Tschudin is here:
      
      @@
      iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
      
      type T;
      expression a,c,d,e;
      identifier b;
      statement S;
      @@
      
      -T b;
          <+... when != b
      (
      hlist_for_each_entry(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue(a,
      - b,
      c) S
      |
      hlist_for_each_entry_from(a,
      - b,
      c) S
      |
      hlist_for_each_entry_rcu(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_rcu_bh(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue_rcu_bh(a,
      - b,
      c) S
      |
      for_each_busy_worker(a, c,
      - b,
      d) S
      |
      ax25_uid_for_each(a,
      - b,
      c) S
      |
      ax25_for_each(a,
      - b,
      c) S
      |
      inet_bind_bucket_for_each(a,
      - b,
      c) S
      |
      sctp_for_each_hentry(a,
      - b,
      c) S
      |
      sk_for_each(a,
      - b,
      c) S
      |
      sk_for_each_rcu(a,
      - b,
      c) S
      |
      sk_for_each_from
      -(a, b)
      +(a)
      S
      + sk_for_each_from(a) S
      |
      sk_for_each_safe(a,
      - b,
      c, d) S
      |
      sk_for_each_bound(a,
      - b,
      c) S
      |
      hlist_for_each_entry_safe(a,
      - b,
      c, d, e) S
      |
      hlist_for_each_entry_continue_rcu(a,
      - b,
      c) S
      |
      nr_neigh_for_each(a,
      - b,
      c) S
      |
      nr_neigh_for_each_safe(a,
      - b,
      c, d) S
      |
      nr_node_for_each(a,
      - b,
      c) S
      |
      nr_node_for_each_safe(a,
      - b,
      c, d) S
      |
      - for_each_gfn_sp(a, c, d, b) S
      + for_each_gfn_sp(a, c, d) S
      |
      - for_each_gfn_indirect_valid_sp(a, c, d, b) S
      + for_each_gfn_indirect_valid_sp(a, c, d) S
      |
      for_each_host(a,
      - b,
      c) S
      |
      for_each_host_safe(a,
      - b,
      c, d) S
      |
      for_each_mesh_entry(a,
      - b,
      c, d) S
      )
          ...+>
      
      [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
      [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
      [akpm@linux-foundation.org: checkpatch fixes]
      [akpm@linux-foundation.org: fix warnings]
      [akpm@linux-foudnation.org: redo intrusive kvm changes]
      Tested-by: NPeter Senna Tschudin <peter.senna@gmail.com>
      Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b67bfe0d
    • T
      sctp: convert to idr_alloc() · 94960e8c
      Tejun Heo 提交于
      Convert to the much saner new idr interface.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NVlad Yasevich <vyasevich@gmail.com>
      Cc: Sridhar Samudrala <sri@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      94960e8c
    • G
      net/sctp: Validate parameter size for SCTP_GET_ASSOC_STATS · 726bc6b0
      Guenter Roeck 提交于
      Building sctp may fail with:
      
      In function ‘copy_from_user’,
          inlined from ‘sctp_getsockopt_assoc_stats’ at
          net/sctp/socket.c:5656:20:
      arch/x86/include/asm/uaccess_32.h:211:26: error: call to
          ‘copy_from_user_overflow’ declared with attribute error: copy_from_user()
          buffer size is not provably correct
      
      if built with W=1 due to a missing parameter size validation
      before the call to copy_from_user.
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Acked-by: NVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      726bc6b0
  11. 19 2月, 2013 2 次提交
  12. 14 2月, 2013 4 次提交
  13. 13 2月, 2013 1 次提交
  14. 09 2月, 2013 3 次提交
  15. 08 2月, 2013 2 次提交
  16. 05 2月, 2013 2 次提交
    • Y
      net: remove redundant check for timer pending state before del_timer · 25cc4ae9
      Ying Xue 提交于
      As in del_timer() there has already placed a timer_pending() function
      to check whether the timer to be deleted is pending or not, it's
      unnecessary to check timer pending state again before del_timer() is
      called.
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      25cc4ae9
    • D
      sctp: sctp_close: fix release of bindings for deferred call_rcu's · 8c98653f
      Daniel Borkmann 提交于
      It seems due to RCU usage, i.e. within SCTP's address binding list,
      a, say, ``behavioral change'' was introduced which does actually
      not conform to the RFC anymore. In particular consider the following
      (fictional) scenario to demonstrate this:
      
        do:
          Two SOCK_SEQPACKET-style sockets are opened (S1, S2)
          S1 is bound to 127.0.0.1, port 1024 [server]
          S2 is bound to 127.0.0.1, port 1025 [client]
          listen(2) is invoked on S1
          From S2 we call one sendmsg(2) with msg.msg_name and
             msg.msg_namelen parameters set to the server's
             address
          S1, S2 are closed
          goto do
      
      The first pass of this loop passes successful, while the second round
      fails during binding of S1 (address still in use). What is happening?
      In the first round, the initial handshake is being done, and, at the
      time close(2) is called on S1, a non-graceful shutdown is performed via
      ABORT since in S1's receive queue an unprocessed packet is present,
      thus stating an error condition. This can be considered as a correct
      behavior.
      
      During close also all bound addresses are freed, thus nothing *must*
      be active anymore. In reference to RFC2960:
      
        After checking the Verification Tag, the receiving endpoint shall
        remove the association from its record, and shall report the
        termination to its upper layer. (9.1 Abort of an Association)
      
      Also, no half-open states are supported, thus after an ungraceful
      shutdown, we leave nothing behind. However, this seems not to be
      happening though. In a real-world scenario, this is exactly where
      it breaks the lksctp-tools functional test suite, *for instance*:
      
        ./test_sockopt
        test_sockopt.c  1 PASS : getsockopt(SCTP_STATUS) on a socket with no assoc
        test_sockopt.c  2 PASS : getsockopt(SCTP_STATUS)
        test_sockopt.c  3 PASS : getsockopt(SCTP_STATUS) with invalid associd
        test_sockopt.c  4 PASS : getsockopt(SCTP_STATUS) with NULL associd
        test_sockopt.c  5 BROK : bind: Address already in use
      
      The underlying problem is that sctp_endpoint_destroy() hasn't been
      triggered yet while the next bind attempt is being done. It will be
      triggered eventually (but too late) by sctp_transport_destroy_rcu()
      after one RCU grace period:
      
        sctp_transport_destroy()
          sctp_transport_destroy_rcu() ----.
            sctp_association_put() [*]  <--+--> sctp_packet_free()
              sctp_association_destroy()          [...]
                sctp_endpoint_put()                 skb->destructor
                  sctp_endpoint_destroy()             sctp_wfree()
                    sctp_bind_addr_free()               sctp_association_put() [*]
      
      Thus, we move out the condition with sctp_association_put() as well as
      the sctp_packet_free() invocation and the issue can be solved. We also
      better free the SCTP chunks first before putting the ref of the association.
      
      With this patch, the example above (which simulates a similar scenario
      as in the implementation of this test case) and therefore also the test
      suite run successfully through. Tested by myself.
      
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Acked-by: NVlad Yasevich <vyasevich@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8c98653f
  17. 28 1月, 2013 2 次提交
    • V
      SCTP: Free the per-net sysctl table on net exit. v2 · 5f19d121
      Vlad Yasevich 提交于
      Per-net sysctl table needs to be explicitly freed at
      net exit.  Otherwise we see the following with kmemleak:
      
      unreferenced object 0xffff880402d08000 (size 2048):
        comm "chrome_sandbox", pid 18437, jiffies 4310887172 (age 9097.630s)
        hex dump (first 32 bytes):
          b2 68 89 81 ff ff ff ff 20 04 04 f8 01 88 ff ff  .h...... .......
          04 00 00 00 a4 01 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<ffffffff815b4aad>] kmemleak_alloc+0x21/0x3e
          [<ffffffff81110352>] slab_post_alloc_hook+0x28/0x2a
          [<ffffffff81113fad>] __kmalloc_track_caller+0xf1/0x104
          [<ffffffff810f10c2>] kmemdup+0x1b/0x30
          [<ffffffff81571e9f>] sctp_sysctl_net_register+0x1f/0x72
          [<ffffffff8155d305>] sctp_net_init+0x100/0x39f
          [<ffffffff814ad53c>] ops_init+0xc6/0xf5
          [<ffffffff814ad5b7>] setup_net+0x4c/0xd0
          [<ffffffff814ada5e>] copy_net_ns+0x6d/0xd6
          [<ffffffff810938b1>] create_new_namespaces+0xd7/0x147
          [<ffffffff810939f4>] copy_namespaces+0x63/0x99
          [<ffffffff81076733>] copy_process+0xa65/0x1233
          [<ffffffff81077030>] do_fork+0x10b/0x271
          [<ffffffff8100a0e9>] sys_clone+0x23/0x25
          [<ffffffff815dda73>] stub_clone+0x13/0x20
          [<ffffffffffffffff>] 0xffffffffffffffff
      
      I fixed the spelling of sysctl_header so the code actually
      compiles. -- EWB.
      Reported-by: NMartin Mokrejs <mmokrejs@fold.natur.cuni.cz>
      Signed-off-by: NVlad Yasevich <vyasevich@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5f19d121
    • X
      sctp: set association state to established in dupcook_a handler · 9839ff0d
      Xufeng Zhang 提交于
      While sctp handling a duplicate COOKIE-ECHO and the action is
      'Association restart', sctp_sf_do_dupcook_a() will processing
      the unexpected COOKIE-ECHO for peer restart, but it does not set
      the association state to SCTP_STATE_ESTABLISHED, so the association
      could stuck in SCTP_STATE_SHUTDOWN_PENDING state forever.
      This violates the sctp specification:
        RFC 4960 5.2.4. Handle a COOKIE ECHO when a TCB Exists
        Action
        A) In this case, the peer may have restarted. .....
           After this, the endpoint shall enter the ESTABLISHED state.
      
      To resolve this problem, adding a SCTP_CMD_NEW_STATE cmd to the
      command list before SCTP_CMD_REPLY cmd, this will set the restart
      association to SCTP_STATE_ESTABLISHED state properly and also avoid
      I-bit being set in the DATA chunk header when COOKIE_ACK is bundled
      with DATA chunks.
      Signed-off-by: NXufeng Zhang <xufeng.zhang@windriver.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9839ff0d
  18. 18 1月, 2013 1 次提交