1. 11 10月, 2007 4 次提交
  2. 20 7月, 2007 1 次提交
    • P
      mm: Remove slab destructors from kmem_cache_create(). · 20c2df83
      Paul Mundt 提交于
      Slab destructors were no longer supported after Christoph's
      c59def9f change. They've been
      BUGs for both slab and slub, and slob never supported them
      either.
      
      This rips out support for the dtor pointer from kmem_cache_create()
      completely and fixes up every single callsite in the kernel (there were
      about 224, not including the slab allocator definitions themselves,
      or the documentation references).
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      20c2df83
  3. 29 3月, 2007 1 次提交
  4. 11 2月, 2007 1 次提交
  5. 09 2月, 2007 1 次提交
    • E
      [NET]: change layout of ehash table · dbca9b27
      Eric Dumazet 提交于
      ehash table layout is currently this one :
      
      First half of this table is used by sockets not in TIME_WAIT state
      Second half of it is used by sockets in TIME_WAIT state.
      
      This is non optimal because of for a given hash or socket, the two chain heads 
      are located in separate cache lines.
      Moreover the locks of the second half are never used.
      
      If instead of this halving, we use two list heads in inet_ehash_bucket instead 
      of only one, we probably can avoid one cache miss, and reduce ram usage, 
      particularly if sizeof(rwlock_t) is big (various CONFIG_DEBUG_SPINLOCK, 
      CONFIG_DEBUG_LOCK_ALLOC settings). So we still halves the table but we keep 
      together related chains to speedup lookups and socket state change.
      
      In this patch I did not try to align struct inet_ehash_bucket, but a future 
      patch could try to make this structure have a convenient size (a power of two 
      or a multiple of L1_CACHE_SIZE).
      I guess rwlock will just vanish as soon as RCU is plugged into ehash :) , so 
      maybe we dont need to scratch our heads to align the bucket...
      
      Note : In case struct inet_ehash_bucket is not a power of two, we could 
      probably change alloc_large_system_hash() (in case it use __get_free_pages()) 
      to free the unused space. It currently allocates a big zone, but the last 
      quarter of it could be freed. Again, this should be a temporary 'problem'.
      
      Patch tested on ipv4 tcp only, but should be OK for IPV6 and DCCP.
      Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dbca9b27
  6. 12 12月, 2006 1 次提交
  7. 03 12月, 2006 8 次提交
    • A
      [DCCP]: Make {set,get}sockopt(DCCP_SOCKOPT_PACKET_SIZE) return 0 · 841bac1d
      Arnaldo Carvalho de Melo 提交于
      To reflect the fact that this now is of no effect, not making apps
      stop working, just be warned in the system log.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
      841bac1d
    • G
      [DCCP]: Tidy up unused structures · 5aed3243
      Gerrit Renker 提交于
      This removes and cleans up unused variables and structures which have become
      unnecessary following the introduction of the EWMA patch to automatically track
      the CCID 3 receiver/sender packet sizes `s'.
      
      It deprecates the PACKET_SIZE socket option by returning an error code and
      printing a deprecation warning if an application tries to read or write this
      socket option.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
      5aed3243
    • G
      [DCCP]: Simplified conditions due to use of enum:8 states · 59348b19
      Gerrit Renker 提交于
      This reaps the benefit of the earlier patch, which changed the type of
      CCID 3 states to use enums, in that many conditions are now simplified
      and the number of possible (unexpected) values is greatly reduced.
      
      In a few instances, this also allowed to simplify pre-conditions; where
      care has been taken to retain logical equivalence.
      
      [DCCP]: Introduce a consistent BUG/WARN message scheme
      
      This refines the existing set of DCCP messages so that
       * BUG(), BUG_ON(), WARN_ON() have meaningful DCCP-specific counterparts
       * DCCP_CRIT (for severe warnings) is not rate-limited
       * DCCP_WARN() is introduced as rate-limited wrapper
      
      Using these allows a faster and cleaner transition to their original
      counterparts once the code has matured into a full DCCP implementation.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
      59348b19
    • I
      [DCCP]: Set TX Queue Length Bounds via Sysctl · b1308dc0
      Ian McDonald 提交于
      Previously the transmit queue was unbounded.
      
      This patch:
      	* puts a limit on transmit queue length
      	  and sends back EAGAIN if the buffer is full
      	* sets the TX queue length to a sensible default
      	* implements tx buffer sysctls for DCCP
      Signed-off-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
      b1308dc0
    • G
      [DCCP]: Miscellaneous code tidy-ups · 09dbc389
      Gerrit Renker 提交于
      This patch does not change code; it performs some trivial clean/tidy-ups:
      
        * removal of a `debug_prefix' string in favour of the
          already existing dccp_role(sk)
      
        * add documentation of structures and constants
      
        * separated out the cases for invalid packets (step 1
          of the packet validation)
      
        * removing duplicate statements
      
        * combining declaration & initialisation
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
      09dbc389
    • G
      [DCCP]: Add sysctls to control retransmission behaviour · 2e2e9e92
      Gerrit Renker 提交于
      This adds 3 sysctls which govern the retransmission behaviour of DCCP control
      packets (3way handshake, feature negotiation).
      
      It removes 4 FIXMEs from the code.
      
      The close resemblance of sysctl variables to their TCP analogues is emphasised
      not only by their name, but also by giving them the same initial values.
      This is useful since there is not much practical experience with DCCP yet.
      
      Furthermore, with regard to the previous patch, it is now possible to limit
      the number of keepalive-Responses by setting net.dccp.default.request_retries
      (also a bit like in TCP).
      
      Lastly, added documentation of all existing DCCP sysctls.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
      2e2e9e92
    • G
      [DCCP]: Support for partial checksums (RFC 4340, sec. 9.2) · 6f4e5fff
      Gerrit Renker 提交于
      This patch does the following:
        a) introduces variable-length checksums as specified in [RFC 4340, sec. 9.2]
        b) provides necessary socket options and documentation as to how to use them
        c) basic support and infrastructure for the Minimum Checksum Coverage feature
           [RFC 4340, sec. 9.2.1]: acceptability tests, user notification and user
           interface
      
      In addition, it
      
       (1) fixes two bugs in the DCCPv4 checksum computation:
       	* pseudo-header used checksum_len instead of skb->len
      	* incorrect checksum coverage calculation based on dccph_x
       (2) removes dccp_v4_verify_checksum() since it reduplicates code of the
           checksum computation; code calling this function is updated accordingly.
       (3) now uses skb_checksum(), which is safer than checksum_partial() if the
           sk_buff has is a non-linear buffer (has pages attached to it).
       (4) fixes an outstanding TODO item:
              * If P.CsCov is too large for the packet size, drop packet and return.
      
      The code has been tested with applications, the latest version of tcpdump now
      comes with support for partial DCCP checksums.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
      6f4e5fff
    • E
      [NET]: Size listen hash tables using backlog hint · 72a3effa
      Eric Dumazet 提交于
      We currently allocate a fixed size (TCP_SYNQ_HSIZE=512) slots hash table for
      each LISTEN socket, regardless of various parameters (listen backlog for
      example)
      
      On x86_64, this means order-1 allocations (might fail), even for 'small'
      sockets, expecting few connections. On the contrary, a huge server wanting a
      backlog of 50000 is slowed down a bit because of this fixed limit.
      
      This patch makes the sizing of listen hash table a dynamic parameter,
      depending of :
      - net.core.somaxconn tunable (default is 128)
      - net.ipv4.tcp_max_syn_backlog tunable (default : 256, 1024 or 128)
      - backlog value given by user application  (2nd parameter of listen())
      
      For large allocations (bigger than PAGE_SIZE), we use vmalloc() instead of
      kmalloc().
      
      We still limit memory allocation with the two existing tunables (somaxconn &
      tcp_max_syn_backlog). So for standard setups, this patch actually reduce RAM
      usage.
      Signed-off-by: NEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      72a3effa
  8. 25 9月, 2006 1 次提交
  9. 23 9月, 2006 1 次提交
  10. 11 7月, 2006 1 次提交
  11. 01 7月, 2006 1 次提交
  12. 18 6月, 2006 1 次提交
  13. 06 5月, 2006 1 次提交
    • H
      [DCCP]: Fix sock_orphan dead lock · 134af346
      Herbert Xu 提交于
      Calling sock_orphan inside bh_lock_sock in dccp_close can lead to dead
      locks.  For example, the inet_diag code holds sk_callback_lock without
      disabling BH.  If an inbound packet arrives during that admittedly tiny
      window, it will cause a dead lock on bh_lock_sock.  Another possible
      path would be through sock_wfree if the network device driver frees the
      tx skb in process context with BH enabled.
      
      We can fix this by moving sock_orphan out of bh_lock_sock.
      
      The tricky bit is to work out when we need to destroy the socket
      ourselves and when it has already been destroyed by someone else.
      
      By moving sock_orphan before the release_sock we can solve this
      problem.  This is because as long as we own the socket lock its
      state cannot change.
      
      So we simply record the socket state before the release_sock
      and then check the state again after we regain the socket lock.
      If the socket state has transitioned to DCCP_CLOSED in the time being,
      we know that the socket has been destroyed.  Otherwise the socket is
      still ours to keep.
      
      This problem was discoverd by Ingo Molnar using his lock validator.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      134af346
  14. 26 3月, 2006 1 次提交
    • D
      [PATCH] POLLRDHUP/EPOLLRDHUP handling for half-closed devices notifications · f348d70a
      Davide Libenzi 提交于
      Implement the half-closed devices notifiation, by adding a new POLLRDHUP
      (and its alias EPOLLRDHUP) bit to the existing poll/select sets.  Since the
      existing POLLHUP handling, that does not report correctly half-closed
      devices, was feared to be changed, this implementation leaves the current
      POLLHUP reporting unchanged and simply add a new bit that is set in the few
      places where it makes sense.  The same thing was discussed and conceptually
      agreed quite some time ago:
      
      http://lkml.org/lkml/2003/7/12/116
      
      Since this new event bit is added to the existing Linux poll infrastruture,
      even the existing poll/select system calls will be able to use it.  As far
      as the existing POLLHUP handling, the patch leaves it as is.  The
      pollrdhup-2.6.16.rc5-0.10.diff defines the POLLRDHUP for all the existing
      archs and sets the bit in the six relevant files.  The other attached diff
      is the simple change required to sys/epoll.h to add the EPOLLRDHUP
      definition.
      
      There is "a stupid program" to test POLLRDHUP delivery here:
      
       http://www.xmailserver.org/pollrdhup-test.c
      
      It tests poll(2), but since the delivery is same epoll(2) will work equally.
      Signed-off-by: NDavide Libenzi <davidel@xmailserver.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Michael Kerrisk <mtk-manpages@gmx.net>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      f348d70a
  15. 21 3月, 2006 16 次提交