1. 17 5月, 2012 1 次提交
  2. 20 12月, 2011 1 次提交
  3. 01 8月, 2011 1 次提交
  4. 07 12月, 2010 2 次提交
  5. 29 10月, 2010 1 次提交
    • G
      dccp: Refine the wait-for-ccid mechanism · b1fcf55e
      Gerrit Renker 提交于
      This extends the existing wait-for-ccid routine so that it may be used with
      different types of CCID, addressing the following problems:
      
       1) The queue-drain mechanism only works with rate-based CCIDs. If CCID-2 for
          example has a full TX queue and becomes network-limited just as the
          application wants to close, then waiting for CCID-2 to become unblocked
          could lead to an indefinite  delay (i.e., application "hangs").
       2) Since each TX CCID in turn uses a feedback mechanism, there may be changes
          in its sending policy while the queue is being drained. This can lead to
          further delays during which the application will not be able to terminate.
       3) The minimum wait time for CCID-3/4 can be expected to be the queue length
          times the current inter-packet delay. For example if tx_qlen=100 and a delay
          of 15 ms is used for each packet, then the application would have to wait
          for a minimum of 1.5 seconds before being allowed to exit.
       4) There is no way for the user/application to control this behaviour. It would
          be good to use the timeout argument of dccp_close() as an upper bound. Then
          the maximum time that an application is willing to wait for its CCIDs to can
          be set via the SO_LINGER option.
      
      These problems are addressed by giving the CCID a grace period of up to the
      `timeout' value.
      
      The wait-for-ccid function is, as before, used when the application
       (a) has read all the data in its receive buffer and
       (b) if SO_LINGER was set with a non-zero linger time, or
       (c) the socket is either in the OPEN (active close) or in the PASSIVE_CLOSEREQ
           state (client application closes after receiving CloseReq).
      
      In addition, there is a catch-all case of __skb_queue_purge() after waiting for
      the CCID. This is necessary since the write queue may still have data when
       (a) the host has been passively-closed,
       (b) abnormal termination (unread data, zero linger time),
       (c) wait-for-ccid could not finish within the given time limit.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b1fcf55e
  6. 12 10月, 2010 1 次提交
  7. 07 10月, 2010 1 次提交
  8. 26 6月, 2010 1 次提交
    • E
      snmp: add align parameter to snmp_mib_init() · 1823e4c8
      Eric Dumazet 提交于
      In preparation for 64bit snmp counters for some mibs,
      add an 'align' parameter to snmp_mib_init(), instead
      of assuming mibs only contain 'unsigned long' fields.
      
      Callers can use __alignof__(type) to provide correct
      alignment.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      CC: Herbert Xu <herbert@gondor.apana.org.au>
      CC: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
      CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      CC: Vlad Yasevich <vladislav.yasevich@hp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1823e4c8
  9. 31 5月, 2010 1 次提交
  10. 21 4月, 2010 1 次提交
  11. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  12. 16 3月, 2010 1 次提交
    • G
      net-2.6 [Bug-Fix][dccp]: fix oops caused after failed initialisation · d14a0ebd
      Gerrit Renker 提交于
      dccp: fix panic caused by failed initialisation
      
      This fixes a kernel panic reported thanks to Andre Noll:
      
      if DCCP is compiled into the kernel and any out of the initialisation
      steps in net/dccp/proto.c:dccp_init() fail, a subsequent attempt to create
      a SOCK_DCCP socket will panic, since inet{,6}_create() are not prevented
      from creating DCCP sockets.
      
      This patch fixes the problem by propagating a failure in dccp_init() to
      dccp_v{4,6}_init_net(), and from there to dccp_v{4,6}_init(), so that the
      DCCP protocol is not made available if its initialisation fails.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d14a0ebd
  13. 17 2月, 2010 1 次提交
    • T
      percpu: add __percpu sparse annotations to net · 7d720c3e
      Tejun Heo 提交于
      Add __percpu sparse annotations to net.
      
      These annotations are to make sparse consider percpu variables to be
      in a different address space and warn if accessed without going
      through percpu accessors.  This patch doesn't affect normal builds.
      
      The macro and type tricks around snmp stats make things a bit
      interesting.  DEFINE/DECLARE_SNMP_STAT() macros mark the target field
      as __percpu and SNMP_UPD_PO_STATS() macro is updated accordingly.  All
      snmp_mib_*() users which used to cast the argument to (void **) are
      updated to cast it to (void __percpu **).
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
      Cc: Vlad Yasevich <vladislav.yasevich@hp.com>
      Cc: netdev@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7d720c3e
  14. 13 2月, 2010 1 次提交
  15. 19 10月, 2009 1 次提交
    • E
      inet: rename some inet_sock fields · c720c7e8
      Eric Dumazet 提交于
      In order to have better cache layouts of struct sock (separate zones
      for rx/tx paths), we need this preliminary patch.
      
      Goal is to transfert fields used at lookup time in the first
      read-mostly cache line (inside struct sock_common) and move sk_refcnt
      to a separate cache line (only written by rx path)
      
      This patch adds inet_ prefix to daddr, rcv_saddr, dport, num, saddr,
      sport and id fields. This allows a future patch to define these
      fields as macros, like sk_refcnt, without name clashes.
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c720c7e8
  16. 13 10月, 2009 1 次提交
  17. 01 10月, 2009 1 次提交
  18. 22 9月, 2009 1 次提交
  19. 06 8月, 2009 2 次提交
    • J
      net: mark read-only arrays as const · 36cbd3dc
      Jan Engelhardt 提交于
      String literals are constant, and usually, we can also tag the array
      of pointers const too, moving it to the .rodata section.
      Signed-off-by: NJan Engelhardt <jengelh@medozas.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      36cbd3dc
    • W
      dccp: missing destroy of percpu counter variable while unload module · 476181cb
      Wei Yongjun 提交于
      percpu counter dccp_orphan_count is init in dccp_init() by
      percpu_counter_init() while dccp module is loaded, but the
      destroy of it is missing while dccp module is unloaded. We
      can get the kernel WARNING about this. Reproduct by the
      following commands:
      
        $ modprobe dccp
        $ rmmod dccp
        $ modprobe dccp
      
      WARNING: at lib/list_debug.c:26 __list_add+0x27/0x5c()
      Hardware name: VMware Virtual Platform
      list_add corruption. next->prev should be prev (c080c0c4), but was (null). (next
      =ca7188cc).
      Modules linked in: dccp(+) nfsd lockd nfs_acl auth_rpcgss exportfs sunrpc
      Pid: 1956, comm: modprobe Not tainted 2.6.31-rc5 #55
      Call Trace:
       [<c042f8fa>] warn_slowpath_common+0x6a/0x81
       [<c053a6cb>] ? __list_add+0x27/0x5c
       [<c042f94f>] warn_slowpath_fmt+0x29/0x2c
       [<c053a6cb>] __list_add+0x27/0x5c
       [<c053c9b3>] __percpu_counter_init+0x4d/0x5d
       [<ca9c90c7>] dccp_init+0x19/0x2ed [dccp]
       [<c0401141>] do_one_initcall+0x4f/0x111
       [<ca9c90ae>] ? dccp_init+0x0/0x2ed [dccp]
       [<c06971b5>] ? notifier_call_chain+0x26/0x48
       [<c0444943>] ? __blocking_notifier_call_chain+0x45/0x51
       [<c04516f7>] sys_init_module+0xac/0x1bd
       [<c04028e4>] sysenter_do_call+0x12/0x22
      Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
      Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      476181cb
  20. 30 7月, 2009 1 次提交
  21. 10 7月, 2009 1 次提交
    • J
      net: adding memory barrier to the poll and receive callbacks · a57de0b4
      Jiri Olsa 提交于
      Adding memory barrier after the poll_wait function, paired with
      receive callbacks. Adding fuctions sock_poll_wait and sk_has_sleeper
      to wrap the memory barrier.
      
      Without the memory barrier, following race can happen.
      The race fires, when following code paths meet, and the tp->rcv_nxt
      and __add_wait_queue updates stay in CPU caches.
      
      CPU1                         CPU2
      
      sys_select                   receive packet
        ...                        ...
        __add_wait_queue           update tp->rcv_nxt
        ...                        ...
        tp->rcv_nxt check          sock_def_readable
        ...                        {
        schedule                      ...
                                      if (sk->sk_sleep && waitqueue_active(sk->sk_sleep))
                                              wake_up_interruptible(sk->sk_sleep)
                                      ...
                                   }
      
      If there was no cache the code would work ok, since the wait_queue and
      rcv_nxt are opposit to each other.
      
      Meaning that once tp->rcv_nxt is updated by CPU2, the CPU1 either already
      passed the tp->rcv_nxt check and sleeps, or will get the new value for
      tp->rcv_nxt and will return with new data mask.
      In both cases the process (CPU1) is being added to the wait queue, so the
      waitqueue_active (CPU2) call cannot miss and will wake up CPU1.
      
      The bad case is when the __add_wait_queue changes done by CPU1 stay in its
      cache, and so does the tp->rcv_nxt update on CPU2 side.  The CPU1 will then
      endup calling schedule and sleep forever if there are no more data on the
      socket.
      
      Calls to poll_wait in following modules were ommited:
      	net/bluetooth/af_bluetooth.c
      	net/irda/af_irda.c
      	net/irda/irnet/irnet_ppp.c
      	net/mac80211/rc80211_pid_debugfs.c
      	net/phonet/socket.c
      	net/rds/af_rds.c
      	net/rfkill/core.c
      	net/sunrpc/cache.c
      	net/sunrpc/rpc_pipe.c
      	net/tipc/socket.c
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a57de0b4
  22. 22 1月, 2009 1 次提交
    • G
      dccp: Implement both feature-local and feature-remote Sequence Window feature · 792b4878
      Gerrit Renker 提交于
      This adds full support for local/remote Sequence Window feature, from which the
        * sequence-number-validity (W) and
        * acknowledgment-number-validity (W') windows
      derive as specified in RFC 4340, 7.5.3.
      
      Specifically, the following is contained in this patch:
        * integrated new socket fields into dccp_sk;
        * updated the update_gsr/gss routines with regard to these fields;
        * updated handler code: the Sequence Window feature is located at the TX side,
          so the local feature is meant if the handler-rx flag is false;
        * the initialisation of `rcv_wnd' in reqsk is removed, since
          - rcv_wnd is not used by the code anywhere;
          - sequence number checks are not done in the LISTEN state (cf. 7.5.3);
          - dccp_check_req checks the Ack number validity more rigorously;
        * the `struct dccp_minisock' became empty and is now removed.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      792b4878
  23. 05 1月, 2009 1 次提交
    • G
      dccp: Lockless integration of CCID congestion-control plugins · ddebc973
      Gerrit Renker 提交于
      Based on Arnaldo's earlier patch, this patch integrates the standardised
      CCID congestion control plugins (CCID-2 and CCID-3) of DCCP with dccp.ko:
      
       * enables a faster connection path by eliminating the need to always go 
         through the CCID registration lock;
      
       * updates the implementation to use only a single array whose size equals
         the number of configured CCIDs instead of the maximum (256);
      
       * since the CCIDs are now fixed array elements, synchronization is no
         longer needed, simplifying use and implementation.
      
      CCID-2 is suggested as minimum for a basic DCCP implementation (RFC 4340, 10);
      CCID-3 is a standards-track CCID supported by RFC 4342 and RFC 5348.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ddebc973
  24. 30 12月, 2008 1 次提交
  25. 08 12月, 2008 2 次提交
    • G
      dccp ccid-2: Phase out the use of boolean Ack Vector sysctl · 6fdd34d4
      Gerrit Renker 提交于
      This removes the use of the sysctl and the minisock variable for the Send Ack
      Vector feature, as it now is handled fully dynamically via feature negotiation
      (i.e. when CCID-2 is enabled, Ack Vectors are automatically enabled as per
       RFC 4341, 4.).
      
      Using a sysctl in parallel to this implementation would open the door to
      crashes, since much of the code relies on tests of the boolean minisock /
      sysctl variable. Thus, this patch replaces all tests of type
      
      	if (dccp_msk(sk)->dccpms_send_ack_vector)
      		/* ... */
      with
      	if (dp->dccps_hc_rx_ackvec != NULL)
      		/* ... */
      
      The dccps_hc_rx_ackvec is allocated by the dccp_hdlr_ackvec() when feature
      negotiation concluded that Ack Vectors are to be used on the half-connection.
      Otherwise, it is NULL (due to dccp_init_sock/dccp_create_openreq_child),
      so that the test is a valid one.
      
      The activation handler for Ack Vectors is called as soon as the feature
      negotiation has concluded at the
       * server when the Ack marking the transition RESPOND => OPEN arrives;
       * client after it has sent its ACK, marking the transition REQUEST => PARTOPEN.
      
      Adding the sequence number of the Response packet to the Ack Vector has been
      removed, since
       (a) connection establishment implies that the Response has been received;
       (b) the CCIDs only look at packets received in the (PART)OPEN state, i.e.
           this entry will always be ignored;
       (c) it can not be used for anything useful - to detect loss for instance, only
           packets received after the loss can serve as pseudo-dupacks.
      
      There was a FIXME to change the error code when dccp_ackvec_add() fails.
      I removed this after finding out that:
       * the check whether ackno < ISN is already made earlier,
       * this Response is likely the 1st packet with an Ackno that the client gets,
       * so when dccp_ackvec_add() fails, the reason is likely not a packet error.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6fdd34d4
    • G
      dccp: Integration of dynamic feature activation - part 1 (socket setup) · 6eb55d17
      Gerrit Renker 提交于
      This first patch out of three replaces the hardcoded default settings with
      initialisation code for the dynamic feature negotiation.
      
      The patch also ensures that the client feature-negotiation queue is flushed
      only when entering the OPEN state.
      
      Since confirmed Change options are removed as soon as they are confirmed
      (in the DCCP-Response), this ensures that Confirm options are retransmitted.
      
      Note on retransmitting Confirm options:
      ---------------------------------------
      Implementation experience showed that it is necessary to retransmit Confirm
      options. Thanks to Leandro Melo de Sales who reported a bug in an earlier
      revision of the patch set, resulting from not retransmitting these options.
      
      As long as the client is in PARTOPEN, it needs to retransmit the Confirm
      options for the Change options received on the DCCP-Response from the server.
      
      Otherwise, if the packet containing the Confirm options gets dropped in the
      network, the connection aborts due to undefined feature negotiation state.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6eb55d17
  26. 26 11月, 2008 1 次提交
  27. 24 11月, 2008 2 次提交
  28. 20 11月, 2008 1 次提交
  29. 17 11月, 2008 4 次提交
  30. 12 11月, 2008 3 次提交
    • G
      dccp: Resolve dependencies of features on choice of CCID · 9eca0a47
      Gerrit Renker 提交于
      This provides a missing link in the code chain, as several features implicitly
      depend and/or rely on the choice of CCID. Most notably, this is the Send Ack Vector
      feature, but also Ack Ratio and Send Loss Event Rate (also taken care of).
      
      For Send Ack Vector, the situation is as follows:
       * since CCID2 mandates the use of Ack Vectors, there is no point in allowing 
         endpoints which use CCID2 to disable Ack Vector features such a connection;
      
       * a peer with a TX CCID of CCID2 will always expect Ack Vectors, and a peer
         with a RX CCID of CCID2 must always send Ack Vectors (RFC 4341, sec. 4);
      
       * for all other CCIDs, the use of (Send) Ack Vector is optional and thus
         negotiable. However, this implies that the code negotiating the use of Ack
         Vectors also supports it (i.e. is able to supply and to either parse or
         ignore received Ack Vectors). Since this is not the case (CCID-3 has no Ack
         Vector support), the use of Ack Vectors is here disabled, with a comment
         in the source code.
      
      An analogous consideration arises for the Send Loss Event Rate feature,
      since the CCID-3 implementation does not support the loss interval options
      of RFC 4342. To make such use explicit, corresponding feature-negotiation
      options are inserted which signal the use of the loss event rate option,
      as it is used by the CCID3 code.
      
      Lastly, the values of the Ack Ratio feature are matched to the choice of CCID.
      
      The patch implements this as a function which is called after the user has
      made all other registrations for changing default values of features.
      
      The table is variable-length, the reserved (and hence for feature-negotiation
      invalid, confirmed by considering section 19.4 of RFC 4340) feature number `0'
      is used to mark the end of the table.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9eca0a47
    • G
      dccp: Query supported CCIDs · d90ebcbf
      Gerrit Renker 提交于
      This provides a data structure to record which CCIDs are locally supported
      and three accessor functions:
       - a test function for internal use which is used to validate CCID requests
         made by the user;
       - a copy function so that the list can be used for feature-negotiation;   
       - documented getsockopt() support so that the user can query capabilities.
      
      The data structure is a table which is filled in at compile-time with the
      list of available CCIDs (which in turn depends on the Kconfig choices).
      
      Using the copy function for cloning the list of supported CCIDs is useful for
      feature negotiation, since the negotiation is now with the full list of available
      CCIDs (e.g. {2, 3}) instead of the default value {2}. This means negotiation 
      will not fail if the peer requests to use CCID3 instead of CCID2. 
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d90ebcbf
    • G
      dccp: Registration routines for changing feature values · e8ef967a
      Gerrit Renker 提交于
      Two registration routines, for SP and NN features, are provided by this patch,
      replacing a previous routine which was used for both feature types.
      
      These are internal-only routines and therefore start with `__feat_register'.
      
      It further exports the known limits of Sequence Window and Ack Ratio as symbolic
      constants.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e8ef967a
  31. 05 11月, 2008 1 次提交