1. 01 10月, 2009 1 次提交
  2. 22 9月, 2009 1 次提交
  3. 06 8月, 2009 2 次提交
    • J
      net: mark read-only arrays as const · 36cbd3dc
      Jan Engelhardt 提交于
      String literals are constant, and usually, we can also tag the array
      of pointers const too, moving it to the .rodata section.
      Signed-off-by: NJan Engelhardt <jengelh@medozas.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      36cbd3dc
    • W
      dccp: missing destroy of percpu counter variable while unload module · 476181cb
      Wei Yongjun 提交于
      percpu counter dccp_orphan_count is init in dccp_init() by
      percpu_counter_init() while dccp module is loaded, but the
      destroy of it is missing while dccp module is unloaded. We
      can get the kernel WARNING about this. Reproduct by the
      following commands:
      
        $ modprobe dccp
        $ rmmod dccp
        $ modprobe dccp
      
      WARNING: at lib/list_debug.c:26 __list_add+0x27/0x5c()
      Hardware name: VMware Virtual Platform
      list_add corruption. next->prev should be prev (c080c0c4), but was (null). (next
      =ca7188cc).
      Modules linked in: dccp(+) nfsd lockd nfs_acl auth_rpcgss exportfs sunrpc
      Pid: 1956, comm: modprobe Not tainted 2.6.31-rc5 #55
      Call Trace:
       [<c042f8fa>] warn_slowpath_common+0x6a/0x81
       [<c053a6cb>] ? __list_add+0x27/0x5c
       [<c042f94f>] warn_slowpath_fmt+0x29/0x2c
       [<c053a6cb>] __list_add+0x27/0x5c
       [<c053c9b3>] __percpu_counter_init+0x4d/0x5d
       [<ca9c90c7>] dccp_init+0x19/0x2ed [dccp]
       [<c0401141>] do_one_initcall+0x4f/0x111
       [<ca9c90ae>] ? dccp_init+0x0/0x2ed [dccp]
       [<c06971b5>] ? notifier_call_chain+0x26/0x48
       [<c0444943>] ? __blocking_notifier_call_chain+0x45/0x51
       [<c04516f7>] sys_init_module+0xac/0x1bd
       [<c04028e4>] sysenter_do_call+0x12/0x22
      Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
      Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      476181cb
  4. 30 7月, 2009 1 次提交
  5. 10 7月, 2009 1 次提交
    • J
      net: adding memory barrier to the poll and receive callbacks · a57de0b4
      Jiri Olsa 提交于
      Adding memory barrier after the poll_wait function, paired with
      receive callbacks. Adding fuctions sock_poll_wait and sk_has_sleeper
      to wrap the memory barrier.
      
      Without the memory barrier, following race can happen.
      The race fires, when following code paths meet, and the tp->rcv_nxt
      and __add_wait_queue updates stay in CPU caches.
      
      CPU1                         CPU2
      
      sys_select                   receive packet
        ...                        ...
        __add_wait_queue           update tp->rcv_nxt
        ...                        ...
        tp->rcv_nxt check          sock_def_readable
        ...                        {
        schedule                      ...
                                      if (sk->sk_sleep && waitqueue_active(sk->sk_sleep))
                                              wake_up_interruptible(sk->sk_sleep)
                                      ...
                                   }
      
      If there was no cache the code would work ok, since the wait_queue and
      rcv_nxt are opposit to each other.
      
      Meaning that once tp->rcv_nxt is updated by CPU2, the CPU1 either already
      passed the tp->rcv_nxt check and sleeps, or will get the new value for
      tp->rcv_nxt and will return with new data mask.
      In both cases the process (CPU1) is being added to the wait queue, so the
      waitqueue_active (CPU2) call cannot miss and will wake up CPU1.
      
      The bad case is when the __add_wait_queue changes done by CPU1 stay in its
      cache, and so does the tp->rcv_nxt update on CPU2 side.  The CPU1 will then
      endup calling schedule and sleep forever if there are no more data on the
      socket.
      
      Calls to poll_wait in following modules were ommited:
      	net/bluetooth/af_bluetooth.c
      	net/irda/af_irda.c
      	net/irda/irnet/irnet_ppp.c
      	net/mac80211/rc80211_pid_debugfs.c
      	net/phonet/socket.c
      	net/rds/af_rds.c
      	net/rfkill/core.c
      	net/sunrpc/cache.c
      	net/sunrpc/rpc_pipe.c
      	net/tipc/socket.c
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a57de0b4
  6. 22 1月, 2009 1 次提交
    • G
      dccp: Implement both feature-local and feature-remote Sequence Window feature · 792b4878
      Gerrit Renker 提交于
      This adds full support for local/remote Sequence Window feature, from which the
        * sequence-number-validity (W) and
        * acknowledgment-number-validity (W') windows
      derive as specified in RFC 4340, 7.5.3.
      
      Specifically, the following is contained in this patch:
        * integrated new socket fields into dccp_sk;
        * updated the update_gsr/gss routines with regard to these fields;
        * updated handler code: the Sequence Window feature is located at the TX side,
          so the local feature is meant if the handler-rx flag is false;
        * the initialisation of `rcv_wnd' in reqsk is removed, since
          - rcv_wnd is not used by the code anywhere;
          - sequence number checks are not done in the LISTEN state (cf. 7.5.3);
          - dccp_check_req checks the Ack number validity more rigorously;
        * the `struct dccp_minisock' became empty and is now removed.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      792b4878
  7. 05 1月, 2009 1 次提交
    • G
      dccp: Lockless integration of CCID congestion-control plugins · ddebc973
      Gerrit Renker 提交于
      Based on Arnaldo's earlier patch, this patch integrates the standardised
      CCID congestion control plugins (CCID-2 and CCID-3) of DCCP with dccp.ko:
      
       * enables a faster connection path by eliminating the need to always go 
         through the CCID registration lock;
      
       * updates the implementation to use only a single array whose size equals
         the number of configured CCIDs instead of the maximum (256);
      
       * since the CCIDs are now fixed array elements, synchronization is no
         longer needed, simplifying use and implementation.
      
      CCID-2 is suggested as minimum for a basic DCCP implementation (RFC 4340, 10);
      CCID-3 is a standards-track CCID supported by RFC 4342 and RFC 5348.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ddebc973
  8. 30 12月, 2008 1 次提交
  9. 08 12月, 2008 2 次提交
    • G
      dccp ccid-2: Phase out the use of boolean Ack Vector sysctl · 6fdd34d4
      Gerrit Renker 提交于
      This removes the use of the sysctl and the minisock variable for the Send Ack
      Vector feature, as it now is handled fully dynamically via feature negotiation
      (i.e. when CCID-2 is enabled, Ack Vectors are automatically enabled as per
       RFC 4341, 4.).
      
      Using a sysctl in parallel to this implementation would open the door to
      crashes, since much of the code relies on tests of the boolean minisock /
      sysctl variable. Thus, this patch replaces all tests of type
      
      	if (dccp_msk(sk)->dccpms_send_ack_vector)
      		/* ... */
      with
      	if (dp->dccps_hc_rx_ackvec != NULL)
      		/* ... */
      
      The dccps_hc_rx_ackvec is allocated by the dccp_hdlr_ackvec() when feature
      negotiation concluded that Ack Vectors are to be used on the half-connection.
      Otherwise, it is NULL (due to dccp_init_sock/dccp_create_openreq_child),
      so that the test is a valid one.
      
      The activation handler for Ack Vectors is called as soon as the feature
      negotiation has concluded at the
       * server when the Ack marking the transition RESPOND => OPEN arrives;
       * client after it has sent its ACK, marking the transition REQUEST => PARTOPEN.
      
      Adding the sequence number of the Response packet to the Ack Vector has been
      removed, since
       (a) connection establishment implies that the Response has been received;
       (b) the CCIDs only look at packets received in the (PART)OPEN state, i.e.
           this entry will always be ignored;
       (c) it can not be used for anything useful - to detect loss for instance, only
           packets received after the loss can serve as pseudo-dupacks.
      
      There was a FIXME to change the error code when dccp_ackvec_add() fails.
      I removed this after finding out that:
       * the check whether ackno < ISN is already made earlier,
       * this Response is likely the 1st packet with an Ackno that the client gets,
       * so when dccp_ackvec_add() fails, the reason is likely not a packet error.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6fdd34d4
    • G
      dccp: Integration of dynamic feature activation - part 1 (socket setup) · 6eb55d17
      Gerrit Renker 提交于
      This first patch out of three replaces the hardcoded default settings with
      initialisation code for the dynamic feature negotiation.
      
      The patch also ensures that the client feature-negotiation queue is flushed
      only when entering the OPEN state.
      
      Since confirmed Change options are removed as soon as they are confirmed
      (in the DCCP-Response), this ensures that Confirm options are retransmitted.
      
      Note on retransmitting Confirm options:
      ---------------------------------------
      Implementation experience showed that it is necessary to retransmit Confirm
      options. Thanks to Leandro Melo de Sales who reported a bug in an earlier
      revision of the patch set, resulting from not retransmitting these options.
      
      As long as the client is in PARTOPEN, it needs to retransmit the Confirm
      options for the Change options received on the DCCP-Response from the server.
      
      Otherwise, if the packet containing the Confirm options gets dropped in the
      network, the connection aborts due to undefined feature negotiation state.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6eb55d17
  10. 26 11月, 2008 1 次提交
  11. 24 11月, 2008 2 次提交
  12. 20 11月, 2008 1 次提交
  13. 17 11月, 2008 4 次提交
  14. 12 11月, 2008 3 次提交
    • G
      dccp: Resolve dependencies of features on choice of CCID · 9eca0a47
      Gerrit Renker 提交于
      This provides a missing link in the code chain, as several features implicitly
      depend and/or rely on the choice of CCID. Most notably, this is the Send Ack Vector
      feature, but also Ack Ratio and Send Loss Event Rate (also taken care of).
      
      For Send Ack Vector, the situation is as follows:
       * since CCID2 mandates the use of Ack Vectors, there is no point in allowing 
         endpoints which use CCID2 to disable Ack Vector features such a connection;
      
       * a peer with a TX CCID of CCID2 will always expect Ack Vectors, and a peer
         with a RX CCID of CCID2 must always send Ack Vectors (RFC 4341, sec. 4);
      
       * for all other CCIDs, the use of (Send) Ack Vector is optional and thus
         negotiable. However, this implies that the code negotiating the use of Ack
         Vectors also supports it (i.e. is able to supply and to either parse or
         ignore received Ack Vectors). Since this is not the case (CCID-3 has no Ack
         Vector support), the use of Ack Vectors is here disabled, with a comment
         in the source code.
      
      An analogous consideration arises for the Send Loss Event Rate feature,
      since the CCID-3 implementation does not support the loss interval options
      of RFC 4342. To make such use explicit, corresponding feature-negotiation
      options are inserted which signal the use of the loss event rate option,
      as it is used by the CCID3 code.
      
      Lastly, the values of the Ack Ratio feature are matched to the choice of CCID.
      
      The patch implements this as a function which is called after the user has
      made all other registrations for changing default values of features.
      
      The table is variable-length, the reserved (and hence for feature-negotiation
      invalid, confirmed by considering section 19.4 of RFC 4340) feature number `0'
      is used to mark the end of the table.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9eca0a47
    • G
      dccp: Query supported CCIDs · d90ebcbf
      Gerrit Renker 提交于
      This provides a data structure to record which CCIDs are locally supported
      and three accessor functions:
       - a test function for internal use which is used to validate CCID requests
         made by the user;
       - a copy function so that the list can be used for feature-negotiation;   
       - documented getsockopt() support so that the user can query capabilities.
      
      The data structure is a table which is filled in at compile-time with the
      list of available CCIDs (which in turn depends on the Kconfig choices).
      
      Using the copy function for cloning the list of supported CCIDs is useful for
      feature negotiation, since the negotiation is now with the full list of available
      CCIDs (e.g. {2, 3}) instead of the default value {2}. This means negotiation 
      will not fail if the peer requests to use CCID3 instead of CCID2. 
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d90ebcbf
    • G
      dccp: Registration routines for changing feature values · e8ef967a
      Gerrit Renker 提交于
      Two registration routines, for SP and NN features, are provided by this patch,
      replacing a previous routine which was used for both feature types.
      
      These are internal-only routines and therefore start with `__feat_register'.
      
      It further exports the known limits of Sequence Window and Ack Ratio as symbolic
      constants.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e8ef967a
  15. 05 11月, 2008 2 次提交
  16. 09 9月, 2008 1 次提交
  17. 04 9月, 2008 15 次提交
    • G
      dccp tfrc: Let dccp_tfrc_lib do the sampling work · 34a081be
      Gerrit Renker 提交于
      This migrates more TFRC-related code into the dccp_tfrc_lib:
       * sampling of the packet size `s' (which is only needed until the first
         loss interval is computed (ccid3_first_li));
       * updating the byte-counter `bytes_recvd' in between sending feedbacks.
      The result is a better separation of CCID-3 specific and TFRC specific
      code, which aids future integration with ECN and e.g. CCID-4.
      
      Further changes:
      ----------------
       * replaced magic number of 536 with equivalent constant TCP_MIN_RCVMSS;
         (this constant is also used when no estimate for `s' is available).
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      34a081be
    • T
      dccp qpolicy: Parameter checking of cmsg qpolicy parameters · 7d1af6a8
      Tomasz Grobelny 提交于
      Ensure that cmsg->cmsg_type value is valid for qpolicy 
      that is currently in use.
      Signed-off-by: NTomasz Grobelny <tomasz@grobelny.oswiecenia.net>
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      7d1af6a8
    • T
      dccp: Policy-based packet dequeueing infrastructure · d6da3511
      Tomasz Grobelny 提交于
      This patch adds a generic infrastructure for policy-based dequeueing of 
      TX packets and provides two policies:
       * a simple FIFO policy (which is the default) and
       * a priority based policy (set via socket options).
      Both policies honour the tx_qlen sysctl for the maximum size of the write
      queue (can be overridden via socket options). 
      
      The priority policy uses skb->priority internally to assign an u32 priority
      identifier, using the same ranking as SO_PRIORITY. The skb->priority field
      is set to 0 when the packet leaves DCCP. The priority is supplied as ancillary
      data using cmsg(3), the patch also provides the requisite parsing routines.
      Signed-off-by: NTomasz Grobelny <tomasz@grobelny.oswiecenia.net>
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      d6da3511
    • G
      dccp: Refine the wait-for-ccid mechanism · 146993cf
      Gerrit Renker 提交于
      This extends the existing wait-for-ccid routine so that it may be used with
      different types of CCID. It further addresses the problems listed below.
      
      The code looks if the write queue is non-empty and grants the TX CCID up to
      `timeout' jiffies to drain the queue. It will instead purge that queue if
       * the delay suggested by the CCID exceeds the time budget;
       * a socket error occurred while waiting for the CCID;
       * there is a signal pending (eg. annoyed user pressed Control-C);
       * the CCID does not support delays (we don't know how long it will take).
      
      
                       D e t a i l s  [can be removed]
                       -------------------------------
      DCCP's sending mechanism functions a bit like non-blocking I/O: dccp_sendmsg()
      will enqueue up to net.dccp.default.tx_qlen packets (default=5), without waiting
      for them to be released to the network.
      
      Rate-based CCIDs, such as CCID3/4, can impose sending delays of up to maximally
      64 seconds (t_mbi in RFC 3448). Hence the write queue may still contain packets
      when the application closes. Since the write queue is congestion-controlled by
      the CCID, draining the queue is also under control of the CCID.
      
      There are several problems that needed to be addressed:
       1) The queue-drain mechanism only works with rate-based CCIDs. If CCID2 for
          example has a full TX queue and becomes network-limited just as the
          application wants to close, then waiting for CCID2 to become unblocked could
          lead to an indefinite  delay (i.e., application "hangs").
       2) Since each TX CCID in turn uses a feedback mechanism, there may be changes
          in its sending policy while the queue is being drained. This can lead to
          further delays during which the application will not be able to terminate.
       3) The minimum wait time for CCID3/4 can be expected to be the queue length
          times the current inter-packet delay. For example if tx_qlen=100 and a delay
          of 15 ms is used for each packet, then the application would have to wait
          for a minimum of 1.5 seconds before being allowed to exit.
       4) There is no way for the user/application to control this behaviour. It would
          be good to use the timeout argument of dccp_close() as an upper bound. Then
          the maximum time that an application is willing to wait for its CCIDs to can
          be set via the SO_LINGER option.
      
      These problems are addressed by giving the CCID a grace period of up to the
      `timeout' value.
      
      The wait-for-ccid function is, as before, used when the application 
       (a) has read all the data in its receive buffer and
       (b) if SO_LINGER was set with a non-zero linger time, or
       (c) the socket is either in the OPEN (active close) or in the PASSIVE_CLOSEREQ
           state (client application closes after receiving CloseReq).
      
      In addition, there is a catch-all case by calling __skb_queue_purge() after 
      waiting for the CCID. This is necessary since the write queue may still have
      data when
       (a) the host has been passively-closed,
       (b) abnormal termination (unread data, zero linger time),
       (c) wait-for-ccid could not finish within the given time limit.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      146993cf
    • G
      dccp ccid-2: Use feature-negotiation to report Ack Ratio changes · 2faae558
      Gerrit Renker 提交于
      This uses the new feature-negotiation framework to signal Ack Ratio changes,
      as required by RFC 4341, sec. 6.1.2.
      
      This raises some problems for CCID-2 since it can at the moment not cope
      gracefully with Ack Ratio of e.g. 2. A FIXME has thus been added which
      reverts to the existing policy of bypassing the Ack Ratio sysctl.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      2faae558
    • G
      dccp: Implement both feature-local and feature-remote Sequence Window feature · 51c7d4fa
      Gerrit Renker 提交于
      This adds full support for local/remote Sequence Window feature, from which the 
        * sequence-number-validity (W) and 
        * acknowledgment-number-validity (W') windows 
      derive as specified in RFC 4340, 7.5.3. 
      
      Specifically, the following changes are introduced:
        * integrated new socket fields into dccp_sk;
        * updated the update_gsr/gss routines with regard to these fields;
        * updated handler code: the Sequence Window feature is located at the TX side,
          so the local feature is meant if the handler-rx flag is false;
        * the initialisation of `rcv_wnd' in reqsk is removed, since
          - rcv_wnd is not used by the code anywhere;
          - sequence number checks are not done in the LISTEN state (cf. 7.5.3);
          - dccp_check_req checks the Ack number validity more rigorously;
        * the `struct dccp_minisock' became empty and is now removed.
      
      Until the handshake completes with activating negotiated values, the local/remote
      Sequence-Window values are undefined and thus can not reliably be estimated.
      This issue is addressed in a separate patch.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      51c7d4fa
    • G
      dccp ccid-2: Phase out the use of boolean Ack Vector sysctl · b235dc4a
      Gerrit Renker 提交于
      This removes the use of the sysctl and the minisock variable for the Send Ack
      Vector feature, which is now handled fully dynamically via feature negotiation;
      i.e. when CCID2 is enabled, Ack Vectors are automatically enabled (as per
      RFC 4341, 4.).
      
      Using a sysctl in parallel to this implementation would open the door to
      crashes, since much of the code relies on tests of the boolean minisock /
      sysctl variable. Thus, this patch replaces all tests of type
      
      	if (dccp_msk(sk)->dccpms_send_ack_vector)
      		/* ... */
      with
      	if (dp->dccps_hc_rx_ackvec != NULL)
      		/* ... */
      
      The dccps_hc_rx_ackvec is allocated by the dccp_hdlr_ackvec() when feature
      negotiation concluded that Ack Vectors are to be used on the half-connection.
      Otherwise, it is NULL (due to dccp_init_sock/dccp_create_openreq_child),
      so that the test is a valid one.
      
      The activation handler for Ack Vectors is called as soon as the feature
      negotiation has concluded at the
       * server when the Ack marking the transition RESPOND => OPEN arrives;
       * client after it has sent its ACK, marking the transition REQUEST => PARTOPEN.
      
      Adding the sequence number of the Response packet to the Ack Vector has been 
      removed, since
       (a) connection establishment implies that the Response has been received;
       (b) the CCIDs only look at packets received in the (PART)OPEN state, i.e.
           this entry will always be ignored;
       (c) it can not be used for anything useful - to detect loss for instance, only
           packets received after the loss can serve as pseudo-dupacks.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      b235dc4a
    • G
      dccp: Integration of dynamic feature activation - part 1 (socket setup) · 3a53a9ad
      Gerrit Renker 提交于
      This first patch out of three replaces the hardcoded default settings with
      initialisation code for the dynamic feature negotiation.
      
      Note on retransmitting Confirm options:
      ---------------------------------------
      This patch also defers flushing the client feature-negotiation queue,
      due to the following considerations.
      
      As long as the client is in PARTOPEN, it needs to retransmit the Confirm
      options for the Change options received on the DCCP-Response from the server.
      
      Otherwise, if the packet containing the Confirm options gets dropped in the 
      network, the connection aborts due to undefined feature negotiation state.
      
      Thanks to Leandro Melo de Sales who reported a bug in an earlier revision
      of the patch set, resulting from not retransmitting the Confirm options.
      
      The patch now ensures that the client feature-negotiation queue is flushed only
      when entering the OPEN state. Since confirmed Change options are removed as
      soon as they are confirmed (in the DCCP-Response), this ensures that Confirm
      options are retransmitted.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      3a53a9ad
    • G
      dccp: API to query the current TX/RX CCID · c8041e26
      Gerrit Renker 提交于
      This provides function to query the current TX/RX CCID dynamically, without
      reliance on the minisock value, using dynamic information available in the
      currently loaded CCID module.
      
      This query function is then used to 
       (a) provide the getsockopt part for getting/setting CCIDs via sockopts;
       (b) replace the current test for "which CCID is in use" in probe.c.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      c8041e26
    • G
      dccp: Set per-connection CCIDs via socket options · fade756f
      Gerrit Renker 提交于
      With this patch, TX/RX CCIDs can now be changed on a per-connection basis, which
      overrides the defaults set by the global sysctl variables for TX/RX CCIDs.
      
      To make full use of this facility, the remaining patches of this patch set are
      needed, which track dependencies and activate negotiated feature values.
      
      Note on the maximum number of CCIDs that can be registered:
      -----------------------------------------------------------
      The maximum number of CCIDs that can be registered on the socket is constrained
      by the space in a Confirm/Change feature negotiation option. 
      
      The space in these in turn depends on the size of header options as defined
      in RFC 4340, 5.8. Since this is a recurring constant, it has been moved from
      ackvec.h into linux/dccp.h, clarifying its purpose.
      
      Relative to this size, the maximum number of CCID identifiers that can be 
      present in a Confirm option (which always consumes 1 byte more than a Change
      option, cf. 6.1) is 2 bytes less than the maximum TLV size: one for the
      CCID-feature-type and one for the selected value.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      fade756f
    • G
      dccp: Tidy up setsockopt calls · 73bbe095
      Gerrit Renker 提交于
      This splits the setsockopt calls into two groups, depending on whether an
      integer argument (val) is required and whether routines being called do
      their own locking.
      
      Some options (such as setting the CCID) use u8 rather than int, so that for
      these the test with regard to integer-sizeof can not be used.
      
      The second switch-case statement now only has those statements which need
      locking and which make use of `val'.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      Acked-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Reviewed-by: NEugene Teo <eugeneteo@kernel.sg>
      73bbe095
    • G
      dccp: Feature negotiation for minimum-checksum-coverage · 20f41eee
      Gerrit Renker 提交于
      This provides feature negotiation for server minimum checksum coverage
      which so far has been missing.
      
      Since sender/receiver coverage values range only from 0...15, their
      type has also been reduced in size from u16 to u4.
      
      Feature-negotiation options are now generated for both sender and receiver
      coverage, i.e. when the peer has `forgotten' to enable partial coverage
      then feature negotiation will automatically enable (negotiate) the partial
      coverage value for this connection.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      20f41eee
    • G
      dccp: Deprecate old setsockopt framework · 668144f7
      Gerrit Renker 提交于
      The previous setsockopt interface, which passed socket options via struct 
      dccp_so_feat, is complicated/difficult to use. Continuing to support it leads to
      ugly code since the old approach did not distinguish between NN and SP values.
      
      This patch removes the old setsockopt interface and replaces it with two new
      functions to register NN/SP values for feature negotiation. These are 
      essentially wrappers around the internal __feat_register functions, with 
      checking added to avoid
       * wrong usage (type);
       * changing values while the connection is in progress.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      668144f7
    • G
      dccp: Resolve dependencies of features on choice of CCID · 093e1f46
      Gerrit Renker 提交于
      This provides a missing link in the code chain, as several features implicitly
      depend and/or rely on the choice of CCID. Most notably, this is the Send Ack Vector
      feature, but also Ack Ratio and Send Loss Event Rate (also taken care of).
      
      For Send Ack Vector, the situation is as follows:
       * since CCID2 mandates the use of Ack Vectors, there is no point in allowing 
         endpoints which use CCID2 to disable Ack Vector features such a connection;
      
       * a peer with a TX CCID of CCID2 will always expect Ack Vectors, and a peer
         with a RX CCID of CCID2 must always send Ack Vectors (RFC 4341, sec. 4);
      
       * for all other CCIDs, the use of (Send) Ack Vector is optional and thus
         negotiable. However, this implies that the code negotiating the use of Ack
         Vectors also supports it (i.e. is able to supply and to either parse or
         ignore received Ack Vectors). Since this is not the case (CCID-3 has no Ack
         Vector support), the use of Ack Vectors is here disabled, with a comment
         in the source code.
      
      An analogous consideration arises for the Send Loss Event Rate feature,
      since the CCID-3 implementation does not support the loss interval options
      of RFC 4342. To make such use explicit, corresponding feature-negotiation
      options are inserted which signal the use of the loss event rate option,
      as it is used by the CCID3 code.
      
      Lastly, the values of the Ack Ratio feature are matched to the choice of CCID.
      
      The patch implements this as a function which is called after the user has
      made all other registrations for changing default values of features.
      
      The table is variable-length, the reserved (and hence for feature-negotiation
      invalid, confirmed by considering section 19.4 of RFC 4340) feature number `0'
      is used to mark the end of the table.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      093e1f46
    • G
      dccp: Query supported CCIDs · 71bb4959
      Gerrit Renker 提交于
      This provides a data structure to record which CCIDs are locally supported
      and three accessor functions:
       - a test function for internal use which is used to validate CCID requests
         made by the user;
       - a copy function so that the list can be used for feature-negotiation;   
       - documented getsockopt() support so that the user can query capabilities.
      
      The data structure is a table which is filled in at compile-time with the
      list of available CCIDs (which in turn depends on the Kconfig choices).
      
      Using the copy function for cloning the list of supported CCIDs is useful for
      feature negotiation, since the negotiation is now with the full list of available
      CCIDs (e.g. {2, 3}) instead of the default value {2}. This means negotiation 
      will not fail if the peer requests to use CCID3 instead of CCID2. 
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      71bb4959