1. 08 12月, 2008 3 次提交
    • G
      dccp ccid-2: Phase out the use of boolean Ack Vector sysctl · 6fdd34d4
      Gerrit Renker 提交于
      This removes the use of the sysctl and the minisock variable for the Send Ack
      Vector feature, as it now is handled fully dynamically via feature negotiation
      (i.e. when CCID-2 is enabled, Ack Vectors are automatically enabled as per
       RFC 4341, 4.).
      
      Using a sysctl in parallel to this implementation would open the door to
      crashes, since much of the code relies on tests of the boolean minisock /
      sysctl variable. Thus, this patch replaces all tests of type
      
      	if (dccp_msk(sk)->dccpms_send_ack_vector)
      		/* ... */
      with
      	if (dp->dccps_hc_rx_ackvec != NULL)
      		/* ... */
      
      The dccps_hc_rx_ackvec is allocated by the dccp_hdlr_ackvec() when feature
      negotiation concluded that Ack Vectors are to be used on the half-connection.
      Otherwise, it is NULL (due to dccp_init_sock/dccp_create_openreq_child),
      so that the test is a valid one.
      
      The activation handler for Ack Vectors is called as soon as the feature
      negotiation has concluded at the
       * server when the Ack marking the transition RESPOND => OPEN arrives;
       * client after it has sent its ACK, marking the transition REQUEST => PARTOPEN.
      
      Adding the sequence number of the Response packet to the Ack Vector has been
      removed, since
       (a) connection establishment implies that the Response has been received;
       (b) the CCIDs only look at packets received in the (PART)OPEN state, i.e.
           this entry will always be ignored;
       (c) it can not be used for anything useful - to detect loss for instance, only
           packets received after the loss can serve as pseudo-dupacks.
      
      There was a FIXME to change the error code when dccp_ackvec_add() fails.
      I removed this after finding out that:
       * the check whether ackno < ISN is already made earlier,
       * this Response is likely the 1st packet with an Ackno that the client gets,
       * so when dccp_ackvec_add() fails, the reason is likely not a packet error.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6fdd34d4
    • G
      dccp: Remove manual influence on NDP Count feature · 4098dce5
      Gerrit Renker 提交于
      Updating the NDP count feature is handled automatically now:
       * for CCID-2 it is disabled, since the code does not use NDP counts;
       * for CCID-3 it is enabled, as NDP counts are used to determine loss lengths.
      
      Allowing the user to change NDP values leads to unpredictable and failing
      behaviour, since it is then possible to disable NDP counts even when they
      are needed (e.g. in CCID-3).
      
      This means that only those user settings are sensible that agree with the
      values for Send NDP Count implied by the choice of CCID. But those settings
      are already activated by the feature negotiation (CCID dependency tracking),
      hence this form of support is redundant.
      
      At startup the initialisation of the NDP count feature uses the default
      value of 0, which is done implicitly by the zeroing-out of the socket when
      it is allocated. If the choice of CCID or feature negotiation enables NDP
      count, this will then be updated via the NDP activation handler.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4098dce5
    • G
      dccp: Remove obsolete parts of the old CCID interface · 0049bab5
      Gerrit Renker 提交于
      The TX/RX CCIDs of the minisock are now redundant: similar to the Ack Vector
      case, their value equals initially that of the sysctl, but at the end of
      feature negotiation may be something different.
      
      The old interface removed by this patch thus has been replaced by the newer
      interface to dynamically query the currently loaded CCIDs.
      
      Also removed are the constructors for the TX CCID and the RX CCID, since the
      switch "rx <-> non-rx" is done by the handler in minisocks.c (and the handler
      is the only place in the code where CCIDs are loaded).
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0049bab5
  2. 24 11月, 2008 1 次提交
  3. 17 11月, 2008 3 次提交
    • G
      dccp: Deprecate Ack Ratio sysctl · dd9c0e36
      Gerrit Renker 提交于
      This patch deprecates the Ack Ratio sysctl, since
       * Ack Ratio is entirely ignored by CCID-3 and CCID-4,
       * Ack Ratio currently doesn't work in CCID-2 (i.e. is always set to 1);
       * even if it would work in CCID-2, there is no point for a user to change it:
         - Ack Ratio is constrained by cwnd (RFC 4341, 6.1.2),
         - if Ack Ratio > cwnd, the system resorts to spurious RTO timeouts
           (since waiting for Acks which will never arrive in this window),
         - cwnd is not a user-configurable value.
      
      The only reasonable place for Ack Ratio is to print it for debugging. It is
      planned to do this later on, as part of e.g. dccp_probe.
      
      With this patch Ack Ratio is now under full control of feature negotiation:
       * Ack Ratio is resolved as a dependency of the selected CCID;
       * if the chosen CCID supports it (i.e. CCID == CCID-2), Ack Ratio is set to
         the default of 2, following RFC 4340, 11.3 - "New connections start with Ack
         Ratio 2 for both endpoints";
       * what happens then is part of another patch set, since it concerns the
         dynamic update of Ack Ratio while the connection is in full flight.
      
      Thanks to Tomasz Grobelny for discussion leading up to this patch.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dd9c0e36
    • G
      dccp: Feature negotiation for minimum-checksum-coverage · 29450559
      Gerrit Renker 提交于
      This provides feature negotiation for server minimum checksum coverage
      which so far has been missing.
      
      Since sender/receiver coverage values range only from 0...15, their
      type has also been reduced in size from u16 to u4.
      
      Feature-negotiation options are now generated for both sender and receiver
      coverage, i.e. when the peer has `forgotten' to enable partial coverage
      then feature negotiation will automatically enable (negotiate) the partial
      coverage value for this connection.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      29450559
    • G
      dccp: Deprecate old setsockopt framework · 49aebc66
      Gerrit Renker 提交于
      The previous setsockopt interface, which passed socket options via struct
      dccp_so_feat, is complicated/difficult to use. Continuing to support it leads to
      ugly code since the old approach did not distinguish between NN and SP values.
      
      This patch removes the old setsockopt interface and replaces it with two new
      functions to register NN/SP values for feature negotiation. 
      These are essentially wrappers around the internal __feat_register functions,
      with checking added to avoid
      
       * wrong usage (type);
       * changing values while the connection is in progress.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      49aebc66
  4. 12 11月, 2008 1 次提交
    • G
      dccp: Query supported CCIDs · d90ebcbf
      Gerrit Renker 提交于
      This provides a data structure to record which CCIDs are locally supported
      and three accessor functions:
       - a test function for internal use which is used to validate CCID requests
         made by the user;
       - a copy function so that the list can be used for feature-negotiation;   
       - documented getsockopt() support so that the user can query capabilities.
      
      The data structure is a table which is filled in at compile-time with the
      list of available CCIDs (which in turn depends on the Kconfig choices).
      
      Using the copy function for cloning the list of supported CCIDs is useful for
      feature negotiation, since the negotiation is now with the full list of available
      CCIDs (e.g. {2, 3}) instead of the default value {2}. This means negotiation 
      will not fail if the peer requests to use CCID3 instead of CCID2. 
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d90ebcbf
  5. 05 11月, 2008 2 次提交
  6. 09 9月, 2008 1 次提交
  7. 04 9月, 2008 17 次提交
    • T
      dccp: Policy-based packet dequeueing infrastructure · d6da3511
      Tomasz Grobelny 提交于
      This patch adds a generic infrastructure for policy-based dequeueing of 
      TX packets and provides two policies:
       * a simple FIFO policy (which is the default) and
       * a priority based policy (set via socket options).
      Both policies honour the tx_qlen sysctl for the maximum size of the write
      queue (can be overridden via socket options). 
      
      The priority policy uses skb->priority internally to assign an u32 priority
      identifier, using the same ranking as SO_PRIORITY. The skb->priority field
      is set to 0 when the packet leaves DCCP. The priority is supplied as ancillary
      data using cmsg(3), the patch also provides the requisite parsing routines.
      Signed-off-by: NTomasz Grobelny <tomasz@grobelny.oswiecenia.net>
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      d6da3511
    • G
      dccp: Extend CCID packet dequeueing interface · e7937772
      Gerrit Renker 提交于
      This extends the packet dequeuing interface of dccp_write_xmit() to allow
       1. CCIDs to take care of timing when the next packet may be sent;
       2. delayed sending (as before, with an inter-packet gap up to 65.535 seconds).
      
      The main purpose is to take CCID2 out of its polling mode (when it is network-
      limited, it tries every millisecond to send, without interruption).
      The interface can also be used to support other CCIDs.
      
      The mode of operation for (2) is as follows:
       * new packet is enqueued via dccp_sendmsg() => dccp_write_xmit(),
       * ccid_hc_tx_send_packet() detects that it may not send (e.g. window full), 
       * it signals this condition via `CCID_PACKET_WILL_DEQUEUE_LATER',
       * dccp_write_xmit() returns without further action;
       * after some time the wait-condition for CCID becomes true,
       * that CCID schedules the tasklet,
       * tasklet function calls ccid_hc_tx_send_packet() via dccp_write_xmit(),
       * since the wait-condition is now true, ccid_hc_tx_packet() returns "send now",
       * packet is sent, and possibly more (since dccp_write_xmit() loops).
      
      Code reuse: the taskled function calls dccp_write_xmit(), the timer function
                  reduces to a wrapper around the same code.
      
      If the tasklet finds that the socket is locked, it re-schedules the tasklet
      function (not the tasklet) after one jiffy.
      
      Changed DCCP_BUG to dccp_pr_debug when transmit_skb returns an error (e.g. when a
      local qdisc is used, NET_XMIT_DROP=1 can be returned for many packets).
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      e7937772
    • G
      dccp ccid-2: Schedule Sync as out-of-band mechanism · c2f42077
      Gerrit Renker 提交于
      The problem with Ack Vectors is that 
      
        i) their length is variable and can in principle grow quite large,
       ii) it is hard to predict exactly how large they will be.
      
      Due to the second point it seems not a good idea to reduce the MPS; in
      particular when on average there is enough room for the Ack Vector and an
      increase in length is momentarily due to some burst loss, after which the
      Ack Vector returns to its normal/average length.
      
      The solution taken by this patch is to subtract a minimum-expected Ack Vector
      length from the MPS (previous patch), and to defer any larger Ack Vectors onto
      a separate Sync - but only if indeed there is no space left on the skb.
      
      This patch provides the infrastructure to schedule Sync-packets for transporting
      (urgent) out-of-band data. Its signalling is quicker than scheduling an Ack, since
      it does not need to wait for new application data.
      
      It can thus serve other parts of the DCCP code as well.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      c2f42077
    • G
      dccp: Replace magic CCID-specific numbers by symbolic constants · f10ecaee
      Gerrit Renker 提交于
      The constants DCCPO_{MIN,MAX}_CCID_SPECIFIC are nowhere used in the code, but
      instead for the CCID-specific options numbers are used.
      
      This patch unifies the use of CCID-specific option numbers, by adding symbolic
      names reflecting the definitions in RFC 4340, 10.3.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      f10ecaee
    • G
      dccp: Initialisation and type-checking of feature sysctls · 0a482267
      Gerrit Renker 提交于
      This patch takes care of initialising and type-checking sysctls related to
      feature negotiation. Type checking is important since some of the sysctls
      now directly act on the feature-negotiation process.
      
      The sysctls are initialised with the known default values for each feature.
      For the type-checking the value constraints from RFC 4340 are used:
      
       * Sequence Window uses the specified Wmin=32, the maximum is ulong (4 bytes),
         tested and confirmed that it works up to 4294967295 - for Gbps speed;
       * Ack Ratio is between 0 .. 0xffff (2-byte unsigned integer);
       * CCIDs are between 0 .. 255;
       * request_retries, retries1, retries2 also between 0..255 for good measure;
       * tx_qlen is checked to be non-negative;
       * sync_ratelimit remains as before.
      
      Further changes:
      ----------------
      Performed s@sysctl_dccp_feat@sysctl_dccp@g since the sysctls are now in feat.c.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      0a482267
    • G
      dccp: Implement both feature-local and feature-remote Sequence Window feature · 51c7d4fa
      Gerrit Renker 提交于
      This adds full support for local/remote Sequence Window feature, from which the 
        * sequence-number-validity (W) and 
        * acknowledgment-number-validity (W') windows 
      derive as specified in RFC 4340, 7.5.3. 
      
      Specifically, the following changes are introduced:
        * integrated new socket fields into dccp_sk;
        * updated the update_gsr/gss routines with regard to these fields;
        * updated handler code: the Sequence Window feature is located at the TX side,
          so the local feature is meant if the handler-rx flag is false;
        * the initialisation of `rcv_wnd' in reqsk is removed, since
          - rcv_wnd is not used by the code anywhere;
          - sequence number checks are not done in the LISTEN state (cf. 7.5.3);
          - dccp_check_req checks the Ack number validity more rigorously;
        * the `struct dccp_minisock' became empty and is now removed.
      
      Until the handshake completes with activating negotiated values, the local/remote
      Sequence-Window values are undefined and thus can not reliably be estimated.
      This issue is addressed in a separate patch.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      51c7d4fa
    • G
      dccp: Initialisation framework for feature negotiation · 5d3dac26
      Gerrit Renker 提交于
      This initialises feature negotiation from two tables, which are initialised
      from sysctls. 
      
      As a novel feature, specifics of the implementation (e.g. currently short
      seqnos and ECN are not supported) are advertised for robustness.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      5d3dac26
    • G
      dccp ccid-2: Phase out the use of boolean Ack Vector sysctl · b235dc4a
      Gerrit Renker 提交于
      This removes the use of the sysctl and the minisock variable for the Send Ack
      Vector feature, which is now handled fully dynamically via feature negotiation;
      i.e. when CCID2 is enabled, Ack Vectors are automatically enabled (as per
      RFC 4341, 4.).
      
      Using a sysctl in parallel to this implementation would open the door to
      crashes, since much of the code relies on tests of the boolean minisock /
      sysctl variable. Thus, this patch replaces all tests of type
      
      	if (dccp_msk(sk)->dccpms_send_ack_vector)
      		/* ... */
      with
      	if (dp->dccps_hc_rx_ackvec != NULL)
      		/* ... */
      
      The dccps_hc_rx_ackvec is allocated by the dccp_hdlr_ackvec() when feature
      negotiation concluded that Ack Vectors are to be used on the half-connection.
      Otherwise, it is NULL (due to dccp_init_sock/dccp_create_openreq_child),
      so that the test is a valid one.
      
      The activation handler for Ack Vectors is called as soon as the feature
      negotiation has concluded at the
       * server when the Ack marking the transition RESPOND => OPEN arrives;
       * client after it has sent its ACK, marking the transition REQUEST => PARTOPEN.
      
      Adding the sequence number of the Response packet to the Ack Vector has been 
      removed, since
       (a) connection establishment implies that the Response has been received;
       (b) the CCIDs only look at packets received in the (PART)OPEN state, i.e.
           this entry will always be ignored;
       (c) it can not be used for anything useful - to detect loss for instance, only
           packets received after the loss can serve as pseudo-dupacks.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      b235dc4a
    • G
      dccp: Remove manual influence on NDP Count feature · 68e074bf
      Gerrit Renker 提交于
      Updating the NDP count feature is handled automatically now:
       * for CCID-2 it is disabled, since the code does not use NDP counts;
       * for CCID-3 it is enabled, as NDP counts are used to determine loss lengths.
      
      Allowing the user to change NDP values leads to unpredictable and failing
      behaviour, since it is then possible to disable NDP counts even when they
      are needed (e.g. in CCID-3).
      
      This means that only those user settings are sensible that agree with the
      values for Send NDP Count implied by the choice of CCID. But those settings
      are already activated by the feature negotiation (CCID dependency tracking),
      hence this form of support is redundant.
      
      At startup the initialisation of the NDP count feature is with the default
      value of 0, which is done implicitly by the zeroing-out of the socket when
      it is allocated. If the choice of CCID or feature negotiation enables NDP
      count, this will then be updated via the NDP activation handler.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      68e074bf
    • G
      dccp: Remove obsolete parts of the old CCID interface · 78673e24
      Gerrit Renker 提交于
      The TX/RX CCIDs of the minisock are now redundant: similar to the Ack Vector
      case, their value equals initially that of the sysctl, but at the end of
      feature negotiation may be something different.
      
      The old interface removed by this patch thus has been replaced by the newer
      interface to dynamically query the currently loaded CCIDs earlier in this
      patch set.
      
      Also removed the constructors for the TX CCID and the RX CCID, since the
      switch rx/non-rx is done by the handler in minisocks.c (and the handler is
      the only place in the code where CCIDs are loaded).
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      78673e24
    • G
      dccp: Set per-connection CCIDs via socket options · fade756f
      Gerrit Renker 提交于
      With this patch, TX/RX CCIDs can now be changed on a per-connection basis, which
      overrides the defaults set by the global sysctl variables for TX/RX CCIDs.
      
      To make full use of this facility, the remaining patches of this patch set are
      needed, which track dependencies and activate negotiated feature values.
      
      Note on the maximum number of CCIDs that can be registered:
      -----------------------------------------------------------
      The maximum number of CCIDs that can be registered on the socket is constrained
      by the space in a Confirm/Change feature negotiation option. 
      
      The space in these in turn depends on the size of header options as defined
      in RFC 4340, 5.8. Since this is a recurring constant, it has been moved from
      ackvec.h into linux/dccp.h, clarifying its purpose.
      
      Relative to this size, the maximum number of CCID identifiers that can be 
      present in a Confirm option (which always consumes 1 byte more than a Change
      option, cf. 6.1) is 2 bytes less than the maximum TLV size: one for the
      CCID-feature-type and one for the selected value.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      fade756f
    • G
      dccp: Deprecate Ack Ratio sysctl · 17c30b40
      Gerrit Renker 提交于
      This patch deprecates the Ack Ratio sysctl, since
       * Ack Ratio is entirely ignored by CCID-3 and CCID-4,
       * Ack Ratio currently doesn't work in CCID-2 (i.e. is always set to 1);
       * even if it would work in CCID-2, there is no point for a user to change it:
         - Ack Ratio is constrained by cwnd (RFC 4341, 6.1.2),
         - if Ack Ratio > cwnd, the system resorts to spurious RTO timeouts 
           (since waiting for Acks which will never arrive in this window),
         - cwnd is not a user-configurable value.	
      
      The only reasonable place for Ack Ratio is to print it for debugging. It is
      planned to do this later on, as part of e.g. dccp_probe.
      
      With this patch Ack Ratio is now under full control of feature negotiation:
       * Ack Ratio is resolved as a dependency of the selected CCID;
       * if the chosen CCID supports it (i.e. CCID == CCID-2), Ack Ratio is set to
         the default of 2, following RFC 4340, 11.3 - "New connections start with Ack
         Ratio 2 for both endpoints";
       * what happens then is part of another patch set, since it concerns the 
         dynamic update of Ack Ratio while the connection is in full flight.
      
      Thanks to Tomasz Grobelny for discussion leading up to this patch.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      17c30b40
    • G
      dccp: Feature negotiation for minimum-checksum-coverage · 20f41eee
      Gerrit Renker 提交于
      This provides feature negotiation for server minimum checksum coverage
      which so far has been missing.
      
      Since sender/receiver coverage values range only from 0...15, their
      type has also been reduced in size from u16 to u4.
      
      Feature-negotiation options are now generated for both sender and receiver
      coverage, i.e. when the peer has `forgotten' to enable partial coverage
      then feature negotiation will automatically enable (negotiate) the partial
      coverage value for this connection.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      20f41eee
    • G
      dccp: Deprecate old setsockopt framework · 668144f7
      Gerrit Renker 提交于
      The previous setsockopt interface, which passed socket options via struct 
      dccp_so_feat, is complicated/difficult to use. Continuing to support it leads to
      ugly code since the old approach did not distinguish between NN and SP values.
      
      This patch removes the old setsockopt interface and replaces it with two new
      functions to register NN/SP values for feature negotiation. These are 
      essentially wrappers around the internal __feat_register functions, with 
      checking added to avoid
       * wrong usage (type);
       * changing values while the connection is in progress.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      668144f7
    • G
      dccp: Query supported CCIDs · 71bb4959
      Gerrit Renker 提交于
      This provides a data structure to record which CCIDs are locally supported
      and three accessor functions:
       - a test function for internal use which is used to validate CCID requests
         made by the user;
       - a copy function so that the list can be used for feature-negotiation;   
       - documented getsockopt() support so that the user can query capabilities.
      
      The data structure is a table which is filled in at compile-time with the
      list of available CCIDs (which in turn depends on the Kconfig choices).
      
      Using the copy function for cloning the list of supported CCIDs is useful for
      feature negotiation, since the negotiation is now with the full list of available
      CCIDs (e.g. {2, 3}) instead of the default value {2}. This means negotiation 
      will not fail if the peer requests to use CCID3 instead of CCID2. 
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      71bb4959
    • G
      dccp: Per-socket initialisation of feature negotiation · 828755ce
      Gerrit Renker 提交于
      This provides feature-negotiation initialisation for both DCCP sockets and
      DCCP request_sockets, to support feature negotiation during connection setup.
      
      It also resolves a FIXME regarding the congestion control initialisation.
      
      Thanks to Wei Yongjun for help with the IPv6 side of this patch.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      828755ce
    • G
      dccp: Implement lookup table for feature-negotiation information · b4eec206
      Gerrit Renker 提交于
      A lookup table for feature-negotiation information, extracted from RFC 4340/42,
      is provided by this patch. All currently known features can be found in this 
      table, along with their feature location, their default value, and type.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      b4eec206
  8. 13 7月, 2008 1 次提交
  9. 03 2月, 2008 1 次提交
  10. 29 1月, 2008 6 次提交
    • G
      [DCCP]: Handle timestamps on Request/Response exchange separately · b4d4f7c7
      Gerrit Renker 提交于
      In DCCP, timestamps can occur on packets anytime, CCID3 uses a timestamp(/echo) on the Request/Response
      exchange. This patch addresses the following situation:
      	* timestamps are recorded on the listening socket;
      	* Responses are sent from dccp_request_sockets;
      	* suppose two connections reach the listening socket with very small time in between:
      	* the first timestamp value gets overwritten by the second connection request.
      
      This is not really good, so this patch separates timestamps into
       * those which are received by the server during the initial handshake (on dccp_request_sock);
       * those which are received by the client or the client after connection establishment.
      
      As before, a timestamp of 0 is regarded as indicating that no (meaningful) timestamp has been
      received (in addition, a warning message is printed if hosts send 0-valued timestamps).
      
      The timestamp-echoing now works as follows:
       * when a timestamp is present on the initial Request, it is placed into dreq, due to the
         call to dccp_parse_options in dccp_v{4,6}_conn_request;
       * when a timestamp is present on the Ack leading from RESPOND => OPEN, it is copied over
         from the request_sock into the child cocket in dccp_create_openreq_child;
       * timestamps received on an (established) dccp_sock are treated as before.
      
      Since Elapsed Time is measured in hundredths of milliseconds (13.2), the new dccp_timestamp()
      function is used, as it is expected that the time between receiving the timestamp and
      sending the timestamp echo will be very small against the wrap-around time. As a byproduct,
      this allows smaller timestamping-time fields.
      
      Furthermore, inserting the Timestamp Echo option has been taken out of the block starting with
      '!dccp_packet_without_ack()', since Timestamp Echo can be carried on any packet (5.8 and 13.3).
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b4d4f7c7
    • G
      [DCCP]: Allow to parse options on Request Sockets · 8b819412
      Gerrit Renker 提交于
      The option parsing code currently only parses on full sk's. This causes a problem for
      options sent during the initial handshake (in particular timestamps and feature-negotiation
      options). Therefore, this patch extends the option parsing code with an additional argument
      for request_socks: if it is non-NULL, options are parsed on the request socket, otherwise
      the normal path (parsing on the sk) is used.
      
      Subsequent patches, which implement feature negotiation during connection setup, make use
      of this facility.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8b819412
    • G
      [DCCP]: Support for server holding timewait state · b8599d20
      Gerrit Renker 提交于
      This adds a socket option and signalling support for the case where the server
      holds timewait state on closing the connection, as described in RFC 4340, 8.3.
      
      Since holding timewait state at the server is the non-usual case, it is enabled
      via a socket option. Documentation for this socket option has been added.
      
      The setsockopt statement has been made resilient against different possible cases
      of expressing boolean `true' values using a suggestion by Ian McDonald.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b8599d20
    • G
      [DCCP]: Integrate state transitions for passive-close · 0c869620
      Gerrit Renker 提交于
      This adds the necessary state transitions for the two forms of passive-close
      
       * PASSIVE_CLOSE    - which is entered when a host   receives a Close;
       * PASSIVE_CLOSEREQ - which is entered when a client receives a CloseReq.
      
      Here is a detailed account of what the patch does in each state.
      
      1) Receiving CloseReq
      
        The pseudo-code in 8.5 says:
      
           Step 13: Process CloseReq
                If P.type == CloseReq and S.state < CLOSEREQ,
                    Generate Close
                    S.state := CLOSING
                    Set CLOSING timer.
      
        This means we need to address what to do in CLOSED, LISTEN, REQUEST, RESPOND, PARTOPEN, and OPEN.
      
         * CLOSED:         silently ignore - it may be a late or duplicate CloseReq;
         * LISTEN/RESPOND: will not appear, since Step 7 is performed first (we know we are the client);
         * REQUEST:        perform Step 13 directly (no need to enqueue packet);
         * OPEN/PARTOPEN:  enter PASSIVE_CLOSEREQ so that the application has a chance to process unread data.
      
        When already in PASSIVE_CLOSEREQ, no second CloseReq is enqueued. In any other state, the CloseReq is ignored.
        I think that this offers some robustness against rare and pathological cases: e.g. a simultaneous close where
        the client sends a Close and the server a CloseReq. The client will then be retransmitting its Close until it
        gets the Reset, so ignoring the CloseReq while in state CLOSING is sane.
      
      2) Receiving Close
      
        The code below from 8.5 is unconditional.
      
           Step 14: Process Close
                If P.type == Close,
                    Generate Reset(Closed)
                    Tear down connection
                    Drop packet and return
      
        Thus we need to consider all states:
         * CLOSED:           silently ignore, since this can happen when a retransmitted or late Close arrives;
         * LISTEN:           dccp_rcv_state_process() will generate a Reset ("No Connection");
         * REQUEST:          perform Step 14 directly (no need to enqueue packet);
         * RESPOND:          dccp_check_req() will generate a Reset ("Packet Error") -- left it at that;
         * OPEN/PARTOPEN:    enter PASSIVE_CLOSE so that application has a chance to process unread data;
         * CLOSEREQ:         server performed active-close -- perform Step 14;
         * CLOSING:          simultaneous-close: use a tie-breaker to avoid message ping-pong (see comment);
         * PASSIVE_CLOSEREQ: ignore - the peer has a bug (sending first a CloseReq and now a Close);
         * TIMEWAIT:         packet is ignored.
      
         Note that the condition of receiving a packet in state CLOSED here is different from the condition "there
         is no socket for such a connection": the socket still exists, but its state indicates it is unusable.
      
         Last, dccp_finish_passive_close sets either DCCP_CLOSED or DCCP_CLOSING = TCP_CLOSING, so that
         sk_stream_wait_close() will wait for the final Reset (which will trigger CLOSING => CLOSED).
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0c869620
    • G
      [DCCP]: Dedicated auxiliary states to support passive-close · f11135a3
      Gerrit Renker 提交于
      This adds two auxiliary states to deal with passive closes:
        * PASSIVE_CLOSE    (reached from OPEN via reception of Close)    and
        * PASSIVE_CLOSEREQ (reached from OPEN via reception of CloseReq)
      as internal intermediate states.
      
      These states are used to allow a receiver to process unread data before
      acknowledging the received connection-termination-request (the Close/CloseReq).
      
      Without such support, it will happen that passively-closed sockets enter CLOSED
      state while there is still unprocessed data in the queue; leading to unexpected
      and erratic API behaviour.
      
      PASSIVE_CLOSE has been mapped into TCPF_CLOSE_WAIT, so that the code will
      seamlessly work with inet_accept() (which tests for this state).
      
      The state names are thanks to Arnaldo, who suggested this naming scheme
      following an earlier revision of this patch.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f11135a3
    • G
      [DCCP]: Make PARTOPEN an autonomous state · 9b91ad27
      Gerrit Renker 提交于
      This decouples PARTOPEN from TCP-specific stream-states.
      
      It thus addresses the FIXME.
      
      The code has been checked with regard to dependency on PARTOPEN and FIN_WAIT1
      states (to which PARTOPEN previously was mapped): there is no difference, as
      PARTOPEN is always referred to directly (i.e. not via the mapping to TCP
      state).
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9b91ad27
  11. 24 10月, 2007 2 次提交
    • G
      [DCCP]: Convert Reset code into socket error number · d8ef2c29
      Gerrit Renker 提交于
      This adds support for converting the 11 currently defined Reset codes into system
      error numbers, which are stored in sk_err for further interpretation.
      
      This makes the externally visible API behaviour similar to TCP, since a client
      connecting to a non-existing port will experience ECONNREFUSED.
      
      * Code 0, Unspecified, is interpreted as non-error (0);
      * Code 1, Closed (normal termination), also maps into 0;
      * Code 2, Aborted, maps into "Connection reset by peer" (ECONNRESET);
      * Code 3, No Connection and
        Code 7, Connection Refused, map into "Connection refused" (ECONNREFUSED);
      * Code 4, Packet Error, maps into "No message of desired type" (ENOMSG);
      * Code 5, Option Error, maps into "Illegal byte sequence" (EILSEQ);
      * Code 6, Mandatory Error, maps into "Operation not supported on transport endpoint" (EOPNOTSUPP);
      * Code 8, Bad Service Code, maps into "Invalid request code" (EBADRQC);
      * Code 9, Too Busy, maps into "Too many users" (EUSERS);
      * Code 10, Bad Init Cookie, maps into "Invalid request descriptor" (EBADR);
      * Code 11, Aggression Penalty, maps into "Quota exceeded" (EDQUOT)
        which makes sense in terms of using more than the `fair share' of bandwidth.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d8ef2c29
    • G
      [DCCP]: Retrieve packet sequence number for error reporting · fde20105
      Gerrit Renker 提交于
      This fixes a problem when analysing erroneous packets in dccp_v{4,6}_err:
      * dccp_hdr_seq currently takes an skb
      * however, the transport headers in the skb are shifted, due to the
        preceding IPv4/v6 header.
      Fixed for v4 and v6 by changing dccp_hdr_seq to take a struct dccp_hdr as
      argument. Verified that the correct sequence number is now reported in the
      error handler.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fde20105
  12. 11 10月, 2007 2 次提交