1. 17 11月, 2008 1 次提交
  2. 12 11月, 2008 1 次提交
    • G
      dccp: Resolve dependencies of features on choice of CCID · 9eca0a47
      Gerrit Renker 提交于
      This provides a missing link in the code chain, as several features implicitly
      depend and/or rely on the choice of CCID. Most notably, this is the Send Ack Vector
      feature, but also Ack Ratio and Send Loss Event Rate (also taken care of).
      
      For Send Ack Vector, the situation is as follows:
       * since CCID2 mandates the use of Ack Vectors, there is no point in allowing 
         endpoints which use CCID2 to disable Ack Vector features such a connection;
      
       * a peer with a TX CCID of CCID2 will always expect Ack Vectors, and a peer
         with a RX CCID of CCID2 must always send Ack Vectors (RFC 4341, sec. 4);
      
       * for all other CCIDs, the use of (Send) Ack Vector is optional and thus
         negotiable. However, this implies that the code negotiating the use of Ack
         Vectors also supports it (i.e. is able to supply and to either parse or
         ignore received Ack Vectors). Since this is not the case (CCID-3 has no Ack
         Vector support), the use of Ack Vectors is here disabled, with a comment
         in the source code.
      
      An analogous consideration arises for the Send Loss Event Rate feature,
      since the CCID-3 implementation does not support the loss interval options
      of RFC 4342. To make such use explicit, corresponding feature-negotiation
      options are inserted which signal the use of the loss event rate option,
      as it is used by the CCID3 code.
      
      Lastly, the values of the Ack Ratio feature are matched to the choice of CCID.
      
      The patch implements this as a function which is called after the user has
      made all other registrations for changing default values of features.
      
      The table is variable-length, the reserved (and hence for feature-negotiation
      invalid, confirmed by considering section 19.4 of RFC 4340) feature number `0'
      is used to mark the end of the table.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9eca0a47
  3. 20 10月, 2008 1 次提交
  4. 09 9月, 2008 1 次提交
  5. 04 9月, 2008 11 次提交
    • T
      dccp: Policy-based packet dequeueing infrastructure · d6da3511
      Tomasz Grobelny 提交于
      This patch adds a generic infrastructure for policy-based dequeueing of 
      TX packets and provides two policies:
       * a simple FIFO policy (which is the default) and
       * a priority based policy (set via socket options).
      Both policies honour the tx_qlen sysctl for the maximum size of the write
      queue (can be overridden via socket options). 
      
      The priority policy uses skb->priority internally to assign an u32 priority
      identifier, using the same ranking as SO_PRIORITY. The skb->priority field
      is set to 0 when the packet leaves DCCP. The priority is supplied as ancillary
      data using cmsg(3), the patch also provides the requisite parsing routines.
      Signed-off-by: NTomasz Grobelny <tomasz@grobelny.oswiecenia.net>
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      d6da3511
    • G
      dccp: Combine the functionality of enqeueing and cloning · b25b0c60
      Gerrit Renker 提交于
      Realising the following call pattern,
       * first dccp_entail() is called to enqueue a new skb and
       * then skb_clone() is called to transmit a clone of that skb,
      
      this patch integrates both interrelated steps into dccp_entail().
      
      Note: the return value of skb_clone is not checked. It may be an idea to add a
            warning if this occurs. In both instances, however, a timer is set for
            retransmission, so that cloning is re-tried via dccp_retransmit_skb().
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      b25b0c60
    • G
      dccp: Refine the wait-for-ccid mechanism · 146993cf
      Gerrit Renker 提交于
      This extends the existing wait-for-ccid routine so that it may be used with
      different types of CCID. It further addresses the problems listed below.
      
      The code looks if the write queue is non-empty and grants the TX CCID up to
      `timeout' jiffies to drain the queue. It will instead purge that queue if
       * the delay suggested by the CCID exceeds the time budget;
       * a socket error occurred while waiting for the CCID;
       * there is a signal pending (eg. annoyed user pressed Control-C);
       * the CCID does not support delays (we don't know how long it will take).
      
      
                       D e t a i l s  [can be removed]
                       -------------------------------
      DCCP's sending mechanism functions a bit like non-blocking I/O: dccp_sendmsg()
      will enqueue up to net.dccp.default.tx_qlen packets (default=5), without waiting
      for them to be released to the network.
      
      Rate-based CCIDs, such as CCID3/4, can impose sending delays of up to maximally
      64 seconds (t_mbi in RFC 3448). Hence the write queue may still contain packets
      when the application closes. Since the write queue is congestion-controlled by
      the CCID, draining the queue is also under control of the CCID.
      
      There are several problems that needed to be addressed:
       1) The queue-drain mechanism only works with rate-based CCIDs. If CCID2 for
          example has a full TX queue and becomes network-limited just as the
          application wants to close, then waiting for CCID2 to become unblocked could
          lead to an indefinite  delay (i.e., application "hangs").
       2) Since each TX CCID in turn uses a feedback mechanism, there may be changes
          in its sending policy while the queue is being drained. This can lead to
          further delays during which the application will not be able to terminate.
       3) The minimum wait time for CCID3/4 can be expected to be the queue length
          times the current inter-packet delay. For example if tx_qlen=100 and a delay
          of 15 ms is used for each packet, then the application would have to wait
          for a minimum of 1.5 seconds before being allowed to exit.
       4) There is no way for the user/application to control this behaviour. It would
          be good to use the timeout argument of dccp_close() as an upper bound. Then
          the maximum time that an application is willing to wait for its CCIDs to can
          be set via the SO_LINGER option.
      
      These problems are addressed by giving the CCID a grace period of up to the
      `timeout' value.
      
      The wait-for-ccid function is, as before, used when the application 
       (a) has read all the data in its receive buffer and
       (b) if SO_LINGER was set with a non-zero linger time, or
       (c) the socket is either in the OPEN (active close) or in the PASSIVE_CLOSEREQ
           state (client application closes after receiving CloseReq).
      
      In addition, there is a catch-all case by calling __skb_queue_purge() after 
      waiting for the CCID. This is necessary since the write queue may still have
      data when
       (a) the host has been passively-closed,
       (b) abnormal termination (unread data, zero linger time),
       (c) wait-for-ccid could not finish within the given time limit.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      146993cf
    • G
      dccp: Extend CCID packet dequeueing interface · e7937772
      Gerrit Renker 提交于
      This extends the packet dequeuing interface of dccp_write_xmit() to allow
       1. CCIDs to take care of timing when the next packet may be sent;
       2. delayed sending (as before, with an inter-packet gap up to 65.535 seconds).
      
      The main purpose is to take CCID2 out of its polling mode (when it is network-
      limited, it tries every millisecond to send, without interruption).
      The interface can also be used to support other CCIDs.
      
      The mode of operation for (2) is as follows:
       * new packet is enqueued via dccp_sendmsg() => dccp_write_xmit(),
       * ccid_hc_tx_send_packet() detects that it may not send (e.g. window full), 
       * it signals this condition via `CCID_PACKET_WILL_DEQUEUE_LATER',
       * dccp_write_xmit() returns without further action;
       * after some time the wait-condition for CCID becomes true,
       * that CCID schedules the tasklet,
       * tasklet function calls ccid_hc_tx_send_packet() via dccp_write_xmit(),
       * since the wait-condition is now true, ccid_hc_tx_packet() returns "send now",
       * packet is sent, and possibly more (since dccp_write_xmit() loops).
      
      Code reuse: the taskled function calls dccp_write_xmit(), the timer function
                  reduces to a wrapper around the same code.
      
      If the tasklet finds that the socket is locked, it re-schedules the tasklet
      function (not the tasklet) after one jiffy.
      
      Changed DCCP_BUG to dccp_pr_debug when transmit_skb returns an error (e.g. when a
      local qdisc is used, NET_XMIT_DROP=1 can be returned for many packets).
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      e7937772
    • G
      dccp ccid-2: Schedule Sync as out-of-band mechanism · c2f42077
      Gerrit Renker 提交于
      The problem with Ack Vectors is that 
      
        i) their length is variable and can in principle grow quite large,
       ii) it is hard to predict exactly how large they will be.
      
      Due to the second point it seems not a good idea to reduce the MPS; in
      particular when on average there is enough room for the Ack Vector and an
      increase in length is momentarily due to some burst loss, after which the
      Ack Vector returns to its normal/average length.
      
      The solution taken by this patch is to subtract a minimum-expected Ack Vector
      length from the MPS (previous patch), and to defer any larger Ack Vectors onto
      a separate Sync - but only if indeed there is no space left on the skb.
      
      This patch provides the infrastructure to schedule Sync-packets for transporting
      (urgent) out-of-band data. Its signalling is quicker than scheduling an Ack, since
      it does not need to wait for new application data.
      
      It can thus serve other parts of the DCCP code as well.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      c2f42077
    • G
      dccp: Merge now-reduced connect_init() function · a9c1656a
      Gerrit Renker 提交于
      After moving the assignment of GAR/ISS from dccp_connect_init() to
      dccp_transmit_skb(), the former function becomes very small, so that
      a merger with dccp_connect() suggests itself.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      a9c1656a
    • G
      dccp: Unused argument in CCID tx function · c506d91d
      Gerrit Renker 提交于
      This removes the argument `more' from ccid_hc_tx_packet_sent, since it was
      nowhere used in the entire code.
      
      (Anecdotally, this argument was not even used in the original KAME code where
       the function originally came from; compare the variable moreToSend in the
       freebsd61-dccp-kame-28.08.2006.patch now maintained by Emmanuel Lochin.)
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      c506d91d
    • G
      dccp: Special case of the MPS for client-PARTOPEN with DataAcks · 88ddac51
      Gerrit Renker 提交于
      To increase robustness, it is necessary to resend Confirm feature-negotiation
      options, even though the RFC does not mandate it. But feature negotiation
      options can take (much) more room than the options on common DataAck packets.
      
      Instead of reducing the MPS always for a case which only applies to the three
      messages send during initial handshake, this patch devises a special case:
      
         if the payload length of the DataAck in PARTOPEN is too large, an Ack is sent
         to carry the options, and the feature-negotiation list is then flushed.
      
         This means that the server gets two Acks for one Response. If both Acks get
         lost, it is probably better to restart the connection anyway and devising yet
         another special-case does not seem worth the extra complexity.
      
      The patch (over-)estimates the expected overhead to be 32*4 bytes -- commonly
      seen values were 20-90 bytes for initial feature-negotiation options. 
      
      It uses sizeof(u32) to mean "aligned units of 4 bytes". For consistency,
      another use of sizeof is modified.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      88ddac51
    • G
      dccp: Leave headroom for options when calculating the MPS · 55ebe3ab
      Gerrit Renker 提交于
      The Maximum Packet Size (MPS) is of interest for applications which want
      to transfer data, so it is only relevant to the data transfer phase of a
      connection (unless one wants to send data on the DCCP-Request, but that is
      not considered here).
      
      The strategy chosen to deal with this requirement is to leave room for only 
      such options that may appear on data packets.
      
      A special consideration applies to Ack Vectors: this is purely guesswork,
      since these can have any length between 3 and 1020 bytes. The strategy
      chosen here is to subtract a configurable minimum, the value of 16 bytes
      (2 bytes for type/length plus 14 Ack Vector cells) has been found by 
      experimentatation. If people experience this as too much or too little,
      this could later be turned into a Kconfig option.	
      
      There are currently no CCID-specific header options which may appear on data
      packets, hence it is not necessary to define a corresponding CCID field.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      55ebe3ab
    • G
      dccp: Mechanism to resolve CCID dependencies · d4c8741c
      Gerrit Renker 提交于
      This adds a hook to resolve features whose value depends on the choice of
      CCID. It is done at the server since it can only be done after the CCID
      values have been negotiated; i.e. the client will add its CCID preference
      list on the Change options sent in the Request, which will be reconciled
      with the local preference list of the server.
      
      The concept is documented on 
      http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/feature_negotiation/\
      				implementation_notes.html#ccid_dependencies
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      d4c8741c
    • G
      dccp: Resolve dependencies of features on choice of CCID · 093e1f46
      Gerrit Renker 提交于
      This provides a missing link in the code chain, as several features implicitly
      depend and/or rely on the choice of CCID. Most notably, this is the Send Ack Vector
      feature, but also Ack Ratio and Send Loss Event Rate (also taken care of).
      
      For Send Ack Vector, the situation is as follows:
       * since CCID2 mandates the use of Ack Vectors, there is no point in allowing 
         endpoints which use CCID2 to disable Ack Vector features such a connection;
      
       * a peer with a TX CCID of CCID2 will always expect Ack Vectors, and a peer
         with a RX CCID of CCID2 must always send Ack Vectors (RFC 4341, sec. 4);
      
       * for all other CCIDs, the use of (Send) Ack Vector is optional and thus
         negotiable. However, this implies that the code negotiating the use of Ack
         Vectors also supports it (i.e. is able to supply and to either parse or
         ignore received Ack Vectors). Since this is not the case (CCID-3 has no Ack
         Vector support), the use of Ack Vectors is here disabled, with a comment
         in the source code.
      
      An analogous consideration arises for the Send Loss Event Rate feature,
      since the CCID-3 implementation does not support the loss interval options
      of RFC 4342. To make such use explicit, corresponding feature-negotiation
      options are inserted which signal the use of the loss event rate option,
      as it is used by the CCID3 code.
      
      Lastly, the values of the Ack Ratio feature are matched to the choice of CCID.
      
      The patch implements this as a function which is called after the user has
      made all other registrations for changing default values of features.
      
      The table is variable-length, the reserved (and hence for feature-negotiation
      invalid, confirmed by considering section 19.4 of RFC 4340) feature number `0'
      is used to mark the end of the table.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      093e1f46
  6. 26 7月, 2008 2 次提交
    • G
      dccp: Bug-Fix - AWL was never updated · 73f18fdb
      Gerrit Renker 提交于
      The AWL lower Ack validity window advances in proportion to GSS, the greatest
      sequence number sent. Updating AWL other than at connection setup (in the
      DCCP-Request sent by dccp_v{4,6}_connect()) was missing in the DCCP code.
      
      This bug lead to syslog messages such as
      
       "kernel: dccp_check_seqno: DCCP: Step 6 failed for DATAACK packet, [...] 
        P.ackno exists or LAWL(82947089) <= P.ackno(82948208)
                                         <= S.AWH(82948728), sending SYNC..."
      
      The difference between AWL/AWH here is 1639 packets, while the expected value
      (the Sequence Window) would have been 100 (the default).  A closer look showed
      that LAWL = AWL = 82947089 equalled the ISS on the Response.
      
      The patch now updates AWL with each increase of GSS.
      
      
      Further changes:
      ----------------
      The patch also enforces more stringent checks on the ISS sequence number:
      
       * AWL is initialised to ISS at connection setup and remains at this value;
       * AWH is then always set to GSS (via dccp_update_gss());
       * so on the first Request: AWL =      AWH = ISS,
         and on the n-th Request: AWL = ISS, AWH = ISS + n.
      
      As a consequence, only Response packets that refer to Requests sent by this
      host will pass, all others are discarded. This is the intention and in effect 
      implements the initial adjustments for AWL as specified in RFC 4340, 7.5.1.
      
      Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>   
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      73f18fdb
    • G
      dccp: Allow to distinguish original and retransmitted packets · 59435444
      Gerrit Renker 提交于
      This patch allows the sender to distinguish original and retransmitted packets,
      which is in particular needed for the retransmission of DCCP-Requests:
       * the first Request uses ISS (generated in net/dccp/ip*.c), and sets GSS = ISS;
       * all retransmitted Requests use GSS' = GSS + 1, so that the n-th retransmitted
         Request has sequence number ISS + n (mod 48).
      
      To add generic support, the patch reorganises existing code so that:
       * icsk_retransmits == 0     for the original packet and
       * icsk_retransmits = n > 0  for the n-th retransmitted packet
      at the time dccp_transmit_skb() is called, via dccp_retransmit_skb().
       
      Thanks to Wei Yongjun for pointing this problem out.
      
      Further changes:
      ----------------
       * removed the `skb' argument from dccp_retransmit_skb(), since sk_send_head
         is used for all retransmissions (the exception is client-Acks in PARTOPEN
         state, but these do not use sk_send_head);
       * since sk_send_head always contains the original skb (via dccp_entail()),
         skb_cloned() never evaluated to true and thus pskb_copy() was never used.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      59435444
  7. 11 6月, 2008 1 次提交
    • G
      dccp: Fix sparse warnings · 1e2f0e5e
      Gerrit Renker 提交于
      This patch fixes the following sparse warnings:
       * nested min(max()) expression:
         net/dccp/ccids/ccid3.c:91:21: warning: symbol '__x' shadows an earlier one
         net/dccp/ccids/ccid3.c:91:21: warning: symbol '__y' shadows an earlier one
         
       * Declaration of function prototypes in .c instead of .h file, resulting in
         "should it be static?" warnings. 
      
       * Declared "struct dccpw" static (local to dccp_probe).
       
       * Disabled dccp_delayed_ack() - not fully removed due to RFC 4340, 11.3
         ("Receivers SHOULD implement delayed acknowledgement timers ...").
      
       * Used a different local variable name to avoid
         net/dccp/ackvec.c:293:13: warning: symbol 'state' shadows an earlier one
         net/dccp/ackvec.c:238:33: originally declared here
      
       * Removed unused functions `dccp_ackvector_print' and `dccp_ackvec_print'.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      1e2f0e5e
  8. 13 4月, 2008 1 次提交
    • P
      [DCCP]: Fix skb->cb conflicts with IP · 028b0275
      Patrick McHardy 提交于
      dev_queue_xmit() and the other IP output functions expect to get a skb
      with clear or properly initialized skb->cb. Unlike TCP and UDP, the
      dccp_skb_cb doesn't contain a struct inet_skb_parm at the beginning,
      so the DCCP-specific data is interpreted by the IP output functions.
      This can cause false negatives for the conditional POST_ROUTING hook
      invocation, making the packet bypass the hook.
      
      Add a inet_skb_parm/inet6_skb_parm union to the beginning of
      dccp_skb_cb to avoid clashes. Also add a BUILD_BUG_ON to make
      sure it fits in the cb.
      
      [ Combined with patch from Gerrit Renker to remove two now unnecessary
        memsets of IPCB(skb)->opt ]
      Signed-off-by: NPatrick McHardy <kaber@trash.net>
      Acked-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      028b0275
  9. 04 4月, 2008 1 次提交
  10. 29 1月, 2008 6 次提交
  11. 11 10月, 2007 5 次提交
  12. 26 4月, 2007 1 次提交
  13. 10 3月, 2007 1 次提交
  14. 01 3月, 2007 1 次提交
  15. 11 2月, 2007 1 次提交
  16. 26 1月, 2007 1 次提交
  17. 12 12月, 2006 3 次提交
  18. 03 12月, 2006 1 次提交
    • G
      [DCCP]: Use `unsigned' for packet lengths · 6b57c93d
      Gerrit Renker 提交于
      This patch implements a suggestion by Ian McDonald and
      
       1) Avoids tests against negative packet lengths by using unsigned int
          for packet payload lengths in the CCID send_packet()/packet_sent() routines
      
       2) As a consequence, it removes an now unnecessary test with regard to `len > 0'
          in ccid3_hc_tx_packet_sent: that condition is always true, since
            * negative packet lengths are avoided
            * ccid3_hc_tx_send_packet flags an error whenever the payload length is 0.
              As a consequence, ccid3_hc_tx_packet_sent is never called as all errors
              returned by ccid_hc_tx_send_packet are caught in dccp_write_xmit
      
       3) Removes the third argument of ccid_hc_tx_send_packet (the `len' parameter),
          since it is currently always set to skb->len. The code is updated with regard
          to this parameter change.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
      6b57c93d