1. 04 9月, 2008 9 次提交
    • G
      dccp ccid-2: Simplify dec_pipe and rearming of RTO timer · e9803c01
      Gerrit Renker 提交于
      This removes the dec_pipe function and improves the way the RTO timer is rearmed
      when a new acknowledgment comes in.
      
      Details and justification for removal:
      --------------------------------------
       1) The BUG_ON in dec_pipe is never triggered: pipe is only decremented for TX 
          history entries between tail and head, for which it had previously been 
          incremented in tx_packet_sent; and it is not decremented twice for the same
          entry, since it is
          - either decremented when a corresponding Ack Vector cell in state 0 or 1 
            was received (and then ccid2s_acked==1),
          - or it is decremented when ccid2s_acked==0, as part of the loss detection
            in tx_packet_recv (and hence it can not have been decremented earlier).
      
       2) Restarting the RTO timer happens for every single entry in each Ack Vector
          parsed by tx_packet_recv (according to RFC 4340, 11.4 this can happen up to
          16192 times per Ack Vector). 
      
       3) The RTO timer should not be restarted when all outstanding data has been
          acknowledged. This is currently done similar to (2), in dec_pipe, when
          pipe has reached 0.
      
      The patch onsolidates the code which rearms the RTO timer, combining the
      segments from new_ack and dec_pipe. As a result, the code becomes clearer
      (compare with tcp_rearm_rto()).
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      e9803c01
    • G
      dccp ccid-2: Remove redundant sanity tests · c6f0f2e7
      Gerrit Renker 提交于
      This removes the ccid2_hc_tx_check_sanity function: it is redundant.
      
      Details:
      ========
      The tx_check_sanity function performs three tests:
       1) it checks that the circular TX list is sorted
          - in ascending order of sequence number (ccid2s_seq) 
          - and time (ccid2s_sent),
          - in the direction from `tail' (hctx_seqt) to `head' (hctx_seqh);
       2) it ensures that the entire list has the length seqbufc * CCID2_SEQBUF_LEN;
       3) it ensures that pipe equals the number of packets that were not
          marked `acked' (ccid2s_acked) between `tail' and `head'.
      
      The following argues that each of these tests is redundant, this can be verified
      by going through the code.
      
      (1) is not necessary, since both time and GSS increase from one packet to the
      next, so that subsequent insertions in tx_packet_sent (which advance the `head'
      pointer) will be in ascending order of time and sequence number.
      
      In (2), the length of the list is always equal to seqbufc times CCID2_SEQBUF_LEN
      (set to 1024) unless allocation caused an earlier failure, because:
       * at initialisation (tx_init), there is one chunk of size 1024 and seqbufc=1;
       * subsequent calls to tx_alloc_seq take place whenever head->next == tail in 
         tx_packet_sent; then a new chunk of size 1024 is inserted between head and
         tail, and seqbufc is incremented by one.
      
      To show that (3) is redundant requires looking at two cases. 
      
      The `pipe' variable of the TX socket is incremented only in tx_packet_sent, and 
      decremented in tx_packet_recv.  When head == tail (TX history empty) then pipe
      should be 0, which is the case directly after initialisation and after a
      retransmission timeout has occurred (ccid2_hc_tx_rto_expire).
      
      The first case involves parsing Ack Vectors for packets recorded in the live
      portion of the buffer, between tail and head. For each packet marked by the
      receiver as received (state 0) or ECN-marked (state 1), pipe is decremented by
      one, so for all such packets the BUG_ON in tx_check_sanity will not trigger.
      
      The second case is the loss detection in the second half of tx_packet_recv,
      below the comment "Check for NUMDUPACK".
      
      The first while-loop here ensures that the sequence number of `seqp' is either
      above or equal to `high_ack', or otherwise equal to the highest sequence number
      sent so far (of the entry head->prev, as head points to the next unsent entry).
      The next while-loop ("while (1)") counts the number of acked packets starting
      from that position of seqp, going backwards in the direction from head->prev to
      tail. If NUMDUPACK=3 such packets were counted within this loop, `seqp' points
      to the last acknowledged packet of these, and the "if (done == NUMDUPACK)" block
      is entered next. 
      The while-loop contained within that block in turn traverses the list backwards,
      from head to tail; the position of `seqp' is saved in the variable `last_acked'. 
      For each packet not marked as `acked', a congestion event is triggered within 
      the loop, and pipe is decremented. The loop terminates when `seqp' has reached
      `tail', whereupon tail is set to the position previously stored in `last_acked'.
      Thus, between `last_acked' and the previous position of `tail', 
       - pipe has been decremented earlier if the packet was marked as state 0 or 1;
       - pipe was decremented if the packet was not marked as acked.
      That is, pipe has been decremented by the number of packets between `last_acked'
      and the previous position of `tail'. As a consequence, pipe now again reflects
      the number of packets which have not (yet) been acked between the new position
      of tail (at `last_acked') and head->prev, or 0 if head==tail. The result is that
      the BUG_ON condition in check_sanity will also not be triggered, hence the test
      (3) is also redundant.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      c6f0f2e7
    • G
      dccp ccid-2: Stop polling · 83337dae
      Gerrit Renker 提交于
      This updates CCID2 to use the CCID dequeuing mechanism, converting from
      previous constant-polling to a now event-driven mechanism.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      83337dae
    • G
      dccp ccid-2: Separate option parsing from CCID processing · c8bf462b
      Gerrit Renker 提交于
      This patch replaces an almost identical replication of code: large parts
      of dccp_parse_options() re-appeared as ccid2_ackvector() in ccid2.c.
      
      Apart from the duplication, this caused two more problems:
       1. CCIDs should not need to be concerned with parsing header options;
       2. one can not assume that Ack Vectors appear as a contiguous area within an
          skb, it is legal to insert other options and/or padding in between. The
          current code would throw an error and stop reading in such a case.
      
      The patch provides a new data structure and associated list housekeeping.
      
      Only small changes were necessary to integrate with CCID-2: data structure
      initialisation, adapt list traversal routine, and add call to the provided
      cleanup routine.
      
      The latter also lead to fixing the following BUG: CCID-2 so far ignored
      Ack Vectors on all packets other than Ack/DataAck, which is incorrect,
      since Ack Vectors can be present on any packet that has an Ack field.
      
      Details:
      --------
       * received Ack Vectors are parsed by dccp_parse_options() alone, which passes
         the result on to the CCID-specific routine ccid_hc_tx_parse_options();
       * CCIDs interested in using/decoding Ack Vector information will add code
         to fetch parsed Ack Vectors via this interface;
       * a data structure, `struct dccp_ackvec_parsed' is provided as interface;
       * this structure arranges Ack Vectors of the same skb into a FIFO order;
       * a doubly-linked list is used to keep the required FIFO code small.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      c8bf462b
    • G
      dccp ccid-2: Ack Vector interface clean-up · ff49e270
      Gerrit Renker 提交于
      This patch brings the Ack Vector interface up to date. Its main purpose is
      to lay the basis for the subsequent patches of this set, which will use the
      new data structure fields and routines.
      
      There are no real algorithmic changes, rather an adaptation:
      
       (1) Replaced the static Ack Vector size (2) with a #define so that it can
           be adapted (with low loss / Ack Ratio, a value of 1 works, so 2 seems
           to be sufficient for the moment) and added a solution so that computing
           the ECN nonce will continue to work - even with larger Ack Vectors.
      
       (2) Replaced the #defines for Ack Vector states with a complete enum.
      
       (3) Replaced #defines to compute Ack Vector length and state with general
           purpose routines (inlines), and updated code to use these.
      
       (4) Added a `tail' field (conversion to circular buffer in subsequent patch).
      
       (5) Updated the (outdated) documentation for Ack Vector struct.
      
       (6) All sequence number containers now trimmed to 48 bits.
      
       (7) Removal of unused bits:
           * removed dccpav_ack_nonce from struct dccp_ackvec, since this is already
             redundantly stored in the `dccpavr_ack_nonce' (of Ack Vector record);
           * removed Elapsed Time for Ack Vectors (it was nowhere used);
           * replaced semantics of dccpavr_sent_len with dccpavr_ack_runlen, since
             the code needs to be able to remember the old run length; 
           * reduced the de-/allocation routines (redundant / duplicate tests).
      
      
      Justification for removing Elapsed Time information [can be removed]:
      ---------------------------------------------------------------------
       1. The Elapsed Time information for Ack Vectors was nowhere used in the code.
       2. DCCP does not implement rate-based pacing of acknowledgments. The only
          recommendation for always including Elapsed Time is in section 11.3 of
          RFC 4340: "Receivers that rate-pace acknowledgements SHOULD [...]
          include Elapsed Time options". But such is not the case here.
       3. It does not really improve estimation accuracy. The Elapsed Time field only
          records the time between the arrival of the last acknowledgeable packet and
          the time the Ack Vector is sent out. Since Linux does not (yet) implement
          delayed Acks, the time difference will typically be small, since often the
          arrival of a data packet triggers sending feedback at the HC-receiver.
      
      
      Justification for changes in de-/allocation routines [can be removed]:
      ----------------------------------------------------------------------
        * INIT_LIST_HEAD in dccp_ackvec_record_new was redundant, since the list
          pointers were later overwritten when the node was added via list_add();
        * dccp_ackvec_record_new() was called in a single place only;
        * calls to list_del_init() before calling dccp_ackvec_record_delete() were
          redundant, since subsequently the entire element was k-freed;
        * since all calls to dccp_ackvec_record_delete() were preceded to a call to
          list_del_init(), the WARN_ON test would never evaluate to true;
        * since all calls to dccp_ackvec_record_delete() were made from within
          list_for_each_entry_safe(), the test for avr == NULL was redundant;
        * list_empty() in ackvec_free was redundant, since the same condition is
          embedded in the loop condition of the subsequent list_for_each_entry_safe().
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      ff49e270
    • G
      dccp: Unused argument in CCID tx function · c506d91d
      Gerrit Renker 提交于
      This removes the argument `more' from ccid_hc_tx_packet_sent, since it was
      nowhere used in the entire code.
      
      (Anecdotally, this argument was not even used in the original KAME code where
       the function originally came from; compare the variable moreToSend in the
       freebsd61-dccp-kame-28.08.2006.patch now maintained by Emmanuel Lochin.)
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      c506d91d
    • G
      dccp ccid-2: Remove ccid2hc{tx,rx}_ prefixes · 1fb87509
      Gerrit Renker 提交于
      This patch fixes two problems caused by the ubiquitous long "hctx->ccid2htx_"
      and "hcrx->ccid2hcrx_" prefixes:
       * code becomes hard to read;
       * multiple-line statements are almost inevitable even for simple expressions;
      The prefixes are not really necessary (compare with "struct tcp_sock").
      
      There had been previous discussion of this on dccp@vger, but so far this was
      not followed up (most people agreed that the prefixes are too long). 
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: NLeandro Melo de Sales <leandroal@gmail.com>
      1fb87509
    • G
      dccp: Registration routines for changing feature values · 86349c8d
      Gerrit Renker 提交于
      Two registration routines, for SP and NN features, are provided by this patch,
      replacing a previous routine which was used for both feature types.
      
      These are internal-only routines and therefore start with `__feat_register'.
      
      It further exports the known limits of Sequence Window and Ack Ratio as symbolic
      constants.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Acked-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      86349c8d
    • G
      dccp: Toggle debug output without module unloading · 43264991
      Gerrit Renker 提交于
      This sets the sysfs permissions so that root can toggle the `debug'
      parameter available for nearly every DCCP module. This is useful 
      since there are various module inter-dependencies. The debug flag
      can now be toggled at runtime using
      
        echo 1 > /sys/module/dccp/parameters/dccp_debug
        echo 1 > /sys/module/dccp_ccid2/parameters/ccid2_debug
        echo 1 > /sys/module/dccp_ccid3/parameters/ccid3_debug
        echo 1 > /sys/module/dccp_tfrc_lib/parameters/tfrc_debug
      
      The last is not very useful yet, since no code at the moment calls
      the tfrc_debug() macro.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      43264991
  2. 03 5月, 2008 1 次提交
  3. 29 1月, 2008 16 次提交
  4. 24 10月, 2007 1 次提交
    • G
      [CCID2/3]: Initialisation assignments of 0 are redundant · 24c667db
      Gerrit Renker 提交于
      Assigning initial values of `0' is redundant when loading a new CCID structure,
      since in net/dccp/ccid.c the entire CCID structure is zeroed out prior to
      initialisation in ccid_new():
      
          	struct ccid {
          		struct ccid_operations *ccid_ops;
          		char		       ccid_priv[0];
          	};
      
          	// ...
          	if (rx) {
          		memset(ccid + 1, 0, ccid_ops->ccid_hc_rx_obj_size);
          		if (ccid->ccid_ops->ccid_hc_rx_init != NULL &&
          		    ccid->ccid_ops->ccid_hc_rx_init(ccid, sk) != 0)
          			goto out_free_ccid;
          	} else {
          		memset(ccid + 1, 0, ccid_ops->ccid_hc_tx_obj_size);
          		/* analogous to the rx case */
          	}
      
      This patch therefore removes the redundant assignments. Thanks to Arnaldo for
      the inspiration.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      24c667db
  5. 11 10月, 2007 6 次提交
  6. 22 8月, 2007 1 次提交
    • G
      [DCCP]: Allocation in atomic context · 39dad26c
      Gerrit Renker 提交于
      This fixes the following bug reported in syslog:
      
      [ 4039.051658] BUG: sleeping function called from invalid context at /usr/src/davem-2.6/mm/slab.c:3032
      [ 4039.051668] in_atomic():1, irqs_disabled():0
      [ 4039.051670] INFO: lockdep is turned off.
      [ 4039.051674]  [<c0104c0f>] show_trace_log_lvl+0x1a/0x30
      [ 4039.051687]  [<c0104d4d>] show_trace+0x12/0x14
      [ 4039.051691]  [<c0104d65>] dump_stack+0x16/0x18
      [ 4039.051695]  [<c011371e>] __might_sleep+0xaf/0xbe
      [ 4039.051700]  [<c0157b66>] __kmalloc+0xb1/0xd0
      [ 4039.051706]  [<f090416f>] ccid2_hc_tx_alloc_seq+0x35/0xc3 [dccp_ccid2]
      [ 4039.051717]  [<f09048d6>] ccid2_hc_tx_packet_sent+0x27f/0x2d9 [dccp_ccid2]
      [ 4039.051723]  [<f085486b>] dccp_write_xmit+0x1eb/0x338 [dccp]
      [ 4039.051741]  [<f085603d>] dccp_sendmsg+0x113/0x18f [dccp]
      [ 4039.051750]  [<c03907fc>] inet_sendmsg+0x2e/0x4c
      [ 4039.051758]  [<c033a47d>] sock_aio_write+0xd5/0x107
      [ 4039.051766]  [<c015abc1>] do_sync_write+0xcd/0x11c
      [ 4039.051772]  [<c015b296>] vfs_write+0x118/0x11f
      [ 4039.051840]  [<c015b932>] sys_write+0x3d/0x64
      [ 4039.051845]  [<c0103e7c>] syscall_call+0x7/0xb
      [ 4039.051848]  =======================
      
      The problem was that GFP_KERNEL was used; fixed by using gfp_any().
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      39dad26c
  7. 11 2月, 2007 1 次提交
  8. 12 12月, 2006 1 次提交
  9. 03 12月, 2006 4 次提交
    • G
      [DCCP]: Use `unsigned' for packet lengths · 6b57c93d
      Gerrit Renker 提交于
      This patch implements a suggestion by Ian McDonald and
      
       1) Avoids tests against negative packet lengths by using unsigned int
          for packet payload lengths in the CCID send_packet()/packet_sent() routines
      
       2) As a consequence, it removes an now unnecessary test with regard to `len > 0'
          in ccid3_hc_tx_packet_sent: that condition is always true, since
            * negative packet lengths are avoided
            * ccid3_hc_tx_send_packet flags an error whenever the payload length is 0.
              As a consequence, ccid3_hc_tx_packet_sent is never called as all errors
              returned by ccid_hc_tx_send_packet are caught in dccp_write_xmit
      
       3) Removes the third argument of ccid_hc_tx_send_packet (the `len' parameter),
          since it is currently always set to skb->len. The code is updated with regard
          to this parameter change.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: NIan McDonald <ian.mcdonald@jandi.co.nz>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
      6b57c93d
    • G
      [DCCP]: Simplified conditions due to use of enum:8 states · 59348b19
      Gerrit Renker 提交于
      This reaps the benefit of the earlier patch, which changed the type of
      CCID 3 states to use enums, in that many conditions are now simplified
      and the number of possible (unexpected) values is greatly reduced.
      
      In a few instances, this also allowed to simplify pre-conditions; where
      care has been taken to retain logical equivalence.
      
      [DCCP]: Introduce a consistent BUG/WARN message scheme
      
      This refines the existing set of DCCP messages so that
       * BUG(), BUG_ON(), WARN_ON() have meaningful DCCP-specific counterparts
       * DCCP_CRIT (for severe warnings) is not rate-limited
       * DCCP_WARN() is introduced as rate-limited wrapper
      
      Using these allows a faster and cleaner transition to their original
      counterparts once the code has matured into a full DCCP implementation.
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
      59348b19
    • G
      [DCCP]: enable debug messages also for static builds · 84116716
      Gerrit Renker 提交于
      This patch
        * makes debugging (when configured) work both for static / module build
        * provides generic debugging macros for use in other DCCP / CCID modules
        * adds missing information about debug parameters to Kconfig
        * performs some code tidy-up
      Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
      84116716
    • A
      [DCCP] CCID2: Code optimizations · 32aac18d
      Andrea Bittau 提交于
      These are code optimizations which are relevant when dealing with large
      windows.  They are not coded the way I would like to, but they do the job for
      the short-term.  This patch should be more neat.
      
      Commiter note: Changed the seqno comparisions to use {after,before}48 to handle
                     wrapping.
      Signed-off-by: NAndrea Bittau <a.bittau@cs.ucl.ac.uk>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
      32aac18d