1. 17 2月, 2014 1 次提交
    • M
      net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer · ef2820a7
      Matija Glavinic Pecotic 提交于
      Implementation of (a)rwnd calculation might lead to severe performance issues
      and associations completely stalling. These problems are described and solution
      is proposed which improves lksctp's robustness in congestion state.
      
      1) Sudden drop of a_rwnd and incomplete window recovery afterwards
      
      Data accounted in sctp_assoc_rwnd_decrease takes only payload size (sctp data),
      but size of sk_buff, which is blamed against receiver buffer, is not accounted
      in rwnd. Theoretically, this should not be the problem as actual size of buffer
      is double the amount requested on the socket (SO_RECVBUF). Problem here is
      that this will have bad scaling for data which is less then sizeof sk_buff.
      E.g. in 4G (LTE) networks, link interfacing radio side will have a large portion
      of traffic of this size (less then 100B).
      
      An example of sudden drop and incomplete window recovery is given below. Node B
      exhibits problematic behavior. Node A initiates association and B is configured
      to advertise rwnd of 10000. A sends messages of size 43B (size of typical sctp
      message in 4G (LTE) network). On B data is left in buffer by not reading socket
      in userspace.
      
      Lets examine when we will hit pressure state and declare rwnd to be 0 for
      scenario with above stated parameters (rwnd == 10000, chunk size == 43, each
      chunk is sent in separate sctp packet)
      
      Logic is implemented in sctp_assoc_rwnd_decrease:
      
      socket_buffer (see below) is maximum size which can be held in socket buffer
      (sk_rcvbuf). current_alloced is amount of data currently allocated (rx_count)
      
      A simple expression is given for which it will be examined after how many
      packets for above stated parameters we enter pressure state:
      
      We start by condition which has to be met in order to enter pressure state:
      
      	socket_buffer < currently_alloced;
      
      currently_alloced is represented as size of sctp packets received so far and not
      yet delivered to userspace. x is the number of chunks/packets (since there is no
      bundling, and each chunk is delivered in separate packet, we can observe each
      chunk also as sctp packet, and what is important here, having its own sk_buff):
      
      	socket_buffer < x*each_sctp_packet;
      
      each_sctp_packet is sctp chunk size + sizeof(struct sk_buff). socket_buffer is
      twice the amount of initially requested size of socket buffer, which is in case
      of sctp, twice the a_rwnd requested:
      
      	2*rwnd < x*(payload+sizeof(struc sk_buff));
      
      sizeof(struct sk_buff) is 190 (3.13.0-rc4+). Above is stated that rwnd is 10000
      and each payload size is 43
      
      	20000 < x(43+190);
      
      	x > 20000/233;
      
      	x ~> 84;
      
      After ~84 messages, pressure state is entered and 0 rwnd is advertised while
      received 84*43B ~= 3612B sctp data. This is why external observer notices sudden
      drop from 6474 to 0, as it will be now shown in example:
      
      IP A.34340 > B.12345: sctp (1) [INIT] [init tag: 1875509148] [rwnd: 81920] [OS: 10] [MIS: 65535] [init TSN: 1096057017]
      IP B.12345 > A.34340: sctp (1) [INIT ACK] [init tag: 3198966556] [rwnd: 10000] [OS: 10] [MIS: 10] [init TSN: 902132839]
      IP A.34340 > B.12345: sctp (1) [COOKIE ECHO]
      IP B.12345 > A.34340: sctp (1) [COOKIE ACK]
      IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057017] [SID: 0] [SSEQ 0] [PPID 0x18]
      IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057017] [a_rwnd 9957] [#gap acks 0] [#dup tsns 0]
      IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057018] [SID: 0] [SSEQ 1] [PPID 0x18]
      IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057018] [a_rwnd 9957] [#gap acks 0] [#dup tsns 0]
      IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057019] [SID: 0] [SSEQ 2] [PPID 0x18]
      IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057019] [a_rwnd 9914] [#gap acks 0] [#dup tsns 0]
      <...>
      IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057098] [SID: 0] [SSEQ 81] [PPID 0x18]
      IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057098] [a_rwnd 6517] [#gap acks 0] [#dup tsns 0]
      IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057099] [SID: 0] [SSEQ 82] [PPID 0x18]
      IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057099] [a_rwnd 6474] [#gap acks 0] [#dup tsns 0]
      IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057100] [SID: 0] [SSEQ 83] [PPID 0x18]
      
      --> Sudden drop
      
      IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057100] [a_rwnd 0] [#gap acks 0] [#dup tsns 0]
      
      At this point, rwnd_press stores current rwnd value so it can be later restored
      in sctp_assoc_rwnd_increase. This however doesn't happen as condition to start
      slowly increasing rwnd until rwnd_press is returned to rwnd is never met. This
      condition is not met since rwnd, after it hit 0, must first reach rwnd_press by
      adding amount which is read from userspace. Let us observe values in above
      example. Initial a_rwnd is 10000, pressure was hit when rwnd was ~6500 and the
      amount of actual sctp data currently waiting to be delivered to userspace
      is ~3500. When userspace starts to read, sctp_assoc_rwnd_increase will be blamed
      only for sctp data, which is ~3500. Condition is never met, and when userspace
      reads all data, rwnd stays on 3569.
      
      IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057100] [a_rwnd 1505] [#gap acks 0] [#dup tsns 0]
      IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057100] [a_rwnd 3010] [#gap acks 0] [#dup tsns 0]
      IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057101] [SID: 0] [SSEQ 84] [PPID 0x18]
      IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057101] [a_rwnd 3569] [#gap acks 0] [#dup tsns 0]
      
      --> At this point userspace read everything, rwnd recovered only to 3569
      
      IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057102] [SID: 0] [SSEQ 85] [PPID 0x18]
      IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057102] [a_rwnd 3569] [#gap acks 0] [#dup tsns 0]
      
      Reproduction is straight forward, it is enough for sender to send packets of
      size less then sizeof(struct sk_buff) and receiver keeping them in its buffers.
      
      2) Minute size window for associations sharing the same socket buffer
      
      In case multiple associations share the same socket, and same socket buffer
      (sctp.rcvbuf_policy == 0), different scenarios exist in which congestion on one
      of the associations can permanently drop rwnd of other association(s).
      
      Situation will be typically observed as one association suddenly having rwnd
      dropped to size of last packet received and never recovering beyond that point.
      Different scenarios will lead to it, but all have in common that one of the
      associations (let it be association from 1)) nearly depleted socket buffer, and
      the other association blames socket buffer just for the amount enough to start
      the pressure. This association will enter pressure state, set rwnd_press and
      announce 0 rwnd.
      When data is read by userspace, similar situation as in 1) will occur, rwnd will
      increase just for the size read by userspace but rwnd_press will be high enough
      so that association doesn't have enough credit to reach rwnd_press and restore
      to previous state. This case is special case of 1), being worse as there is, in
      the worst case, only one packet in buffer for which size rwnd will be increased.
      Consequence is association which has very low maximum rwnd ('minute size', in
      our case down to 43B - size of packet which caused pressure) and as such
      unusable.
      
      Scenario happened in the field and labs frequently after congestion state (link
      breaks, different probabilities of packet drop, packet reordering) and with
      scenario 1) preceding. Here is given a deterministic scenario for reproduction:
      
      >From node A establish two associations on the same socket, with rcvbuf_policy
      being set to share one common buffer (sctp.rcvbuf_policy == 0). On association 1
      repeat scenario from 1), that is, bring it down to 0 and restore up. Observe
      scenario 1). Use small payload size (here we use 43). Once rwnd is 'recovered',
      bring it down close to 0, as in just one more packet would close it. This has as
      a consequence that association number 2 is able to receive (at least) one more
      packet which will bring it in pressure state. E.g. if association 2 had rwnd of
      10000, packet received was 43, and we enter at this point into pressure,
      rwnd_press will have 9957. Once payload is delivered to userspace, rwnd will
      increase for 43, but conditions to restore rwnd to original state, just as in
      1), will never be satisfied.
      
      --> Association 1, between A.y and B.12345
      
      IP A.55915 > B.12345: sctp (1) [INIT] [init tag: 836880897] [rwnd: 10000] [OS: 10] [MIS: 65535] [init TSN: 4032536569]
      IP B.12345 > A.55915: sctp (1) [INIT ACK] [init tag: 2873310749] [rwnd: 81920] [OS: 10] [MIS: 10] [init TSN: 3799315613]
      IP A.55915 > B.12345: sctp (1) [COOKIE ECHO]
      IP B.12345 > A.55915: sctp (1) [COOKIE ACK]
      
      --> Association 2, between A.z and B.12346
      
      IP A.55915 > B.12346: sctp (1) [INIT] [init tag: 534798321] [rwnd: 10000] [OS: 10] [MIS: 65535] [init TSN: 2099285173]
      IP B.12346 > A.55915: sctp (1) [INIT ACK] [init tag: 516668823] [rwnd: 81920] [OS: 10] [MIS: 10] [init TSN: 3676403240]
      IP A.55915 > B.12346: sctp (1) [COOKIE ECHO]
      IP B.12346 > A.55915: sctp (1) [COOKIE ACK]
      
      --> Deplete socket buffer by sending messages of size 43B over association 1
      
      IP B.12345 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3799315613] [SID: 0] [SSEQ 0] [PPID 0x18]
      IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315613] [a_rwnd 9957] [#gap acks 0] [#dup tsns 0]
      
      <...>
      
      IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315696] [a_rwnd 6388] [#gap acks 0] [#dup tsns 0]
      IP B.12345 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3799315697] [SID: 0] [SSEQ 84] [PPID 0x18]
      IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315697] [a_rwnd 6345] [#gap acks 0] [#dup tsns 0]
      
      --> Sudden drop on 1
      
      IP B.12345 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3799315698] [SID: 0] [SSEQ 85] [PPID 0x18]
      IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315698] [a_rwnd 0] [#gap acks 0] [#dup tsns 0]
      
      --> Here userspace read, rwnd 'recovered' to 3698, now deplete again using
          association 1 so there is place in buffer for only one more packet
      
      IP B.12345 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3799315799] [SID: 0] [SSEQ 186] [PPID 0x18]
      IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315799] [a_rwnd 86] [#gap acks 0] [#dup tsns 0]
      IP B.12345 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3799315800] [SID: 0] [SSEQ 187] [PPID 0x18]
      IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315800] [a_rwnd 43] [#gap acks 0] [#dup tsns 0]
      
      --> Socket buffer is almost depleted, but there is space for one more packet,
          send them over association 2, size 43B
      
      IP B.12346 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3676403240] [SID: 0] [SSEQ 0] [PPID 0x18]
      IP A.55915 > B.12346: sctp (1) [SACK] [cum ack 3676403240] [a_rwnd 0] [#gap acks 0] [#dup tsns 0]
      
      --> Immediate drop
      
      IP A.60995 > B.12346: sctp (1) [SACK] [cum ack 387491510] [a_rwnd 0] [#gap acks 0] [#dup tsns 0]
      
      --> Read everything from the socket, both association recover up to maximum rwnd
          they are capable of reaching, note that association 1 recovered up to 3698,
          and association 2 recovered only to 43
      
      IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315800] [a_rwnd 1548] [#gap acks 0] [#dup tsns 0]
      IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315800] [a_rwnd 3053] [#gap acks 0] [#dup tsns 0]
      IP B.12345 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3799315801] [SID: 0] [SSEQ 188] [PPID 0x18]
      IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315801] [a_rwnd 3698] [#gap acks 0] [#dup tsns 0]
      IP B.12346 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3676403241] [SID: 0] [SSEQ 1] [PPID 0x18]
      IP A.55915 > B.12346: sctp (1) [SACK] [cum ack 3676403241] [a_rwnd 43] [#gap acks 0] [#dup tsns 0]
      
      A careful reader might wonder why it is necessary to reproduce 1) prior
      reproduction of 2). It is simply easier to observe when to send packet over
      association 2 which will push association into the pressure state.
      
      Proposed solution:
      
      Both problems share the same root cause, and that is improper scaling of socket
      buffer with rwnd. Solution in which sizeof(sk_buff) is taken into concern while
      calculating rwnd is not possible due to fact that there is no linear
      relationship between amount of data blamed in increase/decrease with IP packet
      in which payload arrived. Even in case such solution would be followed,
      complexity of the code would increase. Due to nature of current rwnd handling,
      slow increase (in sctp_assoc_rwnd_increase) of rwnd after pressure state is
      entered is rationale, but it gives false representation to the sender of current
      buffer space. Furthermore, it implements additional congestion control mechanism
      which is defined on implementation, and not on standard basis.
      
      Proposed solution simplifies whole algorithm having on mind definition from rfc:
      
      o  Receiver Window (rwnd): This gives the sender an indication of the space
         available in the receiver's inbound buffer.
      
      Core of the proposed solution is given with these lines:
      
      sctp_assoc_rwnd_update:
      	if ((asoc->base.sk->sk_rcvbuf - rx_count) > 0)
      		asoc->rwnd = (asoc->base.sk->sk_rcvbuf - rx_count) >> 1;
      	else
      		asoc->rwnd = 0;
      
      We advertise to sender (half of) actual space we have. Half is in the braces
      depending whether you would like to observe size of socket buffer as SO_RECVBUF
      or twice the amount, i.e. size is the one visible from userspace, that is,
      from kernelspace.
      In this way sender is given with good approximation of our buffer space,
      regardless of the buffer policy - we always advertise what we have. Proposed
      solution fixes described problems and removes necessity for rwnd restoration
      algorithm. Finally, as proposed solution is simplification, some lines of code,
      along with some bytes in struct sctp_association are saved.
      
      Version 2 of the patch addressed comments from Vlad. Name of the function is set
      to be more descriptive, and two parts of code are changed, in one removing the
      superfluous call to sctp_assoc_rwnd_update since call would not result in update
      of rwnd, and the other being reordering of the code in a way that call to
      sctp_assoc_rwnd_update updates rwnd. Version 3 corrected change introduced in v2
      in a way that existing function is not reordered/copied in line, but it is
      correctly called. Thanks Vlad for suggesting.
      Signed-off-by: NMatija Glavinic Pecotic <matija.glavinic-pecotic.ext@nsn.com>
      Reviewed-by: NAlexander Sverdlin <alexander.sverdlin@nsn.com>
      Acked-by: NVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ef2820a7
  2. 11 12月, 2013 1 次提交
    • N
      sctp: properly latch and use autoclose value from sock to association · 9f70f46b
      Neil Horman 提交于
      Currently, sctp associations latch a sockets autoclose value to an association
      at association init time, subject to capping constraints from the max_autoclose
      sysctl value.  This leads to an odd situation where an application may set a
      socket level autoclose timeout, but sliently sctp will limit the autoclose
      timeout to something less than that.
      
      Fix this by modifying the autoclose setsockopt function to check the limit, cap
      it and warn the user via syslog that the timeout is capped.  This will allow
      getsockopt to return valid autoclose timeout values that reflect what subsequent
      associations actually use.
      
      While were at it, also elimintate the assoc->autoclose variable, it duplicates
      whats in the timeout array, which leads to multiple sources for the same
      information, that may differ (as the former isn't subject to any capping).  This
      gives us the timeout information in a canonical place and saves some space in
      the association structure as well.
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NVlad Yasevich <vyasevich@gmail.com>
      CC: Wang Weidong <wangweidong1@huawei.com>
      CC: David Miller <davem@davemloft.net>
      CC: Vlad Yasevich <vyasevich@gmail.com>
      CC: netdev@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9f70f46b
  3. 07 12月, 2013 5 次提交
  4. 15 11月, 2013 1 次提交
  5. 28 10月, 2013 1 次提交
  6. 13 8月, 2013 1 次提交
  7. 10 8月, 2013 1 次提交
  8. 25 7月, 2013 1 次提交
  9. 02 7月, 2013 1 次提交
    • D
      net: sctp: rework debugging framework to use pr_debug and friends · bb33381d
      Daniel Borkmann 提交于
      We should get rid of all own SCTP debug printk macros and use the ones
      that the kernel offers anyway instead. This makes the code more readable
      and conform to the kernel code, and offers all the features of dynamic
      debbuging that pr_debug() et al has, such as only turning on/off portions
      of debug messages at runtime through debugfs. The runtime cost of having
      CONFIG_DYNAMIC_DEBUG enabled, but none of the debug statements printing,
      is negligible [1]. If kernel debugging is completly turned off, then these
      statements will also compile into "empty" functions.
      
      While we're at it, we also need to change the Kconfig option as it /now/
      only refers to the ifdef'ed code portions in outqueue.c that enable further
      debugging/tracing of SCTP transaction fields. Also, since SCTP_ASSERT code
      was enabled with this Kconfig option and has now been removed, we
      transform those code parts into WARNs resp. where appropriate BUG_ONs so
      that those bugs can be more easily detected as probably not many people
      have SCTP debugging permanently turned on.
      
      To turn on all SCTP debugging, the following steps are needed:
      
       # mount -t debugfs none /sys/kernel/debug
       # echo -n 'module sctp +p' > /sys/kernel/debug/dynamic_debug/control
      
      This can be done more fine-grained on a per file, per line basis and others
      as described in [2].
      
       [1] https://www.kernel.org/doc/ols/2009/ols2009-pages-39-46.pdf
       [2] Documentation/dynamic-debug-howto.txt
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bb33381d
  10. 26 6月, 2013 1 次提交
    • D
      net: sctp: migrate cookie life from timeval to ktime · 52db882f
      Daniel Borkmann 提交于
      Currently, SCTP code defines its own timeval functions (since timeval
      is rarely used inside the kernel by others), namely tv_lt() and
      TIMEVAL_ADD() macros, that operate on SCTP cookie expiration.
      
      We might as well remove all those, and operate directly on ktime
      structures for a couple of reasons: ktime is available on all archs;
      complexity of ktime calculations depending on the arch is less than
      (reduces to a simple arithmetic operations on archs with
      BITS_PER_LONG == 64 or CONFIG_KTIME_SCALAR) or equal to timeval
      functions (other archs); code becomes more readable; macros can be
      thrown out.
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Acked-by: NVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      52db882f
  11. 18 6月, 2013 1 次提交
  12. 15 6月, 2013 1 次提交
  13. 30 4月, 2013 1 次提交
  14. 16 4月, 2013 2 次提交
  15. 13 3月, 2013 1 次提交
  16. 28 2月, 2013 1 次提交
  17. 05 2月, 2013 1 次提交
  18. 08 12月, 2012 1 次提交
  19. 04 12月, 2012 1 次提交
    • M
      sctp: Add support to per-association statistics via a new SCTP_GET_ASSOC_STATS call · 196d6759
      Michele Baldessari 提交于
      The current SCTP stack is lacking a mechanism to have per association
      statistics. This is an implementation modeled after OpenSolaris'
      SCTP_GET_ASSOC_STATS.
      
      Userspace part will follow on lksctp if/when there is a general ACK on
      this.
      V4:
      - Move ipackets++ before q->immediate.func() for consistency reasons
      - Move sctp_max_rto() at the end of sctp_transport_update_rto() to avoid
        returning bogus RTO values
      - return asoc->rto_min when max_obs_rto value has not changed
      
      V3:
      - Increase ictrlchunks in sctp_assoc_bh_rcv() as well
      - Move ipackets++ to sctp_inq_push()
      - return 0 when no rto updates took place since the last call
      
      V2:
      - Implement partial retrieval of stat struct to cope for future expansion
      - Kill the rtxpackets counter as it cannot be precise anyway
      - Rename outseqtsns to outofseqtsns to make it clearer that these are out
        of sequence unexpected TSNs
      - Move asoc->ipackets++ under a lock to avoid potential miscounts
      - Fold asoc->opackets++ into the already existing asoc check
      - Kill unneeded (q->asoc) test when increasing rtxchunks
      - Do not count octrlchunks if sending failed (SCTP_XMIT_OK != 0)
      - Don't count SHUTDOWNs as SACKs
      - Move SCTP_GET_ASSOC_STATS to the private space API
      - Adjust the len check in sctp_getsockopt_assoc_stats() to allow for
        future struct growth
      - Move association statistics in their own struct
      - Update idupchunks when we send a SACK with dup TSNs
      - return min_rto in max_rto when RTO has not changed. Also return the
        transport when max_rto last changed.
      
      Signed-off: Michele Baldessari <michele@acksyn.org>
      Acked-by: NVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      196d6759
  20. 15 8月, 2012 6 次提交
  21. 23 7月, 2012 1 次提交
    • N
      sctp: Implement quick failover draft from tsvwg · 5aa93bcf
      Neil Horman 提交于
      I've seen several attempts recently made to do quick failover of sctp transports
      by reducing various retransmit timers and counters.  While its possible to
      implement a faster failover on multihomed sctp associations, its not
      particularly robust, in that it can lead to unneeded retransmits, as well as
      false connection failures due to intermittent latency on a network.
      
      Instead, lets implement the new ietf quick failover draft found here:
      http://tools.ietf.org/html/draft-nishida-tsvwg-sctp-failover-05
      
      This will let the sctp stack identify transports that have had a small number of
      errors, and avoid using them quickly until their reliability can be
      re-established.  I've tested this out on two virt guests connected via multiple
      isolated virt networks and believe its in compliance with the above draft and
      works well.
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      CC: Vlad Yasevich <vyasevich@gmail.com>
      CC: Sridhar Samudrala <sri@us.ibm.com>
      CC: "David S. Miller" <davem@davemloft.net>
      CC: linux-sctp@vger.kernel.org
      CC: joe@perches.com
      Acked-by: NVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5aa93bcf
  22. 16 7月, 2012 1 次提交
  23. 01 7月, 2012 1 次提交
    • N
      sctp: be more restrictive in transport selection on bundled sacks · 4244854d
      Neil Horman 提交于
      It was noticed recently that when we send data on a transport, its possible that
      we might bundle a sack that arrived on a different transport.  While this isn't
      a major problem, it does go against the SHOULD requirement in section 6.4 of RFC
      2960:
      
       An endpoint SHOULD transmit reply chunks (e.g., SACK, HEARTBEAT ACK,
         etc.) to the same destination transport address from which it
         received the DATA or control chunk to which it is replying.  This
         rule should also be followed if the endpoint is bundling DATA chunks
         together with the reply chunk.
      
      This patch seeks to correct that.  It restricts the bundling of sack operations
      to only those transports which have moved the ctsn of the association forward
      since the last sack.  By doing this we guarantee that we only bundle outbound
      saks on a transport that has received a chunk since the last sack.  This brings
      us into stricter compliance with the RFC.
      
      Vlad had initially suggested that we strictly allow only sack bundling on the
      transport that last moved the ctsn forward.  While this makes sense, I was
      concerned that doing so prevented us from bundling in the case where we had
      received chunks that moved the ctsn on multiple transports.  In those cases, the
      RFC allows us to select any of the transports having received chunks to bundle
      the sack on.  so I've modified the approach to allow for that, by adding a state
      variable to each transport that tracks weather it has moved the ctsn since the
      last sack.  This I think keeps our behavior (and performance), close enough to
      our current profile that I think we can do this without a sysctl knob to
      enable/disable it.
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      CC: Vlad Yaseivch <vyasevich@gmail.com>
      CC: David S. Miller <davem@davemloft.net>
      CC: linux-sctp@vger.kernel.org
      Reported-by: NMichele Baldessari <michele@redhat.com>
      Reported-by: Nsorin serban <sserban@redhat.com>
      Acked-by: NVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4244854d
  24. 16 4月, 2012 1 次提交
  25. 20 12月, 2011 1 次提交
  26. 25 8月, 2011 1 次提交
  27. 02 6月, 2011 1 次提交
  28. 01 6月, 2011 1 次提交
  29. 26 5月, 2011 1 次提交
  30. 13 4月, 2011 1 次提交