1. 05 5月, 2019 1 次提交
    • X
      sctp: avoid running the sctp state machine recursively · b563e9bb
      Xin Long 提交于
      [ Upstream commit fbd019737d71e405f86549fd738f81e2ff3dd073 ]
      
      Ying triggered a call trace when doing an asconf testing:
      
        BUG: scheduling while atomic: swapper/12/0/0x10000100
        Call Trace:
         <IRQ>  [<ffffffffa4375904>] dump_stack+0x19/0x1b
         [<ffffffffa436fcaf>] __schedule_bug+0x64/0x72
         [<ffffffffa437b93a>] __schedule+0x9ba/0xa00
         [<ffffffffa3cd5326>] __cond_resched+0x26/0x30
         [<ffffffffa437bc4a>] _cond_resched+0x3a/0x50
         [<ffffffffa3e22be8>] kmem_cache_alloc_node+0x38/0x200
         [<ffffffffa423512d>] __alloc_skb+0x5d/0x2d0
         [<ffffffffc0995320>] sctp_packet_transmit+0x610/0xa20 [sctp]
         [<ffffffffc098510e>] sctp_outq_flush+0x2ce/0xc00 [sctp]
         [<ffffffffc098646c>] sctp_outq_uncork+0x1c/0x20 [sctp]
         [<ffffffffc0977338>] sctp_cmd_interpreter.isra.22+0xc8/0x1460 [sctp]
         [<ffffffffc0976ad1>] sctp_do_sm+0xe1/0x350 [sctp]
         [<ffffffffc099443d>] sctp_primitive_ASCONF+0x3d/0x50 [sctp]
         [<ffffffffc0977384>] sctp_cmd_interpreter.isra.22+0x114/0x1460 [sctp]
         [<ffffffffc0976ad1>] sctp_do_sm+0xe1/0x350 [sctp]
         [<ffffffffc097b3a4>] sctp_assoc_bh_rcv+0xf4/0x1b0 [sctp]
         [<ffffffffc09840f1>] sctp_inq_push+0x51/0x70 [sctp]
         [<ffffffffc099732b>] sctp_rcv+0xa8b/0xbd0 [sctp]
      
      As it shows, the first sctp_do_sm() running under atomic context (NET_RX
      softirq) invoked sctp_primitive_ASCONF() that uses GFP_KERNEL flag later,
      and this flag is supposed to be used in non-atomic context only. Besides,
      sctp_do_sm() was called recursively, which is not expected.
      
      Vlad tried to fix this recursive call in Commit c0786693 ("sctp: Fix
      oops when sending queued ASCONF chunks") by introducing a new command
      SCTP_CMD_SEND_NEXT_ASCONF. But it didn't work as this command is still
      used in the first sctp_do_sm() call, and sctp_primitive_ASCONF() will
      be called in this command again.
      
      To avoid calling sctp_do_sm() recursively, we send the next queued ASCONF
      not by sctp_primitive_ASCONF(), but by sctp_sf_do_prm_asconf() in the 1st
      sctp_do_sm() directly.
      Reported-by: NYing Xu <yinxu@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b563e9bb
  2. 25 7月, 2018 1 次提交
  3. 15 3月, 2018 1 次提交
  4. 21 12月, 2017 1 次提交
  5. 16 12月, 2017 2 次提交
  6. 12 12月, 2017 5 次提交
  7. 29 10月, 2017 2 次提交
  8. 25 10月, 2017 1 次提交
  9. 04 10月, 2017 1 次提交
    • M
      sctp: introduce stream scheduler foundations · 5bbbbe32
      Marcelo Ricardo Leitner 提交于
      This patch introduces the hooks necessary to do stream scheduling, as
      per RFC Draft ndata.  It also introduces the first scheduler, which is
      what we do today but now factored out: first come first served (FCFS).
      
      With stream scheduling now we have to track which chunk was enqueued on
      which stream and be able to select another other than the in front of
      the main outqueue. So we introduce a list on sctp_stream_out_ext
      structure for this purpose.
      
      We reuse sctp_chunk->transmitted_list space for the list above, as the
      chunk cannot belong to the two lists at the same time. By using the
      union in there, we can have distinct names for these moments.
      
      sctp_sched_ops are the operations expected to be implemented by each
      scheduler. The dequeueing is a bit particular to this implementation but
      it is to match how we dequeue packets today. We first dequeue and then
      check if it fits the packet and if not, we requeue it at head. Thus why
      we don't have a peek operation but have dequeue_done instead, which is
      called once the chunk can be safely considered as transmitted.
      
      The check removed from sctp_outq_flush is now performed by
      sctp_stream_outq_migrate, which is only called during assoc setup.
      (sctp_sendmsg() also checks for it)
      
      The only operation that is foreseen but not yet added here is a way to
      signalize that a new packet is starting or that the packet is done, for
      round robin scheduler per packet, but is intentionally left to the
      patch that actually implements it.
      
      Support for I-DATA chunks, also described in this RFC, with user message
      interleaving is straightforward as it just requires the schedulers to
      probe for the feature and ignore datamsg boundaries when dequeueing.
      
      See-also: https://tools.ietf.org/html/draft-ietf-tsvwg-sctp-ndata-13Tested-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5bbbbe32
  10. 12 8月, 2017 5 次提交
  11. 07 8月, 2017 4 次提交
  12. 04 8月, 2017 1 次提交
  13. 02 7月, 2017 2 次提交
  14. 21 6月, 2017 2 次提交
  15. 20 2月, 2017 1 次提交
  16. 08 2月, 2017 1 次提交
  17. 19 1月, 2017 1 次提交
    • X
      sctp: add stream reconf timer · 7b9438de
      Xin Long 提交于
      This patch is to add a per transport timer based on sctp timer frame
      for stream reconf chunk retransmission. It would start after sending
      a reconf request chunk, and stop after receiving the response chunk.
      
      If the timer expires, besides retransmitting the reconf request chunk,
      it would also do the same thing with data RTO timer. like to increase
      the appropriate error counts, and perform threshold management, possibly
      destroying the asoc if sctp retransmission thresholds are exceeded, just
      as section 5.1.1 describes.
      
      This patch is also to add asoc strreset_chunk, it is used to save the
      reconf request chunk, so that it can be retransmitted, and to check if
      the response is really for this request by comparing the information
      inside with the response chunk as well.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7b9438de
  18. 19 9月, 2016 2 次提交
  19. 11 6月, 2016 1 次提交
  20. 02 5月, 2016 1 次提交
  21. 14 4月, 2016 1 次提交
  22. 11 4月, 2016 1 次提交
    • M
      sctp: avoid refreshing heartbeat timer too often · ba6f5e33
      Marcelo Ricardo Leitner 提交于
      Currently on high rate SCTP streams the heartbeat timer refresh can
      consume quite a lot of resources as timer updates are costly and it
      contains a random factor, which a) is also costly and b) invalidates
      mod_timer() optimization for not editing a timer to the same value.
      It may even cause the timer to be slightly advanced, for no good reason.
      
      As suggested by David Laight this patch now removes this timer update
      from hot path by leaving the timer on and re-evaluating upon its
      expiration if the heartbeat is still needed or not, similarly to what is
      done for TCP. If it's not needed anymore the timer is re-scheduled to
      the new timeout, considering the time already elapsed.
      
      For this, we now record the last tx timestamp per transport, updated in
      the same spots as hb timer was restarted on tx. Also split up
      sctp_transport_reset_timers into sctp_transport_reset_t3_rtx and
      sctp_transport_reset_hb_timer, so we can re-arm T3 without re-arming the
      heartbeat one.
      
      On loopback with MTU of 65535 and data chunks with 1636, so that we
      have a considerable amount of chunks without stressing system calls,
      netperf -t SCTP_STREAM -l 30, perf looked like this before:
      
      Samples: 103K of event 'cpu-clock', Event count (approx.): 25833000000
        Overhead  Command  Shared Object      Symbol
      +    6,15%  netperf  [kernel.vmlinux]   [k] copy_user_enhanced_fast_string
      -    5,43%  netperf  [kernel.vmlinux]   [k] _raw_write_unlock_irqrestore
         - _raw_write_unlock_irqrestore
            - 96,54% _raw_spin_unlock_irqrestore
               - 36,14% mod_timer
                  + 97,24% sctp_transport_reset_timers
                  + 2,76% sctp_do_sm
               + 33,65% __wake_up_sync_key
               + 28,77% sctp_ulpq_tail_event
               + 1,40% del_timer
            - 1,84% mod_timer
               + 99,03% sctp_transport_reset_timers
               + 0,97% sctp_do_sm
            + 1,50% sctp_ulpq_tail_event
      
      And after this patch, now with netperf -l 60:
      
      Samples: 230K of event 'cpu-clock', Event count (approx.): 57707250000
        Overhead  Command  Shared Object      Symbol
      +    5,65%  netperf  [kernel.vmlinux]   [k] memcpy_erms
      +    5,59%  netperf  [kernel.vmlinux]   [k] copy_user_enhanced_fast_string
      -    5,05%  netperf  [kernel.vmlinux]   [k] _raw_spin_unlock_irqrestore
         - _raw_spin_unlock_irqrestore
            + 49,89% __wake_up_sync_key
            + 45,68% sctp_ulpq_tail_event
            - 2,85% mod_timer
               + 76,51% sctp_transport_reset_t3_rtx
               + 23,49% sctp_do_sm
            + 1,55% del_timer
      +    2,50%  netperf  [sctp]             [k] sctp_datamsg_from_user
      +    2,26%  netperf  [sctp]             [k] sctp_sendmsg
      
      Throughput-wise, from 6800mbps without the patch to 7050mbps with it,
      ~3.7%.
      Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ba6f5e33
  23. 21 3月, 2016 1 次提交
  24. 14 3月, 2016 1 次提交
    • M
      sctp: allow sctp_transmit_packet and others to use gfp · cea8768f
      Marcelo Ricardo Leitner 提交于
      Currently sctp_sendmsg() triggers some calls that will allocate memory
      with GFP_ATOMIC even when not necessary. In the case of
      sctp_packet_transmit it will allocate a linear skb that will be used to
      construct the packet and this may cause sends to fail due to ENOMEM more
      often than anticipated specially with big MTUs.
      
      This patch thus allows it to inherit gfp flags from upper calls so that
      it can use GFP_KERNEL if it was triggered by a sctp_sendmsg call or
      similar. All others, like retransmits or flushes started from BH, are
      still allocated using GFP_ATOMIC.
      
      In netperf tests this didn't result in any performance drawbacks when
      memory is not too fragmented and made it trigger ENOMEM way less often.
      Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cea8768f