1. 30 1月, 2019 15 次提交
  2. 06 12月, 2018 1 次提交
  3. 20 11月, 2018 4 次提交
  4. 13 11月, 2018 2 次提交
  5. 30 10月, 2018 1 次提交
  6. 19 10月, 2018 2 次提交
    • X
      sctp: use sk_wmem_queued to check for writable space · cd305c74
      Xin Long 提交于
      sk->sk_wmem_queued is used to count the size of chunks in out queue
      while sk->sk_wmem_alloc is for counting the size of chunks has been
      sent. sctp is increasing both of them before enqueuing the chunks,
      and using sk->sk_wmem_alloc to check for writable space.
      
      However, sk_wmem_alloc is also increased by 1 for the skb allocked
      for sending in sctp_packet_transmit() but it will not wake up the
      waiters when sk_wmem_alloc is decreased in this skb's destructor.
      
      If msg size is equal to sk_sndbuf and sendmsg is waiting for sndbuf,
      the check 'msg_len <= sctp_wspace(asoc)' in sctp_wait_for_sndbuf()
      will keep waiting if there's a skb allocked in sctp_packet_transmit,
      and later even if this skb got freed, the waiting thread will never
      get waked up.
      
      This issue has been there since very beginning, so we change to use
      sk->sk_wmem_queued to check for writable space as sk_wmem_queued is
      not increased for the skb allocked for sending, also as TCP does.
      
      SOCK_SNDBUF_LOCK check is also removed here as it's for tx buf auto
      tuning which I will add in another patch.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cd305c74
    • X
      sctp: count both sk and asoc sndbuf with skb truesize and sctp_chunk size · 605c0ac1
      Xin Long 提交于
      Now it's confusing that asoc sndbuf_used is doing memory accounting with
      SCTP_DATA_SNDSIZE(chunk) + sizeof(sk_buff) + sizeof(sctp_chunk) while sk
      sk_wmem_alloc is doing that with skb->truesize + sizeof(sctp_chunk).
      
      It also causes sctp_prsctp_prune to count with a wrong freed memory when
      sndbuf_policy is not set.
      
      To make this right and also keep consistent between asoc sndbuf_used, sk
      sk_wmem_alloc and sk_wmem_queued, use skb->truesize + sizeof(sctp_chunk)
      for them.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      605c0ac1
  7. 18 10月, 2018 2 次提交
    • X
      sctp: not free the new asoc when sctp_wait_for_connect returns err · c863850c
      Xin Long 提交于
      When sctp_wait_for_connect is called to wait for connect ready
      for sp->strm_interleave in sctp_sendmsg_to_asoc, a panic could
      be triggered if cpu is scheduled out and the new asoc is freed
      elsewhere, as it will return err and later the asoc gets freed
      again in sctp_sendmsg.
      
      [  285.840764] list_del corruption, ffff9f0f7b284078->next is LIST_POISON1 (dead000000000100)
      [  285.843590] WARNING: CPU: 1 PID: 8861 at lib/list_debug.c:47 __list_del_entry_valid+0x50/0xa0
      [  285.846193] Kernel panic - not syncing: panic_on_warn set ...
      [  285.846193]
      [  285.848206] CPU: 1 PID: 8861 Comm: sctp_ndata Kdump: loaded Not tainted 4.19.0-rc7.label #584
      [  285.850559] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [  285.852164] Call Trace:
      ...
      [  285.872210]  ? __list_del_entry_valid+0x50/0xa0
      [  285.872894]  sctp_association_free+0x42/0x2d0 [sctp]
      [  285.873612]  sctp_sendmsg+0x5a4/0x6b0 [sctp]
      [  285.874236]  sock_sendmsg+0x30/0x40
      [  285.874741]  ___sys_sendmsg+0x27a/0x290
      [  285.875304]  ? __switch_to_asm+0x34/0x70
      [  285.875872]  ? __switch_to_asm+0x40/0x70
      [  285.876438]  ? ptep_set_access_flags+0x2a/0x30
      [  285.877083]  ? do_wp_page+0x151/0x540
      [  285.877614]  __sys_sendmsg+0x58/0xa0
      [  285.878138]  do_syscall_64+0x55/0x180
      [  285.878669]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      This is a similar issue with the one fixed in Commit ca3af4dd
      ("sctp: do not free asoc when it is already dead in sctp_sendmsg").
      But this one can't be fixed by returning -ESRCH for the dead asoc
      in sctp_wait_for_connect, as it will break sctp_connect's return
      value to users.
      
      This patch is to simply set err to -ESRCH before it returns to
      sctp_sendmsg when any err is returned by sctp_wait_for_connect
      for sp->strm_interleave, so that no asoc would be freed due to
      this.
      
      When users see this error, they will know the packet hasn't been
      sent. And it also makes sense to not free asoc because waiting
      connect fails, like the second call for sctp_wait_for_connect in
      sctp_sendmsg_to_asoc.
      
      Fixes: 668c9beb ("sctp: implement assign_number for sctp_stream_interleave")
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c863850c
    • M
      sctp: fix race on sctp_id2asoc · b336deca
      Marcelo Ricardo Leitner 提交于
      syzbot reported an use-after-free involving sctp_id2asoc.  Dmitry Vyukov
      helped to root cause it and it is because of reading the asoc after it
      was freed:
      
              CPU 1                       CPU 2
      (working on socket 1)            (working on socket 2)
      	                         sctp_association_destroy
      sctp_id2asoc
         spin lock
           grab the asoc from idr
         spin unlock
                                         spin lock
      				     remove asoc from idr
      				   spin unlock
      				   free(asoc)
         if asoc->base.sk != sk ... [*]
      
      This can only be hit if trying to fetch asocs from different sockets. As
      we have a single IDR for all asocs, in all SCTP sockets, their id is
      unique on the system. An application can try to send stuff on an id
      that matches on another socket, and the if in [*] will protect from such
      usage. But it didn't consider that as that asoc may belong to another
      socket, it may be freed in parallel (read: under another socket lock).
      
      We fix it by moving the checks in [*] into the protected region. This
      fixes it because the asoc cannot be freed while the lock is held.
      
      Reported-by: syzbot+c7dd55d7aec49d48e49a@syzkaller.appspotmail.com
      Acked-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b336deca
  8. 17 10月, 2018 1 次提交
  9. 04 9月, 2018 2 次提交
  10. 28 8月, 2018 1 次提交
  11. 12 8月, 2018 1 次提交
  12. 04 7月, 2018 2 次提交
    • X
      sctp: add support for setting flowlabel when adding a transport · 4be4139f
      Xin Long 提交于
      Struct sockaddr_in6 has the member sin6_flowinfo that includes the
      ipv6 flowlabel, it should also support for setting flowlabel when
      adding a transport whose ipaddr is from userspace.
      
      Note that addrinfo in sctp_sendmsg is using struct in6_addr for
      the secondary addrs, which doesn't contain sin6_flowinfo, and
      it needs to copy sin6_flowinfo from the primary addr.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4be4139f
    • X
      sctp: add spp_ipv6_flowlabel and spp_dscp for sctp_paddrparams · 0b0dce7a
      Xin Long 提交于
      spp_ipv6_flowlabel and spp_dscp are added in sctp_paddrparams in
      this patch so that users could set sctp_sock/asoc/transport dscp
      and flowlabel with spp_flags SPP_IPV6_FLOWLABEL or SPP_DSCP by
      SCTP_PEER_ADDR_PARAMS , as described section 8.1.12 in RFC6458.
      
      As said in last patch, it uses '| 0x100000' or '|0x1' to mark
      flowlabel or dscp is set,  so that their values could be set
      to 0.
      
      Note that to guarantee that an old app built with old kernel
      headers could work on the newer kernel, the param's check in
      sctp_g/setsockopt_peer_addr_params() is also improved, which
      follows the way that sctp_g/setsockopt_delayed_ack() or some
      other sockopts' process that accept two types of params does.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0b0dce7a
  13. 29 6月, 2018 2 次提交
    • X
      sctp: add support for SCTP_REUSE_PORT sockopt · b0e9a2fe
      Xin Long 提交于
      This feature is actually already supported by sk->sk_reuse which can be
      set by socket level opt SO_REUSEADDR. But it's not working exactly as
      RFC6458 demands in section 8.1.27, like:
      
        - This option only supports one-to-one style SCTP sockets
        - This socket option must not be used after calling bind()
          or sctp_bindx().
      
      Besides, SCTP_REUSE_PORT sockopt should be provided for user's programs.
      Otherwise, the programs with SCTP_REUSE_PORT from other systems will not
      work in linux.
      
      To separate it from the socket level version, this patch adds 'reuse' in
      sctp_sock and it works pretty much as sk->sk_reuse, but with some extra
      setup limitations that are needed when it is being enabled.
      
      "It should be noted that the behavior of the socket-level socket option
      to reuse ports and/or addresses for SCTP sockets is unspecified", so it
      leaves SO_REUSEADDR as is for the compatibility.
      
      Note that the name SCTP_REUSE_PORT is somewhat confusing, as its
      functionality is nearly identical to SO_REUSEADDR, but with some
      extra restrictions. Here it uses 'reuse' in sctp_sock instead of
      'reuseport'. As for sk->sk_reuseport support for SCTP, it will be
      added in another patch.
      
      Thanks to Neil to make this clear.
      
      v1->v2:
        - add sctp_sk->reuse to separate it from the socket level version.
      v2->v3:
        - improve changelog according to Marcelo's suggestion.
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b0e9a2fe
    • L
      Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL · a11e1d43
      Linus Torvalds 提交于
      The poll() changes were not well thought out, and completely
      unexplained.  They also caused a huge performance regression, because
      "->poll()" was no longer a trivial file operation that just called down
      to the underlying file operations, but instead did at least two indirect
      calls.
      
      Indirect calls are sadly slow now with the Spectre mitigation, but the
      performance problem could at least be largely mitigated by changing the
      "->get_poll_head()" operation to just have a per-file-descriptor pointer
      to the poll head instead.  That gets rid of one of the new indirections.
      
      But that doesn't fix the new complexity that is completely unwarranted
      for the regular case.  The (undocumented) reason for the poll() changes
      was some alleged AIO poll race fixing, but we don't make the common case
      slower and more complex for some uncommon special case, so this all
      really needs way more explanations and most likely a fundamental
      redesign.
      
      [ This revert is a revert of about 30 different commits, not reverted
        individually because that would just be unnecessarily messy  - Linus ]
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a11e1d43
  14. 22 6月, 2018 1 次提交
    • N
      rhashtable: split rhashtable.h · 0eb71a9d
      NeilBrown 提交于
      Due to the use of rhashtables in net namespaces,
      rhashtable.h is included in lots of the kernel,
      so a small changes can required a large recompilation.
      This makes development painful.
      
      This patch splits out rhashtable-types.h which just includes
      the major type declarations, and does not include (non-trivial)
      inline code.  rhashtable.h is no longer included by anything
      in the include/ directory.
      Common include files only include rhashtable-types.h so a large
      recompilation is only triggered when that changes.
      Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0eb71a9d
  15. 26 5月, 2018 1 次提交
  16. 23 5月, 2018 1 次提交
  17. 28 4月, 2018 1 次提交