1. 20 11月, 2018 1 次提交
    • X
      sctp: define subscribe in sctp_sock as __u16 · 2cc0eeb6
      Xin Long 提交于
      The member subscribe in sctp_sock is used to indicate to which of
      the events it is subscribed, more like a group of flags. So it's
      better to be defined as __u16 (2 bytpes), instead of struct
      sctp_event_subscribe (13 bytes).
      
      Note that sctp_event_subscribe is an UAPI struct, used on sockopt
      calls, and thus it will not be removed. This patch only changes
      the internal storage of the flags.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2cc0eeb6
  2. 13 11月, 2018 2 次提交
  3. 30 10月, 2018 1 次提交
  4. 19 10月, 2018 2 次提交
    • X
      sctp: use sk_wmem_queued to check for writable space · cd305c74
      Xin Long 提交于
      sk->sk_wmem_queued is used to count the size of chunks in out queue
      while sk->sk_wmem_alloc is for counting the size of chunks has been
      sent. sctp is increasing both of them before enqueuing the chunks,
      and using sk->sk_wmem_alloc to check for writable space.
      
      However, sk_wmem_alloc is also increased by 1 for the skb allocked
      for sending in sctp_packet_transmit() but it will not wake up the
      waiters when sk_wmem_alloc is decreased in this skb's destructor.
      
      If msg size is equal to sk_sndbuf and sendmsg is waiting for sndbuf,
      the check 'msg_len <= sctp_wspace(asoc)' in sctp_wait_for_sndbuf()
      will keep waiting if there's a skb allocked in sctp_packet_transmit,
      and later even if this skb got freed, the waiting thread will never
      get waked up.
      
      This issue has been there since very beginning, so we change to use
      sk->sk_wmem_queued to check for writable space as sk_wmem_queued is
      not increased for the skb allocked for sending, also as TCP does.
      
      SOCK_SNDBUF_LOCK check is also removed here as it's for tx buf auto
      tuning which I will add in another patch.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cd305c74
    • X
      sctp: count both sk and asoc sndbuf with skb truesize and sctp_chunk size · 605c0ac1
      Xin Long 提交于
      Now it's confusing that asoc sndbuf_used is doing memory accounting with
      SCTP_DATA_SNDSIZE(chunk) + sizeof(sk_buff) + sizeof(sctp_chunk) while sk
      sk_wmem_alloc is doing that with skb->truesize + sizeof(sctp_chunk).
      
      It also causes sctp_prsctp_prune to count with a wrong freed memory when
      sndbuf_policy is not set.
      
      To make this right and also keep consistent between asoc sndbuf_used, sk
      sk_wmem_alloc and sk_wmem_queued, use skb->truesize + sizeof(sctp_chunk)
      for them.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      605c0ac1
  5. 18 10月, 2018 2 次提交
    • X
      sctp: not free the new asoc when sctp_wait_for_connect returns err · c863850c
      Xin Long 提交于
      When sctp_wait_for_connect is called to wait for connect ready
      for sp->strm_interleave in sctp_sendmsg_to_asoc, a panic could
      be triggered if cpu is scheduled out and the new asoc is freed
      elsewhere, as it will return err and later the asoc gets freed
      again in sctp_sendmsg.
      
      [  285.840764] list_del corruption, ffff9f0f7b284078->next is LIST_POISON1 (dead000000000100)
      [  285.843590] WARNING: CPU: 1 PID: 8861 at lib/list_debug.c:47 __list_del_entry_valid+0x50/0xa0
      [  285.846193] Kernel panic - not syncing: panic_on_warn set ...
      [  285.846193]
      [  285.848206] CPU: 1 PID: 8861 Comm: sctp_ndata Kdump: loaded Not tainted 4.19.0-rc7.label #584
      [  285.850559] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [  285.852164] Call Trace:
      ...
      [  285.872210]  ? __list_del_entry_valid+0x50/0xa0
      [  285.872894]  sctp_association_free+0x42/0x2d0 [sctp]
      [  285.873612]  sctp_sendmsg+0x5a4/0x6b0 [sctp]
      [  285.874236]  sock_sendmsg+0x30/0x40
      [  285.874741]  ___sys_sendmsg+0x27a/0x290
      [  285.875304]  ? __switch_to_asm+0x34/0x70
      [  285.875872]  ? __switch_to_asm+0x40/0x70
      [  285.876438]  ? ptep_set_access_flags+0x2a/0x30
      [  285.877083]  ? do_wp_page+0x151/0x540
      [  285.877614]  __sys_sendmsg+0x58/0xa0
      [  285.878138]  do_syscall_64+0x55/0x180
      [  285.878669]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      This is a similar issue with the one fixed in Commit ca3af4dd
      ("sctp: do not free asoc when it is already dead in sctp_sendmsg").
      But this one can't be fixed by returning -ESRCH for the dead asoc
      in sctp_wait_for_connect, as it will break sctp_connect's return
      value to users.
      
      This patch is to simply set err to -ESRCH before it returns to
      sctp_sendmsg when any err is returned by sctp_wait_for_connect
      for sp->strm_interleave, so that no asoc would be freed due to
      this.
      
      When users see this error, they will know the packet hasn't been
      sent. And it also makes sense to not free asoc because waiting
      connect fails, like the second call for sctp_wait_for_connect in
      sctp_sendmsg_to_asoc.
      
      Fixes: 668c9beb ("sctp: implement assign_number for sctp_stream_interleave")
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c863850c
    • M
      sctp: fix race on sctp_id2asoc · b336deca
      Marcelo Ricardo Leitner 提交于
      syzbot reported an use-after-free involving sctp_id2asoc.  Dmitry Vyukov
      helped to root cause it and it is because of reading the asoc after it
      was freed:
      
              CPU 1                       CPU 2
      (working on socket 1)            (working on socket 2)
      	                         sctp_association_destroy
      sctp_id2asoc
         spin lock
           grab the asoc from idr
         spin unlock
                                         spin lock
      				     remove asoc from idr
      				   spin unlock
      				   free(asoc)
         if asoc->base.sk != sk ... [*]
      
      This can only be hit if trying to fetch asocs from different sockets. As
      we have a single IDR for all asocs, in all SCTP sockets, their id is
      unique on the system. An application can try to send stuff on an id
      that matches on another socket, and the if in [*] will protect from such
      usage. But it didn't consider that as that asoc may belong to another
      socket, it may be freed in parallel (read: under another socket lock).
      
      We fix it by moving the checks in [*] into the protected region. This
      fixes it because the asoc cannot be freed while the lock is held.
      
      Reported-by: syzbot+c7dd55d7aec49d48e49a@syzkaller.appspotmail.com
      Acked-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b336deca
  6. 17 10月, 2018 1 次提交
  7. 04 9月, 2018 2 次提交
  8. 28 8月, 2018 1 次提交
  9. 12 8月, 2018 1 次提交
  10. 04 7月, 2018 2 次提交
    • X
      sctp: add support for setting flowlabel when adding a transport · 4be4139f
      Xin Long 提交于
      Struct sockaddr_in6 has the member sin6_flowinfo that includes the
      ipv6 flowlabel, it should also support for setting flowlabel when
      adding a transport whose ipaddr is from userspace.
      
      Note that addrinfo in sctp_sendmsg is using struct in6_addr for
      the secondary addrs, which doesn't contain sin6_flowinfo, and
      it needs to copy sin6_flowinfo from the primary addr.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4be4139f
    • X
      sctp: add spp_ipv6_flowlabel and spp_dscp for sctp_paddrparams · 0b0dce7a
      Xin Long 提交于
      spp_ipv6_flowlabel and spp_dscp are added in sctp_paddrparams in
      this patch so that users could set sctp_sock/asoc/transport dscp
      and flowlabel with spp_flags SPP_IPV6_FLOWLABEL or SPP_DSCP by
      SCTP_PEER_ADDR_PARAMS , as described section 8.1.12 in RFC6458.
      
      As said in last patch, it uses '| 0x100000' or '|0x1' to mark
      flowlabel or dscp is set,  so that their values could be set
      to 0.
      
      Note that to guarantee that an old app built with old kernel
      headers could work on the newer kernel, the param's check in
      sctp_g/setsockopt_peer_addr_params() is also improved, which
      follows the way that sctp_g/setsockopt_delayed_ack() or some
      other sockopts' process that accept two types of params does.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0b0dce7a
  11. 29 6月, 2018 2 次提交
    • X
      sctp: add support for SCTP_REUSE_PORT sockopt · b0e9a2fe
      Xin Long 提交于
      This feature is actually already supported by sk->sk_reuse which can be
      set by socket level opt SO_REUSEADDR. But it's not working exactly as
      RFC6458 demands in section 8.1.27, like:
      
        - This option only supports one-to-one style SCTP sockets
        - This socket option must not be used after calling bind()
          or sctp_bindx().
      
      Besides, SCTP_REUSE_PORT sockopt should be provided for user's programs.
      Otherwise, the programs with SCTP_REUSE_PORT from other systems will not
      work in linux.
      
      To separate it from the socket level version, this patch adds 'reuse' in
      sctp_sock and it works pretty much as sk->sk_reuse, but with some extra
      setup limitations that are needed when it is being enabled.
      
      "It should be noted that the behavior of the socket-level socket option
      to reuse ports and/or addresses for SCTP sockets is unspecified", so it
      leaves SO_REUSEADDR as is for the compatibility.
      
      Note that the name SCTP_REUSE_PORT is somewhat confusing, as its
      functionality is nearly identical to SO_REUSEADDR, but with some
      extra restrictions. Here it uses 'reuse' in sctp_sock instead of
      'reuseport'. As for sk->sk_reuseport support for SCTP, it will be
      added in another patch.
      
      Thanks to Neil to make this clear.
      
      v1->v2:
        - add sctp_sk->reuse to separate it from the socket level version.
      v2->v3:
        - improve changelog according to Marcelo's suggestion.
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b0e9a2fe
    • L
      Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL · a11e1d43
      Linus Torvalds 提交于
      The poll() changes were not well thought out, and completely
      unexplained.  They also caused a huge performance regression, because
      "->poll()" was no longer a trivial file operation that just called down
      to the underlying file operations, but instead did at least two indirect
      calls.
      
      Indirect calls are sadly slow now with the Spectre mitigation, but the
      performance problem could at least be largely mitigated by changing the
      "->get_poll_head()" operation to just have a per-file-descriptor pointer
      to the poll head instead.  That gets rid of one of the new indirections.
      
      But that doesn't fix the new complexity that is completely unwarranted
      for the regular case.  The (undocumented) reason for the poll() changes
      was some alleged AIO poll race fixing, but we don't make the common case
      slower and more complex for some uncommon special case, so this all
      really needs way more explanations and most likely a fundamental
      redesign.
      
      [ This revert is a revert of about 30 different commits, not reverted
        individually because that would just be unnecessarily messy  - Linus ]
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a11e1d43
  12. 22 6月, 2018 1 次提交
    • N
      rhashtable: split rhashtable.h · 0eb71a9d
      NeilBrown 提交于
      Due to the use of rhashtables in net namespaces,
      rhashtable.h is included in lots of the kernel,
      so a small changes can required a large recompilation.
      This makes development painful.
      
      This patch splits out rhashtable-types.h which just includes
      the major type declarations, and does not include (non-trivial)
      inline code.  rhashtable.h is no longer included by anything
      in the include/ directory.
      Common include files only include rhashtable-types.h so a large
      recompilation is only triggered when that changes.
      Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0eb71a9d
  13. 26 5月, 2018 1 次提交
  14. 23 5月, 2018 1 次提交
  15. 28 4月, 2018 8 次提交
  16. 09 4月, 2018 1 次提交
    • E
      sctp: sctp_sockaddr_af must check minimal addr length for AF_INET6 · 81e98370
      Eric Dumazet 提交于
      Check must happen before call to ipv6_addr_v4mapped()
      
      syzbot report was :
      
      BUG: KMSAN: uninit-value in sctp_sockaddr_af net/sctp/socket.c:359 [inline]
      BUG: KMSAN: uninit-value in sctp_do_bind+0x60f/0xdc0 net/sctp/socket.c:384
      CPU: 0 PID: 3576 Comm: syzkaller968804 Not tainted 4.16.0+ #82
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:17 [inline]
       dump_stack+0x185/0x1d0 lib/dump_stack.c:53
       kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
       __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676
       sctp_sockaddr_af net/sctp/socket.c:359 [inline]
       sctp_do_bind+0x60f/0xdc0 net/sctp/socket.c:384
       sctp_bind+0x149/0x190 net/sctp/socket.c:332
       inet6_bind+0x1fd/0x1820 net/ipv6/af_inet6.c:293
       SYSC_bind+0x3f2/0x4b0 net/socket.c:1474
       SyS_bind+0x54/0x80 net/socket.c:1460
       do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      RIP: 0033:0x43fd49
      RSP: 002b:00007ffe99df3d28 EFLAGS: 00000213 ORIG_RAX: 0000000000000031
      RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 000000000043fd49
      RDX: 0000000000000010 RSI: 0000000020000000 RDI: 0000000000000003
      RBP: 00000000006ca018 R08: 00000000004002c8 R09: 00000000004002c8
      R10: 00000000004002c8 R11: 0000000000000213 R12: 0000000000401670
      R13: 0000000000401700 R14: 0000000000000000 R15: 0000000000000000
      
      Local variable description: ----address@SYSC_bind
      Variable was created at:
       SYSC_bind+0x6f/0x4b0 net/socket.c:1461
       SyS_bind+0x54/0x80 net/socket.c:1460
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      81e98370
  17. 16 3月, 2018 1 次提交
    • N
      sctp: Fix double free in sctp_sendmsg_to_asoc · 0aee4c25
      Neil Horman 提交于
      syzbot/kasan detected a double free in sctp_sendmsg_to_asoc:
      BUG: KASAN: use-after-free in sctp_association_free+0x7b7/0x930
      net/sctp/associola.c:332
      Read of size 8 at addr ffff8801d8006ae0 by task syzkaller914861/4202
      
      CPU: 1 PID: 4202 Comm: syzkaller914861 Not tainted 4.16.0-rc4+ #258
      Hardware name: Google Google Compute Engine/Google Compute Engine
      01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:17 [inline]
       dump_stack+0x194/0x24d lib/dump_stack.c:53
       print_address_description+0x73/0x250 mm/kasan/report.c:256
       kasan_report_error mm/kasan/report.c:354 [inline]
       kasan_report+0x23c/0x360 mm/kasan/report.c:412
       __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
       sctp_association_free+0x7b7/0x930 net/sctp/associola.c:332
       sctp_sendmsg+0xc67/0x1a80 net/sctp/socket.c:2075
       inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:763
       sock_sendmsg_nosec net/socket.c:629 [inline]
       sock_sendmsg+0xca/0x110 net/socket.c:639
       SYSC_sendto+0x361/0x5c0 net/socket.c:1748
       SyS_sendto+0x40/0x50 net/socket.c:1716
       do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x42/0xb7
      
      This was introduced by commit:
      f84af331 sctp: factor out sctp_sendmsg_to_asoc from sctp_sendmsg
      
      As the newly refactored function moved the wait_for_sndbuf call to a
      point after the association was connected, allowing for peeloff events
      to occur, which in turn caused wait_for_sndbuf to return -EPIPE which
      was not caught by the logic that determines if an association should be
      freed or not.
      
      Fix it the easy way by returning the ordering of
      sctp_primitive_ASSOCIATE and sctp_wait_for_sndbuf to the old order, to
      ensure that EPIPE will not happen.
      
      Tested by myself using the syzbot reproducers with positive results
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      CC: davem@davemloft.net
      CC: Xin Long <lucien.xin@gmail.com>
      Reported-by: syzbot+a4e4112c3aff00c8cfd8@syzkaller.appspotmail.com
      Reviewed-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0aee4c25
  18. 15 3月, 2018 4 次提交
  19. 13 3月, 2018 1 次提交
  20. 07 3月, 2018 3 次提交
  21. 05 3月, 2018 2 次提交