1. 05 12月, 2019 1 次提交
  2. 27 9月, 2019 1 次提交
  3. 09 7月, 2019 1 次提交
    • W
      ipv6: elide flowlabel check if no exclusive leases exist · 59c820b2
      Willem de Bruijn 提交于
      Processes can request ipv6 flowlabels with cmsg IPV6_FLOWINFO.
      If not set, by default an autogenerated flowlabel is selected.
      
      Explicit flowlabels require a control operation per label plus a
      datapath check on every connection (every datagram if unconnected).
      This is particularly expensive on unconnected sockets multiplexing
      many flows, such as QUIC.
      
      In the common case, where no lease is exclusive, the check can be
      safely elided, as both lease request and check trivially succeed.
      Indeed, autoflowlabel does the same even with exclusive leases.
      
      Elide the check if no process has requested an exclusive lease.
      
      fl6_sock_lookup previously returns either a reference to a lease or
      NULL to denote failure. Modify to return a real error and update
      all callers. On return NULL, they can use the label and will elide
      the atomic_dec in fl6_sock_release.
      
      This is an optimization. Robust applications still have to revert to
      requesting leases if the fast path fails due to an exclusive lease.
      
      Changes RFC->v1:
        - use static_key_false_deferred to rate limit jump label operations
          - call static_key_deferred_flush to stop timers on exit
        - move decrement out of RCU context
        - defer optimization also if opt data is associated with a lease
        - updated all fp6_sock_lookup callers, not just udp
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      59c820b2
  4. 24 5月, 2019 1 次提交
  5. 20 4月, 2019 1 次提交
  6. 25 1月, 2019 1 次提交
    • X
      sctp: set flow sport from saddr only when it's 0 · ecf938fe
      Xin Long 提交于
      Now sctp_transport_pmtu() passes transport->saddr into .get_dst() to set
      flow sport from 'saddr'. However, transport->saddr is set only when
      transport->dst exists in sctp_transport_route().
      
      If sctp_transport_pmtu() is called without transport->saddr set, like
      when transport->dst doesn't exists, the flow sport will be set to 0
      from transport->saddr, which will cause a wrong route to be got.
      
      Commit 6e91b578 ("sctp: re-use sctp_transport_pmtu in
      sctp_transport_route") made the issue be triggered more easily
      since sctp_transport_pmtu() would be called in sctp_transport_route()
      after that.
      
      In gerneral, fl4->fl4_sport should always be set to
      htons(asoc->base.bind_addr.port), unless transport->asoc doesn't exist
      in sctp_v4/6_get_dst(), which is the case:
      
        sctp_ootb_pkt_new() ->
          sctp_transport_route()
      
      For that, we can simply handle it by setting flow sport from saddr only
      when it's 0 in sctp_v4/6_get_dst().
      
      Fixes: 6e91b578 ("sctp: re-use sctp_transport_pmtu in sctp_transport_route")
      Reported-by: NYing Xu <yinxu@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ecf938fe
  7. 17 1月, 2019 1 次提交
  8. 11 12月, 2018 1 次提交
    • X
      sctp: initialize sin6_flowinfo for ipv6 addrs in sctp_inet6addr_event · 4a2eb0c3
      Xin Long 提交于
      syzbot reported a kernel-infoleak, which is caused by an uninitialized
      field(sin6_flowinfo) of addr->a.v6 in sctp_inet6addr_event().
      The call trace is as below:
      
        BUG: KMSAN: kernel-infoleak in _copy_to_user+0x19a/0x230 lib/usercopy.c:33
        CPU: 1 PID: 8164 Comm: syz-executor2 Not tainted 4.20.0-rc3+ #95
        Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
        Google 01/01/2011
        Call Trace:
          __dump_stack lib/dump_stack.c:77 [inline]
          dump_stack+0x32d/0x480 lib/dump_stack.c:113
          kmsan_report+0x12c/0x290 mm/kmsan/kmsan.c:683
          kmsan_internal_check_memory+0x32a/0xa50 mm/kmsan/kmsan.c:743
          kmsan_copy_to_user+0x78/0xd0 mm/kmsan/kmsan_hooks.c:634
          _copy_to_user+0x19a/0x230 lib/usercopy.c:33
          copy_to_user include/linux/uaccess.h:183 [inline]
          sctp_getsockopt_local_addrs net/sctp/socket.c:5998 [inline]
          sctp_getsockopt+0x15248/0x186f0 net/sctp/socket.c:7477
          sock_common_getsockopt+0x13f/0x180 net/core/sock.c:2937
          __sys_getsockopt+0x489/0x550 net/socket.c:1939
          __do_sys_getsockopt net/socket.c:1950 [inline]
          __se_sys_getsockopt+0xe1/0x100 net/socket.c:1947
          __x64_sys_getsockopt+0x62/0x80 net/socket.c:1947
          do_syscall_64+0xcf/0x110 arch/x86/entry/common.c:291
          entry_SYSCALL_64_after_hwframe+0x63/0xe7
      
      sin6_flowinfo is not really used by SCTP, so it will be fixed by simply
      setting it to 0.
      
      The issue exists since very beginning.
      Thanks Alexander for the reproducer provided.
      
      Reported-by: syzbot+ad5d327e6936a2e284be@syzkaller.appspotmail.com
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4a2eb0c3
  9. 09 11月, 2018 1 次提交
    • S
      net: Convert protocol error handlers from void to int · 32bbd879
      Stefano Brivio 提交于
      We'll need this to handle ICMP errors for tunnels without a sending socket
      (i.e. FoU and GUE). There, we might have to look up different types of IP
      tunnels, registered as network protocols, before we get a match, so we
      want this for the error handlers of IPPROTO_IPIP and IPPROTO_IPV6 in both
      inet_protos and inet6_protos. These error codes will be used in the next
      patch.
      
      For consistency, return sensible error codes in protocol error handlers
      whenever handlers can't handle errors because, even if valid, they don't
      match a protocol or any of its states.
      
      This has no effect on existing error handling paths.
      Signed-off-by: NStefano Brivio <sbrivio@redhat.com>
      Reviewed-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      32bbd879
  10. 04 7月, 2018 2 次提交
  11. 29 6月, 2018 1 次提交
    • L
      Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL · a11e1d43
      Linus Torvalds 提交于
      The poll() changes were not well thought out, and completely
      unexplained.  They also caused a huge performance regression, because
      "->poll()" was no longer a trivial file operation that just called down
      to the underlying file operations, but instead did at least two indirect
      calls.
      
      Indirect calls are sadly slow now with the Spectre mitigation, but the
      performance problem could at least be largely mitigated by changing the
      "->get_poll_head()" operation to just have a per-file-descriptor pointer
      to the poll head instead.  That gets rid of one of the new indirections.
      
      But that doesn't fix the new complexity that is completely unwarranted
      for the regular case.  The (undocumented) reason for the poll() changes
      was some alleged AIO poll race fixing, but we don't make the common case
      slower and more complex for some uncommon special case, so this all
      really needs way more explanations and most likely a fundamental
      redesign.
      
      [ This revert is a revert of about 30 different commits, not reverted
        individually because that would just be unnecessarily messy  - Linus ]
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a11e1d43
  12. 26 5月, 2018 1 次提交
  13. 23 5月, 2018 1 次提交
  14. 28 4月, 2018 1 次提交
  15. 13 4月, 2018 1 次提交
    • X
      sctp: do not check port in sctp_inet6_cmp_addr · 1071ec9d
      Xin Long 提交于
      pf->cmp_addr() is called before binding a v6 address to the sock. It
      should not check ports, like in sctp_inet_cmp_addr.
      
      But sctp_inet6_cmp_addr checks the addr by invoking af(6)->cmp_addr,
      sctp_v6_cmp_addr where it also compares the ports.
      
      This would cause that setsockopt(SCTP_SOCKOPT_BINDX_ADD) could bind
      multiple duplicated IPv6 addresses after Commit 40b4f0fd ("sctp:
      lack the check for ports in sctp_v6_cmp_addr").
      
      This patch is to remove af->cmp_addr called in sctp_inet6_cmp_addr,
      but do the proper check for both v6 addrs and v4mapped addrs.
      
      v1->v2:
        - define __sctp_v6_cmp_addr to do the common address comparison
          used for both pf and af v6 cmp_addr.
      
      Fixes: 40b4f0fd ("sctp: lack the check for ports in sctp_v6_cmp_addr")
      Reported-by: NJianwen Ji <jiji@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1071ec9d
  16. 08 4月, 2018 1 次提交
    • E
      sctp: do not leak kernel memory to user space · 6780db24
      Eric Dumazet 提交于
      syzbot produced a nice report [1]
      
      Issue here is that a recvmmsg() managed to leak 8 bytes of kernel memory
      to user space, because sin_zero (padding field) was not properly cleared.
      
      [1]
      BUG: KMSAN: uninit-value in copy_to_user include/linux/uaccess.h:184 [inline]
      BUG: KMSAN: uninit-value in move_addr_to_user+0x32e/0x530 net/socket.c:227
      CPU: 1 PID: 3586 Comm: syzkaller481044 Not tainted 4.16.0+ #82
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:17 [inline]
       dump_stack+0x185/0x1d0 lib/dump_stack.c:53
       kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
       kmsan_internal_check_memory+0x164/0x1d0 mm/kmsan/kmsan.c:1176
       kmsan_copy_to_user+0x69/0x160 mm/kmsan/kmsan.c:1199
       copy_to_user include/linux/uaccess.h:184 [inline]
       move_addr_to_user+0x32e/0x530 net/socket.c:227
       ___sys_recvmsg+0x4e2/0x810 net/socket.c:2211
       __sys_recvmmsg+0x54e/0xdb0 net/socket.c:2313
       SYSC_recvmmsg+0x29b/0x3e0 net/socket.c:2394
       SyS_recvmmsg+0x76/0xa0 net/socket.c:2378
       do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      RIP: 0033:0x4401c9
      RSP: 002b:00007ffc56f73098 EFLAGS: 00000217 ORIG_RAX: 000000000000012b
      RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 00000000004401c9
      RDX: 0000000000000001 RSI: 0000000020003ac0 RDI: 0000000000000003
      RBP: 00000000006ca018 R08: 0000000020003bc0 R09: 0000000000000010
      R10: 0000000000000000 R11: 0000000000000217 R12: 0000000000401af0
      R13: 0000000000401b80 R14: 0000000000000000 R15: 0000000000000000
      
      Local variable description: ----addr@___sys_recvmsg
      Variable was created at:
       ___sys_recvmsg+0xd5/0x810 net/socket.c:2172
       __sys_recvmmsg+0x54e/0xdb0 net/socket.c:2313
      
      Bytes 8-15 of 16 are uninitialized
      
      ==================================================================
      Kernel panic - not syncing: panic_on_warn set ...
      
      CPU: 1 PID: 3586 Comm: syzkaller481044 Tainted: G    B            4.16.0+ #82
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:17 [inline]
       dump_stack+0x185/0x1d0 lib/dump_stack.c:53
       panic+0x39d/0x940 kernel/panic.c:183
       kmsan_report+0x238/0x240 mm/kmsan/kmsan.c:1083
       kmsan_internal_check_memory+0x164/0x1d0 mm/kmsan/kmsan.c:1176
       kmsan_copy_to_user+0x69/0x160 mm/kmsan/kmsan.c:1199
       copy_to_user include/linux/uaccess.h:184 [inline]
       move_addr_to_user+0x32e/0x530 net/socket.c:227
       ___sys_recvmsg+0x4e2/0x810 net/socket.c:2211
       __sys_recvmmsg+0x54e/0xdb0 net/socket.c:2313
       SYSC_recvmmsg+0x29b/0x3e0 net/socket.c:2394
       SyS_recvmmsg+0x76/0xa0 net/socket.c:2378
       do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc:	Vlad Yasevich <vyasevich@gmail.com>
      Cc:	Neil Horman <nhorman@tuxdriver.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6780db24
  17. 27 2月, 2018 1 次提交
  18. 13 2月, 2018 1 次提交
    • D
      net: make getname() functions return length rather than use int* parameter · 9b2c45d4
      Denys Vlasenko 提交于
      Changes since v1:
      Added changes in these files:
          drivers/infiniband/hw/usnic/usnic_transport.c
          drivers/staging/lustre/lnet/lnet/lib-socket.c
          drivers/target/iscsi/iscsi_target_login.c
          drivers/vhost/net.c
          fs/dlm/lowcomms.c
          fs/ocfs2/cluster/tcp.c
          security/tomoyo/network.c
      
      Before:
      All these functions either return a negative error indicator,
      or store length of sockaddr into "int *socklen" parameter
      and return zero on success.
      
      "int *socklen" parameter is awkward. For example, if caller does not
      care, it still needs to provide on-stack storage for the value
      it does not need.
      
      None of the many FOO_getname() functions of various protocols
      ever used old value of *socklen. They always just overwrite it.
      
      This change drops this parameter, and makes all these functions, on success,
      return length of sockaddr. It's always >= 0 and can be differentiated
      from an error.
      
      Tests in callers are changed from "if (err)" to "if (err < 0)", where needed.
      
      rpc_sockname() lost "int buflen" parameter, since its only use was
      to be passed to kernel_getsockname() as &buflen and subsequently
      not used in any way.
      
      Userspace API is not changed.
      
          text    data     bss      dec     hex filename
      30108430 2633624  873672 33615726 200ef6e vmlinux.before.o
      30108109 2633612  873672 33615393 200ee21 vmlinux.o
      Signed-off-by: NDenys Vlasenko <dvlasenk@redhat.com>
      CC: David S. Miller <davem@davemloft.net>
      CC: linux-kernel@vger.kernel.org
      CC: netdev@vger.kernel.org
      CC: linux-bluetooth@vger.kernel.org
      CC: linux-decnet-user@lists.sourceforge.net
      CC: linux-wireless@vger.kernel.org
      CC: linux-rdma@vger.kernel.org
      CC: linux-sctp@vger.kernel.org
      CC: linux-nfs@vger.kernel.org
      CC: linux-x25@vger.kernel.org
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9b2c45d4
  19. 06 2月, 2018 1 次提交
  20. 16 1月, 2018 1 次提交
  21. 16 11月, 2017 1 次提交
  22. 29 10月, 2017 1 次提交
  23. 24 10月, 2017 1 次提交
    • L
      sctp: full support for ipv6 ip_nonlocal_bind & IP_FREEBIND · b71d21c2
      Laszlo Toth 提交于
      Commit 9b974202 ("sctp: support ipv6 nonlocal bind")
      introduced support for the above options as v4 sctp did,
      so patched sctp_v6_available().
      
      In the v4 implementation it's enough, because
      sctp_inet_bind_verify() just returns with sctp_v4_available().
      However sctp_inet6_bind_verify() has an extra check before that
      for link-local scope_id, which won't respect the above options.
      
      Added the checks before calling ipv6_chk_addr(), but
      not before the validation of scope_id.
      
      before (w/ both options):
       ./v6test fe80::10 sctp
       bind failed, errno: 99 (Cannot assign requested address)
       ./v6test fe80::10 tcp
       bind success, errno: 0 (Success)
      
      after (w/ both options):
       ./v6test fe80::10 sctp
       bind success, errno: 0 (Success)
      Signed-off-by: NLaszlo Toth <laszlth@gmail.com>
      Reviewed-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b71d21c2
  24. 19 8月, 2017 1 次提交
    • A
      sctp: fully initialize the IPv6 address in sctp_v6_to_addr() · 15339e44
      Alexander Potapenko 提交于
      KMSAN reported use of uninitialized sctp_addr->v4.sin_addr.s_addr and
      sctp_addr->v6.sin6_scope_id in sctp_v6_cmp_addr() (see below).
      Make sure all fields of an IPv6 address are initialized, which
      guarantees that the IPv4 fields are also initialized.
      
      ==================================================================
       BUG: KMSAN: use of uninitialized memory in sctp_v6_cmp_addr+0x8d4/0x9f0
       net/sctp/ipv6.c:517
       CPU: 2 PID: 31056 Comm: syz-executor1 Not tainted 4.11.0-rc5+ #2944
       Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
       01/01/2011
       Call Trace:
        dump_stack+0x172/0x1c0 lib/dump_stack.c:42
        is_logbuf_locked mm/kmsan/kmsan.c:59 [inline]
        kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:938
        native_save_fl arch/x86/include/asm/irqflags.h:18 [inline]
        arch_local_save_flags arch/x86/include/asm/irqflags.h:72 [inline]
        arch_local_irq_save arch/x86/include/asm/irqflags.h:113 [inline]
        __msan_warning_32+0x61/0xb0 mm/kmsan/kmsan_instr.c:467
        sctp_v6_cmp_addr+0x8d4/0x9f0 net/sctp/ipv6.c:517
        sctp_v6_get_dst+0x8c7/0x1630 net/sctp/ipv6.c:290
        sctp_transport_route+0x101/0x570 net/sctp/transport.c:292
        sctp_assoc_add_peer+0x66d/0x16f0 net/sctp/associola.c:651
        sctp_sendmsg+0x35a5/0x4f90 net/sctp/socket.c:1871
        inet_sendmsg+0x498/0x670 net/ipv4/af_inet.c:762
        sock_sendmsg_nosec net/socket.c:633 [inline]
        sock_sendmsg net/socket.c:643 [inline]
        SYSC_sendto+0x608/0x710 net/socket.c:1696
        SyS_sendto+0x8a/0xb0 net/socket.c:1664
        entry_SYSCALL_64_fastpath+0x13/0x94
       RIP: 0033:0x44b479
       RSP: 002b:00007f6213f21c08 EFLAGS: 00000286 ORIG_RAX: 000000000000002c
       RAX: ffffffffffffffda RBX: 0000000020000000 RCX: 000000000044b479
       RDX: 0000000000000041 RSI: 0000000020edd000 RDI: 0000000000000006
       RBP: 00000000007080a8 R08: 0000000020b85fe4 R09: 000000000000001c
       R10: 0000000000040005 R11: 0000000000000286 R12: 00000000ffffffff
       R13: 0000000000003760 R14: 00000000006e5820 R15: 0000000000ff8000
       origin description: ----dst_saddr@sctp_v6_get_dst
       local variable created at:
        sk_fullsock include/net/sock.h:2321 [inline]
        inet6_sk include/linux/ipv6.h:309 [inline]
        sctp_v6_get_dst+0x91/0x1630 net/sctp/ipv6.c:241
        sctp_transport_route+0x101/0x570 net/sctp/transport.c:292
      ==================================================================
       BUG: KMSAN: use of uninitialized memory in sctp_v6_cmp_addr+0x8d4/0x9f0
       net/sctp/ipv6.c:517
       CPU: 2 PID: 31056 Comm: syz-executor1 Not tainted 4.11.0-rc5+ #2944
       Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
       01/01/2011
       Call Trace:
        dump_stack+0x172/0x1c0 lib/dump_stack.c:42
        is_logbuf_locked mm/kmsan/kmsan.c:59 [inline]
        kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:938
        native_save_fl arch/x86/include/asm/irqflags.h:18 [inline]
        arch_local_save_flags arch/x86/include/asm/irqflags.h:72 [inline]
        arch_local_irq_save arch/x86/include/asm/irqflags.h:113 [inline]
        __msan_warning_32+0x61/0xb0 mm/kmsan/kmsan_instr.c:467
        sctp_v6_cmp_addr+0x8d4/0x9f0 net/sctp/ipv6.c:517
        sctp_v6_get_dst+0x8c7/0x1630 net/sctp/ipv6.c:290
        sctp_transport_route+0x101/0x570 net/sctp/transport.c:292
        sctp_assoc_add_peer+0x66d/0x16f0 net/sctp/associola.c:651
        sctp_sendmsg+0x35a5/0x4f90 net/sctp/socket.c:1871
        inet_sendmsg+0x498/0x670 net/ipv4/af_inet.c:762
        sock_sendmsg_nosec net/socket.c:633 [inline]
        sock_sendmsg net/socket.c:643 [inline]
        SYSC_sendto+0x608/0x710 net/socket.c:1696
        SyS_sendto+0x8a/0xb0 net/socket.c:1664
        entry_SYSCALL_64_fastpath+0x13/0x94
       RIP: 0033:0x44b479
       RSP: 002b:00007f6213f21c08 EFLAGS: 00000286 ORIG_RAX: 000000000000002c
       RAX: ffffffffffffffda RBX: 0000000020000000 RCX: 000000000044b479
       RDX: 0000000000000041 RSI: 0000000020edd000 RDI: 0000000000000006
       RBP: 00000000007080a8 R08: 0000000020b85fe4 R09: 000000000000001c
       R10: 0000000000040005 R11: 0000000000000286 R12: 00000000ffffffff
       R13: 0000000000003760 R14: 00000000006e5820 R15: 0000000000ff8000
       origin description: ----dst_saddr@sctp_v6_get_dst
       local variable created at:
        sk_fullsock include/net/sock.h:2321 [inline]
        inet6_sk include/linux/ipv6.h:309 [inline]
        sctp_v6_get_dst+0x91/0x1630 net/sctp/ipv6.c:241
        sctp_transport_route+0x101/0x570 net/sctp/transport.c:292
      ==================================================================
      Signed-off-by: NAlexander Potapenko <glider@google.com>
      Reviewed-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      15339e44
  25. 07 8月, 2017 1 次提交
  26. 17 7月, 2017 1 次提交
  27. 06 7月, 2017 1 次提交
  28. 18 5月, 2017 1 次提交
  29. 12 5月, 2017 1 次提交
    • X
      sctp: fix src address selection if using secondary addresses for ipv6 · dbc2b5e9
      Xin Long 提交于
      Commit 0ca50d12 ("sctp: fix src address selection if using secondary
      addresses") has fixed a src address selection issue when using secondary
      addresses for ipv4.
      
      Now sctp ipv6 also has the similar issue. When using a secondary address,
      sctp_v6_get_dst tries to choose the saddr which has the most same bits
      with the daddr by sctp_v6_addr_match_len. It may make some cases not work
      as expected.
      
      hostA:
        [1] fd21:356b:459a:cf10::11 (eth1)
        [2] fd21:356b:459a:cf20::11 (eth2)
      
      hostB:
        [a] fd21:356b:459a:cf30::2  (eth1)
        [b] fd21:356b:459a:cf40::2  (eth2)
      
      route from hostA to hostB:
        fd21:356b:459a:cf30::/64 dev eth1  metric 1024  mtu 1500
      
      The expected path should be:
        fd21:356b:459a:cf10::11 <-> fd21:356b:459a:cf30::2
      But addr[2] matches addr[a] more bits than addr[1] does, according to
      sctp_v6_addr_match_len. It causes the path to be:
        fd21:356b:459a:cf20::11 <-> fd21:356b:459a:cf30::2
      
      This patch is to fix it with the same way as Marcelo's fix for sctp ipv4.
      As no ip_dev_find for ipv6, this patch is to use ipv6_chk_addr to check
      if the saddr is in a dev instead.
      
      Note that for backwards compatibility, it will still do the addr_match_len
      check here when no optimal is found.
      Reported-by: NPatrick Talbert <ptalbert@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dbc2b5e9
  30. 10 3月, 2017 1 次提交
    • D
      net: Work around lockdep limitation in sockets that use sockets · cdfbabfb
      David Howells 提交于
      Lockdep issues a circular dependency warning when AFS issues an operation
      through AF_RXRPC from a context in which the VFS/VM holds the mmap_sem.
      
      The theory lockdep comes up with is as follows:
      
       (1) If the pagefault handler decides it needs to read pages from AFS, it
           calls AFS with mmap_sem held and AFS begins an AF_RXRPC call, but
           creating a call requires the socket lock:
      
      	mmap_sem must be taken before sk_lock-AF_RXRPC
      
       (2) afs_open_socket() opens an AF_RXRPC socket and binds it.  rxrpc_bind()
           binds the underlying UDP socket whilst holding its socket lock.
           inet_bind() takes its own socket lock:
      
      	sk_lock-AF_RXRPC must be taken before sk_lock-AF_INET
      
       (3) Reading from a TCP socket into a userspace buffer might cause a fault
           and thus cause the kernel to take the mmap_sem, but the TCP socket is
           locked whilst doing this:
      
      	sk_lock-AF_INET must be taken before mmap_sem
      
      However, lockdep's theory is wrong in this instance because it deals only
      with lock classes and not individual locks.  The AF_INET lock in (2) isn't
      really equivalent to the AF_INET lock in (3) as the former deals with a
      socket entirely internal to the kernel that never sees userspace.  This is
      a limitation in the design of lockdep.
      
      Fix the general case by:
      
       (1) Double up all the locking keys used in sockets so that one set are
           used if the socket is created by userspace and the other set is used
           if the socket is created by the kernel.
      
       (2) Store the kern parameter passed to sk_alloc() in a variable in the
           sock struct (sk_kern_sock).  This informs sock_lock_init(),
           sock_init_data() and sk_clone_lock() as to the lock keys to be used.
      
           Note that the child created by sk_clone_lock() inherits the parent's
           kern setting.
      
       (3) Add a 'kern' parameter to ->accept() that is analogous to the one
           passed in to ->create() that distinguishes whether kernel_accept() or
           sys_accept4() was the caller and can be passed to sk_alloc().
      
           Note that a lot of accept functions merely dequeue an already
           allocated socket.  I haven't touched these as the new socket already
           exists before we get the parameter.
      
           Note also that there are a couple of places where I've made the accepted
           socket unconditionally kernel-based:
      
      	irda_accept()
      	rds_rcp_accept_one()
      	tcp_accept_from_sock()
      
           because they follow a sock_create_kern() and accept off of that.
      
      Whilst creating this, I noticed that lustre and ocfs don't create sockets
      through sock_create_kern() and thus they aren't marked as for-kernel,
      though they appear to be internal.  I wonder if these should do that so
      that they use the new set of lock keys.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cdfbabfb
  31. 27 1月, 2017 1 次提交
  32. 29 12月, 2016 1 次提交
  33. 25 12月, 2016 1 次提交
  34. 01 11月, 2016 1 次提交
    • X
      sctp: hold transport instead of assoc when lookup assoc in rx path · dae399d7
      Xin Long 提交于
      Prior to this patch, in rx path, before calling lock_sock, it needed to
      hold assoc when got it by __sctp_lookup_association, in case other place
      would free/put assoc.
      
      But in __sctp_lookup_association, it lookup and hold transport, then got
      assoc by transport->assoc, then hold assoc and put transport. It means
      it didn't hold transport, yet it was returned and later on directly
      assigned to chunk->transport.
      
      Without the protection of sock lock, the transport may be freed/put by
      other places, which would cause a use-after-free issue.
      
      This patch is to fix this issue by holding transport instead of assoc.
      As holding transport can make sure to access assoc is also safe, and
      actually it looks up assoc by searching transport rhashtable, to hold
      transport here makes more sense.
      
      Note that the function will be renamed later on on another patch.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dae399d7
  35. 26 7月, 2016 2 次提交
  36. 14 7月, 2016 1 次提交
    • M
      sctp: allow GSO frags to access the chunk too · 1f45f78f
      Marcelo Ricardo Leitner 提交于
      SCTP will try to access original IP headers on sctp_recvmsg in order to
      copy the addresses used. There are also other places that do similar access
      to IP or even SCTP headers. But after 90017acc ("sctp: Add GSO
      support") they aren't always there because they are only present in the
      header skb.
      
      SCTP handles the queueing of incoming data by cloning the incoming skb
      and limiting to only the relevant payload. This clone has its cb updated
      to something different and it's then queued on socket rx queue. Thus we
      need to fix this in two moments.
      
      For rx path, not related to socket queue yet, this patch uses a
      partially copied sctp_input_cb to such GSO frags. This restores the
      ability to access the headers for this part of the code.
      
      Regarding the socket rx queue, it removes iif member from sctp_event and
      also add a chunk pointer on it.
      
      With these changes we're always able to reach the headers again.
      
      The biggest change here is that now the sctp_chunk struct and the
      original skb are only freed after the application consumed the buffer.
      Note however that the original payload was already like this due to the
      skb cloning.
      
      For iif, SCTP's IPv4 code doesn't use it, so no change is necessary.
      IPv6 now can fetch it directly from original's IPv6 CB as the original
      skb is still accessible.
      
      In the future we probably can simplify sctp_v*_skb_iif() stuff, as
      sctp_v4_skb_iif() was called but it's return value not used, and now
      it's not even called, but such cleanup is out of scope for this change.
      
      Fixes: 90017acc ("sctp: Add GSO support")
      Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1f45f78f
  37. 28 4月, 2016 1 次提交
  38. 02 3月, 2016 1 次提交
    • X
      sctp: lack the check for ports in sctp_v6_cmp_addr · 40b4f0fd
      Xin Long 提交于
      As the member .cmp_addr of sctp_af_inet6, sctp_v6_cmp_addr should also check
      the port of addresses, just like sctp_v4_cmp_addr, cause it's invoked by
      sctp_cmp_addr_exact().
      
      Now sctp_v6_cmp_addr just check the port when two addresses have different
      family, and lack the port check for two ipv6 addresses. that will make
      sctp_hash_cmp() cannot work well.
      
      so fix it by adding ports comparison in sctp_v6_cmp_addr().
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      40b4f0fd