1. 27 6月, 2020 1 次提交
  2. 24 6月, 2020 2 次提交
  3. 23 6月, 2020 1 次提交
  4. 19 6月, 2020 2 次提交
  5. 16 6月, 2020 3 次提交
  6. 11 6月, 2020 2 次提交
  7. 09 6月, 2020 1 次提交
  8. 31 5月, 2020 4 次提交
    • P
      mptcp: fix NULL ptr dereference in MP_JOIN error path · 39884604
      Paolo Abeni 提交于
      When token lookup on MP_JOIN 3rd ack fails, the server
      socket closes with a reset the incoming child. Such socket
      has the 'is_mptcp' flag set, but no msk socket associated
      - due to the failed lookup.
      
      While crafting the reset packet mptcp_established_options_mp()
      will try to dereference the child's master socket, causing
      a NULL ptr dereference.
      
      This change addresses the issue with explicit fallback to
      TCP in such error path.
      
      Fixes: 729cd643 ("mptcp: cope better with MP_JOIN failure")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      39884604
    • P
      mptcp: remove msk from the token container at destruction time. · c5c79763
      Paolo Abeni 提交于
      Currently we remote the msk from the token container only
      via mptcp_close(). The MPTCP master socket can be destroyed
      also via other paths (e.g. if not yet accepted, when shutting
      down the listener socket). When we hit the latter scenario,
      dangling msk references are left into the token container,
      leading to memory corruption and/or UaF.
      
      This change addresses the issue by moving the token removal
      into the msk destructor.
      
      Fixes: 79c0949e ("mptcp: Add key generation and token tree")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c5c79763
    • P
      mptcp: fix race between MP_JOIN and close · 10f6d46c
      Paolo Abeni 提交于
      If a MP_JOIN subflow completes the 3whs while another
      CPU is closing the master msk, we can hit the
      following race:
      
      CPU1                                    CPU2
      
      close()
       mptcp_close
                                              subflow_syn_recv_sock
                                               mptcp_token_get_sock
                                               mptcp_finish_join
                                                inet_sk_state_load
        mptcp_token_destroy
        inet_sk_state_store(TCP_CLOSE)
        __mptcp_flush_join_list()
                                                mptcp_sock_graft
                                                list_add_tail
        sk_common_release
         sock_orphan()
       <socket free>
      
      The MP_JOIN socket will be leaked. Additionally we can hit
      UaF for the msk 'struct socket' referenced via the 'conn'
      field.
      
      This change try to address the issue introducing some
      synchronization between the MP_JOIN 3whs and mptcp_close
      via the join_list spinlock. If we detect the msk is closing
      the MP_JOIN socket is closed, too.
      
      Fixes: f296234c ("mptcp: Add handling of incoming MP_JOIN requests")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      10f6d46c
    • P
      mptcp: fix unblocking connect() · 41be81a8
      Paolo Abeni 提交于
      Currently unblocking connect() on MPTCP sockets fails frequently.
      If mptcp_stream_connect() is invoked to complete a previously
      attempted unblocking connection, it will still try to create
      the first subflow via __mptcp_socket_create(). If the 3whs is
      completed and the 'can_ack' flag is already set, the latter
      will fail with -EINVAL.
      
      This change addresses the issue checking for pending connect and
      delegating the completion to the first subflow. Additionally
      do msk addresses and sk_state changes only when needed.
      
      Fixes: 2303f994 ("mptcp: Associate MPTCP context with TCP socket")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      41be81a8
  9. 27 5月, 2020 2 次提交
    • F
      mptcp: attempt coalescing when moving skbs to mptcp rx queue · 4e637c70
      Florian Westphal 提交于
      We can try to coalesce skbs we take from the subflows rx queue with the
      tail of the mptcp rx queue.
      
      If successful, the skb head can be discarded early.
      
      We can also free the skb extensions, we do not access them after this.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4e637c70
    • P
      mptcp: avoid NULL-ptr derefence on fallback · 0a82e230
      Paolo Abeni 提交于
      In the MPTCP receive path we must cope with TCP fallback
      on blocking recvmsg(). Currently in such code path we detect
      the fallback condition, but we don't fetch the struct socket
      required for fallback.
      
      The above allowed syzkaller to trigger a NULL pointer
      dereference:
      
      general protection fault, probably for non-canonical address 0xdffffc0000000004: 0000 [#1] PREEMPT SMP KASAN
      KASAN: null-ptr-deref in range [0x0000000000000020-0x0000000000000027]
      CPU: 1 PID: 7226 Comm: syz-executor523 Not tainted 5.7.0-rc6-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:sock_recvmsg_nosec net/socket.c:886 [inline]
      RIP: 0010:sock_recvmsg+0x92/0x110 net/socket.c:904
      Code: 5b 41 5c 41 5d 41 5e 41 5f 5d c3 44 89 6c 24 04 e8 53 18 1d fb 4d 8d 6f 20 4c 89 e8 48 c1 e8 03 48 b9 00 00 00 00 00 fc ff df <80> 3c 08 00 74 08 4c 89 ef e8 20 12 5b fb bd a0 00 00 00 49 03 6d
      RSP: 0018:ffffc90001077b98 EFLAGS: 00010202
      RAX: 0000000000000004 RBX: ffffc90001077dc0 RCX: dffffc0000000000
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
      RBP: 0000000000000000 R08: ffffffff86565e59 R09: ffffed10115afeaa
      R10: ffffed10115afeaa R11: 0000000000000000 R12: 1ffff9200020efbc
      R13: 0000000000000020 R14: ffffc90001077de0 R15: 0000000000000000
      FS:  00007fc6a3abe700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000004d0050 CR3: 00000000969f0000 CR4: 00000000001406e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       mptcp_recvmsg+0x18d5/0x19b0 net/mptcp/protocol.c:891
       inet_recvmsg+0xf6/0x1d0 net/ipv4/af_inet.c:838
       sock_recvmsg_nosec net/socket.c:886 [inline]
       sock_recvmsg net/socket.c:904 [inline]
       __sys_recvfrom+0x2f3/0x470 net/socket.c:2057
       __do_sys_recvfrom net/socket.c:2075 [inline]
       __se_sys_recvfrom net/socket.c:2071 [inline]
       __x64_sys_recvfrom+0xda/0xf0 net/socket.c:2071
       do_syscall_64+0xf3/0x1b0 arch/x86/entry/common.c:295
       entry_SYSCALL_64_after_hwframe+0x49/0xb3
      
      Address the issue initializing the struct socket reference
      before entering the fallback code.
      
      Reported-and-tested-by: syzbot+c6bfc3db991edc918432@syzkaller.appspotmail.com
      Suggested-by: NOndrej Mosnacek <omosnace@redhat.com>
      Fixes: 8ab183de ("mptcp: cope with later TCP fallback")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0a82e230
  10. 23 5月, 2020 1 次提交
  11. 20 5月, 2020 1 次提交
  12. 19 5月, 2020 1 次提交
  13. 18 5月, 2020 7 次提交
  14. 17 5月, 2020 1 次提交
    • C
      mptcp: Use 32-bit DATA_ACK when possible · a0c1d0ea
      Christoph Paasch 提交于
      RFC8684 allows to send 32-bit DATA_ACKs as long as the peer is not
      sending 64-bit data-sequence numbers. The 64-bit DSN is only there for
      extreme scenarios when a very high throughput subflow is combined with a
      long-RTT subflow such that the high-throughput subflow wraps around the
      32-bit sequence number space within an RTT of the high-RTT subflow.
      
      It is thus a rare scenario and we should try to use the 32-bit DATA_ACK
      instead as long as possible. It allows to reduce the TCP-option overhead
      by 4 bytes, thus makes space for an additional SACK-block. It also makes
      tcpdumps much easier to read when the DSN and DATA_ACK are both either
      32 or 64-bit.
      Signed-off-by: NChristoph Paasch <cpaasch@apple.com>
      Reviewed-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a0c1d0ea
  15. 16 5月, 2020 2 次提交
    • P
      mptcp: cope better with MP_JOIN failure · 729cd643
      Paolo Abeni 提交于
      Currently, on MP_JOIN failure we reset the child
      socket, but leave the request socket untouched.
      
      tcp_check_req will deal with it according to the
      'tcp_abort_on_overflow' sysctl value - by default the
      req socket will stay alive.
      
      The above leads to inconsistent behavior on MP JOIN
      failure, and bad listener overflow accounting.
      
      This patch addresses the issue leveraging the infrastructure
      just introduced to ask the TCP stack to drop the req on
      failure.
      
      The child socket is not freed anymore by subflow_syn_recv_sock(),
      instead it's moved to a dead state and will be disposed by the
      next sock_put done by the TCP stack, so that listener overflow
      accounting is not affected by MP JOIN failure.
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: NChristoph Paasch <cpaasch@apple.com>
      Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      729cd643
    • P
      mptcp: add new sock flag to deal with join subflows · 90bf4513
      Paolo Abeni 提交于
      MP_JOIN subflows must not land into the accept queue.
      Currently tcp_check_req() calls an mptcp specific helper
      to detect such scenario.
      
      Such helper leverages the subflow context to check for
      MP_JOIN subflows. We need to deal also with MP JOIN
      failures, even when the subflow context is not available
      due allocation failure.
      
      A possible solution would be changing the syn_recv_sock()
      signature to allow returning a more descriptive action/
      error code and deal with that in tcp_check_req().
      
      Since the above need is MPTCP specific, this patch instead
      uses a TCP request socket hole to add a MPTCP specific flag.
      Such flag is used by the MPTCP syn_recv_sock() to tell
      tcp_check_req() how to deal with the request socket.
      
      This change is a no-op for !MPTCP build, and makes the
      MPTCP code simpler. It allows also the next patch to deal
      correctly with MP JOIN failure.
      
      v1 -> v2:
       - be more conservative on drop_req initialization (Mat)
      
      RFC -> v1:
       - move the drop_req bit inside tcp_request_sock (Eric)
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
      Reviewed-by: NChristoph Paasch <cpaasch@apple.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      90bf4513
  16. 13 5月, 2020 1 次提交
  17. 08 5月, 2020 2 次提交
    • E
      mptcp: use SHA256_BLOCK_SIZE, not SHA_MESSAGE_BYTES · ac0ad93d
      Eric Biggers 提交于
      In preparation for naming the SHA-1 stuff in <linux/cryptohash.h>
      properly and moving it to a more appropriate header, fix the HMAC-SHA256
      code in mptcp_crypto_hmac_sha() to use SHA256_BLOCK_SIZE instead of
      "SHA_MESSAGE_BYTES" which is actually the SHA-1 block size.
      (Fortunately these are both 64 bytes, so this wasn't a "real" bug...)
      
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: mptcp@lists.01.org
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Reviewed-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      ac0ad93d
    • P
      mptcp: set correct vfs info for subflows · 7d14b0d2
      Paolo Abeni 提交于
      When a subflow is created via mptcp_subflow_create_socket(),
      a new 'struct socket' is allocated, with a new i_ino value.
      
      When inspecting TCP sockets via the procfs and or the diag
      interface, the above ones are not related to the process owning
      the MPTCP master socket, even if they are a logical part of it
      ('ss -p' shows an empty process field)
      
      Additionally, subflows created by the path manager get
      the uid/gid from the running workqueue.
      
      Subflows are part of the owning MPTCP master socket, let's
      adjust the vfs info to reflect this.
      
      After this patch, 'ss' correctly displays subflows as belonging
      to the msk socket creator.
      
      Fixes: 2303f994 ("mptcp: Associate MPTCP context with TCP socket")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7d14b0d2
  18. 01 5月, 2020 6 次提交
    • P
      mptcp: fix uninitialized value access · ac2b47fb
      Paolo Abeni 提交于
      tcp_v{4,6}_syn_recv_sock() set 'own_req' only when returning
      a not NULL 'child', let's check 'own_req' only if child is
      available to avoid an - unharmful - UBSAN splat.
      
      v1 -> v2:
       - reference the correct hash
      
      Fixes: 4c8941de ("mptcp: avoid flipping mp_capable field in syn_recv_sock()")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ac2b47fb
    • P
      mptcp: initialize the data_fin field for mpc packets · a77895db
      Paolo Abeni 提交于
      When parsing MPC+data packets we set the dss field, so
      we must also initialize the data_fin, or we can find stray
      value there.
      
      Fixes: 9a19371b ("mptcp: fix data_fin handing in RX path")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a77895db
    • P
      mptcp: fix 'use_ack' option access. · 5a91e32b
      Paolo Abeni 提交于
      The mentioned RX option field is initialized only for DSS
      packet, we must access it only if 'dss' is set too, or
      the subflow will end-up in a bad status, leading to
      RFC violations.
      
      Fixes: d22f4988 ("mptcp: process MP_CAPABLE data option")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5a91e32b
    • P
      mptcp: avoid a WARN on bad input. · d6085fe1
      Paolo Abeni 提交于
      Syzcaller has found a way to trigger the WARN_ON_ONCE condition
      in check_fully_established().
      
      The root cause is a legit fallback to TCP scenario, so replace
      the WARN with a plain message on a more strict condition.
      
      Fixes: f296234c ("mptcp: Add handling of incoming MP_JOIN requests")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d6085fe1
    • P
      mptcp: move option parsing into mptcp_incoming_options() · cfde141e
      Paolo Abeni 提交于
      The mptcp_options_received structure carries several per
      packet flags (mp_capable, mp_join, etc.). Such fields must
      be cleared on each packet, even on dropped ones or packet
      not carrying any MPTCP options, but the current mptcp
      code clears them only on TCP option reset.
      
      On several races/corner cases we end-up with stray bits in
      incoming options, leading to WARN_ON splats. e.g.:
      
      [  171.164906] Bad mapping: ssn=32714 map_seq=1 map_data_len=32713
      [  171.165006] WARNING: CPU: 1 PID: 5026 at net/mptcp/subflow.c:533 warn_bad_map (linux-mptcp/net/mptcp/subflow.c:533 linux-mptcp/net/mptcp/subflow.c:531)
      [  171.167632] Modules linked in: ip6_vti ip_vti ip_gre ipip sit tunnel4 ip_tunnel geneve ip6_udp_tunnel udp_tunnel macsec macvtap tap ipvlan macvlan 8021q garp mrp xfrm_interface veth netdevsim nlmon dummy team bonding vcan bridge stp llc ip6_gre gre ip6_tunnel tunnel6 tun binfmt_misc intel_rapl_msr intel_rapl_common rfkill kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel joydev virtio_balloon pcspkr i2c_piix4 sunrpc ip_tables xfs libcrc32c crc32c_intel serio_raw virtio_console ata_generic virtio_blk virtio_net net_failover failover ata_piix libata
      [  171.199464] CPU: 1 PID: 5026 Comm: repro Not tainted 5.7.0-rc1.mptcp_f227fdf5d388+ #95
      [  171.200886] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014
      [  171.202546] RIP: 0010:warn_bad_map (linux-mptcp/net/mptcp/subflow.c:533 linux-mptcp/net/mptcp/subflow.c:531)
      [  171.206537] Code: c1 ea 03 0f b6 14 02 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 04 84 d2 75 1d 8b 55 3c 44 89 e6 48 c7 c7 20 51 13 95 e8 37 8b 22 fe <0f> 0b 48 83 c4 08 5b 5d 41 5c c3 89 4c 24 04 e8 db d6 94 fe 8b 4c
      [  171.220473] RSP: 0018:ffffc90000150560 EFLAGS: 00010282
      [  171.221639] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
      [  171.223108] RDX: 0000000000000000 RSI: 0000000000000008 RDI: fffff5200002a09e
      [  171.224388] RBP: ffff8880aa6e3c00 R08: 0000000000000001 R09: fffffbfff2ec9955
      [  171.225706] R10: ffffffff9764caa7 R11: fffffbfff2ec9954 R12: 0000000000007fca
      [  171.227211] R13: ffff8881066f4a7f R14: ffff8880aa6e3c00 R15: 0000000000000020
      [  171.228460] FS:  00007f8623719740(0000) GS:ffff88810be00000(0000) knlGS:0000000000000000
      [  171.230065] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  171.231303] CR2: 00007ffdab190a50 CR3: 00000001038ea006 CR4: 0000000000160ee0
      [  171.232586] Call Trace:
      [  171.233109]  <IRQ>
      [  171.233531] get_mapping_status (linux-mptcp/net/mptcp/subflow.c:691)
      [  171.234371] mptcp_subflow_data_available (linux-mptcp/net/mptcp/subflow.c:736 linux-mptcp/net/mptcp/subflow.c:832)
      [  171.238181] subflow_state_change (linux-mptcp/net/mptcp/subflow.c:1085 (discriminator 1))
      [  171.239066] tcp_fin (linux-mptcp/net/ipv4/tcp_input.c:4217)
      [  171.240123] tcp_data_queue (linux-mptcp/./include/linux/compiler.h:199 linux-mptcp/net/ipv4/tcp_input.c:4822)
      [  171.245083] tcp_rcv_established (linux-mptcp/./include/linux/skbuff.h:1785 linux-mptcp/./include/net/tcp.h:1774 linux-mptcp/./include/net/tcp.h:1847 linux-mptcp/net/ipv4/tcp_input.c:5238 linux-mptcp/net/ipv4/tcp_input.c:5730)
      [  171.254089] tcp_v4_rcv (linux-mptcp/./include/linux/spinlock.h:393 linux-mptcp/net/ipv4/tcp_ipv4.c:2009)
      [  171.258969] ip_protocol_deliver_rcu (linux-mptcp/net/ipv4/ip_input.c:204 (discriminator 1))
      [  171.260214] ip_local_deliver_finish (linux-mptcp/./include/linux/rcupdate.h:651 linux-mptcp/net/ipv4/ip_input.c:232)
      [  171.261389] ip_local_deliver (linux-mptcp/./include/linux/netfilter.h:307 linux-mptcp/./include/linux/netfilter.h:301 linux-mptcp/net/ipv4/ip_input.c:252)
      [  171.265884] ip_rcv (linux-mptcp/./include/linux/netfilter.h:307 linux-mptcp/./include/linux/netfilter.h:301 linux-mptcp/net/ipv4/ip_input.c:539)
      [  171.273666] process_backlog (linux-mptcp/./include/linux/rcupdate.h:651 linux-mptcp/net/core/dev.c:6135)
      [  171.275328] net_rx_action (linux-mptcp/net/core/dev.c:6572 linux-mptcp/net/core/dev.c:6640)
      [  171.280472] __do_softirq (linux-mptcp/./arch/x86/include/asm/jump_label.h:25 linux-mptcp/./include/linux/jump_label.h:200 linux-mptcp/./include/trace/events/irq.h:142 linux-mptcp/kernel/softirq.c:293)
      [  171.281379] do_softirq_own_stack (linux-mptcp/arch/x86/entry/entry_64.S:1083)
      [  171.282358]  </IRQ>
      
      We could address the issue clearing explicitly the relevant fields
      in several places - tcp_parse_option, tcp_fast_parse_options,
      possibly others.
      
      Instead we move the MPTCP option parsing into the already existing
      mptcp ingress hook, so that we need to clear the fields in a single
      place.
      
      This allows us dropping an MPTCP hook from the TCP code and
      removing the quite large mptcp_options_received from the tcp_sock
      struct. On the flip side, the MPTCP sockets will traverse the
      option space twice (in tcp_parse_option() and in
      mptcp_incoming_options(). That looks acceptable: we already
      do that for syn and 3rd ack packets, plain TCP socket will
      benefit from it, and even MPTCP sockets will experience better
      code locality, reducing the jumps between TCP and MPTCP code.
      
      v1 -> v2:
       - rebased on current '-net' tree
      
      Fixes: 648ef4b8 ("mptcp: Implement MPTCP receive path")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cfde141e
    • P
      mptcp: consolidate synack processing. · 263e1201
      Paolo Abeni 提交于
      Currently the MPTCP code uses 2 hooks to process syn-ack
      packets, mptcp_rcv_synsent() and the sk_rx_dst_set()
      callback.
      
      We can drop the first, moving the relevant code into the
      latter, reducing the hooking into the TCP code. This is
      also needed by the next patch.
      
      v1 -> v2:
       - use local tcp sock ptr instead of casting the sk variable
         several times - DaveM
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      263e1201