1. 04 8月, 2020 1 次提交
  2. 29 7月, 2020 9 次提交
  3. 28 7月, 2020 1 次提交
  4. 25 7月, 2020 2 次提交
  5. 24 7月, 2020 4 次提交
  6. 20 7月, 2020 1 次提交
  7. 08 7月, 2020 1 次提交
  8. 05 7月, 2020 3 次提交
  9. 02 7月, 2020 1 次提交
    • F
      mptcp: add receive buffer auto-tuning · a6b118fe
      Florian Westphal 提交于
      When mptcp is used, userspace doesn't read from the tcp (subflow)
      socket but from the parent (mptcp) socket receive queue.
      
      skbs are moved from the subflow socket to the mptcp rx queue either from
      'data_ready' callback (if mptcp socket can be locked), a work queue, or
      the socket receive function.
      
      This means tcp_rcv_space_adjust() is never called and thus no receive
      buffer size auto-tuning is done.
      
      An earlier (not merged) patch added tcp_rcv_space_adjust() calls to the
      function that moves skbs from subflow to mptcp socket.
      While this enabled autotuning, it also meant tuning was done even if
      userspace was reading the mptcp socket very slowly.
      
      This adds mptcp_rcv_space_adjust() and calls it after userspace has
      read data from the mptcp socket rx queue.
      
      Its very similar to tcp_rcv_space_adjust, with two differences:
      
      1. The rtt estimate is the largest one observed on a subflow
      2. The rcvbuf size and window clamp of all subflows is adjusted
         to the mptcp-level rcvbuf.
      
      Otherwise, we get spurious drops at tcp (subflow) socket level if
      the skbs are not moved to the mptcp socket fast enough.
      
      Before:
      time mptcp_connect.sh -t -f $((4*1024*1024)) -d 300 -l 0.01% -r 0 -e "" -m mmap
      [..]
      ns4 MPTCP -> ns3 (10.0.3.2:10108      ) MPTCP   (duration 40823ms) [ OK ]
      ns4 MPTCP -> ns3 (10.0.3.2:10109      ) TCP     (duration 23119ms) [ OK ]
      ns4 TCP   -> ns3 (10.0.3.2:10110      ) MPTCP   (duration  5421ms) [ OK ]
      ns4 MPTCP -> ns3 (dead:beef:3::2:10111) MPTCP   (duration 41446ms) [ OK ]
      ns4 MPTCP -> ns3 (dead:beef:3::2:10112) TCP     (duration 23427ms) [ OK ]
      ns4 TCP   -> ns3 (dead:beef:3::2:10113) MPTCP   (duration  5426ms) [ OK ]
      Time: 1396 seconds
      
      After:
      ns4 MPTCP -> ns3 (10.0.3.2:10108      ) MPTCP   (duration  5417ms) [ OK ]
      ns4 MPTCP -> ns3 (10.0.3.2:10109      ) TCP     (duration  5427ms) [ OK ]
      ns4 TCP   -> ns3 (10.0.3.2:10110      ) MPTCP   (duration  5422ms) [ OK ]
      ns4 MPTCP -> ns3 (dead:beef:3::2:10111) MPTCP   (duration  5415ms) [ OK ]
      ns4 MPTCP -> ns3 (dead:beef:3::2:10112) TCP     (duration  5422ms) [ OK ]
      ns4 TCP   -> ns3 (dead:beef:3::2:10113) MPTCP   (duration  5423ms) [ OK ]
      Time: 296 seconds
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Reviewed-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a6b118fe
  10. 30 6月, 2020 5 次提交
  11. 27 6月, 2020 2 次提交
  12. 11 6月, 2020 1 次提交
  13. 31 5月, 2020 3 次提交
    • P
      mptcp: remove msk from the token container at destruction time. · c5c79763
      Paolo Abeni 提交于
      Currently we remote the msk from the token container only
      via mptcp_close(). The MPTCP master socket can be destroyed
      also via other paths (e.g. if not yet accepted, when shutting
      down the listener socket). When we hit the latter scenario,
      dangling msk references are left into the token container,
      leading to memory corruption and/or UaF.
      
      This change addresses the issue by moving the token removal
      into the msk destructor.
      
      Fixes: 79c0949e ("mptcp: Add key generation and token tree")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c5c79763
    • P
      mptcp: fix race between MP_JOIN and close · 10f6d46c
      Paolo Abeni 提交于
      If a MP_JOIN subflow completes the 3whs while another
      CPU is closing the master msk, we can hit the
      following race:
      
      CPU1                                    CPU2
      
      close()
       mptcp_close
                                              subflow_syn_recv_sock
                                               mptcp_token_get_sock
                                               mptcp_finish_join
                                                inet_sk_state_load
        mptcp_token_destroy
        inet_sk_state_store(TCP_CLOSE)
        __mptcp_flush_join_list()
                                                mptcp_sock_graft
                                                list_add_tail
        sk_common_release
         sock_orphan()
       <socket free>
      
      The MP_JOIN socket will be leaked. Additionally we can hit
      UaF for the msk 'struct socket' referenced via the 'conn'
      field.
      
      This change try to address the issue introducing some
      synchronization between the MP_JOIN 3whs and mptcp_close
      via the join_list spinlock. If we detect the msk is closing
      the MP_JOIN socket is closed, too.
      
      Fixes: f296234c ("mptcp: Add handling of incoming MP_JOIN requests")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      10f6d46c
    • P
      mptcp: fix unblocking connect() · 41be81a8
      Paolo Abeni 提交于
      Currently unblocking connect() on MPTCP sockets fails frequently.
      If mptcp_stream_connect() is invoked to complete a previously
      attempted unblocking connection, it will still try to create
      the first subflow via __mptcp_socket_create(). If the 3whs is
      completed and the 'can_ack' flag is already set, the latter
      will fail with -EINVAL.
      
      This change addresses the issue checking for pending connect and
      delegating the completion to the first subflow. Additionally
      do msk addresses and sk_state changes only when needed.
      
      Fixes: 2303f994 ("mptcp: Associate MPTCP context with TCP socket")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      41be81a8
  14. 27 5月, 2020 2 次提交
    • F
      mptcp: attempt coalescing when moving skbs to mptcp rx queue · 4e637c70
      Florian Westphal 提交于
      We can try to coalesce skbs we take from the subflows rx queue with the
      tail of the mptcp rx queue.
      
      If successful, the skb head can be discarded early.
      
      We can also free the skb extensions, we do not access them after this.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4e637c70
    • P
      mptcp: avoid NULL-ptr derefence on fallback · 0a82e230
      Paolo Abeni 提交于
      In the MPTCP receive path we must cope with TCP fallback
      on blocking recvmsg(). Currently in such code path we detect
      the fallback condition, but we don't fetch the struct socket
      required for fallback.
      
      The above allowed syzkaller to trigger a NULL pointer
      dereference:
      
      general protection fault, probably for non-canonical address 0xdffffc0000000004: 0000 [#1] PREEMPT SMP KASAN
      KASAN: null-ptr-deref in range [0x0000000000000020-0x0000000000000027]
      CPU: 1 PID: 7226 Comm: syz-executor523 Not tainted 5.7.0-rc6-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:sock_recvmsg_nosec net/socket.c:886 [inline]
      RIP: 0010:sock_recvmsg+0x92/0x110 net/socket.c:904
      Code: 5b 41 5c 41 5d 41 5e 41 5f 5d c3 44 89 6c 24 04 e8 53 18 1d fb 4d 8d 6f 20 4c 89 e8 48 c1 e8 03 48 b9 00 00 00 00 00 fc ff df <80> 3c 08 00 74 08 4c 89 ef e8 20 12 5b fb bd a0 00 00 00 49 03 6d
      RSP: 0018:ffffc90001077b98 EFLAGS: 00010202
      RAX: 0000000000000004 RBX: ffffc90001077dc0 RCX: dffffc0000000000
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
      RBP: 0000000000000000 R08: ffffffff86565e59 R09: ffffed10115afeaa
      R10: ffffed10115afeaa R11: 0000000000000000 R12: 1ffff9200020efbc
      R13: 0000000000000020 R14: ffffc90001077de0 R15: 0000000000000000
      FS:  00007fc6a3abe700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000004d0050 CR3: 00000000969f0000 CR4: 00000000001406e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       mptcp_recvmsg+0x18d5/0x19b0 net/mptcp/protocol.c:891
       inet_recvmsg+0xf6/0x1d0 net/ipv4/af_inet.c:838
       sock_recvmsg_nosec net/socket.c:886 [inline]
       sock_recvmsg net/socket.c:904 [inline]
       __sys_recvfrom+0x2f3/0x470 net/socket.c:2057
       __do_sys_recvfrom net/socket.c:2075 [inline]
       __se_sys_recvfrom net/socket.c:2071 [inline]
       __x64_sys_recvfrom+0xda/0xf0 net/socket.c:2071
       do_syscall_64+0xf3/0x1b0 arch/x86/entry/common.c:295
       entry_SYSCALL_64_after_hwframe+0x49/0xb3
      
      Address the issue initializing the struct socket reference
      before entering the fallback code.
      
      Reported-and-tested-by: syzbot+c6bfc3db991edc918432@syzkaller.appspotmail.com
      Suggested-by: NOndrej Mosnacek <omosnace@redhat.com>
      Fixes: 8ab183de ("mptcp: cope with later TCP fallback")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0a82e230
  15. 19 5月, 2020 1 次提交
  16. 18 5月, 2020 3 次提交