1. 23 6月, 2022 1 次提交
  2. 20 6月, 2022 1 次提交
    • Z
      net/tls: fix tls_sk_proto_close executed repeatedly · 69135c57
      Ziyang Xuan 提交于
      After setting the sock ktls, update ctx->sk_proto to sock->sk_prot by
      tls_update(), so now ctx->sk_proto->close is tls_sk_proto_close(). When
      close the sock, tls_sk_proto_close() is called for sock->sk_prot->close
      is tls_sk_proto_close(). But ctx->sk_proto->close() will be executed later
      in tls_sk_proto_close(). Thus tls_sk_proto_close() executed repeatedly
      occurred. That will trigger the following bug.
      
      =================================================================
      KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
      RIP: 0010:tls_sk_proto_close+0xd8/0xaf0 net/tls/tls_main.c:306
      Call Trace:
       <TASK>
       tls_sk_proto_close+0x356/0xaf0 net/tls/tls_main.c:329
       inet_release+0x12e/0x280 net/ipv4/af_inet.c:428
       __sock_release+0xcd/0x280 net/socket.c:650
       sock_close+0x18/0x20 net/socket.c:1365
      
      Updating a proto which is same with sock->sk_prot is incorrect. Add proto
      and sock->sk_prot equality check at the head of tls_update() to fix it.
      
      Fixes: 95fa1454 ("bpf: sockmap/tls, close can race with map free")
      Reported-by: syzbot+29c3c12f3214b85ad081@syzkaller.appspotmail.com
      Signed-off-by: NZiyang Xuan <william.xuanziyang@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      69135c57
  3. 10 6月, 2022 1 次提交
  4. 19 5月, 2022 1 次提交
    • B
      tls: Add opt-in zerocopy mode of sendfile() · c1318b39
      Boris Pismenny 提交于
      TLS device offload copies sendfile data to a bounce buffer before
      transmitting. It allows to maintain the valid MAC on TLS records when
      the file contents change and a part of TLS record has to be
      retransmitted on TCP level.
      
      In many common use cases (like serving static files over HTTPS) the file
      contents are not changed on the fly. In many use cases breaking the
      connection is totally acceptable if the file is changed during
      transmission, because it would be received corrupted in any case.
      
      This commit allows to optimize performance for such use cases to
      providing a new optional mode of TLS sendfile(), in which the extra copy
      is skipped. Removing this copy improves performance significantly, as
      TLS and TCP sendfile perform the same operations, and the only overhead
      is TLS header/trailer insertion.
      
      The new mode can only be enabled with the new socket option named
      TLS_TX_ZEROCOPY_SENDFILE on per-socket basis. It preserves backwards
      compatibility with existing applications that rely on the copying
      behavior.
      
      The new mode is safe, meaning that unsolicited modifications of the file
      being sent can't break integrity of the kernel. The worst thing that can
      happen is sending a corrupted TLS record, which is in any case not
      forbidden when using regular TCP sockets.
      
      Sockets other than TLS device offload are not affected by the new socket
      option. The actual status of zerocopy sendfile can be queried with
      sock_diag.
      
      Performance numbers in a single-core test with 24 HTTPS streams on
      nginx, under 100% CPU load:
      
      * non-zerocopy: 33.6 Gbit/s
      * zerocopy: 79.92 Gbit/s
      
      CPU: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz
      Signed-off-by: NBoris Pismenny <borisp@nvidia.com>
      Signed-off-by: NTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: NMaxim Mikityanskiy <maximmi@nvidia.com>
      Reviewed-by: NJakub Kicinski <kuba@kernel.org>
      Link: https://lore.kernel.org/r/20220518092731.1243494-1-maximmi@nvidia.comSigned-off-by: NPaolo Abeni <pabeni@redhat.com>
      c1318b39
  5. 22 3月, 2022 1 次提交
  6. 26 11月, 2021 1 次提交
    • J
      tls: fix replacing proto_ops · f3911f73
      Jakub Kicinski 提交于
      We replace proto_ops whenever TLS is configured for RX. But our
      replacement also overrides sendpage_locked, which will crash
      unless TX is also configured. Similarly we plug both of those
      in for TLS_HW (NIC crypto offload) even tho TLS_HW has a completely
      different implementation for TX.
      
      Last but not least we always plug in something based on inet_stream_ops
      even though a few of the callbacks differ for IPv6 (getname, release,
      bind).
      
      Use a callback building method similar to what we do for struct proto.
      
      Fixes: c46234eb ("tls: RX path for ktls")
      Fixes: d4ffb02d ("net/tls: enable sk_msg redirect to tls socket egress")
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      f3911f73
  7. 27 10月, 2021 1 次提交
  8. 25 10月, 2021 1 次提交
  9. 16 9月, 2021 1 次提交
  10. 02 6月, 2021 1 次提交
    • M
      net/tls: Fix use-after-free after the TLS device goes down and up · c55dcdd4
      Maxim Mikityanskiy 提交于
      When a netdev with active TLS offload goes down, tls_device_down is
      called to stop the offload and tear down the TLS context. However, the
      socket stays alive, and it still points to the TLS context, which is now
      deallocated. If a netdev goes up, while the connection is still active,
      and the data flow resumes after a number of TCP retransmissions, it will
      lead to a use-after-free of the TLS context.
      
      This commit addresses this bug by keeping the context alive until its
      normal destruction, and implements the necessary fallbacks, so that the
      connection can resume in software (non-offloaded) kTLS mode.
      
      On the TX side tls_sw_fallback is used to encrypt all packets. The RX
      side already has all the necessary fallbacks, because receiving
      non-decrypted packets is supported. The thing needed on the RX side is
      to block resync requests, which are normally produced after receiving
      non-decrypted packets.
      
      The necessary synchronization is implemented for a graceful teardown:
      first the fallbacks are deployed, then the driver resources are released
      (it used to be possible to have a tls_dev_resync after tls_dev_del).
      
      A new flag called TLS_RX_DEV_DEGRADED is added to indicate the fallback
      mode. It's used to skip the RX resync logic completely, as it becomes
      useless, and some objects may be released (for example, resync_async,
      which is allocated and freed by the driver).
      
      Fixes: e8f69799 ("net/tls: Add generic NIC offload infrastructure")
      Signed-off-by: NMaxim Mikityanskiy <maximmi@nvidia.com>
      Reviewed-by: NTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c55dcdd4
  11. 28 11月, 2020 1 次提交
  12. 14 10月, 2020 1 次提交
  13. 02 9月, 2020 1 次提交
  14. 29 7月, 2020 1 次提交
  15. 25 7月, 2020 1 次提交
  16. 16 4月, 2020 1 次提交
    • W
      net: tls: Avoid assigning 'const' pointer to non-const pointer · 9a893949
      Will Deacon 提交于
      tls_build_proto() uses WRITE_ONCE() to assign a 'const' pointer to a
      'non-const' pointer. Cleanups to the implementation of WRITE_ONCE() mean
      that this will give rise to a compiler warning, just like a plain old
      assignment would do:
      
        | net/tls/tls_main.c: In function ‘tls_build_proto’:
        | ./include/linux/compiler.h:229:30: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
        | net/tls/tls_main.c:640:4: note: in expansion of macro ‘smp_store_release’
        |   640 |    smp_store_release(&saved_tcpv6_prot, prot);
        |       |    ^~~~~~~~~~~~~~~~~
      
      Drop the const qualifier from the local 'prot' variable, as it isn't
      needed.
      
      Cc: Boris Pismenny <borisp@mellanox.com>
      Cc: Aviad Yehezkel <aviadye@mellanox.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NWill Deacon <will@kernel.org>
      9a893949
  17. 09 4月, 2020 1 次提交
    • A
      net/tls: fix const assignment warning · f691a25c
      Arnd Bergmann 提交于
      Building with some experimental patches, I came across a warning
      in the tls code:
      
      include/linux/compiler.h:215:30: warning: assignment discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
        215 |  *(volatile typeof(x) *)&(x) = (val);  \
            |                              ^
      net/tls/tls_main.c:650:4: note: in expansion of macro 'smp_store_release'
        650 |    smp_store_release(&saved_tcpv4_prot, prot);
      
      This appears to be a legitimate warning about assigning a const pointer
      into the non-const 'saved_tcpv4_prot' global. Annotate both the ipv4 and
      ipv6 pointers 'const' to make the code internally consistent.
      
      Fixes: 5bb4c45d ("net/tls: Read sk_prot once when building tls proto ops")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f691a25c
  18. 22 3月, 2020 3 次提交
  19. 22 2月, 2020 1 次提交
    • J
      net, sk_msg: Annotate lockless access to sk_prot on clone · b8e202d1
      Jakub Sitnicki 提交于
      sk_msg and ULP frameworks override protocol callbacks pointer in
      sk->sk_prot, while tcp accesses it locklessly when cloning the listening
      socket, that is with neither sk_lock nor sk_callback_lock held.
      
      Once we enable use of listening sockets with sockmap (and hence sk_msg),
      there will be shared access to sk->sk_prot if socket is getting cloned
      while being inserted/deleted to/from the sockmap from another CPU:
      
      Read side:
      
      tcp_v4_rcv
        sk = __inet_lookup_skb(...)
        tcp_check_req(sk)
          inet_csk(sk)->icsk_af_ops->syn_recv_sock
            tcp_v4_syn_recv_sock
              tcp_create_openreq_child
                inet_csk_clone_lock
                  sk_clone_lock
                    READ_ONCE(sk->sk_prot)
      
      Write side:
      
      sock_map_ops->map_update_elem
        sock_map_update_elem
          sock_map_update_common
            sock_map_link_no_progs
              tcp_bpf_init
                tcp_bpf_update_sk_prot
                  sk_psock_update_proto
                    WRITE_ONCE(sk->sk_prot, ops)
      
      sock_map_ops->map_delete_elem
        sock_map_delete_elem
          __sock_map_delete
           sock_map_unref
             sk_psock_put
               sk_psock_drop
                 sk_psock_restore_proto
                   tcp_update_ulp
                     WRITE_ONCE(sk->sk_prot, proto)
      
      Mark the shared access with READ_ONCE/WRITE_ONCE annotations.
      Signed-off-by: NJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20200218171023.844439-2-jakub@cloudflare.com
      b8e202d1
  20. 16 1月, 2020 1 次提交
  21. 07 12月, 2019 1 次提交
  22. 29 11月, 2019 1 次提交
  23. 20 11月, 2019 1 次提交
  24. 07 11月, 2019 1 次提交
    • J
      net/tls: add a TX lock · 79ffe608
      Jakub Kicinski 提交于
      TLS TX needs to release and re-acquire the socket lock if send buffer
      fills up.
      
      TLS SW TX path currently depends on only allowing one thread to enter
      the function by the abuse of sk_write_pending. If another writer is
      already waiting for memory no new ones are allowed in.
      
      This has two problems:
       - writers don't wake other threads up when they leave the kernel;
         meaning that this scheme works for single extra thread (second
         application thread or delayed work) because memory becoming
         available will send a wake up request, but as Mallesham and
         Pooja report with larger number of threads it leads to threads
         being put to sleep indefinitely;
       - the delayed work does not get _scheduled_ but it may _run_ when
         other writers are present leading to crashes as writers don't
         expect state to change under their feet (same records get pushed
         and freed multiple times); it's hard to reliably bail from the
         work, however, because the mere presence of a writer does not
         guarantee that the writer will push pending records before exiting.
      
      Ensuring wakeups always happen will make the code basically open
      code a mutex. Just use a mutex.
      
      The TLS HW TX path does not have any locking (not even the
      sk_write_pending hack), yet it uses a per-socket sg_tx_data
      array to push records.
      
      Fixes: a42055e8 ("net/tls: Add support for async encryption of records for performance")
      Reported-by: NMallesham  Jatharakonda <mallesh537@gmail.com>
      Reported-by: NPooja Trivedi <poojatrivedi@gmail.com>
      Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: NSimon Horman <simon.horman@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      79ffe608
  25. 06 10月, 2019 2 次提交
  26. 05 10月, 2019 6 次提交
  27. 05 9月, 2019 2 次提交
  28. 01 9月, 2019 2 次提交
  29. 16 8月, 2019 1 次提交
  30. 10 8月, 2019 1 次提交