1. 04 3月, 2019 4 次提交
    • B
      tls: Fix tls_device receive · d069b780
      Boris Pismenny 提交于
      Currently, the receive function fails to handle records already
      decrypted by the device due to the commit mentioned below.
      
      This commit advances the TLS record sequence number and prepares the context
      to handle the next record.
      
      Fixes: fedf201e ("net: tls: Refactor control message handling on recv")
      Signed-off-by: NBoris Pismenny <borisp@mellanox.com>
      Reviewed-by: NEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d069b780
    • E
      tls: Fix mixing between async capable and async · 7754bd63
      Eran Ben Elisha 提交于
      Today, tls_sw_recvmsg is capable of using asynchronous mode to handle
      application data TLS records. Moreover, it assumes that if the cipher
      can be handled asynchronously, then all packets will be processed
      asynchronously.
      
      However, this assumption is not always true. Specifically, for AES-GCM
      in TLS1.2, it causes data corruption, and breaks user applications.
      
      This patch fixes this problem by separating the async capability from
      the decryption operation result.
      
      Fixes: c0ab4732 ("net/tls: Do not use async crypto for non-data records")
      Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
      Reviewed-by: NBoris Pismenny <borisp@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7754bd63
    • B
      tls: Fix write space handling · 7463d3a2
      Boris Pismenny 提交于
      TLS device cannot use the sw context. This patch returns the original
      tls device write space handler and moves the sw/device specific portions
      to the relevant files.
      
      Also, we remove the write_space call for the tls_sw flow, because it
      handles partial records in its delayed tx work handler.
      
      Fixes: a42055e8 ("net/tls: Add support for async encryption of records for performance")
      Signed-off-by: NBoris Pismenny <borisp@mellanox.com>
      Reviewed-by: NEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7463d3a2
    • B
      tls: Fix tls_device handling of partial records · 94850257
      Boris Pismenny 提交于
      Cleanup the handling of partial records while fixing a bug where the
      tls_push_pending_closed_record function is using the software tls
      context instead of the hardware context.
      
      The bug resulted in the following crash:
      [   88.791229] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
      [   88.793271] #PF error: [normal kernel read fault]
      [   88.794449] PGD 800000022a426067 P4D 800000022a426067 PUD 22a156067 PMD 0
      [   88.795958] Oops: 0000 [#1] SMP PTI
      [   88.796884] CPU: 2 PID: 4973 Comm: openssl Not tainted 5.0.0-rc4+ #3
      [   88.798314] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
      [   88.800067] RIP: 0010:tls_tx_records+0xef/0x1d0 [tls]
      [   88.801256] Code: 00 02 48 89 43 08 e8 a0 0b 96 d9 48 89 df e8 48 dd
      4d d9 4c 89 f8 4d 8b bf 98 00 00 00 48 05 98 00 00 00 48 89 04 24 49 39
      c7 <49> 8b 1f 4d 89 fd 0f 84 af 00 00 00 41 8b 47 10 85 c0 0f 85 8d 00
      [   88.805179] RSP: 0018:ffffbd888186fca8 EFLAGS: 00010213
      [   88.806458] RAX: ffff9af1ed657c98 RBX: ffff9af1e88a1980 RCX: 0000000000000000
      [   88.808050] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9af1e88a1980
      [   88.809724] RBP: ffff9af1e88a1980 R08: 0000000000000017 R09: ffff9af1ebeeb700
      [   88.811294] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
      [   88.812917] R13: ffff9af1e88a1980 R14: ffff9af1ec13f800 R15: 0000000000000000
      [   88.814506] FS:  00007fcad2240740(0000) GS:ffff9af1f7880000(0000) knlGS:0000000000000000
      [   88.816337] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   88.817717] CR2: 0000000000000000 CR3: 0000000228b3e000 CR4: 00000000001406e0
      [   88.819328] Call Trace:
      [   88.820123]  tls_push_data+0x628/0x6a0 [tls]
      [   88.821283]  ? remove_wait_queue+0x20/0x60
      [   88.822383]  ? n_tty_read+0x683/0x910
      [   88.823363]  tls_device_sendmsg+0x53/0xa0 [tls]
      [   88.824505]  sock_sendmsg+0x36/0x50
      [   88.825492]  sock_write_iter+0x87/0x100
      [   88.826521]  __vfs_write+0x127/0x1b0
      [   88.827499]  vfs_write+0xad/0x1b0
      [   88.828454]  ksys_write+0x52/0xc0
      [   88.829378]  do_syscall_64+0x5b/0x180
      [   88.830369]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [   88.831603] RIP: 0033:0x7fcad1451680
      
      [ 1248.470626] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
      [ 1248.472564] #PF error: [normal kernel read fault]
      [ 1248.473790] PGD 0 P4D 0
      [ 1248.474642] Oops: 0000 [#1] SMP PTI
      [ 1248.475651] CPU: 3 PID: 7197 Comm: openssl Tainted: G           OE 5.0.0-rc4+ #3
      [ 1248.477426] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
      [ 1248.479310] RIP: 0010:tls_tx_records+0x110/0x1f0 [tls]
      [ 1248.480644] Code: 00 02 48 89 43 08 e8 4f cb 63 d7 48 89 df e8 f7 9c
      1b d7 4c 89 f8 4d 8b bf 98 00 00 00 48 05 98 00 00 00 48 89 04 24 49 39
      c7 <49> 8b 1f 4d 89 fd 0f 84 af 00 00 00 41 8b 47 10 85 c0 0f 85 8d 00
      [ 1248.484825] RSP: 0018:ffffaa0a41543c08 EFLAGS: 00010213
      [ 1248.486154] RAX: ffff955a2755dc98 RBX: ffff955a36031980 RCX: 0000000000000006
      [ 1248.487855] RDX: 0000000000000000 RSI: 000000000000002b RDI: 0000000000000286
      [ 1248.489524] RBP: ffff955a36031980 R08: 0000000000000000 R09: 00000000000002b1
      [ 1248.491394] R10: 0000000000000003 R11: 00000000ad55ad55 R12: 0000000000000000
      [ 1248.493162] R13: 0000000000000000 R14: ffff955a2abe6c00 R15: 0000000000000000
      [ 1248.494923] FS:  0000000000000000(0000) GS:ffff955a378c0000(0000) knlGS:0000000000000000
      [ 1248.496847] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1248.498357] CR2: 0000000000000000 CR3: 000000020c40e000 CR4: 00000000001406e0
      [ 1248.500136] Call Trace:
      [ 1248.500998]  ? tcp_check_oom+0xd0/0xd0
      [ 1248.502106]  tls_sk_proto_close+0x127/0x1e0 [tls]
      [ 1248.503411]  inet_release+0x3c/0x60
      [ 1248.504530]  __sock_release+0x3d/0xb0
      [ 1248.505611]  sock_close+0x11/0x20
      [ 1248.506612]  __fput+0xb4/0x220
      [ 1248.507559]  task_work_run+0x88/0xa0
      [ 1248.508617]  do_exit+0x2cb/0xbc0
      [ 1248.509597]  ? core_sys_select+0x17a/0x280
      [ 1248.510740]  do_group_exit+0x39/0xb0
      [ 1248.511789]  get_signal+0x1d0/0x630
      [ 1248.512823]  do_signal+0x36/0x620
      [ 1248.513822]  exit_to_usermode_loop+0x5c/0xc6
      [ 1248.515003]  do_syscall_64+0x157/0x180
      [ 1248.516094]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [ 1248.517456] RIP: 0033:0x7fb398bd3f53
      [ 1248.518537] Code: Bad RIP value.
      
      Fixes: a42055e8 ("net/tls: Add support for async encryption of records for performance")
      Signed-off-by: NBoris Pismenny <borisp@mellanox.com>
      Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      94850257
  2. 25 2月, 2019 1 次提交
    • V
      tls: Return type of non-data records retrieved using MSG_PEEK in recvmsg · 2b794c40
      Vakul Garg 提交于
      The patch enables returning 'type' in msghdr for records that are
      retrieved with MSG_PEEK in recvmsg. Further it prevents records peeked
      from socket from getting clubbed with any other record of different
      type when records are subsequently dequeued from strparser.
      
      For each record, we now retain its type in sk_buff's control buffer
      cb[]. Inside control buffer, record's full length and offset are already
      stored by strparser in 'struct strp_msg'. We store record type after
      'struct strp_msg' inside 'struct tls_msg'. For tls1.2, the type is
      stored just after record dequeue. For tls1.3, the type is stored after
      record has been decrypted.
      
      Inside process_rx_list(), before processing a non-data record, we check
      that we must be able to return back the record type to the user
      application. If not, the decrypted records in tls context's rx_list is
      left there without consuming any data.
      
      Fixes: 692d7b5d ("tls: Fix recvmsg() to be able to peek across multiple records")
      Signed-off-by: NVakul Garg <vakul.garg@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2b794c40
  3. 20 2月, 2019 1 次提交
  4. 13 2月, 2019 1 次提交
    • V
      net/tls: Do not use async crypto for non-data records · c0ab4732
      Vakul Garg 提交于
      Addition of tls1.3 support broke tls1.2 handshake when async crypto
      accelerator is used. This is because the record type for non-data
      records is not propagated to user application. Also when async
      decryption happens, the decryption does not stop when two different
      types of records get dequeued and submitted for decryption. To address
      it, we decrypt tls1.2 non-data records in synchronous way. We check
      whether the record we just processed has same type as the previous one
      before checking for async condition and jumping to dequeue next record.
      
      Fixes: 130b392c ("net: tls: Add tls 1.3 support")
      Signed-off-by: NVakul Garg <vakul.garg@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c0ab4732
  5. 10 2月, 2019 1 次提交
    • V
      net/tls: Disable async decrytion for tls1.3 · 8497ded2
      Vakul Garg 提交于
      Function tls_sw_recvmsg() dequeues multiple records from stream parser
      and decrypts them. In case the decryption is done by async accelerator,
      the records may get submitted for decryption while the previous ones may
      not have been decryted yet. For tls1.3, the record type is known only
      after decryption. Therefore, for tls1.3, tls_sw_recvmsg() may submit
      records for decryption even if it gets 'handshake' records after 'data'
      records. These intermediate 'handshake' records may do a key updation.
      By the time new keys are given to ktls by userspace, it is possible that
      ktls has already submitted some records i(which are encrypted with new
      keys) for decryption using old keys. This would lead to decrypt failure.
      Therefore, async decryption of records should be disabled for tls1.3.
      
      Fixes: 130b392c ("net: tls: Add tls 1.3 support")
      Signed-off-by: NVakul Garg <vakul.garg@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8497ded2
  6. 02 2月, 2019 5 次提交
  7. 29 1月, 2019 2 次提交
  8. 23 1月, 2019 2 次提交
    • A
      net/tls: free ctx in sock destruct · 76f7164d
      Atul Gupta 提交于
      free tls context in sock destruct. close may not be the last
      call to free sock but force releasing the ctx in close
      will result in GPF when ctx referred again in tcp_done
      
      [  515.330477] general protection fault: 0000 [#1] SMP PTI
      [  515.330539] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.20.0-rc7+ #10
      [  515.330657] Hardware name: Supermicro X8ST3/X8ST3, BIOS 2.0b
      11/07/2013
      [  515.330844] RIP: 0010:tls_hw_unhash+0xbf/0xd0
      [
      [  515.332220] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  515.332340] CR2: 00007fab32c55000 CR3: 000000009261e000 CR4:
      00000000000006e0
      [  515.332519] Call Trace:
      [  515.332632]  <IRQ>
      [  515.332793]  tcp_set_state+0x5a/0x190
      [  515.332907]  ? tcp_update_metrics+0xe3/0x350
      [  515.333023]  tcp_done+0x31/0xd0
      [  515.333130]  tcp_rcv_state_process+0xc27/0x111a
      [  515.333242]  ? __lock_is_held+0x4f/0x90
      [  515.333350]  ? tcp_v4_do_rcv+0xaf/0x1e0
      [  515.333456]  tcp_v4_do_rcv+0xaf/0x1e0
      Signed-off-by: NAtul Gupta <atul.gupta@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      76f7164d
    • A
      net/tls: build_protos moved to common routine · 63a6b3fe
      Atul Gupta 提交于
      build protos is required for tls_hw_prot also hence moved to
      'tls_build_proto' and called as required from tls_init
      and tls_hw_proto. This is required since build_protos
      for v4 is moved from tls_register to tls_init in
      commit <28cb6f1e>
      Signed-off-by: NAtul Gupta <atul.gupta@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      63a6b3fe
  9. 18 1月, 2019 3 次提交
  10. 22 12月, 2018 1 次提交
    • V
      tls: Do not call sk_memcopy_from_iter with zero length · 65a10e28
      Vakul Garg 提交于
      In some conditions e.g. when tls_clone_plaintext_msg() returns -ENOSPC,
      the number of bytes to be copied using subsequent function
      sk_msg_memcopy_from_iter() becomes zero. This causes function
      sk_msg_memcopy_from_iter() to fail which in turn causes tls_sw_sendmsg()
      to return failure. To prevent it, do not call sk_msg_memcopy_from_iter()
      when number of bytes to copy (indicated by 'try_to_copy') is zero.
      
      Fixes: d829e9c4 ("tls: convert to generic sk_msg interface")
      Signed-off-by: NVakul Garg <vakul.garg@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      65a10e28
  11. 21 12月, 2018 2 次提交
    • J
      bpf: tls_sw, init TLS ULP removes BPF proto hooks · 28cb6f1e
      John Fastabend 提交于
      The existing code did not expect users would initialize the TLS ULP
      without subsequently calling the TLS TX enabling socket option.
      If the application tries to send data after the TLS ULP enable op
      but before the TLS TX enable op the BPF sk_msg verdict program is
      skipped. This patch resolves this by converting the ipv4 sock ops
      to be calculated at init time the same way ipv6 ops are done. This
      pulls in any changes to the sock ops structure that have been made
      after the socket was created including the changes from adding the
      socket to a sock{map|hash}.
      
      This was discovered by running OpenSSL master branch which calls
      the TLS ULP setsockopt early in TLS handshake but only enables
      the TLS TX path once the handshake has completed. As a result the
      datapath missed the initial handshake messages.
      
      Fixes: 02c558b2 ("bpf: sockmap, support for msg_peek in sk_msg with redirect ingress")
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      28cb6f1e
    • J
      bpf: sk_msg, sock{map|hash} redirect through ULP · 0608c69c
      John Fastabend 提交于
      A sockmap program that redirects through a kTLS ULP enabled socket
      will not work correctly because the ULP layer is skipped. This
      fixes the behavior to call through the ULP layer on redirect to
      ensure any operations required on the data stream at the ULP layer
      continue to be applied.
      
      To do this we add an internal flag MSG_SENDPAGE_NOPOLICY to avoid
      calling the BPF layer on a redirected message. This is
      required to avoid calling the BPF layer multiple times (possibly
      recursively) which is not the current/expected behavior without
      ULPs. In the future we may add a redirect flag if users _do_
      want the policy applied again but this would need to work for both
      ULP and non-ULP sockets and be opt-in to avoid breaking existing
      programs.
      
      Also to avoid polluting the flag space with an internal flag we
      reuse the flag space overlapping MSG_SENDPAGE_NOPOLICY with
      MSG_WAITFORONE. Here WAITFORONE is specific to recv path and
      SENDPAGE_NOPOLICY is only used for sendpage hooks. The last thing
      to verify is user space API is masked correctly to ensure the flag
      can not be set by user. (Note this needs to be true regardless
      because we have internal flags already in-use that user space
      should not be able to set). But for completeness we have two UAPI
      paths into sendpage, sendfile and splice.
      
      In the sendfile case the function do_sendfile() zero's flags,
      
      ./fs/read_write.c:
       static ssize_t do_sendfile(int out_fd, int in_fd, loff_t *ppos,
      		   	    size_t count, loff_t max)
       {
         ...
         fl = 0;
      #if 0
         /*
          * We need to debate whether we can enable this or not. The
          * man page documents EAGAIN return for the output at least,
          * and the application is arguably buggy if it doesn't expect
          * EAGAIN on a non-blocking file descriptor.
          */
          if (in.file->f_flags & O_NONBLOCK)
      	fl = SPLICE_F_NONBLOCK;
      #endif
          file_start_write(out.file);
          retval = do_splice_direct(in.file, &pos, out.file, &out_pos, count, fl);
       }
      
      In the splice case the pipe_to_sendpage "actor" is used which
      masks flags with SPLICE_F_MORE.
      
      ./fs/splice.c:
       static int pipe_to_sendpage(struct pipe_inode_info *pipe,
      			    struct pipe_buffer *buf, struct splice_desc *sd)
       {
         ...
         more = (sd->flags & SPLICE_F_MORE) ? MSG_MORE : 0;
         ...
       }
      
      Confirming what we expect that internal flags  are in fact internal
      to socket side.
      
      Fixes: d3b18ad3 ("tls: add bpf support to sk_msg handling")
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      0608c69c
  12. 20 12月, 2018 1 次提交
    • G
      net/tls: allocate tls context using GFP_ATOMIC · c6ec179a
      Ganesh Goudar 提交于
      create_ctx can be called from atomic context, hence use
      GFP_ATOMIC instead of GFP_KERNEL.
      
      [  395.962599] BUG: sleeping function called from invalid context at mm/slab.h:421
      [  395.979896] in_atomic(): 1, irqs_disabled(): 0, pid: 16254, name: openssl
      [  395.996564] 2 locks held by openssl/16254:
      [  396.010492]  #0: 00000000347acb52 (sk_lock-AF_INET){+.+.}, at: do_tcp_setsockopt.isra.44+0x13b/0x9a0
      [  396.029838]  #1: 000000006c9552b5 (device_spinlock){+...}, at: tls_init+0x1d/0x280
      [  396.047675] CPU: 5 PID: 16254 Comm: openssl Tainted: G           O      4.20.0-rc6+ #25
      [  396.066019] Hardware name: Supermicro X10SRA-F/X10SRA-F, BIOS 2.0c 09/25/2017
      [  396.083537] Call Trace:
      [  396.096265]  dump_stack+0x5e/0x8b
      [  396.109876]  ___might_sleep+0x216/0x250
      [  396.123940]  kmem_cache_alloc_trace+0x1b0/0x240
      [  396.138800]  create_ctx+0x1f/0x60
      [  396.152504]  tls_init+0xbd/0x280
      [  396.166135]  tcp_set_ulp+0x191/0x2d0
      [  396.180035]  ? tcp_set_ulp+0x2c/0x2d0
      [  396.193960]  do_tcp_setsockopt.isra.44+0x148/0x9a0
      [  396.209013]  __sys_setsockopt+0x7c/0xe0
      [  396.223054]  __x64_sys_setsockopt+0x20/0x30
      [  396.237378]  do_syscall_64+0x4a/0x180
      [  396.251200]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Fixes: df9d4a17 ("net/tls: sleeping function from invalid context")
      Signed-off-by: NGanesh Goudar <ganeshgr@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c6ec179a
  13. 15 12月, 2018 2 次提交
    • A
      net/tls: sleeping function from invalid context · df9d4a17
      Atul Gupta 提交于
      HW unhash within mutex for registered tls devices cause sleep
      when called from tcp_set_state for TCP_CLOSE. Release lock and
      re-acquire after function call with ref count incr/dec.
      defined kref and fp release for tls_device to ensure device
      is not released outside lock.
      
      BUG: sleeping function called from invalid context at
      kernel/locking/mutex.c:748
      in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/7
      INFO: lockdep is turned off.
      CPU: 7 PID: 0 Comm: swapper/7 Tainted: G        W  O
      Call Trace:
       <IRQ>
       dump_stack+0x5e/0x8b
       ___might_sleep+0x222/0x260
       __mutex_lock+0x5c/0xa50
       ? vprintk_emit+0x1f3/0x440
       ? kmem_cache_free+0x22d/0x2a0
       ? tls_hw_unhash+0x2f/0x80
       ? printk+0x52/0x6e
       ? tls_hw_unhash+0x2f/0x80
       tls_hw_unhash+0x2f/0x80
       tcp_set_state+0x5f/0x180
       tcp_done+0x2e/0xe0
       tcp_rcv_state_process+0x92c/0xdd3
       ? lock_acquire+0xf5/0x1f0
       ? tcp_v4_rcv+0xa7c/0xbe0
       ? tcp_v4_do_rcv+0x70/0x1e0
      Signed-off-by: NAtul Gupta <atul.gupta@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      df9d4a17
    • A
      net/tls: Init routines in create_ctx · 6c0563e4
      Atul Gupta 提交于
      create_ctx is called from tls_init and tls_hw_prot
      hence initialize function pointers in common routine.
      Signed-off-by: NAtul Gupta <atul.gupta@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6c0563e4
  14. 29 11月, 2018 1 次提交
  15. 24 10月, 2018 2 次提交
    • D
      iov_iter: Separate type from direction and use accessor functions · aa563d7b
      David Howells 提交于
      In the iov_iter struct, separate the iterator type from the iterator
      direction and use accessor functions to access them in most places.
      
      Convert a bunch of places to use switch-statements to access them rather
      then chains of bitwise-AND statements.  This makes it easier to add further
      iterator types.  Also, this can be more efficient as to implement a switch
      of small contiguous integers, the compiler can use ~50% fewer compare
      instructions than it has to use bitwise-and instructions.
      
      Further, cease passing the iterator type into the iterator setup function.
      The iterator function can set that itself.  Only the direction is required.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      aa563d7b
    • D
      iov_iter: Use accessor function · 00e23707
      David Howells 提交于
      Use accessor functions to access an iterator's type and direction.  This
      allows for the possibility of using some other method of determining the
      type of iterator than if-chains with bitwise-AND conditions.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      00e23707
  16. 21 10月, 2018 1 次提交
  17. 17 10月, 2018 1 次提交
  18. 16 10月, 2018 3 次提交
    • J
      tls: add bpf support to sk_msg handling · d3b18ad3
      John Fastabend 提交于
      This work adds BPF sk_msg verdict program support to kTLS
      allowing BPF and kTLS to be combined together. Previously kTLS
      and sk_msg verdict programs were mutually exclusive in the
      ULP layer which created challenges for the orchestrator when
      trying to apply TCP based policy, for example. To resolve this,
      leveraging the work from previous patches that consolidates
      the use of sk_msg, we can finally enable BPF sk_msg verdict
      programs so they continue to run after the kTLS socket is
      created. No change in behavior when kTLS is not used in
      combination with BPF, the kselftest suite for kTLS also runs
      successfully.
      
      Joint work with Daniel.
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      d3b18ad3
    • J
      tls: replace poll implementation with read hook · 924ad65e
      John Fastabend 提交于
      Instead of re-implementing poll routine use the poll callback to
      trigger read from kTLS, we reuse the stream_memory_read callback
      which is simpler and achieves the same. This helps to align sockmap
      and kTLS so we can more easily embed BPF in kTLS.
      
      Joint work with Daniel.
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      924ad65e
    • D
      tls: convert to generic sk_msg interface · d829e9c4
      Daniel Borkmann 提交于
      Convert kTLS over to make use of sk_msg interface for plaintext and
      encrypted scattergather data, so it reuses all the sk_msg helpers
      and data structure which later on in a second step enables to glue
      this to BPF.
      
      This also allows to remove quite a bit of open coded helpers which
      are covered by the sk_msg API. Recent changes in kTLs 80ece6a0
      ("tls: Remove redundant vars from tls record structure") and
      4e6d4720 ("tls: Add support for inplace records encryption")
      changed the data path handling a bit; while we've kept the latter
      optimization intact, we had to undo the former change to better
      fit the sk_msg model, hence the sg_aead_in and sg_aead_out have
      been brought back and are linked into the sk_msg sgs. Now the kTLS
      record contains a msg_plaintext and msg_encrypted sk_msg each.
      
      In the original code, the zerocopy_from_iter() has been used out
      of TX but also RX path. For the strparser skb-based RX path,
      we've left the zerocopy_from_iter() in decrypt_internal() mostly
      untouched, meaning it has been moved into tls_setup_from_iter()
      with charging logic removed (as not used from RX). Given RX path
      is not based on sk_msg objects, we haven't pursued setting up a
      dummy sk_msg to call into sk_msg_zerocopy_from_iter(), but it
      could be an option to prusue in a later step.
      
      Joint work with John.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      d829e9c4
  19. 03 10月, 2018 1 次提交
    • V
      tls: Add support for inplace records encryption · 4e6d4720
      Vakul Garg 提交于
      Presently, for non-zero copy case, separate pages are allocated for
      storing plaintext and encrypted text of records. These pages are stored
      in sg_plaintext_data and sg_encrypted_data scatterlists inside record
      structure. Further, sg_plaintext_data & sg_encrypted_data are passed
      to cryptoapis for record encryption. Allocating separate pages for
      plaintext and encrypted text is inefficient from both required memory
      and performance point of view.
      
      This patch adds support of inplace encryption of records. For non-zero
      copy case, we reuse the pages from sg_encrypted_data scatterlist to
      copy the application's plaintext data. For the movement of pages from
      sg_encrypted_data to sg_plaintext_data scatterlists, we introduce a new
      function move_to_plaintext_sg(). This function add pages into
      sg_plaintext_data from sg_encrypted_data scatterlists.
      
      tls_do_encryption() is modified to pass the same scatterlist as both
      source and destination into aead_request_set_crypt() if inplace crypto
      has been enabled. A new ariable 'inplace_crypto' has been introduced in
      record structure to signify whether the same scatterlist can be used.
      By default, the inplace_crypto is enabled in get_rec(). If zero-copy is
      used (i.e. plaintext data is not copied), inplace_crypto is set to '0'.
      Signed-off-by: NVakul Garg <vakul.garg@nxp.com>
      Reviewed-by: NDave Watson <davejwatson@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4e6d4720
  20. 30 9月, 2018 1 次提交
    • V
      tls: Remove redundant vars from tls record structure · 80ece6a0
      Vakul Garg 提交于
      Structure 'tls_rec' contains sg_aead_in and sg_aead_out which point
      to a aad_space and then chain scatterlists sg_plaintext_data,
      sg_encrypted_data respectively. Rather than using chained scatterlists
      for plaintext and encrypted data in aead_req, it is efficient to store
      aad_space in sg_encrypted_data and sg_plaintext_data itself in the
      first index and get rid of sg_aead_in, sg_aead_in and further chaining.
      
      This requires increasing size of sg_encrypted_data & sg_plaintext_data
      arrarys by 1 to accommodate entry for aad_space. The code which uses
      sg_encrypted_data and sg_plaintext_data has been modified to skip first
      index as it points to aad_space.
      Signed-off-by: NVakul Garg <vakul.garg@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      80ece6a0
  21. 29 9月, 2018 1 次提交
  22. 26 9月, 2018 2 次提交
    • V
      tls: Fixed a memory leak during socket close · c774973e
      Vakul Garg 提交于
      During socket close, if there is a open record with tx context, it needs
      to be be freed apart from freeing up plaintext and encrypted scatter
      lists. This patch frees up the open record if present in tx context.
      
      Also tls_free_both_sg() has been renamed to tls_free_open_rec() to
      indicate that the free record in tx context is being freed inside the
      function.
      
      Fixes: a42055e8 ("net/tls: Add support for async encryption")
      Signed-off-by: NVakul Garg <vakul.garg@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c774973e
    • V
      tls: Fix socket mem accounting error under async encryption · b85135b5
      Vakul Garg 提交于
      Current async encryption implementation sometimes showed up socket
      memory accounting error during socket close. This results in kernel
      warning calltrace. The root cause of the problem is that socket var
      sk_forward_alloc gets corrupted due to access in sk_mem_charge()
      and sk_mem_uncharge() being invoked from multiple concurrent contexts
      in multicore processor. The apis sk_mem_charge() and sk_mem_uncharge()
      are called from functions alloc_plaintext_sg(), free_sg() etc. It is
      required that memory accounting apis are called under a socket lock.
      
      The plaintext sg data sent for encryption is freed using free_sg() in
      tls_encryption_done(). It is wrong to call free_sg() from this function.
      This is because this function may run in irq context. We cannot acquire
      socket lock in this function.
      
      We remove calling of function free_sg() for plaintext data from
      tls_encryption_done() and defer freeing up of plaintext data to the time
      when the record is picked up from tx_list and transmitted/freed. When
      tls_tx_records() gets called, socket is already locked and thus there is
      no concurrent access problem.
      
      Fixes: a42055e8 ("net/tls: Add support for async encryption")
      Signed-off-by: NVakul Garg <vakul.garg@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b85135b5
  23. 25 9月, 2018 1 次提交