1. 18 1月, 2019 1 次提交
  2. 22 12月, 2018 1 次提交
    • V
      tls: Do not call sk_memcopy_from_iter with zero length · 65a10e28
      Vakul Garg 提交于
      In some conditions e.g. when tls_clone_plaintext_msg() returns -ENOSPC,
      the number of bytes to be copied using subsequent function
      sk_msg_memcopy_from_iter() becomes zero. This causes function
      sk_msg_memcopy_from_iter() to fail which in turn causes tls_sw_sendmsg()
      to return failure. To prevent it, do not call sk_msg_memcopy_from_iter()
      when number of bytes to copy (indicated by 'try_to_copy') is zero.
      
      Fixes: d829e9c4 ("tls: convert to generic sk_msg interface")
      Signed-off-by: NVakul Garg <vakul.garg@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      65a10e28
  3. 21 12月, 2018 1 次提交
    • J
      bpf: sk_msg, sock{map|hash} redirect through ULP · 0608c69c
      John Fastabend 提交于
      A sockmap program that redirects through a kTLS ULP enabled socket
      will not work correctly because the ULP layer is skipped. This
      fixes the behavior to call through the ULP layer on redirect to
      ensure any operations required on the data stream at the ULP layer
      continue to be applied.
      
      To do this we add an internal flag MSG_SENDPAGE_NOPOLICY to avoid
      calling the BPF layer on a redirected message. This is
      required to avoid calling the BPF layer multiple times (possibly
      recursively) which is not the current/expected behavior without
      ULPs. In the future we may add a redirect flag if users _do_
      want the policy applied again but this would need to work for both
      ULP and non-ULP sockets and be opt-in to avoid breaking existing
      programs.
      
      Also to avoid polluting the flag space with an internal flag we
      reuse the flag space overlapping MSG_SENDPAGE_NOPOLICY with
      MSG_WAITFORONE. Here WAITFORONE is specific to recv path and
      SENDPAGE_NOPOLICY is only used for sendpage hooks. The last thing
      to verify is user space API is masked correctly to ensure the flag
      can not be set by user. (Note this needs to be true regardless
      because we have internal flags already in-use that user space
      should not be able to set). But for completeness we have two UAPI
      paths into sendpage, sendfile and splice.
      
      In the sendfile case the function do_sendfile() zero's flags,
      
      ./fs/read_write.c:
       static ssize_t do_sendfile(int out_fd, int in_fd, loff_t *ppos,
      		   	    size_t count, loff_t max)
       {
         ...
         fl = 0;
      #if 0
         /*
          * We need to debate whether we can enable this or not. The
          * man page documents EAGAIN return for the output at least,
          * and the application is arguably buggy if it doesn't expect
          * EAGAIN on a non-blocking file descriptor.
          */
          if (in.file->f_flags & O_NONBLOCK)
      	fl = SPLICE_F_NONBLOCK;
      #endif
          file_start_write(out.file);
          retval = do_splice_direct(in.file, &pos, out.file, &out_pos, count, fl);
       }
      
      In the splice case the pipe_to_sendpage "actor" is used which
      masks flags with SPLICE_F_MORE.
      
      ./fs/splice.c:
       static int pipe_to_sendpage(struct pipe_inode_info *pipe,
      			    struct pipe_buffer *buf, struct splice_desc *sd)
       {
         ...
         more = (sd->flags & SPLICE_F_MORE) ? MSG_MORE : 0;
         ...
       }
      
      Confirming what we expect that internal flags  are in fact internal
      to socket side.
      
      Fixes: d3b18ad3 ("tls: add bpf support to sk_msg handling")
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      0608c69c
  4. 29 11月, 2018 1 次提交
  5. 24 10月, 2018 1 次提交
  6. 17 10月, 2018 1 次提交
  7. 16 10月, 2018 3 次提交
    • J
      tls: add bpf support to sk_msg handling · d3b18ad3
      John Fastabend 提交于
      This work adds BPF sk_msg verdict program support to kTLS
      allowing BPF and kTLS to be combined together. Previously kTLS
      and sk_msg verdict programs were mutually exclusive in the
      ULP layer which created challenges for the orchestrator when
      trying to apply TCP based policy, for example. To resolve this,
      leveraging the work from previous patches that consolidates
      the use of sk_msg, we can finally enable BPF sk_msg verdict
      programs so they continue to run after the kTLS socket is
      created. No change in behavior when kTLS is not used in
      combination with BPF, the kselftest suite for kTLS also runs
      successfully.
      
      Joint work with Daniel.
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      d3b18ad3
    • J
      tls: replace poll implementation with read hook · 924ad65e
      John Fastabend 提交于
      Instead of re-implementing poll routine use the poll callback to
      trigger read from kTLS, we reuse the stream_memory_read callback
      which is simpler and achieves the same. This helps to align sockmap
      and kTLS so we can more easily embed BPF in kTLS.
      
      Joint work with Daniel.
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      924ad65e
    • D
      tls: convert to generic sk_msg interface · d829e9c4
      Daniel Borkmann 提交于
      Convert kTLS over to make use of sk_msg interface for plaintext and
      encrypted scattergather data, so it reuses all the sk_msg helpers
      and data structure which later on in a second step enables to glue
      this to BPF.
      
      This also allows to remove quite a bit of open coded helpers which
      are covered by the sk_msg API. Recent changes in kTLs 80ece6a0
      ("tls: Remove redundant vars from tls record structure") and
      4e6d4720 ("tls: Add support for inplace records encryption")
      changed the data path handling a bit; while we've kept the latter
      optimization intact, we had to undo the former change to better
      fit the sk_msg model, hence the sg_aead_in and sg_aead_out have
      been brought back and are linked into the sk_msg sgs. Now the kTLS
      record contains a msg_plaintext and msg_encrypted sk_msg each.
      
      In the original code, the zerocopy_from_iter() has been used out
      of TX but also RX path. For the strparser skb-based RX path,
      we've left the zerocopy_from_iter() in decrypt_internal() mostly
      untouched, meaning it has been moved into tls_setup_from_iter()
      with charging logic removed (as not used from RX). Given RX path
      is not based on sk_msg objects, we haven't pursued setting up a
      dummy sk_msg to call into sk_msg_zerocopy_from_iter(), but it
      could be an option to prusue in a later step.
      
      Joint work with John.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      d829e9c4
  8. 03 10月, 2018 1 次提交
    • V
      tls: Add support for inplace records encryption · 4e6d4720
      Vakul Garg 提交于
      Presently, for non-zero copy case, separate pages are allocated for
      storing plaintext and encrypted text of records. These pages are stored
      in sg_plaintext_data and sg_encrypted_data scatterlists inside record
      structure. Further, sg_plaintext_data & sg_encrypted_data are passed
      to cryptoapis for record encryption. Allocating separate pages for
      plaintext and encrypted text is inefficient from both required memory
      and performance point of view.
      
      This patch adds support of inplace encryption of records. For non-zero
      copy case, we reuse the pages from sg_encrypted_data scatterlist to
      copy the application's plaintext data. For the movement of pages from
      sg_encrypted_data to sg_plaintext_data scatterlists, we introduce a new
      function move_to_plaintext_sg(). This function add pages into
      sg_plaintext_data from sg_encrypted_data scatterlists.
      
      tls_do_encryption() is modified to pass the same scatterlist as both
      source and destination into aead_request_set_crypt() if inplace crypto
      has been enabled. A new ariable 'inplace_crypto' has been introduced in
      record structure to signify whether the same scatterlist can be used.
      By default, the inplace_crypto is enabled in get_rec(). If zero-copy is
      used (i.e. plaintext data is not copied), inplace_crypto is set to '0'.
      Signed-off-by: NVakul Garg <vakul.garg@nxp.com>
      Reviewed-by: NDave Watson <davejwatson@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4e6d4720
  9. 30 9月, 2018 1 次提交
    • V
      tls: Remove redundant vars from tls record structure · 80ece6a0
      Vakul Garg 提交于
      Structure 'tls_rec' contains sg_aead_in and sg_aead_out which point
      to a aad_space and then chain scatterlists sg_plaintext_data,
      sg_encrypted_data respectively. Rather than using chained scatterlists
      for plaintext and encrypted data in aead_req, it is efficient to store
      aad_space in sg_encrypted_data and sg_plaintext_data itself in the
      first index and get rid of sg_aead_in, sg_aead_in and further chaining.
      
      This requires increasing size of sg_encrypted_data & sg_plaintext_data
      arrarys by 1 to accommodate entry for aad_space. The code which uses
      sg_encrypted_data and sg_plaintext_data has been modified to skip first
      index as it points to aad_space.
      Signed-off-by: NVakul Garg <vakul.garg@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      80ece6a0
  10. 29 9月, 2018 1 次提交
  11. 26 9月, 2018 2 次提交
    • V
      tls: Fixed a memory leak during socket close · c774973e
      Vakul Garg 提交于
      During socket close, if there is a open record with tx context, it needs
      to be be freed apart from freeing up plaintext and encrypted scatter
      lists. This patch frees up the open record if present in tx context.
      
      Also tls_free_both_sg() has been renamed to tls_free_open_rec() to
      indicate that the free record in tx context is being freed inside the
      function.
      
      Fixes: a42055e8 ("net/tls: Add support for async encryption")
      Signed-off-by: NVakul Garg <vakul.garg@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c774973e
    • V
      tls: Fix socket mem accounting error under async encryption · b85135b5
      Vakul Garg 提交于
      Current async encryption implementation sometimes showed up socket
      memory accounting error during socket close. This results in kernel
      warning calltrace. The root cause of the problem is that socket var
      sk_forward_alloc gets corrupted due to access in sk_mem_charge()
      and sk_mem_uncharge() being invoked from multiple concurrent contexts
      in multicore processor. The apis sk_mem_charge() and sk_mem_uncharge()
      are called from functions alloc_plaintext_sg(), free_sg() etc. It is
      required that memory accounting apis are called under a socket lock.
      
      The plaintext sg data sent for encryption is freed using free_sg() in
      tls_encryption_done(). It is wrong to call free_sg() from this function.
      This is because this function may run in irq context. We cannot acquire
      socket lock in this function.
      
      We remove calling of function free_sg() for plaintext data from
      tls_encryption_done() and defer freeing up of plaintext data to the time
      when the record is picked up from tx_list and transmitted/freed. When
      tls_tx_records() gets called, socket is already locked and thus there is
      no concurrent access problem.
      
      Fixes: a42055e8 ("net/tls: Add support for async encryption")
      Signed-off-by: NVakul Garg <vakul.garg@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b85135b5
  12. 25 9月, 2018 2 次提交
    • V
      tls: Fixed uninitialised vars warning · 4128c0cf
      Vakul Garg 提交于
      In tls_sw_sendmsg() and tls_sw_sendpage(), it is possible that the
      uninitialised variable 'ret' gets passed to sk_stream_error(). So
      initialise local variable 'ret' to '0. The warnings were detected by
      'smatch' tool.
      
      Fixes: a42055e8 ("net/tls: Add support for async encryption")
      Signed-off-by: NVakul Garg <vakul.garg@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4128c0cf
    • V
      net/tls: Fixed race condition in async encryption · 9932a29a
      Vakul Garg 提交于
      On processors with multi-engine crypto accelerators, it is possible that
      multiple records get encrypted in parallel and their encryption
      completion is notified to different cpus in multicore processor. This
      leads to the situation where tls_encrypt_done() starts executing in
      parallel on different cores. In current implementation, encrypted
      records are queued to tx_ready_list in tls_encrypt_done(). This requires
      addition to linked list 'tx_ready_list' to be protected. As
      tls_decrypt_done() could be executing in irq content, it is not possible
      to protect linked list addition operation using a lock.
      
      To fix the problem, we remove linked list addition operation from the
      irq context. We do tx_ready_list addition/removal operation from
      application context only and get rid of possible multiple access to
      the linked list. Before starting encryption on the record, we add it to
      the tail of tx_ready_list. To prevent tls_tx_records() from transmitting
      it, we mark the record with a new flag 'tx_ready' in 'struct tls_rec'.
      When record encryption gets completed, tls_encrypt_done() has to only
      update the 'tx_ready' flag to true & linked list add operation is not
      required.
      
      The changed logic brings some other side benefits. Since the records
      are always submitted in tls sequence number order for encryption, the
      tx_ready_list always remains sorted and addition of new records to it
      does not have to traverse the linked list.
      
      Lastly, we renamed tx_ready_list in 'struct tls_sw_context_tx' to
      'tx_list'. This is because now, the some of the records at the tail are
      not ready to transmit.
      
      Fixes: a42055e8 ("net/tls: Add support for async encryption")
      Signed-off-by: NVakul Garg <vakul.garg@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9932a29a
  13. 22 9月, 2018 1 次提交
    • V
      net/tls: Add support for async encryption of records for performance · a42055e8
      Vakul Garg 提交于
      In current implementation, tls records are encrypted & transmitted
      serially. Till the time the previously submitted user data is encrypted,
      the implementation waits and on finish starts transmitting the record.
      This approach of encrypt-one record at a time is inefficient when
      asynchronous crypto accelerators are used. For each record, there are
      overheads of interrupts, driver softIRQ scheduling etc. Also the crypto
      accelerator sits idle most of time while an encrypted record's pages are
      handed over to tcp stack for transmission.
      
      This patch enables encryption of multiple records in parallel when an
      async capable crypto accelerator is present in system. This is achieved
      by allowing the user space application to send more data using sendmsg()
      even while previously issued data is being processed by crypto
      accelerator. This requires returning the control back to user space
      application after submitting encryption request to accelerator. This
      also means that zero-copy mode of encryption cannot be used with async
      accelerator as we must be done with user space application buffer before
      returning from sendmsg().
      
      There can be multiple records in flight to/from the accelerator. Each of
      the record is represented by 'struct tls_rec'. This is used to store the
      memory pages for the record.
      
      After the records are encrypted, they are added in a linked list called
      tx_ready_list which contains encrypted tls records sorted as per tls
      sequence number. The records from tx_ready_list are transmitted using a
      newly introduced function called tls_tx_records(). The tx_ready_list is
      polled for any record ready to be transmitted in sendmsg(), sendpage()
      after initiating encryption of new tls records. This achieves parallel
      encryption and transmission of records when async accelerator is
      present.
      
      There could be situation when crypto accelerator completes encryption
      later than polling of tx_ready_list by sendmsg()/sendpage(). Therefore
      we need a deferred work context to be able to transmit records from
      tx_ready_list. The deferred work context gets scheduled if applications
      are not sending much data through the socket. If the applications issue
      sendmsg()/sendpage() in quick succession, then the scheduling of
      tx_work_handler gets cancelled as the tx_ready_list would be polled from
      application's context itself. This saves scheduling overhead of deferred
      work.
      
      The patch also brings some side benefit. We are able to get rid of the
      concept of CLOSED record. This is because the records once closed are
      either encrypted and then placed into tx_ready_list or if encryption
      fails, the socket error is set. This simplifies the kernel tls
      sendpath. However since tls_device.c is still using macros, accessory
      functions for CLOSED records have been retained.
      Signed-off-by: NVakul Garg <vakul.garg@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a42055e8
  14. 17 9月, 2018 2 次提交
    • D
      tls: fix currently broken MSG_PEEK behavior · 50c6b58a
      Daniel Borkmann 提交于
      In kTLS MSG_PEEK behavior is currently failing, strace example:
      
        [pid  2430] socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3
        [pid  2430] socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 4
        [pid  2430] bind(4, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
        [pid  2430] listen(4, 10)               = 0
        [pid  2430] getsockname(4, {sa_family=AF_INET, sin_port=htons(38855), sin_addr=inet_addr("0.0.0.0")}, [16]) = 0
        [pid  2430] connect(3, {sa_family=AF_INET, sin_port=htons(38855), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
        [pid  2430] setsockopt(3, SOL_TCP, 0x1f /* TCP_??? */, [7564404], 4) = 0
        [pid  2430] setsockopt(3, 0x11a /* SOL_?? */, 1, "\3\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 40) = 0
        [pid  2430] accept(4, {sa_family=AF_INET, sin_port=htons(49636), sin_addr=inet_addr("127.0.0.1")}, [16]) = 5
        [pid  2430] setsockopt(5, SOL_TCP, 0x1f /* TCP_??? */, [7564404], 4) = 0
        [pid  2430] setsockopt(5, 0x11a /* SOL_?? */, 2, "\3\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 40) = 0
        [pid  2430] close(4)                    = 0
        [pid  2430] sendto(3, "test_read_peek", 14, 0, NULL, 0) = 14
        [pid  2430] sendto(3, "_mult_recs\0", 11, 0, NULL, 0) = 11
        [pid  2430] recvfrom(5, "test_read_peektest_read_peektest"..., 64, MSG_PEEK, NULL, NULL) = 64
      
      As can be seen from strace, there are two TLS records sent,
      i) 'test_read_peek' and ii) '_mult_recs\0' where we end up
      peeking 'test_read_peektest_read_peektest'. This is clearly
      wrong, and what happens is that given peek cannot call into
      tls_sw_advance_skb() to unpause strparser and proceed with
      the next skb, we end up looping over the current one, copying
      the 'test_read_peek' over and over into the user provided
      buffer.
      
      Here, we can only peek into the currently held skb (current,
      full TLS record) as otherwise we would end up having to hold
      all the original skb(s) (depending on the peek depth) in a
      separate queue when unpausing strparser to process next
      records, minimally intrusive is to return only up to the
      current record's size (which likely was what c46234eb
      ("tls: RX path for ktls") originally intended as well). Thus,
      after patch we properly peek the first record:
      
        [pid  2046] wait4(2075,  <unfinished ...>
        [pid  2075] socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3
        [pid  2075] socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 4
        [pid  2075] bind(4, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
        [pid  2075] listen(4, 10)               = 0
        [pid  2075] getsockname(4, {sa_family=AF_INET, sin_port=htons(55115), sin_addr=inet_addr("0.0.0.0")}, [16]) = 0
        [pid  2075] connect(3, {sa_family=AF_INET, sin_port=htons(55115), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
        [pid  2075] setsockopt(3, SOL_TCP, 0x1f /* TCP_??? */, [7564404], 4) = 0
        [pid  2075] setsockopt(3, 0x11a /* SOL_?? */, 1, "\3\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 40) = 0
        [pid  2075] accept(4, {sa_family=AF_INET, sin_port=htons(45732), sin_addr=inet_addr("127.0.0.1")}, [16]) = 5
        [pid  2075] setsockopt(5, SOL_TCP, 0x1f /* TCP_??? */, [7564404], 4) = 0
        [pid  2075] setsockopt(5, 0x11a /* SOL_?? */, 2, "\3\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 40) = 0
        [pid  2075] close(4)                    = 0
        [pid  2075] sendto(3, "test_read_peek", 14, 0, NULL, 0) = 14
        [pid  2075] sendto(3, "_mult_recs\0", 11, 0, NULL, 0) = 11
        [pid  2075] recvfrom(5, "test_read_peek", 64, MSG_PEEK, NULL, NULL) = 14
      
      Fixes: c46234eb ("tls: RX path for ktls")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      50c6b58a
    • J
      tls: async support causes out-of-bounds access in crypto APIs · 7a3dd8c8
      John Fastabend 提交于
      When async support was added it needed to access the sk from the async
      callback to report errors up the stack. The patch tried to use space
      after the aead request struct by directly setting the reqsize field in
      aead_request. This is an internal field that should not be used
      outside the crypto APIs. It is used by the crypto code to define extra
      space for private structures used in the crypto context. Users of the
      API then use crypto_aead_reqsize() and add the returned amount of
      bytes to the end of the request memory allocation before posting the
      request to encrypt/decrypt APIs.
      
      So this breaks (with general protection fault and KASAN error, if
      enabled) because the request sent to decrypt is shorter than required
      causing the crypto API out-of-bounds errors. Also it seems unlikely the
      sk is even valid by the time it gets to the callback because of memset
      in crypto layer.
      
      Anyways, fix this by holding the sk in the skb->sk field when the
      callback is set up and because the skb is already passed through to
      the callback handler via void* we can access it in the handler. Then
      in the handler we need to be careful to NULL the pointer again before
      kfree_skb. I added comments on both the setup (in tls_do_decryption)
      and when we clear it from the crypto callback handler
      tls_decrypt_done(). After this selftests pass again and fixes KASAN
      errors/warnings.
      
      Fixes: 94524d8f ("net/tls: Add support for async decryption of tls records")
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Reviewed-by: NVakul Garg <Vakul.garg@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7a3dd8c8
  15. 14 9月, 2018 2 次提交
  16. 12 9月, 2018 1 次提交
  17. 09 9月, 2018 1 次提交
    • V
      net/tls: Set count of SG entries if sk_alloc_sg returns -ENOSPC · 52ea992c
      Vakul Garg 提交于
      tls_sw_sendmsg() allocates plaintext and encrypted SG entries using
      function sk_alloc_sg(). In case the number of SG entries hit
      MAX_SKB_FRAGS, sk_alloc_sg() returns -ENOSPC and sets the variable for
      current SG index to '0'. This leads to calling of function
      tls_push_record() with 'sg_encrypted_num_elem = 0' and later causes
      kernel crash. To fix this, set the number of SG elements to the number
      of elements in plaintext/encrypted SG arrays in case sk_alloc_sg()
      returns -ENOSPC.
      
      Fixes: 3c4d7559 ("tls: kernel TLS support")
      Signed-off-by: NVakul Garg <vakul.garg@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      52ea992c
  18. 02 9月, 2018 1 次提交
    • V
      net/tls: Add support for async decryption of tls records · 94524d8f
      Vakul Garg 提交于
      When tls records are decrypted using asynchronous acclerators such as
      NXP CAAM engine, the crypto apis return -EINPROGRESS. Presently, on
      getting -EINPROGRESS, the tls record processing stops till the time the
      crypto accelerator finishes off and returns the result. This incurs a
      context switch and is not an efficient way of accessing the crypto
      accelerators. Crypto accelerators work efficient when they are queued
      with multiple crypto jobs without having to wait for the previous ones
      to complete.
      
      The patch submits multiple crypto requests without having to wait for
      for previous ones to complete. This has been implemented for records
      which are decrypted in zero-copy mode. At the end of recvmsg(), we wait
      for all the asynchronous decryption requests to complete.
      
      The references to records which have been sent for async decryption are
      dropped. For cases where record decryption is not possible in zero-copy
      mode, asynchronous decryption is not used and we wait for decryption
      crypto api to complete.
      
      For crypto requests executing in async fashion, the memory for
      aead_request, sglists and skb etc is freed from the decryption
      completion handler. The decryption completion handler wakesup the
      sleeping user context when recvmsg() flags that it has done sending
      all the decryption requests and there are no more decryption requests
      pending to be completed.
      Signed-off-by: NVakul Garg <vakul.garg@nxp.com>
      Reviewed-by: NDave Watson <davejwatson@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      94524d8f
  19. 30 8月, 2018 1 次提交
  20. 13 8月, 2018 1 次提交
    • V
      net/tls: Combined memory allocation for decryption request · 0b243d00
      Vakul Garg 提交于
      For preparing decryption request, several memory chunks are required
      (aead_req, sgin, sgout, iv, aad). For submitting the decrypt request to
      an accelerator, it is required that the buffers which are read by the
      accelerator must be dma-able and not come from stack. The buffers for
      aad and iv can be separately kmalloced each, but it is inefficient.
      This patch does a combined allocation for preparing decryption request
      and then segments into aead_req || sgin || sgout || iv || aad.
      Signed-off-by: NVakul Garg <vakul.garg@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0b243d00
  21. 06 8月, 2018 1 次提交
  22. 02 8月, 2018 1 次提交
  23. 31 7月, 2018 1 次提交
    • V
      net/tls: Use socket data_ready callback on record availability · ad13acce
      Vakul Garg 提交于
      On receipt of a complete tls record, use socket's saved data_ready
      callback instead of state_change callback. In function tls_queue(),
      the TLS record is queued in encrypted state. But the decryption
      happen inline when tls_sw_recvmsg() or tls_sw_splice_read() get invoked.
      So it should be ok to notify the waiting context about the availability
      of data as soon as we could collect a full TLS record. For new data
      availability notification, sk_data_ready callback is more appropriate.
      It points to sock_def_readable() which wakes up specifically for EPOLLIN
      event. This is in contrast to the socket callback sk_state_change which
      points to sock_def_wakeup() which issues a wakeup unconditionally
      (without event mask).
      Signed-off-by: NVakul Garg <vakul.garg@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ad13acce
  24. 29 7月, 2018 2 次提交
  25. 27 7月, 2018 2 次提交
  26. 21 7月, 2018 1 次提交
  27. 17 7月, 2018 1 次提交
    • D
      tls: Stricter error checking in zerocopy sendmsg path · 32da1221
      Dave Watson 提交于
      In the zerocopy sendmsg() path, there are error checks to revert
      the zerocopy if we get any error code.  syzkaller has discovered
      that tls_push_record can return -ECONNRESET, which is fatal, and
      happens after the point at which it is safe to revert the iter,
      as we've already passed the memory to do_tcp_sendpages.
      
      Previously this code could return -ENOMEM and we would want to
      revert the iter, but AFAIK this no longer returns ENOMEM after
      a447da7d ("tls: fix waitall behavior in tls_sw_recvmsg"),
      so we fail for all error codes.
      
      Reported-by: syzbot+c226690f7b3126c5ee04@syzkaller.appspotmail.com
      Reported-by: syzbot+709f2810a6a05f11d4d3@syzkaller.appspotmail.com
      Signed-off-by: NDave Watson <davejwatson@fb.com>
      Fixes: 3c4d7559 ("tls: kernel TLS support")
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      32da1221
  28. 16 7月, 2018 5 次提交