提交 · fc8da80f990696a50ea76628daca6e63331b18b7 · openeuler / Kernel

11 4月, 2022 6 次提交

tls: rx: don't handle async in tls_sw_advance_skb() · fc8da80f

由 Jakub Kicinski 提交于 4月 08, 2022

tls_sw_advance_skb() caters to the async case when skb argument
is NULL. In that case it simply unpauses the strparser.

These are surprising semantics to a person reading the code,
and result in higher LoC, so inline the __strp_unpause and
only call tls_sw_advance_skb() when we actually move past
an skb.
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fc8da80f

tls: rx: factor out writing ContentType to cmsg · 06554f4f

由 Jakub Kicinski 提交于 4月 08, 2022

cmsg can be filled in during rx_list processing or normal
receive. Consolidate the code.

We don't need to keep the boolean to track if the cmsg was
created. 0 is an invalid content type.
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

06554f4f

tls: rx: simplify async wait · 37943f04

由 Jakub Kicinski 提交于 4月 08, 2022

Since we are protected from async completions by decrypt_compl_lock
we can drop the async_notify and reinit the completion before we
start waiting.
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

37943f04

tls: rx: wrap decryption arguments in a structure · 4175eac3

由 Jakub Kicinski 提交于 4月 08, 2022

We pass zc as a pointer to bool a few functions down as an in/out
argument. This is error prone since C will happily evalue a pointer
as a boolean (IOW forgetting *zc and writing zc leads to loss of
developer time..). Wrap the arguments into a structure.
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4175eac3

tls: rx: don't report text length from the bowels of decrypt · 9bdf75cc

由 Jakub Kicinski 提交于 4月 08, 2022

We plumb pointer to chunk all the way to the decryption method.
It's set to the length of the text when decrypt_skb_update()
returns.

I think the code is written this way because original TLS
implementation passed &chunk to zerocopy_from_iter() and this
was carried forward as the code gotten more complex, without
any refactoring.

The fix for peek() introduced a new variable - to_decrypt
which for all practical purposes is what chunk is going to
get set to. Spare ourselves the pointer passing, use to_decrypt.

Use this opportunity to clean things up a little further.

Note that chunk / to_decrypt was mostly needed for the async
path, since the sync path would access rxm->full_len (decryption
transforms full_len from record size to text size). Use the
right source of truth more explicitly.

We have three cases:
 - async - it's TLS 1.2 only, so chunk == to_decrypt, but we
           need the min() because to_decrypt is a whole record
	   and we don't want to underflow len. Note that we can't
	   handle partial record by falling back to sync as it
	   would introduce reordering against records in flight.
 - zc - again, TLS 1.2 only for now, so chunk == to_decrypt,
        we don't do zc if len < to_decrypt, no need to check again.
 - normal - it already handles chunk > len, we can factor out the
            assignment to rxm->full_len and share it with zc.
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9bdf75cc

tls: rx: drop unnecessary arguments from tls_setup_from_iter() · d4bd88e6

由 Jakub Kicinski 提交于 4月 08, 2022

sk is unused, remove it to make it clear the function
doesn't poke at the socket.

size_used is always 0 on input and @length on success.
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d4bd88e6

08 4月, 2022 10 次提交

tls: hw: rx: use return value of tls_device_decrypted() to carry status · 71471ca3

由 Jakub Kicinski 提交于 4月 07, 2022

Instead of tls_device poking into internals of the message
return 1 from tls_device_decrypted() if the device handled
the decryption.
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

71471ca3

tls: rx: refactor decrypt_skb_update() · 3764ae5b

由 Jakub Kicinski 提交于 4月 07, 2022

Use early return and a jump label to remove two indentation levels.
No functional changes.
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3764ae5b

tls: rx: don't issue wake ups when data is decrypted · 5dbda02d

由 Jakub Kicinski 提交于 4月 07, 2022

We inform the applications that data is available when
the record is received. Decryption happens inline inside
recvmsg or splice call. Generating another wakeup inside
the decryption handler seems pointless as someone must
be actively reading the socket if we are executing this
code.
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5dbda02d

tls: rx: replace 'back' with 'offset' · 5deee41b

由 Jakub Kicinski 提交于 4月 07, 2022

The padding length TLS 1.3 logic is searching for content_type from
the end of text. IMHO the code is easier to parse if we calculate
offset and decrement it rather than try to maintain positive offset
from the end of the record called "back".
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5deee41b

tls: rx: use a define for tag length · a8340cc0

由 Jakub Kicinski 提交于 4月 07, 2022

TLS 1.3 has to strip padding, and it starts out 16 bytes
from the end of the record. Make it clear this is because
of the auth tag.
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a8340cc0

tls: rx: init decrypted status in tls_read_size() · 863533e3

由 Jakub Kicinski 提交于 4月 07, 2022

We set the record type in tls_read_size(), can as well init
the tlm->decrypted field there.
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

863533e3

tls: rx: don't store the decryption status in socket context · 7dc59c33

由 Jakub Kicinski 提交于 4月 07, 2022

Similar justification to previous change, the information
about decryption status belongs in the skb.
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7dc59c33

tls: rx: don't store the record type in socket context · c3f6bb74

由 Jakub Kicinski 提交于 4月 07, 2022

Original TLS implementation was handling one record at a time.
It stashed the type of the record inside tls context (per socket
structure) for convenience. When async crypto support was added
[1] the author had to use skb->cb to store the type per-message.

The use of skb->cb overlaps with strparser, however, so a hybrid
approach was taken where type is stored in context while parsing
(since we parse a message at a time) but once parsed its copied
to skb->cb.

Recently a workaround for sockmaps [2] exposed the previously
private struct _strp_msg and started a trend of adding user
fields directly in strparser's header. This is cleaner than
storing information about an skb in the context.

This change is not strictly necessary, but IMHO the ownership
of the context field is confusing. Information naturally
belongs to the skb.

[1] commit 94524d8f ("net/tls: Add support for async decryption of tls records")
[2] commit b2c46181 ("bpf, sockmap: sk_skb data_end access incorrect when src_reg = dst_reg")
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c3f6bb74

tls: rx: drop pointless else after goto · d5123edd

由 Jakub Kicinski 提交于 4月 07, 2022

Pointless else branch after goto makes the code harder to refactor
down the line.
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d5123edd

tls: rx: jump to a more appropriate label · bfc06e1a

由 Jakub Kicinski 提交于 4月 07, 2022

'recv_end:' checks num_async and decrypted, and is then followed
by the 'end' label. Since we know that decrypted and num_async
are 0 at the start we can jump to 'end'.

Move the init of decrypted and num_async to let the compiler
catch if I'm wrong.
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bfc06e1a

01 4月, 2022 1 次提交

net/tls: fix slab-out-of-bounds bug in decrypt_internal · 9381fe8c

由 Ziyang Xuan 提交于 3月 31, 2022

The memory size of tls_ctx->rx.iv for AES128-CCM is 12 setting in
tls_set_sw_offload(). The return value of crypto_aead_ivsize()
for "ccm(aes)" is 16. So memcpy() require 16 bytes from 12 bytes
memory space will trigger slab-out-of-bounds bug as following:

==================================================================
BUG: KASAN: slab-out-of-bounds in decrypt_internal+0x385/0xc40 [tls]
Read of size 16 at addr ffff888114e84e60 by task tls/10911

Call Trace:
 <TASK>
 dump_stack_lvl+0x34/0x44
 print_report.cold+0x5e/0x5db
 ? decrypt_internal+0x385/0xc40 [tls]
 kasan_report+0xab/0x120
 ? decrypt_internal+0x385/0xc40 [tls]
 kasan_check_range+0xf9/0x1e0
 memcpy+0x20/0x60
 decrypt_internal+0x385/0xc40 [tls]
 ? tls_get_rec+0x2e0/0x2e0 [tls]
 ? process_rx_list+0x1a5/0x420 [tls]
 ? tls_setup_from_iter.constprop.0+0x2e0/0x2e0 [tls]
 decrypt_skb_update+0x9d/0x400 [tls]
 tls_sw_recvmsg+0x3c8/0xb50 [tls]

Allocated by task 10911:
 kasan_save_stack+0x1e/0x40
 __kasan_kmalloc+0x81/0xa0
 tls_set_sw_offload+0x2eb/0xa20 [tls]
 tls_setsockopt+0x68c/0x700 [tls]
 __sys_setsockopt+0xfe/0x1b0

Replace the crypto_aead_ivsize() with prot->iv_size + prot->salt_size
when memcpy() iv value in TLS_1_3_VERSION scenario.

Fixes: f295b3ae ("net/tls: Add support of AES128-CCM based ciphers")
Signed-off-by: NZiyang Xuan <william.xuanziyang@huawei.com>
Reviewed-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9381fe8c

22 3月, 2022 2 次提交

net/tls: optimize judgement processes in tls_set_device_offload() · b1a6f56b

由 Ziyang Xuan 提交于 3月 19, 2022

It is known that priority setting HW offload when set tls TX/RX offload
by setsockopt(). Check netdevice whether support NETIF_F_HW_TLS_TX or
not at the later stages in the whole tls_set_device_offload() process,
some memory allocations have been done before that. We must release those
memory and return error if we judge the netdevice not support
NETIF_F_HW_TLS_TX. It is redundant.

Move NETIF_F_HW_TLS_TX judgement forward, and move start_marker_record
and offload_ctx memory allocation back slightly. Thus, we can get
simpler exception handling process.
Signed-off-by: NZiyang Xuan <william.xuanziyang@huawei.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

b1a6f56b

net/tls: remove unnecessary jump instructions in do_tls_setsockopt_conf() · 1ddcbfbf

由 Ziyang Xuan 提交于 3月 19, 2022

Avoid using "goto" jump instruction unconditionally when we
can return directly. Remove unnecessary jump instructions in
do_tls_setsockopt_conf().
Signed-off-by: NZiyang Xuan <william.xuanziyang@huawei.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

1ddcbfbf

04 2月, 2022 1 次提交

tls: cap the output scatter list to something reasonable · b93235e6

由 Jakub Kicinski 提交于 2月 02, 2022

TLS recvmsg() passes user pages as destination for decrypt.
The decrypt operation is repeated record by record, each
record being 16kB, max. TLS allocates an sg_table and uses
iov_iter_get_pages() to populate it with enough pages to
fit the decrypted record.

Even though we decrypt a single message at a time we size
the sg_table based on the entire length of the iovec.
This leads to unnecessarily large allocations, risking
triggering OOM conditions.

Use iov_iter_truncate() / iov_iter_reexpand() to construct
a "capped" version of iov_iter_npages(). Alternatively we
could parametrize iov_iter_npages() to take the size as
arg instead of using i->count, or do something else..
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b93235e6

17 1月, 2022 1 次提交

net/tls: Fix another skb memory leak when running kTLS traffic · db094aa8

由 Gal Pressman 提交于 1月 17, 2022

This patch is a followup to
commit ffef737f ("net/tls: Fix skb memory leak when running kTLS traffic")

Which was missing another sk_defer_free_flush() call in
tls_sw_splice_read().

Fixes: f35f8219 ("tcp: defer skb freeing after socket lock is released")
Signed-off-by: NGal Pressman <gal@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

db094aa8

08 1月, 2022 1 次提交

net/tls: Fix skb memory leak when running kTLS traffic · ffef737f

由 Gal Pressman 提交于 1月 02, 2022

The cited Fixes commit introduced a memory leak when running kTLS
traffic (with/without hardware offloads).
I'm running nginx on the server side and wrk on the client side and get
the following:

  unreferenced object 0xffff8881935e9b80 (size 224):
  comm "softirq", pid 0, jiffies 4294903611 (age 43.204s)
  hex dump (first 32 bytes):
    80 9b d0 36 81 88 ff ff 00 00 00 00 00 00 00 00  ...6............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000efe2a999>] build_skb+0x1f/0x170
    [<00000000ef521785>] mlx5e_skb_from_cqe_mpwrq_linear+0x2bc/0x610 [mlx5_core]
    [<00000000945d0ffe>] mlx5e_handle_rx_cqe_mpwrq+0x264/0x9e0 [mlx5_core]
    [<00000000cb675b06>] mlx5e_poll_rx_cq+0x3ad/0x17a0 [mlx5_core]
    [<0000000018aac6a9>] mlx5e_napi_poll+0x28c/0x1b60 [mlx5_core]
    [<000000001f3369d1>] __napi_poll+0x9f/0x560
    [<00000000cfa11f72>] net_rx_action+0x357/0xa60
    [<000000008653b8d7>] __do_softirq+0x282/0x94e
    [<00000000644923c6>] __irq_exit_rcu+0x11f/0x170
    [<00000000d4085f8f>] irq_exit_rcu+0xa/0x20
    [<00000000d412fef4>] common_interrupt+0x7d/0xa0
    [<00000000bfb0cebc>] asm_common_interrupt+0x1e/0x40
    [<00000000d80d0890>] default_idle+0x53/0x70
    [<00000000f2b9780e>] default_idle_call+0x8c/0xd0
    [<00000000c7659e15>] do_idle+0x394/0x450

I'm not familiar with these areas of the code, but I've added this
sk_defer_free_flush() to tls_sw_recvmsg() based on a hunch and it
resolved the issue.

Fixes: f35f8219 ("tcp: defer skb freeing after socket lock is released")
Signed-off-by: NGal Pressman <gal@nvidia.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20220102081253.9123-1-gal@nvidia.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

ffef737f

30 11月, 2021 1 次提交

net/tls: simplify the tls_set_sw_offload function · dc2724a6

由 Tianjia Zhang 提交于 11月 29, 2021

Assigning crypto_info variables in advance can simplify the logic
of accessing value and move related local variables to a smaller
scope.
Signed-off-by: NTianjia Zhang <tianjia.zhang@linux.alibaba.com>
Reviewed-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dc2724a6

29 11月, 2021 1 次提交

net/tls: Fix authentication failure in CCM mode · 59610606

由 Tianjia Zhang 提交于 11月 29, 2021

When the TLS cipher suite uses CCM mode, including AES CCM and
SM4 CCM, the first byte of the B0 block is flags, and the real
IV starts from the second byte. The XOR operation of the IV and
rec_seq should be skip this byte, that is, add the iv_offset.

Fixes: f295b3ae ("net/tls: Add support of AES128-CCM based ciphers")
Signed-off-by: NTianjia Zhang <tianjia.zhang@linux.alibaba.com>
Cc: Vakul Garg <vakul.garg@nxp.com>
Cc: stable@vger.kernel.org # v5.2+
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

59610606

26 11月, 2021 3 次提交

tls: fix replacing proto_ops · f3911f73

由 Jakub Kicinski 提交于 11月 24, 2021

We replace proto_ops whenever TLS is configured for RX. But our
replacement also overrides sendpage_locked, which will crash
unless TX is also configured. Similarly we plug both of those
in for TLS_HW (NIC crypto offload) even tho TLS_HW has a completely
different implementation for TX.

Last but not least we always plug in something based on inet_stream_ops
even though a few of the callbacks differ for IPv6 (getname, release,
bind).

Use a callback building method similar to what we do for struct proto.

Fixes: c46234eb ("tls: RX path for ktls")
Fixes: d4ffb02d ("net/tls: enable sk_msg redirect to tls socket egress")
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

f3911f73

tls: splice_read: fix accessing pre-processed records · e062fe99

由 Jakub Kicinski 提交于 11月 24, 2021

recvmsg() will put peek()ed and partially read records onto the rx_list.
splice_read() needs to consult that list otherwise it may miss data.
Align with recvmsg() and also put partially-read records onto rx_list.
tls_sw_advance_skb() is pretty pointless now and will be removed in
net-next.

Fixes: 692d7b5d ("tls: Fix recvmsg() to be able to peek across multiple records")
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

e062fe99

tls: splice_read: fix record type check · 520493f6

由 Jakub Kicinski 提交于 11月 24, 2021

We don't support splicing control records. TLS 1.3 changes moved
the record type check into the decrypt if(). The skb may already
be decrypted and still be an alert.

Note that decrypt_skb_update() is idempotent and updates ctx->decrypted
so the if() is pointless.

Reorder the check for decryption errors with the content type check
while touching them. This part is not really a bug, because if
decryption failed in TLS 1.3 content type will be DATA, and for
TLS 1.2 it will be correct. Nevertheless its strange to touch output
before checking if the function has failed.

Fixes: fedf201e ("net: tls: Refactor control message handling on recv")
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

520493f6

28 10月, 2021 2 次提交

net/tls: Fix flipped sign in async_wait.err assignment · 1d9d6fd2

由 Daniel Jordan 提交于 10月 27, 2021

sk->sk_err contains a positive number, yet async_wait.err wants the
opposite. Fix the missed sign flip, which Jakub caught by inspection.

Fixes: a42055e8 ("net/tls: Add support for async encryption of records for performance")
Suggested-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDaniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1d9d6fd2

net/tls: Fix flipped sign in tls_err_abort() calls · da353fac

由 Daniel Jordan 提交于 10月 27, 2021

sk->sk_err appears to expect a positive value, a convention that ktls
doesn't always follow and that leads to memory corruption in other code.
For instance,

    [kworker]
    tls_encrypt_done(..., err=<negative error from crypto request>)
      tls_err_abort(.., err)
        sk->sk_err = err;

    [task]
    splice_from_pipe_feed
      ...
        tls_sw_do_sendpage
          if (sk->sk_err) {
            ret = -sk->sk_err;  // ret is positive

    splice_from_pipe_feed (continued)
      ret = actor(...)  // ret is still positive and interpreted as bytes
                        // written, resulting in underflow of buf->len and
                        // sd->len, leading to huge buf->offset and bogus
                        // addresses computed in later calls to actor()

Fix all tls_err_abort() callers to pass a negative error code
consistently and centralize the error-prone sign flip there, throwing in
a warning to catch future misuse and uninlining the function so it
really does only warn once.

Cc: stable@vger.kernel.org
Fixes: c46234eb ("tls: RX path for ktls")
Reported-by: syzbot+b187b77c8474f9648fae@syzkaller.appspotmail.com
Signed-off-by: NDaniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

da353fac

27 10月, 2021 1 次提交

net: Rename ->stream_memory_read to ->sock_is_readable · 7b50ecfc

由 Cong Wang 提交于 10月 08, 2021

The proto ops ->stream_memory_read() is currently only used
by TCP to check whether psock queue is empty or not. We need
to rename it before reusing it for non-TCP protocols, and
adjust the exsiting users accordingly.
Signed-off-by: NCong Wang <cong.wang@bytedance.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211008203306.37525-2-xiyou.wangcong@gmail.com

7b50ecfc

25 10月, 2021 1 次提交

net/tls: getsockopt supports complete algorithm list · 3fb59a5d

由 Tianjia Zhang 提交于 10月 25, 2021

AES_CCM_128 and CHACHA20_POLY1305 are already supported by tls,
similar to setsockopt, getsockopt also needs to support these
two algorithms.
Signed-off-by: NTianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3fb59a5d

28 9月, 2021 1 次提交

net/tls: support SM4 CCM algorithm · 128cfb88

由 Tianjia Zhang 提交于 9月 28, 2021

The IV of CCM mode has special requirements, this patch supports CCM
mode of SM4 algorithm.
Signed-off-by: NTianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

128cfb88

16 9月, 2021 1 次提交

net/tls: support SM4 GCM/CCM algorithm · 227b9644

由 Tianjia Zhang 提交于 9月 16, 2021

The RFC8998 specification defines the use of the ShangMi algorithm
cipher suites in TLS 1.3, and also supports the GCM/CCM mode using
the SM4 algorithm.
Signed-off-by: NTianjia Zhang <tianjia.zhang@linux.alibaba.com>
Acked-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

227b9644

22 6月, 2021 1 次提交

tls: prevent oversized sendfile() hangs by ignoring MSG_MORE · d452d48b

由 Jakub Kicinski 提交于 6月 18, 2021

We got multiple reports that multi_chunk_sendfile test
case from tls selftest fails. This was sort of expected,
as the original fix was never applied (see it in the first
Link:). The test in question uses sendfile() with count
larger than the size of the underlying file. This will
make splice set MSG_MORE on all sendpage calls, meaning
TLS will never close and flush the last partial record.

Eric seem to have addressed a similar problem in
commit 35f9c09f ("tcp: tcp_sendpages() should call tcp_push() once")
by introducing MSG_SENDPAGE_NOTLAST. Unlike MSG_MORE
MSG_SENDPAGE_NOTLAST is not set on the last call
of a "pipefull" of data (PIPE_DEF_BUFFERS == 16,
so every 16 pages or whenever we run out of data).

Having a break every 16 pages should be fine, TLS
can pack exactly 4 pages into a record, so for
aligned reads there should be no difference,
unaligned may see one extra record per sendpage().

Sticking to TCP semantics seems preferable to modifying
splice, but we can revisit it if real life scenarios
show a regression.
Reported-by: NVadim Fedorenko <vfedorenko@novek.ru>
Reported-by: NSeth Forshee <seth.forshee@canonical.com>
Link: https://lore.kernel.org/netdev/1591392508-14592-1-git-send-email-pooja.trivedi@stackpath.com/
Fixes: 3c4d7559 ("tls: kernel TLS support")
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Tested-by: NSeth Forshee <seth.forshee@canonical.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d452d48b

08 6月, 2021 1 次提交

skbuff: add a parameter to __skb_frag_unref · c420c989

由 Matteo Croce 提交于 6月 07, 2021

This is a prerequisite patch, the next one is enabling recycling of
skbs and fragments. Add an extra argument on __skb_frag_unref() to
handle recycling, and update the current users of the function with that.
Signed-off-by: NMatteo Croce <mcroce@microsoft.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c420c989

02 6月, 2021 2 次提交

net/tls: Fix use-after-free after the TLS device goes down and up · c55dcdd4

由 Maxim Mikityanskiy 提交于 6月 01, 2021

When a netdev with active TLS offload goes down, tls_device_down is
called to stop the offload and tear down the TLS context. However, the
socket stays alive, and it still points to the TLS context, which is now
deallocated. If a netdev goes up, while the connection is still active,
and the data flow resumes after a number of TCP retransmissions, it will
lead to a use-after-free of the TLS context.

This commit addresses this bug by keeping the context alive until its
normal destruction, and implements the necessary fallbacks, so that the
connection can resume in software (non-offloaded) kTLS mode.

On the TX side tls_sw_fallback is used to encrypt all packets. The RX
side already has all the necessary fallbacks, because receiving
non-decrypted packets is supported. The thing needed on the RX side is
to block resync requests, which are normally produced after receiving
non-decrypted packets.

The necessary synchronization is implemented for a graceful teardown:
first the fallbacks are deployed, then the driver resources are released
(it used to be possible to have a tls_dev_resync after tls_dev_del).

A new flag called TLS_RX_DEV_DEGRADED is added to indicate the fallback
mode. It's used to skip the RX resync logic completely, as it becomes
useless, and some objects may be released (for example, resync_async,
which is allocated and freed by the driver).

Fixes: e8f69799 ("net/tls: Add generic NIC offload infrastructure")
Signed-off-by: NMaxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: NTariq Toukan <tariqt@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c55dcdd4

net/tls: Replace TLS_RX_SYNC_RUNNING with RCU · 05fc8b6c

由 Maxim Mikityanskiy 提交于 6月 01, 2021

RCU synchronization is guaranteed to finish in finite time, unlike a
busy loop that polls a flag. This patch is a preparation for the bugfix
in the next patch, where the same synchronize_net() call will also be
used to sync with the TX datapath.
Signed-off-by: NMaxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: NTariq Toukan <tariqt@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

05fc8b6c

15 5月, 2021 1 次提交

tls splice: check SPLICE_F_NONBLOCK instead of MSG_DONTWAIT · 974271e5

由 Jim Ma 提交于 5月 14, 2021

In tls_sw_splice_read, checkout MSG_* is inappropriate, should use
SPLICE_*, update tls_wait_data to accept nonblock arguments instead
of flags for recvmsg and splice.

Fixes: c46234eb ("tls: RX path for ktls")
Signed-off-by: NJim Ma <majinjing3@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

974271e5

13 5月, 2021 1 次提交

tls splice: remove inappropriate flags checking for MSG_PEEK · d8654f4f

由 Jim Ma 提交于 5月 12, 2021

In function tls_sw_splice_read, before call tls_sw_advance_skb
it checks likely(!(flags & MSG_PEEK)), while MSG_PEEK is used
for recvmsg, splice supports SPLICE_F_NONBLOCK, SPLICE_F_MOVE,
SPLICE_F_MORE, should remove this checking.
Signed-off-by: NJim Ma <majinjing3@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d8654f4f

28 4月, 2021 1 次提交

net/tls: Remove redundant initialization of record · 3afef8c7

由 Jiapeng Chong 提交于 4月 27, 2021

record is being initialized to ctx->open_record but this is never
read as record is overwritten later on.  Remove the redundant
initialization.

Cleans up the following clang-analyzer warning:

net/tls/tls_device.c:421:26: warning: Value stored to 'record' during
its initialization is never read [clang-analyzer-deadcode.DeadStores].
Reported-by: NAbaci Robot <abaci@linux.alibaba.com>
Signed-off-by: NJiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3afef8c7

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功