提交 · e54937963fa249595824439dc839c948188dea83 · openeuler / Kernel

24 2月, 2021 1 次提交

net: remove cmsg restriction from io_uring based send/recvmsg calls · e5493796

由 Jens Axboe 提交于 2月 17, 2021

No need to restrict these anymore, as the worker threads are direct
clones of the original task. Hence we know for a fact that we can
support anything that the regular task can.

Since the only user of proto_ops->flags was to flag PROTO_CMSG_DATA_ONLY,
kill the member and the flag definition too.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

e5493796

21 1月, 2021 1 次提交

bpf: Remove extra lock_sock for TCP_ZEROCOPY_RECEIVE · 9cacf81f

由 Stanislav Fomichev 提交于 1月 15, 2021

Add custom implementation of getsockopt hook for TCP_ZEROCOPY_RECEIVE.
We skip generic hooks for TCP_ZEROCOPY_RECEIVE and have a custom
call in do_tcp_getsockopt using the on-stack data. This removes
3% overhead for locking/unlocking the socket.

Without this patch:
     3.38%     0.07%  tcp_mmap  [kernel.kallsyms]  [k] __cgroup_bpf_run_filter_getsockopt
            |
             --3.30%--__cgroup_bpf_run_filter_getsockopt
                       |
                        --0.81%--__kmalloc

With the patch applied:
     0.52%     0.12%  tcp_mmap  [kernel.kallsyms]  [k] __cgroup_bpf_run_filter_getsockopt_kern

Note, exporting uapi/tcp.h requires removing netinet/tcp.h
from test_progs.h because those headers have confliciting
definitions.
Signed-off-by: NStanislav Fomichev <sdf@google.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NMartin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210115163501.805133-2-sdf@google.com

9cacf81f

05 12月, 2020 1 次提交

net: Remove the err argument from sock_from_file · dba4a925

由 Florent Revest 提交于 12月 04, 2020

Currently, the sock_from_file prototype takes an "err" pointer that is
either not set or set to -ENOTSOCK IFF the returned socket is NULL. This
makes the error redundant and it is ignored by a few callers.

This patch simplifies the API by letting callers deduce the error based
on whether the returned socket is NULL or not.
Suggested-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NFlorent Revest <revest@google.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Reviewed-by: NKP Singh <kpsingh@google.com>
Link: https://lore.kernel.org/bpf/20201204113609.1850150-1-revest@google.com

dba4a925

24 11月, 2020 2 次提交

net: don't include ethtool.h from netdevice.h · cc69837f

由 Jakub Kicinski 提交于 11月 20, 2020

linux/netdevice.h is included in very many places, touching any
of its dependecies causes large incremental builds.

Drop the linux/ethtool.h include, linux/netdevice.h just needs
a forward declaration of struct ethtool_ops.

Fix all the places which made use of this implicit include.
Acked-by: NJohannes Berg <johannes@sipsolutions.net>
Acked-by: NShannon Nelson <snelson@pensando.io>
Reviewed-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Link: https://lore.kernel.org/r/20201120225052.1427503-1-kuba@kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

cc69837f

net: provide __sys_shutdown_sock() that takes a socket · b713c195

由 Jens Axboe 提交于 9月 05, 2020

No functional changes in this patch, needed to provide io_uring support
for shutdown(2).

Cc: netdev@vger.kernel.org
Cc: David S. Miller <davem@davemloft.net>
Acked-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b713c195

18 11月, 2020 1 次提交

net: wan: Delete the DLCI / SDLA drivers · f7365919

由 Xie He 提交于 11月 14, 2020

The DLCI driver (dlci.c) implements the Frame Relay protocol. However,
we already have another newer and better implementation of Frame Relay
provided by the HDLC_FR driver (hdlc_fr.c).

The DLCI driver's implementation of Frame Relay is used by only one
hardware driver in the kernel - the SDLA driver (sdla.c).

The SDLA driver provides Frame Relay support for the Sangoma S50x devices.
However, the vendor provides their own driver (along with their own
multi-WAN-protocol implementations including Frame Relay), called WANPIPE.
I believe most users of the hardware would use the vendor-provided WANPIPE
driver instead.

(The WANPIPE driver was even once in the kernel, but was deleted in
commit 8db60bcf ("[WAN]: Remove broken and unmaintained Sangoma
drivers.") because the vendor no longer updated the in-kernel WANPIPE
driver.)

Cc: Mike McLagan <mike.mclagan@linux.org>
Signed-off-by: NXie He <xie.he.0141@gmail.com>
Link: https://lore.kernel.org/r/20201114150921.685594-1-xie.he.0141@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

f7365919

03 10月, 2020 1 次提交

net: add WARN_ONCE in kernel_sendpage() for improper zero-copy send · 7b62d31d

由 Coly Li 提交于 10月 02, 2020

If a page sent into kernel_sendpage() is a slab page or it doesn't have
ref_count, this page is improper to send by the zero copy sendpage()
method. Otherwise such page might be unexpected released in network code
path and causes impredictable panic due to kernel memory management data
structure corruption.

This path adds a WARN_ON() on the sending page before sends it into the
concrete zero-copy sendpage() method, if the page is improper for the
zero-copy sendpage() method, a warning message can be observed before
the consequential unpredictable kernel panic.

This patch does not change existing kernel_sendpage() behavior for the
improper page zero-copy send, it just provides hint warning message for
following potential panic due the kernel memory heap corruption.
Signed-off-by: NColy Li <colyli@suse.de>
Cc: Cong Wang <amwang@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7b62d31d

27 8月, 2020 1 次提交

net: Fix some comments · 645f0897

由 Miaohe Lin 提交于 8月 27, 2020

Fix some comments, including wrong function name, duplicated word and so
on.
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

645f0897

25 8月, 2020 1 次提交

io_uring: allow tcp ancillary data for __sys_recvmsg_sock() · 583bbf06

由 Luke Hsiao 提交于 8月 21, 2020

For TCP tx zero-copy, the kernel notifies the process of completions by
queuing completion notifications on the socket error queue. This patch
allows reading these notifications via recvmsg to support TCP tx
zero-copy.

Ancillary data was originally disallowed due to privilege escalation
via io_uring's offloading of sendmsg() onto a kernel thread with kernel
credentials (https://crbug.com/project-zero/1975). So, we must ensure
that the socket type is one where the ancillary data types that are
delivered on recvmsg are plain data (no file descriptors or values that
are translated based on the identity of the calling process).

This was tested by using io_uring to call recvmsg on the MSG_ERRQUEUE
with tx zero-copy enabled. Before this patch, we received -EINVALID from
this specific code path. After this patch, we could read tcp tx
zero-copy completion notifications from the MSG_ERRQUEUE.
Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: NArjun Roy <arjunroy@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Reviewed-by: NJann Horn <jannh@google.com>
Reviewed-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NLuke Hsiao <lukehsiao@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

583bbf06

11 8月, 2020 1 次提交

net: Revert "net: optimize the sockptr_t for unified kernel/user address spaces" · 519a8a6c

由 Christoph Hellwig 提交于 8月 10, 2020

This reverts commits 6d04fe15 and
a31edb20.

It turns out the idea to share a single pointer for both kernel and user
space address causes various kinds of problems.  So use the slightly less
optimal version that uses an extra bit, but which is guaranteed to be safe
everywhere.

Fixes: 6d04fe15 ("net: optimize the sockptr_t for unified kernel/user address spaces")
Reported-by: NEric Dumazet <edumazet@google.com>
Reported-by: NJohn Stultz <john.stultz@linaro.org>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

519a8a6c

09 8月, 2020 4 次提交

net: Convert to use the fallthrough macro · 7c7ab580

由 Miaohe Lin 提交于 8月 08, 2020

Convert the uses of fallthrough comments to fallthrough macro.
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7c7ab580

net: Remove meaningless jump label out_fs · 47260ba9

由 Miaohe Lin 提交于 8月 06, 2020

The out_fs jump label has nothing to do but goto out.
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

47260ba9

net: Set fput_needed iff FDPUT_FPUT is set · ce787a5a

由 Miaohe Lin 提交于 8月 06, 2020

We should fput() file iff FDPUT_FPUT is set. So we should set fput_needed
accordingly.

Fixes: 00e188ef ("sockfd_lookup_light(): switch to fdget^W^Waway from fget_light")
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ce787a5a

net: Use helper function fdput() · 6b07edeb

由 Miaohe Lin 提交于 8月 06, 2020

Use helper function fdput() to fput() the file iff FDPUT_FPUT is set.
Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6b07edeb

29 7月, 2020 1 次提交

net: improve the user pointer check in init_user_sockptr · a31edb20

由 Christoph Hellwig 提交于 7月 28, 2020

Make sure not just the pointer itself but the whole range lies in
the user address space.  For that pass the length and then use
the access_ok helper to do the check.

Fixes: 6d04fe15 ("net: optimize the sockptr_t for unified kernel/user address spaces")
Reported-by: NDavid Laight <David.Laight@ACULAB.COM>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a31edb20

25 7月, 2020 3 次提交

net: optimize the sockptr_t for unified kernel/user address spaces · 6d04fe15

由 Christoph Hellwig 提交于 7月 23, 2020

For architectures like x86 and arm64 we don't need the separate bit to
indicate that a pointer is a kernel pointer as the address spaces are
unified.  That way the sockptr_t can be reduced to a union of two
pointers, which leads to nicer calling conventions.

The only caveat is that we need to check that users don't pass in kernel
address and thus gain access to kernel memory.  Thus the USER_SOCKPTR
helper is replaced with a init_user_sockptr function that does this check
and returns an error if it fails.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6d04fe15

net: pass a sockptr_t into ->setsockopt · a7b75c5a

由 Christoph Hellwig 提交于 7月 23, 2020

Rework the remaining setsockopt code to pass a sockptr_t instead of a
plain user pointer.  This removes the last remaining set_fs(KERNEL_DS)
outside of architecture specific code.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: Stefan Schmidt <stefan@datenfreihafen.org> [ieee802154]
Acked-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a7b75c5a

net: switch sock_set_timeout to sockptr_t · c8c1bbb6

由 Christoph Hellwig 提交于 7月 23, 2020

Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c8c1bbb6

20 7月, 2020 4 次提交

net: make ->{get,set}sockopt in proto_ops optional · a44d9e72

由 Christoph Hellwig 提交于 7月 17, 2020

Just check for a NULL method instead of wiring up
sock_no_{get,set}sockopt.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NMarc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a44d9e72

net: remove compat_sys_{get,set}sockopt · 55db9c0e

由 Christoph Hellwig 提交于 7月 17, 2020

Now that the ->compat_{get,set}sockopt proto_ops methods are gone
there is no good reason left to keep the compat syscalls separate.

This fixes the odd use of unsigned int for the compat_setsockopt
optlen and the missing sock_use_custom_sol_socket.

It would also easily allow running the eBPF hooks for the compat
syscalls, but such a large change in behavior does not belong into
a consolidation patch like this one.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

55db9c0e

net: streamline __sys_getsockopt · d8a9b38f

由 Christoph Hellwig 提交于 7月 17, 2020

Return early when sockfd_lookup_light fails to reduce a level of
indentation for most of the function body.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d8a9b38f

net: streamline __sys_setsockopt · 4a367299

由 Christoph Hellwig 提交于 7月 17, 2020

Return early when sockfd_lookup_light fails to reduce a level of
indentation for most of the function body.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4a367299

14 7月, 2020 1 次提交

net: socket: Move kerneldoc next to function it documents · 9a8ad9ac

由 Andrew Lunn 提交于 7月 13, 2020

Fix the warning "Function parameter or member 'inode' not described in
'__sock_release'' due to the kerneldoc being placed before
__sock_release() not sock_release(), which does not take an inode
parameter.
Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9a8ad9ac

05 7月, 2020 1 次提交

net: use mptcp setsockopt function for SOL_SOCKET on mptcp sockets · 83f0c10b

由 Florian Westphal 提交于 7月 05, 2020

setsockopt(mptcp_fd, SOL_SOCKET, ...)...  appears to work (returns 0),
but it has no effect -- this is because the MPTCP layer never has a
chance to copy the settings to the subflow socket.

Skip the generic handling for the mptcp case and instead call the
mptcp specific handler instead for SOL_SOCKET too.

Next patch adds more specific handling for SOL_SOCKET to mptcp.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

83f0c10b

30 5月, 2020 1 次提交

net: remove kernel_setsockopt · 5a892ff2

由 Christoph Hellwig 提交于 5月 29, 2020

No users left.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5a892ff2

28 5月, 2020 1 次提交

net: remove kernel_getsockopt · 7a15b2e0

由 Christoph Hellwig 提交于 5月 27, 2020

No users left.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7a15b2e0

19 5月, 2020 2 次提交

ipv4,appletalk: move SIOCADDRT and SIOCDELRT handling into ->compat_ioctl · dc13c876

由 Christoph Hellwig 提交于 5月 18, 2020

To prepare removing the global routing_ioctl hack start lifting the code
into the ipv4 and appletalk ->compat_ioctl handlers.  Unlike the existing
handler we don't bother copying in the name - there are no compat issues for
char arrays.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dc13c876

ipv6: move SIOCADDRT and SIOCDELRT handling into ->compat_ioctl · 3986912f

由 Christoph Hellwig 提交于 5月 18, 2020

To prepare removing the global routing_ioctl hack start lifting the code
into a newly added ipv6 ->compat_ioctl handler.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3986912f

12 5月, 2020 1 次提交

net: cleanly handle kernel vs user buffers for ->msg_control · 1f466e1f

由 Christoph Hellwig 提交于 5月 11, 2020

The msg_control field in struct msghdr can either contain a user
pointer when used with the recvmsg system call, or a kernel pointer
when used with sendmsg.  To complicate things further kernel_recvmsg
can stuff a kernel pointer in and then use set_fs to make the uaccess
helpers accept it.

Replace it with a union of a kernel pointer msg_control field, and
a user pointer msg_control_user one, and allow kernel_recvmsg operate
on a proper kernel pointer using a bitfield to override the normal
choice of a user pointer for recvmsg.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1f466e1f

20 3月, 2020 1 次提交

io_uring: make sure accept honor rlimit nofile · 09952e3e

由 Jens Axboe 提交于 3月 19, 2020

Just like commit 4022e7af, this fixes the fact that
IORING_OP_ACCEPT ends up using get_unused_fd_flags(), which checks
current->signal->rlim[] for limits.

Add an extra argument to __sys_accept4_file() that allows us to pass
in the proper nofile limit, and grab it at request prep time.
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

09952e3e

10 3月, 2020 1 次提交

net: abstract out normal and compat msghdr import · 0a384abf

由 Jens Axboe 提交于 2月 27, 2020

This splits it into two parts, one that imports the message, and one
that imports the iovec. This allows a caller to only do the first part,
and import the iovec manually afterwards.

No functional changes in this patch.
Acked-by: NDavid Miller <davem@davemloft.net>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

0a384abf

09 1月, 2020 1 次提交

socket: fix unused-function warning · 542d3065

由 Arnd Bergmann 提交于 1月 08, 2020

When procfs is disabled, the fdinfo code causes a harmless
warning:

net/socket.c:1000:13: error: 'sock_show_fdinfo' defined but not used [-Werror=unused-function]
 static void sock_show_fdinfo(struct seq_file *m, struct file *f)

Move the function definition up so we can use a single #ifdef
around it.

Fixes: b4653342 ("net: Allow to show socket-specific information in /proc/[pid]/fdinfo/[fd]")
Suggested-by: NAl Viro <viro@zeniv.linux.org.uk>
Acked-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

542d3065

13 12月, 2019 1 次提交

net: Allow to show socket-specific information in /proc/[pid]/fdinfo/[fd] · b4653342

由 Kirill Tkhai 提交于 12月 09, 2019

This adds .show_fdinfo to socket_file_ops, so protocols will be able
to print their specific data in fdinfo.
Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b4653342

11 12月, 2019 1 次提交

net: make socket read/write_iter() honor IOCB_NOWAIT · ebfcd895

由 Jens Axboe 提交于 12月 09, 2019

The socket read/write helpers only look at the file O_NONBLOCK. not
the iocb IOCB_NOWAIT flag. This breaks users like preadv2/pwritev2
and io_uring that rely on not having the file itself marked nonblocking,
but rather the iocb itself.

Cc: netdev@vger.kernel.org
Acked-by: NDavid Miller <davem@davemloft.net>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ebfcd895

07 12月, 2019 1 次提交

net: avoid an indirect call in ____sys_recvmsg() · 1af66221

由 Eric Dumazet 提交于 12月 06, 2019

CONFIG_RETPOLINE=y made indirect calls expensive.

gcc seems to add an indirect call in ____sys_recvmsg().

Rewriting the code slightly makes sure to avoid this indirection.

Alternative would be to not call sock_recvmsg() and instead
use security_socket_recvmsg() and sock_recvmsg_nosec(),
but this is less readable IMO.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: David Laight <David.Laight@aculab.com>
Acked-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1af66221

03 12月, 2019 2 次提交

io_uring: ensure async punted connect requests copy data · f499a021

由 Jens Axboe 提交于 12月 02, 2019

Just like commit f67676d1 for read/write requests, this one ensures
that the sockaddr data has been copied for IORING_OP_CONNECT if we need
to punt the request to async context.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

f499a021

io_uring: ensure async punted sendmsg/recvmsg requests copy data · 03b1230c

由 Jens Axboe 提交于 12月 02, 2019

Just like commit f67676d1 for read/write requests, this one ensures
that the msghdr data is fully copied if we need to punt a recvmsg or
sendmsg system call to async context.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

03b1230c

27 11月, 2019 2 次提交

net: disallow ancillary data for __sys_{send,recv}msg_file() · d69e0779

由 Jens Axboe 提交于 11月 25, 2019

Only io_uring uses (and added) these, and we want to disallow the
use of sendmsg/recvmsg for anything but regular data transfers.
Use the newly added prep helper to split the msghdr copy out from
the core function, to check for msg_control and msg_controllen
settings. If either is set, we return -EINVAL.
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

d69e0779

net: separate out the msghdr copy from ___sys_{send,recv}msg() · 4257c8ca

由 Jens Axboe 提交于 11月 25, 2019

This is in preparation for enabling the io_uring helpers for sendmsg
and recvmsg to first copy the header for validation before continuing
with the operation.

There should be no functional changes in this patch.
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

4257c8ca

26 11月, 2019 1 次提交

net: add __sys_connect_file() helper · bd3ded31

由 Jens Axboe 提交于 11月 23, 2019

This is identical to __sys_connect(), except it takes a struct file
instead of an fd, and it also allows passing in extra file->f_flags
flags. The latter is done to support masking in O_NONBLOCK without
manipulating the original file flags.

No functional changes in this patch.

Cc: netdev@vger.kernel.org
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

bd3ded31

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功