- 29 4月, 2022 1 次提交
-
-
由 Erin MacNeil 提交于
Adding a new socket option, SO_RCVMARK, to indicate that SO_MARK should be included in the ancillary data returned by recvmsg(). Renamed the sock_recv_ts_and_drops() function to sock_recv_cmsgs(). Signed-off-by: NErin MacNeil <lnx.erin@gmail.com> Reviewed-by: NEric Dumazet <edumazet@google.com> Reviewed-by: NDavid Ahern <dsahern@kernel.org> Acked-by: NMarc Kleine-Budde <mkl@pengutronix.de> Link: https://lore.kernel.org/r/20220427200259.2564-1-lnx.erin@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 12 4月, 2022 1 次提交
-
-
由 Oliver Hartkopp 提交于
The internal recvmsg() functions have two parameters 'flags' and 'noblock' that were merged inside skb_recv_datagram(). As a follow up patch to commit f4b41f06 ("net: remove noblock parameter from skb_recv_datagram()") this patch removes the separate 'noblock' parameter for recvmsg(). Analogue to the referenced patch for skb_recv_datagram() the 'flags' and 'noblock' parameters are unnecessarily split up with e.g. err = sk->sk_prot->recvmsg(sk, msg, size, flags & MSG_DONTWAIT, flags & ~MSG_DONTWAIT, &addr_len); or in err = INDIRECT_CALL_2(sk->sk_prot->recvmsg, tcp_recvmsg, udp_recvmsg, sk, msg, size, flags & MSG_DONTWAIT, flags & ~MSG_DONTWAIT, &addr_len); instead of simply using only flags all the time and check for MSG_DONTWAIT where needed (to preserve for the formerly separated no(n)block condition). Signed-off-by: NOliver Hartkopp <socketcan@hartkopp.net> Link: https://lore.kernel.org/r/20220411124955.154876-1-socketcan@hartkopp.netSigned-off-by: NPaolo Abeni <pabeni@redhat.com>
-
- 06 4月, 2022 1 次提交
-
-
由 Oliver Hartkopp 提交于
skb_recv_datagram() has two parameters 'flags' and 'noblock' that are merged inside skb_recv_datagram() by 'flags | (noblock ? MSG_DONTWAIT : 0)' As 'flags' may contain MSG_DONTWAIT as value most callers split the 'flags' into 'flags' and 'noblock' with finally obsolete bit operations like this: skb_recv_datagram(sk, flags & ~MSG_DONTWAIT, flags & MSG_DONTWAIT, &rc); And this is not even done consistently with the 'flags' parameter. This patch removes the obsolete and costly splitting into two parameters and only performs bit operations when really needed on the caller side. One missing conversion thankfully reported by kernel test robot. I missed to enable kunit tests to build the mctp code. Reported-by: Nkernel test robot <lkp@intel.com> Signed-off-by: NOliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 1月, 2022 1 次提交
-
-
由 Eric Dumazet 提交于
For some reason, raw_bind() forgot to lock the socket. BUG: KCSAN: data-race in __ip4_datagram_connect / raw_bind write to 0xffff8881170d4308 of 4 bytes by task 5466 on cpu 0: raw_bind+0x1b0/0x250 net/ipv4/raw.c:739 inet_bind+0x56/0xa0 net/ipv4/af_inet.c:443 __sys_bind+0x14b/0x1b0 net/socket.c:1697 __do_sys_bind net/socket.c:1708 [inline] __se_sys_bind net/socket.c:1706 [inline] __x64_sys_bind+0x3d/0x50 net/socket.c:1706 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae read to 0xffff8881170d4308 of 4 bytes by task 5468 on cpu 1: __ip4_datagram_connect+0xb7/0x7b0 net/ipv4/datagram.c:39 ip4_datagram_connect+0x2a/0x40 net/ipv4/datagram.c:89 inet_dgram_connect+0x107/0x190 net/ipv4/af_inet.c:576 __sys_connect_file net/socket.c:1900 [inline] __sys_connect+0x197/0x1b0 net/socket.c:1917 __do_sys_connect net/socket.c:1927 [inline] __se_sys_connect net/socket.c:1924 [inline] __x64_sys_connect+0x3d/0x50 net/socket.c:1924 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae value changed: 0x00000000 -> 0x0003007f Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 5468 Comm: syz-executor.5 Not tainted 5.17.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Fixes: 1da177e4 ("Linux-2.6.12-rc2") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: Nsyzbot <syzkaller@googlegroups.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 1月, 2022 1 次提交
-
-
由 Muchun Song 提交于
Remove PDE_DATA() completely and replace it with pde_data(). [akpm@linux-foundation.org: fix naming clash in drivers/nubus/proc.c] [akpm@linux-foundation.org: now fix it properly] Link: https://lkml.kernel.org/r/20211124081956.87711-2-songmuchun@bytedance.comSigned-off-by: NMuchun Song <songmuchun@bytedance.com> Acked-by: NChristian Brauner <christian.brauner@ubuntu.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Alexey Gladkov <gladkov.alexey@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 18 11月, 2021 1 次提交
-
-
由 Riccardo Paolo Bestetti 提交于
Add support to inet v4 raw sockets for binding to nonlocal addresses through the IP_FREEBIND and IP_TRANSPARENT socket options, as well as the ipv4.ip_nonlocal_bind kernel parameter. Add helper function to inet_sock.h to check for bind address validity on the base of the address type and whether nonlocal address are enabled for the socket via any of the sockopts/sysctl, deduplicating checks in ipv4/ping.c, ipv4/af_inet.c, ipv6/af_inet6.c (for mapped v4->v6 addresses), and ipv4/raw.c. Add test cases with IP[V6]_FREEBIND verifying that both v4 and v6 raw sockets support binding to nonlocal addresses after the change. Add necessary support for the test cases to nettest. Signed-off-by: NRiccardo Paolo Bestetti <pbl@bestov.io> Reviewed-by: NDavid Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20211117090010.125393-1-pbl@bestov.ioSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 16 11月, 2021 1 次提交
-
-
由 Eric Dumazet 提交于
This is distracting really, let's make this simpler, because many callers had to take care of this by themselves, even if on x86 this adds more code than really needed. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 9月, 2021 1 次提交
-
-
由 Eric Dumazet 提交于
This reverts commit d7807a9a. As mentioned in https://lkml.org/lkml/2021/9/13/1819 5 years old commit 91948309 ("ipv4: fix memory leaks in ip_cmsg_send() callers") was a correct fix. ip_cmsg_send() can loop over multiple cmsghdr() If IP_RETOPTS has been successful, but following cmsghdr generates an error, we do not free ipc.ok If IP_RETOPTS is not successful, we have freed the allocated temporary space, not the one currently in ipc.opt. Sure, code could be refactored, but let's not bring back old bugs. Fixes: d7807a9a ("Revert "ipv4: fix memory leaks in ip_cmsg_send() callers"") Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Yajun Deng <yajun.deng@linux.dev> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 9月, 2021 1 次提交
-
-
由 Yajun Deng 提交于
This reverts commit 91948309. There is only when ip_options_get() return zero need to free. It already called kfree() when return error. Fixes: 91948309 ("ipv4: fix memory leaks in ip_cmsg_send() callers") Signed-off-by: NYajun Deng <yajun.deng@linux.dev> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 6月, 2021 1 次提交
-
-
由 Alexander Aring 提交于
This patch introduces a function wrapper to call the sk_error_report callback. That will prepare to add additional handling whenever sk_error_report is called, for example to trace socket errors. Signed-off-by: NAlexander Aring <aahringo@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 11月, 2020 1 次提交
-
-
由 Paul Moore 提交于
As pointed out by Herbert in a recent related patch, the LSM hooks do not have the necessary address family information to use the flowi struct safely. As none of the LSMs currently use any of the protocol specific flowi information, replace the flowi pointers with pointers to the address family independent flowi_common struct. Reported-by: NHerbert Xu <herbert@gondor.apana.org.au> Acked-by: NJames Morris <jamorris@linux.microsoft.com> Signed-off-by: NPaul Moore <paul@paul-moore.com>
-
- 27 8月, 2020 1 次提交
-
-
由 Miaohe Lin 提交于
Fix some comments, including wrong function name, duplicated word and so on. Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 8月, 2020 2 次提交
-
-
由 Miaohe Lin 提交于
There is no need to fetch errno and fatal info from icmp_err_convert when icmp code is ICMP_FRAG_NEEDED. Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Randy Dunlap 提交于
Drop duplicate words in comments in net/ipv4/. Signed-off-by: NRandy Dunlap <rdunlap@infradead.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 8月, 2020 1 次提交
-
-
由 Al Viro 提交于
It's always 0. Note that we theoretically could use ~0U as well - result will be the same modulo 0xffff, _if_ the damn thing did the right thing for any value of initial sum; later we'll make use of that when convenient. However, unlike csum_and_copy_..._user(), there are instances that did not work for arbitrary initial sums; c6x is one such. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 25 7月, 2020 1 次提交
-
-
由 Christoph Hellwig 提交于
Rework the remaining setsockopt code to pass a sockptr_t instead of a plain user pointer. This removes the last remaining set_fs(KERNEL_DS) outside of architecture specific code. Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: Stefan Schmidt <stefan@datenfreihafen.org> [ieee802154] Acked-by: NMatthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 7月, 2020 1 次提交
-
-
由 Christoph Hellwig 提交于
Handle the few cases that need special treatment in-line using in_compat_syscall(). Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 3月, 2020 1 次提交
-
-
由 Jules Irenge 提交于
Sparse reports warnings at raw_seq_start() and raw_seq_stop() warning: context imbalance in raw_seq_start() - wrong count at exit warning: context imbalance in raw_seq_stop() - unexpected unlock The root cause is the missing annotations at raw_seq_start() and raw_seq_stop() Add the missing __acquires(&h->lock) annotation Add the missing __releases(&h->lock) annotation Signed-off-by: NJules Irenge <jbi.octave@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 10月, 2019 1 次提交
-
-
由 Florian Westphal 提交于
commit 174e2381 ("sk_buff: drop all skb extensions on free and skb scrubbing") made napi recycle always drop skb extensions. The additional skb_ext_del() that is performed via nf_reset on napi skb recycle is not needed anymore. Most nf_reset() calls in the stack are there so queued skb won't block 'rmmod nf_conntrack' indefinitely. This removes the skb_ext_del from nf_reset, and renames it to a more fitting nf_reset_ct(). In a few selected places, add a call to skb_ext_reset to make sure that no active extensions remain. I am submitting this for "net", because we're still early in the release cycle. The patch applies to net-next too, but I think the rename causes needless divergence between those trees. Suggested-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
- 14 9月, 2019 1 次提交
-
-
由 Willem de Bruijn 提交于
Enable setting skb->mark for UDP and RAW sockets using cmsg. This is analogous to existing support for TOS, TTL, txtime, etc. Packet sockets already support this as of commit c7d39e32 ("packet: support per-packet fwmark for af_packet sendmsg"). Similar to other fields, implement by 1. initialize the sockcm_cookie.mark from socket option sk_mark 2. optionally overwrite this in ip_cmsg_send/ip6_datagram_send_ctl 3. initialize inet_cork.mark from sockcm_cookie.mark 4. initialize each (usually just one) skb->mark from inet_cork.mark Step 1 is handled in one location for most protocols by ipcm_init_sk as of commit 35178206 ("ipv4: ipcm_cookie initializers"). Signed-off-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 6月, 2019 1 次提交
-
-
由 Stephen Suryaputra 提交于
In commit 19e4e768 ("ipv4: Fix raw socket lookup for local traffic"), the dif argument to __raw_v4_lookup() is coming from the returned value of inet_iif() but the change was done only for the first lookup. Subsequent lookups in the while loop still use skb->dev->ifIndex. Fixes: 19e4e768 ("ipv4: Fix raw socket lookup for local traffic") Signed-off-by: NStephen Suryaputra <ssuryaextr@gmail.com> Reviewed-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 5月, 2019 1 次提交
-
-
由 Thomas Gleixner 提交于
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NAllison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 20 5月, 2019 1 次提交
-
-
由 Patrick Talbert 提交于
Currently, procfs socket stats format sk_drops as a signed int (%d). For large values this will cause a negative number to be printed. We know the drop count can never be a negative so change the format specifier to %u. Signed-off-by: NPatrick Talbert <ptalbert@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 5月, 2019 1 次提交
-
-
由 David Ahern 提交于
inet_iif should be used for the raw socket lookup. inet_iif considers rt_iif which handles the case of local traffic. As it stands, ping to a local address with the '-I <dev>' option fails ever since ping was changed to use SO_BINDTODEVICE instead of cmsg + IP_PKTINFO. IPv6 works fine. Fixes: 1da177e4 ("Linux-2.6.12-rc2") Signed-off-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 12月, 2018 1 次提交
-
-
由 Willem de Bruijn 提交于
SOF_TIMESTAMPING_OPT_ID is supported on TCP, UDP and RAW sockets. But it was missing on RAW with IPPROTO_IP, PF_PACKET and CAN. Add skb_setup_tx_timestamp that configures both tx_flags and tskey for these paths that do not need corking or use bytestream keys. Fixes: 09c2d251 ("net-timestamp: add key to disambiguate concurrent datagrams") Signed-off-by: NWillem de Bruijn <willemb@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 11月, 2018 1 次提交
-
-
由 David Ahern 提交于
Randy reported when CONFIG_PROC_FS is not enabled: ld: net/ipv4/af_inet.o: in function `inet_init': af_inet.c:(.init.text+0x42d): undefined reference to `raw_init' Fix by moving the endif up to the end of the proc entries Fixes: 6897445f ("net: provide a sysctl raw_l3mdev_accept for raw socket lookup with VRFs") Reported-by: NRandy Dunlap <rdunlap@infradead.org> Cc: Mike Manning <mmanning@vyatta.att-mail.com> Signed-off-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 11月, 2018 2 次提交
-
-
由 Duncan Eastoe 提交于
When there exist a pair of raw sockets one unbound and one bound to a VRF but equal in all other respects, when a packet is received in the VRF context, __raw_v4_lookup() matches on both sockets. This results in the packet being delivered over both sockets, instead of only the raw socket bound to the VRF. The bound device checks in __raw_v4_lookup() are replaced with a call to raw_sk_bound_dev_eq() which correctly handles whether the packet should be delivered over the unbound socket in such cases. In __raw_v6_lookup() the match on the device binding of the socket is similarly updated to use raw_sk_bound_dev_eq() which matches the handling in __raw_v4_lookup(). Importantly raw_sk_bound_dev_eq() takes the raw_l3mdev_accept sysctl into account. Signed-off-by: NDuncan Eastoe <deastoe@vyatta.att-mail.com> Signed-off-by: NMike Manning <mmanning@vyatta.att-mail.com> Reviewed-by: NDavid Ahern <dsahern@gmail.com> Tested-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Mike Manning 提交于
Add a sysctl raw_l3mdev_accept to control raw socket lookup in a manner similar to use of tcp_l3mdev_accept for stream and of udp_l3mdev_accept for datagram sockets. Have this default to enabled for reasons of backwards compatibility. This is so as to specify the output device with cmsg and IP_PKTINFO, but using a socket not bound to the corresponding VRF. This allows e.g. older ping implementations to be run with specifying the device but without executing it in the VRF. If the option is disabled, packets received in a VRF context are only handled by a raw socket bound to the VRF, and correspondingly packets in the default VRF are only handled by a socket not bound to any VRF. Signed-off-by: NMike Manning <mmanning@vyatta.att-mail.com> Reviewed-by: NDavid Ahern <dsahern@gmail.com> Tested-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 10月, 2018 1 次提交
-
-
由 Robert Shearman 提交于
It is useful to be able to use the same socket for listening in a specific VRF, as for sending multicast packets out of a specific interface. However, the bound device on the socket currently takes precedence and results in the packets not being sent. Relax the condition on overriding the output interface to use for sending packets out of UDP, raw and ping sockets to allow multicast packets to be sent using the specified multicast interface. Signed-off-by: NRobert Shearman <rshearma@vyatta.att-mail.com> Signed-off-by: NMike Manning <mmanning@vyatta.att-mail.com> Reviewed-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 7月, 2018 2 次提交
-
-
由 Willem de Bruijn 提交于
skb_shinfo(skb)->tx_flags is derived from sk->sk_tsflags, possibly after modification by __sock_cmsg_send, by calling sock_tx_timestamp. The IPv4 and IPv6 paths do this conversion differently. In IPv4, the individual protocols that support tx timestamps call this function and store the result in ipc.tx_flags. In IPv6, sock_tx_timestamp is called in __ip6_append_data. There is no need to store both tx_flags and ts_flags in the cookie as one is derived from the other. Convert when setting up the cork and remove the redundant field. This is similar to IPv6, only have the conversion happen only once per datagram, in ip(6)_setup_cork. Also change __ip6_append_data to match __ip_append_data. Only update tskey if timestamping is enabled with OPT_ID. The SOCK_.. test is redundant: only valid protocols can have non-zero cork->tx_flags. After this change the IPv4 and IPv6 logic is the same. Signed-off-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Willem de Bruijn 提交于
Initialize the cookie in one location to reduce code duplication and avoid bugs from inconsistent initialization, such as that fixed in commit 9887cba1 ("ip: limit use of gso_size to udp"). Signed-off-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 7月, 2018 1 次提交
-
-
由 Jesus Sanchez-Palencia 提交于
Add a transmit_time field to struct inet_cork, then copy the timestamp from the CMSG cookie at ip_setup_cork() so we can safely copy it into the skb later during __ip_make_skb(). For the raw fast path, just perform the copy at raw_send_hdrinc(). Signed-off-by: NRichard Cochran <rcochran@linutronix.de> Signed-off-by: NJesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 5月, 2018 2 次提交
-
-
由 Christoph Hellwig 提交于
Variants of proc_create{,_data} that directly take a struct seq_operations and deal with network namespaces in ->open and ->release. All callers of proc_create + seq_open_net converted over, and seq_{open,release}_net are removed entirely. Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Christoph Hellwig 提交于
Pass the hashtable to the proc private data instead of copying it into the per-file private data. Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 28 3月, 2018 1 次提交
-
-
由 Kirill Tkhai 提交于
Synchronous pernet_operations are not allowed anymore. All are asynchronous. So, drop the structure member. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 3月, 2018 1 次提交
-
-
由 Joe Perches 提交于
Prefer the direct use of octal for permissions. Done with checkpatch -f --types=SYMBOLIC_PERMS --fix-inplace and some typing. Miscellanea: o Whitespace neatening around these conversions. Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 3月, 2018 1 次提交
-
-
由 Kirill Tkhai 提交于
This reverts commit 1215e51e. Since raw_close() is used on every RAW socket destruction, the changes made by 1215e51e scale sadly. This clearly seen on endless unshare(CLONE_NEWNET) test, and cleanup_net() kwork spends a lot of time waiting for rtnl_lock() introduced by this commit. Previous patch moved IP_ROUTER_ALERT out of rtnl_lock(), so we revert this patch. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 2月, 2018 1 次提交
-
-
由 Kirill Tkhai 提交于
arp_net_ops just addr/removes /proc entry. devinet_ops allocates and frees duplicate of init_net tables and (un)registers sysctl entries. fib_net_ops allocates and frees pernet tables, creates/destroys netlink socket and (un)initializes /proc entries. Foreign pernet_operations do not touch them. ip_rt_proc_ops only modifies pernet /proc entries. xfrm_net_ops creates/destroys /proc entries, allocates/frees pernet statistics, hashes and tables, and (un)initializes sysctl files. These are not touched by foreigh pernet_operations xfrm4_net_ops allocates/frees private pernet memory, and configures sysctls. sysctl_route_ops creates/destroys sysctls. rt_genid_ops only initializes fields of just allocated net. ipv4_inetpeer_ops allocated/frees net private memory. igmp_net_ops just creates/destroys /proc files and socket, noone else interested in. tcp_sk_ops seems to be safe, because tcp_sk_init() does not depend on any other pernet_operations modifications. Iteration over hash table in inet_twsk_purge() is made under RCU lock, and it's safe to iterate the table this way. Removing from the table happen from inet_twsk_deschedule_put(), but this function is safe without any extern locks, as it's synchronized inside itself. There are many examples, it's used in different context. So, it's safe to leave tcp_sk_exit_batch() unlocked. tcp_net_metrics_ops is synchronized on tcp_metrics_lock and safe. udplite4_net_ops only creates/destroys pernet /proc file. icmp_sk_ops creates percpu sockets, not touched by foreign pernet_operations. ipmr_net_ops creates/destroys pernet fib tables, (un)registers fib rules and /proc files. This seem to be safe to execute in parallel with foreign pernet_operations. af_inet_ops just sets up default parameters of newly created net. ipv4_mib_ops creates and destroys pernet percpu statistics. raw_net_ops, tcp4_net_ops, udp4_net_ops, ping_v4_net_ops and ip_proc_ops only create/destroy pernet /proc files. ip4_frags_ops creates and destroys sysctl file. So, it's safe to make the pernet_operations async. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Acked-by: NAndrei Vagin <avagin@virtuozzo.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 1月, 2018 1 次提交
-
-
由 David Ahern 提交于
Message sends to the local broadcast address (255.255.255.255) require uc_index or sk_bound_dev_if to be set to an egress device. However, responses or only received if the socket is bound to the device. This is overly constraining for processes running in an L3 domain. This patch allows a socket bound to the VRF device to send to the local broadcast address by using IP_UNICAST_IF to set the egress interface with packet receipt handled by the VRF binding. Similar to IP_MULTICAST_IF, relax the constraint on setting IP_UNICAST_IF if a socket is bound to an L3 master device. In this case allow uc_index to be set to an enslaved if sk_bound_dev_if is an L3 master device and is the master device for the ifindex. In udp and raw sendmsg, allow uc_index to override the oif if uc_index master device is oif (ie., the oif is an L3 master and the index is an L3 slave). Signed-off-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 1月, 2018 1 次提交
-
-
由 Alexey Dobriyan 提交于
/proc has been ignoring struct file_operations::owner field for 10 years. Specifically, it started with commit 786d7e16 ("Fix rmmod/read/write races in /proc entries"). Notice the chunk where inode->i_fop is initialized with proxy struct file_operations for regular files: - if (de->proc_fops) - inode->i_fop = de->proc_fops; + if (de->proc_fops) { + if (S_ISREG(inode->i_mode)) + inode->i_fop = &proc_reg_file_ops; + else + inode->i_fop = de->proc_fops; + } VFS stopped pinning module at this point. Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-