- 06 5月, 2023 1 次提交
-
-
由 Chuck Lever 提交于
Dan points out that sock_alloc_file() releases @sock on error, but so do all of svc_setup_socket's callers, resulting in a double- release if sock_alloc_file() returns an error. Rather than allocating a struct file for all new sockets, allocate one only for sockets created during a TCP accept. For the moment, those are the only ones that will ever be used with RPC-with-TLS. Reported-by: NDan Carpenter <dan.carpenter@linaro.org> Fixes: ae0d7770 ("SUNRPC: Ensure server-side sockets have a sock->file") Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
- 03 5月, 2023 3 次提交
-
-
由 Chuck Lever 提交于
Jiri Slaby says: > I bisected to this ... as it breaks nfs3-only servers in 6.3. > I.e. /etc/nfs.conf containing: > [nfsd] > vers4=no > > The client sees: > mount("10.0.2.15:/tmp", "/mnt", "nfs", 0, "vers=4.2,addr=10.0.2.15,clientad"...) = -1 EIO (Input/output error) > write(2, "mount.nfs: mount system call fai"..., 45 > mount.nfs: mount system call failed for /mnt > > And the kernel says: > nfs4_discover_server_trunking unhandled error -5. Exiting with error EIO Reported-by: NJiri Slaby <jirislaby@kernel.org> Link: https://bugzilla.suse.com/show_bug.cgi?id=1210995 Fixes: 4bcf0343 ("SUNRPC: Set rq_accept_statp inside ->accept methods") Tested-by: NJiri Slaby <jirislaby@kernel.org> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Tom Rix 提交于
gcc with W=1 and ! CONFIG_SYSCTL fs/lockd/svc.c:80:51: error: ‘nlm_port_max’ defined but not used [-Werror=unused-const-variable=] 80 | static const int nlm_port_min = 0, nlm_port_max = 65535; | ^~~~~~~~~~~~ fs/lockd/svc.c:80:33: error: ‘nlm_port_min’ defined but not used [-Werror=unused-const-variable=] 80 | static const int nlm_port_min = 0, nlm_port_max = 65535; | ^~~~~~~~~~~~ The only use of these variables is when CONFIG_SYSCTL is defined, so their definition should be likewise conditional. Signed-off-by: NTom Rix <trix@redhat.com> Reviewed-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Tom Rix 提交于
gcc with W=1 and ! CONFIG_PROC_FS fs/nfsd/nfsctl.c:161:30: error: ‘exports_proc_ops’ defined but not used [-Werror=unused-const-variable=] 161 | static const struct proc_ops exports_proc_ops = { | ^~~~~~~~~~~~~~~~ The only use of exports_proc_ops is when CONFIG_PROC_FS is defined, so its definition should be likewise conditional. Signed-off-by: NTom Rix <trix@redhat.com> Reviewed-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
- 02 5月, 2023 1 次提交
-
-
由 Ard Biesheuvel 提交于
Scott reports SUNRPC self-test failures regarding the output IV on arm64 when using the SIMD accelerated implementation of AES in CBC mode with ciphertext stealing ("cts(cbc(aes))" in crypto API speak). These failures are the result of the fact that, while RFC 3962 does specify what the output IV should be and includes test vectors for it, the general concept of an output IV is poorly defined, and generally, not specified by the various algorithms implemented by the crypto API. Only algorithms that support transparent chaining (e.g., CBC mode on a block boundary) have requirements on the output IV, but ciphertext stealing (CTS) is fundamentally about how to encapsulate CBC in a way where the length of the entire message may not be an integral multiple of the cipher block size, and the concept of an output IV does not exist here because it has no defined purpose past the end of the message. The generic CTS template takes advantage of this chaining capability of the CBC implementations, and as a result, happens to return an output IV, simply because it passes its IV buffer directly to the encapsulated CBC implementation, which operates on full blocks only, and always returns an IV. This output IV happens to match how RFC 3962 defines it, even though the CTS template itself does not contain any output IV logic whatsoever, and, for this reason, lacks any test vectors that exercise this accidental output IV generation. The arm64 SIMD implementation of cts(cbc(aes)) does not use the generic CTS template at all, but instead, implements the CBC mode and ciphertext stealing directly, and therefore does not encapsule a CBC implementation that returns an output IV in the same way. The arm64 SIMD implementation complies with the specification and passes all internal tests, but when invoked by the SUNRPC code, fails to produce the expected output IV and causes its selftests to fail. Given that the output IV is defined as the penultimate block (where the final block may smaller than the block size), we can quite easily derive it in the caller by copying the appropriate slice of ciphertext after encryption. Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Anna Schumaker <anna@kernel.org> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Jeff Layton <jlayton@kernel.org> Reported-by: NScott Mayhew <smayhew@redhat.com> Tested-by: NScott Mayhew <smayhew@redhat.com> Signed-off-by: NArd Biesheuvel <ardb@kernel.org> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
- 28 4月, 2023 7 次提交
-
-
由 Chuck Lever 提交于
Enable administrators to require clients to use transport layer security when accessing particular exports. Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Chuck Lever 提交于
This patch adds opportunitistic RPC-with-TLS to the Linux in-kernel NFS server. If the client requests RPC-with-TLS and the user space handshake agent is running, the server will set up a TLS session. There are no policy settings yet. For example, the server cannot yet require the use of RPC-with-TLS to access its data. Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Chuck Lever 提交于
Tetsuo Handa points out: > Since GFP_KERNEL is "GFP_NOFS | __GFP_FS", usage like > "GFP_KERNEL | GFP_NOFS" does not make sense. The original intent was to hold the inode lock while estimating the buffer requirements for the requested information. Frank van der Linden, the author of NFSD's xattr code, says: > ... you need inode_lock to get an atomic view of an xattr. Since > both nfsd_getxattr and nfsd_listxattr to the standard trick of > querying the xattr length with a NULL buf argument (just getting > the length back), allocating the right buffer size, and then > querying again, they need to hold the inode lock to avoid having > the xattr changed from under them while doing that. > > From that then flows the requirement that GFP_FS could cause > problems while holding i_rwsem, so I added GFP_NOFS. However, Dave Chinner states: > You can do GFP_KERNEL allocations holding the i_rwsem just fine. > All that it requires is the caller holds a reference to the > inode ... Since these code paths acquire a dentry, they do indeed hold a reference. It is therefore safe to use GFP_KERNEL for these memory allocations. In particular, that's what this code is already doing; but now the C source code looks sane too. At a later time we can revisit in order to remove the inode lock in favor of simply retrying if the estimated buffer size is too small. Reported-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Dai Ngo 提交于
The following request sequence to the same file causes the NFS client and server getting into an infinite loop with COMMIT and NFS4ERR_DELAY: OPEN REMOVE WRITE COMMIT Problem reported by recall11, recall12, recall14, recall20, recall22, recall40, recall42, recall48, recall50 of nfstest suite. This patch restores the handling of race condition in nfsd_file_do_acquire with unlink to that prior of the regression. Fixes: ac3a2585 ("nfsd: rework refcounting in filecache") Signed-off-by: NDai Ngo <dai.ngo@oracle.com> Reviewed-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Chuck Lever 提交于
This is an eye-catcher for tracepoints that record the XID: it means svc_rqst() has not received a full RPC Call with an XID yet. Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Chuck Lever 提交于
To support kTLS, the server-side TCP socket receive path needs to watch for CMSGs. Acked-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Chuck Lever 提交于
A single RPC transaction that touches only a couple of pages means rq_pvec will not be even close to full in svc_xpt_release(). This is a common case. Instead, just leave the pages in rq_pvec until it is completely full. This improves the efficiency of the batch release mechanism on workloads that involve small RPC messages. The rq_pvec is also fully emptied just before thread exit. Reviewed-by: NCalum Mackay <calum.mackay@oracle.com> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
- 26 4月, 2023 26 次提交
-
-
由 Chuck Lever 提交于
Instead of invoking put_page() one-at-a-time, pass the "response" portion of rq_pages directly to release_pages() to reduce the number of times each nfsd thread invokes a page allocator API. Since svc_xprt_release() is not invoked while a client is waiting for an RPC Reply, this is not expected to directly impact mean request latencies on a lightly or moderately loaded server. However as workload intensity increases, I expect somewhat better scalability: the same number of server threads should be able to handle more work. Reviewed-by: NCalum Mackay <calum.mackay@oracle.com> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Chuck Lever 提交于
Clean-up: There doesn't seem to be a reason why this function is stuck in a header. One thing it prevents is the convenient addition of tracing. Moving it to a source file also makes the rq_respages clean-up logic easier to find. Reviewed-by: NCalum Mackay <calum.mackay@oracle.com> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Jeff Layton 提交于
When queueing a dispose list to the appropriate "freeme" lists, it pointlessly queues the objects one at a time to an intermediate list. Remove a few helpers and just open code a list_move to make it more clear and efficient. Better document the resulting functions with kerneldoc comments. Signed-off-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Chuck Lever 提交于
Clean up: All callers of svc_process() ignore its return value, so svc_process() can safely be converted to return void. Ditto for svc_send(). The return value of ->xpo_sendto() is now used only as part of a trace event. Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Chuck Lever 提交于
The TLS handshake upcall mechanism requires a non-NULL sock->file on the socket it hands to user space. svc_sock_free() already releases sock->file properly if one exists. Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Chuck Lever 提交于
There have been several bugs over the years where the NFSD splice actor has attempted to write outside the rq_pages array. This is a "should never happen" condition, but if for some reason the pipe splice actor should attempt to walk past the end of rq_pages, it needs to terminate the READ operation to prevent corruption of the pointer addresses in the fields just beyond the array. A server crash is thus prevented. Since the code is not behaving, the READ operation returns -EIO to the client. None of the READ payload data can be trusted if the splice actor isn't operating as expected. Suggested-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NJeff Layton <jlayton@kernel.org>
-
由 Luis Chamberlain 提交于
There is no need to declare two tables to just create directories, this can be easily be done with a prefix path with register_sysctl(). Simplify this registration. Signed-off-by: NLuis Chamberlain <mcgrof@kernel.org> Reviewed-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 NeilBrown 提交于
The get_expiry() function currently returns a timestamp, and uses the special return value of 0 to indicate an error. Unfortunately this causes a problem when 0 is the correct return value. On a system with no RTC it is possible that the boot time will be seen to be "3". When exportfs probes to see if a particular filesystem supports NFS export it tries to cache information with an expiry time of "3". The intention is for this to be "long in the past". Even with no RTC it will not be far in the future (at most a second or two) so this is harmless. But if the boot time happens to have been calculated to be "3", then get_expiry will fail incorrectly as it converts the number to "seconds since bootime" - 0. To avoid this problem we change get_expiry() to report the error quite separately from the expiry time. The error is now the return value. The expiry time is reported through a by-reference parameter. Reported-by: NJerry Zhang <jerry@skydio.com> Tested-by: NJerry Zhang <jerry@skydio.com> Signed-off-by: NNeilBrown <neilb@suse.de> Reviewed-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Jeff Layton 提交于
Signed-off-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Jeff Layton 提交于
lockd needs to be able to hash filehandles for tracepoints. Move the nfs_fhandle_hash() helper to a common nfs include file. Signed-off-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Jeff Layton 提交于
Currently lockd just dequeues the block and ignores it if the client sends a GRANT_RES with a status of nlm_lck_denied. That status is an indicator that the client has rejected the lock, so the right thing to do is to unlock the lock we were trying to grant. Reported-by: NYongcheng Yang <yoyang@redhat.com> Link: https://bugzilla.redhat.com/show_bug.cgi?id=2063818Signed-off-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Jeff Layton 提交于
After the wait for a grant is done (for whatever reason), nlmclnt_block updates the status of the nlm_rqst with the status of the block. At the point it does this, however, the block is still queued its status could change at any time. This is particularly a problem when the waiting task is signaled during the wait. We can end up giving up on the lock just before the GRANTED_MSG callback comes in, and accept it even though the lock request gets back an error, leaving a dangling lock on the server. Since the nlm_wait never lives beyond the end of nlmclnt_lock, put it on the stack and add functions to allow us to enqueue and dequeue the block. Enqueue it just before the lock/wait loop, and dequeue it just after we exit the loop instead of waiting until the end of the function. Also, scrape the status at the time that we dequeue it to ensure that it's final. Reported-by: NYongcheng Yang <yoyang@redhat.com> Link: https://bugzilla.redhat.com/show_bug.cgi?id=2063818Signed-off-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Jeff Layton 提交于
The next patch needs struct nlm_wait in fs/lockd/clntproc.c, so move the definition to a shared header file. As an added clean-up, drop the unused b_reclaim field. Signed-off-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Jeff Layton 提交于
Signed-off-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Jeff Layton 提交于
It's easily possible for the server to have an outstanding lock when we go to shut down. When that happens, we often get a warning like this in the kernel log: lockd: couldn't shutdown host module for net f0000000! This is because the shutdown procedures skip removing any hosts that still have outstanding resources (locks). Eventually, things seem to get cleaned up anyway, but the log message is unsettling, and server shutdown doesn't seem to be working the way it was intended. Ensure that we tear down any resources held on behalf of a client when tearing one down for server shutdown. Reported-by: NYongcheng Yang <yoyang@redhat.com> Link: https://bugzilla.redhat.com/show_bug.cgi?id=2063818Signed-off-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Chuck Lever 提交于
While we were converting the nfs4_file hashtable to use the kernel's resizable hashtable data structure, Neil Brown observed that the list variant (rhltable) would be better for managing nfsd_file items as well. The nfsd_file hash table will contain multiple entries for the same inode -- these should be kept together on a list. And, it could be possible for exotic or malicious client behavior to cause the hash table to resize itself on every insertion. A nice simplification is that rhltable_lookup() can return a list that contains only nfsd_file items that match a given inode, which enables us to eliminate specialized hash table helper functions and use the default functions provided by the rhashtable implementation). Since we are now storing nfsd_file items for the same inode on a single list, that effectively reduces the number of hash entries that have to be tracked in the hash table. The mininum bucket count is therefore lowered. Light testing with fstests generic/531 show no regressions. Suggested-by: NNeil Brown <neilb@suse.de> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Jeff Layton 提交于
On most filesystems, there is no reason to delay reaping an nfsd_file just because its underlying inode is still under writeback. nfsd just relies on client activity or the local flusher threads to do writeback. The main exception is NFS, which flushes all of its dirty data on last close. Add a new EXPORT_OP_FLUSH_ON_CLOSE flag to allow filesystems to signal that they do this, and only skip closing files under writeback on such filesystems. Also, remove a redundant NULL file pointer check in nfsd_file_check_writeback, and clean up nfs's export op flag definitions. Signed-off-by: NJeff Layton <jlayton@kernel.org> Acked-by: NAnna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Jeff Layton 提交于
Signed-off-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Jeff Layton 提交于
The last thing that filp_close does is an fput, so don't bother taking and putting the extra reference. Signed-off-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Jeff Layton 提交于
David Howells mentioned that he found this bit of code confusing, so sprinkle in some comments to clarify. Reported-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Jeff Layton 提交于
An error from break_lease is non-fatal, so we needn't destroy the nfsd_file in that case. Just put the reference like we normally would and return the error. Signed-off-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Jeff Layton 提交于
test_bit returns bool, so we can just compare the result of that to the key->gc value without the "!!". Signed-off-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Jeff Layton 提交于
Since v4 files are expected to be long-lived, there's little value in closing them out of the cache when there is conflicting access. Change the comparator to also match the gc value in the key. Change both of the current users of that key to set the gc value in the key to "true". Signed-off-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Jeff Layton 提交于
Signed-off-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
-
由 Paolo Abeni 提交于
commit 4bb7aac7 ("net: phy: fix circular LEDS_CLASS dependencies") solved a build failure, but introduces a new config knob with a default 'y' value: PHYLIB_LEDS. The latter is against the current new config policy. The exception was raised to allow the user to catch bad configurations without led support. Anyway the current definition of PHYLIB_LEDS does not fit the above goal: if LEDS_CLASS is disabled, the new config will be available only with PHYLIB disabled, too. Hide the mentioned config, to preserve the randconfig testing done so far, while respecting the mentioned policy. Suggested-by: NAndrew Lunn <andrew@lunn.ch> Suggested-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/d82489be8ed911c383c3447e9abf469995ccf39a.1682496488.git.pabeni@redhat.comSigned-off-by: NPaolo Abeni <pabeni@redhat.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net由 Paolo Abeni 提交于
No conflicts. Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
-
- 25 4月, 2023 2 次提交
-
-
由 wuych 提交于
Pointer variables of void * type do not require type cast. Signed-off-by: Nwuych <yunchuan@nfschina.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Kuniyuki Iwashima 提交于
syzkaller reported [0] memory leaks of an UDP socket and ZEROCOPY skbs. We can reproduce the problem with these sequences: sk = socket(AF_INET, SOCK_DGRAM, 0) sk.setsockopt(SOL_SOCKET, SO_TIMESTAMPING, SOF_TIMESTAMPING_TX_SOFTWARE) sk.setsockopt(SOL_SOCKET, SO_ZEROCOPY, 1) sk.sendto(b'', MSG_ZEROCOPY, ('127.0.0.1', 53)) sk.close() sendmsg() calls msg_zerocopy_alloc(), which allocates a skb, sets skb->cb->ubuf.refcnt to 1, and calls sock_hold(). Here, struct ubuf_info_msgzc indirectly holds a refcnt of the socket. When the skb is sent, __skb_tstamp_tx() clones it and puts the clone into the socket's error queue with the TX timestamp. When the original skb is received locally, skb_copy_ubufs() calls skb_unclone(), and pskb_expand_head() increments skb->cb->ubuf.refcnt. This additional count is decremented while freeing the skb, but struct ubuf_info_msgzc still has a refcnt, so __msg_zerocopy_callback() is not called. The last refcnt is not released unless we retrieve the TX timestamped skb by recvmsg(). Since we clear the error queue in inet_sock_destruct() after the socket's refcnt reaches 0, there is a circular dependency. If we close() the socket holding such skbs, we never call sock_put() and leak the count, sk, and skb. TCP has the same problem, and commit e0c8bccd ("net: stream: purge sk_error_queue in sk_stream_kill_queues()") tried to fix it by calling skb_queue_purge() during close(). However, there is a small chance that skb queued in a qdisc or device could be put into the error queue after the skb_queue_purge() call. In __skb_tstamp_tx(), the cloned skb should not have a reference to the ubuf to remove the circular dependency, but skb_clone() does not call skb_copy_ubufs() for zerocopy skb. So, we need to call skb_orphan_frags_rx() for the cloned skb to call skb_copy_ubufs(). [0]: BUG: memory leak unreferenced object 0xffff88800c6d2d00 (size 1152): comm "syz-executor392", pid 264, jiffies 4294785440 (age 13.044s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 cd af e8 81 00 00 00 00 ................ 02 00 07 40 00 00 00 00 00 00 00 00 00 00 00 00 ...@............ backtrace: [<0000000055636812>] sk_prot_alloc+0x64/0x2a0 net/core/sock.c:2024 [<0000000054d77b7a>] sk_alloc+0x3b/0x800 net/core/sock.c:2083 [<0000000066f3c7e0>] inet_create net/ipv4/af_inet.c:319 [inline] [<0000000066f3c7e0>] inet_create+0x31e/0xe40 net/ipv4/af_inet.c:245 [<000000009b83af97>] __sock_create+0x2ab/0x550 net/socket.c:1515 [<00000000b9b11231>] sock_create net/socket.c:1566 [inline] [<00000000b9b11231>] __sys_socket_create net/socket.c:1603 [inline] [<00000000b9b11231>] __sys_socket_create net/socket.c:1588 [inline] [<00000000b9b11231>] __sys_socket+0x138/0x250 net/socket.c:1636 [<000000004fb45142>] __do_sys_socket net/socket.c:1649 [inline] [<000000004fb45142>] __se_sys_socket net/socket.c:1647 [inline] [<000000004fb45142>] __x64_sys_socket+0x73/0xb0 net/socket.c:1647 [<0000000066999e0e>] do_syscall_x64 arch/x86/entry/common.c:50 [inline] [<0000000066999e0e>] do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80 [<0000000017f238c1>] entry_SYSCALL_64_after_hwframe+0x63/0xcd BUG: memory leak unreferenced object 0xffff888017633a00 (size 240): comm "syz-executor392", pid 264, jiffies 4294785440 (age 13.044s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 2d 6d 0c 80 88 ff ff .........-m..... backtrace: [<000000002b1c4368>] __alloc_skb+0x229/0x320 net/core/skbuff.c:497 [<00000000143579a6>] alloc_skb include/linux/skbuff.h:1265 [inline] [<00000000143579a6>] sock_omalloc+0xaa/0x190 net/core/sock.c:2596 [<00000000be626478>] msg_zerocopy_alloc net/core/skbuff.c:1294 [inline] [<00000000be626478>] msg_zerocopy_realloc+0x1ce/0x7f0 net/core/skbuff.c:1370 [<00000000cbfc9870>] __ip_append_data+0x2adf/0x3b30 net/ipv4/ip_output.c:1037 [<0000000089869146>] ip_make_skb+0x26c/0x2e0 net/ipv4/ip_output.c:1652 [<00000000098015c2>] udp_sendmsg+0x1bac/0x2390 net/ipv4/udp.c:1253 [<0000000045e0e95e>] inet_sendmsg+0x10a/0x150 net/ipv4/af_inet.c:819 [<000000008d31bfde>] sock_sendmsg_nosec net/socket.c:714 [inline] [<000000008d31bfde>] sock_sendmsg+0x141/0x190 net/socket.c:734 [<0000000021e21aa4>] __sys_sendto+0x243/0x360 net/socket.c:2117 [<00000000ac0af00c>] __do_sys_sendto net/socket.c:2129 [inline] [<00000000ac0af00c>] __se_sys_sendto net/socket.c:2125 [inline] [<00000000ac0af00c>] __x64_sys_sendto+0xe1/0x1c0 net/socket.c:2125 [<0000000066999e0e>] do_syscall_x64 arch/x86/entry/common.c:50 [inline] [<0000000066999e0e>] do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80 [<0000000017f238c1>] entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: f214f915 ("tcp: enable MSG_ZEROCOPY") Fixes: b5947e5d ("udp: msg_zerocopy") Reported-by: Nsyzbot <syzkaller@googlegroups.com> Signed-off-by: NKuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-