提交 · 4fdccaa0d184c202f98d73b24e3ec8eeee88ab8d · openeuler / Kernel

24 10月, 2021 4 次提交

iomap: Add done_before argument to iomap_dio_rw · 4fdccaa0

由 Andreas Gruenbacher 提交于 7月 24, 2021

Add a done_before argument to iomap_dio_rw that indicates how much of
the request has already been transferred.  When the request succeeds, we
report that done_before additional bytes were tranferred.  This is
useful for finishing a request asynchronously when part of the request
has already been completed synchronously.

We'll use that to allow iomap_dio_rw to be used with page faults
disabled: when a page fault occurs while submitting a request, we
synchronously complete the part of the request that has already been
submitted.  The caller can then take care of the page fault and call
iomap_dio_rw again for the rest of the request, passing in the number of
bytes already tranferred.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>

4fdccaa0

iomap: Support partial direct I/O on user copy failures · 97308f8b

由 Andreas Gruenbacher 提交于 7月 23, 2021

In iomap_dio_rw, when iomap_apply returns an -EFAULT error and the
IOMAP_DIO_PARTIAL flag is set, complete the request synchronously and
return a partial result.  This allows the caller to deal with the page
fault and retry the remainder of the request.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>

97308f8b

iomap: Fix iomap_dio_rw return value for user copies · 42c498c1

由 Andreas Gruenbacher 提交于 7月 15, 2021

When a user copy fails in one of the helpers of iomap_dio_rw, fail with
-EFAULT instead of returning 0.  This matches what iomap_dio_bio_actor
returns when it gets an -EFAULT from bio_iov_iter_get_pages.  With these
changes, iomap_dio_actor now consistently fails with -EFAULT when a user
page cannot be faulted in.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

42c498c1

gfs2: Fix mmap + page fault deadlocks for buffered I/O · 00bfe02f

由 Andreas Gruenbacher 提交于 10月 18, 2021

In the .read_iter and .write_iter file operations, we're accessing
user-space memory while holding the inode glock.  There is a possibility
that the memory is mapped to the same file, in which case we'd recurse
on the same glock.

We could detect and work around this simple case of recursive locking,
but more complex scenarios exist that involve multiple glocks,
processes, and cluster nodes, and working around all of those cases
isn't practical or even possible.

Avoid these kinds of problems by disabling page faults while holding the
inode glock.  If a page fault would occur, we either end up with a
partial read or write or with -EFAULT if nothing could be read or
written.  In either case, we know that we're not done with the
operation, so we indicate that we're willing to give up the inode glock
and then we fault in the missing pages.  If that made us lose the inode
glock, we return a partial read or write.  Otherwise, we resume the
operation.

This locking problem was originally reported by Jan Kara.  Linus came up
with the idea of disabling page faults.  Many thanks to Al Viro and
Matthew Wilcox for their feedback.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>

00bfe02f

21 10月, 2021 5 次提交

gfs2: Eliminate ip->i_gh · 1b223f70

由 Andreas Gruenbacher 提交于 8月 25, 2021

Now that gfs2_file_buffered_write is the only remaining user of
ip->i_gh, we can move the glock holder to the stack (or rather, use the
one we already have on the stack); there is no need for keeping the
holder in the inode anymore.

This is slightly complicated by the fact that we're using ip->i_gh for
the statfs inode in gfs2_file_buffered_write as well.  Writing to the
statfs inode isn't very common, so allocate the statfs holder
dynamically when needed.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>

1b223f70

gfs2: Move the inode glock locking to gfs2_file_buffered_write · b924bdab

由 Andreas Gruenbacher 提交于 8月 11, 2021

So far, for buffered writes, we were taking the inode glock in
gfs2_iomap_begin and dropping it in gfs2_iomap_end with the intention of
not holding the inode glock while iomap_write_actor faults in user
pages.  It turns out that iomap_write_actor is called inside iomap_begin
... iomap_end, so the user pages were still faulted in while holding the
inode glock and the locking code in iomap_begin / iomap_end was
completely pointless.

Move the locking into gfs2_file_buffered_write instead.  We'll take care
of the potential deadlocks due to faulting in user pages while holding a
glock in a subsequent patch.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>

b924bdab

gfs2: Introduce flag for glock holder auto-demotion · dc732906

由 Bob Peterson 提交于 8月 19, 2021

This patch introduces a new HIF_MAY_DEMOTE flag and infrastructure that
will allow glocks to be demoted automatically on locking conflicts.
When a locking request comes in that isn't compatible with the locking
state of an active holder and that holder has the HIF_MAY_DEMOTE flag
set, the holder will be demoted before the incoming locking request is
granted.

Note that this mechanism demotes active holders (with the HIF_HOLDER
flag set), while before we were only demoting glocks without any active
holders.  This allows processes to keep hold of locks that may form a
cyclic locking dependency; the core glock logic will then break those
dependencies in case a conflicting locking request occurs.  We'll use
this to avoid giving up the inode glock proactively before faulting in
pages.

Processes that allow a glock holder to be taken away indicate this by
calling gfs2_holder_allow_demote(), which sets the HIF_MAY_DEMOTE flag.
Later, they call gfs2_holder_disallow_demote() to clear the flag again,
and then they check if their holder is still queued: if it is, they are
still holding the glock; if it isn't, they can re-acquire the glock (or
abort).
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>

dc732906

gfs2: Clean up function may_grant · 61444649

由 Andreas Gruenbacher 提交于 8月 10, 2021

Pass the first current glock holder into function may_grant and
deobfuscate the logic there.

While at it, switch from BUG_ON to GLOCK_BUG_ON in may_grant.  To make
that build cleanly, de-constify the may_grant arguments.

We're now using function find_first_holder in do_promote, so move the
function's definition above do_promote.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>

61444649

gfs2: Add wrapper for iomap_file_buffered_write · 2eb7509a

由 Andreas Gruenbacher 提交于 5月 13, 2021

Add a wrapper around iomap_file_buffered_write.  We'll add code for when
the operation needs to be retried here later.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>

2eb7509a

18 10月, 2021 2 次提交

iov_iter: Turn iov_iter_fault_in_readable into fault_in_iov_iter_readable · a6294593

由 Andreas Gruenbacher 提交于 8月 02, 2021

Turn iov_iter_fault_in_readable into a function that returns the number
of bytes not faulted in, similar to copy_to_user, instead of returning a
non-zero value when any of the requested pages couldn't be faulted in.
This supports the existing users that require all pages to be faulted in
as well as new users that are happy if any pages can be faulted in.

Rename iov_iter_fault_in_readable to fault_in_iov_iter_readable to make
sure this change doesn't silently break things.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>

a6294593

gup: Turn fault_in_pages_{readable,writeable} into fault_in_{readable,writeable} · bb523b40

由 Andreas Gruenbacher 提交于 8月 02, 2021

Turn fault_in_pages_{readable,writeable} into versions that return the
number of bytes not faulted in, similar to copy_to_user, instead of
returning a non-zero value when any of the requested pages couldn't be
faulted in.  This supports the existing users that require all pages to
be faulted in as well as new users that are happy if any pages can be
faulted in.

Rename the functions to fault_in_{readable,writeable} to make sure
this change doesn't silently break things.

Neither of these functions is entirely trivial and it doesn't seem
useful to inline them, so move them to mm/gup.c.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>

bb523b40

07 10月, 2021 6 次提交

ksmbd: fix oops from fuse driver · 64e78755

由 Namjae Jeon 提交于 10月 03, 2021

Marios reported kernel oops from fuse driver when ksmbd call
mark_inode_dirty(). This patch directly update ->i_ctime after removing
mark_inode_ditry() and notify_change will put inode to dirty list.

Cc: Tom Talpey <tom@talpey.com>
Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Cc: Ralph Böhme <slow@samba.org>
Cc: Hyunchul Lee <hyc.lee@gmail.com>
Reported-by: NMarios Makassikis <mmakassikis@freebox.fr>
Tested-by: NMarios Makassikis <mmakassikis@freebox.fr>
Acked-by: NHyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: NNamjae Jeon <linkinjeon@kernel.org>
Signed-off-by: NSteve French <stfrench@microsoft.com>

64e78755

ksmbd: fix version mismatch with out of tree · 2db72604

由 Namjae Jeon 提交于 10月 01, 2021

Fix version mismatch with out of tree, This updated version will be
matched with ksmbd-tools.

Cc: Tom Talpey <tom@talpey.com>
Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Cc: Ralph Böhme <slow@samba.org>
Cc: Steve French <smfrench@gmail.com>
Cc: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: NNamjae Jeon <linkinjeon@kernel.org>
Signed-off-by: NSteve French <stfrench@microsoft.com>

2db72604

ksmbd: use buf_data_size instead of recalculation in smb3_decrypt_req() · c7705eec

由 Namjae Jeon 提交于 10月 04, 2021

Tom suggested to use buf_data_size that is already calculated, to verify
these offsets.

Cc: Tom Talpey <tom@talpey.com>
Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Cc: Ralph Böhme <slow@samba.org>
Suggested-by: NTom Talpey <tom@talpey.com>
Acked-by: NHyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: NNamjae Jeon <linkinjeon@kernel.org>
Signed-off-by: NSteve French <stfrench@microsoft.com>

c7705eec

ksmbd: remove the leftover of smb2.0 dialect support · 51a13873

由 Namjae Jeon 提交于 9月 29, 2021

Although ksmbd doesn't send SMB2.0 support in supported dialect list of smb
negotiate response, There is the leftover of smb2.0 dialect.
This patch remove it not to support SMB2.0 in ksmbd.

Cc: Tom Talpey <tom@talpey.com>
Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Cc: Ralph Böhme <slow@samba.org>
Cc: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: NNamjae Jeon <linkinjeon@kernel.org>
Signed-off-by: NSteve French <stfrench@microsoft.com>

51a13873

ksmbd: check strictly data area in ksmbd_smb2_check_message() · c2e99d47

由 Namjae Jeon 提交于 9月 26, 2021

When invalid data offset and data length in request,
ksmbd_smb2_check_message check strictly and doesn't allow to process such
requests.

Cc: Tom Talpey <tom@talpey.com>
Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Cc: Ralph Böhme <slow@samba.org>
Acked-by: NHyunchul Lee <hyc.lee@gmail.com>
Reviewed-by: NRalph Boehme <slow@samba.org>
Signed-off-by: NNamjae Jeon <linkinjeon@kernel.org>
Signed-off-by: NSteve French <stfrench@microsoft.com>

c2e99d47

NFSD: Keep existing listeners on portlist error · c2010694

由 Benjamin Coddington 提交于 10月 06, 2021

If nfsd has existing listening sockets without any processes, then an error
returned from svc_create_xprt() for an additional transport will remove
those existing listeners. We're seeing this in practice when userspace
attempts to create rpcrdma transports without having the rpcrdma modules
present before creating nfsd kernel processes. Fix this by checking for
existing sockets before calling nfsd_destroy().
Signed-off-by: NBenjamin Coddington <bcodding@redhat.com>
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>

c2010694

06 10月, 2021 1 次提交

ksmbd: add the check to vaildate if stream protocol length exceeds maximum value · 36399990

由 Namjae Jeon 提交于 9月 24, 2021

This patch add MAX_STREAM_PROT_LEN macro and check if stream protocol
length exceeds maximum value. opencode pdu size check in
ksmbd_pdu_size_has_room().

Cc: Tom Talpey <tom@talpey.com>
Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Cc: Ralph Böhme <slow@samba.org>
Acked-by: NHyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: NNamjae Jeon <linkinjeon@kernel.org>
Signed-off-by: NSteve French <stfrench@microsoft.com>

36399990

05 10月, 2021 7 次提交

afs: Fix afs_launder_page() to set correct start file position · 5c052248

由 David Howells 提交于 8月 12, 2021

Fix afs_launder_page() to set the starting position of the StoreData RPC at
the offset into the page at which the modified data starts instead of at
the beginning of the page (the iov_iter is correctly offset).

The offset got lost during the conversion to passing an iov_iter into
afs_store_data().

Changes:
ver #2:
 - Use page_offset() rather than manually calculating it[1].

Fixes: bd80d8a8 ("afs: Use ITER_XARRAY for writing")
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Reviewed-by: NJeffrey Altman <jaltman@auristor.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/YST/0e92OdSH0zjg@casper.infradead.org/ [1]
Link: https://lore.kernel.org/r/162880783179.3421678.7795105718190440134.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/162937512409.1449272.18441473411207824084.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/162981148752.1901565.3663780601682206026.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/163005741670.2472992.2073548908229887941.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/163221839087.3143591.14278359695763025231.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/163292980654.4004896.7134735179887998551.stgit@warthog.procyon.org.uk/ # v2

5c052248

netfs: Fix READ/WRITE confusion when calling iov_iter_xarray() · 330de47d

由 David Howells 提交于 7月 26, 2021

Fix netfs_clear_unread() to pass READ to iov_iter_xarray() instead of WRITE
(the flag is about the operation accessing the buffer, not what sort of
access it is doing to the buffer).

Fixes: 3d3c9504 ("netfs: Provide readahead and readpage netfs helpers")
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Reviewed-by: NJeff Layton <jlayton@kernel.org>
cc: linux-cachefs@redhat.com
cc: linux-afs@lists.infradead.org
cc: ceph-devel@vger.kernel.org
cc: linux-cifs@vger.kernel.org
cc: linux-nfs@vger.kernel.org
cc: v9fs-developer@lists.sourceforge.net
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
Link: https://lore.kernel.org/r/162729351325.813557.9242842205308443901.stgit@warthog.procyon.org.uk/
Link: https://lore.kernel.org/r/162886603464.3940407.3790841170414793899.stgit@warthog.procyon.org.uk
Link: https://lore.kernel.org/r/163239074602.1243337.14154704004485867017.stgit@warthog.procyon.org.uk

330de47d

fscache: Remove an unused static variable · ef31499a

由 David Howells 提交于 9月 20, 2021

The fscache object CREATE_OBJECT work state isn't ever referred to, so
remove it and avoid the unused variable warning caused by W=1.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Reviewed-by: NJeff Layton <jlayton@kernel.org>
cc: linux-fsdevel@vger.kernel.org
cc: linux-doc@vger.kernel.org
Link: https://lore.kernel.org/r/163214005516.2945267.7000234432243167892.stgit@warthog.procyon.org.uk/ # rfc v1
Link: https://lore.kernel.org/r/163281899704.2790286.9177774252843775348.stgit@warthog.procyon.org.uk/ # rfc v2

ef31499a

fscache: Fix some kerneldoc warnings shown up by W=1 · d9e3f822

由 David Howells 提交于 10月 04, 2021

Fix some kerneldoc warnings in the fscache driver that are shown up by W=1.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Reviewed-by: NJeff Layton <jlayton@kernel.org>
cc: Mauro Carvalho Chehab <mchehab@kernel.org>
cc: linux-fsdevel@vger.kernel.org
cc: linux-doc@vger.kernel.org
Link: https://lore.kernel.org/r/163214005516.2945267.7000234432243167892.stgit@warthog.procyon.org.uk/ # rfc v1
Link: https://lore.kernel.org/r/163281899704.2790286.9177774252843775348.stgit@warthog.procyon.org.uk/ # rfc v2

d9e3f822

9p: Fix a bunch of kerneldoc warnings shown up by W=1 · bc868036

由 David Howells 提交于 10月 04, 2021

Fix a bunch of kerneldoc warnings shown up by W=1 in the 9p filesystem:

 (1) Add/remove/fix kerneldoc parameters descriptions.

 (2) Move __add_fid() from between v9fs_fid_add() and its comment.

 (3) 9p's caches_show() doesn't really make sense as an API function, so
     remove the kerneldoc annotation.  It's also not prefixed with 'v9fs_'.
     Also remove the kerneldoc markers from the 9p fscache wrappers.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Reviewed-by: NDominique Martinet <asmadeus@codewreck.org>
Reviewed-by: NJeff Layton <jlayton@kernel.org>
cc: Mauro Carvalho Chehab <mchehab@kernel.org>
cc: v9fs-developer@lists.sourceforge.net
cc: linux-fsdevel@vger.kernel.org
cc: linux-doc@vger.kernel.org
Link: https://lore.kernel.org/r/163214005516.2945267.7000234432243167892.stgit@warthog.procyon.org.uk/ # rfc v1
Link: https://lore.kernel.org/r/163281899704.2790286.9177774252843775348.stgit@warthog.procyon.org.uk/ # rfc v2

bc868036

afs: Fix kerneldoc warning shown up by W=1 · dcb442b1

由 David Howells 提交于 10月 04, 2021

Fix a kerneldoc warning in afs due to a partially documented internal
function by removing the kerneldoc marker.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Reviewed-by: NJeff Layton <jlayton@kernel.org>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-doc@vger.kernel.org
Link: https://lore.kernel.org/r/163214005516.2945267.7000234432243167892.stgit@warthog.procyon.org.uk/ # rfc v1
Link: https://lore.kernel.org/r/163281899704.2790286.9177774252843775348.stgit@warthog.procyon.org.uk/ # rfc v2

dcb442b1

nfs: Fix kerneldoc warning shown up by W=1 · c0b27c48

由 David Howells 提交于 10月 04, 2021

Fix a kerneldoc warning in nfs due to documentation for a parameter that
isn't present.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Reviewed-by: NJeff Layton <jlayton@kernel.org>
cc: Trond Myklebust <trond.myklebust@hammerspace.com>
cc: Anna Schumaker <anna.schumaker@netapp.com>
cc: Mauro Carvalho Chehab <mchehab@kernel.org>
cc: linux-nfs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-doc@vger.kernel.org
Link: https://lore.kernel.org/r/163214005516.2945267.7000234432243167892.stgit@warthog.procyon.org.uk/ # rfc v1
Link: https://lore.kernel.org/r/163281899704.2790286.9177774252843775348.stgit@warthog.procyon.org.uk/ # rfc v2

c0b27c48

04 10月, 2021 1 次提交

elf: don't use MAP_FIXED_NOREPLACE for elf interpreter mappings · 9b2f72cc

由 Chen Jingwen 提交于 9月 28, 2021

In commit b212921b ("elf: don't use MAP_FIXED_NOREPLACE for elf
executable mappings") we still leave MAP_FIXED_NOREPLACE in place for
load_elf_interp.

Unfortunately, this will cause kernel to fail to start with:

    1 (init): Uhuuh, elf segment at 00003ffff7ffd000 requested but the memory is mapped already
    Failed to execute /init (error -17)

The reason is that the elf interpreter (ld.so) has overlapping segments.

  readelf -l ld-2.31.so
  Program Headers:
    Type           Offset             VirtAddr           PhysAddr
                   FileSiz            MemSiz              Flags  Align
    LOAD           0x0000000000000000 0x0000000000000000 0x0000000000000000
                   0x000000000002c94c 0x000000000002c94c  R E    0x10000
    LOAD           0x000000000002dae0 0x000000000003dae0 0x000000000003dae0
                   0x00000000000021e8 0x0000000000002320  RW     0x10000
    LOAD           0x000000000002fe00 0x000000000003fe00 0x000000000003fe00
                   0x00000000000011ac 0x0000000000001328  RW     0x10000

The reason for this problem is the same as described in commit
ad55eac7 ("elf: enforce MAP_FIXED on overlaying elf segments").

Not only executable binaries, elf interpreters (e.g. ld.so) can have
overlapping elf segments, so we better drop MAP_FIXED_NOREPLACE and go
back to MAP_FIXED in load_elf_interp.

Fixes: 4ed28639 ("fs, elf: drop MAP_FIXED usage from elf_map")
Cc: <stable@vger.kernel.org> # v4.19
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: NChen Jingwen <chenjingwen6@huawei.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9b2f72cc

02 10月, 2021 1 次提交

io_uring: kill fasync · 3f008385

由 Pavel Begunkov 提交于 10月 01, 2021

We have never supported fasync properly, it would only fire when there
is something polling io_uring making it useless. The original support came
in through the initial io_uring merge for 5.1. Since it's broken and
nobody has reported it, get rid of the fasync bits.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/2f7ca3d344d406d34fa6713824198915c41cea86.1633080236.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

3f008385

01 10月, 2021 9 次提交

nfsd: Fix a warning for nfsd_file_close_inode · 19598141

由 Trond Myklebust 提交于 9月 30, 2021

Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>

19598141

ext4: recheck buffer uptodate bit under buffer lock · f2c77973

由 Zhang Yi 提交于 9月 10, 2021

Commit 8e33fadf ("ext4: remove an unnecessary if statement in
__ext4_get_inode_loc()") forget to recheck buffer's uptodate bit again
under buffer lock, which may overwrite the buffer if someone else have
already brought it uptodate and changed it.

Fixes: 8e33fadf ("ext4: remove an unnecessary if statement in __ext4_get_inode_loc()")
Cc: stable@kernel.org
Signed-off-by: NZhang Yi <yi.zhang@huawei.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/r/20210910080316.70421-1-yi.zhang@huawei.com

f2c77973

ext4: fix potential infinite loop in ext4_dx_readdir() · 42cb4474

由 yangerkun 提交于 9月 14, 2021

When ext4_htree_fill_tree() fails, ext4_dx_readdir() can run into an
infinite loop since if info->last_pos != ctx->pos this will reset the
directory scan and reread the failing entry.  For example:

1. a dx_dir which has 3 block, block 0 as dx_root block, block 1/2 as
   leaf block which own the ext4_dir_entry_2
2. block 1 read ok and call_filldir which will fill the dirent and update
   the ctx->pos
3. block 2 read fail, but we has already fill some dirent, so we will
   return back to userspace will a positive return val(see ksys_getdents64)
4. the second ext4_dx_readdir will reset the world since info->last_pos
   != ctx->pos, and will also init the curr_hash which pos to block 1
5. So we will read block1 too, and once block2 still read fail, we can
   only fill one dirent because the hash of the entry in block1(besides
   the last one) won't greater than curr_hash
6. this time, we forget update last_pos too since the read for block2
   will fail, and since we has got the one entry, ksys_getdents64 can
   return success
7. Latter we will trapped in a loop with step 4~6

Cc: stable@kernel.org
Signed-off-by: Nyangerkun <yangerkun@huawei.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/r/20210914111415.3921954-1-yangerkun@huawei.com

42cb4474

ext4: flush s_error_work before journal destroy in ext4_fill_super · bb9464e0

由 yangerkun 提交于 9月 24, 2021

The error path in ext4_fill_super forget to flush s_error_work before
journal destroy, and it may trigger the follow bug since
flush_stashed_error_work can run concurrently with journal destroy
without any protection for sbi->s_journal.

[32031.740193] EXT4-fs (loop66): get root inode failed
[32031.740484] EXT4-fs (loop66): mount failed
[32031.759805] ------------[ cut here ]------------
[32031.759807] kernel BUG at fs/jbd2/transaction.c:373!
[32031.760075] invalid opcode: 0000 [#1] SMP PTI
[32031.760336] CPU: 5 PID: 1029268 Comm: kworker/5:1 Kdump: loaded
4.18.0
[32031.765112] Call Trace:
[32031.765375]  ? __switch_to_asm+0x35/0x70
[32031.765635]  ? __switch_to_asm+0x41/0x70
[32031.765893]  ? __switch_to_asm+0x35/0x70
[32031.766148]  ? __switch_to_asm+0x41/0x70
[32031.766405]  ? _cond_resched+0x15/0x40
[32031.766665]  jbd2__journal_start+0xf1/0x1f0 [jbd2]
[32031.766934]  jbd2_journal_start+0x19/0x20 [jbd2]
[32031.767218]  flush_stashed_error_work+0x30/0x90 [ext4]
[32031.767487]  process_one_work+0x195/0x390
[32031.767747]  worker_thread+0x30/0x390
[32031.768007]  ? process_one_work+0x390/0x390
[32031.768265]  kthread+0x10d/0x130
[32031.768521]  ? kthread_flush_work_fn+0x10/0x10
[32031.768778]  ret_from_fork+0x35/0x40

static int start_this_handle(...)
    BUG_ON(journal->j_flags & JBD2_UNMOUNT); <---- Trigger this

Besides, after we enable fast commit, ext4_fc_replay can add work to
s_error_work but return success, so the latter journal destroy in
ext4_load_journal can trigger this problem too.

Fix this problem with two steps:
1. Call ext4_commit_super directly in ext4_handle_error for the case
   that called from ext4_fc_replay
2. Since it's hard to pair the init and flush for s_error_work, we'd
   better add a extras flush_work before journal destroy in
   ext4_fill_super

Besides, this patch will call ext4_commit_super in ext4_handle_error for
any nojournal case too. But it seems safe since the reason we call
schedule_work was that we should save error info to sb through journal
if available. Conversely, for the nojournal case, it seems useless delay
commit superblock to s_error_work.

Fixes: c92dc856 ("ext4: defer saving error info from atomic context")
Fixes: 2d01ddc8 ("ext4: save error info to sb through journal if available")
Cc: stable@kernel.org
Signed-off-by: Nyangerkun <yangerkun@huawei.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/r/20210924093917.1953239-1-yangerkun@huawei.com

bb9464e0

ext4: fix loff_t overflow in ext4_max_bitmap_size() · 75ca6ad4

由 Ritesh Harjani 提交于 6月 05, 2021

We should use unsigned long long rather than loff_t to avoid
overflow in ext4_max_bitmap_size() for comparison before returning.
w/o this patch sbi->s_bitmap_maxbytes was becoming a negative
value due to overflow of upper_limit (with has_huge_files as true)

Below is a quick test to trigger it on a 64KB pagesize system.

sudo mkfs.ext4 -b 65536 -O ^has_extents,^64bit /dev/loop2
sudo mount /dev/loop2 /mnt
sudo echo "hello" > /mnt/hello 	-> This will error out with
				"echo: write error: File too large"
Signed-off-by: NRitesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/594f409e2c543e90fd836b78188dfa5c575065ba.1622867594.git.riteshh@linux.ibm.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>

75ca6ad4

ext4: fix reserved space counter leakage · 6fed8395

由 Jeffle Xu 提交于 8月 23, 2021

When ext4_insert_delayed block receives and recovers from an error from
ext4_es_insert_delayed_block(), e.g., ENOMEM, it does not release the
space it has reserved for that block insertion as it should. One effect
of this bug is that s_dirtyclusters_counter is not decremented and
remains incorrectly elevated until the file system has been unmounted.
This can result in premature ENOSPC returns and apparent loss of free
space.

Another effect of this bug is that
/sys/fs/ext4/<dev>/delayed_allocation_blocks can remain non-zero even
after syncfs has been executed on the filesystem.

Besides, add check for s_dirtyclusters_counter when inode is going to be
evicted and freed. s_dirtyclusters_counter can still keep non-zero until
inode is written back in .evict_inode(), and thus the check is delayed
to .destroy_inode().

Fixes: 51865fda ("ext4: let ext4 maintain extent status tree")
Cc: stable@kernel.org
Suggested-by: NGao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: NJeffle Xu <jefflexu@linux.alibaba.com>
Reviewed-by: NEric Whitney <enwlinux@gmail.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/r/20210823061358.84473-1-jefflexu@linux.alibaba.com

6fed8395

ext4: limit the number of blocks in one ADD_RANGE TLV · a2c2f082

由 Hou Tao 提交于 8月 20, 2021

Now EXT4_FC_TAG_ADD_RANGE uses ext4_extent to track the
newly-added blocks, but the limit on the max value of
ee_len field is ignored, and it can lead to BUG_ON as
shown below when running command "fallocate -l 128M file"
on a fast_commit-enabled fs:

  kernel BUG at fs/ext4/ext4_extents.h:199!
  invalid opcode: 0000 [#1] SMP PTI
  CPU: 3 PID: 624 Comm: fallocate Not tainted 5.14.0-rc6+ #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
  RIP: 0010:ext4_fc_write_inode_data+0x1f3/0x200
  Call Trace:
   ? ext4_fc_write_inode+0xf2/0x150
   ext4_fc_commit+0x93b/0xa00
   ? ext4_fallocate+0x1ad/0x10d0
   ext4_sync_file+0x157/0x340
   ? ext4_sync_file+0x157/0x340
   vfs_fsync_range+0x49/0x80
   do_fsync+0x3d/0x70
   __x64_sys_fsync+0x14/0x20
   do_syscall_64+0x3b/0xc0
   entry_SYSCALL_64_after_hwframe+0x44/0xae

Simply fixing it by limiting the number of blocks
in one EXT4_FC_TAG_ADD_RANGE TLV.

Fixes: aa75f4d3 ("ext4: main fast-commit commit path")
Cc: stable@kernel.org
Signed-off-by: NHou Tao <houtao1@huawei.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/r/20210820044505.474318-1-houtao1@huawei.com

a2c2f082

ksmbd: missing check for NULL in convert_to_nt_pathname() · 87ffb310

由 Dan Carpenter 提交于 9月 30, 2021

The kmalloc() does not have a NULL check.  This code can be re-written
slightly cleaner to just use the kstrdup().

Fixes: 265fd199 ("ksmbd: use LOOKUP_BENEATH to prevent the out of share access")
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Acked-by: NNamjae Jeon <linkinjeon@kernel.org>
Acked-by: NHyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: NSteve French <stfrench@microsoft.com>

87ffb310

nfsd4: Handle the NFSv4 READDIR 'dircount' hint being zero · f2e717d6

由 Trond Myklebust 提交于 9月 30, 2021

RFC3530 notes that the 'dircount' field may be zero, in which case the
recommendation is to ignore it, and only enforce the 'maxcount' field.
In RFC5661, this recommendation to ignore a zero valued field becomes a
requirement.

Fixes: aee37764 ("nfsd4: fix rd_dircount enforcement")
Cc: <stable@vger.kernel.org>
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>

f2e717d6

30 9月, 2021 4 次提交

nfsd: fix error handling of register_pernet_subsys() in init_nfsd() · 1d625050

由 Patrick Ho 提交于 8月 21, 2021

init_nfsd() should not unregister pernet subsys if the register fails
but should instead unwind from the last successful operation which is
register_filesystem().

Unregistering a failed register_pernet_subsys() call can result in
a kernel GPF as revealed by programmatically injecting an error in
register_pernet_subsys().

Verified the fix handled failure gracefully with no lingering nfsd
entry in /proc/filesystems.  This change was introduced by the commit
bd5ae928 ("nfsd: register pernet ops last, unregister first"),
the original error handling logic was correct.

Fixes: bd5ae928 ("nfsd: register pernet ops last, unregister first")
Cc: stable@vger.kernel.org
Signed-off-by: NPatrick Ho <Patrick.Ho@netapp.com>
Acked-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>

1d625050

ksmbd: fix transform header validation · 4227f811

由 Namjae Jeon 提交于 9月 29, 2021

Validate that the transform and smb request headers are present
before checking OriginalMessageSize and SessionId fields.

Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Cc: Ralph Böhme <slow@samba.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: NTom Talpey <tom@talpey.com>
Acked-by: NHyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: NNamjae Jeon <linkinjeon@kernel.org>
Signed-off-by: NSteve French <stfrench@microsoft.com>

4227f811

ksmbd: add buffer validation for SMB2_CREATE_CONTEXT · 8f77150c

由 Hyunchul Lee 提交于 9月 24, 2021

Add buffer validation for SMB2_CREATE_CONTEXT.

Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Reviewed-by: NRalph Boehme <slow@samba.org>
Signed-off-by: NHyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: NNamjae Jeon <linkinjeon@kernel.org>
Signed-off-by: NSteve French <stfrench@microsoft.com>

8f77150c

ksmbd: add validation in smb2 negotiate · 442ff9eb

由 Namjae Jeon 提交于 9月 29, 2021

This patch add validation to check request buffer check in smb2
negotiate and fix null pointer deferencing oops in smb3_preauth_hash_rsp()
that found from manual test.

Cc: Tom Talpey <tom@talpey.com>
Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Cc: Ralph Böhme <slow@samba.org>
Cc: Hyunchul Lee <hyc.lee@gmail.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: NRalph Boehme <slow@samba.org>
Signed-off-by: NNamjae Jeon <linkinjeon@kernel.org>
Signed-off-by: NSteve French <stfrench@microsoft.com>

442ff9eb

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功