提交 · 996bb9fddd5b68d1dfb5e27d30ca2c7a72448596 · openeuler / Kernel

04 4月, 2013 10 次提交

ext4: support simple conversion of extent-mapped inodes to use i_blocks · 996bb9fd

由 Theodore Ts'o 提交于 4月 03, 2013

In order to make it simpler to test the code which support
i_blocks/indirect-mapped inodes, support the conversion of inodes
which are less than 12 blocks and which are contained in no more than
a single extent.

The primary intended use of this code is to converting freshly created
zero-length files and empty directories.

Note that the version of chattr in e2fsprogs 1.42.7 and earlier has a
check that prevents the clearing of the extent flag.  A simple patch
which allows "chattr -e <file>" to work will be checked into the
e2fsprogs git repository.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

996bb9fd

ext4/jbd2: don't wait (forever) for stale tid caused by wraparound · d76a3a77

由 Theodore Ts'o 提交于 4月 03, 2013

In the case where an inode has a very stale transaction id (tid) in
i_datasync_tid or i_sync_tid, it's possible that after a very large
(2**31) number of transactions, that the tid number space might wrap,
causing tid_geq()'s calculations to fail.

Commit deeeaf13 "jbd2: fix fsync() tid wraparound bug", later modified
by commit e7b04ac0 "jbd2: don't wake kjournald unnecessarily",
attempted to fix this problem, but it only avoided kjournald spinning
forever by fixing the logic in jbd2_log_start_commit().

Unfortunately, in the codepaths in fs/ext4/fsync.c and fs/ext4/inode.c
that might call jbd2_log_start_commit() with a stale tid, those
functions will subsequently call jbd2_log_wait_commit() with the same
stale tid, and then wait for a very long time.  To fix this, we
replace the calls to jbd2_log_start_commit() and
jbd2_log_wait_commit() with a call to a new function,
jbd2_complete_transaction(), which will correctly handle stale tid's.

As a bonus, jbd2_complete_transaction() will avoid locking
j_state_lock for writing unless a commit needs to be started.  This
should have a small (but probably not measurable) improvement for
ext4's scalability.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reported-by: NBen Hutchings <ben@decadent.org.uk>
Reported-by: NGeorge Barnett <gbarnett@atlassian.com>
Cc: stable@vger.kernel.org

d76a3a77

ext4: add might_sleep() annotations · b10a44c3

由 Theodore Ts'o 提交于 4月 03, 2013

Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: NLukas Czerner <lczerner@redhat.com>

b10a44c3

ext4: add mutex_is_locked() assertion to ext4_truncate() · 19b5ef61

由 Theodore Ts'o 提交于 4月 03, 2013

[ Added fixup from Lukáš Czerner which only checks the assertion when
  the inode is not new and is not being freed. ]
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

19b5ef61

ext4: refactor truncate code · 819c4920

由 Theodore Ts'o 提交于 4月 03, 2013

Move common code in ext4_ind_truncate() and ext4_ext_truncate() into
ext4_truncate().  This saves over 60 lines of code.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

819c4920

ext4: refactor punch hole code · 26a4c0c6

由 Theodore Ts'o 提交于 4月 03, 2013

Move common code in ext4_ind_punch_hole() and ext4_ext_punch_hole()
into ext4_punch_hole(). This saves over 150 lines of code.

This also fixes a potential bug when the punch_hole() code is racing
against indirect-to-extents or extents-to-indirect migation. We are
currently using i_mutex to protect against changes to the inode flag;
specifically, the append-only, immutable, and extents inode flags. So
we need to take i_mutex before deciding whether to use the
extents-specific or indirect-specific punch_hole code.

Also, there was a missing call to ext4_inode_block_unlocked_dio() in
the indirect punch codepath. This was added in commit 02d262df
to block DIO readers racing against the punch operation in the
codepath for extent-mapped inodes, but it was missing for
indirect-block mapped inodes. One of the advantages of refactoring
the code is that it makes such oversights much less likely.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

26a4c0c6

ext4: fold ext4_alloc_blocks() in ext4_alloc_branch() · 781f143e

由 Theodore Ts'o 提交于 4月 03, 2013

The older code was far more complicated than it needed to be because
of how we spliced in the ext4's new multiblock allocator into ext3's
indirect block code. By folding ext4_alloc_blocks() into
ext4_alloc_branch(), we make the code far more understable, shave off
over 130 lines of code and half a kilobyte of compiled object code.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

781f143e

ext4: fold ext4_generic_write_end() into ext4_write_end() · eed4333f

由 Zheng Liu 提交于 4月 03, 2013

After collapsing the handling of data ordered and data writeback
codepath, ext4_generic_write_end() has only one caller,
ext4_write_end().  So we fold it into ext4_write_end().
Signed-off-by: NZheng Liu <wenqing.lz@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: NLukas Czerner <lczerner@redhat.com>

eed4333f

ext4: collapse handling of data=ordered and data=writeback codepaths · 74d553aa

由 Theodore Ts'o 提交于 4月 03, 2013

The only difference between how we handle data=ordered and
data=writeback is a single call to ext4_jbd2_file_inode().  Eliminate
code duplication by factoring out redundant the code paths.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: NLukas Czerner <lczerner@redhat.com>

74d553aa

ext4: fix big-endian bugs which could cause fs corruptions · 8cde7ad1

由 Zheng Liu 提交于 4月 03, 2013

When an extent was zeroed out, we forgot to do convert from cpu to le16.
It could make us hit a BUG_ON when we try to write dirty pages out.  So
fix it.

[ Also fix a bug found by Dmitry Monakhov where we were missing
  le32_to_cpu() calls in the new indirect punch hole code.

  There are a number of other big endian warnings found by static code
  analyzers, but we'll wait for the next merge window to fix them all
  up.  These fixes are designed to be Obviously Correct by code
  inspection, and easy to demonstrate that it won't make any
  difference (and hence, won't introduce any bugs) on little endian
  architectures such as x86.  --tytso ]
Signed-off-by: NZheng Liu <wenqing.lz@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reported-by: NCAI Qian <caiqian@redhat.com>
Reported-by: NChristian Kujau <lists@nerdbynature.de>
Cc: Dmitry Monakhov <dmonakhov@openvz.org>

8cde7ad1

29 3月, 2013 1 次提交

Btrfs: don't drop path when printing out tree errors in scrub · d8fe29e9

由 Josef Bacik 提交于 3月 29, 2013

A user reported a panic where we were panicing somewhere in
tree_backref_for_extent from scrub_print_warning.  He only captured the trace
but looking at scrub_print_warning we drop the path right before we mess with
the extent buffer to print out a bunch of stuff, which isn't right.  So fix this
by dropping the path after we use the eb if we need to.  Thanks,

Cc: stable@vger.kernel.org
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

d8fe29e9

28 3月, 2013 9 次提交

Btrfs: fix wrong return value of btrfs_lookup_csum() · 82d130ff

由 Miao Xie 提交于 3月 28, 2013

If we don't find the expected csum item, but find a csum item which is
adjacent to the specified extent, we should return -EFBIG, or we should
return -ENOENT. But btrfs_lookup_csum() return -EFBIG even the csum item
is not adjacent to the specified extent. Fix it.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

82d130ff

Btrfs: fix wrong reservation of csums · 39847c4d

由 Miao Xie 提交于 3月 28, 2013

We reserve the space for csums only when we write data into a file, in
the other cases, such as tree log, log replay, we don't do reservation,
so we can use the reservation of the transaction handle just for the former.
And for the latter, we should use the tree's own reservation. But the
function - btrfs_csum_file_blocks() didn't differentiate between these
two types of the cases, fix it.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

39847c4d

Btrfs: fix double free in the btrfs_qgroup_account_ref() · a7975026

由 Wang Shilong 提交于 3月 25, 2013

The function btrfs_find_all_roots is responsible to allocate
memory for 'roots' and free it if errors happen,so the caller should not
free it again since the work has been done.

Besides,'tmp' is allocated after the function btrfs_find_all_roots,
so we can return directly if btrfs_find_all_roots() fails.
Signed-off-by: NWang Shilong <wangsl-fnst@cn.fujitsu.com>
Reviewed-by: NMiao Xie <miaox@cn.fujitsu.com>
Reviewed-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

a7975026

Btrfs: limit the global reserve to 512mb · fdf30d1c

由 Josef Bacik 提交于 3月 26, 2013

A user reported a problem where he was getting early ENOSPC with hundreds of
gigs of free data space and 6 gigs of free metadata space. This is because the
global block reserve was taking up the entire free metadata space. This is
ridiculous, we have infrastructure in place to throttle if we start using too
much of the global reserve, so instead of letting it get this huge just limit it
to 512mb so that users can still get work done. This allowed the user to
complete his rsync without issues. Thanks

Cc: stable@vger.kernel.org
Reported-and-tested-by: NStefan Priebe <s.priebe@profihost.ag>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

fdf30d1c

Btrfs: hold the ordered operations mutex when waiting on ordered extents · db1d607d

由 Josef Bacik 提交于 3月 26, 2013

We need to hold the ordered_operations mutex while waiting on ordered extents
since we splice and run the ordered extents list. We need to make sure anybody
else who wants to wait on ordered extents does actually wait for them to be
completed. This will keep us from bailing out of flushing in case somebody is
already waiting on ordered extents to complete. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

db1d607d

Btrfs: fix space accounting for unlink and rename · 6e137ed3

由 Josef Bacik 提交于 3月 26, 2013

We are way over-reserving for unlink and rename. Rename is just some random
huge number and unlink accounts for tree log operations that don't actually
happen during unlink, not to mention the tree log doesn't take from the trans
block rsv anyway so it's completely useless. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

6e137ed3

Btrfs: fix space leak when we fail to reserve metadata space · f4881bc7

由 Josef Bacik 提交于 3月 25, 2013

Dave reported a warning when running xfstest 275. We have been leaking delalloc
metadata space when our reservations fail. This is because we were improperly
calculating how much space to free for our checksum reservations. The problem
is we would sometimes free up space that had already been freed in another
thread and we would end up with negative usage for the delalloc space. This
patch fixes the problem by calculating how much space the other threads would
have already freed, and then calculate how much space we need to free had we not
done the reservation at all, and then freeing any excess space. This makes
xfstests 275 no longer have leaked space. Thanks

Cc: stable@vger.kernel.org
Reported-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

f4881bc7

Btrfs: fix EIO from btrfs send in is_extent_unchanged for punched holes · adaa4b8e

由 Jan Schmidt 提交于 3月 21, 2013

When you take a snapshot, punch a hole where there has been data, then take
another snapshot and try to send an incremental stream, btrfs send would
give you EIO. That is because is_extent_unchanged had no support for holes
being punched. With this patch, instead of returning EIO we just return
0 (== the extent is not unchanged) and we're good.
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
Cc: Alexander Block <ablock84@gmail.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

adaa4b8e

vfs/splice: Fix missed checks in new __kernel_write() helper · 3e84f48e

由 Al Viro 提交于 3月 27, 2013

Commit 06ae43f3 ("Don't bother with redoing rw_verify_area() from
default_file_splice_from()") lost the checks to test existence of the
write/aio_write methods.  My apologies ;-/

Eventually, we want that in fs/splice.c side of things (no point
repeating it for every buffer, after all), but for now this is the
obvious minimal fix.
Reported-by: NDave Jones <davej@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3e84f48e

27 3月, 2013 6 次提交

userns: Restrict when proc and sysfs can be mounted · 87a8ebd6

由 Eric W. Biederman 提交于 3月 24, 2013

Only allow unprivileged mounts of proc and sysfs if they are already
mounted when the user namespace is created.

proc and sysfs are interesting because they have content that is
per namespace, and so fresh mounts are needed when new namespaces
are created while at the same time proc and sysfs have content that
is shared between every instance.

Respect the policy of who may see the shared content of proc and sysfs
by only allowing new mounts if there was an existing mount at the time
the user namespace was created.

In practice there are only two interesting cases: proc and sysfs are
mounted at their usual places, proc and sysfs are not mounted at all
(some form of mount namespace jail).

Cc: stable@vger.kernel.org
Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

87a8ebd6

vfs: Carefully propogate mounts across user namespaces · 132c94e3

由 Eric W. Biederman 提交于 3月 22, 2013

As a matter of policy MNT_READONLY should not be changable if the
original mounter had more privileges than creator of the mount
namespace.

Add the flag CL_UNPRIVILEGED to note when we are copying a mount from
a mount namespace that requires more privileges to a mount namespace
that requires fewer privileges.

When the CL_UNPRIVILEGED flag is set cause clone_mnt to set MNT_NO_REMOUNT
if any of the mnt flags that should never be changed are set.

This protects both mount propagation and the initial creation of a less
privileged mount namespace.

Cc: stable@vger.kernel.org
Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
Reported-by: NAndy Lutomirski <luto@amacapital.net>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

132c94e3

vfs: Add a mount flag to lock read only bind mounts · 90563b19

由 Eric W. Biederman 提交于 3月 22, 2013

When a read-only bind mount is copied from mount namespace in a higher
privileged user namespace to a mount namespace in a lesser privileged
user namespace, it should not be possible to remove the the read-only
restriction.

Add a MNT_LOCK_READONLY mount flag to indicate that a mount must
remain read-only.

CC: stable@vger.kernel.org
Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

90563b19

userns: Don't allow creation if the user is chrooted · 3151527e

由 Eric W. Biederman 提交于 3月 15, 2013

Guarantee that the policy of which files may be access that is
established by setting the root directory will not be violated
by user namespaces by verifying that the root directory points
to the root of the mount namespace at the time of user namespace
creation.

Changing the root is a privileged operation, and as a matter of policy
it serves to limit unprivileged processes to files below the current
root directory.

For reasons of simplicity and comprehensibility the privilege to
change the root directory is gated solely on the CAP_SYS_CHROOT
capability in the user namespace.  Therefore when creating a user
namespace we must ensure that the policy of which files may be access
can not be violated by changing the root directory.

Anyone who runs a processes in a chroot and would like to use user
namespace can setup the same view of filesystems with a mount
namespace instead.  With this result that this is not a practical
limitation for using user namespaces.

Cc: stable@vger.kernel.org
Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
Reported-by: NAndy Lutomirski <luto@amacapital.net>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

3151527e

Nest rename_lock inside vfsmount_lock · 7ea600b5

由 Al Viro 提交于 3月 26, 2013

... lest we get livelocks between path_is_under() and d_path() and friends.

The thing is, wrt fairness lglocks are more similar to rwsems than to rwlocks;
it is possible to have thread B spin on attempt to take lock shared while thread
A is already holding it shared, if B is on lower-numbered CPU than A and there's
a thread C spinning on attempt to take the same lock exclusive.

As the result, we need consistent ordering between vfsmount_lock (lglock) and
rename_lock (seq_lock), even though everything that takes both is going to take
vfsmount_lock only shared.
Spotted-by: NBrad Spengler <spender@grsecurity.net>
Cc: stable@vger.kernel.org
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

7ea600b5

Btrfs: fix race between mmap writes and compression · 4adaa611

由 Chris Mason 提交于 3月 26, 2013

Btrfs uses page_mkwrite to ensure stable pages during
crc calculations and mmap workloads.  We call clear_page_dirty_for_io
before we do any crcs, and this forces any application with the file
mapped to wait for the crc to finish before it is allowed to change
the file.

With compression on, the clear_page_dirty_for_io step is happening after
we've compressed the pages.  This means the applications might be
changing the pages while we are compressing them, and some of those
modifications might not hit the disk.

This commit adds the clear_page_dirty_for_io before compression starts
and makes sure to redirty the page if we have to fallback to
uncompressed IO as well.
Signed-off-by: NChris Mason <chris.mason@fusionio.com>
Reported-by: NAlexandre Oliva <oliva@gnu.org>
cc: stable@vger.kernel.org

4adaa611

23 3月, 2013 2 次提交

nfsd: fix bad offset use · e49dbbf3

由 Kent Overstreet 提交于 3月 22, 2013

vfs_writev() updates the offset argument - but the code then passes the
offset to vfs_fsync_range(). Since offset now points to the offset after
what was just written, this is probably not what was intended

Introduced by face1502 "nfsd: use
vfs_fsync_range(), not O_SYNC, for stable writes".
Signed-off-by: NKent Overstreet <koverstreet@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: stable@vger.kernel.org
Reviewed-by: NZach Brown <zab@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

e49dbbf3

vfs,proc: guarantee unique inodes in /proc · 51f0885e

由 Linus Torvalds 提交于 3月 22, 2013

Dave Jones found another /proc issue with his Trinity tool: thanks to
the namespace model, we can have multiple /proc dentries that point to
the same inode, aliasing directories in /proc/<pid>/net/ for example.

This ends up being a total disaster, because it acts like hardlinked
directories, and causes locking problems. We rely on the topological
sort of the inodes pointed to by dentries, and if we have aliased
directories, that odering becomes unreliable.

In short: don't do this. Multiple dentries with the same (directory)
inode is just a bad idea, and the namespace code should never have
exposed things this way. But we're kind of stuck with it.

This solves things by just always allocating a new inode during /proc
dentry lookup, instead of using "iget_locked()" to look up existing
inodes by superblock and number. That actually simplies the code a bit,
at the cost of potentially doing more inode [de]allocations.

That said, the inode lookup wasn't free either (and did a lot of locking
of inodes), so it is probably not that noticeable. We could easily keep
the old lookup model for non-directory entries, but rather than try to
be excessively clever this just implements the minimal and simplest
workaround for the problem.
Reported-and-tested-by: NDave Jones <davej@redhat.com>
Analyzed-by: NAl Viro <viro@zeniv.linux.org.uk>
Cc: stable@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

51f0885e

22 3月, 2013 7 次提交

Btrfs: fix memory leak in btrfs_create_tree() · 1dd05682

由 Tsutomu Itoh 提交于 3月 21, 2013

We should free leaf and root before returning from the error
handling code.
Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

1dd05682

Btrfs: fix locking on ROOT_REPLACE operations in tree mod log · d9abbf1c

由 Jan Schmidt 提交于 3月 20, 2013

To resolve backrefs, ROOT_REPLACE operations in the tree mod log are
required to be tied to at least one KEY_REMOVE_WHILE_FREEING operation.
Therefore, those operations must be enclosed by tree_mod_log_write_lock()
and tree_mod_log_write_unlock() calls.

Those calls are private to the tree_mod_log_* functions, which means that
removal of the elements of an old root node must be logged from
tree_mod_log_insert_root. This partly reverts and corrects commit ba1bfbd5
(Btrfs: fix a tree mod logging issue for root replacement operations).

This fixes the brand-new version of xfstest 276 as of commit cfe73f71.

Cc: stable@vger.kernel.org
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

d9abbf1c

Btrfs: fix missing qgroup reservation before fallocating · 6113077c

由 Wang Shilong 提交于 3月 19, 2013

Steps to reproduce:
	mkfs.btrfs <disk>
	mount <disk> <mnt>
	btrfs quota enable <mnt>
	btrfs sub create <mnt>/subv
	btrfs qgroup limit 10M <mnt>/subv
	fallocate --length 20M <mnt>/subv/data

For the above example, fallocating will return successfully which
is not expected, we try to fix it by doing qgroup reservation before
fallocating.
Signed-off-by: NWang Shilong <wangsl-fnst@cn.fujitsu.com>
Reviewed-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

6113077c

Btrfs: handle a bogus chunk tree nicely · 835d974f

由 Josef Bacik 提交于 3月 19, 2013

If you restore a btrfs-image file system and try to mount that file system we'll
panic. That's because btrfs-image restores and just makes one big chunk to
envelope the whole disk, since they are really only meant to be messed with by
our btrfs-progs. So fix up btrfs_rmap_block and the callers of it for mount so
that we no longer panic but instead just return an error and fail to mount.
Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

835d974f

Btrfs: update to use fs_state bit · d7634482

由 Liu Bo 提交于 3月 11, 2013

Now that we use bit operation to check fs_state, update
btrfs_free_fs_root()'s checker, otherwise we get back to
memory leak case.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

d7634482

cifs: ignore everything in SPNEGO blob after mechTypes · f853c616

由 Jeff Layton 提交于 3月 11, 2013

We've had several reports of people attempting to mount Windows 8 shares
and getting failures with a return code of -EINVAL. The default sec=
mode changed recently to sec=ntlmssp. With that, we expect and parse a
SPNEGO blob from the server in the NEGOTIATE reply.

The current decode_negTokenInit function first parses all of the
mechTypes and then tries to parse the rest of the negTokenInit reply.
The parser however currently expects a mechListMIC or nothing to follow the
mechTypes, but Windows 8 puts a mechToken field there instead to carry
some info for the new NegoEx stuff.

In practice, we don't do anything with the fields after the mechTypes
anyway so I don't see any real benefit in continuing to parse them.
This patch just has the kernel ignore the fields after the mechTypes.
We'll probably need to reinstate some of this if we ever want to support
NegoEx.
Reported-by: NJason Burgess <jason@jacknife2.dns2go.com>
Reported-by: NYan Li <elliot.li.tech@gmail.com>
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

f853c616

Don't bother with redoing rw_verify_area() from default_file_splice_from() · 06ae43f3

由 Al Viro 提交于 3月 20, 2013

default_file_splice_from() ends up calling vfs_write() (via very convoluted
callchain). It's an overkill, since we already have done rw_verify_area()
in the caller by the time we call vfs_write() we are under set_fs(KERNEL_DS),
so access_ok() is also pointless. Add a new helper (__kernel_write()),
use it instead of kernel_write() in there.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

06ae43f3

21 3月, 2013 5 次提交

NFSv4.1: Add a helper pnfs_commit_and_return_layout · 24028672

由 Trond Myklebust 提交于 3月 20, 2013

In order to be able to safely return the layout in nfs4_proc_setattr,
we need to block new uses of the layout, wait for all outstanding
users of the layout to complete, commit the layout and then return it.

This patch adds a helper in order to do all this safely.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: Boaz Harrosh <bharrosh@panasas.com>

24028672

NFSv4.1: Always clear the NFS_INO_LAYOUTCOMMIT in layoutreturn · 24956804

由 Trond Myklebust 提交于 3月 20, 2013

Note that clearing NFS_INO_LAYOUTCOMMIT is tricky, since it requires
you to also clear the NFS_LSEG_LAYOUTCOMMIT bits from the layout
segments.
The only two sites that need to do this are the ones that call
pnfs_return_layout() without first doing a layout commit.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: NBenny Halevy <bhalevy@tonian.com>
Cc: stable@vger.kernel.org

24956804

NFSv4.1: Fix a race in pNFS layoutcommit · a073dbff

由 Trond Myklebust 提交于 3月 20, 2013

We need to clear the NFS_LSEG_LAYOUTCOMMIT bits atomically with the
NFS_INO_LAYOUTCOMMIT bit, otherwise we may end up with situations
where the two are out of sync.
The first half of the problem is to ensure that pnfs_layoutcommit_inode
clears the NFS_LSEG_LAYOUTCOMMIT bit through pnfs_list_write_lseg.
We still need to keep the reference to those segments until the RPC call
is finished, so in order to make it clear _where_ those references come
from, we add a helper pnfs_list_write_lseg_done() that cleans up after
pnfs_list_write_lseg.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: NBenny Halevy <bhalevy@tonian.com>
Cc: stable@vger.kernel.org

a073dbff

pnfs-block: removing DM device maybe cause oops when call dev_remove · 4376c946

由 fanchaoting 提交于 3月 21, 2013

when pnfs block using device mapper,if umounting later,it maybe
cause oops. we apply "1 + sizeof(bl_umount_request)" memory for
msg->data, the memory maybe overflow when we do "memcpy(&dataptr
[sizeof(bl_msg)], &bl_umount_request, sizeof(bl_umount_request))",
because the size of bl_msg is more than 1 byte.

Signed-off-by: fanchaoting<fanchaoting@cn.fujitsu.com>
Cc: stable@vger.kernel.org
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

4376c946

sysfs: handle failure path correctly for readdir() · e5110f41

由 Ming Lei 提交于 3月 20, 2013

In case of 'if (filp->f_pos ==  0 or 1)' of sysfs_readdir(),
the failure from filldir() isn't handled, and the reference counter
of the sysfs_dirent object pointed by filp->private_data will be
released without clearing filp->private_data, so use after free
bug will be triggered later.

This patch returns immeadiately under the situation for fixing the bug,
and it is reasonable to return from readdir() when filldir() fails.
Reported-by: NDave Jones <davej@redhat.com>
Tested-by: NSasha Levin <levinsasha928@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NMing Lei <ming.lei@canonical.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

e5110f41

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功