提交 · 19ea80603715d473600cd993b9987bc97d042e02 · openeuler / Kernel

17 2月, 2014 1 次提交

ext4: don't leave i_crtime.tv_sec uninitialized · 19ea8060

由 Theodore Ts'o 提交于 2月 16, 2014

If the i_crtime field is not present in the inode, don't leave the
field uninitialized.

Fixes: ef7f3835 ("ext4: Add nanosecond timestamps")
Reported-by: NVegard Nossum <vegard.nossum@oracle.com>
Tested-by: NVegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org

19ea8060

16 2月, 2014 2 次提交

ext4: fix online resize with a non-standard blocks per group setting · 3d2660d0

由 Theodore Ts'o 提交于 2月 15, 2014

The set_flexbg_block_bitmap() function assumed that the number of
blocks in a blockgroup was sb->blocksize * 8, which is normally true,
but not always!  Use EXT4_BLOCKS_PER_GROUP(sb) instead, to fix block
bitmap corruption after:

mke2fs -t ext4 -g 3072 -i 4096 /dev/vdd 1G
mount -t ext4 /dev/vdd /vdd
resize2fs /dev/vdd 8G
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Reported-by: NJon Bernard <jbernard@tuxion.com>
Cc: stable@vger.kernel.org

3d2660d0

ext4: fix online resize with very large inode tables · b93c9535

由 Theodore Ts'o 提交于 2月 15, 2014

If a file system has a large number of inodes per block group, all of
the metadata blocks in a flex_bg may be larger than what can fit in a
single block group.  Unfortunately, ext4_alloc_group_tables() in
resize.c was never tested to see if it would handle this case
correctly, and there were a large number of bugs which caused the
following sequence to result in a BUG_ON:

kernel bug at fs/ext4/resize.c:409!
   ...
call trace:
 [<ffffffff81256768>] ext4_flex_group_add+0x1448/0x1830
 [<ffffffff81257de2>] ext4_resize_fs+0x7b2/0xe80
 [<ffffffff8123ac50>] ext4_ioctl+0xbf0/0xf00
 [<ffffffff811c111d>] do_vfs_ioctl+0x2dd/0x4b0
 [<ffffffff811b9df2>] ? final_putname+0x22/0x50
 [<ffffffff811c1371>] sys_ioctl+0x81/0xa0
 [<ffffffff81676aa9>] system_call_fastpath+0x16/0x1b
code: c8 4c 89 df e8 41 96 f8 ff 44 89 e8 49 01 c4 44 29 6d d4 0
rip  [<ffffffff81254fa1>] set_flexbg_block_bitmap+0x171/0x180


This can be reproduced with the following command sequence:

   mke2fs -t ext4 -i 4096 /dev/vdd 1G
   mount -t ext4 /dev/vdd /vdd
   resize2fs /dev/vdd 8G

To fix this, we need to make sure the right thing happens when a block
group's inode table straddles two block groups, which means the
following bugs had to be fixed:

1) Not clearing the BLOCK_UNINIT flag in the second block group in
   ext4_alloc_group_tables --- the was proximate cause of the BUG_ON.

2) Incorrectly determining how many block groups contained contiguous
   free blocks in ext4_alloc_group_tables().

3) Incorrectly setting the start of the next block range to be marked
   in use after a discontinuity in setup_new_flex_group_blocks().
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org

b93c9535

13 2月, 2014 2 次提交

ext4: don't try to modify s_flags if the the file system is read-only · 23301410

由 Theodore Ts'o 提交于 2月 12, 2014

If an ext4 file system is created by some tool other than mke2fs
(perhaps by someone who has a pathalogical fear of the GPL) that
doesn't set one or the other of the EXT2_FLAGS_{UN}SIGNED_HASH flags,
and that file system is then mounted read-only, don't try to modify
the s_flags field.  Otherwise, if dm_verity is in use, the superblock
will change, causing an dm_verity failure.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org

23301410

ext4: fix error paths in swap_inode_boot_loader() · 30d29b11

由 Zheng Liu 提交于 2月 12, 2014

In swap_inode_boot_loader() we forgot to release ->i_mutex and resume
unlocked dio for inode and inode_bl if there is an error starting the
journal handle.  This commit fixes this issue.
Reported-by: NAhmed Tamrawi <ahmedtamrawi@gmail.com>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Dr. Tilmann Bubeck <t.bubeck@reinform.de>
Signed-off-by: NZheng Liu <wenqing.lz@taobao.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org  # v3.10+

30d29b11

12 2月, 2014 1 次提交

ext4: fix xfstest generic/299 block validity failures · 15cc1767

由 Eric Whitney 提交于 2月 12, 2014

Commit a115f749 (ext4: remove wait for unwritten extent conversion from
ext4_truncate) exposed a bug in ext4_ext_handle_uninitialized_extents().
It can be triggered by xfstest generic/299 when run on a test file
system created without a journal. This test continuously fallocates and
truncates files to which random dio/aio writes are simultaneously
performed by a separate process. The test completes successfully, but
if the test filesystem is mounted with the block_validity option, a
warning message stating that a logical block has been mapped to an
illegal physical block is posted in the kernel log.

The bug occurs when an extent is being converted to the written state
by ext4_end_io_dio() and ext4_ext_handle_uninitialized_extents()
discovers a mapping for an existing uninitialized extent. Although it
sets EXT4_MAP_MAPPED in map->m_flags, it fails to set map->m_pblk to
the discovered physical block number. Because map->m_pblk is not
otherwise initialized or set by this function or its callers, its
uninitialized value is returned to ext4_map_blocks(), where it is
stored as a bogus mapping in the extent status tree.

Since map->m_pblk can accidentally contain illegal values that are
larger than the physical size of the file system, calls to
check_block_validity() in ext4_map_blocks() that are enabled if the
block_validity mount option is used can fail, resulting in the logged
warning message.
Signed-off-by: NEric Whitney <enwlinux@gmail.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org # 3.11+

15cc1767

10 2月, 2014 1 次提交

fix O_SYNC|O_APPEND syncing the wrong range on write() · d311d79d

由 Al Viro 提交于 2月 09, 2014

It actually goes back to 2004 ([PATCH] Concurrent O_SYNC write support)
when sync_page_range() had been introduced; generic_file_write{,v}() correctly
synced
	pos_after_write - written .. pos_after_write - 1
but generic_file_aio_write() synced
	pos_before_write .. pos_before_write + written - 1
instead.  Which is not the same thing with O_APPEND, obviously.
A couple of years later correct variant had been killed off when
everything switched to use of generic_file_aio_write().

All users of generic_file_aio_write() are affected, and the same bug
has been copied into other instances of ->aio_write().

The fix is trivial; the only subtle point is that generic_write_sync()
ought to be inlined to avoid calculations useless for the majority of
calls.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

d311d79d

09 2月, 2014 6 次提交

Btrfs: fix data corruption when reading/updating compressed extents · a2aa75e1

由 Filipe David Borba Manana 提交于 2月 08, 2014

When using a mix of compressed file extents and prealloc extents, it
is possible to fill a page of a file with random, garbage data from
some unrelated previous use of the page, instead of a sequence of zeroes.

A simple sequence of steps to get into such case, taken from the test
case I made for xfstests, is:

   _scratch_mkfs
   _scratch_mount "-o compress-force=lzo"
   $XFS_IO_PROG -f -c "pwrite -S 0x06 -b 18670 266978 18670" $SCRATCH_MNT/foobar
   $XFS_IO_PROG -c "falloc 26450 665194" $SCRATCH_MNT/foobar
   $XFS_IO_PROG -c "truncate 542872" $SCRATCH_MNT/foobar
   $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foobar

This results in the following file items in the fs tree:

   item 4 key (257 INODE_ITEM 0) itemoff 15879 itemsize 160
       inode generation 6 transid 6 size 542872 block group 0 mode 100600
   item 5 key (257 INODE_REF 256) itemoff 15863 itemsize 16
       inode ref index 2 namelen 6 name: foobar
   item 6 key (257 EXTENT_DATA 0) itemoff 15810 itemsize 53
       extent data disk byte 0 nr 0 gen 6
       extent data offset 0 nr 24576 ram 266240
       extent compression 0
   item 7 key (257 EXTENT_DATA 24576) itemoff 15757 itemsize 53
       prealloc data disk byte 12849152 nr 241664 gen 6
       prealloc data offset 0 nr 241664
   item 8 key (257 EXTENT_DATA 266240) itemoff 15704 itemsize 53
       extent data disk byte 12845056 nr 4096 gen 6
       extent data offset 0 nr 20480 ram 20480
       extent compression 2
   item 9 key (257 EXTENT_DATA 286720) itemoff 15651 itemsize 53
       prealloc data disk byte 13090816 nr 405504 gen 6
       prealloc data offset 0 nr 258048

The on disk extent at offset 266240 (which corresponds to 1 single disk block),
contains 5 compressed chunks of file data. Each of the first 4 compress 4096
bytes of file data, while the last one only compresses 3024 bytes of file data.
Therefore a read into the file region [285648 ; 286720[ (length = 4096 - 3024 =
1072 bytes) should always return zeroes (our next extent is a prealloc one).

The solution here is the compression code path to zero the remaining (untouched)
bytes of the last page it uncompressed data into, as the information about how
much space the file data consumes in the last page is not known in the upper layer
fs/btrfs/extent_io.c:__do_readpage(). In __do_readpage we were correctly zeroing
the remainder of the page but only if it corresponds to the last page of the inode
and if the inode's size is not a multiple of the page size.

This would cause not only returning random data on reads, but also permanently
storing random data when updating parts of the region that should be zeroed.
For the example above, it means updating a single byte in the region [285648 ; 286720[
would store that byte correctly but also store random data on disk.

A test case for xfstests follows soon.
Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: NChris Mason <clm@fb.com>

a2aa75e1

Btrfs: don't loop forever if we can't run because of the tree mod log · 27a377db

由 Josef Bacik 提交于 2月 07, 2014

A user reported a 100% cpu hang with my new delayed ref code. Turns out I
forgot to increase the count check when we can't run a delayed ref because of
the tree mod log. If we can't run any delayed refs during this there is no
point in continuing to look, and we need to break out. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NChris Mason <clm@fb.com>

27a377db

btrfs: reserve no transaction units in btrfs_ioctl_set_features · 8051aa1a

由 David Sterba 提交于 2月 07, 2014

Added in patch "btrfs: add ioctls to query/change feature bits online"
modifications to superblock don't need to reserve metadata blocks when
starting a transaction.
Signed-off-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NChris Mason <clm@fb.com>

8051aa1a

btrfs: commit transaction after setting label and features · d0270aca

由 Jeff Mahoney 提交于 2月 07, 2014

The set_fslabel ioctl uses btrfs_end_transaction, which means it's
possible that the change will be lost if the system crashes, same for
the newly set features. Let's use btrfs_commit_transaction instead.
Signed-off-by: NJeff Mahoney <jeffm@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NChris Mason <clm@fb.com>

d0270aca

Btrfs: fix assert screwup for the pending move stuff · 6cc98d90

由 Josef Bacik 提交于 2月 05, 2014

Wang noticed that he was failing btrfs/030 even though me and Filipe couldn't
reproduce. Turns out this is because Wang didn't have CONFIG_BTRFS_ASSERT set,
which meant that a key part of Filipe's original patch was not being built in.
This appears to be a mess up with merging Filipe's patch as it does not exist in
his original patch. Fix this by changing how we make sure del_waiting_dir_move
asserts that it did not error and take the function out of the ifdef check.
This makes btrfs/030 pass with the assert on or off. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Reviewed-by: NFilipe Manana <fdmanana@gmail.com>
Signed-off-by: NChris Mason <clm@fb.com>

6cc98d90

jfs: fix generic posix ACL regression · c18f7b51

由 Dave Kleikamp 提交于 2月 07, 2014

I missed a couple errors in reviewing the patches converting jfs
to use the generic posix ACL function. Setting ACL's currently
fails with -EOPNOTSUPP.
Signed-off-by: NDave Kleikamp <dave.kleikamp@oracle.com>
Reported-by: NMichael L. Semon <mlsemon35@gmail.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

c18f7b51

07 2月, 2014 2 次提交

mm: __set_page_dirty uses spin_lock_irqsave instead of spin_lock_irq · 227d53b3

由 KOSAKI Motohiro 提交于 2月 06, 2014

To use spin_{un}lock_irq is dangerous if caller disabled interrupt.
During aio buffer migration, we have a possibility to see the following
call stack.

aio_migratepage  [disable interrupt]
  migrate_page_copy
    clear_page_dirty_for_io
      set_page_dirty
        __set_page_dirty_buffers
          __set_page_dirty
            spin_lock_irq

This mean, current aio migration is a deadlockable.  spin_lock_irqsave
is a safer alternative and we should use it.
Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reported-by: David Rientjes rientjes@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

227d53b3

ocfs2: free allocated clusters if error occurs after ocfs2_claim_clusters · fb951eb5

由 Zongxun Wang 提交于 2月 06, 2014

Even if using the same jbd2 handle, we cannot rollback a transaction.
So once some error occurs after successfully allocating clusters, the
allocated clusters will never be used and it means they are lost.  For
example, call ocfs2_claim_clusters successfully when expanding a file,
but failed in ocfs2_insert_extent.  So we need free the allocated
clusters if they are not used indeed.
Signed-off-by: NZongxun Wang <wangzongxun@huawei.com>
Signed-off-by: NJoseph Qi <joseph.qi@huawei.com>
Acked-by: NJoel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Li Zefan <lizefan@huawei.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fb951eb5

06 2月, 2014 2 次提交

execve: use 'struct filename *' for executable name passing · c4ad8f98

由 Linus Torvalds 提交于 2月 05, 2014

This changes 'do_execve()' to get the executable name as a 'struct
filename', and to free it when it is done.  This is what the normal
users want, and it simplifies and streamlines their error handling.

The controlled lifetime of the executable name also fixes a
use-after-free problem with the trace_sched_process_exec tracepoint: the
lifetime of the passed-in string for kernel users was not at all
obvious, and the user-mode helper code used UMH_WAIT_EXEC to serialize
the pathname allocation lifetime with the execve() having finished,
which in turn meant that the trace point that happened after
mm_release() of the old process VM ended up using already free'd memory.

To solve the kernel string lifetime issue, this simply introduces
"getname_kernel()" that works like the normal user-space getname()
function, except with the source coming from kernel memory.

As Oleg points out, this also means that we could drop the tcomm[] array
from 'struct linux_binprm', since the pathname lifetime now covers
setup_new_exec().  That would be a separate cleanup.
Reported-by: NIgor Zhbanov <i.zhbanov@samsung.com>
Tested-by: NSteven Rostedt <rostedt@goodmis.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c4ad8f98

kernfs: make kernfs_deactivate() honor KERNFS_LOCKDEP flag · da9846ae

由 Tejun Heo 提交于 1月 29, 2014

kernfs_deactivate() forgot to check whether KERNFS_LOCKDEP is set
before performing lockdep annotations and ends up feeding
uninitialized lockdep_map to lockdep triggering warning like the
following on USB stick hotunplug.

 usb 1-2: USB disconnect, device number 2
 INFO: trying to register non-static key.
 the code is fine but needs lockdep annotation.
 turning off the locking correctness validator.
 CPU: 1 PID: 62 Comm: khubd Not tainted 3.13.0-work+ #82
 Hardware name: empty empty/S3992, BIOS 080011  10/26/2007
  ffff880065ca7f60 ffff88013a4ffa08 ffffffff81cfb6bd 0000000000000002
  ffff88013a4ffac8 ffffffff810f8530 ffff88013a4fc710 0000000000000002
  ffff880100000000 ffffffff82a3db50 0000000000000001 ffff88013a4fc710
 Call Trace:
  [<ffffffff81cfb6bd>] dump_stack+0x4e/0x7a
  [<ffffffff810f8530>] __lock_acquire+0x1910/0x1e70
  [<ffffffff810f931a>] lock_acquire+0x9a/0x1d0
  [<ffffffff8127c75e>] kernfs_deactivate+0xee/0x130
  [<ffffffff8127d4c8>] kernfs_addrm_finish+0x38/0x60
  [<ffffffff8127d701>] kernfs_remove_by_name_ns+0x51/0xa0
  [<ffffffff8127b4f1>] remove_files.isra.1+0x41/0x80
  [<ffffffff8127b7e7>] sysfs_remove_group+0x47/0xa0
  [<ffffffff8127b873>] sysfs_remove_groups+0x33/0x50
  [<ffffffff8177d66d>] device_remove_attrs+0x4d/0x80
  [<ffffffff8177e25e>] device_del+0x12e/0x1d0
  [<ffffffff819722c2>] usb_disconnect+0x122/0x1a0
  [<ffffffff819749b5>] hub_thread+0x3c5/0x1290
  [<ffffffff810c6a6d>] kthread+0xed/0x110
  [<ffffffff81d0a56c>] ret_from_fork+0x7c/0xb0

Fix it by making kernfs_deactivate() perform lockdep annotations only
if KERNFS_LOCKDEP is set.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NFabio Estevam <festevam@gmail.com>
Reported-by: NAlan Stern <stern@rowland.harvard.edu>
Reported-by: NJiri Kosina <jkosina@suse.cz>
Reported-by: NDave Jones <davej@redhat.com>
Tested-by: NFabio Estevam <fabio.estevam@freescale.com>
Tested-by: NJiri Kosina <jkosina@suse.cz>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

da9846ae

04 2月, 2014 6 次提交

fs: get_acl() must be allowed to return EOPNOTSUPP · 789b663a

由 Trond Myklebust 提交于 1月 31, 2014

posix_acl_xattr_get requires get_acl() to return EOPNOTSUPP if the
filesystem cannot support acls. This is needed for NFS, which can't
know whether or not the server supports acls until it tries to get/set
one.
This patch converts posix_acl_chmod and posix_acl_create to deal with
EOPNOTSUPP return values from get_acl().
Reported-by: NRussell King <linux@arm.linux.org.uk>
Link: http://lkml.kernel.org/r/20140130140834.GW15937@n2100.arm.linux.org.uk
Cc: Al Viro viro@zeniv.linux.org.uk>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Tested-by: NTakashi Iwai <tiwai@suse.de>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

789b663a

NFSv3: Fix return value of nfs3_proc_setacls · 8f493b9c

由 Trond Myklebust 提交于 2月 02, 2014

nfs3_proc_setacls is used internally by the NFSv3 create operations
to set the acl after the file has been created. If the operation
fails because the server doesn't support acls, then it must return '0',
not -EOPNOTSUPP.
Reported-by: NRussell King <linux@arm.linux.org.uk>
Link: http://lkml.kernel.org/r/20140201010328.GI15937@n2100.arm.linux.org.uk
Cc: Christoph Hellwig <hch@lst.de>
Tested-by: NTakashi Iwai <tiwai@suse.de>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

8f493b9c

T
NFSv3: Remove unused function nfs3_proc_set_default_acl · d4c42fb4
由 Trond Myklebust 提交于 2月 02, 2014
```
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
```
d4c42fb4

Btrfs: use late_initcall instead of module_init · 60efa5eb

由 Filipe David Borba Manana 提交于 2月 01, 2014

It seems that when init_btrfs_fs() is called, crc32c/crc32c-intel might
not always be already initialized, which results in the call to crypto_alloc_shash()
returning -ENOENT, as experienced by Ahmet who reported this.

Therefore make sure init_btrfs_fs() is called after crc32c is initialized (which
is at initialization level 6, module_init), by using late_initcall (which is at
initialization level 7) instead of module_init for btrfs.
Reported-and-Tested-by: NAhmet Inan <ainan@mathematik.uni-freiburg.de>
Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: NChris Mason <clm@fb.com>

60efa5eb

Btrfs: use btrfs_crc32c everywhere instead of libcrc32c · 0b947aff

由 Filipe David Borba Manana 提交于 1月 29, 2014

After the commit titled "Btrfs: fix btrfs boot when compiled as built-in",
LIBCRC32C requirement was removed from btrfs' Kconfig. This made it not
possible to build a kernel with btrfs enabled (either as module or built-in)
if libcrc32c is not enabled as well. So just replace all uses of libcrc32c
with the equivalent function in btrfs hash.h - btrfs_crc32c.
Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: NChris Mason <clm@fb.com>

0b947aff

Btrfs: disable snapshot aware defrag for now · 8101c8db

由 Josef Bacik 提交于 1月 29, 2014

It's just broken and it's taking a lot of effort to fix it, so for now just
disable it so people can defrag in peace.  Thanks,

Cc: stable@vger.kernel.org
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NChris Mason <clm@fb.com>

8101c8db

03 2月, 2014 2 次提交

hpfs: optimize quad buffer loading · 1c0b8a7a

由 Mikulas Patocka 提交于 1月 29, 2014

HPFS needs to load 4 consecutive 512-byte sectors when accessing the
directory nodes or bitmaps. We can't switch to 2048-byte block size
because files are allocated in the units of 512-byte sectors.

Previously, the driver would allocate a 2048-byte area using kmalloc,
copy the data from four buffers to this area and eventually copy them
back if they were modified.

In the current implementation of the buffer cache, buffers are allocated
in the pagecache. That means that 4 consecutive 512-byte buffers are
stored in consecutive areas in the kernel address space. So, we don't
need to allocate extra memory and copy the content of the buffers there.

This patch optimizes the code to avoid copying the buffers. It checks
if the four buffers are stored in contiguous memory - if they are not,
it falls back to allocating a 2048-byte area and copying data there.
Signed-off-by: NMikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1c0b8a7a

hpfs: remember free space · 2cbe5c76

由 Mikulas Patocka 提交于 1月 29, 2014

Previously, hpfs scanned all bitmaps each time the user asked for free
space using statfs.  This patch changes it so that hpfs scans the
bitmaps only once, remembes the free space and on next invocation of
statfs it returns the value instantly.

New versions of wine are hammering on the statfs syscall very heavily,
making some games unplayable when they're stored on hpfs, with load
times in minutes.

This should be backported to the stable kernels because it fixes
user-visible problem (excessive level load times in wine).
Signed-off-by: NMikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Cc: stable@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2cbe5c76

02 2月, 2014 3 次提交

NFSv4.1: nfs4_destroy_session must call rpc_destroy_waitqueue · 20b9a902

由 Trond Myklebust 提交于 2月 01, 2014

There may still be timers active on the session waitqueues. Make sure
that we kill them before freeing the memory.

Cc: stable@vger.kernel.org # 3.12+
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

20b9a902

NFSv4: Fix memory corruption in nfs4_proc_open_confirm · 17ead6c8

由 Trond Myklebust 提交于 2月 01, 2014

nfs41_wake_and_assign_slot() relies on the task->tk_msg.rpc_argp and
task->tk_msg.rpc_resp always pointing to the session sequence arguments.

nfs4_proc_open_confirm tries to pull a fast one by reusing the open
sequence structure, thus causing corruption of the NFSv4 slot table.

Cc: stable@vger.kernel.org # 3.12+
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

17ead6c8

afs: proc cells and rootcell are writeable · 1bda2ac0

由 Pali Rohár 提交于 1月 28, 2014

Both proc files are writeable and used for configuring cells. But
there is missing correct mode flag for writeable files. Without
this patch both proc files are read only.

[ It turns out they aren't really read-only, since root can write to
  them even if the write bit isn't set due to CAP_DAC_OVERRIDE ]
Signed-off-by: NPali Rohár <pali.rohar@gmail.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1bda2ac0

01 2月, 2014 6 次提交

nfs: fix setting of ACLs on file creation. · 718360c5

由 Noah Massey 提交于 1月 30, 2014

nfs3_get_acl() tries to skip posix equivalent ACLs, but misinterprets
the return value of posix_acl_equiv_mode(). Fix it.

This is a regression introduced by
"nfs: use generic posix ACL infrastructure for v3 Posix ACLs"

CC: Christoph Hellwig <hch@infradead.org>
CC: linux-nfs@vger.kernel.org
CC: linux-fsdevel@vger.kernel.org
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

718360c5

Fix mountpoint reference leakage in linkat · d22e6338

由 Oleg Drokin 提交于 1月 31, 2014

Recent changes to retry on ESTALE in linkat
(commit 442e31ca)
introduced a mountpoint reference leak and a small memory
leak in case a filesystem link operation returns ESTALE
which is pretty normal for distributed filesystems like
lustre, nfs and so on.
Free old_path in such a case.

[AV: there was another missing path_put() nearby - on the previous
goto retry]
Signed-off-by: NOleg Drokin: <green@linuxhacker.ru>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

d22e6338

hfsplus: use xattr handlers for removexattr · b168fff7

由 Christoph Hellwig 提交于 1月 29, 2014

hfsplus was already using the handlers for get and set operations,
and with the removal of can_set_xattr we've now allow operations that
wouldn't otherwise be allowed.

With this we can also centralize the special-casing of the osx.
attrs that don't have prefixes on disk in the osx xattr handlers.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

b168fff7

fs/super.c: sync ro remount after blocking writers · 807612db

由 Andrew Ruder 提交于 1月 30, 2014

Move sync_filesystem() after sb_prepare_remount_readonly().  If writers
sneak in anywhere from sync_filesystem() to sb_prepare_remount_readonly()
it can cause inodes to be dirtied and writeback to occur well after
sys_mount() has completely successfully.

This was spotted by corrupted ubifs filesystems on reboot, but appears
that it can cause issues with any filesystem using writeback.

Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
CC: Richard Weinberger <richard@nod.at>
Co-authored-by: NRichard Weinberger <richard@nod.at>
Signed-off-by: NAndrew Ruder <andrew.ruder@elecsyscorp.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

807612db

vfs: unexport the getname() symbol · 9115eac2

由 Jeff Layton 提交于 1月 27, 2014

Leaving getname() exported when putname() isn't is a bad idea.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

9115eac2

ceph: fix missing dput in ceph_set_acl · 77516dc9

由 Sage Weil 提交于 1月 31, 2014

Add matching dput() for d_find_alias().  Move d_find_alias() down a bit
at Julia's suggestion.

[ Introduced by commit 72466d0b: "ceph: fix posix ACL hooks" ]
Reported-by: NFengguang Wu <fengguang.wu@intel.com>
Reported-by: NJulia Lawall <julia.lawall@lip6.fr>
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NIlya Dryomov <ilya.dryomov@inktank.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

77516dc9

31 1月, 2014 5 次提交

cifs: Fix check for regular file in couldbe_mf_symlink() · a9a315d4

由 Sachin Prabhu 提交于 1月 31, 2014

MF Symlinks are regular files containing content in a specified format.

The function couldbe_mf_symlink() checks the mode for a set S_IFREG bit
as a test to confirm that it is a regular file. This bit is also set for
other filetypes and simply checking for this bit being set may return
false positives.

We ensure that we are actually checking for a regular file by using the
S_ISREG macro to test instead.
Signed-off-by: NSachin Prabhu <sprabhu@redhat.com>
Reviewed-by: NJeff Layton <jlayton@redhat.com>
Reported-by: NNeil Brown <neilb@suse.de>
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NSteve French <smfrench@gmail.com>

a9a315d4

nfs: initialize the ACL support bits to zero. · a1800aca

由 Malahal Naineni 提交于 1月 27, 2014

Avoid returning incorrect acl mask attributes when the server doesn't
support ACLs.
Signed-off-by: NMalahal Naineni <malahal@us.ibm.com>
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

a1800aca

ceph: simplify ceph_{get,init}_acl · 75858236

由 Christoph Hellwig 提交于 1月 30, 2014

 - ->get_acl only gets called after we checked for a cached ACL, so no
   need to call get_cached_acl again.
 - no need to check IS_POSIXACL in ->get_acl, without that it should
   never get set as all the callers that set it already have the check.
 - you should be able to use the full posix_acl_create in CEPH
Signed-off-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NSage Weil <sage@inktank.com>

75858236

nfs: fix xattr inode op pointers when disabled · 5f13ee9c

由 Christoph Hellwig 提交于 1月 30, 2014

Chris Mason reported a NULL pointer derefernence in generic_getxattr()
that was due to sb->s_xattr being NULL.

The reason is that the nfs #ifdef's for ACL support were misplaced, and
the nfs3 inode operations had the xattr operation pointers set up, even
though xattrs were not actually supported.  As a result, the xattr code
was being called without the infrastructure having been set up.

Move the #ifdef's appropriately.
Reported-and-tested-by: NChris Mason <clm@fb.com>
Acked-by: Al Viro viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5f13ee9c

P
ceph: remove duplicate declaration of ceph_setattr · 32d35d44
由 Peter Rosin 提交于 1月 30, 2014
```
Signed-off-by: NPeter Rosin <peda@lysator.liu.se>
Signed-off-by: NSage Weil <sage@inktank.com>
```
32d35d44

30 1月, 2014 1 次提交

fs/compat: fix lookup_dcookie() parameter handling · d8d14bd0

由 Heiko Carstens 提交于 1月 29, 2014

Commit d5dc77bf ("consolidate compat lookup_dcookie()") coverted all
architectures to the new compat_sys_lookup_dcookie() syscall.

The "len" paramater of the new compat syscall must have the type
compat_size_t in order to enforce zero extension for architectures where
the ABI requires that the caller of a function performed zero and/or
sign extension to 64 bit of all parameters.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: <stable@vger.kernel.org>	[v3.10+]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d8d14bd0

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功