提交 · 8397345172aa5cdcbc133977af9d525f45b874ea · openeuler / raspberrypi-kernel

07 6月, 2011 2 次提交

vfs: make unlink() and rmdir() return ENOENT in preference to EROFS · e6bc45d6

由 Theodore Ts'o 提交于 6月 06, 2011

If user space attempts to remove a non-existent file or directory, and
the file system is mounted read-only, return ENOENT instead of EROFS.
Either error code is arguably valid/correct, but ENOENT is a more
specific error message.
Reported-by: NMichael Tokarev <mjt@tls.msk.ru>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

e6bc45d6

lmLogOpen() broken failure exit · 9054760f

由 Al Viro 提交于 6月 05, 2011

Callers of lmLogOpen() expect it to return -E... on failure exits, which
is what it returns, except for the case of blkdev_get_by_dev() failure.
It that case lmLogOpen() return the error with the wrong sign...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Acked-by: NDave Kleikamp <dave.kleikamp@oracle.com>

9054760f

04 6月, 2011 12 次提交

btrfs: fix uninitialized variable warning · aa0467d8

由 David Sterba 提交于 6月 03, 2011

With Linus' tree, today's linux-next build (powercp ppc64_defconfig)
produced this warning:

fs/btrfs/delayed-inode.c: In function 'btrfs_delayed_update_inode':
fs/btrfs/delayed-inode.c:1598:6: warning: 'ret' may be used
uninitialized in this function

Introduced by commit 16cdcec7 ("btrfs: implement delayed inode items
operation").

This fixes a bug in btrfs_update_inode(): if the returned value from
btrfs_delayed_update_inode is a nonzero garbage, inode stat data are not
updated and several call paths may hit a BUG_ON or fail with strange
code.
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NDavid Sterba <dsterba@suse.cz>

aa0467d8

btrfs: add helper for fs_info->closing · 7841cb28

由 David Sterba 提交于 5月 31, 2011

wrap checking of filesystem 'closing' flag and fix a few missing memory
barriers.
Signed-off-by: NDavid Sterba <dsterba@suse.cz>

7841cb28

Btrfs: add mount -o inode_cache · 4b9465cb

由 Chris Mason 提交于 6月 03, 2011

This makes the inode map cache default to off until we
fix the overflow problem when the free space crcs don't fit
inside a single page.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

4b9465cb

btrfs: scrub: add explicit plugging · e7786c3a

由 Arne Jansen 提交于 5月 28, 2011

With the removal of the implicit plugging scrub ends up doing more and
smaller I/O than necessary. This patch adds explicit plugging per chunk.
Signed-off-by: NArne Jansen <sensille@gmx.net>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e7786c3a

btrfs: use btrfs_ino to access inode number · a4689d2b

由 David Sterba 提交于 5月 31, 2011

commit 4cb5300b ("Btrfs: add mount -o auto_defrag") accesses inode
number directly while it should use the helper with the new inode
number allocator.
Signed-off-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a4689d2b

Btrfs: don't save the inode cache if we are deleting this root · d132a538

由 Josef Bacik 提交于 5月 31, 2011

With xfstest 254 I can panic the box every time with the inode number caching
stuff on. This is because we clean the inodes out when we delete the subvolume,
but then we write out the inode cache which adds an inode to the subvolume inode
tree, and then when it gets evicted again the root gets added back on the dead
roots list and is deleted again, so we have a double free. To stop this from
happening just return 0 if refs is 0 (and we're not the tree root since tree
root always has refs of 0). With this fix 254 no longer panics. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Tested-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

d132a538

btrfs: false BUG_ON when degraded · 5f3f302a

由 Arne Jansen 提交于 5月 30, 2011

In degraded mode the struct btrfs_device of missing devs don't have
device->name set. A kstrdup of NULL correctly returns NULL. Don't
BUG in this case.
Signed-off-by: NArne Jansen <sensille@gmx.net>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

5f3f302a

Btrfs: don't save the inode cache in non-FS roots · ca456ae2

由 liubo 提交于 6月 01, 2011

This adds extra checks to make sure the inode map we are caching really
belongs to a FS root instead of a special relocation tree.  It
prevents crashes during balancing operations.
Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ca456ae2

Btrfs: make sure we don't overflow the free space cache crc page · 211f96c2

由 Chris Mason 提交于 6月 03, 2011

The free space cache uses only one page for crcs right now,
which means we can't have a cache file bigger than the
crcs we can fit in the first page.  This adds a check to
enforce that restriction.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

211f96c2

C
Btrfs: fix uninit variable in the delayed inode code · 17aca1c9
由 Chris Mason 提交于 6月 03, 2011
```
The nitems counter needs to start at zero
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
17aca1c9

btrfs: scrub: don't reuse bios and pages · 1bc87793

由 Arne Jansen 提交于 5月 28, 2011

The current scrub implementation reuses bios and pages as often as possible,
allocating them only on start and releasing them when finished. This leads
to more problems with the block layer than it's worth. The elevator gets
confused when there are more pages added to the bio than bi_size suggests.
This patch completely rips out the reuse of bios and pages and allocates
them freshly for each submit.
Signed-off-by: NArne Jansen <sensille@gmx.net>
Signed-off-by: NChris Maosn <chris.mason@oracle.com>

1bc87793

more conservative S_NOSEC handling · 9e1f1de0

由 Al Viro 提交于 6月 03, 2011

Caching "we have already removed suid/caps" was overenthusiastic as merged.
On network filesystems we might have had suid/caps set on another client,
silently picked by this client on revalidate, all of that *without* clearing
the S_NOSEC flag.

AFAICS, the only reasonably sane way to deal with that is
	* new superblock flag; unless set, S_NOSEC is not going to be set.
	* local block filesystems set it in their ->mount() (more accurately,
mount_bdev() does, so does btrfs ->mount(), users of mount_bdev() other than
local block ones clear it)
	* if any network filesystem (or a cluster one) wants to use S_NOSEC,
it'll need to set MS_NOSEC in sb->s_flags *AND* take care to clear S_NOSEC when
inode attribute changes are picked from other clients.

It's not an earth-shattering hole (anybody that can set suid on another client
will almost certainly be able to write to the file before doing that anyway),
but it's a bug that needs fixing.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

9e1f1de0

03 6月, 2011 6 次提交

UBIFS: fix-up free space earlier · 09801194

由 Ben Gardiner 提交于 5月 30, 2011

The free space fixup is currently initiated during mount after the call to
ubifs_write_master() which results in a write to PEBs; this has been observed
with the patch 'assert no fixup when writing a node' applied:

Move the free space fixup on mount to before the calls to
ubifs_recover_inl_heads() and ubifs_write_master(). This results in no
assertions with the previously mentioned patch applied.

Artem: tweaked the patch a bit
Signed-off-by: NBen Gardiner <bengardiner@nanometrics>
Reviewed-by: NMatthew L. Creech <mlcreech@gmail.com>
Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>

09801194

UBIFS: intialize LPT earlier · 781c5717

由 Ben Gardiner 提交于 5月 30, 2011

The current 'mount_ubifs()' implementation does not initialize the LPT until the
the master node is marked dirty. Move the LPT initialization to before marking
the master node dirty. This is a preparation for the next patch which will move
the free-space-fixup check to before marking the master node dirty, because we
have to fix-up the free space before doing any writes.

Artem: massaged the patch and commit message.
Signed-off-by: NBen Gardiner <bengardiner@nanometrics.ca>
Reviewed-by: NMatthew L. Creech <mlcreech@gmail.com>
Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>

781c5717

UBIFS: assert no fixup when writing a node · 4f1ab9b0

由 Ben Gardiner 提交于 5月 30, 2011

The current free space fixup can result in some writing to the UBI volume
when the space_fixup flag is set.

To catch instances where UBIFS is writing to the NAND while the space_fixup
flag is set, add an assert to ubifs_write_node().

Artem: tweaked the patch, added similar assertion to the write buffer
       write path.
Signed-off-by: NBen Gardiner <bengardiner@nanometrics.ca>
Reviewed-by: NMatthew L. Creech <mlcreech@gmail.com>
Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>

4f1ab9b0

UBIFS: fix clean znode counter corruption in error cases · 83707237

由 Artem Bityutskiy 提交于 5月 31, 2011

UBIFS maintains per-filesystem and global clean znode counters
('c->clean_zn_cnt' and 'ubifs_clean_zn_cnt'). It is important to maintain
correct values there since the shrinker relies on 'ubifs_clean_zn_cnt'.

However, in case of failures during commit the counters were corrupted. E.g.,
if a failure happens in the middle of 'write_index()', then some nodes in the
commit list ('c->cnext') are marked as clean, and some are marked as dirty. And
the 'ubifs_destroy_tnc_subtree()' frees does not retrun correct count, and we
end up with non-zero 'c->clean_zn_cnt' when unmounting. This means that if we
have 2 file-sytem and one of them fails, and we unmount it,
'ubifs_clean_zn_cnt' stays incorrect and confuses the shrinker.
Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>

83707237

UBIFS: fix memory leak on error path · 812eb258

由 Artem Bityutskiy 提交于 5月 31, 2011

UBIFS leaks memory on error path in 'ubifs_jnl_update()' in case of write
failure because it forgets to free the 'struct ubifs_dent_node *dent' object.
Although the object is small, the alignment can make it large - e.g., 2KiB
if the min. I/O unit is 2KiB.
Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
Cc: stable@kernel.org

812eb258

UBIFS: fix shrinker object count reports · cf610bf4

由 Artem Bityutskiy 提交于 5月 31, 2011

Sometimes VM asks the shrinker to return amount of objects it can shrink,
and we return the ubifs_clean_zn_cnt in that case. However, it is possible
that this counter is negative for a short period of time, due to the way
UBIFS TNC code updates it. And I can observe the following warnings sometimes:

shrink_slab: ubifs_shrinker+0x0/0x2b7 [ubifs] negative objects to delete nr=-8541616642706119788

This patch makes sure UBIFS never returns negative count of objects.
Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
Cc: stable@kernel.org

cf610bf4

01 6月, 2011 5 次提交

UBIFS: fix recovery broken by the previous recovery fix · da8b94ea

由 Artem Bityutskiy 提交于 5月 26, 2011

Unfortunately, the recovery fix d1606a59b6be4ea392eabd40d1250aa1eeb19efb
(UBIFS: fix extremely rare mount failure) broke recovery. This commit make
UBIFS drop the last min. I/O unit in all journal heads, but this is needed only
for the GC head. And this does not work for non-GC heads. For example, if
suppose we have min. I/O units A and B, and A contains a valid node X, which
was fsynced, and then a group of nodes Y which spans the rest of A and B. In
this case we'll drop not only Y, but also X, which is obviously incorrect.

This patch fixes the issue and additionally makes recovery to drop last min.
I/O unit only for the GC head, and leave things as they have been for ages for
the other heads - this is safer.
Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>

da8b94ea

UBIFS: amend ubifs_recover_leb interface · efcfde54

由 Artem Bityutskiy 提交于 5月 26, 2011

Instead of passing "grouped" parameter to 'ubifs_recover_leb()' which tells
whether the nodes are grouped in the LEB to recover, pass the journal head
number and let 'ubifs_recover_leb()' look at the journal head's 'grouped' flag.

This patch is a preparation to a further fix where we'll need to know the
journal head number for other purposes.
Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>

efcfde54

UBIFS: introduce a "grouped" journal head flag · 1a0b0699

由 Artem Bityutskiy 提交于 5月 26, 2011

Journal heads are different in a way how UBIFS writes nodes there. All normal
journal heads receive grouped nodes, while the GC journal heads receives
ungrouped nodes. This patch adds a 'grouped' flag to 'struct ubifs_jhead' which
describes this property.

This patch is a preparation to a further recovery fix.
Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>

1a0b0699

UBIFS: supress false error messages · ab75950b

由 Artem Bityutskiy 提交于 5月 26, 2011

Commit ab51afe05273741f72383529ef488aa1ea598ec6 was a good clean-up, but
it introduced a regression - now UBIFS prints scary error messages during
recovery on all corrupted nodes, even though the corruptions are expected
(due to a power cut). This patch fixes the issue.

Additionally fix a typo in a commentary introduced by the same commit.
Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>

ab75950b

block: blkdev_get() should access ->bd_disk only after success · 4c49ff3f

由 Tejun Heo 提交于 6月 01, 2011

d4dc210f (block: don't block events on excl write for non-optical
devices) added dereferencing of bdev->bd_disk to test
GENHD_FL_BLOCK_EVENTS_ON_EXCL_WRITE; however, bdev->bd_disk can be
%NULL if open failed which can lead to an oops.

Test the flag after testing open was successful, not before.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NDavid Miller <davem@davemloft.net>
Tested-by: NDavid Miller <davem@davemloft.net>
Cc: stable@kernel.org
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

4c49ff3f

30 5月, 2011 15 次提交

A
autofs4: bogus dentry_unhash() added in ->unlink() · c7427d23
由 Al Viro 提交于 5月 30, 2011
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
c7427d23

vfs: shrink_dcache_parent before rmdir, dir rename · 3cebde24

由 Sage Weil 提交于 5月 29, 2011

The dentry_unhash push-down series missed that shink_dcache_parent needs to
be called prior to rmdir or dir rename to clear DCACHE_REFERENCED and
allow efficient dentry reclaim.
Reported-by: NDave Chinner <david@fromorbit.com>
Signed-off-by: NSage Weil <sage@newdream.net>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

3cebde24

Revert "block: Remove extra discard_alignment from hd_struct." · a1706ac4

由 Jens Axboe 提交于 5月 30, 2011

It was not a good idea to start dereferencing disk->queue from
the fs sysfs strategy for displaying discard alignment. We ran
into first a NULL pointer deref, and after fixing that we sometimes
see unvalid disk->queue pointer values.

Since discard is the only one of the bunch actually looking into
the queue, just revert the change.

This reverts commit 23ceb5b7.

Conflicts:
	fs/partitions/check.c

a1706ac4

eCryptfs: Remove ecryptfs_header_cache_2 · 30632870

由 Tyler Hicks 提交于 5月 24, 2011

Now that ecryptfs_lookup_interpose() is no longer using
ecryptfs_header_cache_2 to read in metadata, the kmem_cache can be
removed and the ecryptfs_header_cache_1 kmem_cache can be renamed to
ecryptfs_header_cache.
Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>

30632870

eCryptfs: Cleanup and optimize ecryptfs_lookup_interpose() · 778aeb42

由 Tyler Hicks 提交于 5月 24, 2011

ecryptfs_lookup_interpose() has turned into spaghetti code over the
years. This is an effort to clean it up.

 - Shorten overly descriptive variable names such as ecryptfs_dentry
 - Simplify gotos and error paths
 - Create helper function for reading plaintext i_size from metadata

It also includes an optimization when reading i_size from the metadata.
A complete page-sized kmem_cache_alloc() was being done to read in 16
bytes of metadata. The buffer for that is now statically declared.
Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>

778aeb42

eCryptfs: Return useful code from contains_ecryptfs_marker · 7a86617e

由 Tyler Hicks 提交于 5月 02, 2011

Instead of having the calling functions translate the true/false return
code to either 0 or -EINVAL, have contains_ecryptfs_marker() return 0 or
-EINVAL so that the calling functions can just reuse the return code.

Also, rename the function to ecryptfs_validate_marker() to avoid callers
mistakenly thinking that it returns true/false codes.
Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>

7a86617e

eCryptfs: Fix new inode race condition · 3b06b3eb

由 Tyler Hicks 提交于 5月 24, 2011

Only unlock and d_add() new inodes after the plaintext inode size has
been read from the lower filesystem. This fixes a race condition that
was sometimes seen during a multi-job kernel build in an eCryptfs mount.

https://bugzilla.kernel.org/show_bug.cgi?id=36002Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>
Reported-by: NDavid <david@unsolicited.net>
Tested-by: NDavid <david@unsolicited.net>

3b06b3eb

cifs/ubifs: Fix shrinker API change fallout · ef1d5759

由 Al Viro 提交于 5月 29, 2011

Commit 1495f230 ("vmscan: change shrinker API by passing
shrink_control struct") changed the API of ->shrink(), but missed ubifs
and cifs instances.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ef1d5759

pnfs-obj: pg_test check for max_io_size · 93420770

由 Boaz Harrosh 提交于 5月 25, 2011

Implement pg_test vector to test for max IO sizes. We calculate
a max_io_size member only once, and cache it in lseg so to not
do so on every page insert.
Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
[simplify logic]
Signed-off-by: NBenny Halevy <bhalevy@panasas.com>

93420770

NFSv4.1: define nfs_generic_pg_test · 5b36c7dc

由 Boaz Harrosh 提交于 5月 29, 2011

By default, unless pnfs is used coalesce pages until pg_bsize
(rsize or wsize) is reached.

pnfs layout drivers define their own pg_test methods that use
pnfs_generic_pg_test and need to define their own I/O size
limits (e.g. based on the file stripe size).

[Move a check from nfs_pageio_do_add_request to nfs_generic_pg_test]
Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
Signed-off-by: NBenny Halevy <bhalevy@panasas.com>

5b36c7dc

B
NFSv4.1: use pnfs_generic_pg_test directly by layout driver · 89a58e32
由 Benny Halevy 提交于 5月 25, 2011
```
Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
```
89a58e32
B
NFSv4.1: change pg_test return type to bool · 18ad0a9f
由 Benny Halevy 提交于 5月 25, 2011
```
Signed-off-by: NBenny Halevy <bhalevy@panasas.com>
```
18ad0a9f

NFSv4.1: unify pnfs_pageio_init functions · dfed206b

由 Benny Halevy 提交于 5月 25, 2011

Use common code for pnfs_pageio_init_{read,write} and use
a common generic pg_test function.

Note that this function always assumes the the layout driver's
pg_test method is implemented.

[Fix BUG]
Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
Signed-off-by: NBenny Halevy <bhalevy@panasas.com>

dfed206b

pnfs-obj: objlayout_encode_layoutcommit implementation · a0fe8bf4

由 Boaz Harrosh 提交于 5月 22, 2011

* Define API for io-engines to report delta_space_used in IOs
* Encode the osd-layout specific information of the layoutcommit
  XDR buffer.
Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com>
Signed-off-by: NBenny Halevy <bhalevy@panasas.com>

a0fe8bf4

pnfs: encode_layoutcommit · ac7db726

由 Benny Halevy 提交于 5月 22, 2011

Add a layout driver method to encode the layout type specific
opaque part of layout commit in-line in the xdr stream.

Currently, the pnfs-objects layout driver uses it to encode metadata hints
to the MDS and the blocks layout driver to commit provisionally allocated
extents to the file.
Signed-off-by: NBenny Halevy <bhalevy@panasas.com>

ac7db726