提交 · 2cc3c559fb2fe8cecca82a517bc56e88b0c1effd · openeuler / raspberrypi-kernel

04 6月, 2009 1 次提交

Btrfs: set device->total_disk_bytes when adding new device · 2cc3c559

由 Yan Zheng 提交于 6月 04, 2009

It was not being properly initialized, and so the size saved to
disk was not correct.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

2cc3c559

15 5月, 2009 6 次提交

S
Btrfs: Spelling fix in btrfs_lookup_first_block_group comments · 9f55684c
由 Sankar P 提交于 5月 14, 2009
```
Signed-off-by: NSankar P <sankar.curiosity@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
9f55684c

Btrfs: make show_options result match actual option names · 6b65c5c6

由 Sage Weil 提交于 5月 14, 2009

The notreelog and flushoncommit mount options were being printed slightly
differently.
Signed-off-by: NSage Weil <sage@newdream.net>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

6b65c5c6

Btrfs: remove outdated comment in btrfs_ioctl_resize() · 5d847a8e

由 Li Hong 提交于 5月 14, 2009

In Li Zefan's commit dae7b665,
a combination call of kmalloc() and copy_from_user() is replaced by
memdup_user(). So btrfs_ioctl_resize() doesn't use GFP_NOFS any more.
Signed-off-by: NLi Hong <lihong.hi@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

5d847a8e

Btrfs: remove some WARN_ONs in the IO failure path · cc7b0c9b

由 Chris Mason 提交于 5月 14, 2009

These debugging WARN_ONs make too much console noise during regular
IO failures. An IO failure will still generate a number of messages
as we verify checksums etc, but these two are not needed.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

cc7b0c9b

Btrfs: Don't loop forever on metadata IO failures · 76a05b35

由 Chris Mason 提交于 5月 14, 2009

When a btrfs metadata read fails, the first thing we try to do is find
a good copy on another mirror of the block.  If this fails, read_tree_block()
ends up returning a buffer that isn't up to date.

The btrfs btree reading code was reworked to drop locks and repeat
the search when IO was done, but the changes didn't add a check for failed
reads.  The end result was looping forever on buffers that were never
going to become up to date.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

76a05b35

Btrfs: init inode ordered_data_close flag properly · 2757495c

由 Chris Mason 提交于 5月 14, 2009

This flag is used to decide when we need to send a given file through
the ordered code to make sure it is fully written before a transaction
commits.  It was not being properly set to zero when the inode was
being setup.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

2757495c

28 4月, 2009 2 次提交

Btrfs: look for acls during btrfs_read_locked_inode · 46a53cca

由 Chris Mason 提交于 4月 27, 2009

This changes btrfs_read_locked_inode() to peek ahead in the btree for acl items.
If it is certain a given inode has no acls, it will set the in memory acl
fields to null to avoid acl lookups completely.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

46a53cca

Btrfs: fix acl caching · 7b1a14bb

由 Chris Mason 提交于 4月 27, 2009

Linus noticed the btrfs code to cache acls wasn't properly caching
a NULL acl when the inode didn't have any acls.  This meant the common
case of no acls resulted in expensive btree searches every time the
kernel checked permissions (which is quite often).

This is a modified version of Linus' original patch:

Properly set initial acl fields to BTRFS_ACL_NOT_CACHED in the inode.
This forces an acl lookup when permission checks are done.

Fix btrfs_get_acl to avoid lookups and locking when the inode acls fields
are set to null.

Fix btrfs_get_acl to use the right return value from __btrfs_getxattr
when deciding to cache a NULL acl.  It was storing a NULL acl when
__btrfs_getxattr return -ENOENT, but __btrfs_getxattr was actually returning
-ENODATA for this case.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

7b1a14bb

27 4月, 2009 6 次提交

Btrfs: Fix a bunch of printk() warnings. · 21380931

由 Joel Becker 提交于 4月 21, 2009

Just happened to notice a bunch of %llu vs u64 warnings.  Here's a patch
to cast them all.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

21380931

Btrfs: Fix a trivial warning using max() of u64 vs ULL. · e63b6a6c

由 Joel Becker 提交于 4月 21, 2009

A small warning popped up on ia64 because inode-map.c was comparing a
u64 object id with the ULL FIRST_FREE_OBJECTID.  My first thought was
that all the OBJECTID constants should contain the u64 cast because
btrfs code deals entirely in u64s.  But then I saw how large that was,
and figured I'd just fix the max() call.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e63b6a6c

C
Btrfs: remove unused btrfs_bit_radix slab · 45c06543
由 Chris Mason 提交于 4月 27, 2009
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
45c06543

Btrfs: ratelimit IO error printks · 193f284d

由 Chris Mason 提交于 4月 27, 2009

Btrfs has printks for various IO errors, including bad checksums and
mismatches between what we expect the block headers to contain and what
we actually find on the disk.

Longer term we need a real reporting mechanism for this, but for now
printk is going to have to do.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

193f284d

Btrfs: remove #if 0 code · b7967db7

由 Chris Mason 提交于 4月 27, 2009

Btrfs had some old code sitting around under #if 0, this drops it.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

b7967db7

Btrfs: When shrinking, only update disk size on success · d6397bae

由 Chris Ball 提交于 4月 27, 2009

Previously, we updated a device's size prior to attempting a shrink
operation. This patch moves the device resizing logic to only happen if
the shrink completes successfully. In the process, it introduces a new
field to btrfs_device -- disk_total_bytes -- to track the on-disk size.
Signed-off-by: NChris Ball <cjb@laptop.org>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

d6397bae

25 4月, 2009 6 次提交

Btrfs: fix deadlocks and stalls on dead root removal · 59bc5c75

由 Chris Mason 提交于 4月 24, 2009

After a transaction commit, the old root of the subvol btrees are sent through
snapshot removal. This is what actually frees up any blocks replaced by
COW, and anything the old blocks pointed to.

Snapshot deletion will pause when a transaction commit has started, which
helps to avoid a huge amount of delayed reference count updates piling up
as the transaction is trying to close.

But, this pause happens after the snapshot deletion process has asked other
procs on the system to throttle back a bit so that it can make progress.

We don't want to throttle everyone while we're waiting for the transaction
commit, it leads to deadlocks in the user transaction ioctls used by Ceph
and makes things slower in general.

This patch changes things to avoid the throttling while we sleep.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

59bc5c75

Btrfs: fix fallocate deadlock on inode extent lock · e980b50c

由 Chris Mason 提交于 4月 24, 2009

The btrfs fallocate call takes an extent lock on the entire range
being fallocated, and then runs through insert_reserved_extent on each
extent as they are allocated.

The problem with this is that btrfs_drop_extents may decide to try
and take the same extent lock fallocate was already holding.  The solution
used here is to push down knowledge of the range that is already locked
going into btrfs_drop_extents.

It turns out that at least one other caller had the same bug.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e980b50c

Btrfs: kill btrfs_cache_create · 9601e3f6

由 Christoph Hellwig 提交于 4月 13, 2009

Just use kmem_cache_create directly.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

9601e3f6

Btrfs: don't export symbols · 0d4bf11e

由 Christoph Hellwig 提交于 4月 13, 2009

Currently the extent_map code is only for btrfs so don't export it's
symbols.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

0d4bf11e

Btrfs: simplify makefile · 2ea2544e

由 Christoph Hellwig 提交于 4月 13, 2009

Get rid of the hacks for building out of tree, and always use += for
assigning to the object lists.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

2ea2544e

Btrfs: try to keep a healthy ratio of metadata vs data block groups · 97e728d4

由 Josef Bacik 提交于 4月 21, 2009

This patch makes the chunk allocator keep a good ratio of metadata vs data
block groups. By default for every 8 data block groups, we'll allocate 1
metadata chunk, or about 12% of the disk will be allocated for metadata. This
can be changed by specifying the metadata_ratio mount option.

This is simply the number of data block groups that have to be allocated to
force a metadata chunk allocation. By making sure we allocate metadata chunks
more often, we are less likely to get into situations where the whole disk
has been allocated as data block groups.
Signed-off-by: NJosef Bacik <jbacik@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

97e728d4

22 4月, 2009 1 次提交

Btrfs: fix btrfs fallocate oops and deadlock · 546888da

由 Chris Mason 提交于 4月 21, 2009

Btrfs fallocate was incorrectly starting a transaction with a lock held
on the extent_io tree for the file, which could deadlock. Strictly
speaking it was using join_transaction which would be safe, but it is better
to move the transaction outside of the lock.

When preallocated extents are overwritten, btrfs_mark_buffer_dirty was
being called on an unlocked buffer. This was triggering an assertion and
oops because the lock is supposed to be held.

The bug was calling btrfs_mark_buffer_dirty on a leaf after btrfs_del_item had
been run. btrfs_del_item takes care of dirtying things, so the solution is a
to skip the btrfs_mark_buffer_dirty call in this case.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

546888da

21 4月, 2009 4 次提交

Btrfs: use the right node in reada_for_balance · 8c594ea8

由 Chris Mason 提交于 4月 20, 2009

reada_for_balance was using the wrong index into the path node array,
so it wasn't reading the right blocks.  We never directly used the
results of the read done by this function because the btree search is
started over at the end.

This fixes reada_for_balance to reada in the correct node and to
avoid searching past the last slot in the node.  It also makes sure to
hold the parent lock while we are finding the nodes to read.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

8c594ea8

Btrfs: fix oops on page->mapping->host during writepage · 11c8349b

由 Chris Mason 提交于 4月 20, 2009

The extent_io writepage call updates the writepage index in the inode
as it makes progress.  But, it was doing the update after unlocking the page,
which isn't legal because page->mapping can't be trusted once the page
is unlocked.

This lead to an oops, especially common with compression turned on.  The
fix here is to update the writeback index before unlocking the page.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

11c8349b

Btrfs: add a priority queue to the async thread helpers · d313d7a3

由 Chris Mason 提交于 4月 20, 2009

Btrfs is using WRITE_SYNC_PLUG to send down synchronous IOs with a
higher priority.  But, the checksumming helper threads prevent it
from being fully effective.

There are two problems.  First, a big queue of pending checksumming
will delay the synchronous IO behind other lower priority writes.  Second,
the checksumming uses an ordered async work queue.  The ordering makes sure
that IOs are sent to the block layer in the same order they are sent
to the checksumming threads.  Usually this gives us less seeky IO.

But, when we start mixing IO priorities, the lower priority IO can delay
the higher priority IO.

This patch solves both problems by adding a high priority list to the async
helper threads, and a new btrfs_set_work_high_prio(), which is used
to make put a new async work item onto the higher priority list.

The ordering is still done on high priority IO, but all of the high
priority bios are ordered separately from the low priority bios.  This
ordering is purely an IO optimization, it is not involved in data
or metadata integrity.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

d313d7a3

Btrfs: use WRITE_SYNC for synchronous writes · ffbd517d

由 Chris Mason 提交于 4月 20, 2009

Part of reducing fsync/O_SYNC/O_DIRECT latencies is using WRITE_SYNC for
writes we plan on waiting on in the near future.  This patch
mirrors recent changes in other filesystems and the generic code to
use WRITE_SYNC when WB_SYNC_ALL is passed and to use WRITE_SYNC for
other latency critical writes.

Btrfs uses async worker threads for checksumming before the write is done,
and then again to actually submit the bios.  The bio submission code just
runs a per-device list of bios that need to be sent down the pipe.

This list is split into low priority and high priority lists so the
WRITE_SYNC IO happens first.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ffbd517d

14 4月, 2009 3 次提交

ext2: fix data corruption for racing writes · 316cb4ef

由 Jan Kara 提交于 4月 13, 2009

If two writers allocating blocks to file race with each other (e.g.
because writepages races with ordinary write or two writepages race with
each other), ext2_getblock() can be called on the same inode in parallel.
Before we are going to allocate new blocks, we have to recheck the block
chain we have obtained so far without holding truncate_mutex.  Otherwise
we could overwrite the indirect block pointer set by the other writer
leading to data loss.

The below test program by Ying is able to reproduce the data loss with ext2
on in BRD in a few minutes if the machine is under memory pressure:

long kMemSize  = 50 << 20;
int kPageSize = 4096;

int main(int argc, char **argv) {
	int status;
	int count = 0;
	int i;
	char *fname = "/mnt/test.mmap";
	char *mem;
	unlink(fname);
	int fd = open(fname, O_CREAT | O_EXCL | O_RDWR, 0600);
	status = ftruncate(fd, kMemSize);
	mem = mmap(0, kMemSize, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
	// Fill the memory with 1s.
	memset(mem, 1, kMemSize);
	sleep(2);
	for (i = 0; i < kMemSize; i++) {
		int byte_good = mem[i] != 0;
		if (!byte_good && ((i % kPageSize) == 0)) {
			//printf("%d ", i / kPageSize);
			count++;
		}
	}
	munmap(mem, kMemSize);
	close(fd);
	unlink(fname);

	if (count > 0) {
		printf("Running %d bad page\n", count);
		return 1;
	}
	return 0;
}

Cc: Ying Han <yinghan@google.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: NJan Kara <jack@suse.cz>
Cc: Mingming Cao <cmm@us.ibm.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

316cb4ef

jbd: update locking coments · 32433879

由 Jan Kara 提交于 4月 13, 2009

Update information about locking in JBD revoke code.

Reported-by: Lin Tan <tammy000@gmail.com>.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

32433879

hfs: fix memory leak when unmounting · eb2e5f45

由 Dave Anderson 提交于 4月 13, 2009

When an HFS filesystem is unmounted, it leaks a 2-page bitmap.  Also,
under extreme memory pressure, it's possible that hfs_releasepage() may
use a tree pointer that has not been initialized, and if so, the release
request should just be rejected.

[akpm@linux-foundation.org: free_pages(0) is legal, remove obvious comment]
Signed-off-by: NDave Anderson <anderson@redhat.com>
Tested-by: NEugene Teo <eugeneteo@kernel.sg>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

eb2e5f45

13 4月, 2009 8 次提交

nilfs2: fix possible mismatch of sufile counters on recovery · c85399c2

由 Ryusuke Konishi 提交于 4月 05, 2009

On-disk counters ndirtysegs and ncleansegs of sufile, can go wrong
after roll-forward recovery because
nilfs_prepare_segment_for_recovery() function marks segments dirty
without adjusting value of these counters.

This fixes the problem by adding a function to sufile which does the
operation adjusting the counters, and by letting the recovery function
use it.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

c85399c2

nilfs2: segment usage file cleanups · a703018f

由 Ryusuke Konishi 提交于 4月 05, 2009

This will simplify sufile.c by sharing common code which repeatedly
appears in routines updating a segment usage entry; a wrapper function
nilfs_sufile_update() is introduced for the purpose, and counter
modifications are integrated to a new function
nilfs_sufile_mod_counter().

This is a preparation for the successive bugfix patch ("nilfs2: fix
possible mismatch of sufile counters on recovery").
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

a703018f

nilfs2: fix wrong accounting and duplicate brelse in nilfs_sufile_set_error · 88072faf

由 Ryusuke Konishi 提交于 4月 05, 2009

The nilfs_sufile_set_error() function wrongly adjusts the number of
dirty segments instead of the number of clean segments.  In addition,
the function calls brelse() twice for the same buffer head.

This fixes these bugs.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

88072faf

nilfs2: simplify handling of active state of segments fix · 3efb55b4

由 Ryusuke Konishi 提交于 3月 30, 2009

This fixes a bug of ("nilfs2: simplify handling of active state of
segments") patch.  The patch did not take account that a base index is
increased in nilfs_sufile_get_suinfo() function if requested entries
go across block boundary on sufile.

Due to this bug, the active flag sometimes appears on wrong segments
and has induced malfunction of garbage collection.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

3efb55b4

nilfs2: remove module version · e7a7402c

由 Ryusuke Konishi 提交于 3月 27, 2009

A MODULE_VERSION() macro has been used in out-of-tree nilfs modules,
but it's needless and not updated in tree.  So, this removes it along
with the version declaration.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

e7a7402c

nilfs2: fix lockdep recursive locking warning on meta data files · c2698e50

由 Ryusuke Konishi 提交于 3月 27, 2009

This fixes the following false detection of lockdep against nilfs meta
data files:

=============================================
[ INFO: possible recursive locking detected ]
2.6.29 #26
---------------------------------------------
mount.nilfs2/4185 is trying to acquire lock:
 (&mi->mi_sem){----}, at: [<d0c7925b>] nilfs_sufile_get_stat+0x1e/0x105 [nilfs2]
 but task is already holding lock:
  (&mi->mi_sem){----}, at: [<d0c72026>] nilfs_count_free_blocks+0x48/0x84 [nilfs2]
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

c2698e50

nilfs2: fix lockdep recursive locking warning on bmap · bcb48891

由 Ryusuke Konishi 提交于 3月 27, 2009

The bmap semaphore of DAT file can be held while a bmap of other files
is locked.  This has caused the following false detection of lockdep
check:

mount.nilfs2/4667 is trying to acquire lock:
 (&bmap->b_sem){..--}, at: [<d0c6c4b4>] nilfs_bmap_lookup_at_level+0x1a/0x74 [nilfs2]

but task is already holding lock:
 (&bmap->b_sem){..--}, at: [<d0c6c4b4>] nilfs_bmap_lookup_at_level+0x1a/0x74 [nilfs2]

This will fix the false detection by distinguishing semaphores of the
DAT and other files.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

bcb48891

nilfs2: return f_fsid for statfs2 · c306af23

由 Ryusuke Konishi 提交于 3月 26, 2009

This follows the change of Coly Li's series ("fs: return f_fsid for
statfs(2)"), and make nilfs2 return f_fsid info for statfs(2).
Acked-by: NColy Li <coly.li@suse.de>
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

c306af23

10 4月, 2009 1 次提交

afs: BUG to BUG_ON changes · 11ff5f6a

由 Stoyan Gaydarov 提交于 4月 09, 2009

Signed-off-by: NStoyan Gaydarov <stoyboyker@gmail.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

11ff5f6a

09 4月, 2009 2 次提交

fuse: fix "direct_io" private mmap · 3121bfe7

由 Miklos Szeredi 提交于 4月 09, 2009

MAP_PRIVATE mmap could return stale data from the cache for
"direct_io" files.  Fix this by flushing the cache on mmap. 

Found with a slightly modified fsx-linux.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

3121bfe7

fuse: fix argument type in fuse_get_user_pages() · ce60a2f1

由 Miklos Szeredi 提交于 4月 09, 2009

Fix the following warning:

fs/fuse/file.c: In function 'fuse_direct_io':
fs/fuse/file.c:1002: warning: passing argument 3 of 'fuse_get_user_pages' from incompatible pointer type

This was introduced by commit f4975c67 "fuse: allow kernel to access
"direct_io" files".
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>

ce60a2f1