提交 · 2f48d593b6ceb7bb63d34124ceba77d33be298cf · OpenHarmony / kernel_linux

29 10月, 2009 3 次提交

ocfs2: duplicate inline data properly during reflink. · 2f48d593

由 Tao Ma 提交于 10月 15, 2009

The old reflink fails to handle inodes with inline data and will oops
if it encounters them.  This patch copies inline data to the new inode.
Extended attributes may still be refcounted.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Tested-by: NTristan Ye <tristan.ye@oracle.com>

2f48d593

ocfs2: Move ocfs2_complete_reflink to the right place. · 87f4b1bb

由 Tao Ma 提交于 10月 15, 2009

As its name ocfs2_complete_reflink indicates, it should
be called after all the work for reflink is done, so
it really should be called after we reflink xattr
successfully.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Tested-by: NTristan Ye <tristan.ye@oracle.com>

87f4b1bb

ocfs2: Return -EINVAL when a device is not ocfs2. · fb5cbe9e

由 Joel Becker 提交于 10月 28, 2009

In case of non-modular kernels the root filesystem is mounted by trying
several filesystems. If ocfs2 was tried before the actual filesystem
type, the mount would fail because ocfs2_sb_probe() returns -EAGAIN
instead of -EINVAL. ocfs2 will now return -EINVAL properly.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>
Reported-by: NLaszlo Attila Toth <panther@balabit.hu>

fb5cbe9e

22 10月, 2009 2 次提交

nfs: Fix nfs_parse_mount_options() kfree() leak · 4223a4a1

由 Yinghai Lu 提交于 10月 20, 2009

Fix a (small) memory leak in one of the error paths of the NFS mount
options parsing code.

Regression introduced in 2.6.30 by commit a67d18f8 (NFS: load the
rpc/rdma transport module automatically).
Reported-by: NYinghai Lu <yinghai@kernel.org>
Reported-by: NPekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4223a4a1

fs: pipe.c null pointer dereference · ad396024

由 Earl Chew 提交于 10月 19, 2009

This patch fixes a null pointer exception in pipe_rdwr_open() which
generates the stack trace:

> Unable to handle kernel NULL pointer dereference at 0000000000000028 RIP:
>  [<ffffffff802899a5>] pipe_rdwr_open+0x35/0x70
>  [<ffffffff8028125c>] __dentry_open+0x13c/0x230
>  [<ffffffff8028143d>] do_filp_open+0x2d/0x40
>  [<ffffffff802814aa>] do_sys_open+0x5a/0x100
>  [<ffffffff8021faf3>] sysenter_do_call+0x1b/0x67

The failure mode is triggered by an attempt to open an anonymous
pipe via /proc/pid/fd/* as exemplified by this script:

=============================================================
while : ; do
   { echo y ; sleep 1 ; } | { while read ; do echo z$REPLY; done ; } &
   PID=$!
   OUT=$(ps -efl | grep 'sleep 1' | grep -v grep |
        { read PID REST ; echo $PID; } )
   OUT="${OUT%% *}"
   DELAY=$((RANDOM * 1000 / 32768))
   usleep $((DELAY * 1000 + RANDOM % 1000 ))
   echo n > /proc/$OUT/fd/1                 # Trigger defect
done
=============================================================

Note that the failure window is quite small and I could only
reliably reproduce the defect by inserting a small delay
in pipe_rdwr_open(). For example:

 static int
 pipe_rdwr_open(struct inode *inode, struct file *filp)
 {
       msleep(100);
       mutex_lock(&inode->i_mutex);

Although the defect was observed in pipe_rdwr_open(), I think it
makes sense to replicate the change through all the pipe_*_open()
functions.

The core of the change is to verify that inode->i_pipe has not
been released before attempting to manipulate it. If inode->i_pipe
is no longer present, return ENOENT to indicate so.

The comment about potentially using atomic_t for i_pipe->readers
and i_pipe->writers has also been removed because it is no longer
relevant in this context. The inode->i_mutex lock must be used so
that inode->i_pipe can be dealt with correctly.
Signed-off-by: NEarl Chew <earl_chew@agilent.com>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ad396024

21 10月, 2009 1 次提交

dnotify: ignore FS_EVENT_ON_CHILD · 94552684

由 Andreas Gruenbacher 提交于 10月 15, 2009

Mask off FS_EVENT_ON_CHILD in dnotify_handle_event().  Otherwise, when there
is more than one watch on a directory and dnotify_should_send_event()
succeeds, events with FS_EVENT_ON_CHILD set will trigger all watches and cause
spurious events.

This case was overlooked in commit e42e2773.

	#define _GNU_SOURCE

	#include <stdio.h>
	#include <stdlib.h>
	#include <unistd.h>
	#include <signal.h>
	#include <sys/types.h>
	#include <sys/stat.h>
	#include <fcntl.h>
	#include <string.h>

	static void create_event(int s, siginfo_t* si, void* p)
	{
		printf("create\n");
	}

	static void delete_event(int s, siginfo_t* si, void* p)
	{
		printf("delete\n");
	}

	int main (void) {
		struct sigaction action;
		char *tmpdir, *file;
		int fd1, fd2;

		sigemptyset (&action.sa_mask);
		action.sa_flags = SA_SIGINFO;

		action.sa_sigaction = create_event;
		sigaction (SIGRTMIN + 0, &action, NULL);

		action.sa_sigaction = delete_event;
		sigaction (SIGRTMIN + 1, &action, NULL);

	#	define TMPDIR "/tmp/test.XXXXXX"
		tmpdir = malloc(strlen(TMPDIR) + 1);
		strcpy(tmpdir, TMPDIR);
		mkdtemp(tmpdir);

	#	define TMPFILE "/file"
		file = malloc(strlen(tmpdir) + strlen(TMPFILE) + 1);
		sprintf(file, "%s/%s", tmpdir, TMPFILE);

		fd1 = open (tmpdir, O_RDONLY);
		fcntl(fd1, F_SETSIG, SIGRTMIN);
		fcntl(fd1, F_NOTIFY, DN_MULTISHOT | DN_CREATE);

		fd2 = open (tmpdir, O_RDONLY);
		fcntl(fd2, F_SETSIG, SIGRTMIN + 1);
		fcntl(fd2, F_NOTIFY, DN_MULTISHOT | DN_DELETE);

		if (fork()) {
			/* This triggers a create event */
			creat(file, 0600);
			/* This triggers a create and delete event (!) */
			unlink(file);
		} else {
			sleep(1);
			rmdir(tmpdir);
		}

		return 0;
	}
Signed-off-by: NAndreas Gruenbacher <agruen@suse.de>
Signed-off-by: NEric Paris <eparis@redhat.com>

94552684

19 10月, 2009 2 次提交

inotify: fix coalesce duplicate events into a single event in special case · 3de0ef4f

由 Wei Yongjun 提交于 10月 14, 2009

If we do rename a dir entry, like this:

  rename("/tmp/ino7UrgoJ.rename1", "/tmp/ino7UrgoJ.rename2")
  rename("/tmp/ino7UrgoJ.rename2", "/tmp/ino7UrgoJ")

The duplicate events should be coalesced into a single event. But those two
events do not be coalesced into a single event, due to some bad check in
event_compare(). It can not match the two NULL inodes as the same event.
Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: NEric Paris <eparis@redhat.com>

3de0ef4f

fsnotify: do not set group for a mark before it is on the i_list · 9f0d793b

由 Eric Paris 提交于 9月 11, 2009

fsnotify_add_mark is supposed to add a mark to the g_list and i_list and to
set the group and inode for the mark.  fsnotify_destroy_mark_by_entry uses
the fact that ->group != NULL to know if this group should be destroyed or
if it's already been done.

But fsnotify_add_mark sets the group and inode before it actually adds the
mark to the i_list and g_list.  This can result in a race in inotify, it
requires 3 threads.

sys_inotify_add_watch("file")	sys_inotify_add_watch("file")	sys_inotify_rm_watch([a])
inotify_update_watch()
inotify_new_watch()
inotify_add_to_idr()
   ^--- returns wd = [a]
				inotfiy_update_watch()
				inotify_new_watch()
				inotify_add_to_idr()
				fsnotify_add_mark()
				   ^--- returns wd = [b]
				returns to userspace;
								inotify_idr_find([a])
								   ^--- gives us the pointer from task 1
fsnotify_add_mark()
   ^--- this is going to set the mark->group and mark->inode fields, but will
return -EEXIST because of the race with [b].
								fsnotify_destroy_mark()
								   ^--- since ->group != NULL we call back
									into inotify_freeing_mark() which calls
								inotify_remove_from_idr([a])

since fsnotify_add_mark() failed we call:
inotify_remove_from_idr([a])     <------WHOOPS it's not in the idr, this could
					have been any entry added later!

The fix is to make sure we don't set mark->group until we are sure the mark is
on the inode and fsnotify_add_mark will return success.
Signed-off-by: NEric Paris <eparis@redhat.com>

9f0d793b

15 10月, 2009 2 次提交

sysfs: Allow sysfs_notify_dirent to be called from interrupt context. · 83db93f4

由 Neil Brown 提交于 9月 15, 2009

sysfs_notify_dirent is a simple atomic operation that can be used to
alert user-space that new data can be read from a sysfs attribute.

Unfortunately it cannot currently be called from non-process context
because of its use of spin_lock which is sometimes taken with
interrupts enabled.

So change all lockers of sysfs_open_dirent_lock to disable interrupts,
thus making sysfs_notify_dirent safe to be called from non-process
context (as drivers/md does in md_safemode_timeout).

sysfs_get_open_dirent is (documented as being) only called from
process context, so it uses spin_lock_irq.  Other places
use spin_lock_irqsave.

The usage for sysfs_notify_dirent in md_safemode_timeout was
introduced in 2.6.28, so this patch is suitable for that and more
recent kernels.
Reported-by: NJoel Andres Granados <jgranado@redhat.com>
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Cc: stable <stable@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

83db93f4

sysfs: Allow sysfs_move_dir(..., NULL) again. · a6a83577

由 Cornelia Huck 提交于 10月 06, 2009

As device_move() and kobject_move() both handle a NULL destination,
sysfs_move_dir() should do this as well (again) and fall back to
sysfs_root in that case.
Signed-off-by: NCornelia Huck <cornelia.huck@de.ibm.com>
Cc: Phil Carmody <ext-phil.2.carmody@nokia.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

a6a83577

14 10月, 2009 9 次提交

Btrfs: always pin metadata in discard mode · 444528b3

由 Chris Mason 提交于 10月 14, 2009

We have an optimization in btrfs to allow blocks to be
immediately freed if they were allocated in this transaction and never
written.  Otherwise they are pinned and freed when the transaction
commits.

This isn't optimal for discard mode because immediately freeing
them means immediately discarding them.  It is better to give the
block to the pinning code and letting the (slow) discard happen later.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

444528b3

Btrfs: enable discard support · 06348574

由 Christoph Hellwig 提交于 10月 14, 2009

The discard support code in btrfs currently is guarded by ifdefs for
BIO_RW_DISCARD, which is never defines as it's the name of an enum
memeber.  Just remove the useless ifdefs to actually enable the code.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

06348574

Btrfs: add -o discard option · e244a0ae

由 Christoph Hellwig 提交于 10月 14, 2009

Enable discard by default is not a good idea given the the trim speed
of SSD prototypes we've seen, and the carecteristics for many high-end
arrays.  Turn of discards by default and require the -o discard option
to enable them on.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e244a0ae

Btrfs: properly wait log writers during log sync · 86df7eb9

由 Yan, Zheng 提交于 10月 14, 2009

A recently fsync optimization make btrfs_sync_log skip calling
wait_for_writer in the single log writer case. This is incorrect
since the writer count can also be increased by btrfs_pin_log.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

86df7eb9

Btrfs: fix possible ENOSPC problems with truncate · 5d5e103a

由 Josef Bacik 提交于 10月 13, 2009

There's a problem where we don't do any space reservation for truncates, which
can cause you to OOPs because you will be allowed to go off in the weeds a bit
since we don't account for the delalloc bytes that are created as a result of
the truncate.
Signed-off-by: NJosef Bacik <jbacik@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

5d5e103a

Btrfs: fix btrfs acl #ifdef checks · 0eda294d

由 Chris Mason 提交于 10月 13, 2009

The btrfs acl code was #ifdefing for a define
that didn't exist.  This correctly matches it
to the values used by the Kconfig file.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

0eda294d

Btrfs: streamline tree-log btree block writeout · 690587d1

由 Chris Mason 提交于 10月 13, 2009

Syncing the tree log is a 3 phase operation.

1) write and wait for all the tree log blocks for a given root.

2) write and wait for all the tree log blocks for the
tree of tree log roots.

3) write and wait for the super blocks (barriers here)

This isn't as efficient as it could be because there is
no requirement to wait for the blocks from step one to hit the disk
before we start writing the blocks from step two.  This commit
changes the sequence so that we don't start waiting until
all the tree blocks from both steps one and two have been sent
to disk.

We do this by breaking up btrfs_write_wait_marked_extents into
two functions, which is trivial because it was already broken
up into two parts.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

690587d1

Btrfs: avoid tree log commit when there are no changes · 257c62e1

由 Chris Mason 提交于 10月 13, 2009

rpm has a habit of running fdatasync when the file hasn't
changed.  We already detect if a file hasn't been changed
in the current transaction but it might have been sent to
the tree-log in this transaction and not changed since
the last call to fsync.

In this case, we want to avoid a tree log sync, which includes
a number of synchronous writes and barriers.  This commit
extends the existing tracking of the last transaction to change
a file to also track the last sub-transaction.

The end result is that rpm -ivh and -Uvh are roughly twice as fast,
and on par with ext3.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

257c62e1

Btrfs: only write one super copy during fsync · 4722607d

由 Chris Mason 提交于 10月 13, 2009

During a tree-log commit for fsync, we've been writing at least
two copies of the super block and forcing them to disk.

The other filesystems write only one, and this change brings us on
par with them.  A full transaction commit will write all the super
copies, so we still have redundant info written on a regular
basis.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

4722607d

13 10月, 2009 2 次提交

ext3: Don't update superblock write time when filesystem is read-only · 96ec2e0a

由 Theodore Ts'o 提交于 9月 16, 2009

This avoids updating the superblock write time when we are mounting
the root file system read/only but we need to replay the journal; at
that point, for people who are east of GMT and who make their clock
tick in localtime for Windows bug-for-bug compatibility, and this will
cause e2fsck to complain and force a full file system check.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NJan Kara <jack@suse.cz>

96ec2e0a

NFS: suppress a build warning · a1be9eee

由 Stefan Richter 提交于 10月 12, 2009

struct sockaddr_storage * can safely be used as struct sockaddr *.
Suppress an "incompatible pointer type" warning.
Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a1be9eee

12 10月, 2009 2 次提交

ROMFS: fix length used with romfs_dev_strnlen() function · ef1f7a7e

由 Bernd Schmidt 提交于 10月 06, 2009

An interestingly corrupted romfs file system exposed a problem with the
romfs_dev_strnlen function: it's passing the wrong value to its helpers.
Rather than limit the string to the length passed in by the callers, it
uses the size of the device as the limit.
Signed-off-by: NBernd Schmidt <bernds_cb1@t-online.de>
Signed-off-by: NMike Frysinger <vapier@gentoo.org>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ef1f7a7e

headers: remove sched.h from interrupt.h · d43c36dc

由 Alexey Dobriyan 提交于 10月 07, 2009

After m68k's task_thread_info() doesn't refer to current,
it's possible to remove sched.h from interrupt.h and not break m68k!
Many thanks to Heiko Carstens for allowing this.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>

d43c36dc

09 10月, 2009 17 次提交

Btrfs: fix file clone ioctl for bookend extents · ac6889cb

由 Chris Mason 提交于 10月 09, 2009

The file clone ioctl was incorrectly taking the offset into the
extent on disk into account when calculating the length of the
cloned extent.

The length never changes based on the offset into the physical extent.

Test case:

fallocate -l 1g image
mke2fs image
bcp image image2
e2fsck -f image2

(errors on image2)

The math bug ends up wrapping the length of the extent, and things
go wrong from there.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ac6889cb

Btrfs: fix uninit compiler warning in cow_file_range_nocow · e9061e21

由 Chris Mason 提交于 10月 09, 2009

The extent_type variable was exposed uninit via a goto.  It should be
impossible to trigger because it is protected by a check on another
variable, but this makes sure.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e9061e21

A
Btrfs: constify dentry_operations · 82d339d9
由 Alexey Dobriyan 提交于 10月 09, 2009
```
Signed-off-by: NChris Mason <chris.mason@oracle.com>
```
82d339d9

Btrfs: optimize back reference update during btrfs_drop_snapshot · 94fcca9f

由 Yan, Zheng 提交于 10月 09, 2009

This patch reading level 0 tree blocks that already use full backrefs.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

94fcca9f

Btrfs: remove negative dentry when deleting subvolumne · efefb143

由 Yan, Zheng 提交于 10月 09, 2009

The use of btrfs_dentry_delete is removing dentries from the
dcache when deleting subvolumne. btrfs_dentry_delete ignores
negative dentries. This is incorrect since if we don't remove
the negative dentry, its parent dentry can't be removed.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

efefb143

Btrfs: optimize fsync for the single writer case · ff782e0a

由 Josef Bacik 提交于 10月 08, 2009

This patch optimizes the tree logging stuff so it doesn't always wait 1 jiffie
for new people to join the logging transaction if there is only ever 1 writer.
This helps a little bit with latency where we have something like RPM where it
will fdatasync every file it writes, and so waiting the 1 jiffie for every
fdatasync really starts to add up.
Signed-off-by: NJosef Bacik <jbacik@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

ff782e0a

Btrfs: async delalloc flushing under space pressure · e3ccfa98

由 Josef Bacik 提交于 10月 07, 2009

This patch moves the delalloc flushing that occurs when we are under space
pressure off to a async thread pool. This helps since we only free up
metadata space when we actually insert the extent item, which means it takes
quite a while for space to be free'ed up if we wait on all ordered extents.
However, if space is freed up due to inline extents being inserted, we can
wake people who are waiting up early, and they can finish their work.
Signed-off-by: NJosef Bacik <jbacik@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

e3ccfa98

Btrfs: release delalloc reservations on extent item insertion · 32c00aff

由 Josef Bacik 提交于 10月 08, 2009

This patch fixes an issue with the delalloc metadata space reservation
code. The problem is we used to free the reservation as soon as we
allocated the delalloc region. The problem with this is if we are not
inserting an inline extent, we don't actually insert the extent item until
after the ordered extent is written out. This patch does 3 things,

1) It moves the reservation clearing stuff into the ordered code, so when
we remove the ordered extent we remove the reservation.
2) It adds a EXTENT_DO_ACCOUNTING flag that gets passed when we clear
delalloc bits in the cases where we want to clear the metadata reservation
when we clear the delalloc extent, in the case that we do an inline extent
or we invalidate the page.
3) It adds another waitqueue to the space info so that when we start a fs
wide delalloc flush, anybody else who also hits that area will simply wait
for the flush to finish and then try to make their allocation.

This has been tested thoroughly to make sure we did not regress on
performance.
Signed-off-by: NJosef Bacik <jbacik@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

32c00aff

Btrfs: delay clearing EXTENT_DELALLOC for compressed extents · a3429ab7

由 Chris Mason 提交于 10月 08, 2009

When compression is on, the cow_file_range code is farmed off to
worker threads.  This allows us to do significant CPU work in parallel
on SMP machines.

But it is a delicate balance around when we clear flags and how.  In
the past we cleared the delalloc flag immediately, which was safe
because the pages stayed locked.

But this is causing problems with the newest ENOSPC code, and with the
recent extent state cleanups we can now clear the delalloc bit at the
same time the uncompressed code does.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a3429ab7

Btrfs: cleanup extent_clear_unlock_delalloc flags · a791e35e

由 Chris Mason 提交于 10月 08, 2009

extent_clear_unlock_delalloc has a growing set of ugly parameters
that is very difficult to read and maintain.

This switches to a flag field and well named flag defines.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

a791e35e

xfs: stop calling filemap_fdatawait inside ->fsync · d0800703

由 Christoph Hellwig 提交于 9月 26, 2009

Now that the VFS actually waits for the data I/O to complete before
calling into ->fsync we can stop doing it ourselves.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

d0800703

fix readahead calculations in xfs_dir2_leaf_getdents() · 8e69ce14

由 Eric Sandeen 提交于 9月 25, 2009

This is for bug #850,
http://oss.sgi.com/bugzilla/show_bug.cgi?id=850
XFS file system segfaults , repeatedly and 100% reproducable in 2.6.30 , 2.6.31

The above only showed up on a CONFIG_XFS_DEBUG=y kernel, because
xfs_bmapi() ASSERTs that it has been asked for at least one map,

and it was getting 0.

The root cause is that our guesstimated "bufsize" from xfs_file_readdir
was fairly small, and the

		bufsize -= length;

in the loop was going negative - except bufsize is a size_t, so it
was wrapping to a very large number.

Then when we did
		ra_want = howmany(bufsize + mp->m_dirblksize,
				  mp->m_sb.sb_blocksize) - 1;

with that very large number, the (int) ra_want was coming out
negative, and a subsequent compare:

		if (1 + ra_want > map_blocks ...

was coming out -true- (negative int compare w/ uint) and we went
back to xfs_bmapi() for more, even though we did not need more,
and asked for 0 maps, and hit the ASSERT.

We have kind of a type mess here, but just keeping bufsize from
going negative is probably sufficient to avoid the problem.
Signed-off-by: NEric Sandeen <sandeen@sandeen.net>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

8e69ce14

xfs: make sure xfs_sync_fsdata covers the log · dce5065a

由 Dave Chinner 提交于 10月 06, 2009

We want to always cover the log after writing out the superblock, and
in case of a synchronous writeout make sure we actually wait for the
log to be covered.  That way a filesystem that has been sync()ed can
be considered clean by log recovery.
Signed-off-by: NDave Chinner <david@fromorbit.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NEric Sandeen <sandeen@sandeen.net>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

dce5065a

xfs: mark inodes dirty before issuing I/O · 932640e8

由 Dave Chinner 提交于 10月 06, 2009

To make sure they get properly waited on in sync when I/O is in flight and
we latter need to update the inode size.  Requires a new helper to check if an
ioend structure is beyond the current EOF.
Signed-off-by: NDave Chinner <david@fromorbit.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

932640e8

xfs: cleanup ->sync_fs · 69961a26

由 Christoph Hellwig 提交于 10月 06, 2009

Sort out ->sync_fs to not perform a superblock writeback for the wait = 0 case
as that is just an optional first pass and the superblock will be written back
properly in the next call with wait = 1.  Instead perform an opportunistic
quota writeback to have less work later.  Also remove the freeze special case
as we do a proper wait = 1 call in the freeze code anyway.

Also rename the function to xfs_fs_sync_fs to match the normal naming
convention, update comments and avoid calling into the laptop_mode logic on
an error.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

69961a26

xfs: fix xfs_quiesce_data · c90b07e8

由 Dave Chinner 提交于 10月 06, 2009

We need to do a synchronous xfs_sync_fsdata to make sure the superblock
actually is on disk when we return.

Also remove SYNC_BDFLUSH flag to xfs_sync_inodes because that particular
flag is never checked.

Move xfs_filestream_flush call later to only release inodes after they
have been written out.
Signed-off-by: NDave Chinner <david@fromorbit.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

c90b07e8

xfs: implement ->dirty_inode to fix timestamp handling · f9581b14

由 Christoph Hellwig 提交于 10月 06, 2009

This is picking up on Felix's repost of Dave's patch to implement a
.dirty_inode method.  We really need this notification because
the VFS keeps writing directly into the inode structure instead
of going through methods to update this state.  In addition to
the long-known atime issue we now also have a caller in VM code
that updates c/mtime that way for shared writeable mmaps.  And
I found another one that no one has noticed in practice in the FIFO
code.

So implement ->dirty_inode to set i_update_core whenever the
inode gets externally dirtied, and switch the c/mtime handling to
the same scheme we already use for atime (always picking up
the value from the Linux inode).

Note that this patch also removes the xfs_synchronize_atime call
in xfs_reclaim it was superflous as we already synchronize the time
when writing the inode via the log (xfs_inode_item_format) or the
normal buffers (xfs_iflush_int).

In addition also remove the I_CLEAR check before copying the Linux
timestamps - now that we always have the Linux inode available
we can always use the timestamps in it.

Also switch to just using file_update_time for regular reads/writes -
that will get us all optimization done to it for free and make
sure we notice early when it breaks.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NFelix Blyakher <felixb@sgi.com>
Reviewed-by: NAlex Elder <aelder@sgi.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

f9581b14

OpenHarmony / kernel_linux 上一次同步 大约 4 年

OpenHarmony / kernel_linux
上一次同步大约 4 年