提交 · d3fb612076eebec6f67257db0c7a9666ac7e5892 · openanolis / cloud-kernel

01 8月, 2011 1 次提交
- A
  switch posix_acl_create() to umode_t * · d3fb6120
  由 Al Viro 提交于 7月 23, 2011
```
so we can pass &inode->i_mode to it
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  d3fb6120
27 7月, 2011 1 次提交

由 Arun Sharma 提交于 7月 26, 2011

This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>
Signed-off-by: NArun Sharma <asharma@fb.com>
Reviewed-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: NMike Frysinger <vapier@gentoo.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

60063497

26 7月, 2011 4 次提交

fs: take the ACL checks to common code · 4e34e719

由 Christoph Hellwig 提交于 7月 23, 2011

Replace the ->check_acl method with a ->get_acl method that simply reads an
ACL from disk after having a cache miss. This means we can replace the ACL
checking boilerplate code with a single implementation in namei.c.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

4e34e719

kill boilerplates around posix_acl_create_masq() · 826cae2f

由 Al Viro 提交于 7月 23, 2011

new helper: posix_acl_create(&acl, gfp, mode_p).  Replaces acl with
modified clone, on failure releases acl and replaces with NULL.
Returns 0 or -ve on error.  All callers of posix_acl_create_masq()
switched.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

826cae2f

kill boilerplate around posix_acl_chmod_masq() · bc26ab5f

由 Al Viro 提交于 7月 23, 2011

new helper: posix_acl_chmod(&acl, gfp, mode).  Replaces acl with modified
clone or with NULL if that has failed; returns 0 or -ve on error.  All
callers of posix_acl_chmod_masq() switched to that - they'd been doing
exactly the same thing.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

bc26ab5f

vfs: move ACL cache lookup into generic code · e77819e5

由 Linus Torvalds 提交于 7月 22, 2011

This moves logic for checking the cached ACL values from low-level
filesystems into generic code.  The end result is a streamlined ACL
check that doesn't need to load the inode->i_op->check_acl pointer at
all for the common cached case.

The filesystems also don't need to check for a non-blocking RCU walk
case in their acl_check() functions, because that is all handled at a
VFS layer.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

e77819e5

21 7月, 2011 3 次提交

simplify gfs2_lookup() · 6c673ab3

由 Al Viro 提交于 7月 17, 2011

d_splice_alias() will DTRT when given NULL or ERR_PTR
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

6c673ab3

fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers · 02c24a82

由 Josef Bacik 提交于 7月 16, 2011

Btrfs needs to be able to control how filemap_write_and_wait_range() is called
in fsync to make it less of a painful operation, so push down taking i_mutex and
the calling of filemap_write_and_wait() down into the ->fsync() handlers. Some
file systems can drop taking the i_mutex altogether it seems, like ext3 and
ocfs2. For correctness sake I just pushed everything down in all cases to make
sure that we keep the current behavior the same for everybody, and then each
individual fs maintainer can make up their mind about what to do from there.
Thanks,
Acked-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

02c24a82

fs: move inode_dio_wait calls into ->setattr · 562c72aa

由 Christoph Hellwig 提交于 6月 24, 2011

Let filesystems handle waiting for direct I/O requests themselves instead
of doing it beforehand. This means filesystem-specific locks to prevent
new dio referenes from appearing can be held. This is important to allow
generalizing i_dio_count to non-DIO_LOCKING filesystems.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

562c72aa

20 7月, 2011 5 次提交
- A
  ->permission() sanitizing: don't pass flags to ->permission() · 10556cb2
  由 Al Viro 提交于 6月 20, 2011
```
not used by the instances anymore.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  10556cb2
- A
  ->permission() sanitizing: don't pass flags to generic_permission() · 2830ba7f
  由 Al Viro 提交于 6月 20, 2011
```
redundant; all callers get it duplicated in mask & MAY_NOT_BLOCK and none of
them removes that bit.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  2830ba7f
- A
  ->permission() sanitizing: don't pass flags to ->check_acl() · 7e40145e
  由 Al Viro 提交于 6月 20, 2011
```
not used in the instances anymore.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  7e40145e
- A
  ->permission() sanitizing: pass MAY_NOT_BLOCK to ->check_acl() · 9c2c7039
  由 Al Viro 提交于 6月 20, 2011
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  9c2c7039
- A
  kill check_acl callback of generic_permission() · 178ea735
  由 Al Viro 提交于 6月 20, 2011
```
its value depends only on inode and does not change; we might as
well store it in ->i_op->check_acl and be done with that.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  178ea735
15 7月, 2011 4 次提交

GFS2: combine duplicated block freeing routines · 46fcb2ed

由 Eric Sandeen 提交于 6月 23, 2011

__gfs2_free_data and __gfs2_free_meta are almost identical, and
can be trivially combined.

[This is as per Eric's original patch minus gfs2_free_data() which had
 no callers left and plus the conversion of the bmap.c calls to these
 functions. All in all, a nice clean up]
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

46fcb2ed

GFS2: Add S_NOSEC support · 9964afbb

由 Steven Whitehouse 提交于 6月 16, 2011

This adds S_NOSEC support to GFS2. We set/reset the flag either when
a user calls setattr or when we have just regained the glock
from another node. The flag is only set if there are no xattrs
on the inode and there is no suid bit set.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Reviewed-by: NAndi Kleen <ak@linux.intel.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>

9964afbb

GFS2: Automatically adjust glock min hold time · 7cf8dcd3

由 Bob Peterson 提交于 6月 15, 2011

This patch is a performance improvement for GFS2 in a clustered
environment. It makes the glock hold time self-adjusting.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

7cf8dcd3

GFS2: Cache dir hash table in a contiguous buffer · 17d539f0

由 Steven Whitehouse 提交于 6月 15, 2011

This patch adds a cache for the hash table to the directory code
in order to help simplify the way in which the hash table is
accessed. This is intended to be a first step towards introducing
some performance improvements in the directory code.

There are two follow ups that I'm hoping to see fairly shortly. One
is to simplify the hash table reading code now that we always read the
complete hash table, whether we want one entry or all of them. The
other is to introduce readahead on the heads of the hash chains
which are referred to from the table.

The hash table is a maximum of 128k in size, so it is not worth trying
to read it in small chunks.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

17d539f0

14 7月, 2011 1 次提交

GFS2: Resolve inode eviction and ail list interaction bug · 380f7c65

由 Steven Whitehouse 提交于 7月 14, 2011

This patch contains a few misc fixes which resolve a recently
reported issue. This patch has been a real team effort and has
received a lot of testing.

The first issue is that the ail lock needs to be held over a few
more operations. The lock thats added into gfs2_releasepage() may
possibly be a candidate for replacing with RCU at some future
point, but at this stage we've gone for the obvious fix.

The second issue is that gfs2_write_inode() can end up calling
a glock recursively when called from gfs2_evict_inode() via the
syncing code, so it needs a guard added.

The third issue is that we either need to not truncate the metadata
pages of inodes which have zero link count, but which we cannot
deallocate due to them still being in use by other nodes, or we need
to ensure that those pages have all made it through the journal and
ail lists first. This patch takes the former approach, but the
latter has also been tested and there is nothing to choose between
them performance-wise. So again, we could revise that decision
in the future.

Also, the inode eviction process is now better documented.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Tested-by: NBob Peterson <rpeterso@redhat.com>
Tested-by: NAbhijith Das <adas@redhat.com>
Reported-by: NBarry J. Marson <bmarson@redhat.com>
Reported-by: NDavid Teigland <teigland@redhat.com>

380f7c65

12 7月, 2011 2 次提交

GFS2: Fix race during filesystem mount · 3942ae53

由 Steven Whitehouse 提交于 7月 11, 2011

There is a potential race during filesystem mounting which has recently
been reported. It occurs when the userland gfs_controld is able to
process requests fast enough that it tries to use the sysfs interface
before the lock module is properly initialised. This is a pretty
unusual case as normally the lock module initialisation is very quick
compared with gfs_controld.

This patch adds an interruptible completion which is used to ensure that
userland will wait for the initialisation of the lock module to
complete.

There are other potential solutions to this problem, but this is the
quickest at this stage and has been tested both with and without
mount.gfs2 present in the system.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Reported-by: NDavid Booher <dbooher@adams.net>

3942ae53

GFS2: force a log flush when invalidating the rindex glock · 1ce53368

由 Benjamin Marzinski 提交于 6月 13, 2011

Right now, there is nothing that forces the log to get flushed when a node
drops its rindex glock so that another node can grow the filesystem. If the
log doesn't get flushed, GFS2 can corrupt the sd_log_le_rg list in the
following way.

A node puts an rgd on the list in rg_lo_add(), and then the rindex glock is
dropped so the other node can grow the filesystem. When the node reacquires the
rindex glock, that rgd gets deleted in clear_rgrpdi() before ever being
removed from the list by gfs2_log_flush().

This code simply forces a log flush when the rindex glock is invalidated,
solving the problem.
Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

1ce53368

26 5月, 2011 1 次提交

gfs2: Drop __TIME__ usage · 8d2c50e3

由 Michal Marek 提交于 4月 01, 2011

The kernel already prints its build timestamp during boot, no need to
repeat it in random drivers and produce different object files each
time.

Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: cluster-devel@redhat.com
Signed-off-by: NMichal Marek <mmarek@suse.cz>

8d2c50e3

25 5月, 2011 2 次提交

vmscan: change shrinker API by passing shrink_control struct · 1495f230

由 Ying Han 提交于 5月 24, 2011

Change each shrinker's API by consolidating the existing parameters into
shrink_control struct.  This will simplify any further features added w/o
touching each file of shrinker.

[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: fix warning]
[kosaki.motohiro@jp.fujitsu.com: fix up new shrinker API]
[akpm@linux-foundation.org: fix xfs warning]
[akpm@linux-foundation.org: update gfs2]
Signed-off-by: NYing Han <yinghan@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Acked-by: NPavel Emelyanov <xemul@openvz.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Acked-by: NRik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1495f230

GFS2: Processes waiting on inode glock that no processes are holding · f90e5b5b

由 Bob Peterson 提交于 5月 24, 2011

This patch fixes a race in the GFS2 glock state machine that may
result in lockups.  The symptom is that all nodes but one will
hang, waiting for a particular glock.  All the holder records
will have the "W" (Waiting) bit set.  The other node will
typically have the glock stuck in Exclusive mode (EX) with no
holder records, but the dinode will be cached.  In other words,
an entry with "I:" will appear in the glock dump for that glock,
but nothing else.

The race has to do with the glock "Pending Demote" bit, which
can be set, then immediately reset, thus losing the fact that
another node needs the glock.  The sequence of events is:

1. Something schedules the glock workqueue (e.g. glock request from fs)
2. The glock workqueue gets to the point between the test of the reply pending
bit and the spin lock:

        if (test_and_clear_bit(GLF_REPLY_PENDING, &gl->gl_flags)) {
                finish_xmote(gl, gl->gl_reply);
                drop_ref = 1;
        }
        down_read(&gfs2_umount_flush_sem);         <---- i.e. here
        spin_lock(&gl->gl_spin);

3. In comes (a) the reply to our EX lock request setting GLF_REPLY_PENDING and
            (b) the demote request which sets GLF_PENDING_DEMOTE

4. The following test is executed:

        if (test_and_clear_bit(GLF_PENDING_DEMOTE, &gl->gl_flags) &&
            gl->gl_state != LM_ST_UNLOCKED &&
            gl->gl_demote_state != LM_ST_EXCLUSIVE) {

This resets the pending demote flag, and gl->gl_demote_state is not equal to
exclusive, however because the reply from the dlm arrived after we checked for
the GLF_REPLY_PENDING flag, gl->gl_state is still equal to unlocked, so
although we reset the GLF_PENDING_DEMOTE flag, we didn't then set the
GLF_DEMOTE flag or reinstate the GLF_PENDING_DEMOTE_FLAG.

The patch closes the timing window by only transitioning the
"Pending demote" bit to the "demote" flag once we know the
other conditions (not unlocked and not exclusive) are met.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

f90e5b5b

22 5月, 2011 1 次提交

GFS2: Wait properly when flushing the ail list · 26b06a69

由 Steven Whitehouse 提交于 5月 21, 2011

The ail flush code has always relied upon log flushing to prevent
it from spinning needlessly. This fixes it to wait on the last
I/O request submitted (we don't need to wait for all of it)
instead of either spinning with io_schedule or sleeping.

As a result cpu usage of gfs2_logd is much reduced with certain
workloads.
Reported-by: NAbhijith Das <adas@redhat.com>
Tested-by: NAbhijith Das <adas@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

26b06a69

21 5月, 2011 1 次提交

GFS2: Wipe directory hash table metadata when deallocating a directory · 6d3117b4

由 Steven Whitehouse 提交于 5月 21, 2011

The deallocation code for directories in GFS2 is largely divided into
two parts. The first part deallocates any directory leaf blocks and
marks the directory as being a regular file when that is complete. The
second stage was identical to deallocating regular files.

Regular files have their data blocks in a different
address space to directories, and thus what would have been normal data
blocks in a regular file (the hash table in a GFS2 directory) were
deallocated correctly. However, a reference to these blocks was left in the
journal (assuming of course that some previous activity had resulted in
those blocks being in the journal or ail list).

This patch uses the i_depth as a test of whether the inode is an
exhash directory (we cannot test the inode type as that has already
been changed to a regular file at this stage in deallocation)

The original issue was reported by Chris Hertel as an issue he encountered
running bonnie++
Reported-by: NChristopher R. Hertel <crh@samba.org>
Cc: Abhijith Das <adas@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

6d3117b4

13 5月, 2011 3 次提交

GFS2: Move all locking inside the inode creation function · f2741d98

由 Steven Whitehouse 提交于 5月 13, 2011

Now that there are no longer any exceptions to the normal inode
creation code path, we can move the parts of the locking code
which were duplicated in mkdir/mknod/create/symlink into the
inode create function.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

f2741d98

GFS2: Clean up symlink creation · 160b4026

由 Steven Whitehouse 提交于 5月 13, 2011

This moves the symlink specific parts of inode creation
into the function where we initialise the rest of the
dinode. As a result we have one less place where we need
to look up the inode's buffer.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

160b4026

GFS2: Clean up mkdir · e2d0a13b

由 Steven Whitehouse 提交于 5月 13, 2011

This moves the initialisation of the directory into the inode
creation functions to avoid having to duplicate the lookup
of the inode's buffer.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

e2d0a13b

10 5月, 2011 3 次提交

GFS2: Use UUID field in generic superblock · 32e471ef

由 Steven Whitehouse 提交于 5月 10, 2011

The VFS superblock structure now has a UUID field, so we can use that
in preference to the UUID field in the GFS2 superblock now.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

32e471ef

GFS2: Rename ops_inode.c to inode.c · 2ab9cd1c

由 Steven Whitehouse 提交于 5月 10, 2011

This is the final part of the ops_inode.c/inode.c reordering. We
are left with a single file called inode.c which now contains
all the inode operations, as expected.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

2ab9cd1c

S
GFS2: Inode.c is empty now, remove it · 64ea5402
由 Steven Whitehouse 提交于 5月 10, 2011
```
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
```
64ea5402

09 5月, 2011 7 次提交

S
GFS2: Move final part of inode.c into super.c · 9eed04cd
由 Steven Whitehouse 提交于 5月 09, 2011
```
Now inode.c is empty.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
```
9eed04cd

GFS2: Move most of the remaining inode.c into ops_inode.c · 194c011f

由 Steven Whitehouse 提交于 5月 09, 2011

This is in preparation to remove inode.c and rename ops_inode.c
to inode.c. Also most of the functions which were left in inode.c
relate to the creation and lookup of inodes. I'm intending to work
on consolidating some of that code, and its easier when its all in
one place.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

194c011f

GFS2: Move gfs2_refresh_inode() and friends into glops.c · d4b2cf1b

由 Steven Whitehouse 提交于 5月 09, 2011

Eventually there will only be a single caller of this code, so lets
move it where it can be made static at some future date.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

d4b2cf1b

GFS2: Remove gfs2_dinode_print() function · 94fb763b

由 Steven Whitehouse 提交于 5月 09, 2011

This function was intended for debugging purposes, but it is not very
useful. If we want to know what is on disk then all we need is a
block number and gfs2_edit can give us much better information about
what is there. Otherwise, if we are interested in what is stored in
the in-core inode, it doesn't help us out there either.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

94fb763b

GFS2: When adding a new dir entry, inc link count if it is a subdir · 3d6ecb7d

由 Steven Whitehouse 提交于 5月 09, 2011

This adds an increment of the link count when we add a new directory
entry, if that entry is itself a directory. This means that we no
longer need separate code to perform this operation.

Now that both adding and removing directory entries automatically
update the parent directory's link count if required, that makes
the code shorter and simpler than before.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

3d6ecb7d

GFS2: Make gfs2_dir_del update link count when required · 855d23ce

由 Steven Whitehouse 提交于 5月 09, 2011

When we remove an entry from a directory, we can save ourselves
some trouble if we know the type of the entry in question, since
if it is itself a directory, we can update the link count of the
parent at the same time as removing the directory entry.

In addition this patch also merges the rmdir and unlink code which
was almost identical anyway. This eliminates the calls to remove
the . and .. directory entries on each rmdir (not needed since the
directory will be deallocated, anyway) which was the only thing preventing
passing the dentry to gfs2_dir_del(). The passing of the dentry
rather than just the name allows us to figure out the type of the entry
which is being removed, and thus adjust the link count when required.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

855d23ce

GFS2: Don't use gfs2_change_nlink in link syscall · 2baee03f

由 Steven Whitehouse 提交于 5月 09, 2011

There are three users of gfs2_change_nlink which add to the link
count. Two of these are about to be removed in later patches, so
this means that there will no callers, when that happens allowing
removal of that function, also in a later patch.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

2baee03f

05 5月, 2011 1 次提交

GFS2: Don't use a try lock when promoting to a higher mode · 588da3b3

由 Steven Whitehouse 提交于 5月 05, 2011

Previously we marked all locks being promoted to a higher mode
with the try flag to avoid any potential deadlocks issues. The
DLM is able to detect these and report them in way that GFS2 can
deal with them correctly. So we can just request the required mode
and wait for a response without needing to perform this check.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

588da3b3

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功