提交 · 44ad37d69b2cc421d5b5c7ad7fed16230685b092 · openanolis / cloud-kernel

18 4月, 2011 1 次提交

GFS2: filesystem hang caused by incorrect lock order · 44ad37d6

由 Bob Peterson 提交于 3月 17, 2011

This patch fixes a deadlock in GFS2 where two processes are trying
to reclaim an unlinked dinode:
One holds the inode glock and calls gfs2_lookup_by_inum trying to look
up the inode, which it can't, due to I_FREEING.  The other has set
I_FREEING from vfs and is at the beginning of gfs2_delete_inode
waiting for the glock, which is held by the first.  The solution is to
add a new non_block parameter to the gfs2_iget function that causes it
to return -ENOENT if the inode is being freed.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

44ad37d6

20 9月, 2010 2 次提交

GFS2: Make . and .. qstrs constant · 8d123585

由 Steven Whitehouse 提交于 9月 17, 2010

Rather than calculating the qstrs for . and .. each time
we need them, its better to keep a constant version of
these and just refer to them when required.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@infradead.org>

8d123585

GFS2: Remove i_disksize · a2e0f799

由 Steven Whitehouse 提交于 8月 11, 2010

With the update of the truncate code, ip->i_disksize and
inode->i_size are merely copies of each other. This means
we can remove ip->i_disksize and use inode->i_size exclusively
reducing the size of a GFS2 inode by 8 bytes.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

a2e0f799

29 7月, 2010 2 次提交

GFS2: remove dependency on __GFP_NOFAIL · 4244b52e

由 David Rientjes 提交于 7月 20, 2010

The k[mc]allocs in dr_split_leaf() and dir_double_exhash() are failable,
so remove __GFP_NOFAIL from their masks.

Cc: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

4244b52e

GFS2: Use kmalloc when possible for ->readdir() · d2a97a4e

由 Steven Whitehouse 提交于 7月 28, 2010

If we don't need a huge amount of memory in ->readdir() then
we can use kmalloc rather than vmalloc to allocate it. This
should cut down on the greater overheads associated with
vmalloc for smaller directories.

We may be able to eliminate vmalloc entirely at some stage,
but this is easy to do right away.

Also using GFP_NOFS to avoid any issues wrt to deleting inodes
while under a glock, and suggestion from Linus to factor out
the alloc/dealloc.

I've given this a test with a variety of different sized
directories and it seems to work ok.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d2a97a4e

15 7月, 2010 1 次提交

GFS2: rename causes kernel Oops · 728a756b

由 Bob Peterson 提交于 7月 14, 2010

This patch fixes a kernel Oops in the GFS2 rename code.

The problem was in the way the gfs2 directory code was trying
to re-use sentinel directory entries.

In the failing case, gfs2's rename function was renaming a
file to another name that had the same non-trivial length.
The file being renamed happened to be the first directory
entry on the leaf block.

First, the rename code (gfs2_rename in ops_inode.c) found the
original directory entry and decided it could do its job by
simply replacing the directory entry with another.  Therefore
it determined correctly that no block allocations were needed.

Next, the rename code deleted the old directory entry prior to
replacing it with the new name.  Therefore, the soon-to-be
replaced directory entry was temporarily made into a directory
entry "sentinel" or a place holder at the start of a leaf block.

Lastly, it went to re-add the replacement directory entry in
that leaf block.  However, when gfs2_dirent_find_space was
looking for space in the leaf block, it used the wrong value
for the sentinel.  That threw off its calculations so later
it decides it can't really re-use the sentinel and therefore
must allocate a new leaf block.  But because it previously decided
to re-use the directory entry, it didn't waste the time to
grab a new block allocation for the inode.  Therefore, the
inode's i_alloc pointer was still NULL and it crashes trying to
reference it.

In the case of sentinel directory entries, the entire dirent is
reused, not just the "free space" portion of it, and therefore
the function gfs2_dirent_find_space should use the value 0
rather than GFS2_DIRENT_SIZE(0) for the actual dirent size.

Fixing this calculation enables the reproducer programs to work
properly.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

728a756b

14 4月, 2010 1 次提交

GFS2: glock livelock · 1a0eae88

由 Bob Peterson 提交于 4月 14, 2010

This patch fixes a couple gfs2 problems with the reclaiming of
unlinked dinodes.  First, there were a couple of livelocks where
everything would come to a halt waiting for a glock that was
seemingly held by a process that no longer existed.  In fact, the
process did exist, it just had the wrong pid number in the holder
information.  Second, there was a lock ordering problem between
inode locking and glock locking.  Third, glock/inode contention
could sometimes cause inodes to be improperly marked invalid by
iget_failed.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

1a0eae88

03 12月, 2009 1 次提交

GFS2: Remove dirent_first() function · 1579343a

由 Steven Whitehouse 提交于 11月 06, 2009

This function only had one caller left, and that caller only
called it for leaf blocks, hence one branch of the "if" was
never taken. In addition the call to get_left had already
verified the metadata type, so the function can be reduced
to a single line of code in its caller.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

1579343a

20 5月, 2009 1 次提交

GFS2: Improve resource group error handling · 09010978

由 Steven Whitehouse 提交于 5月 20, 2009

This patch improves the error handling in the case where we
discover that the summary information in the resource group
doesn't match the bitmap information while in the process of
allocating blocks. Originally this resulted in a kernel bug,
but this patch changes that so that we return -EIO and print
some messages explaining what went wrong, and how to fix it.

We also remember locally not to try and allocate from the
same rgrp again, so that a subsequent allocation in a
different rgrp should succeed.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

09010978

24 3月, 2009 1 次提交

GFS2: Merge lock_dlm module into GFS2 · f057f6cd

由 Steven Whitehouse 提交于 1月 12, 2009

This is the big patch that I've been working on for some time
now. There are many reasons for wanting to make this change
such as:
 o Reducing overhead by eliminating duplicated fields between structures
 o Simplifcation of the code (reduces the code size by a fair bit)
 o The locking interface is now the DLM interface itself as proposed
   some time ago.
 o Fewer lookups of glocks when processing replies from the DLM
 o Fewer memory allocations/deallocations for each glock
 o Scope to do further optimisations in the future (but this patch is
   more than big enough for now!)

Please note that (a) this patch relates to the lock_dlm module and
not the DLM itself, that is still a separate module; and (b) that
we retain the ability to build GFS2 as a standalone single node
filesystem with out requiring the DLM.

This patch needs a lot of testing, hence my keeping it I restarted
my -git tree after the last merge window. That way, this has the maximum
exposure before its merged. This is (modulo a few minor bug fixes) the
same patch that I've been posting on and off the the last three months
and its passed a number of different tests so far.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

f057f6cd

05 1月, 2009 3 次提交

GFS2: Banish struct gfs2_dinode_host · 383f01fb

由 Steven Whitehouse 提交于 11月 04, 2008

The final field in gfs2_dinode_host was the i_flags field. Thats
renamed to i_diskflags in order to avoid confusion with the existing
inode flags, and moved into the inode proper at a suitable location
to avoid creating a "hole".

At that point struct gfs2_dinode_host is no longer needed and as
promised (quite some time ago!) it can now be removed completely.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

383f01fb

GFS2: Move i_size from gfs2_dinode_host and rename it to i_disksize · c9e98886

由 Steven Whitehouse 提交于 11月 04, 2008

This patch moved the i_size field from the gfs2_dinode_host and
following the ext3 convention renames it i_disksize.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

c9e98886

GFS2: Move "entries" into "proper" inode · ad6203f2

由 Steven Whitehouse 提交于 11月 03, 2008

This moves the directory entry count into the proper inode.
Potentially we could get this to share the space used by
something else in the future, but this is one more step
on the way to removing the gfs2_dinode_host structure.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

ad6203f2

10 4月, 2008 1 次提交

[GFS2] fix GFP_KERNEL misuses · 16c5f06f

由 Josef Bacik 提交于 4月 09, 2008

There are several places where GFP_KERNEL allocations happen under a glock,
which will result in hangs if we're under memory pressure and go to re-enter the
fs in order to flush stuff out. This patch changes the culprits to GFS_NOFS to
keep this problem from happening. Thank you,
Signed-off-by: NJosef Bacik <jbacik@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

16c5f06f

31 3月, 2008 9 次提交

[GFS2] possible null pointer dereference fixup · 182fe5ab

由 Cyrill Gorcunov 提交于 3月 03, 2008

gfs2_alloc_get may fail so we have to check it to prevent
NULL pointer dereference.
Signed-off-by: NCyrill Gorcunov <gorcunov@gamil.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

182fe5ab

[GFS2] Allow bmap to allocate extents · 9b8c81d1

由 Steven Whitehouse 提交于 2月 22, 2008

We've supported mapping of extents when no block allocation is required
for some time. This patch extends that to mapping of extents when an
allocation has been requested. In that case we try to allocate as many
blocks as are requested, but we might return fewer in case there is
something preventing us from returning the complete amount (e.g. an
already allocated block is in the way).

Currently the only code path which can actually request multiple data
blocks in a single bmap call is the page_mkwrite path and even then it
only happens if there are multiple blocks per page. What this patch does
do however, is merge the allocation requests for metadata (growing the
metadata tree in either height or depth) with the allocation of the data
blocks in the case that both are needed. This results in lower overheads
even in the single block allocation case.

The one thing which we can't handle here at the moment is unstuffing. I
would like to be able to do that, but the problem which arises is that
in order to unstuff one has to get a locked page from the page cache
which results in locking problems in the (usual) case that the caller is
holding the page lock on the page it wishes to map. So that case will
have to be addressed in future patches.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

9b8c81d1

[GFS2] be*_add_cpu conversion · bb16b342

由 Marcin Slusarz 提交于 2月 13, 2008

replace all:
big_endian_variable = cpu_to_beX(beX_to_cpu(big_endian_variable) +
					expression_in_cpu_byteorder);
with:
	beX_add_cpu(&big_endian_variable, expression_in_cpu_byteorder);
generated with semantic patch
Signed-off-by: NMarcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

bb16b342

[GFS2] Eliminate (almost) duplicate field from gfs2_inode · 77658aad

由 Steven Whitehouse 提交于 2月 12, 2008

The blocks counter is almost a duplicate of the i_blocks
field in the VFS inode. The only difference is that i_blocks
can be only 32bits long for 32bit arch without large single file
support. Since GFS2 doesn't handle the non-large single file
case (for 32 bit anyway) this adds a new config dependency on
64BIT || LSF. This has always been the case, however we've never
explicitly said so before.

Even if we do add support for the non-LSF case, we will still
not require this field to be duplicated since we will not be
able to access oversized files anyway.

So the net result of all this is that we shave 8 bytes from a gfs2_inode
and get our config deps correct.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

77658aad

[GFS2] Add extent allocation to block allocator · b45e41d7

由 Steven Whitehouse 提交于 2月 06, 2008

Rather than having to allocate a single block at a time, this patch
allows the block allocator to allocate an extent. Since there is
no difference (so far as the block allocator is concerned) between
data blocks and indirect blocks, it is posible to allocate a single
extent and for the caller to unrevoke just the blocks required
for indirect blocks.

Currently the only bit of GFS2 to make use of this feature is the
build height function. The intention is that gfs2_block_map will
be changed to make use of this feature in future patches.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

b45e41d7

[GFS2] Merge gfs2_alloc_meta and gfs2_alloc_data · 1639431a

由 Steven Whitehouse 提交于 2月 01, 2008

Thanks to the preceeding patches, the only difference between
these two functions is their name. We can thus merge them
and call the new function gfs2_alloc_block to reflect the
fact that it can allocate either kind of block.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

1639431a

[GFS2] Update gfs2_trans_add_unrevoke to accept extents · 5731be53

由 Steven Whitehouse 提交于 2月 01, 2008

By adding an extra argument to gfs2_trans_add_unrevoke we can now
specify an extent length of blocks to unrevoke. This means that
we only need to make one pass through the list for each extent
rather than each block. Currently the only extent length which
is used is 1, but that will change in the future.

Also gfs2_trans_add_unrevoke is removed from gfs2_alloc_meta
since its the only difference between this and gfs2_alloc_data
which is left. This will allow a future patch to merge these
two functions into one (i.e. one call to allocate both data
and metadata in a single extent in the future).
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

5731be53

[GFS2] Shrink & rename di_depth · 9a004508

由 Steven Whitehouse 提交于 2月 01, 2008

This patch forms a pair with the previous patch which shrunk
di_height. Like that patch di_depth is renamed i_depth and moved
into struct gfs2_inode directly. Also the field goes from 16 bits
to 8 bits since it is also limited to a max value which is rather
small (17 in this case). In addition we also now validate the field
against this maximum value when its read in.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

9a004508

[GFS2] Get rid of unneeded parameter in gfs2_rlist_alloc · fe6c991c

由 Bob Peterson 提交于 1月 28, 2008

This patch removed the unnecessary parameter from function
gfs2_rlist_alloc.  The parameter was always passed in as 0.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

fe6c991c

08 2月, 2008 1 次提交

Convert ERR_PTR(PTR_ERR(p)) instances to ERR_CAST(p) · e231c2ee

由 David Howells 提交于 2月 07, 2008

Convert instances of ERR_PTR(PTR_ERR(p)) to ERR_CAST(p) using:

perl -spi -e 's/ERR_PTR[(]PTR_ERR[(](.*)[)][)]/ERR_CAST(\1)/' `grep -rl 'ERR_PTR[(]*PTR_ERR' fs crypto net security`
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e231c2ee

25 1月, 2008 1 次提交

[GFS2] Reduce inode size by moving i_alloc out of line · 6dbd8224

由 Steven Whitehouse 提交于 1月 10, 2008

It is possible to reduce the size of GFS2 inodes by taking the i_alloc
structure out of the gfs2_inode. This patch allocates the i_alloc
structure whenever its needed, and frees it afterward. This decreases
the amount of low memory we use at the expense of requiring a memory
allocation for each page or partial page that we write. A quick test
with postmark shows that the overhead is not measurable and I also note
that OCFS2 use the same approach.

In the future I'd like to solve the problem by shrinking down the size
of the members of the i_alloc structure, but for now, this reduces the
immediate problem of using too much low-memory on x86 and doesn't add
too much overhead.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

6dbd8224

10 10月, 2007 2 次提交

[GFS2] Alternate gfs2_iget to avoid looking up inodes being freed · 7a9f53b3

由 Benjamin Marzinski 提交于 9月 18, 2007

There is a possible deadlock between two processes on the same node, where one
process is deleting an inode, and another process is looking for allocated but
unused inodes to delete in order to create more space.

process A does an iput() on inode X, and it's i_count drops to 0. This causes
iput_final() to be called, which puts an inode into state I_FREEING at
generic_delete_inode(). There no point between when iput_final() is called, and
when I_FREEING is set where GFS2 could acquire any glocks. Once I_FREEING is
set, no other process on that node can successfully look up that inode until
the delete finishes.

process B locks the the resource group for the same inode in get_local_rgrp(),
which is called by gfs2_inplace_reserve_i()

process A tries to lock the resource group for the inode in
gfs2_dinode_dealloc(), but it's already locked by process B

process B waits in find_inode for the inode to have the I_FREEING state cleared.

Deadlock.

This patch solves the problem by adding an alternative to gfs2_iget(),
gfs2_iget_skip(), that simply skips any inodes that are in the I_FREEING
state.o The alternate test function is just like the original one, except that
it fails if the inode is being freed, and sets a skipped flag. The alternate
set function is just like the original, except that it fails if the skipped
flag is set. Only try_rgrp_unlink() calls gfs2_iget_skip() instead of
gfs2_iget().
Signed-off-by: NBenjamin E. Marzinski <bmarzins@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

7a9f53b3

[GFS2] Add a missing gfs2_trans_add_bh() · 382e6e25

由 Steven Whitehouse 提交于 8月 16, 2007

This was missing from the dir_split_leaf() function although in
most cases its not a problem due to other functions having
already previously called gfs2_trans_add_bh. This makes certain
that it is correct.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Cc: Wendy Cheng <wcheng@redhat.com>

382e6e25

09 7月, 2007 4 次提交

[GFS2] Obtaining no_formal_ino from directory entry · bb9bcf06

由 Wendy Cheng 提交于 6月 27, 2007

GFS2 lookup code doesn't ask for inode shared glock. This implies during
in-memory inode creation for existing file, GFS2 will not disk-read in
the inode contents. This leaves no_formal_ino un-initialized during
lookup time. The un-initialized no_formal_ino is subsequently encoded
into file handle. Clients will get ESTALE error whenever it tries to
access these files.
Signed-off-by: NS. Wendy Cheng <wcheng@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

bb9bcf06

[GFS2] Add nanosecond timestamp feature · 4bd91ba1

由 Steven Whitehouse 提交于 6月 05, 2007

This adds a nanosecond timestamp feature to the GFS2 filesystem. Due
to the way that the on-disk format works, older filesystems will just
appear to have this field set to zero. When mounted by an older version
of GFS2, the filesystem will simply ignore the extra fields so that
it will again appear to have whole second resolution, so that its
trivially backward compatible.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

4bd91ba1

[GFS2] Fix sign problem in quota/statfs and cleanup _host structures · bb8d8a6f

由 Steven Whitehouse 提交于 6月 01, 2007

This patch fixes some sign issues which were accidentally introduced
into the quota & statfs code during the endianess annotation process.
Also included is a general clean up which moves all of the _host
structures out of gfs2_ondisk.h (where they should not have been to
start with) and into the places where they are actually used (often only
one place). Also those _host structures which are not required any more
are removed entirely (which is the eventual plan for all of them).

The conversion routines from ondisk.c are also moved into the places
where they are actually used, which for almost every one, was just one
single place, so all those are now static functions. This also cleans up
the end of gfs2_ondisk.h which no longer needs the #ifdef __KERNEL__.

The net result is a reduction of about 100 lines of code, many functions
now marked static plus the bug fixes as mentioned above. For good
measure I ran the code through sparse after making these changes to
check that there are no warnings generated.

This fixes Red Hat bz #239686
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

bb8d8a6f

[GFS2] Clean up inode number handling · dbb7cae2

由 Steven Whitehouse 提交于 5月 15, 2007

This patch cleans up the inode number handling code. The main difference
is that instead of looking up the inodes using a struct gfs2_inum_host
we now use just the no_addr member of this structure. The tests relating
to no_formal_ino can then be done by the calling code. This has
advantages in that we want to do different things in different code
paths if the no_formal_ino doesn't match. In the NFS patch we want to
return -ESTALE, but in the ->lookup() path, its a bug in the fs if the
no_formal_ino doesn't match and thus we can withdraw in this case.

In order to later fix bz #201012, we need to be able to look up an inode
without knowing no_formal_ino, as the only information that is known to
us is the on-disk location of the inode in question.

This patch will also help us to fix bz #236099 at a later date by
cleaning up a lot of the code in that area.

There are no user visible changes as a result of this patch and there
are no changes to the on-disk format either.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

dbb7cae2

01 5月, 2007 2 次提交

[GFS2] printk warning fixes · f391a4ea

由 akpm@linux-foundation.org 提交于 4月 25, 2007

alpha:

fs/gfs2/dir.c: In function 'gfs2_dir_read_leaf':
fs/gfs2/dir.c:1322: warning: format '%llu' expects type 'long long unsigned int', but argument 3 has type 'sector_t'
fs/gfs2/dir.c: In function 'gfs2_dir_read':
fs/gfs2/dir.c:1455: warning: format '%llu' expects type 'long long unsigned int', but argument 3 has type '__u64'

Cc: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

f391a4ea

[GFS2] Patch to detect corrupt number of dir entries in leaf and/or inode blocks · bdd19a22

由 Steven Whitehouse 提交于 4月 18, 2007

This patch detects when the number of entries in a leaf block or inode
block (in the case of stuffed directories) is corrupt and informs the
user. It prevents us from running off the end of the array thats been
allocated for the sorting in this case,
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

bdd19a22

15 2月, 2007 1 次提交

[PATCH] remove many unneeded #includes of sched.h · cd354f1a

由 Tim Schmielau 提交于 2月 14, 2007

After Al Viro (finally) succeeded in removing the sched.h #include in module.h
recently, it makes sense again to remove other superfluous sched.h includes.
There are quite a lot of files which include it but don't actually need
anything defined in there. Presumably these includes were once needed for
macros that used to live in sched.h, but moved to other header files in the
course of cleaning it up.

To ease the pain, this time I did not fiddle with any header files and only
removed #includes from .c-files, which tend to cause less trouble.

Compile tested against 2.6.20-rc2 and 2.6.20-rc2-mm2 (with offsets) on alpha,
arm, i386, ia64, mips, powerpc, and x86_64 with allnoconfig, defconfig,
allmodconfig, and allyesconfig as well as a few randconfigs on x86_64 and all
configs in arch/arm/configs on arm. I also checked that no new warnings were
introduced by the patch (actually, some warnings are removed that were emitted
by unnecessarily included header files).
Signed-off-by: NTim Schmielau <tim@physik3.uni-rostock.de>
Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cd354f1a

06 2月, 2007 2 次提交

[GFS2] use CURRENT_TIME_SEC instead of get_seconds in gfs2 · ddfe0627

由 Eric Sandeen 提交于 1月 18, 2007

I was looking something else up and came across this...

I don't honestly have a good reason to change it other than to make it
like every other Linux filesystem in this regard.  ;-)  It doesn't
functionally change anything, but makes some lines shorter. :)

I'm also curious; why does gfs2 have 64-bits of on-disk timestamps, but
not in timespec_t format, and only stores second resolutions?  Seems like
you're halfway to sub-second resolutions already.

I suppose if that gets implemented then all of the below should
instead be CURRENT_TIME not CURRENT_TIME_SEC.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

ddfe0627

[GFS2] Clean up/speed up readdir · 3699e3a4

由 Steven Whitehouse 提交于 1月 17, 2007

This removes the extra filldir callback which gfs2 was using to
enclose an attempt at readahead for inodes during readdir. The
code was too complicated and also hurts performance badly in the
case that the getdents64/readdir call isn't being followed by
stat() and it wasn't even getting it right all the time when it
was.

As a result, on my test box an "ls" of a directory containing 250000
files fell from about 7mins (freshly mounted, so nothing cached) to
between about 15 to 25 seconds. When the directory content was cached,
the time taken fell from about 3mins to about 4 or 5 seconds.

Interestingly in the cached case, running "ls -l" once reduced the time
taken for subsequent runs of "ls" to about 6 secs even without this
patch. Now it turns out that there was a special case of glocks being
used for prefetching the metadata, but because of the timeouts for these
locks (set to 10 secs) the metadata was being timed out before it was
being used and this the prefetch code was constantly trying to prefetch
the same data over and over.

Calling "ls -l" meant that the inodes were brought into memory and once
the inodes are cached, the glocks are not disposed of until the inodes
are pushed out of the cache, thus extending the lifetime of the glocks,
and thus bringing down the time for subsequent runs of "ls"
considerably.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

3699e3a4

30 11月, 2006 4 次提交

[GFS2] Make sentinel dirents compatible with gfs1 · 5e7d65cd

由 Steven Whitehouse 提交于 11月 17, 2006

When deleting directory entries, we set the inum.no_addr to zero
in a dirent when its the first dirent in a block and thus cannot
be merged into the previous dirent as is the usual case. In gfs1,
inum.no_formal_ino was used instead.

This patch changes gfs2 to set both inum.no_addr and inum.no_formal_ino
to zero. It also changes the test from just looking at inum.no_addr to
look at both inum.no_addr and inum.no_formal_ino and a sentinel is
now considered to be a dirent in which _either_ (or both) of them
is set to zero.

This resolves Red Hat bugzillas: #215809, #211465
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

5e7d65cd

[GFS2] Remove gfs2_inode_attr_in · 9e2dbdac

由 Steven Whitehouse 提交于 11月 08, 2006

This function wasn't really doing the right thing. There was no need
to update the inode size at this point and the updating of the
i_blocks field has now been moved to the places where di_blocks is
updated. A result of this patch and some those preceeding it is that
unlocking a glock is now a much more efficient process, since there
is no longer any requirement to copy data from the gfs2 inode into
the vfs inode at this point.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

9e2dbdac

[GFS2] Shrink gfs2_inode (7) - di_payload_format · a9583c79

由 Steven Whitehouse 提交于 11月 01, 2006

This is almost never used. Its there for backward
compatibility with GFS1. It doesn't need its own
field since it can always be calculated from the
inode mode & flags. This saves a bit more space
in the gfs2_inode.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

a9583c79

[GFS2] Shrink gfs2_inode (6) - di_atime/di_mtime/di_ctime · 1a7b1eed

由 Steven Whitehouse 提交于 11月 01, 2006

Remove the di_[amc]time fields and use inode->i_[amc]time
fields instead. This saves 24 bytes from the gfs2_inode.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

1a7b1eed

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功