提交 · a17d758b661d6fa01a0d466d7bdda3c131bb68f9 · openanolis / cloud-kernel

07 3月, 2014 1 次提交

GFS2: global conversion to pr_foo() · fc554ed3

由 Fabian Frederick 提交于 3月 05, 2014

-All printk(KERN_foo converted to pr_foo().
-Messages updated to fit in 80 columns.
-fs_macros converted as well.
-fs_printk removed.
Signed-off-by: NFabian Frederick <fabf@skynet.be>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

fc554ed3

07 2月, 2014 1 次提交

GFS2: Add meta readahead field in directory entries · 44aaada9

由 Steven Whitehouse 提交于 2月 07, 2014

The intent of this new field in the directory entry is to
allow a subsequent lookup to know how many blocks, which
are contiguous with the inode, contain metadata which relates
to the inode. This will then allow the issuing of a single
read to read these blocks, rather than reading the inode
first, and then issuing a second read for the metadata.

This only works under some fairly strict conditions, since
we do not have back pointers from inodes to directory entries
we must ensure that the blocks referenced in this way will
always belong to the inode.

This rules out being able to use this system for indirect
blocks, as these can change as a result of truncate/rewrite.

So the idea here is to restrict this to xattr blocks only
for the time being. For most inodes, that means only a
single block. Also, when using ACLs and/or SELinux or
other LSMs, these will be added at inode creation time
so that they will be contiguous with the inode on disk and
also will almost always be needed when we read the inode in
for permissions checks.

Once an xattr block for an inode is allocated, it will never
change until the inode is deallocated.

This patch adds the new field, a further patch will add the
readahead in due course.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

44aaada9

08 1月, 2014 2 次提交

GFS2: Add hints to directory leaf blocks · 01bcb0de

由 Steven Whitehouse 提交于 1月 08, 2014

This patch adds four new fields to directory leaf blocks.
The intent is not to use them in the kernel itself, although
perhaps we may be able to use them as hints at some later date,
but instead to provide more information for debug/fsck use.

One new field adds a pointer to the inode to which the leaf
belongs. This can be useful if the pointer to the leaf block
has become corrupt, as it will allow us to know which inode
this block should be associated with. This field is set when
the leaf is created and never changed over its lifetime.

The second field is a "distance from the hash table" field.
The meaning is as follows:
 0  = An old leaf in which this value has not been set
 1  = This leaf is pointed to directly from the hash table
 2+ = This leaf is part of a chain, pointed to by another leaf
      block, the value gives the position in the chain.

The third and fourth fields combine to give a time stamp of
the most recent directory insertion or deletion from this
leaf block. The time stamp is not updated when a new leaf
block is chained from the current one. The code is currently
written such that the timestamp on the dir inode will match
that of the leaf block for the most recent insertion/deletion.

For backwards compatibility, any of these new fields which is
zero should be considered to be "unknown".
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

01bcb0de

GFS2: For exhash conversion, only one block is needed · 22b5a6c0

由 Steven Whitehouse 提交于 1月 08, 2014

For most cases, only a single new block is needed when we reach
the point of converting from stuffed to exhash directory. The
exception being when the file name is so long that it will not
fit within the new leaf block.

So this patch adds a simple test for that situation so that we
do not need to request the full reservation size in this case.

Potentially we could calculate more accurately the value to use
in other cases too, but that is much more complicated to do and
it is doubtful that the benefit would outweigh the extra cost
in code complexity.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

22b5a6c0

06 1月, 2014 2 次提交

GFS2: Remember directory insert point · 2b47dad8

由 Steven Whitehouse 提交于 1月 06, 2014

When we look to see if there is enough space to add a dir
entry without allocation, we have then been repeating the
same search later when we do the actual insertion. This
patch caches the details of the location in the gfs2_diradd
structure, so that we do not have to repeat the search.

This will provide a performance improvement which will be
greater as the size of the directory increases.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

2b47dad8

GFS2: Add directory addition info structure · 3c1c0ae1

由 Steven Whitehouse 提交于 1月 06, 2014

The intent is that this structure will hold the information
required when adding entries to a directory (linking). To
start with, it will contain only the number of blocks which
are required to link the new entry into the directory. The
current calculation returns either 0 or the maximim number of
blocks that can ever be requested by such a transaction.

The intent is that in a later patch, we can update the dir
code to calculate this value more accurately. In addition
further patches will also add further fields to the new
structure to increase its utility.

In addition this patch fixes a bug where the link used during
inode creation was adding requesting too many blocks in
some cases. This is harmless unless the fs is close to being
full.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

3c1c0ae1

20 8月, 2013 1 次提交

treewide: Add __GFP_NOWARN to k.alloc calls with v.alloc fallbacks · 8be04b93

由 Joe Perches 提交于 6月 19, 2013

Don't emit OOM warnings when k.alloc calls fail when
there there is a v.alloc immediately afterwards.

Converted a kmalloc/vmalloc with memset to kzalloc/vzalloc.
Signed-off-by: NJoe Perches <joe@perches.com>
Acked-by: N"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

8be04b93

29 6月, 2013 1 次提交
- A
  [readdir] convert gfs2 · d81a8ef5
  由 Al Viro 提交于 5月 16, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  d81a8ef5
14 6月, 2013 1 次提交

GFS2: fix regression in dir_double_exhash · 512cbf02

由 Bob Peterson 提交于 6月 14, 2013

Recent commit e8830d88 introduced a bug in function dir_double_exhash;
it was failing to set h in the fall-back case. This patch corrects it.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

512cbf02

11 6月, 2013 1 次提交

GFS2: Only do one directory search on create · 5a00f3cc

由 Steven Whitehouse 提交于 6月 11, 2013

Creation of a new inode requires a directory search in order to ensure
that we are not trying to create an inode with the same name as an
existing one. This was hidden away inside the create_ok() function.

In the case that there was an existing inode, and a lookup can be
substituted for a create (which is the case with regular files
when the O_EXCL flag is not in use) then we were doing a second
lookup in order to return the inode.

This patch merges these two lookups into one. This can be done by
passing a flag to gfs2_dir_search() to tell it to just return -EEXIST
in the cases where we don't actually want to look up the inode.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

5a00f3cc

03 6月, 2013 1 次提交

GFS2: Fall back to vmalloc if kmalloc fails for dir hash tables · e8830d88

由 Bob Peterson 提交于 5月 30, 2013

This version has one more correction: the vmalloc calls are replaced
by __vmalloc calls to preserve the GFP_NOFS flag.

When GFS2's directory management code allocates buffers for a
directory hash table, if it can't get the memory it needs, it
currently gives a bad return code. Rather than giving an error,
this patch allows it to use virtual memory rather than kernel
memory for the hash table. This should make it possible for
directories to function properly, even when kernel memory becomes
very fragmented.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

e8830d88

13 2月, 2013 1 次提交

gfs2: Split NO_QUOTA_CHANGE inot NO_UID_QUTOA_CHANGE and NO_GID_QUTOA_CHANGE · f4108a60

由 Eric W. Biederman 提交于 1月 31, 2013

Split NO_QUOTA_CHANGE into NO_UID_QUTOA_CHANGE and NO_GID_QUTOA_CHANGE
so the constants may be well typed.

Cc: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

f4108a60

29 1月, 2013 1 次提交

GFS2: Split gfs2_trans_add_bh() into two · 350a9b0a

由 Steven Whitehouse 提交于 12月 14, 2012

There is little common content in gfs2_trans_add_bh() between the data
and meta classes by the time that the functions which it calls are
taken into account. The intent here is to split this into two
separate functions. Stage one is to introduce gfs2_trans_add_data()
and gfs2_trans_add_meta() and update the callers accordingly.

Later patches will then pull in the content of gfs2_trans_add_bh()
and its dependent functions in order to clean up the code in this
area.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

350a9b0a

13 11月, 2012 1 次提交

GFS2: Use dirty_inode in gfs2_dir_add · 343cd8f0

由 Bob Peterson 提交于 11月 12, 2012

This patch changes the gfs2_dir_add function so that it uses
the dirty_inode function (via mark_inode_dirty) rather than manually
updating the dinode.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

343cd8f0

06 6月, 2012 1 次提交

GFS2: Fold quota data into the reservations struct · 5407e242

由 Bob Peterson 提交于 5月 18, 2012

This patch moves the ancillary quota data structures into the
block reservations structure. This saves GFS2 some time and
effort in allocating and deallocating the qadata structure.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

5407e242

11 5月, 2012 1 次提交

vfs: make it possible to access the dentry hash/len as one 64-bit entry · 26fe5750

由 Linus Torvalds 提交于 5月 10, 2012

This allows comparing hash and len in one operation on 64-bit
architectures.  Right now only __d_lookup_rcu() takes advantage of this,
since that is the case we care most about.

The use of anonymous struct/unions hides the alternate 64-bit approach
from most users, the exception being a few cases where we initialize a
'struct qstr' with a static initializer.  This makes the problematic
cases use a new QSTR_INIT() helper function for that (but initializing
just the name pointer with a "{ .name = xyzzy }" initializer remains
valid, as does just copying another qstr structure).
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

26fe5750

05 4月, 2012 1 次提交

GFS2: Make sure rindex is uptodate before starting transactions · 5e2f7d61

由 Bob Peterson 提交于 4月 04, 2012

This patch removes the call from gfs2_blk2rgrd to function
gfs2_rindex_update and replaces it with individual calls.
The former way turned out to be too problematic.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

5e2f7d61

22 11月, 2011 1 次提交

GFS2: decouple quota allocations from block allocations · 564e12b1

由 Bob Peterson 提交于 11月 21, 2011

This patch separates the code pertaining to allocations into two
parts: quota-related information and block reservations.
This patch also moves all the block reservation structure allocations to
function gfs2_inplace_reserve to simplify the code, and moves
the frees to function gfs2_inplace_release.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

564e12b1

21 11月, 2011 1 次提交

GFS2: move toward a generic multi-block allocator · 6e87ed0f

由 Bob Peterson 提交于 11月 18, 2011

This patch is a revision of the one I previously posted.
I tried to integrate all the suggestions Steve gave.
The purpose of the patch is to change function gfs2_alloc_block
(allocate either a dinode block or an extent of data blocks)
to a more generic gfs2_alloc_blocks function that can
allocate both a dinode _and_ an extent of data blocks in the
same call. This will ultimately help us create a multi-block
reservation scheme to reduce file fragmentation.

This patch moves more toward a generic multi-block allocator that
takes a pointer to the number of data blocks to allocate, plus whether
or not to allocate a dinode. In theory, it could be called to allocate
(1) a single dinode block, (2) a group of one or more data blocks, or
(3) a dinode plus several data blocks.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

6e87ed0f

15 11月, 2011 1 次提交

GFS2: combine gfs2_alloc_block and gfs2_alloc_di · 3c5d785a

由 Bob Peterson 提交于 11月 14, 2011

GFS2 functions gfs2_alloc_block and gfs2_alloc_di do basically
the same things, with a few exceptions. This patch combines
the two functions into a slightly more generic gfs2_alloc_block.
Having one centralized block allocation function will reduce
code redundancy and make it easier to implement multi-block
reservations to reduce file fragmentation in the future.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

3c5d785a

09 11月, 2011 1 次提交

GFS2: f_ra is always valid in dir readahead function · 79c4c379

由 Steven Whitehouse 提交于 11月 09, 2011

As a result, we don't need to test it each time.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Cc: Bob Peterson <rpeterso@redhat.com>

79c4c379

08 11月, 2011 1 次提交

GFS2: Add readahead to sequential directory traversal · dfe4d34b

由 Bob Peterson 提交于 10月 27, 2011

This patch adds read-ahead capability to GFS2's
directory hash table management.  It greatly improves
performance for some directory operations.  For example:
In one of my file systems that has 1000 directories, each
of which has 1000 files, time to execute a recursive
ls (time ls -fR /mnt/gfs2 > /dev/null) was reduced
from 2m2.814s on a stock kernel to 0m45.938s.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

dfe4d34b

21 10月, 2011 4 次提交

GFS2: Use cached rgrp in gfs2_rlist_add() · 70b0c365

由 Steven Whitehouse 提交于 9月 02, 2011

Each block which is deallocated, requires a call to gfs2_rlist_add()
and each of those calls was calling gfs2_blk2rgrpd() in order to
figure out which rgrp the block belonged in. This can be speeded up
by making use of the rgrp cached in the inode. We also reset this
cached rgrp in case the block has changed rgrp. This should provide
a big reduction in gfs2_blk2rgrpd() calls during deallocation.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

70b0c365

GFS2: Make resource groups "append only" during life of fs · 8339ee54

由 Steven Whitehouse 提交于 8月 31, 2011

Since we have ruled out supporting online filesystem shrink,
it is possible to make the resource group list append only
during the life of a super block. This gives several benefits:

Firstly, we only need to read new rindex elements as they are added
rather than needing to reread the whole rindex file each time one
element is added.

Secondly, the rindex glock can be held for much shorter periods of
time, and is completely removed from the fast path for allocations.
The lock is taken in shared mode only when updating the resource
groups when the first allocation occurs, and after a grow has
taken place.

Thirdly, this results in a reduction in code size, and everything
gets a lot simpler to understand in this area.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

8339ee54

GFS2: Use ->dirty_inode() · ab9bbda0

由 Steven Whitehouse 提交于 8月 15, 2011

The aim of this patch is to use the newly enhanced ->dirty_inode()
super block operation to deal with atime updates, rather than
piggy backing that code into ->write_inode() as is currently
done.

The net result is a simplification of the code in various places
and a reduction of the number of gfs2_dinode_out() calls since
this is now implied by ->dirty_inode().

Some of the mark_inode_dirty() calls have been moved under glocks
in order to take advantage of then being able to avoid locking in
->dirty_inode() when we already have suitable locks.

One consequence is that generic_write_end() now correctly deals
with file size updates, so that we do not need a separate check
for that afterwards. This also, indirectly, means that fdatasync
should work correctly on GFS2 - the current code always syncs the
metadata whether it needs to or not.

Has survived testing with postmark (with and without atime) and
also fsx.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

ab9bbda0

GFS2: Clean up dir hash table reading · 4c28d338

由 Steven Whitehouse 提交于 7月 26, 2011

Since there is now only a single caller to gfs2_dir_read_data()
and it has a number of constant arguments, we can factor
those out. Also some tests relating to the inode size were
being done twice.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

4c28d338

15 7月, 2011 1 次提交

GFS2: Cache dir hash table in a contiguous buffer · 17d539f0

由 Steven Whitehouse 提交于 6月 15, 2011

This patch adds a cache for the hash table to the directory code
in order to help simplify the way in which the hash table is
accessed. This is intended to be a first step towards introducing
some performance improvements in the directory code.

There are two follow ups that I'm hoping to see fairly shortly. One
is to simplify the hash table reading code now that we always read the
complete hash table, whether we want one entry or all of them. The
other is to introduce readahead on the heads of the hash chains
which are referred to from the table.

The hash table is a maximum of 128k in size, so it is not worth trying
to read it in small chunks.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

17d539f0

09 5月, 2011 2 次提交

GFS2: When adding a new dir entry, inc link count if it is a subdir · 3d6ecb7d

由 Steven Whitehouse 提交于 5月 09, 2011

This adds an increment of the link count when we add a new directory
entry, if that entry is itself a directory. This means that we no
longer need separate code to perform this operation.

Now that both adding and removing directory entries automatically
update the parent directory's link count if required, that makes
the code shorter and simpler than before.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

3d6ecb7d

GFS2: Make gfs2_dir_del update link count when required · 855d23ce

由 Steven Whitehouse 提交于 5月 09, 2011

When we remove an entry from a directory, we can save ourselves
some trouble if we know the type of the entry in question, since
if it is itself a directory, we can update the link count of the
parent at the same time as removing the directory entry.

In addition this patch also merges the rmdir and unlink code which
was almost identical anyway. This eliminates the calls to remove
the . and .. directory entries on each rmdir (not needed since the
directory will be deallocated, anyway) which was the only thing preventing
passing the dentry to gfs2_dir_del(). The passing of the dentry
rather than just the name allows us to figure out the type of the entry
which is being removed, and thus adjust the link count when required.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

855d23ce

20 4月, 2011 4 次提交

GFS2: move function foreach_leaf to gfs2_dir_exhash_dealloc · 556bb179

由 Bob Peterson 提交于 3月 22, 2011

The previous patches made function gfs2_dir_exhash_dealloc do nothing
but call function foreach_leaf. This patch simplifies the code by
moving the entire function foreach_leaf into gfs2_dir_exhash_dealloc.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

556bb179

GFS2: pass leaf_bh into leaf_dealloc · ec038c82

由 Bob Peterson 提交于 3月 22, 2011

Function foreach_leaf used to look up the leaf block address and get
a buffer_head.  Then it would call leaf_dealloc which did the same
lookup.  This patch combines the two operations by making foreach_leaf
pass the leaf bh to leaf_dealloc.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

ec038c82

GFS2: Combine transaction from gfs2_dir_exhash_dealloc · d24a7a43

由 Bob Peterson 提交于 3月 22, 2011

At the end of function gfs2_dir_exhash_dealloc, it was setting the dinode
type to "file" to prevent directory corruption in case of a crash.
It was doing so in its own journal transaction. This patch makes the
change occur when the last call is make to leaf_dealloc, since it needs
to rewrite the directory dinode at that time anyway.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

d24a7a43

GFS2: remove *leaf_call_t and simplify leaf_dealloc · 0d95326d

由 Bob Peterson 提交于 3月 22, 2011

Since foreach_leaf is only called with leaf_dealloc as its only possible
call function, we can simplify the code by making it call leaf_dealloc
directly. This simplifies the code and eliminates the need for
leaf_call_t, the generic call method. This is a first small step in
simplifying the directory leaf deallocation code.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

0d95326d

18 4月, 2011 1 次提交

GFS2: filesystem hang caused by incorrect lock order · 44ad37d6

由 Bob Peterson 提交于 3月 17, 2011

This patch fixes a deadlock in GFS2 where two processes are trying
to reclaim an unlinked dinode:
One holds the inode glock and calls gfs2_lookup_by_inum trying to look
up the inode, which it can't, due to I_FREEING.  The other has set
I_FREEING from vfs and is at the beginning of gfs2_delete_inode
waiting for the glock, which is held by the first.  The solution is to
add a new non_block parameter to the gfs2_iget function that causes it
to return -ENOENT if the inode is being freed.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

44ad37d6

20 9月, 2010 2 次提交

GFS2: Make . and .. qstrs constant · 8d123585

由 Steven Whitehouse 提交于 9月 17, 2010

Rather than calculating the qstrs for . and .. each time
we need them, its better to keep a constant version of
these and just refer to them when required.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@infradead.org>

8d123585

GFS2: Remove i_disksize · a2e0f799

由 Steven Whitehouse 提交于 8月 11, 2010

With the update of the truncate code, ip->i_disksize and
inode->i_size are merely copies of each other. This means
we can remove ip->i_disksize and use inode->i_size exclusively
reducing the size of a GFS2 inode by 8 bytes.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

a2e0f799

29 7月, 2010 2 次提交

GFS2: remove dependency on __GFP_NOFAIL · 4244b52e

由 David Rientjes 提交于 7月 20, 2010

The k[mc]allocs in dr_split_leaf() and dir_double_exhash() are failable,
so remove __GFP_NOFAIL from their masks.

Cc: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

4244b52e

GFS2: Use kmalloc when possible for ->readdir() · d2a97a4e

由 Steven Whitehouse 提交于 7月 28, 2010

If we don't need a huge amount of memory in ->readdir() then
we can use kmalloc rather than vmalloc to allocate it. This
should cut down on the greater overheads associated with
vmalloc for smaller directories.

We may be able to eliminate vmalloc entirely at some stage,
but this is easy to do right away.

Also using GFP_NOFS to avoid any issues wrt to deleting inodes
while under a glock, and suggestion from Linus to factor out
the alloc/dealloc.

I've given this a test with a variety of different sized
directories and it seems to work ok.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d2a97a4e

15 7月, 2010 1 次提交

GFS2: rename causes kernel Oops · 728a756b

由 Bob Peterson 提交于 7月 14, 2010

This patch fixes a kernel Oops in the GFS2 rename code.

The problem was in the way the gfs2 directory code was trying
to re-use sentinel directory entries.

In the failing case, gfs2's rename function was renaming a
file to another name that had the same non-trivial length.
The file being renamed happened to be the first directory
entry on the leaf block.

First, the rename code (gfs2_rename in ops_inode.c) found the
original directory entry and decided it could do its job by
simply replacing the directory entry with another.  Therefore
it determined correctly that no block allocations were needed.

Next, the rename code deleted the old directory entry prior to
replacing it with the new name.  Therefore, the soon-to-be
replaced directory entry was temporarily made into a directory
entry "sentinel" or a place holder at the start of a leaf block.

Lastly, it went to re-add the replacement directory entry in
that leaf block.  However, when gfs2_dirent_find_space was
looking for space in the leaf block, it used the wrong value
for the sentinel.  That threw off its calculations so later
it decides it can't really re-use the sentinel and therefore
must allocate a new leaf block.  But because it previously decided
to re-use the directory entry, it didn't waste the time to
grab a new block allocation for the inode.  Therefore, the
inode's i_alloc pointer was still NULL and it crashes trying to
reference it.

In the case of sentinel directory entries, the entire dirent is
reused, not just the "free space" portion of it, and therefore
the function gfs2_dirent_find_space should use the value 0
rather than GFS2_DIRENT_SIZE(0) for the actual dirent size.

Fixing this calculation enables the reproducer programs to work
properly.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

728a756b

14 4月, 2010 1 次提交

GFS2: glock livelock · 1a0eae88

由 Bob Peterson 提交于 4月 14, 2010

This patch fixes a couple gfs2 problems with the reclaiming of
unlinked dinodes.  First, there were a couple of livelocks where
everything would come to a halt waiting for a glock that was
seemingly held by a process that no longer existed.  In fact, the
process did exist, it just had the wrong pid number in the holder
information.  Second, there was a lock ordering problem between
inode locking and glock locking.  Third, glock/inode contention
could sometimes cause inodes to be improperly marked invalid by
iget_failed.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

1a0eae88

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功