提交 · 8e2e00473598dd5379d8408cb974dade000acafc · openeuler / Kernel

19 7月, 2012 1 次提交

GFS2: Reduce file fragmentation · 8e2e0047

由 Bob Peterson 提交于 7月 19, 2012

This patch reduces GFS2 file fragmentation by pre-reserving blocks. The
resulting improved on disk layout greatly speeds up operations in cases
which would have resulted in interlaced allocation of blocks previously.
A typical example of this is 10 parallel dd processes, each writing to a
file in a common dirctory.

The implementation uses an rbtree of reservations attached to each
resource group (and each inode).
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

8e2e0047

18 7月, 2012 1 次提交

GFS2: kernel panic with small gfs2 filesystems - 1 RG · 294f2ad5

由 Abhijith Das 提交于 7月 18, 2012

In the unlikely setup where there's only one resource group in the gfs2
filesystem, gfs2_rgrpd_get_next() returns a NULL rgd that is not dealt with
properly, causing a kernel NULL ptr dereference. This patch fixes this issue.
Signed-off-by: NAbhi Das <adas@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

294f2ad5

28 6月, 2012 1 次提交

GFS2: Fixing double brelse'ing bh allocated in gfs2_meta_read when EIO occurs · 44b8db13

由 Masatake YAMATO 提交于 6月 18, 2012

This patch fixes buffer_head double free in following code path:

gfs2_block_map
=> gfs2_meta_inode_buffer
 => gfs2_meta_indirect_buffer
  => gfs2_meta_read
=> release_metapath

gfs2_block_map calls gfs2_meta_inode_buffer with &mp.mp_bh[0]
as an argument. mp.mp_bh are filled with zero at the beginning
of gfs2_block_map.

If gfs2_meta_inode_buffer returns non-zero value, gfs2_block_map
calls release_metapath to free buffers chained to mp.mp_bh.
release_metapath checks each slot of mp.mp_bh[i] and
free(with brelse) unless the slot is filled with NULL.

&mp.mp_bh[0] passed to gfs2_meta_inode_buffer is filled at
gfs2_meta_read. gfs2_meta_read is filled a buffer allocated with
gfs2_getbuf even if EIO occurs. When EIO occurs, the allocated buffer
is brelse'ed though the pointer(wrong poiner) points the brelse'ed is
passed back to caller via an argument bhp.

gfs2_meta_indirect_buffer, the caller also pass the wrong pointer
to its caller with EIO. Finally gfs2_block_map gets both EIO and
&mp.mp_bh[0] filled with the wrong pointer. release_metapath
calls brelse again on the wrong pointer.
Signed-off-by: NMasatake YAMATO <yamato@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

44b8db13

14 6月, 2012 1 次提交

GFS2: Combine functions get_local_rgrp and gfs2_inplace_reserve · 666d1d8a

由 Bob Peterson 提交于 6月 13, 2012

This function combines rgrp functions get_local_rgrp and
gfs2_inplace_reserve so that the double retry loop is gone.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

666d1d8a

13 6月, 2012 1 次提交

GFS2: Add kobject release method · 0d515210

由 Bob Peterson 提交于 6月 13, 2012

This patch adds a kobject release function that properly maintains
the kobject use count, so that accesses to the sysfs files do not
cause an access to freed kernel memory after an unmount.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

0d515210

11 6月, 2012 2 次提交

GFS2: Size seq_file buffer more carefully · 0fe2f1e9

由 Steven Whitehouse 提交于 6月 11, 2012

This places a limit on the buffer size for archs with larger
PAGE_SIZE.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Reported-by: NEric Dumazet <eric.dumazet@gmail.com>

0fe2f1e9

GFS2: Use seq_vprintf for glocks debugfs file · 1bb49303

由 Steven Whitehouse 提交于 6月 11, 2012

Make use of the newly added seq_vprintf() function.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Reported-by: NEric Dumazet <eric.dumazet@gmail.com>
Acked-by: NAl Viro <viro@ZenIV.linux.org.uk>

1bb49303

08 6月, 2012 2 次提交

GFS2: Use lvbs for storing rgrp information with mount option · 90306c41

由 Benjamin Marzinski 提交于 5月 29, 2012

Instead of reading in the resource groups when gfs2 is checking
for free space to allocate from, gfs2 can store the necessary infromation
in the resource group's lvb. Also, instead of searching for unlinked
inodes in every resource group that's checked for free space, gfs2 can
store the number of unlinked but inodes in the lvb, and only check for
unlinked inodes if it will find some.

The first time a resource group is locked, the lvb must initialized.
Since this involves counting the unlinked inodes in the resource group,
this takes a little extra time. But after that, if the resource group
is locked with GL_SKIP, the buffer head won't be read in unless it's
actually needed.

Enabling the resource groups lvbs is done via the rgrplvb mount option. If
this option isn't set, the lvbs will still be set and updated, but they won't
be verfied or used by the filesystem. To safely turn on this option, all of
the nodes mounting the filesystem must be running code with this patch, and
the filesystem must have been completely unmounted since they were updated.
Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

90306c41

GFS2: Cache last hash bucket for glock seq_files · ba1ddcb6

由 Steven Whitehouse 提交于 6月 08, 2012

For the glocks and glstats seq_files, which are exposed via debugfs
we should cache the most recent hash bucket, along with the offset
into that bucket. This allows us to restart from that point, rather
than having to begin at the beginning each time.

This is an idea from Eric Dumazet, however I've slightly extended it
so that if the position from which we are due to start is at any
point beyond the last cached point, we start from the last cached
point, plus whatever is the appropriate offset. I don't really expect
people to be lseeking around these files, but if they did so with only
positive offsets, then we'd still get some of the benefit of using a
cached offset.

With my simple test of around 200k entries in the file, I'm seeing
an approx 10x speed up.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

ba1ddcb6

07 6月, 2012 1 次提交

GFS2: Increase buffer size for glocks and glstats debugfs files · df5d2f55

由 Steven Whitehouse 提交于 6月 07, 2012

As per Al Viro's suggestion, this increases the buffer size used
for these two files. This provides a speed up of slightly less than
8x (i.e. proportional to the buffer size) for cases when we have
large numbers of glocks.

Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

df5d2f55

06 6月, 2012 4 次提交

GFS2: Fix error handling when reading an invalid block from the journal · 1b8ba31a

由 Steven Whitehouse 提交于 5月 29, 2012

When we read an invalid block from the journal, we should not call
withdraw, but simply print a message and return an error. It is
up to the caller to then handle that error. In the case of mount
that means a failed mount, rather than a withdraw (requiring a
reboot). In the case of recovering another nodes journal then
we return an error via the uevent.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

1b8ba31a

GFS2: Add "top dir" flag support · 23d0bb83

由 Steven Whitehouse 提交于 5月 28, 2012

This patch adds support for the "top dir" flag. Currently this is unused
but a subsequent patch is planned which will add support for the
Orlov allocation policy when allocating subdirectories in a parent
with this flag set.

In order to ensure backward compatible behaviour, mkfs.gfs2 does
not currently tag the root directory with this flag, it must always be
set manually.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

23d0bb83

GFS2: Fold quota data into the reservations struct · 5407e242

由 Bob Peterson 提交于 5月 18, 2012

This patch moves the ancillary quota data structures into the
block reservations structure. This saves GFS2 some time and
effort in allocating and deallocating the qadata structure.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

5407e242

GFS2: Extend the life of the reservations · 0a305e49

由 Bob Peterson 提交于 6月 06, 2012

This patch lengthens the lifespan of the reservations structure for
inodes. Before, they were allocated and deallocated for every write
operation. With this patch, they are allocated when the first write
occurs, and deallocated when the last process closes the file.
It's more efficient to do it this way because it saves GFS2 a lot of
unnecessary allocates and frees. It also gives us more flexibility
for the future: (1) we can now fold the qadata structure back into
the structure and save those alloc/frees, (2) we can use this for
multi-block reservations.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

0a305e49

30 5月, 2012 1 次提交

->encode_fh() API change · b0b0382b

由 Al Viro 提交于 4月 02, 2012

pass inode + parent's inode or NULL instead of dentry + bool saying
whether we want the parent or not.

NOTE: that needs ceph fix folded in.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

b0b0382b

16 5月, 2012 1 次提交

GFS2: Fix quota adjustment return code · 500242ac

由 Bob Peterson 提交于 5月 15, 2012

This patch changes function gfs2_adjust_quota so that it properly
returns a good (zero) return code on the normal path through the code.
Without this, mounting GFS2 with -o quota=account periodically gave
this error message: GFS2: fsid=cluster:fs: gfs2_quotad: sync error -5
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

500242ac

11 5月, 2012 3 次提交

GFS2: Add rgrp information to block_alloc trace point · 41db1ab9

由 Bob Peterson 提交于 5月 09, 2012

This is a second attempt at a patch that adds rgrp information to the
block allocation trace point for GFS2. As suggested, the patch was
modified to list the rgrp information _after_ the fields that exist today.

Again, the reason for this patch is to allow us to trace and debug
problems with the block reservations patch, which is still in the works.
We can debug problems with reservations if we can see what block allocations
result from the block reservations. It may also be handy in figuring out
if there are problems in rgrp free space accounting. In other words,
we can use it to track the rgrp and its free space along side the allocations
that are taking place.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

41db1ab9

GFS2: Eliminate unused "new" parameter to gfs2_meta_indirect_buffer · f2f9c812

由 Bob Peterson 提交于 5月 10, 2012

It turns out that the "new" parameter to function gfs2_meta_indirect_buffer
was always being passed in as zero. Therefore, this patch eliminates it
and simplifies the function.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

f2f9c812

vfs: make it possible to access the dentry hash/len as one 64-bit entry · 26fe5750

由 Linus Torvalds 提交于 5月 10, 2012

This allows comparing hash and len in one operation on 64-bit
architectures.  Right now only __d_lookup_rcu() takes advantage of this,
since that is the case we care most about.

The use of anonymous struct/unions hides the alternate 64-bit approach
from most users, the exception being a few cases where we initialize a
'struct qstr' with a static initializer.  This makes the problematic
cases use a new QSTR_INIT() helper function for that (but initializing
just the name pointer with a "{ .name = xyzzy }" initializer remains
valid, as does just copying another qstr structure).
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

26fe5750

08 5月, 2012 1 次提交

GFS2: Remove redundant metadata block type check · 6de1e2f3

由 Bob Peterson 提交于 4月 27, 2012

This patch removes a redundant metadata block check. See description below.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

6de1e2f3

06 5月, 2012 1 次提交

vfs: Rename end_writeback() to clear_inode() · dbd5768f

由 Jan Kara 提交于 5月 03, 2012

After we moved inode_sync_wait() from end_writeback() it doesn't make sense
to call the function end_writeback() anymore. Rename it to clear_inode()
which well says what the function really does - set I_CLEAR flag.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>

dbd5768f

04 5月, 2012 1 次提交

GFS2: Fix sgid propagation when using ACLs · f9425ad4

由 Steven Whitehouse 提交于 5月 04, 2012

This cleans up the mode setting code when creating inodes. The
SGID bit was being reset by setattr_copy() when the user creating a
subdirectory was not in the owning group. When ACLs are in use this
SGID bit should have been propagated if the ACL allows creation of
a subdirectory. GFS2's behaviour now matches that of the other ACL
supporting filesystems in this regard.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

f9425ad4

03 5月, 2012 2 次提交

gfs2: fix recovery during unmount · 1a058f52

由 David Teigland 提交于 5月 01, 2012

Journal recovery from lock_dlm should not be ignored
if there is an unmount in progress.  Ignoring it will
causes the recovery to get stuck.  The recovery
process will correctly handle an in-progess unmount.
Signed-off-by: NDavid Teigland <teigland@redhat.com>

1a058f52

dlm: fixes for nodir mode · 4875647a

由 David Teigland 提交于 4月 26, 2012

The "nodir" mode (statically assign master nodes instead
of using the resource directory) has always been highly
experimental, and never seriously used.  This commit
fixes a number of problems, making nodir much more usable.

- Major change to recovery: recover all locks and restart
  all in-progress operations after recovery.  In some
  cases it's not possible to know which in-progess locks
  to recover, so recover all.  (Most require recovery
  in nodir mode anyway since rehashing changes most
  master nodes.)

- Change the way nodir mode is enabled, from a command
  line mount arg passed through gfs2, into a sysfs
  file managed by dlm_controld, consistent with the
  other config settings.

- Allow recovering MSTCPY locks on an rsb that has not
  yet been turned into a master copy.

- Ignore RCOM_LOCK and RCOM_LOCK_REPLY recovery messages
  from a previous, aborted recovery cycle.  Base this
  on the local recovery status not being in the state
  where any nodes should be sending LOCK messages for the
  current recovery cycle.

- Hold rsb lock around dlm_purge_mstcpy_locks() because it
  may run concurrently with dlm_recover_master_copy().

- Maintain highbast on process-copy lkb's (in addition to
  the master as is usual), because the lkb can switch
  back and forth between being a master and being a
  process copy as the master node changes in recovery.

- When recovering MSTCPY locks, flag rsb's that have
  non-empty convert or waiting queues for granting
  at the end of recovery.  (Rename flag from LOCKS_PURGED
  to RECOVER_GRANT and similar for the recovery function,
  because it's not only resources with purged locks
  that need grant a grant attempt.)

- Replace a couple of unnecessary assertion panics with
  error messages.
Signed-off-by: NDavid Teigland <teigland@redhat.com>

4875647a

02 5月, 2012 1 次提交

GFS2: eliminate log elements and simplify · c0752aa7

由 Bob Peterson 提交于 5月 01, 2012

This patch eliminates the gfs2_log_element data structure and
rolls its two components into the gfs2_bufdata. This makes the code
easier to understand and makes it easier to migrate to a rbtree
to keep the list sorted.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

c0752aa7

30 4月, 2012 1 次提交

GFS2: Eliminate vestigial sd_log_le_rg · 1c47f095

由 Bob Peterson 提交于 4月 27, 2012

This patch eliminates gfs2 superblock variable sd_log_le_rg which
is no longer used.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

1c47f095

27 4月, 2012 1 次提交

GFS2: Eliminate needless parameter from function gfs2_setbit · 06344b91

由 Bob Peterson 提交于 4月 26, 2012

This patch eliminates parameter "buf1" from function gfs2_setbit.
This is possible because it was always passed in as bi->bi_bh->b_data.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

06344b91

24 4月, 2012 13 次提交

GFS2: Log code fixes · 144a4c2f

由 Steven Whitehouse 提交于 4月 19, 2012

This patch removes a log lock from around atomic operation where
it is not needed, removes an unused variable, and also changes
a void pointer used incorrectly to a struct page pointer.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

144a4c2f

GFS2: Remove unused argument from gfs2_internal_read · 4306629e

由 Andrew Price 提交于 4月 16, 2012

gfs2_internal_read accepts an unused ra_state argument, left over from
when we did readahead on the rindex. Since there are currently no plans
to add back this readahead, this patch removes the ra_state parameter
and updates the functions which call gfs2_internal_read accordingly.
Signed-off-by: NAndrew Price <anprice@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

4306629e

GFS2: Remove bd_list_tr · c50b91c4

由 Steven Whitehouse 提交于 4月 16, 2012

This is another clean up in the logging code. This per-transaction
list was largely unused. Its main function was to ensure that the
number of buffers in a transaction was correct, however that counter
was only used to check the number of buffers in the bd_list_tr, plus
an assert at the end of each transaction. With the assert now changed
to use the calculated buffer counts, we can remove both bd_list_tr and
its associated counter.

This should make the code easier to understand as well as shrinking
a couple of structures.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

c50b91c4

GFS2: Remove duplicate log code · dad30e90

由 Steven Whitehouse 提交于 4月 16, 2012

The main part of this patch merges the two functions used to
write metadata and data buffers to the log. Most of the code
is common between the two functions, so this provides a nice
clean up, and makes the code more readable.

The gfs2_get_log_desc() function is also extended to take two more
arguments, and thus avoid having to set the length and data1
fields of this strucuture as a separate operation.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

dad30e90

GFS2: Clean up log write code path · e8c92ed7

由 Steven Whitehouse 提交于 4月 16, 2012

Prior to this patch, we have two ways of sending i/o to the log.
One of those is used when we need to allocate both the data
to be written itself and also a buffer head to submit it. This
is done via sb_getblk and friends. This is used mostly for writing
log headers.

The other method is used when writing blocks which have some
in-place counterpart. This is the case for all the metadata
blocks which are journalled, and when journaled data is in use,
for unescaped journalled data blocks.

This patch replaces both of those two methods, and about half
a dozen separate i/o submission points with a single i/o
submission function. We also go direct to bio rather than
using buffer heads, since this allows us to build i/o
requests of the maximum size for the block device in
question. It also reduces the memory required for flushing
the log, which can be very useful in low memory situations.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

e8c92ed7

GFS2: Use variable rather than qa to determine if unstuff necessary · 2f7ee358

由 Bob Peterson 提交于 4月 12, 2012

In the future, the qadata structure will be eliminated and merged
back in with the block reservation structure, after we extend the
lifespan of that. This patch is a step forward in eliminating the
qadata structure. It adds a variable to the do_grow function to
determine when unstuffing is necessary, and has been done.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

2f7ee358

GFS2: Change variable blk to biblk · 9598d25e

由 Bob Peterson 提交于 4月 12, 2012

In the resource group code, we have no less than three different
kinds of block references: block relative to the file system (u64),
block relative to the rgrp (u32), and block relative to the bitmap.
This is a small step to making the code more readable; it renames
variable blk to biblk to solidify in my mind that it's relative to
the bitmap and nothing else.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

9598d25e

GFS2: Fix function parameter comments in rgrp.c · 886b1416