提交 · ad781971d9b8b409be09645be56d160865c952a6 · openeuler / raspberrypi-kernel

16 1月, 2014 1 次提交

GFS2: Don't use ENOBUFS when ENOMEM is the correct error code · ac3beb6a

由 Steven Whitehouse 提交于 1月 16, 2014

Al Viro has tactfully pointed out that we are using the incorrect
error code in some cases. This patch fixes that, and also removes
the (unused) return value for glock dumping.

>        * gfs2_iget() - ENOBUFS instead of ENOMEM.  ENOBUFS is
> "No buffer space available (POSIX.1 (XSI STREAMS option))" and since
> we don't support STREAMS it's probably fair game, but... what the hell?
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>

ac3beb6a

15 1月, 2014 3 次提交

GFS2: Move quota bitmap operations under their own lock · 2d9e7230

由 Steven Whitehouse 提交于 12月 13, 2013

Gradually, the global qd_lock is being used for less and less.
After this patch it will only be used for the per super block
list whose purpose is to allow syncing of changes back to the
master quota file from the local quota changes file. Fixing
up that process to make it more efficient will be the subject
of a later patch, however this patch removes another barrier
to doing that.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Cc: Abhijith Das <adas@redhat.com>

2d9e7230

GFS2: Clean up quota slot allocation · ee2411a8

由 Steven Whitehouse 提交于 12月 12, 2013

Quota slot allocation has historically used a vector of pages
and a set of homegrown find/test/set/clear bit functions. Since
the size of the bitmap is likely to be based on the default
qc file size, thats a couple of pages at most. So we ought
to be able to allocate that as a single chunk, with a vmalloc
fallback, just in case of memory fragmentation.

We are then able to use the kernel's own find/test/set/clear
bit functions, rather than rolling our own.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Cc: Abhijith Das <adas@redhat.com>

ee2411a8

GFS2: Use RCU/hlist_bl based hash for quotas · c754fbbb

由 Steven Whitehouse 提交于 12月 12, 2013

Prior to this patch, GFS2 kept all the quotas for each
super block in a single linked list. This is rather slow
when there are large numbers of quotas.

This patch introduces a hlist_bl based hash table, similar
to the one used for glocks. The initial look up of the quota
is now lockless in the case where it is already cached,
although we still have to take the per quota spinlock in
order to bump the ref count. Either way though, this is a
big improvement on what was there before.

The qd_lock and the per super block list is preserved, for
the time being. However it is intended that since this is no
longer used for its original role, it should be possible to
shrink the number of items on that list in due course and
remove the requirement to take qd_lock in qd_get.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Cc: Abhijith Das <adas@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

c754fbbb

03 1月, 2014 3 次提交

GFS2: Use only a single address space for rgrps · 70d4ee94

由 Steven Whitehouse 提交于 12月 06, 2013

Prior to this patch, GFS2 had one address space for each rgrp,
stored in the glock. This patch changes them to use a single
address space in the super block. This therefore saves
(sizeof(struct address_space) * nr_of_rgrps) bytes of memory
and for large filesystems, that can be significant.

It would be nice to be able to do something similar and merge
the inode metadata address space into the same global
address space. However, that is rather more complicated as the
on-disk location doesn't have a 1:1 mapping with the inodes in
general. So while it could be done, it will be a more complicated
operation as it requires changing a lot more code paths.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

70d4ee94

GFS2: Use range based functions for rgrp sync/invalidation · 7005c3e4

由 Steven Whitehouse 提交于 12月 06, 2013

Each rgrp header is represented as a single extent on disk, so we
can calculate the position within the address space, since we are
using address spaces mapped 1:1 to the disk. This means that it
is possible to use the range based versions of filemap_fdatawrite/wait
and for invalidating the page cache.

Our eventual intent is to then be able to merge the address spaces
used for rgrps into a single address space, rather than to have
one for each glock, saving memory and reducing complexity.

Since during umount, the rgrp structures are disposed of before
the glocks, we need to store the extent information in the glock
so that is is available for a final invalidation. This patch uses
a field which is otherwise unused in rgrp glocks to do that, so
that we do not have to expand the size of a glock.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

7005c3e4

GFS2: Implement a "rgrp has no extents longer than X" scheme · 5ea5050c

由 Bob Peterson 提交于 11月 25, 2013

With the preceding patch, we started accepting block reservations
smaller than the ideal size, which requires a lot more parsing of the
bitmaps. To reduce the amount of bitmap searching, this patch
implements a scheme whereby each rgrp keeps track of the point
at this multi-block reservations will fail.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

5ea5050c

04 11月, 2013 2 次提交

GFS2: Use generic list_lru for quota · 2147dbfd

由 Steven Whitehouse 提交于 11月 04, 2013

By using the generic list_lru code, we can now separate the
per sb quota list locking from the lru locking. The lru
lock is made into the inner-most lock.

As a result of this new lock order, we may occasionally see
items on the per-sb quota list which are "dead" so that the
two places where we traverse that list are updated to take
account of that.

As a result of this patch, the gfs2 quota shrinker is now
NUMA zone aware, and we are also laying the foundations for
further improvments in due course.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Signed-off-by: NAbhijith Das <adas@redhat.com>
Tested-by: NAbhijith Das <adas@redhat.com>
Cc: Dave Chinner <dchinner@redhat.com>

2147dbfd

GFS2: Use reflink for quota data cache · 9b9f039d

由 Steven Whitehouse 提交于 11月 01, 2013

This patch adds reflink support to the quota data cache. It
looks a bit strange because we still don't have a sensible
split in the lookup by id and the lru list. That is coming in
later patches though.

The intent here is just to swap the current ref count for
reflinks in all cases with as little as possible other change.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Signed-off-by: NAbhijith Das <adas@redhat.com>
Tested-by: NAbhijith Das <adas@redhat.com>

9b9f039d

15 10月, 2013 1 次提交

GFS2: Use lockref for glocks · e66cf161

由 Steven Whitehouse 提交于 10月 15, 2013

Currently glocks have an atomic reference count and also a spinlock
which covers various internal fields, such as the state. This intent of
this patch is to replace the spinlock and the atomic reference count
with a lockref structure. This contains a spinlock which we can continue
to use as before, and a reference counter which is used in conjuction
with the spinlock to replace the previous atomic counter.

As a result of this there are some new rules for reference counting on
glocks. We need to distinguish between reference count changes under
gl_spin (which are now just increment or decrement of the new counter,
provided the count cannot hit zero) and those which are outside of
gl_spin, but which now take gl_spin internally.

The conversion is relatively straight forward. There is probably some
further clean up which can be done, but the priority at this stage is to
make the change in as simple a manner as possible.

A consequence of this change is that the reference count is being
decoupled from the lru list processing. This should allow future
adoption of the lru_list code with glocks in due course.

The reason for using the "dead" state and not just relying on 0 being
the "invalid state" is so that in due course 0 ref counts can be
allowable. The intent is to eventually be able to remove the ref count
changes which are currently hidden away in state_change().
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

e66cf161

04 10月, 2013 2 次提交

GFS2: Protect quota sync generation · e46c772d

由 Steven Whitehouse 提交于 10月 04, 2013

Now that gfs2_quota_sync can be potentially called from multiple
threads, we should protect this bit of code, and the sync generation
number in particular in order to ensure that there are no races
when syncing quotas.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Cc: Abhijith Das <adas@redhat.com>

e46c772d

GFS2: Remove obsolete quota tunable · bef292a7

由 Steven Whitehouse 提交于 10月 03, 2013

There is no need for a paramater which relates to the internals
of quota to be exposed to users. The only possible use would be
to turn it up so large that the memory allocation fails. So lets
remove it and set it to a sensible value which ensures that we
don't ask for multipage allocations.

Currently the size of struct gfs2_holder means that the caluclated
value is identical to the previous default value, so there should
be no functional change.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Cc: Abhijith Das <adas@redhat.com>

bef292a7

02 10月, 2013 1 次提交

GFS2: Add allocation parameters structure · 7b9cff46

由 Steven Whitehouse 提交于 10月 02, 2013

This patch adds a structure to contain allocation parameters with
the intention of future expansion of this structure. The idea is
that we should be able to add more information about the allocation
in the future in order to allow the allocator to make a better job
of placing the requests on-disk.

There is no functional difference from applying this patch.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

7b9cff46

18 9月, 2013 1 次提交

GFS2: Introduce rbm field bii · e579ed4f

由 Bob Peterson 提交于 9月 17, 2013

This is a respin of the original patch. As Steve pointed out, the
introduction of field bii makes it easy to eliminate bi itself.
This revised patch does just that, replacing bi with bii.

This patch adds a new field to the rbm structure, called bii,
which is an index into the array of bitmaps for an rgrp.
This replaces *bi which was a pointer to the bitmap.
This is being done for further optimizations.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

e579ed4f

17 9月, 2013 1 次提交

GFS2: introduce bi_blocks for optimization · 7e230f57

由 Bob Peterson 提交于 9月 11, 2013

This patch introduces a new field in the bitmap structure called
bi_blocks. Its purpose is to save us from constantly multiplying
bi_len by the constant GFS2_NBBY. It also paves the way for more
optimization in a future patch.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

7e230f57

10 4月, 2013 1 次提交

GFS2: Add origin indicator to glock callbacks · 81ffbf65

由 Steven Whitehouse 提交于 4月 10, 2013

This patch adds a bool indicating whether the demote
request was originated locally or remotely. This is then
used by the iopen ->go_callback() to make 100% sure that
it will only respond to remote callbacks.

Since ->evict_inode() uses GL_NOCACHE when it attempts to
get an exclusive lock on the iopen lock, this may result
in extra scheduling of the workqueue in case that the
exclusive promotion request failed. This patch prevents
that from happening.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

81ffbf65

08 4月, 2013 1 次提交

GFS2: replace gfs2_ail structure with gfs2_trans · 16ca9412

由 Benjamin Marzinski 提交于 4月 05, 2013

In order to allow transactions and log flushes to happen at the same
time, gfs2 needs to move the transaction accounting and active items
list code into the gfs2_trans structure.  As a first step toward this,
this patch removes the gfs2_ail structure, and handles the active items
list in the gfs_trans structure.  This keeps gfs2 from allocating an ail
structure on log flushes, and gives us a struture that can later be used
to store the transaction accounting outside of the gfs2 superblock
structure.

With this patch, at the end of a transaction, gfs2 will add the
gfs2_trans structure to the superblock if there is not one already.
This structure now has the active items fields that were previously in
gfs2_ail.  This is not necessary in the case where the transaction was
simply used to add revokes, since these are never written outside of the
journal, and thus, don't need an active items list.

Also, in order to make sure that the transaction structure is not
removed while it's still in use by gfs2_trans_end, unlocking the
sd_log_flush_lock has to happen slightly later in ending the
transaction.
Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

16ca9412

04 4月, 2013 1 次提交

GFS2: use kmalloc for lvb bitmap · 57c7310b

由 David Teigland 提交于 3月 05, 2013

The temp lvb bitmap was on the stack, which could
be an alignment problem for __set_bit_le.  Use
kmalloc for it instead.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

57c7310b

13 2月, 2013 2 次提交

gfs2: Store qd_id in struct gfs2_quota_data as a struct kqid · 05e0a60d

由 Eric W. Biederman 提交于 1月 31, 2013

- Change qd_id in struct gfs2_qutoa_data to struct kqid.
- Remove the now unnecessary QDF_USER bit field in qd_flags.
- Propopoage this change through the code generally making
  things simpler along the way.

Cc: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

05e0a60d

GFS2: Reinstate withdraw ack system · fd95e81c

由 Steven Whitehouse 提交于 2月 13, 2013

This patch reinstates the ack system which withdraw should be using. It
appears to have been accidentally forgotten when the lock module was
merged into GFS2, due to two different sysfs files having the same name.
Reported-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

fd95e81c

29 1月, 2013 3 次提交

GFS2: Use ->writepages for ordered writes · 45138990

由 Steven Whitehouse 提交于 1月 28, 2013

Instead of using a list of buffers to write ahead of the journal
flush, this now uses a list of inodes and calls ->writepages
via filemap_fdatawrite() in order to achieve the same thing. For
most use cases this results in a shorter ordered write list,
as well as much larger i/os being issued.

The ordered write list is sorted by inode number before writing
in order to retain the disk block ordering between inodes as
per the previous code.

The previous ordered write code used to conflict in its assumptions
about how to write out the disk blocks with mpage_writepages()
so that with this updated version we can also use mpage_writepages()
for GFS2's ordered write, writepages implementation. So we will
also send larger i/os from writeback too.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

45138990

GFS2: Clean up freeze code · d564053f

由 Steven Whitehouse 提交于 1月 11, 2013

The freeze code has not been looked at a lot recently. Upstream has
moved on, and this is an attempt to catch us back up again. There
is a vfs level interface for the freeze code which can be called
from our (obsolete, but kept for backward compatibility purposes)
sysfs freeze interface. This means freezing this way vs. doing it
from the ioctl should now work in identical fashion.

As a result of this, the freeze function is only called once
and we can drop our own special purpose code for counting the
number of freezes.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

d564053f

GFS2: Copy gfs2_trans_add_bh into new data/meta functions · 767f433f

由 Steven Whitehouse 提交于 12月 14, 2012

This patch copies the body of gfs2_trans_add_bh into the two newly
added gfs2_trans_add_data and gfs2_trans_add_meta functions. We can
then move the .lo_add functions from lops.c into trans.c and call
them directly.

As a result of this, we no longer need to use the .lo_add functions
at all, so that is removed from the log operations structure.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

767f433f

15 11月, 2012 2 次提交

GFS2: remove redundant lvb pointer · 4e2f8849

由 David Teigland 提交于 11月 14, 2012

The lksb struct already contains a pointer to the lvb,
so another directly from the glock struct is not needed.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

4e2f8849

GFS2: only use lvb on glocks that need it · dba2d70c

由 David Teigland 提交于 11月 14, 2012

Save the effort of allocating, reading and writing
the lvb for most glocks that do not use it.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

dba2d70c

14 11月, 2012 1 次提交

GFS2: skip dlm_unlock calls in unmount · fb6791d1

由 David Teigland 提交于 11月 13, 2012

When unmounting, gfs2 does a full dlm_unlock operation on every
cached lock. This can create a very large amount of work and can
take a long time to complete. However, the vast majority of these
dlm unlock operations are unnecessary because after all the unlocks
are done, gfs2 leaves the dlm lockspace, which automatically clears
the locks of the leaving node, without unlocking each one individually.
So, gfs2 can skip explicit dlm unlocks, and use dlm_release_lockspace to
remove the locks implicitly. The one exception is when the lock's lvb is
being used. In this case, dlm_unlock is called because it may update the
lvb of the resource.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

fb6791d1

07 11月, 2012 2 次提交

GFS2: Rename glops go_xmote_th to go_sync · 06dfc306

由 Bob Peterson 提交于 10月 24, 2012

[Editorial: This is a nit, but has been a minor irritation for a long time:]

This patch renames glops structure item for go_xmote_th to go_sync.
The functionality is unchanged; it's just for readability.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

06dfc306

GFS2: Speed up gfs2_rbm_from_block · a68a0a35

由 Bob Peterson 提交于 10月 19, 2012

This patch is a rewrite of function gfs2_rbm_from_block. Rather than
looping to find the right bitmap, the code now does a few simple
math calculations.

I compared the performance of both algorithms side by side and the new
algorithm is noticeably faster. Sample instrumentation output from a
"fast" machine:

5 million calls: millisec spent: Orig: 166 New: 113
5 million calls: millisec spent: Orig: 189 New: 114

In addition, I ran postmark (on a somewhat slowr CPU) before the after
the new algorithm was put in place and postmark showed a decent
improvement:

Before the new algorithm:
-------------------------
Time:
	645 seconds total
	584 seconds of transactions (171 per second)

Files:
	150087 created (232 per second)
		Creation alone: 100000 files (2083 per second)
		Mixed with transactions: 50087 files (85 per second)
	49995 read (85 per second)
	49991 appended (85 per second)
	150087 deleted (232 per second)
		Deletion alone: 100174 files (7705 per second)
		Mixed with transactions: 49913 files (85 per second)

Data:
	273.42 megabytes read (434.08 kilobytes per second)
	852.13 megabytes written (1.32 megabytes per second)

With the new algorithm:
-----------------------
Time:
	599 seconds total
	530 seconds of transactions (188 per second)

Files:
	150087 created (250 per second)
		Creation alone: 100000 files (1886 per second)
		Mixed with transactions: 50087 files (94 per second)
	49995 read (94 per second)
	49991 appended (94 per second)
	150087 deleted (250 per second)
		Deletion alone: 100174 files (6260 per second)
		Mixed with transactions: 49913 files (94 per second)

Data:
	273.42 megabytes read (467.42 kilobytes per second)
	852.13 megabytes written (1.42 megabytes per second)
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

a68a0a35

24 9月, 2012 5 次提交

GFS2: Consolidate free block searching functions · ff7f4cb4

由 Steven Whitehouse 提交于 9月 10, 2012

With the recently added block reservation code, an additional function
was added to search for free blocks. This had a restriction of only being
able to search for aligned extents of free blocks. As a result the
allocation patterns when reserving blocks were suboptimal when the
existing allocation of blocks for an inode was not aligned to the same
boundary.

This patch resolves that problem by adding the ability for gfs2_rbm_find
to search for extents of a particular minimum size. We can then use
gfs2_rbm_find for both looking for reservations, and also looking for
free blocks on an individual basis when we actually come to do the
allocation later on. As a result we only need a single set of code
to deal with both situations.

The function gfs2_rbm_from_block() is moved up rgrp.c so that it
occurs before all of its callers.

Many thanks are due to Bob for helping track down the final issue in
this patch. That fix to the rb_tree traversal and to not share
block reservations from a dirctory to its children is included here.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

ff7f4cb4

GFS2: Improve block reservation tracing · 9e733d39

由 Steven Whitehouse 提交于 8月 23, 2012

This patch improves the tracing of block reservations by
removing some corner cases and also providing more useful
detail in the traces.

A new field is added to the reservation structure to contain
the inode number. This is used since in certain contexts it is
not possible to access the inode itself to obtain this information.
As a result we can then display the inode number for all tracepoints
and also in case we dump the resource group.

The "del" tracepoint operation has been removed. This could be called
with the reservation rgrp set to NULL. That resulted in not printing
the device number, and thus making the information largely useless
anyway. Also, the conditional on the rgrp being NULL can then be
removed from the tracepoint. After this change, all the block
reservation tracepoint calls will be called with the rgrp information.

The existing ins,clm and tdel calls to the block reservation tracepoint
are sufficient to track the entire life of the block reservation.

In gfs2_block_alloc() the error detection is updated to print out
the inode number of the problematic inode. This can then be compared
against the information in the glock dump,tracepoints, etc.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

9e733d39

GFS2: Replace rgblk_search with gfs2_rbm_find · 5b924ae2

由 Steven Whitehouse 提交于 8月 01, 2012

This is part of a series of patches which are introducing the
gfs2_rbm structure throughout the block allocation code. The
main aim of this part is to create a search function which can
deal directly with struct gfs2_rbm. In this case it specifies
the initial position at which to start the search and also the
point at which the search terminates.

The net result of this is to clean up the search code and make
it rather more readable, and the various possible exceptions which
may occur during the search are partitioned into their own functions.

There are some bug fixes too. We should not be checking the reservations
while allocating extents - the time for that is when we are searching
for where to put the extent, not when we've already made that decision.

Also, rgblk_search had two uses, and in only one of those cases did
it make sense to check for reservations. This is fixed in the new
gfs2_rbm_find function, which has a cleaner interface.

The reservation checking has been improved by always checking for
contiguous reservations, and returning the first free block after
all contiguous reservations. This is done under the spin lock to
ensure consistancy of the tree.

The allocation of extents is now in all cases done by the existing
allocation code, and if there is an active reservation, that is updated
after the fact. Again this is done under the spin lock, since it entails
changing the lookup key for the reservation in question.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

5b924ae2

GFS2: Add structure to contain rgrp, bitmap, offset tuple · 4a993fb1

由 Steven Whitehouse 提交于 7月 31, 2012

This patch introduces a new structure, gfs2_rbm, which is a
tuple of a resource group, a bitmap within the resource group
and an offset within that bitmap. This is designed to make
manipulating these sets of variables easier. There is also a
new helper function which converts this representation back
to a disk block address.

In addition, the rbtree nodes which are used for the reservations
were not being correctly initialised, which is now fixed. Also,
the tracing was not passing through the inode where it should
have been. That is mostly fixed aside from one corner case. This
needs to be revisited since there can also be a NULL rgrp in
some cases which results in the device being incorrect in the
trace.

This is intended to be the first step towards cleaning up some
of the allocation code, and some further bug fixes.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

4a993fb1

GFS2: Remove rs_requested field from reservations · 71f890f7

由 Steven Whitehouse 提交于 7月 30, 2012

The rs_requested field is left over from the original allocation
code, however this should have been a parameter passed to the
various functions from gfs2_inplace_reserve() and not a member of the
reservation structure as the value is not required after the
initial allocation.

This also helps simplify the code since we no longer need to set
the rs_requested to zero. Also the gfs2_inplace_release()
function can also be simplified since the reservation structure
will always be defined when it is called, and the only remaining
task is to unlock the rgrp if required. It can also now be
called unconditionally too, resulting in a further simplification.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

71f890f7

19 7月, 2012 1 次提交

GFS2: Reduce file fragmentation · 8e2e0047

由 Bob Peterson 提交于 7月 19, 2012

This patch reduces GFS2 file fragmentation by pre-reserving blocks. The
resulting improved on disk layout greatly speeds up operations in cases
which would have resulted in interlaced allocation of blocks previously.
A typical example of this is 10 parallel dd processes, each writing to a
file in a common dirctory.

The implementation uses an rbtree of reservations attached to each
resource group (and each inode).
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

8e2e0047

08 6月, 2012 1 次提交

GFS2: Use lvbs for storing rgrp information with mount option · 90306c41

由 Benjamin Marzinski 提交于 5月 29, 2012

Instead of reading in the resource groups when gfs2 is checking
for free space to allocate from, gfs2 can store the necessary infromation
in the resource group's lvb. Also, instead of searching for unlinked
inodes in every resource group that's checked for free space, gfs2 can
store the number of unlinked but inodes in the lvb, and only check for
unlinked inodes if it will find some.

The first time a resource group is locked, the lvb must initialized.
Since this involves counting the unlinked inodes in the resource group,
this takes a little extra time. But after that, if the resource group
is locked with GL_SKIP, the buffer head won't be read in unless it's
actually needed.

Enabling the resource groups lvbs is done via the rgrplvb mount option. If
this option isn't set, the lvbs will still be set and updated, but they won't
be verfied or used by the filesystem. To safely turn on this option, all of
the nodes mounting the filesystem must be running code with this patch, and
the filesystem must have been completely unmounted since they were updated.
Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

90306c41

06 6月, 2012 1 次提交

GFS2: Fold quota data into the reservations struct · 5407e242

由 Bob Peterson 提交于 5月 18, 2012

This patch moves the ancillary quota data structures into the
block reservations structure. This saves GFS2 some time and
effort in allocating and deallocating the qadata structure.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

5407e242

03 5月, 2012 1 次提交

dlm: fixes for nodir mode · 4875647a

由 David Teigland 提交于 4月 26, 2012

The "nodir" mode (statically assign master nodes instead
of using the resource directory) has always been highly
experimental, and never seriously used.  This commit
fixes a number of problems, making nodir much more usable.

- Major change to recovery: recover all locks and restart
  all in-progress operations after recovery.  In some
  cases it's not possible to know which in-progess locks
  to recover, so recover all.  (Most require recovery
  in nodir mode anyway since rehashing changes most
  master nodes.)

- Change the way nodir mode is enabled, from a command
  line mount arg passed through gfs2, into a sysfs
  file managed by dlm_controld, consistent with the
  other config settings.

- Allow recovering MSTCPY locks on an rsb that has not
  yet been turned into a master copy.

- Ignore RCOM_LOCK and RCOM_LOCK_REPLY recovery messages
  from a previous, aborted recovery cycle.  Base this
  on the local recovery status not being in the state
  where any nodes should be sending LOCK messages for the
  current recovery cycle.

- Hold rsb lock around dlm_purge_mstcpy_locks() because it
  may run concurrently with dlm_recover_master_copy().

- Maintain highbast on process-copy lkb's (in addition to
  the master as is usual), because the lkb can switch
  back and forth between being a master and being a
  process copy as the master node changes in recovery.

- When recovering MSTCPY locks, flag rsb's that have
  non-empty convert or waiting queues for granting
  at the end of recovery.  (Rename flag from LOCKS_PURGED
  to RECOVER_GRANT and similar for the recovery function,
  because it's not only resources with purged locks
  that need grant a grant attempt.)

- Replace a couple of unnecessary assertion panics with
  error messages.
Signed-off-by: NDavid Teigland <teigland@redhat.com>

4875647a

02 5月, 2012 1 次提交

GFS2: eliminate log elements and simplify · c0752aa7

由 Bob Peterson 提交于 5月 01, 2012

This patch eliminates the gfs2_log_element data structure and
rolls its two components into the gfs2_bufdata. This makes the code
easier to understand and makes it easier to migrate to a rbtree
to keep the list sorted.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

c0752aa7

30 4月, 2012 1 次提交

GFS2: Eliminate vestigial sd_log_le_rg · 1c47f095

由 Bob Peterson 提交于 4月 27, 2012

This patch eliminates gfs2 superblock variable sd_log_le_rg which
is no longer used.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

1c47f095

24 4月, 2012 1 次提交

GFS2: Remove bd_list_tr · c50b91c4

由 Steven Whitehouse 提交于 4月 16, 2012

This is another clean up in the logging code. This per-transaction
list was largely unused. Its main function was to ensure that the
number of buffers in a transaction was correct, however that counter
was only used to check the number of buffers in the bd_list_tr, plus
an assert at the end of each transaction. With the assert now changed
to use the calculated buffer counts, we can remove both bd_list_tr and
its associated counter.

This should make the code easier to understand as well as shrinking
a couple of structures.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

c50b91c4