提交 · 24972557b12ce8fd5b6c6847d0e2ee1837ddc13b · openanolis / cloud-kernel

14 5月, 2014 1 次提交

GFS2: remove transaction glock · 24972557

由 Benjamin Marzinski 提交于 5月 01, 2014

GFS2 has a transaction glock, which must be grabbed for every
transaction, whose purpose is to deal with freezing the filesystem.
Aside from this involving a large amount of locking, it is very easy to
make the current fsfreeze code hang on unfreezing.

This patch rewrites how gfs2 handles freezing the filesystem. The
transaction glock is removed. In it's place is a freeze glock, which is
cached (but not held) in a shared state by every node in the cluster
when the filesystem is mounted. This lock only needs to be grabbed on
freezing, and actions which need to be safe from freezing, like
recovery.

When a node wants to freeze the filesystem, it grabs this glock
exclusively. When the freeze glock state changes on the nodes (either
from shared to unlocked, or shared to exclusive), the filesystem does a
special log flush. gfs2_log_flush() does all the work for flushing out
the and shutting down the incore log, and then it tries to grab the
freeze glock in a shared state again. Since the filesystem is stuck in
gfs2_log_flush, no new transaction can start, and nothing can be written
to disk. Unfreezing the filesytem simply involes dropping the freeze
glock, allowing gfs2_log_flush() to grab and then release the shared
lock, so it is cached for next time.

However, in order for the unfreezing ioctl to occur, gfs2 needs to get a
shared lock on the filesystem root directory inode to check permissions.
If that glock has already been grabbed exclusively, fsfreeze will be
unable to get the shared lock and unfreeze the filesystem.

In order to allow the unfreeze, this patch makes gfs2 grab a shared lock
on the filesystem root directory during the freeze, and hold it until it
unfreezes the filesystem. The functions which need to grab a shared
lock in order to allow the unfreeze ioctl to be issued now use the lock
grabbed by the freeze code instead.

The freeze and unfreeze code take care to make sure that this shared
lock will not be dropped while another process is using it.
Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

24972557

12 3月, 2014 1 次提交

GFS2: Re-add a call to log_flush_wait when flushing the journal · 428fd95d

由 Bob Peterson 提交于 3月 12, 2014

Upstream commit 34cc1781 changed a line of code from calling function
log_flush_commit to calling log_write_header. This had the effect of
eliminating a call to function log_flush_wait. That causes the journal
to skip over log headers, which results in multiple wrap points,
which itself leads to infinite loops in journal replay, both in the
kernel code and fsck.gfs2 code. This patch re-adds that call.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

428fd95d

25 2月, 2014 3 次提交

GFS2: Remove extra "if" in gfs2_log_flush() · b1ab1e44

由 Steven Whitehouse 提交于 2月 25, 2014

By reordering some of the assignments in gfs2_log_flush() it
is possible to remove one of the "if" statements as it can be
merged with one higher up the function.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

b1ab1e44

GFS2: Move log buffer accounting to transaction · 022ef4fe

由 Steven Whitehouse 提交于 2月 21, 2014

Now we have a master transaction into which other transactions
are merged, the accounting can be done using this master
transaction. We no longer require the superblock fields which
were being used for this function.

In addition, this allows for a clean up in calc_reserved()
making it rather easier understand. Also, by reducing the
number of variables used to track the buffers being added
and removed from the journal, a number of error checks are
now no longer required.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

022ef4fe

GFS2: Move log buffer lists into transaction · d69a3c65

由 Steven Whitehouse 提交于 2月 21, 2014

Over time, we hope to be able to improve the concurrency available
in the log code. This is one small step towards that, by moving
the buffer lists from the super block, and into the transaction
structure, so that each transaction builds its own buffer lists.

At transaction commit time, the buffer lists are merged into
the currently accumulating transaction. That transaction then
is passed into the before and after commit functions at journal
flush time. Thus there should be no change in overall behaviour
yet.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

d69a3c65

03 2月, 2014 1 次提交

GFS2: Plug on AIL flush · 885bceca

由 Steven Whitehouse 提交于 2月 03, 2014

When we do a flush of the AIL list, we are writing out what is
likely to be a lot of small I/Os, which are possibly in an order
which is not ideal performance-wise. Since this is done by calling
filemap_fdatatwrite for each individual inode's address space there
is no overall plugging going on.

In addition to that, we do not always wait for AIL i/o when we flush
it, so that it is possible for things to get left behind on the queue.
By adding explicit plugging here, we reduce the chances of this
being an issues. A quick test using the AIL flush tracepoint shows a
small, but measurable improvement.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

885bceca

14 12月, 2013 1 次提交

GFS2: Fix use-after-free race when calling gfs2_remove_from_ail · 9290a9a7

由 Bob Peterson 提交于 12月 10, 2013

Function gfs2_remove_from_ail drops the reference on the bh via
brelse. This patch fixes a race condition whereby bh is deferenced
after the brelse when setting bd->bd_blkno = bh->b_blocknr;
Under certain rare circumstances, bh might be gone or reused,
and bd->bd_blkno is set to whatever that memory happens to be,
which is often 0. Later, in gfs2_trans_add_unrevoke, that bd fails
the test "bd->bd_blkno >= blkno" which causes it to never be freed.
The end result is that the bd is never freed from the bufdata cache,
which results in this error:
slab error in kmem_cache_destroy(): cache `gfs2_bufdata': Can't free all objects
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

9290a9a7

19 6月, 2013 1 次提交

GFS2: aggressively issue revokes in gfs2_log_flush · 5d054964

由 Benjamin Marzinski 提交于 6月 14, 2013

This patch looks at all the outstanding blocks in all the transactions
on the log, and moves the completed ones to the ail2 list.  Then it
issues revokes for these blocks.  This will hopefully speed things up
in situations where there is a lot of contention for glocks, especially
if they are acquired serially.

revoke_lo_before_commit will issue at most one log block's full of these
preemptive revokes. The amount of reserved log space that
gfs2_log_reserve() ignores has been incremented to allow for this extra
block.

This patch also consolidates the common revoke instructions into one
function, gfs2_add_revoke().
Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

5d054964

08 4月, 2013 1 次提交

GFS2: replace gfs2_ail structure with gfs2_trans · 16ca9412

由 Benjamin Marzinski 提交于 4月 05, 2013

In order to allow transactions and log flushes to happen at the same
time, gfs2 needs to move the transaction accounting and active items
list code into the gfs2_trans structure.  As a first step toward this,
this patch removes the gfs2_ail structure, and handles the active items
list in the gfs_trans structure.  This keeps gfs2 from allocating an ail
structure on log flushes, and gives us a struture that can later be used
to store the transaction accounting outside of the gfs2 superblock
structure.

With this patch, at the end of a transaction, gfs2 will add the
gfs2_trans structure to the superblock if there is not one already.
This structure now has the active items fields that were previously in
gfs2_ail.  This is not necessary in the case where the transaction was
simply used to add revokes, since these are never written outside of the
journal, and thus, don't need an active items list.

Also, in order to make sure that the transaction structure is not
removed while it's still in use by gfs2_trans_end, unlocking the
sd_log_flush_lock has to happen slightly later in ending the
transaction.
Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

16ca9412

29 1月, 2013 1 次提交

GFS2: Use ->writepages for ordered writes · 45138990

由 Steven Whitehouse 提交于 1月 28, 2013

Instead of using a list of buffers to write ahead of the journal
flush, this now uses a list of inodes and calls ->writepages
via filemap_fdatawrite() in order to achieve the same thing. For
most use cases this results in a shorter ordered write list,
as well as much larger i/os being issued.

The ordered write list is sorted by inode number before writing
in order to retain the disk block ordering between inodes as
per the previous code.

The previous ordered write code used to conflict in its assumptions
about how to write out the disk blocks with mpage_writepages()
so that with this updated version we can also use mpage_writepages()
for GFS2's ordered write, writepages implementation. So we will
also send larger i/os from writeback too.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

45138990

02 5月, 2012 1 次提交

GFS2: eliminate log elements and simplify · c0752aa7

由 Bob Peterson 提交于 5月 01, 2012

This patch eliminates the gfs2_log_element data structure and
rolls its two components into the gfs2_bufdata. This makes the code
easier to understand and makes it easier to migrate to a rbtree
to keep the list sorted.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

c0752aa7

24 4月, 2012 4 次提交

GFS2: Log code fixes · 144a4c2f

由 Steven Whitehouse 提交于 4月 19, 2012

This patch removes a log lock from around atomic operation where
it is not needed, removes an unused variable, and also changes
a void pointer used incorrectly to a struct page pointer.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

144a4c2f

GFS2: Remove bd_list_tr · c50b91c4

由 Steven Whitehouse 提交于 4月 16, 2012

This is another clean up in the logging code. This per-transaction
list was largely unused. Its main function was to ensure that the
number of buffers in a transaction was correct, however that counter
was only used to check the number of buffers in the bd_list_tr, plus
an assert at the end of each transaction. With the assert now changed
to use the calculated buffer counts, we can remove both bd_list_tr and
its associated counter.

This should make the code easier to understand as well as shrinking
a couple of structures.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

c50b91c4

GFS2: Clean up log write code path · e8c92ed7

由 Steven Whitehouse 提交于 4月 16, 2012

Prior to this patch, we have two ways of sending i/o to the log.
One of those is used when we need to allocate both the data
to be written itself and also a buffer head to submit it. This
is done via sb_getblk and friends. This is used mostly for writing
log headers.

The other method is used when writing blocks which have some
in-place counterpart. This is the case for all the metadata
blocks which are journalled, and when journaled data is in use,
for unescaped journalled data blocks.

This patch replaces both of those two methods, and about half
a dozen separate i/o submission points with a single i/o
submission function. We also go direct to bio rather than
using buffer heads, since this allows us to build i/o
requests of the maximum size for the block device in
question. It also reduces the memory required for flushing
the log, which can be very useful in low memory situations.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

e8c92ed7

GFS2: Drop "pull" argument from log_write_header() · fdb76a42

由 Steven Whitehouse 提交于 4月 02, 2012

The "pull" argument to log_write_header() is only used
for debug purposes and it is not really needed any more. There
are other tests for this particular problem, so I think we can
dispose of it in order to simplify the code.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

fdb76a42

09 3月, 2012 1 次提交

GFS2: Clean up log flush header writing · 34cc1781

由 Steven Whitehouse 提交于 3月 09, 2012

We already send both a pre and post flush to the block device
when writing a journal header. There is no need to wait for
the previous I/O specifically when we do this, unless we've
turned "barriers" off.

As a side effect, this also cleans up the code path for flushing
the journal and makes it more readable.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

34cc1781

29 2月, 2012 3 次提交

GFS2: Make bd_cmp() static · 08728f2d

由 Steven Whitehouse 提交于 2月 21, 2012

Add missing static to bd_cmp()
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

08728f2d

GFS2: Sort the ordered write list · 4a36d08d

由 Bob Peterson 提交于 2月 14, 2012

This patch sorts the ordered write list for GFS2 writes.
This increases the throughput for simultaneous writes.
For example, if you have ten processes, all doing:
dd if=/dev/zero of=/mnt/gfs2/fileX
on different files, the throughput will be much better.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

4a36d08d

GFS2: Move two functions from log.c to lops.c · 47ac5537

由 Steven Whitehouse 提交于 2月 03, 2012

gfs2_log_get_buf() and gfs2_log_fake_buf() are both used
only in lops.c, so move them next to their callers and they
can then become static.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

47ac5537

22 11月, 2011 1 次提交

freezer: unexport refrigerator() and update try_to_freeze() slightly · a0acae0e

由 Tejun Heo 提交于 11月 21, 2011

There is no reason to export two functions for entering the
refrigerator.  Calling refrigerator() instead of try_to_freeze()
doesn't save anything noticeable or removes any race condition.

* Rename refrigerator() to __refrigerator() and make it return bool
  indicating whether it scheduled out for freezing.

* Update try_to_freeze() to return bool and relay the return value of
  __refrigerator() if freezing().

* Convert all refrigerator() users to try_to_freeze().

* Update documentation accordingly.

* While at it, add might_sleep() to try_to_freeze().
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Samuel Ortiz <samuel@sortiz.org>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Kara <jack@suse.cz>
Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp>
Cc: Christoph Hellwig <hch@infradead.org>

a0acae0e

08 11月, 2011 1 次提交

GFS2: Fix up REQ flags · 20ed0535

由 Steven Whitehouse 提交于 10月 31, 2011

Christoph has split up REQ_PRIO from REQ_META. That means that
we can drop REQ_PRIO from places where is it not needed. I'm
not at all sure that the combination WRITE_FLUSH_FUA | REQ_PRIO
makes any kind of sense, anyway.

In addition, I've added REQ_META to one place in the code where
it was missing. REQ_PRIO has been left for read/writes triggered
by glock acquisition and writeback only. We can adjust it again
if required, but these are the most important points from a
performance perspective.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>

20ed0535

23 8月, 2011 1 次提交

block: separate priority boosting from REQ_META · 65299a3b

由 Christoph Hellwig 提交于 8月 23, 2011

Add a new REQ_PRIO to let requests preempt others in the cfq I/O schedule,
and lave REQ_META purely for marking requests as metadata in blktrace.

All existing callers of REQ_META except for XFS are updated to also
set REQ_PRIO for now.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NNamhyung Kim <namhyung@gmail.com>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

65299a3b

14 7月, 2011 1 次提交

GFS2: Resolve inode eviction and ail list interaction bug · 380f7c65

由 Steven Whitehouse 提交于 7月 14, 2011

This patch contains a few misc fixes which resolve a recently
reported issue. This patch has been a real team effort and has
received a lot of testing.

The first issue is that the ail lock needs to be held over a few
more operations. The lock thats added into gfs2_releasepage() may
possibly be a candidate for replacing with RCU at some future
point, but at this stage we've gone for the obvious fix.

The second issue is that gfs2_write_inode() can end up calling
a glock recursively when called from gfs2_evict_inode() via the
syncing code, so it needs a guard added.

The third issue is that we either need to not truncate the metadata
pages of inodes which have zero link count, but which we cannot
deallocate due to them still being in use by other nodes, or we need
to ensure that those pages have all made it through the journal and
ail lists first. This patch takes the former approach, but the
latter has also been tested and there is nothing to choose between
them performance-wise. So again, we could revise that decision
in the future.

Also, the inode eviction process is now better documented.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Tested-by: NBob Peterson <rpeterso@redhat.com>
Tested-by: NAbhijith Das <adas@redhat.com>
Reported-by: NBarry J. Marson <bmarson@redhat.com>
Reported-by: NDavid Teigland <teigland@redhat.com>

380f7c65

22 5月, 2011 1 次提交

GFS2: Wait properly when flushing the ail list · 26b06a69

由 Steven Whitehouse 提交于 5月 21, 2011

The ail flush code has always relied upon log flushing to prevent
it from spinning needlessly. This fixes it to wait on the last
I/O request submitted (we don't need to wait for all of it)
instead of either spinning with io_schedule or sleeping.

As a result cpu usage of gfs2_logd is much reduced with certain
workloads.
Reported-by: NAbhijith Das <adas@redhat.com>
Tested-by: NAbhijith Das <adas@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

26b06a69

03 5月, 2011 1 次提交

GFS2: Fix ail list traversal · 4f1de018

由 Steven Whitehouse 提交于 4月 26, 2011

In the recent patches to update the AIL list code, I managed to
forget that the ail list lock got dropped, even though I
added a comment specifically to remind myself :(
Reported-by: NBarry Marson <bmarson@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

4f1de018

20 4月, 2011 3 次提交

GFS2: Add an AIL writeback tracepoint · c83ae9ca

由 Steven Whitehouse 提交于 4月 18, 2011

Add a tracepoint for monitoring writeback of the AIL.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

c83ae9ca

GFS2: Make writeback more responsive to system conditions · 4667a0ec

由 Steven Whitehouse 提交于 4月 18, 2011

This patch adds writeback_control to writing back the AIL
list. This means that we can then take advantage of the
information we get in ->write_inode() in order to set off
some pre-emptive writeback.

In addition, the AIL code is cleaned up a bit to make it
a bit simpler to understand.

There is still more which can usefully be done in this area,
but this is a good start at least.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

4667a0ec

GFS2: Use filemap_fdatawrite() to write back the AIL · 5ac048bb

由 Steven Whitehouse 提交于 3月 30, 2011

In order to ensure that the mapping stats (and thus the bdi) are correctly
updated, this patch changes the AIL writeback to use the filemap_datawrite
function. This helps prevent stalls in balance_dirty_pages() due to
large amounts of dirty metadata when there is little or no dirty data
around.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

5ac048bb

14 3月, 2011 1 次提交

GFS2: Update to AIL list locking · c618e87a

由 Steven Whitehouse 提交于 3月 14, 2011

The previous patch missed a couple of places where the AIL list
needed locking, so this fixes up those places, plus a comment
is corrected too.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Cc: Dave Chinner <dchinner@redhat.com>

c618e87a

11 3月, 2011 1 次提交

GFS2: introduce AIL lock · d6a079e8

由 Dave Chinner 提交于 3月 11, 2011

The log lock is currently used to protect the AIL lists and
the movements of buffers into and out of them. The lists
are self contained and no log specific items outside the
lists are accessed when starting or emptying the AIL lists.

Hence the operation of the AIL does not require the protection
of the log lock so split them out into a new AIL specific lock
to reduce the amount of traffic on the log lock. This will
also reduce the amount of serialisation that occurs when
the gfs2_logd pushes on the AIL to move it forward.

This reduces the impact of log pushing on sequential write
throughput.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

d6a079e8

10 3月, 2011 1 次提交

block: kill off REQ_UNPLUG · 721a9602

由 Jens Axboe 提交于 3月 09, 2011

With the plugging now being explicitly controlled by the
submitter, callers need not pass down unplugging hints
to the block layer. If they want to unplug, it's because they
manually plugged on their own - in which case, they should just
unplug at will.
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

721a9602

17 9月, 2010 1 次提交

GFS2: gfs2_logd should be using interruptible waits · 5f487490

由 Steven Whitehouse 提交于 9月 09, 2010

Looks like this crept in, in a recent update.
Reported-by: NKrzysztof Urbaniak <urban@bash.org.pl>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

5f487490

10 9月, 2010 1 次提交

gfs2: replace barriers with explicit flush / FUA usage · f1e4d518

由 Christoph Hellwig 提交于 8月 18, 2010

Switch to the WRITE_FLUSH_FUA flag for log writes, remove the EOPNOTSUPP
detection for barriers and stop setting the barrier flag for discards.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NSteven Whitehouse <swhiteho@redhat.com>
Acked-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

f1e4d518

08 8月, 2010 2 次提交

block: unify flags for struct bio and struct request · 7b6d91da

由 Christoph Hellwig 提交于 8月 07, 2010

Remove the current bio flags and reuse the request flags for the bio, too.
This allows to more easily trace the type of I/O from the filesystem
down to the block driver. There were two flags in the bio that were
missing in the requests: BIO_RW_UNPLUG and BIO_RW_AHEAD. Also I've
renamed two request flags that had a superflous RW in them.

Note that the flags are in bio.h despite having the REQ_ name - as
blkdev.h includes bio.h that is the only way to go for now.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

7b6d91da

block: BARRIER request should imply SYNC · 41f2df62

由 Christoph Hellwig 提交于 6月 17, 2010

A barrier request should by defintion have priority in get_request
and let the queue be unplugged immediately as it's blocking all forward
progress due to the queue draining.

Most filesystems already get this implicitly by the way how submit_bh
treats the buffer_ordered flag, and gfs2 sets it explicitly.  But btrfs
and XFS are still forgetting to set the flag, as is blkdev_issue_flush
and some places in DM/MD.

For XFS on metadata heavy workloads this gives a consistent speedup
in the 2-3% range.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

41f2df62

21 5月, 2010 1 次提交

GFS2: Rework reclaiming unlinked dinodes · ed4878e8

由 Bob Peterson 提交于 5月 20, 2010

The previous patch I wrote for reclaiming unlinked dinodes
had some shortcomings and did not prevent all hangs.
This version is much cleaner and more logical, and has
passed very difficult testing.  Sorry for the churn.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

ed4878e8

06 5月, 2010 1 次提交

GFS2: Add some useful messages · 913a71d2

由 Steven Whitehouse 提交于 5月 06, 2010

The following patch adds a message to indicate when barriers have been
disabled due to a block device which doesn't support them. You could
already tell this via the mount options in /proc/mounts, but all the
other filesystems also log a message at the same time.

Also, the same mechanisms are used to indicate when the lock
demote interface has been used (only ever used for debugging)
which is a request from our support team.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

913a71d2

05 5月, 2010 1 次提交

GFS2: Various gfs2_logd improvements · 5e687eac

由 Benjamin Marzinski 提交于 5月 04, 2010

This patch contains various tweaks to how log flushes and active item writeback
work. gfs2_logd is now managed by a waitqueue, and gfs2_log_reseve now waits
for gfs2_logd to do the log flushing. Multiple functions were rewritten to
remove the need to call gfs2_log_lock(). Instead of using one test to see if
gfs2_logd had work to do, there are now seperate tests to check if there
are two many buffers in the incore log or if there are two many items on the
active items list.

This patch is a port of a patch Steve Whitehouse wrote about a year ago, with
some minor changes. Since gfs2_ail1_start always submits all the active items,
it no longer needs to keep track of the first ai submitted, so this has been
removed. In gfs2_log_reserve(), the order of the calls to
prepare_to_wait_exclusive() and wake_up() when firing off the logd thread has
been switched. If it called wake_up first there was a small window for a race,
where logd could run and return before gfs2_log_reserve was ready to get woken
up. If gfs2_logd ran, but did not free up enough blocks, gfs2_log_reserve()
would be left waiting for gfs2_logd to eventualy run because it timed out.
Finally, gt_logd_secs, which controls how long to wait before gfs2_logd times
out, and flushes the log, can now be set on mount with ar_commit.
Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

5e687eac

11 3月, 2010 1 次提交

GFS2: Allow the number of committed revokes to temporarily be negative · 2e95e3f6

由 Benjamin Marzinski 提交于 3月 10, 2010

GFS2 tracks the number of revokes and unrevokes that are part of committed
transactions via sd_log_commited_revoke. It is possible for one process to add
revokes during its transaction, while another process unrevokes them during its
transaction. If the second process finishes its transaction first,
sd_log_commited_revoke will be decremented by the number of unrevokes that the
second process did, without first being incremented by the number of revokes
the first process did. This is fine, since all started transactions must be
completed before the journal can be flushed.  However, sd_log_commited_revoke
is an unsigned integer, and log_refund() causes an assertion failure if it
would go negative at the end of a transaction.  This patch makes
sd_log_commited_revoke a signed integer and allows it to go negative.
__gfs2_log_flush() still checks that it mataches the actual number of revokes.
Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

2e95e3f6

03 12月, 2009 1 次提交

GFS2: Tag all metadata with jid · 0ab7d13f

由 Steven Whitehouse 提交于 11月 06, 2009

There are two spare field in the header common to all GFS2
metadata. One is just the right size to fit a journal id
in it, and this patch updates the journal code so that each
time a metadata block is modified, we tag it with the journal
id of the node which is performing the modification.

The reason for this is that it should make it much easier to
debug issues which arise if we can tell which node was the
last to modify a particular metadata block.

Since the field is updated before the block is written into
the journal, each journal should only contain metadata which
is tagged with its own journal id. The one exception to this
is the journal header block, which might have a different node's
id in it, if that journal was recovered by another node in the
cluster.

Thus each journal will contain a record of which nodes recovered
it, via the journal header.

The other field in the metadata header could potentially be
used to hold information about what kind of operation was
performed, but for the time being we just zero it on each
transaction so that if we use it for that in future, we'll
know that the information (where it exists) is reliable.

I did consider using the other field to hold the journal
sequence number, however since in GFS2's journaling we write
the modified data into the journal and not the original
data, this gives no information as to what action caused the
modification, so I think we can probably come up with a better
use for those 64 bits in the future.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

0ab7d13f

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功