提交 · 2457aec63745e235bcafb7ef312b182d8682f0fc · openanolis / cloud-kernel

05 6月, 2014 1 次提交

mm: non-atomically mark page accessed during page cache allocation where possible · 2457aec6

由 Mel Gorman 提交于 6月 04, 2014

aops->write_begin may allocate a new page and make it visible only to have
mark_page_accessed called almost immediately after.  Once the page is
visible the atomic operations are necessary which is noticable overhead
when writing to an in-memory filesystem like tmpfs but should also be
noticable with fast storage.  The objective of the patch is to initialse
the accessed information with non-atomic operations before the page is
visible.

The bulk of filesystems directly or indirectly use
grab_cache_page_write_begin or find_or_create_page for the initial
allocation of a page cache page.  This patch adds an init_page_accessed()
helper which behaves like the first call to mark_page_accessed() but may
called before the page is visible and can be done non-atomically.

The primary APIs of concern in this care are the following and are used
by most filesystems.

	find_get_page
	find_lock_page
	find_or_create_page
	grab_cache_page_nowait
	grab_cache_page_write_begin

All of them are very similar in detail to the patch creates a core helper
pagecache_get_page() which takes a flags parameter that affects its
behavior such as whether the page should be marked accessed or not.  Then
old API is preserved but is basically a thin wrapper around this core
function.

Each of the filesystems are then updated to avoid calling
mark_page_accessed when it is known that the VM interfaces have already
done the job.  There is a slight snag in that the timing of the
mark_page_accessed() has now changed so in rare cases it's possible a page
gets to the end of the LRU as PageReferenced where as previously it might
have been repromoted.  This is expected to be rare but it's worth the
filesystem people thinking about it in case they see a problem with the
timing change.  It is also the case that some filesystems may be marking
pages accessed that previously did not but it makes sense that filesystems
have consistent behaviour in this regard.

The test case used to evaulate this is a simple dd of a large file done
multiple times with the file deleted on each iterations.  The size of the
file is 1/10th physical memory to avoid dirty page balancing.  In the
async case it will be possible that the workload completes without even
hitting the disk and will have variable results but highlight the impact
of mark_page_accessed for async IO.  The sync results are expected to be
more stable.  The exception is tmpfs where the normal case is for the "IO"
to not hit the disk.

The test machine was single socket and UMA to avoid any scheduling or NUMA
artifacts.  Throughput and wall times are presented for sync IO, only wall
times are shown for async as the granularity reported by dd and the
variability is unsuitable for comparison.  As async results were variable
do to writback timings, I'm only reporting the maximum figures.  The sync
results were stable enough to make the mean and stddev uninteresting.

The performance results are reported based on a run with no profiling.
Profile data is based on a separate run with oprofile running.

async dd
                                    3.15.0-rc3            3.15.0-rc3
                                       vanilla           accessed-v2
ext3    Max      elapsed     13.9900 (  0.00%)     11.5900 ( 17.16%)
tmpfs	Max      elapsed      0.5100 (  0.00%)      0.4900 (  3.92%)
btrfs   Max      elapsed     12.8100 (  0.00%)     12.7800 (  0.23%)
ext4	Max      elapsed     18.6000 (  0.00%)     13.3400 ( 28.28%)
xfs	Max      elapsed     12.5600 (  0.00%)      2.0900 ( 83.36%)

The XFS figure is a bit strange as it managed to avoid a worst case by
sheer luck but the average figures looked reasonable.

        samples percentage
ext3       86107    0.9783  vmlinux-3.15.0-rc4-vanilla        mark_page_accessed
ext3       23833    0.2710  vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed
ext3        5036    0.0573  vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed
ext4       64566    0.8961  vmlinux-3.15.0-rc4-vanilla        mark_page_accessed
ext4        5322    0.0713  vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed
ext4        2869    0.0384  vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed
xfs        62126    1.7675  vmlinux-3.15.0-rc4-vanilla        mark_page_accessed
xfs         1904    0.0554  vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed
xfs          103    0.0030  vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed
btrfs      10655    0.1338  vmlinux-3.15.0-rc4-vanilla        mark_page_accessed
btrfs       2020    0.0273  vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed
btrfs        587    0.0079  vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed
tmpfs      59562    3.2628  vmlinux-3.15.0-rc4-vanilla        mark_page_accessed
tmpfs       1210    0.0696  vmlinux-3.15.0-rc4-accessed-v3r25 init_page_accessed
tmpfs         94    0.0054  vmlinux-3.15.0-rc4-accessed-v3r25 mark_page_accessed

[akpm@linux-foundation.org: don't run init_page_accessed() against an uninitialised pointer]
Signed-off-by: NMel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Jan Kara <jack@suse.cz>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Tested-by: NPrabhakar Lad <prabhakar.csengg@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2457aec6

01 4月, 2014 1 次提交

GFS2: Fix address space from page function · 1b2ad412

由 Steven Whitehouse 提交于 3月 31, 2014

Now that rgrps use the address space which is part of the super
block, we need to update gfs2_mapping2sbd() to take account of
that. The only way to do that easily is to use a different set
of address_space_operations for rgrps.
Reported-by: NAbhi Das <adas@redhat.com>
Tested-by: NAbhi Das <adas@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

1b2ad412

25 2月, 2014 1 次提交

GFS2: Move log buffer accounting to transaction · 022ef4fe

由 Steven Whitehouse 提交于 2月 21, 2014

Now we have a master transaction into which other transactions
are merged, the accounting can be done using this master
transaction. We no longer require the superblock fields which
were being used for this function.

In addition, this allows for a clean up in calc_reserved()
making it rather easier understand. Also, by reducing the
number of variables used to track the buffers being added
and removed from the journal, a number of error checks are
now no longer required.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

022ef4fe

03 1月, 2014 1 次提交

GFS2: Use only a single address space for rgrps · 70d4ee94

由 Steven Whitehouse 提交于 12月 06, 2013

Prior to this patch, GFS2 had one address space for each rgrp,
stored in the glock. This patch changes them to use a single
address space in the super block. This therefore saves
(sizeof(struct address_space) * nr_of_rgrps) bytes of memory
and for large filesystems, that can be significant.

It would be nice to be able to do something similar and merge
the inode metadata address space into the same global
address space. However, that is rather more complicated as the
on-disk location doesn't have a 1:1 mapping with the inodes in
general. So while it could be done, it will be a more complicated
operation as it requires changing a lot more code paths.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

70d4ee94

14 12月, 2013 1 次提交

GFS2: Fix slab memory leak in gfs2_bufdata · 502be2a3

由 Bob Peterson 提交于 12月 13, 2013

This patch fixes a slab memory leak that sometimes can occur
for files with a very short lifespan. The problem occurs when
a dinode is deleted before it has gotten to the journal properly.
In the leak scenario, the bd object is pinned for journal
committment (queued to the metadata buffers queue: sd_log_le_buf)
but is subsequently unpinned and dequeued before it finds its way
to the ail or the revoke queue. In this rare circumstance, the bd
object needs to be freed from slab memory, or it is forgotten.
We have to be very careful how we do it, though, because
multiple processes can call gfs2_remove_from_journal. In order to
avoid double-frees, only the process that does the unpinning is
allowed to free the bd.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

502be2a3

20 8月, 2013 1 次提交

GFS2: Move gfs2_sync_meta to lops.c · 7c0ef28a

由 Steven Whitehouse 提交于 7月 03, 2013

Since gfs2_sync_meta() is only called from a single file, lets move
it to lops.c where it is used, and mark it static. At the same
time, we can clean up the meta_io.h header too.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

7c0ef28a

19 6月, 2013 1 次提交

GFS2: aggressively issue revokes in gfs2_log_flush · 5d054964

由 Benjamin Marzinski 提交于 6月 14, 2013

This patch looks at all the outstanding blocks in all the transactions
on the log, and moves the completed ones to the ail2 list.  Then it
issues revokes for these blocks.  This will hopefully speed things up
in situations where there is a lot of contention for glocks, especially
if they are acquired serially.

revoke_lo_before_commit will issue at most one log block's full of these
preemptive revokes. The amount of reserved log space that
gfs2_log_reserve() ignores has been incremented to allow for this extra
block.

This patch also consolidates the common revoke instructions into one
function, gfs2_add_revoke().
Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

5d054964

08 4月, 2013 1 次提交

GFS2: replace gfs2_ail structure with gfs2_trans · 16ca9412

由 Benjamin Marzinski 提交于 4月 05, 2013

In order to allow transactions and log flushes to happen at the same
time, gfs2 needs to move the transaction accounting and active items
list code into the gfs2_trans structure.  As a first step toward this,
this patch removes the gfs2_ail structure, and handles the active items
list in the gfs_trans structure.  This keeps gfs2 from allocating an ail
structure on log flushes, and gives us a struture that can later be used
to store the transaction accounting outside of the gfs2 superblock
structure.

With this patch, at the end of a transaction, gfs2 will add the
gfs2_trans structure to the superblock if there is not one already.
This structure now has the active items fields that were previously in
gfs2_ail.  This is not necessary in the case where the transaction was
simply used to add revokes, since these are never written outside of the
journal, and thus, don't need an active items list.

Also, in order to make sure that the transaction structure is not
removed while it's still in use by gfs2_trans_end, unlocking the
sd_log_flush_lock has to happen slightly later in ending the
transaction.
Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

16ca9412

29 1月, 2013 1 次提交

GFS2: Merge gfs2_attach_bufdata() into trans.c · c76c4d96

由 Steven Whitehouse 提交于 12月 14, 2012

The locking in gfs2_attach_bufdata() was type specific (data/meta)
which made the function rather confusing. This patch moves the core
of gfs2_attach_bufdata() into trans.c renaming it gfs2_alloc_bufdata()
and moving the locking into gfs2_trans_add_data()/gfs2_trans_add_meta()

As a result all of the locking related to adding data and metadata to
the journal is now in these two functions. This should help to clarify
what is going on, and give us some opportunities to simplify in
some cases.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

c76c4d96

04 8月, 2012 1 次提交

gfs2: nuke pdflush from comments · e76e0ec9

由 Artem Bityutskiy 提交于 7月 25, 2012

The pdflush thread is long gone, so this patch removes references to pdflush
from gfs comments.

Cc: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: NArtem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

e76e0ec9

28 6月, 2012 1 次提交

GFS2: Fixing double brelse'ing bh allocated in gfs2_meta_read when EIO occurs · 44b8db13

由 Masatake YAMATO 提交于 6月 18, 2012

This patch fixes buffer_head double free in following code path:

gfs2_block_map
=> gfs2_meta_inode_buffer
 => gfs2_meta_indirect_buffer
  => gfs2_meta_read
=> release_metapath

gfs2_block_map calls gfs2_meta_inode_buffer with &mp.mp_bh[0]
as an argument. mp.mp_bh are filled with zero at the beginning
of gfs2_block_map.

If gfs2_meta_inode_buffer returns non-zero value, gfs2_block_map
calls release_metapath to free buffers chained to mp.mp_bh.
release_metapath checks each slot of mp.mp_bh[i] and
free(with brelse) unless the slot is filled with NULL.

&mp.mp_bh[0] passed to gfs2_meta_inode_buffer is filled at
gfs2_meta_read. gfs2_meta_read is filled a buffer allocated with
gfs2_getbuf even if EIO occurs. When EIO occurs, the allocated buffer
is brelse'ed though the pointer(wrong poiner) points the brelse'ed is
passed back to caller via an argument bhp.

gfs2_meta_indirect_buffer, the caller also pass the wrong pointer
to its caller with EIO. Finally gfs2_block_map gets both EIO and
&mp.mp_bh[0] filled with the wrong pointer. release_metapath
calls brelse again on the wrong pointer.
Signed-off-by: NMasatake YAMATO <yamato@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

44b8db13

11 5月, 2012 1 次提交

GFS2: Eliminate unused "new" parameter to gfs2_meta_indirect_buffer · f2f9c812

由 Bob Peterson 提交于 5月 10, 2012

It turns out that the "new" parameter to function gfs2_meta_indirect_buffer
was always being passed in as zero. Therefore, this patch eliminates it
and simplifies the function.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

f2f9c812

02 5月, 2012 1 次提交

GFS2: eliminate log elements and simplify · c0752aa7

由 Bob Peterson 提交于 5月 01, 2012

This patch eliminates the gfs2_log_element data structure and
rolls its two components into the gfs2_bufdata. This makes the code
easier to understand and makes it easier to migrate to a rbtree
to keep the list sorted.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

c0752aa7

24 4月, 2012 1 次提交

GFS2: Remove bd_list_tr · c50b91c4

由 Steven Whitehouse 提交于 4月 16, 2012

This is another clean up in the logging code. This per-transaction
list was largely unused. Its main function was to ensure that the
number of buffers in a transaction was correct, however that counter
was only used to check the number of buffers in the bd_list_tr, plus
an assert at the end of each transaction. With the assert now changed
to use the calculated buffer counts, we can remove both bd_list_tr and
its associated counter.

This should make the code easier to understand as well as shrinking
a couple of structures.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

c50b91c4

08 11月, 2011 1 次提交

GFS2: Fix up REQ flags · 20ed0535

由 Steven Whitehouse 提交于 10月 31, 2011

Christoph has split up REQ_PRIO from REQ_META. That means that
we can drop REQ_PRIO from places where is it not needed. I'm
not at all sure that the combination WRITE_FLUSH_FUA | REQ_PRIO
makes any kind of sense, anyway.

In addition, I've added REQ_META to one place in the code where
it was missing. REQ_PRIO has been left for read/writes triggered
by glock acquisition and writeback only. We can adjust it again
if required, but these are the most important points from a
performance perspective.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>

20ed0535

23 8月, 2011 1 次提交

block: separate priority boosting from REQ_META · 65299a3b

由 Christoph Hellwig 提交于 8月 23, 2011

Add a new REQ_PRIO to let requests preempt others in the cfq I/O schedule,
and lave REQ_META purely for marking requests as metadata in blktrace.

All existing callers of REQ_META except for XFS are updated to also
set REQ_PRIO for now.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NNamhyung Kim <namhyung@gmail.com>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

65299a3b

20 4月, 2011 1 次提交

GFS2: Improve tracing support (adds two flags) · 627c10b7

由 Steven Whitehouse 提交于 4月 14, 2011

This adds support for two new flags. One keeps track of whether
the glock is on the LRU list or not. The other isn't really a
flag as such, but an indication of whether the glock has an
attached object or not. This indication is reported without
any locking, which is ok since we do not dereference the object
pointer but merely report whether it is NULL or not.

Also, this fixes one place where a tracepoint was missing, which
was at the point we remove deallocated blocks from the journal.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

627c10b7

14 3月, 2011 1 次提交

GFS2: Update to AIL list locking · c618e87a

由 Steven Whitehouse 提交于 3月 14, 2011

The previous patch missed a couple of places where the AIL list
needed locking, so this fixes up those places, plus a comment
is corrected too.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Cc: Dave Chinner <dchinner@redhat.com>

c618e87a

10 3月, 2011 2 次提交

block: kill off REQ_UNPLUG · 721a9602

由 Jens Axboe 提交于 3月 09, 2011

With the plugging now being explicitly controlled by the
submitter, callers need not pass down unplugging hints
to the block layer. If they want to unplug, it's because they
manually plugged on their own - in which case, they should just
unplug at will.
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

721a9602

block: remove per-queue plugging · 7eaceacc

由 Jens Axboe 提交于 3月 10, 2011

Code has been converted over to the new explicit on-stack plugging,
and delay users have been converted to use the new API for that.
So lets kill off the old plugging along with aops->sync_page().
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

7eaceacc

27 10月, 2010 1 次提交

writeback: remove nonblocking/encountered_congestion references · 1b430bee

由 Wu Fengguang 提交于 10月 26, 2010

This removes more dead code that was somehow missed by commit 0d99519e
(writeback: remove unused nonblocking and congestion checks).  There are
no behavior change except for the removal of two entries from one of the
ext4 tracing interface.

The nonblocking checks in ->writepages are no longer used because the
flusher now prefer to block on get_request_wait() than to skip inodes on
IO congestion.  The latter will lead to more seeky IO.

The nonblocking checks in ->writepage are no longer used because it's
redundant with the WB_SYNC_NONE check.

We no long set ->nonblocking in VM page out and page migration, because
a) it's effectively redundant with WB_SYNC_NONE in current code
b) it's old semantic of "Don't get stuck on request queues" is mis-behavior:
   that would skip some dirty inodes on congestion and page out others, which
   is unfair in terms of LRU age.

Inspired by Christoph Hellwig. Thanks!
Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: David Howells <dhowells@redhat.com>
Cc: Sage Weil <sage@newdream.net>
Cc: Steve French <sfrench@samba.org>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1b430bee

08 8月, 2010 1 次提交

block: unify flags for struct bio and struct request · 7b6d91da

由 Christoph Hellwig 提交于 8月 07, 2010

Remove the current bio flags and reuse the request flags for the bio, too.
This allows to more easily trace the type of I/O from the filesystem
down to the block driver. There were two flags in the bio that were
missing in the requests: BIO_RW_UNPLUG and BIO_RW_AHEAD. Also I've
renamed two request flags that had a superflous RW in them.

Note that the flags are in bio.h despite having the REQ_ name - as
blkdev.h includes bio.h that is the only way to go for now.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <jaxboe@fusionio.com>

7b6d91da

12 5月, 2010 1 次提交

GFS2: Eliminate useless err variable · eaefbf96

由 Bob Peterson 提交于 5月 11, 2010

This patch removes an unneeded "err" variable that is always
returned as zero.

Signed-off-by: Bob Peterson <rpeterso@redhat.com> 
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

eaefbf96

05 5月, 2010 1 次提交

GFS2: Various gfs2_logd improvements · 5e687eac

由 Benjamin Marzinski 提交于 5月 04, 2010

This patch contains various tweaks to how log flushes and active item writeback
work. gfs2_logd is now managed by a waitqueue, and gfs2_log_reseve now waits
for gfs2_logd to do the log flushing. Multiple functions were rewritten to
remove the need to call gfs2_log_lock(). Instead of using one test to see if
gfs2_logd had work to do, there are now seperate tests to check if there
are two many buffers in the incore log or if there are two many items on the
active items list.

This patch is a port of a patch Steve Whitehouse wrote about a year ago, with
some minor changes. Since gfs2_ail1_start always submits all the active items,
it no longer needs to keep track of the first ai submitted, so this has been
removed. In gfs2_log_reserve(), the order of the calls to
prepare_to_wait_exclusive() and wake_up() when firing off the logd thread has
been switched. If it called wake_up first there was a small window for a race,
where logd could run and return before gfs2_log_reserve was ready to get woken
up. If gfs2_logd ran, but did not free up enough blocks, gfs2_log_reserve()
would be left waiting for gfs2_logd to eventualy run because it timed out.
Finally, gt_logd_secs, which controls how long to wait before gfs2_logd times
out, and flushes the log, can now be set on mount with ar_commit.
Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

5e687eac

01 3月, 2010 1 次提交

GFS2: Metadata address space clean up · 009d8518

由 Steven Whitehouse 提交于 12月 08, 2009

Since the start of GFS2, an "extra" inode has been used to store
the metadata belonging to each inode. The only reason for using
this inode was to have an extra address space, the other fields
were unused. This means that the memory usage was rather inefficient.

The reason for keeping each inode's metadata in a separate address
space is that when glocks are requested on remote nodes, we need to
be able to efficiently locate the data and metadata which relating
to that glock (inode) in order to sync or sync and invalidate it
(depending on the remotely requested lock mode).

This patch adds a new type of glock, which has in addition to
its normal fields, has an address space. This applies to all
inode and rgrp glocks (but to no other glock types which remain
as before). As a result, we no longer need to have the second
inode.

This results in three major improvements:
 1. A saving of approx 25% of memory used in caching inodes
 2. A removal of the circular dependency between inodes and glocks
 3. No confusion between "normal" and "metadata" inodes in super.c

Although the first of these is the more immediately apparent, the
second is just as important as it now enables a number of clean
ups at umount time. Those will be the subject of future patches.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

009d8518

11 1月, 2010 1 次提交

GFS2: Use MAX_LFS_FILESIZE for meta inode size · ba198098

由 Steven Whitehouse 提交于 1月 08, 2010

Using ~0ULL was cauing sign issues in filemap_fdatawrite_range, so
use MAX_LFS_FILESIZE instead.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

ba198098

22 5月, 2009 1 次提交

GFS2: Clean up some file names · b1e71b06

由 Steven Whitehouse 提交于 5月 22, 2009

This patch renames the ops_*.c files which have no counterpart
without the ops_ prefix in order to shorten the name and make
it more readable. In addition, ops_address.h (which was very
small) is moved into inode.h and inode.h is cleaned up by
adding extern where required.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

b1e71b06

11 5月, 2009 2 次提交

GFS2: Optimise writepage for metadata · 4a0f9a32

由 Steven Whitehouse 提交于 4月 20, 2009

This adds a GFS2 specific writepage for metadata, rather than
continuing to use the VFS function. As a result we now tag all
our metadata I/O with the correct flag so that blktraces will
now be less confusing.

Also, the generic function was checking for a number of corner
cases which cannot happen on the metadata address spaces so that
this should be faster too.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

4a0f9a32

GFS2: Update the rw flags · c969f58c

由 Steven Whitehouse 提交于 4月 07, 2009

After Jens recent updates:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=a1f242524c3c1f5d40f1c9c343427e34d1aadd6e
et al. this is a patch to bring gfs2 uptodate with the core
code. Also I've managed to squash another call to ll_rw_block()
along the way.

There is still one part of the GFS2 I/O paths which are not correctly
annotated and that is due to the sharing of the writeback code between
the data and metadata address spaces. I would like to change that too,
but this patch is still worth doing on its own, I think.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

c969f58c

24 3月, 2009 2 次提交

GFS2: Clean up of glops.c · 6bac243f

由 Steven Whitehouse 提交于 3月 09, 2009

This cleans up a number of bits of code mostly based in glops.c.
A couple of simple functions have been merged into the callers
to make it more obvious what is going on, the mysterious raising
of i_writecount around the truncate_inode_pages() call has been
removed. The meta_go_* operations have been renamed rgrp_go_*
since that is the only lock type that they are used with.

The unused argument of gfs2_read_sb has been removed. Also
a bug has been fixed where a check for the rindex inode was
in the wrong callback. More comments are added, and the
debugging code is improved too.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

6bac243f

GFS2: Merge lock_dlm module into GFS2 · f057f6cd

由 Steven Whitehouse 提交于 1月 12, 2009

This is the big patch that I've been working on for some time
now. There are many reasons for wanting to make this change
such as:
 o Reducing overhead by eliminating duplicated fields between structures
 o Simplifcation of the code (reduces the code size by a fair bit)
 o The locking interface is now the DLM interface itself as proposed
   some time ago.
 o Fewer lookups of glocks when processing replies from the DLM
 o Fewer memory allocations/deallocations for each glock
 o Scope to do further optimisations in the future (but this patch is
   more than big enough for now!)

Please note that (a) this patch relates to the lock_dlm module and
not the DLM itself, that is still a separate module; and (b) that
we retain the ability to build GFS2 as a standalone single node
filesystem with out requiring the DLM.

This patch needs a lot of testing, hence my keeping it I restarted
my -git tree after the last merge window. That way, this has the maximum
exposure before its merged. This is (modulo a few minor bug fixes) the
same patch that I've been posting on and off the the last three months
and its passed a number of different tests so far.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

f057f6cd

27 6月, 2008 1 次提交

[GFS2] Clean up the glock core · 6802e340

由 Steven Whitehouse 提交于 5月 21, 2008

This patch implements a number of cleanups to the core of the
GFS2 glock code. As a result a lot of code is removed. It looks
like a really big change, but actually a large part of this patch
is either removing or moving existing code.

There are some new bits too though, such as the new run_queue()
function which is considerably streamlined. Highlights of this
patch include:

 o Fixes a cluster coherency bug during SH -> EX lock conversions
 o Removes the "glmutex" code in favour of a single bit lock
 o Removes the ->go_xmote_bh() for inodes since it was duplicating
   ->go_lock()
 o We now only use the ->lm_lock() function for both locks and
   unlocks (i.e. unlock is a lock with target mode LM_ST_UNLOCKED)
 o The fast path is considerably shortly, giving performance gains
   especially with lock_nolock
 o The glock_workqueue is now used for all the callbacks from the DLM
   which allows us to simplify the lock_dlm module (see following patch)
 o The way is now open to make further changes such as eliminating the two
   threads (gfs2_glockd and gfs2_scand) in favour of a more efficient
   scheme.

This patch has undergone extensive testing with various test suites
so it should be pretty stable by now.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Cc: Bob Peterson <rpeterso@redhat.com>

6802e340

12 5月, 2008 1 次提交

[GFS2] filesystem consistency error from do_strip · 091806ed

由 Bob Peterson 提交于 4月 29, 2008

This patch fixes a GFS2 filesystem consistency error reported from
function do_strip.  The problem was caused by a timing window
that allowed two vfs inodes to be created in memory that point
to the same file.  The problem is fixed by making the vfs's
iget_test, iget_set mechanism check and set a new bit in the
in-core gfs2_inode structure while the vfs inode spin_lock is held.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

091806ed

25 1月, 2008 4 次提交

[GFS2] Fix typo · 3e5cd087

由 Bob Peterson 提交于 1月 16, 2008

This patch fixes a minor typo.  Surprisingly, it still compiled.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

3e5cd087

[GFS2] Function meta_read optimization · 15c7cee7

由 Bob Peterson 提交于 12月 11, 2007

This patch optimizes function gfs2_meta_read.  Basically, gfs2_meta_wait
was being called regardless of whether a disk read was requested.
This just pulls that wait into the if that triggers the read.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

15c7cee7

[GFS2] Add sync_page to metadata address space operations · 52d4c74b

由 Steven Whitehouse 提交于 11月 01, 2007

This set of address space operations was missing a sync_page
operation.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

52d4c74b

[GFS2] Remove useless i_cache from inodes · f91a0d3e

由 Steven Whitehouse 提交于 10月 15, 2007

The i_cache was designed to keep references to the indirect blocks
used during block mapping so that they didn't have to be looked
up continually. The idea failed because there are too many places
where the i_cache needs to be freed, and this has in the past been
the cause of many bugs.

In addition there was no performance benefit being gained since the
disk blocks in question were cached anyway. So this patch removes
it in order to simplify the code to prepare for other changes which
would otherwise have had to add further support for this feature.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

f91a0d3e

10 10月, 2007 3 次提交

[GFS2] Data corruption fix · de986e85

由 Wendy Cheng 提交于 9月 18, 2007

* GFS2 has been using i_cache array to store its indirect meta blocks.
Its flush routine doesn't correctly clean up all the entries. The
problem would show while multiple nodes do simultaneous writes to the
same file. Upon glock exclusive lock transfer, if the file is a sparse
file with large file size where the indirect meta blocks span multiple
array entries with "zero" entries in between. The flush routine
prematurely stops the flushing that leaves old (stale) entries around.
This leads to several nasty issues, including data corruption.
* Fix gfs2_get_block_noalloc checking to correctly return EIO upon
unmapped buffer.
Signed-off-by: NWendy Cheng <wcheng@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

de986e85

[GFS2] Clean up journaled data writing · 16615be1

由 Steven Whitehouse 提交于 9月 17, 2007

This patch cleans up the code for writing journaled data into the log.
It also removes the need to allocate a small "tag" structure for each
block written into the log. Instead we just keep count of the outstanding
I/O so that we can be sure that its all been written at the correct time.
Another result of this patch is that a number of ll_rw_block() calls
have become submit_bh() calls, closing some races at the same time.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

16615be1

[GFS2] Clean up gfs2_trans_add_revoke() · 1ad38c43

由 Steven Whitehouse 提交于 9月 03, 2007

The following alters gfs2_trans_add_revoke() to take a struct
gfs2_bufdata as an argument. This eliminates the memory allocation which
was previously required by making use of the already existing struct
gfs2_bufdata. It makes some sanity checks to ensure that the
gfs2_bufdata has been removed from all the lists before its recycled as
a revoke structure. This saves one memory allocation and one free per
revoke structure.

Also as a result, and to simplify the locking, since there is no longer
any blocking code in gfs2_trans_add_revoke() we must hold the log lock
whenever this function is called. This reduces the amount of times we
take and unlock the log lock.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

1ad38c43

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功