提交 · e7d4cb6bc19658646357eeff134645cd9bc3479f · openeuler / raspberrypi-kernel

14 10月, 2008 3 次提交

ocfs2: Abstract ocfs2_extent_tree in b-tree operations. · e7d4cb6b

由 Tao Ma 提交于 8月 18, 2008

In the old extent tree operation, we take the hypothesis that we
are using the ocfs2_extent_list in ocfs2_dinode as the tree root.
As xattr will also use ocfs2_extent_list to store large value
for a xattr entry, we refactor the tree operation so that xattr
can use it directly.

The refactoring includes 4 steps:
1. Abstract set/get of last_eb_blk and update_clusters since they may
   be stored in different location for dinode and xattr.
2. Add a new structure named ocfs2_extent_tree to indicate the
   extent tree the operation will work on.
3. Remove all the use of fe_bh and di, use root_bh and root_el in
   extent tree instead. So now all the fe_bh is replaced with
   et->root_bh, el with root_el accordingly.
4. Make ocfs2_lock_allocators generic. Now it is limited to be only used
   in file extend allocation. But the whole function is useful when we want
   to store large EAs.

Note: This patch doesn't touch ocfs2_commit_truncate() since it is not used
for anything other than truncate inode data btrees.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

e7d4cb6b

ocfs2: Use ocfs2_extent_list instead of ocfs2_dinode. · 811f933d

由 Tao Ma 提交于 8月 18, 2008

ocfs2_extend_meta_needed(), ocfs2_calc_extend_credits() and
ocfs2_reserve_new_metadata() are all useful for extent tree operations. But
they are all limited to an inode btree because they use a struct
ocfs2_dinode parameter. Change their parameter to struct ocfs2_extent_list
(the part of an ocfs2_dinode they actually use) so that the xattr btree code
can use these functions.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

811f933d

ocfs2: Modify ocfs2_num_free_extents for future xattr usage. · 231b87d1

由 Tao Ma 提交于 8月 18, 2008

ocfs2_num_free_extents() is used to find the number of free extent records
in an inode btree. Hence, it takes an "ocfs2_dinode" parameter. We want to
use this for extended attribute trees in the future, so genericize the
interface the take a buffer head. A future patch will allow that buffer_head
to contain any structure rooting an ocfs2 btree.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

231b87d1

10 9月, 2008 1 次提交

ocfs2: Fix a bug in direct IO read. · 0e116227

由 Tao Ma 提交于 9月 03, 2008

ocfs2 will become read-only if we try to read the bytes which pass
the end of i_size. This can be easily reproduced by following steps:
1. mkfs a ocfs2 volume with bs=4k cs=4k and nosparse.
2. create a small file(say less than 100 bytes) and we will create the file
   which is allocated 1 cluster.
3. read 8196 bytes from the kernel using O_DIRECT which exceeds the limit.
4. The ocfs2 volume becomes read-only and dmesg shows:
OCFS2: ERROR (device sda13): ocfs2_direct_IO_get_blocks:
Inode 66010 has a hole at block 1
File system is now read-only due to the potential of on-disk corruption.
Please run fsck.ocfs2 once the file system is unmounted.

So suppress the ERROR message.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

0e116227

01 8月, 2008 1 次提交

[PATCH] ocfs2: Fix oops when racing files truncates with writes into an mmap region · 961cecbe

由 Sunil Mushran 提交于 7月 16, 2008

This patch fixes an oops that is reproduced when one races writes to a mmap-ed
region with another process truncating the file.
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

961cecbe

17 7月, 2008 1 次提交

[PATCH] ocfs2: fix oops in mmap_truncate testing · c0420ad2

由 Coly Li 提交于 6月 30, 2008

This patch fixes a mmap_truncate bug which was found by ocfs2 test suite.

In an ocfs2 cluster more than 1 node, run program mmap_truncate, which races
mmap writes and truncates from multiple processes. While the test is
running, a stat from another node forces writeout, causing an oops in
ocfs2_get_block() because it sees a buffer to write which isn't allocated.

This patch fixed the bug by clear dirty and uptodate bits in buffer, leave
the buffer unmapped and return.

Fix is suggested by Mark Fasheh, and I code up the patch.
Signed-off-by: NColy Li <coyli@suse.de>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

c0420ad2

18 4月, 2008 1 次提交

fs/ocfs2/aops.c: test for IS_ERR rather than 0 · 58dadcdb

由 Julia Lawall 提交于 3月 28, 2008

The function ocfs2_start_trans always returns either a valid pointer or a
value made with ERR_PTR, so its result should be tested with IS_ERR, not
with a test for 0.
Signed-off-by: NJulia Lawall <julia@diku.dk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>

58dadcdb

04 3月, 2008 1 次提交

[PATCH] fs/ocfs2/aops.c: Correct use of ! and & · 86c838b0

由 Julia Lawall 提交于 2月 26, 2008

In commit e6bafba5, a bug was fixed that
involved converting !x & y to !(x & y).  The code below shows the same
pattern, and thus should perhaps be fixed in the same way.

This is not tested and clearly changes the semantics, so it is only
something to consider.
Signed-off-by: NJulia Lawall <julia@diku.dk>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

86c838b0

06 2月, 2008 1 次提交

Pagecache zeroing: zero_user_segment, zero_user_segments and zero_user · eebd2aa3

由 Christoph Lameter 提交于 2月 04, 2008

Simplify page cache zeroing of segments of pages through 3 functions

zero_user_segments(page, start1, end1, start2, end2)

        Zeros two segments of the page. It takes the position where to
        start and end the zeroing which avoids length calculations and
	makes code clearer.

zero_user_segment(page, start, end)

        Same for a single segment.

zero_user(page, start, length)

        Length variant for the case where we know the length.

We remove the zero_user_page macro. Issues:

1. Its a macro. Inline functions are preferable.

2. The KM_USER0 macro is only defined for HIGHMEM.

   Having to treat this special case everywhere makes the
   code needlessly complex. The parameter for zeroing is always
   KM_USER0 except in one single case that we open code.

Avoiding KM_USER0 makes a lot of code not having to be dealing
with the special casing for HIGHMEM anymore. Dealing with
kmap is only necessary for HIGHMEM configurations. In those
configurations we use KM_USER0 like we do for a series of other
functions defined in highmem.h.

Since KM_USER0 is depends on HIGHMEM the existing zero_user_page
function could not be a macro. zero_user_* functions introduced
here can be be inline because that constant is not used when these
functions are called.

Also extract the flushing of the caches to be outside of the kmap.

[akpm@linux-foundation.org: fix nfs and ntfs build]
[akpm@linux-foundation.org: fix ntfs build some more]
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Cc: Steven French <sfrench@us.ibm.com>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: <linux-ext4@vger.kernel.org>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Anton Altaparmakov <aia21@cantab.net>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Cc: David Chinner <dgc@sgi.com>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: Steven French <sfrench@us.ibm.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

eebd2aa3

26 1月, 2008 4 次提交

ocfs2: Safer read_inline_data() · d2849fb2

由 Jan Kara 提交于 12月 19, 2007

In ocfs2_read_inline_data() we should store file size in loff_t. Although
the file size should fit in 32 bits we cannot be sure in case filesystem is
corrupted.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

d2849fb2

ocfs2: Readpages support · 628a24f5

由 Mark Fasheh 提交于 10月 30, 2007

Add ->readpages support to Ocfs2. This is rather trivial - all it required
is a small update to ocfs2_get_block (for mapping full extents via b_size)
and an ocfs2_readpages() function which partially mirrors ocfs2_readpage().
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

628a24f5

ocfs2: Rename ocfs2_meta_[un]lock · e63aecb6

由 Mark Fasheh 提交于 10月 18, 2007

Call this the "inode_lock" now, since it covers both data and meta data.
This patch makes no functional changes.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

e63aecb6

ocfs2: Remove data locks · c934a92d

由 Mark Fasheh 提交于 10月 18, 2007

The meta lock now covers both meta data and data, so this just removes the
now-redundant data lock.

Combining locks saves us a round of lock mastery per inode and one less lock
to ping between nodes during read/write.

We don't lose much - since meta locks were always held before a data lock
(and at the same level) ordered writeout mode (the default) ensured that
flushing for the meta data lock also pushed out data anyways.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

c934a92d

28 11月, 2007 1 次提交

ocfs2: Fix comparison in ocfs2_size_fits_inline_data() · 0d8a4e0c

由 Mark Fasheh 提交于 11月 20, 2007

This was causing us to prematurely push out inline data by one byte.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

0d8a4e0c

07 11月, 2007 1 次提交

ocfs2: fix write() performance regression · 4e9563fd

由 Mark Fasheh 提交于 11月 01, 2007

On file systems which don't support sparse files, Ocfs2_map_page_blocks()
was reading blocks on appending writes. This caused write performance to
suffer dramatically. Fix this by detecting an appending write on a nonsparse
fs and skipping the read.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

4e9563fd

17 10月, 2007 1 次提交

ocfs2: convert to new aops · b6af1bcd

由 Nick Piggin 提交于 10月 16, 2007

Plug ocfs2 into the ->write_begin and ->write_end aops.

A bunch of custom code is now gone - the iovec iteration stuff during write
and the ocfs2 splice write actor.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b6af1bcd

13 10月, 2007 4 次提交

ocfs2: Write support for inline data · 1afc32b9

由 Mark Fasheh 提交于 9月 07, 2007

This fixes up write, truncate, mmap, and RESVSP/UNRESVP to understand inline
inode data.

For the most part, the changes to the core write code can be relied on to do
the heavy lifting. Any code calling ocfs2_write_begin (including shared
writeable mmap) can count on it doing the right thing with respect to
growing inline data to an extent tree.

Size reducing truncates, including UNRESVP can simply zero that portion of
the inode block being removed. Size increasing truncatesm, including RESVP
have to be a little bit smarter and grow the inode to an extent tree if
necessary.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
Reviewed-by: NJoel Becker <joel.becker@oracle.com>

1afc32b9

ocfs2: Read support for inline data · 6798d35a

由 Mark Fasheh 提交于 9月 07, 2007

This hooks up ocfs2_readpage() to populate a page with data from an inode
block. Direct IO reads from inline data are modified to fall back to
buffered I/O. Appropriate checks are also placed in the extent map code to
avoid reading an extent list when inline data might be stored.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
Reviewed-by: NJoel Becker <joel.becker@oracle.com>

6798d35a

ocfs2: Small refactor of truncate zeroing code · 1d410a6e

由 Mark Fasheh 提交于 9月 07, 2007

We'll want to reuse most of this when pushing inline data back out to an
extent. Keeping this part as a seperate patch helps to keep the upcoming
changes for write support uncluttered.

The core portion of ocfs2_zero_cluster_pages() responsible for making sure a
page is mapped and properly dirtied is abstracted out into it's own
function, ocfs2_map_and_dirty_page(). Actual functionality doesn't change,
though zeroing becomes optional.

We also turn part of ocfs2_free_write_ctxt() into  a common function for
unlocking and freeing a page array. This operation is very common (and
uniform) for Ocfs2 cluster sizes greater than page size, so it makes sense
to keep the code in one place.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
Reviewed-by: NJoel Becker <joel.becker@oracle.com>

1d410a6e

ocfs2: move nonsparse hole-filling into ocfs2_write_begin() · 65ed39d6

由 Mark Fasheh 提交于 8月 28, 2007

By doing this, we can remove any higher level logic which has to have
knowledge of btree functionality - any callers of ocfs2_write_begin() can
now expect it to do anything necessary to prepare the inode for new data.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
Reviewed-by: NJoel Becker <joel.becker@oracle.com>

65ed39d6

21 9月, 2007 2 次提交

ocfs2: Don't double set write parameters · 5c26a7b7

由 Mark Fasheh 提交于 9月 18, 2007

The target page offsets were being incorrectly set a second time in
ocfs2_prepare_page_for_write(), which was causing problems on a 16k page
size kernel. Additionally, ocfs2_write_failure() was incorrectly using those
parameters instead of the parameters for the individual page being cleaned
up.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

5c26a7b7

ocfs2: Fix pos/len passed to ocfs2_write_cluster · db56246c

由 Mark Fasheh 提交于 9月 17, 2007

This was broken for file systems whose cluster size is greater than page
size. Pos needs to be incremented as we loop through the descriptors, and
len needs to be capped to the size of a single cluster.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

db56246c

12 9月, 2007 1 次提交

[PATCH] ocfs2: Fix a wrong cluster calculation. · 30b8548f

由 tao.ma@oracle.com 提交于 9月 06, 2007

In ocfs2_alloc_write_write_ctxt, the written clusters length is calculated
by the byte length only. This may cause some problems if we start to write
at some position in the end of one cluster and last to a second cluster
while the "len" is smaller than a cluster size. In that case, we have to
write 2 clusters actually.
So we have to take the start position into consideration also.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

30b8548f

20 7月, 2007 1 次提交

mm: merge populate and nopage into fault (fixes nonlinear) · 54cb8821

由 Nick Piggin 提交于 7月 19, 2007

Nonlinear mappings are (AFAIKS) simply a virtual memory concept that encodes
the virtual address -> file offset differently from linear mappings.

->populate is a layering violation because the filesystem/pagecache code
should need to know anything about the virtual memory mapping.  The hitch here
is that the ->nopage handler didn't pass down enough information (ie.  pgoff).
 But it is more logical to pass pgoff rather than have the ->nopage function
calculate it itself anyway (because that's a similar layering violation).

Having the populate handler install the pte itself is likewise a nasty thing
to be doing.

This patch introduces a new fault handler that replaces ->nopage and
->populate and (later) ->nopfn.  Most of the old mechanism is still in place
so there is a lot of duplication and nice cleanups that can be removed if
everyone switches over.

The rationale for doing this in the first place is that nonlinear mappings are
subject to the pagefault vs invalidate/truncate race too, and it seemed stupid
to duplicate the synchronisation logic rather than just consolidate the two.

After this patch, MAP_NONBLOCK no longer sets up ptes for pages present in
pagecache.  Seems like a fringe functionality anyway.

NOPAGE_REFAULT is removed.  This should be implemented with ->fault, and no
users have hit mainline yet.

[akpm@linux-foundation.org: cleanup]
[randy.dunlap@oracle.com: doc. fixes for readahead]
[akpm@linux-foundation.org: build fix]
Signed-off-by: NNick Piggin <npiggin@suse.de>
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

54cb8821

11 7月, 2007 9 次提交

[PATCH] ocfs2: zero_user_page conversion · 54c57dc3

由 Eric Sandeen 提交于 6月 20, 2007

Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

54c57dc3

ocfs2: Support creation of unwritten extents · 2ae99a60

由 Mark Fasheh 提交于 3月 09, 2007

This can now be trivially supported with re-use of our existing extend code.

ocfs2_allocate_unwritten_extents() takes a start offset and a byte length
and iterates over the inode, adding extents (marked as unwritten) until len
is reached. Existing extents are skipped over.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

2ae99a60

ocfs2: support writing of unwritten extents · b27b7cbc

由 Mark Fasheh 提交于 6月 18, 2007

Update the write code to detect when the user is asking to write to an
unwritten extent. Like writing to a hole, we must zero the region between
the write and the cluster boundaries. Most of the existing cluster zeroing
logic can be re-used with some additional checks for the unwritten flag on
extent records.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

b27b7cbc

ocfs2: small cleanup of ocfs2_write_begin_nolock() · 0d172baa

由 Mark Fasheh 提交于 5月 14, 2007

We can easily seperate out the write descriptor setup and manipulation
into helper functions.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

0d172baa

M
ocfs2: plug truncate into cached dealloc routines · 59a5e416
由 Mark Fasheh 提交于 6月 22, 2007
```
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
```
59a5e416

ocfs2: harden buffer check during mapping of page blocks · bce99768

由 Mark Fasheh 提交于 6月 18, 2007

We don't want to submit buffer_new blocks for read i/o. This actually won't
happen right now because those requests during an allocating write are all nicely
aligned. It's probably a good idea to provide an explicit check though.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

bce99768

ocfs2: shared writeable mmap · 7307de80

由 Mark Fasheh 提交于 5月 09, 2007

Implement cluster consistent shared writeable mappings using the
->page_mkwrite() callback.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

7307de80

ocfs2: factor out write aops into nolock variants · 607d44aa

由 Mark Fasheh 提交于 5月 09, 2007

ocfs2_mkwrite() will want this so that it can add some mmap specific checks
before asking for a write.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

607d44aa

ocfs2: rework ocfs2_buffered_write_cluster() · 3a307ffc

由 Mark Fasheh 提交于 5月 08, 2007

Use some ideas from the new-aops patch series and turn
ocfs2_buffered_write_cluster() into a 2 stage operation with the caller
copying data in between. The code now understands multiple cluster writes as
a result of having to deal with a full page write for greater than 4k pages.

This sets us up to easily call into the write path during ->page_mkwrite().
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

3a307ffc

07 6月, 2007 1 次提交

ocfs2: Fix invalid assertion during write on 64k pages · eeb47d12

由 Mark Fasheh 提交于 6月 06, 2007

The write path code intends to bug if a math error (or unhandled case)
results in a write outside of the current cluster boundaries. The actual
BUG_ON() statements however are incorrect, leading to a crash on kernels
with 64k page size. Fix those by checking against the right variables.

Also, move the assertions higher up within the functions so that they trip
*before* the code starts to mark buffers.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

eeb47d12

26 5月, 2007 2 次提交

[PATCH] ocfs2: use zero_user_page · 5c3c6bb7

由 Nate Diller 提交于 5月 10, 2007

Use zero_user_page() instead of open-coding it.
Signed-off-by: NNate Diller <nate.diller@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

5c3c6bb7

ocfs2: trylock in ocfs2_readpage() · e9dfc0b2

由 Mark Fasheh 提交于 5月 14, 2007

Similarly to the page lock / cluster lock inversion in ocfs2_readpage, we
can deadlock on ip_alloc_sem. We can down_read_trylock() instead and just
return AOP_TRUNCATED_PAGE if the operation fails.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

e9dfc0b2

03 5月, 2007 3 次提交

ocfs2: Force use of GFP_NOFS in ocfs2_write() · 9315f130

由 Mark Fasheh 提交于 5月 01, 2007

We can otherwise recurse into the file system.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

9315f130

ocfs2: fix sparse warnings in fs/ocfs2 · 1ca1a111

由 Mark Fasheh 提交于 4月 27, 2007

None of these are actually harmful, but the noise makes looking for real
problems difficult.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

1ca1a111

[PATCH] fs/ocfs2/: make 3 functions static · 6cb129f5

由 Adrian Bunk 提交于 4月 26, 2007

This patch makes the following needlessly global functions static:
- aops.c: ocfs2_write_data_page()
- dlmglue.c: ocfs2_dump_meta_lvb_info()
- file.c: ocfs2_set_inode_size()
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

6cb129f5

27 4月, 2007 1 次提交

ocfs2: Remember rw lock level during direct io · 7cdfc3a1

由 Mark Fasheh 提交于 4月 16, 2007

Cluster locking might have been redone because a direct write won't
complete, so this needs to be reflected in the iocb.
Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>

7cdfc3a1