提交 · c1d00b2d9c4fc821e33c5cdfbdbc32677cb0e2e0 · openeuler / raspberrypi-kernel

20 4月, 2015 1 次提交

ceph: properly release page upon error · c1d00b2d

由 Taesoo Kim 提交于 3月 20, 2015

When ceph_update_writeable_page fails (including -EAGAIN), it
unlocks (w/ unlock_page) the page but does not 'release'
(w/ page_cache_release) properly.

Upon error, properly set *pagep to NULL, indicating an error.
Signed-off-by: NTaesoo Kim <tsgatesv@gmail.com>
Signed-off-by: NYan, Zheng <zyan@redhat.com>

c1d00b2d

19 2月, 2015 1 次提交

ceph: fix reading inline data when i_size > PAGE_SIZE · fcc02d2a

由 Yan, Zheng 提交于 1月 10, 2015

when inode has inline data but its size > PAGE_SIZE (it was truncated
to larger size), previous direct read code return -EIO. This patch adds
code to return zeros for data whose offset > PAGE_SIZE.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

fcc02d2a

11 2月, 2015 1 次提交

mm: drop vm_ops->remap_pages and generic_file_remap_pages() stub · d83a08db

由 Kirill A. Shutemov 提交于 2月 10, 2015

Nobody uses it anymore.

[akpm@linux-foundation.org: fix filemap_xip.c]
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d83a08db

09 1月, 2015 1 次提交
- I
  ceph: use %zu for len in ceph_fill_inline_data() · 0668ff52
  由 Ilya Dryomov 提交于 12月 19, 2014
```
len is size_t, should be printed with %zu.
Signed-off-by: NIlya Dryomov <idryomov@redhat.com>
```
  0668ff52
18 12月, 2014 6 次提交

ceph: do_sync is never initialized · 021b77be

由 Dan Carpenter 提交于 11月 28, 2014

Probably this code was syncing a lot more often then intended because
the do_sync variable wasn't set to zero.

Cc: stable@vger.kernel.org # v3.11+
Fixes: c62988ec ('ceph: avoid meaningless calling ceph_caps_revoking if sync_mode == WB_SYNC_ALL.')
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NIlya Dryomov <idryomov@redhat.com>

021b77be

ceph: convert inline data to normal data before data write · 28127bdd

由 Yan, Zheng 提交于 11月 14, 2014

Before any data write, convert inline data to normal data and set
i_inline_version to CEPH_INLINE_NONE. The OSD request that saves
inline data to object contains 3 operations (CMPXATTR, WRITE and
SETXATTR). It compares a xattr named 'inline_version' to prevent
old data overwrites newer data.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

28127bdd

ceph: sync read inline data · 83701246

由 Yan, Zheng 提交于 11月 14, 2014

we can't use getattr to fetch inline data while holding Fr cap,
because it can cause deadlock. If we need to sync read inline data,
drop cap refs first, then use getattr to fetch inline data.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

83701246

ceph: fetch inline data when getting Fcr cap refs · 3738daa6

由 Yan, Zheng 提交于 11月 14, 2014

we can't use getattr to fetch inline data after getting Fcr caps,
because it can cause deadlock. The solution is try bringing inline
data to page cache when not holding any cap, and hope the inline
data page is still there after getting the Fcr caps. If the page
is still there, pin it in page cache for later IO.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

3738daa6

ceph: add inline data to pagecache · 31c542a1

由 Yan, Zheng 提交于 11月 14, 2014

Request reply and cap message can contain inline data. add inline data
to the page cache if there is Fc cap.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

31c542a1

libceph: specify position of extent operation · 715e4cd4

由 Yan, Zheng 提交于 11月 13, 2014

allow specifying position of extent operation in multi-operations
osd request. This is required for cephfs to convert inline data to
normal data (compare xattr, then write object).
Signed-off-by: NYan, Zheng <zyan@redhat.com>
Reviewed-by: NIlya Dryomov <idryomov@redhat.com>

715e4cd4

15 10月, 2014 1 次提交

ceph: remove redundant code for max file size verification · a4483e8a

由 Chao Yu 提交于 9月 17, 2014

Both ceph_update_writeable_page and ceph_setattr will verify file size
with max size ceph supported.
There are two caller for ceph_update_writeable_page, ceph_write_begin and
ceph_page_mkwrite. For ceph_write_begin, we have already verified the size in
generic_write_checks of ceph_write_iter; for ceph_page_mkwrite, we have no
chance to change file size when mmap. Likewise we have already verified the size
in inode_change_ok when we call ceph_setattr.
So let's remove the redundant code for max file size verification.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Reviewed-by: NYan, Zheng <zyan@redhat.com>

a4483e8a

07 6月, 2014 1 次提交

fs/ceph: replace pr_warning by pr_warn · f3ae1b97

由 Fabian Frederick 提交于 6月 06, 2014

Update the last pr_warning callsites in fs branch
Signed-off-by: NFabian Frederick <fabf@skynet.be>
Cc: Sage Weil <sage@inktank.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f3ae1b97

06 6月, 2014 1 次提交

ceph: refactor readpage_nounlock() to make the logic clearer · 23cd573b

由 Zhang Zhen 提交于 5月 28, 2014

If the return value of ceph_osdc_readpages() is not negative,
it is certainly greater than or equal to zero.

Remove the useless condition judgment and redundant braces.
Signed-off-by: NZhang Zhen <zhenzhang.zhang@huawei.com>
Reviewed-by: NYan, Zheng <zheng.z.yan@intel.com>

23cd573b

07 5月, 2014 1 次提交
- A
  pass iov_iter to ->direct_IO() · d8d3d94b
  由 Al Viro 提交于 3月 04, 2014
```
unmodified, for now
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  d8d3d94b
29 1月, 2014 1 次提交

ceph: fix dout() compile warnings in ceph_filemap_fault() · 37b52fe6

由 Ilya Dryomov 提交于 1月 28, 2014

PAGE_CACHE_SIZE is unsigned long on all architectures, however size_t
is either unsigned int or unsigned long.  Rather than change format
strings, cast PAGE_CACHE_SIZE to size_t to be in line with dout()s in
ceph_page_mkwrite().

Cc: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NIlya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

37b52fe6

01 1月, 2014 2 次提交

ceph fscache: Uncaching no data page from fscache in readpage() · 18302805

由 Li Wang 提交于 12月 19, 2013

Currently, if one new page allocated into fscache in readpage(), however,
with no data read into due to error encountered during reading from OSDs,
the slot in fscache is not uncached. This patch fixes this.
Signed-off-by: NLi Wang <liwang@ubuntukylin.com>
Reviewed-by: NMilosz Tanski <milosz@adfin.com>

18302805

ceph: check caps in filemap_fault and page_mkwrite · 61f68816

由 Yan, Zheng 提交于 11月 28, 2013

Adds cap check to the page fault handler. The check prevents page
fault handler from adding new page to the page cache while Fcb caps
are being revoked. This solves Fc revoking hang in multiple clients
mmap IO workload.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>

61f68816

14 12月, 2013 2 次提交

ceph: Clean up if error occurred in finish_read() · f36132a7

由 Li Wang 提交于 11月 27, 2013

Clean up if error occurred rather than going through normal process
Signed-off-by: NLi Wang <liwang@ubuntukylin.com>
Signed-off-by: NYunchuan Wen <yunchuanwen@ubuntukylin.com>
Signed-off-by: NSage Weil <sage@inktank.com>

f36132a7

ceph: Avoid data inconsistency due to d-cache aliasing in readpage() · 56f91aad

由 Li Wang 提交于 11月 13, 2013

If the length of data to be read in readpage() is exactly
PAGE_CACHE_SIZE, the original code does not flush d-cache
for data consistency after finishing reading. This patches fixes
this.
Signed-off-by: NLi Wang <liwang@ubuntukylin.com>
Signed-off-by: NSage Weil <sage@inktank.com>

56f91aad

24 11月, 2013 1 次提交

ceph: allocate non-zero page to fscache in readpage() · ff638b7d

由 Li Wang 提交于 11月 09, 2013

ceph_osdc_readpages() returns number of bytes read, currently,
the code only allocate full-zero page into fscache, this patch
fixes this.
Signed-off-by: NLi Wang <liwang@ubuntukylin.com>
Reviewed-by: NMilosz Tanski <milosz@adfin.com>
Reviewed-by: NSage Weil <sage@inktank.com>

ff638b7d

07 9月, 2013 3 次提交

ceph: page still marked private_2 · d4d3aa38

由 Milosz Tanski 提交于 9月 03, 2013

Previous patch that allowed us to cleanup most of the issues with pages marked
as private_2 when calling ceph_readpages. However, there seams to be a case in
the error case clean up in start read that still trigers this from time to
time. I've only seen this one a couple times.

BUG: Bad page state in process petabucket  pfn:335b82
page:ffffea000cd6e080 count:0 mapcount:0 mapping:          (null) index:0x0
page flags: 0x200000000001000(private_2)
Call Trace:
 [<ffffffff81563442>] dump_stack+0x46/0x58
 [<ffffffff8112c7f7>] bad_page+0xc7/0x120
 [<ffffffff8112cd9e>] free_pages_prepare+0x10e/0x120
 [<ffffffff8112e580>] free_hot_cold_page+0x40/0x160
 [<ffffffff81132427>] __put_single_page+0x27/0x30
 [<ffffffff81132d95>] put_page+0x25/0x40
 [<ffffffffa02cb409>] ceph_readpages+0x2e9/0x6f0 [ceph]
 [<ffffffff811313cf>] __do_page_cache_readahead+0x1af/0x260
Signed-off-by: NMilosz Tanski <milosz@adfin.com>
Signed-off-by: NSage Weil <sage@inktank.com>

d4d3aa38

ceph: clean PgPrivate2 on returning from readpages · 76be778b

由 Milosz Tanski 提交于 8月 21, 2013

In some cases the ceph readapages code code bails without filling all the pages
already marked by fscache. When we return back to readahead code this causes
a BUG.
Signed-off-by: NMilosz Tanski <milosz@adfin.com>

76be778b

ceph: use fscache as a local presisent cache · 99ccbd22

由 Milosz Tanski 提交于 8月 21, 2013

Adding support for fscache to the Ceph filesystem. This would bring it to on
par with some of the other network filesystems in Linux (like NFS, AFS, etc...)

In order to mount the filesystem with fscache the 'fsc' mount option must be
passed.
Signed-off-by: NMilosz Tanski <milosz@adfin.com>
Signed-off-by: NSage Weil <sage@inktank.com>

99ccbd22

28 8月, 2013 1 次提交

ceph: use vfs __set_page_dirty_nobuffers interface instead of doing it inside filesystem · 7d6e1f54

由 Sha Zhengju 提交于 8月 21, 2013

Following we will begin to add memcg dirty page accounting around
__set_page_dirty_{buffers,nobuffers} in vfs layer, so we'd better use vfs interface to
avoid exporting those details to filesystems.

Since vfs set_page_dirty() should be called under page lock, here we don't need elaborate
codes to handle racy anymore, and two WARN_ON() are added to detect such exceptions.
Thanks very much for Sage and Yan Zheng's coaching!

I tested it in a two server's ceph environment that one is client and the other is
mds/osd/mon, and run the following fsx test from xfstests:

  ./fsx   1MB -N 50000 -p 10000 -l 1048576
  ./fsx  10MB -N 50000 -p 10000 -l 10485760
  ./fsx 100MB -N 50000 -p 10000 -l 104857600

The fsx does lots of mmap-read/mmap-write/truncate operations and the tests completed
successfully without triggering any of WARN_ON.
Signed-off-by: NSha Zhengju <handai.szj@taobao.com>
Reviewed-by: NSage Weil <sage@inktank.com>

7d6e1f54

16 8月, 2013 1 次提交

ceph: cleanup the logic in ceph_invalidatepage · b150f5c1

由 Milosz Tanski 提交于 8月 09, 2013

The invalidatepage code bails if it encounters a non-zero page offset. The
current logic that does is non-obvious with multiple if statements.

This should be logically and functionally equivalent.
Signed-off-by: NMilosz Tanski <milosz@adfin.com>
Reviewed-by: NSage Weil <sage@inktank.com>

b150f5c1

10 8月, 2013 1 次提交

ceph: Remove bogus check in invalidatepage · fe2a801b

由 Milosz Tanski 提交于 8月 09, 2013

The early bug checks are moot because the VMA layer ensures those things.

1. It will not call invalidatepage unless PagePrivate (or PagePrivate2) are set
2. It will not call invalidatepage without taking a PageLock first.
3. Guantrees that the inode page is mapped.
Signed-off-by: NMilosz Tanski <milosz@adfin.com>
Reviewed-by: NSage Weil <sage@inktank.com>

fe2a801b

04 7月, 2013 2 次提交

M
ceph: avoid meaningless calling ceph_caps_revoking if sync_mode == WB_SYNC_ALL. · c62988ec
由 majianpeng 提交于 6月 19, 2013
```
Signed-off-by: NJianpeng Ma <majianpeng@gmail.com>
Reviewed-by: NSage Weil <sage@inktank.com>
```
c62988ec

ceph: fix race between page writeback and truncate · fc2744aa

由 Yan, Zheng 提交于 5月 31, 2013

The client can receive truncate request from MDS at any time.
So the page writeback code need to get i_size, truncate_seq and
truncate_size atomically
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>

fc2744aa

22 5月, 2013 2 次提交

ceph: use ->invalidatepage() length argument · 569d39fc

由 Lukas Czerner 提交于 5月 21, 2013

->invalidatepage() aop now accepts range to invalidate so we can make
use of it in ceph_invalidatepage().
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Acked-by: NSage Weil <sage@inktank.com>
Cc: ceph-devel@vger.kernel.org

569d39fc

mm: change invalidatepage prototype to accept length · d47992f8

由 Lukas Czerner 提交于 5月 21, 2013

Currently there is no way to truncate partial page where the end
truncate point is not at the end of the page. This is because it was not
needed and the functionality was enough for file system truncate
operation to work properly. However more file systems now support punch
hole feature and it can benefit from mm supporting truncating page just
up to the certain point.

Specifically, with this functionality truncate_inode_pages_range() can
be changed so it supports truncating partial page at the end of the
range (currently it will BUG_ON() if 'end' is not at the end of the
page).

This commit changes the invalidatepage() address space operation
prototype to accept range to be invalidated and update all the instances
for it.

We also change the block_invalidatepage() in the same way and actually
make a use of the new length argument implementing range invalidation.

Actual file system implementations will follow except the file systems
where the changes are really simple and should not change the behaviour
in any way .Implementation for truncate_page_range() which will be able
to accept page unaligned ranges will follow as well.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Hugh Dickins <hughd@google.com>

d47992f8

02 5月, 2013 10 次提交

libceph: kill off osd data write_request parameters · 406e2c9f

由 Alex Elder 提交于 4月 15, 2013

In the incremental move toward supporting distinct data items in an
osd request some of the functions had "write_request" parameters to
indicate, basically, whether the data belonged to in_data or the
out_data.  Now that we maintain the data fields in the op structure
there is no need to indicate the direction, so get rid of the
"write_request" parameters.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

406e2c9f

ceph: fix race between writepages and truncate · 1ac0fc8a

由 Yan, Zheng 提交于 4月 12, 2013

ceph_writepages_start() reads inode->i_size in two places. It can get
different values between successive read, because truncate can change
inode->i_size at any time. The race can lead to mismatch between data
length of osd request and pages marked as writeback. When osd request
finishes, it clear writeback page according to its data length. So
some pages can be left in writeback state forever. The fix is only
read inode->i_size once, save its value to a local variable and use
the local variable when i_size is needed.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

1ac0fc8a

libceph: combine initializing and setting osd data · a4ce40a9

由 Alex Elder 提交于 4月 05, 2013

This ends up being a rather large patch but what it's doing is
somewhat straightforward.

Basically, this is replacing two calls with one.  The first of the
two calls is initializing a struct ceph_osd_data with data (either a
page array, a page list, or a bio list); the second is setting an
osd request op so it associates that data with one of the op's
parameters.  In place of those two will be a single function that
initializes the op directly.

That means we sort of fan out a set of the needed functions:
    - extent ops with pages data
    - extent ops with pagelist data
    - extent ops with bio list data
and
    - class ops with page data for receiving a response

We also have define another one, but it's only used internally:
    - class ops with pagelist data for request parameters

Note that we *still* haven't gotten rid of the osd request's
r_data_in and r_data_out fields.  All the osd ops refer to them for
their data.  For now, these data fields are pointers assigned to the
appropriate r_data_* field when these new functions are called.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

a4ce40a9

libceph: specify osd op by index in request · c99d2d4a

由 Alex Elder 提交于 4月 05, 2013

An osd request now holds all of its source op structures, and every
place that initializes one of these is in fact initializing one
of the entries in the the osd request's array.

So rather than supplying the address of the op to initialize, have
caller specify the osd request and an indication of which op it
would like to initialize.  This better hides the details the
op structure (and faciltates moving the data pointers they use).

Since osd_req_op_init() is a common routine, and it's not used
outside the osd client code, give it static scope.  Also make
it return the address of the specified op (so all the other
init routines don't have to repeat that code).
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

c99d2d4a

libceph: add data pointers in osd op structures · 8c042b0d

由 Alex Elder 提交于 4月 03, 2013

An extent type osd operation currently implies that there will
be corresponding data supplied in the data portion of the request
(for write) or response (for read) message.  Similarly, an osd class
method operation implies a data item will be supplied to receive
the response data from the operation.

Add a ceph_osd_data pointer to each of those structures, and assign
it to point to eithre the incoming or the outgoing data structure in
the osd message.  The data is not always available when an op is
initially set up, so add two new functions to allow setting them
after the op has been initialized.

Begin to make use of the data item pointer available in the osd
operation rather than the request data in or out structure in
places where it's convenient.  Add some assertions to verify
pointers are always set the way they're expected to be.

This is a sort of stepping stone toward really moving the data
into the osd request ops, to allow for some validation before
making that jump.

This is the first in a series of patches that resolve:
    http://tracker.ceph.com/issues/4657Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

8c042b0d

libceph: keep source rather than message osd op array · 79528734

由 Alex Elder 提交于 4月 03, 2013

An osd request keeps a pointer to the osd operations (ops) array
that it builds in its request message.

In order to allow each op in the array to have its own distinct
data, we will need to keep track of each op's data, and that
information does not go over the wire.

As long as we're tracking the data we might as well just track the
entire (source) op definition for each of the ops.  And if we're
doing that, we'll have no more need to keep a pointer to the
wire-encoded version.

This patch makes the array of source ops be kept with the osd
request structure, and uses that instead of the version encoded in
the message in places where that was previously used.  The array
will be embedded in the request structure, and the maximum number of
ops we ever actually use is currently 2.  So reduce CEPH_OSD_MAX_OP
to 2 to reduce the size of the structure.

The result of doing this sort of ripples back up, and as a result
various function parameters and local variables become unnecessary.

Make r_num_ops be unsigned, and move the definition of struct
ceph_osd_req_op earlier to ensure it's defined where needed.

It does not yet add per-op data, that's coming soon.

This resolves:
    http://tracker.ceph.com/issues/4656Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

79528734

libceph: a few more osd data cleanups · 87060c10

由 Alex Elder 提交于 4月 03, 2013

These are very small changes that make use osd_data local pointers
as shorthands for structures being operated on.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

87060c10

libceph: define osd data initialization helpers · 43bfe5de

由 Alex Elder 提交于 4月 03, 2013

Define and use functions that encapsulate the initializion of a
ceph_osd_data structure.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

43bfe5de

ceph: build osd request message later for writepages · e5975c7c

由 Alex Elder 提交于 3月 14, 2013

Hold off building the osd request message in ceph_writepages_start()
until just before it will be submitted to the osd client for
execution.

We'll still create the request and allocate the page pointer array
after we learn we have at least one page to write.  A local variable
will be used to keep track of the allocated array of pages.  Wait
until just before submitting the request for assigning that page
array pointer to the request message.

Create ands use a new function osd_req_op_extent_update() whose
purpose is to serve this one spot where the length value supplied
when an osd request's op was initially formatted might need to get
changed (reduced, never increased) before submitting the request.

Previously, ceph_writepages_start() assigned the message header's
data length because of this update.  That's no longer necessary,
because ceph_osdc_build_request() will recalculate the right
value to use based on the content of the ops in the request.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

e5975c7c

libceph: hold off building osd request · 02ee07d3

由 Alex Elder 提交于 3月 14, 2013

Defer building the osd request until just before submitting it in
all callers except ceph_writepages_start().  (That caller will be
handed in the next patch.)
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

02ee07d3