提交 · 3738daa68a5121ad7dd0318bca931e2a6afb0e8c · openeuler / raspberrypi-kernel

18 12月, 2014 2 次提交

ceph: fetch inline data when getting Fcr cap refs · 3738daa6

由 Yan, Zheng 提交于 11月 14, 2014

we can't use getattr to fetch inline data after getting Fcr caps,
because it can cause deadlock. The solution is try bringing inline
data to page cache when not holding any cap, and hope the inline
data page is still there after getting the Fcr caps. If the page
is still there, pin it in page cache for later IO.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

3738daa6

libceph: specify position of extent operation · 715e4cd4

由 Yan, Zheng 提交于 11月 13, 2014

allow specifying position of extent operation in multi-operations
osd request. This is required for cephfs to convert inline data to
normal data (compare xattr, then write object).
Signed-off-by: NYan, Zheng <zyan@redhat.com>
Reviewed-by: NIlya Dryomov <idryomov@redhat.com>

715e4cd4

15 10月, 2014 3 次提交

ceph: include the initial ACL in create/mkdir/mknod MDS requests · b1ee94aa

由 Yan, Zheng 提交于 9月 16, 2014

Current code set new file/directory's initial ACL in a non-atomic
manner.
Client first sends request to MDS to create new file/directory, then set
the initial ACL after the new file/directory is successfully created.

The fix is include the initial ACL in create/mkdir/mknod MDS requests.
So MDS can handle creating file/directory and setting the initial ACL in
one request.
Signed-off-by: NYan, Zheng <zyan@redhat.com>
Reviewed-by: NSage Weil <sage@redhat.com>

b1ee94aa

ceph: remove redundant io_iter_advance() · 3b70b388

由 Yan, Zheng 提交于 9月 17, 2014

ceph_sync_read and generic_file_read_iter() have already advanced the
IO iterator.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

3b70b388

ceph: request xattrs if xattr_version is zero · 508b32d8

由 Yan, Zheng 提交于 9月 16, 2014

Following sequence of events can happen.
  - Client releases an inode, queues cap release message.
  - A 'lookup' reply brings the same inode back, but the reply
    doesn't contain xattrs because MDS didn't receive the cap release
    message and thought client already has up-to-data xattrs.

The fix is force sending a getattr request to MDS if xattrs_version
is 0. The getattr mask is set to CEPH_STAT_CAP_XATTR, so MDS knows client
does not have xattr.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

508b32d8

28 7月, 2014 1 次提交

ceph: fix append mode write · 06fee30f

由 Yan, Zheng 提交于 7月 28, 2014

generic_write_checks() may update 'pos', so we need to pass 'pos'
to ceph_sync_write() and ceph_sync_direct_write();
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>

06fee30f

21 7月, 2014 1 次提交
- Y
  ceph: check zero length in ceph_sync_read() · d0d0db22
  由 Yan, Zheng 提交于 7月 21, 2014
```
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
```
  d0d0db22
08 7月, 2014 2 次提交
- Y
  ceph: pass proper page offset to copy_page_to_iter() · 5aaa432a
  由 Yan, Zheng 提交于 7月 02, 2014
```
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
```
  5aaa432a
- Y
  ceph: check unsupported fallocate mode · 494d77bf
  由 Yan, Zheng 提交于 6月 26, 2014
```
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
```
  494d77bf
12 6月, 2014 1 次提交
- A
  ceph: switch to iter_file_splice_write() · 3551dd79
  由 Al Viro 提交于 4月 05, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  3551dd79
07 5月, 2014 9 次提交

A
ceph: switch to ->write_iter() · 4908b822
由 Al Viro 提交于 4月 03, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
4908b822
A
ceph_sync_direct_write: stop poking into iov_iter guts · 64c31311
由 Al Viro 提交于 4月 03, 2014
```
all needed primitives are there...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
64c31311
A
ceph_sync_read: stop poking into iov_iter guts · 2b777c9d
由 Al Viro 提交于 4月 03, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
2b777c9d
A
ceph: switch to ->read_iter() · 3644424d
由 Al Viro 提交于 4月 02, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
3644424d

start adding the tag to iov_iter · 71d8e532

由 Al Viro 提交于 3月 05, 2014

For now, just use the same thing we pass to ->direct_IO() - it's all
iovec-based at the moment.  Pass it explicitly to iov_iter_init() and
account for kvec vs. iovec in there, by the same kludge NFS ->direct_IO()
uses.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

71d8e532

new helper: generic_file_read_iter() · ed978a81

由 Al Viro 提交于 3月 05, 2014

iov_iter-using variant of generic_file_aio_read(). Some callers
converted. Note that it's still not quite there for use as ->read_iter() -
we depend on having zero iter->iov_offset in O_DIRECT case. Fortunately,
that's true for all converted callers (and for generic_file_aio_read() itself).
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

ed978a81

A
ceph_aio_read(): keep iov_iter across retries · 05bb2e0b
由 Al Viro 提交于 3月 05, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
05bb2e0b

kill generic_segment_checks() · cb66a7a1

由 Al Viro 提交于 3月 04, 2014

all callers of ->aio_read() and ->aio_write() have iov/nr_segs already
checked - generic_segment_checks() done after that is just an odd way
to spell iov_length().
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

cb66a7a1

kill iov_iter_copy_from_user() · e7c24607

由 Al Viro 提交于 4月 10, 2014

all callers can use copy_page_from_iter() and it actually simplifies
them.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

e7c24607

12 4月, 2014 2 次提交

fs: disallow all fallocate operation on active swapfile · 0790b31b

由 Lukas Czerner 提交于 4月 12, 2014

Currently some file system have IS_SWAPFILE check in their fallocate
implementations and some do not. However we should really prevent any
fallocate operation on swapfile so move the check to vfs and remove the
redundant checks from the file systems fallocate implementations.
Signed-off-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>

0790b31b

ceph_sync_{,direct_}write: fix an oops on ceph_osdc_new_request() failure · eab87235

由 Al Viro 提交于 4月 03, 2014

ceph_osdc_put_request(ERR_PTR(-error)) oopses.  What we want there
is break, not goto out.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

eab87235

05 4月, 2014 1 次提交

ceph: drop extra open file reference in ceph_atomic_open() · ab866549

由 Yan, Zheng 提交于 4月 01, 2014

ceph_atomic_open() calls ceph_open() after receiving the MDS reply.
ceph_open() grabs an extra open file reference. (The open request
already holds an open file reference)
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>

ab866549

03 4月, 2014 2 次提交

ceph: fscache: Update object store limit after file writing · 32d3e148

由 Yunchuan Wen 提交于 12月 26, 2013

Synchronize object->store_limit[_l] with new inode->i_size after file writing.
Tested-by: NMilosz Tanski <milosz@adfin.com>
Signed-off-by: NYunchuan Wen <yunchuanwen@ubuntukylin.com>
Signed-off-by: NMin Chen <minchen@ubuntukylin.com>
Signed-off-by: NLi Wang <liwang@ubuntukylin.com>

32d3e148

ceph: do not chain inode updates to parent fsync · 752c8bdc

由 Sage Weil 提交于 2月 05, 2013

The fsync(dirfd) only covers namespace operations, not inode updates.
We do not need to cover setattr variants or O_TRUNC.
Reported-by: NAl Viro <viro@xeniv.linux.org.uk>
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NYan, Zheng <zheng.z.yan@intel.com>

752c8bdc

02 4月, 2014 2 次提交
- A
  ceph_aio_write(): switch to generic_perform_write() · aec605f4
  由 Al Viro 提交于 2月 11, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  aec605f4
- A
  kill the 5th argument of generic_file_buffered_write() · fcacafd2
  由 Al Viro 提交于 2月 09, 2014
```
same story - it's &iocb->ki_pos in all cases
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  fcacafd2
18 2月, 2014 1 次提交
- Y
  ceph: add missing init_acl() for mkdir() and atomic_open() · b20a95a0
  由 Yan, Zheng 提交于 2月 11, 2014
```
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
```
  b20a95a0
29 1月, 2014 1 次提交

ceph: cast PAGE_SIZE to size_t in ceph_sync_write() · 125d725c

由 Ilya Dryomov 提交于 1月 28, 2014

Use min_t(size_t, ...) instead of plain min(), which does strict type
checking, to avoid compile warning on i386.

Cc: Jianpeng Ma <majianpeng@gmail.com>
Signed-off-by: NIlya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

125d725c

14 12月, 2013 3 次提交

fs: ceph: new helper: file_inode(file) · aa8b60e0

由 Libo Chen 提交于 12月 11, 2013

Signed-off-by: NLibo Chen <clbchenlibo.chen@huawei.com>
Signed-off-by: NSage Weil <sage@inktank.com>

aa8b60e0

ceph: implement readv/preadv for sync operation · 8eb4efb0

由 majianpeng 提交于 9月 26, 2013

For readv/preadv sync-operatoin, ceph only do the first iov.
Now implement this.
Signed-off-by: NJianpeng Ma <majianpeng@gmail.com>
Reviewed-by: NYan, Zheng <zheng.z.yan@intel.com>

8eb4efb0

ceph: Implement writev/pwritev for sync operation. · e8344e66

由 majianpeng 提交于 9月 12, 2013

For writev/pwritev sync-operatoin, ceph only do the first iov.

I divided the write-sync-operation into two functions. One for
direct-write, other for none-direct-sync-write. This is because for
none-direct-sync-write we can merge iovs to one. But for direct-write,
we can't merge iovs.
Signed-off-by: NJianpeng Ma <majianpeng@gmail.com>
Reviewed-by: NYan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NSage Weil <sage@inktank.com>

e8344e66

07 9月, 2013 1 次提交

ceph: use fscache as a local presisent cache · 99ccbd22

由 Milosz Tanski 提交于 8月 21, 2013

Adding support for fscache to the Ceph filesystem. This would bring it to on
par with some of the other network filesystems in Linux (like NFS, AFS, etc...)

In order to mount the filesystem with fscache the 'fsc' mount option must be
passed.
Signed-off-by: NMilosz Tanski <milosz@adfin.com>
Signed-off-by: NSage Weil <sage@inktank.com>

99ccbd22

28 8月, 2013 3 次提交

ceph: allow sync_read/write return partial successed size of read/write. · ee7289bf

由 majianpeng 提交于 8月 21, 2013

For sync_read/write, it may do multi stripe operations.If one of those
met erro, we return the former successed size rather than a error value.
There is a exception for write-operation met -EOLDSNAPC.If this occur,we
retry the whole write again.
Signed-off-by: NJianpeng Ma <majianpeng@gmail.com>

ee7289bf

ceph: fix bugs about handling short-read for sync read mode. · 02ae66d8

由 majianpeng 提交于 8月 06, 2013

cephfs . show_layout
>layyout.data_pool:     0
>layout.object_size:   4194304
>layout.stripe_unit:   4194304
>layout.stripe_count:  1

TestA:
>dd if=/dev/urandom of=test bs=1M count=2 oflag=direct
>dd if=/dev/urandom of=test bs=1M count=2 seek=4  oflag=direct
>dd if=test of=/dev/null bs=6M count=1 iflag=direct
The messages from func striped_read are:
ceph:           file.c:350  : striped_read 0~6291456 (read 0) got 2097152 HITSTRIPE SHORT
ceph:           file.c:350  : striped_read 2097152~4194304 (read 2097152) got 0 HITSTRIPE SHORT
ceph:           file.c:381  : zero tail 4194304
ceph:           file.c:390  : striped_read returns 6291456
The hole of file is from 2M--4M.But actualy it zero the last 4M include
the last 2M area which isn't a hole.
Using this patch, the messages are:
ceph:           file.c:350  : striped_read 0~6291456 (read 0) got 2097152 HITSTRIPE SHORT
ceph:           file.c:358  :  zero gap 2097152 to 4194304
ceph:           file.c:350  : striped_read 4194304~2097152 (read 4194304) got 2097152
ceph:           file.c:384  : striped_read returns 6291456

TestB:
>echo majianpeng > test
>dd if=test of=/dev/null bs=2M count=1 iflag=direct
The messages are:
ceph:           file.c:350  : striped_read 0~6291456 (read 0) got 11 HITSTRIPE SHORT
ceph:           file.c:350  : striped_read 11~6291445 (read 11) got 0 HITSTRIPE SHORT
ceph:           file.c:390  : striped_read returns 11
For this case,it did once more striped_read.It's no meaningless.
Using this patch, the message are:
ceph:           file.c:350  : striped_read 0~6291456 (read 0) got 11 HITSTRIPE SHORT
ceph:           file.c:384  : striped_read returns 11

Big thanks to Yan Zheng for the patch.
Reviewed-by: NYan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NJianpeng Ma <majianpeng@gmail.com>

02ae66d8

ceph: fix fallocate division · b314a90d

由 Sage Weil 提交于 8月 27, 2013

We need to use do_div to divide by a 64-bit value.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NJosh Durgin <josh.durgin@inktank.com>

b314a90d

16 8月, 2013 2 次提交

ceph: punch hole support · ad7a60de

由 Li Wang 提交于 8月 15, 2013

This patch implements fallocate and punch hole support for Ceph kernel client.
Signed-off-by: NLi Wang <liwang@ubuntukylin.com>
Signed-off-by: NYunchuan Wen <yunchuanwen@ubuntukylin.com>

ad7a60de

ceph: introduce i_truncate_mutex · b0d7c223

由 Yan, Zheng 提交于 8月 12, 2013

I encountered below deadlock when running fsstress

wmtruncate work      truncate                 MDS
---------------  ------------------  --------------------------
                   lock i_mutex
                                      <- truncate file
lock i_mutex (blocked)
                                      <- revoking Fcb (filelock to MIX)
                   send request ->
                                         handle request (xlock filelock)

At the initial time, there are some dirty pages in the page cache.
When the kclient receives the truncate message, it reduces inode size
and creates some 'out of i_size' dirty pages. wmtruncate work can't
truncate these dirty pages because it's blocked by the i_mutex. Later
when the kclient receives the cap message that revokes Fcb caps, It
can't flush all dirty pages because writepages() only flushes dirty
pages within the inode size.

When the MDS handles the 'truncate' request from kclient, it waits
for the filelock to become stable. But the filelock is stuck in
unstable state because it can't finish revoking kclient's Fcb caps.

The truncate pagecache locking has already caused lots of trouble
for use. I think it's time simplify it by introducing a new mutex.
We use the new mutex to prevent concurrent truncate_inode_pages().
There is no need to worry about race between buffered write and
truncate_inode_pages(), because our "get caps" mechanism prevents
them from concurrent execution.
Reviewed-by: NSage Weil <sage@inktank.com>
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>

b0d7c223

10 8月, 2013 3 次提交

ceph: replace hold_mutex flag with goto · 2f75e9e1

由 Sage Weil 提交于 8月 09, 2013

All of the early exit paths need to drop the mutex; it is only the normal
path through the function that does not.  Skip the unlock in that case
with a goto out_unlocked.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NJianpeng Ma <majianpeng@gmail.com>

2f75e9e1

ceph: Move the place for EOLDSNAPC handle in ceph_aio_write to easily understand · 0e5dd45c

由 majianpeng 提交于 8月 08, 2013

Only for ceph_sync_write, the osd can return EOLDSNAPC.so move the
related codes after the call ceph_sync_write.
Signed-off-by: NJianpeng Ma <majianpeng@gmail.com>
Reviewed-by: NSage Weil <sage@inktank.com>

0e5dd45c

ceph: Don't use ceph-sync-mode for synchronous-fs. · 7ab9b380

由 majianpeng 提交于 6月 27, 2013

Sending reads and writes through the sync read/write paths bypasses the
page cache, which is not expected or generally a good idea.  Removing
the write check is safe as there is a conditional vfs_fsync_range() later
in ceph_aio_write that already checks for the same flag (via
IS_SYNC(inode)).
Signed-off-by: NJianpeng Ma <majianpeng@gmail.com>
Reviewed-by: NSage Weil <sage@inktank.com>

7ab9b380