提交 · a7400222e3eb7d5ce3820d2234905bbeafabd171 · openeuler / Kernel

15 10月, 2014 2 次提交

ceph: remove redundant code for max file size verification · a4483e8a

由 Chao Yu 提交于 9月 17, 2014

Both ceph_update_writeable_page and ceph_setattr will verify file size
with max size ceph supported.
There are two caller for ceph_update_writeable_page, ceph_write_begin and
ceph_page_mkwrite. For ceph_write_begin, we have already verified the size in
generic_write_checks of ceph_write_iter; for ceph_page_mkwrite, we have no
chance to change file size when mmap. Likewise we have already verified the size
in inode_change_ok when we call ceph_setattr.
So let's remove the redundant code for max file size verification.
Signed-off-by: NChao Yu <chao2.yu@samsung.com>
Reviewed-by: NYan, Zheng <zyan@redhat.com>

a4483e8a

ceph: request xattrs if xattr_version is zero · 508b32d8

由 Yan, Zheng 提交于 9月 16, 2014

Following sequence of events can happen.
  - Client releases an inode, queues cap release message.
  - A 'lookup' reply brings the same inode back, but the reply
    doesn't contain xattrs because MDS didn't receive the cap release
    message and thought client already has up-to-data xattrs.

The fix is force sending a getattr request to MDS if xattrs_version
is 0. The getattr mask is set to CEPH_STAT_CAP_XATTR, so MDS knows client
does not have xattr.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

508b32d8

08 6月, 2014 1 次提交
- Y
  ceph: use truncate_pagecache() instead of truncate_inode_pages() · 4e217b5d
  由 Yan, Zheng 提交于 6月 08, 2014
```
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
```
  4e217b5d
07 6月, 2014 1 次提交

fs/ceph: replace pr_warning by pr_warn · f3ae1b97

由 Fabian Frederick 提交于 6月 06, 2014

Update the last pr_warning callsites in fs branch
Signed-off-by: NFabian Frederick <fabf@skynet.be>
Cc: Sage Weil <sage@inktank.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f3ae1b97

06 6月, 2014 4 次提交

ceph: remember subtree root dirfrag's auth MDS · 8d08503c

由 Yan, Zheng 提交于 4月 18, 2014

remember dirfrag's auth MDS when it's different from its parent inode's
auth MDS.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>

8d08503c

ceph: introduce ceph_fill_fragtree() · 3e7fbe9c

由 Yan, Zheng 提交于 4月 18, 2014

Move the code that update the i_fragtree into a separate function.
Also add simple probabilistic test to decide whether the i_fragtree
should be updated
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>

3e7fbe9c

ceph: pre-allocate ceph_cap struct for ceph_add_cap() · d9df2783

由 Yan, Zheng 提交于 4月 18, 2014

So that ceph_add_cap() can be used while i_ceph_lock is locked.
This simplifies the code that handle cap import/export.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>

d9df2783

ceph: update inode fields according to issued caps · f98a128a

由 Yan, Zheng 提交于 4月 17, 2014

Cap message and request reply from non-auth MDS may carry stale
information (corresponding locks are in LOCK states) even they
have the newest inode version. So client should update inode fields
according to issued caps.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>

f98a128a

29 4月, 2014 1 次提交

ceph: clear directory's completeness when creating file · 0a8a70f9

由 Yan, Zheng 提交于 4月 14, 2014

When creating a file, ceph_set_dentry_offset() puts the new dentry
at the end of directory's d_subdirs, then set the dentry's offset
based on directory's max offset. The offset does not reflect the
real postion of the dentry in directory. Later readdir reply from
MDS may change the dentry's position/offset. This inconsistency
can cause missing/duplicate entries in readdir result if readdir
is partly satisfied by dcache_readdir().

The fix is clear directory's completeness after creating/renaming
file. It prevents later readdir from using dcache_readdir().

Fixes: http://tracker.ceph.com/issues/8025Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>

0a8a70f9

05 4月, 2014 3 次提交

Y
ceph: don't grabs open file reference for aborted request · 48193012
由 Yan, Zheng 提交于 4月 01, 2014
```
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
```
48193012

ceph: Remove get/set acl on symlinks · 5f75ce57

由 Fabian Frederick 提交于 3月 21, 2014

Remove unsupported symlink operations.
Signed-off-by: NFabian Frederick <fabf@skynet.be>
Signed-off-by: NIlya Dryomov <ilya.dryomov@inktank.com>

5f75ce57

ceph: update i_max_size even if inode version does not change · 8c93cd61

由 Yan, Zheng 提交于 3月 08, 2014

handle following sequence of events:
 - client releases a inode with i_max_size > 0. The release message
   is queued. (is not sent to the auth MDS)
 - a 'lookup' request reply from non-auth MDS returns the same inode.
 - client opens the inode in write mode. The version of inode trace
   in 'open' request reply is equal to the cached inode's version.
 - client requests new max size. The MDS ignores the request because
   it does not affect client's write range
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>

8c93cd61

03 4月, 2014 2 次提交

ceph: add get_name() NFS export callback · 19913b4e

由 Yan, Zheng 提交于 3月 06, 2014

Use the newly introduced LOOKUPNAME MDS request to connect child
inode to its parent directory.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>

19913b4e

ceph: do not chain inode updates to parent fsync · 752c8bdc

由 Sage Weil 提交于 2月 05, 2013

The fsync(dirfd) only covers namespace operations, not inode updates.
We do not need to cover setattr variants or O_TRUNC.
Reported-by: NAl Viro <viro@xeniv.linux.org.uk>
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NYan, Zheng <zheng.z.yan@intel.com>

752c8bdc

30 1月, 2014 1 次提交

ceph: fix posix ACL hooks · 72466d0b

由 Sage Weil 提交于 1月 29, 2014

The merge of commit 7221fe4c ("ceph: add acl for cephfs") raced with
upstream changes in the generic POSIX ACL code (eg commit 2aeccbe9
"fs: add generic xattr_acl handlers" and others).

Some of the fallout was fixed in commit 4db658ea ("ceph: Fix up after
semantic merge conflict"), but it was incomplete: the set_acl
inode_operation wasn't getting set, and the prototype needed to be
adjusted a bit (it doesn't take a dentry anymore).
Signed-off-by: NSage Weil <sage@inktank.com>
Signed-off-by: NIlya Dryomov <ilya.dryomov@inktank.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

72466d0b

29 1月, 2014 1 次提交

ceph: Fix up after semantic merge conflict · 4db658ea

由 Linus Torvalds 提交于 1月 28, 2014

The previous ceph-client merge resulted in ceph not even building,
because there was a merge conflict that wasn't visible as an actual data
conflict: commit 7221fe4c ("ceph: add acl for cephfs") added support
for POSIX ACL's into Ceph, but unluckily we also had the VFS tree change
a lot of the POSIX ACL helper functions to be much more helpful to
filesystems (see for example commits 2aeccbe9 "fs: add generic
xattr_acl handlers", 5bf3258f "fs: make posix_acl_chmod more useful"
and 37bc1539 "fs: make posix_acl_create more useful")

The reason this conflict wasn't obvious was many-fold: because it was a
semantic conflict rather than a data conflict, it wasn't visible in the
git merge as a conflict.  And because the VFS tree hadn't been in
linux-next, people hadn't become aware of it that way.  And because I
was at jury duty this morning, I was using my laptop and as a result not
doing constant "allmodconfig" builds.

Anyway, this fixes the build and generally removes a fair chunk of the
Ceph POSIX ACL support code, since the improved helpers seem to match
really well for Ceph too.  But I don't actually have any way to *test*
the end result, and I was really hoping for some ACK's for this.  Oh,
well.

Not compiling certainly doesn't make things easier to test, so I'm
committing this without the acks after having waited for four hours...
Plus it's what I would have done for the merge had I noticed the
semantic conflict..
Reported-by: NDave Jones <davej@redhat.com>
Cc: Sage Weil <sage@inktank.com>
Cc: Guangliang Zhao <lucienchao@gmail.com>
Cc: Li Wang <li.wang@ubuntykylin.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4db658ea

21 1月, 2014 2 次提交

ceph: add imported caps when handling cap export message · 11df2dfb

由 Yan, Zheng 提交于 11月 24, 2013

Version 3 cap export message includes information about the imported
caps. It allows us to add the imported caps if the corresponding cap
import message still hasn't been received.

This allow us to handle situation that the importer MDS crashes and
the cap import message is missing.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>

11df2dfb

ceph: fix cache revoke race · 9563f88c

由 Yan, Zheng 提交于 11月 22, 2013

handle following sequence of events:

- non-auth MDS revokes Fc cap. queue invalidate work
- auth MDS issues Fc cap through request reply. i_rdcache_gen gets
  increased.
- invalidate work runs. it finds i_rdcache_revoking != i_rdcache_gen,
  so it does nothing.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>

9563f88c

01 1月, 2014 1 次提交

ceph: add acl for cephfs · 7221fe4c

由 Guangliang Zhao 提交于 11月 11, 2013

Signed-off-by: NGuangliang Zhao <lucienchao@gmail.com>
Reviewed-by: NLi Wang <li.wang@ubuntykylin.com>
Reviewed-by: NZheng Yan <zheng.z.yan@intel.com>

7221fe4c

14 12月, 2013 2 次提交

ceph: drop unconnected inodes · 9f12bd11

由 Yan, Zheng 提交于 9月 20, 2013

Positve dentry and corresponding inode are always accompanied in MDS reply.
So no need to keep inode in the cache after dropping all its aliases.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>

9f12bd11

ceph: initialize inode before instantiating dentry · 86b58d13

由 Yan, Zheng 提交于 12月 05, 2013

commit b18825a7 (Put a small type field into struct dentry::d_flags)
put a type field into struct dentry::d_flags. __d_instantiate() set the
field by checking inode->i_mode. So we should initialize inode before
instantiating dentry when handling mds reply.

Fixes: http://tracker.ceph.com/issues/6930Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>

86b58d13

01 10月, 2013 2 次提交

ceph: handle frag mismatch between readdir request and reply · 81c6aea5

由 Yan, Zheng 提交于 9月 18, 2013

If client has outdated directory fragments information, it may request
readdir an non-existent directory fragment. In this case, the MDS finds
an approximate directory fragment and sends its contents back to the
client. When receiving a reply with fragment that is different than the
requested one, the client need to reset the 'readdir offset'.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>

81c6aea5

ceph: remove outdated frag information · 53e879a4

由 Yan, Zheng 提交于 9月 18, 2013

If directory fragments change, fill_inode() inserts new frags into
the fragtree, but it does not remove outdated frags from the fragtree.
This patch fixes it.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>

53e879a4

07 9月, 2013 2 次提交

ceph: remove ceph_lookup_inode() · ed284c49

由 Yan, Zheng 提交于 9月 02, 2013

commit 6f60f889 (ceph: fix freeing inode vs removing session caps race)
introduced ceph_lookup_inode(). But there is already a ceph_find_inode()
which provides similar function. So remove ceph_lookup_inode(), use
ceph_find_inode() instead.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NAlex Elder <alex.elder@linary.org>
Reviewed-by: NSage Weil <sage@inktank.com>

ed284c49

ceph: use fscache as a local presisent cache · 99ccbd22

由 Milosz Tanski 提交于 8月 21, 2013

Adding support for fscache to the Ceph filesystem. This would bring it to on
par with some of the other network filesystems in Linux (like NFS, AFS, etc...)

In order to mount the filesystem with fscache the 'fsc' mount option must be
passed.
Signed-off-by: NMilosz Tanski <milosz@adfin.com>
Signed-off-by: NSage Weil <sage@inktank.com>

99ccbd22

16 8月, 2013 1 次提交

ceph: introduce i_truncate_mutex · b0d7c223

由 Yan, Zheng 提交于 8月 12, 2013

I encountered below deadlock when running fsstress

wmtruncate work      truncate                 MDS
---------------  ------------------  --------------------------
                   lock i_mutex
                                      <- truncate file
lock i_mutex (blocked)
                                      <- revoking Fcb (filelock to MIX)
                   send request ->
                                         handle request (xlock filelock)

At the initial time, there are some dirty pages in the page cache.
When the kclient receives the truncate message, it reduces inode size
and creates some 'out of i_size' dirty pages. wmtruncate work can't
truncate these dirty pages because it's blocked by the i_mutex. Later
when the kclient receives the cap message that revokes Fcb caps, It
can't flush all dirty pages because writepages() only flushes dirty
pages within the inode size.

When the MDS handles the 'truncate' request from kclient, it waits
for the filelock to become stable. But the filelock is stuck in
unstable state because it can't finish revoking kclient's Fcb caps.

The truncate pagecache locking has already caused lots of trouble
for use. I think it's time simplify it by introducing a new mutex.
We use the new mutex to prevent concurrent truncate_inode_pages().
There is no need to worry about race between buffered write and
truncate_inode_pages(), because our "get caps" mechanism prevents
them from concurrent execution.
Reviewed-by: NSage Weil <sage@inktank.com>
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>

b0d7c223

10 8月, 2013 2 次提交

ceph: fix freeing inode vs removing session caps race · 6f60f889

由 Yan, Zheng 提交于 7月 24, 2013

remove_session_caps() uses iterate_session_caps() to remove caps,
but iterate_session_caps() skips inodes that are being deleted.
So session->s_nr_caps can be non-zero after iterate_session_caps()
return.

We can fix the issue by waiting until deletions are complete.
__wait_on_freeing_inode() is designed for the job, but it is not
exported, so we use lookup inode function to access it.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>

6f60f889

ceph: wake up writer if vmtruncate work get blocked · 85ce127a

由 Yan, Zheng 提交于 7月 21, 2013

To write data, the writer first acquires the i_mutex, then try getting
caps. The writer may sleep while holding the i_mutex. If the MDS revokes
Fb cap in this case, vmtruncate work can't do its job because i_mutex
is locked. We should wake up the writer and let it truncate the pages.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>

85ce127a

05 7月, 2013 1 次提交
- A
  helper for reading ->d_count · 84d08fa8
  由 Al Viro 提交于 7月 05, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  84d08fa8
04 7月, 2013 1 次提交

ceph: fix pending vmtruncate race · b415bf4f

由 Yan, Zheng 提交于 7月 02, 2013

The locking order for pending vmtruncate is wrong, it can lead to
following race:

        write                  wmtruncate work
------------------------    ----------------------
lock i_mutex
check i_truncate_pending   check i_truncate_pending
truncate_inode_pages()     lock i_mutex (blocked)
copy data to page cache
unlock i_mutex
                           truncate_inode_pages()

The fix is take i_mutex before calling __ceph_do_pending_vmtruncate()

Fixes: http://tracker.ceph.com/issues/5453Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>

b415bf4f

02 5月, 2013 4 次提交

ceph: fix symlink inode operations · 0b932672

由 Yan, Zheng 提交于 4月 07, 2013

add getattr/setattr and xattrs related methods.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NGreg Farnum <greg@inktank.com>

0b932672

ceph: use i_release_count to indicate dir's completeness · 2f276c51

由 Yan, Zheng 提交于 3月 13, 2013

Current ceph code tracks directory's completeness in two places.
ceph_readdir() checks i_release_count to decide if it can set the
I_COMPLETE flag in i_ceph_flags. All other places check the I_COMPLETE
flag. This indirection introduces locking complexity.

This patch adds a new variable i_complete_count to ceph_inode_info.
Set i_release_count's value to it when marking a directory complete.
By comparing the two variables, we know if a directory is complete
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>

2f276c51

ceph: acquire i_mutex in __ceph_do_pending_vmtruncate · 3f99969f

由 Yan, Zheng 提交于 3月 01, 2013

make __ceph_do_pending_vmtruncate() acquire the i_mutex if the caller
does not hold the i_mutex, so ceph_aio_read() can call safely.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NGreg Farnum <greg@inktank.com>

3f99969f

ceph: use I_COMPLETE inode flag instead of D_COMPLETE flag · a8673d61

由 Yan, Zheng 提交于 2月 18, 2013

commit c6ffe100 moved the flag that tracks if the dcache contents
for a directory are complete to dentry. The problem is there are
lots of places that use ceph_dir_{set,clear,test}_complete() while
holding i_ceph_lock. but ceph_dir_{set,clear,test}_complete() may
sleep because they call dput().

This patch basically reverts that commit. For ceph_d_prune(), it's
called with both the dentry to prune and the parent dentry are
locked. So it's safe to access the parent dentry's d_inode and
clear I_COMPLETE flag.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NGreg Farnum <greg@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

a8673d61

26 2月, 2013 1 次提交

ceph: prepopulate inodes only when request is aborted · 79f9f99a

由 Sage Weil 提交于 1月 29, 2013

If r_aborted is true, we do not hold the dir i_mutex, and cannot touch
the dcache.  However, we still need to update the inodes with the state
returned by the MDS.
Reported-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NSage Weil <sage@inktank.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

79f9f99a

12 2月, 2013 2 次提交

ceph: Convert kuids and kgids before printing them. · bd2bae6a

由 Eric W. Biederman 提交于 1月 31, 2013

Before printing kuid and kgids values convert them into
the initial user namespace.

Cc: Sage Weil <sage@inktank.com>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

bd2bae6a

ceph: Translate inode uid and gid attributes to/from kuids and kgids. · ab871b90

由 Eric W. Biederman 提交于 1月 31, 2013

- In fill_inode() transate uids and gids in the initial user namespace
  into kuids and kgids stored in inode->i_uid and inode->i_gid.

- In ceph_setattr() if they have changed convert inode->i_uid and
  inode->i_gid into initial user namespace uids and gids for
  transmission.

Cc: Sage Weil <sage@inktank.com>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

ab871b90

13 12月, 2012 1 次提交

ceph: Fix __ceph_do_pending_vmtruncate · a85f50b6

由 Yan, Zheng 提交于 11月 19, 2012

we should set i_truncate_pending to 0 after page cache is truncated
to i_truncate_size
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NSage Weil <sage@inktank.com>

a85f50b6

27 9月, 2012 1 次提交
- A
  ceph: don't abuse d_delete() on failure exits · 2744c171
  由 Al Viro 提交于 9月 26, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  2744c171
22 8月, 2012 1 次提交

ceph: tolerate (and warn on) extraneous dentry from mds · 6c5e50fa

由 Sage Weil 提交于 8月 21, 2012

If the MDS gives us a dentry and we weren't prepared to handle it,
WARN_ON_ONCE instead of crashing.
Reported-by: NYan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

6c5e50fa

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功