提交 · 2cd698be9a3d3a0f8f3c66814eac34144c31954c · openeuler / raspberrypi-kernel

06 6月, 2014 4 次提交

ceph: handle cap import atomically · 2cd698be

由 Yan, Zheng 提交于 4月 18, 2014

cap import messages are processed by both handle_cap_import() and
handle_cap_grant(). These two functions are not executed in the same
atomic context, so they can races with cap release.

The fix is make handle_cap_import() not release the i_ceph_lock when
it returns. Let handle_cap_grant() release the lock after it finishes
its job.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>

2cd698be

ceph: pre-allocate ceph_cap struct for ceph_add_cap() · d9df2783

由 Yan, Zheng 提交于 4月 18, 2014

So that ceph_add_cap() can be used while i_ceph_lock is locked.
This simplifies the code that handle cap import/export.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>

d9df2783

ceph: update inode fields according to issued caps · f98a128a

由 Yan, Zheng 提交于 4月 17, 2014

Cap message and request reply from non-auth MDS may carry stale
information (corresponding locks are in LOCK states) even they
have the newest inode version. So client should update inode fields
according to issued caps.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>

f98a128a

ceph: queue vmtruncate if necessary when handing cap grant/revoke · c6bcda6f

由 Yan, Zheng 提交于 4月 11, 2014

cap grant/revoke message from non-auth MDS can update inode's size
and truncate_seq/truncate_size. (the message arrives before auth
MDS's cap trunc message)
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>

c6bcda6f

29 4月, 2014 1 次提交

ceph: avoid releasing caps that are being used · fd7b95cd

由 Yan, Zheng 提交于 4月 17, 2014

To avoid releasing caps that are being used, encode_inode_release()
should send implemented caps to MDS.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>

fd7b95cd

05 4月, 2014 2 次提交

ceph: set mds_wanted when MDS reply changes a cap to auth cap · d9ffc4f7

由 Yan, Zheng 提交于 3月 18, 2014

When adjusting caps client wants, MDS does not record caps that are
not allowed. For non-auth MDS, it does not record WR caps. So when
a MDS reply changes a non-auth cap to auth cap, client needs to set
cap's mds_wanted according to the reply.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>

d9ffc4f7

ceph: make sure write caps are registered with auth MDS · a2550604

由 Yan, Zheng 提交于 3月 08, 2014

Only auth MDS can issue write caps to clients, so don't consider
write caps registered with non-auth MDS as valid.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>

a2550604

21 1月, 2014 6 次提交

ceph: add imported caps when handling cap export message · 11df2dfb

由 Yan, Zheng 提交于 11月 24, 2013

Version 3 cap export message includes information about the imported
caps. It allows us to add the imported caps if the corresponding cap
import message still hasn't been received.

This allow us to handle situation that the importer MDS crashes and
the cap import message is missing.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>

11df2dfb

ceph: remove exported caps when handling cap import message · 4ee6a914

由 Yan, Zheng 提交于 11月 24, 2013

Version 3 cap import message includes the ID of the exported
caps. It allow us to remove the exported caps if we still haven't
received the corresponding cap export message.

We remove the exported caps because they are stale, keeping them
can compromise consistence.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>

4ee6a914

ceph: check inode caps in ceph_d_revalidate · 9215aeea

由 Yan, Zheng 提交于 11月 30, 2013

Some inodes in readdir reply may have no caps. Getattr mds request
for these inodes can return -ESTALE. The fix is consider dentry that
links to inode with no caps as invalid. Invalid dentry causes a
lookup request to send to the mds, the MDS will send caps back.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>

9215aeea

ceph: fix cache revoke race · 9563f88c

由 Yan, Zheng 提交于 11月 22, 2013

handle following sequence of events:

- non-auth MDS revokes Fc cap. queue invalidate work
- auth MDS issues Fc cap through request reply. i_rdcache_gen gets
  increased.
- invalidate work runs. it finds i_rdcache_revoking != i_rdcache_gen,
  so it does nothing.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>

9563f88c

Y
ceph: use ceph_seq_cmp() to compare migrate_seq · d1b87809
由 Yan, Zheng 提交于 11月 13, 2013
```
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
```
d1b87809

ceph: handle cap export race in try_flush_caps() · 4fe59789

由 Yan, Zheng 提交于 10月 31, 2013

auth cap may change after releasing the i_ceph_lock
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>

4fe59789

17 1月, 2014 1 次提交

ceph: trivial comment fix · fc12c80a

由 J. Bruce Fields 提交于 1月 16, 2014

"disconnected" is too easily confused with "DCACHE_DISCONNECTED".  I
think "unhashed" is the more precise term here.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Reviewed-by: NSage Weil <sage@inktank.com>

fc12c80a

01 1月, 2014 1 次提交

ceph: add acl for cephfs · 7221fe4c

由 Guangliang Zhao 提交于 11月 11, 2013

Signed-off-by: NGuangliang Zhao <lucienchao@gmail.com>
Reviewed-by: NLi Wang <li.wang@ubuntykylin.com>
Reviewed-by: NZheng Yan <zheng.z.yan@intel.com>

7221fe4c

24 11月, 2013 2 次提交

ceph: handle race between cap reconnect and cap release · 99a9c273

由 Yan, Zheng 提交于 9月 22, 2013

When a cap get released while composing the cap reconnect message.
We should skip queuing the release message if the cap hasn't been
added to the cap reconnect message.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>

99a9c273

ceph: queue cap release in __ceph_remove_cap() · a096b09a

由 Yan, Zheng 提交于 9月 22, 2013

call __queue_cap_release() in __ceph_remove_cap(), this avoids
acquiring s_cap_lock twice.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>

a096b09a

07 9月, 2013 2 次提交

ceph: use d_invalidate() to invalidate aliases · a8d436f0

由 Yan, Zheng 提交于 9月 02, 2013

d_invalidate() is the standard VFS method to invalidate dentry.
compare to d_delete(), it also try shrinking children dentries.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>

a8d436f0

ceph: use fscache as a local presisent cache · 99ccbd22

由 Milosz Tanski 提交于 8月 21, 2013

Adding support for fscache to the Ceph filesystem. This would bring it to on
par with some of the other network filesystems in Linux (like NFS, AFS, etc...)

In order to mount the filesystem with fscache the 'fsc' mount option must be
passed.
Signed-off-by: NMilosz Tanski <milosz@adfin.com>
Signed-off-by: NSage Weil <sage@inktank.com>

99ccbd22

28 8月, 2013 1 次提交

ceph: remove useless variable revoked_rdcache · e9075743

由 Li Wang 提交于 8月 15, 2013

Cleanup in handle_cap_grant().
Signed-off-by: NLi Wang <liwang@ubuntukylin.com>
Reviewed-by: NSage Weil <sage@inktank.com>

e9075743

16 8月, 2013 2 次提交

ceph: fix request max size · 3871cbb9

由 Yan, Zheng 提交于 8月 05, 2013

ceph_check_caps() requests new max size only when there is Fw cap.
If we call check_max_size() while there is no Fw cap. It updates
i_wanted_max_size and calls ceph_check_caps(), but ceph_check_caps()
does nothing. Later when Fw cap is issued, we call check_max_size()
again. But i_wanted_max_size is equal to 'endoff' at this time, so
check_max_size() doesn't call ceph_check_caps() and we end up with
waiting for the new max size forever.

The fix is duplicate ceph_check_caps()'s "request max size" code in
check_max_size(), and make try_get_cap_refs() wait for the Fw cap
before retry requesting new max size.

This patch also removes the "endoff > (inode->i_size << 1)" check
in check_max_size(). It's useless because there is no corresponding
logic in ceph_check_caps().
Reviewed-by: NSage Weil <sage@inktank.com>
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>

3871cbb9

ceph: introduce i_truncate_mutex · b0d7c223

由 Yan, Zheng 提交于 8月 12, 2013

I encountered below deadlock when running fsstress

wmtruncate work      truncate                 MDS
---------------  ------------------  --------------------------
                   lock i_mutex
                                      <- truncate file
lock i_mutex (blocked)
                                      <- revoking Fcb (filelock to MIX)
                   send request ->
                                         handle request (xlock filelock)

At the initial time, there are some dirty pages in the page cache.
When the kclient receives the truncate message, it reduces inode size
and creates some 'out of i_size' dirty pages. wmtruncate work can't
truncate these dirty pages because it's blocked by the i_mutex. Later
when the kclient receives the cap message that revokes Fcb caps, It
can't flush all dirty pages because writepages() only flushes dirty
pages within the inode size.

When the MDS handles the 'truncate' request from kclient, it waits
for the filelock to become stable. But the filelock is stuck in
unstable state because it can't finish revoking kclient's Fcb caps.

The truncate pagecache locking has already caused lots of trouble
for use. I think it's time simplify it by introducing a new mutex.
We use the new mutex to prevent concurrent truncate_inode_pages().
There is no need to worry about race between buffered write and
truncate_inode_pages(), because our "get caps" mechanism prevents
them from concurrent execution.
Reviewed-by: NSage Weil <sage@inktank.com>
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>

b0d7c223

10 8月, 2013 1 次提交

ceph: trim deleted inode · ca20c991

由 Yan, Zheng 提交于 7月 21, 2013

The MDS uses caps message to notify clients about deleted inode.
when receiving a such message, invalidate any alias of the inode.
This makes the kernel release the inode ASAP.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>

ca20c991

04 7月, 2013 7 次提交

ceph: fix race between cap issue and revoke · 6ee6b953

由 Yan, Zheng 提交于 7月 02, 2013

If we receive new caps from the auth MDS and the non-auth MDS is
revoking the newly issued caps, we should release the caps from
the non-auth MDS. The scenario is filelock's state changes from
SYNC to LOCK. Non-auth MDS revokes Fc cap, the client gets Fc cap
from the auth MDS at the same time.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>

6ee6b953

ceph: fix cap revoke race · b1530f57

由 Yan, Zheng 提交于 7月 02, 2013

If caps are been revoking by the auth MDS, don't consider them as
issued even they are still issued by non-auth MDS. The non-auth
MDS should also be revoking/exporting these caps, the client just
hasn't received the cap revoke/export message.

The race I encountered is: When caps are exporting to new MDS, the
client receives cap import message and cap revoke message from the
new MDS, then receives cap export message from the old MDS. When
the client receives cap revoke message from the new MDS, the revoking
caps are still issued by the old MDS, so the client does nothing.
Later when the cap export message is received, the client removes
the caps issued by the old MDS. (Another way to fix the race is
calling ceph_check_caps() in handle_cap_export())
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>

b1530f57

ceph: fix pending vmtruncate race · b415bf4f

由 Yan, Zheng 提交于 7月 02, 2013

The locking order for pending vmtruncate is wrong, it can lead to
following race:

        write                  wmtruncate work
------------------------    ----------------------
lock i_mutex
check i_truncate_pending   check i_truncate_pending
truncate_inode_pages()     lock i_mutex (blocked)
copy data to page cache
unlock i_mutex
                           truncate_inode_pages()

The fix is take i_mutex before calling __ceph_do_pending_vmtruncate()

Fixes: http://tracker.ceph.com/issues/5453Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>

b415bf4f

ceph: Reconstruct the func ceph_reserve_caps. · 93faca6e

由 majianpeng 提交于 6月 26, 2013

Drop ignored return value.  Fix allocation failure case to not leak.
Signed-off-by: NJianpeng Ma <majianpeng@gmail.com>
Reviewed-by: NSage Weil <sage@inktank.com>

93faca6e

Y
ceph: move inode to proper flushing list when auth MDS changes · 005c4697
由 Yan, Zheng 提交于 5月 31, 2013
```
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>
```
005c4697

ceph: check migrate seq before changing auth cap · b8c2f3ae

由 Yan, Zheng 提交于 5月 31, 2013

We may receive old request reply from the exporter MDS after receiving
the importer MDS' cap import message.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>

b8c2f3ae

ceph: fix cap release race · bb137f84

由 Yan, Zheng 提交于 6月 03, 2013

ceph_encode_inode_release() can race with ceph_open() and release
caps wanted by open files. So it should call __ceph_caps_wanted()
to get the wanted caps.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>

bb137f84

02 5月, 2013 5 次提交

ceph: take i_mutex before getting Fw cap · 37505d57

由 Yan, Zheng 提交于 4月 12, 2013

There is deadlock as illustrated bellow. The fix is taking i_mutex
before getting Fw cap reference.

      write                    truncate                 MDS
---------------------     --------------------      --------------
get Fw cap
                          lock i_mutex
lock i_mutex (blocked)
                          request setattr.size  ->
                                                <-   revoke Fw cap
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

37505d57

ceph: use i_release_count to indicate dir's completeness · 2f276c51

由 Yan, Zheng 提交于 3月 13, 2013

Current ceph code tracks directory's completeness in two places.
ceph_readdir() checks i_release_count to decide if it can set the
I_COMPLETE flag in i_ceph_flags. All other places check the I_COMPLETE
flag. This indirection introduces locking complexity.

This patch adds a new variable i_complete_count to ceph_inode_info.
Set i_release_count's value to it when marking a directory complete.
By comparing the two variables, we know if a directory is complete
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>

2f276c51

ceph: use I_COMPLETE inode flag instead of D_COMPLETE flag · a8673d61

由 Yan, Zheng 提交于 2月 18, 2013

commit c6ffe100 moved the flag that tracks if the dcache contents
for a directory are complete to dentry. The problem is there are
lots of places that use ceph_dir_{set,clear,test}_complete() while
holding i_ceph_lock. but ceph_dir_{set,clear,test}_complete() may
sleep because they call dput().

This patch basically reverts that commit. For ceph_d_prune(), it's
called with both the dentry to prune and the parent dentry are
locked. So it's safe to access the parent dentry's d_inode and
clear I_COMPLETE flag.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NGreg Farnum <greg@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

a8673d61

ceph: set mds_want according to cap import message · 964266cc

由 Yan, Zheng 提交于 2月 27, 2013

MDS ignores cap update message if migrate_seq mismatch, so when
receiving a cap import message with higher migrate_seq, set mds_want
according to the cap import message.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NGreg Farnum <greg@inktank.com>

964266cc

ceph: queue cap release when trimming cap · d40ee0dc

由 Yan, Zheng 提交于 2月 18, 2013

So the client will later send cap release message to MDS
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NGreg Farnum <greg@inktank.com>

d40ee0dc

12 2月, 2013 2 次提交

ceph: Convert kuids and kgids before printing them. · bd2bae6a

由 Eric W. Biederman 提交于 1月 31, 2013

Before printing kuid and kgids values convert them into
the initial user namespace.

Cc: Sage Weil <sage@inktank.com>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

bd2bae6a

ceph: Translate between uid and gids in cap messages and kuids and kgids · 05cb11c1

由 Eric W. Biederman 提交于 1月 31, 2013

- Make the uid and gid arguments of send_cap_msg() used to compose
  ceph_mds_caps messages of type kuid_t and kgid_t.

- Pass inode->i_uid and inode->i_gid in __send_cap to send_cap_msg()
  through variables of type kuid_t and kgid_t.

- Modify struct ceph_cap_snap to store uids and gids in types kuid_t
  and kgid_t.  This allows capturing inode->i_uid and inode->i_gid in
  ceph_queue_cap_snap() without loss and pssing them to
  __ceph_flush_snaps() where they are removed from struct
  ceph_cap_snap and passed to send_cap_msg().

- In handle_cap_grant translate uid and gids in the initial user
  namespace stored in struct ceph_mds_cap into kuids and kgids
  before setting inode->i_uid and inode->i_gid.

Cc: Sage Weil <sage@inktank.com>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

05cb11c1

18 1月, 2013 3 次提交

ceph: check mds_wanted for imported cap · 390306c3

由 Yan, Zheng 提交于 1月 04, 2013

The MDS may have incorrect wanted caps after importing caps. So the
client should check the value mds has and send cap update if necessary.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>

390306c3

ceph: allocate cap_release message when receiving cap import · 66f58691

由 Yan, Zheng 提交于 1月 04, 2013

When client wants to release an imported cap, it's possible there
is no reserved cap_release message in corresponding mds session.
so __queue_cap_release causes kernel panic.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>

66f58691

ceph: allow revoking duplicated caps issued by non-auth MDS · 395c312b

由 Yan, Zheng 提交于 1月 04, 2013

Allow revoking duplicated caps issued by non-auth MDS if these caps
are also issued by auth MDS.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>

395c312b