提交 · 4c32f5dda5ffe23687a55da1538b7cc426710d1a · openeuler / Kernel

21 10月, 2010 1 次提交

ceph: factor out libceph from Ceph file system · 3d14c5d2

由 Yehuda Sadeh 提交于 4月 06, 2010

This factors out protocol and low-level storage parts of ceph into a
separate libceph module living in net/ceph and include/linux/ceph.  This
is mostly a matter of moving files around.  However, a few key pieces
of the interface change as well:

 - ceph_client becomes ceph_fs_client and ceph_client, where the latter
   captures the mon and osd clients, and the fs_client gets the mds client
   and file system specific pieces.
 - Mount option parsing and debugfs setup is correspondingly broken into
   two pieces.
 - The mon client gets a generic handler callback for otherwise unknown
   messages (mds map, in this case).
 - The basic supported/required feature bits can be expanded (and are by
   ceph_fs_client).

No functional change, aside from some subtle error handling cases that got
cleaned up in the refactoring process.
Signed-off-by: NSage Weil <sage@newdream.net>

3d14c5d2

07 10月, 2010 2 次提交

ceph: update issue_seq on cap grant · d91f2438

由 Sage Weil 提交于 9月 22, 2010

We need to update the issue_seq on any grant operation, be it via an MDS
reply or a separate grant message.  The update in the grant path was
missing.  This broke cap release for inodes in which the MDS sent an
explicit grant message that was not soon after followed by a successful
MDS reply on the same inode.

Also fix the signedness on seq locals.
Signed-off-by: NSage Weil <sage@newdream.net>

d91f2438

ceph: send cap release message early on failed revoke. · 21b559de

由 Greg Farnum 提交于 10月 06, 2010

If an MDS tries to revoke caps that we don't have, we want to send
releases early since they probably contain the caps message the MDS
is looking for.

Previously, we only sent the messages if we didn't have the inode either. But
in a multi-mds system we can retain the inode after dropping all caps for
a single MDS.
Signed-off-by: NGreg Farnum <gregf@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

21b559de

18 9月, 2010 1 次提交

ceph: check mapping to determine if FILE_CACHE cap is used · a43fb731

由 Sage Weil 提交于 9月 17, 2010

See if the i_data mapping has any pages to determine if the FILE_CACHE
capability is currently in use, instead of assuming it is any time the
rdcache_gen value is set (i.e., issued -> used).

This allows the MDS RECALL_STATE process work for inodes that have cached
pages.
Signed-off-by: NSage Weil <sage@newdream.net>

a43fb731

17 9月, 2010 1 次提交

ceph: only send one flushsnap per cap_snap per mds session · e835124c

由 Sage Weil 提交于 9月 17, 2010

Sending multiple flushsnap messages is problematic because we ignore
the response if the tid doesn't match, and the server may only respond to
each one once.  It's also a waste.

So, skip cap_snaps that are already on the flushing list, unless the caller
tells us to resend (because we are reconnecting).
Signed-off-by: NSage Weil <sage@newdream.net>

e835124c

15 9月, 2010 1 次提交

ceph: stop sending FLUSHSNAPs when we hit a dirty capsnap · cfc0bf66

由 Sage Weil 提交于 9月 14, 2010

Stop sending FLUSHSNAP messages when we hit a capsnap that has dirty_pages
or is still writing.  We'll send the newer capsnaps only after the older
ones complete.
Signed-off-by: NSage Weil <sage@newdream.net>

cfc0bf66

25 8月, 2010 1 次提交

ceph: maintain i_head_snapc when any caps are dirty, not just for data · 7d8cb26d

由 Sage Weil 提交于 8月 24, 2010

We used to use i_head_snapc to keep track of which snapc the current epoch
of dirty data was dirtied under.  It is used by queue_cap_snap to set up
the cap_snap.  However, since we queue cap snaps for any dirty caps, not
just for dirty file data, we need to keep a valid i_head_snapc anytime
we have dirty|flushing caps.  This fixes a NULL pointer deref in
queue_cap_snap when writing back dirty caps without data (e.g.,
snaptest-authwb.sh).
Signed-off-by: NSage Weil <sage@newdream.net>

7d8cb26d

23 8月, 2010 2 次提交

ceph: include dirty xattrs state in snapped caps · 4a625be4

由 Sage Weil 提交于 8月 22, 2010

When we snapshot dirty metadata that needs to be written back to the MDS,
include dirty xattr metadata.  Make the capsnap reference the encoded
xattr blob so that it will be written back in the FLUSHSNAP op.

Also fix the capsnap creation guard to include dirty auth or file bits,
not just tests specific to dirty file data or file writes in progress
(this fixes auth metadata writeback).
Signed-off-by: NSage Weil <sage@newdream.net>

4a625be4

ceph: fix xattr cap writeback · 082afec9

由 Sage Weil 提交于 8月 22, 2010

We should include the xattr metadata blob in the cap update message any
time we are flushing dirty state, NOT just when we are also dropping the
cap.  This fixes async xattr writeback.

Also, clean up the code slightly to avoid duplicating the bit test.
Signed-off-by: NSage Weil <sage@newdream.net>

082afec9

06 8月, 2010 1 次提交

ceph: only queue async writeback on cap revocation if there is dirty data · 0eb6cd49

由 Sage Weil 提交于 8月 05, 2010

Normally, if the Fb cap bit is being revoked, we queue an async writeback.
If there is no dirty data but we still hold the cap, this leaves the
client sitting around doing nothing until the cap timeouts expire and the
cap is released on its own (as it would have been without the revocation).

Instead, only queue writeback if the bit is actually used (i.e., we have
dirty data). If not, we can reply to the revocation immediately.
Signed-off-by: NSage Weil <sage@newdream.net>

0eb6cd49

03 8月, 2010 1 次提交

ceph: support v2 client_caps encoding · ce1fbc8d

由 Sage Weil 提交于 8月 02, 2010

Add support for v2 encoding of MClientCaps, which includes a flock blob.
Signed-off-by: NSage Weil <sage@newdream.net>

ce1fbc8d

02 8月, 2010 10 次提交

ceph: warn on missing snap realm · b8cd07e7

由 Sage Weil 提交于 7月 16, 2010

Well, this Shouldn't Happen, so it would be helpful to know the caller when
it does.
Signed-off-by: NSage Weil <sage@newdream.net>

b8cd07e7

ceph: add ceph_get_cap_for_mds function. · 2bc50259

由 Greg Farnum 提交于 6月 30, 2010

Signed-off-by: NGreg Farnum <gregf@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

2bc50259

ceph: connect to export targets on cap export · 154f42c2

由 Sage Weil 提交于 6月 21, 2010

When we get a cap EXPORT message, make sure we are connected to all export
targets to ensure we can handle the matching IMPORT.
Signed-off-by: NSage Weil <sage@newdream.net>

154f42c2

ceph: do caps accounting per mds_client · 37151668

由 Yehuda Sadeh 提交于 6月 17, 2010

Caps related accounting is now being done per mds client instead
of just being global. This prepares ground work for a later revision
of the caps preallocated reservation list.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

37151668

ceph: code cleanup · cd84db6e

由 Yehuda Sadeh 提交于 6月 11, 2010

Mainly fixing minor issues reported by sparse.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

cd84db6e

ceph: skip if no auth cap in flush_snaps · ca81f3f6

由 Sage Weil 提交于 6月 10, 2010

If we have a capsnap but no auth cap (e.g. because it is migrating to
another mds), bail out and do nothing for now.  Do NOT remove the capsnap
from the flush list.
Signed-off-by: NSage Weil <sage@newdream.net>

ca81f3f6

ceph: simplify caps revocation, fix for multimds · 3b454c49

由 Sage Weil 提交于 6月 10, 2010

The caps revocation should either initiate writeback, invalidateion, or
call check_caps to ack or do the dirty work.  The primary question is
whether we can get away with only checking the auth cap or whether all
caps need to be checked.

The old code was doing...something else.  At the very least, revocations
from non-auth MDSs could break by triggering the "check auth cap only"
case.
Signed-off-by: NSage Weil <sage@newdream.net>

3b454c49

S
ceph: drop unused argument · ee6b272b
由 Sage Weil 提交于 6月 10, 2010
```
Signed-off-by: NSage Weil <sage@newdream.net>
```
ee6b272b

ceph: perform lazy reads when file mode and caps permit · 2962507c

由 Sage Weil 提交于 5月 27, 2010

If the file mode is marked as "lazy," perform cached/buffered reads when
the caps permit it.  Adjust the rdcache_gen and invalidation logic
accordingly so that we manage our cache based on the FILE_CACHE -or-
FILE_LAZYIO cap bits.
Signed-off-by: NSage Weil <sage@newdream.net>

2962507c

ceph: perform lazy writes when file mode and caps permit · 33caad32

由 Sage Weil 提交于 5月 26, 2010

If we have marked a file as "lazy" (using the ceph ioctl), perform buffered
writes when the MDS caps allow it.
Signed-off-by: NSage Weil <sage@newdream.net>

33caad32

28 7月, 2010 1 次提交

ceph: use complete_all and wake_up_all · 03066f23

由 Yehuda Sadeh 提交于 7月 27, 2010

This fixes an issue triggered by running concurrent syncs. One of the syncs
would go through while the other would just hang indefinitely. In any case, we
never actually want to wake a single waiter, so the *_all functions should
be used.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

03066f23

24 7月, 2010 1 次提交

ceph: fix dentry lease release · 1dadcce3

由 Sage Weil 提交于 7月 23, 2010

When we embed a dentry lease release notification in a request, invalidate
our lease so we don't think we still have it.  Otherwise we can get all
sorts of incorrect client behavior when multiple clients are interacting
with the same part of the namespace.
Signed-off-by: NSage Weil <sage@newdream.net>

1dadcce3

30 6月, 2010 2 次提交

ceph: fix caps usage accounting for import (non-reserved) case · 443b3760

由 Sage Weil 提交于 6月 29, 2010

We need to increase the total and used counters when allocating a new cap
in the non-reserved (cap import) case.
Signed-off-by: NSage Weil <sage@newdream.net>

443b3760

ceph: only release clean, unused caps with mds requests · ec97f88b

由 Sage Weil 提交于 6月 24, 2010

We can drop caps with an mds request.  Ensure we only drop unused AND
clean caps, since the MDS doesn't support cap writeback in that context,
nor do we track it.  If caps are dirty, and the MDS needs them back, we
it will revoke and we will flush in the normal fashion.

This fixes a possibly loss of metadata.
Signed-off-by: NSage Weil <sage@newdream.net>

ec97f88b

11 6月, 2010 3 次提交

ceph: try to send partial cap release on cap message on missing inode · 2b2300d6

由 Sage Weil 提交于 6月 09, 2010

If we have enough memory to allocate a new cap release message, do so, so
that we can send a partial release message immediately.  This keeps us from
making the MDS wait when the cap release it needs is in a partially full
release message.

If we fail because of ENOMEM, oh well, they'll just have to wait a bit
longer.
Signed-off-by: NSage Weil <sage@newdream.net>

2b2300d6

ceph: release cap on import if we don't have the inode · 3d7ded4d

由 Sage Weil 提交于 6月 09, 2010

If we get an IMPORT that give us a cap, but we don't have the inode, queue
a release (and try to send it immediately) so that the MDS doesn't get
stuck waiting for us.
Signed-off-by: NSage Weil <sage@newdream.net>

3d7ded4d

ceph: fix misleading/incorrect debug message · 9dbd412f

由 Sage Weil 提交于 6月 10, 2010

Nothing is released here: the caps message is simply ignored in this case.
Signed-off-by: NSage Weil <sage@newdream.net>

9dbd412f

28 5月, 2010 1 次提交

drop unused dentry argument to ->fsync · 7ea80859

由 Christoph Hellwig 提交于 5月 26, 2010

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

7ea80859

18 5月, 2010 5 次提交

ceph: all allocation functions should get gfp_mask · 34d23762

由 Yehuda Sadeh 提交于 4月 06, 2010

This is essential, as for the rados block device we'll need
to run in different contexts that would need flags that
are other than GFP_NOFS.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

34d23762

ceph: cleanup: remove unused assignement · a5ee751c

由 Dan Carpenter 提交于 5月 07, 2010

We don't ever use "dirty" so we can remove it.
Signed-off-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NSage Weil <sage@newdream.net>

a5ee751c

ceph: simplify ceph_msg_new · bb257664

由 Sage Weil 提交于 4月 01, 2010

We only need to pass in front_len.  Callers can attach any other payload
pieces (middle, data) as they see fit.
Signed-off-by: NSage Weil <sage@newdream.net>

bb257664

ceph: make ceph_msg_new return NULL on failure; clean up, fix callers · a79832f2

由 Sage Weil 提交于 4月 01, 2010

Returning ERR_PTR(-ENOMEM) is useless extra work. Return NULL on failure
instead, and fix up the callers (about half of which were wrong anyway).
Signed-off-by: NSage Weil <sage@newdream.net>

a79832f2

ceph: use ceph_sb_to_client instead of ceph_client · 640ef79d

由 Cheng Renquan 提交于 3月 26, 2010

ceph_sb_to_client and ceph_client are really identical, we need to dump
one; while function ceph_client is confusing with "struct ceph_client",
ceph_sb_to_client's definition is more clear; so we'd better switch all
call to ceph_sb_to_client.

  -static inline struct ceph_client *ceph_client(struct super_block *sb)
  -{
  -	return sb->s_fs_info;
  -}
Signed-off-by: NCheng Renquan <crquan@gmail.com>
Signed-off-by: NSage Weil <sage@newdream.net>

640ef79d

12 5月, 2010 1 次提交

ceph: fix cap removal races · f818a736

由 Sage Weil 提交于 5月 11, 2010

The iterate_session_caps helper traverses the session caps list and tries
to grab an inode reference.  However, the __ceph_remove_cap was clearing
the inode backpointer _before_ removing itself from the session list,
causing a null pointer dereference.

Clear cap->ci under protection of s_cap_lock to avoid the race, and to
tightly couple the list and backpointer state.  Use a local flag to
indicate whether we are releasing the cap, as cap->session may be modified
by a racing thread in iterate_session_caps.
Signed-off-by: NSage Weil <sage@newdream.net>

f818a736

04 5月, 2010 1 次提交
- S
  ceph: fix leaked spinlock during mds reconnect · 0b0c06d1
  由 Sage Weil 提交于 4月 20, 2010
```
Signed-off-by: NSage Weil <sage@newdream.net>
```
  0b0c06d1
02 4月, 2010 1 次提交

ceph: fix leaked inode ref due to snap metadata writeback race · 819ccbfa

由 Sage Weil 提交于 4月 01, 2010

We create a ceph_cap_snap if there is dirty cap metadata (for writeback to
mds) OR dirty pages (for writeback to osd). It is thus possible that the
metadata has been written back to the MDS but the OSD data has not when
the cap_snap is created. This results in a cap_snap with dirty(caps) == 0.
The problem is that cap writeback to the MDS isn't necessary, and a
FLUSHSNAP cap op gets no ack from the MDS. This leaves the cap_snap
attached to the inode along with its inode reference.

Fix the problem by dropping the cap_snap if it becomes 'complete' (all
pages written out) and dirty(caps) == 0 in ceph_put_wrbuffer_cap_refs().

Also, BUG() in __ceph_flush_snaps() if we encounter a cap_snap with
dirty(caps) == 0.
Signed-off-by: NSage Weil <sage@newdream.net>

819ccbfa

30 3月, 2010 1 次提交

include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6

由 Tejun Heo 提交于 3月 24, 2010

include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: NTejun Heo <tj@kernel.org>
Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

5a0e3ad6

23 3月, 2010 2 次提交

ceph: only release unused caps with mds requests · 916623da

由 Sage Weil 提交于 3月 16, 2010

We were releasing used caps (e.g. FILE_CACHE) from encode_inode_release
with MDS requests (e.g. setattr).  We don't carry refs on most caps, so
this code worked most of the time, but for setattr (utimes) we try to
drop Fscr.

This causes cap state to get slightly out of sync with reality, and may
result in subsequent mds revoke messages getting ignored.

Fix by only releasing unused caps.
Signed-off-by: NSage Weil <sage@newdream.net>

916623da

ceph: clean up handle_cap_grant, handle_caps wrt session mutex · 15637c8b

由 Sage Weil 提交于 3月 16, 2010

Drop session mutex unconditionally in handle_cap_grant, and do the
check_caps from the handle_cap_grant helper.  This avoids using a magic
return value.

Also avoid using a flag variable in the IMPORT case and call
check_caps at the appropriate point.
Signed-off-by: NSage Weil <sage@newdream.net>

15637c8b

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功