提交 · 8031049147c58d9d8b6226c3ac31a9d72d053e25 · openeuler / Kernel

12 2月, 2010 14 次提交

ceph: remove bogus invalidate_mapping_pages · 80310491

由 Sage Weil 提交于 2月 09, 2010

We were invalidating mapping pages when dropping FILE_CACHE in
__send_cap().  But ceph_check_caps attempts to invalidate already, and
also checks for success, so we should never get to this point.
Signed-off-by: NSage Weil <sage@newdream.net>

80310491

ceph: invalidate pages even if truncate is pending · 0840d8af

由 Sage Weil 提交于 2月 09, 2010

There is no reason not to invalidate pages when a truncate is pending.
Both throw out page cache pages.
Signed-off-by: NSage Weil <sage@newdream.net>

0840d8af

ceph: cleanup async writeback, truncation, invalidate helpers · 3c6f6b79

由 Sage Weil 提交于 2月 09, 2010

Grab inode ref in helper.  Make work functions static, with consistent
naming.
Signed-off-by: NSage Weil <sage@newdream.net>

3c6f6b79

ceph: fix sync read eof check deadlock · 6a026589

由 Sage Weil 提交于 2月 09, 2010

If a sync read gets a short result from the OSD, it may need to do a
getattr to see if it is short due to reaching end-of-file. The getattr
was being done while holding a reference to FILE_RD, which can lead to
a deadlock if the MDS is revoking that capability bit and can't process
the getattr until it does.

We fix this by setting a flag if EOF size validation is needed, and doing
the getattr in ceph_aio_read, after the RD cap ref is dropped. If the
read needs to be continued, we loop and continue traversing the file.
Signed-off-by: NSage Weil <sage@newdream.net>

6a026589

ceph: do not retain caps that are being revoked · 68c28323

由 Sage Weil 提交于 2月 09, 2010

Never retain caps in __send_cap() that are being revoked.
Signed-off-by: NSage Weil <sage@newdream.net>

68c28323

ceph: cap revocation fixes · cbd03635

由 Sage Weil 提交于 2月 09, 2010

Try to invalidate pages in ceph_check_caps() if FILE_CACHE is being
revoked. If we fail, queue an immediate async invalidate if FILE_CACHE
is being revoked. (If it's not being revoked, we just queue the caps
for later evaluation later, as per the old behavior.)
Signed-off-by: NSage Weil <sage@newdream.net>

cbd03635

ceph: sync read/write considers page cache · 29065a51

由 Yehuda Sadeh 提交于 2月 09, 2010

In the cases where we either do a sync read or a write, we
need to make sure that everything in the page cache is flushed.
In the case of a sync write we invalidate the relevant pages,
so that subsequent read/write reflects the new data written.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

29065a51

ceph: fix truncation when not holding caps · 3d497d85

由 Yehuda Sadeh 提交于 2月 09, 2010

A truncation should occur when either we have the
specified caps for the file, or (in cases where we are
not the only ones referencing the file) when it is mapped
or when it is opened. The latter two cases were not
handled.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

3d497d85

ceph: refactor ceph_write_begin, fix ceph_page_mkwrite · 4af6b225

由 Yehuda Sadeh 提交于 2月 09, 2010

Originally ceph_page_mkwrite called ceph_write_begin, hoping that
the returned locked page would be the page that it was requested
to mkwrite. Factored out relevant part of ceph_page_mkwrite and
we lock the right page anyway.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

4af6b225

ceph: fix short synchronous reads · 972f0d3a

由 Yehuda Sadeh 提交于 2月 04, 2010

Zeroing of holes was not done correctly: page_off was miscalculated and
zeroing the tail didn't not adjust the 'read' value to include the zeroed
portion.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

972f0d3a

S
ceph: add uid field to ceph_pg_pool · 02f90c61
由 Sage Weil 提交于 2月 04, 2010
```
Also verify encoding version as we go.
Signed-off-by: NSage Weil <sage@newdream.net>
```
02f90c61

ceph: put unused osd connections on lru · f5a2041b

由 Yehuda Sadeh 提交于 2月 03, 2010

Instead of removing osd connection immediately when the
requests list is empty, put the osd connection on an lru.
Only if that osd has not been used for more than a specified
time, will it be removed.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

f5a2041b

ceph: remove unused variable · b056c876

由 Yehuda Sadeh 提交于 2月 03, 2010

Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

b056c876

ceph: add support for auth_x authentication protocol · ec0994e4

由 Sage Weil 提交于 2月 02, 2010

The auth_x protocol implements support for a kerberos-like mutual
authentication infrastructure used by Ceph.  We do not simply use vanilla
kerberos because of scalability and performance issues when dealing with
a large cluster of nodes providing a single logical service.

Auth_x provides mutual authentication of client and server and protects
against replay and man in the middle attacks.  It does not encrypt
the full session over the wire, however, so data payload may still be
snooped.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

ec0994e4

11 2月, 2010 4 次提交

ceph: add struct version to auth encoding · 07c8739c

由 Sage Weil 提交于 2月 04, 2010

Inlucde struct version in encoding. This will streamline future protocol
changes.
Signed-off-by: NSage Weil <sage@newdream.net>

07c8739c

ceph: allow renewal of auth credentials · 9bd2e6f8

由 Sage Weil 提交于 2月 02, 2010

Add infrastructure to allow the mon_client to periodically renew its auth
credentials.  Also add a messenger callback that will force such a renewal
if a peer rejects our authenticator.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

9bd2e6f8

ceph: aes crypto and base64 encode/decode helpers · 8b6e4f2d

由 Sage Weil 提交于 2月 02, 2010

Helpers to encrypt/decrypt AES and base64.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

8b6e4f2d

ceph: buffer decoding helpers · c7e337d6

由 Sage Weil 提交于 2月 02, 2010

Helper for decoding into a ceph_buffer, and other misc decoding helpers
we will need.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

c7e337d6

03 2月, 2010 2 次提交

ceph: release all pages after successful osd write response · 79788c69

由 Sage Weil 提交于 2月 02, 2010

We release all the pages, even if the osd response was
different than the number of pages written. This could only
happen due to truncation that arrives the osd in
different order, for which we want the pages released anyway.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

79788c69

ceph: always send truncation info with read and write osd ops · 0c948992

由 Yehuda Sadeh 提交于 2月 01, 2010

This fixes a bug where the read/write ops arrive the osd after
a following truncation request.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

0c948992

30 1月, 2010 2 次提交

ceph: remove unreachable code · 0f26c4b2

由 Yehuda Sadeh 提交于 1月 29, 2010

We never truncate to a smaller size without contacting the MDS.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

0f26c4b2

ceph: include type in ceph_entity_addr, filepath · ac8839d7

由 Sage Weil 提交于 1月 27, 2010

Include a type/version in ceph_entity_addr and filepath.  Include extra
byte in filepath encoding as necessary.
Signed-off-by: NSage Weil <sage@newdream.net>

ac8839d7

26 1月, 2010 8 次提交

S
ceph: precede encoded ceph_pg_pool struct with version · 361be860
由 Sage Weil 提交于 1月 25, 2010
```
Signed-off-by: NSage Weil <sage@newdream.net>
```
361be860

ceph: keep reserved replies on the request structure · 0d59ab81

由 Yehuda Sadeh 提交于 1月 13, 2010

This includes treating all the data preallocation and revokation
at the same place, not having to have a special case for
the reserved pages.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>

0d59ab81

ceph: alloc message data pages and check if tid exists · 0547a9b3

由 Yehuda Sadeh 提交于 1月 11, 2010

Now doing it in the same callback that is also responsible for
allocating the 'front' part of the message. If we get a message
that we haven't got a corresponding tid for, mark it for skipping.

Moving the mutex unlock/lock from the osd alloc_msg callback
to the calling function in the messenger.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>

0547a9b3

Y
ceph: refactor messages data section allocation · 9d7f0f13
由 Yehuda Sadeh 提交于 1月 11, 2010
```
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
```
9d7f0f13

ceph: allocate middle of message before stating to read · 2450418c

由 Yehuda Sadeh 提交于 1月 08, 2010

Both front and middle parts of the message are now being
allocated at the ceph_alloc_msg().
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>

2450418c

ceph: properly handle aborted mds requests · 5b1daecd

由 Sage Weil 提交于 1月 25, 2010

Previously, if the MDS request was interrupted, we would unregister the
request and ignore any reply. This could cause the caps or other cache
state to become out of sync. (For instance, aborting dbench and doing
rm -r on clients would complain about a non-empty directory because the
client didn't realize it's aborted file create request completed.)

Even we don't unregister, we still can't process the reply normally because
we are no longer holding the caller's locks (like the dir i_mutex).

So, mark aborted operations with r_aborted, and in the reply handler, be
sure to process all the caps. Do not process the namespace changes,
though, since we no longer will hold the dir i_mutex. The dentry lease
state can also be ignored as it's more forgiving.
Signed-off-by: NSage Weil <sage@newdream.net>

5b1daecd

ceph: mark MDS CREATE as a write op · 3ea25f94

由 Sage Weil 提交于 1月 25, 2010

CEPH_MDS_OP_CREATE was not correctly marked as a write operation.
Signed-off-by: NSage Weil <sage@newdream.net>

3ea25f94

ceph: remove duplicate variable initialization · ec7384ec

由 Julia Lawall 提交于 1月 20, 2010

The variable client is initialized twice to the same (side effect-free)
expression.  Drop one initialization.

A simplified version of the semantic match that finds this problem is:
(http://coccinelle.lip6.fr/)

// <smpl>
@forall@
idexpression *x;
identifier f!=ERR_PTR;
@@

x = f(...)
... when != x
(
x = f(...,<+...x...+>,...)
|
* x = f(...)
)
// </smpl>
Signed-off-by: NJulia Lawall <julia@diku.dk>
Signed-off-by: NSage Weil <sage@newdream.net>

ec7384ec

15 1月, 2010 3 次提交

S
ceph: display pgid in debugfs osd request dump · 7740a42f
由 Sage Weil 提交于 1月 08, 2010
```
Signed-off-by: NSage Weil <sage@newdream.net>
```
7740a42f

ceph: remove unused erank field · 103e2d3a

由 Sage Weil 提交于 1月 07, 2010

The ceph_entity_addr erank field is obsolete; remove it.  Get rid of
trivial addr comparison helpers while we're at it.
Signed-off-by: NSage Weil <sage@newdream.net>

103e2d3a

ceph: change dentry offset and position after splice_dentry · 4baa75ef

由 Yehuda Sadeh 提交于 1月 07, 2010

This fixes a bug, where we had the parent list have dentries with
offsets that are not monotonically increasing, which caused the ceph
dcache_readdir to skip entries.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

4baa75ef

07 1月, 2010 1 次提交

ceph: fix copy_user_to_page_vector() · 6a4ef481

由 Yehuda Sadeh 提交于 12月 31, 2009

The function was broken in the case where there was more than one page
involved, broke the ceph sync_write case.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

6a4ef481

24 12月, 2009 6 次提交

ceph: use ceph_pagelist for mds reconnect message; change encoding (protocol change) · 93cea5be

由 Sage Weil 提交于 12月 23, 2009

Use the ceph_pagelist to encode the MDS reconnect message. We change the
message encoding (protocol change!) at the same time to make our life
easier (we don't know how many snaprealms we have when we start encoding).

An empty message implies the session is closed/does not exist.
Signed-off-by: NSage Weil <sage@newdream.net>

93cea5be

ceph: support ceph_pagelist for message payload · 58bb3b37

由 Sage Weil 提交于 12月 23, 2009

The ceph_pagelist is a simple list of whole pages, strung together via
their lru list_head.  It facilitates encoding to a "buffer" of unknown
size.  Allow its use in place of the ceph_msg page vector.

This will be used to fix the huge buffer preallocation woes of MDS
reconnection.
Signed-off-by: NSage Weil <sage@newdream.net>

58bb3b37

ceph: add feature bits to connection handshake (protocol change) · 04a419f9

由 Sage Weil 提交于 12月 23, 2009

Define supported and required feature set.  Fail connection if the server
requires features we do not support (TAG_FEATURES), or if the server does
not support features we require.
Signed-off-by: NSage Weil <sage@newdream.net>

04a419f9

ceph: include transaction id in ceph_msg_header (protocol change) · 6df058c0

由 Sage Weil 提交于 12月 22, 2009

Many (most?) message types include a transaction id.  By including it in
the fixed size header, we always have it available even when we are unable
to allocate memory for the (larger, variable sized) message body.  This
will allow us to error out the appropriate request instead of (silently)
dropping the reply.
Signed-off-by: NSage Weil <sage@newdream.net>

6df058c0

S
ceph: more informative msgpool errors · 0cf90ab5
由 Sage Weil 提交于 12月 22, 2009
```
Signed-off-by: NSage Weil <sage@newdream.net>
```
0cf90ab5

ceph: control access to page vector for incoming data · 350b1c32

由 Sage Weil 提交于 12月 22, 2009

When we issue an OSD read, we specify a vector of pages that the data is to
be read into. The request may be sent multiple times, to multiple OSDs, if
the osdmap changes, which means we can get more than one reply.

Only read data into the page vector if the reply is coming from the
OSD we last sent the request to. Keep track of which connection is using
the vector by taking a reference. If another connection was already
using the vector before and a new reply comes in on the right connection,
revoke the pages from the other connection.
Signed-off-by: NSage Weil <sage@newdream.net>

350b1c32

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功