提交 · 47474d0b011bb385719e91a60bb9ff7649d66526 · openeuler / Kernel

02 4月, 2018 40 次提交

ceph: optimize mds session register · 47474d0b

由 Chengguang Xu 提交于 3月 13, 2018

Do memory allocation first, so that avoid unnecessary
initialization of newly allocated session in error case.
Signed-off-by: NChengguang Xu <cgxu519@gmx.com>
Reviewed-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

47474d0b

libceph, ceph: add __init attribution to init funcitons · 57a35dfb

由 Chengguang Xu 提交于 3月 10, 2018

Add __init attribution to the functions which are called only once
during initiating/registering operations and deleting unnecessary
symbol exports.
Signed-off-by: NChengguang Xu <cgxu519@gmx.com>
Reviewed-by: NIlya Dryomov <idryomov@gmail.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

57a35dfb

ceph: filter out used flags when printing unused open flags · 51b10f3f

由 Chengguang Xu 提交于 3月 09, 2018

Filter out used access mode flags when printing unused open flags.
Signed-off-by: NChengguang Xu <cgxu519@gmx.com>
Reviewed-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

51b10f3f

ceph: don't wait on writeback when there is no more dirty pages · 1582af2e

由 Yan, Zheng 提交于 3月 06, 2018

In sync mode, writepages() needs to write all dirty pages. But
it can only write dirty pages associated with the oldest snapc.
To write dirty pages associated with next snapc, it needs to wait
until current writes complete.

If there is no more dirty pages, writepages() should not wait on
writeback. Otherwise, dirty page writeback becomes very slow.
Signed-off-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

1582af2e

ceph: invalidate pages that beyond EOF in ceph_writepages_start() · af9cc401

由 Yan, Zheng 提交于 3月 04, 2018

Dirty pages can be associated with different capsnap. Different capsnap
may have different EOF value. So invalidating dirty pages according to
the largest EOF value is wrong. Dirty pages beyond EOF, but associated
with other capsnap, do not get invalidated.
Signed-off-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

af9cc401

ceph: mark the cap cache as unreclaimable · bc4b5ad3

由 Chengguang Xu 提交于 2月 27, 2018

Releasing cap is affected by many factors (e.g., avail_count/reserve_count/min_count)
and min_count could be specified high volume in client mount option. Hence it's better
to mark cap cache as unreclaimable in case of non-trivial discrepancies between memory
shown as reclaimable and what is actually reclaimed.
Signed-off-by: NChengguang Xu <cgxu519@icloud.com>
Reviewed-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

bc4b5ad3

ceph: change variable name to follow common rule · 73737682

由 Chengguang Xu 提交于 2月 28, 2018

Variable name ci is mostly used for ceph_inode_info.
Variable name fi is mostly used for ceph_file_info.
Variable name cf is mostly used for ceph_cap_flush.

Change variable name to follow above common rules
in case of confusing.
Signed-off-by: NChengguang Xu <cgxu519@icloud.com>
Reviewed-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

73737682

ceph: optimizing cap reservation · 79cd674a

由 Chengguang Xu 提交于 2月 24, 2018

When caps_avail_count is in a low level, most newly
trimmed caps will probably go into ->caps_list and
caps_avail_count will be increased. Hence after trimming,
should recheck caps_avail_count to effectly reuse
newly trimmed caps. Also, when releasing unnecessary
caps follow the same rule of ceph_put_cap.
Signed-off-by: NChengguang Xu <cgxu519@icloud.com>
Reviewed-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

79cd674a

ceph: release unreserved caps if having enough available caps · b517c1d8

由 Chengguang Xu 提交于 2月 25, 2018

When unreserving caps check if there is too mamy available caps
in the ->caps_list, if so release unreserved caps.
Signed-off-by: NChengguang Xu <cgxu519@icloud.com>
Reviewed-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

b517c1d8

ceph: optimizing cap allocation · e327ce06

由 Chengguang Xu 提交于 2月 24, 2018

When setting high volume of caps_min_count or having many
unreserved caps, unused caps may always keep in the ->caps_list
even can't get new cap from kmem_cache_alloc because lack of
maximum limitation of caps_avail_count. Hence reuse caps in
->caps_list if available, it's maybe better than setting max
limitation of caps_avail_count and releasing unused caps when
reaching the limit.
Signed-off-by: NChengguang Xu <cgxu519@icloud.com>
Reviewed-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

e327ce06

ceph: adding protection for showing cap reservation info · b884014a

由 Chengguang Xu 提交于 2月 23, 2018

Adding spinlock protection during getting cap reservation
ralated fields so that the numbers match below BUG_ON condition
in the code.

BUG_ON(mdsc->caps_total_count != mdsc->caps_use_count +
				 mdsc->caps_reserve_count +
				 mdsc->caps_avail_count);
Signed-off-by: NChengguang Xu <cgxu519@icloud.com>
Reviewed-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

b884014a

libceph: adding missing message types to ceph_msg_type_name() · f2f87877

由 Chengguang Xu 提交于 2月 22, 2018

Some of message types are missing in ceph_msg_type_name(),
so just adding them for better understanding of output information.
Signed-off-by: NChengguang Xu <cgxu519@icloud.com>
Reviewed-by: NIlya Dryomov <idryomov@gmail.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

f2f87877

rbd: get the latest osdmap when using an existing client · dd435855

由 Ilya Dryomov 提交于 2月 22, 2018

Currently we request the latest osdmap only if ceph_pg_poolid_by_name()
fails with -ENOENT.  This is effective with newly created pools, but we
also want to avoid attempting to map from pools that were recently
deleted and report "pool does not exist" instead.  (Such an attempt
eventually fails in the OSD client after map check code kicks in, but
the error message is confusing.)

Request the latest osdmap unconditionally after bumping a ref on an
existing client in rbd_client_find().
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

dd435855

rbd: move rbd_get_client() below rbd_put_client() · 5feb0d8d

由 Ilya Dryomov 提交于 2月 22, 2018

... to avoid a forward declaration in the next commit.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

5feb0d8d

I
rbd: remove redundant declaration of rbd_spec_put() · 0a4a1e68
由 Ilya Dryomov 提交于 2月 12, 2018
```
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
```
0a4a1e68

ceph: use seq_show_option for string type options · 4d8969af

由 Chengguang Xu 提交于 2月 15, 2018

Using seq_show_option to replace seq_printf for string type options.
Signed-off-by: NChengguang Xu <cgxu519@icloud.com>
Reviewed-by: NIlya Dryomov <idryomov@gmail.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

4d8969af

libceph: fix misjudgement of maximum monitor number · 7377324e

由 Chengguang Xu 提交于 2月 11, 2018

num_mon should allow up to CEPH_MAX_MON in ceph_monmap_decode().
Signed-off-by: NChengguang Xu <cgxu519@icloud.com>
Reviewed-by: NIlya Dryomov <idryomov@gmail.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

7377324e

libceph, ceph: change permission for readonly debugfs entries · 11e1478d

由 Chengguang Xu 提交于 2月 10, 2018

Remove write permission for debugfs entries which only have readonly
function.
Signed-off-by: NChengguang Xu <cgxu519@icloud.com>
Reviewed-by: NIlya Dryomov <idryomov@gmail.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

11e1478d

ceph: keep consistent semantic in fscache related option combination · 7ae7a828

由 Chengguang Xu 提交于 2月 07, 2018

When specifying multiple fscache related options, the result isn't always
the same as option order, this fix will keep strict consistent meaning
by order.
Signed-off-by: NChengguang Xu <cgxu519@icloud.com>
Reviewed-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

7ae7a828

ceph: add newline to end of debug message format · 4c069a58

由 Chengguang Xu 提交于 1月 30, 2018

Some of dout format do not include newline in the end,
fix for the files which are in fs/ceph and net/ceph directories,
and changing printk to dout for printing debug info in super.c
Signed-off-by: NChengguang Xu <cgxu519@icloud.com>
Reviewed-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

4c069a58

rbd: allow "fancy" striping · b1331852

由 Ilya Dryomov 提交于 2月 07, 2018

Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
Acked-by: NJason Dillaman <dillaman@redhat.com>

b1331852

rbd: introduce OWN_BVECS data type · afb97888

由 Ilya Dryomov 提交于 2月 06, 2018

If the layout is "fancy", we need to be able to rearrange the provided
bio_vecs in stripe unit chunks to make it possible for the messenger to
read/write directly from/to the provided data buffer, without employing
a temporary data buffer for assembling the result.

Higher level bio_vec arrays are generally immutable, so this requires
copying into a private array. Only the bio_vecs themselves are shuffled
around, not the actual data. OWN_BVECS doesn't own any pages.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

afb97888

rbd: remove rbd_parent_request_{create,destroy}() · e93aca0a

由 Ilya Dryomov 提交于 2月 06, 2018

rbd_parent_request_create() takes a ref on obj_req for child_img_req.
There is no point in doing that because child_img_req is created on
behalf of obj_req -- obj_req is the initiator and can't be completed
before child_img_req.

Open-code the rest of rbd_parent_request_create() and remove it along
with rbd_parent_request_destroy().
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

e93aca0a

I
rbd: get rid of img_req->{offset,length} · dfd9875f
由 Ilya Dryomov 提交于 2月 06, 2018
```
These are set, but no longer used.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
```
dfd9875f
I
rbd: remove rbd_img_request_fill() and helpers · 0420c5dd
由 Ilya Dryomov 提交于 2月 06, 2018
```
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
```
0420c5dd
I
rbd: switch to common striping framework · 5a237819
由 Ilya Dryomov 提交于 2月 06, 2018
```
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
```
5a237819

rbd: create+truncate for whole-object layered discards · 2bb1e56e

由 Ilya Dryomov 提交于 2月 06, 2018

A whole-object layered discard is implemented as a truncate rather
than a delete: a dummy object is needed to prevent the CoW machinery
from kicking in.  However, a truncate on a non-existent object is
a no-op.  If the object doesn't exist in HEAD, a discard request is
effectively ignored, which violates our "discard zeroes data" promise
and breaks REQ_OP_WRITE_ZEROES implementation.

A non-exclusive create on an existing object is also a no-op, so the
fix is to do a compound create+truncate instead.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

2bb1e56e

rbd: move to obj_req->img_extents · 86bd7998

由 Ilya Dryomov 提交于 2月 06, 2018

In preparation for rbd "fancy" striping, replace obj_req->img_offset
with obj_req->img_extents. A single starting offset isn't sufficient
because we want only one OSD request per object and will merge adjacent
object extents in ceph_file_to_extents(). The final object extent may
map into multiple different byte ranges in the image.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

86bd7998

rbd: incorporate ceph_object_extent · 43df3d35

由 Ilya Dryomov 提交于 2月 02, 2018

obj_req->object_no -> obj_req->ex.oe_objno
obj_req->offset -> obj_req->ex.oe_off
obj_req->length -> obj_req->ex.oe_len

... and use ex for linking object requests to image requests.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

43df3d35

I
libceph, ceph: move ceph_calc_file_object_mapping() to striper.c · 08c1ac50
由 Ilya Dryomov 提交于 2月 17, 2018
```
ceph_calc_file_object_mapping() has nothing to do with osdmaps.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
```
08c1ac50
I
libceph: striping framework implementation · ed0811d2
由 Ilya Dryomov 提交于 2月 02, 2018
```
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
```
ed0811d2

rbd: store data_type in img_req instead of obj_req · ecc633ca

由 Ilya Dryomov 提交于 2月 01, 2018

All object requests are associated with an image request now -- avoid
duplicating the same info in each object request.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

ecc633ca

rbd: remove obj_req->flags field · 0be2d60e

由 Ilya Dryomov 提交于 2月 01, 2018

There are no standalone (!IMG_DATA) object requests anymore.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

0be2d60e

I
rbd: remove old request completion code · 15961b44
由 Ilya Dryomov 提交于 2月 01, 2018
```
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
```
15961b44

rbd: new request completion code · 7114edac

由 Ilya Dryomov 提交于 2月 01, 2018

Do away with partial request completions and all the associated
complexity.  Individual object requests no longer need to be completed
in order -- when the last one becomes ready, we complete the entire
higher level request all at once.

This also wraps up the conversion to a state machine model and
eliminates the recursion described in commit 6d69bb53 ("rbd:
prevent kernel stack blow up on rbd map").
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

7114edac

rbd: update rbd_img_request_submit() signature · efbd1a11

由 Ilya Dryomov 提交于 1月 30, 2018

It should be void now.  Also, object requests are unlinked only in
image request destructor, which can't run before rbd_img_request_put(),
so no need for _safe.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

efbd1a11

rbd: add img_req->op_type field · 9bb0248d

由 Ilya Dryomov 提交于 1月 30, 2018

Store op_type in its own field instead of packing it into flags.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

9bb0248d

rbd: simplify rbd_osd_req_create() · a162b308

由 Ilya Dryomov 提交于 1月 30, 2018

No need to pass rbd_dev and op_type to rbd_osd_req_create(): there are
no standalone (!IMG_DATA) object requests anymore and osd_req->r_flags
can be set in rbd_osd_req_format_{read,write}().
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

a162b308

I
rbd: remove old request handling code · 51c3509e
由 Ilya Dryomov 提交于 1月 29, 2018
```
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
```
51c3509e

rbd: new request handling code · 3da691bf

由 Ilya Dryomov 提交于 1月 29, 2018

The notable changes are:

- instead of explicitly stat'ing the object to see if it exists before
  issuing the write, send the write optimistically along with the stat
  in a single OSD request
- zero copyup optimization
- all object requests are associated with an image request and have
  a valid ->img_request pointer; there are no standalone (!IMG_DATA)
  object requests anymore
- code is structured as a state machine (vs a bunch of callbacks with
  implicit state)
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

3da691bf

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功