提交 · 1b83bef24c6746a146d39915a18fb5425f2facb0 · openeuler / raspberrypi-kernel

27 2月, 2013 2 次提交

ceph: update support for PGID64, PGPOOL3, OSDENC protocol features · 4f6a7e5e

由 Sage Weil 提交于 2月 23, 2013

Support (and require) the PGID64, PGPOOL3, and OSDENC protocol features.
These have been present in ceph.git since v0.42, Feb 2012.  Require these
features to simplify support; nobody is running older userspace.

Note that the new request and reply encoding is still not in place, so the new
code is not yet functional.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>

4f6a7e5e

ceph: update "ceph_features.h" · ec73a754

由 Alex Elder 提交于 2月 26, 2013

This updates "include/linux/ceph/ceph_features.h" so all the feature
bits defined in the user space code are defined here.

The features supported by this implementation will still differ so
that's not updated here.
Signed-off-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

ec73a754

18 1月, 2013 2 次提交

libceph: for chooseleaf rules, retry CRUSH map descent from root if leaf is failed · 1604f488

由 Jim Schutt 提交于 11月 30, 2012

Add libceph support for a new CRUSH tunable recently added to Ceph servers.

Consider the CRUSH rule
step chooseleaf firstn 0 type <node_type>

This rule means that <n> replicas will be chosen in a manner such that
each chosen leaf's branch will contain a unique instance of <node_type>.

When an object is re-replicated after a leaf failure, if the CRUSH map uses
a chooseleaf rule the remapped replica ends up under the <node_type> bucket
that held the failed leaf. This causes uneven data distribution across the
storage cluster, to the point that when all the leaves but one fail under a
particular <node_type> bucket, that remaining leaf holds all the data from
its failed peers.

This behavior also limits the number of peers that can participate in the
re-replication of the data held by the failed leaf, which increases the
time required to re-replicate after a failure.

For a chooseleaf CRUSH rule, the tree descent has two steps: call them the
inner and outer descents.

If the tree descent down to <node_type> is the outer descent, and the descent
from <node_type> down to a leaf is the inner descent, the issue is that a
down leaf is detected on the inner descent, so only the inner descent is
retried.

In order to disperse re-replicated data as widely as possible across a
storage cluster after a failure, we want to retry the outer descent. So,
fix up crush_choose() to allow the inner descent to return immediately on
choosing a failed leaf. Wire this up as a new CRUSH tunable.

Note that after this change, for a chooseleaf rule, if the primary OSD
in a placement group has failed, choosing a replacement may result in
one of the other OSDs in the PG colliding with the new primary. This
requires that OSD's data for that PG to need moving as well. This
seems unavoidable but should be relatively rare.

This corresponds to ceph.git commit 88f218181a9e6d2292e2697fc93797d0f6d6e5dc.
Signed-off-by: NJim Schutt <jaschut@sandia.gov>
Reviewed-by: NSage Weil <sage@inktank.com>

1604f488

ceph: Check for created flag in response from mds · 6e8575fa

由 Sam Lang 提交于 12月 28, 2012

The mds now sends back a created inode if the create request
performed the create.  If the file already existed, no inode is
returned in the reply.  This allows ceph to set the created flag
in atomic_open so that permissions are properly checked in the case
that the file wasn't created by the create call to the mds.

To ensure compability with previous kernels, a feature for sending
back the inode in the create reply was added, so that the mds will
only send back the inode if the client indicates it supports the
feature.
Signed-off-by: NSam Lang <sam.lang@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

6e8575fa

31 7月, 2012 2 次提交

libceph: support crush tunables · 546f04ef

由 Sage Weil 提交于 7月 30, 2012

The server side recently added support for tuning some magic
crush variables. Decode these variables if they are present, or use the
default values if they are not present.

Corresponds to ceph.git commit 89af369c25f274fe62ef730e5e8aad0c54f1e5a5.
Signed-off-by: Ncaleb miles <caleb.miles@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NYehuda Sadeh <yehuda@inktank.com>

546f04ef

libceph: move feature bits to separate header · 1fe60e51

由 Sage Weil 提交于 7月 30, 2012

This is simply cleanup that will keep things more closely synced with the
userland code.
Signed-off-by: NSage Weil <sage@inktank.com>
Reviewed-by: NAlex Elder <elder@inktank.com>
Reviewed-by: NYehuda Sadeh <yehuda@inktank.com>

1fe60e51