提交 · 4452226ea276e74fc3e252c88d9bb7e8f8e44bf0 · openeuler / Kernel

02 6月, 2015 1 次提交

writeback: move backing_dev_info->state into bdi_writeback · 4452226e

由 Tejun Heo 提交于 5月 22, 2015

Currently, a bdi (backing_dev_info) embeds single wb (bdi_writeback)
and the role of the separation is unclear.  For cgroup support for
writeback IOs, a bdi will be updated to host multiple wb's where each
wb serves writeback IOs of a different cgroup on the bdi.  To achieve
that, a wb should carry all states necessary for servicing writeback
IOs for a cgroup independently.

This patch moves bdi->state into wb.

* enum bdi_state is renamed to wb_state and the prefix of all enums is
  changed from BDI_ to WB_.

* Explicit zeroing of bdi->state is removed without adding zeoring of
  wb->state as the whole data structure is zeroed on init anyway.

* As there's still only one bdi_writeback per backing_dev_info, all
  uses of bdi->state are mechanically replaced with bdi->wb.state
  introducing no behavior changes.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reviewed-by: NJan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: drbd-dev@lists.linbit.com
Cc: Neil Brown <neilb@suse.de>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

4452226e

16 4月, 2015 1 次提交
- D
  VFS: assorted weird filesystems: d_inode() annotations · 75c3cfa8
  由 David Howells 提交于 3月 17, 2015
```
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  75c3cfa8
25 3月, 2015 2 次提交

block, drbd: use mempool_create_slab_pool() · cbc4ffdb

由 David Rientjes 提交于 3月 24, 2015

Mempools created for slab caches should use
mempool_create_slab_pool().

Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: Jens Axboe <axboe@fb.com>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

cbc4ffdb

block, drbd: fix drbd_req_new() initialization · 23fe8f8b

由 David Rientjes 提交于 3月 24, 2015

mempool_alloc() does not support __GFP_ZERO since elements may come from
memory that has already been released by mempool_free().

Remove __GFP_ZERO from mempool_alloc() in drbd_req_new() and properly
initialize it to 0.

Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: Jens Axboe <axboe@fb.com>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

23fe8f8b

22 1月, 2015 1 次提交

block: Add discard flag to blkdev_issue_zeroout() function · d93ba7a5

由 Martin K. Petersen 提交于 1月 20, 2015

blkdev_issue_discard() will zero a given block range. This is done by
way of explicit writing, thus provisioning or allocating the blocks on
disk.

There are use cases where the desired behavior is to zero the blocks but
unprovision them if possible. The blocks must deterministically contain
zeroes when they are subsequently read back.

This patch adds a flag to blkdev_issue_zeroout() that provides this
variant. If the discard flag is set and a block device guarantees
discard_zeroes_data we will use REQ_DISCARD to clear the block range. If
the device does not support discard_zeroes_data or if the discard
request fails we will fall back to first REQ_WRITE_SAME and then a
regular REQ_WRITE.

Also update the callers of blkdev_issue_zero() to reflect the new flag
and make sb_issue_zeroout() prefer the discard approach.
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

d93ba7a5

24 11月, 2014 1 次提交

drbd: use generic io stats accounting functions to simplify io stat accounting · 24480854

由 Gu Zheng 提交于 11月 24, 2014

Use generic io stats accounting help functions (generic_{start,end}_io_acct)
to simplify io stat accounting.
Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

24480854

20 11月, 2014 1 次提交
- A
  kill f_dentry uses · b583043e
  由 Al Viro 提交于 10月 31, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  b583043e
11 11月, 2014 7 次提交

drbd: Remove an useless copy of kernel_setsockopt() · e805b983

由 Philipp Reisner 提交于 11月 10, 2014

Old backward-compat cruft
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

e805b983

drbd: Fix state change in case of connection timeout · 9581f97a

由 Philipp Reisner 提交于 11月 10, 2014

A connection timeout affects all volumes of a resource!
Under the following conditions:

 A resource with multiple volumes
  AND
 ko-count >=1
  AND
 a write request triggers the timeout (ko-count * timeout)

DRBD's internal state gets confused. That in turn may
lead to very miss leading follow up failures. E.g.
"BUG: scheduling while atomic"

CC: stable@kernel.org # v3.17
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

9581f97a

drbd: merge_bvec_fn: properly remap bvm->bi_bdev · 3b9d35d7

由 Lars Ellenberg 提交于 11月 10, 2014

This was not noticed for many years. Affects operation if
md raid is used a backing device for DRBD.

CC: stable@kernel.org # v3.2+
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

3b9d35d7

drbd: fix resync throttling initialization · ff8bd88b

由 Lars Ellenberg 提交于 11月 10, 2014

If for some reason DRBD resync was the only activity on a backend
device, drbd_rs_c_min_rate_throttle() would mistakenly decide that it is
still initialization time, and keep throttling the resync.

This patch explicitly initializes ->rs_last_events to the current
backend event counters, and drops the rs_last_events == 0 from the
throttle condition.
Reported-by: NMikhail Sugakov <msugakov@amazon.de>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

ff8bd88b

drbd: fix race between role change and handshake · a8821531

由 Philipp Reisner 提交于 11月 10, 2014

Symptoms:
If DRBD was "cleanly shut down" (all in sync, both Secondary before
disconnect, identical data generation uuids), and then one side was
promoted *during* the next connection handshake, the role change
could confuse the handshake.

The Primary would get stuck in WFBitmapS, the Secondary would log
unexpected cstate (Connected) in receive_bitmap
and get stuck in WFBitmapT.

Fix:
The test in is_valid_soft_transition wrong. It works because
the not allowed actions (promote/attach) do not touch the
cstate. The previous condition failed to demand a cstate change
in one clause.

In order to avoid deadlocks give up the state_mutex while waiting
for the transient state to go away.

Conflicts:
	drbd/drbd_state.c
	drbd/drbd_state.h
	drbd/drbd_wrappers.h
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

a8821531

drbd: Only use drbd_msg_put_info() in drbd_nl.c · f221f4bc

由 Andreas Gruenbacher 提交于 11月 10, 2014

Avoid generic netlink calls in other parts of the code base.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

f221f4bc

drbd: Minor cleanups · 179e20b8

由 Andreas Gruenbacher 提交于 11月 10, 2014

 . Update comments
 . drbd_set_{in,out_of}_sync(): Remove unused parameters
 . Move common code into adm_del_resource()
 . Redefine ERR_MINOR_EXISTS -> ERR_MINOR_OR_VOLUME_EXISTS
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

179e20b8

18 9月, 2014 2 次提交

drbd: use RB_DECLARE_CALLBACKS() to define augment callbacks · e9f05b4c

由 Lai Jiangshan 提交于 9月 18, 2014

The original code are the same as RB_DECLARE_CALLBACKS().

CC: Michel Lespinasse <walken@google.com>
Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

e9f05b4c

drbd: compute the end before rb_insert_augmented() · 82cfb90b

由 Lai Jiangshan 提交于 9月 18, 2014

Commit 98683650 "Merge branch 'drbd-8.4_ed6' into
for-3.8-drivers-drbd-8.4_ed6" switches to the new augment API, but the
new API requires that the tree is augmented before rb_insert_augmented()
is called, which is missing.

So we add the augment-code to drbd_insert_interval() when it travels the
tree up to down before rb_insert_augmented().  See the example in
include/linux/interval_tree_generic.h or Documentation/rbtree.txt.

drbd_insert_interval() may cancel the insertion when traveling, in this
case, the just added augment-code does nothing before cancel since the
@this node is already in the subtrees in this case.

CC: Michel Lespinasse <walken@google.com>
CC: stable@kernel.org # v3.10+
Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

82cfb90b

11 9月, 2014 9 次提交

drbd: Add missing newline in resync progress display in /proc/drbd · 590001c2

由 Philipp Reisner 提交于 9月 11, 2014

Was broken in 2010 with commit 4b0715f0Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

590001c2

drbd: reduce lock contention in drbd_worker · 729e8b87

由 Lars Ellenberg 提交于 9月 11, 2014

The worker may now dequeue work items in batches.
This should reduce lock contention during busy periods.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

729e8b87

drbd: Improve asender performance · abde9cc6

由 Lars Ellenberg 提交于 9月 11, 2014

Shorten receive path in the asender thread. Reduces CPU utilisation
of asender when receiving packets, and with that increases IOPs.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

abde9cc6

drbd: Get rid of the WORK_PENDING macro · b47a06d1

由 Andreas Gruenbacher 提交于 9月 11, 2014

This macro doesn't add any value; just use test_bit() instead.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

b47a06d1

drbd: Get rid of the __no_warn and __cond_lock macros · d1b80853

由 Andreas Gruenbacher 提交于 9月 11, 2014

These macros can easily be replaced with its definition.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

d1b80853

drbd: Avoid inconsistent locking warning · 8d4ba3f0

由 Andreas Gruenbacher 提交于 9月 11, 2014

request_timer_fn() takes resource->req_lock via the device and releases it via
the connection.  Avoid this as it is confusing static code checkers.
Reported-by: N"Dan Carpenter" <dan.carpenter@oracle.com>
Signed-off-by: NAndreas Gruenbacher <agruen@linbit.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

8d4ba3f0

drbd: Remove superfluous newline from "resync_extents" debugfs entry. · f0c21e62

由 Philipp Marek 提交于 9月 11, 2014

See "drbd/resources/*/volumes/*/resync_extents".
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

f0c21e62

drbd: Use consistent names for all the bi_end_io callbacks · ed15b795

由 Andreas Gruenbacher 提交于 9月 11, 2014

Now they follow the _endio naming sheme.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

ed15b795

drbd: Use better variable names · 11f8b2b6

由 Andreas Gruenbacher 提交于 9月 11, 2014

Rename local variable 'ds' to 'disk_state' or 'data_size'.
'dgs' to 'digest_size'
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

11f8b2b6

11 7月, 2014 15 次提交

drbd: silence underflow warning in read_in_block() · bf0d6e4a

由 Dan Carpenter 提交于 5月 06, 2014

My static checker warns that "data_size" could be negative and underflow
the limit check.  The code looks suspicious but I don't know if it is a
real bug.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

bf0d6e4a

drbd: implicitly truncate cpu-mask · 1e39152f

由 Lars Ellenberg 提交于 5月 19, 2014

Don't error out with misleading "out of memory"
if the cpu-mask has more bits set than there are CPUs.
Just truncate to nr_cpu_ids implicitly.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

1e39152f

drbd: drop spurious parameters from _drbd_md_sync_page_io · 193cb00c

由 Lars Ellenberg 提交于 4月 02, 2014

size is always 4096,
page is always device->md_io.page.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

193cb00c

drbd: resync should only lock out specific ranges · f5b90b6b

由 Lars Ellenberg 提交于 5月 07, 2014

During resync, if we need to block some specific incoming write because
of active resync requests to that same range, we potentially caused
*all* new application writes (to "cold" activity log extents) to block
until this one request has been processed.

Improve the do_submit() logic to
 * grab all incoming requests to some "incoming" list
 * process this list
   - move aside requests that are blocked by resync
   - prepare activity log transactions,
   - commit transactions and submit corresponding requests
   - if there are remaining requests that only wait for
     activity log extents to become free, stop the fast path
     (mark activity log as "starving")
   - iterate until no more requests are waiting for the activity log,
     but all potentially remaining requests are only blocked by resync
 * only then grab new incoming requests

That way, very busy IO on currently "hot" activity log extents cannot
starve scattered IO to "cold" extents. And blocked-by-resync requests
are processed once resync traffic on the affected region has ceased,
without blocking anything else.

The only blocking mode left is when we cannot start requests to "cold"
extents because all currently "hot" extents are actually used.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

f5b90b6b

drbd: debugfs: add per device data_gen_id · cc356f85

由 Lars Ellenberg 提交于 5月 14, 2014

The data generation identifiers used to be exposed via sysfs
at /sys/block/drbdX/drbd/meta_data/data_gen_id (out-of-tree),
for advanced policy scripting.
Bring that information over to debugfs.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

cc356f85

drbd: debugfs: add per connection oldest requests · 3d299f48

由 Lars Ellenberg 提交于 5月 14, 2014

Information of former /sys/block/drbdX/drbd/oldest_requests
is already with higher detail in these files:
 debugfs/drbd/resource/$name/in_flight_summary,
 debugfs/drbd/resource/$name/volumes/$vnr/oldest_requests

This patch adds
 debugfs/drbd/resource/$name/connections/peer/oldest_requests
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

3d299f48

drbd: debugfs: add version tag to debugfs files · b44e1184

由 Lars Ellenberg 提交于 5月 06, 2014

Make the first line of debugfs files a version number,
starting now with "v: 0".

If we change content of presentation, we will bump that.
Monitoring or diagnostic scritps that may parse these files
can then easily know when they need to be reviewed.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

b44e1184

drbd: debugfs: add per volume oldest_requests · 54e6fc38

由 Lars Ellenberg 提交于 5月 08, 2014

Show oldest requests
 * pending master bio completion and,
 * if different, local disk bio completion.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

54e6fc38

drbd: debugfs: add callback_history · 944410e9

由 Lars Ellenberg 提交于 5月 06, 2014

Add a per-connection worker thread callback_history
with timing details, call site and callback function.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

944410e9

drbd: debugfs: Add in_flight_summary · f418815f

由 Lars Ellenberg 提交于 5月 05, 2014

* Add details about pending meta data operations to in_flight_summary.

* Report number of requests waiting for activity log transactions.

* timing details of peer_requests to in_flight_summary.

* FLUSH details
  DRBD devides the incoming request stream into "epochs",
  in which peers are allowed to re-order writes independendly.

  These epochs are separated by P_BARRIER on the replication link.
  Such barrier packets, depending on configuration, may cause
  the receiving side to drain the lower level device request queues
  and call blkdev_issue_flush().

  This is known to be an other major source of latency in DRBD.

  Track timing details of calls to blkdev_issue_flush(),
  and add them to in_flight_summary.

* data socket stats
  To be able to diagnose bottlenecks and root causes of "slow" IO on DRBD,
  it is useful to see network buffer stats along with the timing details of
  requests, peer requests, and meta data IO.

* pending bitmap IO timing details to in_flight_summary.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

f418815f

drbd: debugfs: deal with destructor racing with open of debugfs file · 4a521cca

由 Lars Ellenberg 提交于 5月 05, 2014

Try to close the race between open() and debugfs_remove_recursive()
from inside an object destructor.
Once open succeeds, the object should stay around.
Open should not succeed if the object has already reached its destructor.

This may be overkill, but to make that happen, we check for existence of
a parent directory, "stale-ness" of "this" dentry, and serialize
kref_get_unless_zero() on the outermost object relevant for this file
with d_delete() on this dentry (using the parent's i_mutex).
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

4a521cca

drbd: debugfs: add in_flight_summary data · db1866ff

由 Lars Ellenberg 提交于 5月 02, 2014

To help diagnosing "high latency" or "hung" IO situations on DRBD,
present per drbd resource group a summary of operations currently in progress.

First item is a list of oldest drbd_request objects
waiting for various things:
 * still being prepared
 * waiting for activity log transaction
 * waiting for local disk
 * waiting to be sent
 * waiting for peer acknowledgement ("receive ack", "write ack")
 * waiting for peer epoch acknowledgement ("barrier ack")
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

db1866ff

drbd: debugfs: add basic hierarchy · 4d3d5aa8

由 Lars Ellenberg 提交于 5月 02, 2014

Add new debugfs hierarchy /sys/kernel/debug/
  drbd/
    resources/
      $resource_name/connections/peer/$volume_number/
      $resource_name/volumes/$volume_number/
    minors/$minor_number -> ../resources/$resource_name/volumes/$volume_number/

Followup commits will populate this hierarchy with files containing
statistics, diagnostic information and some attribute data.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

4d3d5aa8

drbd: track details of bitmap IO · 4ce49266

由 Lars Ellenberg 提交于 5月 06, 2014

Track start and submit time of bitmap operations, and
add pending bitmap IO contexts to a new pending_bitmap_io list.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

4ce49266

drbd: register peer requests on read_ee early · c5a2c150

由 Lars Ellenberg 提交于 5月 08, 2014

Initialize peer_request with timestamp and proper empty list head.
Add peer_request to list early, so debugfs can find this request and
report it as "preparing", even if we sleep before we actually submit it.
Signed-off-by: NPhilipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: NLars Ellenberg <lars.ellenberg@linbit.com>

c5a2c150

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功