提交 · 9b6e54232992a2e39790d93df4581a2dcb8a5429 · openanolis / cloud-kernel

15 6月, 2016 8 次提交

dm raid: bump to v1.9.0 and make the extended SB feature flag reflect it · 9b6e5423

由 Mike Snitzer 提交于 6月 02, 2016

No idea what Heinz was doing with the versioning but upstream commit
4c9971ca ("dm raid: make sure no feature flags are set in metadata")
bumped to 1.8.0 already.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

9b6e5423

dm raid: remove ti_error_* wrappers · bd83a4c4

由 Mike Snitzer 提交于 5月 31, 2016

There ti_error_* wrappers added very little.  No other DM target has
ever gone to such lengths to wrap setting ti->error.

Also fixes some NULL derefences via rs->ti->error.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

bd83a4c4

dm raid: tabify appropriate whitespace · 43157840

由 Mike Snitzer 提交于 5月 30, 2016

Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

43157840

dm raid: enhance status interface and fixup takeover/raid0 · 3a1c1ef2

由 Heinz Mauelshagen 提交于 5月 19, 2016

The target's status interface has to provide the new 'data_offset' value
to allow userspace to retrieve the kernels offset to the data on each
raid device of a raid set.  This is the base for out-of-place reshaping
required to not write over any data during reshaping (e.g. change
raid6_zr -> raid6_nc):

 - add rs_set_cur() to be able to start up existing array in case of no
   takeover; use in ctr on takeover check

 - enhance raid_status()

 - add supporting functions to get resync/reshape progress and raid
   device status chars

 - fixup rebuild table line output race, which does miss to emit
   'rebuild N' on fully synced/rebuild devices, because it is relying on
   the transient 'In_sync' raid device flag

 - add new status line output for 'data_offset', which'll later be used
   for out-of-place reshaping

 - fixup takeover not working for all levels

 - fixup raid0 message interface oops caused by missing checks
   for the md threads, which don't exist in case of raid0

 - remove ALL_FREEZE_FLAGS not needed for takeover

 - adjust comments
Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

3a1c1ef2

dm raid: add raid level takeover support · ecbfb9f1

由 Heinz Mauelshagen 提交于 5月 19, 2016

Add raid level takeover support allowing arbitrary takeovers between
raid levels supported by md personalities (i.e. raid0, raid1/10 and
raid4/5/6):

 - add rs_config_{backup|restore} function to allow for temporary
   storing ctr requested layout changes and restore them for takeover
   conersion decision after the superblocks got loaded and analyzed

 - add members to store layout to 'struct raid_set' (not mandatory
   for takeover but needed for reshape in later patch)

 - add rebuild_disks bitfield to 'struct raid_set' and set bits in ctr
   to use in setting up takeover (base to address a 'rebuild' related
   raid_status() table line bug and needed as well for reshape in future
   patch)

 - add runtime flags and respective manipulation functions to be able to
   control e.g. wrting of superlocks to the preresume function on
   takeover and (later) reshape

 - add functions to detect takeover, check it's valid (mandatory here to
   avoid failing on md_run()), setup for it and use in the ctr; those
   will be likely moved out once reshaping gets added to simplify the
   ctr

 - start raid set readonly in ctr and switch to readwrite, optionally
   updating superblocks, in preresume in order to allow suspend to
   quiesce any active table before (which involves superblock updates);
   this ensures the proper sequence of writing the current and any new
   takeover(/reshape) metadata
Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

ecbfb9f1

dm raid: enhance super_sync() to support new superblock members · 7b34df74

由 Heinz Mauelshagen 提交于 5月 19, 2016

Add transferring the new takeover/reshape related superblock
members introduced to the super_sync() function:

 - add/move supporting functions

 - add failed devices bitfield transfer functions to retrieve the
   bitfield from superblock format or update it in the superblock

 - add code to transfer all new members
Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

7b34df74

dm raid: add new reshaping/raid10 format table line options to parameter parser · 4763e543

由 Heinz Mauelshagen 提交于 5月 19, 2016

Support the follwoing arguments in the ctr parameter parser:

 - add 'delta_disks', 'data_offset' taking int and sector respectively

 - 'raid10_use_near_sets' bool argument to optionally select
   near sets with supporting raid10 mappings
Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

4763e543

dm raid: introduce extended superblock and new raid types to support takeover/reshaping · 33e53f06

由 Heinz Mauelshagen 提交于 5月 19, 2016

Add new members to the dm-raid superblock and new raid types to support
takeover/reshape.

Add all necessary members needed to support takeover and reshape in one
go -- aiming to limit the amount of changes to the superblock layout.

This is a larger patch due to the new superblock members, their related
flags, validation of both and involved API additions/changes:

 - add additional members to keep track of:
   - state about forward/backward reshaping
   - reshape position
   - new level, layout, stripe size and delta disks
   - data offset to current and new data for out-of-place reshapes
   - failed devices bitfield extensions to keep track of max raid devices

 - adjust super_validate() to cope with new superblock members

 - adjust super_init_validation() to cope with new superblock members

 - add definitions for ctr flags supporting delta disks etc.

 - add new raid types (raid6_n_6 etc.)

 - add new raid10 supporting function API (_is_raid10_*())

 - adjust to changed raid10 supporting function API
Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

33e53f06

14 6月, 2016 6 次提交

dm raid: use rt_is_raid*() in all appropriate checks · 676fa5ad

由 Heinz Mauelshagen 提交于 5月 19, 2016

Make use if raid type rt_is_*() bool functions for simplification and
consistency reasons.
Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

676fa5ad

dm raid: more use of flag testing wrappers · ad51d7f1

由 Heinz Mauelshagen 提交于 5月 19, 2016

 - add _test_flags() function

 - use it to simplify rs_check_for_invalid_flags()

 - use _test_flag() throughout
Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

ad51d7f1

dm raid: check constructor arguments for invalid raid level/argument combinations · f090279e

由 Heinz Mauelshagen 提交于 5月 19, 2016

Reject invalid flag combinations to avoid potential data corruption or
failing raid set construction:

 - add definitions for constructor flag combinations and invalid flags
   per level

 - add bool test functions for the various raid types
   (also will be used by future reshaping enhancements)

 - introduce rs_check_for_invalid_flags() and _invalid_flags()
   to perform the validity checks
Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

f090279e

dm raid: cleanup / provide infrastructure · 702108d1

由 Heinz Mauelshagen 提交于 5月 19, 2016

Provide necessary infrastructure to handle ctr flags and their names
and cleanup setting ti->error:

 - comment constructor flags

 - introduce constructor flag manipulation

 - introduce ti_error_*() functions to simplify
   setting the error message (use in other targets?)

 - introduce array to hold ctr flag <-> flag name mapping

 - introduce argument name by flag functions for that array

 - use those functions throughout the ctr call path
Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

702108d1

dm raid: use dm_arg_set API in constructor · 92c83d79

由 Heinz Mauelshagen 提交于 5月 19, 2016

- use dm_arg_set API in ctr and its callees parse_raid_params() and dev_parms()

- introduce _in_range() function to check a value is in a [ min, max ] range;
  this is to support more callers in parsing parameters etc. in the future

- correct comment on MAX_RAID_DEVICES
Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

92c83d79

H
dm raid: rename variable 'ret' to 'r' to conform to other dm code · 73c6f239
由 Heinz Mauelshagen 提交于 5月 19, 2016
```
Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
```
73c6f239

11 6月, 2016 4 次提交

dm mpath: add optional "queue_mode" feature · e83068a5

由 Mike Snitzer 提交于 5月 24, 2016

Allow a user to specify an optional feature 'queue_mode <mode>' where
<mode> may be "bio", "rq" or "mq" -- which corresponds to bio-based,
request_fn rq-based, and blk-mq rq-based respectively.

If the queue_mode feature isn't specified the default for the
"multipath" target is still "rq" but if dm_mod.use_blk_mq is set to Y
it'll default to mode "mq".

This new queue_mode feature introduces the ability for each multipath
device to have its own queue_mode (whereas before this feature all
multipath devices effectively had to have the same queue_mode).

This commit also goes a long way to eliminate the awkward (ab)use of
DM_TYPE_*, the associated filter_md_type() and other relatively fragile
and difficult to maintain code.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

e83068a5

M
dm mpath: remove bio-based bloat from struct dm_mpath_io · bf661be1
由 Mike Snitzer 提交于 5月 24, 2016
```
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
```
bf661be1

dm mpath: reinstate bio-based support · 76e33fe4

由 Mike Snitzer 提交于 5月 19, 2016

Add "multipath-bio" target that offers a bio-based multipath target as
an alternative to the request-based "multipath" target -- but in a
following commit "multipath-bio" will immediately be replaced by a new
"queue_mode" feature for the "multipath" target which will allow
bio-based mode to be selected.

When DM multipath was originally converted from bio-based to
request-based the motivation for the change was better dynamic load
balancing (by leveraging block core's request-based IO schedulers, for
merging and sorting, _before_ DM multipath would make the decision on
where to steer the IO -- based on path load and/or availability).

More background is available in this "Request-based Device-mapper
multipath and Dynamic load balancing" paper:
https://www.kernel.org/doc/ols/2007/ols2007v2-pages-235-244.pdf

But we've now come full circle where significantly faster storage
devices no longer need IOs to be made larger to drive optimal IO
performance. And even if they do there have been changes to the block
and filesystem layers that help ensure upper layers are constructing
larger IOs. In addition, SCSI's differentiated IO errors will propagate
through to bio-based IO completion hooks -- so that eliminates another
historic justiciation for request-based DM multipath. Lastly, the block
layer's immutable biovec changes have made bio cloning cheaper than it
has ever been; whereas request cloning is still relatively expensive
(both on a CPU usage and memory footprint level).

As such, bio-based DM multipath offers the promise of a more efficient
IO path for high IOPs devices that are, or will be, emerging.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

76e33fe4

dm: move request-based code out to dm-rq.[hc] · 4cc96131

由 Mike Snitzer 提交于 5月 12, 2016

Add some seperation between bio-based and request-based DM core code.

'struct mapped_device' and other DM core only structures and functions
have been moved to dm-core.h and all relevant DM core .c files have been
updated to include dm-core.h rather than dm.h

DM targets should _never_ include dm-core.h!

[block core merge conflict resolution from Stephen Rothwell]
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>

4cc96131

10 6月, 2016 9 次提交

block: bio: kill BIO_MAX_SIZE · 1a89694f

由 Ming Lei 提交于 6月 10, 2016

No one need this macro now, so remove it. Basically
only how many bvecs in one bio matters instead
of how many bytes in this bio.

The motivation is for supporting multipage bvecs, in
which we only know what the max count of bvecs is supported
in the bio.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMing Lei <ming.lei@canonical.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

1a89694f

cfq-iosched: temporarily boost queue priority for idle classes · b8269db4

由 Jens Axboe 提交于 6月 09, 2016

If we're queuing REQ_PRIO IO and the task is running at an idle IO
class, then temporarily boost the priority. This prevents livelocks
due to priority inversion, when a low priority task is holding file
system resources while attempting to do IO.

An example of that is shown below. An ioniced idle task is holding
the directory mutex, while a normal priority task is trying to do
a directory lookup.

[478381.198925] ------------[ cut here ]------------
[478381.200315] INFO: task ionice:1168369 blocked for more than 120 seconds.
[478381.201324]       Not tainted 4.0.9-38_fbk5_hotfix1_2936_g85409c6 #1
[478381.202278] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[478381.203462] ionice          D ffff8803692736a8     0 1168369      1 0x00000080
[478381.203466]  ffff8803692736a8 ffff880399c21300 ffff880276adcc00 ffff880369273698
[478381.204589]  ffff880369273fd8 0000000000000000 7fffffffffffffff 0000000000000002
[478381.205752]  ffffffff8177d5e0 ffff8803692736c8 ffffffff8177cea7 0000000000000000
[478381.206874] Call Trace:
[478381.207253]  [<ffffffff8177d5e0>] ? bit_wait_io_timeout+0x80/0x80
[478381.208175]  [<ffffffff8177cea7>] schedule+0x37/0x90
[478381.208932]  [<ffffffff8177f5fc>] schedule_timeout+0x1dc/0x250
[478381.209805]  [<ffffffff81421c17>] ? __blk_run_queue+0x37/0x50
[478381.210706]  [<ffffffff810ca1c5>] ? ktime_get+0x45/0xb0
[478381.211489]  [<ffffffff8177c407>] io_schedule_timeout+0xa7/0x110
[478381.212402]  [<ffffffff810a8c2b>] ? prepare_to_wait+0x5b/0x90
[478381.213280]  [<ffffffff8177d616>] bit_wait_io+0x36/0x50
[478381.214063]  [<ffffffff8177d325>] __wait_on_bit+0x65/0x90
[478381.214961]  [<ffffffff8177d5e0>] ? bit_wait_io_timeout+0x80/0x80
[478381.215872]  [<ffffffff8177d47c>] out_of_line_wait_on_bit+0x7c/0x90
[478381.216806]  [<ffffffff810a89f0>] ? wake_atomic_t_function+0x40/0x40
[478381.217773]  [<ffffffff811f03aa>] __wait_on_buffer+0x2a/0x30
[478381.218641]  [<ffffffff8123c557>] ext4_bread+0x57/0x70
[478381.219425]  [<ffffffff8124498c>] __ext4_read_dirblock+0x3c/0x380
[478381.220467]  [<ffffffff8124665d>] ext4_dx_find_entry+0x7d/0x170
[478381.221357]  [<ffffffff8114c49e>] ? find_get_entry+0x1e/0xa0
[478381.222208]  [<ffffffff81246bd4>] ext4_find_entry+0x484/0x510
[478381.223090]  [<ffffffff812471a2>] ext4_lookup+0x52/0x160
[478381.223882]  [<ffffffff811c401d>] lookup_real+0x1d/0x60
[478381.224675]  [<ffffffff811c4698>] __lookup_hash+0x38/0x50
[478381.225697]  [<ffffffff817745bd>] lookup_slow+0x45/0xab
[478381.226941]  [<ffffffff811c690e>] link_path_walk+0x7ae/0x820
[478381.227880]  [<ffffffff811c6a42>] path_init+0xc2/0x430
[478381.228677]  [<ffffffff813e6e26>] ? security_file_alloc+0x16/0x20
[478381.229776]  [<ffffffff811c8c57>] path_openat+0x77/0x620
[478381.230767]  [<ffffffff81185c6e>] ? page_add_file_rmap+0x2e/0x70
[478381.232019]  [<ffffffff811cb253>] do_filp_open+0x43/0xa0
[478381.233016]  [<ffffffff8108c4a9>] ? creds_are_invalid+0x29/0x70
[478381.234072]  [<ffffffff811c0cb0>] do_open_execat+0x70/0x170
[478381.235039]  [<ffffffff811c1bf8>] do_execveat_common.isra.36+0x1b8/0x6e0
[478381.236051]  [<ffffffff811c214c>] do_execve+0x2c/0x30
[478381.236809]  [<ffffffff811ca392>] ? getname+0x12/0x20
[478381.237564]  [<ffffffff811c23be>] SyS_execve+0x2e/0x40
[478381.238338]  [<ffffffff81780a1d>] stub_execve+0x6d/0xa0
[478381.239126] ------------[ cut here ]------------
[478381.239915] ------------[ cut here ]------------
[478381.240606] INFO: task python2.7:1168375 blocked for more than 120 seconds.
[478381.242673]       Not tainted 4.0.9-38_fbk5_hotfix1_2936_g85409c6 #1
[478381.243653] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[478381.244902] python2.7       D ffff88005cf8fb98     0 1168375 1168248 0x00000080
[478381.244904]  ffff88005cf8fb98 ffff88016c1f0980 ffffffff81c134c0 ffff88016c1f11a0
[478381.246023]  ffff88005cf8ffd8 ffff880466cd0cbc ffff88016c1f0980 00000000ffffffff
[478381.247138]  ffff880466cd0cc0 ffff88005cf8fbb8 ffffffff8177cea7 ffff88005cf8fcc8
[478381.248252] Call Trace:
[478381.248630]  [<ffffffff8177cea7>] schedule+0x37/0x90
[478381.249382]  [<ffffffff8177d08e>] schedule_preempt_disabled+0xe/0x10
[478381.250465]  [<ffffffff8177e892>] __mutex_lock_slowpath+0x92/0x100
[478381.251409]  [<ffffffff8177e91b>] mutex_lock+0x1b/0x2f
[478381.252199]  [<ffffffff817745ae>] lookup_slow+0x36/0xab
[478381.253023]  [<ffffffff811c690e>] link_path_walk+0x7ae/0x820
[478381.253877]  [<ffffffff811aeb41>] ? try_charge+0xc1/0x700
[478381.254690]  [<ffffffff811c6a42>] path_init+0xc2/0x430
[478381.255525]  [<ffffffff813e6e26>] ? security_file_alloc+0x16/0x20
[478381.256450]  [<ffffffff811c8c57>] path_openat+0x77/0x620
[478381.257256]  [<ffffffff8115b2fb>] ? lru_cache_add_active_or_unevictable+0x2b/0xa0
[478381.258390]  [<ffffffff8117b623>] ? handle_mm_fault+0x13f3/0x1720
[478381.259309]  [<ffffffff811cb253>] do_filp_open+0x43/0xa0
[478381.260139]  [<ffffffff811d7ae2>] ? __alloc_fd+0x42/0x120
[478381.260962]  [<ffffffff811b95ac>] do_sys_open+0x13c/0x230
[478381.261779]  [<ffffffff81011393>] ? syscall_trace_enter_phase1+0x113/0x170
[478381.262851]  [<ffffffff811b96c2>] SyS_open+0x22/0x30
[478381.263598]  [<ffffffff81780532>] system_call_fastpath+0x12/0x17
[478381.264551] ------------[ cut here ]------------
[478381.265377] ------------[ cut here ]------------
Signed-off-by: NJens Axboe <axboe@fb.com>
Reviewed-by: NJeff Moyer <jmoyer@redhat.com>

b8269db4

block: drbd: avoid to use BIO_MAX_SIZE · 8bf223c2

由 Ming Lei 提交于 5月 30, 2016

Use BIO_MAX_PAGES instead and we will remove BIO_MAX_SIZE.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMing Lei <ming.lei@canonical.com>
Tested-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

8bf223c2

block: bio: remove BIO_MAX_SECTORS · 30ac4607

由 Ming Lei 提交于 6月 09, 2016

No one need this macro, so remove it. The motivation is for supporting
multipage bvecs, in which we only know what the max count of bvecs is
supported in the bio, instead of max size or max sectors.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMing Lei <ming.lei@canonical.com>
Tested-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

30ac4607

fs: xfs: replace BIO_MAX_SECTORS with BIO_MAX_PAGES · c908e380

由 Ming Lei 提交于 5月 30, 2016

BIO_MAX_PAGES is used as maximum count of bvecs, so
replace BIO_MAX_SECTORS with BIO_MAX_PAGES since
BIO_MAX_SECTORS is to be removed.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMing Lei <ming.lei@canonical.com>
Tested-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

c908e380

iov_iter: use bvec iterator to implement iterate_bvec() · 1bdc76ae

由 Ming Lei 提交于 5月 30, 2016

bvec has one native/mature iterator for long time, so not
necessary to use the reinvented wheel for iterating bvecs
in lib/iov_iter.c.

Two ITER_BVEC test cases are run:
	- xfstest(-g auto) on loop dio/aio, no regression found
	- swap file works well under extreme stress(stress-ng --all 64 -t
	  800 -v), and lots of OOMs are triggerd, and the whole
	system still survives
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMing Lei <ming.lei@canonical.com>
Tested-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

1bdc76ae

block: mark 1st parameter of bvec_iter_advance as const · 80f162ff

由 Ming Lei 提交于 5月 30, 2016

bvec_iter_advance() only writes the parameter of iterator,
so the base address of bvec can be marked as const safely.

Without the change, we can see compiling warning in the
following patch for implementing iterate_bvec(): lib/iov_iter.c
with bvec iterator.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMing Lei <ming.lei@canonical.com>
Tested-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

80f162ff

block: move two bvec structure into bvec.h · 0781e79e

由 Ming Lei 提交于 5月 30, 2016

This patch moves 'struct bio_vec' and 'struct bvec_iter'
into 'include/linux/bvec.h', then always include this header
into 'include/linux/blk_types.h'.

With this change, both 'struct bvec_iter' and bvec iterator
helpers don't depend on CONFIG_BLOCK any more, then we can
use bvec iterator to implement iterate_bvec(): lib/iov_iter.c.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Suggested-by: NChristoph Hellwig <hch@infradead.org>
Signed-off-by: NMing Lei <ming.lei@canonical.com>
Tested-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

0781e79e

block: move bvec iterator into include/linux/bvec.h · 8fc55455

由 Ming Lei 提交于 6月 09, 2016

bvec iterator helpers should be used to implement by
iterate_bvec():lib/iov_iter.c too, and move them into
one header, so that we can keep bvec iterator header
out of CONFIG_BLOCK. Then we can remove the reinventing
of wheel in iterate_bvec().
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMing Lei <ming.lei@canonical.com>
Tested-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

8fc55455

09 6月, 2016 3 次提交

blk-mq: actually hook up defer list when running requests · 52b9c330

由 Omar Sandoval 提交于 6月 08, 2016

If ->queue_rq() returns BLK_MQ_RQ_QUEUE_OK, we use continue and skip
over the rest of the loop body. However, dptr is assigned later in the
loop body, and the BLK_MQ_RQ_QUEUE_OK case is exactly the case that we'd
want it for.

NVMe isn't actually using BLK_MQ_F_DEFER_ISSUE yet, nor is any other
in-tree driver, but if the code's going to be there, it might as well
work.

Fixes: 74c45052 ("blk-mq: add a 'list' parameter to ->queue_rq()")
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

52b9c330

block: better packing for struct request · ca93e453

由 Christoph Hellwig 提交于 6月 09, 2016

Keep the 32-bit CPU and cmd_type flags together to avoid holes on 64-bit
architectures.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

ca93e453

ext4: use bio op helprs in ext4 crypto code · 60a40096

由 Mike Christie 提交于 6月 08, 2016

This was missed from my last patchset.

This patch has ext4 crypto code use the bio op helper
to set the operation. The operation (discard, write, writesame,
etc) is now defined seperately from the other REQ bits. They
still share the bi_rw field to save space, so we use these
helpers so modules do not have to worry about setting/overwriting
info.

Jens, I am not sure how you handle patches on top of patches
in the next branches. If you merge patches that fix issues
in previous patches in next, then this patch could be part
of

commit 95fe6c1a
Author: Mike Christie <mchristi@redhat.com>
Date:   Sun Jun 5 14:31:48 2016 -0500

    block, fs, mm, drivers: use bio set/get op accessors
Signed-off-by: NMike Christie <mchristi@redhat.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

60a40096

08 6月, 2016 10 次提交

cfq-iosched: Convert to use highres timers · 91148325

由 Jan Kara 提交于 6月 08, 2016

Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NJan Kara <jack@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

91148325

cfq-iosched: Expose microsecond interfaces · d2d481d0

由 Jeff Moyer 提交于 6月 08, 2016

Expose interfaces to tune time slices of CFQ IO scheduler in
microseconds.
Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <axboe@fb.com>

d2d481d0

cfq-iosched: Convert from jiffies to nanoseconds · 9a7f38c4

由 Jeff Moyer 提交于 6月 08, 2016

Convert all time-keeping in CFQ IO scheduler from jiffies to nanoseconds
so that we can later make the intervals more fine-grained than jiffies.
One jiffie is several miliseconds and even for today's rotating disks
that is a noticeable amount of time and thus we leave disk unnecessarily
idle.
Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <axboe@fb.com>

9a7f38c4

block, drivers, fs: rename REQ_FLUSH to REQ_PREFLUSH · 28a8f0d3

由 Mike Christie 提交于 6月 05, 2016

To avoid confusion between REQ_OP_FLUSH, which is handled by
request_fn drivers, and upper layers requesting the block layer
perform a flush sequence along with possibly a WRITE, this patch
renames REQ_FLUSH to REQ_PREFLUSH.
Signed-off-by: NMike Christie <mchristi@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

28a8f0d3

block: do not use REQ_FLUSH for tracking flush support · a418090a

由 Mike Christie 提交于 6月 05, 2016

The last patch added a REQ_OP_FLUSH for request_fn drivers
and the next patch renames REQ_FLUSH to REQ_PREFLUSH which
will be used by file systems and make_request_fn drivers so
they can send a write/flush combo.

This patch drops xen's use of REQ_FLUSH to track if it supports
REQ_OP_FLUSH requests, so REQ_FLUSH can be deleted.
Signed-off-by: NMike Christie <mchristi@redhat.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJuergen Gross <kernel@pfupf.net>
Signed-off-by: NJens Axboe <axboe@fb.com>

a418090a

block, drivers: add REQ_OP_FLUSH operation · 3a5e02ce

由 Mike Christie 提交于 6月 05, 2016

This adds a REQ_OP_FLUSH operation that is sent to request_fn
based drivers by the block layer's flush code, instead of
sending requests with the request->cmd_flags REQ_FLUSH bit set.
Signed-off-by: NMike Christie <mchristi@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

3a5e02ce

block, fs, drivers: remove REQ_OP compat defs and related code · 4e1b2d52

由 Mike Christie 提交于 6月 05, 2016

This patch drops the compat definition of req_op where it matches
the rq_flag_bits definitions, and drops the related old and compat
code that allowed users to set either the op or flags for the operation.

We also then store the operation in the bi_rw/cmd_flags field similar
to how we used to store the bio ioprio where it sat in the upper bits
of the field.
Signed-off-by: NMike Christie <mchristi@redhat.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

4e1b2d52

block, drivers, fs: shrink bi_rw from long to int · 6296b960

由 Mike Christie 提交于 6月 05, 2016

We don't need bi_rw to be so large on 64 bit archs, so
reduce it to unsigned int.
Signed-off-by: NMike Christie <mchristi@redhat.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

6296b960

block: move bio io prio to a new field · 43b62ce3

由 Mike Christie 提交于 6月 05, 2016

In the next patch, we move drop the compat code and make
the op a separate value that is hidden in bi_rw. To give
the op and rq bits flags room to grow this moves prio to
its own field.
Signed-off-by: NMike Christie <mchristi@redhat.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

43b62ce3

ide cd: do not set REQ_WRITE on requests. · 8e45c6f8

由 Mike Christie 提交于 6月 05, 2016

The block layer will set the correct READ/WRITE operation flags/fields
when creating a request, so there is not need for drivers to set the
REQ_WRITE flag.
Signed-off-by: NMike Christie <mchristi@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

8e45c6f8

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功