提交 · abf545484d31b68777a85c5c8f5b4bcde08283eb · openeuler / raspberrypi-kernel

05 8月, 2016 2 次提交

mm/block: convert rw_page users to bio op use · abf54548

由 Mike Christie 提交于 8月 04, 2016

The rw_page users were not converted to use bio/req ops. As a result
bdev_write_page is not passing down REQ_OP_WRITE and the IOs will
be sent down as reads.
Signed-off-by: NMike Christie <mchristi@redhat.com>
Fixes: 4e1b2d52 ("block, fs, drivers: remove REQ_OP compat defs and related code")

Modified by me to:

1) Drop op_flags passing into ->rw_page(), as we don't use it.
2) Make op_is_write() and friends safe to use for !CONFIG_BLOCK
Signed-off-by: NJens Axboe <axboe@fb.com>

abf54548

Include: blkdev: Removed duplicate 'struct request;' declaration. · 6d25ec14

由 John Pittman 提交于 8月 01, 2016

In include/linux/blkdev.h duplicate declarations of the request
struct exist.  Cleaned up by removing the second, unneeded
declaration.
Signed-off-by: NJohn Pittman <jpittman@redhat.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

6d25ec14

21 7月, 2016 5 次提交

block: Fix front merge check · 17007f39

由 Damien Le Moal 提交于 7月 20, 2016

For a front merge, the maximum number of sectors of the
request must be checked against the front merge BIO sector,
not the current sector of the request.
Signed-off-by: NDamien Le Moal <damien.lemoal@hgst.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

17007f39

block: add QUEUE_FLAG_DAX for devices to advertise their DAX support · 163d4baa

由 Toshi Kani 提交于 6月 23, 2016

Currently, presence of direct_access() in block_device_operations
indicates support of DAX on its block device.  Because
block_device_operations is instantiated with 'const', this DAX
capablity may not be enabled conditinally.

In preparation for supporting DAX to device-mapper devices, add
QUEUE_FLAG_DAX to request_queue flags to advertise their DAX
support.  This will allow to set the DAX capability based on how
mapped device is composed.
Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
Acked-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: <linux-s390@vger.kernel.org>
Signed-off-by: NJens Axboe <axboe@fb.com>

163d4baa

scsi/osd: open code blk_make_request · 4613c5f1

由 Christoph Hellwig 提交于 7月 19, 2016

I wish the OSD code could simply use blk_rq_map_* helpers like
everyone else, but the complex nature of deciding if we have
DATA IN and/or DATA OUT buffers might make this impossible
(at least for a mere human like me).

But using blk_rq_append_bio at least allows sharing the setup code
between request with or without dat a buffers, and given that this
is the last user of blk_make_request it allows getting rid of that
somewhat awkward interface.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NBoaz Harrosh <ooo@electrozaur.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

4613c5f1

block: simplify and export blk_rq_append_bio · 98d61d5b

由 Christoph Hellwig 提交于 7月 19, 2016

The target SCSI passthrough backend is much better served with the low-level
blk_rq_append_bio construct then the helpers built on top of it, so export it.

Also use the opportunity to remove the pointless request_queue argument and
make the code flow a little more readable.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

98d61d5b

block: introduce BLKDEV_DISCARD_ZERO to fix zeroout · e950fdf7

由 Christoph Hellwig 提交于 7月 19, 2016

Currently blkdev_issue_zeroout cascades down from discards (if the driver
guarantees that discards zero data), to WRITE SAME and then to a loop
writing zeroes.  Unfortunately we ignore run-time EOPNOTSUPP errors in the
block layer blkdev_issue_discard helper to work around DM volumes that
may have mixed discard support underneath.

This patch intoroduces a new BLKDEV_DISCARD_ZERO flag to
blkdev_issue_discard that indicates we are called for zeroing operation.
This allows both to ignore the EOPNOTSUPP hack and actually consolidating
the discard_zeroes_data check into the function.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

e950fdf7

13 7月, 2016 1 次提交

pmem: kill __pmem address space · 7a9eb206

由 Dan Williams 提交于 6月 03, 2016

The __pmem address space was meant to annotate codepaths that touch
persistent memory and need to coordinate a call to wmb_pmem().  Now that
wmb_pmem() is gone, there is little need to keep this annotation.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

7a9eb206

28 6月, 2016 1 次提交

block: Convert fifo_time from ulong to u64 · 9828c2c6

由 Jan Kara 提交于 6月 28, 2016

Currently rq->fifo_time is unsigned long but CFQ stores nanosecond
timestamp in it which would overflow on 32-bit archs. Convert it to u64
to avoid the overflow. Since the rq->fifo_time is unioned with struct
call_single_data(), this does not change the size of struct request in
any way.

We have to slightly fixup block/deadline-iosched.c so that comparison
happens in the right types.

Fixes: 9a7f38c4Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <axboe@fb.com>

9828c2c6

09 6月, 2016 2 次提交

block: add a separate operation type for secure erase · 288dab8a

由 Christoph Hellwig 提交于 6月 09, 2016

Instead of overloading the discard support with the REQ_SECURE flag.
Use the opportunity to rename the queue flag as well, and remove the
dead checks for this flag in the RAID 1 and RAID 10 drivers that don't
claim support for secure erase.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

288dab8a

block: better packing for struct request · ca93e453

由 Christoph Hellwig 提交于 6月 09, 2016

Keep the 32-bit CPU and cmd_type flags together to avoid holes on 64-bit
architectures.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

ca93e453

08 6月, 2016 6 次提交

block, drivers: add REQ_OP_FLUSH operation · 3a5e02ce

由 Mike Christie 提交于 6月 05, 2016

This adds a REQ_OP_FLUSH operation that is sent to request_fn
based drivers by the block layer's flush code, instead of
sending requests with the request->cmd_flags REQ_FLUSH bit set.
Signed-off-by: NMike Christie <mchristi@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

3a5e02ce

block, fs, drivers: remove REQ_OP compat defs and related code · 4e1b2d52

由 Mike Christie 提交于 6月 05, 2016

This patch drops the compat definition of req_op where it matches
the rq_flag_bits definitions, and drops the related old and compat
code that allowed users to set either the op or flags for the operation.

We also then store the operation in the bi_rw/cmd_flags field similar
to how we used to store the bio ioprio where it sat in the upper bits
of the field.
Signed-off-by: NMike Christie <mchristi@redhat.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

4e1b2d52

block: convert is_sync helpers to use REQ_OPs. · d9d8c5c4

由 Mike Christie 提交于 6月 05, 2016

This patch converts the is_sync helpers to use separate variables
for the operation and flags.
Signed-off-by: NMike Christie <mchristi@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

d9d8c5c4

block: convert merge/insert code to check for REQ_OPs. · 8fe0d473

由 Mike Christie 提交于 6月 05, 2016

This patch converts the block layer merging code to use separate variables
for the operation and flags, and to check req_op for the REQ_OP.
Signed-off-by: NMike Christie <mchristi@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

8fe0d473

block discard: use bio set op accessor · 469e3216

由 Mike Christie 提交于 6月 05, 2016

This converts the block issue discard helper and users to use
the bio_set_op_attrs accessor and only pass in the operation flags
like REQ_SEQURE.
Signed-off-by: NMike Christie <mchristi@redhat.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

469e3216

block: add REQ_OP definitions and helpers · f2150821

由 Mike Christie 提交于 6月 05, 2016

The following patches separate the operation (WRITE, READ, DISCARD,
etc) from the rq_flag_bits flags. This patch adds definitions for
request/bio operations (REQ_OPs) and adds request/bio accessors to
get/set the op.

In this patch the REQ_OPs match the REQ rq_flag_bits ones
for compat reasons while all the code is converted to use the
op accessors in the set. In the last patches the op will become a
number and the accessors and helpers in this patch will be dropped
or updated.
Signed-off-by: NMike Christie <mchristi@redhat.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

f2150821

19 5月, 2016 1 次提交

dax: enable dax in the presence of known media errors (badblocks) · 0a70bd43

由 Dan Williams 提交于 2月 24, 2016

1/ If a mapping overlaps a bad sector fail the request.

2/ Do not opportunistically report more dax-capable capacity than is
   requested when errors present.
Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
[vishal: fix a conflict with system RAM collision patches]
[vishal: add a 'size' parameter to ->direct_access]
[vishal: fix a conflict with DAX alignment check patches]
Signed-off-by: NVishal Verma <vishal.l.verma@intel.com>

0a70bd43

17 5月, 2016 3 次提交

block: Update blkdev_dax_capable() for consistency · a8078b1f

由 Toshi Kani 提交于 5月 10, 2016

blkdev_dax_capable() is similar to bdev_dax_supported(), but needs
to remain as a separate interface for checking dax capability of
a raw block device.

Rename and relocate blkdev_dax_capable() to keep them maintained
consistently, and call bdev_direct_access() for the dax capability
check.

There is no change in the behavior.

Link: https://lkml.org/lkml/2016/5/9/950Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <axboe@fb.com>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Jan Kara <jack@suse.cz>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Boaz Harrosh <boaz@plexistor.com>
Signed-off-by: NVishal Verma <vishal.l.verma@intel.com>

a8078b1f

block: Add bdev_dax_supported() for dax mount checks · 2d96afc8

由 Toshi Kani 提交于 5月 10, 2016

DAX imposes additional requirements to a device.  Add
bdev_dax_supported() which performs all the precondition checks
necessary for filesystem to mount the device with dax option.

Also add a new check to verify if a partition is aligned by 4KB.
When a partition is unaligned, any dax read/write access fails,
except for metadata update.
Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <axboe@fb.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Jan Kara <jack@suse.cz>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Boaz Harrosh <boaz@plexistor.com>
Signed-off-by: NVishal Verma <vishal.l.verma@intel.com>

2d96afc8

block: Add vfs_msg() interface · 2af3a815

由 Toshi Kani 提交于 5月 10, 2016

In preparation of moving DAX capability checks to the block layer
from filesystem code, add a VFS message interface that aligns with
filesystem's message format.

For instance, a vfs_msg() message followed by XFS messages in case
of a dax mount error may look like:

  VFS (pmem0p1): error: unaligned partition for dax
  XFS (pmem0p1): DAX unsupported by block device. Turning off DAX.
  XFS (pmem0p1): Mounting V5 Filesystem
   :

vfs_msg() is largely based on ext4_msg().
Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <axboe@fb.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Jan Kara <jack@suse.cz>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Boaz Harrosh <boaz@plexistor.com>
Signed-off-by: NVishal Verma <vishal.l.verma@intel.com>

2af3a815

02 5月, 2016 1 次提交

block: add __blkdev_issue_discard · 38f25255

由 Christoph Hellwig 提交于 4月 16, 2016

This is a version of blkdev_issue_discard which doesn't wait for
the I/O to complete, but instead allows the caller to submit
the final bio and/or chain it to others.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMing Lin <ming.l@ssi.samsung.com>
Signed-off-by: NSagi Grimberg <sagig@grimberg.me>
Reviewed-by: NMing Lei <tom.leiming@gmail.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

38f25255

14 4月, 2016 1 次提交

block: kill off q->flush_flags · c888a8f9

由 Jens Axboe 提交于 4月 13, 2016

Now that we converted everything to the newer block write cache
interface, kill off the queue flush_flags and queueable flush
entries.
Signed-off-by: NJens Axboe <axboe@fb.com>

c888a8f9

13 4月, 2016 3 次提交

block: kill blk_queue_flush() · 2245f6de

由 Jens Axboe 提交于 3月 30, 2016

We don't have any drivers left using it, so kill it off. Update
documentation to use the newer blk_queue_write_cache().
Signed-off-by: NJens Axboe <axboe@fb.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

2245f6de

block: add ability to flag write back caching on a device · 93e9d8e8

由 Jens Axboe 提交于 4月 12, 2016

Add an internal helper and flag for setting whether a queue has
write back caching, or write through (or none). Add a sysfs file
to show this as well, and make it changeable from user space.

This will replace the (awkward) blk_queue_flush() interface that
drivers currently use to inform the block layer of write cache state
and capabilities.
Signed-off-by: NJens Axboe <axboe@fb.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

93e9d8e8

block: add offset in blk_add_request_payload() · 37e58237

由 Ming Lin 提交于 3月 22, 2016

We could kmalloc() the payload, so need the offset in page.
Signed-off-by: NMing Lin <ming.l@ssi.samsung.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

37e58237

05 4月, 2016 1 次提交

mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf

由 Kirill A. Shutemov 提交于 4月 01, 2016

PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.

This promise never materialized.  And unlikely will.

We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE.  And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.

Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.

Let's stop pretending that pages in page cache are special.  They are
not.

The changes are pretty straight-forward:

 - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;

 - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;

 - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

 - page_cache_get() -> get_page();

 - page_cache_release() -> put_page();

This patch contains automated changes generated with coccinelle using
script below.  For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.

The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.

There are few places in the code where coccinelle didn't reach.  I'll
fix them manually in a separate patch.  Comments and documentation also
will be addressed with the separate patch.

virtual patch

@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT

@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE

@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK

@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)

@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)

@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

09cbfeaf

05 3月, 2016 1 次提交

blk-mq: enable polling support by default · 8e0b60b9

由 Christoph Hellwig 提交于 3月 03, 2016

Now that applications need to explicitly ask for polling we can enable it
by default in blk-mq drivers.  Note that this will only have an affect
on driver that supply a poll function, which currently only includes nvme.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NStephen Bates <stephen.bates@pmcs.com>
Tested-by: NStephen Bates <stephen.bates@pmcs.com>
Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

8e0b60b9

04 3月, 2016 3 次提交

block: fix blk_rq_get_max_sectors for driver private requests · f2101842

由 Christoph Hellwig 提交于 3月 03, 2016

Driver private request types should not get the artifical cap for the
FS requests.  This is important to use the full device capabilities
for internal command or NVMe pass through commands.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reported-by: NJeff Lien <Jeff.Lien@hgst.com>
Tested-by: NJeff Lien <Jeff.Lien@hgst.com>
Reviewed-by: NKeith Busch <keith.busch@intel.com>

Updated by me to use an explicit check for the one command type that
does support extended checking, instead of relying on the ordering
of the enum command values - as suggested by Keith.
Signed-off-by: NJens Axboe <axboe@fb.com>

f2101842

block: get the 1st and last bvec via helpers · 25e71a99

由 Ming Lei 提交于 2月 26, 2016

This patch applies the two introduced helpers to
figure out the 1st and last bvec, and fixes the
original way after bio splitting.

Cc: stable@vger.kernel.org
Reported-by: NSagi Grimberg <sagig@dev.mellanox.co.il>
Reviewed-by: NSagi Grimberg <sagig@mellanox.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMing Lei <ming.lei@canonical.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

25e71a99

block: check virt boundary in bio_will_gap() · e0af2917

由 Ming Lei 提交于 2月 26, 2016

In the following patch, the way for figuring out
the last bvec will be changed with a bit cost introduced,
so return immediately if the queue doesn't have virt
boundary limit. Actually most of devices have not
this limit.
Reviewed-by: NSagi Grimberg <sagig@mellanox.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMing Lei <ming.lei@canonical.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

e0af2917

19 2月, 2016 1 次提交

block: Add blk_set_runtime_active() · d07ab6d1

由 Mika Westerberg 提交于 2月 18, 2016

If block device is left runtime suspended during system suspend, resume
hook of the driver typically corrects runtime PM status of the device back
to "active" after it is resumed. However, this is not enough as queue's
runtime PM status is still "suspended". As long as it is in this state
blk_pm_peek_request() returns NULL and thus prevents new requests to be
processed.

Add new function blk_set_runtime_active() that can be used to force the
queue status back to "active" as needed.
Signed-off-by: NMika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NTejun Heo <tj@kernel.org>

d07ab6d1

05 2月, 2016 1 次提交

block/sd: Return -EREMOTEIO when WRITE SAME and DISCARD are disabled · 0fb5b1fb

由 Martin K. Petersen 提交于 2月 04, 2016

When a storage device rejects a WRITE SAME command we will disable write
same functionality for the device and return -EREMOTEIO to the block
layer. -EREMOTEIO will in turn prevent DM from retrying the I/O and/or
failing the path.

Yiwen Jiang discovered a small race where WRITE SAME requests issued
simultaneously would cause -EIO to be returned. This happened because
any requests being prepared after WRITE SAME had been disabled for the
device caused us to return BLKPREP_KILL. The latter caused the block
layer to return -EIO upon completion.

To overcome this we introduce BLKPREP_INVALID which indicates that this
is an invalid request for the device. blk_peek_request() is modified to
return -EREMOTEIO in that case.
Reported-by: NYiwen Jiang <jiangyiwen@huawei.com>
Suggested-by: NMike Snitzer <snitzer@redhat.com>
Reviewed-by: NHannes Reinicke <hare@suse.de>
Reviewed-by: NEwan Milne <emilne@redhat.com>
Reviewed-by: NYiwen Jiang <jiangyiwen@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

0fb5b1fb

16 1月, 2016 2 次提交

mm, dax, pmem: introduce pfn_t · 34c0fd54

由 Dan Williams 提交于 1月 15, 2016

For the purpose of communicating the optional presence of a 'struct
page' for the pfn returned from ->direct_access(), introduce a type that
encapsulates a page-frame-number plus flags.  These flags contain the
historical "page_link" encoding for a scatterlist entry, but can also
denote "device memory".  Where "device memory" is a set of pfns that are
not part of the kernel's linear mapping by default, but are accessed via
the same memory controller as ram.

The motivation for this new type is large capacity persistent memory
that needs struct page entries in the 'memmap' to support 3rd party DMA
(i.e.  O_DIRECT I/O with a persistent memory source/target).  However,
we also need it in support of maintaining a list of mapped inodes which
need to be unmapped at driver teardown or freeze_bdev() time.
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Hansen <dave@sr71.net>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

34c0fd54

dax: fix lifetime of in-kernel dax mappings with dax_map_atomic() · b2e0d162

由 Dan Williams 提交于 1月 15, 2016

The DAX implementation needs to protect new calls to ->direct_access()
and usage of its return value against the driver for the underlying
block device being disabled.  Use blk_queue_enter()/blk_queue_exit() to
hold off blk_cleanup_queue() from proceeding, or otherwise fail new
mapping requests if the request_queue is being torn down.

This also introduces blk_dax_ctl to simplify the interface from fs/dax.c
through dax_map_atomic() to bdev_direct_access().

[willy@linux.intel.com: fix read() of a hole]
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
Cc: Jan Kara <jack@suse.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b2e0d162

29 12月, 2015 1 次提交

block: add blk_start_queue_async() · 21491412

由 Jens Axboe 提交于 12月 28, 2015

We currently only have an inline/sync helper to restart a stopped
queue. If drivers need an async version, they have to roll their
own. Add a generic helper instead.
Signed-off-by: NJens Axboe <axboe@fb.com>

21491412

23 12月, 2015 1 次提交

block: defer timeouts to a workqueue · 287922eb

由 Christoph Hellwig 提交于 10月 30, 2015

Timer context is not very useful for drivers to perform any meaningful abort
action from.  So instead of calling the driver from this useless context
defer it to a workqueue as soon as possible.

Note that while a delayed_work item would seem the right thing here I didn't
dare to use it due to the magic in blk_add_timer that pokes deep into timer
internals.  But maybe this encourages Tejun to add a sensible API for that to
the workqueue API and we'll all be fine in the end :)

Contains a major update from Keith Bush:

"This patch removes synchronizing the timeout work so that the timer can
 start a freeze on its own queue. The timer enters the queue, so timer
 context can only start a freeze, but not wait for frozen."
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

287922eb

02 12月, 2015 1 次提交

blk-mq: add a flags parameter to blk_mq_alloc_request · 6f3b0e8b

由 Christoph Hellwig 提交于 11月 26, 2015

We already have the reserved flag, and a nowait flag awkwardly encoded as
a gfp_t.  Add a real flags argument to make the scheme more extensible and
allow for a nicer calling convention.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

6f3b0e8b

30 11月, 2015 1 次提交

block: Always check queue limits for cloned requests · bf4e6b4e

由 Hannes Reinecke 提交于 11月 26, 2015

When a cloned request is retried on other queues it always needs
to be checked against the queue limits of that queue.
Otherwise the calculations for nr_phys_segments might be wrong,
leading to a crash in scsi_init_sgtable().

To clarify this the patch renames blk_rq_check_limits()
to blk_cloned_rq_check_limits() and removes the symbol
export, as the new function should only be used for
cloned requests and never exported.

Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Ewan Milne <emilne@redhat.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: NHannes Reinecke <hare@suse.de>
Fixes: e2a60da7 ("block: Clean up special command handling logic")
Cc: stable@vger.kernel.org # 3.7+
Acked-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

bf4e6b4e

26 11月, 2015 1 次提交

block/sd: Fix device-imposed transfer length limits · ca369d51

由 Martin K. Petersen 提交于 11月 13, 2015

Commit 4f258a46 ("sd: Fix maximum I/O size for BLOCK_PC requests")
had the unfortunate side-effect of removing an implicit clamp to
BLK_DEF_MAX_SECTORS for REQ_TYPE_FS requests in the block layer
code. This caused problems for some SMR drives.

Debugging this issue revealed a few problems with the existing
infrastructure since the block layer didn't know how to deal with
device-imposed limits, only limits set by the I/O controller.

- Introduce a new queue limit, max_dev_sectors, which is used by the
ULD to signal the maximum sectors for a REQ_TYPE_FS request.

- Ensure that max_dev_sectors is correctly stacked and taken into
account when overriding max_sectors through sysfs.

- Rework sd_read_block_limits() so it saves the max_xfer and opt_xfer
values for later processing.

- In sd_revalidate() set the queue's max_dev_sectors based on the
MAXIMUM TRANSFER LENGTH value in the Block Limits VPD. If this value
is not reported, fall back to a cap based on the CDB TRANSFER LENGTH
field size.

- In sd_revalidate(), use OPTIMAL TRANSFER LENGTH from the Block Limits
VPD--if reported and sane--to signal the preferred device transfer
size for FS requests. Otherwise use BLK_DEF_MAX_SECTORS.

- blk_limits_max_hw_sectors() is no longer used and can be removed.
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=93581Reviewed-by: NChristoph Hellwig <hch@lst.de>
Tested-by: sweeneygj@gmx.com
Tested-by: NArzeets <anatol.pomozov@gmail.com>
Tested-by: NDavid Eisner <david.eisner@oriel.oxon.org>
Tested-by: NMario Kicherer <dev@kicherer.org>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

ca369d51