提交 · 4452226ea276e74fc3e252c88d9bb7e8f8e44bf0 · openeuler / Kernel

02 6月, 2015 1 次提交

writeback: move backing_dev_info->state into bdi_writeback · 4452226e

由 Tejun Heo 提交于 5月 22, 2015

Currently, a bdi (backing_dev_info) embeds single wb (bdi_writeback)
and the role of the separation is unclear.  For cgroup support for
writeback IOs, a bdi will be updated to host multiple wb's where each
wb serves writeback IOs of a different cgroup on the bdi.  To achieve
that, a wb should carry all states necessary for servicing writeback
IOs for a cgroup independently.

This patch moves bdi->state into wb.

* enum bdi_state is renamed to wb_state and the prefix of all enums is
  changed from BDI_ to WB_.

* Explicit zeroing of bdi->state is removed without adding zeoring of
  wb->state as the whole data structure is zeroed on init anyway.

* As there's still only one bdi_writeback per backing_dev_info, all
  uses of bdi->state are mechanically replaced with bdi->wb.state
  introducing no behavior changes.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reviewed-by: NJan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: drbd-dev@lists.linbit.com
Cc: Neil Brown <neilb@suse.de>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

4452226e

06 5月, 2015 2 次提交
- C
  nbd: stop using req->cmd · 9dc6c806
  由 Christoph Hellwig 提交于 4月 17, 2015
```
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>
```
  9dc6c806
- C
  block: rename REQ_TYPE_SPECIAL to REQ_TYPE_DRV_PRIV · 4f8c9510
  由 Christoph Hellwig 提交于 4月 17, 2015
```
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>
```
  4f8c9510
02 5月, 2015 1 次提交

rbd: end I/O the entire obj_request on error · 082a75da

由 Ilya Dryomov 提交于 4月 25, 2015

When we end I/O struct request with error, we need to pass
obj_request->length as @nr_bytes so that the entire obj_request worth
of bytes is completed.  Otherwise block layer ends up confused and we
trip on

    rbd_assert(more ^ (which == img_request->obj_request_count));

in rbd_img_obj_callback() due to more being true no matter what.  We
already do it in most cases but we are missing some, in particular
those where we don't even get a chance to submit any obj_requests, due
to an early -ENOMEM for example.

A number of obj_request->xferred assignments seem to be redundant but
I haven't touched any of obj_request->xferred stuff to keep this small
and isolated.

Cc: Alex Elder <elder@linaro.org>
Cc: stable@vger.kernel.org # 3.10+
Reported-by: NShawn Edwards <lesser.evil@gmail.com>
Reviewed-by: NSage Weil <sage@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

082a75da

22 4月, 2015 1 次提交

rbd: rbd_wq comment is obsolete · f77303bd

由 Ilya Dryomov 提交于 4月 22, 2015

After the switch to blk-mq rbd_wq processes requests, not devices.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

f77303bd

20 4月, 2015 2 次提交

rbd: mark block queue as non-rotational · d8a2c89c

由 Ilya Dryomov 提交于 3月 24, 2015

Set QUEUE_FLAG_NONROT.  Following commit b277da0a ("block: disable
entropy contributions for nonrot devices") we should also clear
QUEUE_FLAG_ADD_RANDOM, but it's off by default for blk-mq drivers, so
just note it in the comment.

Also remove physical block size assignment - no sense in repeating
defaults that are not going to change.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

d8a2c89c

rbd: be more informative on -ENOENT failures · 1fe48023

由 Ilya Dryomov 提交于 3月 05, 2015

pr_info what exactly was the culprit: missing pool, image or snap.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

1fe48023

16 4月, 2015 11 次提交

paride: fix the "verbose" module param · 946e8798

由 Dan Carpenter 提交于 4月 15, 2015

The verbose module parameter can be set to 2 for extremely verbose
messages so the type should be int instead of bool.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Cc: Tim Waugh <tim@cyberelk.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

946e8798

zram: fix error return code · 201c7b72

由 Julia Lawall 提交于 4月 15, 2015

Return a negative error code on failure.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
identifier ret; expression e1,e2;
@@
(
if (\(ret < 0\|ret != 0\))
 { ... return ret; }
|
ret = 0
)
... when != ret = e1
    when != &ret
*if(...)
{
  ... when != ret = e2
      when forall
 return ret;
}
// </smpl>
Signed-off-by: NJulia Lawall <Julia.Lawall@lip6.fr>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Acked-by: NSergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

201c7b72

zram: deprecate zram attrs sysfs nodes · 8f7d282c

由 Sergey Senozhatsky 提交于 4月 15, 2015

Add Documentation/ABI/obsolete/sysfs-block-zram file and list obsolete and
deprecated attributes there.  The patch also adds additional information
to zram documentation and describes the basic strategy:

- the existing RW nodes will be downgraded to WO nodes (in 4.11)
- deprecated RO sysfs nodes will eventually be removed (in 4.11)

Users will be additionally notified about deprecated attr usage by
pr_warn_once() (added to every deprecated attr _show()), as suggested by
Minchan Kim.

User space is advised to use zram<id>/stat, zram<id>/io_stat and
zram<id>/mm_stat files.
Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reported-by: NMinchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8f7d282c

zram: export new 'mm_stat' sysfs attrs · 4f2109f6

由 Sergey Senozhatsky 提交于 4月 15, 2015

Per-device `zram<id>/mm_stat' file provides mm statistics of a particular
zram device in a format similar to block layer statistics.  The file
consists of a single line and represents the following stats (separated by
whitespace):

        orig_data_size
        compr_data_size
        mem_used_total
        mem_limit
        mem_used_max
        zero_pages
        num_migrated
Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: NMinchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4f2109f6

zram: export new 'io_stat' sysfs attrs · 2f6a3bed

由 Sergey Senozhatsky 提交于 4月 15, 2015

Per-device `zram<id>/io_stat' file provides accumulated I/O statistics of
particular zram device in a format similar to block layer statistics.  The
file consists of a single line and represents the following stats
(separated by whitespace):

        failed_reads
        failed_writes
        invalid_io
        notify_free
Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: NMinchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2f6a3bed

zram: use generic start/end io accounting · 8811a942

由 Sergey Senozhatsky 提交于 4月 15, 2015

Use bio generic_start_io_acct() and generic_end_io_acct() to account
device's block layer statistics.  This will let users to monitor zram
activities using sysstat and similar packages/tools.

Apart from the usual per-stat sysfs attr, zram IO stats are now also
available in '/sys/block/zram<id>/stat' and '/proc/diskstats' files.

We will slowly get rid of per-stat sysfs files.
Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: NMinchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8811a942

zram: move compact_store() to sysfs functions area · c72c6160

由 Sergey Senozhatsky 提交于 4月 15, 2015

A cosmetic change.  We have a new code layout and keep zram per-device
sysfs store and show functions in one place.  Move compact_store() to that
handlers block to conform to current layout.
Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: NMinchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c72c6160

zram: remove `num_migrated' device attr · 10447b60

由 Sergey Senozhatsky 提交于 4月 15, 2015

This patch introduces rework to zram stats.  We have per-stat sysfs nodes,
and it makes things a bit hard to use in user space: it doesn't give an
immediate stats 'snapshot', it requires user space to use more syscalls -
open, read, close for every stat file, with appropriate error checks on
every step, etc.

First, zram now accounts block layer statistics, available in
/sys/block/zram<id>/stat and /proc/diskstats files.  So some new stats are
available (see Documentation/block/stat.txt), besides, zram's activities
now can be monitored by sysstat's iostat or similar tools.

Example:
cat /sys/block/zram0/stat
248     0    1984    0   251029     0  2008232   5120   0   5116   5116

Second, group currently exported on per-stat basis nodes into two
categories (files):

-- zram<id>/io_stat
accumulates device's IO stats, that are not accounted by block layer,
and contains:
        failed_reads
        failed_writes
        invalid_io
        notify_free

Example:
cat /sys/block/zram0/io_stat
0        0        0   652572

-- zram<id>/mm_stat
accumulates zram mm stats and contains:
        orig_data_size
        compr_data_size
        mem_used_total
        mem_limit
        mem_used_max
        zero_pages
        num_migrated

Example:
cat /sys/block/zram0/mm_stat
434634752 270288572 279158784        0 579895296    15060        0

per-stat sysfs nodes are now considered to be deprecated and we plan to
remove them (and clean up some of the existing stat code) in two years (as
of now, there is no warning printed to syslog about deprecated stats being
used).  User space is advised to use the above mentioned 3 files.

This patch (of 7):

Remove sysfs `num_migrated' attribute.  We are moving away from per-stat
device attrs towards 3 stat files that will accumulate io and mm stats in
a format similar to block layer statistics in /sys/block/<dev>/stat.  That
will be easier to use in user space, and reduce the number of syscalls
needed to read zram device statistics.

`num_migrated' will return back in zram<id>/mm_stat file.
Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: NMinchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

10447b60

zram: support compaction · 4e3ba878

由 Minchan Kim 提交于 4月 15, 2015

Now that zsmalloc supports compaction, zram can use it.  For the first
step, this patch exports compact knob via sysfs so user can do compaction
via "echo 1 > /sys/block/zram0/compact".
Signed-off-by: NMinchan Kim <minchan@kernel.org>
Cc: Juneho Choi <juno.choi@lge.com>
Cc: Gunho Lee <gunho.lee@lge.com>
Cc: Luigi Semenzato <semenzato@google.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Seth Jennings <sjennings@variantweb.net>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4e3ba878

D
VFS: assorted weird filesystems: d_inode() annotations · 75c3cfa8
由 David Howells 提交于 3月 17, 2015
```
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
75c3cfa8

block: loop: switch to VFS ITER_BVEC · aa4d8616

由 Christoph Hellwig 提交于 4月 07, 2015

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

aa4d8616

15 4月, 2015 1 次提交

xenbus_client: Extend interface to support multi-page ring · ccc9d90a

由 Wei Liu 提交于 4月 03, 2015

Originally Xen PV drivers only use single-page ring to pass along
information. This might limit the throughput between frontend and
backend.

The patch extends Xenbus driver to support multi-page ring, which in
general should improve throughput if ring is the bottleneck. Changes to
various frontend / backend to adapt to the new interface are also
included.

Affected Xen drivers:
* blkfront/back
* netfront/back
* pcifront/back
* scsifront/back
* vtpmfront

The interface is documented, as before, in xenbus_client.c.
Signed-off-by: NWei Liu <wei.liu2@citrix.com>
Signed-off-by: NPaul Durrant <paul.durrant@citrix.com>
Signed-off-by: NBob Liu <bob.liu@oracle.com>
Cc: Konrad Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

ccc9d90a

12 4月, 2015 1 次提交

switch /dev/loop to vfs_iter_write() · 283e7e5d

由 Al Viro 提交于 4月 03, 2015

all writable files that might be used as backing store for /dev/loop
already support ->write_iter()
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

283e7e5d

11 4月, 2015 1 次提交

sd, mmc, virtio_blk, string_helpers: fix block size units · b9f28d86

由 James Bottomley 提交于 3月 05, 2015

The current string_get_size() overflows when the device size goes over
2^64 bytes because the string helper routine computes the suffix from
the size in bytes.  However, the entirety of Linux thinks in terms of
blocks, not bytes, so this will artificially induce an overflow on very
large devices.  Fix this by making the function string_get_size() take
blocks and the block size instead of bytes.  This should allow us to
keep working until the current SCSI standard overflows.

Also fix virtio_blk and mmc (both of which were also artificially
multiplying by the block size to pass a byte side to string_get_size()).

The mathematics of this is pretty simple:  we're taking a product of
size in blocks (S) and block size (B) and trying to re-express this in
exponential form: S*B = R*N^E (where N, the exponent is either 1000 or
1024) and R < N.  Mathematically, S = RS*N^ES and B=RB*N^EB, so if RS*RB
< N it's easy to see that S*B = RS*RB*N^(ES+EB).  However, if RS*BS > N,
we can see that this can be re-expressed as RS*BS = R*N (where R =
RS*BS/N < N) so the whole exponent becomes R*N^(ES+EB+1)

[jejb: fix incorrect 32 bit do_div spotted by kbuild test robot <fengguang.wu@intel.com>]
Acked-by: NUlf Hansson <ulf.hansson@linaro.org>
Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NJames Bottomley <JBottomley@Odin.com>

b9f28d86

08 4月, 2015 4 次提交

NVMe: Meta data handling through submit io ioctl · a67a9513

由 Keith Busch 提交于 4月 07, 2015

This adds support for the extended metadata formats through the submit
IO ioctl, and simplifies the rest when using a separate metadata format.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

a67a9513

NVMe: Add translation for block limits · 7f749d9c

由 Keith Busch 提交于 4月 07, 2015

Adds SCSI-to-NVMe translation for VPD B0h, block limits inquiry data.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

7f749d9c

NVMe: Remove check for null · 44722802

由 Keith Busch 提交于 4月 07, 2015

Checking fails static analysis due to additional arithmetic prior to
the NULL check. Mapping doesn't return NULL here anyway, so removing
the check.
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

44722802

NVMe: Fix error handling of class_create("nvme") · c727040b

由 Alexey Khoroshilov 提交于 3月 07, 2015

class_create() returns ERR_PTR on failure,
so IS_ERR() should be used instead of check for NULL.

Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: NAlexey Khoroshilov <khoroshilov@ispras.ru>
Acked-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

c727040b

07 4月, 2015 2 次提交

xen-blkback: define pr_fmt macro to avoid the duplication of DRV_PFX · 77387b82

由 Tao Chen 提交于 4月 01, 2015

Define pr_fmt macro with {xen-blkback: } prefix, then remove all use
of DRV_PFX in the pr sentences. Replace all DPRINTK with pr sentences,
and get rid of DPRINTK macro. It will simplify the code.

And if the pr sentences miss a \n, add it in the end. If the DPRINTK
sentences have redundant \n, remove it. It will format the code.

These all make the readability of the code become better.
Signed-off-by: NTao Chen <boby.chen@huawei.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: NRoger Pau Monné <roger.pau@citrix.com>

77387b82

xen-blkback: enlarge the array size of blkback name · 1375590d

由 Tao Chen 提交于 3月 27, 2015

The blkback name is like blkback.domid.xvd[a-z], if domid has four digits
(means larger than 1000), then the backmost xvd wouldn't be fully shown.

Define a BLKBACK_NAME_LEN macro to be 20, enlarge the array size of
blkback name, so it will be fully shown in any case.
Signed-off-by: NTao Chen <boby.chen@huawei.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: NRoger Pau Monné <roger.pau@citrix.com>

1375590d

03 4月, 2015 7 次提交

nbd: Return error pointer directly · de9ad6d4

由 Markus Pargmann 提交于 4月 02, 2015

Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de>
Acked-by: NPavel Machek <pavel@ucw.cz>
Signed-off-by: NJens Axboe <axboe@fb.com>

de9ad6d4

nbd: Return error code directly · dab5313a

由 Markus Pargmann 提交于 4月 02, 2015

By returning the error code directly, we can avoid the jump label
error_out.
Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de>
Acked-by: NPavel Machek <pavel@ucw.cz>
Signed-off-by: NJens Axboe <axboe@fb.com>

dab5313a

nbd: Remove fixme that was already fixed · e018e757

由 Markus Pargmann 提交于 4月 02, 2015

The mentioned problem is not present anymore.
Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

e018e757

nbd: Restructure debugging prints · d18509f5

由 Markus Pargmann 提交于 4月 02, 2015

dprintk has some name collisions with other frameworks and drivers. It
is also not necessary to have these custom debug print filters. Dynamic
debug offers the same amount of filtered debugging.

This patch replaces all dprintks with dev_dbg(). It also removes the
ioctl dprintk which prints the ingoing ioctls which should be
replaceable by strace or similar stuff.
Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de>
Acked-by: NPavel Machek <pavel@ucw.cz>
Signed-off-by: NJens Axboe <axboe@fb.com>

d18509f5

nbd: Fix device bytesize type · b9c495bb

由 Markus Pargmann 提交于 4月 02, 2015

The block subsystem uses loff_t to store the device size. Change the
type for nbd_device bytesize to loff_t.
Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de>
Acked-by: NPavel Machek <pavel@ucw.cz>
Signed-off-by: NJens Axboe <axboe@fb.com>

b9c495bb

nbd: Replace kthread_create with kthread_run · d06df60b

由 Markus Pargmann 提交于 4月 02, 2015

kthread_run includes the wake_up_process() call, so instead of
kthread_create() followed by wake_up_process() we can use this macro.
Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de>
Acked-by: NPavel Machek <pavel@ucw.cz>
Signed-off-by: NJens Axboe <axboe@fb.com>

d06df60b

nbd: Remove kernel internal header · 13e71d69

由 Markus Pargmann 提交于 4月 02, 2015

The header is not included anywhere. Remove it and include the private
nbd_device struct in nbd.c.
Signed-off-by: NMarkus Pargmann <mpa@pengutronix.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

13e71d69

01 4月, 2015 6 次提交

drivers/block/pmem: Fix 32-bit build warning in pmem_alloc() · 4c1eaa23

由 Ingo Molnar 提交于 4月 01, 2015

Fix:

  drivers/block/pmem.c: In function ‘pmem_alloc’:
  drivers/block/pmem.c:138:7: warning: format ‘%llx’ expects argument of type ‘long long unsigned int’, but argument 3 has type ‘phys_addr_t’ [-Wformat=]

By using the proper %pa format specifier we use for 'phys_addr_t' arguments.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-nvdimm@ml01.01.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

4c1eaa23

drivers/block/pmem: Add a driver for persistent memory · 9e853f23

由 Ross Zwisler 提交于 4月 01, 2015

PMEM is a new driver that presents a reserved range of memory as
a block device.  This is useful for developing with NV-DIMMs,
and can be used with volatile memory as a development platform.

This patch contains the initial driver from Ross Zwisler, with
various changes: converted it to use a platform_device for
discovery, fixed partition support and merged various patches
from Boaz Harrosh.
Tested-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NDan Williams <dan.j.williams@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-nvdimm@ml01.01.org
Link: http://lkml.kernel.org/r/1427872339-6688-3-git-send-email-hch@lst.de
[ Minor cleanups. ]
Signed-off-by: NIngo Molnar <mingo@kernel.org>

9e853f23

NVMe: increase depth of admin queue · d31af0a3

由 Jens Axboe 提交于 3月 06, 2015

Usually the admin queue depth of 64 is plenty, but for some use cases we
really need it larger. Examples are use cases like MAT, where you have
to touch all of NAND for init/format like purposes. In those cases, we
see a good 2x increase with an increased queue depth.
Signed-off-by: NJens Axboe <axboe@fb.com>
Acked-by: NKeith Busch <keith.busch@intel.com>

d31af0a3

nvme: Fix PRP list calculation for non-4k system page size · f137e0f1

由 Murali Iyer 提交于 3月 26, 2015

PRP list calculation is supposed to be based on device's page size.
Systems with page size larger than device's page size cause corruption
to the name space as well as system memory with out this fix.
Systems like x86 might not experience this issue because it uses
PAGE_SIZE of 4K where as powerpc uses PAGE_SIZE of 64k while NVMe device's
page size varies depending upon the vendor.
Signed-off-by: NMurali Iyer <mniyer@us.ibm.com>
Signed-off-by: NBrian King <brking@linux.vnet.ibm.com>
Acked-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

f137e0f1

NVMe: Fix blk-mq hot cpu notification · 1efccc9d

由 Keith Busch 提交于 3月 31, 2015

The driver may issue commands to a device that may never return, so its
request_queue could always have active requests while the controller is
running. Waiting for the queue to freeze could block forever, which is
what blk-mq's hot cpu notification handler was doing when nvme drives
were in use.

This has the nvme driver make the asynchronous event command's tag
reserved and does not keep the request active. We can't have more than
one since the request is released back to the request_queue before the
command is completed. Having only one avoids potential tag collisions,
and reserving the tag for this purpose prevents other admin tasks from
reusing the tag.

I also couldn't think of a scenario where issuing AEN requests single
depth is worse than issuing them in batches, so I don't think we lose
anything with this change.

As an added bonus, doing it this way removes "Cancelling I/O" warnings
observed when unbinding the nvme driver from a device.
Reported-by: NYigal Korman <yigal@plexistor.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

1efccc9d

NVMe: embedded iod mask cleanup · fda631ff

由 Chong Yuan 提交于 3月 27, 2015

Remove unused mask in nvme_alloc_iod
Signed-off-by: NChong Yuan <chong.yuan@memblaze.com>
Reviewed-by: NWenbo Wang  <wenbo.wang@memblaze.com>
Acked-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

fda631ff

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功