提交 · 6600593cbd9340b3d4fcde8e58d17653732620c4 · openanolis / cloud-kernel

29 5月, 2018 8 次提交

block: rename BLK_EH_NOT_HANDLED to BLK_EH_DONE · 6600593c

由 Christoph Hellwig 提交于 5月 29, 2018

The BLK_EH_NOT_HANDLED implies nothing happen, but very often that
is not what is happening - instead the driver already completed the
command.  Fix the symbolic name to reflect that a little better.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6600593c

blk-mq: Remove generation seqeunce · 12f5b931

由 Keith Busch 提交于 5月 29, 2018

This patch simplifies the timeout handling by relying on the request
reference counting to ensure the iterator is operating on an inflight
and truly timed out request. Since the reference counting prevents the
tag from being reallocated, the block layer no longer needs to prevent
drivers from completing their requests while the timeout handler is
operating on it: a driver completing a request is allowed to proceed to
the next state without additional syncronization with the block layer.

This also removes any need for generation sequence numbers since the
request lifetime is prevented from being reallocated as a new sequence
while timeout handling is operating on it.

To enables this a refcount is added to struct request so that request
users can be sure they're operating on the same request without it
changing while they're processing it.  The request's tag won't be
released for reuse until both the timeout handler and the completion
are done with it.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
[hch: slight cleanups, added back submission side hctx lock, use cmpxchg
 for completions]
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

12f5b931

blk-mq: Fix timeout and state order · ad103e79

由 Keith Busch 提交于 5月 29, 2018

The block layer had been setting the state to in-flight prior to updating
the timer. This is the wrong order since the timeout handler could observe
the in-flight state with the older timeout, believing the request had
expired when in fact it is just getting started.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ad103e79

libata: remove ata_scsi_timed_out · 01fc27d9

由 Christoph Hellwig 提交于 5月 29, 2018

As far as I can tell this function can't even be called any more, given
that ATA implements its own eh_strategy_handler with ata_scsi_error, which
never calls ->eh_timed_out.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

01fc27d9

bcache: Replace bch_read_string_list() by __sysfs_match_string() · ce4c3e19

由 Andy Shevchenko 提交于 5月 28, 2018

Kernel library has a common function to match user input from sysfs
against an array of strings. Thus, replace bch_read_string_list() by
__sysfs_match_string().
Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: NColy Li <colyli@suse.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ce4c3e19

bcache: Move couple of functions to sysfs.c · ecb37ce9

由 Andy Shevchenko 提交于 5月 28, 2018

There is couple of functions that are used exclusively in sysfs.c.
Move it to there and make them static.

Besides above, it will allow further clean up.

No functional change intended.
Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: NColy Li <colyli@suse.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ecb37ce9

bcache: Move couple of string arrays to sysfs.c · 04cbc211

由 Andy Shevchenko 提交于 5月 28, 2018

There is couple of string arrays that are used exclusively in sysfs.c.
Move it to there and make them static.

Besides above, it will allow further clean up.

No functional change intended.
Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: NColy Li <colyli@suse.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

04cbc211

bcache: stop bcache device when backing device is offline · 0f0709e6

由 Coly Li 提交于 5月 28, 2018

Currently bcache does not handle backing device failure, if backing
device is offline and disconnected from system, its bcache device can still
be accessible. If the bcache device is in writeback mode, I/O requests even
can success if the requests hit on cache device. That is to say, when and
how bcache handles offline backing device is undefined.

This patch tries to handle backing device offline in a rather simple way,
- Add cached_dev->status_update_thread kernel thread to update backing
  device status in every 1 second.
- Add cached_dev->offline_seconds to record how many seconds the backing
  device is observed to be offline. If the backing device is offline for
  BACKING_DEV_OFFLINE_TIMEOUT (30) seconds, set dc->io_disable to 1 and
  call bcache_device_stop() to stop the bache device which linked to the
  offline backing device.

Now if a backing device is offline for BACKING_DEV_OFFLINE_TIMEOUT seconds,
its bcache device will be removed, then user space application writing on
it will get error immediately, and handler the device failure in time.

This patch is quite simple, does not handle more complicated situations.
Once the bcache device is stopped, users need to recovery the backing
device, register and attach it manually.

Changelog:
v3: call wait_for_kthread_stop() before exits kernel thread.
v2: remove "bcache: " prefix when calling pr_warn().
v1: initial version.
Signed-off-by: NColy Li <colyli@suse.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Cc: Michael Lyle <mlyle@lyle.org>
Cc: Junhui Tang <tang.junhui@zte.com.cn>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

0f0709e6

25 5月, 2018 3 次提交

null_blk: add blocking description and remove lightnvm · 6723d8dc

由 Liu Bo 提交于 5月 25, 2018

- The description of 'blocking' is missing in null_blk.txt

- The 'lightnvm' parameter has been removed in null_blk.c

This updates both in null_blk.txt.
Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6723d8dc

block drivers/block: Use octal not symbolic permissions · 5657a819

由 Joe Perches 提交于 5月 24, 2018

Convert the S_<FOO> symbolic permissions to their octal equivalents as
using octal and not symbolic permissions is preferred by many as more
readable.

see: https://lkml.org/lkml/2016/8/2/1945

Done with automated conversion via:
$ ./scripts/checkpatch.pl -f --types=SYMBOLIC_PERMS --fix-inplace <files...>

Miscellanea:

o Wrapped modified multi-line calls to a single line where appropriate
o Realign modified multi-line calls to open parenthesis
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

5657a819

blk-mq: avoid starving tag allocation after allocating process migrates · e6fc4649

由 Ming Lei 提交于 5月 24, 2018

When the allocation process is scheduled back and the mapped hw queue is
changed, fake one extra wake up on previous queue for compensating wake
up miss, so other allocations on the previous queue won't be starved.

This patch fixes one request allocation hang issue, which can be
triggered easily in case of very low nr_request.

The race is as follows:

1) 2 hw queues, nr_requests are 2, and wake_batch is one

2) there are 3 waiters on hw queue 0

3) two in-flight requests in hw queue 0 are completed, and only two
   waiters of 3 are waken up because of wake_batch, but both the two
   waiters can be scheduled to another CPU and cause to switch to hw
   queue 1

4) then the 3rd waiter will wait for ever, since no in-flight request
   is in hw queue 0 any more.

5) this patch fixes it by the fake wakeup when waiter is scheduled to
   another hw queue

Cc: <stable@vger.kernel.org>
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NMing Lei <ming.lei@redhat.com>

Modified commit message to make it clearer, and make it apply on
top of the 4.18 branch.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

e6fc4649

24 5月, 2018 2 次提交

bdi: Move cgroup bdi_writeback to a dedicated low concurrency workqueue · f1834646

由 Tejun Heo 提交于 5月 23, 2018

From 0aa2e9b921d6db71150633ff290199554f0842a8 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Wed, 23 May 2018 10:29:00 -0700

cgwb_release() punts the actual release to cgwb_release_workfn() on
system_wq.  Depending on the number of cgroups or block devices, there
can be a lot of cgwb_release_workfn() in flight at the same time.

We're periodically seeing close to 256 kworkers getting stuck with the
following stack trace and overtime the entire system gets stuck.

  [<ffffffff810ee40c>] _synchronize_rcu_expedited.constprop.72+0x2fc/0x330
  [<ffffffff810ee634>] synchronize_rcu_expedited+0x24/0x30
  [<ffffffff811ccf23>] bdi_unregister+0x53/0x290
  [<ffffffff811cd1e9>] release_bdi+0x89/0xc0
  [<ffffffff811cd645>] wb_exit+0x85/0xa0
  [<ffffffff811cdc84>] cgwb_release_workfn+0x54/0xb0
  [<ffffffff810a68d0>] process_one_work+0x150/0x410
  [<ffffffff810a71fd>] worker_thread+0x6d/0x520
  [<ffffffff810ad3dc>] kthread+0x12c/0x160
  [<ffffffff81969019>] ret_from_fork+0x29/0x40
  [<ffffffffffffffff>] 0xffffffffffffffff

The events leading to the lockup are...

1. A lot of cgwb_release_workfn() is queued at the same time and all
   system_wq kworkers are assigned to execute them.

2. They all end up calling synchronize_rcu_expedited().  One of them
   wins and tries to perform the expedited synchronization.

3. However, that invovles queueing rcu_exp_work to system_wq and
   waiting for it.  Because #1 is holding all available kworkers on
   system_wq, rcu_exp_work can't be executed.  cgwb_release_workfn()
   is waiting for synchronize_rcu_expedited() which in turn is waiting
   for cgwb_release_workfn() to free up some of the kworkers.

We shouldn't be scheduling hundreds of cgwb_release_workfn() at the
same time.  There's nothing to be gained from that.  This patch
updates cgwb release path to use a dedicated percpu workqueue with
@max_active of 1.

While this resolves the problem at hand, it might be a good idea to
isolate rcu_exp_work to its own workqueue too as it can be used from
various paths and is prone to this sort of indirect A-A deadlocks.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: NJens Axboe <axboe@kernel.dk>

f1834646

nbd: set discard granularity properly · 6df133a1

由 Josef Bacik 提交于 5月 23, 2018

For some reason we had discard granularity set to 512 always even when
discards were disabled.  Fix this by having the default be 0, and then
if we turn it on set the discard granularity to the blocksize.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6df133a1

23 5月, 2018 2 次提交

blkdev_report_zones_ioctl(): Use vmalloc() to allocate large buffers · 327ea4ad

由 Bart Van Assche 提交于 5月 22, 2018

Avoid that complaints similar to the following appear in the kernel log
if the number of zones is sufficiently large:

  fio: page allocation failure: order:9, mode:0x140c0c0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null)
  Call Trace:
  dump_stack+0x63/0x88
  warn_alloc+0xf5/0x190
  __alloc_pages_slowpath+0x8f0/0xb0d
  __alloc_pages_nodemask+0x242/0x260
  alloc_pages_current+0x6a/0xb0
  kmalloc_order+0x18/0x50
  kmalloc_order_trace+0x26/0xb0
  __kmalloc+0x20e/0x220
  blkdev_report_zones_ioctl+0xa5/0x1a0
  blkdev_ioctl+0x1ba/0x930
  block_ioctl+0x41/0x50
  do_vfs_ioctl+0xaa/0x610
  SyS_ioctl+0x79/0x90
  do_syscall_64+0x79/0x1b0
  entry_SYSCALL_64_after_hwframe+0x3d/0xa2

Fixes: 3ed05a98 ("blk-zoned: implement ioctls")
Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com>
Cc: Shaun Tancheff <shaun.tancheff@seagate.com>
Cc: Damien Le Moal <damien.lemoal@hgst.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

327ea4ad

block/ndb: add WQ_UNBOUND to the knbd-recv workqueue · 2189c97c

由 Dan Melnic 提交于 9月 18, 2017

Add WQ_UNBOUND to the knbd-recv workqueue so we're not bound
to a single CPU that is selected at device creation time.
Signed-off-by: NDan Melnic <dmm@fb.com>
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

2189c97c

22 5月, 2018 1 次提交

blk-mq: remove wrong 'unlikely' check · b4f6f38d

由 huhai 提交于 5月 22, 2018

When dispatch_rq_from_ctx is called, in the vast majority of cases
the ctx->rq_list is not empty.
Signed-off-by: Nhuhai <huhai@kylinos.cn>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b4f6f38d

21 5月, 2018 2 次提交

nvme-pci: fix race between poll and IRQ completions · 68fa9dbe

由 Jens Axboe 提交于 5月 21, 2018

If polling completions are racing with the IRQ triggered by a
completion, the IRQ handler will find no work and return IRQ_NONE.
This can trigger complaints about spurious interrupts:

[  560.169153] irq 630: nobody cared (try booting with the "irqpoll" option)
[  560.175988] CPU: 40 PID: 0 Comm: swapper/40 Not tainted 4.17.0-rc2+ #65
[  560.175990] Hardware name: Intel Corporation S2600STB/S2600STB, BIOS SE5C620.86B.00.01.0010.010920180151 01/09/2018
[  560.175991] Call Trace:
[  560.175994]  <IRQ>
[  560.176005]  dump_stack+0x5c/0x7b
[  560.176010]  __report_bad_irq+0x30/0xc0
[  560.176013]  note_interrupt+0x235/0x280
[  560.176020]  handle_irq_event_percpu+0x51/0x70
[  560.176023]  handle_irq_event+0x27/0x50
[  560.176026]  handle_edge_irq+0x6d/0x180
[  560.176031]  handle_irq+0xa5/0x110
[  560.176036]  do_IRQ+0x41/0xc0
[  560.176042]  common_interrupt+0xf/0xf
[  560.176043]  </IRQ>
[  560.176050] RIP: 0010:cpuidle_enter_state+0x9b/0x2b0
[  560.176052] RSP: 0018:ffffa0ed4659fe98 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffdd
[  560.176055] RAX: ffff9527beb20a80 RBX: 000000826caee491 RCX: 000000000000001f
[  560.176056] RDX: 000000826caee491 RSI: 00000000335206ee RDI: 0000000000000000
[  560.176057] RBP: 0000000000000001 R08: 00000000ffffffff R09: 0000000000000008
[  560.176059] R10: ffffa0ed4659fe78 R11: 0000000000000001 R12: ffff9527beb29358
[  560.176060] R13: ffffffffa235d4b8 R14: 0000000000000000 R15: 000000826caed593
[  560.176065]  ? cpuidle_enter_state+0x8b/0x2b0
[  560.176071]  do_idle+0x1f4/0x260
[  560.176075]  cpu_startup_entry+0x6f/0x80
[  560.176080]  start_secondary+0x184/0x1d0
[  560.176085]  secondary_startup_64+0xa5/0xb0
[  560.176088] handlers:
[  560.178387] [<00000000efb612be>] nvme_irq [nvme]
[  560.183019] Disabling IRQ #630

A previous commit removed ->cqe_seen that was handling this case,
but we need to handle this a bit differently due to completions
now running outside the queue lock. Return IRQ_HANDLED from the
IRQ handler, if the completion ring head was moved since we last
saw it.

Fixes: 5cb525c8 ("nvme-pci: handle completions outside of the queue lock")
Reported-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Tested-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

68fa9dbe

Merge branch 'nvme-4.18' of git://git.infradead.org/nvme into for-4.18/block · 81b1dab4

由 Jens Axboe 提交于 5月 21, 2018

Pull NVMe changes from Keith:

"This is just the first nvme pull request for 4.18. There are several
fabrics and target patches that I missed, so there will be more to
come."

* 'nvme-4.18' of git://git.infradead.org/nvme:
  nvme-pci: drop IRQ disabling on submission queue lock
  nvme-pci: split the nvme queue lock into submission and completion locks
  nvme-pci: handle completions outside of the queue lock
  nvme-pci: move ->cq_vector == -1 check outside of ->q_lock
  nvme-pci: remove cq check after submission
  nvme-pci: simplify nvme_cqe_valid
  nvme: mark the result argument to nvme_complete_async_event volatile
  nvme/pci: Sync controller reset for AER slot_reset
  nvme/pci: Hold controller reference during async probe
  nvme: only reconfigure discard if necessary
  nvme/pci: Use async_schedule for initial reset work
  nvme: lightnvm: add granby support
  NVMe: Add Quirk Delay before CHK RDY for Seagate Nytro Flash Storage
  nvme: change order of qid and cmdid in completion trace
  nvme: fc: provide a descriptive error

81b1dab4

19 5月, 2018 7 次提交

nvme-pci: drop IRQ disabling on submission queue lock · 1eae349d

由 Jens Axboe 提交于 5月 17, 2018

Since we aren't sharing the lock for completions now, we don't
have to make it IRQ safe.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

1eae349d

nvme-pci: split the nvme queue lock into submission and completion locks · 1ab0cd69

由 Jens Axboe 提交于 5月 17, 2018

This is now feasible. We protect the submission queue ring with
->sq_lock, and the completion side with ->cq_lock.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

1ab0cd69

nvme-pci: handle completions outside of the queue lock · 5cb525c8

由 Jens Axboe 提交于 5月 17, 2018

Split the completion of events into a two part process:

1) Reap the events inside the queue lock
2) Complete the events outside the queue lock

Since we never wrap the queue, we can access it locklessly after we've
updated the completion queue head. This patch started off with batching
events on the stack, but with this trick we don't have to. Keith Busch
<keith.busch@intel.com> came up with that idea.

Note that this kills the ->cqe_seen as well. I haven't been able to
trigger any ill effects of this. If we do race with polling every so
often, it should be rare enough NOT to trigger any issues.
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
[hch: refactored, restored poll early exit optimization]
Signed-off-by: NChristoph Hellwig <hch@lst.de>

5cb525c8

nvme-pci: move ->cq_vector == -1 check outside of ->q_lock · d1f06f4a

由 Jens Axboe 提交于 5月 17, 2018

We only clear it dynamically in nvme_suspend_queue(). When we do, ensure
to do a full flush so that any nvme_queue_rq() invocation will see it.

Ideally we'd kill this check completely, but we're using it to flush
requests on a dying queue.
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

d1f06f4a

nvme-pci: remove cq check after submission · f9dde187

由 Jens Axboe 提交于 5月 17, 2018

We always check the completion queue after submitting, but in my testing
this isn't a win even on DRAM/xpoint devices. In some cases it's
actually worse. Kill it.
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

f9dde187

nvme-pci: simplify nvme_cqe_valid · 750dde44

由 Christoph Hellwig 提交于 5月 18, 2018

We always look at the current CQ head and phase, so don't pass these
as separate arguments, and rename the function to nvme_cqe_pending.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

750dde44

C
nvme: mark the result argument to nvme_complete_async_event volatile · 287a63eb
由 Christoph Hellwig 提交于 5月 17, 2018
```
We'll need that in the PCIe driver soon as we'll read it straight off the
CQ.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
```
287a63eb

18 5月, 2018 1 次提交

blk-mq: clear hctx->dispatch_from when mappings change · d416c92c

由 huhai 提交于 5月 18, 2018

When the number of hardware queues is changed, the drivers will call
blk_mq_update_nr_hw_queues() to remap hardware queues. This changes
the ctx mappings, but the current code doesn't clear the
->dispatch_from hint. This can result in dispatch_from pointing to
a ctx that isn't mapped to the hctx anymore.

Fixes: b347689f ("blk-mq-sched: improve dispatching from sw queue")
Signed-off-by: Nhuhai <huhai@kylinos.cn>
Reviewed-by: NMing Lei <ming.lei@redhat.com>

Moved the placement of the clearing to where we clear other items
pertaining to the existing mapping, added Fixes line, and reworded
the commit message.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

d416c92c

17 5月, 2018 7 次提交

nbd: call nbd_bdev_reset instead of bd_set_size on disconnect · 76aa1d34

由 Josef Bacik 提交于 5月 16, 2018

We need to make sure we don't just set the size of the bdev to 0 while
it's being used by a file system.  We have the appropriate check in
nbd_bdev_reset, simply use that helper instead.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

76aa1d34

nbd: fix how we set bd_invalidated · fe1f9e66

由 Josef Bacik 提交于 5月 16, 2018

bd_invalidated is kind of a pain wrt partitions as it really only
triggers the partition rescan if it is set after bd_ops->open() runs, so
setting it when we reset the device isn't useful. We also sporadically
would still have partitions left over in some disconnect cases, so fix
this by always setting bd_invalidated on open if there's no
configuration or if we've had a disconnect action happen, that way the
partition table gets invalidated and rescanned properly.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

fe1f9e66

nbd: clear_sock on netlink disconnect · 96d97e17

由 Josef Bacik 提交于 5月 16, 2018

This is what the ioctl based nbd disconnect does as well.  Without this
the device will just sit there and wait for the connection to go away
(or IO to occur) before the device gets torn down.  Instead clear
everything up on our end so the configuration goes away as quickly as
possible.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

96d97e17

nbd: use bd_set_size when updating disk size · 9e2b1967

由 Josef Bacik 提交于 5月 16, 2018

When we stopped relying on the bdev everywhere I broke updating the
block device size on the fly, which ceph relies on.  We can't just do
set_capacity, we also have to do bd_set_size so things like parted will
notice the device size change.

Fixes: 29eaadc0 ("nbd: stop using the bdev everywhere")
cc: stable@vger.kernel.org
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9e2b1967

nbd: update size when connected · c3f7c939

由 Josef Bacik 提交于 5月 16, 2018

I messed up changing the size of an NBD device while it was connected by
not actually updating the device or doing the uevent.  Fix this by
updating everything if we're connected and we change the size.

cc: stable@vger.kernel.org
Fixes: 639812a1 ("nbd: don't set the device size until we're connected")
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

c3f7c939

nbd: fix nbd device deletion · 8364da47

由 Josef Bacik 提交于 5月 16, 2018

This fixes a use after free bug, we shouldn't be doing disk->queue right
after we do del_gendisk(disk).  Save the queue and do the cleanup after
the del_gendisk.

Fixes: c6a4759e ("nbd: add device refcounting")
cc: stable@vger.kernel.org
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8364da47

block: fix MAINTAINERS email for nbd · 3de9beee

由 Josef Bacik 提交于 5月 16, 2018

I've been missing stuff because it's been going into my work email which
is a black hole.  Update to the email I actually use so I stop missing
patches and bug reports.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

3de9beee

16 5月, 2018 2 次提交

blk-mq: remove redundant insert case in blk_mq_make_request() · 8fa9f556

由 huhai 提交于 5月 16, 2018

We can use blk_mq_sched_insert_request() even if we don't have
an IO scheduler attached, since that case will end up being
exactly the same as what blk_mq_queue_io() was doing now.
Signed-off-by: Nhuhai <huhai@kylinos.cn>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8fa9f556

Remove jsflash driver · da3c6efe

由 Jens Axboe 提交于 5月 15, 2018

Nobody is using it anymore, and it's been abandoned. Since David
is fine with removing it, kill it.
Suggested-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

da3c6efe

15 5月, 2018 5 次提交

block: Add sysfs entry for fua support · 6fcefbe5

由 Kent Overstreet 提交于 5月 08, 2018

Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6fcefbe5

block: Export bio check/set pages_dirty · 1900fcc4

由 Kent Overstreet 提交于 5月 08, 2018

Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

1900fcc4

block: Add warning for bi_next not NULL in bio_endio() · 0ba99ca4

由 Kent Overstreet 提交于 5月 08, 2018

Recently found a bug where a driver left bi_next not NULL and then
called bio_endio(), and then the submitter of the bio used
bio_copy_data() which was treating src and dst as lists of bios.

Fixed that bug by splitting out bio_list_copy_data(), but in case other
things are depending on bi_next in weird ways, add a warning to help
avoid more bugs like that in the future.
Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

0ba99ca4

block: Add missing flush_dcache_page() call · 6e6e811d

由 Kent Overstreet 提交于 5月 08, 2018

Since a bio can point to userspace pages (e.g. direct IO), this is
generally necessary.
Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6e6e811d

block: Split out bio_list_copy_data() · 45db54d5

由 Kent Overstreet 提交于 5月 08, 2018

Found a bug (with ASAN) where we were passing a bio to bio_copy_data()
with bi_next not NULL, when it should have been - a driver had left
bi_next set to something after calling bio_endio().

Since the normal case is only copying single bios, split out
bio_list_copy_data() to avoid more bugs like this in the future.
Signed-off-by: NKent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

45db54d5

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功