提交 · ffa772cfe9356ce94d3061335c2681f60e7c1c5b · openeuler / Kernel

11 2月, 2021 1 次提交

nbd: Convert to DEFINE_SHOW_ATTRIBUTE · a2d52a6c

由 Liao Pingfang 提交于 2月 06, 2021

Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code.
Signed-off-by: NLiao Pingfang <winndows@163.com>
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

a2d52a6c

26 1月, 2021 1 次提交

nbd: freeze the queue while we're adding connections · b98e762e

由 Josef Bacik 提交于 1月 25, 2021

When setting up a device, we can krealloc the config->socks array to add
new sockets to the configuration.  However if we happen to get a IO
request in at this point even though we aren't setup we could hit a UAF,
as we deref config->socks without any locking, assuming that the
configuration was setup already and that ->socks is safe to access it as
we have a reference on the configuration.

But there's nothing really preventing IO from occurring at this point of
the device setup, we don't want to incur the overhead of a lock to
access ->socks when it will never change while the device is running.
To fix this UAF scenario simply freeze the queue if we are adding
sockets.  This will protect us from this particular case without adding
any additional overhead for the normal running case.

Cc: stable@vger.kernel.org
Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b98e762e

17 12月, 2020 1 次提交

nbd: Respect max_part for all partition scans · 1aba169e

由 Josh Triplett 提交于 12月 17, 2020

The creation path of the NBD device respects max_part and only scans for
partitions if max_part is not 0. However, some other code paths ignore
max_part, and unconditionally scan for partitions. Add a check for
max_part on each partition scan.
Signed-off-by: NJosh Triplett <josh@joshtriplett.org>
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

1aba169e

02 12月, 2020 2 次提交

block: stop using bdget_disk for partition 0 · 977115c0

由 Christoph Hellwig 提交于 11月 26, 2020

We can just dereference the point in struct gendisk instead.  Also
remove the now unused export.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

977115c0

block: remove the nr_sects field in struct hd_struct · a782483c

由 Christoph Hellwig 提交于 11月 26, 2020

Now that the hd_struct always has a block device attached to it, there is
no need for having two size field that just get out of sync.

Additionally the field in hd_struct did not use proper serialization,
possibly allowing for torn writes.  By only using the block_device field
this problem also gets fixed.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Acked-by: Coly Li <colyli@suse.de>			[bcache]
Acked-by: Chao Yu <yuchao0@huawei.com>			[f2fs]
Signed-off-by: NJens Axboe <axboe@kernel.dk>

a782483c

16 11月, 2020 5 次提交

nbd: use set_capacity_and_notify · 2ebcabf3

由 Christoph Hellwig 提交于 11月 16, 2020

Use set_capacity_and_notify to update the disk and block device sizes and
send a RESIZE uevent to userspace.  Note that blktests relies on uevents
being sent also for updates that did not change the device size, so the
explicit kobject_uevent remains for that case.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

2ebcabf3

nbd: validate the block size in nbd_set_size · dcbddf54

由 Christoph Hellwig 提交于 11月 16, 2020

Move the validation of the block from the callers into nbd_set_size.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

dcbddf54

nbd: refactor size updates · 2dc691cc

由 Christoph Hellwig 提交于 11月 16, 2020

Merge nbd_size_set and nbd_size_update into a single function that also
updates the nbd_config fields.  This new function takes the device size
in bytes as the first argument, and the blocksize as the second argument,
simplifying the calculations required in most callers.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

2dc691cc

nbd: move the task_recv check into nbd_size_update · 92f93c3a

由 Christoph Hellwig 提交于 11月 16, 2020

nbd_size_update is about to acquire a few more callers, so lift the check
into the function.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

92f93c3a

nbd: remove the call to set_blocksize · ee4bf648

由 Christoph Hellwig 提交于 11月 16, 2020

Block driver have no business setting the file system concept of a
block size.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ee4bf648

10 11月, 2020 1 次提交

nbd: fix a block_device refcount leak in nbd_release · 2bd645b2

由 Christoph Hellwig 提交于 11月 09, 2020

bdget_disk needs to be paired with bdput to not leak a reference
on the block device inode.

Fixes: 08ba91ee ("nbd: Add the nbd NBD_DISCONNECT_ON_CLOSE config flag.")
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

2bd645b2

29 10月, 2020 1 次提交

nbd: don't update block size after device is started · b40813dd

由 Ming Lei 提交于 10月 28, 2020

Mounted NBD device can be resized, one use case is rbd-nbd.

Fix the issue by setting up default block size, then not touch it
in nbd_size_update() any more. This kind of usage is aligned with loop
which has same use case too.

Cc: stable@vger.kernel.org
Fixes: c8a83a6b ("nbd: Use set_blocksize() to set device blocksize")
Reported-by: Nlining <lining2020x@163.com>
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Jan Kara <jack@suse.cz>
Tested-by: Nlining <lining2020x@163.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b40813dd

15 10月, 2020 1 次提交

nbd: make the config put is called before the notifying the waiter · 87aac3a8

由 Xiubo Li 提交于 10月 13, 2020

There has one race case for ceph's rbd-nbd tool. When do mapping
it may fail with EBUSY from ioctl(nbd, NBD_DO_IT), but actually
the nbd device has already unmaped.

It dues to if just after the wake_up(), the recv_work() is scheduled
out and defers calling the nbd_config_put(), though the map process
has exited the "nbd->recv_task" is not cleared.
Signed-off-by: NXiubo Li <xiubli@redhat.com>
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

87aac3a8

03 10月, 2020 1 次提交

genetlink: move to smaller ops wherever possible · 66a9b928

由 Jakub Kicinski 提交于 10月 02, 2020

Bulk of the genetlink users can use smaller ops, move them.
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Reviewed-by: NJohannes Berg <johannes@sipsolutions.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

66a9b928

24 9月, 2020 1 次提交

block: move the NEED_PART_SCAN flag to struct gendisk · 38430f08

由 Christoph Hellwig 提交于 9月 21, 2020

We can only scan for partitions on the whole disk, so move the flag
from struct block_device to struct gendisk.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

38430f08

02 9月, 2020 2 次提交

block: rename bd_invalidated · f4ad06f2

由 Christoph Hellwig 提交于 9月 01, 2020

Replace bd_invalidate with a new BDEV_NEED_PART_SCAN flag in a bd_flags
variable to better describe the condition.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

f4ad06f2

block: replace bd_set_size with bd_set_nr_sectors · 611bee52

由 Christoph Hellwig 提交于 8月 23, 2020

Replace bd_set_size with a version that takes the number of sectors
instead, as that fits most of the current and future callers much better.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

611bee52

26 8月, 2020 1 次提交

nbd: restore default timeout when setting it to zero · acb19e17

由 Hou Pu 提交于 8月 10, 2020

If we configured io timeout of nbd0 to 100s. Later after we
finished using it, we configured nbd0 again and set the io
timeout to 0. We expect it would timeout after 30 seconds
and keep retry. But in fact we could not change the timeout
when we set it to 0. the timeout is still the original 100s.

So change the timeout to default 30s when we set it to zero.
It also behaves same as commit 2da22da5 ("nbd: fix zero
cmd timeout handling v2").

It becomes more important if we were reconfigure a nbd device
and the io timeout it set to zero. Because it could take 30s
to detect the new socket and thus io could be completed more
quickly compared to 100s.
Signed-off-by: NHou Pu <houpu@bytedance.com>
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

acb19e17

09 7月, 2020 1 次提交

nbd: Fix memory leak in nbd_add_socket · 579dd91a

由 Zheng Bin 提交于 6月 29, 2020

When adding first socket to nbd, if nsock's allocation failed, the data
structure member "config->socks" was reallocated, but the data structure
member "config->num_connections" was not updated. A memory leak will occur
then because the function "nbd_config_put" will free "config->socks" only
when "config->num_connections" is not zero.

Fixes: 03bf73c3 ("nbd: prevent memory leak")
Reported-by: syzbot+934037347002901b8d2a@syzkaller.appspotmail.com
Signed-off-by: NZheng Bin <zhengbin13@huawei.com>
Reviewed-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

579dd91a

24 6月, 2020 1 次提交

blk-mq: move failure injection out of blk_mq_complete_request · 15f73f5b

由 Christoph Hellwig 提交于 6月 11, 2020

Move the call to blk_should_fake_timeout out of blk_mq_complete_request
and into the drivers, skipping call sites that are obvious error
handlers, and remove the now superflous blk_mq_force_complete_rq helper.
This ensures we don't keep injecting errors into completions that just
terminate the Linux request after the hardware has been reset or the
command has been aborted.
Reviewed-by: NDaniel Wagner <dwagner@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

15f73f5b

12 3月, 2020 2 次提交

nbd: requeue command if the soecket is changed · 2c272542

由 Hou Pu 提交于 2月 28, 2020

In commit 2da22da5 (nbd: fix zero cmd timeout handling v2),
it is allowed to reset timer when it fires if tag_set.timeout
is set to zero. If the server is shutdown and a new socket
is reconfigured, the request should be requeued to be processed by
new server instead of waiting for response from the old one.
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NHou Pu <houpu@bytedance.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

2c272542

nbd: enable replace socket if only one connection is configured · d970958b

由 Hou Pu 提交于 2月 28, 2020

Nbd server with multiple connections could be upgraded since
560bc4b3 (nbd: handle dead connections). But if only one conncection
is configured, after we take down nbd server, all inflight IO
would finally timeout and return error. We could requeue them
like what we do with multiple connections and wait for new socket
in submit path.
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NHou Pu <houpu@bytedance.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

d970958b

30 1月, 2020 1 次提交

nbd: add a flush_workqueue in nbd_start_device · 5c0dd228

由 Sun Ke 提交于 1月 22, 2020

When kzalloc fail, may cause trying to destroy the
workqueue from inside the workqueue.

If num_connections is m (2 < m), and NO.1 ~ NO.n
(1 < n < m) kzalloc are successful. The NO.(n + 1)
failed. Then, nbd_start_device will return ENOMEM
to nbd_start_device_ioctl, and nbd_start_device_ioctl
will return immediately without running flush_workqueue.
However, we still have n recv threads. If nbd_release
run first, recv threads may have to drop the last
config_refs and try to destroy the workqueue from
inside the workqueue.

To fix it, add a flush_workqueue in nbd_start_device.

Fixes: e9e006f5 ("nbd: fix max number of supported devs")
Signed-off-by: NSun Ke <sunke32@huawei.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

5c0dd228

17 12月, 2019 1 次提交

nbd: fix shutdown and recv work deadlock v2 · 1c05839a

由 Mike Christie 提交于 12月 08, 2019

This fixes a regression added with:

commit e9e006f5
Author: Mike Christie <mchristi@redhat.com>
Date:   Sun Aug 4 14:10:06 2019 -0500

    nbd: fix max number of supported devs

where we can deadlock during device shutdown. The problem occurs if
the recv_work's nbd_config_put occurs after nbd_start_device_ioctl has
returned and the userspace app has droppped its reference via closing
the device and running nbd_release. The recv_work nbd_config_put call
would then drop the refcount to zero and try to destroy the config which
would try to do destroy_workqueue from the recv work.

This patch just has nbd_start_device_ioctl do a flush_workqueue when it
wakes so we know after the ioctl returns running works have exited. This
also fixes a possible race where we could try to reuse the device while
old recv_works are still running.

Cc: stable@vger.kernel.org
Fixes: e9e006f5 ("nbd: fix max number of supported devs")
Signed-off-by: NMike Christie <mchristi@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

1c05839a

22 11月, 2019 1 次提交

nbd: prevent memory leak · 03bf73c3

由 Navid Emamdoost 提交于 9月 23, 2019

In nbd_add_socket when krealloc succeeds, if nsock's allocation fail the
reallocted memory is leak. The correct behaviour should be assigning the
reallocted memory to config->socks right after success.
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NNavid Emamdoost <navid.emamdoost@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

03bf73c3

20 11月, 2019 1 次提交

nbd:fix memory leak in nbd_get_socket() · dff10bbe

由 Sun Ke 提交于 11月 19, 2019

Before returning NULL, put the sock first.

Cc: stable@vger.kernel.org
Fixes: cf1b2326 ("nbd: verify socket is supported during setup")
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Reviewed-by: NMike Christie <mchristi@redhat.com>
Signed-off-by: NSun Ke <sunke32@huawei.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

dff10bbe

26 10月, 2019 3 次提交

nbd: verify socket is supported during setup · cf1b2326

由 Mike Christie 提交于 10月 17, 2019

nbd requires socket families to support the shutdown method so the nbd
recv workqueue can be woken up from its sock_recvmsg call. If the socket
does not support the callout we will leave recv works running or get hangs
later when the device or module is removed.

This adds a check during socket connection/reconnection to make sure the
socket being passed in supports the needed callout.

Reported-by: syzbot+24c12fa8d218ed26011a@syzkaller.appspotmail.com
Fixes: e9e006f5 ("nbd: fix max number of supported devs")
Tested-by: NRichard W.M. Jones <rjones@redhat.com>
Signed-off-by: NMike Christie <mchristi@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

cf1b2326

nbd: handle racing with error'ed out commands · 7ce23e8e

由 Josef Bacik 提交于 10月 21, 2019

We hit the following warning in production

print_req_error: I/O error, dev nbd0, sector 7213934408 flags 80700
------------[ cut here ]------------
refcount_t: underflow; use-after-free.
WARNING: CPU: 25 PID: 32407 at lib/refcount.c:190 refcount_sub_and_test_checked+0x53/0x60
Workqueue: knbd-recv recv_work [nbd]
RIP: 0010:refcount_sub_and_test_checked+0x53/0x60
Call Trace:
 blk_mq_free_request+0xb7/0xf0
 blk_mq_complete_request+0x62/0xf0
 recv_work+0x29/0xa1 [nbd]
 process_one_work+0x1f5/0x3f0
 worker_thread+0x2d/0x3d0
 ? rescuer_thread+0x340/0x340
 kthread+0x111/0x130
 ? kthread_create_on_node+0x60/0x60
 ret_from_fork+0x1f/0x30
---[ end trace b079c3c67f98bb7c ]---

This was preceded by us timing out everything and shutting down the
sockets for the device.  The problem is we had a request in the queue at
the same time, so we completed the request twice.  This can actually
happen in a lot of cases, we fail to get a ref on our config, we only
have one connection and just error out the command, etc.

Fix this by checking cmd->status in nbd_read_stat.  We only change this
under the cmd->lock, so we are safe to check this here and see if we've
already error'ed this command out, which would indicate that we've
completed it as well.
Reviewed-by: NMike Christie <mchristi@redhat.com>
Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

7ce23e8e

nbd: protect cmd->status with cmd->lock · de6346ec

由 Josef Bacik 提交于 10月 21, 2019

We already do this for the most part, except in timeout and clear_req.
For the timeout case we take the lock after we grab a ref on the config,
but that isn't really necessary because we're safe to touch the cmd at
this point, so just move the order around.

For the clear_req cause this is initiated by the user, so again is safe.
Reviewed-by: NMike Christie <mchristi@redhat.com>
Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

de6346ec

10 10月, 2019 1 次提交

nbd: fix possible sysfs duplicate warning · 86248810

由 Xiubo Li 提交于 9月 19, 2019

1. nbd_put takes the mutex and drops nbd->ref to 0. It then does
idr_remove and drops the mutex.

2. nbd_genl_connect takes the mutex. idr_find/idr_for_each fails
to find an existing device, so it does nbd_dev_add.

3. just before the nbd_put could call nbd_dev_remove or not finished
totally, but if nbd_dev_add try to add_disk, we can hit:

debugfs: Directory 'nbd1' with parent 'block' already present!

This patch will make sure all the disk add/remove stuff are done
by holding the nbd_index_mutex lock.
Reported-by: NMike Christie <mchristi@redhat.com>
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NXiubo Li <xiubli@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

86248810

18 9月, 2019 2 次提交

nbd: fix possible page fault for nbd disk · 8454d685

由 Xiubo Li 提交于 9月 17, 2019

When the NBD_CFLAG_DESTROY_ON_DISCONNECT flag is set and at the same
time when the socket is closed due to the server daemon is restarted,
just before the last DISCONNET is totally done if we start a new connection
by using the old nbd_index, there will be crashing randomly, like:

<3>[  110.151949] block nbd1: Receive control failed (result -32)
<1>[  110.152024] BUG: unable to handle page fault for address: 0000058000000840
<1>[  110.152063] #PF: supervisor read access in kernel mode
<1>[  110.152083] #PF: error_code(0x0000) - not-present page
<6>[  110.152094] PGD 0 P4D 0
<4>[  110.152106] Oops: 0000 [#1] SMP PTI
<4>[  110.152120] CPU: 0 PID: 6698 Comm: kworker/u5:1 Kdump: loaded Not tainted 5.3.0-rc4+ #2
<4>[  110.152136] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
<4>[  110.152166] Workqueue: knbd-recv recv_work [nbd]
<4>[  110.152187] RIP: 0010:__dev_printk+0xd/0x67
<4>[  110.152206] Code: 10 e8 c5 fd ff ff 48 8b 4c 24 18 65 48 33 0c 25 28 00 [...]
<4>[  110.152244] RSP: 0018:ffffa41581f13d18 EFLAGS: 00010206
<4>[  110.152256] RAX: ffffa41581f13d30 RBX: ffff96dd7374e900 RCX: 0000000000000000
<4>[  110.152271] RDX: ffffa41581f13d20 RSI: 00000580000007f0 RDI: ffffffff970ec24f
<4>[  110.152285] RBP: ffffa41581f13d80 R08: ffff96dd7fc17908 R09: 0000000000002e56
<4>[  110.152299] R10: ffffffff970ec24f R11: 0000000000000003 R12: ffff96dd7374e900
<4>[  110.152313] R13: 0000000000000000 R14: ffff96dd7374e9d8 R15: ffff96dd6e3b02c8
<4>[  110.152329] FS:  0000000000000000(0000) GS:ffff96dd7fc00000(0000) knlGS:0000000000000000
<4>[  110.152362] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  110.152383] CR2: 0000058000000840 CR3: 0000000067cc6002 CR4: 00000000001606f0
<4>[  110.152401] Call Trace:
<4>[  110.152422]  _dev_err+0x6c/0x83
<4>[  110.152435]  nbd_read_stat.cold+0xda/0x578 [nbd]
<4>[  110.152448]  ? __switch_to_asm+0x34/0x70
<4>[  110.152468]  ? __switch_to_asm+0x40/0x70
<4>[  110.152478]  ? __switch_to_asm+0x34/0x70
<4>[  110.152491]  ? __switch_to_asm+0x40/0x70
<4>[  110.152501]  ? __switch_to_asm+0x34/0x70
<4>[  110.152511]  ? __switch_to_asm+0x40/0x70
<4>[  110.152522]  ? __switch_to_asm+0x34/0x70
<4>[  110.152533]  recv_work+0x35/0x9e [nbd]
<4>[  110.152547]  process_one_work+0x19d/0x340
<4>[  110.152558]  worker_thread+0x50/0x3b0
<4>[  110.152568]  kthread+0xfb/0x130
<4>[  110.152577]  ? process_one_work+0x340/0x340
<4>[  110.152609]  ? kthread_park+0x80/0x80
<4>[  110.152637]  ret_from_fork+0x35/0x40

This is very easy to reproduce by running the nbd-runner.
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NXiubo Li <xiubli@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8454d685

nbd: rename the runtime flags as NBD_RT_ prefixed · ec76a7b9

由 Xiubo Li 提交于 9月 17, 2019

Preparing for the destory when disconnecting crash fixing.
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NXiubo Li <xiubli@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ec76a7b9

21 8月, 2019 5 次提交

nbd: fix max number of supported devs · e9e006f5

由 Mike Christie 提交于 8月 04, 2019

This fixes a bug added in 4.10 with commit:

commit 9561a7ad
Author: Josef Bacik <jbacik@fb.com>
Date:   Tue Nov 22 14:04:40 2016 -0500

    nbd: add multi-connection support

that limited the number of devices to 256. Before the patch we could
create 1000s of devices, but the patch switched us from using our
own thread to using a work queue which has a default limit of 256
active works.

The problem is that our recv_work function sits in a loop until
disconnection but only handles IO for one connection. The work is
started when the connection is started/restarted, but if we end up
creating 257 or more connections, the queue_work call just queues
connection257+'s recv_work and that waits for connection 1 - 256's
recv_work to be disconnected and that work instance completing.

Instead of reverting back to kthreads, this has us allocate a
workqueue_struct per device, so we can block in the work.

Cc: stable@vger.kernel.org
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NMike Christie <mchristi@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

e9e006f5

nbd: fix zero cmd timeout handling v2 · 2da22da5

由 Mike Christie 提交于 8月 13, 2019

This fixes a regression added in 4.9 with commit:

commit 0eadf37a
Author: Josef Bacik <jbacik@fb.com>
Date:   Thu Sep 8 12:33:40 2016 -0700

    nbd: allow block mq to deal with timeouts

where before the patch userspace would set the timeout to 0 to disable
it. With the above patch, a zero timeout tells the block layer to use
the default value of 30 seconds. For setups where commands can take a
long time or experience transient issues like network disruptions this
then results in IO errors being sent to the application.

To fix this, the patch still uses the common block layer timeout
framework, but if zero is set, nbd just logs a message and then resets
the timer when it expires.
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NMike Christie <mchristi@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

2da22da5

nbd: add missing config put · 887e975c

由 Mike Christie 提交于 8月 13, 2019

Fix bug added with the patch:

commit 8f3ea359
Author: Josef Bacik <josef@toxicpanda.com>
Date:   Mon Jul 16 12:11:35 2018 -0400

    nbd: handle unexpected replies better

where if the timeout handler runs when the completion path is and we fail
to grab the mutex in the timeout handler we will leave a config reference
and cannot free the config later.
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NMike Christie <mchristi@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

887e975c

nbd: add function to convert blk req op to nbd cmd · 00514677

由 Mike Christie 提交于 8月 13, 2019

This adds a helper function to convert a block req op to a nbd cmd type.
It will be used in the last patch to log the type in the timeout
handler.
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NMike Christie <mchristi@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

00514677

nbd: add set cmd timeout helper · 55313e92

由 Mike Christie 提交于 8月 13, 2019

Add a helper to set the cmd timeout. It does not really do a lot now,
but will be more useful in the next patches.
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NMike Christie <mchristi@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

55313e92

31 7月, 2019 1 次提交

nbd: replace kill_bdev() with __invalidate_device() again · 2b5c8f00

由 Munehisa Kamata 提交于 7月 31, 2019

Commit abbbdf12 ("replace kill_bdev() with __invalidate_device()")
once did this, but 29eaadc0 ("nbd: stop using the bdev everywhere")
resurrected kill_bdev() and it has been there since then. So buffer_head
mappings still get killed on a server disconnection, and we can still
hit the BUG_ON on a filesystem on the top of the nbd device.

  EXT4-fs (nbd0): mounted filesystem with ordered data mode. Opts: (null)
  block nbd0: Receive control failed (result -32)
  block nbd0: shutting down sockets
  print_req_error: I/O error, dev nbd0, sector 66264 flags 3000
  EXT4-fs warning (device nbd0): htree_dirblock_to_tree:979: inode #2: lblock 0: comm ls: error -5 reading directory block
  print_req_error: I/O error, dev nbd0, sector 2264 flags 3000
  EXT4-fs error (device nbd0): __ext4_get_inode_loc:4690: inode #2: block 283: comm ls: unable to read itable block
  EXT4-fs error (device nbd0) in ext4_reserve_inode_write:5894: IO failure
  ------------[ cut here ]------------
  kernel BUG at fs/buffer.c:3057!
  invalid opcode: 0000 [#1] SMP PTI
  CPU: 7 PID: 40045 Comm: jbd2/nbd0-8 Not tainted 5.1.0-rc3+ #4
  Hardware name: Amazon EC2 m5.12xlarge/, BIOS 1.0 10/16/2017
  RIP: 0010:submit_bh_wbc+0x18b/0x190
  ...
  Call Trace:
   jbd2_write_superblock+0xf1/0x230 [jbd2]
   ? account_entity_enqueue+0xc5/0xf0
   jbd2_journal_update_sb_log_tail+0x94/0xe0 [jbd2]
   jbd2_journal_commit_transaction+0x12f/0x1d20 [jbd2]
   ? __switch_to_asm+0x40/0x70
   ...
   ? lock_timer_base+0x67/0x80
   kjournald2+0x121/0x360 [jbd2]
   ? remove_wait_queue+0x60/0x60
   kthread+0xf8/0x130
   ? commit_timeout+0x10/0x10 [jbd2]
   ? kthread_bind+0x10/0x10
   ret_from_fork+0x35/0x40

With __invalidate_device(), I no longer hit the BUG_ON with sync or
unmount on the disconnected device.

Fixes: 29eaadc0 ("nbd: stop using the bdev everywhere")
Cc: linux-block@vger.kernel.org
Cc: Ratna Manoj Bolla <manoj.br@gmail.com>
Cc: nbd@other.debian.org
Cc: stable@vger.kernel.org
Cc: David Woodhouse <dwmw@amazon.com>
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NMunehisa Kamata <kamatam@amazon.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

2b5c8f00

11 7月, 2019 2 次提交

nbd: add netlink reconfigure resize support · 4ddeaae8

由 Mike Christie 提交于 5月 29, 2019

If the device is setup with ioctl we can resize the device after the
initial setup, but if the device is setup with netlink we cannot use the
resize related ioctls and there is no netlink reconfigure size ATTR
handling code.

This patch adds netlink reconfigure resize support to match the ioctl
interface.
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NMike Christie <mchristi@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

4ddeaae8

nbd: fix crash when the blksize is zero · 553768d1

由 Xiubo Li 提交于 5月 29, 2019

This will allow the blksize to be set zero and then use 1024 as
default.
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NXiubo Li <xiubli@redhat.com>
[fix to use goto out instead of return in genl_connect]
Signed-off-by: NMike Christie <mchristi@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

553768d1

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功