提交 · 1f032ca298dda0cc16eb121bce900f15373f346e · openanolis / cloud-kernel

10 11月, 2019 2 次提交

nbd: handle racing with error'ed out commands · 1f032ca2

由 Josef Bacik 提交于 10月 21, 2019

[ Upstream commit 7ce23e8e0a9cd38338fc8316ac5772666b565ca9 ]

We hit the following warning in production

print_req_error: I/O error, dev nbd0, sector 7213934408 flags 80700
------------[ cut here ]------------
refcount_t: underflow; use-after-free.
WARNING: CPU: 25 PID: 32407 at lib/refcount.c:190 refcount_sub_and_test_checked+0x53/0x60
Workqueue: knbd-recv recv_work [nbd]
RIP: 0010:refcount_sub_and_test_checked+0x53/0x60
Call Trace:
 blk_mq_free_request+0xb7/0xf0
 blk_mq_complete_request+0x62/0xf0
 recv_work+0x29/0xa1 [nbd]
 process_one_work+0x1f5/0x3f0
 worker_thread+0x2d/0x3d0
 ? rescuer_thread+0x340/0x340
 kthread+0x111/0x130
 ? kthread_create_on_node+0x60/0x60
 ret_from_fork+0x1f/0x30
---[ end trace b079c3c67f98bb7c ]---

This was preceded by us timing out everything and shutting down the
sockets for the device.  The problem is we had a request in the queue at
the same time, so we completed the request twice.  This can actually
happen in a lot of cases, we fail to get a ref on our config, we only
have one connection and just error out the command, etc.

Fix this by checking cmd->status in nbd_read_stat.  We only change this
under the cmd->lock, so we are safe to check this here and see if we've
already error'ed this command out, which would indicate that we've
completed it as well.
Reviewed-by: NMike Christie <mchristi@redhat.com>
Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NSasha Levin <sashal@kernel.org>

1f032ca2

nbd: protect cmd->status with cmd->lock · 82b7c99e

由 Josef Bacik 提交于 10月 21, 2019

[ Upstream commit de6346ecbc8f5591ebd6c44ac164e8b8671d71d7 ]

We already do this for the most part, except in timeout and clear_req.
For the timeout case we take the lock after we grab a ref on the config,
but that isn't really necessary because we're safe to touch the cmd at
this point, so just move the order around.

For the clear_req cause this is initiated by the user, so again is safe.
Reviewed-by: NMike Christie <mchristi@redhat.com>
Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NSasha Levin <sashal@kernel.org>

82b7c99e

06 11月, 2019 2 次提交

nbd: verify socket is supported during setup · 08332245

由 Mike Christie 提交于 10月 17, 2019

[ Upstream commit cf1b2326b734896734c6e167e41766f9cee7686a ]

nbd requires socket families to support the shutdown method so the nbd
recv workqueue can be woken up from its sock_recvmsg call. If the socket
does not support the callout we will leave recv works running or get hangs
later when the device or module is removed.

This adds a check during socket connection/reconnection to make sure the
socket being passed in supports the needed callout.

Reported-by: syzbot+24c12fa8d218ed26011a@syzkaller.appspotmail.com
Fixes: e9e006f5fcf2 ("nbd: fix max number of supported devs")
Tested-by: NRichard W.M. Jones <rjones@redhat.com>
Signed-off-by: NMike Christie <mchristi@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NSasha Levin <sashal@kernel.org>

08332245

nbd: fix possible sysfs duplicate warning · cad4448d

由 Xiubo Li 提交于 9月 19, 2019

[ Upstream commit 862488105b84ca744b3d8ff131e0fcfe10644be1 ]

1. nbd_put takes the mutex and drops nbd->ref to 0. It then does
idr_remove and drops the mutex.

2. nbd_genl_connect takes the mutex. idr_find/idr_for_each fails
to find an existing device, so it does nbd_dev_add.

3. just before the nbd_put could call nbd_dev_remove or not finished
totally, but if nbd_dev_add try to add_disk, we can hit:

debugfs: Directory 'nbd1' with parent 'block' already present!

This patch will make sure all the disk add/remove stuff are done
by holding the nbd_index_mutex lock.
Reported-by: NMike Christie <mchristi@redhat.com>
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NXiubo Li <xiubli@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NSasha Levin <sashal@kernel.org>

cad4448d

12 10月, 2019 2 次提交

nbd: fix crash when the blksize is zero · c688982f

由 Xiubo Li 提交于 5月 29, 2019

[ Upstream commit 553768d1169a48c0cd87c4eb4ab57534ee663415 ]

This will allow the blksize to be set zero and then use 1024 as
default.
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NXiubo Li <xiubli@redhat.com>
[fix to use goto out instead of return in genl_connect]
Signed-off-by: NMike Christie <mchristi@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NSasha Levin <sashal@kernel.org>

c688982f

nbd: fix max number of supported devs · 9f0f39c9

由 Mike Christie 提交于 8月 04, 2019

commit e9e006f5fcf2bab59149cb38a48a4817c1b538b4 upstream.

This fixes a bug added in 4.10 with commit:

commit 9561a7ad
Author: Josef Bacik <jbacik@fb.com>
Date:   Tue Nov 22 14:04:40 2016 -0500

    nbd: add multi-connection support

that limited the number of devices to 256. Before the patch we could
create 1000s of devices, but the patch switched us from using our
own thread to using a work queue which has a default limit of 256
active works.

The problem is that our recv_work function sits in a loop until
disconnection but only handles IO for one connection. The work is
started when the connection is started/restarted, but if we end up
creating 257 or more connections, the queue_work call just queues
connection257+'s recv_work and that waits for connection 1 - 256's
recv_work to be disconnected and that work instance completing.

Instead of reverting back to kthreads, this has us allocate a
workqueue_struct per device, so we can block in the work.

Cc: stable@vger.kernel.org
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NMike Christie <mchristi@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

9f0f39c9

05 10月, 2019 1 次提交

nbd: add missing config put · d093d318

由 Mike Christie 提交于 8月 13, 2019

[ Upstream commit 887e975c4172d0d5670c39ead2f18ba1e4ec8133 ]

Fix bug added with the patch:

commit 8f3ea359
Author: Josef Bacik <josef@toxicpanda.com>
Date:   Mon Jul 16 12:11:35 2018 -0400

    nbd: handle unexpected replies better

where if the timeout handler runs when the completion path is and we fail
to grab the mutex in the timeout handler we will leave a config reference
and cannot free the config later.
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NMike Christie <mchristi@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NSasha Levin <sashal@kernel.org>

d093d318

07 8月, 2019 1 次提交

nbd: replace kill_bdev() with __invalidate_device() again · eb828241

由 Munehisa Kamata 提交于 7月 31, 2019

commit 2b5c8f0063e4b263cf2de82029798183cf85c320 upstream.

Commit abbbdf12 ("replace kill_bdev() with __invalidate_device()")
once did this, but 29eaadc0 ("nbd: stop using the bdev everywhere")
resurrected kill_bdev() and it has been there since then. So buffer_head
mappings still get killed on a server disconnection, and we can still
hit the BUG_ON on a filesystem on the top of the nbd device.

  EXT4-fs (nbd0): mounted filesystem with ordered data mode. Opts: (null)
  block nbd0: Receive control failed (result -32)
  block nbd0: shutting down sockets
  print_req_error: I/O error, dev nbd0, sector 66264 flags 3000
  EXT4-fs warning (device nbd0): htree_dirblock_to_tree:979: inode #2: lblock 0: comm ls: error -5 reading directory block
  print_req_error: I/O error, dev nbd0, sector 2264 flags 3000
  EXT4-fs error (device nbd0): __ext4_get_inode_loc:4690: inode #2: block 283: comm ls: unable to read itable block
  EXT4-fs error (device nbd0) in ext4_reserve_inode_write:5894: IO failure
  ------------[ cut here ]------------
  kernel BUG at fs/buffer.c:3057!
  invalid opcode: 0000 [#1] SMP PTI
  CPU: 7 PID: 40045 Comm: jbd2/nbd0-8 Not tainted 5.1.0-rc3+ #4
  Hardware name: Amazon EC2 m5.12xlarge/, BIOS 1.0 10/16/2017
  RIP: 0010:submit_bh_wbc+0x18b/0x190
  ...
  Call Trace:
   jbd2_write_superblock+0xf1/0x230 [jbd2]
   ? account_entity_enqueue+0xc5/0xf0
   jbd2_journal_update_sb_log_tail+0x94/0xe0 [jbd2]
   jbd2_journal_commit_transaction+0x12f/0x1d20 [jbd2]
   ? __switch_to_asm+0x40/0x70
   ...
   ? lock_timer_base+0x67/0x80
   kjournald2+0x121/0x360 [jbd2]
   ? remove_wait_queue+0x60/0x60
   kthread+0xf8/0x130
   ? commit_timeout+0x10/0x10 [jbd2]
   ? kthread_bind+0x10/0x10
   ret_from_fork+0x35/0x40

With __invalidate_device(), I no longer hit the BUG_ON with sync or
unmount on the disconnected device.

Fixes: 29eaadc0 ("nbd: stop using the bdev everywhere")
Cc: linux-block@vger.kernel.org
Cc: Ratna Manoj Bolla <manoj.br@gmail.com>
Cc: nbd@other.debian.org
Cc: stable@vger.kernel.org
Cc: David Woodhouse <dwmw@amazon.com>
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NMunehisa Kamata <kamatam@amazon.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

eb828241

23 1月, 2019 1 次提交

nbd: Use set_blocksize() to set device blocksize · 9a9c3c02

由 Jan Kara 提交于 1月 14, 2019

commit c8a83a6b54d0ca078de036aafb3f6af58c1dc5eb upstream.

NBD can update block device block size implicitely through
bd_set_size(). Make it explicitely set blocksize with set_blocksize() as
this behavior of bd_set_size() is going away.

CC: Josef Bacik <jbacik@fb.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

9a9c3c02

05 9月, 2018 1 次提交

nbd: don't allow invalid blocksize settings · bc811f05

由 Jens Axboe 提交于 9月 04, 2018

syzbot reports a divide-by-zero off the NBD_SET_BLKSIZE ioctl.
We need proper validation of the input here. Not just if it's
zero, but also if the value is a power-of-2 and in a valid
range. Add that.

Cc: stable@vger.kernel.org
Reported-by: Nsyzbot <syzbot+25dbecbec1e62c6b0dd4@syzkaller.appspotmail.com>
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

bc811f05

21 7月, 2018 1 次提交

nbd: constify nla_policy · a86c4120

由 Stephen Hemminger 提交于 7月 18, 2018

The netlink policy should be const like other drivers.
Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a86c4120

17 7月, 2018 2 次提交

nbd: handle unexpected replies better · 8f3ea359

由 Josef Bacik 提交于 7月 16, 2018

If the server or network is misbehaving and we get an unexpected reply
we can sometimes miss the request not being started and wait on a
request and never get a response, or even double complete the same
request. Fix this by replacing the send_complete completion with just a
per command lock. Add a per command cookie as well so that we can know
if we're getting a double completion for a previous event. Also check
to make sure we dont have REQUEUED set as that means we raced with the
timeout handler and need to just let the retry occur.
Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8f3ea359

nbd: don't requeue the same request twice. · d7d94d48

由 Josef Bacik 提交于 7月 16, 2018

We can race with the snd timeout and the per-request timeout and end up
requeuing the same request twice. We can't use the send_complete
completion to tell if everything is ok because we hold the tx_lock
during send, so the timeout stuff will block waiting to mark the socket
dead, and we could be marked complete and still requeue. Instead add a
flag to the socket so we know whether we've been requeued yet.
Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

d7d94d48

21 6月, 2018 1 次提交

nbd: Add the nbd NBD_DISCONNECT_ON_CLOSE config flag. · 08ba91ee

由 Doron Roberts-Kedes 提交于 6月 15, 2018

If NBD_DISCONNECT_ON_CLOSE is set on a device, then the driver will
issue a disconnect from nbd_release if the device has no remaining
bdev->bd_openers.

Fix ret val so reconfigure with only setting the flag succeeds.
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NDoron Roberts-Kedes <doronrk@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

08ba91ee

05 6月, 2018 2 次提交

nbd: set discard_alignment to the granularity · 07ce213f

由 Josef Bacik 提交于 6月 05, 2018

Technically we should be able to get away with 0 as the
discard_alignment, but there's no way currently for the protocol to
indicate different alignments, and in real life most disks have
discard_alignment == discard_granularity.  Just set our alignment to our
blocksize to make sure discards will actually work properly with 4k
drives.
Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

07ce213f

nbd: Consistently use request pointer in debug messages. · ee57a05c

由 Kevin Vigor 提交于 6月 04, 2018

Existing dev_dbg messages sometimes identify request using request
pointer, sometimes using nbd_cmd pointer. This makes it hard to
follow request flow. Consistently use request pointer instead.
Reviewed-by: NJosef Bacik <jbacik@toxicpanda.com>
Signed-off-by: NKevin Vigor <kvigor@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ee57a05c

31 5月, 2018 2 次提交

blk-mq: only iterate over inflight requests in blk_mq_tagset_busy_iter · d250bf4e

由 Christoph Hellwig 提交于 5月 30, 2018

We already check for started commands in all callbacks, but we should
also protect against already completed commands.  Do this by taking
the checks to common code.
Acked-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

d250bf4e

nbd: clear DISCONNECT_REQUESTED flag once disconnection occurs. · 5e3c3a7e

由 Kevin Vigor 提交于 5月 30, 2018

When a userspace client requests a NBD device be disconnected, the
DISCONNECT_REQUESTED flag is set. While this flag is set, the driver
will not inform userspace when a connection is closed.

Unfortunately the flag was never cleared, so once a disconnect was
requested the driver would thereafter never tell userspace about a
closed connection. Thus when connections failed due to timeout, no
attempt to reconnect was made and eventually the device would fail.

Fix by clearing the DISCONNECT_REQUESTED flag (and setting the
DISCONNECTED flag) once all connections are closed.
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NKevin Vigor <kvigor@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

5e3c3a7e

29 5月, 2018 2 次提交

nbd: complete requests from ->timeout · e5eab017

由 Christoph Hellwig 提交于 5月 29, 2018

By completing the request entirely in the driver we can remove the
BLK_EH_HANDLED return value and thus the split responsibility between the
driver and the block layer that has been causing trouble.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

e5eab017

block: rename BLK_EH_NOT_HANDLED to BLK_EH_DONE · 6600593c

由 Christoph Hellwig 提交于 5月 29, 2018

The BLK_EH_NOT_HANDLED implies nothing happen, but very often that
is not what is happening - instead the driver already completed the
command.  Fix the symbolic name to reflect that a little better.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6600593c

25 5月, 2018 1 次提交

block drivers/block: Use octal not symbolic permissions · 5657a819

由 Joe Perches 提交于 5月 24, 2018

Convert the S_<FOO> symbolic permissions to their octal equivalents as
using octal and not symbolic permissions is preferred by many as more
readable.

see: https://lkml.org/lkml/2016/8/2/1945

Done with automated conversion via:
$ ./scripts/checkpatch.pl -f --types=SYMBOLIC_PERMS --fix-inplace <files...>

Miscellanea:

o Wrapped modified multi-line calls to a single line where appropriate
o Realign modified multi-line calls to open parenthesis
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

5657a819

24 5月, 2018 1 次提交

nbd: set discard granularity properly · 6df133a1

由 Josef Bacik 提交于 5月 23, 2018

For some reason we had discard granularity set to 512 always even when
discards were disabled.  Fix this by having the default be 0, and then
if we turn it on set the discard granularity to the blocksize.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6df133a1

23 5月, 2018 1 次提交

block/ndb: add WQ_UNBOUND to the knbd-recv workqueue · 2189c97c

由 Dan Melnic 提交于 9月 18, 2017

Add WQ_UNBOUND to the knbd-recv workqueue so we're not bound
to a single CPU that is selected at device creation time.
Signed-off-by: NDan Melnic <dmm@fb.com>
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

2189c97c

17 5月, 2018 6 次提交

nbd: call nbd_bdev_reset instead of bd_set_size on disconnect · 76aa1d34

由 Josef Bacik 提交于 5月 16, 2018

We need to make sure we don't just set the size of the bdev to 0 while
it's being used by a file system.  We have the appropriate check in
nbd_bdev_reset, simply use that helper instead.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

76aa1d34

nbd: fix how we set bd_invalidated · fe1f9e66

由 Josef Bacik 提交于 5月 16, 2018

bd_invalidated is kind of a pain wrt partitions as it really only
triggers the partition rescan if it is set after bd_ops->open() runs, so
setting it when we reset the device isn't useful. We also sporadically
would still have partitions left over in some disconnect cases, so fix
this by always setting bd_invalidated on open if there's no
configuration or if we've had a disconnect action happen, that way the
partition table gets invalidated and rescanned properly.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

fe1f9e66

nbd: clear_sock on netlink disconnect · 96d97e17

由 Josef Bacik 提交于 5月 16, 2018

This is what the ioctl based nbd disconnect does as well.  Without this
the device will just sit there and wait for the connection to go away
(or IO to occur) before the device gets torn down.  Instead clear
everything up on our end so the configuration goes away as quickly as
possible.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

96d97e17

nbd: use bd_set_size when updating disk size · 9e2b1967

由 Josef Bacik 提交于 5月 16, 2018

When we stopped relying on the bdev everywhere I broke updating the
block device size on the fly, which ceph relies on.  We can't just do
set_capacity, we also have to do bd_set_size so things like parted will
notice the device size change.

Fixes: 29eaadc0 ("nbd: stop using the bdev everywhere")
cc: stable@vger.kernel.org
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9e2b1967

nbd: update size when connected · c3f7c939

由 Josef Bacik 提交于 5月 16, 2018

I messed up changing the size of an NBD device while it was connected by
not actually updating the device or doing the uevent.  Fix this by
updating everything if we're connected and we change the size.

cc: stable@vger.kernel.org
Fixes: 639812a1 ("nbd: don't set the device size until we're connected")
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

c3f7c939

nbd: fix nbd device deletion · 8364da47

由 Josef Bacik 提交于 5月 16, 2018

This fixes a use after free bug, we shouldn't be doing disk->queue right
after we do del_gendisk(disk).  Save the queue and do the cleanup after
the del_gendisk.

Fixes: c6a4759e ("nbd: add device refcounting")
cc: stable@vger.kernel.org
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8364da47

09 3月, 2018 1 次提交

block: Use blk_queue_flag_*() in drivers instead of queue_flag_*() · 8b904b5b

由 Bart Van Assche 提交于 3月 07, 2018

This patch has been generated as follows:

for verb in set_unlocked clear_unlocked set clear; do
  replace-in-files queue_flag_${verb} blk_queue_flag_${verb%_unlocked} \
    $(git grep -lw queue_flag_${verb} drivers block/bsg*)
done

Except for protecting all queue flag changes with the queue lock
this patch does not change any functionality.

Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Acked-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8b904b5b

28 2月, 2018 1 次提交

nbd: fix return value in error handling path · 0979962f

由 Gustavo A. R. Silva 提交于 2月 12, 2018

It seems that the proper value to return in this particular case is the
one contained into variable new_index instead of ret.

Addresses-Coverity-ID: 1465148 ("Copy-paste error")
Fixes: e46c7287 ("nbd: add a basic netlink interface")
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

0979962f

07 11月, 2017 2 次提交

nbd: don't start req until after the dead connection logic · 6a468d59

由 Josef Bacik 提交于 11月 06, 2017

We can end up sleeping for a while waiting for the dead timeout, which
means we could get the per request timer to fire.  We did handle this
case, but if the dead timeout happened right after we submitted we'd
either tear down the connection or possibly requeue as we're handling an
error and race with the endio which can lead to panics and other
hilarity.

Fixes: 560bc4b3 ("nbd: handle dead connections")
Cc: stable@vger.kernel.org
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6a468d59

nbd: wait uninterruptible for the dead timeout · ff57dc94

由 Josef Bacik 提交于 11月 06, 2017

If we have a pending signal or the user kills their application then
it'll bring down the whole device, which is less than awesome.  Instead
wait uninterruptible for the dead timeout so we're sure we gave it our
best shot.

Fixes: 560bc4b3 ("nbd: handle dead connections")
Cc: stable@vger.kernel.org
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ff57dc94

25 10月, 2017 1 次提交

nbd: handle interrupted sendmsg with a sndtimeo set · 32e67a3a

由 Josef Bacik 提交于 10月 24, 2017

If you do not set sk_sndtimeo you will get -ERESTARTSYS if there is a
pending signal when you enter sendmsg, which we handle properly.
However if you set a timeout for your commands we'll set sk_sndtimeo to
that timeout, which means that sendmsg will start returning -EINTR
instead of -ERESTARTSYS.  Fix this by checking either cases and doing
the correct thing.

Cc: stable@vger.kernel.org
Fixes: dc88e34d ("nbd: set sk->sk_sndtimeo for our sockets")
Reported-and-tested-by: NDaniel Xu <dlxu@fb.com>
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

32e67a3a

10 10月, 2017 1 次提交

nbd: don't set the device size until we're connected · 639812a1

由 Josef Bacik 提交于 10月 09, 2017

A user reported a regression with using the normal ioctl interface on
newer kernels.  This happens because I was setting the device size
before the device was actually connected, which caused us to error out
and close everything down.  This didn't happen on netlink because we
hold the device lock the whole time we're setting things up, but we
don't do that for the ioctl path.  This fixes the problem.

Cc: stable@vger.kernel.org
Fixes: 29eaadc0 ("nbd: stop using the bdev everywhere")
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

639812a1

03 10月, 2017 1 次提交

nbd: fix -ERESTARTSYS handling · 6e60a3bb

由 Josef Bacik 提交于 10月 02, 2017

Christoph made it so that if we return'ed BLK_STS_RESOURCE whenever we
got ERESTARTSYS from sending our packets we'd return BLK_STS_OK, which
means we'd never requeue and just hang.  We really need to return the
right value from the upper layer.

Fixes: fc17b653 ("blk-mq: switch ->queue_rq return value to blk_status_t")
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6e60a3bb

25 9月, 2017 1 次提交

nbd: ignore non-nbd ioctl's · 1dae69be

由 Josef Bacik 提交于 5月 05, 2017

In testing we noticed that nbd would spew if you ran a fio job against
the raw device itself.  This is because fio calls a block device
specific ioctl, however the block layer will first pass this back to the
driver ioctl handler in case the driver wants to do something special.
Since the device was setup using netlink this caused us to spew every
time fio called this ioctl.  Since we don't have special handling, just
error out for any non-nbd specific ioctl's that come in.  This fixes the
spew.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

1dae69be

29 8月, 2017 1 次提交

nbd: make device_attribute const · dfbde552

由 Bhumika Goyal 提交于 8月 21, 2017

Make this const as is is only passed as an argument to the
function device_create_file and device_remove_file and the corresponding
arguments are of type const.
Done using Coccinelle
Signed-off-by: NBhumika Goyal <bhumirks@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

dfbde552

18 8月, 2017 2 次提交

nbd: change the default nbd partitions · 7a8362a0

由 Josef Bacik 提交于 8月 14, 2017

There's no reason to have partitions disabled for nbd by default, it costs us
nothing to have it enabled and is just confusing/obnoxious to users who try to
use partitions with nbd.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

7a8362a0

nbd: allow device creation at a specific index · e6a76272

由 Josef Bacik 提交于 8月 14, 2017

If users really want to use a particular index for their nbd device and it
doesn't already exist there's no reason we can't just create it for them.  Do
this instead of erroring out.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

e6a76272

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功