提交 · 1cc1f17aab7e596d0a7373fc1ed11dbddfa82bc9 · openanolis / cloud-kernel

21 4月, 2017 3 次提交

nbd: set the max segments to USHRT_MAX · 1cc1f17a

由 Josef Bacik 提交于 4月 20, 2017

I lack the basic understanding of what segments mean, so we were being
limited to 512kib requests even with higher max_sectors sizes set.
Setting the maximum number of segments to unlimited allows us to
actually have arbitrarily large IO's go through NBD.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

1cc1f17a

blk-mq: remove the error argument to blk_mq_complete_request · 08e0029a

由 Christoph Hellwig 提交于 4月 20, 2017

Now that all drivers that call blk_mq_complete_requests have a
->complete callback we can remove the direct call to blk_mq_end_request,
as well as the error argument to blk_mq_complete_request.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NBart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

08e0029a

nbd: don't use req->errors · 1e388ae0

由 Christoph Hellwig 提交于 4月 20, 2017

Add a nbd-specific field instead.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

1e388ae0

19 4月, 2017 1 次提交

nbd: set the max segment size to UINT_MAX · ebb16d0d

由 Josef Bacik 提交于 4月 18, 2017

NBD doesn't care about limiting the segment size, let the user push the
largest bio's they want.  This allows us to control the request size
solely through max_sectors_kb.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

ebb16d0d

17 4月, 2017 12 次提交

nbd: add a flag to destroy an nbd device on disconnect · a2c97909

由 Josef Bacik 提交于 4月 06, 2017

For ease of management it would be nice for users to specify that the
device node for a nbd device is destroyed once it is disconnected and
there are no more users.  Add a client flag and enable this operation to
happen.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

a2c97909

nbd: add device refcounting · c6a4759e

由 Josef Bacik 提交于 4月 06, 2017

In order to support deleting the device on disconnect we need to
refcount the actual nbd_device struct.  So add the refcounting framework
and change how we free the normal devices at rmmod time so we can catch
reference leaks.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

c6a4759e

nbd: add a status netlink command · 47d902b9

由 Josef Bacik 提交于 4月 06, 2017

Allow users to query the status of existing nbd devices.  Right now this
only returns whether or not the device is connected, but could be
extended in the future to include more information.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

47d902b9

nbd: handle dead connections · 560bc4b3

由 Josef Bacik 提交于 4月 06, 2017

Sometimes we like to upgrade our server without making all of our
clients freak out and reconnect.  This patch provides a way to specify a
dead connection timeout to allow us to pause all requests and wait for
new connections to be opened.  With this in place I can take down the
nbd server for less than the dead connection timeout time and bring it
back up and everything resumes gracefully.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

560bc4b3

nbd: only clear the queue on device teardown · 2516ab15

由 Josef Bacik 提交于 4月 06, 2017

When running a disconnect torture test I noticed that sometimes we would
crash with a negative ref count on our queue.  This was because we were
ending the same request twice.  Turns out we were racing with
NBD_CLEAR_SOCK clearing the requests as well as the teardown of the
device clearing the requests.  So instead make the ioctl only shutdown
the sockets and make it so that we only ever run nbd_clear_que from the
device teardown.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

2516ab15

nbd: multicast dead link notifications · 799f9a38

由 Josef Bacik 提交于 4月 06, 2017

Provide a mechanism to notify userspace that there's been a link problem
on a NBD device.  This will allow userspace to re-establish a connection
and provide the new socket to the device without disrupting the device.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

799f9a38

nbd: add a reconfigure netlink command · b7aa3d39

由 Josef Bacik 提交于 4月 06, 2017

We want to be able to reconnect dead connections to existing block
devices, so add a reconfigure netlink command.  We will also allow users
to change their timeout on the fly, but everything else will require a
disconnect and reconnect.  You won't be able to add more connections
either, simply replace dead connections with new more lively
connections.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

b7aa3d39

nbd: add a basic netlink interface · e46c7287

由 Josef Bacik 提交于 4月 06, 2017

The existing ioctl interface for configuring NBD devices is a bit
cumbersome and hard to extend.  The other problem is we leave a
userspace app sitting in it's syscall until the device disconnects,
which is less than ideal.

This patch introduces a netlink interface for adding and disconnecting
nbd devices.  This has the benefits of being easily extendable without
breaking older userspace applications, and allows us to configure a nbd
device without leaving a userspace app sitting waiting for the device to
disconnect.

With this interface we also gain the ability to configure more devices
than are preallocated at insmod time.  We also have gained the ability
to not specify a particular device and be provided one for us so that
userspace doesn't need to find a free device to configure.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

e46c7287

nbd: stop using the bdev everywhere · 29eaadc0

由 Josef Bacik 提交于 4月 06, 2017

In preparation for the upcoming netlink interface we need to not rely on
already having the bdev for the NBD device we are doing operations on.
Instead of passing the bdev around, just use it in places where we know
we already have the bdev.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

29eaadc0

nbd: separate out the config information · 5ea8d108

由 Josef Bacik 提交于 4月 06, 2017

In order to properly refcount the various aspects of a NBD device we
need to separate out the configuration elements of the nbd device. The
configuration of a NBD device has a different lifetime from the actual
device, so it doesn't make sense to bundle these two concepts. Add a
config_refs to keep track of the configuration structure, that way we
can be sure that we never access it when we've torn down the device.
Add a new nbd_config structure to hold all of the transient
configuration information. Finally create this when we open the device
so that it is in place when we start to configure the device. This has
a nice side-effect of fixing a long standing problem where you could end
up with a half-configured nbd device that needed to be "disconnected" in
order to be usable again. Now once we close our device the
configuration will be discarded.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

5ea8d108

nbd: handle single path failures gracefully · f3733247

由 Josef Bacik 提交于 4月 06, 2017

Currently if we have multiple connections and one of them goes down we will tear
down the whole device. However there's no reason we need to do this as we
could have other connections that are working fine. Deal with this by keeping
track of the state of the different connections, and if we lose one we mark it
as dead and send all IO destined for that socket to one of the other healthy
sockets. Any outstanding requests that were on the dead socket will timeout and
be re-submitted properly.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

f3733247

nbd: put socket in error cases · 9b1355d5

由 Josef Bacik 提交于 4月 06, 2017

When adding a new socket we look it up and then try to add it to our
configuration.  If any of those steps fail we need to make sure we put
the socket so we don't leak them.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

9b1355d5

09 4月, 2017 1 次提交

block: remove the discard_zeroes_data flag · 48920ff2

由 Christoph Hellwig 提交于 4月 05, 2017

Now that we use the proper REQ_OP_WRITE_ZEROES operation everywhere we can
kill this hack.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

48920ff2

31 3月, 2017 1 次提交

blk-mq: constify struct blk_mq_ops · f363b089

由 Eric Biggers 提交于 3月 30, 2017

Constify all instances of blk_mq_ops, as they are never modified.
Signed-off-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

f363b089

25 3月, 2017 4 次提交

nbd: replace kill_bdev() with __invalidate_device() · abbbdf12

由 Ratna Manoj Bolla 提交于 3月 24, 2017

When a filesystem is mounted on a nbd device and on a disconnect, because
of kill_bdev(), and resetting bdev size to zero, buffer_head mappings are
getting destroyed under mounted filesystem.

After a bdev size reset(i.e bdev->bd_inode->i_size = 0) on a disconnect,
followed by a sys_umount(),
        generic_shutdown_super()->...
        ->__sync_blockdev()->...
        -blkdev_writepages()->...
        ->do_invalidatepage()->...
        -discard_buffer()   is discarding superblock buffer_head assumed
to be in mapped state by ext4_commit_super().

[mlin: ported to 4.11-rc2]
Signed-off-by: Ratna Manoj Bolla <manoj.br@gmail.com
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

abbbdf12

nbd: set queue timeout properly · f8586855

由 Josef Bacik 提交于 3月 24, 2017

We can't just set the timeout on the tagset, we have to set it on the
queue as it would have been setup already at this point.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

f8586855

nbd: set rq->errors to actual error code · c103b4da

由 Josef Bacik 提交于 3月 24, 2017

We've been relying on the block layer to assume rq->errors being set
translates into -EIO.  I noticed in testing that sometimes this isn't
true, and really there's not much of a reason to have a counter instead
of just using -EIO.  So set it properly so we don't leak random numbers
to unsuspecting victims.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

c103b4da

nbd: handle ERESTARTSYS properly · 9dd5d3ab

由 Josef Bacik 提交于 3月 24, 2017

We can submit IO in a processes context, which means there can be
pending signals. This isn't a fatal error for NBD, but it does require
some finesse. If the signal happens before we transmit anything then we
are ok, just requeue the request and carry on. However if we've done a
partial transmit we can't allow anything else to be transmitted on this
socket until we transmit the remaining part of the request. Deal with
this by keeping track of how much we've sent for the current request,
and if we get an ERESTARTSYS during any part of our transmission save
the state of that request and requeue the IO. If anybody tries to
submit a request that isn't our pending request then requeue that
request until we are able to service the one that is pending.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

9dd5d3ab

02 3月, 2017 1 次提交

nbd: stop leaking sockets · 6a8a2154

由 Josef Bacik 提交于 3月 01, 2017

This was introduced in the multi-connection patch, we've been leaking
socket's ever since.

Fixes: 9561a7ad ("nbd: add multi-connection support")
cc: stable@vger.kernel.org
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

6a8a2154

22 2月, 2017 3 次提交

nbd: cleanup workqueue on error properly · 6330a2d0

由 Josef Bacik 提交于 2月 15, 2017

If we fail to register the blockdev we need to make sure to destroy the
recv workqueue.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

6330a2d0

nbd: set the logical and physical blocksize properly · e544541b

由 Josef Bacik 提交于 2月 13, 2017

We noticed when trying to do O_DIRECT to an export on the server side
that we were getting requests smaller than the 4k sectorsize of the
device.  This is because the client isn't setting the logical and
physical blocksizes properly for the underlying device.  Fix this up by
setting the queue blocksizes and then calling bd_set_size.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

e544541b

nbd: cleanup ioctl handling · 9442b739

由 Josef Bacik 提交于 2月 07, 2017

Break the ioctl handling out into helper functions, some of these things
are getting pretty big and unwieldy.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

9442b739

02 2月, 2017 2 次提交

nbd: use an idr to keep track of nbd devices · b0d9111a

由 Josef Bacik 提交于 2月 01, 2017

To prepare for dynamically adding new nbd devices to the system switch
from using an array for the nbd devices and instead use an idr.  This
copies what loop does for keeping track of its devices.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

b0d9111a

nbd: use our own workqueue for recv threads · 124d6db0

由 Josef Bacik 提交于 2月 01, 2017

Since we are in the memory reclaim path we need our recv work to be on a
workqueue that has WQ_MEM_RECLAIM set so we can avoid deadlocks.  Also
set WQ_HIGHPRI since we are in the completion path for IO.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

124d6db0

01 2月, 2017 3 次提交

block: fold cmd_type into the REQ_OP_ space · aebf526b

由 Christoph Hellwig 提交于 1月 31, 2017

Instead of keeping two levels of indirection for requests types, fold it
all into the operations.  The little caveat here is that previously
cmd_type only applied to struct request, while the request and bio op
fields were set to plain REQ_OP_READ/WRITE even for passthrough
operations.

Instead this patch adds new REQ_OP_* for SCSI passthrough and driver
private requests, althought it has to add two for each so that we
can communicate the data in/out nature of the request.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

aebf526b

nbd: move request validity checking into nbd_send_cmd · 09fc54cc

由 Christoph Hellwig 提交于 1月 31, 2017

This is where we do the rest of the request handling, which will
become much simpler soon, too.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

09fc54cc

nbd: remove REQ_TYPE_DRV_PRIV leftovers · 27410a89

由 Christoph Hellwig 提交于 1月 31, 2017

Disconnects don't use block layer requests these days, so all handling
of private requests is dead code.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

27410a89

20 1月, 2017 1 次提交

nbd: only set MSG_MORE when we have more to send · d61b7f97

由 Josef Bacik 提交于 1月 19, 2017

A user noticed that write performance was horrible over loopback and we
traced it to an inversion of when we need to set MSG_MORE.  It should be
set when we have more bvec's to send, not when we are on the last bvec.
This patch made the test go from 20 iops to 78k iops.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Fixes: 429a787b ("nbd: fix use-after-free of rq/bio in the xmit path")
Signed-off-by: NJens Axboe <axboe@fb.com>

d61b7f97

11 1月, 2017 1 次提交

nbd: blk_mq_init_queue returns an error code on failure, not NULL · 25b4acfc

由 Jeff Moyer 提交于 1月 09, 2017

Additionally, don't assign directly to disk->queue, otherwise
blk_put_queue (called via put_disk) will choke (panic) on the errno
stored there.

Bug found by code inspection after Omar found a similar issue in
virtio_blk.  Compile-tested only.
Signed-off-by: NJeff Moyer <jmoyer@redhat.com>
Reviewed-by: NOmar Sandoval <osandov@fb.com>
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

25b4acfc

27 12月, 2016 2 次提交

[nbd] pass iov_iter to nbd_xmit() · c9f2b6ae

由 Al Viro 提交于 11月 12, 2015

... and don't mess with kmap() - just use BVEC_ITER for those parts.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

c9f2b6ae

[nbd] switch sock_xmit() to sock_{send,recv}msg() · c1696cab

由 Al Viro 提交于 11月 12, 2015

Step 1 - don't reinintialize ->msg_iter on each iteration.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

c1696cab

25 12月, 2016 1 次提交

Replace <asm/uaccess.h> with <linux/uaccess.h> globally · 7c0f6ba6

由 Linus Torvalds 提交于 12月 24, 2016

This was entirely automated, using the script by Al:

  PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
  sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
        $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.
Requested-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7c0f6ba6

09 12月, 2016 2 次提交

nbd: use dev_err_ratelimited in io path · a897b666

由 Josef Bacik 提交于 12月 05, 2016

While doing stress tests we noticed that we'd get a lot of dmesg spam if
we suddenly disconnected the nbd device out of band.  Rate limit the
messages in the io path in order to deal with this.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

a897b666

nbd: reset the setup task for NBD_CLEAR_SOCK · 20032ec3

由 Josef Bacik 提交于 12月 08, 2016

If an app exits before running NBD_DO_IT but after adding sockets we can
end up not being allowed to do a new nbd device.  Fix this by making
NBD_CLEAR_SOCK reset the setup_task.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

20032ec3

04 12月, 2016 1 次提交

nbd: fix 64-bit division · e88f72cb

由 Jens Axboe 提交于 12月 03, 2016

We have this:

ERROR: "__aeabi_ldivmod" [drivers/block/nbd.ko] undefined!
ERROR: "__divdi3" [drivers/block/nbd.ko] undefined!
nbd.c:(.text+0x247c72): undefined reference to `__divdi3'

due to a recent commit, that did 64-bit division. Use the proper
divider function so that 32-bit compiles don't break.

Fixes: ef77b515 ("nbd: use loff_t for blocksize and nbd_set_size args")
Signed-off-by: NJens Axboe <axboe@fb.com>

e88f72cb

03 12月, 2016 1 次提交

nbd: use loff_t for blocksize and nbd_set_size args · ef77b515

由 Josef Bacik 提交于 12月 02, 2016

If we have large devices (say like the 40t drive I was trying to test with) we
will end up overflowing the int arguments to nbd_set_size and not get the right
size for our device.  Fix this by using loff_t everywhere so I don't have to
think about this again.  Thanks,
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

ef77b515

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功