提交 · 76451d79bde6bed17e113f057e58e1fa5fb79e78 · openeuler / Kernel

23 7月, 2017 3 次提交

nbd: only set sndtimeo if we have a timeout set · a7ee8cf1

由 Josef Bacik 提交于 7月 21, 2017

A user reported that he was getting immediate disconnects with my
sndtimeo patch applied.  This is because by default the OSS nbd client
doesn't set a timeout, so we end up setting the sndtimeo to 0, which of
course means we have send errors a lot.  Instead only set our sndtimeo
if the user specified a timeout, otherwise we'll just wait forever like
we did previously.

Fixes: dc88e34d ("nbd: set sk->sk_sndtimeo for our sockets")
Reported-by: NAdam Borowski <kilobyte@angband.pl>
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

a7ee8cf1

nbd: take tx_lock before disconnecting · b4b2aecc

由 Josef Bacik 提交于 7月 21, 2017

We need to take the tx_lock so we don't interleave our disconnect
request between real data going down the wire.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b4b2aecc

nbd: allow multiple disconnects to be sent · 2e13456f

由 Josef Bacik 提交于 7月 21, 2017

There's no reason to limit ourselves to one disconnect message per
socket.  Sometimes networks do strange things, might as well let
sysadmins hit the panic button as much as they want.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

2e13456f

13 7月, 2017 1 次提交

nbd: kill unused ret in recv_work · 76851689

由 Kefeng Wang 提交于 7月 13, 2017

No need to return value in queue work, kill ret variable.
Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

76851689

06 7月, 2017 1 次提交

nbd: quiesce request queues to make sure no submissions are inflight · b52c2e92

由 Sagi Grimberg 提交于 7月 04, 2017

Unlike blk_mq_stop_hw_queues, blk_mq_quiesce_queue respects the
submission path rcu grace. quiesce the queue before iterating
on live tags.
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Acked-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

b52c2e92

09 6月, 2017 3 次提交

blk-mq: switch ->queue_rq return value to blk_status_t · fc17b653

由 Christoph Hellwig 提交于 6月 03, 2017

Use the same values for use for request completion errors as the return
value from ->queue_rq.  BLK_STS_RESOURCE is special cased to cause
a requeue, and all the others are completed as-is.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

fc17b653

block: introduce new block status code type · 2a842aca

由 Christoph Hellwig 提交于 6月 03, 2017

Currently we use nornal Linux errno values in the block layer, and while
we accept any error a few have overloaded magic meanings. This patch
instead introduces a new blk_status_t value that holds block layer specific
status codes and explicitly explains their meaning. Helpers to convert from
and to the previous special meanings are provided for now, but I suspect
we want to get rid of them in the long run - those drivers that have a
errno input (e.g. networking) usually get errnos that don't know about
the special block layer overloads, and similarly returning them to userspace
will usually return somethings that strictly speaking isn't correct
for file system operations, but that's left as an exercise for later.

For now the set of errors is a very limited set that closely corresponds
to the previous overloaded errno values, but there is some low hanging
fruite to improve it.

blk_status_t (ab)uses the sparse __bitwise annotations to allow for sparse
typechecking, so that we can easily catch places passing the wrong values.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

2a842aca

nbd: set sk->sk_sndtimeo for our sockets · dc88e34d

由 Josef Bacik 提交于 6月 08, 2017

If the nbd server stops receiving packets altogether we will get stuck
waiting for them to receive indefinitely as the tcp buffer will never
empty, which looks like a deadlock.  Fix this by setting the sk send
timeout to our configured timeout, that way if the server really
misbehaves we'll disconnect cleanly instead of waiting forever.
Reported-by: NDan Melnic <dmm@fb.com>
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

dc88e34d

30 5月, 2017 3 次提交

nbd: add FUA op support · 685c9b24

由 Shaun McDowell 提交于 5月 25, 2017

NBD userland client and server have FUA (forced unit access) support
and flags defined. Make NBD kernel module recognize NBD_FLAG_SEND_FUA,
enable FUA on the queue, and forward FUA requests to the server.
Signed-off-by: NShaun McDowell <shaunjmcdowell@gmail.com>
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

685c9b24

nbd: don't leak nbd_config · fa976532

由 Ilya Dryomov 提交于 5月 23, 2017

nbd_config is allocated in nbd_alloc_config(), but never freed.

Fixes: 5ea8d108 ("nbd: separate out the config information")
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

fa976532

nbd: nbd_reset() call in nbd_dev_add() is redundant · af622b86

由 Ilya Dryomov 提交于 5月 23, 2017

There is nothing to clear -- nbd_device has just been allocated.
Fold nbd_reset() into its other caller, nbd_config_put().
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

af622b86

09 5月, 2017 1 次提交

treewide: convert PF_MEMALLOC manipulations to new helpers · f1083048

由 Vlastimil Babka 提交于 5月 08, 2017

We now have memalloc_noreclaim_{save,restore} helpers for robust setting
and clearing of PF_MEMALLOC.  Let's convert the code which was using the
generic tsk_restore_flags().  No functional change.

[vbabka@suse.cz: in net/core/sock.c the hunk is missing]
Link: http://lkml.kernel.org/r/20170405074700.29871-4-vbabka@suse.czSigned-off-by: NVlastimil Babka <vbabka@suse.cz>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Lee Duncan <lduncan@suse.com>
Cc: Chris Leech <cleech@redhat.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Boris Brezillon <boris.brezillon@free-electrons.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Wouter Verhelst <w@uter.be>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f1083048

02 5月, 2017 1 次提交

blk-mq: update ->init_request and ->exit_request prototypes · d6296d39

由 Christoph Hellwig 提交于 5月 01, 2017

Remove the request_idx parameter, which can't be used safely now that we
support I/O schedulers with blk-mq.  Except for a superflous check in
mtip32xx it was unused anyway.

Also pass the tag_set instead of just the driver data - this allows drivers
to avoid some code duplication in a follow on cleanup.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@fb.com>

d6296d39

28 4月, 2017 1 次提交

nbd: fix use after free on module unload · 60ae36ad

由 Josef Bacik 提交于 4月 28, 2017

list_for_each_entry() isn't super safe if we're freeing the objects
while we traverse the list.  Also don't bother taking the extra
reference, the module refcounting stuff will save us from having anybody
messing with the device while we're trying to unload.
Reported-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

60ae36ad

21 4月, 2017 3 次提交

nbd: set the max segments to USHRT_MAX · 1cc1f17a

由 Josef Bacik 提交于 4月 20, 2017

I lack the basic understanding of what segments mean, so we were being
limited to 512kib requests even with higher max_sectors sizes set.
Setting the maximum number of segments to unlimited allows us to
actually have arbitrarily large IO's go through NBD.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

1cc1f17a

blk-mq: remove the error argument to blk_mq_complete_request · 08e0029a

由 Christoph Hellwig 提交于 4月 20, 2017

Now that all drivers that call blk_mq_complete_requests have a
->complete callback we can remove the direct call to blk_mq_end_request,
as well as the error argument to blk_mq_complete_request.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NBart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

08e0029a

nbd: don't use req->errors · 1e388ae0

由 Christoph Hellwig 提交于 4月 20, 2017

Add a nbd-specific field instead.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

1e388ae0

19 4月, 2017 1 次提交

nbd: set the max segment size to UINT_MAX · ebb16d0d

由 Josef Bacik 提交于 4月 18, 2017

NBD doesn't care about limiting the segment size, let the user push the
largest bio's they want.  This allows us to control the request size
solely through max_sectors_kb.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

ebb16d0d

17 4月, 2017 12 次提交

nbd: add a flag to destroy an nbd device on disconnect · a2c97909

由 Josef Bacik 提交于 4月 06, 2017

For ease of management it would be nice for users to specify that the
device node for a nbd device is destroyed once it is disconnected and
there are no more users.  Add a client flag and enable this operation to
happen.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

a2c97909

nbd: add device refcounting · c6a4759e

由 Josef Bacik 提交于 4月 06, 2017

In order to support deleting the device on disconnect we need to
refcount the actual nbd_device struct.  So add the refcounting framework
and change how we free the normal devices at rmmod time so we can catch
reference leaks.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

c6a4759e

nbd: add a status netlink command · 47d902b9

由 Josef Bacik 提交于 4月 06, 2017

Allow users to query the status of existing nbd devices.  Right now this
only returns whether or not the device is connected, but could be
extended in the future to include more information.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

47d902b9

nbd: handle dead connections · 560bc4b3

由 Josef Bacik 提交于 4月 06, 2017

Sometimes we like to upgrade our server without making all of our
clients freak out and reconnect.  This patch provides a way to specify a
dead connection timeout to allow us to pause all requests and wait for
new connections to be opened.  With this in place I can take down the
nbd server for less than the dead connection timeout time and bring it
back up and everything resumes gracefully.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

560bc4b3

nbd: only clear the queue on device teardown · 2516ab15

由 Josef Bacik 提交于 4月 06, 2017

When running a disconnect torture test I noticed that sometimes we would
crash with a negative ref count on our queue.  This was because we were
ending the same request twice.  Turns out we were racing with
NBD_CLEAR_SOCK clearing the requests as well as the teardown of the
device clearing the requests.  So instead make the ioctl only shutdown
the sockets and make it so that we only ever run nbd_clear_que from the
device teardown.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

2516ab15

nbd: multicast dead link notifications · 799f9a38

由 Josef Bacik 提交于 4月 06, 2017

Provide a mechanism to notify userspace that there's been a link problem
on a NBD device.  This will allow userspace to re-establish a connection
and provide the new socket to the device without disrupting the device.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

799f9a38

nbd: add a reconfigure netlink command · b7aa3d39

由 Josef Bacik 提交于 4月 06, 2017

We want to be able to reconnect dead connections to existing block
devices, so add a reconfigure netlink command.  We will also allow users
to change their timeout on the fly, but everything else will require a
disconnect and reconnect.  You won't be able to add more connections
either, simply replace dead connections with new more lively
connections.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

b7aa3d39

nbd: add a basic netlink interface · e46c7287

由 Josef Bacik 提交于 4月 06, 2017

The existing ioctl interface for configuring NBD devices is a bit
cumbersome and hard to extend.  The other problem is we leave a
userspace app sitting in it's syscall until the device disconnects,
which is less than ideal.

This patch introduces a netlink interface for adding and disconnecting
nbd devices.  This has the benefits of being easily extendable without
breaking older userspace applications, and allows us to configure a nbd
device without leaving a userspace app sitting waiting for the device to
disconnect.

With this interface we also gain the ability to configure more devices
than are preallocated at insmod time.  We also have gained the ability
to not specify a particular device and be provided one for us so that
userspace doesn't need to find a free device to configure.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

e46c7287

nbd: stop using the bdev everywhere · 29eaadc0

由 Josef Bacik 提交于 4月 06, 2017

In preparation for the upcoming netlink interface we need to not rely on
already having the bdev for the NBD device we are doing operations on.
Instead of passing the bdev around, just use it in places where we know
we already have the bdev.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

29eaadc0

nbd: separate out the config information · 5ea8d108

由 Josef Bacik 提交于 4月 06, 2017

In order to properly refcount the various aspects of a NBD device we
need to separate out the configuration elements of the nbd device. The
configuration of a NBD device has a different lifetime from the actual
device, so it doesn't make sense to bundle these two concepts. Add a
config_refs to keep track of the configuration structure, that way we
can be sure that we never access it when we've torn down the device.
Add a new nbd_config structure to hold all of the transient
configuration information. Finally create this when we open the device
so that it is in place when we start to configure the device. This has
a nice side-effect of fixing a long standing problem where you could end
up with a half-configured nbd device that needed to be "disconnected" in
order to be usable again. Now once we close our device the
configuration will be discarded.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

5ea8d108

nbd: handle single path failures gracefully · f3733247

由 Josef Bacik 提交于 4月 06, 2017

Currently if we have multiple connections and one of them goes down we will tear
down the whole device. However there's no reason we need to do this as we
could have other connections that are working fine. Deal with this by keeping
track of the state of the different connections, and if we lose one we mark it
as dead and send all IO destined for that socket to one of the other healthy
sockets. Any outstanding requests that were on the dead socket will timeout and
be re-submitted properly.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

f3733247

nbd: put socket in error cases · 9b1355d5

由 Josef Bacik 提交于 4月 06, 2017

When adding a new socket we look it up and then try to add it to our
configuration.  If any of those steps fail we need to make sure we put
the socket so we don't leak them.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

9b1355d5

11 4月, 2017 1 次提交

sched/core: Remove 'task' parameter and rename tsk_restore_flags() to current_restore_flags() · 717a94b5

由 NeilBrown 提交于 4月 07, 2017

It is not safe for one thread to modify the ->flags
of another thread as there is no locking that can protect
the update.

So tsk_restore_flags(), which takes a task pointer and modifies
the flags, is an invitation to do the wrong thing.

All current users pass "current" as the task, so no developers have
accepted that invitation.  It would be best to ensure it remains
that way.

So rename tsk_restore_flags() to current_restore_flags() and don't
pass in a task_struct pointer.  Always operate on current->flags.
Signed-off-by: NNeilBrown <neilb@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

717a94b5

09 4月, 2017 1 次提交

block: remove the discard_zeroes_data flag · 48920ff2

由 Christoph Hellwig 提交于 4月 05, 2017

Now that we use the proper REQ_OP_WRITE_ZEROES operation everywhere we can
kill this hack.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

48920ff2

31 3月, 2017 1 次提交

blk-mq: constify struct blk_mq_ops · f363b089

由 Eric Biggers 提交于 3月 30, 2017

Constify all instances of blk_mq_ops, as they are never modified.
Signed-off-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

f363b089

25 3月, 2017 4 次提交

nbd: replace kill_bdev() with __invalidate_device() · abbbdf12

由 Ratna Manoj Bolla 提交于 3月 24, 2017

When a filesystem is mounted on a nbd device and on a disconnect, because
of kill_bdev(), and resetting bdev size to zero, buffer_head mappings are
getting destroyed under mounted filesystem.

After a bdev size reset(i.e bdev->bd_inode->i_size = 0) on a disconnect,
followed by a sys_umount(),
        generic_shutdown_super()->...
        ->__sync_blockdev()->...
        -blkdev_writepages()->...
        ->do_invalidatepage()->...
        -discard_buffer()   is discarding superblock buffer_head assumed
to be in mapped state by ext4_commit_super().

[mlin: ported to 4.11-rc2]
Signed-off-by: Ratna Manoj Bolla <manoj.br@gmail.com
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

abbbdf12

nbd: set queue timeout properly · f8586855

由 Josef Bacik 提交于 3月 24, 2017

We can't just set the timeout on the tagset, we have to set it on the
queue as it would have been setup already at this point.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

f8586855

nbd: set rq->errors to actual error code · c103b4da

由 Josef Bacik 提交于 3月 24, 2017

We've been relying on the block layer to assume rq->errors being set
translates into -EIO.  I noticed in testing that sometimes this isn't
true, and really there's not much of a reason to have a counter instead
of just using -EIO.  So set it properly so we don't leak random numbers
to unsuspecting victims.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

c103b4da

nbd: handle ERESTARTSYS properly · 9dd5d3ab

由 Josef Bacik 提交于 3月 24, 2017

We can submit IO in a processes context, which means there can be
pending signals. This isn't a fatal error for NBD, but it does require
some finesse. If the signal happens before we transmit anything then we
are ok, just requeue the request and carry on. However if we've done a
partial transmit we can't allow anything else to be transmitted on this
socket until we transmit the remaining part of the request. Deal with
this by keeping track of how much we've sent for the current request,
and if we get an ERESTARTSYS during any part of our transmission save
the state of that request and requeue the IO. If anybody tries to
submit a request that isn't our pending request then requeue that
request until we are able to service the one that is pending.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

9dd5d3ab

02 3月, 2017 1 次提交

nbd: stop leaking sockets · 6a8a2154

由 Josef Bacik 提交于 3月 01, 2017

This was introduced in the multi-connection patch, we've been leaking
socket's ever since.

Fixes: 9561a7ad ("nbd: add multi-connection support")
cc: stable@vger.kernel.org
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

6a8a2154

22 2月, 2017 2 次提交

nbd: cleanup workqueue on error properly · 6330a2d0

由 Josef Bacik 提交于 2月 15, 2017

If we fail to register the blockdev we need to make sure to destroy the
recv workqueue.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

6330a2d0

nbd: set the logical and physical blocksize properly · e544541b

由 Josef Bacik 提交于 2月 13, 2017

We noticed when trying to do O_DIRECT to an export on the server side
that we were getting requests smaller than the 4k sectorsize of the
device.  This is because the client isn't setting the logical and
physical blocksizes properly for the underlying device.  Fix this up by
setting the queue blocksizes and then calling bd_set_size.
Signed-off-by: NJosef Bacik <jbacik@fb.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

e544541b

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功