1. 03 10月, 2017 1 次提交
    • J
      nbd: fix -ERESTARTSYS handling · 6e60a3bb
      Josef Bacik 提交于
      Christoph made it so that if we return'ed BLK_STS_RESOURCE whenever we
      got ERESTARTSYS from sending our packets we'd return BLK_STS_OK, which
      means we'd never requeue and just hang.  We really need to return the
      right value from the upper layer.
      
      Fixes: fc17b653 ("blk-mq: switch ->queue_rq return value to blk_status_t")
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      6e60a3bb
  2. 25 9月, 2017 1 次提交
    • J
      nbd: ignore non-nbd ioctl's · 1dae69be
      Josef Bacik 提交于
      In testing we noticed that nbd would spew if you ran a fio job against
      the raw device itself.  This is because fio calls a block device
      specific ioctl, however the block layer will first pass this back to the
      driver ioctl handler in case the driver wants to do something special.
      Since the device was setup using netlink this caused us to spew every
      time fio called this ioctl.  Since we don't have special handling, just
      error out for any non-nbd specific ioctl's that come in.  This fixes the
      spew.
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      1dae69be
  3. 29 8月, 2017 1 次提交
  4. 18 8月, 2017 2 次提交
  5. 26 7月, 2017 1 次提交
    • J
      nbd: clear disconnected on reconnect · 7a362ea9
      Josef Bacik 提交于
      If our device loses its connection for longer than the dead timeout we
      will set NBD_DISCONNECTED in order to quickly fail any pending IO's that
      flood in after the IO's that were waiting during the dead timer.
      However if we re-connect at some point in the future we'll still see
      this DISCONNECTED flag set if we then lose our connection again after
      that, which means we won't get notifications for our newly lost
      connections.  Fix this by just clearing the DISCONNECTED flag on
      reconnect in order to make sure everything works as it's supposed to.
      Reported-by: NDan Melnic <dmm@fb.com>
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      7a362ea9
  6. 23 7月, 2017 3 次提交
  7. 13 7月, 2017 1 次提交
  8. 06 7月, 2017 1 次提交
  9. 09 6月, 2017 3 次提交
    • C
      blk-mq: switch ->queue_rq return value to blk_status_t · fc17b653
      Christoph Hellwig 提交于
      Use the same values for use for request completion errors as the return
      value from ->queue_rq.  BLK_STS_RESOURCE is special cased to cause
      a requeue, and all the others are completed as-is.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      fc17b653
    • C
      block: introduce new block status code type · 2a842aca
      Christoph Hellwig 提交于
      Currently we use nornal Linux errno values in the block layer, and while
      we accept any error a few have overloaded magic meanings.  This patch
      instead introduces a new  blk_status_t value that holds block layer specific
      status codes and explicitly explains their meaning.  Helpers to convert from
      and to the previous special meanings are provided for now, but I suspect
      we want to get rid of them in the long run - those drivers that have a
      errno input (e.g. networking) usually get errnos that don't know about
      the special block layer overloads, and similarly returning them to userspace
      will usually return somethings that strictly speaking isn't correct
      for file system operations, but that's left as an exercise for later.
      
      For now the set of errors is a very limited set that closely corresponds
      to the previous overloaded errno values, but there is some low hanging
      fruite to improve it.
      
      blk_status_t (ab)uses the sparse __bitwise annotations to allow for sparse
      typechecking, so that we can easily catch places passing the wrong values.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      2a842aca
    • J
      nbd: set sk->sk_sndtimeo for our sockets · dc88e34d
      Josef Bacik 提交于
      If the nbd server stops receiving packets altogether we will get stuck
      waiting for them to receive indefinitely as the tcp buffer will never
      empty, which looks like a deadlock.  Fix this by setting the sk send
      timeout to our configured timeout, that way if the server really
      misbehaves we'll disconnect cleanly instead of waiting forever.
      Reported-by: NDan Melnic <dmm@fb.com>
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      dc88e34d
  10. 30 5月, 2017 3 次提交
  11. 09 5月, 2017 1 次提交
  12. 02 5月, 2017 1 次提交
  13. 28 4月, 2017 1 次提交
  14. 21 4月, 2017 3 次提交
  15. 19 4月, 2017 1 次提交
  16. 17 4月, 2017 12 次提交
    • J
      nbd: add a flag to destroy an nbd device on disconnect · a2c97909
      Josef Bacik 提交于
      For ease of management it would be nice for users to specify that the
      device node for a nbd device is destroyed once it is disconnected and
      there are no more users.  Add a client flag and enable this operation to
      happen.
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      a2c97909
    • J
      nbd: add device refcounting · c6a4759e
      Josef Bacik 提交于
      In order to support deleting the device on disconnect we need to
      refcount the actual nbd_device struct.  So add the refcounting framework
      and change how we free the normal devices at rmmod time so we can catch
      reference leaks.
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      c6a4759e
    • J
      nbd: add a status netlink command · 47d902b9
      Josef Bacik 提交于
      Allow users to query the status of existing nbd devices.  Right now this
      only returns whether or not the device is connected, but could be
      extended in the future to include more information.
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      47d902b9
    • J
      nbd: handle dead connections · 560bc4b3
      Josef Bacik 提交于
      Sometimes we like to upgrade our server without making all of our
      clients freak out and reconnect.  This patch provides a way to specify a
      dead connection timeout to allow us to pause all requests and wait for
      new connections to be opened.  With this in place I can take down the
      nbd server for less than the dead connection timeout time and bring it
      back up and everything resumes gracefully.
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      560bc4b3
    • J
      nbd: only clear the queue on device teardown · 2516ab15
      Josef Bacik 提交于
      When running a disconnect torture test I noticed that sometimes we would
      crash with a negative ref count on our queue.  This was because we were
      ending the same request twice.  Turns out we were racing with
      NBD_CLEAR_SOCK clearing the requests as well as the teardown of the
      device clearing the requests.  So instead make the ioctl only shutdown
      the sockets and make it so that we only ever run nbd_clear_que from the
      device teardown.
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      2516ab15
    • J
      nbd: multicast dead link notifications · 799f9a38
      Josef Bacik 提交于
      Provide a mechanism to notify userspace that there's been a link problem
      on a NBD device.  This will allow userspace to re-establish a connection
      and provide the new socket to the device without disrupting the device.
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      799f9a38
    • J
      nbd: add a reconfigure netlink command · b7aa3d39
      Josef Bacik 提交于
      We want to be able to reconnect dead connections to existing block
      devices, so add a reconfigure netlink command.  We will also allow users
      to change their timeout on the fly, but everything else will require a
      disconnect and reconnect.  You won't be able to add more connections
      either, simply replace dead connections with new more lively
      connections.
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      b7aa3d39
    • J
      nbd: add a basic netlink interface · e46c7287
      Josef Bacik 提交于
      The existing ioctl interface for configuring NBD devices is a bit
      cumbersome and hard to extend.  The other problem is we leave a
      userspace app sitting in it's syscall until the device disconnects,
      which is less than ideal.
      
      This patch introduces a netlink interface for adding and disconnecting
      nbd devices.  This has the benefits of being easily extendable without
      breaking older userspace applications, and allows us to configure a nbd
      device without leaving a userspace app sitting waiting for the device to
      disconnect.
      
      With this interface we also gain the ability to configure more devices
      than are preallocated at insmod time.  We also have gained the ability
      to not specify a particular device and be provided one for us so that
      userspace doesn't need to find a free device to configure.
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      e46c7287
    • J
      nbd: stop using the bdev everywhere · 29eaadc0
      Josef Bacik 提交于
      In preparation for the upcoming netlink interface we need to not rely on
      already having the bdev for the NBD device we are doing operations on.
      Instead of passing the bdev around, just use it in places where we know
      we already have the bdev.
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      29eaadc0
    • J
      nbd: separate out the config information · 5ea8d108
      Josef Bacik 提交于
      In order to properly refcount the various aspects of a NBD device we
      need to separate out the configuration elements of the nbd device.  The
      configuration of a NBD device has a different lifetime from the actual
      device, so it doesn't make sense to bundle these two concepts.  Add a
      config_refs to keep track of the configuration structure, that way we
      can be sure that we never access it when we've torn down the device.
      Add a new nbd_config structure to hold all of the transient
      configuration information.  Finally create this when we open the device
      so that it is in place when we start to configure the device.  This has
      a nice side-effect of fixing a long standing problem where you could end
      up with a half-configured nbd device that needed to be "disconnected" in
      order to be usable again.  Now once we close our device the
      configuration will be discarded.
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      5ea8d108
    • J
      nbd: handle single path failures gracefully · f3733247
      Josef Bacik 提交于
      Currently if we have multiple connections and one of them goes down we will tear
      down the whole device.  However there's no reason we need to do this as we
      could have other connections that are working fine.  Deal with this by keeping
      track of the state of the different connections, and if we lose one we mark it
      as dead and send all IO destined for that socket to one of the other healthy
      sockets.  Any outstanding requests that were on the dead socket will timeout and
      be re-submitted properly.
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      f3733247
    • J
      nbd: put socket in error cases · 9b1355d5
      Josef Bacik 提交于
      When adding a new socket we look it up and then try to add it to our
      configuration.  If any of those steps fail we need to make sure we put
      the socket so we don't leak them.
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      9b1355d5
  17. 11 4月, 2017 1 次提交
    • N
      sched/core: Remove 'task' parameter and rename tsk_restore_flags() to current_restore_flags() · 717a94b5
      NeilBrown 提交于
      It is not safe for one thread to modify the ->flags
      of another thread as there is no locking that can protect
      the update.
      
      So tsk_restore_flags(), which takes a task pointer and modifies
      the flags, is an invitation to do the wrong thing.
      
      All current users pass "current" as the task, so no developers have
      accepted that invitation.  It would be best to ensure it remains
      that way.
      
      So rename tsk_restore_flags() to current_restore_flags() and don't
      pass in a task_struct pointer.  Always operate on current->flags.
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      717a94b5
  18. 09 4月, 2017 1 次提交
  19. 31 3月, 2017 1 次提交
  20. 25 3月, 2017 1 次提交
    • R
      nbd: replace kill_bdev() with __invalidate_device() · abbbdf12
      Ratna Manoj Bolla 提交于
      When a filesystem is mounted on a nbd device and on a disconnect, because
      of kill_bdev(), and resetting bdev size to zero, buffer_head mappings are
      getting destroyed under mounted filesystem.
      
      After a bdev size reset(i.e bdev->bd_inode->i_size = 0) on a disconnect,
      followed by a sys_umount(),
              generic_shutdown_super()->...
              ->__sync_blockdev()->...
              -blkdev_writepages()->...
              ->do_invalidatepage()->...
              -discard_buffer()   is discarding superblock buffer_head assumed
      to be in mapped state by ext4_commit_super().
      
      [mlin: ported to 4.11-rc2]
      Signed-off-by: Ratna Manoj Bolla <manoj.br@gmail.com
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      abbbdf12