1. 07 8月, 2019 1 次提交
    • M
      nbd: replace kill_bdev() with __invalidate_device() again · eb828241
      Munehisa Kamata 提交于
      commit 2b5c8f0063e4b263cf2de82029798183cf85c320 upstream.
      
      Commit abbbdf12 ("replace kill_bdev() with __invalidate_device()")
      once did this, but 29eaadc0 ("nbd: stop using the bdev everywhere")
      resurrected kill_bdev() and it has been there since then. So buffer_head
      mappings still get killed on a server disconnection, and we can still
      hit the BUG_ON on a filesystem on the top of the nbd device.
      
        EXT4-fs (nbd0): mounted filesystem with ordered data mode. Opts: (null)
        block nbd0: Receive control failed (result -32)
        block nbd0: shutting down sockets
        print_req_error: I/O error, dev nbd0, sector 66264 flags 3000
        EXT4-fs warning (device nbd0): htree_dirblock_to_tree:979: inode #2: lblock 0: comm ls: error -5 reading directory block
        print_req_error: I/O error, dev nbd0, sector 2264 flags 3000
        EXT4-fs error (device nbd0): __ext4_get_inode_loc:4690: inode #2: block 283: comm ls: unable to read itable block
        EXT4-fs error (device nbd0) in ext4_reserve_inode_write:5894: IO failure
        ------------[ cut here ]------------
        kernel BUG at fs/buffer.c:3057!
        invalid opcode: 0000 [#1] SMP PTI
        CPU: 7 PID: 40045 Comm: jbd2/nbd0-8 Not tainted 5.1.0-rc3+ #4
        Hardware name: Amazon EC2 m5.12xlarge/, BIOS 1.0 10/16/2017
        RIP: 0010:submit_bh_wbc+0x18b/0x190
        ...
        Call Trace:
         jbd2_write_superblock+0xf1/0x230 [jbd2]
         ? account_entity_enqueue+0xc5/0xf0
         jbd2_journal_update_sb_log_tail+0x94/0xe0 [jbd2]
         jbd2_journal_commit_transaction+0x12f/0x1d20 [jbd2]
         ? __switch_to_asm+0x40/0x70
         ...
         ? lock_timer_base+0x67/0x80
         kjournald2+0x121/0x360 [jbd2]
         ? remove_wait_queue+0x60/0x60
         kthread+0xf8/0x130
         ? commit_timeout+0x10/0x10 [jbd2]
         ? kthread_bind+0x10/0x10
         ret_from_fork+0x35/0x40
      
      With __invalidate_device(), I no longer hit the BUG_ON with sync or
      unmount on the disconnected device.
      
      Fixes: 29eaadc0 ("nbd: stop using the bdev everywhere")
      Cc: linux-block@vger.kernel.org
      Cc: Ratna Manoj Bolla <manoj.br@gmail.com>
      Cc: nbd@other.debian.org
      Cc: stable@vger.kernel.org
      Cc: David Woodhouse <dwmw@amazon.com>
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NMunehisa Kamata <kamatam@amazon.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eb828241
  2. 23 1月, 2019 1 次提交
  3. 05 9月, 2018 1 次提交
  4. 21 7月, 2018 1 次提交
  5. 17 7月, 2018 2 次提交
    • J
      nbd: handle unexpected replies better · 8f3ea359
      Josef Bacik 提交于
      If the server or network is misbehaving and we get an unexpected reply
      we can sometimes miss the request not being started and wait on a
      request and never get a response, or even double complete the same
      request.  Fix this by replacing the send_complete completion with just a
      per command lock.  Add a per command cookie as well so that we can know
      if we're getting a double completion for a previous event.  Also check
      to make sure we dont have REQUEUED set as that means we raced with the
      timeout handler and need to just let the retry occur.
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      8f3ea359
    • J
      nbd: don't requeue the same request twice. · d7d94d48
      Josef Bacik 提交于
      We can race with the snd timeout and the per-request timeout and end up
      requeuing the same request twice.  We can't use the send_complete
      completion to tell if everything is ok because we hold the tx_lock
      during send, so the timeout stuff will block waiting to mark the socket
      dead, and we could be marked complete and still requeue.  Instead add a
      flag to the socket so we know whether we've been requeued yet.
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      d7d94d48
  6. 21 6月, 2018 1 次提交
  7. 05 6月, 2018 2 次提交
  8. 31 5月, 2018 2 次提交
  9. 29 5月, 2018 2 次提交
  10. 25 5月, 2018 1 次提交
  11. 24 5月, 2018 1 次提交
  12. 23 5月, 2018 1 次提交
  13. 17 5月, 2018 6 次提交
  14. 09 3月, 2018 1 次提交
  15. 28 2月, 2018 1 次提交
  16. 07 11月, 2017 2 次提交
  17. 25 10月, 2017 1 次提交
  18. 10 10月, 2017 1 次提交
    • J
      nbd: don't set the device size until we're connected · 639812a1
      Josef Bacik 提交于
      A user reported a regression with using the normal ioctl interface on
      newer kernels.  This happens because I was setting the device size
      before the device was actually connected, which caused us to error out
      and close everything down.  This didn't happen on netlink because we
      hold the device lock the whole time we're setting things up, but we
      don't do that for the ioctl path.  This fixes the problem.
      
      Cc: stable@vger.kernel.org
      Fixes: 29eaadc0 ("nbd: stop using the bdev everywhere")
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      639812a1
  19. 03 10月, 2017 1 次提交
    • J
      nbd: fix -ERESTARTSYS handling · 6e60a3bb
      Josef Bacik 提交于
      Christoph made it so that if we return'ed BLK_STS_RESOURCE whenever we
      got ERESTARTSYS from sending our packets we'd return BLK_STS_OK, which
      means we'd never requeue and just hang.  We really need to return the
      right value from the upper layer.
      
      Fixes: fc17b653 ("blk-mq: switch ->queue_rq return value to blk_status_t")
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      6e60a3bb
  20. 25 9月, 2017 1 次提交
    • J
      nbd: ignore non-nbd ioctl's · 1dae69be
      Josef Bacik 提交于
      In testing we noticed that nbd would spew if you ran a fio job against
      the raw device itself.  This is because fio calls a block device
      specific ioctl, however the block layer will first pass this back to the
      driver ioctl handler in case the driver wants to do something special.
      Since the device was setup using netlink this caused us to spew every
      time fio called this ioctl.  Since we don't have special handling, just
      error out for any non-nbd specific ioctl's that come in.  This fixes the
      spew.
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      1dae69be
  21. 29 8月, 2017 1 次提交
  22. 18 8月, 2017 2 次提交
  23. 26 7月, 2017 1 次提交
    • J
      nbd: clear disconnected on reconnect · 7a362ea9
      Josef Bacik 提交于
      If our device loses its connection for longer than the dead timeout we
      will set NBD_DISCONNECTED in order to quickly fail any pending IO's that
      flood in after the IO's that were waiting during the dead timer.
      However if we re-connect at some point in the future we'll still see
      this DISCONNECTED flag set if we then lose our connection again after
      that, which means we won't get notifications for our newly lost
      connections.  Fix this by just clearing the DISCONNECTED flag on
      reconnect in order to make sure everything works as it's supposed to.
      Reported-by: NDan Melnic <dmm@fb.com>
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      7a362ea9
  24. 23 7月, 2017 3 次提交
  25. 13 7月, 2017 1 次提交
  26. 06 7月, 2017 1 次提交
  27. 09 6月, 2017 1 次提交