1. 20 8月, 2020 4 次提交
    • P
      io_uring: kill extra iovec=NULL in import_iovec() · 867a23ea
      Pavel Begunkov 提交于
      If io_import_iovec() returns an error, return iovec is undefined and
      must not be used, so don't set it to NULL when failing.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      867a23ea
    • P
      io_uring: comment on kfree(iovec) checks · f261c168
      Pavel Begunkov 提交于
      kfree() handles NULL pointers well, but io_{read,write}() checks it
      because of performance reasons. Leave a comment there for those who are
      tempted to patch it.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      f261c168
    • P
      io_uring: fix racy req->flags modification · bb175342
      Pavel Begunkov 提交于
      Setting and clearing REQ_F_OVERFLOW in io_uring_cancel_files() and
      io_cqring_overflow_flush() are racy, because they might be called
      asynchronously.
      
      REQ_F_OVERFLOW flag in only needed for files cancellation, so if it can
      be guaranteed that requests _currently_ marked inflight can't be
      overflown, the problem will be solved with removing the flag
      altogether.
      
      That's how the patch works, it removes inflight status of a request
      in io_cqring_fill_event() whenever it should be thrown into CQ-overflow
      list. That's Ok to do, because no opcode specific handling can be done
      after io_cqring_fill_event(), the same assumption as with "struct
      io_completion" patches.
      And it already have a good place for such cleanups, which is
      io_clean_op(). A nice side effect of this is removing this inflight
      check from the hot path.
      
      note on synchronisation: now __io_cqring_fill_event() may be taking two
      spinlocks simultaneously, completion_lock and inflight_lock. It's fine,
      because we never do that in reverse order, and CQ-overflow of inflight
      requests shouldn't happen often.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      bb175342
    • J
      io_uring: use system_unbound_wq for ring exit work · fc666777
      Jens Axboe 提交于
      We currently use system_wq, which is unbounded in terms of number of
      workers. This means that if we're exiting tons of rings at the same
      time, then we'll briefly spawn tons of event kworkers just for a very
      short blocking time as the rings exit.
      
      Use system_unbound_wq instead, which has a sane cap on the concurrency
      level.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      fc666777
  2. 19 8月, 2020 1 次提交
    • J
      io_uring: cleanup io_import_iovec() of pre-mapped request · 8452fd0c
      Jens Axboe 提交于
      io_rw_prep_async() goes through a dance of clearing req->io, calling
      the iovec import, then re-setting req->io. Provide an internal helper
      that does the right thing without needing state tweaked to get there.
      
      This enables further cleanups in io_read, io_write, and
      io_resubmit_prep(), but that's left for another time.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      8452fd0c
  3. 17 8月, 2020 2 次提交
    • J
      io_uring: get rid of kiocb_wait_page_queue_init() · 3b2a4439
      Jens Axboe 提交于
      The 5.9 merge moved this function io_uring, which means that we don't
      need to retain the generic nature of it. Clean up this part by removing
      redundant checks, and just inlining the small remainder in
      io_rw_should_retry().
      
      No functional changes in this patch.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      3b2a4439
    • J
      io_uring: find and cancel head link async work on files exit · b711d4ea
      Jens Axboe 提交于
      Commit f254ac04 ("io_uring: enable lookup of links holding inflight files")
      only handled 2 out of the three head link cases we have, we also need to
      lookup and cancel work that is blocked in io-wq if that work has a link
      that's holding a reference to the files structure.
      
      Put the "cancel head links that hold this request pending" logic into
      io_attempt_cancel(), which will to through the motions of finding and
      canceling head links that hold the current inflight files stable request
      pending.
      
      Cc: stable@vger.kernel.org
      Reported-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      b711d4ea
  4. 16 8月, 2020 2 次提交
    • J
      io_uring: short circuit -EAGAIN for blocking read attempt · f91daf56
      Jens Axboe 提交于
      One case was missed in the short IO retry handling, and that's hitting
      -EAGAIN on a blocking attempt read (eg from io-wq context). This is a
      problem on sockets that are marked as non-blocking when created, they
      don't carry any REQ_F_NOWAIT information to help us terminate them
      instead of perpetually retrying.
      
      Fixes: 227c0c96 ("io_uring: internally retry short reads")
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      f91daf56
    • J
      io_uring: sanitize double poll handling · d4e7cd36
      Jens Axboe 提交于
      There's a bit of confusion on the matching pairs of poll vs double poll,
      depending on if the request is a pure poll (IORING_OP_POLL_ADD) or
      poll driven retry.
      
      Add io_poll_get_double() that returns the double poll waitqueue, if any,
      and io_poll_get_single() that returns the original poll waitqueue. With
      that, remove the argument to io_poll_remove_double().
      
      Finally ensure that wait->private is cleared once the double poll handler
      has run, so that remove knows it's already been seen.
      
      Cc: stable@vger.kernel.org # v5.8
      Reported-by: syzbot+7f617d4a9369028b8a2c@syzkaller.appspotmail.com
      Fixes: 18bceab1 ("io_uring: allow POLL_ADD with double poll_wait() users")
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      d4e7cd36
  5. 15 8月, 2020 2 次提交
  6. 14 8月, 2020 3 次提交
    • S
      SMB3: Fix mkdir when idsfromsid configured on mount · c8c412f9
      Steve French 提交于
      mkdir uses a compounded create operation which was not setting
      the security descriptor on create of a directory. Fix so
      mkdir now sets the mode and owner info properly when idsfromsid
      and modefromsid are configured on the mount.
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      CC: Stable <stable@vger.kernel.org> # v5.8
      Reviewed-by: NPaulo Alcantara (SUSE) <pc@cjr.nz>
      Reviewed-by: NPavel Shilovsky <pshilov@microsoft.com>
      c8c412f9
    • J
      io_uring: internally retry short reads · 227c0c96
      Jens Axboe 提交于
      We've had a few application cases of not handling short reads properly,
      and it is understandable as short reads aren't really expected if the
      application isn't doing non-blocking IO.
      
      Now that we retain the iov_iter over retries, we can implement internal
      retry pretty trivially. This ensures that we don't return a short read,
      even for buffered reads on page cache conflicts.
      
      Cleanup the deep nesting and hard to read nature of io_read() as well,
      it's much more straight forward now to read and understand. Added a
      few comments explaining the logic as well.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      227c0c96
    • J
      io_uring: retain iov_iter state over io_read/io_write calls · ff6165b2
      Jens Axboe 提交于
      Instead of maintaining (and setting/remembering) iov_iter size and
      segment counts, just put the iov_iter in the async part of the IO
      structure.
      
      This is mostly a preparation patch for doing appropriate internal retries
      for short reads, but it also cleans up the state handling nicely and
      simplifies it quite a bit.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      ff6165b2
  7. 13 8月, 2020 25 次提交
  8. 12 8月, 2020 1 次提交
    • T
      NFS: Fix flexfiles read failover · 563c53e7
      Trond Myklebust 提交于
      The current mirrored read failover code is correctly resetting the mirror
      index between failed reads, however it is not able to actually flip the
      RPC call over to the next RPC client.
      The end result is that we keep resending the RPC call to the same client
      over and over.
      
      The fix is to use the pnfs_read_resend_pnfs() mechanism to schedule a
      new RPC call, but we need to add the ability to pass in a mirror
      index so that we always retry the next mirror in the list.
      
      Fixes: 166bd5b8 ("pNFS/flexfiles: Fix layoutstats handling during read failovers")
      Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
      563c53e7