1. 13 9月, 2019 1 次提交
    • J
      io_uring: extend async work merging · 6d5d5ac5
      Jens Axboe 提交于
      We currently merge async work items if we see a strict sequential hit.
      This helps avoid unnecessary workqueue switches when we don't need
      them. We can extend this merging to cover cases where it's not a strict
      sequential hit, but the IO still fits within the same page. If an
      application is doing multiple requests within the same page, we don't
      want separate workers waiting on the same page to complete IO. It's much
      faster to let the first worker bring in the page, then operate on that
      page from the same worker to complete the next request(s).
      Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      6d5d5ac5
  2. 10 9月, 2019 5 次提交
    • J
      io_uring: limit parallelism of buffered writes · 54a91f3b
      Jens Axboe 提交于
      All the popular filesystems need to grab the inode lock for buffered
      writes. With io_uring punting buffered writes to async context, we
      observe a lot of contention with all workers hamming this mutex.
      
      For buffered writes, we generally don't need a lot of parallelism on
      the submission side, as the flushing will take care of that for us.
      Hence we don't need a deep queue on the write side, as long as we
      can safely punt from the original submission context.
      
      Add a workqueue with a limit of 2 that we can use for buffered writes.
      This greatly improves the performance and efficiency of higher queue
      depth buffered async writes with io_uring.
      Reported-by: NAndres Freund <andres@anarazel.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      54a91f3b
    • J
      io_uring: add io_queue_async_work() helper · 18d9be1a
      Jens Axboe 提交于
      Add a helper for queueing a request for async execution, in preparation
      for optimizing it.
      
      No functional change in this patch.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      18d9be1a
    • J
      io_uring: optimize submit_and_wait API · c5766668
      Jens Axboe 提交于
      For some applications that end up using a submit-and-wait type of
      approach for certain batches of IO, we can make that a bit more
      efficient by allowing the application to block for the last IO
      submission. This prevents an async when we don't need it, as the
      application will be blocking for the completion event(s) anyway.
      
      Typical use cases are using the liburing
      io_uring_submit_and_wait() API, or just using io_uring_enter()
      doing both submissions and completions. As a specific example,
      RocksDB doing MultiGet() is sped up quite a bit with this
      change.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      c5766668
    • J
      io_uring: add support for link with drain · 4fe2c963
      Jackie Liu 提交于
      To support the link with drain, we need to do two parts.
      
      There is an sqes:
      
          0     1     2     3     4     5     6
       +-----+-----+-----+-----+-----+-----+-----+
       |  N  |  L  |  L  | L+D |  N  |  N  |  N  |
       +-----+-----+-----+-----+-----+-----+-----+
      
      First, we need to ensure that the io before the link is completed,
      there is a easy way is set drain flag to the link list's head, so
      all subsequent io will be inserted into the defer_list.
      
      	+-----+
          (0) |  N  |
      	+-----+
                 |          (2)         (3)         (4)
      	+-----+     +-----+     +-----+     +-----+
          (1) | L+D | --> |  L  | --> | L+D | --> |  N  |
      	+-----+     +-----+     +-----+     +-----+
                 |
      	+-----+
          (5) |  N  |
      	+-----+
                 |
      	+-----+
          (6) |  N  |
      	+-----+
      
      Second, ensure that the following IO will not be completed first,
      an easy way is to create a mirror of drain io and insert it into
      defer_list, in this way, as long as drain io is not processed, the
      following io in the defer_list will not be actively process.
      
      	+-----+
          (0) |  N  |
      	+-----+
                 |          (2)         (3)         (4)
      	+-----+     +-----+     +-----+     +-----+
          (1) | L+D | --> |  L  | --> | L+D | --> |  N  |
      	+-----+     +-----+     +-----+     +-----+
                 |
      	+-----+
         ('3) |  D  |   <== This is a shadow of (3)
      	+-----+
                 |
      	+-----+
          (5) |  N  |
      	+-----+
                 |
      	+-----+
          (6) |  N  |
      	+-----+
      Signed-off-by: NJackie Liu <liuyun01@kylinos.cn>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      4fe2c963
    • J
      io_uring: fix wrong sequence setting logic · 8776f3fa
      Jackie Liu 提交于
      Sqo_thread will get sqring in batches, which will cause
      ctx->cached_sq_head to be added in batches. if one of these
      sqes is set with the DRAIN flag, then he will never get a
      chance to process, and finally sqo_thread will not exit.
      
      Fixes: de0617e4 ("io_uring: add support for marking commands as draining")
      Signed-off-by: NJackie Liu <liuyun01@kylinos.cn>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      8776f3fa
  3. 07 9月, 2019 1 次提交
    • J
      io_uring: expose single mmap capability · ac90f249
      Jens Axboe 提交于
      After commit 75b28aff we can get by with just a single mmap to
      map both the sq and cq ring. However, userspace doesn't know that.
      
      Add a features variable to io_uring_params, and notify userspace
      that the kernel has this ability. This can then be used in liburing
      (or in applications directly) to avoid the second mmap.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      ac90f249
  4. 28 8月, 2019 2 次提交
  5. 26 8月, 2019 10 次提交
  6. 25 8月, 2019 18 次提交
  7. 24 8月, 2019 3 次提交
    • W
      Merge tag 'kvmarm-fixes-for-5.3-3' of... · 087eeea9
      Will Deacon 提交于
      Merge tag 'kvmarm-fixes-for-5.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm/fixes
      
      Pull KVM/arm fixes from Marc Zyngier as per Paulo's request at:
      
        https://lkml.kernel.org/r/21ae69a2-2546-29d0-bff6-2ea825e3d968@redhat.com
      
        "One (hopefully last) set of fixes for KVM/arm for 5.3: an embarassing
         MMIO emulation regression, and a UBSAN splat. Oh well...
      
         - Don't overskip instructions on MMIO emulation
      
         - Fix UBSAN splat when initializing PPI priorities"
      
      * tag 'kvmarm-fixes-for-5.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm:
        KVM: arm/arm64: VGIC: Properly initialise private IRQ affinity
        KVM: arm/arm64: Only skip MMIO insn once
      087eeea9
    • D
      drm/mediatek: include dma-mapping header · 7837951a
      Dave Airlie 提交于
      Although it builds fine here in my arm cross compile, it seems
      either via some other patches in -next or some Kconfig combination,
      this fails to build for everyone.
      
      Include linux/dma-mapping.h should fix it.
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      7837951a
    • L
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · 9140d8bd
      Linus Torvalds 提交于
      Pull rdma fixes from Doug Ledford:
       "No beating around the bush: this is a monster pull request for an -rc5
        kernel. Intel hit me with a series of fixes for TID processing.
        Mellanox hit me with a series for their UMR memory support.
      
        And we had one fix for siw that fixes the 32bit build warnings and
        because of the number of casts that had to be changed to properly
        silence the warnings, that one patch alone is a full 40% of the LOC of
        this entire pull request. Given that this is the initial release
        kernel for siw, I'm trying to fix anything in it that we can, so that
        adds to the impetus to take fixes for it like this one.
      
        I had to do a rebase early in the week. Jason had thought he put a
        patch on the rc queue that he needed to be there so he could base some
        work off of it, and it had actually not been placed there. So he asked
        me (on Tuesday) to fix that up before pushing my wip branch to the
        official rc branch. I did, and that's why the early patches look like
        they were all committed at the same time on Tuesday. That bunch had
        been in my queue prior.
      
        The various patches all pass my test for being legitimate fixes and
        not attempts to slide new features or development into a late rc.
        Well, they were all fixes with the exception of a couple clean up
        patches people wrote for making the fixes they also wrote better (like
        a cleanup patch to move UMR checking into a function so that the
        remaining UMR fix patches can reference that function), so I left
        those in place too.
      
        My apologies for the LOC count and the number of patches here, it's
        just how the cards fell this cycle.
      
        Summary:
      
         - Fix siw buffer mapping issue
      
         - Fix siw 32/64 casting issues
      
         - Fix a KASAN access issue in bnxt_re
      
         - Fix several memory leaks (hfi1, mlx4)
      
         - Fix a NULL deref in cma_cleanup
      
         - Fixes for UMR memory support in mlx5 (4 patch series)
      
         - Fix namespace check for restrack
      
         - Fixes for counter support
      
         - Fixes for hfi1 TID processing (5 patch series)
      
         - Fix potential NULL deref in siw
      
         - Fix memory page calculations in mlx5"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (21 commits)
        RDMA/siw: Fix 64/32bit pointer inconsistency
        RDMA/siw: Fix SGL mapping issues
        RDMA/bnxt_re: Fix stack-out-of-bounds in bnxt_qplib_rcfw_send_message
        infiniband: hfi1: fix memory leaks
        infiniband: hfi1: fix a memory leak bug
        IB/mlx4: Fix memory leaks
        RDMA/cma: fix null-ptr-deref Read in cma_cleanup
        IB/mlx5: Block MR WR if UMR is not possible
        IB/mlx5: Fix MR re-registration flow to use UMR properly
        IB/mlx5: Report and handle ODP support properly
        IB/mlx5: Consolidate use_umr checks into single function
        RDMA/restrack: Rewrite PID namespace check to be reliable
        RDMA/counters: Properly implement PID checks
        IB/core: Fix NULL pointer dereference when bind QP to counter
        IB/hfi1: Drop stale TID RDMA packets that cause TIDErr
        IB/hfi1: Add additional checks when handling TID RDMA WRITE DATA packet
        IB/hfi1: Add additional checks when handling TID RDMA READ RESP packet
        IB/hfi1: Unsafe PSN checking for TID RDMA READ Resp packet
        IB/hfi1: Drop stale TID RDMA packets
        RDMA/siw: Fix potential NULL de-ref
        ...
      9140d8bd