1. 28 10月, 2019 2 次提交
  2. 26 10月, 2019 2 次提交
  3. 25 10月, 2019 3 次提交
  4. 24 10月, 2019 3 次提交
  5. 18 10月, 2019 2 次提交
  6. 15 10月, 2019 1 次提交
    • Y
      io_uring: consider the overflow of sequence for timeout req · 5da0fb1a
      yangerkun 提交于
      Now we recalculate the sequence of timeout with 'req->sequence =
      ctx->cached_sq_head + count - 1', judge the right place to insert
      for timeout_list by compare the number of request we still expected for
      completion. But we have not consider about the situation of overflow:
      
      1. ctx->cached_sq_head + count - 1 may overflow. And a bigger count for
      the new timeout req can have a small req->sequence.
      
      2. cached_sq_head of now may overflow compare with before req. And it
      will lead the timeout req with small req->sequence.
      
      This overflow will lead to the misorder of timeout_list, which can lead
      to the wrong order of the completion of timeout_list. Fix it by reuse
      req->submit.sequence to store the count, and change the logic of
      inserting sort in io_timeout.
      Signed-off-by: Nyangerkun <yangerkun@huawei.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      5da0fb1a
  7. 11 10月, 2019 1 次提交
  8. 10 10月, 2019 1 次提交
  9. 08 10月, 2019 1 次提交
  10. 04 10月, 2019 1 次提交
  11. 01 10月, 2019 1 次提交
    • A
      io_uring: use __kernel_timespec in timeout ABI · bdf20073
      Arnd Bergmann 提交于
      All system calls use struct __kernel_timespec instead of the old struct
      timespec, but this one was just added with the old-style ABI. Change it
      now to enforce the use of __kernel_timespec, avoiding ABI confusion and
      the need for compat handlers on 32-bit architectures.
      
      Any user space caller will have to use __kernel_timespec now, but this
      is unambiguous and works for any C library regardless of the time_t
      definition. A nicer way to specify the timeout would have been a less
      ambiguous 64-bit nanosecond value, but I suppose it's too late now to
      change that as this would impact both 32-bit and 64-bit users.
      
      Fixes: 5262f567 ("io_uring: IORING_OP_TIMEOUT support")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      bdf20073
  12. 26 9月, 2019 1 次提交
  13. 25 9月, 2019 1 次提交
  14. 24 9月, 2019 2 次提交
  15. 19 9月, 2019 6 次提交
    • J
      io_uring: IORING_OP_TIMEOUT support · 5262f567
      Jens Axboe 提交于
      There's been a few requests for functionality similar to io_getevents()
      and epoll_wait(), where the user can specify a timeout for waiting on
      events. I deliberately did not add support for this through the system
      call initially to avoid overloading the args, but I can see that the use
      cases for this are valid.
      
      This adds support for IORING_OP_TIMEOUT. If a user wants to get woken
      when waiting for events, simply submit one of these timeout commands
      with your wait call (or before). This ensures that the application
      sleeping on the CQ ring waiting for events will get woken. The timeout
      command is passed in as a pointer to a struct timespec. Timeouts are
      relative. The timeout command also includes a way to auto-cancel after
      N events has passed.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      5262f567
    • J
      io_uring: use cond_resched() in sqthread · 9831a90c
      Jens Axboe 提交于
      If preempt isn't enabled in the kernel, we can run into hang issues with
      sqthread submissions. Use cond_resched() to play nice instead of
      cpu_relax(), if we end up starting the loop and not having any events
      pending for submissions.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      9831a90c
    • J
      io_uring: fix potential crash issue due to io_get_req failure · a1041c27
      Jackie Liu 提交于
      Sometimes io_get_req will return a NUL, then we need to do the
      correct error handling, otherwise it will cause the kernel null
      pointer exception.
      
      Fixes: 4fe2c963 ("io_uring: add support for link with drain")
      Signed-off-by: NJackie Liu <liuyun01@kylinos.cn>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      a1041c27
    • J
      io_uring: ensure poll commands clear ->sqe · 6cc47d1d
      Jens Axboe 提交于
      If we end up getting woken in poll (due to a signal), then we may need
      to punt the poll request to an async worker. When we do that, we look up
      the list to queue at, deferefencing req->submit.sqe, however that is
      only set for requests we initially decided to queue async.
      
      This fixes a crash with poll command usage and wakeups that need to punt
      to async context.
      
      Fixes: 54a91f3b ("io_uring: limit parallelism of buffered writes")
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      6cc47d1d
    • J
      io_uring: fix use-after-free of shadow_req · 5f5ad9ce
      Jackie Liu 提交于
      There is a potential dangling pointer problem. we never clean
      shadow_req, if there are multiple link lists in this series of
      sqes, then the shadow_req will not reallocate, and continue to
      use the last one. but in the previous, his memory has been
      released, thus forming a dangling pointer. let's clean up him
      and make sure that every new link list can reapply for a new
      shadow_req.
      
      Fixes: 4fe2c963 ("io_uring: add support for link with drain")
      Signed-off-by: NJackie Liu <liuyun01@kylinos.cn>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      5f5ad9ce
    • J
      io_uring: use kmemdup instead of kmalloc and memcpy · 954dab19
      Jackie Liu 提交于
      Just clean up the code, no function changes.
      Signed-off-by: NJackie Liu <liuyun01@kylinos.cn>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      954dab19
  16. 15 9月, 2019 1 次提交
  17. 13 9月, 2019 2 次提交
  18. 10 9月, 2019 5 次提交
    • J
      io_uring: limit parallelism of buffered writes · 54a91f3b
      Jens Axboe 提交于
      All the popular filesystems need to grab the inode lock for buffered
      writes. With io_uring punting buffered writes to async context, we
      observe a lot of contention with all workers hamming this mutex.
      
      For buffered writes, we generally don't need a lot of parallelism on
      the submission side, as the flushing will take care of that for us.
      Hence we don't need a deep queue on the write side, as long as we
      can safely punt from the original submission context.
      
      Add a workqueue with a limit of 2 that we can use for buffered writes.
      This greatly improves the performance and efficiency of higher queue
      depth buffered async writes with io_uring.
      Reported-by: NAndres Freund <andres@anarazel.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      54a91f3b
    • J
      io_uring: add io_queue_async_work() helper · 18d9be1a
      Jens Axboe 提交于
      Add a helper for queueing a request for async execution, in preparation
      for optimizing it.
      
      No functional change in this patch.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      18d9be1a
    • J
      io_uring: optimize submit_and_wait API · c5766668
      Jens Axboe 提交于
      For some applications that end up using a submit-and-wait type of
      approach for certain batches of IO, we can make that a bit more
      efficient by allowing the application to block for the last IO
      submission. This prevents an async when we don't need it, as the
      application will be blocking for the completion event(s) anyway.
      
      Typical use cases are using the liburing
      io_uring_submit_and_wait() API, or just using io_uring_enter()
      doing both submissions and completions. As a specific example,
      RocksDB doing MultiGet() is sped up quite a bit with this
      change.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      c5766668
    • J
      io_uring: add support for link with drain · 4fe2c963
      Jackie Liu 提交于
      To support the link with drain, we need to do two parts.
      
      There is an sqes:
      
          0     1     2     3     4     5     6
       +-----+-----+-----+-----+-----+-----+-----+
       |  N  |  L  |  L  | L+D |  N  |  N  |  N  |
       +-----+-----+-----+-----+-----+-----+-----+
      
      First, we need to ensure that the io before the link is completed,
      there is a easy way is set drain flag to the link list's head, so
      all subsequent io will be inserted into the defer_list.
      
      	+-----+
          (0) |  N  |
      	+-----+
                 |          (2)         (3)         (4)
      	+-----+     +-----+     +-----+     +-----+
          (1) | L+D | --> |  L  | --> | L+D | --> |  N  |
      	+-----+     +-----+     +-----+     +-----+
                 |
      	+-----+
          (5) |  N  |
      	+-----+
                 |
      	+-----+
          (6) |  N  |
      	+-----+
      
      Second, ensure that the following IO will not be completed first,
      an easy way is to create a mirror of drain io and insert it into
      defer_list, in this way, as long as drain io is not processed, the
      following io in the defer_list will not be actively process.
      
      	+-----+
          (0) |  N  |
      	+-----+
                 |          (2)         (3)         (4)
      	+-----+     +-----+     +-----+     +-----+
          (1) | L+D | --> |  L  | --> | L+D | --> |  N  |
      	+-----+     +-----+     +-----+     +-----+
                 |
      	+-----+
         ('3) |  D  |   <== This is a shadow of (3)
      	+-----+
                 |
      	+-----+
          (5) |  N  |
      	+-----+
                 |
      	+-----+
          (6) |  N  |
      	+-----+
      Signed-off-by: NJackie Liu <liuyun01@kylinos.cn>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      4fe2c963
    • J
      io_uring: fix wrong sequence setting logic · 8776f3fa
      Jackie Liu 提交于
      Sqo_thread will get sqring in batches, which will cause
      ctx->cached_sq_head to be added in batches. if one of these
      sqes is set with the DRAIN flag, then he will never get a
      chance to process, and finally sqo_thread will not exit.
      
      Fixes: de0617e4 ("io_uring: add support for marking commands as draining")
      Signed-off-by: NJackie Liu <liuyun01@kylinos.cn>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      8776f3fa
  19. 07 9月, 2019 1 次提交
    • J
      io_uring: expose single mmap capability · ac90f249
      Jens Axboe 提交于
      After commit 75b28aff we can get by with just a single mmap to
      map both the sq and cq ring. However, userspace doesn't know that.
      
      Add a features variable to io_uring_params, and notify userspace
      that the kernel has this ability. This can then be used in liburing
      (or in applications directly) to avoid the second mmap.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      ac90f249
  20. 28 8月, 2019 2 次提交
  21. 23 8月, 2019 1 次提交