1. 19 3月, 2022 2 次提交
  2. 18 3月, 2022 1 次提交
    • J
      io_uring: manage provided buffers strictly ordered · dbc7d452
      Jens Axboe 提交于
      Workloads using provided buffers benefit from using and returning buffers
      in the right order, and so does TLBs for that matter. Manage the internal
      buffer list in a straight list, rather than use the head buffer as the
      insertion node. Use a hashed list for the buffer group IDs instead of
      xarray, the overhead is much lower this way. xarray provides internal
      locking and other trickery that is handy for some uses cases, but
      io_uring already locks internally for the buffer manipulation and needs
      none of that.
      
      This is good for about a 2% reduction in overhead, combination of the
      improved management and the fact that the workload has an easier time
      bundling back provided buffers.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      dbc7d452
  3. 17 3月, 2022 11 次提交
  4. 16 3月, 2022 2 次提交
    • D
      io_uring: make tracing format consistent · 052ebf1f
      Dylan Yudaken 提交于
      Make the tracing formatting for user_data and flags consistent.
      
      Having consistent formatting allows one for example to grep for a specific
      user_data/flags and be able to trace a single sqe through easily.
      
      Change user_data to 0x%llx and flags to 0x%x everywhere. The '0x' is
      useful to disambiguate for example "user_data 100".
      
      Additionally remove the '=' for flags in io_uring_req_failed, again for consistency.
      Signed-off-by: NDylan Yudaken <dylany@fb.com>
      Link: https://lore.kernel.org/r/20220316095204.2191498-1-dylany@fb.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
      052ebf1f
    • J
      io_uring: recycle apoll_poll entries · 4d9237e3
      Jens Axboe 提交于
      Particularly for networked workloads, io_uring intensively uses its
      poll based backend to get a notification when data/space is available.
      Profiling workloads, we see 3-4% of alloc+free that is directly attributed
      to just the apoll allocation and free (and the rest being skb alloc+free).
      
      For the fast path, we have ctx->uring_lock held already for both issue
      and the inline completions, and we can utilize that to avoid any extra
      locking needed to have a basic recycling cache for the apoll entries on
      both the alloc and free side.
      
      Double poll still requires an allocation. But those are rare and not
      a fast path item.
      
      With the simple cache in place, we see a 3-4% reduction in overhead for
      the workload.
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      4d9237e3
  5. 12 3月, 2022 1 次提交
  6. 11 3月, 2022 7 次提交
  7. 10 3月, 2022 16 次提交
新手
引导
客服 返回
顶部