1. 04 2月, 2020 5 次提交
    • Z
      io_uring: fix the sequence comparison in io_sequence_defer · 170a1188
      Zhengyuan Liu 提交于
      commit dbd0f6d6c2a11eb9c31ca9cd454f95bb5713e92e upstream.
      
      sq->cached_sq_head and cq->cached_cq_tail are both unsigned int. If
      cached_sq_head overflows before cached_cq_tail, then we may miss a
      barrier req. As cached_cq_tail always follows cached_sq_head, the NQ
      should be enough.
      
      Cc: stable@vger.kernel.org
      Fixes: de0617e46717 ("io_uring: add support for marking commands as draining")
      Signed-off-by: NZhengyuan Liu <liuzhengyuan@kylinos.cn>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
      170a1188
    • J
      io_uring: fix io_sq_thread_stop running in front of io_sq_thread · db4c235e
      Jackie Liu 提交于
      commit a4c0b3decb33fb4a2b5ecc6234a50680f0b21e7d upstream.
      
      INFO: task syz-executor.5:8634 blocked for more than 143 seconds.
             Not tainted 5.2.0-rc5+ #3
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      syz-executor.5  D25632  8634   8224 0x00004004
      Call Trace:
        context_switch kernel/sched/core.c:2818 [inline]
        __schedule+0x658/0x9e0 kernel/sched/core.c:3445
        schedule+0x131/0x1d0 kernel/sched/core.c:3509
        schedule_timeout+0x9a/0x2b0 kernel/time/timer.c:1783
        do_wait_for_common+0x35e/0x5a0 kernel/sched/completion.c:83
        __wait_for_common kernel/sched/completion.c:104 [inline]
        wait_for_common kernel/sched/completion.c:115 [inline]
        wait_for_completion+0x47/0x60 kernel/sched/completion.c:136
        kthread_stop+0xb4/0x150 kernel/kthread.c:559
        io_sq_thread_stop fs/io_uring.c:2252 [inline]
        io_finish_async fs/io_uring.c:2259 [inline]
        io_ring_ctx_free fs/io_uring.c:2770 [inline]
        io_ring_ctx_wait_and_kill+0x268/0x880 fs/io_uring.c:2834
        io_uring_release+0x5d/0x70 fs/io_uring.c:2842
        __fput+0x2e4/0x740 fs/file_table.c:280
        ____fput+0x15/0x20 fs/file_table.c:313
        task_work_run+0x17e/0x1b0 kernel/task_work.c:113
        tracehook_notify_resume include/linux/tracehook.h:185 [inline]
        exit_to_usermode_loop arch/x86/entry/common.c:168 [inline]
        prepare_exit_to_usermode+0x402/0x4f0 arch/x86/entry/common.c:199
        syscall_return_slowpath+0x110/0x440 arch/x86/entry/common.c:279
        do_syscall_64+0x126/0x140 arch/x86/entry/common.c:304
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x412fb1
      Code: 80 3b 7c 0f 84 c7 02 00 00 c7 85 d0 00 00 00 00 00 00 00 48 8b 05 cf
      a6 24 00 49 8b 14 24 41 b9 cb 2a 44 00 48 89 ee 48 89 df <48> 85 c0 4c 0f
      45 c8 45 31 c0 31 c9 e8 0e 5b 00 00 85 c0 41 89 c7
      RSP: 002b:00007ffe7ee6a180 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
      RAX: 0000000000000000 RBX: 0000000000000004 RCX: 0000000000412fb1
      RDX: 0000001b2d920000 RSI: 0000000000000000 RDI: 0000000000000003
      RBP: 0000000000000001 R08: 00000000f3a3e1f8 R09: 00000000f3a3e1fc
      R10: 00007ffe7ee6a260 R11: 0000000000000293 R12: 000000000075c9a0
      R13: 000000000075c9a0 R14: 0000000000024c00 R15: 000000000075bf2c
      
      =============================================
      
      There is an wrong logic, when kthread_park running
      in front of io_sq_thread.
      
      CPU#0					CPU#1
      
      io_sq_thread_stop:			int kthread(void *_create):
      
      kthread_park()
      					__kthread_parkme(self);	 <<< Wrong
      kthread_stop()
          << wait for self->exited
          << clear_bit KTHREAD_SHOULD_PARK
      
      					ret = threadfn(data);
      					   |
      					   |- io_sq_thread
      					       |- kthread_should_park()	<< false
      					       |- schedule() <<< nobody wake up
      
      stuck CPU#0				stuck CPU#1
      
      So, use a new variable sqo_thread_started to ensure that io_sq_thread
      run first, then io_sq_thread_stop.
      
      Reported-by: syzbot+94324416c485d422fe15@syzkaller.appspotmail.com
      Suggested-by: NJens Axboe <axboe@kernel.dk>
      Signed-off-by: NJackie Liu <liuyun01@kylinos.cn>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
      db4c235e
    • J
      io_uring: add support for recvmsg() · 7cabfcb1
      Jens Axboe 提交于
      commit aa1fa28fc73ea6b740ee7b62bf3b07141883dbb8 upstream.
      
      This is done through IORING_OP_RECVMSG. This opcode uses the same
      sqe->msg_flags that IORING_OP_SENDMSG added, and we pass in the
      msghdr struct in the sqe->addr field as well.
      
      We use MSG_DONTWAIT to force an inline fast path if recvmsg() doesn't
      block, and punt to async execution if it would have.
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
      7cabfcb1
    • J
      io_uring: add support for sendmsg() · ce630b99
      Jens Axboe 提交于
      commit 0fa03c624d8fc9932d0f27c39a9deca6a37e0e17 upstream.
      
      This is done through IORING_OP_SENDMSG. There's a new sqe->msg_flags
      for the flags argument, and the msghdr struct is passed in the
      sqe->addr field.
      
      We use MSG_DONTWAIT to force an inline fast path if sendmsg() doesn't
      block, and punt to async execution if it would have.
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
      ce630b99
    • C
      block: never take page references for ITER_BVEC · 3f9da4d9
      Christoph Hellwig 提交于
      Cherry-pick from commit b620743077e291ae7d0debd21f50413a8c266229 upstream.
      
      If we pass pages through an iov_iter we always already have a reference
      in the caller.  Thus remove the ITER_BVEC_FLAG_NO_REF and don't take
      reference to pages by default for bvec backed iov_iters.
      
      [Joseph] Resolve conflicts since we don't have:
      81ba6abd2bcd "block: loop: mark bvec as ITER_BVEC_FLAG_NO_REF"
      7321ecbfc7cf "block: change how we get page references in bio_iov_iter_get_pages"
      Reviewed-by: NMinwoo Im <minwoo.im.dev@gmail.com>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
      Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
      3f9da4d9
  2. 03 2月, 2020 24 次提交
  3. 19 1月, 2020 10 次提交
  4. 17 1月, 2020 1 次提交
    • X
      alinux: mm: kidled: fix frame-larger-than build warning · 042cecca
      Xu Yu 提交于
      This fix the following build warning:
      
      mm/memcontrol.c: In function 'mem_cgroup_idle_page_stats_show':
      mm/memcontrol.c:3866:1: warning: the frame size of 2160 bytes is larger than 2048 bytes [-Wframe-larger-than=]
      
      The root cause is that "mem_cgroup_idle_page_stats_show" has two
      "struct idle_page_stats" variables, each of which is 1056 bytes in
      size, on the stack, thus exceeding the 2048 max frame size.
      
      This fix the build warning by dynamically allocating memory to these two
      variables with kmalloc.
      
      Fixes: f55ac551 ("alinux: mm: Support kidled")
      Signed-off-by: NXu Yu <xuyu@linux.alibaba.com>
      Reviewed-by: NXunlei Pang <xlpang@linux.alibaba.com>
      042cecca