1. 05 8月, 2020 1 次提交
    • G
      io_uring: Fix NULL pointer dereference in loop_rw_iter() · 2dd2111d
      Guoyu Huang 提交于
      loop_rw_iter() does not check whether the file has a read or
      write function. This can lead to NULL pointer dereference
      when the user passes in a file descriptor that does not have
      read or write function.
      
      The crash log looks like this:
      
      [   99.834071] BUG: kernel NULL pointer dereference, address: 0000000000000000
      [   99.835364] #PF: supervisor instruction fetch in kernel mode
      [   99.836522] #PF: error_code(0x0010) - not-present page
      [   99.837771] PGD 8000000079d62067 P4D 8000000079d62067 PUD 79d8c067 PMD 0
      [   99.839649] Oops: 0010 [#2] SMP PTI
      [   99.840591] CPU: 1 PID: 333 Comm: io_wqe_worker-0 Tainted: G      D           5.8.0 #2
      [   99.842622] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1 04/01/2014
      [   99.845140] RIP: 0010:0x0
      [   99.845840] Code: Bad RIP value.
      [   99.846672] RSP: 0018:ffffa1c7c01ebc08 EFLAGS: 00010202
      [   99.848018] RAX: 0000000000000000 RBX: ffff92363bd67300 RCX: ffff92363d461208
      [   99.849854] RDX: 0000000000000010 RSI: 00007ffdbf696bb0 RDI: ffff92363bd67300
      [   99.851743] RBP: ffffa1c7c01ebc40 R08: 0000000000000000 R09: 0000000000000000
      [   99.853394] R10: ffffffff9ec692a0 R11: 0000000000000000 R12: 0000000000000010
      [   99.855148] R13: 0000000000000000 R14: ffff92363d461208 R15: ffffa1c7c01ebc68
      [   99.856914] FS:  0000000000000000(0000) GS:ffff92363dd00000(0000) knlGS:0000000000000000
      [   99.858651] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   99.860032] CR2: ffffffffffffffd6 CR3: 000000007ac66000 CR4: 00000000000006e0
      [   99.861979] Call Trace:
      [   99.862617]  loop_rw_iter.part.0+0xad/0x110
      [   99.863838]  io_write+0x2ae/0x380
      [   99.864644]  ? kvm_sched_clock_read+0x11/0x20
      [   99.865595]  ? sched_clock+0x9/0x10
      [   99.866453]  ? sched_clock_cpu+0x11/0xb0
      [   99.867326]  ? newidle_balance+0x1d4/0x3c0
      [   99.868283]  io_issue_sqe+0xd8f/0x1340
      [   99.869216]  ? __switch_to+0x7f/0x450
      [   99.870280]  ? __switch_to_asm+0x42/0x70
      [   99.871254]  ? __switch_to_asm+0x36/0x70
      [   99.872133]  ? lock_timer_base+0x72/0xa0
      [   99.873155]  ? switch_mm_irqs_off+0x1bf/0x420
      [   99.874152]  io_wq_submit_work+0x64/0x180
      [   99.875192]  ? kthread_use_mm+0x71/0x100
      [   99.876132]  io_worker_handle_work+0x267/0x440
      [   99.877233]  io_wqe_worker+0x297/0x350
      [   99.878145]  kthread+0x112/0x150
      [   99.878849]  ? __io_worker_unuse+0x100/0x100
      [   99.879935]  ? kthread_park+0x90/0x90
      [   99.880874]  ret_from_fork+0x22/0x30
      [   99.881679] Modules linked in:
      [   99.882493] CR2: 0000000000000000
      [   99.883324] ---[ end trace 4453745f4673190b ]---
      [   99.884289] RIP: 0010:0x0
      [   99.884837] Code: Bad RIP value.
      [   99.885492] RSP: 0018:ffffa1c7c01ebc08 EFLAGS: 00010202
      [   99.886851] RAX: 0000000000000000 RBX: ffff92363acd7f00 RCX: ffff92363d461608
      [   99.888561] RDX: 0000000000000010 RSI: 00007ffe040d9e10 RDI: ffff92363acd7f00
      [   99.890203] RBP: ffffa1c7c01ebc40 R08: 0000000000000000 R09: 0000000000000000
      [   99.891907] R10: ffffffff9ec692a0 R11: 0000000000000000 R12: 0000000000000010
      [   99.894106] R13: 0000000000000000 R14: ffff92363d461608 R15: ffffa1c7c01ebc68
      [   99.896079] FS:  0000000000000000(0000) GS:ffff92363dd00000(0000) knlGS:0000000000000000
      [   99.898017] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   99.899197] CR2: ffffffffffffffd6 CR3: 000000007ac66000 CR4: 00000000000006e0
      
      Fixes: 32960613 ("io_uring: correctly handle non ->{read,write}_iter() file_operations")
      Cc: stable@vger.kernel.org
      Signed-off-by: NGuoyu Huang <hgy5945@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      2dd2111d
  2. 04 8月, 2020 2 次提交
  3. 02 8月, 2020 1 次提交
  4. 31 7月, 2020 7 次提交
    • J
      io_uring: don't touch 'ctx' after installing file descriptor · d1719f70
      Jens Axboe 提交于
      As soon as we install the file descriptor, we have to assume that it
      can get arbitrarily closed. We currently account memory (and note that
      we did) after installing the ring fd, which means that it could be a
      potential use-after-free condition if the fd is closed right after
      being installed, but before we fiddle with the ctx.
      
      In fact, syzbot reported this exact scenario:
      
      BUG: KASAN: use-after-free in io_account_mem fs/io_uring.c:7397 [inline]
      BUG: KASAN: use-after-free in io_uring_create fs/io_uring.c:8369 [inline]
      BUG: KASAN: use-after-free in io_uring_setup+0x2797/0x2910 fs/io_uring.c:8400
      Read of size 1 at addr ffff888087a41044 by task syz-executor.5/18145
      
      CPU: 0 PID: 18145 Comm: syz-executor.5 Not tainted 5.8.0-rc7-next-20200729-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x18f/0x20d lib/dump_stack.c:118
       print_address_description.constprop.0.cold+0xae/0x497 mm/kasan/report.c:383
       __kasan_report mm/kasan/report.c:513 [inline]
       kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
       io_account_mem fs/io_uring.c:7397 [inline]
       io_uring_create fs/io_uring.c:8369 [inline]
       io_uring_setup+0x2797/0x2910 fs/io_uring.c:8400
       do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      RIP: 0033:0x45c429
      Code: 8d b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 5b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f8f121d0c78 EFLAGS: 00000246 ORIG_RAX: 00000000000001a9
      RAX: ffffffffffffffda RBX: 0000000000008540 RCX: 000000000045c429
      RDX: 0000000000000000 RSI: 0000000020000040 RDI: 0000000000000196
      RBP: 000000000078bf38 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 000000000078bf0c
      R13: 00007fff86698cff R14: 00007f8f121d19c0 R15: 000000000078bf0c
      
      Move the accounting of the ring used locked memory before we get and
      install the ring file descriptor.
      
      Cc: stable@vger.kernel.org
      Reported-by: syzbot+9d46305e76057f30c74e@syzkaller.appspotmail.com
      Fixes: 30975825 ("io_uring: report pinned memory usage")
      Reviewed-by: NStefano Garzarella <sgarzare@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      d1719f70
    • P
      io_uring: get rid of atomic FAA for cq_timeouts · 01cec8c1
      Pavel Begunkov 提交于
      If ->cq_timeouts modifications are done under ->completion_lock, we
      don't really nee any fetch-and-add and other complex atomics. Replace it
      with non-atomic FAA, that saves an implicit full memory barrier.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      01cec8c1
    • P
      io_uring: consolidate *_check_overflow accounting · 46930143
      Pavel Begunkov 提交于
      Add a helper to mark ctx->{cq,sq}_check_overflow to get rid of
      duplicates, and it's clearer to check cq_overflow_list directly anyway.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      46930143
    • P
      io_uring: fix stalled deferred requests · dd9dfcdf
      Pavel Begunkov 提交于
      Always do io_commit_cqring() after completing a request, even if it was
      accounted as overflowed on the CQ side. Failing to do that may lead to
      not to pushing deferred requests when needed, and so stalling the whole
      ring.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      dd9dfcdf
    • P
      io_uring: fix racy overflow count reporting · b2bd1cf9
      Pavel Begunkov 提交于
      All ->cq_overflow modifications should be under completion_lock,
      otherwise it can report a wrong number to the userspace. Fix it in
      io_uring_cancel_files().
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      b2bd1cf9
    • P
      io_uring: deduplicate __io_complete_rw() · 81b68a5c
      Pavel Begunkov 提交于
      Call __io_complete_rw() in io_iopoll_queue() instead of hand coding it.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      81b68a5c
    • P
      io_uring: de-unionise io_kiocb · 010e8e6b
      Pavel Begunkov 提交于
      As io_kiocb have enough space, move ->work out of a union. It's safer
      this way and removes ->work memcpy bouncing.
      By the way make tabulation in struct io_kiocb consistent.
      Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      010e8e6b
  5. 25 7月, 2020 29 次提交