1. 15 3月, 2022 1 次提交
  2. 09 3月, 2022 1 次提交
  3. 28 2月, 2022 1 次提交
    • Y
      blktrace: fix use after free for struct blk_trace · 30939293
      Yu Kuai 提交于
      When tracing the whole disk, 'dropped' and 'msg' will be created
      under 'q->debugfs_dir' and 'bt->dir' is NULL, thus blk_trace_free()
      won't remove those files. What's worse, the following UAF can be
      triggered because of accessing stale 'dropped' and 'msg':
      
      ==================================================================
      BUG: KASAN: use-after-free in blk_dropped_read+0x89/0x100
      Read of size 4 at addr ffff88816912f3d8 by task blktrace/1188
      
      CPU: 27 PID: 1188 Comm: blktrace Not tainted 5.17.0-rc4-next-20220217+ #469
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-4
      Call Trace:
       <TASK>
       dump_stack_lvl+0x34/0x44
       print_address_description.constprop.0.cold+0xab/0x381
       ? blk_dropped_read+0x89/0x100
       ? blk_dropped_read+0x89/0x100
       kasan_report.cold+0x83/0xdf
       ? blk_dropped_read+0x89/0x100
       kasan_check_range+0x140/0x1b0
       blk_dropped_read+0x89/0x100
       ? blk_create_buf_file_callback+0x20/0x20
       ? kmem_cache_free+0xa1/0x500
       ? do_sys_openat2+0x258/0x460
       full_proxy_read+0x8f/0xc0
       vfs_read+0xc6/0x260
       ksys_read+0xb9/0x150
       ? vfs_write+0x3d0/0x3d0
       ? fpregs_assert_state_consistent+0x55/0x60
       ? exit_to_user_mode_prepare+0x39/0x1e0
       do_syscall_64+0x35/0x80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      RIP: 0033:0x7fbc080d92fd
      Code: ce 20 00 00 75 10 b8 00 00 00 00 0f 05 48 3d 01 f0 ff ff 73 31 c3 48 83 1
      RSP: 002b:00007fbb95ff9cb0 EFLAGS: 00000293 ORIG_RAX: 0000000000000000
      RAX: ffffffffffffffda RBX: 00007fbb95ff9dc0 RCX: 00007fbc080d92fd
      RDX: 0000000000000100 RSI: 00007fbb95ff9cc0 RDI: 0000000000000045
      RBP: 0000000000000045 R08: 0000000000406299 R09: 00000000fffffffd
      R10: 000000000153afa0 R11: 0000000000000293 R12: 00007fbb780008c0
      R13: 00007fbb78000938 R14: 0000000000608b30 R15: 00007fbb780029c8
       </TASK>
      
      Allocated by task 1050:
       kasan_save_stack+0x1e/0x40
       __kasan_kmalloc+0x81/0xa0
       do_blk_trace_setup+0xcb/0x410
       __blk_trace_setup+0xac/0x130
       blk_trace_ioctl+0xe9/0x1c0
       blkdev_ioctl+0xf1/0x390
       __x64_sys_ioctl+0xa5/0xe0
       do_syscall_64+0x35/0x80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Freed by task 1050:
       kasan_save_stack+0x1e/0x40
       kasan_set_track+0x21/0x30
       kasan_set_free_info+0x20/0x30
       __kasan_slab_free+0x103/0x180
       kfree+0x9a/0x4c0
       __blk_trace_remove+0x53/0x70
       blk_trace_ioctl+0x199/0x1c0
       blkdev_common_ioctl+0x5e9/0xb30
       blkdev_ioctl+0x1a5/0x390
       __x64_sys_ioctl+0xa5/0xe0
       do_syscall_64+0x35/0x80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      The buggy address belongs to the object at ffff88816912f380
       which belongs to the cache kmalloc-96 of size 96
      The buggy address is located 88 bytes inside of
       96-byte region [ffff88816912f380, ffff88816912f3e0)
      The buggy address belongs to the page:
      page:000000009a1b4e7c refcount:1 mapcount:0 mapping:0000000000000000 index:0x0f
      flags: 0x17ffffc0000200(slab|node=0|zone=2|lastcpupid=0x1fffff)
      raw: 0017ffffc0000200 ffffea00044f1100 dead000000000002 ffff88810004c780
      raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff88816912f280: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
       ffff88816912f300: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
      >ffff88816912f380: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
                                                          ^
       ffff88816912f400: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
       ffff88816912f480: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
      ==================================================================
      
      Fixes: c0ea5760 ("blktrace: remove debugfs file dentries from struct blk_trace")
      Signed-off-by: NYu Kuai <yukuai3@huawei.com>
      Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Link: https://lore.kernel.org/r/20220228034354.4047385-1-yukuai3@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
      30939293
  4. 24 2月, 2022 1 次提交
  5. 23 2月, 2022 3 次提交
  6. 22 2月, 2022 1 次提交
  7. 17 2月, 2022 3 次提交
  8. 12 2月, 2022 2 次提交
  9. 11 2月, 2022 1 次提交
  10. 10 2月, 2022 1 次提交
  11. 09 2月, 2022 2 次提交
  12. 04 2月, 2022 3 次提交
  13. 03 2月, 2022 2 次提交
    • U
      nvme-fabrics: fix state check in nvmf_ctlr_matches_baseopts() · 6a51abde
      Uday Shankar 提交于
      Controller deletion/reset, immediately followed by or concurrent with
      a reconnect, is hard failing the connect attempt resulting in a
      complete loss of connectivity to the controller.
      
      In the connect request, fabrics looks for an existing controller with
      the same address components and aborts the connect if a controller
      already exists and the duplicate connect option isn't set. The match
      routine filters out controllers that are dead or dying, so they don't
      interfere with the new connect request.
      
      When NVME_CTRL_DELETING_NOIO was added, it missed updating the state
      filters in the nvmf_ctlr_matches_baseopts() routine. Thus, when in this
      new state, it's seen as a live controller and fails the connect request.
      
      Correct by adding the DELETING_NIO state to the match checks.
      
      Fixes: ecca390e ("nvme: fix deadlock in disconnect during scan_work and/or ana_work")
      Cc: <stable@vger.kernel.org> # v5.7+
      Signed-off-by: NUday Shankar <ushankar@purestorage.com>
      Reviewed-by: NJames Smart <jsmart2021@gmail.com>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      6a51abde
    • S
      md: fix NULL pointer deref with nowait but no mddev->queue · 0f9650bd
      Song Liu 提交于
      Leon reported NULL pointer deref with nowait support:
      
      [   15.123761] device-mapper: raid: Loading target version 1.15.1
      [   15.124185] device-mapper: raid: Ignoring chunk size parameter for RAID 1
      [   15.124192] device-mapper: raid: Choosing default region size of 4MiB
      [   15.129524] BUG: kernel NULL pointer dereference, address: 0000000000000060
      [   15.129530] #PF: supervisor write access in kernel mode
      [   15.129533] #PF: error_code(0x0002) - not-present page
      [   15.129535] PGD 0 P4D 0
      [   15.129538] Oops: 0002 [#1] PREEMPT SMP NOPTI
      [   15.129541] CPU: 5 PID: 494 Comm: ldmtool Not tainted 5.17.0-rc2-1-mainline #1 9fe89d43dfcb215d2731e6f8851740520778615e
      [   15.129546] Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ELITE/X570 AORUS ELITE, BIOS F36e 10/14/2021
      [   15.129549] RIP: 0010:blk_queue_flag_set+0x7/0x20
      [   15.129555] Code: 00 00 00 0f 1f 44 00 00 48 8b 35 e4 e0 04 02 48 8d 57 28 bf 40 01 \
             00 00 e9 16 c1 be ff 66 0f 1f 44 00 00 0f 1f 44 00 00 89 ff <f0> 48 0f ab 7e 60 \
             31 f6 89 f7 c3 66 66 2e 0f 1f 84 00 00 00 00 00
      [   15.129559] RSP: 0018:ffff966b81987a88 EFLAGS: 00010202
      [   15.129562] RAX: ffff8b11c363a0d0 RBX: ffff8b11e294b070 RCX: 0000000000000000
      [   15.129564] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000000001d
      [   15.129566] RBP: ffff8b11e294b058 R08: 0000000000000000 R09: 0000000000000000
      [   15.129568] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8b11e294b070
      [   15.129570] R13: 0000000000000000 R14: ffff8b11e294b000 R15: 0000000000000001
      [   15.129572] FS:  00007fa96e826780(0000) GS:ffff8b18deb40000(0000) knlGS:0000000000000000
      [   15.129575] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   15.129577] CR2: 0000000000000060 CR3: 000000010b8ce000 CR4: 00000000003506e0
      [   15.129580] Call Trace:
      [   15.129582]  <TASK>
      [   15.129584]  md_run+0x67c/0xc70 [md_mod 1e470c1b6bcf1114198109f42682f5a2740e9531]
      [   15.129597]  raid_ctr+0x134a/0x28ea [dm_raid 6a645dd7519e72834bd7e98c23497eeade14cd63]
      [   15.129604]  ? dm_split_args+0x63/0x150 [dm_mod 0d7b0bc3414340a79c4553bae5ca97294b78336e]
      [   15.129615]  dm_table_add_target+0x188/0x380 [dm_mod 0d7b0bc3414340a79c4553bae5ca97294b78336e]
      [   15.129625]  table_load+0x13b/0x370 [dm_mod 0d7b0bc3414340a79c4553bae5ca97294b78336e]
      [   15.129635]  ? dev_suspend+0x2d0/0x2d0 [dm_mod 0d7b0bc3414340a79c4553bae5ca97294b78336e]
      [   15.129644]  ctl_ioctl+0x1bd/0x460 [dm_mod 0d7b0bc3414340a79c4553bae5ca97294b78336e]
      [   15.129655]  dm_ctl_ioctl+0xa/0x20 [dm_mod 0d7b0bc3414340a79c4553bae5ca97294b78336e]
      [   15.129663]  __x64_sys_ioctl+0x8e/0xd0
      [   15.129667]  do_syscall_64+0x5c/0x90
      [   15.129672]  ? syscall_exit_to_user_mode+0x23/0x50
      [   15.129675]  ? do_syscall_64+0x69/0x90
      [   15.129677]  ? do_syscall_64+0x69/0x90
      [   15.129679]  ? syscall_exit_to_user_mode+0x23/0x50
      [   15.129682]  ? do_syscall_64+0x69/0x90
      [   15.129684]  ? do_syscall_64+0x69/0x90
      [   15.129686]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      [   15.129689] RIP: 0033:0x7fa96ecd559b
      [   15.129692] Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c \
          c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff \
          ff 73 01 c3 48 8b 0d a5 a8 0c 00 f7 d8 64 89 01 48
      [   15.129696] RSP: 002b:00007ffcaf85c258 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
      [   15.129699] RAX: ffffffffffffffda RBX: 00007fa96f1b48f0 RCX: 00007fa96ecd559b
      [   15.129701] RDX: 00007fa97017e610 RSI: 00000000c138fd09 RDI: 0000000000000003
      [   15.129702] RBP: 00007fa96ebab583 R08: 00007fa97017c9e0 R09: 00007ffcaf85bf27
      [   15.129704] R10: 0000000000000001 R11: 0000000000000206 R12: 00007fa97017e610
      [   15.129706] R13: 00007fa97017e640 R14: 00007fa97017e6c0 R15: 00007fa97017e530
      [   15.129709]  </TASK>
      
      This is caused by missing mddev->queue check for setting QUEUE_FLAG_NOWAIT
      Fix this by moving the QUEUE_FLAG_NOWAIT logic to under mddev->queue check.
      
      Fixes: f51d46d0 ("md: add support for REQ_NOWAIT")
      Reported-by: NLeon Möller <jkhsjdhjs@totally.rip>
      Tested-by: NLeon Möller <jkhsjdhjs@totally.rip>
      Cc: Vishal Verma <vverma@digitalocean.com>
      Signed-off-by: NSong Liu <song@kernel.org>
      0f9650bd
  14. 02 2月, 2022 4 次提交
    • I
      block: fix DIO handling regressions in blkdev_read_iter() · 3e1f941d
      Ilya Dryomov 提交于
      Commit ceaa7625 ("block: move direct_IO into our own read_iter
      handler") introduced several regressions for bdev DIO:
      
      1. read spanning EOF always returns 0 instead of the number of bytes
         read.  This is because "count" is assigned early and isn't updated
         when the iterator is truncated:
      
           $ lsblk -o name,size /dev/vdb
           NAME SIZE
           vdb    1G
           $ xfs_io -d -c 'pread -b 4M 1021M 4M' /dev/vdb
           read 0/4194304 bytes at offset 1070596096
           0.000000 bytes, 0 ops; 0.0007 sec (0.000000 bytes/sec and 0.0000 ops/sec)
      
           instead of
      
           $ xfs_io -d -c 'pread -b 4M 1021M 4M' /dev/vdb
           read 3145728/4194304 bytes at offset 1070596096
           3 MiB, 1 ops; 0.0007 sec (3.865 GiB/sec and 1319.2612 ops/sec)
      
      2. truncated iterator isn't reexpanded
      3. iterator isn't reverted on blkdev_direct_IO() error
      4. zero size read no longer skips atime update
      
      Fixes: ceaa7625 ("block: move direct_IO into our own read_iter handler")
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Link: https://lore.kernel.org/r/20220201100420.25875-1-idryomov@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
      3e1f941d
    • S
      nvme-rdma: fix possible use-after-free in transport error_recovery work · b6bb1722
      Sagi Grimberg 提交于
      While nvme_rdma_submit_async_event_work is checking the ctrl and queue
      state before preparing the AER command and scheduling io_work, in order
      to fully prevent a race where this check is not reliable the error
      recovery work must flush async_event_work before continuing to destroy
      the admin queue after setting the ctrl state to RESETTING such that
      there is no race .submit_async_event and the error recovery handler
      itself changing the ctrl state.
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      b6bb1722
    • S
      nvme-tcp: fix possible use-after-free in transport error_recovery work · ff9fc7eb
      Sagi Grimberg 提交于
      While nvme_tcp_submit_async_event_work is checking the ctrl and queue
      state before preparing the AER command and scheduling io_work, in order
      to fully prevent a race where this check is not reliable the error
      recovery work must flush async_event_work before continuing to destroy
      the admin queue after setting the ctrl state to RESETTING such that
      there is no race .submit_async_event and the error recovery handler
      itself changing the ctrl state.
      Tested-by: NChris Leech <cleech@redhat.com>
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      ff9fc7eb
    • S
      nvme: fix a possible use-after-free in controller reset during load · 0fa0f99f
      Sagi Grimberg 提交于
      Unlike .queue_rq, in .submit_async_event drivers may not check the ctrl
      readiness for AER submission. This may lead to a use-after-free
      condition that was observed with nvme-tcp.
      
      The race condition may happen in the following scenario:
      1. driver executes its reset_ctrl_work
      2. -> nvme_stop_ctrl - flushes ctrl async_event_work
      3. ctrl sends AEN which is received by the host, which in turn
         schedules AEN handling
      4. teardown admin queue (which releases the queue socket)
      5. AEN processed, submits another AER, calling the driver to submit
      6. driver attempts to send the cmd
      ==> use-after-free
      
      In order to fix that, add ctrl state check to validate the ctrl
      is actually able to accept the AER submission.
      
      This addresses the above race in controller resets because the driver
      during teardown should:
      1. change ctrl state to RESETTING
      2. flush async_event_work (as well as other async work elements)
      
      So after 1,2, any other AER command will find the
      ctrl state to be RESETTING and bail out without submitting the AER.
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      0fa0f99f
  15. 29 1月, 2022 3 次提交
  16. 28 1月, 2022 1 次提交
  17. 27 1月, 2022 3 次提交
  18. 26 1月, 2022 1 次提交
  19. 24 1月, 2022 1 次提交
  20. 23 1月, 2022 5 次提交
    • L
      Merge tag 'powerpc-5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · dd81e1c7
      Linus Torvalds 提交于
      Pull powerpc fixes from Michael Ellerman:
      
       - A series of bpf fixes, including an oops fix and some codegen fixes.
      
       - Fix a regression in syscall_get_arch() for compat processes.
      
       - Fix boot failure on some 32-bit systems with KASAN enabled.
      
       - A couple of other build/minor fixes.
      
      Thanks to Athira Rajeev, Christophe Leroy, Dmitry V. Levin, Jiri Olsa,
      Johan Almbladh, Maxime Bizon, Naveen N. Rao, and Nicholas Piggin.
      
      * tag 'powerpc-5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/64s: Mask SRR0 before checking against the masked NIP
        powerpc/perf: Only define power_pmu_wants_prompt_pmi() for CONFIG_PPC64
        powerpc/32s: Fix kasan_init_region() for KASAN
        powerpc/time: Fix build failure due to do_hard_irq_enable() on PPC32
        powerpc/audit: Fix syscall_get_arch()
        powerpc64/bpf: Limit 'ldbrx' to processors compliant with ISA v2.06
        tools/bpf: Rename 'struct event' to avoid naming conflict
        powerpc/bpf: Update ldimm64 instructions during extra pass
        powerpc32/bpf: Fix codegen for bpf-to-bpf calls
        bpf: Guard against accessing NULL pt_regs in bpf_get_task_stack()
      dd81e1c7
    • L
      Merge tag 'irq_urgent_for_v5.17_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ac5a9bb6
      Linus Torvalds 提交于
      Pull irq fix from Borislav Petkov:
       "A single use-after-free fix in the PCI MSI irq domain allocation path"
      
      * tag 'irq_urgent_for_v5.17_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        PCI/MSI: Prevent UAF in error path
      ac5a9bb6
    • L
      Merge tag 'sched_urgent_for_v5.17_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 10c64a0f
      Linus Torvalds 提交于
      Pull scheduler fixes from Borislav Petkov:
       "A bunch of fixes: forced idle time accounting, utilization values
        propagation in the sched hierarchies and other minor cleanups and
        improvements"
      
      * tag 'sched_urgent_for_v5.17_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        kernel/sched: Remove dl_boosted flag comment
        sched: Avoid double preemption in __cond_resched_*lock*()
        sched/fair: Fix all kernel-doc warnings
        sched/core: Accounting forceidle time for all tasks except idle task
        sched/pelt: Relax the sync of load_sum with load_avg
        sched/pelt: Relax the sync of runnable_sum with runnable_avg
        sched/pelt: Continue to relax the sync of util_sum with util_avg
        sched/pelt: Relax the sync of util_sum with util_avg
        psi: Fix uaf issue when psi trigger is destroyed while being polled
      10c64a0f
    • L
      Merge tag 'perf_urgent_for_v5.17_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 0f9e0422
      Linus Torvalds 提交于
      Pull perf fixes from Borislav Petkov:
      
       - Add support for accessing the general purpose counters on Alder Lake
         via MMIO
      
       - Add new LBR format v7 support which is v5 modulo TSX
      
       - Fix counter enumeration on Alder Lake hybrids
      
       - Overhaul how context time updates are done and get rid of
         perf_event::shadow_ctx_time.
      
       - The usual amount of fixes: event mask correction, supported event
         types reporting, etc.
      
      * tag 'perf_urgent_for_v5.17_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/perf: Avoid warning for Arch LBR without XSAVE
        perf/x86/intel/uncore: Add IMC uncore support for ADL
        perf/x86/intel/lbr: Add static_branch for LBR INFO flags
        perf/x86/intel/lbr: Support LBR format V7
        perf/x86/rapl: fix AMD event handling
        perf/x86/intel/uncore: Fix CAS_COUNT_WRITE issue for ICX
        perf/x86/intel: Add a quirk for the calculation of the number of counters on Alder Lake
        perf: Fix perf_event_read_local() time
      0f9e0422
    • L
      Linux 5.17-rc1 · e783362e
      Linus Torvalds 提交于
      e783362e