1. 29 8月, 2022 1 次提交
    • J
      genetlink: start to validate reserved header bytes · 9c5d03d3
      Jakub Kicinski 提交于
      We had historically not checked that genlmsghdr.reserved
      is 0 on input which prevents us from using those precious
      bytes in the future.
      
      One use case would be to extend the cmd field, which is
      currently just 8 bits wide and 256 is not a lot of commands
      for some core families.
      
      To make sure that new families do the right thing by default
      put the onus of opting out of validation on existing families.
      Signed-off-by: NJakub Kicinski <kuba@kernel.org>
      Acked-by: Paul Moore <paul@paul-moore.com> (NetLabel)
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9c5d03d3
  2. 20 5月, 2022 1 次提交
  3. 03 5月, 2022 1 次提交
    • X
      scsi: target: tcmu: Fix possible data corruption · bb9b9eb0
      Xiaoguang Wang 提交于
      When tcmu_vma_fault() gets a page successfully, before the current context
      completes page fault procedure, find_free_blocks() may run and call
      unmap_mapping_range() to unmap the page. Assume that when
      find_free_blocks() initially completes and the previous page fault
      procedure starts to run again and completes, then one truncated page has
      been mapped to userspace. But note that tcmu_vma_fault() has gotten a
      refcount for the page so any other subsystem won't be able to use the page
      unless the userspace address is unmapped later.
      
      If another command subsequently runs and needs to extend dbi_thresh it may
      reuse the corresponding slot for the previous page in data_bitmap. Then
      though we'll allocate new page for this slot in data_area, no page fault
      will happen because we have a valid map and the real request's data will be
      lost.
      
      Filesystem implementations will also run into this issue but they usually
      lock the page when vm_operations_struct->fault gets a page and unlock the
      page after finish_fault() completes. For truncate filesystems lock pages in
      truncate_inode_pages() to protect against racing wrt. page faults.
      
      To fix this possible data corruption scenario we can apply a method similar
      to the filesystems.  For pages that are to be freed, tcmu_blocks_release()
      locks and unlocks. Make tcmu_vma_fault() also lock found page under
      cmdr_lock. At the same time, since tcmu_vma_fault() gets an extra page
      refcount, tcmu_blocks_release() won't free pages if pages are in page fault
      procedure, which means it is safe to call tcmu_blocks_release() before
      unmap_mapping_range().
      
      With these changes tcmu_blocks_release() will wait for all page faults to
      be completed before calling unmap_mapping_range(). And later, if
      unmap_mapping_range() is called, it will ensure stale mappings are removed.
      
      Link: https://lore.kernel.org/r/20220421023735.9018-1-xiaoguang.wang@linux.alibaba.comReviewed-by: NBodo Stroesser <bostroesser@gmail.com>
      Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      bb9b9eb0
  4. 30 3月, 2022 1 次提交
  5. 23 2月, 2022 1 次提交
  6. 19 10月, 2021 1 次提交
  7. 05 10月, 2021 1 次提交
  8. 03 8月, 2021 1 次提交
  9. 22 5月, 2021 2 次提交
    • K
      scsi: target: tcmu: Fix boolreturn.cocci warnings · 82473125
      kernel test robot 提交于
      drivers/target/target_core_user.c:1424:9-10: WARNING: return of 0/1 in function 'tcmu_handle_completions' with return type bool
      
       Return statements in functions returning bool should use
       true/false instead of 1/0.
      
      Generated by: scripts/coccinelle/misc/boolreturn.cocci
      
      Link: https://lore.kernel.org/r/20210515230358.GA97544@60d1edce16e0
      Fixes: 9814b55c ("scsi: target: tcmu: Return from tcmu_handle_completions() if cmd_id not found")
      CC: Bodo Stroesser <bostroesser@gmail.com>
      Reported-by: Nkernel test robot <lkp@intel.com>
      Acked-by: NBodo Stroesser <bostroesser@gmail.com>
      Signed-off-by: Nkernel test robot <lkp@intel.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      82473125
    • B
      scsi: target: tcmu: Fix xarray RCU warning · b4150b68
      Bodo Stroesser 提交于
      Commit f5ce815f ("scsi: target: tcmu: Support DATA_BLOCK_SIZE = N *
      PAGE_SIZE") introduced xas_next() calls to iterate xarray elements.  These
      calls triggered the WARNING "suspicious RCU usage" at tcmu device set up
      [1]. In the call stack of xas_next(), xas_load() was called.  According to
      its comment, this function requires "the xa_lock or the RCU lock".
      
      To avoid the warning:
      
       - Guard the small loop calling xas_next() in tcmu_get_empty_block with RCU
         lock.
      
       - In the large loop in tcmu_copy_data using RCU lock would possibly
         disable preemtion for a long time (copy multi MBs). Therefore replace
         XA_STATE, xas_set and xas_next with a single xa_load.
      
      [1]
      
      [ 1899.867091] =============================
      [ 1899.871199] WARNING: suspicious RCU usage
      [ 1899.875310] 5.13.0-rc1+ #41 Not tainted
      [ 1899.879222] -----------------------------
      [ 1899.883299] include/linux/xarray.h:1182 suspicious rcu_dereference_check() usage!
      [ 1899.890940] other info that might help us debug this:
      [ 1899.899082] rcu_scheduler_active = 2, debug_locks = 1
      [ 1899.905719] 3 locks held by kworker/0:1/1368:
      [ 1899.910161]  #0: ffffa1f8c8b98738 ((wq_completion)target_submission){+.+.}-{0:0}, at: process_one_work+0x1ee/0x580
      [ 1899.920732]  #1: ffffbd7040cd7e78 ((work_completion)(&q->sq.work)){+.+.}-{0:0}, at: process_one_work+0x1ee/0x580
      [ 1899.931146]  #2: ffffa1f8d1c99768 (&udev->cmdr_lock){+.+.}-{3:3}, at: tcmu_queue_cmd+0xea/0x160 [target_core_user]
      [ 1899.941678] stack backtrace:
      [ 1899.946093] CPU: 0 PID: 1368 Comm: kworker/0:1 Not tainted 5.13.0-rc1+ #41
      [ 1899.953070] Hardware name: System manufacturer System Product Name/PRIME Z270-A, BIOS 1302 03/15/2018
      [ 1899.962459] Workqueue: target_submission target_queued_submit_work [target_core_mod]
      [ 1899.970337] Call Trace:
      [ 1899.972839]  dump_stack+0x6d/0x89
      [ 1899.976222]  xas_descend+0x10e/0x120
      [ 1899.979875]  xas_load+0x39/0x50
      [ 1899.983077]  tcmu_get_empty_blocks+0x115/0x1c0 [target_core_user]
      [ 1899.989318]  queue_cmd_ring+0x1da/0x630 [target_core_user]
      [ 1899.994897]  ? rcu_read_lock_sched_held+0x3f/0x70
      [ 1899.999695]  ? trace_kmalloc+0xa6/0xd0
      [ 1900.003501]  ? __kmalloc+0x205/0x380
      [ 1900.007167]  tcmu_queue_cmd+0x12f/0x160 [target_core_user]
      [ 1900.012746]  __target_execute_cmd+0x23/0xa0 [target_core_mod]
      [ 1900.018589]  transport_generic_new_cmd+0x1f3/0x370 [target_core_mod]
      [ 1900.025046]  transport_handle_cdb_direct+0x34/0x50 [target_core_mod]
      [ 1900.031517]  target_queued_submit_work+0x43/0xe0 [target_core_mod]
      [ 1900.037837]  process_one_work+0x268/0x580
      [ 1900.041952]  ? process_one_work+0x580/0x580
      [ 1900.046195]  worker_thread+0x55/0x3b0
      [ 1900.049921]  ? process_one_work+0x580/0x580
      [ 1900.054192]  kthread+0x143/0x160
      [ 1900.057499]  ? kthread_create_worker_on_cpu+0x40/0x40
      [ 1900.062661]  ret_from_fork+0x1f/0x30
      
      Link: https://lore.kernel.org/r/20210519135440.26773-1-bostroesser@gmail.com
      Fixes: f5ce815f ("scsi: target: tcmu: Support DATA_BLOCK_SIZE = N * PAGE_SIZE")
      Reported-by: NShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
      Tested-by: NShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
      Signed-off-by: NBodo Stroesser <bostroesser@gmail.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      b4150b68
  10. 15 5月, 2021 1 次提交
  11. 29 4月, 2021 1 次提交
  12. 13 4月, 2021 6 次提交
  13. 16 3月, 2021 1 次提交
  14. 10 3月, 2021 3 次提交
  15. 05 3月, 2021 1 次提交
  16. 23 2月, 2021 2 次提交
  17. 15 1月, 2021 1 次提交
    • S
      scsi: target: tcmu: Fix use-after-free of se_cmd->priv · 780e1384
      Shin'ichiro Kawasaki 提交于
      Commit a3512902 ("scsi: target: tcmu: Use priv pointer in se_cmd")
      modified tcmu_free_cmd() to set NULL to priv pointer in se_cmd. However,
      se_cmd can be already freed by work queue triggered in
      target_complete_cmd(). This caused BUG KASAN use-after-free [1].
      
      To fix the bug, do not touch priv pointer in tcmu_free_cmd(). Instead, set
      NULL to priv pointer before target_complete_cmd() calls. Also, to avoid
      unnecessary priv pointer change in tcmu_queue_cmd(), modify priv pointer in
      the function only when tcmu_free_cmd() is not called.
      
      [1]
      BUG: KASAN: use-after-free in tcmu_handle_completions+0x1172/0x1770 [target_core_user]
      Write of size 8 at addr ffff88814cf79a40 by task cmdproc-uio0/14842
      
      CPU: 2 PID: 14842 Comm: cmdproc-uio0 Not tainted 5.11.0-rc2 #1
      Hardware name: Supermicro Super Server/X10SRL-F, BIOS 3.2 11/22/2019
      Call Trace:
       dump_stack+0x9a/0xcc
       ? tcmu_handle_completions+0x1172/0x1770 [target_core_user]
       print_address_description.constprop.0+0x18/0x130
       ? tcmu_handle_completions+0x1172/0x1770 [target_core_user]
       ? tcmu_handle_completions+0x1172/0x1770 [target_core_user]
       kasan_report.cold+0x7f/0x10e
       ? tcmu_handle_completions+0x1172/0x1770 [target_core_user]
       tcmu_handle_completions+0x1172/0x1770 [target_core_user]
       ? queue_tmr_ring+0x5d0/0x5d0 [target_core_user]
       tcmu_irqcontrol+0x28/0x60 [target_core_user]
       uio_write+0x155/0x230
       ? uio_vma_fault+0x460/0x460
       ? security_file_permission+0x4f/0x440
       vfs_write+0x1ce/0x860
       ksys_write+0xe9/0x1b0
       ? __ia32_sys_read+0xb0/0xb0
       ? syscall_enter_from_user_mode+0x27/0x70
       ? trace_hardirqs_on+0x1c/0x110
       do_syscall_64+0x33/0x40
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      RIP: 0033:0x7fcf8b61905f
      Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 b9 fc ff ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 0c fd ff ff 48
      RSP: 002b:00007fcf7b3e6c30 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fcf8b61905f
      RDX: 0000000000000004 RSI: 00007fcf7b3e6c78 RDI: 000000000000000c
      RBP: 00007fcf7b3e6c80 R08: 0000000000000000 R09: 00007fcf7b3e6aa8
      R10: 000000000b01c000 R11: 0000000000000293 R12: 00007ffe0c32a52e
      R13: 00007ffe0c32a52f R14: 0000000000000000 R15: 00007fcf7b3e7640
      
      Allocated by task 383:
       kasan_save_stack+0x1b/0x40
       ____kasan_kmalloc.constprop.0+0x84/0xa0
       kmem_cache_alloc+0x142/0x330
       tcm_loop_queuecommand+0x2a/0x4e0 [tcm_loop]
       scsi_queue_rq+0x12ec/0x2d20
       blk_mq_dispatch_rq_list+0x30a/0x1db0
       __blk_mq_do_dispatch_sched+0x326/0x830
       __blk_mq_sched_dispatch_requests+0x2c8/0x3f0
       blk_mq_sched_dispatch_requests+0xca/0x120
       __blk_mq_run_hw_queue+0x93/0xe0
       process_one_work+0x7b6/0x1290
       worker_thread+0x590/0xf80
       kthread+0x362/0x430
       ret_from_fork+0x22/0x30
      
      Freed by task 11655:
       kasan_save_stack+0x1b/0x40
       kasan_set_track+0x1c/0x30
       kasan_set_free_info+0x20/0x30
       ____kasan_slab_free+0xec/0x120
       slab_free_freelist_hook+0x53/0x160
       kmem_cache_free+0xf4/0x5c0
       target_release_cmd_kref+0x3ea/0x9e0 [target_core_mod]
       transport_generic_free_cmd+0x28b/0x2f0 [target_core_mod]
       target_complete_ok_work+0x250/0xac0 [target_core_mod]
       process_one_work+0x7b6/0x1290
       worker_thread+0x590/0xf80
       kthread+0x362/0x430
       ret_from_fork+0x22/0x30
      
      Last potentially related work creation:
       kasan_save_stack+0x1b/0x40
       kasan_record_aux_stack+0xa3/0xb0
       insert_work+0x48/0x2e0
       __queue_work+0x4e8/0xdf0
       queue_work_on+0x78/0x80
       tcmu_handle_completions+0xad0/0x1770 [target_core_user]
       tcmu_irqcontrol+0x28/0x60 [target_core_user]
       uio_write+0x155/0x230
       vfs_write+0x1ce/0x860
       ksys_write+0xe9/0x1b0
       do_syscall_64+0x33/0x40
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Second to last potentially related work creation:
       kasan_save_stack+0x1b/0x40
       kasan_record_aux_stack+0xa3/0xb0
       insert_work+0x48/0x2e0
       __queue_work+0x4e8/0xdf0
       queue_work_on+0x78/0x80
       tcm_loop_queuecommand+0x1c3/0x4e0 [tcm_loop]
       scsi_queue_rq+0x12ec/0x2d20
       blk_mq_dispatch_rq_list+0x30a/0x1db0
       __blk_mq_do_dispatch_sched+0x326/0x830
       __blk_mq_sched_dispatch_requests+0x2c8/0x3f0
       blk_mq_sched_dispatch_requests+0xca/0x120
       __blk_mq_run_hw_queue+0x93/0xe0
       process_one_work+0x7b6/0x1290
       worker_thread+0x590/0xf80
       kthread+0x362/0x430
       ret_from_fork+0x22/0x30
      
      The buggy address belongs to the object at ffff88814cf79800 which belongs
      to the cache tcm_loop_cmd_cache of size 896.
      
      Link: https://lore.kernel.org/r/20210113024508.1264992-1-shinichiro.kawasaki@wdc.com
      Fixes: a3512902 ("scsi: target: tcmu: Use priv pointer in se_cmd")
      Cc: stable@vger.kernel.org # v5.9+
      Acked-by: NBodo Stroesser <bostroesser@gmail.com>
      Signed-off-by: NShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      780e1384
  18. 30 10月, 2020 1 次提交
  19. 27 10月, 2020 1 次提交
    • B
      scsi: target: tcmu: scatter_/gather_data_area() rework · c8ed1ff8
      Bodo Stroesser 提交于
      scatter_data_area() and gather_data_area() are not easy to understand since
      data is copied in nested loops over sg_list and tcmu dbi list. Since sg
      list can contain only partly filled pages, the loop has to be prepared to
      handle sg pages not matching dbi pages one by one.
      
      Existing implementation uses kmap_atomic()/kunmap_atomic() due to
      performance reasons. But instead of using these calls strictly nested for
      sg and dpi pages, the code holds the mappings in an overlapping way, which
      indeed is a bug that would trigger on archs using highmem.
      
      The scatterlist lib contains the sg_miter_start/_next/_stop functions which
      can be used to simplify such complicated loops.
      
      The new code now processes the dbi list in the outer loop, while sg list is
      handled by the inner one. That way the code can take advantage of the
      sg_miter_* family calls.
      
      Calling sg_miter_stop() after the end of the inner loop enforces strict
      nesting of atomic kmaps.
      
      Since the nested loops in scatter_/gather_data_area were very similar, I
      replaced them by the new helper function tcmu_copy_data().
      
      Link: https://lore.kernel.org/r/20201019115118.11949-1-bostroesser@gmail.comAcked-by: NMike Christie <michael.christie@oracle.com>
      Signed-off-by: NBodo Stroesser <bostroesser@gmail.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      c8ed1ff8
  20. 03 10月, 2020 2 次提交
  21. 23 9月, 2020 3 次提交
    • B
      scsi: target: tcmu: Optimize scatter_data_area() · 3c9a7c58
      Bodo Stroesser 提交于
      scatter_data_area() has two purposes:
      
       1) Create the iovs for the data area buffer of a SCSI cmd.
      
       2) If there is data in DMA_TO_DEVICE direction, copy
          the data from sg_list to data area buffer.
      
      Both are done in a common loop.
      
      In case of DMA_FROM_DEVICE data transfer, scatter_data_area() is called
      with parameter copy_data = false. But this flag is just used to skip
      memcpy() for data, while radix_tree_lookup still is called for every dbi of
      the area area buffer, and kmap and kunmap are called for every page from
      sg_list and data_area as well as flush_dcache_page() for the data area
      pages.  Since the only thing to do with copy_data = false would be to set
      up the iovs, this is a noticeable overhead.  Rework the iov creation in the
      main loop of scatter_data_area() providing the new function
      new_block_to_iov().  Based on this, create the short new function
      tcmu_setup_iovs() that only writes the iovs with no overhead.  This new
      function is now called instead of scatter_data_area() for bidi buffers and
      for data buffers in those cases where memcpy() would have been skipped.
      
      Link: https://lore.kernel.org/r/20200910155041.17654-4-bstroesser@ts.fujitsu.comAcked-by: NMike Christie <michael.christie@oracle.com>
      Signed-off-by: NBodo Stroesser <bstroesser@ts.fujitsu.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      3c9a7c58
    • B
      scsi: target: tcmu: Optimize queue_cmd_ring() · 7e98905e
      Bodo Stroesser 提交于
      queue_cmd_ring() needs to check whether there is enough space in cmd ring
      and data area for the cmd to queue.
      
      Currently the sequence is:
      
       1) Calculate size the cmd will occupy on the ring based on estimation of
          needed iovs.
      
       2) Check whether there is enough space on the ring based on size from 1)
      
       3) Allocate buffers in data area.
      
       4) Calculate number of iovs the command really needs while copying
          incoming data (if any) to data area.
      
       5) Re-calculate real size of cmd on ring based on real number of iovs.
      
       6) Set up possible padding and cmd on the ring.
      
      Step 1) must not underestimate the cmd size so use max possible number of
      iovs for the given I/O data size. The resulting overestimation can be
      really high so this sequence is not ideal. The earliest the real number of
      iovs can be calculated is after data buffer allocation. Therefore rework
      the code to implement the following sequence:
      
       A) Allocate buffers on data area and calculate number of necessary iovs
          during this.
      
       B) Calculate real size of cmd on ring based on number of iovs.
      
       C) Check whether there is enough space on the ring.
      
       D) Set up possible padding and cmd on the ring.
      
      The new sequence enforces the split of new function tcmu_alloc_data_space()
      from is_ring_space_avail(). Using this function, change queue_cmd_ring()
      according to the new sequence.
      
      Change routines called by tcmu_alloc_data_space() to allow calculating and
      returning the iov count. Remove counting of iovs in scatter_data_area().
      
      Link: https://lore.kernel.org/r/20200910155041.17654-3-bstroesser@ts.fujitsu.comAcked-by: NMike Christie <michael.christie@oracle.com>
      Signed-off-by: NBodo Stroesser <bstroesser@ts.fujitsu.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      7e98905e
    • B
      scsi: target: tcmu: Join tcmu_cmd_get_data_length() and tcmu_cmd_get_block_cnt() · 52ef2743
      Bodo Stroesser 提交于
      Simplify code by joining tcmu_cmd_get_data_length() and
      tcmu_cmd_get_block_cnt() into tcmu_cmd_set_block_cnts().  The new function
      sets tcmu_cmd->dbi_cnt and also the new field tcmu_cmd->dbi_bidi_cnt which
      is needed for further enhancements in following patches.  Simplify some
      code by using tcmu_cmd->dbi(_bidi)_cnt instead of calculation from length.
      
      Please note: The calculation of the number of dbis needed for bidi was
      wrong. It was based on the length of the first bidi sg only. I changed it
      to correctly sum up entire length of all bidi sgs.
      
      Link: https://lore.kernel.org/r/20200910155041.17654-2-bstroesser@ts.fujitsu.comAcked-by: NMike Christie <michael.christie@oracle.com>
      Signed-off-by: NBodo Stroesser <bstroesser@ts.fujitsu.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      52ef2743
  22. 16 9月, 2020 1 次提交
  23. 29 7月, 2020 6 次提交