1. 05 11月, 2017 3 次提交
    • K
      target: Add netlink command reply supported option for each device · b849b456
      Kenjiro Nakayama 提交于
      Currently netlink command reply support option
      (TCMU_ATTR_SUPP_KERN_CMD_REPLY) can be enabled only on module
      scope. Because of that, once an application enables the netlink
      command reply support, all applications using target_core_user.ko
      would be expected to support the netlink reply. To make matters worse,
      users will not be able to add a device via configfs manually.
      
      To fix these issues, this patch adds an option to make netlink command
      reply disabled on each device through configfs. Original
      TCMU_ATTR_SUPP_KERN_CMD_REPLY is still enabled on module scope to keep
      backward-compatibility and used by default, however once users set
      nl_reply_supported=<NAGATIVE_VALUE> via configfs for a particular
      device, the device disables the netlink command reply support.
      Signed-off-by: NKenjiro Nakayama <nakayamakenjiro@gmail.com>
      Reviewed-by: NMike Christie <mchristi@redhat.com>
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      b849b456
    • K
      target/tcmu: Use macro to call container_of in tcmu_cmd_time_out_show · b5ab697c
      Kenjiro Nakayama 提交于
      This patch makes a tiny change that using TCMU_DEV in
      tcmu_cmd_time_out_show so it is consistent with other functions.
      Signed-off-by: NKenjiro Nakayama <nakayamakenjiro@gmail.com>
      Reviewed-by: NMike Christie <mchristi@redhat.com>
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      b5ab697c
    • X
      tcmu: fix crash when removing the tcmu device · c22adc0b
      Xiubo Li 提交于
      Before the nl REMOVE msg has been sent to the userspace, the ring's
      and other resources have been released, but the userspace maybe still
      using them. And then we can see the crash messages like:
      
      ring broken, not handling completions
      BUG: unable to handle kernel paging request at ffffffffffffffd0
      IP: tcmu_handle_completions+0x134/0x2f0 [target_core_user]
      PGD 11bdc0c067
      P4D 11bdc0c067
      PUD 11bdc0e067
      PMD 0
      
      Oops: 0000 [#1] SMP
      cmd_id not found, ring is broken
      RIP: 0010:tcmu_handle_completions+0x134/0x2f0 [target_core_user]
      RSP: 0018:ffffb8a2d8983d88 EFLAGS: 00010296
      RAX: 0000000000000000 RBX: ffffb8a2aaa4e000 RCX: 00000000ffffffff
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
      RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000220
      R10: 0000000076c71401 R11: ffff8d2e76c713f0 R12: ffffb8a2aad56bc0
      R13: 000000000000001c R14: ffff8d2e32c90000 R15: ffff8d2e76c713f0
      FS:  00007f411ffff700(0000) GS:ffff8d1e7fdc0000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ffffffffffffffd0 CR3: 0000001027070000 CR4:
      00000000001406e0
      Call Trace:
      ? tcmu_irqcontrol+0x2a/0x40 [target_core_user]
      ? uio_write+0x7b/0xc0 [uio]
      ? __vfs_write+0x37/0x150
      ? __getnstimeofday64+0x3b/0xd0
      ? vfs_write+0xb2/0x1b0
      ? syscall_trace_enter+0x1d0/0x2b0
      ? SyS_write+0x55/0xc0
      ? do_syscall_64+0x67/0x150
      ? entry_SYSCALL64_slow_path+0x25/0x25
      Code: 41 5d 41 5e 41 5f 5d c3 83 f8 01 0f 85 cf 01 00
      00 48 8b 7d d0 e8 dd 5c 1d f3 41 0f b7 74 24 04 48 8b
      7d c8 31 d2 e8 5c c7 1b f3 <48> 8b 7d d0 49 89 c7 c6 07
      00 0f 1f 40 00 4d 85 ff 0f 84 82 01  RIP:
      tcmu_handle_completions+0x134/0x2f0 [target_core_user]
      RSP: ffffb8a2d8983d88
      CR2: ffffffffffffffd0
      
      And the crash also could happen in tcmu_page_fault and other places.
      Signed-off-by: NZhang Zhuoyu <zhangzhuoyu@cmss.chinamobile.com>
      Signed-off-by: NXiubo Li <lixiubo@cmss.chinamobile.com>
      Reviewed-by: NMike Christie <mchristi@redhat.com>
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      c22adc0b
  2. 31 7月, 2017 2 次提交
  3. 12 7月, 2017 2 次提交
    • X
      tcmu: clean up the code and with one small fix · daf78c30
      Xiubo Li 提交于
      Remove useless blank line and code and at the same time add one error
      path to catch the errors.
      Reviewed-by: NMike Christie <mchristi@redhat.com>
      Signed-off-by: NXiubo Li <lixiubo@cmss.chinamobile.com>
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      daf78c30
    • X
      tcmu: Fix possbile memory leak / OOPs when recalculating cmd base size · b3743c71
      Xiubo Li 提交于
      For all the entries allocated from the ring cmd area, the memory is
      something like the stack memory, which will always reserve the old
      data, so the entry->req.iov_bidi_cnt maybe none zero.
      
      On some environments, the crash could be reproduce very easy and some
      not. The following is the crash core trace as reported by Damien:
      
      [  240.143969] CPU: 0 PID: 1285 Comm: iscsi_trx Not tainted 4.12.0-rc1+ #3
      [  240.150607] Hardware name: ASUS All Series/H87-PRO, BIOS 2104 10/28/2014
      [  240.157331] task: ffff8807de4f5800 task.stack: ffffc900047dc000
      [  240.163270] RIP: 0010:memcpy_erms+0x6/0x10
      [  240.167377] RSP: 0018:ffffc900047dfc68 EFLAGS: 00010202
      [  240.172621] RAX: ffffc9065db85540 RBX: ffff8807f7980000 RCX: 0000000000000010
      [  240.179771] RDX: 0000000000000010 RSI: ffff8807de574fe0 RDI: ffffc9065db85540
      [  240.186930] RBP: ffffc900047dfd30 R08: ffff8807de41b000 R09: 0000000000000000
      [  240.194088] R10: 0000000000000040 R11: ffff8807e9b726f0 R12: 00000006565726b0
      [  240.201246] R13: ffffc90007612ea0 R14: 000000065657d540 R15: 0000000000000000
      [  240.208397] FS:  0000000000000000(0000) GS:ffff88081fa00000(0000) knlGS:0000000000000000
      [  240.216510] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  240.222280] CR2: ffffc9065db85540 CR3: 0000000001c0f000 CR4: 00000000001406f0
      [  240.229430] Call Trace:
      [  240.231887]  ? tcmu_queue_cmd+0x83c/0xa80
      [  240.235916]  ? target_check_reservation+0xcd/0x6f0
      [  240.240725]  __target_execute_cmd+0x27/0xa0
      [  240.244918]  target_execute_cmd+0x232/0x2c0
      [  240.249124]  ? __local_bh_enable_ip+0x64/0xa0
      [  240.253499]  iscsit_execute_cmd+0x20d/0x270
      [  240.257693]  iscsit_sequence_cmd+0x110/0x190
      [  240.261985]  iscsit_get_rx_pdu+0x360/0xc80
      [  240.267565]  ? iscsi_target_rx_thread+0x54/0xd0
      [  240.273571]  iscsi_target_rx_thread+0x9a/0xd0
      [  240.279413]  kthread+0x113/0x150
      [  240.284120]  ? iscsi_target_tx_thread+0x1e0/0x1e0
      [  240.290297]  ? kthread_create_on_node+0x40/0x40
      [  240.296297]  ret_from_fork+0x2e/0x40
      [  240.301332] Code: 90 90 90 90 90 eb 1e 0f 1f 00 48 89 f8 48 89 d1 48
      c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48
      89 d1 <f3> a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38
      [  240.321751] RIP: memcpy_erms+0x6/0x10 RSP: ffffc900047dfc68
      [  240.328838] CR2: ffffc9065db85540
      [  240.333667] ---[ end trace b7e5354cfb54d08b ]---
      
      To fix this, just memset all the entry memory before using it, and
      also to be more readable we adjust the bidi code.
      
      Fixed: fe25cc34(tcmu: Recalculate the tcmu_cmd size to save cmd area
      		memories)
      Reported-by: NBryant G. Ly <bryantly@linux.vnet.ibm.com>
      Tested-by: NBryant G. Ly <bryantly@linux.vnet.ibm.com>
      Reported-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Tested-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: NMike Christie <mchristi@redhat.com>
      Signed-off-by: NXiubo Li <lixiubo@cmss.chinamobile.com>
      Cc: <stable@vger.kernel.org> # 4.12+
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      b3743c71
  4. 10 7月, 2017 1 次提交
    • B
      tcmu: Fix dev_config_store · de8c5221
      Bryant G. Ly 提交于
      Currently when there is a reconfig, the uio_info->name
      does not get updated to reflect the change in the dev_config
      name change.
      
      On restart tcmu-runner there will be a mismatch between
      the dev_config string in uio and the tcmu structure that contains
      the string. When this occurs it'll reload the one in uio
      and you lose the reconfigured device path.
      
      v2: Created a helper function for the updating of uio_info
      Signed-off-by: NBryant G. Ly <bryantly@linux.vnet.ibm.com>
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      de8c5221
  5. 07 7月, 2017 14 次提交
  6. 24 5月, 2017 1 次提交
    • M
      tcmu: fix crash during device removal · f3cdbe39
      Mike Christie 提交于
      We currently do
      
      tcmu_free_device ->tcmu_netlink_event(TCMU_CMD_REMOVED_DEVICE) ->
      uio_unregister_device -> kfree(tcmu_dev).
      
      The problem is that the kernel does not wait for userspace to
      do the close() on the uio device before freeing the tcmu_dev.
      We can then hit a race where the kernel frees the tcmu_dev before
      userspace does close() and so when close() -> release -> tcmu_release
      is done, we try to access a freed tcmu_dev.
      
      This patch made over the target-pending master branch moves the freeing
      of the tcmu_dev to when the last reference has been dropped.
      
      This also fixes a leak where if tcmu_configure_device was not called on a
      device we did not free udev->name which was allocated at tcmu_alloc_device time.
      Signed-off-by: NMike Christie <mchristi@redhat.com>
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      f3cdbe39
  7. 05 5月, 2017 1 次提交
  8. 03 5月, 2017 1 次提交
    • X
      tcmu: Recalculate the tcmu_cmd size to save cmd area memories · fe25cc34
      Xiubo Li 提交于
      For the "struct tcmu_cmd_entry" in cmd area, the minimum size
      will be sizeof(struct tcmu_cmd_entry) == 112 Bytes. And it could
      fill about (sizeof(struct rsp) - sizeof(struct req)) /
      sizeof(struct iovec) == 68 / 16 ~= 4 data regions(iov[4]) by
      default.
      
      For most tcmu_cmds, the data block indexes allocated from the
      data area will be continuous. And for the continuous blocks they
      will be merged into the same region using only one iovec. For
      the current code, it will always allocates the same number of
      iovecs with blocks for each tcmu_cmd, and it will wastes much
      memories.
      
      For example, when the block size is 4K and the DATA_OUT buffer
      size is 64K, and the regions needed is less than 5(on my
      environment is almost 99.7%). The current code will allocate
      about 16 iovecs, and there will be (16 - 4) * sizeof(struct
      iovec) = 192 Bytes cmd area memories wasted.
      
      Here adds two helpers to calculate the base size and full size
      of the tcmu_cmd. And will recalculate them again when it make sure
      how many iovs is needed before insert it to cmd area.
      Signed-off-by: NXiubo Li <lixiubo@cmss.chinamobile.com>
      Acked-by: NMike Christie <mchristi@redhat.com>
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      fe25cc34
  9. 02 5月, 2017 2 次提交
    • X
      tcmu: Add global data block pool support · b6df4b79
      Xiubo Li 提交于
      For each target there will be one ring, when the target number
      grows larger and larger, it could eventually runs out of the
      system memories.
      
      In this patch for each target ring, currently for the cmd area
      the size will be fixed to 8MB and for the data area the size
      will grow from 0 to max 256K * PAGE_SIZE(1G for 4K page size).
      
      For all the targets' data areas, they will get empty blocks
      from the "global data block pool", which has limited to 512K *
      PAGE_SIZE(2G for 4K page size) for now.
      
      When the "global data block pool" has been used up, then any
      target could wake up the unmap thread routine to shrink other
      targets' data area memories. And the unmap thread routine will
      always try to truncate the ring vma from the last using block
      offset.
      
      When user space has touched the data blocks out of tcmu_cmd
      iov[], the tcmu_page_fault() will try to return one zeroed blocks.
      
      Here we move the timeout's tcmu_handle_completions() into unmap
      thread routine, that's to say when the timeout fired, it will
      only do the tcmu_check_expired_cmd() and then wake up the unmap
      thread to do the completions() and then try to shrink its idle
      memories. Then the cmdr_lock could be a mutex and could simplify
      this patch because the unmap_mapping_range() or zap_* may go to
      sleep.
      Signed-off-by: NXiubo Li <lixiubo@cmss.chinamobile.com>
      Signed-off-by: NJianfei Hu <hujianfei@cmss.chinamobile.com>
      Acked-by: NMike Christie <mchristi@redhat.com>
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      b6df4b79
    • X
      tcmu: Add dynamic growing data area feature support · 141685a3
      Xiubo Li 提交于
      Currently for the TCMU, the ring buffer size is fixed to 64K cmd
      area + 1M data area, and this will be bottlenecks for high iops.
      
      The struct tcmu_cmd_entry {} size is fixed about 112 bytes with
      iovec[N] & N <= 4, and the size of struct iovec is about 16 bytes.
      
      If N == 0, the ratio will be sizeof(cmd entry) : sizeof(datas) ==
      112Bytes : (N * 4096)Bytes = 28 : 0, no data area is need.
      
      If 0 < N <=4, the ratio will be sizeof(cmd entry) : sizeof(datas)
      == 112Bytes : (N * 4096)Bytes = 28 : (N * 1024), so the max will
      be 28 : 1024.
      
      If N > 4, the sizeof(cmd entry) will be [(N - 4) *16 + 112] bytes,
      and its corresponding data size will be [N * 4096], so the ratio
      of sizeof(cmd entry) : sizeof(datas) == [(N - 4) * 16 + 112)Bytes
      : (N * 4096)Bytes == 4/1024 - 12/(N * 1024), so the max is about
      4 : 1024.
      
      When N is bigger, the ratio will be smaller.
      
      As the initial patch, we will set the cmd area size to 2M, and
      the cmd area size to 32M. The TCMU will dynamically grows the data
      area from 0 to max 32M size as needed.
      
      The cmd area memory will be allocated through vmalloc(), and the
      data area's blocks will be allocated individually later when needed.
      
      The allocated data area block memory will be managed via radix tree.
      For now the bitmap still be the most efficient way to search and
      manage the block index, this could be update later.
      Signed-off-by: NXiubo Li <lixiubo@cmss.chinamobile.com>
      Signed-off-by: NJianfei Hu <hujianfei@cmss.chinamobile.com>
      Acked-by: NMike Christie <mchristi@redhat.com>
      Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>
      141685a3
  10. 03 4月, 2017 1 次提交
  11. 30 3月, 2017 3 次提交
  12. 19 3月, 2017 5 次提交
  13. 25 2月, 2017 1 次提交
  14. 14 2月, 2017 1 次提交
  15. 15 12月, 2016 1 次提交
  16. 10 12月, 2016 1 次提交