1. 19 7月, 2021 1 次提交
  2. 02 6月, 2021 7 次提交
  3. 08 4月, 2021 1 次提交
  4. 30 3月, 2021 1 次提交
  5. 05 3月, 2021 3 次提交
  6. 09 2月, 2021 1 次提交
  7. 08 12月, 2020 1 次提交
  8. 29 7月, 2020 1 次提交
  9. 08 7月, 2020 1 次提交
  10. 03 7月, 2020 4 次提交
  11. 27 5月, 2020 2 次提交
    • G
      scsi: iscsi: Fix deadlock on recovery path during GFP_IO reclaim · 7e7cd796
      Gabriel Krisman Bertazi 提交于
      iSCSI suffers from a deadlock in case a management command submitted via
      the netlink socket sleeps on an allocation while holding the rx_queue_mutex
      if that allocation causes a memory reclaim that writebacks to a failed
      iSCSI device.  The recovery procedure can never make progress to recover
      the failed disk or abort outstanding IO operations to complete the reclaim
      (since rx_queue_mutex is locked), thus locking the system.
      
      Nevertheless, just marking all allocations under rx_queue_mutex as GFP_NOIO
      (or locking the userspace process with something like PF_MEMALLOC_NOIO) is
      not enough, since the iSCSI command code relies on other subsystems that
      try to grab locked mutexes, whose threads are GFP_IO, leading to the same
      deadlock. One instance where this situation can be observed is in the
      backtraces below, stitched from multiple bugs reports, involving the kobj
      uevent sent when a session is created.
      
      The root of the problem is not the fact that iSCSI does GFP_IO allocations,
      that is acceptable. The actual problem is that rx_queue_mutex has a very
      large granularity, covering every unrelated netlink command execution at
      the same time as the error recovery path.
      
      The proposed fix leverages the recently added mechanism to stop failed
      connections from the kernel, by enabling it to execute even though a
      management command from the netlink socket is being run (rx_queue_mutex is
      held), provided that the command is known to be safe.  It splits the
      rx_queue_mutex in two mutexes, one protecting from concurrent command
      execution from the netlink socket, and one protecting stop_conn from racing
      with other connection management operations that might conflict with it.
      
      It is not very pretty, but it is the simplest way to resolve the deadlock.
      I considered making it a lock per connection, but some external mutex would
      still be needed to deal with iscsi_if_destroy_conn.
      
      The patch was tested by forcing a memory shrinker (unrelated, but used
      bufio/dm-verity) to reclaim iSCSI pages every time
      ISCSI_UEVENT_CREATE_SESSION happens, which is reasonable to simulate
      reclaims that might happen with GFP_KERNEL on that path.  Then, a faulty
      hung target causes a connection to fail during intensive IO, at the same
      time a new session is added by iscsid.
      
      The following stacktraces are stiches from several bug reports, showing a
      case where the deadlock can happen.
      
       iSCSI-write
               holding: rx_queue_mutex
               waiting: uevent_sock_mutex
      
               kobject_uevent_env+0x1bd/0x419
               kobject_uevent+0xb/0xd
               device_add+0x48a/0x678
               scsi_add_host_with_dma+0xc5/0x22d
               iscsi_host_add+0x53/0x55
               iscsi_sw_tcp_session_create+0xa6/0x129
               iscsi_if_rx+0x100/0x1247
               netlink_unicast+0x213/0x4f0
               netlink_sendmsg+0x230/0x3c0
      
       iscsi_fail iscsi_conn_failure
               waiting: rx_queue_mutex
      
               schedule_preempt_disabled+0x325/0x734
               __mutex_lock_slowpath+0x18b/0x230
               mutex_lock+0x22/0x40
               iscsi_conn_failure+0x42/0x149
               worker_thread+0x24a/0xbc0
      
       EventManager_
               holding: uevent_sock_mutex
               waiting: dm_bufio_client->lock
      
               dm_bufio_lock+0xe/0x10
               shrink+0x34/0xf7
               shrink_slab+0x177/0x5d0
               do_try_to_free_pages+0x129/0x470
               try_to_free_mem_cgroup_pages+0x14f/0x210
               memcg_kmem_newpage_charge+0xa6d/0x13b0
               __alloc_pages_nodemask+0x4a3/0x1a70
               fallback_alloc+0x1b2/0x36c
               __kmalloc_node_track_caller+0xb9/0x10d0
               __alloc_skb+0x83/0x2f0
               kobject_uevent_env+0x26b/0x419
               dm_kobject_uevent+0x70/0x79
               dev_suspend+0x1a9/0x1e7
               ctl_ioctl+0x3e9/0x411
               dm_ctl_ioctl+0x13/0x17
               do_vfs_ioctl+0xb3/0x460
               SyS_ioctl+0x5e/0x90
      
       MemcgReclaimerD"
               holding: dm_bufio_client->lock
               waiting: stuck io to finish (needs iscsi_fail thread to progress)
      
               schedule at ffffffffbd603618
               io_schedule at ffffffffbd603ba4
               do_io_schedule at ffffffffbdaf0d94
               __wait_on_bit at ffffffffbd6008a6
               out_of_line_wait_on_bit at ffffffffbd600960
               wait_on_bit.constprop.10 at ffffffffbdaf0f17
               __make_buffer_clean at ffffffffbdaf18ba
               __cleanup_old_buffer at ffffffffbdaf192f
               shrink at ffffffffbdaf19fd
               do_shrink_slab at ffffffffbd6ec000
               shrink_slab at ffffffffbd6ec24a
               do_try_to_free_pages at ffffffffbd6eda09
               try_to_free_mem_cgroup_pages at ffffffffbd6ede7e
               mem_cgroup_resize_limit at ffffffffbd7024c0
               mem_cgroup_write at ffffffffbd703149
               cgroup_file_write at ffffffffbd6d9c6e
               sys_write at ffffffffbd6662ea
               system_call_fastpath at ffffffffbdbc34a2
      
      Link: https://lore.kernel.org/r/20200520022959.1912856-1-krisman@collabora.comReported-by: NKhazhismel Kumykov <khazhy@google.com>
      Reviewed-by: NLee Duncan <lduncan@suse.com>
      Signed-off-by: NGabriel Krisman Bertazi <krisman@collabora.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      7e7cd796
    • B
      scsi: iscsi: Register sysfs for iscsi workqueue · 3ce41966
      Bob Liu 提交于
      This patch enables setting cpu affinity through "cpumask" for iscsi
      workqueues (iscsi_q_xx and iscsi_eh), so as to get performance isolation.
      
      The max number of active worker was changed form 1 to 2, because "cpumask"
      of ordered workqueue isn't allowed to change.
      
      Link: https://lore.kernel.org/r/20200505011908.15538-1-bob.liu@oracle.comReviewed-by: NLee Duncan <lduncan@suse.com>
      Signed-off-by: NBob Liu <bob.liu@oracle.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      3ce41966
  12. 01 4月, 2020 1 次提交
  13. 27 3月, 2020 1 次提交
  14. 12 3月, 2020 1 次提交
  15. 11 2月, 2020 1 次提交
  16. 16 1月, 2020 2 次提交
    • G
      scsi: iscsi: Fail session and connection on transport registration failure · f3c893e3
      Gabriel Krisman Bertazi 提交于
      If the transport cannot be registered, the session/connection creation
      needs to be failed early to let the initiator know.  Otherwise, the system
      will have an outstanding connection that cannot be used nor removed by
      open-iscsi. The result is similar to the error below, triggered by
      injecting a failure in the transport's registration path.
      
      openiscsi reports success:
      
      root@debian-vm:~#  iscsiadm -m node -T iqn:lun1 -p 127.0.0.1 -l
      Logging in to [iface: default, target: iqn:lun1, portal: 127.0.0.1,3260]
      Login to [iface: default, target: iqn:lun1, portal:127.0.0.1,3260] successful.
      
      But cannot remove the session afterwards, since the kernel is in an
      inconsistent state.
      
      root@debian-vm:~#  iscsiadm -m node -T iqn:lun1 -p 127.0.0.1 -u
      iscsiadm: No matching sessions found
      
      Link: https://lore.kernel.org/r/20200106185817.640331-4-krisman@collabora.comSigned-off-by: NGabriel Krisman Bertazi <krisman@collabora.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      f3c893e3
    • N
      scsi: iscsi: Don't destroy session if there are outstanding connections · 54155ed4
      Nick Black 提交于
      A faulty userspace that calls destroy_session() before destroying the
      connections can trigger the failure.  This patch prevents the issue by
      refusing to destroy the session if there are outstanding connections.
      
      ------------[ cut here ]------------
      kernel BUG at mm/slub.c:306!
      invalid opcode: 0000 [#1] SMP PTI
      CPU: 1 PID: 1224 Comm: iscsid Not tainted 5.4.0-rc2.iscsi+ #7
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
      RIP: 0010:__slab_free+0x181/0x350
      [...]
      [ 1209.686056] RSP: 0018:ffffa93d4074fae0 EFLAGS: 00010246
      [ 1209.686694] RAX: ffff934efa5ad800 RBX: 000000008010000a RCX: ffff934efa5ad800
      [ 1209.687651] RDX: ffff934efa5ad800 RSI: ffffeb4041e96b00 RDI: ffff934efd402c40
      [ 1209.688582] RBP: ffffa93d4074fb80 R08: 0000000000000001 R09: ffffffffbb5dfa26
      [ 1209.689425] R10: ffff934efa5ad800 R11: 0000000000000001 R12: ffffeb4041e96b00
      [ 1209.690285] R13: ffff934efa5ad800 R14: ffff934efd402c40 R15: 0000000000000000
      [ 1209.691213] FS:  00007f7945dfb540(0000) GS:ffff934efda80000(0000) knlGS:0000000000000000
      [ 1209.692316] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1209.693013] CR2: 000055877fd3da80 CR3: 0000000077384000 CR4: 00000000000006e0
      [ 1209.693897] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 1209.694773] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 1209.695631] Call Trace:
      [ 1209.695957]  ? __wake_up_common_lock+0x8a/0xc0
      [ 1209.696712]  iscsi_pool_free+0x26/0x40
      [ 1209.697263]  iscsi_session_teardown+0x2f/0xf0
      [ 1209.698117]  iscsi_sw_tcp_session_destroy+0x45/0x60
      [ 1209.698831]  iscsi_if_rx+0xd88/0x14e0
      [ 1209.699370]  netlink_unicast+0x16f/0x200
      [ 1209.699932]  netlink_sendmsg+0x21a/0x3e0
      [ 1209.700446]  sock_sendmsg+0x4f/0x60
      [ 1209.700902]  ___sys_sendmsg+0x2ae/0x320
      [ 1209.701451]  ? cp_new_stat+0x150/0x180
      [ 1209.701922]  __sys_sendmsg+0x59/0xa0
      [ 1209.702357]  do_syscall_64+0x52/0x160
      [ 1209.702812]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [ 1209.703419] RIP: 0033:0x7f7946433914
      [...]
      [ 1209.706084] RSP: 002b:00007fffb99f2378 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      [ 1209.706994] RAX: ffffffffffffffda RBX: 000055bc869eac20 RCX: 00007f7946433914
      [ 1209.708082] RDX: 0000000000000000 RSI: 00007fffb99f2390 RDI: 0000000000000005
      [ 1209.709120] RBP: 00007fffb99f2390 R08: 000055bc84fe9320 R09: 00007fffb99f1f07
      [ 1209.710110] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000038
      [ 1209.711085] R13: 000055bc8502306e R14: 0000000000000000 R15: 0000000000000000
       Modules linked in:
       ---[ end trace a2d933ede7f730d8 ]---
      
      Link: https://lore.kernel.org/r/20191226203148.2172200-1-krisman@collabora.comSigned-off-by: NNick Black <nlb@google.com>
      Co-developed-by: NSalman Qazi <sqazi@google.com>
      Signed-off-by: NSalman Qazi <sqazi@google.com>
      Co-developed-by: NJunho Ryu <jayr@google.com>
      Signed-off-by: NJunho Ryu <jayr@google.com>
      Co-developed-by: NKhazhismel Kumykov <khazhy@google.com>
      Signed-off-by: NKhazhismel Kumykov <khazhy@google.com>
      Co-developed-by: NGabriel Krisman Bertazi <krisman@collabora.com>
      Signed-off-by: NGabriel Krisman Bertazi <krisman@collabora.com>
      Reviewed-by: NLee Duncan <lduncan@suse.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      54155ed4
  17. 10 12月, 2019 1 次提交
    • B
      scsi: iscsi: Avoid potential deadlock in iscsi_if_rx func · bba340c7
      Bo Wu 提交于
      In iscsi_if_rx func, after receiving one request through
      iscsi_if_recv_msg func, iscsi_if_send_reply will be called to try to
      reply to the request in a do-while loop.  If the iscsi_if_send_reply
      function keeps returning -EAGAIN, a deadlock will occur.
      
      For example, a client only send msg without calling recvmsg func, then
      it will result in the watchdog soft lockup.  The details are given as
      follows:
      
      	sock_fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_ISCSI);
      	retval = bind(sock_fd, (struct sock addr*) & src_addr, sizeof(src_addr);
      	while (1) {
      		state_msg = sendmsg(sock_fd, &msg, 0);
      		//Note: recvmsg(sock_fd, &msg, 0) is not processed here.
      	}
      	close(sock_fd);
      
      watchdog: BUG: soft lockup - CPU#7 stuck for 22s! [netlink_test:253305] Sample time: 4000897528 ns(HZ: 250) Sample stat:
      curr: user: 675503481560, nice: 321724050, sys: 448689506750, idle: 4654054240530, iowait: 40885550700, irq: 14161174020, softirq: 8104324140, st: 0
      deta: user: 0, nice: 0, sys: 3998210100, idle: 0, iowait: 0, irq: 1547170, softirq: 242870, st: 0 Sample softirq:
               TIMER:        992
               SCHED:          8
      Sample irqstat:
               irq    2: delta       1003, curr:    3103802, arch_timer
      CPU: 7 PID: 253305 Comm: netlink_test Kdump: loaded Tainted: G           OE
      Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
      pstate: 40400005 (nZcv daif +PAN -UAO)
      pc : __alloc_skb+0x104/0x1b0
      lr : __alloc_skb+0x9c/0x1b0
      sp : ffff000033603a30
      x29: ffff000033603a30 x28: 00000000000002dd
      x27: ffff800b34ced810 x26: ffff800ba7569f00
      x25: 00000000ffffffff x24: 0000000000000000
      x23: ffff800f7c43f600 x22: 0000000000480020
      x21: ffff0000091d9000 x20: ffff800b34eff200
      x19: ffff800ba7569f00 x18: 0000000000000000
      x17: 0000000000000000 x16: 0000000000000000
      x15: 0000000000000000 x14: 0001000101000100
      x13: 0000000101010000 x12: 0101000001010100
      x11: 0001010101010001 x10: 00000000000002dd
      x9 : ffff000033603d58 x8 : ffff800b34eff400
      x7 : ffff800ba7569200 x6 : ffff800b34eff400
      x5 : 0000000000000000 x4 : 00000000ffffffff
      x3 : 0000000000000000 x2 : 0000000000000001
      x1 : ffff800b34eff2c0 x0 : 0000000000000300 Call trace:
      __alloc_skb+0x104/0x1b0
      iscsi_if_rx+0x144/0x12bc [scsi_transport_iscsi]
      netlink_unicast+0x1e0/0x258
      netlink_sendmsg+0x310/0x378
      sock_sendmsg+0x4c/0x70
      sock_write_iter+0x90/0xf0
      __vfs_write+0x11c/0x190
      vfs_write+0xac/0x1c0
      ksys_write+0x6c/0xd8
      __arm64_sys_write+0x24/0x30
      el0_svc_common+0x78/0x130
      el0_svc_handler+0x38/0x78
      el0_svc+0x8/0xc
      
      Link: https://lore.kernel.org/r/EDBAAA0BBBA2AC4E9C8B6B81DEEE1D6915E3D4D2@dggeml505-mbx.china.huawei.comSigned-off-by: NBo Wu <wubo40@huawei.com>
      Reviewed-by: NZhiqiang Liu <liuzhiqiang26@huawei.com>
      Reviewed-by: NLee Duncan <lduncan@suse.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      bba340c7
  18. 31 5月, 2019 1 次提交
  19. 21 5月, 2019 1 次提交
  20. 19 3月, 2019 1 次提交
    • M
      scsi: iscsi: flush running unbind operations when removing a session · 165aa2bf
      Maurizio Lombardi 提交于
      In some cases, the iscsi_remove_session() function is called while an
      unbind_work operation is still running.  This may cause a situation where
      sysfs objects are removed in an incorrect order, triggering a kernel
      warning.
      
      [  605.249442] ------------[ cut here ]------------
      [  605.259180] sysfs group 'power' not found for kobject 'target2:0:0'
      [  605.321371] WARNING: CPU: 1 PID: 26794 at fs/sysfs/group.c:235 sysfs_remove_group+0x76/0x80
      [  605.341266] Modules linked in: dm_service_time target_core_user target_core_pscsi target_core_file target_core_iblock iscsi_target_mod target_core_mod nls_utf8 isofs ppdev bochs_drm nfit ttm libnvdimm drm_kms_helper syscopyarea sysfillrect sysimgblt joydev pcspkr fb_sys_fops drm i2c_piix4 sg parport_pc parport xfs libcrc32c dm_multipath sr_mod sd_mod cdrom ata_generic 8021q garp mrp ata_piix stp crct10dif_pclmul crc32_pclmul llc libata crc32c_intel virtio_net net_failover ghash_clmulni_intel serio_raw failover sunrpc dm_mirror dm_region_hash dm_log dm_mod be2iscsi bnx2i cnic uio cxgb4i cxgb4 libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi
      [  605.627479] CPU: 1 PID: 26794 Comm: kworker/u32:2 Not tainted 4.18.0-60.el8.x86_64 #1
      [  605.721401] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29 04/01/2014
      [  605.823651] Workqueue: scsi_wq_2 __iscsi_unbind_session [scsi_transport_iscsi]
      [  605.830940] RIP: 0010:sysfs_remove_group+0x76/0x80
      [  605.922907] Code: 48 89 df 5b 5d 41 5c e9 38 c4 ff ff 48 89 df e8 e0 bf ff ff eb cb 49 8b 14 24 48 8b 75 00 48 c7 c7 38 73 cb a7 e8 24 77 d7 ff <0f> 0b 5b 5d 41 5c c3 0f 1f 00 0f 1f 44 00 00 41 56 41 55 41 54 55
      [  606.122304] RSP: 0018:ffffbadcc8d1bda8 EFLAGS: 00010286
      [  606.218492] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
      [  606.326381] RDX: ffff98bdfe85eb40 RSI: ffff98bdfe856818 RDI: ffff98bdfe856818
      [  606.514498] RBP: ffffffffa7ab73e0 R08: 0000000000000268 R09: 0000000000000007
      [  606.529469] R10: 0000000000000000 R11: ffffffffa860d9ad R12: ffff98bdf978e838
      [  606.630535] R13: ffff98bdc2cd4010 R14: ffff98bdc2cd3ff0 R15: ffff98bdc2cd4000
      [  606.824707] FS:  0000000000000000(0000) GS:ffff98bdfe840000(0000) knlGS:0000000000000000
      [  607.018333] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  607.117844] CR2: 00007f84b78ac024 CR3: 000000002c00a003 CR4: 00000000003606e0
      [  607.117844] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  607.420926] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  607.524236] Call Trace:
      [  607.530591]  device_del+0x56/0x350
      [  607.624393]  ? ata_tlink_match+0x30/0x30 [libata]
      [  607.727805]  ? attribute_container_device_trigger+0xb4/0xf0
      [  607.829911]  scsi_target_reap_ref_release+0x39/0x50
      [  607.928572]  scsi_remove_target+0x1a2/0x1d0
      [  608.017350]  __iscsi_unbind_session+0xb3/0x160 [scsi_transport_iscsi]
      [  608.117435]  process_one_work+0x1a7/0x360
      [  608.132917]  worker_thread+0x30/0x390
      [  608.222900]  ? pwq_unbound_release_workfn+0xd0/0xd0
      [  608.323989]  kthread+0x112/0x130
      [  608.418318]  ? kthread_bind+0x30/0x30
      [  608.513821]  ret_from_fork+0x35/0x40
      [  608.613909] ---[ end trace 0b98c310c8a6138c ]---
      Signed-off-by: NMaurizio Lombardi <mlombard@redhat.com>
      Acked-by: NChris Leech <cleech@redhat.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      165aa2bf
  21. 21 12月, 2018 1 次提交
  22. 08 11月, 2018 2 次提交
  23. 30 5月, 2018 1 次提交
  24. 19 4月, 2018 1 次提交
    • C
      scsi: iscsi: respond to netlink with unicast when appropriate · af170928
      Chris Leech 提交于
      Instead of always multicasting responses, send a unicast netlink message
      directed at the correct pid.  This will be needed if we ever want to
      support multiple userspace processes interacting with the kernel over
      iSCSI netlink simultaneously.  Limitations can currently be seen if you
      attempt to run multiple iscsistart commands in parallel.
      
      We've fixed up the userspace issues in iscsistart that prevented
      multiple instances from running, so now attempts to speed up booting by
      bringing up multiple iscsi sessions at once in the initramfs are just
      running into misrouted responses that this fixes.
      Signed-off-by: NChris Leech <cleech@redhat.com>
      Reviewed-by: NLee Duncan <lduncan@suse.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      af170928
  25. 01 11月, 2017 1 次提交
  26. 03 10月, 2017 1 次提交