1. 12 11月, 2016 1 次提交
    • A
      scsi: mpt3sas: Fix secure erase premature termination · 18f6084a
      Andrey Grodzovsky 提交于
      This is a work around for a bug with LSI Fusion MPT SAS2 when perfoming
      secure erase. Due to the very long time the operation takes, commands
      issued during the erase will time out and will trigger execution of the
      abort hook. Even though the abort hook is called for the specific
      command which timed out, this leads to entire device halt
      (scsi_state terminated) and premature termination of the secure erase.
      
      Set device state to busy while ATA passthrough commands are in progress.
      
      [mkp: hand applied to 4.9/scsi-fixes, tweaked patch description]
      Signed-off-by: NAndrey Grodzovsky <andrey2805@gmail.com>
      Acked-by: NSreekanth Reddy <Sreekanth.Reddy@broadcom.com>
      Cc: <linux-scsi@vger.kernel.org>
      Cc: Sathya Prakash <sathya.prakash@broadcom.com>
      Cc: Chaitra P B <chaitra.basappa@broadcom.com>
      Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
      Cc: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      18f6084a
  2. 10 11月, 2016 1 次提交
  3. 09 11月, 2016 3 次提交
  4. 02 11月, 2016 5 次提交
    • B
      scsi: qla2xxx: Fix scsi scan hang triggered if adapter fails during init · a5dd506e
      Bill Kuzeja 提交于
      A system can get hung task timeouts if a qlogic board fails during
      initialization (if the board breaks again or fails the init). The hang
      involves the scsi scan.
      
      In a nutshell, since commit beb9e315 ("qla2xxx: Prevent removal and
      board_disable race"):
      
      ...it is possible to have freed ha (base_vha->hw) early by a call to
      qla2x00_remove_one when pdev->enable_cnt equals zero:
      
             if (!atomic_read(&pdev->enable_cnt)) {
                     scsi_host_put(base_vha->host);
                     kfree(ha);
                     pci_set_drvdata(pdev, NULL);
                     return;
      
      Almost always, the scsi_host_put above frees the vha structure
      (attached to the end of the Scsi_Host we're putting) since it's the last
      put, and life is good.  However, if we are entering this routine because
      the adapter has broken sometime during initialization AND a scsi scan is
      already in progress (and has done its own scsi_host_get), vha will not
      be freed. What's worse, the scsi scan will access the freed ha structure
      through qla2xxx_scan_finished:
      
              if (time > vha->hw->loop_reset_delay * HZ)
                      return 1;
      
      The scsi scan keeps checking to see if a scan is complete by calling
      qla2xxx_scan_finished. There is a timeout value that limits the length
      of time a scan can take (hw->loop_reset_delay, usually set to 5
      seconds), but this definition is in the data structure (hw) that can get
      freed early.
      
      This can yield unpredictable results, the worst of which is that the
      scsi scan can hang indefinitely. This happens when the freed structure
      gets reused and loop_reset_delay gets overwritten with garbage, which
      the scan obliviously uses as its timeout value.
      
      The fix for this is simple: at the top of qla2xxx_scan_finished, check
      for the UNLOADING bit in the vha structure (_vha is not freed at this
      point).  If UNLOADING is set, we exit the scan for this adapter
      immediately. After this last reference to the ha structure, we'll exit
      the scan for this adapter, and continue on.
      
      This problem is hard to hit, but I have run into it doing negative
      testing many times now (with a test specifically designed to bring it
      out), so I can verify that this fix works. My testing has been against a
      RHEL7 driver variant, but the bug and patch are equally relevant to to
      the upstream driver.
      
      Fixes: beb9e315 ("qla2xxx: Prevent removal and board_disable race")
      Cc: <stable@vger.kernel.org> # v3.18+
      Signed-off-by: NBill Kuzeja <william.kuzeja@stratus.com>
      Acked-by: NHimanshu Madhani <himanshu.madhani@cavium.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      a5dd506e
    • B
      scsi: scsi_dh_alua: Fix a reference counting bug · df3d422c
      Bart Van Assche 提交于
      The code at the end of alua_rtpg_work() is as follows:
      
      	scsi_device_put(sdev);
      	kref_put(&pg->kref, release_port_group);
      
      In other words, alua_rtpg_queue() must hold an sdev reference and a pg
      reference before queueing rtpg work. If no rtpg work is queued no
      additional references should be held when alua_rtpg_queue() returns. If
      no rtpg work is queued, ensure that alua_rtpg_queue() only gives up the
      sdev reference if that reference was obtained by the same
      alua_rtpg_queue() call.
      Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
      Reported-by: NTang Junhui <tang.junhui@zte.com.cn>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Tang Junhui <tang.junhui@zte.com.cn>
      Cc: <stable@vger.kernel.org>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      df3d422c
    • D
      scsi: vmw_pvscsi: return SUCCESS for successful command aborts · aac173e9
      David Jeffery 提交于
      The vmw_pvscsi driver reports most successful aborts as FAILED to the
      scsi error handler.  This is do to a misunderstanding of how
      completion_done() works and its interaction with a successful wait using
      wait_for_completion_timeout().  The vmw_pvscsi driver is expecting
      completion_done() to always return true if complete() has been called on
      the completion structure.  But completion_done() returns true after
      complete() has been called only if no function like
      wait_for_completion_timeout() has seen the completion and cleared it as
      part of successfully waiting for the completion.
      
      Instead of using completion_done(), vmw_pvscsi should just use the
      return value from wait_for_completion_timeout() to know if the wait
      timed out or not.
      
      [mkp: bumped driver version per request]
      Signed-off-by: NDavid Jeffery <djeffery@redhat.com>
      Reviewed-by: NLaurence Oberman <loberman@redhat.com>
      Reviewed-by: NEwan D. Milne <emilne@redhat.com>
      Acked-by: NJim Gill <jgill@vmware.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      aac173e9
    • S
      scsi: mpt3sas: Fix for block device of raid exists even after deleting raid disk · 6d3a56ed
      Sreekanth Reddy 提交于
      While merging mpt3sas & mpt2sas code, we added the is_warpdrive check
      condition on the wrong line
      
      ---------------------------------------------------------------------------
       scsih_target_alloc(struct scsi_target *starget)
                              sas_target_priv_data->handle = raid_device->handle;
                              sas_target_priv_data->sas_address = raid_device->wwid;
                              sas_target_priv_data->flags |= MPT_TARGET_FLAGS_VOLUME;
      -                       raid_device->starget = starget;
      +                       sas_target_priv_data->raid_device = raid_device;
      +                       if (ioc->is_warpdrive)
      +                               raid_device->starget = starget;
                      }
                      spin_unlock_irqrestore(&ioc->raid_device_lock, flags);
                      return 0;
      ------------------------------------------------------------------------------
      
      That check should be for the line sas_target_priv_data->raid_device =
      raid_device;
      
      Due to above hunk, we are not initializing raid_device's starget for
      raid volumes, and so during raid disk deletion driver is not calling
      scsi_remove_target() API as driver observes starget field of
      raid_device's structure as NULL.
      Signed-off-by: NSreekanth Reddy <Sreekanth.Reddy@broadcom.com>
      Cc: <stable@vger.kernel.org> # v4.4+
      Fixes: 7786ab6a ("mpt3sas: Ported WarpDrive product SSS6200 support")
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      6d3a56ed
    • T
      scsi: scsi_dh_alua: fix missing kref_put() in alua_rtpg_work() · 1fdd1427
      tang.junhui 提交于
      Reference count of pg leaks in alua_rtpg_work() since kref_put() is not
      called to decrease the reference count of pg when the condition
      pg->rtpg_sdev==NULL satisfied (actually it is easy to satisfy), it would
      cause memory of pg leakage.
      Signed-off-by: Ntang.junhui <tang.junhui@zte.com.cn>
      Cc: <stable@vger.kernel.org>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      1fdd1427
  5. 27 10月, 2016 2 次提交
  6. 25 10月, 2016 1 次提交
  7. 18 10月, 2016 3 次提交
    • A
      scsi: NCR5380: no longer mark irq probing as __init · 77f18a87
      Arnd Bergmann 提交于
      The g_NCR5380 has been converted to more regular probing, which
      means its probe function can now be invoked after the __init section
      is discarded, as pointed out by this kbuild warning:
      
      WARNING: drivers/scsi/built-in.o(.text+0x3a105): Section mismatch in reference from the function generic_NCR5380_isa_match() to the function .init.text:probe_intr()
      WARNING: drivers/scsi/built-in.o(.text+0x3a145): Section mismatch in reference from the function generic_NCR5380_isa_match() to the variable .init.data:probe_irq
      
      To make sure this works correctly in all cases, let's remove
      the __init and __initdata annotations.
      
      Fixes: a8cfbcae ("scsi: g_NCR5380: Stop using scsi_module.c")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NFinn Thain <fthain@telegraphics.com.au>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      77f18a87
    • J
      scsi: be2iscsi: Replace _bh with _irqsave/irqrestore · 7d2c0d64
      Jitendra Bhivare 提交于
      [ 3843.132217] WARNING: CPU: 20 PID: 1227 at kernel/softirq.c:150 __local_bh_enable_ip+0x6b/0x90
      [ 3843.142815] Modules linked in:
      ...
      [ 3843.294328] CPU: 20 PID: 1227 Comm: kworker/20:1H Tainted: G            E   4.8.0-rc1+ #3
      [ 3843.304944] Hardware name: Dell Inc. PowerEdge R720/0X6H47, BIOS 1.4.8 10/25/2012
      [ 3843.314798] Workqueue: kblockd blk_timeout_work
      [ 3843.321350]  0000000000000086 00000000a32f4533 ffff8802216d7bd8 ffffffff8135c3cf
      [ 3843.331146]  0000000000000000 0000000000000000 ffff8802216d7c18 ffffffff8108d661
      [ 3843.340918]  00000096216d7c50 0000000000000200 ffff8802d07cc828 ffff8801b3632550
      [ 3843.350687] Call Trace:
      [ 3843.354866]  [<ffffffff8135c3cf>] dump_stack+0x63/0x84
      [ 3843.362061]  [<ffffffff8108d661>] __warn+0xd1/0xf0
      [ 3843.368851]  [<ffffffff8108d79d>] warn_slowpath_null+0x1d/0x20
      [ 3843.376791]  [<ffffffff810930eb>] __local_bh_enable_ip+0x6b/0x90
      [ 3843.384903]  [<ffffffff816fe7be>] _raw_spin_unlock_bh+0x1e/0x20
      [ 3843.392940]  [<ffffffffa085f710>] beiscsi_alloc_pdu+0x2f0/0x6e0 [be2iscsi]
      [ 3843.402076]  [<ffffffffa06bc358>] __iscsi_conn_send_pdu+0xf8/0x370 [libiscsi]
      [ 3843.411549]  [<ffffffffa06bc6fe>] iscsi_send_nopout+0xbe/0x110 [libiscsi]
      [ 3843.420639]  [<ffffffffa06bd98b>] iscsi_eh_cmd_timed_out+0x29b/0x2b0 [libiscsi]
      [ 3843.430339]  [<ffffffff814cd1de>] scsi_times_out+0x5e/0x250
      [ 3843.438119]  [<ffffffff813374af>] blk_rq_timed_out+0x1f/0x60
      [ 3843.446009]  [<ffffffff8133759d>] blk_timeout_work+0xad/0x150
      [ 3843.454010]  [<ffffffff810a6642>] process_one_work+0x152/0x400
      [ 3843.462114]  [<ffffffff810a6f35>] worker_thread+0x125/0x4b0
      [ 3843.469961]  [<ffffffff810a6e10>] ? rescuer_thread+0x380/0x380
      [ 3843.478116]  [<ffffffff810aca28>] kthread+0xd8/0xf0
      [ 3843.485212]  [<ffffffff816fedff>] ret_from_fork+0x1f/0x40
      [ 3843.492908]  [<ffffffff810ac950>] ? kthread_park+0x60/0x60
      [ 3843.500715] ---[ end trace 57ec0a1d8f0dd3a0 ]---
      [ 3852.328667] NMI watchdog: Watchdog detected hard LOCKUP on cpu 1Kernel panic - not syncing: Hard LOCKUP
      
      blk_timeout_work takes queue_lock spin_lock with interrupts disabled
      before invoking iscsi_eh_cmd_timed_out. This causes a WARN_ON_ONCE in
      spin_unlock_bh for wrb_lock/io_sgl_lock/mgmt_sgl_lock.
      
      CPU was kept busy in lot of bottom half work with interrupts disabled
      thus causing hard lock up.
      Signed-off-by: NJitendra Bhivare <jitendra.bhivare@broadcom.com>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Reviewed-by: NChris Leech <cleech@redhat.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      7d2c0d64
    • J
      scsi: libiscsi: Fix locking in __iscsi_conn_send_pdu · 4fa50799
      Jitendra Bhivare 提交于
      The code at free_task label in __iscsi_conn_send_pdu can get executed
      from blk_timeout_work which takes queue_lock using spin_lock_irq.
      back_lock taken with spin_unlock_bh will cause WARN_ON_ONCE.  The code
      gets executed either with bottom half or IRQ disabled hence using
      spin_lock/spin_unlock for back_lock is safe.
      Signed-off-by: NJitendra Bhivare <jitendra.bhivare@broadcom.com>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Reviewed-by: NChris Leech <cleech@redhat.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      4fa50799
  8. 15 10月, 2016 2 次提交
    • B
      scsi: ipr: Fix async error WARN_ON · 8a4236a2
      Brian King 提交于
      Commit afc3f83c ("scsi: ipr: Add asynchronous error notification")
      introduced the warn on shown below. To fix this, rather than attempting
      to send the KOBJ_CHANGE uevent from interrupt context, which is what is
      causing the WARN_ON, just wake the ipr worker thread which will send a
      KOBJ_CHANGE uevent.
      
      [  142.278120] WARNING: CPU: 15 PID: 0 at kernel/softirq.c:161 __local_bh_enable_ip+0x7c/0xd0
      [  142.278124] Modules linked in: ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter ses enclosure scsi_transport_sas sg pseries_rng nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sr_mod sd_mod cdrom ipr libata ibmvscsi scsi_transport_srp ibmveth dm_mirror dm_region_hash dm_log dm_mod
      [  142.278208] CPU: 15 PID: 0 Comm: swapper/15 Not tainted 4.8.0.ipr+ #21
      [  142.278213] task: c00000010cf24480 task.stack: c00000010cfec000
      [  142.278217] NIP: c0000000000c0c7c LR: c000000000881778 CTR: c0000000003c5bf0
      [  142.278221] REGS: c00000010cfef080 TRAP: 0700   Not tainted  (4.8.0.ipr+)
      [  142.278224] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28008022  XER: 2000000f
      [  142.278236] CFAR: c0000000000c0c20 SOFTE: 0
      GPR00: c000000000706c78 c00000010cfef300 c000000000f91d00 c000000000706c78
      GPR04: 0000000000000200 c000000000f7bc80 0000000000000000 00000000024000c0
      GPR08: 0000000000000000 0000000000000001 c000000000ee1d00 c000000000a9bdd0
      GPR12: c0000000003c5bf0 c00000000eb22d00 c000000100ca3880 c00000020ed38400
      GPR16: 0000000000000000 0000000000000000 c000000100940508 0000000000000000
      GPR20: 0000000000000000 0000000000000000 0000000000000000 00000000024000c0
      GPR24: c0000000004588e0 c00000010863bd00 c00000010863bd00 c0000000013773f8
      GPR28: c000000000f7bc80 0000000000000000 ffffffffffffffff c000000000f7bcd8
      [  142.278290] NIP [c0000000000c0c7c] __local_bh_enable_ip+0x7c/0xd0
      [  142.278296] LR [c000000000881778] _raw_spin_unlock_bh+0x38/0x60
      [  142.278299] Call Trace:
      [  142.278303] [c00000010cfef300] [c000000000f7bc80] init_net+0x0/0x1900 (unreliable)
      [  142.278310] [c00000010cfef320] [c000000000706c78] peernet2id+0x58/0x80
      [  142.278316] [c00000010cfef370] [c00000000075caec] netlink_broadcast_filtered+0x30c/0x550
      [  142.278323] [c00000010cfef430] [c000000000459078] kobject_uevent_env+0x588/0x780
      [  142.278331] [c00000010cfef510] [d000000003163a6c] ipr_process_error+0x11c/0x240 [ipr]
      [  142.278337] [c00000010cfef5c0] [d000000003152298] ipr_fail_all_ops+0x108/0x220 [ipr]
      [  142.278343] [c00000010cfef670] [d0000000031643f8] ipr_reset_restore_cfg_space+0xa8/0x240 [ipr]
      [  142.278350] [c00000010cfef6f0] [d000000003158a00] ipr_reset_ioa_job+0x80/0xe0 [ipr]
      [  142.278356] [c00000010cfef720] [d000000003153f78] ipr_reset_timer_done+0xa8/0xe0 [ipr]
      [  142.278363] [c00000010cfef770] [c000000000149c88] call_timer_fn+0x58/0x1c0
      [  142.278368] [c00000010cfef800] [c000000000149f60] expire_timers+0x140/0x200
      [  142.278373] [c00000010cfef870] [c00000000014a0e8] run_timer_softirq+0xc8/0x230
      [  142.278379] [c00000010cfef900] [c0000000000c0844] __do_softirq+0x164/0x3c0
      [  142.278384] [c00000010cfef9f0] [c0000000000c0f18] irq_exit+0x1a8/0x1c0
      [  142.278389] [c00000010cfefa20] [c000000000020b54] timer_interrupt+0xa4/0xe0
      [  142.278394] [c00000010cfefa50] [c000000000002414] decrementer_common+0x114/0x180
      Signed-off-by: NBrian King <brking@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      8a4236a2
    • D
      scsi: zfcp: spin_lock_irqsave() is not nestable · e7cb08e8
      Dan Carpenter 提交于
      We accidentally overwrite the original saved value of "flags" so that we
      can't re-enable IRQs at the end of the function.  Presumably this
      function is mostly called with IRQs disabled or it would be obvious in
      testing.
      
      Fixes: aceeffbb ("zfcp: trace full payload of all SAN records (req,resp,iels)")
      Cc: <stable@vger.kernel.org> #2.6.38+
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NSteffen Maier <maier@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      e7cb08e8
  9. 12 10月, 2016 3 次提交
    • M
      scsi: Remove one useless stack variable · 03eb6b8d
      Ming Lei 提交于
      The local variable of 'devname' in scsi_report_lun_scan() isn't used any
      more, so remove it.
      
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NMing Lei <tom.leiming@gmail.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      03eb6b8d
    • M
      scsi: Fix use-after-free · bcd8f2e9
      Ming Lei 提交于
      This patch fixes one use-after-free report[1] by KASAN.
      
      In __scsi_scan_target(), when a type 31 device is probed,
      SCSI_SCAN_TARGET_PRESENT is returned and the target will be scanned
      again.
      
      Inside the following scsi_report_lun_scan(), one new scsi_device
      instance is allocated, and scsi_probe_and_add_lun() is called again to
      probe the target and still see type 31 device, finally
      __scsi_remove_device() is called to remove & free the device at the end
      of scsi_probe_and_add_lun(), so cause use-after-free in
      scsi_report_lun_scan().
      
      And the following SCSI log can be observed:
      
      	scsi 0:0:2:0: scsi scan: INQUIRY pass 1 length 36
      	scsi 0:0:2:0: scsi scan: INQUIRY successful with code 0x0
      	scsi 0:0:2:0: scsi scan: peripheral device type of 31, no device added
      	scsi 0:0:2:0: scsi scan: Sending REPORT LUNS to (try 0)
      	scsi 0:0:2:0: scsi scan: REPORT LUNS successful (try 0) result 0x0
      	scsi 0:0:2:0: scsi scan: REPORT LUN scan
      	scsi 0:0:2:0: scsi scan: INQUIRY pass 1 length 36
      	scsi 0:0:2:0: scsi scan: INQUIRY successful with code 0x0
      	scsi 0:0:2:0: scsi scan: peripheral device type of 31, no device added
      	BUG: KASAN: use-after-free in __scsi_scan_target+0xbf8/0xe40 at addr ffff88007b44a104
      
      This patch fixes the issue by moving the putting reference at
      the end of scsi_report_lun_scan().
      
      [1] KASAN report
      ==================================================================
      [    3.274597] PM: Adding info for serio:serio1
      [    3.275127] BUG: KASAN: use-after-free in __scsi_scan_target+0xd87/0xdf0 at addr ffff880254d8c304
      [    3.275653] Read of size 4 by task kworker/u10:0/27
      [    3.275903] CPU: 3 PID: 27 Comm: kworker/u10:0 Not tainted 4.8.0 #2121
      [    3.276258] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
      [    3.276797] Workqueue: events_unbound async_run_entry_fn
      [    3.277083]  ffff880254d8c380 ffff880259a37870 ffffffff94bbc6c1 ffff880078402d80
      [    3.277532]  ffff880254d8bb80 ffff880259a37898 ffffffff9459fec1 ffff880259a37930
      [    3.277989]  ffff880254d8bb80 ffff880078402d80 ffff880259a37920 ffffffff945a0165
      [    3.278436] Call Trace:
      [    3.278528]  [<ffffffff94bbc6c1>] dump_stack+0x65/0x84
      [    3.278797]  [<ffffffff9459fec1>] kasan_object_err+0x21/0x70
      [    3.279063] device: 'psaux': device_add
      [    3.279616]  [<ffffffff945a0165>] kasan_report_error+0x205/0x500
      [    3.279651] PM: Adding info for No Bus:psaux
      [    3.280202]  [<ffffffff944ecd22>] ? kfree_const+0x22/0x30
      [    3.280486]  [<ffffffff94bc2dc9>] ? kobject_release+0x119/0x370
      [    3.280805]  [<ffffffff945a0543>] __asan_report_load4_noabort+0x43/0x50
      [    3.281170]  [<ffffffff9507e1f7>] ? __scsi_scan_target+0xd87/0xdf0
      [    3.281506]  [<ffffffff9507e1f7>] __scsi_scan_target+0xd87/0xdf0
      [    3.281848]  [<ffffffff9507d470>] ? scsi_add_device+0x30/0x30
      [    3.282156]  [<ffffffff94f7f660>] ? pm_runtime_autosuspend_expiration+0x60/0x60
      [    3.282570]  [<ffffffff956ddb07>] ? _raw_spin_lock+0x17/0x40
      [    3.282880]  [<ffffffff9507e505>] scsi_scan_channel+0x105/0x160
      [    3.283200]  [<ffffffff9507e8a2>] scsi_scan_host_selected+0x212/0x2f0
      [    3.283563]  [<ffffffff9507eb3c>] do_scsi_scan_host+0x1bc/0x250
      [    3.283882]  [<ffffffff9507efc1>] do_scan_async+0x41/0x450
      [    3.284173]  [<ffffffff941c1fee>] async_run_entry_fn+0xfe/0x610
      [    3.284492]  [<ffffffff941a8954>] ? pwq_dec_nr_in_flight+0x124/0x2a0
      [    3.284876]  [<ffffffff941d1770>] ? preempt_count_add+0x130/0x160
      [    3.285207]  [<ffffffff941a9a84>] process_one_work+0x544/0x12d0
      [    3.285526]  [<ffffffff941aa8e9>] worker_thread+0xd9/0x12f0
      [    3.285844]  [<ffffffff941aa810>] ? process_one_work+0x12d0/0x12d0
      [    3.286182]  [<ffffffff941bb365>] kthread+0x1c5/0x260
      [    3.286443]  [<ffffffff940855cd>] ? __switch_to+0x88d/0x1430
      [    3.286745]  [<ffffffff941bb1a0>] ? kthread_worker_fn+0x5a0/0x5a0
      [    3.287085]  [<ffffffff956dde9f>] ret_from_fork+0x1f/0x40
      [    3.287368]  [<ffffffff941bb1a0>] ? kthread_worker_fn+0x5a0/0x5a0
      [    3.287697] Object at ffff880254d8bb80, in cache kmalloc-2048 size: 2048
      [    3.288064] Allocated:
      [    3.288147] PID = 27
      [    3.288218]  [<ffffffff940b27ab>] save_stack_trace+0x2b/0x50
      [    3.288531]  [<ffffffff9459f246>] save_stack+0x46/0xd0
      [    3.288806]  [<ffffffff9459f4bd>] kasan_kmalloc+0xad/0xe0
      [    3.289098]  [<ffffffff9459c07e>] __kmalloc+0x13e/0x250
      [    3.289378]  [<ffffffff95078e5a>] scsi_alloc_sdev+0xea/0xcf0
      [    3.289701]  [<ffffffff9507de76>] __scsi_scan_target+0xa06/0xdf0
      [    3.290034]  [<ffffffff9507e505>] scsi_scan_channel+0x105/0x160
      [    3.290362]  [<ffffffff9507e8a2>] scsi_scan_host_selected+0x212/0x2f0
      [    3.290724]  [<ffffffff9507eb3c>] do_scsi_scan_host+0x1bc/0x250
      [    3.291055]  [<ffffffff9507efc1>] do_scan_async+0x41/0x450
      [    3.291354]  [<ffffffff941c1fee>] async_run_entry_fn+0xfe/0x610
      [    3.291695]  [<ffffffff941a9a84>] process_one_work+0x544/0x12d0
      [    3.292022]  [<ffffffff941aa8e9>] worker_thread+0xd9/0x12f0
      [    3.292325]  [<ffffffff941bb365>] kthread+0x1c5/0x260
      [    3.292594]  [<ffffffff956dde9f>] ret_from_fork+0x1f/0x40
      [    3.292886] Freed:
      [    3.292945] PID = 27
      [    3.293016]  [<ffffffff940b27ab>] save_stack_trace+0x2b/0x50
      [    3.293327]  [<ffffffff9459f246>] save_stack+0x46/0xd0
      [    3.293600]  [<ffffffff9459fa61>] kasan_slab_free+0x71/0xb0
      [    3.293916]  [<ffffffff9459bac2>] kfree+0xa2/0x1f0
      [    3.294168]  [<ffffffff9508158a>] scsi_device_dev_release_usercontext+0x50a/0x730
      [    3.294598]  [<ffffffff941ace9a>] execute_in_process_context+0xda/0x130
      [    3.294974]  [<ffffffff9508107c>] scsi_device_dev_release+0x1c/0x20
      [    3.295322]  [<ffffffff94f566f6>] device_release+0x76/0x1e0
      [    3.295626]  [<ffffffff94bc2db7>] kobject_release+0x107/0x370
      [    3.295942]  [<ffffffff94bc29ce>] kobject_put+0x4e/0xa0
      [    3.296222]  [<ffffffff94f56e17>] put_device+0x17/0x20
      [    3.296497]  [<ffffffff9505201c>] scsi_device_put+0x7c/0xa0
      [    3.296801]  [<ffffffff9507e1bc>] __scsi_scan_target+0xd4c/0xdf0
      [    3.297132]  [<ffffffff9507e505>] scsi_scan_channel+0x105/0x160
      [    3.297458]  [<ffffffff9507e8a2>] scsi_scan_host_selected+0x212/0x2f0
      [    3.297829]  [<ffffffff9507eb3c>] do_scsi_scan_host+0x1bc/0x250
      [    3.298156]  [<ffffffff9507efc1>] do_scan_async+0x41/0x450
      [    3.298453]  [<ffffffff941c1fee>] async_run_entry_fn+0xfe/0x610
      [    3.298777]  [<ffffffff941a9a84>] process_one_work+0x544/0x12d0
      [    3.299105]  [<ffffffff941aa8e9>] worker_thread+0xd9/0x12f0
      [    3.299408]  [<ffffffff941bb365>] kthread+0x1c5/0x260
      [    3.299676]  [<ffffffff956dde9f>] ret_from_fork+0x1f/0x40
      [    3.299967] Memory state around the buggy address:
      [    3.300209]  ffff880254d8c200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [    3.300608]  ffff880254d8c280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [    3.300986] >ffff880254d8c300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [    3.301408]                    ^
      [    3.301550]  ffff880254d8c380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [    3.301987]  ffff880254d8c400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [    3.302396]
      ==================================================================
      
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: stable@vger.kernel.org
      Signed-off-by: NMing Lei <tom.leiming@gmail.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      bcd8f2e9
    • X
      scsi: Replace wrong device handler name for CLARiiON arrays · 0ba43a81
      Xose Vazquez Perez 提交于
      At drivers/scsi/device_handler/scsi_dh_emc.c it was defined as:
      
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Christophe Varoqui <christophe.varoqui@opensvc.com>
      Cc: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Cc: SCSI ML <linux-scsi@vger.kernel.org>
      Cc: device-mapper development <dm-devel@redhat.com>
      Signed-off-by: NXose Vazquez Perez <xose.vazquez@gmail.com>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      0ba43a81
  10. 10 10月, 2016 1 次提交
    • L
      printk: reinstate KERN_CONT for printing continuation lines · 4bcc595c
      Linus Torvalds 提交于
      Long long ago the kernel log buffer was a buffered stream of bytes, very
      much like stdio in user space.  It supported log levels by scanning the
      stream and noticing the log level markers at the beginning of each line,
      but if you wanted to print a partial line in multiple chunks, you just
      did multiple printk() calls, and it just automatically worked.
      
      Except when it didn't, and you had very confusing output when different
      lines got all mixed up with each other.  Then you got fragment lines
      mixing with each other, or with non-fragment lines, because it was
      traditionally impossible to tell whether a printk() call was a
      continuation or not.
      
      To at least help clarify the issue of continuation lines, we added a
      KERN_CONT marker back in 2007 to mark continuation lines:
      
        47492527 ("printk: add KERN_CONT annotation").
      
      That continuation marker was initially an empty string, and didn't
      actuall make any semantic difference.  But it at least made it possible
      to annotate the source code, and have check-patch notice that a printk()
      didn't need or want a log level marker, because it was a continuation of
      a previous line.
      
      To avoid the ambiguity between a continuation line that had that
      KERN_CONT marker, and a printk with no level information at all, we then
      in 2009 made KERN_CONT be a real log level marker which meant that we
      could now reliably tell the difference between the two cases.
      
        5fd29d6c ("printk: clean up handling of log-levels and newlines")
      
      and we could take advantage of that to make sure we didn't mix up
      continuation lines with lines that just didn't have any loglevel at all.
      
      Then, in 2012, the kernel log buffer was changed to be a "record" based
      log, where each line was a record that has a loglevel and a timestamp.
      
      You can see the beginning of that conversion in commits
      
        e11fea92 ("kmsg: export printk records to the /dev/kmsg interface")
        7ff9554b ("printk: convert byte-buffer to variable-length record buffer")
      
      with a number of follow-up commits to fix some painful fallout from that
      conversion.  Over all, it took a couple of months to sort out most of
      it.  But the upside was that you could have concurrent readers (and
      writers) of the kernel log and not have lines with mixed output in them.
      
      And one particular pain-point for the record-based kernel logging was
      exactly the fragmentary lines that are generated in smaller chunks.  In
      order to still log them as one recrod, the continuation lines need to be
      attached to the previous record properly.
      
      However the explicit continuation record marker that is actually useful
      for this exact case was actually removed in aroundm the same time by commit
      
        61e99ab8 ("printk: remove the now unnecessary "C" annotation for KERN_CONT")
      
      due to the incorrect belief that KERN_CONT wasn't meaningful.  The
      ambiguity between "is this a continuation line" or "is this a plain
      printk with no log level information" was reintroduced, and in fact
      became an even bigger pain point because there was now the whole
      record-level merging of kernel messages going on.
      
      This patch reinstates the KERN_CONT as a real non-empty string marker,
      so that the ambiguity is fixed once again.
      
      But it's not a plain revert of that original removal: in the four years
      since we made KERN_CONT an empty string again, not only has the format
      of the log level markers changed, we've also had some usage changes in
      this area.
      
      For example, some ACPI code seems to use KERN_CONT _together_ with a log
      level, and now uses both the KERN_CONT marker and (for example) a
      KERN_INFO marker to show that it's an informational continuation of a
      line.
      
      Which is actually not a bad idea - if the continuation line cannot be
      attached to its predecessor, without the log level information we don't
      know what log level to assign to it (and we traditionally just assigned
      it the default loglevel).  So having both a log level and the KERN_CONT
      marker is not necessarily a bad idea, but it does mean that we need to
      actually iterate over potentially multiple markers, rather than just a
      single one.
      
      Also, since KERN_CONT was still conceptually needed, and encouraged, but
      didn't actually _do_ anything, we've also had the reverse problem:
      rather than having too many annotations it has too few, and there is bit
      rot with code that no longer marks the continuation lines with the
      KERN_CONT marker.
      
      So this patch not only re-instates the non-empty KERN_CONT marker, it
      also fixes up the cases of bit-rot I noticed in my own logs.
      
      There are probably other cases where KERN_CONT will be needed to be
      added, either because it is new code that never dealt with the need for
      KERN_CONT, or old code that has bitrotted without anybody noticing.
      
      That said, we should strive to avoid the need for KERN_CONT.  It does
      result in real problems for logging, and should generally not be seen as
      a good feature.  If we some day can get rid of the feature entirely,
      because nobody does any fragmented printk calls, that would be lovely.
      
      But until that point, let's at mark the code that relies on the hacky
      multi-fragment kernel printk's.  Not only does it avoid the ambiguity,
      it also annotates code as "maybe this would be good to fix some day".
      
      (That said, particularly during single-threaded bootup, the downsides of
      KERN_CONT are very limited.  Things get much hairier when you have
      multiple threads going on and user level reading and writing logs too).
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4bcc595c
  11. 09 10月, 2016 5 次提交
  12. 08 10月, 2016 13 次提交
    • C
      wan/fsl_ucc_hdlc: Fix size used in dma_free_coherent() · 776482cd
      Christophe Jaillet 提交于
      Size used with 'dma_alloc_coherent()' and 'dma_free_coherent()' should be
      consistent.
      Here, the size of a pointer is used in dma_alloc... and the size of the
      pointed structure is used in dma_free...
      
      This has been spotted with coccinelle, using the following script:
      ////////////////////
      @r@
      expression x0, x1, y0, y1, z0, z1, t0, t1, ret;
      @@
      
      *   ret = dma_alloc_coherent(x0, y0, z0, t0);
          ...
      *   dma_free_coherent(x1, y1, ret, t1);
      
      @script:python@
      y0 << r.y0;
      y1 << r.y1;
      
      @@
      if y1.find(y0) == -1:
       print "WARNING: sizes look different:  '%s'   vs   '%s'" % (y0, y1)
      ////////////////////
      Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      776482cd
    • N
      net: macb: NULL out phydev after removing mdio bus · fa6114d4
      Nathan Sullivan 提交于
      To ensure the dev->phydev pointer is not used after becoming invalid in
      mdiobus_unregister, set it to NULL. This happens when removing the macb
      driver without first taking its interface down, since unregister_netdev
      will end up calling macb_close.
      Signed-off-by: NXander Huff <xander.huff@ni.com>
      Signed-off-by: NNathan Sullivan <nathan.sullivan@ni.com>
      Signed-off-by: NBrad Mouring <brad.mouring@ni.com>
      Reviewed-by: NMoritz Fischer <moritz.fischer@ettus.com>
      Acked-by: NNicolas Ferre <nicolas.ferre@atmel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fa6114d4
    • P
      xen-netback: make sure that hashes are not send to unaware frontends · 912e27e8
      Paul Durrant 提交于
      In the case when a frontend only negotiates a single queue with xen-
      netback it is possible for a skbuff with a s/w hash to result in a
      hash extra_info segment being sent to the frontend even when no hash
      algorithm has been configured. (The ndo_select_queue() entry point makes
      sure the hash is not set if no algorithm is configured, but this entry
      point is not called when there is only a single queue). This can result
      in a frontend that is unable to handle extra_info segments being given
      such a segment, causing it to crash.
      
      This patch fixes the problem by clearing the hash in ndo_start_xmit()
      instead, which is clearly guaranteed to be called irrespective of the
      number of queues.
      Signed-off-by: NPaul Durrant <paul.durrant@citrix.com>
      Cc: Wei Liu <wei.liu2@citrix.com>
      Acked-by: NWei Liu <wei.liu2@citrix.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      912e27e8
    • A
      vfs: Remove {get,set,remove}xattr inode operations · fd50ecad
      Andreas Gruenbacher 提交于
      These inode operations are no longer used; remove them.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      fd50ecad
    • P
      console: don't prefer first registered if DT specifies stdout-path · 05fd007e
      Paul Burton 提交于
      If a device tree specifies a preferred device for kernel console output
      via the stdout-path or linux,stdout-path chosen node properties or the
      stdout alias then the kernel ought to honor it & output the kernel
      console to that device.  As it stands, this isn't the case.  Whilst we
      parse the stdout-path properties & set an of_stdout variable from
      of_alias_scan(), and use that from of_console_check() to determine
      whether to add a console device as a preferred console whilst
      registering it, we also prefer the first registered console if no other
      has been selected at the time of its registration.
      
      This means that if a console other than the one the device tree selects
      via stdout-path is registered first, we will switch to using it & when
      the stdout-path console is later registered the call to
      add_preferred_console() via of_console_check() is too late to do
      anything useful.  In practice this seems to mean that we switch to the
      dummy console device fairly early & see no further console output:
      
          Console: colour dummy device 80x25
          console [tty0] enabled
          bootconsole [ns16550a0] disabled
      
      Fix this by not automatically preferring the first registered console if
      one is specified by the device tree.  This allows consoles to be
      registered but not enabled, and once the driver for the console selected
      by stdout-path calls of_console_check() the driver will be added to the
      list of preferred consoles before any other console has been enabled.
      When that console is then registered via register_console() it will be
      enabled as expected.
      
      Link: http://lkml.kernel.org/r/20160809151937.26118-1-paul.burton@imgtec.comSigned-off-by: NPaul Burton <paul.burton@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Ivan Delalande <colona@arista.com>
      Cc: Thierry Reding <treding@nvidia.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Jan Kara <jack@suse.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Frank Rowand <frowand.list@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      05fd007e
    • A
      cred: simpler, 1D supplementary groups · 81243eac
      Alexey Dobriyan 提交于
      Current supplementary groups code can massively overallocate memory and
      is implemented in a way so that access to individual gid is done via 2D
      array.
      
      If number of gids is <= 32, memory allocation is more or less tolerable
      (140/148 bytes).  But if it is not, code allocates full page (!)
      regardless and, what's even more fun, doesn't reuse small 32-entry
      array.
      
      2D array means dependent shifts, loads and LEAs without possibility to
      optimize them (gid is never known at compile time).
      
      All of the above is unnecessary.  Switch to the usual
      trailing-zero-len-array scheme.  Memory is allocated with
      kmalloc/vmalloc() and only as much as needed.  Accesses become simpler
      (LEA 8(gi,idx,4) or even without displacement).
      
      Maximum number of gids is 65536 which translates to 256KB+8 bytes.  I
      think kernel can handle such allocation.
      
      On my usual desktop system with whole 9 (nine) aux groups, struct
      group_info shrinks from 148 bytes to 44 bytes, yay!
      
      Nice side effects:
      
       - "gi->gid[i]" is shorter than "GROUP_AT(gi, i)", less typing,
      
       - fix little mess in net/ipv4/ping.c
         should have been using GROUP_AT macro but this point becomes moot,
      
       - aux group allocation is persistent and should be accounted as such.
      
      Link: http://lkml.kernel.org/r/20160817201927.GA2096@p183.telecom.bySigned-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Cc: Vasily Kulikov <segoon@openwall.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      81243eac
    • C
      nmi_backtrace: generate one-line reports for idle cpus · 6727ad9e
      Chris Metcalf 提交于
      When doing an nmi backtrace of many cores, most of which are idle, the
      output is a little overwhelming and very uninformative.  Suppress
      messages for cpus that are idling when they are interrupted and just
      emit one line, "NMI backtrace for N skipped: idling at pc 0xNNN".
      
      We do this by grouping all the cpuidle code together into a new
      .cpuidle.text section, and then checking the address of the interrupted
      PC to see if it lies within that section.
      
      This commit suitably tags x86 and tile idle routines, and only adds in
      the minimal framework for other architectures.
      
      Link: http://lkml.kernel.org/r/1472487169-14923-5-git-send-email-cmetcalf@mellanox.comSigned-off-by: NChris Metcalf <cmetcalf@mellanox.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Tested-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Tested-by: Daniel Thompson <daniel.thompson@linaro.org> [arm]
      Tested-by: NPetr Mladek <pmladek@suse.com>
      Cc: Aaron Tomlin <atomlin@redhat.com>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6727ad9e
    • R
      memory-hotplug: fix store_mem_state() return value · d66ba15b
      Reza Arbab 提交于
      If store_mem_state() is called to online memory which is already online,
      it will return 1, the value it got from device_online().
      
      This is wrong because store_mem_state() is a device_attribute .store
      function.  Thus a non-negative return value represents input bytes read.
      
      Set the return value to -EINVAL in this case.
      
      Link: http://lkml.kernel.org/r/1472743777-24266-1-git-send-email-arbab@linux.vnet.ibm.comSigned-off-by: NReza Arbab <arbab@linux.vnet.ibm.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Xishi Qiu <qiuxishi@huawei.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Chen Yucong <slaoub@gmail.com>
      Cc: Andrew Banman <abanman@sgi.com>
      Cc: Seth Jennings <sjenning@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d66ba15b
    • D
      staging/lustre: Disable InfiniBand support · 2937f375
      Doug Ledford 提交于
      We changed one of the RDMA APIs and Lustre's InfiniBand transport
      has not been updated to match.  Disabled it for now.
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      2937f375
    • S
      iw_cxgb4: add fast-path for small REG_MR operations · 49b53a93
      Steve Wise 提交于
      When processing a REG_MR work request, if fw supports the
      FW_RI_NSMR_TPTE_WR work request, and if the page list for this
      registration is <= 2 pages, and the current state of the mr is INVALID,
      then use FW_RI_NSMR_TPTE_WR to pass down a fully populated TPTE for FW
      to write.  This avoids FW having to do an async read of the TPTE blocking
      the SQ until the read completes.
      
      To know if the current MR state is INVALID or not, iw_cxgb4 must track the
      state of each fastreg MR.  The c4iw_mr struct state is updated as REG_MR
      and LOCAL_INV WRs are posted and completed, when a reg_mr is destroyed,
      and when RECV completions are processed that include a local invalidation.
      
      This optimization increases small IO IOPS for both iSER and NVMF.
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      49b53a93
    • S
      cxgb4: advertise support for FR_NSMR_TPTE_WR · 086de575
      Steve Wise 提交于
      Query firmware for the FW_PARAMS_PARAM_DEV_RI_FR_NSMR_TPTE_WR parameter.
      If it exists and is 1, then advertise support for FR_NSMR_TPTE_WR to
      the ULDs.
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      086de575
    • S
      IB/core: correctly handle rdma_rw_init_mrs() failure · b6bc1c73
      Steve Wise 提交于
      Function ib_create_qp() was failing to return an error when
      rdma_rw_init_mrs() fails, causing a crash further down in ib_create_qp()
      when trying to dereferece the qp pointer which was actually a negative
      errno.
      
      The crash:
      
      crash> log|grep BUG
      [  136.458121] BUG: unable to handle kernel NULL pointer dereference at 0000000000000098
      crash> bt
      PID: 3736   TASK: ffff8808543215c0  CPU: 2   COMMAND: "kworker/u64:2"
       #0 [ffff88084d323340] machine_kexec at ffffffff8105fbb0
       #1 [ffff88084d3233b0] __crash_kexec at ffffffff81116758
       #2 [ffff88084d323480] crash_kexec at ffffffff8111682d
       #3 [ffff88084d3234b0] oops_end at ffffffff81032bd6
       #4 [ffff88084d3234e0] no_context at ffffffff8106e431
       #5 [ffff88084d323530] __bad_area_nosemaphore at ffffffff8106e610
       #6 [ffff88084d323590] bad_area_nosemaphore at ffffffff8106e6f4
       #7 [ffff88084d3235a0] __do_page_fault at ffffffff8106ebdc
       #8 [ffff88084d323620] do_page_fault at ffffffff8106f057
       #9 [ffff88084d323660] page_fault at ffffffff816e3148
          [exception RIP: ib_create_qp+427]
          RIP: ffffffffa02554fb  RSP: ffff88084d323718  RFLAGS: 00010246
          RAX: 0000000000000004  RBX: fffffffffffffff4  RCX: 000000018020001f
          RDX: ffff880830997fc0  RSI: 0000000000000001  RDI: ffff88085f407200
          RBP: ffff88084d323778   R8: 0000000000000001   R9: ffffea0020bae210
          R10: ffffea0020bae218  R11: 0000000000000001  R12: ffff88084d3237c8
          R13: 00000000fffffff4  R14: ffff880859fa5000  R15: ffff88082eb89800
          ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
      #10 [ffff88084d323780] rdma_create_qp at ffffffffa0782681 [rdma_cm]
      #11 [ffff88084d3237b0] nvmet_rdma_create_queue_ib at ffffffffa07c43f3 [nvmet_rdma]
      #12 [ffff88084d323860] nvmet_rdma_alloc_queue at ffffffffa07c5ba9 [nvmet_rdma]
      #13 [ffff88084d323900] nvmet_rdma_queue_connect at ffffffffa07c5c96 [nvmet_rdma]
      #14 [ffff88084d323980] nvmet_rdma_cm_handler at ffffffffa07c6450 [nvmet_rdma]
      #15 [ffff88084d3239b0] iw_conn_req_handler at ffffffffa0787480 [rdma_cm]
      #16 [ffff88084d323a60] cm_conn_req_handler at ffffffffa0775f06 [iw_cm]
      #17 [ffff88084d323ab0] process_event at ffffffffa0776019 [iw_cm]
      #18 [ffff88084d323af0] cm_work_handler at ffffffffa0776170 [iw_cm]
      #19 [ffff88084d323cb0] process_one_work at ffffffff810a1483
      #20 [ffff88084d323d90] worker_thread at ffffffff810a211d
      #21 [ffff88084d323ec0] kthread at ffffffff810a6c5c
      #22 [ffff88084d323f50] ret_from_fork at ffffffff816e1ebf
      
      Fixes: 632bc3f6 ("IB/core, RDMA RW API: Do not exceed QP SGE send limit")
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Cc: stable@vger.kernel.org
      Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      b6bc1c73
    • B
      IB/srp: Fix infinite loop when FMR sg[0].offset != 0 · 681cc360
      Bart Van Assche 提交于
      Avoid that mapping an sg-list in which the first element has a
      non-zero offset triggers an infinite loop when using FMR. This
      patch makes the FMR mapping code similar to that of ib_sg_to_pages().
      
      Note: older Mellanox HCAs do not support non-zero offsets for FMR.
      See also commit 8c4037b5 ("IB/srp: always avoid non-zero offsets
      into an FMR").
      Reported-by: NAlex Estrin <alex.estrin@intel.com>
      Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      681cc360