1. 05 12月, 2017 1 次提交
  2. 22 11月, 2017 1 次提交
    • K
      treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts · 841b86f3
      Kees Cook 提交于
      With all callbacks converted, and the timer callback prototype
      switched over, the TIMER_FUNC_TYPE cast is no longer needed,
      so remove it. Conversion was done with the following scripts:
      
          perl -pi -e 's|\(TIMER_FUNC_TYPE\)||g' \
              $(git grep TIMER_FUNC_TYPE | cut -d: -f1 | sort -u)
      
          perl -pi -e 's|\(TIMER_DATA_TYPE\)||g' \
              $(git grep TIMER_DATA_TYPE | cut -d: -f1 | sort -u)
      
      The now unused macros are also dropped from include/linux/timer.h.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      841b86f3
  3. 02 11月, 2017 1 次提交
  4. 23 8月, 2017 1 次提交
  5. 08 8月, 2017 1 次提交
    • B
      scsi: ipr: Fix scsi-mq lockdep issue · b0e17a9b
      Brian King 提交于
      Fixes the following lockdep warning that can occur when scsi-mq is
      enabled with ipr due to ipr calling scsi_unblock_requests from irq
      context. The fix is to move the call to scsi_unblock_requests to ipr's
      existing workqueue.
      
      stack backtrace:
      CPU: 28 PID: 0 Comm: swapper/28 Not tainted 4.13.0-rc2-gcc6x-gf74c89bd #1
      Call Trace:
      [c000001fffe97550] [c000000000b50818] dump_stack+0xe8/0x160 (unreliable)
      [c000001fffe97590] [c0000000001586d0] print_usage_bug+0x2d0/0x390
      [c000001fffe97640] [c000000000158f34] mark_lock+0x7a4/0x8e0
      [c000001fffe976f0] [c00000000015a000] __lock_acquire+0x6a0/0x1a70
      [c000001fffe97860] [c00000000015befc] lock_acquire+0xec/0x2e0
      [c000001fffe97930] [c000000000b71514] _raw_spin_lock+0x44/0x70
      [c000001fffe97960] [c0000000005b60f4] blk_mq_sched_dispatch_requests+0xa4/0x2a0
      [c000001fffe979c0] [c0000000005acac0] __blk_mq_run_hw_queue+0x100/0x2c0
      [c000001fffe97a00] [c0000000005ad478] __blk_mq_delay_run_hw_queue+0x118/0x130
      [c000001fffe97a40] [c0000000005ad61c] blk_mq_start_hw_queues+0x6c/0xa0
      [c000001fffe97a80] [c000000000797aac] scsi_kick_queue+0x2c/0x60
      [c000001fffe97aa0] [c000000000797cf0] scsi_run_queue+0x210/0x360
      [c000001fffe97b10] [c00000000079b888] scsi_run_host_queues+0x48/0x80
      [c000001fffe97b40] [c0000000007b6090] ipr_ioa_bringdown_done+0x70/0x1e0
      [c000001fffe97bc0] [c0000000007bc860] ipr_reset_ioa_job+0x80/0xf0
      [c000001fffe97bf0] [c0000000007b4d50] ipr_reset_timer_done+0xd0/0x100
      [c000001fffe97c30] [c0000000001937bc] call_timer_fn+0xdc/0x4b0
      [c000001fffe97cf0] [c000000000193d08] expire_timers+0x178/0x330
      [c000001fffe97d60] [c0000000001940c8] run_timer_softirq+0xb8/0x120
      [c000001fffe97de0] [c000000000b726a8] __do_softirq+0x168/0x6d8
      [c000001fffe97ef0] [c0000000000df2c8] irq_exit+0x108/0x150
      [c000001fffe97f10] [c000000000017bf4] __do_irq+0x2a4/0x4a0
      [c000001fffe97f90] [c00000000002da50] call_do_irq+0x14/0x24
      [c0000007fad93aa0] [c000000000017e8c] do_IRQ+0x9c/0x140
      [c0000007fad93af0] [c000000000008b98] hardware_interrupt_common+0x138/0x140
      Reported-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NBrian King <brking@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      b0e17a9b
  6. 12 4月, 2017 1 次提交
    • M
      scsi: ipr: do not set DID_PASSTHROUGH on CHECK CONDITION · 785a4704
      Mauricio Faria de Oliveira 提交于
      On a dual controller setup with multipath enabled, some MEDIUM ERRORs
      caused both paths to be failed, thus I/O got queued/blocked since the
      'queue_if_no_path' feature is enabled by default on IPR controllers.
      
      This example disabled 'queue_if_no_path' so the I/O failure is seen at
      the sg_dd program.  Notice that after the sg_dd test-case, both paths
      are in 'failed' state, and both path/priority groups are in 'enabled'
      state (not 'active') -- which would block I/O with 'queue_if_no_path'.
      
          # sg_dd if=/dev/dm-2 bs=4096 count=1 dio=1 verbose=4 blk_sgio=0
          <...>
          read(unix): count=4096, res=-1
          sg_dd: reading, skip=0 : Input/output error
          <...>
      
          # dmesg
          [...] sd 2:2:16:0: [sds] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
          [...] sd 2:2:16:0: [sds] Sense Key : Medium Error [current]
          [...] sd 2:2:16:0: [sds] Add. Sense: Unrecovered read error - recommend rewrite the data
          [...] sd 2:2:16:0: [sds] CDB: Read(10) 28 00 00 00 00 00 00 00 20 00
          [...] blk_update_request: I/O error, dev sds, sector 0
          [...] device-mapper: multipath: Failing path 65:32.
          <...>
          [...] device-mapper: multipath: Failing path 65:224.
      
          # multipath -l
          1IBM_IPR-0_59C2AE0000001F80 dm-2 IBM     ,IPR-0   59C2AE00
          size=5.2T features='0' hwhandler='1 alua' wp=rw
          |-+- policy='service-time 0' prio=0 status=enabled
          | `- 2:2:16:0 sds  65:32  failed undef running
          `-+- policy='service-time 0' prio=0 status=enabled
            `- 1:2:7:0  sdae 65:224 failed undef running
      
      This is not the desired behavior. The dm-multipath explicitly checks
      for the MEDIUM ERROR case (and a few others) so not to fail the path
      (e.g., I/O to other sectors could potentially happen without problems).
      See dm-mpath.c :: do_end_io_bio() -> noretry_error() !->! fail_path().
      
      The problem trace is:
      
      1) ipr_scsi_done()  // SENSE KEY/CHECK CONDITION detected, go to..
      2) ipr_erp_start()  // ipr_is_gscsi() and masked_ioasc OK, go to..
      3) ipr_gen_sense()  // masked_ioasc is IPR_IOASC_MED_DO_NOT_REALLOC,
                          // so set DID_PASSTHROUGH.
      
      4) scsi_decide_disposition()  // check for DID_PASSTHROUGH and return
                                    // early on, faking a DID_OK.. *instead*
                                    // of reaching scsi_check_sense().
      
                                    // Had it reached the latter, that would
                                    // set host_byte to DID_MEDIUM_ERROR.
      
      5) scsi_finish_command()
      6) scsi_io_completion()
      7) __scsi_error_from_host_byte()  // That would be converted to -ENODATA
      <...>
      8) dm_softirq_done()
      9) multipath_end_io()
      10) do_end_io()
      11) noretry_error()  // And that is checked in dm-mpath :: noretry_error()
                           // which would cause fail_path() not to be called.
      
      With this patch applied, the I/O is failed but the paths are not.  This
      multipath device continues accepting more I/O requests without blocking.
      (and notice the different host byte/driver byte handling per SCSI layer).
      
          # dmesg
          [...] sd 2:2:7:0: [sdaf] Done: SUCCESS Result: hostbyte=0x13 driverbyte=DRIVER_OK
          [...] sd 2:2:7:0: [sdaf] CDB: Read(10) 28 00 00 00 00 00 00 00 40 00
          [...] sd 2:2:7:0: [sdaf] Sense Key : Medium Error [current]
          [...] sd 2:2:7:0: [sdaf] Add. Sense: Unrecovered read error - recommend rewrite the data
          [...] blk_update_request: critical medium error, dev sdaf, sector 0
          [...] blk_update_request: critical medium error, dev dm-6, sector 0
          [...] sd 2:2:7:0: [sdaf] Done: SUCCESS Result: hostbyte=0x13 driverbyte=DRIVER_OK
          [...] sd 2:2:7:0: [sdaf] CDB: Read(10) 28 00 00 00 00 00 00 00 10 00
          [...] sd 2:2:7:0: [sdaf] Sense Key : Medium Error [current]
          [...] sd 2:2:7:0: [sdaf] Add. Sense: Unrecovered read error - recommend rewrite the data
          [...] blk_update_request: critical medium error, dev sdaf, sector 0
          [...] blk_update_request: critical medium error, dev dm-6, sector 0
          [...] Buffer I/O error on dev dm-6, logical block 0, async page read
      
          # multipath -l 1IBM_IPR-0_59C2AE0000001F80
          1IBM_IPR-0_59C2AE0000001F80 dm-6 IBM     ,IPR-0   59C2AE00
          size=5.2T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
          |-+- policy='service-time 0' prio=0 status=active
          | `- 2:2:7:0  sdaf 65:240 active undef running
          `-+- policy='service-time 0' prio=0 status=enabled
            `- 1:2:7:0  sdh  8:112  active undef running
      Signed-off-by: NMauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
      Acked-by: NBrian King <brking@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      785a4704
  7. 24 3月, 2017 5 次提交
    • B
      scsi: ipr: Fix SATA EH hang · ef97d8ae
      Brian King 提交于
      This patch fixes a hang that can occur in ATA EH with ipr. With ipr's
      usage of libata, commands should never end up on ap->eh_done_q. The
      timeout function we use for ipr, even for SATA devices, is
      scsi_times_out, so ATA_QCFLAG_EH_SCHEDULED never gets set for ipr and EH
      is driven completely by ipr and SCSI. The SCSI EH thread ends up calling
      ipr's eh_device_reset_handler, which then calls
      ata_std_error_handler. This ends up calling ipr_sata_reset, which issues
      a reset to the device. This should result in all pending commands
      getting failed back and having ata_qc_complete called for them, which
      should end up clearing ATA_QCFLAG_FAILED as qc->flags gets zeroed in
      ata_qc_free.  This ensures that when we end up in ata_eh_finish, we
      don't do anything more with the command.
      
      On adapters that only support a single interrupt and when running with
      two MSI-X vectors or less, the adapter firmware guarantees that
      responses to all outstanding commands are sent back prior to sending the
      response to the SATA reset command.  On newer adapters supporting
      multiple HRRQs, however, this can no longer be guaranteed, since the
      command responses and reset response may be processed on different
      HRRQs.
      
      If ipr returns from ipr_sata_reset before the outstanding command was
      returned, this sends us down the path of __ata_eh_qc_complete which then
      moves the associated scsi_cmd from the work_q in
      scsi_eh_bus_device_reset to ap->eh_done_q, which then will sit there
      forever and we will be wedged.
      
      This patch fixes this up by ensuring that any outstanding commands are
      flushed before returning from eh_device_reset_handler for a SATA device.
      Reported-by: NDavid Jeffery <djeffery@redhat.com>
      Signed-off-by: NBrian King <brking@linux.vnet.ibm.com>
      Reviewed-by: NWendy Xiong <wenxiong@linux.vnet.ibm.com>
      Tested-by: NWendy Xiong <wenxiong@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      ef97d8ae
    • B
      scsi: ipr: Error path locking fixes · f646f325
      Brian King 提交于
      This patch closes up some potential race conditions observed in the
      error handling paths in ipr while debugging an issue resulting in a hang
      with SATA error handling. These patches ensure we are holding the
      correct lock when adding and removing commands from the free and pending
      queues in some error scenarios.
      Signed-off-by: NBrian King <brking@linux.vnet.ibm.com>
      Reviewed-by: NWendy Xiong <wenxiong@linux.vnet.ibm.com>
      Tested-by: NWendy Xiong <wenxiong@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      f646f325
    • B
      scsi: ipr: Fix abort path race condition · 439ae285
      Brian King 提交于
      This fixes a race condition in the error handlomg paths of ipr. While a
      command is outstanding to the adapter, it is placed on a pending queue
      for the hrrq it is associated with, while holding the HRRQ lock. When a
      command is completed, it is removed from the pending queue, under HRRQ
      lock, and placed on a local list.  This list is then iterated through
      without any locks and each command's done function is invoked, inside of
      which, the command gets returned to the free list while grabbing the
      HRRQ lock. This fixes two race conditions when commands have been
      removed from the pending list but have not yet been added to the free
      list. Both of these changes fix race conditions that could result in
      returning success from eh_abort_handler and then later calling scsi_done
      for the same request.
      
      The first race condition is in ipr_cancel_op. It looks through each
      pending queue to see if the command to be aborted is still outstanding
      or not. Rather than looking on the pending queue, reverse the logic to
      check to look for commands that are NOT on the free queue.  The second
      race condition can occur when in ipr_wait_for_ops where we are waiting
      for responses for commands we've aborted.
      Signed-off-by: NBrian King <brking@linux.vnet.ibm.com>
      Reviewed-by: NWendy Xiong <wenxiong@linux.vnet.ibm.com>
      Tested-by: NWendy Xiong <wenxiong@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      439ae285
    • B
      scsi: ipr: Remove redundant initialization · 960e9648
      Brian King 提交于
      Removes some code in __ipr_eh_dev_reset which was modifying the ipr_cmd
      done function. This should have already been setup at command allocation
      time and if its since been changed, it means we are in the ipr_erp*
      functions and need to wait for them to complete and don't want to
      override that here.
      Signed-off-by: NBrian King <brking@linux.vnet.ibm.com>
      Reviewed-by: NWendy Xiong <wenxiong@linux.vnet.ibm.com>
      Tested-by: NWendy Xiong <wenxiong@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      960e9648
    • B
      scsi: ipr: Fix missed EH wakeup · 66a0d59c
      Brian King 提交于
      Following a command abort or device reset, ipr's EH handlers wait for
      the commands getting aborted to get sent back from the adapter prior to
      returning from the EH handler. This fixes up some cases where the
      completion handler was not getting called, which would have resulted in
      the EH thread waiting until it timed out, greatly extending EH time.
      Signed-off-by: NBrian King <brking@linux.vnet.ibm.com>
      Reviewed-by: NWendy Xiong <wenxiong@linux.vnet.ibm.com>
      Tested-by: NWendy Xiong <wenxiong@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      66a0d59c
  8. 28 2月, 2017 1 次提交
  9. 01 12月, 2016 1 次提交
  10. 09 11月, 2016 1 次提交
  11. 15 10月, 2016 1 次提交
    • B
      scsi: ipr: Fix async error WARN_ON · 8a4236a2
      Brian King 提交于
      Commit afc3f83c ("scsi: ipr: Add asynchronous error notification")
      introduced the warn on shown below. To fix this, rather than attempting
      to send the KOBJ_CHANGE uevent from interrupt context, which is what is
      causing the WARN_ON, just wake the ipr worker thread which will send a
      KOBJ_CHANGE uevent.
      
      [  142.278120] WARNING: CPU: 15 PID: 0 at kernel/softirq.c:161 __local_bh_enable_ip+0x7c/0xd0
      [  142.278124] Modules linked in: ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter ses enclosure scsi_transport_sas sg pseries_rng nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sr_mod sd_mod cdrom ipr libata ibmvscsi scsi_transport_srp ibmveth dm_mirror dm_region_hash dm_log dm_mod
      [  142.278208] CPU: 15 PID: 0 Comm: swapper/15 Not tainted 4.8.0.ipr+ #21
      [  142.278213] task: c00000010cf24480 task.stack: c00000010cfec000
      [  142.278217] NIP: c0000000000c0c7c LR: c000000000881778 CTR: c0000000003c5bf0
      [  142.278221] REGS: c00000010cfef080 TRAP: 0700   Not tainted  (4.8.0.ipr+)
      [  142.278224] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28008022  XER: 2000000f
      [  142.278236] CFAR: c0000000000c0c20 SOFTE: 0
      GPR00: c000000000706c78 c00000010cfef300 c000000000f91d00 c000000000706c78
      GPR04: 0000000000000200 c000000000f7bc80 0000000000000000 00000000024000c0
      GPR08: 0000000000000000 0000000000000001 c000000000ee1d00 c000000000a9bdd0
      GPR12: c0000000003c5bf0 c00000000eb22d00 c000000100ca3880 c00000020ed38400
      GPR16: 0000000000000000 0000000000000000 c000000100940508 0000000000000000
      GPR20: 0000000000000000 0000000000000000 0000000000000000 00000000024000c0
      GPR24: c0000000004588e0 c00000010863bd00 c00000010863bd00 c0000000013773f8
      GPR28: c000000000f7bc80 0000000000000000 ffffffffffffffff c000000000f7bcd8
      [  142.278290] NIP [c0000000000c0c7c] __local_bh_enable_ip+0x7c/0xd0
      [  142.278296] LR [c000000000881778] _raw_spin_unlock_bh+0x38/0x60
      [  142.278299] Call Trace:
      [  142.278303] [c00000010cfef300] [c000000000f7bc80] init_net+0x0/0x1900 (unreliable)
      [  142.278310] [c00000010cfef320] [c000000000706c78] peernet2id+0x58/0x80
      [  142.278316] [c00000010cfef370] [c00000000075caec] netlink_broadcast_filtered+0x30c/0x550
      [  142.278323] [c00000010cfef430] [c000000000459078] kobject_uevent_env+0x588/0x780
      [  142.278331] [c00000010cfef510] [d000000003163a6c] ipr_process_error+0x11c/0x240 [ipr]
      [  142.278337] [c00000010cfef5c0] [d000000003152298] ipr_fail_all_ops+0x108/0x220 [ipr]
      [  142.278343] [c00000010cfef670] [d0000000031643f8] ipr_reset_restore_cfg_space+0xa8/0x240 [ipr]
      [  142.278350] [c00000010cfef6f0] [d000000003158a00] ipr_reset_ioa_job+0x80/0xe0 [ipr]
      [  142.278356] [c00000010cfef720] [d000000003153f78] ipr_reset_timer_done+0xa8/0xe0 [ipr]
      [  142.278363] [c00000010cfef770] [c000000000149c88] call_timer_fn+0x58/0x1c0
      [  142.278368] [c00000010cfef800] [c000000000149f60] expire_timers+0x140/0x200
      [  142.278373] [c00000010cfef870] [c00000000014a0e8] run_timer_softirq+0xc8/0x230
      [  142.278379] [c00000010cfef900] [c0000000000c0844] __do_softirq+0x164/0x3c0
      [  142.278384] [c00000010cfef9f0] [c0000000000c0f18] irq_exit+0x1a8/0x1c0
      [  142.278389] [c00000010cfefa20] [c000000000020b54] timer_interrupt+0xa4/0xe0
      [  142.278394] [c00000010cfefa50] [c000000000002414] decrementer_common+0x114/0x180
      Signed-off-by: NBrian King <brking@linux.vnet.ibm.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      8a4236a2
  12. 19 9月, 2016 2 次提交
  13. 26 8月, 2016 1 次提交
  14. 11 8月, 2016 1 次提交
  15. 10 8月, 2016 1 次提交
  16. 02 8月, 2016 1 次提交
  17. 27 7月, 2016 1 次提交
  18. 14 7月, 2016 2 次提交
  19. 29 6月, 2016 1 次提交
  20. 27 2月, 2016 1 次提交
  21. 08 1月, 2016 1 次提交
  22. 12 12月, 2015 3 次提交
  23. 10 11月, 2015 5 次提交
  24. 29 8月, 2015 2 次提交
  25. 31 7月, 2015 3 次提交