1. 03 10月, 2011 2 次提交
    • M
      [SCSI] libsas: fix panic when single phy is disabled on a wide port · a73914c3
      Mark Salyzyn 提交于
      When a wide port is being utilized to a target, if one disables only one
      of the
      phys, we get an OS crash:
      
      BUG: unable to handle kernel NULL pointer dereference at
      0000000000000238
      IP: [<ffffffff814ca9b1>] mutex_lock+0x21/0x50
      PGD 4103f5067 PUD 41dba9067 PMD 0
      Oops: 0002 [#1] SMP
      last sysfs file: /sys/bus/pci/slots/5/address
      CPU 0
      Modules linked in: pm8001(U) ses enclosure fuse nfsd exportfs autofs4
      ipmi_devintf ipmi_si ipmi_msghandler nfs lockd fscache nfs_acl
      auth_rpcgss 8021q fcoe libfcoe garp libfc scsi_transport_fc stp scsi_tgt
      llc sunrpc cpufreq_ondemand acpi_cpufreq freq_table ipv6 sr_mod cdrom
      dm_mirror dm_region_hash dm_log uinput sg i2c_i801 i2c_core iTCO_wdt
      iTCO_vendor_support e1000e mlx4_ib ib_mad ib_core mlx4_en mlx4_core ext3
      jbd mbcache sd_mod crc_t10dif usb_storage ata_generic pata_acpi ata_piix
      libsas(U) scsi_transport_sas dm_mod [last unloaded: pm8001]
      
      Modules linked in: pm8001(U) ses enclosure fuse nfsd exportfs autofs4
      ipmi_devintf ipmi_si ipmi_msghandler nfs lockd fscache nfs_acl
      auth_rpcgss 8021q fcoe libfcoe garp libfc scsi_transport_fc stp scsi_tgt
      llc sunrpc cpufreq_ondemand acpi_cpufreq freq_table ipv6 sr_mod cdrom
      dm_mirror dm_region_hash dm_log uinput sg i2c_i801 i2c_core iTCO_wdt
      iTCO_vendor_support e1000e mlx4_ib ib_mad ib_core mlx4_en mlx4_core ext3
      jbd mbcache sd_mod crc_t10dif usb_storage ata_generic pata_acpi ata_piix
      libsas(U) scsi_transport_sas dm_mod [last unloaded: pm8001]
      Pid: 5146, comm: scsi_wq_5 Not tainted
      2.6.32-71.29.1.el6.lustre.7.x86_64 #1 Storage Server
      RIP: 0010:[<ffffffff814ca9b1>]  [<ffffffff814ca9b1>]
      mutex_lock+0x21/0x50
      RSP: 0018:ffff8803e4e33d30  EFLAGS: 00010246
      RAX: 0000000000000000 RBX: 0000000000000238 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: ffff8803e664c800 RDI: 0000000000000238
      RBP: ffff8803e4e33d40 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
      R13: 0000000000000238 R14: ffff88041acb7200 R15: ffff88041c51ada0
      FS:  0000000000000000(0000) GS:ffff880028200000(0000)
      knlGS:0000000000000000
      CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      CR2: 0000000000000238 CR3: 0000000410143000 CR4: 00000000000006f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process scsi_wq_5 (pid: 5146, threadinfo ffff8803e4e32000, task
      ffff8803e4e294a0)
      Stack:
       ffff8803e664c800 0000000000000000 ffff8803e4e33d70 ffffffffa001f06e
      <0> ffff8803e4e33d60 ffff88041c51ada0 ffff88041acb7200 ffff88041bc0aa00
      <0> ffff8803e4e33d90 ffffffffa0032b6c 0000000000000014 ffff88041acb7200
      Call Trace:
       [<ffffffffa001f06e>] sas_port_delete_phy+0x2e/0xa0 [scsi_transport_sas]
       [<ffffffffa0032b6c>] sas_unregister_devs_sas_addr+0xac/0xe0 [libsas]
       [<ffffffffa0034914>] sas_ex_revalidate_domain+0x204/0x330 [libsas]
       [<ffffffffa00307f0>] ? sas_revalidate_domain+0x0/0x90 [libsas]
       [<ffffffffa0030855>] sas_revalidate_domain+0x65/0x90 [libsas]
       [<ffffffff8108c7d0>] worker_thread+0x170/0x2a0
       [<ffffffff81091ea0>] ? autoremove_wake_function+0x0/0x40
       [<ffffffff8108c660>] ? worker_thread+0x0/0x2a0
       [<ffffffff81091b36>] kthread+0x96/0xa0
       [<ffffffff810141ca>] child_rip+0xa/0x20
       [<ffffffff81091aa0>] ? kthread+0x0/0xa0
       [<ffffffff810141c0>] ? child_rip+0x0/0x20
      Code: ff ff 85 c0 75 ed eb d6 66 90 55 48 89 e5 48 83 ec 10 48 89 1c 24
      4c 89 64 24 08 0f 1f 44 00 00 48 89 fb e8 92 f4 ff ff 48 89 df <f0> ff
      0f 79 05 e8 25 00 00 00 65 48 8b 04 25 08 cc 00 00 48 2d
      RIP  [<ffffffff814ca9b1>] mutex_lock+0x21/0x50
       RSP <ffff8803e4e33d30>
      CR2: 0000000000000238
      
      The following patch is admittedly a band-aid, and does not solve the
      root cause, but it still is a good candidate for hardening as a pointer
      check before reference.
      Signed-off-by: NMark Salyzyn <mark_salyzyn@us.xyratex.com>
      Tested-by: NJack Wang <jack_wang@usish.com>
      Cc: stable@kernel.org
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      a73914c3
    • R
      [SCSI] qla2xxx: Fix crash in qla2x00_abort_all_cmds() on unload · 9bfacd01
      Roland Dreier 提交于
      I hit a crash in qla2x00_abort_all_cmds() if the qla2xxx module is
      unloaded right after it is loaded.  I debugged this down to the abort
      handling improperly treating a command of type SRB_ADISC_CMD as if it
      had a bsg_job to complete when that command actually uses the iocb_cmd
      part of the union.  (I guess to hit this one has to unload the module
      while the async FC initialization is still in progress)
      
      It seems we should only look for a bsg_job if type is SRB_ELS_CMD_RPT,
      SRB_ELS_CMD_HST or SRB_CT_CMD, so switch the test to make that explicit.
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      Acked-by: NChad Dupuis <chad.dupuis@qlogic.com>
      Cc: stable@kernel.org
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      9bfacd01
  2. 26 9月, 2011 2 次提交
    • J
      [SCSI] 3w-9xxx: fix iommu_iova leak · 96067723
      James Bottomley 提交于
      Following reports on the list, it looks like the 3e-9xxx driver will leak dma
      mappings every time we get a transient queueing error back from the card.
      This is because it maps the sg list in the routine that sends the command, but
      doesn't unmap again in the transient failure path (even though the command is
      sent back to the block layer).  Fix by unmapping before returning the status.
      Reported-by: NChris Boot <bootc@bootc.net>
      Tested-by: NChris Boot <bootc@bootc.net>
      Acked-by: NAdam Radford <aradford@gmail.com>
      Cc: stable@kernel.org
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      96067723
    • N
      [SCSI] cxgb3i: convert cdev->l2opt to use rcu to prevent NULL dereference · e48f129c
      Neil Horman 提交于
      This oops was reported recently:
      d:mon> e
      cpu 0xd: Vector: 300 (Data Access) at [c0000000fd4c7120]
          pc: d00000000076f194: .t3_l2t_get+0x44/0x524 [cxgb3]
          lr: d000000000b02108: .init_act_open+0x150/0x3d4 [cxgb3i]
          sp: c0000000fd4c73a0
         msr: 8000000000009032
         dar: 0
       dsisr: 40000000
        current = 0xc0000000fd640d40
        paca    = 0xc00000000054ff80
          pid   = 5085, comm = iscsid
      d:mon> t
      [c0000000fd4c7450] d000000000b02108 .init_act_open+0x150/0x3d4 [cxgb3i]
      [c0000000fd4c7500] d000000000e45378 .cxgbi_ep_connect+0x784/0x8e8 [libcxgbi]
      [c0000000fd4c7650] d000000000db33f0 .iscsi_if_rx+0x71c/0xb18
      [scsi_transport_iscsi2]
      [c0000000fd4c7740] c000000000370c9c .netlink_data_ready+0x40/0xa4
      [c0000000fd4c77c0] c00000000036f010 .netlink_sendskb+0x4c/0x9c
      [c0000000fd4c7850] c000000000370c18 .netlink_sendmsg+0x358/0x39c
      [c0000000fd4c7950] c00000000033be24 .sock_sendmsg+0x114/0x1b8
      [c0000000fd4c7b50] c00000000033d208 .sys_sendmsg+0x218/0x2ac
      [c0000000fd4c7d70] c00000000033f55c .sys_socketcall+0x228/0x27c
      [c0000000fd4c7e30] c0000000000086a4 syscall_exit+0x0/0x40
      --- Exception: c01 (System Call) at 00000080da560cfc
      
      The root cause was an EEH error, which sent us down the offload_close path in
      the cxgb3 driver, which in turn sets cdev->l2opt to NULL, without regard for
      upper layer driver (like the cxgbi drivers) which might have execution contexts
      in the middle of its use. The result is the oops above, when t3_l2t_get attempts
      to dereference L2DATA(cdev)->nentries in arp_hash right after the EEH error handler sets it to NULL.
      
      The fix is to prevent the setting of the NULL pointer until after there are no
      further users of it.  The t3cdev->l2opt pointer is now converted to be an rcu
      pointer and the L2DATA macro is now called under the protection of the
      rcu_read_lock().  When the EEH error path:
      t3_adapter_error->offload_close->cxgb3_offload_deactivate
      Is exectured, setting of that l2opt pointer to NULL, is now gated on an rcu
      quiescence point, preventing, allowing L2DATA callers to safely check for a NULL
      pointer without concern that the underlying data will be freeded before the
      pointer is dereferenced.
      
      This has been tested by the reporter and shown to fix the reproted oops
      
      [nhorman: fix up unitinialised variable reported by Dan Carpenter]
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      Reviewed-by: NKaren Xie <kxie@chelsio.com>
      Cc: stable@kernel.org
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      e48f129c
  3. 24 9月, 2011 2 次提交
  4. 22 9月, 2011 3 次提交
    • R
      [SCSI] scsi: qla4xxx needs libiscsi.o · 3538a001
      Randy Dunlap 提交于
      qla4xxx driver needs to be linked with libiscsi.o to fix
      build errors.  This happens when no other drivers that use
      libiscsi.o are enabled.
      
      ERROR: "iscsi_conn_stop" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined!
      ERROR: "iscsi_conn_get_addr_param" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined!
      ERROR: "iscsi_session_teardown" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined!
      ERROR: "iscsi_host_alloc" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined!
      ERROR: "iscsi_conn_start" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined!
      ERROR: "iscsi_conn_send_pdu" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined!
      ERROR: "iscsi_session_get_param" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined!
      ERROR: "iscsi_conn_get_param" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined!
      ERROR: "iscsi_set_param" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined!
      ERROR: "iscsi_session_failure" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined!
      ERROR: "iscsi_complete_pdu" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined!
      ERROR: "iscsi_session_setup" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined!
      ERROR: "iscsi_conn_bind" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined!
      ERROR: "iscsi_conn_setup" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined!
      ERROR: "iscsi_itt_to_task" [drivers/scsi/qla4xxx/qla4xxx.ko] undefined!
      Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
      Reviewed-by: NMike Christie <michaelc@cs.wisc.edu>
      Cc: stable@kernel.org
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      3538a001
    • M
      [SCSI] libsas: fix failure to revalidate domain for anything but the first expander child. · 24926dad
      Mark Salyzyn 提交于
      In an enclosure model where there are chaining expanders to a large body
      of storage, it was discovered that libsas, responding to a broadcast
      event change, would only revalidate the domain of first child expander
      in the list.
      
      The issue is that the pointer value to the discovered source device was
      used to break out of the loop, rather than the content of the pointer.
      
      This still remains non-compliant as the revalidate domain code is
      supposed to loop through all child expanders, and not stop at the first
      one it finds that reports a change count. However, the design of this
      routine does not allow multiple device discoveries and that would be a
      more complicated set of patches reserved for another day. We are fixing
      the glaring bug rather than refactoring the code.
      Signed-off-by: NMark Salyzyn <msalyzyn@us.xyratex.com>
      Cc: stable@kernel.org
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      24926dad
    • V
      [SCSI] aacraid: reset should disable MSI interrupt · d0efab26
      Vasily Averin 提交于
      scsi reset on hardware with enabled MSI interrupts generates WARNING message
      
      [11027.798722] aacraid: Host adapter abort request (0,0,0,0)
      [11027.798814] aacraid: Host adapter reset request. SCSI hang ?
      [11087.762237] aacraid: SCSI bus appears hung
      [11135.082543] ------------[ cut here ]------------
      [11135.082646] WARNING: at drivers/pci/msi.c:658 pci_enable_msi_block+0x251/0x290()
      Signed-off-by: NVasily Averin <vvs@sw.ru>
      Acked-by: NMark Salyzyn <mark_salyzyn@us.xyratex.com>
      Cc: stable@kernel.org
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      d0efab26
  5. 11 9月, 2011 1 次提交
    • R
      scsi: qla4xxx driver depends on NET · d7a210f3
      Randy Dunlap 提交于
      When CONFIG_NET is disabled, SCSI_QLA_ISCSI selects SCSI_ISCSI_ATTRS,
      which uses network interfaces, so the build fails with multiple errors:
      
        warning: (ISCSI_TCP && SCSI_CXGB3_ISCSI && SCSI_CXGB4_ISCSI && SCSI_QLA_ISCSI && INFINIBAND_ISER) selects SCSI_ISCSI_ATTRS which has unmet direct dependencies (SCSI && NET)
      
        ERROR: "skb_trim" [drivers/scsi/scsi_transport_iscsi.ko] undefined!
        ERROR: "netlink_kernel_create" [drivers/scsi/scsi_transport_iscsi.ko] undefined!
        ERROR: "netlink_kernel_release" [drivers/scsi/scsi_transport_iscsi.ko] undefined!
        ...
      
      so make SCSI_QLA_ISCSI also depend on NET to prevent the build errors.
      Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
      Cc:	Ravi Anand <ravi.anand@qlogic.com>
      Cc:	Vikas Chaudhary <vikas.chaudhary@qlogic.com>
      Cc:	iscsi-driver@qlogic.com
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d7a210f3
  6. 29 8月, 2011 5 次提交
    • E
      [SCSI] bnx2i: Fixed the endian on TTT for NOP out transmission · 610602f3
      Eddie Wai 提交于
      The iscsi_nopout task's TTT is defined as __be32 while the DMA
      memory to the chip is CPU specific.  This creates a problem for
      unsolicited NOP-In responses where the TTT is not the RESERVED
      tag of 0xFFs.  This patch adds a call to be32_to_cpu for the TTT
      specified.
      Signed-off-by: NEddie Wai <eddie.wai@broadcom.com>
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      610602f3
    • Y
      [SCSI] libfc: fix referencing to fc_fcp_pkt from the frame pointer via fr_fsp() · 3ee17f59
      Yi Zou 提交于
      In commit 6a716a85, while releasing the DDP context in case frame_send() failed,
      the frame may already be freed, so we should store the pointer to fc_fcp_pkt and
      release the DDP context using the locally stored fsp instead of getting fsp from
      the fr_fsp(fp) on a frame.
      Signed-off-by: NYi Zou <yi.zou@intel.com>
      Reported-by: NBhanu Prakash Gollapudi <bprakash@broadcom.com>
      Tested-by: NRoss Brattain <ross.b.brattain@intel.com>
      Signed-off-by: NRobert Love <robert.w.love@intel.com>
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      3ee17f59
    • V
      [SCSI] libfc: block SCSI eh thread for blocked rports · 21cc0bd3
      Vasu Dev 提交于
      Call fc_block_scsi_eh() in all fcoe eh to blocks
      the scsi_eh thread for blocked rports.
      Signed-off-by: NVasu Dev <vasu.dev@intel.com>
      Tested-by: NRoss Brattain <ross.b.brattain@intel.com>
      Reviewed-by: NYi Zou <yi.zou@intel.com>
      Signed-off-by: NRobert Love <robert.w.love@intel.com>
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      21cc0bd3
    • V
      [SCSI] libfc: fix fc_eh_host_reset · 77a2b73a
      Vasu Dev 提交于
      Current fc_eh_host_reset leaves lport offline
      permanently  due to FLOGI response getting
      handled by LOGO response from last reset as both
      had same exchange id.
      
      So fix this by having end to end exches clean-up
      using exchange abort along exches reset
      done from fc_eh_host_reset. This would avoid
      exchanges collision between the sessions across
      the reset. In this case implicit login should have
      done that but no aborting support for FIP
      frames, so just wait till lport->r_a_tov before
      restarting next flogi to ensure all exchanges
      are good to use again for next session.
      
      Below is the trace of LOGO from older session
      coming ahead of FLOGI response with same exche id
      0x203:-
      
      617  86.435165     4e.00.0b -> ff.ff.fc     FC ELS LOGO 0x203
      618  86.435195     4e.00.0b -> b6.02.00     FC ELS LOGO 0x213
      619  86.435220     4e.00.0b -> 18.03.00     FC ELS LOGO 0x223
      620  86.435244     4e.00.0b -> 18.02.00     FC ELS LOGO 0x233
      621  86.435267     4e.00.0b -> 18.01.00     FC ELS LOGO 0x243
      622  86.435349     00.00.00 -> ff.ff.fe     FC ELS FLOGI 0x203
      623  86.435549     ff.ff.fc -> 4e.00.0b     FC ELS ACC (LOGO) 0x203
      624  86.438721     ff.ff.fe -> 4e.00.0b     FC ELS ACC (FLOGI) 0x203
      625  86.442059     18.03.00 -> 4e.00.0b     FC ELS ACC (LOGO) 0x223
      626  86.443683     b6.02.00 -> 4e.00.0b     FC ELS ACC (LOGO) 0x213
      627  86.447693     18.01.00 -> 4e.00.0b     FC ELS ACC (LOGO) 0x243
      628  86.453499     18.02.00 -> 4e.00.0b     FC ELS ACC (LOGO) 0x233
      Signed-off-by: NVasu Dev <vasu.dev@intel.com>
      Tested-by: NRoss Brattain <ross.b.brattain@intel.com>
      Reviewed-by: NYi Zou <yi.zou@intel.com>
      Signed-off-by: NRobert Love <robert.w.love@intel.com>
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      77a2b73a
    • R
      [SCSI] fcoe: Fix deadlock between fip's recv_work and rtnl · 848e7d5b
      Robert Love 提交于
      The rtnl cannot be held durrng the fcoe_interface_put.
      If it is the last reference on the fcoe_interface the
      fcoe_ctlr_destroy will be called as a part of the
      cleanup, ultimately calling cancel_work_sync(&fip->recv_work);
      
      If we are processing a flogi response we will be in
      the recv_work context and we will lock the rtnl to
      add a new unicast MAC address. This is how the deadlock
      can occur.
      
      The fix is simply to move the rtnl_lock/unlock into
      fcoe_interface_cleanup so that it can be unlocked before
      fcoe_interface_put is called.
      
      Here is the lockdep report:
      
      Jul 21 11:26:35 bubba [  223.870702]
      ul 21 11:26:35 bubba [  223.870704] =======================================================
      Jul 21 11:26:35 bubba [  223.871255] [ INFO: possible circular locking dependency detected ]
      Jul 21 11:26:35 bubba [  223.871530] 3.0.0-rc7+ #1
      Jul 21 11:26:35 bubba [  223.871797] -------------------------------------------------------
      Jul 21 11:26:35 bubba [  223.872072] lockdeptest.sh/3464 is trying to acquire lock:
      Jul 21 11:26:35 bubba [  223.872345]  ((&fip->recv_work)
      Jul 21 11:26:35 bubba ){+.+.+.}
      Jul 21 11:26:35 bubba , at:
      Jul 21 11:26:35 bubba [<ffffffff810531f1>] wait_on_work+0x0/0xbd
      Jul 21 11:26:35 bubba [  223.873022]
      Jul 21 11:26:35 bubba [  223.873023] but task is already holding lock:
      Jul 21 11:26:35 bubba [  223.873555]  (rtnl_mutex
      Jul 21 11:26:35 bubba ){+.+.+.}
      Jul 21 11:26:35 bubba , at:
      Jul 21 11:26:35 bubba [<ffffffff813e8233>] rtnl_lock+0x12/0x14
      Jul 21 11:26:35 bubba [  223.874229]
      Jul 21 11:26:35 bubba [  223.874230] which lock already depends on the new lock.
      Jul 21 11:26:35 bubba [  223.874231]
      Jul 21 11:26:35 bubba [  223.875032]
      Jul 21 11:26:35 bubba [  223.875033] the existing dependency chain (in reverse order) is:
      Jul 21 11:26:35 bubba [  223.875573]
      Jul 21 11:26:35 bubba [  223.875573] -> #1
      Jul 21 11:26:35 bubba (rtnl_mutex
      Jul 21 11:26:35 bubba ){+.+.+.}
      Jul 21 11:26:35 bubba :
      Jul 21 11:26:35 bubba [  223.876301]
      Jul 21 11:26:35 bubba [<ffffffff8106c14a>] lock_acquire+0xd2/0xf7
      Jul 21 11:26:35 bubba [  223.876645]
      Jul 21 11:26:35 bubba [<ffffffff8151d975>] __mutex_lock_common+0x47/0x30d
      Jul 21 11:26:35 bubba [  223.876991]
      Jul 21 11:26:35 bubba [<ffffffff8151dd36>] mutex_lock_nested+0x3b/0x40
      Jul 21 11:26:35 bubba [  223.877334]
      Jul 21 11:26:35 bubba [<ffffffff813e8233>] rtnl_lock+0x12/0x14
      Jul 21 11:26:35 bubba [  223.877675]
      Jul 21 11:26:35 bubba [<ffffffffa003d5a0>] fcoe_update_src_mac+0x2b/0x80 [fcoe]
      Jul 21 11:26:35 bubba [  223.878022]
      Jul 21 11:26:35 bubba [<ffffffffa003d698>] fcoe_flogi_resp+0x5e/0x79 [fcoe]
      Jul 21 11:26:35 bubba [  223.878366]
      Jul 21 11:26:35 bubba [<ffffffffa001566f>] fc_exch_recv+0x7f5/0x9da [libfc]
      Jul 21 11:26:35 bubba [  223.878713]
      Jul 21 11:26:35 bubba [<ffffffffa00327d8>] fcoe_ctlr_recv_work+0x71f/0x10dc [libfcoe]
      Jul 21 11:26:35 bubba [  223.879258]
      Jul 21 11:26:35 bubba [<ffffffff81053761>] process_one_work+0x1d7/0x347
      Jul 21 11:26:35 bubba [  223.879601]
      Jul 21 11:26:35 bubba [<ffffffff81054ade>] worker_thread+0xf8/0x17c
      Jul 21 11:26:35 bubba [  223.879944]
      Jul 21 11:26:35 bubba [<ffffffff81058184>] kthread+0x7d/0x85
      Jul 21 11:26:35 bubba [  223.880287]
      Jul 21 11:26:35 bubba [<ffffffff81526414>] kernel_thread_helper+0x4/0x10
      Jul 21 11:26:35 bubba [  223.880634]
      Jul 21 11:26:35 bubba [  223.880635] -> #0
      Jul 21 11:26:35 bubba ((&fip->recv_work)
      Jul 21 11:26:35 bubba ){+.+.+.}
      Jul 21 11:26:35 bubba :
      Jul 21 11:26:35 bubba [  223.881357]
      Jul 21 11:26:35 bubba [<ffffffff8106b93e>] __lock_acquire+0xb1d/0xe2c
      Jul 21 11:26:35 bubba [  223.881695]
      Jul 21 11:26:35 bubba [<ffffffff8106c14a>] lock_acquire+0xd2/0xf7
      Jul 21 11:26:35 bubba [  223.882033]
      Jul 21 11:26:35 bubba [<ffffffff81053241>] wait_on_work+0x50/0xbd
      Jul 21 11:26:35 bubba [  223.882378]
      Jul 21 11:26:35 bubba [<ffffffff81053b32>] __cancel_work_timer+0xb6/0xf4
      Jul 21 11:26:35 bubba [  223.882718]
      Jul 21 11:26:35 bubba [<ffffffff81053b8a>] cancel_work_sync+0xb/0xd
      Jul 21 11:26:35 bubba [  223.883057]
      Jul 21 11:26:35 bubba [<ffffffffa00317e6>] fcoe_ctlr_destroy+0x1d/0x67 [libfcoe]
      Jul 21 11:26:35 bubba [  223.883399]
      Jul 21 11:26:35 bubba [<ffffffffa003e51e>] fcoe_interface_release+0x21/0x45 [fcoe]
      Jul 21 11:26:35 bubba [  223.883940]
      Jul 21 11:26:35 bubba [<ffffffff811fbbe6>] kref_put+0x43/0x4d
      Jul 21 11:26:35 bubba [  223.884280]
      Jul 21 11:26:35 bubba [<ffffffffa003ebba>] fcoe_interface_put+0x17/0x19 [fcoe]
      Jul 21 11:26:35 bubba [  223.884624]
      Jul 21 11:26:35 bubba [<ffffffffa003f2a6>] fcoe_interface_cleanup+0x188/0x193 [fcoe]
      Jul 21 11:26:35 bubba [  223.885163]
      Jul 21 11:26:35 bubba [<ffffffffa003f303>] fcoe_destroy+0x52/0x72 [fcoe]
      Jul 21 11:26:35 bubba [  223.885502]
      Jul 21 11:26:35 bubba [<ffffffffa00340a4>] fcoe_transport_destroy+0xab/0x110 [libfcoe]
      Jul 21 11:26:35 bubba [  223.886045]
      Jul 21 11:26:35 bubba [<ffffffff81056153>] param_attr_store+0x43/0x62
      Jul 21 11:26:35 bubba [  223.886385]
      Jul 21 11:26:35 bubba [<ffffffff8105602d>] module_attr_store+0x21/0x25
      Jul 21 11:26:35 bubba [  223.886728]
      Jul 21 11:26:35 bubba [<ffffffff8114c23d>] sysfs_write_file+0x103/0x13f
      Jul 21 11:26:35 bubba [  223.887068]
      Jul 21 11:26:35 bubba [<ffffffff810f3e7b>] vfs_write+0xa7/0xfa
      Jul 21 11:26:35 bubba [  223.887406]
      Jul 21 11:26:35 bubba [<ffffffff810f4073>] sys_write+0x45/0x69
      Jul 21 11:26:35 bubba [  223.887742]
      Jul 21 11:26:35 bubba [<ffffffff815252bb>] system_call_fastpath+0x16/0x1b
      Jul 21 11:26:35 bubba [  223.888083]
      Jul 21 11:26:35 bubba [  223.888084] other info that might help us debug this:
      Jul 21 11:26:35 bubba [  223.888085]
      Jul 21 11:26:35 bubba [  223.888879]  Possible unsafe locking scenario:
      Jul 21 11:26:35 bubba [  223.888881]
      Jul 21 11:26:35 bubba [  223.889411]        CPU0                    CPU1
      Jul 21 11:26:35 bubba [  223.889683]        ----                    ----
      Jul 21 11:26:35 bubba [  223.889955]   lock(
      Jul 21 11:26:35 bubba rtnl_mutex
      Jul 21 11:26:35 bubba );
      Jul 21 11:26:35 bubba [  223.890349]                                lock(
      Jul 21 11:26:35 bubba (&fip->recv_work)
      Jul 21 11:26:35 bubba );
      Jul 21 11:26:35 bubba [  223.890751]                                lock(
      Jul 21 11:26:35 bubba rtnl_mutex
      Jul 21 11:26:35 bubba );
      Jul 21 11:26:35 bubba [  223.891154]   lock(
      Jul 21 11:26:35 bubba (&fip->recv_work)
      Jul 21 11:26:35 bubba );
      Jul 21 11:26:35 bubba [  223.891549]
      Jul 21 11:26:35 bubba [  223.891550]  *** DEADLOCK ***
      Jul 21 11:26:35 bubba [  223.891551]
      Jul 21 11:26:35 bubba [  223.892347] 6 locks held by lockdeptest.sh/3464:
      Jul 21 11:26:35 bubba [  223.892621]  #0:
      Jul 21 11:26:35 bubba (&buffer->mutex
      Jul 21 11:26:35 bubba ){+.+.+.}
      Jul 21 11:26:35 bubba , at:
      Jul 21 11:26:35 bubba [<ffffffff8114c171>] sysfs_write_file+0x37/0x13f
      Jul 21 11:26:35 bubba [  223.893359]  #1:
      Jul 21 11:26:35 bubba (s_active
      Jul 21 11:26:35 bubba ){++++.+}
      Jul 21 11:26:35 bubba , at:
      Jul 21 11:26:35 bubba [<ffffffff8114c21c>] sysfs_write_file+0xe2/0x13f
      Jul 21 11:26:35 bubba [  223.894094]  #2:
      Jul 21 11:26:35 bubba (param_lock
      Jul 21 11:26:35 bubba ){+.+.+.}
      Jul 21 11:26:35 bubba , at:
      Jul 21 11:26:35 bubba [<ffffffff81056146>] param_attr_store+0x36/0x62
      Jul 21 11:26:35 bubba [  223.894835]  #3:
      Jul 21 11:26:35 bubba (ft_mutex
      Jul 21 11:26:35 bubba ){+.+.+.}
      Jul 21 11:26:35 bubba , at:
      Jul 21 11:26:35 bubba [<ffffffffa0034017>] fcoe_transport_destroy+0x1e/0x110 [libfcoe]
      Jul 21 11:26:35 bubba [  223.895574]  #4:
      Jul 21 11:26:35 bubba (fcoe_config_mutex
      Jul 21 11:26:35 bubba ){+.+.+.}
      Jul 21 11:26:35 bubba , at:
      Jul 21 11:26:35 bubba [<ffffffffa003f2c9>] fcoe_destroy+0x18/0x72 [fcoe]
      Jul 21 11:26:35 bubba [  223.896314]  #5:
      Jul 21 11:26:35 bubba (rtnl_mutex
      Jul 21 11:26:35 bubba ){+.+.+.}
      Jul 21 11:26:35 bubba , at:
      Jul 21 11:26:35 bubba [<ffffffff813e8233>] rtnl_lock+0x12/0x14
      Jul 21 11:26:35 bubba [  223.897047]
      Jul 21 11:26:35 bubba [  223.897048] stack backtrace:
      Jul 21 11:26:35 bubba [  223.897578] Pid: 3464, comm: lockdeptest.sh Not tainted 3.0.0-rc7+ #1
      Jul 21 11:26:35 bubba [  223.897853] Call Trace:
      Jul 21 11:26:35 bubba [  223.898128]  [<ffffffff81068e16>] print_circular_bug+0x1f8/0x209
      Jul 21 11:26:35 bubba [  223.898416]  [<ffffffff8106b93e>] __lock_acquire+0xb1d/0xe2c
      Jul 21 11:26:35 bubba [  223.898699]  [<ffffffff810531f1>] ? wait_on_cpu_work+0xe6/0xe6
      Jul 21 11:26:35 bubba [  223.898982]  [<ffffffff8106c14a>] lock_acquire+0xd2/0xf7
      Jul 21 11:26:35 bubba [  223.899263]  [<ffffffff810531f1>] ? wait_on_cpu_work+0xe6/0xe6
      Jul 21 11:26:35 bubba [  223.899547]  [<ffffffff8104a097>] ? mod_timer+0x8f/0x98
      Jul 21 11:26:35 bubba [  223.899827]  [<ffffffff81053241>] wait_on_work+0x50/0xbd
      Jul 21 11:26:35 bubba [  223.900108]  [<ffffffff810531f1>] ? wait_on_cpu_work+0xe6/0xe6
      Jul 21 11:26:35 bubba [  223.900390]  [<ffffffff81053b32>] __cancel_work_timer+0xb6/0xf4
      Jul 21 11:26:35 bubba [  223.900671]  [<ffffffff81053b8a>] cancel_work_sync+0xb/0xd
      Jul 21 11:26:35 bubba [  223.900953]  [<ffffffffa00317e6>] fcoe_ctlr_destroy+0x1d/0x67 [libfcoe]
      Jul 21 11:26:35 bubba [  223.901237]  [<ffffffffa003e51e>] fcoe_interface_release+0x21/0x45 [fcoe]
      Jul 21 11:26:35 bubba [  223.901522]  [<ffffffffa003e4fd>] ? fcoe_enable+0x6b/0x6b [fcoe]
      Jul 21 11:26:35 bubba [  223.901803]  [<ffffffff811fbbe6>] kref_put+0x43/0x4d
      Jul 21 11:26:35 bubba [  223.902083]  [<ffffffffa003ebba>] fcoe_interface_put+0x17/0x19 [fcoe]
      Jul 21 11:26:35 bubba [  223.902367]  [<ffffffffa003f2a6>] fcoe_interface_cleanup+0x188/0x193 [fcoe]
      Jul 21 11:26:35 bubba [  223.902653]  [<ffffffff8151dd36>] ? mutex_lock_nested+0x3b/0x40
      Jul 21 11:26:35 bubba [  223.902939]  [<ffffffffa003f303>] fcoe_destroy+0x52/0x72 [fcoe]
      Jul 21 11:26:35 bubba [  223.903223]  [<ffffffffa00340a4>] fcoe_transport_destroy+0xab/0x110 [libfcoe]
      Jul 21 11:26:35 bubba [  223.903508]  [<ffffffff81056153>] param_attr_store+0x43/0x62
      Jul 21 11:26:35 bubba [  223.903792]  [<ffffffff8105602d>] module_attr_store+0x21/0x25
      Jul 21 11:26:35 bubba [  223.904075]  [<ffffffff8114c23d>] sysfs_write_file+0x103/0x13f
      Jul 21 11:26:35 bubba [  223.904357]  [<ffffffff810f3e7b>] vfs_write+0xa7/0xfa
      Jul 21 11:26:35 bubba [  223.904642]  [<ffffffff810f51d6>] ? fget_light+0x35/0x96
      Jul 21 11:26:35 bubba [  223.904923]  [<ffffffff810f4073>] sys_write+0x45/0x69
      Jul 21 11:26:35 bubba [  223.905204]  [<ffffffff815252bb>] system_call_fastpath+0x16/0x1b
      Jul 21 11:26:36 bubba [  223.964438] ixgbe 0000:05:00.0: eth3: detected SFP+: 5
      Jul 21 11:26:37 bubba [  225.196702] ixgbe 0000:05:00.0: eth3: NIC Link is Up 10 Gbps, Flow Control: None
      Signed-off-by: NRobert Love <robert.w.love@intel.com>
      Tested-by: NRoss Brattain <ross.b.brattain@intel.com>
      Reviewed-by: NYi Zou <yi.zou@intel.com>
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      848e7d5b
  7. 27 8月, 2011 11 次提交
  8. 24 8月, 2011 8 次提交
  9. 28 7月, 2011 6 次提交