1. 20 4月, 2013 1 次提交
  2. 26 3月, 2013 1 次提交
    • R
      fcoe: Fix deadlock between create and destroy paths · f9c4358e
      Robert Love 提交于
      We can deadlock (s_active and fcoe_config_mutex) if a
      port is being destroyed at the same time one is being created.
      
      [ 4200.503113] ======================================================
      [ 4200.503114] [ INFO: possible circular locking dependency detected ]
      [ 4200.503116] 3.8.0-rc5+ #8 Not tainted
      [ 4200.503117] -------------------------------------------------------
      [ 4200.503118] kworker/3:2/2492 is trying to acquire lock:
      [ 4200.503119]  (s_active#292){++++.+}, at: [<ffffffff8122d20b>] sysfs_addrm_finish+0x3b/0x70
      [ 4200.503127]
      but task is already holding lock:
      [ 4200.503128]  (fcoe_config_mutex){+.+.+.}, at: [<ffffffffa02f3338>] fcoe_destroy_work+0xe8/0x120 [fcoe]
      [ 4200.503133]
      which lock already depends on the new lock.
      
      [ 4200.503135]
      the existing dependency chain (in reverse order) is:
      [ 4200.503136]
      -> #1 (fcoe_config_mutex){+.+.+.}:
      [ 4200.503139]        [<ffffffff810c7711>] lock_acquire+0xa1/0x140
      [ 4200.503143]        [<ffffffff816ca7be>] mutex_lock_nested+0x6e/0x360
      [ 4200.503146]        [<ffffffffa02f11bd>] fcoe_enable+0x1d/0xb0 [fcoe]
      [ 4200.503148]        [<ffffffffa02f127d>] fcoe_ctlr_enabled+0x2d/0x50 [fcoe]
      [ 4200.503151]        [<ffffffffa02ffbe8>] store_ctlr_enabled+0x38/0x90 [libfcoe]
      [ 4200.503154]        [<ffffffff81424878>] dev_attr_store+0x18/0x30
      [ 4200.503157]        [<ffffffff8122b750>] sysfs_write_file+0xe0/0x150
      [ 4200.503160]        [<ffffffff811b334c>] vfs_write+0xac/0x180
      [ 4200.503162]        [<ffffffff811b3692>] sys_write+0x52/0xa0
      [ 4200.503164]        [<ffffffff816d7159>] system_call_fastpath+0x16/0x1b
      [ 4200.503167]
      -> #0 (s_active#292){++++.+}:
      [ 4200.503170]        [<ffffffff810c680f>] __lock_acquire+0x135f/0x1c90
      [ 4200.503172]        [<ffffffff810c7711>] lock_acquire+0xa1/0x140
      [ 4200.503174]        [<ffffffff8122c626>] sysfs_deactivate+0x116/0x160
      [ 4200.503176]        [<ffffffff8122d20b>] sysfs_addrm_finish+0x3b/0x70
      [ 4200.503178]        [<ffffffff8122b2eb>] sysfs_hash_and_remove+0x5b/0xb0
      [ 4200.503180]        [<ffffffff8122f3d1>] sysfs_remove_group+0x61/0x100
      [ 4200.503183]        [<ffffffff814251eb>] device_remove_groups+0x3b/0x60
      [ 4200.503185]        [<ffffffff81425534>] device_remove_attrs+0x44/0x80
      [ 4200.503187]        [<ffffffff81425e97>] device_del+0x127/0x1c0
      [ 4200.503189]        [<ffffffff81425f52>] device_unregister+0x22/0x60
      [ 4200.503191]        [<ffffffffa0300970>] fcoe_ctlr_device_delete+0xe0/0xf0 [libfcoe]
      [ 4200.503194]        [<ffffffffa02f1b5c>] fcoe_interface_cleanup+0x6c/0xa0 [fcoe]
      [ 4200.503196]        [<ffffffffa02f3355>] fcoe_destroy_work+0x105/0x120 [fcoe]
      [ 4200.503198]        [<ffffffff8107ee91>] process_one_work+0x1a1/0x580
      [ 4200.503203]        [<ffffffff81080c6e>] worker_thread+0x15e/0x440
      [ 4200.503205]        [<ffffffff8108715a>] kthread+0xea/0xf0
      [ 4200.503207]        [<ffffffff816d70ac>] ret_from_fork+0x7c/0xb0
      
      [ 4200.503209]
      other info that might help us debug this:
      
      [ 4200.503211]  Possible unsafe locking scenario:
      
      [ 4200.503212]        CPU0                    CPU1
      [ 4200.503213]        ----                    ----
      [ 4200.503214]   lock(fcoe_config_mutex);
      [ 4200.503215]                                lock(s_active#292);
      [ 4200.503218]                                lock(fcoe_config_mutex);
      [ 4200.503219]   lock(s_active#292);
      [ 4200.503221]
       *** DEADLOCK ***
      
      [ 4200.503223] 3 locks held by kworker/3:2/2492:
      [ 4200.503224]  #0:  (fcoe){.+.+.+}, at: [<ffffffff8107ee2b>] process_one_work+0x13b/0x580
      [ 4200.503228]  #1:  ((&port->destroy_work)){+.+.+.}, at: [<ffffffff8107ee2b>] process_one_work+0x13b/0x580
      [ 4200.503232]  #2:  (fcoe_config_mutex){+.+.+.}, at: [<ffffffffa02f3338>] fcoe_destroy_work+0xe8/0x120 [fcoe]
      [ 4200.503236]
      stack backtrace:
      [ 4200.503238] Pid: 2492, comm: kworker/3:2 Not tainted 3.8.0-rc5+ #8
      [ 4200.503240] Call Trace:
      [ 4200.503243]  [<ffffffff816c2f09>] print_circular_bug+0x1fb/0x20c
      [ 4200.503246]  [<ffffffff810c680f>] __lock_acquire+0x135f/0x1c90
      [ 4200.503248]  [<ffffffff810c463a>] ? debug_check_no_locks_freed+0x9a/0x180
      [ 4200.503250]  [<ffffffff810c7711>] lock_acquire+0xa1/0x140
      [ 4200.503253]  [<ffffffff8122d20b>] ? sysfs_addrm_finish+0x3b/0x70
      [ 4200.503255]  [<ffffffff8122c626>] sysfs_deactivate+0x116/0x160
      [ 4200.503258]  [<ffffffff8122d20b>] ? sysfs_addrm_finish+0x3b/0x70
      [ 4200.503260]  [<ffffffff8122d20b>] sysfs_addrm_finish+0x3b/0x70
      [ 4200.503262]  [<ffffffff8122b2eb>] sysfs_hash_and_remove+0x5b/0xb0
      [ 4200.503265]  [<ffffffff8122f3d1>] sysfs_remove_group+0x61/0x100
      [ 4200.503273]  [<ffffffff814251eb>] device_remove_groups+0x3b/0x60
      [ 4200.503275]  [<ffffffff81425534>] device_remove_attrs+0x44/0x80
      [ 4200.503277]  [<ffffffff81425e97>] device_del+0x127/0x1c0
      [ 4200.503279]  [<ffffffff81425f52>] device_unregister+0x22/0x60
      [ 4200.503282]  [<ffffffffa0300970>] fcoe_ctlr_device_delete+0xe0/0xf0 [libfcoe]
      [ 4200.503285]  [<ffffffffa02f1b5c>] fcoe_interface_cleanup+0x6c/0xa0 [fcoe]
      [ 4200.503287]  [<ffffffffa02f3355>] fcoe_destroy_work+0x105/0x120 [fcoe]
      [ 4200.503290]  [<ffffffff8107ee91>] process_one_work+0x1a1/0x580
      [ 4200.503292]  [<ffffffff8107ee2b>] ? process_one_work+0x13b/0x580
      [ 4200.503295]  [<ffffffffa02f3250>] ? fcoe_if_destroy+0x230/0x230 [fcoe]
      [ 4200.503297]  [<ffffffff81080c6e>] worker_thread+0x15e/0x440
      [ 4200.503299]  [<ffffffff81080b10>] ? busy_worker_rebind_fn+0x100/0x100
      [ 4200.503301]  [<ffffffff8108715a>] kthread+0xea/0xf0
      [ 4200.503304]  [<ffffffff81087070>] ? kthread_create_on_node+0x160/0x160
      [ 4200.503306]  [<ffffffff816d70ac>] ret_from_fork+0x7c/0xb0
      [ 4200.503308]  [<ffffffff81087070>] ? kthread_create_on_node+0x160/0x160
      Signed-off-by: NRobert Love <robert.w.love@intel.com>
      Tested-by: NJack Morgan <jack.morgan@intel.com>
      f9c4358e
  3. 29 1月, 2013 2 次提交
    • N
      fcoe: Fix deadlock while deleting FCoE interface with NPIV ports · 94aa743a
      Neerav Parikh 提交于
      This patch fixes following deadlock caused by destroying of
      an FCoE interface with active NPIV ports on that interface.
      
          Call Trace:
          [<ffffffff814b7e88>] schedule+0x64/0x66
          [<ffffffff814b6b4f>] schedule_timeout+0x36/0xe3
          [<ffffffff81070c55>] ? update_curr+0xd6/0x110
          [<ffffffff81071f6b>] ? hrtick_update+0x1b/0x4d
          [<ffffffff81072405>] ? dequeue_task_fair+0x1ca/0x1d9
          [<ffffffff8106a369>] ? need_resched+0x1e/0x28
          [<ffffffff814b7d14>] wait_for_common+0x9b/0xf1
          [<ffffffff8106e7be>] ? try_to_wake_up+0x1e0/0x1e0
          [<ffffffff814b7e22>] wait_for_completion+0x1d/0x1f
          [<ffffffff8105ae82>] flush_workqueue+0x116/0x2a1
          [<ffffffff8105b357>] drain_workqueue+0x66/0x14c
          [<ffffffff8105b8ef>] destroy_workqueue+0x1a/0xcf
          [<ffffffffa009211e>] fc_remove_host+0x154/0x17f [scsi_transport_fc]
          [<ffffffffa00edbb8>] fcoe_if_destroy+0x184/0x1c9 [fcoe]
          [<ffffffffa00edc28>] fcoe_destroy_work+0x2b/0x44 [fcoe]
          [<ffffffff8105a82a>] process_one_work+0x1a8/0x2a4
          [<ffffffffa00edbfd>] ? fcoe_if_destroy+0x1c9/0x1c9 [fcoe]
          [<ffffffff8105c396>] worker_thread+0x1db/0x268
          [<ffffffff810604a3>] ? wake_up_bit+0x2a/0x2a
          [<ffffffff8105c1bb>] ? manage_workers.clone.16+0x1f6/0x1f6
          [<ffffffff8105ffd6>] kthread+0x6f/0x77
          [<ffffffff814c0304>] kernel_thread_helper+0x4/0x10
          [<ffffffff8105ff67>] ? kthread_freezable_should_stop+0x4b/0x4b
      
          Call Trace:
          [<ffffffff814b7e88>] schedule+0x64/0x66
          [<ffffffff814b8041>] schedule_preempt_disabled+0xe/0x10
          [<ffffffff814b70a1>] __mutex_lock_common.clone.5+0x117/0x17a
          [<ffffffff814b7117>] __mutex_lock_slowpath+0x13/0x15
          [<ffffffff814b6f76>] mutex_lock+0x23/0x37
          [<ffffffff8125b890>] ? list_del+0x11/0x30
          [<ffffffffa00edc84>] fcoe_vport_destroy+0x43/0x5f [fcoe]
          [<ffffffffa009130a>] fc_vport_terminate+0x48/0x110 [scsi_transport_fc]
          [<ffffffffa00913ef>] fc_vport_sched_delete+0x1d/0x79 [scsi_transport_fc]
          [<ffffffff8105a82a>] process_one_work+0x1a8/0x2a4
          [<ffffffffa00913d2>] ? fc_vport_terminate+0x110/0x110 [scsi_transport_fc]
          [<ffffffff8105c396>] worker_thread+0x1db/0x268
          [<ffffffff8105c1bb>] ? manage_workers.clone.16+0x1f6/0x1f6
          [<ffffffff8105ffd6>] kthread+0x6f/0x77
          [<ffffffff814c0304>] kernel_thread_helper+0x4/0x10
          [<ffffffff8105ff67>] ? kthread_freezable_should_stop+0x4b/0x4b
          [<ffffffff814c0300>] ? gs_change+0x13/0x13
      
      A prior attempt to fix this issue is posted here:
      http://lists.open-fcoe.org/pipermail/devel/2012-October/012318.html
      or
      http://article.gmane.org/gmane.linux.scsi.open-fcoe.devel/11924
      
      Based on feedback and discussion with Neil Horman it seems that the above patch
      may have a case where the fcoe_vport_destroy() and fcoe_destroy_work() can
      race; hence that patch has been withdrawn with this patch that is trying to
      solve the same problem in a different way.
      
      In the current approach instead of removing the fcoe_config_mutex from the
      vport_delete callback function; I've chosen to delete all the NPIV ports first
      on a given root lport before continuing with the removal of the root lport.
      Signed-off-by: NNeerav Parikh <Neerav.Parikh@intel.com>
      Tested-by: NMarcus Dennis <marcusx.e.dennis@intel.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NRobert Love <robert.w.love@intel.com>
      94aa743a
    • N
      fcoe: close race on link speed detection in fcoe code · f9184df3
      Neil Horman 提交于
      When creating an fcoe interfce, we call fcoe_link_speed_update before we add the
      lports fcoe interface to the fc_hostlist.  Since network device events like
      NETDEV_CHANGE are only processed if an fcoe interface is found with an
      underlying netdev that matches the netdev of the event.  Since this processing
      in fcoe_device_notification is how link_speed changes get communicated to the
      libfc  code (via fcoe_link_speed_update), we have a race condition - if a
      NETDEV_CHANGE event is sent after the call to fcoe_link_speed_update in
      fcoe_netdev_config, but before we add the interface to the fc_hostlist, we will
      loose the event and attributes like /sys/class/fc_host/hostX/speed will not get
      updated properly.
      
      Fix this by moving the add to the fc_hostlist above the serialized call to
      fcoe_netdev_config, ensuring that we catch netdev envents before we make a
      direct call to fcoe_link_speed_update.
      
      Also use this opportunity to clean up access to the fc_hostlist a bit by
      creating a fcoe_hostlist_del accessor and replacing the cleanup in fcoe_exit to
      use it properly.
      
      Tested by myself successfully
      
      [ Comment over 80 chars broken into multi-line by Robert Love to
        satisfy checkpatch.pl ]
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      Reviewed-by: NYi Zou <yi.zou@intel.com>
      Signed-off-by: NRobert Love <robert.w.love@intel.com>
      f9184df3
  4. 15 12月, 2012 5 次提交
    • Y
      libfcoe, fcoe: consolidate the fcoe_ctlr_get_lesb/fcoe_get_lesb · 57c2728f
      Yi Zou 提交于
      Similarly they can be moved into libfcoe instead of being private to fcoe now.
      Also add comments particularly on the term LESB to the corresponding function.
      Signed-off-by: NYi Zou <yi.zou@intel.com>
      Cc: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
      Tested-by: NMarcus Dennis <marcusx.e.dennis@intel.com>
      Signed-off-by: NRobert Love <robert.w.love@intel.com>
      57c2728f
    • Y
      libfcoe, fcoe: move fcoe_link_speed_update() to libfcoe and export it · 03702689
      Yi Zou 提交于
      With the previous patch, fcoe_link_speed_update() can be moved into libfcoe and
      exported to used by fcoe, bnx2fc, and etc.
      Signed-off-by: NYi Zou <yi.zou@intel.com>
      Cc: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
      Tested-by: NMarcus Dennis <marcusx.e.dennis@intel.com>
      Signed-off-by: NRobert Love <robert.w.love@intel.com>
      03702689
    • Y
      fcoe: add support to the get_netdev() for fcoe_interface · 66524ec9
      Yi Zou 提交于
      Adds support to fcoe_port's newly added get_netdev fucntion pointer.
      Signed-off-by: NYi Zou <yi.zou@intel.com>
      Cc: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
      Tested-by: NMarcus Dennis <marcusx.e.dennis@intel.com>
      Signed-off-by: NRobert Love <robert.w.love@intel.com>
      66524ec9
    • R
      fcoe: Use the fcoe_sysfs control interface · 435c8667
      Robert Love 提交于
      This patch adds support for the new fcoe_sysfs
      control interface to fcoe.ko. It keeps the deprecated
      interface in tact and therefore either the legacy
      or the new control interfaces can be used. A mixed mode
      is not supported. A user must either use the new
      interfaces or the old ones, but not both.
      
      The fcoe_ctlr's link state is now driven by both the
      netdev link state as well as the fcoe_ctlr_device's
      enabled attribute. The link must be up and the
      fcoe_ctlr_device must be enabled before the FCoE
      Controller starts discovery or login.
      Signed-off-by: NRobert Love <robert.w.love@intel.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      435c8667
    • R
      libfcoe, fcoe, bnx2fc: Add new fcoe control interface · 6a891b07
      Robert Love 提交于
      This patch does a few things.
      
      1) Makes /sys/bus/fcoe/ctlr_{create,destroy} interfaces.
         These interfaces take an <ifname> and will either
         create an FCoE Controller or destroy an FCoE
         Controller depending on which file is written to.
      
         The new FCoE Controller will start in a DISABLED
         state and will not do discovery or login until it
         is ENABLED. This pause will allow us to configure
         the FCoE Controller before enabling it.
      
      2) Makes the 'mode' attribute of a fcoe_ctlr_device
         writale. This allows the user to configure the mode
         in which the FCoE Controller will start in when it
         is ENABLED.
      
         Possible modes are 'Fabric', or 'VN2VN'.
      
         The default mode for a fcoe_ctlr{,_device} is 'Fabric'.
         Drivers must implement the set_fcoe_ctlr_mode routine
         to support this feature.
      
         libfcoe offers an exported routine to set a FCoE
         Controller's mode. The mode can only be changed
         when the FCoE Controller is DISABLED.
      
         This patch also removes the get_fcoe_ctlr_mode pointer
         in the fcoe_sysfs function template, the code in
         fcoe_ctlr.c to get the mode and the assignment of
         the fcoe_sysfs function pointer to the fcoe_ctlr.c
         implementation (in fcoe and bnx2fc). fcoe_sysfs can
         return that value for the mode without consulting the
         LLD.
      
      3) Make a 'enabled' attribute of a fcoe_ctlr_device. On a
         read, fcoe_sysfs will return the attribute's value. On
         a write, fcoe_sysfs will call the LLD (if there is a
         callback) to notifiy that the enalbed state has changed.
      
      This patch maintains the old FCoE control interfaces as
      module parameters, but it adds comments pointing out that
      the old interfaces are deprecated.
      Signed-off-by: NRobert Love <robert.w.love@intel.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      6a891b07
  5. 07 10月, 2012 1 次提交
  6. 20 7月, 2012 2 次提交
  7. 23 5月, 2012 2 次提交
    • R
      [SCSI] fcoe, bnx2fc, libfcoe: SW FCoE and bnx2fc use FCoE Syfs · 8d55e507
      Robert Love 提交于
      This patch has the SW FCoE driver and the bnx2fc
      driver make use of the new fcoe_sysfs API added
      earlier in this patch series.
      
      After this patch a fcoe_ctlr_device is allocated with
      private data in this order.
      
      +------------------+   +------------------+
      | fcoe_ctlr_device |   | fcoe_ctlr_device |
      +------------------+   +------------------+
      | fcoe_ctlr        |   | fcoe_ctlr        |
      +------------------+   +------------------+
      | fcoe_interface   |   | bnx2fc_interface |
      +------------------+   +------------------+
      
      libfcoe also takes part in this new model since it
      discovers and manages fcoe_fcf instances. The memory
      allocation is different for FCFs. I didn't want to
      impact libfcoe's fcoe_fcf processing, so this patch
      creates fcoe_fcf_device instances for each discovered
      fcoe_fcf. The two are paired using a (void * priv)
      member of the fcoe_ctlr_device. This allows libfcoe
      to continue maintaining its list of fcoe_fcf instances
      and simply attaches and detaches them from existing
      or new fcoe_fcf_device instances.
      Signed-off-by: NRobert Love <robert.w.love@intel.com>
      Tested-by: NRoss Brattain <ross.b.brattain@intel.com>
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      8d55e507
    • R
      [SCSI] fcoe: Allocate fcoe_ctlr with fcoe_interface, not as a member · 619fe4be
      Robert Love 提交于
      Currently the fcoe_ctlr associated with an interface is allocated
      as a member of struct fcoe_interface. This causes problems when
      attempting to use the new fcoe_sysfs APIs which allow us to allocate
      the fcoe_interface as private data to the fcoe_ctlr_device instance.
      The problem is that libfcoe wants to be able use pointer math to find a
      fcoe_ctlr's fcoe_ctlr_device as well as finding a fcoe_ctlr_device's
      assocated fcoe_ctlr. To do this we need to allocate the
      fcoe_ctlr_device, with private data for the LLD. The private data
      contains the fcoe_ctlr and its private data is the fcoe_interface.
      This patch only allocates the fcoe_interface with the fcoe_ctlr, the
      fcoe_ctlr_device will be added in a later patch, which will complete
      the below diagram-
      
      +------------------+
      | fcoe_ctlr_device |
      +------------------+
      | fcoe_ctlr        |
      +------------------+
      | fcoe_interface   |
      +------------------+
      
      This prep work will allow us to go from a fcoe_ctlr_device instance
      to its fcoe_ctlr as well as from a fcoe_ctlr to its fcoe_ctlr_device
      once the fcoe_sysfs API is in use (later patches in this series).
      Signed-off-by: NRobert Love <robert.w.love@intel.com>
      Tested-by: NRoss Brattain <ross.b.brattain@intel.com>
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      619fe4be
  8. 10 5月, 2012 4 次提交
  9. 28 3月, 2012 5 次提交
    • R
      [SCSI] fcoe: Drop the rtnl_mutex before calling fcoe_ctlr_link_up · 22805123
      Robert Love 提交于
      The rtnl_lock is primarily used to serialize networking
      driver changes as well as to ensure that a networking driver
      is not removed when making changes to it. fcoe also uses
      the rtnl_lock to protect the fcoe hostlist.
      
      fcoe_create holds the rtnl_lock over the entirity of the
      routine including a the call to fcoe_ctlr_link_up.
      This causes the below deadlock because fcoe_ctlr_link_up
      acquires the fcoe_ctlr ctlr_mutex and this deadlocks with
      a libfcoe thread that acquires the fcoe_ctlr ctlr_mutex and
      then the rtnl_lock (to update a MAC address).
      
      This patch drops the rtnl_lock before calling
      fcoe_ctlr_link_up and therefore the deadlock is prevented.
      
      https://bugzilla.kernel.org/show_bug.cgi?id=42918
      
      the existing dependency chain (in reverse order) is:
      
      -> #1 (&fip->ctlr_mutex){+.+...}:
             [<c1091f70>] lock_acquire+0x80/0x1b0
             [<c147655d>] mutex_lock_nested+0x6d/0x340
             [<f8970c32>] fcoe_ctlr_link_up+0x22/0x180 [libfcoe]
             [<f894620e>] fcoe_create+0x47e/0x6e0 [fcoe]
             [<f8973dd3>] fcoe_transport_create+0x143/0x250 [libfcoe]
             [<c10527e0>] param_attr_store+0x30/0x60
             [<c1052696>] module_attr_store+0x26/0x40
             [<c11a201e>] sysfs_write_file+0xae/0x100
             [<c11449df>] vfs_write+0x8f/0x160
             [<c1144cbd>] sys_write+0x3d/0x70
             [<c147a0c4>] syscall_call+0x7/0xb
      
      -> #0 (rtnl_mutex){+.+.+.}:
             [<c109164b>] __lock_acquire+0x140b/0x1720
             [<c1091f70>] lock_acquire+0x80/0x1b0
             [<c147655d>] mutex_lock_nested+0x6d/0x340
             [<c13a10c4>] rtnl_lock+0x14/0x20
             [<f89445ac>] fcoe_update_src_mac+0x2c/0xb0 [fcoe]
             [<f8971712>] fcoe_ctlr_timer_work+0x712/0xb60 [libfcoe]
             [<c104fb69>] process_one_work+0x179/0x5d0
             [<c10502f1>] worker_thread+0x121/0x2d0
             [<c10550ed>] kthread+0x7d/0x90
             [<c1481a82>] kernel_thread_helper+0x6/0x10
      
      other info that might help us debug this:
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(&fip->ctlr_mutex);
                                     lock(rtnl_mutex);
                                     lock(&fip->ctlr_mutex);
        lock(rtnl_mutex);
      
       *** DEADLOCK ***
      Signed-off-by: NRobert Love <robert.w.love@intel.com>
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      22805123
    • N
      [SCSI] fcoe: reduce contention for fcoe_rx_list lock [v2] · 20dc3811
      Neil Horman 提交于
      There is potentially lots of contention for the rx_list_lock.  On a cpu that is
      receiving lots of fcoe traffic, the softirq context has to add and release the
      lock for every frame it receives, as does the receiving per-cpu thread.  We can
      reduce this contention somewhat by altering the per-cpu threads loop such that
      when traffic is detected on the fcoe_rx_list, we splice it to a temporary list.
      In this way, we can process multiple skbs while only having to acquire and
      release the fcoe_rx_list lock once.
      
      [ Braces around single statement while loop removed by Robert Love
        to satisfy checkpath.pl. ]
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NVasu Dev <vasu.dev@intel.com>
      Signed-off-by: NRobert Love <robert.w.love@intel.com>
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      20dc3811
    • N
      [SCSI] fcoe: remove frame dropping code from fcoe_percpu_clean · dd060e74
      Neil Horman 提交于
      commit e7a51997 ([SCSI] fcoe: flush per-cpu
      thread work when destroying interface) added a skb flush to the fcoe_rx_list,
      which ensures that we push any pending frames on the list through the per-cpu
      receive thread.  Because of this, its redundant to lock and scan the list
      first, dropping any arriving frames.
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NVasu Dev <vasu.dev@intel.com>
      Signed-off-by: NRobert Love <robert.w.love@intel.com>
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      dd060e74
    • N
      [SCSI] foce: remove bh disable from fcoe sw transport rcv function · 94aa29f2
      Neil Horman 提交于
      The fcoe sw recive packet function (fcoe_rcv) only ever executes in softirq
      context.  Given that, and the fact that no use of the fcoe_rx_list is made in
      irq context, its not necessecary to disable bottom halves while actually
      receiving the frame.  Convert spin_*_bh calls in that function to their
      lock-only equivalents
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NVasu Dev <vasu.dev@intel.com>
      Signed-off-by: NRobert Love <robert.w.love@intel.com>
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      94aa29f2
    • N
      [SCSI] fcoe: Ensure fcoe_recv_frame is always called in process context · 5e70c4c4
      Neil Horman 提交于
      commit 859b7b64 introduced the ability to call
      fcoe_recv_frame in softirq context.  While this is beneficial to performance,
      its not safe to do, as it breaks the serialization of access to the lport
      structure (i.e. when an fcoe interface is being torn down, theres no way to
      serialize the teardown effort with the completion of receieve operations
      occuring in softirq context.  As a result, lport (and other) data structures can
      be read and modified in parallel leading to corruption.  Most notable is the
      vport list, which is protected by a mutex, that will cause a panic if a softirq
      receive while said mutex is locked.  Additionaly, the ema_list, discussed here:
      
      http://lists.open-fcoe.org/pipermail/devel/2012-February/011947.html
      
      Can be corrupted if a list traversal occurs in softirq context at the same time
      as a list delete in process context.  And generally the lport state variables
      will not be stable, and may lead to unpredictable results.
      
      The most direct fix is to remove the bits from the above commit that allowed
      fcoe_recv_frame to be called in softirq context.  We just force all frames to be
      handled by the per-cpu rx threads.  This will allow the fcoe_if_destroy's use of
      fcoe_percpu_clean to function properly, ensuring that no frames are being
      received while the lport is being torn down.
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      Reviewed-by: NVasu Dev <vasu.dev@intel.com>
      Signed-off-by: NRobert Love <robert.w.love@intel.com>
      Signed-off-by: NJames Bottomley <JBottomley@Parallels.com>
      5e70c4c4
  10. 20 3月, 2012 2 次提交
  11. 17 3月, 2012 1 次提交
  12. 19 2月, 2012 6 次提交
  13. 16 1月, 2012 2 次提交
  14. 11 1月, 2012 1 次提交
  15. 15 12月, 2011 1 次提交
  16. 14 12月, 2011 1 次提交
  17. 31 10月, 2011 1 次提交
  18. 16 10月, 2011 1 次提交
  19. 03 10月, 2011 1 次提交