1. 12 2月, 2013 1 次提交
  2. 29 1月, 2013 2 次提交
    • N
      fcoe: Fix deadlock while deleting FCoE interface with NPIV ports · 94aa743a
      Neerav Parikh 提交于
      This patch fixes following deadlock caused by destroying of
      an FCoE interface with active NPIV ports on that interface.
      
          Call Trace:
          [<ffffffff814b7e88>] schedule+0x64/0x66
          [<ffffffff814b6b4f>] schedule_timeout+0x36/0xe3
          [<ffffffff81070c55>] ? update_curr+0xd6/0x110
          [<ffffffff81071f6b>] ? hrtick_update+0x1b/0x4d
          [<ffffffff81072405>] ? dequeue_task_fair+0x1ca/0x1d9
          [<ffffffff8106a369>] ? need_resched+0x1e/0x28
          [<ffffffff814b7d14>] wait_for_common+0x9b/0xf1
          [<ffffffff8106e7be>] ? try_to_wake_up+0x1e0/0x1e0
          [<ffffffff814b7e22>] wait_for_completion+0x1d/0x1f
          [<ffffffff8105ae82>] flush_workqueue+0x116/0x2a1
          [<ffffffff8105b357>] drain_workqueue+0x66/0x14c
          [<ffffffff8105b8ef>] destroy_workqueue+0x1a/0xcf
          [<ffffffffa009211e>] fc_remove_host+0x154/0x17f [scsi_transport_fc]
          [<ffffffffa00edbb8>] fcoe_if_destroy+0x184/0x1c9 [fcoe]
          [<ffffffffa00edc28>] fcoe_destroy_work+0x2b/0x44 [fcoe]
          [<ffffffff8105a82a>] process_one_work+0x1a8/0x2a4
          [<ffffffffa00edbfd>] ? fcoe_if_destroy+0x1c9/0x1c9 [fcoe]
          [<ffffffff8105c396>] worker_thread+0x1db/0x268
          [<ffffffff810604a3>] ? wake_up_bit+0x2a/0x2a
          [<ffffffff8105c1bb>] ? manage_workers.clone.16+0x1f6/0x1f6
          [<ffffffff8105ffd6>] kthread+0x6f/0x77
          [<ffffffff814c0304>] kernel_thread_helper+0x4/0x10
          [<ffffffff8105ff67>] ? kthread_freezable_should_stop+0x4b/0x4b
      
          Call Trace:
          [<ffffffff814b7e88>] schedule+0x64/0x66
          [<ffffffff814b8041>] schedule_preempt_disabled+0xe/0x10
          [<ffffffff814b70a1>] __mutex_lock_common.clone.5+0x117/0x17a
          [<ffffffff814b7117>] __mutex_lock_slowpath+0x13/0x15
          [<ffffffff814b6f76>] mutex_lock+0x23/0x37
          [<ffffffff8125b890>] ? list_del+0x11/0x30
          [<ffffffffa00edc84>] fcoe_vport_destroy+0x43/0x5f [fcoe]
          [<ffffffffa009130a>] fc_vport_terminate+0x48/0x110 [scsi_transport_fc]
          [<ffffffffa00913ef>] fc_vport_sched_delete+0x1d/0x79 [scsi_transport_fc]
          [<ffffffff8105a82a>] process_one_work+0x1a8/0x2a4
          [<ffffffffa00913d2>] ? fc_vport_terminate+0x110/0x110 [scsi_transport_fc]
          [<ffffffff8105c396>] worker_thread+0x1db/0x268
          [<ffffffff8105c1bb>] ? manage_workers.clone.16+0x1f6/0x1f6
          [<ffffffff8105ffd6>] kthread+0x6f/0x77
          [<ffffffff814c0304>] kernel_thread_helper+0x4/0x10
          [<ffffffff8105ff67>] ? kthread_freezable_should_stop+0x4b/0x4b
          [<ffffffff814c0300>] ? gs_change+0x13/0x13
      
      A prior attempt to fix this issue is posted here:
      http://lists.open-fcoe.org/pipermail/devel/2012-October/012318.html
      or
      http://article.gmane.org/gmane.linux.scsi.open-fcoe.devel/11924
      
      Based on feedback and discussion with Neil Horman it seems that the above patch
      may have a case where the fcoe_vport_destroy() and fcoe_destroy_work() can
      race; hence that patch has been withdrawn with this patch that is trying to
      solve the same problem in a different way.
      
      In the current approach instead of removing the fcoe_config_mutex from the
      vport_delete callback function; I've chosen to delete all the NPIV ports first
      on a given root lport before continuing with the removal of the root lport.
      Signed-off-by: NNeerav Parikh <Neerav.Parikh@intel.com>
      Tested-by: NMarcus Dennis <marcusx.e.dennis@intel.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NRobert Love <robert.w.love@intel.com>
      94aa743a
    • N
      fcoe: close race on link speed detection in fcoe code · f9184df3
      Neil Horman 提交于
      When creating an fcoe interfce, we call fcoe_link_speed_update before we add the
      lports fcoe interface to the fc_hostlist.  Since network device events like
      NETDEV_CHANGE are only processed if an fcoe interface is found with an
      underlying netdev that matches the netdev of the event.  Since this processing
      in fcoe_device_notification is how link_speed changes get communicated to the
      libfc  code (via fcoe_link_speed_update), we have a race condition - if a
      NETDEV_CHANGE event is sent after the call to fcoe_link_speed_update in
      fcoe_netdev_config, but before we add the interface to the fc_hostlist, we will
      loose the event and attributes like /sys/class/fc_host/hostX/speed will not get
      updated properly.
      
      Fix this by moving the add to the fc_hostlist above the serialized call to
      fcoe_netdev_config, ensuring that we catch netdev envents before we make a
      direct call to fcoe_link_speed_update.
      
      Also use this opportunity to clean up access to the fc_hostlist a bit by
      creating a fcoe_hostlist_del accessor and replacing the cleanup in fcoe_exit to
      use it properly.
      
      Tested by myself successfully
      
      [ Comment over 80 chars broken into multi-line by Robert Love to
        satisfy checkpatch.pl ]
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      Reviewed-by: NYi Zou <yi.zou@intel.com>
      Signed-off-by: NRobert Love <robert.w.love@intel.com>
      f9184df3
  3. 15 12月, 2012 12 次提交
  4. 05 12月, 2012 2 次提交
    • R
      libfcoe: Save some memory and optimize name lookups · ef60f674
      Robert Love 提交于
      Instead of creating a structure with an enum and a pointer
      to a string, simply allocate an array of strings and use
      the enum values for the indicies.
      
      This means that we do not need to iterate through the list
      of entries when looking up a string name by its enum key.
      
      This will also help with a latter patch that will add
      more fcoe_sysfs attributes that will also use the
      fcoe_enum_name_search macro. One attribute will also do
      a reverse lookup which requires less code when the
      enum-to-string mappings are organized as this patch makes
      them to be.
      Signed-off-by: NRobert Love <robert.w.love@intel.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      ef60f674
    • V
      libfc: fix REC handling · 5b97fabd
      Vasu Dev 提交于
      Currently fc_fcp_timeout doesn't check FC_RP_FLAGS_REC_SUPPORTED
      flag first, this prevents REC request ever going out at all
      to the target having REC support. So this patches fixes the
      fc_fcp_timeout by checking FC_RP_FLAGS_REC_SUPPORTED flag first.
      
      The changed order won't cause any issue during clearing
      FC_RP_FLAGS_REC_SUPPORTED on failed IO with target not supporting
      FC_RP_FLAGS_REC_SUPPORTED, since retry on failed IO would succeed.
      Signed-off-by: NVasu Dev <vasu.dev@intel.com>
      Tested-by: NRoss Brattain <ross.b.brattain@intel.com>
      Signed-off-by: NRobert Love <robert.w.love@intel.com>
      5b97fabd
  5. 02 12月, 2012 1 次提交
  6. 01 12月, 2012 2 次提交
    • K
      drivers/rtc/rtc-tps65910.c: fix invalid pointer access on _remove() · 1430e178
      Kim, Milo 提交于
      The tps65910_rtc data is registered as the platform driver data in
      _probe(= ).  Therefore the tps65910_rtc should be used on unregistering
      the rtc device.  And device pointer should be retrieved from the
      platform_device structure.
      
      This patch fixes the below oops:
      
       Unable to handle kernel NULL pointer dereference at virtual address 00000008
       Modules linked in: rtc_tps65910(-)
       CPU: 0    Not tainted  (3.7.0-rc7-next-20121128-g6b1f974-dirty #7)
       PC is at tps65910_rtc_alarm_irq_enable+0x20/0x2c [rtc_tps65910]
           (tps65910_rtc_alarm_irq_enable+0x20/0x2c [rtc_tps65910])
           (tps65910_rtc_remove+0x18/0x28 [rtc_tps65910])
           (platform_drv_remove+0x18/0x1c)
           (__device_release_driver+0x70/0xcc)
           (driver_detach+0xb4/0xb8)
           (bus_remove_driver+0x7c/0xc0)
           (sys_delete_module+0x148/0x21c)
      Signed-off-by: NMilo(Woogyom) Kim <milo.kim@ti.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1430e178
    • A
      revert "Revert "mm: remove __GFP_NO_KSWAPD"" · a5091539
      Andrew Morton 提交于
      It apepars that this patch was innocent, and we hope that "mm: avoid
      waking kswapd for THP allocations when compaction is deferred or
      contended" will fix the final kswapd-spinning cause.
      
      Cc: Zdenek Kabelac <zkabelac@redhat.com>
      Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
      Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
      Cc: Jiri Slaby <jirislaby@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a5091539
  7. 30 11月, 2012 4 次提交
    • L
      blockdev: remove bd_block_size_semaphore again · 1e8b3332
      Linus Torvalds 提交于
      This reverts the block-device direct access code to the previous
      unlocked code, now that fs/buffer.c no longer needs external locking.
      
      With this, fs/block_dev.c is back to the original version, apart from a
      whitespace cleanup that I didn't want to revert.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1e8b3332
    • N
      bonding: fix race condition in bonding_store_slaves_active · e196c0e5
      nikolay@redhat.com 提交于
      Race between bonding_store_slaves_active() and slave manipulation
       functions. The bond_for_each_slave use in bonding_store_slaves_active()
       is not protected by any synchronization mechanism.
       NULL pointer dereference is easy to reach.
       Fixed by acquiring the bond->lock for the slave walk.
      
       v2: Make description text < 75 columns
      Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
      Signed-off-by: NJay Vosburgh <fubar@us.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e196c0e5
    • N
      bonding: make arp_ip_target parameter checks consistent with sysfs · 90fb6250
      nikolay@redhat.com 提交于
      The module can be loaded with arp_ip_target="255.255.255.255" which makes
       it impossible to remove as the function in sysfs checks for that value,
       so we make the parameter checks consistent with sysfs.
      
       v2: Fix formatting
       v3: Make description text < 75 columns
      Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
      Signed-off-by: NJay Vosburgh <fubar@us.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      90fb6250
    • N
      bonding: fix miimon and arp_interval delayed work race conditions · fbb0c41b
      nikolay@redhat.com 提交于
      First I would give three observations which will be used later.
      Observation 1: if (delayed_work_pending(wq)) cancel_delayed_work(wq)
       This usage is wrong because the pending bit is cleared just before the
       work's fn is executed and if the function re-arms itself we might end up
       with the work still running. It's safe to call cancel_delayed_work_sync()
       even if the work is not queued at all.
      Observation 2: Use of INIT_DELAYED_WORK()
       Work needs to be initialized only once prior to (de/en)queueing.
      Observation 3: IFF_UP is set only after ndo_open is called
      
      Related race conditions:
      1. Race between bonding_store_miimon() and bonding_store_arp_interval()
       Because of Obs.1 we can end up having both works enqueued.
      2. Multiple races with INIT_DELAYED_WORK()
       Since the works are not protected by anything between INIT_DELAYED_WORK()
       and calls to (en/de)queue it is possible for races between the following
       functions:
       (races are also possible between the calls to INIT_DELAYED_WORK()
        and workqueue code)
       bonding_store_miimon() - bonding_store_arp_interval(), bond_close(),
      			  bond_open(), enqueued functions
       bonding_store_arp_interval() - bonding_store_miimon(), bond_close(),
      				bond_open(), enqueued functions
      3. By Obs.1 we need to change bond_cancel_all()
      
      Bugs 1 and 2 are fixed by moving all work initializations in bond_open
      which by Obs. 2 and Obs. 3 and the fact that we make sure that all works
      are cancelled in bond_close(), is guaranteed not to have any work
      enqueued.
      Also RTNL lock is now acquired in bonding_store_miimon/arp_interval so
      they can't race with bond_close and bond_open. The opposing work is
      cancelled only if the IFF_UP flag is set and it is cancelled
      unconditionally. The opposing work is already cancelled if the interface
      is down so no need to cancel it again. This way we don't need new
      synchronizations for the bonding workqueue. These bugs (and fixes) are
      tied together and belong in the same patch.
      Note: I have left 1 line intentionally over 80 characters (84) because I
            didn't like how it looks broken down. If you'd prefer it otherwise,
            then simply break it.
      
       v2: Make description text < 75 columns
      Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
      Signed-off-by: NJay Vosburgh <fubar@us.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fbb0c41b
  8. 29 11月, 2012 6 次提交
  9. 28 11月, 2012 2 次提交
  10. 27 11月, 2012 8 次提交