1. 25 7月, 2012 1 次提交
    • G
      irqdomain: eliminate slow-path revmap lookups · 4c0946c4
      Grant Likely 提交于
      With the current state of irq_domain, the reverse map is always updated
      when new IRQs get mapped.  This means that the irq_find_mapping() function
      can be simplified to execute the revmap lookup functions unconditionally
      
      This patch adds lookup functions for the revmaps that don't yet have one
      and removes the slow path lookup code path.
      
      v8: Broke out unrelated changes into separate patches.  Rebased on Paul's irq
          association patches.
      v7: Rebased to irqdomain/next for v3.4 and applied before the removal of 'hint'
      v6: Remove the slow path entirely.  The only place where the slow path
          could get called is for a linear mapping if the hwirq number is larger
          than the linear revmap size.  There shouldn't be any interrupt
          controllers that do that.
      v5: rewrite to not use a ->revmap() callback.  It is simpler, smaller,
          safer and faster to open code each of the revmap lookups directly into
          irq_find_mapping() via a switch statement.
      v4: Fix build failure on incorrect variable reference.
      Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Milton Miller <miltonm@bga.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Rob Herring <rob.herring@calxeda.com>
      4c0946c4
  2. 23 7月, 2012 9 次提交
  3. 22 7月, 2012 4 次提交
  4. 20 7月, 2012 2 次提交
  5. 19 7月, 2012 2 次提交
    • L
      Make wait_for_device_probe() also do scsi_complete_async_scans() · eea03c20
      Linus Torvalds 提交于
      Commit a7a20d10 ("sd: limit the scope of the async probe domain")
      make the SCSI device probing run device discovery in it's own async
      domain.
      
      However, as a result, the partition detection was no longer synchronized
      by async_synchronize_full() (which, despite the name, only synchronizes
      the global async space, not all of them).  Which in turn meant that
      "wait_for_device_probe()" would not wait for the SCSI partitions to be
      parsed.
      
      And "wait_for_device_probe()" was what the boot time init code relied on
      for mounting the root filesystem.
      
      Now, most people never noticed this, because not only is it
      timing-dependent, but modern distributions all use initrd.  So the root
      filesystem isn't actually on a disk at all.  And then before they
      actually mount the final disk filesystem, they will have loaded the
      scsi-wait-scan module, which not only does the expected
      wait_for_device_probe(), but also does scsi_complete_async_scans().
      
      [ Side note: scsi_complete_async_scans() had also been partially broken,
        but that was fixed in commit 43a8d39d ("fix async probe
        regression"), so that same commit a7a20d10 had actually broken
        setups even if you used scsi-wait-scan explicitly ]
      
      Solve this problem by just moving the scsi_complete_async_scans() call
      into wait_for_device_probe().  Everybody who wants to wait for device
      probing to finish really wants the SCSI probing to complete, so there's
      no reason not to do this.
      
      So now "wait_for_device_probe()" really does what the name implies, and
      properly waits for device probing to finish.  This also removes the now
      unnecessary extra calls to scsi_complete_async_scans().
      Reported-and-tested-by: NArtem S. Tashkinov <t.artem@mailcity.com>
      Cc: Dan Williams <dan.j.williams@gmail.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: James Bottomley <jbottomley@parallels.com>
      Cc: Borislav Petkov <bp@amd64.org>
      Cc: linux-scsi <linux-scsi@vger.kernel.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      eea03c20
    • R
      PM / Sleep: Require CAP_BLOCK_SUSPEND to use wake_lock/wake_unlock · 11388c87
      Rafael J. Wysocki 提交于
      Require processes wanting to use the wake_lock/wake_unlock sysfs
      files to have the CAP_BLOCK_SUSPEND capability, which also is
      required for the eventpoll EPOLLWAKEUP flag to be effective, so that
      all interfaces related to blocking autosleep depend on the same
      capability.
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Cc: stable@vger.kernel.org
      Acked-by: NMichael Kerrisk <mtk.man-pages@gmail.com>
      11388c87
  6. 18 7月, 2012 9 次提交
    • T
      workqueue: simplify CPU hotplug code · 8db25e78
      Tejun Heo 提交于
      With trustee gone, CPU hotplug code can be simplified.
      
      * gcwq_claim/release_management() now grab and release gcwq lock too
        respectively and gained _and_lock and _and_unlock postfixes.
      
      * All CPU hotplug logic was implemented in workqueue_cpu_callback()
        which was called by workqueue_cpu_up/down_callback() for the correct
        priority.  This was because up and down paths shared a lot of logic,
        which is no longer true.  Remove workqueue_cpu_callback() and move
        all hotplug logic into the two actual callbacks.
      
      This patch doesn't make any functional changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: N"Rafael J. Wysocki" <rjw@sisk.pl>
      8db25e78
    • T
      workqueue: remove CPU offline trustee · 628c78e7
      Tejun Heo 提交于
      With the previous changes, a disassociated global_cwq now can run as
      an unbound one on its own - it can create workers as necessary to
      drain remaining works after the CPU has been brought down and manage
      the number of workers using the usual idle timer mechanism making
      trustee completely redundant except for the actual unbinding
      operation.
      
      This patch removes the trustee and let a disassociated global_cwq
      manage itself.  Unbinding is moved to a work item (for CPU affinity)
      which is scheduled and flushed from CPU_DONW_PREPARE.
      
      This patch moves nr_running clearing outside gcwq and manager locks to
      simplify the code.  As nr_running is unused at the point, this is
      safe.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: N"Rafael J. Wysocki" <rjw@sisk.pl>
      628c78e7
    • T
      workqueue: don't butcher idle workers on an offline CPU · 3ce63377
      Tejun Heo 提交于
      Currently, during CPU offlining, after all pending work items are
      drained, the trustee butchers all workers.  Also, on CPU onlining
      failure, workqueue_cpu_callback() ensures that the first idle worker
      is destroyed.  Combined, these guarantee that an offline CPU doesn't
      have any worker for it once all the lingering work items are finished.
      
      This guarantee isn't really necessary and makes CPU on/offlining more
      expensive than needs to be, especially for platforms which use CPU
      hotplug for powersaving.
      
      This patch lets offline CPUs removes idle worker butchering from the
      trustee and let a CPU which failed onlining keep the created first
      worker.  The first worker is created if the CPU doesn't have any
      during CPU_DOWN_PREPARE and started right away.  If onlining succeeds,
      the rebind_workers() call in CPU_ONLINE will rebind it like any other
      workers.  If onlining fails, the worker is left alone till the next
      try.
      
      This makes CPU hotplugs cheaper by allowing global_cwqs to keep
      workers across them and simplifies code.
      
      Note that trustee doesn't re-arm idle timer when it's done and thus
      the disassociated global_cwq will keep all workers until it comes back
      online.  This will be improved by further patches.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: N"Rafael J. Wysocki" <rjw@sisk.pl>
      3ce63377
    • T
      workqueue: reimplement CPU online rebinding to handle idle workers · 25511a47
      Tejun Heo 提交于
      Currently, if there are left workers when a CPU is being brough back
      online, the trustee kills all idle workers and scheduled rebind_work
      so that they re-bind to the CPU after the currently executing work is
      finished.  This works for busy workers because concurrency management
      doesn't try to wake up them from scheduler callbacks, which require
      the target task to be on the local run queue.  The busy worker bumps
      concurrency counter appropriately as it clears WORKER_UNBOUND from the
      rebind work item and it's bound to the CPU before returning to the
      idle state.
      
      To reduce CPU on/offlining overhead (as many embedded systems use it
      for powersaving) and simplify the code path, workqueue is planned to
      be modified to retain idle workers across CPU on/offlining.  This
      patch reimplements CPU online rebinding such that it can also handle
      idle workers.
      
      As noted earlier, due to the local wakeup requirement, rebinding idle
      workers is tricky.  All idle workers must be re-bound before scheduler
      callbacks are enabled.  This is achieved by interlocking idle
      re-binding.  Idle workers are requested to re-bind and then hold until
      all idle re-binding is complete so that no bound worker starts
      executing work item.  Only after all idle workers are re-bound and
      parked, CPU_ONLINE proceeds to release them and queue rebind work item
      to busy workers thus guaranteeing scheduler callbacks aren't invoked
      until all idle workers are ready.
      
      worker_rebind_fn() is renamed to busy_worker_rebind_fn() and
      idle_worker_rebind() for idle workers is added.  Rebinding logic is
      moved to rebind_workers() and now called from CPU_ONLINE after
      flushing trustee.  While at it, add CPU sanity check in
      worker_thread().
      
      Note that now a worker may become idle or the manager between trustee
      release and rebinding during CPU_ONLINE.  As the previous patch
      updated create_worker() so that it can be used by regular manager
      while unbound and this patch implements idle re-binding, this is safe.
      
      This prepares for removal of trustee and keeping idle workers across
      CPU hotplugs.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: N"Rafael J. Wysocki" <rjw@sisk.pl>
      25511a47
    • T
      workqueue: drop @bind from create_worker() · bc2ae0f5
      Tejun Heo 提交于
      Currently, create_worker()'s callers are responsible for deciding
      whether the newly created worker should be bound to the associated CPU
      and create_worker() sets WORKER_UNBOUND only for the workers for the
      unbound global_cwq.  Creation during normal operation is always via
      maybe_create_worker() and @bind is true.  For workers created during
      hotplug, @bind is false.
      
      Normal operation path is planned to be used even while the CPU is
      going through hotplug operations or offline and this static decision
      won't work.
      
      Drop @bind from create_worker() and decide whether to bind by looking
      at GCWQ_DISASSOCIATED.  create_worker() will also set WORKER_UNBOUND
      autmatically if disassociated.  To avoid flipping GCWQ_DISASSOCIATED
      while create_worker() is in progress, the flag is now allowed to be
      changed only while holding all manager_mutexes on the global_cwq.
      
      This requires that GCWQ_DISASSOCIATED is not cleared behind trustee's
      back.  CPU_ONLINE no longer clears DISASSOCIATED before flushing
      trustee, which clears DISASSOCIATED before rebinding remaining workers
      if asked to release.  For cases where trustee isn't around, CPU_ONLINE
      clears DISASSOCIATED after flushing trustee.  Also, now, first_idle
      has UNBOUND set on creation which is explicitly cleared by CPU_ONLINE
      while binding it.  These convolutions will soon be removed by further
      simplification of CPU hotplug path.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: N"Rafael J. Wysocki" <rjw@sisk.pl>
      bc2ae0f5
    • T
      workqueue: use mutex for global_cwq manager exclusion · 60373152
      Tejun Heo 提交于
      POOL_MANAGING_WORKERS is used to ensure that at most one worker takes
      the manager role at any given time on a given global_cwq.  Trustee
      later hitched on it to assume manager adding blocking wait for the
      bit.  As trustee already needed a custom wait mechanism, waiting for
      MANAGING_WORKERS was rolled into the same mechanism.
      
      Trustee is scheduled to be removed.  This patch separates out
      MANAGING_WORKERS wait into per-pool mutex.  Workers use
      mutex_trylock() to test for manager role and trustee uses mutex_lock()
      to claim manager roles.
      
      gcwq_claim/release_management() helpers are added to grab and release
      manager roles of all pools on a global_cwq.  gcwq_claim_management()
      always grabs pool manager mutexes in ascending pool index order and
      uses pool index as lockdep subclass.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: N"Rafael J. Wysocki" <rjw@sisk.pl>
      60373152
    • T
      workqueue: ROGUE workers are UNBOUND workers · 403c821d
      Tejun Heo 提交于
      Currently, WORKER_UNBOUND is used to mark workers for the unbound
      global_cwq and WORKER_ROGUE is used to mark workers for disassociated
      per-cpu global_cwqs.  Both are used to make the marked worker skip
      concurrency management and the only place they make any difference is
      in worker_enter_idle() where WORKER_ROGUE is used to skip scheduling
      idle timer, which can easily be replaced with trustee state testing.
      
      This patch replaces WORKER_ROGUE with WORKER_UNBOUND and drops
      WORKER_ROGUE.  This is to prepare for removing trustee and handling
      disassociated global_cwqs as unbound.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: N"Rafael J. Wysocki" <rjw@sisk.pl>
      403c821d
    • T
      workqueue: drop CPU_DYING notifier operation · f2d5a0ee
      Tejun Heo 提交于
      Workqueue used CPU_DYING notification to mark GCWQ_DISASSOCIATED.
      This was necessary because workqueue's CPU_DOWN_PREPARE happened
      before other DOWN_PREPARE notifiers and workqueue needed to stay
      associated across the rest of DOWN_PREPARE.
      
      After the previous patch, workqueue's DOWN_PREPARE happens after
      others and can set GCWQ_DISASSOCIATED directly.  Drop CPU_DYING and
      let the trustee set GCWQ_DISASSOCIATED after disabling concurrency
      management.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: N"Rafael J. Wysocki" <rjw@sisk.pl>
      f2d5a0ee
    • T
      workqueue: perform cpu down operations from low priority cpu_notifier() · 65758202
      Tejun Heo 提交于
      Currently, all workqueue cpu hotplug operations run off
      CPU_PRI_WORKQUEUE which is higher than normal notifiers.  This is to
      ensure that workqueue is up and running while bringing up a CPU before
      other notifiers try to use workqueue on the CPU.
      
      Per-cpu workqueues are supposed to remain working and bound to the CPU
      for normal CPU_DOWN_PREPARE notifiers.  This holds mostly true even
      with workqueue offlining running with higher priority because
      workqueue CPU_DOWN_PREPARE only creates a bound trustee thread which
      runs the per-cpu workqueue without concurrency management without
      explicitly detaching the existing workers.
      
      However, if the trustee needs to create new workers, it creates
      unbound workers which may wander off to other CPUs while
      CPU_DOWN_PREPARE notifiers are in progress.  Furthermore, if the CPU
      down is cancelled, the per-CPU workqueue may end up with workers which
      aren't bound to the CPU.
      
      While reliably reproducible with a convoluted artificial test-case
      involving scheduling and flushing CPU burning work items from CPU down
      notifiers, this isn't very likely to happen in the wild, and, even
      when it happens, the effects are likely to be hidden by the following
      successful CPU down.
      
      Fix it by using different priorities for up and down notifiers - high
      priority for up operations and low priority for down operations.
      
      Workqueue cpu hotplug operations will soon go through further cleanup.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: stable@vger.kernel.org
      Acked-by: N"Rafael J. Wysocki" <rjw@sisk.pl>
      65758202
  7. 17 7月, 2012 1 次提交
  8. 15 7月, 2012 8 次提交
  9. 14 7月, 2012 4 次提交
    • D
      VFS: Pass mount flags to sget() · 9249e17f
      David Howells 提交于
      Pass mount flags to sget() so that it can use them in initialising a new
      superblock before the set function is called.  They could also be passed to the
      compare function.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      9249e17f
    • D
      VFS: Make clone_mnt()/copy_tree()/collect_mounts() return errors · be34d1a3
      David Howells 提交于
      copy_tree() can theoretically fail in a case other than ENOMEM, but always
      returns NULL which is interpreted by callers as -ENOMEM.  Change it to return
      an explicit error.
      
      Also change clone_mnt() for consistency and because union mounts will add new
      error cases.
      
      Thanks to Andreas Gruenbacher <agruen@suse.de> for a bug fix.
      [AV: folded braino fix by Dan Carpenter]
      
      Original-author: Valerie Aurora <vaurora@redhat.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Cc: Valerie Aurora <valerie.aurora@gmail.com>
      Cc: Andreas Gruenbacher <agruen@suse.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      be34d1a3
    • A
      get rid of kern_path_parent() · 79714f72
      Al Viro 提交于
      all callers want the same thing, actually - a kinda-sorta analog of
      kern_path_create().  I.e. they want parent vfsmount/dentry (with
      ->i_mutex held, to make sure the child dentry is still their child)
      + the child dentry.
      
      Signed-off-by Al Viro <viro@zeniv.linux.org.uk>
      79714f72
    • A
      stop passing nameidata to ->lookup() · 00cd8dd3
      Al Viro 提交于
      Just the flags; only NFS cares even about that, but there are
      legitimate uses for such argument.  And getting rid of that
      completely would require splitting ->lookup() into a couple
      of methods (at least), so let's leave that alone for now...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      00cd8dd3