1. 14 4月, 2012 1 次提交
  2. 13 4月, 2012 2 次提交
    • G
      irq_domain: fix type mismatch in debugfs output format · 5269a9ab
      Grant Likely 提交于
      sizeof(void*) returns an unsigned long, but it was being used as a width parameter to a "%-*s" format string which requires an int.  On 64 bit platforms this causes a type mismatch:
      
          linux/kernel/irq/irqdomain.c:575: warning: field width should have type
          'int', but argument 6 has type 'long unsigned int'
      
      This change casts the size to an int so printf gets the right data type.
      Reported-by: NAndreas Schwab <schwab@linux-m68k.org>
      Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>
      Cc: David Daney <david.daney@cavium.com>
      5269a9ab
    • J
      panic: fix stack dump print on direct call to panic() · 026ee1f6
      Jason Wessel 提交于
      Commit 6e6f0a1f ("panic: don't print redundant backtraces on oops")
      causes a regression where no stack trace will be printed at all for the
      case where kernel code calls panic() directly while not processing an
      oops, and of course there are 100's of instances of this type of call.
      
      The original commit executed the check (!oops_in_progress), but this will
      always be false because just before the dump_stack() there is a call to
      bust_spinlocks(1), which does the following:
      
        void __attribute__((weak)) bust_spinlocks(int yes)
        {
      	if (yes) {
      		++oops_in_progress;
      
      The proper way to resolve the problem that original commit tried to
      solve is to avoid printing a stack dump from panic() when the either of
      the following conditions is true:
      
        1) TAINT_DIE has been set (this is done by oops_end())
           This indicates and oops has already been printed.
        2) oops_in_progress > 1
           This guards against the rare case where panic() is invoked
           a second time, or in between oops_begin() and oops_end()
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: <stable@vger.kernel.org>	[3.3+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      026ee1f6
  3. 12 4月, 2012 1 次提交
    • G
      irq_domain: Move irq_virq_count into NOMAP revmap · 6fa6c8e2
      Grant Likely 提交于
      This patch replaces the old global setting of irq_virq_count that is only
      used by the NOMAP mapping and instead uses a revmap_data property so that
      the maximum NOMAP allocation can be set per NOMAP irq_domain.
      
      There is exactly one user of irq_virq_count in-tree right now: PS3.
      Also, irq_virq_count is only useful for the NOMAP mapping.  So,
      instead of having a single global irq_virq_count values, this change
      drops it entirely and added a max_irq argument to irq_domain_add_nomap().
      That makes it a property of an individual nomap irq domain instead of
      a global system settting.
      Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>
      Tested-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Milton Miller <miltonm@bga.com>
      6fa6c8e2
  4. 11 4月, 2012 4 次提交
  5. 10 4月, 2012 2 次提交
  6. 06 4月, 2012 2 次提交
    • N
      nohz: Fix stale jiffies update in tick_nohz_restart() · 6f103929
      Neal Cardwell 提交于
      Fix tick_nohz_restart() to not use a stale ktime_t "now" value when
      calling tick_do_update_jiffies64(now).
      
      If we reach this point in the loop it means that we crossed a tick
      boundary since we grabbed the "now" timestamp, so at this point "now"
      refers to a time in the old jiffy, so using the old value for "now" is
      incorrect, and is likely to give us a stale jiffies value.
      
      In particular, the first time through the loop the
      tick_do_update_jiffies64(now) call is always a no-op, since the
      caller, tick_nohz_restart_sched_tick(), will have already called
      tick_do_update_jiffies64(now) with that "now" value.
      
      Note that tick_nohz_stop_sched_tick() already uses the correct
      approach: when we notice we cross a jiffy boundary, grab a new
      timestamp with ktime_get(), and *then* update jiffies.
      Signed-off-by: NNeal Cardwell <ncardwell@google.com>
      Cc: Ben Segall <bsegall@google.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/1332875377-23014-1-git-send-email-ncardwell@google.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      6f103929
    • S
      simple_open: automatically convert to simple_open() · 234e3405
      Stephen Boyd 提交于
      Many users of debugfs copy the implementation of default_open() when
      they want to support a custom read/write function op.  This leads to a
      proliferation of the default_open() implementation across the entire
      tree.
      
      Now that the common implementation has been consolidated into libfs we
      can replace all the users of this function with simple_open().
      
      This replacement was done with the following semantic patch:
      
      <smpl>
      @ open @
      identifier open_f != simple_open;
      identifier i, f;
      @@
      -int open_f(struct inode *i, struct file *f)
      -{
      (
      -if (i->i_private)
      -f->private_data = i->i_private;
      |
      -f->private_data = i->i_private;
      )
      -return 0;
      -}
      
      @ has_open depends on open @
      identifier fops;
      identifier open.open_f;
      @@
      struct file_operations fops = {
      ...
      -.open = open_f,
      +.open = simple_open,
      ...
      };
      </smpl>
      
      [akpm@linux-foundation.org: checkpatch fixes]
      Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Julia Lawall <Julia.Lawall@lip6.fr>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      234e3405
  7. 05 4月, 2012 1 次提交
  8. 02 4月, 2012 1 次提交
  9. 31 3月, 2012 3 次提交
  10. 30 3月, 2012 5 次提交
  11. 29 3月, 2012 18 次提交
    • S
      padata: Fix cpu hotplug · 96120905
      Steffen Klassert 提交于
      We don't remove the cpu that went offline from our cpumasks
      on cpu hotplug. This got lost somewhere along the line, so
      restore it. This fixes a hang of the padata instance on cpu
      hotplug.
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      96120905
    • S
      padata: Use the online cpumask as the default · 13614e0f
      Steffen Klassert 提交于
      We use the active cpumask to determine the superset of cpus
      to use for parallelization. However, the active cpumask is
      for internal usage of the scheduler and therefore not the
      appropriate cpumask for these purposes. So use the online
      cpumask instead.
      Reported-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      13614e0f
    • S
      padata: Add a reference to the api documentation · 107f8bda
      Steffen Klassert 提交于
      Add a reference to the padata api documentation at Documentation/padata.txt
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      107f8bda
    • K
      futex: Mark get_robust_list as deprecated · ec0c4274
      Kees Cook 提交于
      Notify get_robust_list users that the syscall is going away.
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Cc: Darren Hart <dvhart@linux.intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Serge E. Hallyn <serge.hallyn@canonical.com>
      Cc: kernel-hardening@lists.openwall.com
      Cc: spender@grsecurity.net
      Link: http://lkml.kernel.org/r/20120323190855.GA27213@www.outflux.netSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      ec0c4274
    • K
      futex: Do not leak robust list to unprivileged process · bdbb776f
      Kees Cook 提交于
      It was possible to extract the robust list head address from a setuid
      process if it had used set_robust_list(), allowing an ASLR info leak. This
      changes the permission checks to be the same as those used for similar
      info that comes out of /proc.
      
      Running a setuid program that uses robust futexes would have had:
        cred->euid != pcred->euid
        cred->euid == pcred->uid
      so the old permissions check would allow it. I'm not aware of any setuid
      programs that use robust futexes, so this is just a preventative measure.
      
      (This patch is based on changes from grsecurity.)
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Darren Hart <dvhart@linux.intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Serge E. Hallyn <serge.hallyn@canonical.com>
      Cc: kernel-hardening@lists.openwall.com
      Cc: spender@grsecurity.net
      Link: http://lkml.kernel.org/r/20120319231253.GA20893@www.outflux.netSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      bdbb776f
    • P
      genirq: Respect NUMA node affinity in setup_irq_irq affinity() · 241fc640
      Prarit Bhargava 提交于
      We respect node affinity of devices already in the irq descriptor
      allocation, but we ignore it for the initial interrupt affinity
      setup, so the interrupt might be routed to a different node.
      
      Restrict the default affinity mask to the node on which the irq
      descriptor is allocated.
      
      [ tglx: Massaged changelog ]
      Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Link: http://lkml.kernel.org/r/1332788538-17425-1-git-send-email-prarit@redhat.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      241fc640
    • A
      genirq: Get rid of unneeded force parameter in irq_finalize_oneshot() · f3f79e38
      Alexander Gordeev 提交于
      The only place irq_finalize_oneshot() is called with force parameter set
      is the threaded handler error exit path. But IRQTF_RUNTHREAD is dropped
      at this point and irq_wake_thread() is not going to set it again,
      since PF_EXITING is set for this thread already. So irq_finalize_oneshot()
      will drop the threads bit in threads_oneshot anyway and hence the force
      parameter is superfluous.
      Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
      Link: http://lkml.kernel.org/r/20120321162234.GP24806@dhcp-26-207.brq.redhat.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      f3f79e38
    • A
      genirq: Minor readablity improvement in irq_wake_thread() · 69592db2
      Alexander Gordeev 提交于
      exit_irq_thread() clears IRQTF_RUNTHREAD flag and drops the thread's bit in
      desc->threads_oneshot then. The bit must not be set again in between and it
      does not, since irq_wake_thread() sees PF_EXITING flag first and returns.
      
      Due to above the order or checking PF_EXITING and IRQTF_RUNTHREAD flags in
      irq_wake_thread() is important. This change just makes it more visible in the
      source code.
      Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
      Link: http://lkml.kernel.org/r/20120321162212.GO24806@dhcp-26-207.brq.redhat.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      69592db2
    • S
      sched: Fix __schedule_bug() output when called from an interrupt · 6135fc1e
      Stephen Boyd 提交于
      If schedule is called from an interrupt handler __schedule_bug()
      will call show_regs() with the registers saved during the
      interrupt handling done in do_IRQ(). This means we'll see the
      registers and the backtrace for the process that was interrupted
      and not the full backtrace explaining who called schedule().
      
      This is due to 838225b4 ("sched: use show_regs() to improve
      __schedule_bug() output", 2007-10-24) which improperly assumed
      that get_irq_regs() would return the registers for the current
      stack because it is being called from within an interrupt
      handler. Simply remove the show_reg() code so that we dump a
      backtrace for the interrupt handler that called schedule().
      
      [ I ran across this when I was presented with a scheduling while
        atomic log with a stacktrace pointing at spin_unlock_irqrestore().
        It made no sense and I had to guess what interrupt handler could
        be called and poke around for someone calling schedule() in an
        interrupt handler. A simple test of putting an msleep() in
        an interrupt handler works better with this patch because you
        can actually see the msleep() call in the backtrace. ]
      Also-reported-by: NChris Metcalf <cmetcalf@tilera.com>
      Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
      Cc: Satyam Sharma <satyam@infradead.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1332979847-27102-1-git-send-email-sboyd@codeaurora.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6135fc1e
    • R
      documentation: remove references to cpu_*_map. · 5f054e31
      Rusty Russell 提交于
      This has been obsolescent for a while, fix documentation and
      misc comments.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      5f054e31
    • D
      pidns: add reboot_pid_ns() to handle the reboot syscall · cf3f8921
      Daniel Lezcano 提交于
      In the case of a child pid namespace, rebooting the system does not really
      makes sense.  When the pid namespace is used in conjunction with the other
      namespaces in order to create a linux container, the reboot syscall leads
      to some problems.
      
      A container can reboot the host.  That can be fixed by dropping the
      sys_reboot capability but we are unable to correctly to poweroff/
      halt/reboot a container and the container stays stuck at the shutdown time
      with the container's init process waiting indefinitively.
      
      After several attempts, no solution from userspace was found to reliabily
      handle the shutdown from a container.
      
      This patch propose to make the init process of the child pid namespace to
      exit with a signal status set to : SIGINT if the child pid namespace
      called "halt/poweroff" and SIGHUP if the child pid namespace called
      "reboot".  When the reboot syscall is called and we are not in the initial
      pid namespace, we kill the pid namespace for "HALT", "POWEROFF",
      "RESTART", and "RESTART2".  Otherwise we return EINVAL.
      
      Returning EINVAL is also an easy way to check if this feature is supported
      by the kernel when invoking another 'reboot' option like CAD.
      
      By this way the parent process of the child pid namespace knows if it
      rebooted or not and can take the right decision.
      
      Test case:
      ==========
      
      #include <alloca.h>
      #include <stdio.h>
      #include <sched.h>
      #include <unistd.h>
      #include <signal.h>
      #include <sys/reboot.h>
      #include <sys/types.h>
      #include <sys/wait.h>
      
      #include <linux/reboot.h>
      
      static int do_reboot(void *arg)
      {
              int *cmd = arg;
      
              if (reboot(*cmd))
                      printf("failed to reboot(%d): %m\n", *cmd);
      }
      
      int test_reboot(int cmd, int sig)
      {
              long stack_size = 4096;
              void *stack = alloca(stack_size) + stack_size;
              int status;
              pid_t ret;
      
              ret = clone(do_reboot, stack, CLONE_NEWPID | SIGCHLD, &cmd);
              if (ret < 0) {
                      printf("failed to clone: %m\n");
                      return -1;
              }
      
              if (wait(&status) < 0) {
                      printf("unexpected wait error: %m\n");
                      return -1;
              }
      
              if (!WIFSIGNALED(status)) {
                      printf("child process exited but was not signaled\n");
                      return -1;
              }
      
              if (WTERMSIG(status) != sig) {
                      printf("signal termination is not the one expected\n");
                      return -1;
              }
      
              return 0;
      }
      
      int main(int argc, char *argv[])
      {
              int status;
      
              status = test_reboot(LINUX_REBOOT_CMD_RESTART, SIGHUP);
              if (status < 0)
                      return 1;
              printf("reboot(LINUX_REBOOT_CMD_RESTART) succeed\n");
      
              status = test_reboot(LINUX_REBOOT_CMD_RESTART2, SIGHUP);
              if (status < 0)
                      return 1;
              printf("reboot(LINUX_REBOOT_CMD_RESTART2) succeed\n");
      
              status = test_reboot(LINUX_REBOOT_CMD_HALT, SIGINT);
              if (status < 0)
                      return 1;
              printf("reboot(LINUX_REBOOT_CMD_HALT) succeed\n");
      
              status = test_reboot(LINUX_REBOOT_CMD_POWER_OFF, SIGINT);
              if (status < 0)
                      return 1;
              printf("reboot(LINUX_REBOOT_CMD_POWERR_OFF) succeed\n");
      
              status = test_reboot(LINUX_REBOOT_CMD_CAD_ON, -1);
              if (status >= 0) {
                      printf("reboot(LINUX_REBOOT_CMD_CAD_ON) should have failed\n");
                      return 1;
              }
              printf("reboot(LINUX_REBOOT_CMD_CAD_ON) has failed as expected\n");
      
              return 0;
      }
      
      [akpm@linux-foundation.org: tweak and add comments]
      [akpm@linux-foundation.org: checkpatch fixes]
      Signed-off-by: NDaniel Lezcano <daniel.lezcano@free.fr>
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Tested-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Reviewed-by: NOleg Nesterov <oleg@redhat.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cf3f8921
    • A
      sysctl: use bitmap library functions · 5a04cca6
      Akinobu Mita 提交于
      Use bitmap_set() instead of using set_bit() for each bit.  This conversion
      is valid because the bitmap is private in the function call and atomic
      bitops were unnecessary.
      
      This also includes minor change.
      - Use bitmap_copy() for shorter typing
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5a04cca6
    • Z
      kexec: add further check to crashkernel · eaa3be6a
      Zhenzhong Duan 提交于
      When using crashkernel=2M-256M, the kernel doesn't give any warning.  This
      is misleading sometimes.
      Signed-off-by: NZhenzhong Duan <zhenzhong.duan@oracle.com>
      Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      eaa3be6a
    • W
      kexec: crash: don't save swapper_pg_dir for !CONFIG_MMU configurations · d034cfab
      Will Deacon 提交于
      nommu platforms don't have very interesting swapper_pg_dir pointers and
      usually just #define them to NULL, meaning that we can't include them in
      the vmcoreinfo on the kexec crash path.
      
      This patch only saves the swapper_pg_dir if we have an MMU.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Reviewed-by: NSimon Horman <horms@verge.net.au>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d034cfab
    • G
      smp: add func to IPI cpus based on parameter func · b3a7e98e
      Gilad Ben-Yossef 提交于
      Add the on_each_cpu_cond() function that wraps on_each_cpu_mask() and
      calculates the cpumask of cpus to IPI by calling a function supplied as a
      parameter in order to determine whether to IPI each specific cpu.
      
      The function works around allocation failure of cpumask variable in
      CONFIG_CPUMASK_OFFSTACK=y by itereating over cpus sending an IPI a time
      via smp_call_function_single().
      
      The function is useful since it allows to seperate the specific code that
      decided in each case whether to IPI a specific cpu for a specific request
      from the common boilerplate code of handling creating the mask, handling
      failures etc.
      
      [akpm@linux-foundation.org: s/gfpflags/gfp_flags/]
      [akpm@linux-foundation.org: avoid double-evaluation of `info' (per Michal), parenthesise evaluation of `cond_func']
      [akpm@linux-foundation.org: s/CPU/CPUs, use all 80 cols in comment]
      Signed-off-by: NGilad Ben-Yossef <gilad@benyossef.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: Sasha Levin <levinsasha928@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Avi Kivity <avi@redhat.com>
      Acked-by: NMichal Nazarewicz <mina86@mina86.org>
      Cc: Kosaki Motohiro <kosaki.motohiro@gmail.com>
      Cc: Milton Miller <miltonm@bga.com>
      Reviewed-by: N"Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b3a7e98e
    • G
      smp: introduce a generic on_each_cpu_mask() function · 3fc498f1
      Gilad Ben-Yossef 提交于
      We have lots of infrastructure in place to partition multi-core systems
      such that we have a group of CPUs that are dedicated to specific task:
      cgroups, scheduler and interrupt affinity, and cpuisol= boot parameter.
      Still, kernel code will at times interrupt all CPUs in the system via IPIs
      for various needs.  These IPIs are useful and cannot be avoided
      altogether, but in certain cases it is possible to interrupt only specific
      CPUs that have useful work to do and not the entire system.
      
      This patch set, inspired by discussions with Peter Zijlstra and Frederic
      Weisbecker when testing the nohz task patch set, is a first stab at trying
      to explore doing this by locating the places where such global IPI calls
      are being made and turning the global IPI into an IPI for a specific group
      of CPUs.  The purpose of the patch set is to get feedback if this is the
      right way to go for dealing with this issue and indeed, if the issue is
      even worth dealing with at all.  Based on the feedback from this patch set
      I plan to offer further patches that address similar issue in other code
      paths.
      
      This patch creates an on_each_cpu_mask() and on_each_cpu_cond()
      infrastructure API (the former derived from existing arch specific
      versions in Tile and Arm) and uses them to turn several global IPI
      invocation to per CPU group invocations.
      
      Core kernel:
      
      on_each_cpu_mask() calls a function on processors specified by cpumask,
      which may or may not include the local processor.
      
      You must not call this function with disabled interrupts or from a
      hardware interrupt handler or from a bottom half handler.
      
      arch/arm:
      
      Note that the generic version is a little different then the Arm one:
      
      1. It has the mask as first parameter
      2. It calls the function on the calling CPU with interrupts disabled,
         but this should be OK since the function is called on the other CPUs
         with interrupts disabled anyway.
      
      arch/tile:
      
      The API is the same as the tile private one, but the generic version
      also calls the function on the with interrupts disabled in UP case
      
      This is OK since the function is called on the other CPUs
      with interrupts disabled.
      Signed-off-by: NGilad Ben-Yossef <gilad@benyossef.com>
      Reviewed-by: NChristoph Lameter <cl@linux.com>
      Acked-by: NChris Metcalf <cmetcalf@tilera.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Sasha Levin <levinsasha928@gmail.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Avi Kivity <avi@redhat.com>
      Acked-by: NMichal Nazarewicz <mina86@mina86.org>
      Cc: Kosaki Motohiro <kosaki.motohiro@gmail.com>
      Cc: Milton Miller <miltonm@bga.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3fc498f1
    • M
      PM / QoS: add pm_qos_update_request_timeout() API · c4772d19
      MyungJoo Ham 提交于
      The new API, pm_qos_update_request_timeout() is to provide a timeout
      with pm_qos_update_request.
      
      For example, pm_qos_update_request_timeout(req, 100, 1000), means that
      QoS request on req with value 100 will be active for 1000 microseconds.
      After 1000 microseconds, the QoS request thru req is reset. If there
      were another pm_qos_update_request(req, x) during the 1000 us, this
      new request with value x will override as this is another request on the
      same req handle. A new request on the same req handle will always
      override the previous request whether it is the conventional request or
      it is the new timeout request.
      Signed-off-by: NMyungJoo Ham <myungjoo.ham@samsung.com>
      Signed-off-by: NKyungmin Park <kyungmin.park@samsung.com>
      Acked-by: NMark Gross <markgross@thegnar.org>
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      c4772d19
    • R
      PM / Sleep: Mitigate race between the freezer and request_firmware() · 247bc037
      Rafael J. Wysocki 提交于
      There is a race condition between the freezer and request_firmware()
      such that if request_firmware() is run on one CPU and
      freeze_processes() is run on another CPU and usermodehelper_disable()
      called by it succeeds to grab umhelper_sem for writing before
      usermodehelper_read_trylock() called from request_firmware()
      acquires it for reading, the request_firmware() will fail and
      trigger a WARN_ON() complaining that it was called at a wrong time.
      However, in fact, it wasn't called at a wrong time and
      freeze_processes() simply happened to be executed simultaneously.
      
      To avoid this race, at least in some cases, modify
      usermodehelper_read_trylock() so that it doesn't fail if the
      freezing of tasks has just started and hasn't been completed yet.
      Instead, during the freezing of tasks, it will try to freeze the
      task that has called it so that it can wait until user space is
      thawed without triggering the scary warning.
      
      For this purpose, change usermodehelper_disabled so that it can
      take three different values, UMH_ENABLED (0), UMH_FREEZING and
      UMH_DISABLED.  The first one means that usermode helpers are
      enabled, the last one means "hard disable" (i.e. the system is not
      ready for usermode helpers to be used) and the second one
      is reserved for the freezer.  Namely, when freeze_processes() is
      started, it sets usermodehelper_disabled to UMH_FREEZING which
      tells usermodehelper_read_trylock() that it shouldn't fail just
      yet and should call try_to_freeze() if woken up and cannot
      return immediately.  This way all freezable tasks that happen
      to call request_firmware() right before freeze_processes() is
      started and lose the race for umhelper_sem with it will be
      frozen and will sleep until thaw_processes() unsets
      usermodehelper_disabled.  [For the non-freezable callers of
      request_firmware() the race for umhelper_sem against
      freeze_processes() is unfortunately unavoidable.]
      Reported-by: NStephen Boyd <sboyd@codeaurora.org>
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: stable@vger.kernel.org
      247bc037