1. 13 2月, 2015 2 次提交
    • C
      kernel/sched/clock.c: add another clock for use with the soft lockup watchdog · 545a2bf7
      Cyril Bur 提交于
      When the hypervisor pauses a virtualised kernel the kernel will observe a
      jump in timebase, this can cause spurious messages from the softlockup
      detector.
      
      Whilst these messages are harmless, they are accompanied with a stack
      trace which causes undue concern and more problematically the stack trace
      in the guest has nothing to do with the observed problem and can only be
      misleading.
      
      Futhermore, on POWER8 this is completely avoidable with the introduction
      of the Virtual Time Base (VTB) register.
      
      This patch (of 2):
      
      This permits the use of arch specific clocks for which virtualised kernels
      can use their notion of 'running' time, not the elpased wall time which
      will include host execution time.
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Andrew Jones <drjones@redhat.com>
      Acked-by: NDon Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Cc: chai wen <chaiw.fnst@cn.fujitsu.com>
      Cc: Fabian Frederick <fabf@skynet.be>
      Cc: Aaron Tomlin <atomlin@redhat.com>
      Cc: Ben Zhang <benzh@chromium.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      545a2bf7
    • A
      all arches, signal: move restart_block to struct task_struct · f56141e3
      Andy Lutomirski 提交于
      If an attacker can cause a controlled kernel stack overflow, overwriting
      the restart block is a very juicy exploit target.  This is because the
      restart_block is held in the same memory allocation as the kernel stack.
      
      Moving the restart block to struct task_struct prevents this exploit by
      making the restart_block harder to locate.
      
      Note that there are other fields in thread_info that are also easy
      targets, at least on some architectures.
      
      It's also a decent simplification, since the restart code is more or less
      identical on all architectures.
      
      [james.hogan@imgtec.com: metag: align thread_info::supervisor_stack]
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: David Miller <davem@davemloft.net>
      Acked-by: NRichard Weinberger <richard@nod.at>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Steven Miao <realmz6@gmail.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Tested-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Chen Liqin <liqin.linux@gmail.com>
      Cc: Lennox Wu <lennox.wu@gmail.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f56141e3
  2. 14 12月, 2014 2 次提交
    • D
      syscalls: implement execveat() system call · 51f39a1f
      David Drysdale 提交于
      This patchset adds execveat(2) for x86, and is derived from Meredydd
      Luff's patch from Sept 2012 (https://lkml.org/lkml/2012/9/11/528).
      
      The primary aim of adding an execveat syscall is to allow an
      implementation of fexecve(3) that does not rely on the /proc filesystem,
      at least for executables (rather than scripts).  The current glibc version
      of fexecve(3) is implemented via /proc, which causes problems in sandboxed
      or otherwise restricted environments.
      
      Given the desire for a /proc-free fexecve() implementation, HPA suggested
      (https://lkml.org/lkml/2006/7/11/556) that an execveat(2) syscall would be
      an appropriate generalization.
      
      Also, having a new syscall means that it can take a flags argument without
      back-compatibility concerns.  The current implementation just defines the
      AT_EMPTY_PATH and AT_SYMLINK_NOFOLLOW flags, but other flags could be
      added in future -- for example, flags for new namespaces (as suggested at
      https://lkml.org/lkml/2006/7/11/474).
      
      Related history:
       - https://lkml.org/lkml/2006/12/27/123 is an example of someone
         realizing that fexecve() is likely to fail in a chroot environment.
       - http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=514043 covered
         documenting the /proc requirement of fexecve(3) in its manpage, to
         "prevent other people from wasting their time".
       - https://bugzilla.redhat.com/show_bug.cgi?id=241609 described a
         problem where a process that did setuid() could not fexecve()
         because it no longer had access to /proc/self/fd; this has since
         been fixed.
      
      This patch (of 4):
      
      Add a new execveat(2) system call.  execveat() is to execve() as openat()
      is to open(): it takes a file descriptor that refers to a directory, and
      resolves the filename relative to that.
      
      In addition, if the filename is empty and AT_EMPTY_PATH is specified,
      execveat() executes the file to which the file descriptor refers.  This
      replicates the functionality of fexecve(), which is a system call in other
      UNIXen, but in Linux glibc it depends on opening "/proc/self/fd/<fd>" (and
      so relies on /proc being mounted).
      
      The filename fed to the executed program as argv[0] (or the name of the
      script fed to a script interpreter) will be of the form "/dev/fd/<fd>"
      (for an empty filename) or "/dev/fd/<fd>/<filename>", effectively
      reflecting how the executable was found.  This does however mean that
      execution of a script in a /proc-less environment won't work; also, script
      execution via an O_CLOEXEC file descriptor fails (as the file will not be
      accessible after exec).
      
      Based on patches by Meredydd Luff.
      Signed-off-by: NDavid Drysdale <drysdale@google.com>
      Cc: Meredydd Luff <meredydd@senatehouse.org>
      Cc: Shuah Khan <shuah.kh@samsung.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Rich Felker <dalias@aerifal.cx>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      51f39a1f
    • V
      memcg: turn memcg_kmem_skip_account into a bit field · 6f185c29
      Vladimir Davydov 提交于
      It isn't supposed to stack, so turn it into a bit-field to save 4 bytes on
      the task_struct.
      
      Also, remove the memcg_stop/resume_kmem_account helpers - it is clearer to
      set/clear the flag inline.  Regarding the overwhelming comment to the
      helpers, which is removed by this patch too, we already have a compact yet
      accurate explanation in memcg_schedule_cache_create, no need in yet
      another one.
      Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6f185c29
  3. 04 11月, 2014 1 次提交
  4. 30 10月, 2014 1 次提交
  5. 28 10月, 2014 4 次提交
  6. 10 10月, 2014 1 次提交
    • J
      mm: clear __GFP_FS when PF_MEMALLOC_NOIO is set · 934f3072
      Junxiao Bi 提交于
      commit 21caf2fc ("mm: teach mm by current context info to not do I/O
      during memory allocation") introduces PF_MEMALLOC_NOIO flag to avoid doing
      I/O inside memory allocation, __GFP_IO is cleared when this flag is set,
      but __GFP_FS implies __GFP_IO, it should also be cleared.  Or it may still
      run into I/O, like in superblock shrinker.  And this will make the kernel
      run into the deadlock case described in that commit.
      
      See Dave Chinner's comment about io in superblock shrinker:
      
      Filesystem shrinkers do indeed perform IO from the superblock shrinker and
      have for years.  Even clean inodes can require IO before they can be freed
      - e.g.  on an orphan list, need truncation of post-eof blocks, need to
      wait for ordered operations to complete before it can be freed, etc.
      
      IOWs, Ext4, btrfs and XFS all can issue and/or block on arbitrary amounts
      of IO in the superblock shrinker context.  XFS, in particular, has been
      doing transactions and IO from the VFS inode cache shrinker since it was
      first introduced....
      
      Fix this by clearing __GFP_FS in memalloc_noio_flags(), this function has
      masked all the gfp_mask that will be passed into fs for the processes
      setting PF_MEMALLOC_NOIO in the direct reclaim path.
      
      v1 thread at: https://lkml.org/lkml/2014/9/3/32Signed-off-by: NJunxiao Bi <junxiao.bi@oracle.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: joyce.xue <xuejiufei@huawei.com>
      Cc: Ming Lei <ming.lei@canonical.com>
      Cc: Trond Myklebust <trond.myklebust@primarydata.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      934f3072
  7. 25 9月, 2014 3 次提交
  8. 21 9月, 2014 1 次提交
  9. 19 9月, 2014 3 次提交
  10. 08 9月, 2014 4 次提交
    • R
      time, signal: Protect resource use statistics with seqlock · e78c3496
      Rik van Riel 提交于
      Both times() and clock_gettime(CLOCK_PROCESS_CPUTIME_ID) have scalability
      issues on large systems, due to both functions being serialized with a
      lock.
      
      The lock protects against reporting a wrong value, due to a thread in the
      task group exiting, its statistics reporting up to the signal struct, and
      that exited task's statistics being counted twice (or not at all).
      
      Protecting that with a lock results in times() and clock_gettime() being
      completely serialized on large systems.
      
      This can be fixed by using a seqlock around the events that gather and
      propagate statistics. As an additional benefit, the protection code can
      be moved into thread_group_cputime(), slightly simplifying the calling
      functions.
      
      In the case of posix_cpu_clock_get_task() things can be simplified a
      lot, because the calling function already ensures that the task sticks
      around, and the rest is now taken care of in thread_group_cputime().
      
      This way the statistics reporting code can run lockless.
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alex Thorlton <athorlton@sgi.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Daeseok Youn <daeseok.youn@gmail.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Guillaume Morin <guillaume@morinfr.org>
      Cc: Ionut Alexa <ionut.m.alexa@gmail.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Michal Schmidt <mschmidt@redhat.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Vladimir Davydov <vdavydov@parallels.com>
      Cc: umgwanakikbuti@gmail.com
      Cc: fweisbec@gmail.com
      Cc: srao@redhat.com
      Cc: lwoodman@redhat.com
      Cc: atheurer@redhat.com
      Link: http://lkml.kernel.org/r/20140816134010.26a9b572@annuminas.surriel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e78c3496
    • P
      rcu: Remove local_irq_disable() in rcu_preempt_note_context_switch() · 1d082fd0
      Paul E. McKenney 提交于
      The rcu_preempt_note_context_switch() function is on a scheduling fast
      path, so it would be good to avoid disabling irqs.  The reason that irqs
      are disabled is to synchronize process-level and irq-handler access to
      the task_struct ->rcu_read_unlock_special bitmask.  This commit therefore
      makes ->rcu_read_unlock_special instead be a union of bools with a short
      allowing single-access checks in RCU's __rcu_read_unlock().  This results
      in the process-level and irq-handler accesses being simple loads and
      stores, so that irqs need no longer be disabled.  This commit therefore
      removes the irq disabling from rcu_preempt_note_context_switch().
      Reported-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      1d082fd0
    • P
      rcu: Make TASKS_RCU handle nohz_full= CPUs · 176f8f7a
      Paul E. McKenney 提交于
      Currently TASKS_RCU would ignore a CPU running a task in nohz_full=
      usermode execution.  There would be neither a context switch nor a
      scheduling-clock interrupt to tell TASKS_RCU that the task in question
      had passed through a quiescent state.  The grace period would therefore
      extend indefinitely.  This commit therefore makes RCU's dyntick-idle
      subsystem record the task_struct structure of the task that is running
      in dyntick-idle mode on each CPU.  The TASKS_RCU grace period can
      then access this information and record a quiescent state on
      behalf of any CPU running in dyntick-idle usermode.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      176f8f7a
    • P
      rcu: Add call_rcu_tasks() · 8315f422
      Paul E. McKenney 提交于
      This commit adds a new RCU-tasks flavor of RCU, which provides
      call_rcu_tasks().  This RCU flavor's quiescent states are voluntary
      context switch (not preemption!) and userspace execution (not the idle
      loop -- use some sort of schedule_on_each_cpu() if you need to handle the
      idle tasks.  Note that unlike other RCU flavors, these quiescent states
      occur in tasks, not necessarily CPUs.  Includes fixes from Steven Rostedt.
      
      This RCU flavor is assumed to have very infrequent latency-tolerant
      updaters.  This assumption permits significant simplifications, including
      a single global callback list protected by a single global lock, along
      with a single task-private linked list containing all tasks that have not
      yet passed through a quiescent state.  If experience shows this assumption
      to be incorrect, the required additional complexity will be added.
      Suggested-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      8315f422
  11. 25 8月, 2014 1 次提交
  12. 09 8月, 2014 3 次提交
    • J
      shm: make exit_shm work proportional to task activity · ab602f79
      Jack Miller 提交于
      This is small set of patches our team has had kicking around for a few
      versions internally that fixes tasks getting hung on shm_exit when there
      are many threads hammering it at once.
      
      Anton wrote a simple test to cause the issue:
      
        http://ozlabs.org/~anton/junkcode/bust_shm_exit.c
      
      Before applying this patchset, this test code will cause either hanging
      tracebacks or pthread out of memory errors.
      
      After this patchset, it will still produce output like:
      
        root@somehost:~# ./bust_shm_exit 1024 160
        ...
        INFO: rcu_sched detected stalls on CPUs/tasks: {} (detected by 116, t=2111 jiffies, g=241, c=240, q=7113)
        INFO: Stall ended before state dump start
        ...
      
      But the task will continue to run along happily, so we consider this an
      improvement over hanging, even if it's a bit noisy.
      
      This patch (of 3):
      
      exit_shm obtains the ipc_ns shm rwsem for write and holds it while it
      walks every shared memory segment in the namespace.  Thus the amount of
      work is related to the number of shm segments in the namespace not the
      number of segments that might need to be cleaned.
      
      In addition, this occurs after the task has been notified the thread has
      exited, so the number of tasks waiting for the ns shm rwsem can grow
      without bound until memory is exausted.
      
      Add a list to the task struct of all shmids allocated by this task.  Init
      the list head in copy_process.  Use the ns->rwsem for locking.  Add
      segments after id is added, remove before removing from id.
      
      On unshare of NEW_IPCNS orphan any ids as if the task had exited, similar
      to handling of semaphore undo.
      
      I chose a define for the init sequence since its a simple list init,
      otherwise it would require a function call to avoid include loops between
      the semaphore code and the task struct.  Converting the list_del to
      list_del_init for the unshare cases would remove the exit followed by
      init, but I left it blow up if not inited.
      Signed-off-by: NMilton Miller <miltonm@bga.com>
      Signed-off-by: NJack Miller <millerjo@us.ibm.com>
      Cc: Davidlohr Bueso <davidlohr@hp.com>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: Anton Blanchard <anton@samba.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ab602f79
    • V
      kernel/fork.c: make mm_init_owner static · 33144e84
      Vladimir Davydov 提交于
      It's only used in fork.c:mm_init().
      Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      33144e84
    • J
      mm: memcontrol: use page lists for uncharge batching · 747db954
      Johannes Weiner 提交于
      Pages are now uncharged at release time, and all sources of batched
      uncharges operate on lists of pages.  Directly use those lists, and
      get rid of the per-task batching state.
      
      This also batches statistics accounting, in addition to the res
      counter charges, to reduce IRQ-disabling and re-enabling.
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Vladimir Davydov <vdavydov@parallels.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Vladimir Davydov <vdavydov@parallels.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      747db954
  13. 07 8月, 2014 1 次提交
  14. 06 8月, 2014 1 次提交
  15. 24 7月, 2014 3 次提交
  16. 19 7月, 2014 1 次提交
  17. 16 7月, 2014 2 次提交
    • K
      sched: Transform resched_task() into resched_curr() · 8875125e
      Kirill Tkhai 提交于
      We always use resched_task() with rq->curr argument.
      It's not possible to reschedule any task but rq's current.
      
      The patch introduces resched_curr(struct rq *) to
      replace all of the repeating patterns. The main aim
      is cleanup, but there is a little size profit too:
      
        (before)
      	$ size kernel/sched/built-in.o
      	   text	   data	    bss	    dec	    hex	filename
      	155274	  16445	   7042	 178761	  2ba49	kernel/sched/built-in.o
      
      	$ size vmlinux
      	   text	   data	    bss	    dec	    hex	filename
      	7411490	1178376	 991232	9581098	 92322a	vmlinux
      
        (after)
      	$ size kernel/sched/built-in.o
      	   text	   data	    bss	    dec	    hex	filename
      	155130	  16445	   7042	 178617	  2b9b9	kernel/sched/built-in.o
      
      	$ size vmlinux
      	   text	   data	    bss	    dec	    hex	filename
      	7411362	1178376	 991232	9580970	 9231aa	vmlinux
      
      	I was choosing between resched_curr() and resched_rq(),
      	and the first name looks better for me.
      
      A little lie in Documentation/trace/ftrace.txt. I have not
      actually collected the tracing again. With a hope the patch
      won't make execution times much worse :)
      Signed-off-by: NKirill Tkhai <tkhai@yandex.ru>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/20140628200219.1778.18735.stgit@localhostSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8875125e
    • O
      sched/deadline: Kill task_struct->pi_top_task · 466af29b
      Oleg Nesterov 提交于
      Remove task_struct->pi_top_task. The only user, rt_mutex_setprio(),
      can use a local.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Juri Lelli <juri.lelli@gmail.com>
      Cc: Alex Thorlton <athorlton@sgi.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Daeseok Youn <daeseok.youn@gmail.com>
      Cc: Dario Faggioli <raistlin@linux.it>
      Cc: Davidlohr Bueso <davidlohr@hp.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matthew Dempsky <mdempsky@chromium.org>
      Cc: Michal Simek <michal.simek@xilinx.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Link: http://lkml.kernel.org/r/20140606165206.GB29465@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      466af29b
  18. 10 7月, 2014 1 次提交
  19. 02 7月, 2014 1 次提交
  20. 07 6月, 2014 1 次提交
    • O
      signals: mv {dis,}allow_signal() from sched.h/exit.c to signal.[ch] · 0341729b
      Oleg Nesterov 提交于
      Move the declaration/definition of allow_signal/disallow_signal to
      signal.h/signal.c.  The new place is more logical and allows to use the
      static helpers in signal.c (see the next changes).
      
      While at it, make them return void and remove the valid_signal() check.
      Nobody checks the returned value, and in-kernel users must not pass the
      wrong signal number.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0341729b
  21. 06 6月, 2014 1 次提交
    • A
      perf: Differentiate exec() and non-exec() comm events · 82b89778
      Adrian Hunter 提交于
      perf tools like 'perf report' can aggregate samples by comm strings,
      which generally works.  However, there are other potential use-cases.
      For example, to pair up 'calls' with 'returns' accurately (from branch
      events like Intel BTS) it is necessary to identify whether the process
      has exec'd.  Although a comm event is generated when an 'exec' happens
      it is also generated whenever the comm string is changed on a whim
      (e.g. by prctl PR_SET_NAME).  This patch adds a flag to the comm event
      to differentiate one case from the other.
      
      In order to determine whether the kernel supports the new flag, a
      selection bit named 'exec' is added to struct perf_event_attr.  The
      bit does nothing but will cause perf_event_open() to fail if the bit
      is set on kernels that do not have it defined.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/537D9EBE.7030806@intel.com
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: linux-fsdevel@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      82b89778
  22. 05 6月, 2014 2 次提交
    • N
      sched: Rename capacity related flags · 5d4dfddd
      Nicolas Pitre 提交于
      It is better not to think about compute capacity as being equivalent
      to "CPU power".  The upcoming "power aware" scheduler work may create
      confusion with the notion of energy consumption if "power" is used too
      liberally.
      
      Let's rename the following feature flags since they do relate to capacity:
      
      	SD_SHARE_CPUPOWER  -> SD_SHARE_CPUCAPACITY
      	ARCH_POWER         -> ARCH_CAPACITY
      	NONTASK_POWER      -> NONTASK_CAPACITY
      Signed-off-by: NNicolas Pitre <nico@linaro.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: Morten Rasmussen <morten.rasmussen@arm.com>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: linaro-kernel@lists.linaro.org
      Cc: Andy Fleming <afleming@freescale.com>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Grant Likely <grant.likely@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      Cc: devicetree@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: http://lkml.kernel.org/n/tip-e93lpnxb87owfievqatey6b5@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5d4dfddd
    • N
      sched: Final power vs. capacity cleanups · ca8ce3d0
      Nicolas Pitre 提交于
      It is better not to think about compute capacity as being equivalent
      to "CPU power".  The upcoming "power aware" scheduler work may create
      confusion with the notion of energy consumption if "power" is used too
      liberally.
      
      This contains the architecture visible changes.  Incidentally, only ARM
      takes advantage of the available pow^H^H^Hcapacity scaling hooks and
      therefore those changes outside kernel/sched/ are confined to one ARM
      specific file.  The default arch_scale_smt_power() hook is not overridden
      by anyone.
      
      Replacements are as follows:
      
      	arch_scale_freq_power  --> arch_scale_freq_capacity
      	arch_scale_smt_power   --> arch_scale_smt_capacity
      	SCHED_POWER_SCALE      --> SCHED_CAPACITY_SCALE
      	SCHED_POWER_SHIFT      --> SCHED_CAPACITY_SHIFT
      
      The local usage of "power" in arch/arm/kernel/topology.c is also changed
      to "capacity" as appropriate.
      Signed-off-by: NNicolas Pitre <nico@linaro.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: Morten Rasmussen <morten.rasmussen@arm.com>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: linaro-kernel@lists.linaro.org
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
      Cc: Grant Likely <grant.likely@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Brown <broonie@linaro.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      Cc: devicetree@vger.kernel.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-kernel@vger.kernel.org
      Link: http://lkml.kernel.org/n/tip-48zba9qbznvglwelgq2cfygh@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ca8ce3d0