1. 29 11月, 2012 2 次提交
    • F
      cputime: Consolidate cputime adjustment code · d37f761d
      Frederic Weisbecker 提交于
      task_cputime_adjusted() and thread_group_cputime_adjusted()
      essentially share the same code. They just don't use the same
      source:
      
      * The first function uses the cputime in the task struct and the
      previous adjusted snapshot that ensures monotonicity.
      
      * The second adds the cputime of all tasks in the group and the
      previous adjusted snapshot of the whole group from the signal
      structure.
      
      Just consolidate the common code that does the adjustment. These
      functions just need to fetch the values from the appropriate
      source.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      d37f761d
    • F
      cputime: Rename thread_group_times to thread_group_cputime_adjusted · e80d0a1a
      Frederic Weisbecker 提交于
      We have thread_group_cputime() and thread_group_times(). The naming
      doesn't provide enough information about the difference between
      these two APIs.
      
      To lower the confusion, rename thread_group_times() to
      thread_group_cputime_adjusted(). This name better suggests that
      it's a version of thread_group_cputime() that does some stabilization
      on the raw cputime values. ie here: scale on top of CFS runtime
      stats and bound lower value for monotonicity.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      e80d0a1a
  2. 24 10月, 2012 5 次提交
  3. 13 10月, 2012 1 次提交
  4. 09 10月, 2012 1 次提交
  5. 06 10月, 2012 1 次提交
  6. 03 10月, 2012 1 次提交
  7. 01 10月, 2012 1 次提交
    • A
      preparation for generic kernel_thread() · 2aa3a7f8
      Al Viro 提交于
      Let architectures select GENERIC_KERNEL_THREAD and have their copy_thread()
      treat NULL regs as "it came from kernel_thread(), sp argument contains
      the function new thread will be calling and stack_size - the argument for
      that function".  Switching the architectures begins shortly...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      2aa3a7f8
  8. 26 9月, 2012 1 次提交
    • F
      rcu: Switch task's syscall hooks on context switch · 04e7e951
      Frederic Weisbecker 提交于
      Clear the syscalls hook of a task when it's scheduled out so that if
      the task migrates, it doesn't run the syscall slow path on a CPU
      that might not need it.
      
      Also set the syscalls hook on the next task if needed.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Alessio Igor Bogani <abogani@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Gilad Ben Yossef <gilad@benyossef.com>
      Cc: Hakan Akkan <hakanakkan@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Kevin Hilman <khilman@ti.com>
      Cc: Max Krasnyansky <maxk@qualcomm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      04e7e951
  9. 25 9月, 2012 1 次提交
    • E
      net: use a per task frag allocator · 5640f768
      Eric Dumazet 提交于
      We currently use a per socket order-0 page cache for tcp_sendmsg()
      operations.
      
      This page is used to build fragments for skbs.
      
      Its done to increase probability of coalescing small write() into
      single segments in skbs still in write queue (not yet sent)
      
      But it wastes a lot of memory for applications handling many mostly
      idle sockets, since each socket holds one page in sk->sk_sndmsg_page
      
      Its also quite inefficient to build TSO 64KB packets, because we need
      about 16 pages per skb on arches where PAGE_SIZE = 4096, so we hit
      page allocator more than wanted.
      
      This patch adds a per task frag allocator and uses bigger pages,
      if available. An automatic fallback is done in case of memory pressure.
      
      (up to 32768 bytes per frag, thats order-3 pages on x86)
      
      This increases TCP stream performance by 20% on loopback device,
      but also benefits on other network devices, since 8x less frags are
      mapped on transmit and unmapped on tx completion. Alexander Duyck
      mentioned a probable performance win on systems with IOMMU enabled.
      
      Its possible some SG enabled hardware cant cope with bigger fragments,
      but their ndo_start_xmit() should already handle this, splitting a
      fragment in sub fragments, since some arches have PAGE_SIZE=65536
      
      Successfully tested on various ethernet devices.
      (ixgbe, igb, bnx2x, tg3, mellanox mlx4)
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Ben Hutchings <bhutchings@solarflare.com>
      Cc: Vijay Subramanian <subramanian.vijay@gmail.com>
      Cc: Alexander Duyck <alexander.h.duyck@intel.com>
      Tested-by: NVijay Subramanian <subramanian.vijay@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5640f768
  10. 18 9月, 2012 1 次提交
    • E
      userns: Convert the audit loginuid to be a kuid · e1760bd5
      Eric W. Biederman 提交于
      Always store audit loginuids in type kuid_t.
      
      Print loginuids by converting them into uids in the appropriate user
      namespace, and then printing the resulting uid.
      
      Modify audit_get_loginuid to return a kuid_t.
      
      Modify audit_set_loginuid to take a kuid_t.
      
      Modify /proc/<pid>/loginuid on read to convert the loginuid into the
      user namespace of the opener of the file.
      
      Modify /proc/<pid>/loginud on write to convert the loginuid
      rom the user namespace of the opener of the file.
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: Paul Moore <paul@paul-moore.com> ?
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      e1760bd5
  11. 17 9月, 2012 1 次提交
  12. 15 9月, 2012 1 次提交
    • O
      uprobes: Introduce MMF_RECALC_UPROBES · 9f68f672
      Oleg Nesterov 提交于
      Add the new MMF_RECALC_UPROBES flag, it means that MMF_HAS_UPROBES
      can be false positive after remove_breakpoint() or uprobe_munmap().
      It is also set by uprobe_dup_mmap(), this is not optimal but simple.
      We could add the new hook, uprobe_dup_vma(), to set MMF_HAS_UPROBES
      only if the new mm actually has uprobes, but I don't think this
      makes sense.
      
      The next patch will use this flag to clear MMF_HAS_UPROBES.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      9f68f672
  13. 13 9月, 2012 2 次提交
  14. 29 8月, 2012 1 次提交
    • O
      uprobes: Introduce MMF_HAS_UPROBES · f8ac4ec9
      Oleg Nesterov 提交于
      Add the new MMF_HAS_UPROBES flag. It is set by install_breakpoint()
      and it is copied by dup_mmap(), uprobe_pre_sstep_notifier() checks
      it to avoid the slow path if the task was never probed. Perhaps it
      makes sense to check it in valid_vma(is_register => false) as well.
      
      This needs the new dup_mmap()->uprobe_dup_mmap() hook. We can't use
      uprobe_reset_state() or put MMF_HAS_UPROBES into MMF_INIT_MASK, we
      need oldmm->mmap_sem to avoid the race with uprobe_register() or
      mmap() from another thread.
      
      Currently we never clear this bit, it can be false-positive after
      uprobe_unregister() or uprobe_munmap() or if dup_mmap() hits the
      probed VM_DONTCOPY vma. But this is fine correctness-wise and has
      no effect unless the task hits the non-uprobe breakpoint.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      f8ac4ec9
  15. 14 8月, 2012 1 次提交
  16. 09 8月, 2012 1 次提交
  17. 01 8月, 2012 2 次提交
    • M
      mm: allow PF_MEMALLOC from softirq context · 907aed48
      Mel Gorman 提交于
      This is needed to allow network softirq packet processing to make use of
      PF_MEMALLOC.
      
      Currently softirq context cannot use PF_MEMALLOC due to it not being
      associated with a task, and therefore not having task flags to fiddle with
      - thus the gfp to alloc flag mapping ignores the task flags when in
      interrupts (hard or soft) context.
      
      Allowing softirqs to make use of PF_MEMALLOC therefore requires some
      trickery.  This patch borrows the task flags from whatever process happens
      to be preempted by the softirq.  It then modifies the gfp to alloc flags
      mapping to not exclude task flags in softirq context, and modify the
      softirq code to save, clear and restore the PF_MEMALLOC flag.
      
      The save and clear, ensures the preempted task's PF_MEMALLOC flag doesn't
      leak into the softirq.  The restore ensures a softirq's PF_MEMALLOC flag
      cannot leak back into the preempted process.  This should be safe due to
      the following reasons
      
      Softirqs can run on multiple CPUs sure but the same task should not be
      	executing the same softirq code. Neither should the softirq
      	handler be preempted by any other softirq handler so the flags
      	should not leak to an unrelated softirq.
      
      Softirqs re-enable hardware interrupts in __do_softirq() so can be
      	preempted by hardware interrupts so PF_MEMALLOC is inherited
      	by the hard IRQ. However, this is similar to a process in
      	reclaim being preempted by a hardirq. While PF_MEMALLOC is
      	set, gfp_to_alloc_flags() distinguishes between hard and
      	soft irqs and avoids giving a hardirq the ALLOC_NO_WATERMARKS
      	flag.
      
      If the softirq is deferred to ksoftirq then its flags may be used
              instead of a normal tasks but as the softirq cannot be preempted,
              the PF_MEMALLOC flag does not leak to other code by accident.
      
      [davem@davemloft.net: Document why PF_MEMALLOC is safe]
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Cc: David Miller <davem@davemloft.net>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Mike Christie <michaelc@cs.wisc.edu>
      Cc: Eric B Munson <emunson@mgebm.net>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Christoph Lameter <cl@linux.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      907aed48
    • A
      memcg: rename config variables · c255a458
      Andrew Morton 提交于
      Sanity:
      
      CONFIG_CGROUP_MEM_RES_CTLR -> CONFIG_MEMCG
      CONFIG_CGROUP_MEM_RES_CTLR_SWAP -> CONFIG_MEMCG_SWAP
      CONFIG_CGROUP_MEM_RES_CTLR_SWAP_ENABLED -> CONFIG_MEMCG_SWAP_ENABLED
      CONFIG_CGROUP_MEM_RES_CTLR_KMEM -> CONFIG_MEMCG_KMEM
      
      [mhocko@suse.cz: fix missed bits]
      Cc: Glauber Costa <glommer@parallels.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c255a458
  18. 31 7月, 2012 2 次提交
    • S
      NMI watchdog: fix for lockup detector breakage on resume · 45226e94
      Sameer Nanda 提交于
      On the suspend/resume path the boot CPU does not go though an
      offline->online transition.  This breaks the NMI detector post-resume
      since it depends on PMU state that is lost when the system gets
      suspended.
      
      Fix this by forcing a CPU offline->online transition for the lockup
      detector on the boot CPU during resume.
      
      To provide more context, we enable NMI watchdog on Chrome OS.  We have
      seen several reports of systems freezing up completely which indicated
      that the NMI watchdog was not firing for some reason.
      
      Debugging further, we found a simple way of repro'ing system freezes --
      issuing the command 'tasket 1 sh -c "echo nmilockup > /proc/breakme"'
      after the system has been suspended/resumed one or more times.
      
      With this patch in place, the system freeze result in panics, as
      expected.
      
      These panics provide a nice stack trace for us to debug the actual issue
      causing the freeze.
      
      [akpm@linux-foundation.org: fiddle with code comment]
      [akpm@linux-foundation.org: make lockup_detector_bootcpu_resume() conditional on CONFIG_SUSPEND]
      [akpm@linux-foundation.org: fix section errors]
      Signed-off-by: NSameer Nanda <snanda@chromium.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Mandeep Singh Baines <msb@chromium.org>
      Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      45226e94
    • K
      coredump: warn about unsafe suid_dumpable / core_pattern combo · 54b50199
      Kees Cook 提交于
      When suid_dumpable=2, detect unsafe core_pattern settings and warn when
      they are seen.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Suggested-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Alan Cox <alan@linux.intel.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Doug Ledford <dledford@redhat.com>
      Cc: Serge Hallyn <serge.hallyn@canonical.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      54b50199
  19. 24 7月, 2012 2 次提交
  20. 23 7月, 2012 3 次提交
  21. 06 7月, 2012 1 次提交
  22. 03 7月, 2012 1 次提交
  23. 08 6月, 2012 1 次提交
  24. 06 6月, 2012 2 次提交
  25. 02 6月, 2012 3 次提交
  26. 30 5月, 2012 1 次提交