1. 20 2月, 2015 8 次提交
    • C
      debug: prevent entering debug mode on panic/exception. · 5516fd7b
      Colin Cross 提交于
      On non-developer devices, kgdb prevents the device from rebooting
      after a panic.
      
      Incase of panics and exceptions, to allow the device to reboot, prevent
      entering debug mode to avoid getting stuck waiting for the user to
      interact with debugger.
      
      To avoid entering the debugger on panic/exception without any extra
      configuration, panic_timeout is being used which can be set via
      /proc/sys/kernel/panic at run time and CONFIG_PANIC_TIMEOUT sets the
      default value.
      
      Setting panic_timeout indicates that the user requested machine to
      perform unattended reboot after panic. We dont want to get stuck waiting
      for the user input incase of panic.
      
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: kgdb-bugreport@lists.sourceforge.net
      Cc: linux-kernel@vger.kernel.org
      Cc: Android Kernel Team <kernel-team@android.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Sumit Semwal <sumit.semwal@linaro.org>
      Signed-off-by: NColin Cross <ccross@android.com>
      [Kiran: Added context to commit message.
      panic_timeout is used instead of break_on_panic and
      break_on_exception to honor CONFIG_PANIC_TIMEOUT
      Modified the commit as per community feedback]
      Signed-off-by: NKiran Raparthy <kiran.kumar@linaro.org>
      Signed-off-by: NDaniel Thompson <daniel.thompson@linaro.org>
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      5516fd7b
    • D
      kdb: Const qualifier for kdb_getstr's prompt argument · 32d375f6
      Daniel Thompson 提交于
      All current callers of kdb_getstr() can pass constant pointers via the
      prompt argument. This patch adds a const qualification to make explicit
      the fact that this is safe.
      Signed-off-by: NDaniel Thompson <daniel.thompson@linaro.org>
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      32d375f6
    • D
      kdb: Provide forward search at more prompt · fb6daa75
      Daniel Thompson 提交于
      Currently kdb allows the output of comamnds to be filtered using the
      | grep feature. This is useful but does not permit the output emitted
      shortly after a string match to be examined without wading through the
      entire unfiltered output of the command. Such a feature is particularly
      useful to navigate function traces because these traces often have a
      useful trigger string *before* the point of interest.
      
      This patch reuses the existing filtering logic to introduce a simple
      forward search to kdb that can be triggered from the more prompt.
      Signed-off-by: NDaniel Thompson <daniel.thompson@linaro.org>
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      fb6daa75
    • D
      kdb: Fix a prompt management bug when using | grep · ab08e464
      Daniel Thompson 提交于
      Currently when the "| grep" feature is used to filter the output of a
      command then the prompt is not displayed for the subsequent command.
      Likewise any characters typed by the user are also not echoed to the
      display. This rather disconcerting problem eventually corrects itself
      when the user presses Enter and the kdb_grepping_flag is cleared as
      kdb_parse() tries to make sense of whatever they typed.
      
      This patch resolves the problem by moving the clearing of this flag
      from the middle of command processing to the beginning.
      Signed-off-by: NDaniel Thompson <daniel.thompson@linaro.org>
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      ab08e464
    • D
      kdb: Remove stack dump when entering kgdb due to NMI · 54543881
      Daniel Thompson 提交于
      Issuing a stack dump feels ergonomically wrong when entering due to NMI.
      
      Entering due to NMI is normally a reaction to a user request, either the
      NMI button on a server or a "magic knock" on a UART. Therefore the
      backtrace behaviour on entry due to NMI should be like SysRq-g (no stack
      dump) rather than like oops.
      
      Note also that the stack dump does not offer any information that
      cannot be trivial retrieved using the 'bt' command.
      Signed-off-by: NDaniel Thompson <daniel.thompson@linaro.org>
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      54543881
    • D
      kdb: Avoid printing KERN_ levels to consoles · f7d4ca8b
      Daniel Thompson 提交于
      Currently when kdb traps printk messages then the raw log level prefix
      (consisting of '\001' followed by a numeral) does not get stripped off
      before the message is issued to the various I/O handlers supported by
      kdb. This causes annoying visual noise as well as causing problems
      grepping for ^. It is also a change of behaviour compared to normal usage
      of printk() usage. For example <SysRq>-h ends up with different output to
      that of kdb's "sr h".
      
      This patch addresses the problem by stripping log levels from messages
      before they are issued to the I/O handlers. printk() which can also
      act as an i/o handler in some cases is special cased; if the caller
      provided a log level then the prefix will be preserved when sent to
      printk().
      
      The addition of non-printable characters to the output of kdb commands is a
      regression, albeit and extremely elderly one, introduced by commit
      04d2c8c8 ("printk: convert the format for KERN_<LEVEL> to a 2 byte
      pattern"). Note also that this patch does *not* restore the original
      behaviour from v3.5. Instead it makes printk() from within a kdb command
      display the message without any prefix (i.e. like printk() normally does).
      Signed-off-by: NDaniel Thompson <daniel.thompson@linaro.org>
      Cc: Joe Perches <joe@perches.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      f7d4ca8b
    • J
      kdb: Fix off by one error in kdb_cpu() · df0036d1
      Jason Wessel 提交于
      There was a follow on replacement patch against the prior
      "kgdb: Timeout if secondary CPUs ignore the roundup".
      
      See: https://lkml.org/lkml/2015/1/7/442
      
      This patch is the delta vs the patch that was committed upstream:
        * Fix an off-by-one error in kdb_cpu().
        * Replace NR_CPUS with CONFIG_NR_CPUS to tell checkpatch that we
          really want a static limit.
        * Removed the "KGDB: " prefix from the pr_crit() in debug_core.c
          (kgdb-next contains a patch which introduced pr_fmt() to this file
          to the tag will now be applied automatically).
      
      Cc: Daniel Thompson <daniel.thompson@linaro.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      df0036d1
    • J
      kdb: fix incorrect counts in KDB summary command output · 14675592
      Jay Lan 提交于
      The output of KDB 'summary' command should report MemTotal, MemFree
      and Buffers output in kB. Current codes report in unit of pages.
      
      A define of K(x) as
      is defined in the code, but not used.
      
      This patch would apply the define to convert the values to kB.
      Please include me on Cc on replies. I do not subscribe to linux-kernel.
      Signed-off-by: NJay Lan <jlan@sgi.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
      14675592
  2. 18 2月, 2015 7 次提交
  3. 16 2月, 2015 2 次提交
    • R
      PM / sleep: Make it possible to quiesce timers during suspend-to-idle · 124cf911
      Rafael J. Wysocki 提交于
      The efficiency of suspend-to-idle depends on being able to keep CPUs
      in the deepest available idle states for as much time as possible.
      Ideally, they should only be brought out of idle by system wakeup
      interrupts.
      
      However, timer interrupts occurring periodically prevent that from
      happening and it is not practical to chase all of the "misbehaving"
      timers in a whack-a-mole fashion.  A much more effective approach is
      to suspend the local ticks for all CPUs and the entire timekeeping
      along the lines of what is done during full suspend, which also
      helps to keep suspend-to-idle and full suspend reasonably similar.
      
      The idea is to suspend the local tick on each CPU executing
      cpuidle_enter_freeze() and to make the last of them suspend the
      entire timekeeping.  That should prevent timer interrupts from
      triggering until an IO interrupt wakes up one of the CPUs.  It
      needs to be done with interrupts disabled on all of the CPUs,
      though, because otherwise the suspended clocksource might be
      accessed by an interrupt handler which might lead to fatal
      consequences.
      
      Unfortunately, the existing ->enter callbacks provided by cpuidle
      drivers generally cannot be used for implementing that, because some
      of them re-enable interrupts temporarily and some idle entry methods
      cause interrupts to be re-enabled automatically on exit.  Also some
      of these callbacks manipulate local clock event devices of the CPUs
      which really shouldn't be done after suspending their ticks.
      
      To overcome that difficulty, introduce a new cpuidle state callback,
      ->enter_freeze, that will be guaranteed (1) to keep interrupts
      disabled all the time (and return with interrupts disabled) and (2)
      not to touch the CPU timer devices.  Modify cpuidle_enter_freeze() to
      look for the deepest available idle state with ->enter_freeze present
      and to make the CPU execute that callback with suspended tick (and the
      last of the online CPUs to execute it with suspended timekeeping).
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      124cf911
    • R
      timekeeping: Make it safe to use the fast timekeeper while suspended · 060407ae
      Rafael J. Wysocki 提交于
      Theoretically, ktime_get_mono_fast_ns() may be executed after
      timekeeping has been suspended (or before it is resumed) which
      in turn may lead to undefined behavior, for example, when the
      clocksource read from timekeeping_get_ns() called by it is
      not accessible at that time.
      
      Prevent that from happening by setting up a dummy readout base for
      the fast timekeeper during timekeeping_suspend() such that it will
      always return the same number of cycles.
      
      After the last timekeeping_update() in timekeeping_suspend() the
      clocksource is read and the result is stored as cycles_at_suspend.
      The readout base from the current timekeeper is copied onto the
      dummy and the ->read pointer of the dummy is set to a routine
      unconditionally returning cycles_at_suspend.  Next, the dummy is
      passed to update_fast_timekeeper().
      
      Then, ktime_get_mono_fast_ns() will work until the subsequent
      timekeeping_resume() and the proper readout base for the fast
      timekeeper will be restored by the timekeeping_update() called
      right after clearing timekeeping_suspended.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NJohn Stultz <john.stultz@linaro.org>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      060407ae
  4. 14 2月, 2015 15 次提交
  5. 13 2月, 2015 6 次提交
    • J
      printk: correct timeout comment, neaten MODULE_PARM_DESC · 205bd3d2
      Joe Perches 提交于
      Neaten the MODULE_PARAM_DESC message.
      Use 30 seconds in the comment for the zap console locks timeout.
      Signed-off-by: NJoe Perches <joe@perches.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      205bd3d2
    • C
      kernel/sched/clock.c: add another clock for use with the soft lockup watchdog · 545a2bf7
      Cyril Bur 提交于
      When the hypervisor pauses a virtualised kernel the kernel will observe a
      jump in timebase, this can cause spurious messages from the softlockup
      detector.
      
      Whilst these messages are harmless, they are accompanied with a stack
      trace which causes undue concern and more problematically the stack trace
      in the guest has nothing to do with the observed problem and can only be
      misleading.
      
      Futhermore, on POWER8 this is completely avoidable with the introduction
      of the Virtual Time Base (VTB) register.
      
      This patch (of 2):
      
      This permits the use of arch specific clocks for which virtualised kernels
      can use their notion of 'running' time, not the elpased wall time which
      will include host execution time.
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Andrew Jones <drjones@redhat.com>
      Acked-by: NDon Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Cc: chai wen <chaiw.fnst@cn.fujitsu.com>
      Cc: Fabian Frederick <fabf@skynet.be>
      Cc: Aaron Tomlin <atomlin@redhat.com>
      Cc: Ben Zhang <benzh@chromium.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      545a2bf7
    • A
      all arches, signal: move restart_block to struct task_struct · f56141e3
      Andy Lutomirski 提交于
      If an attacker can cause a controlled kernel stack overflow, overwriting
      the restart block is a very juicy exploit target.  This is because the
      restart_block is held in the same memory allocation as the kernel stack.
      
      Moving the restart block to struct task_struct prevents this exploit by
      making the restart_block harder to locate.
      
      Note that there are other fields in thread_info that are also easy
      targets, at least on some architectures.
      
      It's also a decent simplification, since the restart code is more or less
      identical on all architectures.
      
      [james.hogan@imgtec.com: metag: align thread_info::supervisor_stack]
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: David Miller <davem@davemloft.net>
      Acked-by: NRichard Weinberger <richard@nod.at>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Steven Miao <realmz6@gmail.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Tested-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Chen Liqin <liqin.linux@gmail.com>
      Cc: Lennox Wu <lennox.wu@gmail.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f56141e3
    • R
      kernel/cpuset.c: Mark cpuset_init_current_mems_allowed as __init · 8f4ab07f
      Rasmus Villemoes 提交于
      The only caller of cpuset_init_current_mems_allowed is the __init
      annotated build_all_zonelists_init, so we can also make the former __init.
      Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Vishnu Pratap Singh <vishnu.ps@samsung.com>
      Cc: Pintu Kumar <pintu.k@samsung.com>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8f4ab07f
    • K
      mm: do not use mm->nr_pmds on !MMU configurations · 2d2f5119
      Kirill A. Shutemov 提交于
      mm->nr_pmds doesn't make sense on !MMU configurations
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2d2f5119
    • V
      cgroup: release css->id after css_free · 01e58659
      Vladimir Davydov 提交于
      Currently, we release css->id in css_release_work_fn, right before calling
      css_free callback, so that when css_free is called, the id may have
      already been reused for a new cgroup.
      
      I am going to use css->id to create unique names for per memcg kmem
      caches.  Since kmem caches are destroyed only on css_free, I need css->id
      to be freed after css_free was called to avoid name clashes.  This patch
      therefore moves css->id removal to css_free_work_fn.  To prevent
      css_from_id from returning a pointer to a stale css, it makes
      css_release_work_fn replace the css ptr at css_idr:css->id with NULL.
      Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Acked-by: NTejun Heo <tj@kernel.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      01e58659
  6. 12 2月, 2015 2 次提交
    • K
      mm: fix false-positive warning on exit due mm_nr_pmds(mm) · b30fe6c7
      Kirill A. Shutemov 提交于
      The problem is that we check nr_ptes/nr_pmds in exit_mmap() which happens
      *before* pgd_free().  And if an arch does pte/pmd allocation in
      pgd_alloc() and frees them in pgd_free() we see offset in counters by the
      time of the checks.
      
      We tried to workaround this by offsetting expected counter value according
      to FIRST_USER_ADDRESS for both nr_pte and nr_pmd in exit_mmap().  But it
      doesn't work in some cases:
      
      1. ARM with LPAE enabled also has non-zero USER_PGTABLES_CEILING, but
         upper addresses occupied with huge pmd entries, so the trick with
         offsetting expected counter value will get really ugly: we will have
         to apply it nr_pmds, but not nr_ptes.
      
      2. Metag has non-zero FIRST_USER_ADDRESS, but doesn't do allocation
         pte/pmd page tables allocation in pgd_alloc(), just setup a pgd entry
         which is allocated at boot and shared accross all processes.
      
      The proposal is to move the check to check_mm() which happens *after*
      pgd_free() and do proper accounting during pgd_alloc() and pgd_free()
      which would bring counters to zero if nothing leaked.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Reported-by: NTyler Baker <tyler.baker@linaro.org>
      Tested-by: NTyler Baker <tyler.baker@linaro.org>
      Tested-by: NNishanth Menon <nm@ti.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b30fe6c7
    • K
      mm: account pmd page tables to the process · dc6c9a35
      Kirill A. Shutemov 提交于
      Dave noticed that unprivileged process can allocate significant amount of
      memory -- >500 MiB on x86_64 -- and stay unnoticed by oom-killer and
      memory cgroup.  The trick is to allocate a lot of PMD page tables.  Linux
      kernel doesn't account PMD tables to the process, only PTE.
      
      The use-cases below use few tricks to allocate a lot of PMD page tables
      while keeping VmRSS and VmPTE low.  oom_score for the process will be 0.
      
      	#include <errno.h>
      	#include <stdio.h>
      	#include <stdlib.h>
      	#include <unistd.h>
      	#include <sys/mman.h>
      	#include <sys/prctl.h>
      
      	#define PUD_SIZE (1UL << 30)
      	#define PMD_SIZE (1UL << 21)
      
      	#define NR_PUD 130000
      
      	int main(void)
      	{
      		char *addr = NULL;
      		unsigned long i;
      
      		prctl(PR_SET_THP_DISABLE);
      		for (i = 0; i < NR_PUD ; i++) {
      			addr = mmap(addr + PUD_SIZE, PUD_SIZE, PROT_WRITE|PROT_READ,
      					MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
      			if (addr == MAP_FAILED) {
      				perror("mmap");
      				break;
      			}
      			*addr = 'x';
      			munmap(addr, PMD_SIZE);
      			mmap(addr, PMD_SIZE, PROT_WRITE|PROT_READ,
      					MAP_ANONYMOUS|MAP_PRIVATE|MAP_FIXED, -1, 0);
      			if (addr == MAP_FAILED)
      				perror("re-mmap"), exit(1);
      		}
      		printf("PID %d consumed %lu KiB in PMD page tables\n",
      				getpid(), i * 4096 >> 10);
      		return pause();
      	}
      
      The patch addresses the issue by account PMD tables to the process the
      same way we account PTE.
      
      The main place where PMD tables is accounted is __pmd_alloc() and
      free_pmd_range(). But there're few corner cases:
      
       - HugeTLB can share PMD page tables. The patch handles by accounting
         the table to all processes who share it.
      
       - x86 PAE pre-allocates few PMD tables on fork.
      
       - Architectures with FIRST_USER_ADDRESS > 0. We need to adjust sanity
         check on exit(2).
      
      Accounting only happens on configuration where PMD page table's level is
      present (PMD is not folded).  As with nr_ptes we use per-mm counter.  The
      counter value is used to calculate baseline for badness score by
      oom-killer.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Reported-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: Hugh Dickins <hughd@google.com>
      Reviewed-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Cc: David Rientjes <rientjes@google.com>
      Tested-by: NSedat Dilek <sedat.dilek@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dc6c9a35