1. 16 9月, 2009 4 次提交
    • A
      HWPOISON: Add poison check to page fault handling · a3b947ea
      Andi Kleen 提交于
      Bail out early when hardware poisoned pages are found in page fault handling.
      Since they are poisoned they should not be mapped freshly into processes,
      because that would cause another (potentially deadly) machine check
      
      This is generally handled in the same way as OOM, just a different
      error code is returned to the architecture code.
      
      v2: Do a page unlock if needed (Fengguang Wu)
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      a3b947ea
    • A
      HWPOISON: Add basic support for poisoned pages in fault handler v3 · d1737fdb
      Andi Kleen 提交于
      - Add a new VM_FAULT_HWPOISON error code to handle_mm_fault. Right now
      architectures have to explicitely enable poison page support, so
      this is forward compatible to all architectures. They only need
      to add it when they enable poison page support.
      - Add poison page handling in swap in fault code
      
      v2: Add missing delayacct_clear_flag (Hidehiro Kawai)
      v3: Really use delayacct_clear_flag (Hidehiro Kawai)
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      d1737fdb
    • A
      HWPOISON: Add support for poison swap entries v2 · a7420aa5
      Andi Kleen 提交于
      Memory migration uses special swap entry types to trigger special actions on
      page faults. Extend this mechanism to also support poisoned swap entries, to
      trigger poison handling on page faults. This allows follow-on patches to
      prevent processes from faulting in poisoned pages again.
      
      v2: Fix overflow in MAX_SWAPFILES (Fengguang Wu)
      v3: Better overflow fix (Hidehiro Kawai)
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      a7420aa5
    • A
      HWPOISON: Export some rmap vma locking to outside world · 10be22df
      Andi Kleen 提交于
      Needed for later patch that walks rmap entries on its own.
      
      This used to be very frowned upon, but memory-failure.c does
      some rather specialized rmap walking and rmap has been stable
      for quite some time, so I think it's ok now to export it.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      10be22df
  2. 14 9月, 2009 8 次提交
  3. 11 9月, 2009 7 次提交
    • C
      kmemleak: Improve the "Early log buffer exceeded" error message · addd72c1
      Catalin Marinas 提交于
      Based on a suggestion from Jaswinder, clarify what the user would need
      to do to avoid this error message from kmemleak.
      Reported-by: NJaswinder Singh Rajput <jaswinder@kernel.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      addd72c1
    • J
      writeback: check for registered bdi in flusher add and inode dirty · 500b067c
      Jens Axboe 提交于
      Also a debugging aid. We want to catch dirty inodes being added to
      backing devices that don't do writeback.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      500b067c
    • J
      writeback: add name to backing_dev_info · d993831f
      Jens Axboe 提交于
      This enables us to track who does what and print info. Its main use
      is catching dirty inodes on the default_backing_dev_info, so we can
      fix that up.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      d993831f
    • J
      writeback: add some debug inode list counters to bdi stats · f09b00d3
      Jens Axboe 提交于
      Add some debug entries to be able to inspect the internal state of
      the writeback details.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      f09b00d3
    • J
      writeback: get rid of pdflush completely · d0bceac7
      Jens Axboe 提交于
      It is now unused, so kill it off.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      d0bceac7
    • J
      writeback: switch to per-bdi threads for flushing data · 03ba3782
      Jens Axboe 提交于
      This gets rid of pdflush for bdi writeout and kupdated style cleaning.
      pdflush writeout suffers from lack of locality and also requires more
      threads to handle the same workload, since it has to work in a
      non-blocking fashion against each queue. This also introduces lumpy
      behaviour and potential request starvation, since pdflush can be starved
      for queue access if others are accessing it. A sample ffsb workload that
      does random writes to files is about 8% faster here on a simple SATA drive
      during the benchmark phase. File layout also seems a LOT more smooth in
      vmstat:
      
       r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
       0  1      0 608848   2652 375372    0    0     0 71024  604    24  1 10 48 42
       0  1      0 549644   2712 433736    0    0     0 60692  505    27  1  8 48 44
       1  0      0 476928   2784 505192    0    0     4 29540  553    24  0  9 53 37
       0  1      0 457972   2808 524008    0    0     0 54876  331    16  0  4 38 58
       0  1      0 366128   2928 614284    0    0     4 92168  710    58  0 13 53 34
       0  1      0 295092   3000 684140    0    0     0 62924  572    23  0  9 53 37
       0  1      0 236592   3064 741704    0    0     4 58256  523    17  0  8 48 44
       0  1      0 165608   3132 811464    0    0     0 57460  560    21  0  8 54 38
       0  1      0 102952   3200 873164    0    0     4 74748  540    29  1 10 48 41
       0  1      0  48604   3252 926472    0    0     0 53248  469    29  0  7 47 45
      
      where vanilla tends to fluctuate a lot in the creation phase:
      
       r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
       1  1      0 678716   5792 303380    0    0     0 74064  565    50  1 11 52 36
       1  0      0 662488   5864 319396    0    0     4   352  302   329  0  2 47 51
       0  1      0 599312   5924 381468    0    0     0 78164  516    55  0  9 51 40
       0  1      0 519952   6008 459516    0    0     4 78156  622    56  1 11 52 37
       1  1      0 436640   6092 541632    0    0     0 82244  622    54  0 11 48 41
       0  1      0 436640   6092 541660    0    0     0     8  152    39  0  0 51 49
       0  1      0 332224   6200 644252    0    0     4 102800  728    46  1 13 49 36
       1  0      0 274492   6260 701056    0    0     4 12328  459    49  0  7 50 43
       0  1      0 211220   6324 763356    0    0     0 106940  515    37  1 10 51 39
       1  0      0 160412   6376 813468    0    0     0  8224  415    43  0  6 49 45
       1  1      0  85980   6452 886556    0    0     4 113516  575    39  1 11 54 34
       0  2      0  85968   6452 886620    0    0     0  1640  158   211  0  0 46 54
      
      A 10 disk test with btrfs performs 26% faster with per-bdi flushing. A
      SSD based writeback test on XFS performs over 20% better as well, with
      the throughput being very stable around 1GB/sec, where pdflush only
      manages 750MB/sec and fluctuates wildly while doing so. Random buffered
      writes to many files behave a lot better as well, as does random mmap'ed
      writes.
      
      A separate thread is added to sync the super blocks. In the long term,
      adding sync_supers_bdi() functionality could get rid of this thread again.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      03ba3782
    • J
      writeback: move dirty inodes from super_block to backing_dev_info · 66f3b8e2
      Jens Axboe 提交于
      This is a first step at introducing per-bdi flusher threads. We should
      have no change in behaviour, although sb_has_dirty_inodes() is now
      ridiculously expensive, as there's no easy way to answer that question.
      Not a huge problem, since it'll be deleted in subsequent patches.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      66f3b8e2
  4. 10 9月, 2009 1 次提交
  5. 09 9月, 2009 4 次提交
  6. 08 9月, 2009 3 次提交
  7. 06 9月, 2009 2 次提交
  8. 04 9月, 2009 4 次提交
  9. 01 9月, 2009 2 次提交
    • T
      percpu: don't assume existence of cpu0 · 04a13c7c
      Tejun Heo 提交于
      percpu incorrectly assumed that cpu0 was always there which led to the
      following warning and eventual oops on sparc machines w/o cpu0.
      
        WARNING: at mm/percpu.c:651 pcpu_map+0xdc/0x100()
        Modules linked in:
        Call Trace:
          [000000000045eb70] warn_slowpath_common+0x50/0xa0
          [000000000045ebdc] warn_slowpath_null+0x1c/0x40
          [00000000004d493c] pcpu_map+0xdc/0x100
          [00000000004d59a4] pcpu_alloc+0x3e4/0x4e0
          [00000000004d5af8] __alloc_percpu+0x18/0x40
          [00000000005b112c] __percpu_counter_init+0x4c/0xc0
        ...
        Unable to handle kernel NULL pointer dereference
        ...
         I7: <sysfs_new_dirent+0x30/0x120>
         Disabling lock debugging due to kernel taint
         Caller[000000000053c1b0]: sysfs_new_dirent+0x30/0x120
         Caller[000000000053c7a4]: create_dir+0x24/0xc0
         Caller[000000000053c870]: sysfs_create_dir+0x30/0x80
         Caller[00000000005990e8]: kobject_add_internal+0xc8/0x200
        ...
         Kernel panic - not syncing: Attempted to kill the idle task!
      
      This patch fixes the problem by backporting parts from devel branch to
      make percpu core not depend on the existence of cpu0.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NMeelis Roos <mroos@linux.ee>
      Cc: David Miller <davem@davemloft.net>
      04a13c7c
    • H
      mm: remove !NUMA condition from PAGEFLAGS_EXTENDED condition set · a269cca9
      H. Peter Anvin 提交于
      CONFIG_PAGEFLAGS_EXTENDED disables a trick to conserve pageflags.
      This trick is indended to be enabled when the pressure on page flags
      is very high.
      
      The previous condition was:
      
      -       depends on 64BIT || SPARSEMEM_VMEMMAP || !NUMA || !SPARSEMEM
      
      ... however, the sparsemem code already has a way to crowd out the
      node number from the pageflags, which means that !NUMA actually
      doesn't contribute to hard pageflags exhaustion.
      
      This is required for the new PG_uncached flag to not cause pageflags
      exhaustion on x86_32 + PAE + SPARSEMEM + !NUMA.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      LKML-Reference: <4A9828F4.4040905@zytor.com>
      Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Cc: Suresh Siddha <suresh.siddha@intel.com>
      a269cca9
  10. 30 8月, 2009 1 次提交
  11. 27 8月, 2009 4 次提交