1. 09 12月, 2011 2 次提交
    • M
      mm: Ensure that pfn_valid() is called once per pageblock when reserving pageblocks · d0215638
      Michal Hocko 提交于
      setup_zone_migrate_reserve() expects that zone->start_pfn starts at
      pageblock_nr_pages aligned pfn otherwise we could access beyond an
      existing memblock resulting in the following panic if
      CONFIG_HOLES_IN_ZONE is not configured and we do not check pfn_valid:
      
        IP: [<c02d331d>] setup_zone_migrate_reserve+0xcd/0x180
        *pdpt = 0000000000000000 *pde = f000ff53f000ff53
        Oops: 0000 [#1] SMP
        Pid: 1, comm: swapper Not tainted 3.0.7-0.7-pae #1 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
        EIP: 0060:[<c02d331d>] EFLAGS: 00010006 CPU: 0
        EIP is at setup_zone_migrate_reserve+0xcd/0x180
        EAX: 000c0000 EBX: f5801fc0 ECX: 000c0000 EDX: 00000000
        ESI: 000c01fe EDI: 000c01fe EBP: 00140000 ESP: f2475f58
        DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
        Process swapper (pid: 1, ti=f2474000 task=f2472cd0 task.ti=f2474000)
        Call Trace:
        [<c02d389c>] __setup_per_zone_wmarks+0xec/0x160
        [<c02d3a1f>] setup_per_zone_wmarks+0xf/0x20
        [<c08a771c>] init_per_zone_wmark_min+0x27/0x86
        [<c020111b>] do_one_initcall+0x2b/0x160
        [<c086639d>] kernel_init+0xbe/0x157
        [<c05cae26>] kernel_thread_helper+0x6/0xd
        Code: a5 39 f5 89 f7 0f 46 fd 39 cf 76 40 8b 03 f6 c4 08 74 32 eb 91 90 89 c8 c1 e8 0e 0f be 80 80 2f 86 c0 8b 14 85 60 2f 86 c0 89 c8 <2b> 82 b4 12 00 00 c1 e0 05 03 82 ac 12 00 00 8b 00 f6 c4 08 0f
        EIP: [<c02d331d>] setup_zone_migrate_reserve+0xcd/0x180 SS:ESP 0068:f2475f58
        CR2: 00000000000012b4
      
      We crashed in pageblock_is_reserved() when accessing pfn 0xc0000 because
      highstart_pfn = 0x36ffe.
      
      The issue was introduced in 3.0-rc1 by 6d3163ce ("mm: check if any page
      in a pageblock is reserved before marking it MIGRATE_RESERVE").
      
      Make sure that start_pfn is always aligned to pageblock_nr_pages to
      ensure that pfn_valid s always called at the start of each pageblock.
      Architectures with holes in pageblocks will be correctly handled by
      pfn_valid_within in pageblock_is_reserved.
      Signed-off-by: NMichal Hocko <mhocko@suse.cz>
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Tested-by: NDang Bo <bdang@vmware.com>
      Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Arve Hjnnevg <arve@android.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Dave Hansen <dave@linux.vnet.ibm.com>
      Cc: <stable@vger.kernel.org>	[3.0+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d0215638
    • Y
      thp: set compound tail page _count to zero · 58a84aa9
      Youquan Song 提交于
      Commit 70b50f94 ("mm: thp: tail page refcounting fix") keeps all
      page_tail->_count zero at all times.  But the current kernel does not
      set page_tail->_count to zero if a 1GB page is utilized.  So when an
      IOMMU 1GB page is used by KVM, it wil result in a kernel oops because a
      tail page's _count does not equal zero.
      
        kernel BUG at include/linux/mm.h:386!
        invalid opcode: 0000 [#1] SMP
        Call Trace:
          gup_pud_range+0xb8/0x19d
          get_user_pages_fast+0xcb/0x192
          ? trace_hardirqs_off+0xd/0xf
          hva_to_pfn+0x119/0x2f2
          gfn_to_pfn_memslot+0x2c/0x2e
          kvm_iommu_map_pages+0xfd/0x1c1
          kvm_iommu_map_memslots+0x7c/0xbd
          kvm_iommu_map_guest+0xaa/0xbf
          kvm_vm_ioctl_assigned_device+0x2ef/0xa47
          kvm_vm_ioctl+0x36c/0x3a2
          do_vfs_ioctl+0x49e/0x4e4
          sys_ioctl+0x5a/0x7c
          system_call_fastpath+0x16/0x1b
        RIP  gup_huge_pud+0xf2/0x159
      Signed-off-by: NYouquan Song <youquan.song@intel.com>
      Reviewed-by: NAndrea Arcangeli <aarcange@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      58a84aa9
  2. 01 11月, 2011 2 次提交
  3. 04 8月, 2011 1 次提交
    • A
      fault-injection: add ability to export fault_attr in arbitrary directory · dd48c085
      Akinobu Mita 提交于
      init_fault_attr_dentries() is used to export fault_attr via debugfs.
      But it can only export it in debugfs root directory.
      
      Per Forlin is working on mmc_fail_request which adds support to inject
      data errors after a completed host transfer in MMC subsystem.
      
      The fault_attr for mmc_fail_request should be defined per mmc host and
      export it in debugfs directory per mmc host like
      /sys/kernel/debug/mmc0/mmc_fail_request.
      
      init_fault_attr_dentries() doesn't help for mmc_fail_request.  So this
      introduces fault_create_debugfs_attr() which is able to create a
      directory in the arbitrary directory and replace
      init_fault_attr_dentries().
      
      [akpm@linux-foundation.org: extraneous semicolon, per Randy]
      Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com>
      Tested-by: NPer Forlin <per.forlin@linaro.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dd48c085
  4. 27 7月, 2011 2 次提交
  5. 26 7月, 2011 2 次提交
    • M
      mm: page allocator: reconsider zones for allocation after direct reclaim · 76d3fbf8
      Mel Gorman 提交于
      With zone_reclaim_mode enabled, it's possible for zones to be considered
      full in the zonelist_cache so they are skipped in the future.  If the
      process enters direct reclaim, the ZLC may still consider zones to be full
      even after reclaiming pages.  Reconsider all zones for allocation if
      direct reclaim returns successfully.
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Christoph Lameter <cl@linux.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      76d3fbf8
    • M
      mm: page allocator: initialise ZLC for first zone eligible for zone_reclaim · cd38b115
      Mel Gorman 提交于
      There have been a small number of complaints about significant stalls
      while copying large amounts of data on NUMA machines reported on a
      distribution bugzilla.  In these cases, zone_reclaim was enabled by
      default due to large NUMA distances.  In general, the complaints have not
      been about the workload itself unless it was a file server (in which case
      the recommendation was disable zone_reclaim).
      
      The stalls are mostly due to significant amounts of time spent scanning
      the preferred zone for pages to free.  After a failure, it might fallback
      to another node (as zonelists are often node-ordered rather than
      zone-ordered) but stall quickly again when the next allocation attempt
      occurs.  In bad cases, each page allocated results in a full scan of the
      preferred zone.
      
      Patch 1 checks the preferred zone for recent allocation failure
              which is particularly important if zone_reclaim has failed
              recently.  This avoids rescanning the zone in the near future and
              instead falling back to another node.  This may hurt node locality
              in some cases but a failure to zone_reclaim is more expensive than
              a remote access.
      
      Patch 2 clears the zlc information after direct reclaim.
              Otherwise, zone_reclaim can mark zones full, direct reclaim can
              reclaim enough pages but the zone is still not considered for
              allocation.
      
      This was tested on a 24-thread 2-node x86_64 machine.  The tests were
      focused on large amounts of IO.  All tests were bound to the CPUs on
      node-0 to avoid disturbances due to processes being scheduled on different
      nodes.  The kernels tested are
      
      3.0-rc6-vanilla		Vanilla 3.0-rc6
      zlcfirst		Patch 1 applied
      zlcreconsider		Patches 1+2 applied
      
      FS-Mark
      ./fs_mark  -d  /tmp/fsmark-10813  -D  100  -N  5000  -n  208  -L  35  -t  24  -S0  -s  524288
                      fsmark-3.0-rc6       3.0-rc6       		3.0-rc6
                         vanilla			 zlcfirs 	zlcreconsider
      Files/s  min          54.90 ( 0.00%)       49.80 (-10.24%)       49.10 (-11.81%)
      Files/s  mean        100.11 ( 0.00%)      135.17 (25.94%)      146.93 (31.87%)
      Files/s  stddev       57.51 ( 0.00%)      138.97 (58.62%)      158.69 (63.76%)
      Files/s  max         361.10 ( 0.00%)      834.40 (56.72%)      802.40 (55.00%)
      Overhead min       76704.00 ( 0.00%)    76501.00 ( 0.27%)    77784.00 (-1.39%)
      Overhead mean    1485356.51 ( 0.00%)  1035797.83 (43.40%)  1594680.26 (-6.86%)
      Overhead stddev  1848122.53 ( 0.00%)   881489.88 (109.66%)  1772354.90 ( 4.27%)
      Overhead max     7989060.00 ( 0.00%)  3369118.00 (137.13%) 10135324.00 (-21.18%)
      MMTests Statistics: duration
      User/Sys Time Running Test (seconds)        501.49    493.91    499.93
      Total Elapsed Time (seconds)               2451.57   2257.48   2215.92
      
      MMTests Statistics: vmstat
      Page Ins                                       46268       63840       66008
      Page Outs                                   90821596    90671128    88043732
      Swap Ins                                           0           0           0
      Swap Outs                                          0           0           0
      Direct pages scanned                        13091697     8966863     8971790
      Kswapd pages scanned                               0     1830011     1831116
      Kswapd pages reclaimed                             0     1829068     1829930
      Direct pages reclaimed                      13037777     8956828     8648314
      Kswapd efficiency                               100%         99%         99%
      Kswapd velocity                                0.000     810.643     826.346
      Direct efficiency                                99%         99%         96%
      Direct velocity                             5340.128    3972.068    4048.788
      Percentage direct scans                         100%         83%         83%
      Page writes by reclaim                             0           3           0
      Slabs scanned                                 796672      720640      720256
      Direct inode steals                          7422667     7160012     7088638
      Kswapd inode steals                                0     1736840     2021238
      
      Test completes far faster with a large increase in the number of files
      created per second.  Standard deviation is high as a small number of
      iterations were much higher than the mean.  The number of pages scanned by
      zone_reclaim is reduced and kswapd is used for more work.
      
      LARGE DD
                     		3.0-rc6       3.0-rc6       3.0-rc6
                         	vanilla     zlcfirst     zlcreconsider
      download tar           59 ( 0.00%)   59 ( 0.00%)   55 ( 7.27%)
      dd source files       527 ( 0.00%)  296 (78.04%)  320 (64.69%)
      delete source          36 ( 0.00%)   19 (89.47%)   20 (80.00%)
      MMTests Statistics: duration
      User/Sys Time Running Test (seconds)        125.03    118.98    122.01
      Total Elapsed Time (seconds)                624.56    375.02    398.06
      
      MMTests Statistics: vmstat
      Page Ins                                     3594216      439368      407032
      Page Outs                                   23380832    23380488    23377444
      Swap Ins                                           0           0           0
      Swap Outs                                          0         436         287
      Direct pages scanned                        17482342    69315973    82864918
      Kswapd pages scanned                               0      519123      575425
      Kswapd pages reclaimed                             0      466501      522487
      Direct pages reclaimed                       5858054     2732949     2712547
      Kswapd efficiency                               100%         89%         90%
      Kswapd velocity                                0.000    1384.254    1445.574
      Direct efficiency                                33%          3%          3%
      Direct velocity                            27991.453  184832.737  208171.929
      Percentage direct scans                         100%         99%         99%
      Page writes by reclaim                             0        5082       13917
      Slabs scanned                                  17280       29952       35328
      Direct inode steals                           115257     1431122      332201
      Kswapd inode steals                                0           0      979532
      
      This test downloads a large tarfile and copies it with dd a number of
      times - similar to the most recent bug report I've dealt with.  Time to
      completion is reduced.  The number of pages scanned directly is still
      disturbingly high with a low efficiency but this is likely due to the
      number of dirty pages encountered.  The figures could probably be improved
      with more work around how kswapd is used and how dirty pages are handled
      but that is separate work and this result is significant on its own.
      
      Streaming Mapped Writer
      MMTests Statistics: duration
      User/Sys Time Running Test (seconds)        124.47    111.67    112.64
      Total Elapsed Time (seconds)               2138.14   1816.30   1867.56
      
      MMTests Statistics: vmstat
      Page Ins                                       90760       89124       89516
      Page Outs                                  121028340   120199524   120736696
      Swap Ins                                           0          86          55
      Swap Outs                                          0           0           0
      Direct pages scanned                       114989363    96461439    96330619
      Kswapd pages scanned                        56430948    56965763    57075875
      Kswapd pages reclaimed                      27743219    27752044    27766606
      Direct pages reclaimed                         49777       46884       36655
      Kswapd efficiency                                49%         48%         48%
      Kswapd velocity                            26392.541   31363.631   30561.736
      Direct efficiency                                 0%          0%          0%
      Direct velocity                            53780.091   53108.759   51581.004
      Percentage direct scans                          67%         62%         62%
      Page writes by reclaim                           385         122        1513
      Slabs scanned                                  43008       39040       42112
      Direct inode steals                                0          10           8
      Kswapd inode steals                              733         534         477
      
      This test just creates a large file mapping and writes to it linearly.
      Time to completion is again reduced.
      
      The gains are mostly down to two things.  In many cases, there is less
      scanning as zone_reclaim simply gives up faster due to recent failures.
      The second reason is that memory is used more efficiently.  Instead of
      scanning the preferred zone every time, the allocator falls back to
      another zone and uses it instead improving overall memory utilisation.
      
      This patch: initialise ZLC for first zone eligible for zone_reclaim.
      
      The zonelist cache (ZLC) is used among other things to record if
      zone_reclaim() failed for a particular zone recently.  The intention is to
      avoid a high cost scanning extremely long zonelists or scanning within the
      zone uselessly.
      
      Currently the zonelist cache is setup only after the first zone has been
      considered and zone_reclaim() has been called.  The objective was to avoid
      a costly setup but zone_reclaim is itself quite expensive.  If it is
      failing regularly such as the first eligible zone having mostly mapped
      pages, the cost in scanning and allocation stalls is far higher than the
      ZLC initialisation step.
      
      This patch initialises ZLC before the first eligible zone calls
      zone_reclaim().  Once initialised, it is checked whether the zone failed
      zone_reclaim recently.  If it has, the zone is skipped.  As the first zone
      is now being checked, additional care has to be taken about zones marked
      full.  A zone can be marked "full" because it should not have enough
      unmapped pages for zone_reclaim but this is excessive as direct reclaim or
      kswapd may succeed where zone_reclaim fails.  Only mark zones "full" after
      zone_reclaim fails if it failed to reclaim enough pages after scanning.
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Cc: Minchan Kim <minchan.kim@gmail.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Christoph Lameter <cl@linux.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cd38b115
  6. 13 7月, 2011 1 次提交
    • T
      x86, numa: Implement pfn -> nid mapping granularity check · 1e01979c
      Tejun Heo 提交于
      SPARSEMEM w/o VMEMMAP and DISCONTIGMEM, both used only on 32bit, use
      sections array to map pfn to nid which is limited in granularity.  If
      NUMA nodes are laid out such that the mapping cannot be accurate, boot
      will fail triggering BUG_ON() in mminit_verify_page_links().
      
      On 32bit, it's 512MiB w/ PAE and SPARSEMEM.  This seems to have been
      granular enough until commit 2706a0bf (x86, NUMA: Enable
      CONFIG_AMD_NUMA on 32bit too).  Apparently, there is a machine which
      aligns NUMA nodes to 128MiB and has only AMD NUMA but not SRAT.  This
      led to the following BUG_ON().
      
       On node 0 totalpages: 2096615
         DMA zone: 32 pages used for memmap
         DMA zone: 0 pages reserved
         DMA zone: 3927 pages, LIFO batch:0
         Normal zone: 1740 pages used for memmap
         Normal zone: 220978 pages, LIFO batch:31
         HighMem zone: 16405 pages used for memmap
         HighMem zone: 1853533 pages, LIFO batch:31
       BUG: Int 6: CR2   (null)
            EDI   (null)  ESI 00000002  EBP 00000002  ESP c1543ecc
            EBX f2400000  EDX 00000006  ECX   (null)  EAX 00000001
            err   (null)  EIP c16209aa   CS 00000060  flg 00010002
       Stack: f2400000 00220000 f7200800 c1620613 00220000 01000000 04400000 00238000
                (null) f7200000 00000002 f7200b58 f7200800 c1620929 000375fe   (null)
              f7200b80 c16395f0 00200a02 f7200a80   (null) 000375fe 00000002   (null)
       Pid: 0, comm: swapper Not tainted 2.6.39-rc5-00181-g2706a0bf #17
       Call Trace:
        [<c136b1e5>] ? early_fault+0x2e/0x2e
        [<c16209aa>] ? mminit_verify_page_links+0x12/0x42
        [<c1620613>] ? memmap_init_zone+0xaf/0x10c
        [<c1620929>] ? free_area_init_node+0x2b9/0x2e3
        [<c1607e99>] ? free_area_init_nodes+0x3f2/0x451
        [<c1601d80>] ? paging_init+0x112/0x118
        [<c15f578d>] ? setup_arch+0x791/0x82f
        [<c15f43d9>] ? start_kernel+0x6a/0x257
      
      This patch implements node_map_pfn_alignment() which determines
      maximum internode alignment and update numa_register_memblks() to
      reject NUMA configuration if alignment exceeds the pfn -> nid mapping
      granularity of the memory model as determined by PAGES_PER_SECTION.
      
      This makes the problematic machine boot w/ flatmem by rejecting the
      NUMA config and provides protection against crazy NUMA configurations.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Link: http://lkml.kernel.org/r/20110712074534.GB2872@htj.dyndns.org
      LKML-Reference: <20110628174613.GP478@escobedo.osrc.amd.com>
      Reported-and-Tested-by: NHans Rosenfeld <hans.rosenfeld@amd.com>
      Cc: Conny Seidel <conny.seidel@amd.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      1e01979c
  7. 02 6月, 2011 1 次提交
  8. 27 5月, 2011 1 次提交
    • K
      memcg: fix get_scan_count() for small targets · 246e87a9
      KAMEZAWA Hiroyuki 提交于
      During memory reclaim we determine the number of pages to be scanned per
      zone as
      
      	(anon + file) >> priority.
      Assume
      	scan = (anon + file) >> priority.
      
      If scan < SWAP_CLUSTER_MAX, the scan will be skipped for this time and
      priority gets higher.  This has some problems.
      
        1. This increases priority as 1 without any scan.
           To do scan in this priority, amount of pages should be larger than 512M.
           If pages>>priority < SWAP_CLUSTER_MAX, it's recorded and scan will be
           batched, later. (But we lose 1 priority.)
           If memory size is below 16M, pages >> priority is 0 and no scan in
           DEF_PRIORITY forever.
      
        2. If zone->all_unreclaimabe==true, it's scanned only when priority==0.
           So, x86's ZONE_DMA will never be recoverred until the user of pages
           frees memory by itself.
      
        3. With memcg, the limit of memory can be small. When using small memcg,
           it gets priority < DEF_PRIORITY-2 very easily and need to call
           wait_iff_congested().
           For doing scan before priorty=9, 64MB of memory should be used.
      
      Then, this patch tries to scan SWAP_CLUSTER_MAX of pages in force...when
      
        1. the target is enough small.
        2. it's kswapd or memcg reclaim.
      
      Then we can avoid rapid priority drop and may be able to recover
      all_unreclaimable in a small zones.  And this patch removes nr_saved_scan.
       This will allow scanning in this priority even when pages >> priority is
      very small.
      Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NYing Han <yinghan@google.com>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      246e87a9
  9. 25 5月, 2011 10 次提交
  10. 21 5月, 2011 1 次提交
    • L
      sanitize <linux/prefetch.h> usage · 268bb0ce
      Linus Torvalds 提交于
      Commit e66eed65 ("list: remove prefetching from regular list
      iterators") removed the include of prefetch.h from list.h, which
      uncovered several cases that had apparently relied on that rather
      obscure header file dependency.
      
      So this fixes things up a bit, using
      
         grep -L linux/prefetch.h $(git grep -l '[^a-z_]prefetchw*(' -- '*.[ch]')
         grep -L 'prefetchw*(' $(git grep -l 'linux/prefetch.h' -- '*.[ch]')
      
      to guide us in finding files that either need <linux/prefetch.h>
      inclusion, or have it despite not needing it.
      
      There are more of them around (mostly network drivers), but this gets
      many core ones.
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      268bb0ce
  11. 17 5月, 2011 1 次提交
  12. 12 5月, 2011 2 次提交
    • A
      mm: add alloc_pages_exact_nid() · ee85c2e1
      Andi Kleen 提交于
      Add a alloc_pages_exact_nid() that allocates on a specific node.
      
      The naming is quite broken, but fixing that would need a larger renaming
      action.
      
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: tweak comment]
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Dave Hansen <dave@linux.vnet.ibm.com>
      Cc: David Rientjes <rientjes@google.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ee85c2e1
    • Y
      mm: use alloc_bootmem_node_nopanic() on really needed path · 8f389a99
      Yinghai Lu 提交于
      Stefan found nobootmem does not work on his system that has only 8M of
      RAM.  This causes an early panic:
      
        BIOS-provided physical RAM map:
         BIOS-88: 0000000000000000 - 000000000009f000 (usable)
         BIOS-88: 0000000000100000 - 0000000000840000 (usable)
        bootconsole [earlyser0] enabled
        Notice: NX (Execute Disable) protection missing in CPU or disabled in BIOS!
        DMI not present or invalid.
        last_pfn = 0x840 max_arch_pfn = 0x100000
        init_memory_mapping: 0000000000000000-0000000000840000
        8MB LOWMEM available.
          mapped low ram: 0 - 00840000
          low ram: 0 - 00840000
        Zone PFN ranges:
          DMA      0x00000001 -> 0x00001000
          Normal   empty
        Movable zone start PFN for each node
        early_node_map[2] active PFN ranges
            0: 0x00000001 -> 0x0000009f
            0: 0x00000100 -> 0x00000840
        BUG: Int 6: CR2 (null)
             EDI c034663c  ESI (null)  EBP c0329f38  ESP c0329ef4
             EBX c0346380  EDX 00000006  ECX ffffffff  EAX fffffff4
             err (null)  EIP c0353191   CS c0320060  flg 00010082
        Stack: (null) c030c533 000007cd (null) c030c533 00000001 (null) (null)
               00000003 0000083f 00000018 00000002 00000002 c0329f6c c03534d6 (null)
               (null) 00000100 00000840 (null) c0329f64 00000001 00001000 (null)
        Pid: 0, comm: swapper Not tainted 2.6.36 #5
        Call Trace:
         [<c02e3707>] ? 0xc02e3707
         [<c035e6e5>] 0xc035e6e5
         [<c0353191>] ? 0xc0353191
         [<c03534d6>] 0xc03534d6
         [<c034f1cd>] 0xc034f1cd
         [<c034a824>] 0xc034a824
         [<c03513cb>] ? 0xc03513cb
         [<c0349432>] 0xc0349432
         [<c0349066>] 0xc0349066
      
      It turns out that we should ignore the low limit of 16M.
      
      Use alloc_bootmem_node_nopanic() in this case.
      
      [akpm@linux-foundation.org: less mess]
      Signed-off-by: NYinghai LU <yinghai@kernel.org>
      Reported-by: NStefan Hellermann <stefan@the2masters.de>
      Tested-by: NStefan Hellermann <stefan@the2masters.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: <stable@kernel.org>		[2.6.34+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8f389a99
  13. 15 4月, 2011 1 次提交
  14. 10 4月, 2011 1 次提交
  15. 31 3月, 2011 1 次提交
  16. 25 3月, 2011 1 次提交
  17. 24 3月, 2011 1 次提交
  18. 23 3月, 2011 7 次提交
  19. 18 3月, 2011 1 次提交
  20. 26 2月, 2011 1 次提交