1. 19 5月, 2009 1 次提交
  2. 15 5月, 2009 2 次提交
    • T
      sched, timers: cleanup avenrun users · 2d02494f
      Thomas Gleixner 提交于
      avenrun is an rough estimate so we don't have to worry about
      consistency of the three avenrun values. Remove the xtime lock
      dependency and provide a function to scale the values. Cleanup the
      users.
      
      [ Impact: cleanup ]
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      2d02494f
    • T
      sched, timers: move calc_load() to scheduler · dce48a84
      Thomas Gleixner 提交于
      Dimitri Sivanich noticed that xtime_lock is held write locked across
      calc_load() which iterates over all online CPUs. That can cause long
      latencies for xtime_lock readers on large SMP systems. 
      
      The load average calculation is an rough estimate anyway so there is
      no real need to protect the readers vs. the update. It's not a problem
      when the avenrun array is updated while a reader copies the values.
      
      Instead of iterating over all online CPUs let the scheduler_tick code
      update the number of active tasks shortly before the avenrun update
      happens. The avenrun update itself is handled by the CPU which calls
      do_timer().
      
      [ Impact: reduce xtime_lock write locked section ]
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      dce48a84
  3. 12 5月, 2009 1 次提交
    • V
      sched: Don't export sched_mc_power_savings on multi-socket single core system · 2ff799d3
      Vaidyanathan Srinivasan 提交于
      Fix to prevent sched_mc_power_saving from being exported through sysfs
      for multi-scoket single core system. Max cores should be always greater than
      one (1). My earlier patch that introduced fix for not exporting
      'sched_mc_power_saving' on laptops  broke it on multi-socket single core
      system. This fix addresses issue on both laptop and multi-socket single
      core system.
      Below are the Test results:
      
      1. Single socket - multi-core
             Before Patch: Does not export 'sched_mc_power_saving'
             After Patch: Does not export 'sched_mc_power_saving'
             Result: Pass
      
      2. Multi Socket - single core
            Before Patch: exports 'sched_mc_power_saving'
            After Patch: Does not export 'sched_mc_power_saving'
            Result: Pass
      
      3. Multi Socket - Multi core
            Before Patch: exports 'sched_mc_power_saving'
            After Patch: exports 'sched_mc_power_saving'
      
      [ Impact: make the sched_mc_power_saving control available more consistently ]
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: Suresh B Siddha <suresh.b.siddha@intel.com>
      Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20090511143914.GB4853@dirshya.in.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2ff799d3
  4. 11 5月, 2009 1 次提交
  5. 09 5月, 2009 8 次提交
  6. 08 5月, 2009 13 次提交
    • P
      mtd: fix timeout in M25P80 driver · cd1a6de7
      Peter Horton 提交于
      Extend erase timeout in M25P80 SPI Flash driver.
      
      The M25P80 drivers fails erasing sectors on a M25P128 because the ready
      wait timeout is too short. Change the timeout from a simple loop count to a
      suitable number of seconds.
      Signed-off-by: NPeter Horton <zero@colonel-panic.org>
      Tested-by: NMartin Michlmayr <tbm@cyrius.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
      cd1a6de7
    • H
      x86: MCE: make cmci_discover_lock irq-safe · e5299926
      Hidetoshi Seto 提交于
      Lockdep reports the warning below when Li tries to offline one cpu:
      
      [  110.835487] =================================
      [  110.835616] [ INFO: inconsistent lock state ]
      [  110.835688] 2.6.30-rc4-00336-g8c9ed899 #52
      [  110.835757] ---------------------------------
      [  110.835828] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
      [  110.835908] swapper/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
      [  110.835982]  (cmci_discover_lock){?.+...}, at: [<ffffffff80236dc0>] cmci_clear+0x30/0x9b
      
      cmci_clear() can be called via smp_call_function_single().
      
      It is better to disable interrupt while holding cmci_discover_lock,
      to turn it into an irq-safe lock - we can deadlock otherwise.
      
      [ Impact: fix possible deadlock in the MCE code ]
      Reported-by: NShaohua Li <shaohua.li@intel.com>
      Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <4A03ED38.8000700@jp.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Reported-by: Shaohua Li<shaohua.li@intel.com>
      e5299926
    • J
      x86: xen, i386: reserve Xen pagetables · 33df4db0
      Jeremy Fitzhardinge 提交于
      The Xen pagetables are no longer implicitly reserved as part of the other
      i386_start_kernel reservations, so make sure we explicitly reserve them.
      This prevents them from being released into the general kernel free page
      pool and reused.
      
      [ Impact: fix Xen guest crash ]
      Also-Bisected-by: NBryan Donlan <bdonlan@gmail.com>
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Xen-devel <xen-devel@lists.xensource.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      LKML-Reference: <4A032EEC.30509@goop.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      33df4db0
    • H
      x86, kexec: fix crashdump panic with CONFIG_KEXEC_JUMP · 6407df5c
      Huang Ying 提交于
      Tim Starling reported that crashdump will panic with kernel compiled
      with CONFIG_KEXEC_JUMP due to null pointer deference in
      machine_kexec_32.c: machine_kexec(), when deferencing
      kexec_image. Refering to:
      
      http://bugzilla.kernel.org/show_bug.cgi?id=13265
      
      This patch fixes the BUG via replacing global variable reference:
      kexec_image in machine_kexec() with local variable reference: image,
      which is more appropriate, and will not be null.
      
      Same BUG is in machine_kexec_64.c too, so fixed too in the same way.
      
      [ Impact: fix crash on kexec ]
      Reported-by: NTim Starling <tstarling@wikimedia.org>
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      LKML-Reference: <1241751101.6259.85.camel@yhuang-dev.sh.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      6407df5c
    • J
      x86-64: finish cleanup_highmaps()'s job wrt. _brk_end · 49834396
      Jan Beulich 提交于
      With the introduction of the .brk section, special care must be taken
      that no unused page table entries remain if _brk_end and _end are
      separated by a 2M page boundary. cleanup_highmap() runs very early and
      hence cannot take care of that, hence potential entries needing to be
      removed past _brk_end must be cleared once the brk allocator has done
      its job.
      
      [ Impact: avoids undesirable TLB aliases ]
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      49834396
    • J
      x86: fix boot hang in early_reserve_e820() · 61438766
      Jan Beulich 提交于
      If the first non-reserved (sub-)range doesn't fit the size requested,
      an endless loop will be entered. If a range returned from
      find_e820_area_size() turns out insufficient in size, the range must
      be skipped before calling the function again.
      
      [ Impact: fixes boot hang on some platforms ]
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      61438766
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 · d7a59269
      Linus Torvalds 提交于
      * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: (32 commits)
        [CIFS] Fix double list addition in cifs posix open code
        [CIFS] Allow raw ntlmssp code to be enabled with sec=ntlmssp
        [CIFS] Fix SMB uid in NTLMSSP authenticate request
        [CIFS] NTLMSSP reenabled after move from connect.c to sess.c
        [CIFS] Remove sparse warning
        [CIFS] remove checkpatch warning
        [CIFS] Fix final user of old string conversion code
        [CIFS] remove cifs_strfromUCS_le
        [CIFS] NTLMSSP support moving into new file, old dead code removed
        [CIFS] Fix endian conversion of vcnum field
        [CIFS] Remove trailing whitespace
        [CIFS] Remove sparse endian warnings
        [CIFS] Add remaining ntlmssp flags and standardize field names
        [CIFS] Fix build warning
        cifs: fix length handling in cifs_get_name_from_search_buf
        [CIFS] Remove unneeded QuerySymlink call and fix mapping for unmapped status
        [CIFS] rename cifs_strndup to cifs_strndup_from_ucs
        Added loop check when mounting DFS tree.
        Enable dfs submounts to handle remote referrals.
        [CIFS] Remove older session setup implementation
        ...
      d7a59269
    • S
      [CIFS] Fix double list addition in cifs posix open code · 90e4ee5d
      Steve French 提交于
      Remove adding open file entry twice to lists in the file
      Do not fill file info twice in case of posix opens and creates
      Signed-off-by: NShirish Pargaonkar <shirishp@us.ibm.com>
      Signed-off-by: NSteve French <sfrench@us.ibm.com>
      90e4ee5d
    • D
      NOMMU: Don't check vm_region::vm_start is page aligned in add_nommu_region() · 8c9ed899
      David Howells 提交于
      Don't check vm_region::vm_start is page aligned in add_nommu_region() because
      the region may reflect some non-page-aligned mapped file, such as could be
      obtained from RomFS XIP.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NGreg Ungerer <gerg@uclinux.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8c9ed899
    • L
      Merge branch 'for-linus' of git://neil.brown.name/md · ee7fee0b
      Linus Torvalds 提交于
      * 'for-linus' of git://neil.brown.name/md:
        md: remove rd%d links immediately after stopping an array.
        md: remove ability to explicit set an inactive array to 'clean'.
        md: constify VFTs
        md: tidy up status_resync to handle large arrays.
        md: fix some (more) errors with bitmaps on devices larger than 2TB.
        md/raid10: don't clear bitmap during recovery if array will still be degraded.
        md: fix loading of out-of-date bitmap.
      ee7fee0b
    • L
      random: make get_random_int() more random · 8a0a9bd4
      Linus Torvalds 提交于
      It's a really simple patch that basically just open-codes the current
      "secure_ip_id()" call, but when open-coding it we now use a _static_
      hashing area, so that it gets updated every time.
      
      And to make sure somebody can't just start from the same original seed of
      all-zeroes, and then do the "half_md4_transform()" over and over until
      they get the same sequence as the kernel has, each iteration also mixes in
      the same old "current->pid + jiffies" we used - so we should now have a
      regular strong pseudo-number generator, but we also have one that doesn't
      have a single seed.
      
      Note: the "pid + jiffies" is just meant to be a tiny tiny bit of noise. It
      has no real meaning. It could be anything. I just picked the previous
      seed, it's just that now we keep the state in between calls and that will
      feed into the next result, and that should make all the difference.
      
      I made that hash be a per-cpu data just to avoid cache-line ping-pong:
      having multiple CPU's write to the same data would be fine for randomness,
      and add yet another layer of chaos to it, but since get_random_int() is
      supposed to be a fast interface I did it that way instead. I considered
      using "__raw_get_cpu_var()" to avoid any preemption overhead while still
      getting the hash be _mostly_ ping-pong free, but in the end good taste won
      out.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8a0a9bd4
    • L
      Merge master.kernel.org:/home/rmk/linux-2.6-arm · 2c66fa7e
      Linus Torvalds 提交于
      * master.kernel.org:/home/rmk/linux-2.6-arm:
        [ARM] 5507/1: support R_ARM_MOVW_ABS_NC and MOVT_ABS relocation types
        [ARM] 5506/1: davinci: DMA_32BIT_MASK --> DMA_BIT_MASK(32)
        i.MX31: Disable CPU_32v6K in mx3_defconfig.
        mx3fb: Fix compilation with CONFIG_PM
        mx27ads: move PBC mapping out of vmalloc space
        MXC: remove BUG_ON in interrupt handler
        mx31: remove mx31moboard_defconfig
        ARM: ARCH_MXC should select HAVE_CLK
        mxc : BUG in imx_dma_request
        mxc : Clean up properly when imx_dma_free() used without imx_dma_disable()
        [ARM] mv78xx0: update defconfig
        [ARM] orion5x: update defconfig
        [ARM] Kirkwood: update defconfig
        [ARM] Kconfig typo fix:  "PXA930" -> "CPU_PXA930".
        [ARM] S3C2412: Add missing cache flush in suspend code
        [ARM] S3C: Add UDIVSLOT support for newer UARTS
        [ARM] S3C64XX: Add S3C64XX_PA_IIS{0,1} to <mach/map.h>
      2c66fa7e
    • P
      [ARM] 5507/1: support R_ARM_MOVW_ABS_NC and MOVT_ABS relocation types · ae51e609
      Paul Gortmaker 提交于
      From: Bruce Ashfield <bruce.ashfield@windriver.com>
      
      To fully support the armv7-a instruction set/optimizations, support
      for the R_ARM_MOVW_ABS_NC and R_ARM_MOVT_ABS relocation types is
      required.
      
      The MOVW and MOVT are both load-immediate instructions, MOVW loads 16
      bits into the bottom half of a register, and MOVT loads 16 bits into the
      top half of a register.
      
      The relocation information for these instructions has a full 32 bit
      value, plus an addend which is stored in the 16 immediate bits in the
      instruction itself.  The immediate bits in the instruction are not
      contiguous (the register # splits it into a 4 bit and 12 bit value),
      so the addend has to be extracted accordingly and added to the value.
      The value is then split and put into the instruction; a MOVW uses the
      bottom 16 bits of the value, and a MOVT uses the top 16 bits.
      Signed-off-by: NDavid Borman <david.borman@windriver.com>
      Signed-off-by: NBruce Ashfield <bruce.ashfield@windriver.com>
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      ae51e609
  7. 07 5月, 2009 14 次提交
    • K
      [ARM] 5506/1: davinci: DMA_32BIT_MASK --> DMA_BIT_MASK(32) · a029b706
      Kevin Hilman 提交于
      As per commit 284901a9, use
      DMA_BIT_MASK(n)
      Signed-off-by: NKevin Hilman <khilman@deeprootsystems.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      a029b706
    • D
      sched: emit thread info flags with stack trace · aa47b7e0
      David Rientjes 提交于
      When a thread is oom killed and fails to exit, it's helpful to know which
      threads have access to memory reserves if the machine livelocks.  This is
      done by testing for the TIF_MEMDIE thread info flag and should be
      displayed alongside stack traces to identify tasks that have access to
      such reserves but are still stuck allocating pages, for instance.
      
      It would probably be helpful in other cases as well, so all thread info
      flags are emitted when showing a task.
      
      ( v2: fix warning reported by Stephen Rothwell )
      
      [ Impact: extend debug printout info ]
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      LKML-Reference: <alpine.DEB.2.00.0905040136390.15831@chino.kir.corp.google.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      aa47b7e0
    • N
      md: remove rd%d links immediately after stopping an array. · c4647292
      NeilBrown 提交于
      md maintains link in sys/mdXX/md/ to identify which device has
      which role in the array. e.g.
         rd2 -> dev-sda
      
      indicates that the device with role '2' in the array is sda.
      
      These links are only present when the array is active.  They are
      created immediately after ->run is called, and so should be removed
      immediately after ->stop is called.
      However they are currently removed a little bit later, and it is
      possible for ->run to be called again, thus adding these links, before
      they are removed.
      
      So move the removal earlier so they are consistently only present when
      the array is active.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      c4647292
    • N
      md: remove ability to explicit set an inactive array to 'clean'. · 5bf29597
      NeilBrown 提交于
      Being able to write 'clean' to an 'array_state' of an inactive array
      to activate it in 'clean' mode is both unnecessary and inconvenient.
      
      It is unnecessary because the same can be achieved by writing
      'active'.  This activates and array, but it still remains 'clean'
      until the first write.
      
      It is inconvenient because writing 'clean' is more often used to
      cause an 'active' array to revert to 'clean' mode (thus blocking
      any writes until a 'write-pending' is promoted to 'active').
      
      Allowing 'clean' to both activate an array and mark an active array as
      clean can lead to races:  One program writes 'clean' to mark the
      active array as clean at the same time as another program writes
      'inactive' to deactivate (stop) and active array.  Depending on which
      writes first, the array could be deactivated and immediately
      reactivated which isn't what was desired.
      
      So just disable the use of 'clean' to activate an array.
      
      This avoids a race that can be triggered with mdadm-3.0 and external
      metadata, so it suitable for -stable.
      Reported-by: NRafal Marszewski <rafal.marszewski@intel.com>
      Acked-by: NDan Williams <dan.j.williams@intel.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      5bf29597
    • J
      md: constify VFTs · 110518bc
      Jan Engelhardt 提交于
      Signed-off-by: NJan Engelhardt <jengelh@medozas.de>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      110518bc
    • N
      md: tidy up status_resync to handle large arrays. · dd71cf6b
      NeilBrown 提交于
      Two problems in status_resync.
      1/ It still used Kilobytes as the basic block unit, while most code
         now uses sectors uniformly.
      2/ It doesn't allow for the possibility that max_sectors exceeds
         the range of "unsigned long".
      
      So
       - change "max_blocks" to "max_sectors", and store sector numbers
         in there and in 'resync'
       - Make 'rt' a 'sector_t' so it can temporarily hold the number of
         remaining sectors.
       - use sector_div rather than normal division.
       - change the magic '100' used to preserve precision to '32'.
         + making it a power of 2 makes division easier
         + it doesn't need to be as large as it was chosen when we averaged
           speed over the entire run.  Now we average speed over the last 30
           seconds or so.
      Reported-by: N"Mario 'BitKoenig' Holbe" <Mario.Holbe@TU-Ilmenau.DE>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      dd71cf6b
    • N
      md: fix some (more) errors with bitmaps on devices larger than 2TB. · db305e50
      NeilBrown 提交于
      If a write intent bitmap covers more than 2TB, we sometimes work with
      values beyond 32bit, so these need to be sector_t.  This patches
      add the required casts to some unsigned longs that are being shifted
      up.
      
      This will affect any raid10 larger than 2TB, or any raid1/4/5/6 with
      member devices that are larger than 2TB.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Reported-by: N"Mario 'BitKoenig' Holbe" <Mario.Holbe@TU-Ilmenau.DE>
      Cc: stable@kernel.org
      db305e50
    • N
      md/raid10: don't clear bitmap during recovery if array will still be degraded. · 18055569
      NeilBrown 提交于
      If we have a raid10 with multiple missing devices, and we recover just
      one of these to a spare, then we risk (depending on the bitmap and
      array chunk size) clearing bits of the bitmap for which recovery isn't
      complete (because a device is still missing).
      
      This can lead to a subsequent "re-add" being recovered without
      any IO happening, which would result in loss of data.
      
      This patch takes the safe approach of not clearing bitmap bits
      if the array will still be degraded.
      
      This patch is suitable for all active -stable kernels.
      
      Cc: stable@kernel.org
      Signed-off-by: NNeilBrown <neilb@suse.de>
      18055569
    • N
      md: fix loading of out-of-date bitmap. · b74fd282
      NeilBrown 提交于
      When md is loading a bitmap which it knows is out of date, it fills
      each page with 1s and writes it back out again.  However the
      write_page call makes used of bitmap->file_pages and
      bitmap->last_page_size which haven't been set correctly yet.  So this
      can sometimes fail.
      
      Move the setting of file_pages and last_page_size to before the call
      to write_page.
      
      This bug can cause the assembly on an array to fail, thus making the
      data inaccessible.  Hence I think it is a suitable candidate for
      -stable.
      
      Cc: stable@kernel.org
      Reported-by: NVojtech Pavlik <vojtech@suse.cz>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      b74fd282
    • A
      drivers/base/iommu.c: add missing includes · 60db4027
      Andrew Morton 提交于
      Fix zillions of -mm x86_64 allmodconfig build errors - the file uses
      EXPORT_SYMBOL() and kmalloc but misses the needed includes.
      
      Cc: Greg Kroah-Hartman <gregkh@suse.de>
      Cc: Joerg Roedel <joerg.roedel@amd.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      60db4027
    • E
      initramfs: clean up messages related to initramfs unpacking · a1e6b6c1
      Eric Piel 提交于
      With the removal of duplicate unpack_to_rootfs() (commit
      df52092f) the messages displayed do not
      actually correspond to what the kernel is doing.  In addition, depending
      if ramdisks are supported or not, the messages are not at all the same.
      
      So keep the messages more in sync with what is really doing the kernel,
      and only display a second message in case of failure.  This also ensure
      that the printk message cannot be split by other printk's.
      Signed-off-by: NEric Piel <eric.piel@tremplin-utc.net>
      Acked-by: NH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a1e6b6c1
    • D
      nommu: make the initial mmap allocation excess behaviour Kconfig configurable · fc4d5c29
      David Howells 提交于
      NOMMU mmap() has an option controlled by a sysctl variable that determines
      whether the allocations made by do_mmap_private() should have the excess
      space trimmed off and returned to the allocator.  Make the initial setting
      of this variable a Kconfig configuration option.
      
      The reason there can be excess space is that the allocator only allocates
      in power-of-2 size chunks, but mmap()'s can be made in sizes that aren't a
      power of 2.
      
      There are two alternatives:
      
       (1) Keep the excess as dead space.  The dead space then remains unused for the
           lifetime of the mapping.  Mappings of shared objects such as libc, ld.so
           or busybox's text segment may retain their dead space forever.
      
       (2) Return the excess to the allocator.  This means that the dead space is
           limited to less than a page per mapping, but it means that for a transient
           process, there's more chance of fragmentation as the excess space may be
           reused fairly quickly.
      
      During the boot process, a lot of transient processes are created, and
      this can cause a lot of fragmentation as the pagecache and various slabs
      grow greatly during this time.
      
      By turning off the trimming of excess space during boot and disabling
      batching of frees, Coldfire can manage to boot.
      
      A better way of doing things might be to have /sbin/init turn this option
      off.  By that point libc, ld.so and init - which are all long-duration
      processes - have all been loaded and trimmed.
      Reported-by: NLanttor Guo <lanttor.guo@freescale.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-by: NLanttor Guo <lanttor.guo@freescale.com>
      Cc: Greg Ungerer <gerg@snapgear.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fc4d5c29
    • D
      nommu: clamp zone_batchsize() to 0 under NOMMU conditions · 3a6be87f
      David Howells 提交于
      Clamp zone_batchsize() to 0 under NOMMU conditions to stop
      free_hot_cold_page() from queueing and batching frees.
      
      The problem is that under NOMMU conditions it is really important to be
      able to allocate large contiguous chunks of memory, but when munmap() or
      exit_mmap() releases big stretches of memory, return of these to the buddy
      allocator can be deferred, and when it does finally happen, it can be in
      small chunks.
      
      Whilst the fragmentation this incurs isn't so much of a problem under MMU
      conditions as userspace VM is glued together from individual pages with
      the aid of the MMU, it is a real problem if there isn't an MMU.
      
      By clamping the page freeing queue size to 0, pages are returned to the
      allocator immediately, and the buddy detector is more likely to be able to
      glue them together into large chunks immediately, and fragmentation is
      less likely to occur.
      
      By disabling batching of frees, and by turning off the trimming of excess
      space during boot, Coldfire can manage to boot.
      Reported-by: NLanttor Guo <lanttor.guo@freescale.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-by: NLanttor Guo <lanttor.guo@freescale.com>
      Cc: Greg Ungerer <gerg@snapgear.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3a6be87f
    • D
      mm: use roundown_pow_of_two() in zone_batchsize() · 9155203a
      David Howells 提交于
      Use roundown_pow_of_two(N) in zone_batchsize() rather than (1 <<
      (fls(N)-1)) as they are equivalent, and with the former it is easier to
      see what is going on.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-by: NLanttor Guo <lanttor.guo@freescale.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9155203a