1. 28 10月, 2005 2 次提交
  2. 27 10月, 2005 1 次提交
  3. 21 10月, 2005 1 次提交
  4. 20 10月, 2005 3 次提交
    • Y
      [PATCH] swiotlb: make sure initial DMA allocations really are in DMA memory · 281dd25c
      Yasunori Goto 提交于
      This introduces a limit parameter to the core bootmem allocator; The new
      parameter indicates that physical memory allocated by the bootmem
      allocator should be within the requested limit.
      
      We also introduce alloc_bootmem_low_pages_limit, alloc_bootmem_node_limit,
      alloc_bootmem_low_pages_node_limit apis, but alloc_bootmem_low_pages_limit
      is the only api used for swiotlb.
      
      The existing alloc_bootmem_low_pages() api could instead have been
      changed and made to pass right limit to the core allocator.  But that
      would make the patch more intrusive for 2.6.14, as other arches use
      alloc_bootmem_low_pages().  We may be done that post 2.6.14 as a
      cleanup.
      
      With this, swiotlb gets memory within 4G for both x86_64 and ia64
      arches.
      Signed-off-by: NYasunori Goto <y-goto@jp.fujitsu.com>
      Cc: Ravikiran G Thirumalai <kiran@scalex86.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      281dd25c
    • H
      [PATCH] mm: hugetlb truncation fixes · 1c59827d
      Hugh Dickins 提交于
      hugetlbfs allows truncation of its files (should it?), but hugetlb.c often
      forgets that: crashes and misaccounting ensue.
      
      copy_hugetlb_page_range better grab the src page_table_lock since we don't
      want to guess what happens if concurrently truncated.  unmap_hugepage_range
      rss accounting must not assume the full range was mapped.  follow_hugetlb_page
      must guard with page_table_lock and be prepared to exit early.
      
      Restyle copy_hugetlb_page_range with a for loop like the others there.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      1c59827d
    • S
      [PATCH] Handle spurious page fault for hugetlb region · 3359b54c
      Seth, Rohit 提交于
      The hugetlb pages are currently pre-faulted.  At the time of mmap of
      hugepages, we populate the new PTEs.  It is possible that HW has already
      cached some of the unused PTEs internally.  These stale entries never
      get a chance to be purged in existing control flow.
      
      This patch extends the check in page fault code for hugepages.  Check if
      a faulted address falls with in size for the hugetlb file backing it.
      We return VM_FAULT_MINOR for these cases (assuming that the arch
      specific page-faulting code purges the stale entry for the archs that
      need it).
      Signed-off-by: NRohit Seth <rohit.seth@intel.com>
      
      [ This is apparently arguably an ia64 port bug. But the code won't
        hurt, and for now it fixes a real problem on some ia64 machines ]
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3359b54c
  5. 17 10月, 2005 1 次提交
    • L
      Fix memory ordering bug in page reclaim · 3d80636a
      Linus Torvalds 提交于
      As noticed by Nick Piggin, we need to make sure that we check the page
      count before we check for PageDirty, since the dirty check is only valid
      if the count implies that we're the only possible ones holding the page.
      
      We always did do this, but the code needs a read-memory-barrier to make
      sure that the orderign is also honored by the CPU.
      
      (The writer side is ordered due to the atomic decrement and test on the
      page count, see the discussion on linux-kernel)
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3d80636a
  6. 12 10月, 2005 2 次提交
  7. 09 10月, 2005 1 次提交
  8. 01 10月, 2005 1 次提交
  9. 28 9月, 2005 2 次提交
  10. 24 9月, 2005 1 次提交
  11. 23 9月, 2005 4 次提交
    • R
      [PATCH] Fix bd_claim() error code. · f7b3a435
      Rob Landley 提交于
      Problem: In some circumstances, bd_claim() is returning the wrong error
      code.
      
      If we try to swapon an unused block device that isn't swap formatted, we
      get -EINVAL.  But if that same block device is already mounted, we instead
      get -EBUSY, even though it still isn't a valid swap device.
      
      This issue came up on the busybox list trying to get the error message
      from "swapon -a" right.  If a swap device is already enabled, we get -EBUSY,
      and we shouldn't report this as an error.  But we can't distinguish the two
      -EBUSY conditions, which are very different errors.
      
      In the code, bd_claim() returns either 0 or -EBUSY, but in this case busy
      means "somebody other than sys_swapon has already claimed this", and
      _that_ means this block device can't be a valid swap device.  So return
      -EINVAL there.
      Signed-off-by: NRob Landley <rob@landley.net>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      f7b3a435
    • C
      [PATCH] __kmalloc: Generate BUG if size requested is too large. · eafb4270
      Christoph Lameter 提交于
      I had an issue on ia64 where I got a bug in kernel/workqueue because
      kzalloc returned a NULL pointer due to the task structure getting too big
      for the slab allocator.  Usually these cases are caught by the kmalloc
      macro in include/linux/slab.h.
      
      Compilation will fail if a too big value is passed to kmalloc.
      
      However, kzalloc uses __kmalloc which has no check for that.  This patch
      makes __kmalloc bug if a too large entity is requested.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      eafb4270
    • C
      [PATCH] slab: fix handling of pages from foreign NUMA nodes · ff69416e
      Christoph Lameter 提交于
      The numa slab allocator may allocate pages from foreign nodes onto the
      lists for a particular node if a node runs out of memory.  Inspecting the
      slab->nodeid field will not reflect that the page is now in use for the
      slabs of another node.
      
      This patch fixes that issue by adding a node field to free_block so that
      the caller can indicate which node currently uses a slab.
      
      Also removes the check for the current node from kmalloc_cache_node since
      the process may shift later to another node which may lead to an allocation
      on another node than intended.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      ff69416e
    • I
      [PATCH] slab: alpha inlining fix · 7243cc05
      Ivan Kokshaysky 提交于
      It is essential that index_of() be inlined.  But alpha undoes the gcc
      inlining hackery and index_of() ends up out-of-line.  So fiddle with things
      to make that function inline again.
      
      Cc: Richard Henderson <rth@twiddle.net>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      7243cc05
  12. 22 9月, 2005 2 次提交
  13. 18 9月, 2005 1 次提交
  14. 15 9月, 2005 2 次提交
    • A
      [PATCH] Fix slab BUG_ON() triggered by change in array cache size · c7e43c78
      Alok Kataria 提交于
      With the new changes that we made in the initialization of the slab
      allocator, we first setup the cache from which array caches are allocated,
      and then the cache, from which kmem_list3's are allocated.
      
      Now if the array cache comes from a cache in which objsize > 32, (in this
      instance size-64) then, first size-64 cache will be allocated and then the
      size-128 (if this is the cache from which kmem_list3's are going to be
      allocated).
      
      So with these new changes, we are not guaranteed that we will be
      initializing the malloc_sizes array in a serialized order. Thus there is
      a bug in __find_general_cachep, as we are checking whether the first
      cache_sizes ptr is NULL.
      
      This is replaced by checking whether the array-cache cache is initialized.
      Attached is a patch which does that.  Boots fine on a x86-64, with
      DEBUG_SPIN, DEBUG_SLAB, and preempt.
      
      Attached is a patch which does that.  Boots fine on a x86-64, with
      DEBUG_SPIN, DEBUG_SLAB, and preempt.Thanks & Regards, Alok
      Signed-off-by: NAlok N Kataria <alokk@calsoftinc.com>
      Signed-off-by: Shobhit Dayal <shobhitdayal.com>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: Christoph Lameter <christoph@lameter.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c7e43c78
    • H
      [PATCH] error path in setup_arg_pages() misses vm_unacct_memory() · 2fd4ef85
      Hugh Dickins 提交于
      Pavel Emelianov and Kirill Korotaev observe that fs and arch users of
      security_vm_enough_memory tend to forget to vm_unacct_memory when a
      failure occurs further down (typically in setup_arg_pages variants).
      
      These are all users of insert_vm_struct, and that reservation will only
      be unaccounted on exit if the vma is marked VM_ACCOUNT: which in some
      cases it is (hidden inside VM_STACK_FLAGS) and in some cases it isn't.
      
      So x86_64 32-bit and ppc64 vDSO ELFs have been leaking memory into
      Committed_AS each time they're run.  But don't add VM_ACCOUNT to them,
      it's inappropriate to reserve against the very unlikely case that gdb
      be used to COW a vDSO page - we ought to do something about that in
      do_wp_page, but there are yet other inconsistencies to be resolved.
      
      The safe and economical way to fix this is to let insert_vm_struct do
      the security_vm_enough_memory check when it finds VM_ACCOUNT is set.
      
      And the MIPS irix_brk has been calling security_vm_enough_memory before
      calling do_brk which repeats it, doubly accounting and so also leaking.
      Remove that, and all the fs and arch calls to security_vm_enough_memory:
      give it a less misleading name later on.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-Off-By: NKirill Korotaev <dev@sw.ru>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      2fd4ef85
  15. 13 9月, 2005 4 次提交
  16. 12 9月, 2005 1 次提交
    • G
      [PATCH] uclinux: add NULL check, 0 end valid check and some more exports to nommu.c · 66aa2b4b
      Greg Ungerer 提交于
      Move call to get_mm_counter() in update_mem_hiwater() to be
      inside the check for tsk->mm being null. Otherwise you can be
      following a null pointer here. This patch submitted by
      Javier Herrero <jherrero@hvsistemas.es>.
      
      Modify the end check for munmap regions to allow for the
      legacy behavior of 0 being valid. Pretty much all current
      uClinux system libc malloc's pass in 0 as the end point.
      A hard check will fail on these, so change the check so
      that if it is non-zero it must be valid otherwise it fails.
      A passed in value will always succeed (as it used too).
      
      Also export a few more mm system functions - to be consistent
      with the VM code exports.
      Signed-off-by: NGreg Ungerer <gerg@uclinux.com>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      66aa2b4b
  17. 11 9月, 2005 5 次提交
  18. 10 9月, 2005 5 次提交
    • I
      [PATCH] timer initialization cleanup: DEFINE_TIMER · 8d06afab
      Ingo Molnar 提交于
      Clean up timer initialization by introducing DEFINE_TIMER a'la
      DEFINE_SPINLOCK.  Build and boot-tested on x86.  A similar patch has been
      been in the -RT tree for some time.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      8d06afab
    • P
      [PATCH] update kfree, vfree, and vunmap kerneldoc · 80e93eff
      Pekka Enberg 提交于
      This patch clarifies NULL handling of kfree() and vfree().  I addition,
      wording of calling context restriction for vfree() and vunmap() are changed
      from "may not" to "must not."
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Acked-by: NManfred Spraul <manfred@colorfullife.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      80e93eff
    • C
      [PATCH] Numa-aware slab allocator V5 · e498be7d
      Christoph Lameter 提交于
      The NUMA API change that introduced kmalloc_node was accepted for
      2.6.12-rc3.  Now it is possible to do slab allocations on a node to
      localize memory structures.  This API was used by the pageset localization
      patch and the block layer localization patch now in mm.  The existing
      kmalloc_node is slow since it simply searches through all pages of the slab
      to find a page that is on the node requested.  The two patches do a one
      time allocation of slab structures at initialization and therefore the
      speed of kmalloc node does not matter.
      
      This patch allows kmalloc_node to be as fast as kmalloc by introducing node
      specific page lists for partial, free and full slabs.  Slab allocation
      improves in a NUMA system so that we are seeing a performance gain in AIM7
      of about 5% with this patch alone.
      
      More NUMA localizations are possible if kmalloc_node operates in an fast
      way like kmalloc.
      
      Test run on a 32p systems with 32G Ram.
      
      w/o patch
      Tasks    jobs/min  jti  jobs/min/task      real       cpu
          1      485.36  100       485.3640     11.99      1.91   Sat Apr 30 14:01:51 2005
        100    26582.63   88       265.8263     21.89    144.96   Sat Apr 30 14:02:14 2005
        200    29866.83   81       149.3342     38.97    286.08   Sat Apr 30 14:02:53 2005
        300    33127.16   78       110.4239     52.71    426.54   Sat Apr 30 14:03:46 2005
        400    34889.47   80        87.2237     66.72    568.90   Sat Apr 30 14:04:53 2005
        500    35654.34   76        71.3087     81.62    714.55   Sat Apr 30 14:06:15 2005
        600    36460.83   75        60.7681     95.77    853.42   Sat Apr 30 14:07:51 2005
        700    35957.00   75        51.3671    113.30    990.67   Sat Apr 30 14:09:45 2005
        800    33380.65   73        41.7258    139.48   1140.86   Sat Apr 30 14:12:05 2005
        900    35095.01   76        38.9945    149.25   1281.30   Sat Apr 30 14:14:35 2005
       1000    36094.37   74        36.0944    161.24   1419.66   Sat Apr 30 14:17:17 2005
      
      w/patch
      Tasks    jobs/min  jti  jobs/min/task      real       cpu
          1      484.27  100       484.2736     12.02      1.93   Sat Apr 30 15:59:45 2005
        100    28262.03   90       282.6203     20.59    143.57   Sat Apr 30 16:00:06 2005
        200    32246.45   82       161.2322     36.10    282.89   Sat Apr 30 16:00:42 2005
        300    37945.80   83       126.4860     46.01    418.75   Sat Apr 30 16:01:28 2005
        400    40000.69   81       100.0017     58.20    561.48   Sat Apr 30 16:02:27 2005
        500    40976.10   78        81.9522     71.02    696.95   Sat Apr 30 16:03:38 2005
        600    41121.54   78        68.5359     84.92    834.86   Sat Apr 30 16:05:04 2005
        700    44052.77   78        62.9325     92.48    971.53   Sat Apr 30 16:06:37 2005
        800    41066.89   79        51.3336    113.38   1111.15   Sat Apr 30 16:08:31 2005
        900    38918.77   79        43.2431    134.59   1252.57   Sat Apr 30 16:10:46 2005
       1000    41842.21   76        41.8422    139.09   1392.33   Sat Apr 30 16:13:05 2005
      
      These are measurement taken directly after boot and show a greater
      improvement than 5%.  However, the performance improvements become less
      over time if the AIM7 runs are repeated and settle down at around 5%.
      
      Links to earlier discussions:
      http://marc.theaimsgroup.com/?t=111094594500003&r=1&w=2
      http://marc.theaimsgroup.com/?t=111603406600002&r=1&w=2
      
      Changelog V4-V5:
      - alloc_arraycache and alloc_aliencache take node parameter instead of cpu
      - fix initialization so that nodes without cpus are properly handled.
      - simplify code in kmem_cache_init
      - patch against Andrews temp mm3 release
      - Add Shai to credits
      - fallback to __cache_alloc from __cache_alloc_node if the node's cache
        is not available yet.
      
      Changelog V3-V4:
      - Patch against 2.6.12-rc5-mm1
      - Cleanup patch integrated
      - More and better use of for_each_node and for_each_cpu
      - GCC 2.95 fix (do not use [] use [0])
      - Correct determination of INDEX_AC
      - Remove hack to cause an error on platforms that have no CONFIG_NUMA but nodes.
      - Remove list3_data and list3_data_ptr macros for better readability
      
      Changelog V2-V3:
      - Made to patch against 2.6.12-rc4-mm1
      - Revised bootstrap mechanism so that larger size kmem_list3 structs can be
        supported. Do a generic solution so that the right slab can be found
        for the internal structs.
      - use for_each_online_node
      
      Changelog V1-V2:
      - Batching for freeing of wrong-node objects (alien caches)
      - Locking changes and NUMA #ifdefs as requested by Manfred
      Signed-off-by: NAlok N Kataria <alokk@calsoftinc.com>
      Signed-off-by: NShobhit Dayal <shobhit@calsoftinc.com>
      Signed-off-by: NShai Fultheim <Shai@Scalex86.org>
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e498be7d
    • S
      [PATCH] tmpfs: Enable atomic inode security labeling · 570bc1c2
      Stephen Smalley 提交于
      This patch modifies tmpfs to call the inode_init_security LSM hook to set
      up the incore inode security state for new inodes before the inode becomes
      accessible via the dcache.
      
      As there is no underlying storage of security xattrs in this case, it is
      not necessary for the hook to return the (name, value, len) triple to the
      tmpfs code, so this patch also modifies the SELinux hook function to
      correctly handle the case where the (name, value, len) pointers are NULL.
      
      The hook call is needed in tmpfs in order to support proper security
      labeling of tmpfs inodes (e.g.  for udev with tmpfs /dev in Fedora).  With
      this change in place, we should then be able to remove the
      security_inode_post_create/mkdir/...  hooks safely.
      Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
      Cc: Hugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      570bc1c2
    • M
      [PATCH] update filesystems for new delete_inode behavior · fef26658
      Mark Fasheh 提交于
      Update the file systems in fs/ implementing a delete_inode() callback to
      call truncate_inode_pages().  One implementation note: In developing this
      patch I put the calls to truncate_inode_pages() at the very top of those
      filesystems delete_inode() callbacks in order to retain the previous
      behavior.  I'm guessing that some of those could probably be optimized.
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      Acked-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      fef26658
  19. 09 9月, 2005 1 次提交