1. 15 9月, 2005 1 次提交
    • A
      [PATCH] Fix slab BUG_ON() triggered by change in array cache size · c7e43c78
      Alok Kataria 提交于
      With the new changes that we made in the initialization of the slab
      allocator, we first setup the cache from which array caches are allocated,
      and then the cache, from which kmem_list3's are allocated.
      
      Now if the array cache comes from a cache in which objsize > 32, (in this
      instance size-64) then, first size-64 cache will be allocated and then the
      size-128 (if this is the cache from which kmem_list3's are going to be
      allocated).
      
      So with these new changes, we are not guaranteed that we will be
      initializing the malloc_sizes array in a serialized order. Thus there is
      a bug in __find_general_cachep, as we are checking whether the first
      cache_sizes ptr is NULL.
      
      This is replaced by checking whether the array-cache cache is initialized.
      Attached is a patch which does that.  Boots fine on a x86-64, with
      DEBUG_SPIN, DEBUG_SLAB, and preempt.
      
      Attached is a patch which does that.  Boots fine on a x86-64, with
      DEBUG_SPIN, DEBUG_SLAB, and preempt.Thanks & Regards, Alok
      Signed-off-by: NAlok N Kataria <alokk@calsoftinc.com>
      Signed-off-by: Shobhit Dayal <shobhitdayal.com>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: Christoph Lameter <christoph@lameter.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c7e43c78
  2. 11 9月, 2005 1 次提交
  3. 10 9月, 2005 2 次提交
    • P
      [PATCH] update kfree, vfree, and vunmap kerneldoc · 80e93eff
      Pekka Enberg 提交于
      This patch clarifies NULL handling of kfree() and vfree().  I addition,
      wording of calling context restriction for vfree() and vunmap() are changed
      from "may not" to "must not."
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      Acked-by: NManfred Spraul <manfred@colorfullife.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      80e93eff
    • C
      [PATCH] Numa-aware slab allocator V5 · e498be7d
      Christoph Lameter 提交于
      The NUMA API change that introduced kmalloc_node was accepted for
      2.6.12-rc3.  Now it is possible to do slab allocations on a node to
      localize memory structures.  This API was used by the pageset localization
      patch and the block layer localization patch now in mm.  The existing
      kmalloc_node is slow since it simply searches through all pages of the slab
      to find a page that is on the node requested.  The two patches do a one
      time allocation of slab structures at initialization and therefore the
      speed of kmalloc node does not matter.
      
      This patch allows kmalloc_node to be as fast as kmalloc by introducing node
      specific page lists for partial, free and full slabs.  Slab allocation
      improves in a NUMA system so that we are seeing a performance gain in AIM7
      of about 5% with this patch alone.
      
      More NUMA localizations are possible if kmalloc_node operates in an fast
      way like kmalloc.
      
      Test run on a 32p systems with 32G Ram.
      
      w/o patch
      Tasks    jobs/min  jti  jobs/min/task      real       cpu
          1      485.36  100       485.3640     11.99      1.91   Sat Apr 30 14:01:51 2005
        100    26582.63   88       265.8263     21.89    144.96   Sat Apr 30 14:02:14 2005
        200    29866.83   81       149.3342     38.97    286.08   Sat Apr 30 14:02:53 2005
        300    33127.16   78       110.4239     52.71    426.54   Sat Apr 30 14:03:46 2005
        400    34889.47   80        87.2237     66.72    568.90   Sat Apr 30 14:04:53 2005
        500    35654.34   76        71.3087     81.62    714.55   Sat Apr 30 14:06:15 2005
        600    36460.83   75        60.7681     95.77    853.42   Sat Apr 30 14:07:51 2005
        700    35957.00   75        51.3671    113.30    990.67   Sat Apr 30 14:09:45 2005
        800    33380.65   73        41.7258    139.48   1140.86   Sat Apr 30 14:12:05 2005
        900    35095.01   76        38.9945    149.25   1281.30   Sat Apr 30 14:14:35 2005
       1000    36094.37   74        36.0944    161.24   1419.66   Sat Apr 30 14:17:17 2005
      
      w/patch
      Tasks    jobs/min  jti  jobs/min/task      real       cpu
          1      484.27  100       484.2736     12.02      1.93   Sat Apr 30 15:59:45 2005
        100    28262.03   90       282.6203     20.59    143.57   Sat Apr 30 16:00:06 2005
        200    32246.45   82       161.2322     36.10    282.89   Sat Apr 30 16:00:42 2005
        300    37945.80   83       126.4860     46.01    418.75   Sat Apr 30 16:01:28 2005
        400    40000.69   81       100.0017     58.20    561.48   Sat Apr 30 16:02:27 2005
        500    40976.10   78        81.9522     71.02    696.95   Sat Apr 30 16:03:38 2005
        600    41121.54   78        68.5359     84.92    834.86   Sat Apr 30 16:05:04 2005
        700    44052.77   78        62.9325     92.48    971.53   Sat Apr 30 16:06:37 2005
        800    41066.89   79        51.3336    113.38   1111.15   Sat Apr 30 16:08:31 2005
        900    38918.77   79        43.2431    134.59   1252.57   Sat Apr 30 16:10:46 2005
       1000    41842.21   76        41.8422    139.09   1392.33   Sat Apr 30 16:13:05 2005
      
      These are measurement taken directly after boot and show a greater
      improvement than 5%.  However, the performance improvements become less
      over time if the AIM7 runs are repeated and settle down at around 5%.
      
      Links to earlier discussions:
      http://marc.theaimsgroup.com/?t=111094594500003&r=1&w=2
      http://marc.theaimsgroup.com/?t=111603406600002&r=1&w=2
      
      Changelog V4-V5:
      - alloc_arraycache and alloc_aliencache take node parameter instead of cpu
      - fix initialization so that nodes without cpus are properly handled.
      - simplify code in kmem_cache_init
      - patch against Andrews temp mm3 release
      - Add Shai to credits
      - fallback to __cache_alloc from __cache_alloc_node if the node's cache
        is not available yet.
      
      Changelog V3-V4:
      - Patch against 2.6.12-rc5-mm1
      - Cleanup patch integrated
      - More and better use of for_each_node and for_each_cpu
      - GCC 2.95 fix (do not use [] use [0])
      - Correct determination of INDEX_AC
      - Remove hack to cause an error on platforms that have no CONFIG_NUMA but nodes.
      - Remove list3_data and list3_data_ptr macros for better readability
      
      Changelog V2-V3:
      - Made to patch against 2.6.12-rc4-mm1
      - Revised bootstrap mechanism so that larger size kmem_list3 structs can be
        supported. Do a generic solution so that the right slab can be found
        for the internal structs.
      - use for_each_online_node
      
      Changelog V1-V2:
      - Batching for freeing of wrong-node objects (alien caches)
      - Locking changes and NUMA #ifdefs as requested by Manfred
      Signed-off-by: NAlok N Kataria <alokk@calsoftinc.com>
      Signed-off-by: NShobhit Dayal <shobhit@calsoftinc.com>
      Signed-off-by: NShai Fultheim <Shai@Scalex86.org>
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e498be7d
  4. 08 9月, 2005 1 次提交
  5. 05 9月, 2005 4 次提交
  6. 08 7月, 2005 1 次提交
  7. 07 7月, 2005 1 次提交
  8. 24 6月, 2005 1 次提交
  9. 22 6月, 2005 1 次提交
    • C
      [PATCH] Periodically drain non local pagesets · 4ae7c039
      Christoph Lameter 提交于
      The pageset array can potentially acquire a huge amount of memory on large
      NUMA systems.  F.e.  on a system with 512 processors and 256 nodes there
      will be 256*512 pagesets.  If each pageset only holds 5 pages then we are
      talking about 655360 pages.With a 16K page size on IA64 this results in
      potentially 10 Gigabytes of memory being trapped in pagesets.  The typical
      cases are much less for smaller systems but there is still the potential of
      memory being trapped in off node pagesets.  Off node memory may be rarely
      used if local memory is available and so we may potentially have memory in
      seldom used pagesets without this patch.
      
      The slab allocator flushes its per cpu caches every 2 seconds.  The
      following patch flushes the off node pageset caches in the same way by
      tying into the slab flush.
      
      The patch also changes /proc/zoneinfo to include the number of pages
      currently in each pageset.
      Signed-off-by: NChristoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      4ae7c039
  10. 19 6月, 2005 1 次提交
  11. 01 5月, 2005 2 次提交
    • P
      [PATCH] Change synchronize_kernel to _rcu and _sched · fbd568a3
      Paul E. McKenney 提交于
      This patch changes calls to synchronize_kernel(), deprecated in the earlier
      "Deprecate synchronize_kernel, GPL replacement" patch to instead call the new
      synchronize_rcu() and synchronize_sched() APIs.
      Signed-off-by: NPaul E. McKenney <paulmck@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      fbd568a3
    • M
      [PATCH] add kmalloc_node, inline cleanup · 97e2bde4
      Manfred Spraul 提交于
      The patch makes the following function calls available to allocate memory
      on a specific node without changing the basic operation of the slab
      allocator:
      
       kmem_cache_alloc_node(kmem_cache_t *cachep, unsigned int flags, int node);
       kmalloc_node(size_t size, unsigned int flags, int node);
      
      in a similar way to the existing node-blind functions:
      
       kmem_cache_alloc(kmem_cache_t *cachep, unsigned int flags);
       kmalloc(size, flags);
      
      kmem_cache_alloc_node was changed to pass flags and the node information
      through the existing layers of the slab allocator (which lead to some minor
      rearrangements).  The functions at the lowest layer (kmem_getpages,
      cache_grow) are already node aware.  Also __alloc_percpu can call
      kmalloc_node now.
      
      Performance measurements (using the pageset localization patch) yields:
      
      w/o patches:
      Tasks    jobs/min  jti  jobs/min/task      real       cpu
          1      484.27  100       484.2736     12.02      1.97   Wed Mar 30 20:50:43 2005
        100    25170.83   91       251.7083     23.12    150.10   Wed Mar 30 20:51:06 2005
        200    34601.66   84       173.0083     33.64    294.14   Wed Mar 30 20:51:40 2005
        300    37154.47   86       123.8482     46.99    436.56   Wed Mar 30 20:52:28 2005
        400    39839.82   80        99.5995     58.43    580.46   Wed Mar 30 20:53:27 2005
        500    40036.32   79        80.0726     72.68    728.60   Wed Mar 30 20:54:40 2005
        600    44074.21   79        73.4570     79.23    872.10   Wed Mar 30 20:55:59 2005
        700    44016.60   78        62.8809     92.56   1015.84   Wed Mar 30 20:57:32 2005
        800    40411.05   80        50.5138    115.22   1161.13   Wed Mar 30 20:59:28 2005
        900    42298.56   79        46.9984    123.83   1303.42   Wed Mar 30 21:01:33 2005
       1000    40955.05   80        40.9551    142.11   1441.92   Wed Mar 30 21:03:55 2005
      
      with pageset localization and slab API patches:
      Tasks    jobs/min  jti  jobs/min/task      real       cpu
          1      484.19  100       484.1930     12.02      1.98   Wed Mar 30 21:10:18 2005
        100    27428.25   92       274.2825     21.22    149.79   Wed Mar 30 21:10:40 2005
        200    37228.94   86       186.1447     31.27    293.49   Wed Mar 30 21:11:12 2005
        300    41725.42   85       139.0847     41.84    434.10   Wed Mar 30 21:11:54 2005
        400    43032.22   82       107.5805     54.10    582.06   Wed Mar 30 21:12:48 2005
        500    42211.23   83        84.4225     68.94    722.61   Wed Mar 30 21:13:58 2005
        600    40084.49   82        66.8075     87.12    873.11   Wed Mar 30 21:15:25 2005
        700    44169.30   79        63.0990     92.24   1008.77   Wed Mar 30 21:16:58 2005
        800    43097.94   79        53.8724    108.03   1155.88   Wed Mar 30 21:18:47 2005
        900    41846.75   79        46.4964    125.17   1303.38   Wed Mar 30 21:20:52 2005
       1000    40247.85   79        40.2478    144.60   1442.21   Wed Mar 30 21:23:17 2005
      Signed-off-by: NChristoph Lameter <christoph@lameter.com>
      Signed-off-by: NManfred Spraul <manfred@colorfullife.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      97e2bde4
  12. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4