1. 09 9月, 2015 6 次提交
    • S
      zsmalloc: account the number of compacted pages · 860c707d
      Sergey Senozhatsky 提交于
      Compaction returns back to zram the number of migrated objects, which is
      quite uninformative -- we have objects of different sizes so user space
      cannot obtain any valuable data from that number.  Change compaction to
      operate in terms of pages and return back to compaction issuer the
      number of pages that were freed during compaction.  So from now on we
      will export more meaningful value in zram<id>/mm_stat -- the number of
      freed (compacted) pages.
      
      This requires:
       (a) a rename of `num_migrated' to 'pages_compacted'
       (b) a internal API change -- return first_page's fullness_group from
           putback_zspage(), so we know when putback_zspage() did
           free_zspage().  It helps us to account compaction stats correctly.
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Acked-by: NMinchan Kim <minchan@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      860c707d
    • S
      zsmalloc/zram: introduce zs_pool_stats api · 7d3f3938
      Sergey Senozhatsky 提交于
      `zs_compact_control' accounts the number of migrated objects but it has
      a limited lifespan -- we lose it as soon as zs_compaction() returns back
      to zram.  It worked fine, because (a) zram had it's own counter of
      migrated objects and (b) only zram could trigger compaction.  However,
      this does not work for automatic pool compaction (not issued by zram).
      To account objects migrated during auto-compaction (issued by the
      shrinker) we need to store this number in zs_pool.
      
      Define a new `struct zs_pool_stats' structure to keep zs_pool's stats
      there.  It provides only `num_migrated', as of this writing, but it
      surely can be extended.
      
      A new zsmalloc zs_pool_stats() symbol exports zs_pool's stats back to
      caller.
      
      Use zs_pool_stats() in zram and remove `num_migrated' from zram_stats.
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Suggested-by: NMinchan Kim <minchan@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7d3f3938
    • S
      zsmalloc: cosmetic compaction code adjustments · 0dc63d48
      Sergey Senozhatsky 提交于
      Change zs_object_copy() argument order to be (DST, SRC) rather than
      (SRC, DST).  copy/move functions usually have (to, from) arguments
      order.
      
      Rename alloc_target_page() to isolate_target_page().  This function
      doesn't allocate anything, it isolates target page, pretty much like
      isolate_source_page().
      
      Tweak __zs_compact() comment.
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Acked-by: NMinchan Kim <minchan@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0dc63d48
    • S
      zsmalloc: introduce zs_can_compact() function · 04f05909
      Sergey Senozhatsky 提交于
      This function checks if class compaction will free any pages.
      Rephrasing -- do we have enough unused objects to form at least one
      ZS_EMPTY page and free it.  It aborts compaction if class compaction
      will not result in any (further) savings.
      
      EXAMPLE (this debug output is not part of this patch set):
      
       - class size
       - number of allocated objects
       - number of used objects
       - max objects per zspage
       - pages per zspage
       - estimated number of pages that will be freed
      
      [..]
      class-512 objs:544 inuse:540 maxobj-per-zspage:8  pages-per-zspage:1 zspages-to-free:0
       ... class-512 compaction is useless. break
      class-496 objs:660 inuse:570 maxobj-per-zspage:33 pages-per-zspage:4 zspages-to-free:2
      class-496 objs:627 inuse:570 maxobj-per-zspage:33 pages-per-zspage:4 zspages-to-free:1
      class-496 objs:594 inuse:570 maxobj-per-zspage:33 pages-per-zspage:4 zspages-to-free:0
       ... class-496 compaction is useless. break
      class-448 objs:657 inuse:617 maxobj-per-zspage:9  pages-per-zspage:1 zspages-to-free:4
      class-448 objs:648 inuse:617 maxobj-per-zspage:9  pages-per-zspage:1 zspages-to-free:3
      class-448 objs:639 inuse:617 maxobj-per-zspage:9  pages-per-zspage:1 zspages-to-free:2
      class-448 objs:630 inuse:617 maxobj-per-zspage:9  pages-per-zspage:1 zspages-to-free:1
      class-448 objs:621 inuse:617 maxobj-per-zspage:9  pages-per-zspage:1 zspages-to-free:0
       ... class-448 compaction is useless. break
      class-432 objs:728 inuse:685 maxobj-per-zspage:28 pages-per-zspage:3 zspages-to-free:1
      class-432 objs:700 inuse:685 maxobj-per-zspage:28 pages-per-zspage:3 zspages-to-free:0
       ... class-432 compaction is useless. break
      class-416 objs:819 inuse:705 maxobj-per-zspage:39 pages-per-zspage:4 zspages-to-free:2
      class-416 objs:780 inuse:705 maxobj-per-zspage:39 pages-per-zspage:4 zspages-to-free:1
      class-416 objs:741 inuse:705 maxobj-per-zspage:39 pages-per-zspage:4 zspages-to-free:0
       ... class-416 compaction is useless. break
      class-400 objs:690 inuse:674 maxobj-per-zspage:10 pages-per-zspage:1 zspages-to-free:1
      class-400 objs:680 inuse:674 maxobj-per-zspage:10 pages-per-zspage:1 zspages-to-free:0
       ... class-400 compaction is useless. break
      class-384 objs:736 inuse:709 maxobj-per-zspage:32 pages-per-zspage:3 zspages-to-free:0
       ... class-384 compaction is useless. break
      [..]
      
      Every "compaction is useless" indicates that we saved CPU cycles.
      
      class-512 has
      	544	object allocated
      	540	objects used
      	8	objects per-page
      
      Even if we have a ALMOST_EMPTY zspage, we still don't have enough room to
      migrate all of its objects and free this zspage; so compaction will not
      make a lot of sense, it's better to just leave it as is.
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Acked-by: NMinchan Kim <minchan@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      04f05909
    • S
      zsmalloc: always keep per-class stats · 57244594
      Sergey Senozhatsky 提交于
      Always account per-class `zs_size_stat' stats.  This data will help us
      make better decisions during compaction.  We are especially interested
      in OBJ_ALLOCATED and OBJ_USED, which can tell us if class compaction
      will result in any memory gain.
      
      For instance, we know the number of allocated objects in the class, the
      number of objects being used (so we also know how many objects are not
      used) and the number of objects per-page.  So we can ensure if we have
      enough unused objects to form at least one ZS_EMPTY zspage during
      compaction.
      
      We calculate this value on per-class basis so we can calculate a total
      number of zspages that can be released.  Which is exactly what a
      shrinker wants to know.
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Acked-by: NMinchan Kim <minchan@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      57244594
    • S
      zsmalloc: drop unused variable `nr_to_migrate' · b430d1fd
      Sergey Senozhatsky 提交于
      This patchset tweaks compaction and makes it possible to trigger pool
      compaction automatically when system is getting low on memory.
      
      zsmalloc in some cases can suffer from a notable fragmentation and
      compaction can release some considerable amount of memory.  The problem
      here is that currently we fully rely on user space to perform compaction
      when needed.  However, performing zsmalloc compaction is not always an
      obvious thing to do.  For example, suppose we have a `idle' fragmented
      (compaction was never performed) zram device and system is getting low
      on memory due to some 3rd party user processes (gcc LTO, or firefox,
      etc.).  It's quite unlikely that user space will issue zpool compaction
      in this case.  Besides, user space cannot tell for sure how badly pool
      is fragmented; however, this info is known to zsmalloc and, hence, to a
      shrinker.
      
      This patch (of 7):
      
      __zs_compact() does not use `nr_to_migrate', drop it.
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Acked-by: NMinchan Kim <minchan@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b430d1fd
  2. 26 6月, 2015 2 次提交
  3. 11 6月, 2015 1 次提交
  4. 16 4月, 2015 13 次提交
    • S
      zsmalloc: remove extra cond_resched() in __zs_compact · 160a117f
      Sergey Senozhatsky 提交于
      Do not perform cond_resched() before the busy compaction loop in
      __zs_compact(), because this loop does it when needed.
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Acked-by: NMinchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      160a117f
    • H
      zsmalloc: fix fatal corruption due to wrong size class selection · 81da9b13
      Heesub Shin 提交于
      There is no point in overriding the size class below.  It causes fatal
      corruption on the next chunk on the 3264-bytes size class, which is the
      last size class that is not huge.
      
      For example, if the requested size was exactly 3264 bytes, current
      zsmalloc allocates and returns a chunk from the size class of 3264 bytes,
      not 4096.  User access to this chunk may overwrite head of the next
      adjacent chunk.
      
      Here is the panic log captured when freelist was corrupted due to this:
      
          Kernel BUG at ffffffc00030659c [verbose debug info unavailable]
          Internal error: Oops - BUG: 96000006 [#1] PREEMPT SMP
          Modules linked in:
          exynos-snapshot: core register saved(CPU:5)
          CPUMERRSR: 0000000000000000, L2MERRSR: 0000000000000000
          exynos-snapshot: context saved(CPU:5)
          exynos-snapshot: item - log_kevents is disabled
          CPU: 5 PID: 898 Comm: kswapd0 Not tainted 3.10.61-4497415-eng #1
          task: ffffffc0b8783d80 ti: ffffffc0b71e8000 task.ti: ffffffc0b71e8000
          PC is at obj_idx_to_offset+0x0/0x1c
          LR is at obj_malloc+0x44/0xe8
          pc : [<ffffffc00030659c>] lr : [<ffffffc000306604>] pstate: a0000045
          sp : ffffffc0b71eb790
          x29: ffffffc0b71eb790 x28: ffffffc00204c000
          x27: 000000000001d96f x26: 0000000000000000
          x25: ffffffc098cc3500 x24: ffffffc0a13f2810
          x23: ffffffc098cc3501 x22: ffffffc0a13f2800
          x21: 000011e1a02006e3 x20: ffffffc0a13f2800
          x19: ffffffbc02a7e000 x18: 0000000000000000
          x17: 0000000000000000 x16: 0000000000000feb
          x15: 0000000000000000 x14: 00000000a01003e3
          x13: 0000000000000020 x12: fffffffffffffff0
          x11: ffffffc08b264000 x10: 00000000e3a01004
          x9 : ffffffc08b263fea x8 : ffffffc0b1e611c0
          x7 : ffffffc000307d24 x6 : 0000000000000000
          x5 : 0000000000000038 x4 : 000000000000011e
          x3 : ffffffbc00003e90 x2 : 0000000000000cc0
          x1 : 00000000d0100371 x0 : ffffffbc00003e90
      Reported-by: NSooyong Suk <s.suk@samsung.com>
      Signed-off-by: NHeesub Shin <heesub.shin@samsung.com>
      Tested-by: NSooyong Suk <s.suk@samsung.com>
      Acked-by: NMinchan Kim <minchan@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      81da9b13
    • M
      zsmalloc: remove unnecessary insertion/removal of zspage in compaction · 839373e6
      Minchan Kim 提交于
      In putback_zspage, we don't need to insert a zspage into list of zspage
      in size_class again to just fix fullness group. We could do directly
      without reinsertion so we could save some instuctions.
      Reported-by: NHeesub Shin <heesub.shin@samsung.com>
      Signed-off-by: NMinchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Cc: Seth Jennings <sjennings@variantweb.net>
      Cc: Ganesh Mahendran <opensource.ganesh@gmail.com>
      Cc: Luigi Semenzato <semenzato@google.com>
      Cc: Gunho Lee <gunho.lee@lge.com>
      Cc: Juneho Choi <juno.choi@lge.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      839373e6
    • S
      zsmalloc: micro-optimize zs_object_copy() · 495819ea
      Sergey Senozhatsky 提交于
      A micro-optimization.  Avoid additional branching and reduce (a bit)
      registry pressure (f.e.  s_off += size; d_off += size; may be calculated
      twise: first for >= PAGE_SIZE check and later for offset update in "else"
      clause).
      
      scripts/bloat-o-meter shows some improvement
      
      add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-10 (-10)
      function                          old     new   delta
      zs_object_copy                    550     540     -10
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Acked-by: NMinchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      495819ea
    • S
      zsmalloc: remove synchronize_rcu from zs_compact() · 1ec7cfb1
      Sergey Senozhatsky 提交于
      Do not synchronize rcu in zs_compact(). Neither zsmalloc not
      zram use rcu.
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Acked-by: NMinchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1ec7cfb1
    • Y
    • M
      zsmalloc: zsmalloc documentation · d02be50d
      Minchan Kim 提交于
      Create zsmalloc doc which explains design concept and stat information.
      Signed-off-by: NMinchan Kim <minchan@kernel.org>
      Cc: Juneho Choi <juno.choi@lge.com>
      Cc: Gunho Lee <gunho.lee@lge.com>
      Cc: Luigi Semenzato <semenzato@google.com>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Cc: Seth Jennings <sjennings@variantweb.net>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d02be50d
    • M
      zsmalloc: add fullness into stat · 248ca1b0
      Minchan Kim 提交于
      During investigating compaction, fullness information of each class is
      helpful for investigating how the compaction works well.  With that, we
      could know how compaction works well more clear on each size class.
      Signed-off-by: NMinchan Kim <minchan@kernel.org>
      Cc: Juneho Choi <juno.choi@lge.com>
      Cc: Gunho Lee <gunho.lee@lge.com>
      Cc: Luigi Semenzato <semenzato@google.com>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Cc: Seth Jennings <sjennings@variantweb.net>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      248ca1b0
    • M
      zsmalloc: record handle in page->private for huge object · 7b60a685
      Minchan Kim 提交于
      We store handle on header of each allocated object so it increases the
      size of each object by sizeof(unsigned long).
      
      If zram stores 4096 bytes to zsmalloc(ie, bad compression), zsmalloc needs
      4104B-class to add handle.
      
      However, 4104B-class has 1-pages_per_zspage so wasted size by internal
      fragment is 8192 - 4104, which is terrible.
      
      So this patch records the handle in page->private on such huge object(ie,
      pages_per_zspage == 1 && maxobj_per_zspage == 1) instead of header of each
      object so we could use 4096B-class, not 4104B-class.
      Signed-off-by: NMinchan Kim <minchan@kernel.org>
      Cc: Juneho Choi <juno.choi@lge.com>
      Cc: Gunho Lee <gunho.lee@lge.com>
      Cc: Luigi Semenzato <semenzato@google.com>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Cc: Seth Jennings <sjennings@variantweb.net>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7b60a685
    • M
      zsmalloc: adjust ZS_ALMOST_FULL · d3d07c92
      Minchan Kim 提交于
      Curretly, zsmalloc regards a zspage as ZS_ALMOST_EMPTY if the zspage has
      under 1/4 used objects(ie, fullness_threshold_frac).  It could make result
      in loose packing since zsmalloc migrates only ZS_ALMOST_EMPTY zspage out.
      
      This patch changes the rule so that zsmalloc makes zspage which has above
      3/4 used object ZS_ALMOST_FULL so it could make tight packing.
      Signed-off-by: NMinchan Kim <minchan@kernel.org>
      Cc: Juneho Choi <juno.choi@lge.com>
      Cc: Gunho Lee <gunho.lee@lge.com>
      Cc: Luigi Semenzato <semenzato@google.com>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Cc: Seth Jennings <sjennings@variantweb.net>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d3d07c92
    • M
      zsmalloc: support compaction · 312fcae2
      Minchan Kim 提交于
      This patch provides core functions for migration of zsmalloc.  Migraion
      policy is simple as follows.
      
      for each size class {
              while {
                      src_page = get zs_page from ZS_ALMOST_EMPTY
                      if (!src_page)
                              break;
                      dst_page = get zs_page from ZS_ALMOST_FULL
                      if (!dst_page)
                              dst_page = get zs_page from ZS_ALMOST_EMPTY
                      if (!dst_page)
                              break;
                      migrate(from src_page, to dst_page);
              }
      }
      
      For migration, we need to identify which objects in zspage are allocated
      to migrate them out.  We could know it by iterating of freed objects in a
      zspage because first_page of zspage keeps free objects singly-linked list
      but it's not efficient.  Instead, this patch adds a tag(ie,
      OBJ_ALLOCATED_TAG) in header of each object(ie, handle) so we could check
      whether the object is allocated easily.
      
      This patch adds another status bit in handle to synchronize between user
      access through zs_map_object and migration.  During migration, we cannot
      move objects user are using due to data coherency between old object and
      new object.
      
      [akpm@linux-foundation.org: zsmalloc.c needs sched.h for cond_resched()]
      Signed-off-by: NMinchan Kim <minchan@kernel.org>
      Cc: Juneho Choi <juno.choi@lge.com>
      Cc: Gunho Lee <gunho.lee@lge.com>
      Cc: Luigi Semenzato <semenzato@google.com>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Cc: Seth Jennings <sjennings@variantweb.net>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      312fcae2
    • M
      zsmalloc: factor out obj_[malloc|free] · c7806261
      Minchan Kim 提交于
      In later patch, migration needs some part of functions in zs_malloc and
      zs_free so this patch factor out them.
      Signed-off-by: NMinchan Kim <minchan@kernel.org>
      Cc: Juneho Choi <juno.choi@lge.com>
      Cc: Gunho Lee <gunho.lee@lge.com>
      Cc: Luigi Semenzato <semenzato@google.com>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Cc: Seth Jennings <sjennings@variantweb.net>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c7806261
    • M
      zsmalloc: decouple handle and object · 2e40e163
      Minchan Kim 提交于
      Recently, we started to use zram heavily and some of issues
      popped.
      
      1) external fragmentation
      
      I got a report from Juneho Choi that fork failed although there are plenty
      of free pages in the system.  His investigation revealed zram is one of
      the culprit to make heavy fragmentation so there was no more contiguous
      16K page for pgd to fork in the ARM.
      
      2) non-movable pages
      
      Other problem of zram now is that inherently, user want to use zram as
      swap in small memory system so they use zRAM with CMA to use memory
      efficiently.  However, unfortunately, it doesn't work well because zRAM
      cannot use CMA's movable pages unless it doesn't support compaction.  I
      got several reports about that OOM happened with zram although there are
      lots of swap space and free space in CMA area.
      
      3) internal fragmentation
      
      zRAM has started support memory limitation feature to limit memory usage
      and I sent a patchset(https://lkml.org/lkml/2014/9/21/148) for VM to be
      harmonized with zram-swap to stop anonymous page reclaim if zram consumed
      memory up to the limit although there are free space on the swap.  One
      problem for that direction is zram has no way to know any hole in memory
      space zsmalloc allocated by internal fragmentation so zram would regard
      swap is full although there are free space in zsmalloc.  For solving the
      issue, zram want to trigger compaction of zsmalloc before it decides full
      or not.
      
      This patchset is first step to support above issues.  For that, it adds
      indirect layer between handle and object location and supports manual
      compaction to solve 3th problem first of all.
      
      After this patchset got merged, next step is to make VM aware of zsmalloc
      compaction so that generic compaction will move zsmalloced-pages
      automatically in runtime.
      
      In my imaginary experiment(ie, high compress ratio data with heavy swap
      in/out on 8G zram-swap), data is as follows,
      
      Before =
      zram allocated object :      60212066 bytes
      zram total used:     140103680 bytes
      ratio:         42.98 percent
      MemFree:          840192 kB
      
      Compaction
      
      After =
      frag ratio after compaction
      zram allocated object :      60212066 bytes
      zram total used:      76185600 bytes
      ratio:         79.03 percent
      MemFree:          901932 kB
      
      Juneho reported below in his real platform with small aging.
      So, I think the benefit would be bigger in real aging system
      for a long time.
      
      - frag_ratio increased 3% (ie, higher is better)
      - memfree increased about 6MB
      - In buddy info, Normal 2^3: 4, 2^2: 1: 2^1 increased, Highmem: 2^1 21 increased
      
      frag ratio after swap fragment
      used :        156677 kbytes
      total:        166092 kbytes
      frag_ratio :  94
      meminfo before compaction
      MemFree:           83724 kB
      Node 0, zone   Normal  13642   1364     57     10     61     17      9      5      4      0      0
      Node 0, zone  HighMem    425     29      1      0      0      0      0      0      0      0      0
      
      num_migrated :  23630
      compaction done
      
      frag ratio after compaction
      used :        156673 kbytes
      total:        160564 kbytes
      frag_ratio :  97
      meminfo after compaction
      MemFree:           89060 kB
      Node 0, zone   Normal  14076   1544     67     14     61     17      9      5      4      0      0
      Node 0, zone  HighMem    863     50      1      0      0      0      0      0      0      0      0
      
      This patchset adds more logics(about 480 lines) in zsmalloc but when I
      tested heavy swapin/out program, the regression for swapin/out speed is
      marginal because most of overheads were caused by compress/decompress and
      other MM reclaim stuff.
      
      This patch (of 7):
      
      Currently, handle of zsmalloc encodes object's location directly so it
      makes support of migration hard.
      
      This patch decouples handle and object via adding indirect layer.  For
      that, it allocates handle dynamically and returns it to user.  The handle
      is the address allocated by slab allocation so it's unique and we could
      keep object's location in the memory space allocated for handle.
      
      With it, we can change object's position without changing handle itself.
      Signed-off-by: NMinchan Kim <minchan@kernel.org>
      Cc: Juneho Choi <juno.choi@lge.com>
      Cc: Gunho Lee <gunho.lee@lge.com>
      Cc: Luigi Semenzato <semenzato@google.com>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Cc: Seth Jennings <sjennings@variantweb.net>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2e40e163
  5. 13 2月, 2015 2 次提交
    • G
      mm/zsmalloc: add statistics support · 0f050d99
      Ganesh Mahendran 提交于
      Keeping fragmentation of zsmalloc in a low level is our target.  But now
      we still need to add the debug code in zsmalloc to get the quantitative
      data.
      
      This patch adds a new configuration CONFIG_ZSMALLOC_STAT to enable the
      statistics collection for developers.  Currently only the objects
      statatitics in each class are collected.  User can get the information via
      debugfs.
      
           cat /sys/kernel/debug/zsmalloc/zram0/...
      
      For example:
      
      After I copied "jdk-8u25-linux-x64.tar.gz" to zram with ext4 filesystem:
       class  size obj_allocated   obj_used pages_used
           0    32             0          0          0
           1    48           256         12          3
           2    64            64         14          1
           3    80            51          7          1
           4    96           128          5          3
           5   112            73          5          2
           6   128            32          4          1
           7   144             0          0          0
           8   160             0          0          0
           9   176             0          0          0
          10   192             0          0          0
          11   208             0          0          0
          12   224             0          0          0
          13   240             0          0          0
          14   256            16          1          1
          15   272            15          9          1
          16   288             0          0          0
          17   304             0          0          0
          18   320             0          0          0
          19   336             0          0          0
          20   352             0          0          0
          21   368             0          0          0
          22   384             0          0          0
          23   400             0          0          0
          24   416             0          0          0
          25   432             0          0          0
          26   448             0          0          0
          27   464             0          0          0
          28   480             0          0          0
          29   496            33          1          4
          30   512             0          0          0
          31   528             0          0          0
          32   544             0          0          0
          33   560             0          0          0
          34   576             0          0          0
          35   592             0          0          0
          36   608             0          0          0
          37   624             0          0          0
          38   640             0          0          0
          40   672             0          0          0
          42   704             0          0          0
          43   720            17          1          3
          44   736             0          0          0
          46   768             0          0          0
          49   816             0          0          0
          51   848             0          0          0
          52   864            14          1          3
          54   896             0          0          0
          57   944            13          1          3
          58   960             0          0          0
          62  1024             4          1          1
          66  1088            15          2          4
          67  1104             0          0          0
          71  1168             0          0          0
          74  1216             0          0          0
          76  1248             0          0          0
          83  1360             3          1          1
          91  1488            11          1          4
          94  1536             0          0          0
         100  1632             5          1          2
         107  1744             0          0          0
         111  1808             9          1          4
         126  2048             4          4          2
         144  2336             7          3          4
         151  2448             0          0          0
         168  2720            15         15         10
         190  3072            28         27         21
         202  3264             0          0          0
         254  4096         36209      36209      36209
      
       Total               37022      36326      36288
      
      We can calculate the overall fragentation by the last line:
          Total               37022      36326      36288
          (37022 - 36326) / 37022 = 1.87%
      
      Also by analysing objects alocated in every class we know why we got so
      low fragmentation: Most of the allocated objects is in <class 254>.  And
      there is only 1 page in class 254 zspage.  So, No fragmentation will be
      introduced by allocating objs in class 254.
      
      And in future, we can collect other zsmalloc statistics as we need and
      analyse them.
      Signed-off-by: NGanesh Mahendran <opensource.ganesh@gmail.com>
      Suggested-by: NMinchan Kim <minchan@kernel.org>
      Acked-by: NMinchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Seth Jennings <sjennings@variantweb.net>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0f050d99
    • G
      mm/zpool: add name argument to create zpool · 3eba0c6a
      Ganesh Mahendran 提交于
      Currently the underlay of zpool: zsmalloc/zbud, do not know who creates
      them.  There is not a method to let zsmalloc/zbud find which caller they
      belong to.
      
      Now we want to add statistics collection in zsmalloc.  We need to name the
      debugfs dir for each pool created.  The way suggested by Minchan Kim is to
      use a name passed by caller(such as zram) to create the zsmalloc pool.
      
          /sys/kernel/debug/zsmalloc/zram0
      
      This patch adds an argument `name' to zs_create_pool() and other related
      functions.
      Signed-off-by: NGanesh Mahendran <opensource.ganesh@gmail.com>
      Acked-by: NMinchan Kim <minchan@kernel.org>
      Cc: Seth Jennings <sjennings@variantweb.net>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3eba0c6a
  6. 19 12月, 2014 1 次提交
    • G
      mm/zsmalloc: adjust order of functions · 66cdef66
      Ganesh Mahendran 提交于
      Currently functions in zsmalloc.c does not arranged in a readable and
      reasonable sequence.  With the more and more functions added, we may
      meet below inconvenience.  For example:
      
      Current functions:
      
          void zs_init()
          {
          }
      
          static void get_maxobj_per_zspage()
          {
          }
      
      Then I want to add a func_1() which is called from zs_init(), and this
      new added function func_1() will used get_maxobj_per_zspage() which is
      defined below zs_init().
      
          void func_1()
          {
              get_maxobj_per_zspage()
          }
      
          void zs_init()
          {
              func_1()
          }
      
          static void get_maxobj_per_zspage()
          {
          }
      
      This will cause compiling issue. So we must add a declaration:
      
          static void get_maxobj_per_zspage();
      
      before func_1() if we do not put get_maxobj_per_zspage() before
      func_1().
      
      In addition, puting module_[init|exit] functions at the bottom of the
      file conforms to our habit.
      
      So, this patch ajusts function sequence as:
      
          /* helper functions */
          ...
          obj_location_to_handle()
          ...
      
          /* Some exported functions */
          ...
      
          zs_map_object()
          zs_unmap_object()
      
          zs_malloc()
          zs_free()
      
          zs_init()
          zs_exit()
      Signed-off-by: NGanesh Mahendran <opensource.ganesh@gmail.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Acked-by: NMinchan Kim <minchan@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      66cdef66
  7. 14 12月, 2014 6 次提交
    • G
      mm/zsmalloc: allocate exactly size of struct zs_pool · 18136656
      Ganesh Mahendran 提交于
      In zs_create_pool(), we allocate memory more then sizeof(struct zs_pool)
        ovhd_size = roundup(sizeof(*pool), PAGE_SIZE);
      
      This patch allocate memory of exactly needed size.
      Signed-off-by: NGanesh Mahendran <opensource.ganesh@gmail.com>
      Acked-by: NMinchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      18136656
    • G
      mm/zsmalloc: avoid duplicate assignment of prev_class · df8b5bb9
      Ganesh Mahendran 提交于
      In zs_create_pool(), prev_class is assigned (ZS_SIZE_CLASSES - 1) times.
      And the prev_class only references to the previous size_class.  So we do
      not need unnecessary assignement.
      
      This patch assigns *prev_class* when a new size_class structure is
      allocated and uses prev_class to check whether the first class has been
      allocated.
      
      [akpm@linux-foundation.org: remove now-unused ZS_SIZE_CLASSES]
      Signed-off-by: NGanesh Mahendran <opensource.ganesh@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Reviewed-by: NDan Streetman <ddstreet@ieee.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      df8b5bb9
    • M
      mm/zsmalloc: support allocating obj with size of ZS_MAX_ALLOC_SIZE · 40f9fb8c
      Mahendran Ganesh 提交于
      I sent a patch [1] for unnecessary check in zsmalloc.  And Minchan Kim
      found zsmalloc even does not support allocating an obj with the size of
      ZS_MAX_ALLOC_SIZE in some situations.
      
      For example:
         In system with 64KB PAGE_SIZE and 32 bit of physical addr. Then:
         ZS_MIN_ALLOC_SIZE is 32 bytes which is calculated by:
            MAX(32, (ZS_MAX_PAGES_PER_ZSPAGE << PAGE_SHIFT >> OBJ_INDEX_BITS))
         ZS_MAX_ALLOC_SIZE is 64KB(in current code, is PAGE_SIZE)
         ZS_SIZE_CLASS_DELTA is 256 bytes
         So, ZS_SIZE_CLASSES = (ZS_MAX_ALLOC_SIZE - ZS_MIN_ALLOC_SIZE) /
                                ZS_SIZE_CLASS_DELTA + 1
                             = 256
      
         In zs_create_pool(), the max size obj which can be allocated will be:
            ZS_MIN_ALLOC_SIZE + i * ZS_SIZE_CLASS_DELTA = 32 + 255*256 = 65312
      
         We can see that 65312 < 65536 (ZS_MAX_ALLOC_SIZE). So we can NOT
         allocate objs with size ZS_MAX_ALLOC_SIZE(65536) which we promise upper
         users we can do.
      
       [1]  http://lkml.iu.edu/hypermail/linux/kernel/1411.2/03835.html
       [2]  http://lkml.iu.edu/hypermail/linux/kernel/1411.2/04534.html
      
      This patch fixes this issue by dynamiclly calculating zs_size_classes when
      module is loaded, allocates buffer with size ZS_MAX_ALLOC_SIZE.  Then the
      max obj(size is ZS_MAX_ALLOC_SIZE) can be stored in it.
      
      [akpm@linux-foundation.org: restore ZS_SIZE_CLASSES to fix bisectability]
      Signed-off-by: NMahendran Ganesh <opensource.ganesh@gmail.com>
      Suggested-by: NMinchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      40f9fb8c
    • M
      zsmalloc: correct fragile [kmap|kunmap]_atomic use · af4ee5e9
      Minchan Kim 提交于
      The kunmap_atomic should use virtual address getting by kmap_atomic.
      However, some pieces of code in zsmalloc uses modified address, not the
      one got by kmap_atomic for kunmap_atomic.
      
      It's okay for working because zsmalloc modifies the address inner
      PAGE_SIZE bounday so it works with current kmap_atomic's implementation.
      But it's still fragile with potential changing of kmap_atomic so let's
      correct it.
      
      I got a subtle bug when I implemented a new feature of zsmalloc
      (compaction) due to a link's mishandling (the link was over page
      boundary).  Although it was totally my mistake, it took a while to find
      the cause because an unpredictable kmapped address was unmapped causing an
      almost random crash.
      Signed-off-by: NMinchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Cc: Seth Jennings <sjennings@variantweb.net>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      af4ee5e9
    • S
      zsmalloc: fix zs_init cpu notifier error handling · b1b00a5b
      Sergey Senozhatsky 提交于
      Mahendran Ganesh reported that zpool-enabled zsmalloc should not call
      zpool_unregister_driver() from zs_init() if cpu notifier registration has
      failed, because error handling is performed before we register the driver
      via zpool_register_driver() call.
      
      Factor out cpu notifier registration and unregistration code and fix
      zs_init() error handling.
      
      link: http://lkml.iu.edu//hypermail/linux/kernel/1411.1/04156.html
      [akpm@linux-foundation.org: squash bogus gcc warning]
      [akpm@linux-foundation.org: use __init and __exit]
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Reported-by: NMahendran Ganesh <opensource.ganesh@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b1b00a5b
    • J
      zsmalloc: merge size_class to reduce fragmentation · 9eec4cd5
      Joonsoo Kim 提交于
      zsmalloc has many size_classes to reduce fragmentation and they are in 16
      bytes unit, for example, 16, 32, 48, etc., if PAGE_SIZE is 4096.  And,
      zsmalloc has constraint that each zspage has 4 pages at maximum.
      
      In this situation, we can see interesting aspect.  Let's think about
      size_class for 1488, 1472, ..., 1376.  To prevent external fragmentation,
      they uses 4 pages per zspage and so all they can contain 11 objects at
      maximum.
      
      16384 (4096 * 4) = 1488 * 11 + remains
      16384 (4096 * 4) = 1472 * 11 + remains
      16384 (4096 * 4) = ...
      16384 (4096 * 4) = 1376 * 11 + remains
      
      It means that they have same characteristics and classification between
      them isn't needed.  If we use one size_class for them, we can reduce
      fragementation and save some memory since both the 1488 and 1472 sized
      classes can only fit 11 objects into 4 pages, and an object that's 1472
      bytes can fit into an object that's 1488 bytes, merging these classes to
      always use objects that are 1488 bytes will reduce the total number of
      size classes.  And reducing the total number of size classes reduces
      overall fragmentation, because a wider range of compressed pages can fit
      into a single size class, leaving less unused objects in each size class.
      
      For this purpose, this patch implement size_class merging.  If there is
      size_class that have same pages_per_zspage and same number of objects per
      zspage with previous size_class, we don't create new size_class.  Instead,
      we use previous, same characteristic size_class.  With this way, above
      example sizes (1488, 1472, ..., 1376) use just one size_class so we can
      get much more memory utilization.
      
      Below is result of my simple test.
      
      TEST ENV: EXT4 on zram, mount with discard option WORKLOAD: untar kernel
      source code, remove directory in descending order in size.  (drivers arch
      fs sound include net Documentation firmware kernel tools)
      
      Each line represents orig_data_size, compr_data_size, mem_used_total,
      fragmentation overhead (mem_used - compr_data_size) and overhead ratio
      (overhead to compr_data_size), respectively, after untar and remove
      operation is executed.
      
      * untar-nomerge.out
      
      orig_size compr_size used_size overhead overhead_ratio
      525.88MB 199.16MB 210.23MB  11.08MB 5.56%
      288.32MB  97.43MB 105.63MB   8.20MB 8.41%
      177.32MB  61.12MB  69.40MB   8.28MB 13.55%
      146.47MB  47.32MB  56.10MB   8.78MB 18.55%
      124.16MB  38.85MB  48.41MB   9.55MB 24.58%
      103.93MB  31.68MB  40.93MB   9.25MB 29.21%
       84.34MB  22.86MB  32.72MB   9.86MB 43.13%
       66.87MB  14.83MB  23.83MB   9.00MB 60.70%
       60.67MB  11.11MB  18.60MB   7.49MB 67.48%
       55.86MB   8.83MB  16.61MB   7.77MB 88.03%
       53.32MB   8.01MB  15.32MB   7.31MB 91.24%
      
      * untar-merge.out
      
      orig_size compr_size used_size overhead overhead_ratio
      526.23MB 199.18MB 209.81MB  10.64MB 5.34%
      288.68MB  97.45MB 104.08MB   6.63MB 6.80%
      177.68MB  61.14MB  66.93MB   5.79MB 9.47%
      146.83MB  47.34MB  52.79MB   5.45MB 11.51%
      124.52MB  38.87MB  44.30MB   5.43MB 13.96%
      104.29MB  31.70MB  36.83MB   5.13MB 16.19%
       84.70MB  22.88MB  27.92MB   5.04MB 22.04%
       67.11MB  14.83MB  19.26MB   4.43MB 29.86%
       60.82MB  11.10MB  14.90MB   3.79MB 34.17%
       55.90MB   8.82MB  12.61MB   3.79MB 42.97%
       53.32MB   8.01MB  11.73MB   3.73MB 46.53%
      
      As you can see above result, merged one has better utilization (overhead
      ratio, 5th column) and uses less memory (mem_used_total, 3rd column).
      Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Reviewed-by: NDan Streetman <ddstreet@ieee.org>
      Cc: Luigi Semenzato <semenzato@google.com>
      Cc: <juno.choi@lge.com>
      Cc: "seungho1.park" <seungho1.park@lge.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9eec4cd5
  8. 10 10月, 2014 4 次提交
    • D
      zsmalloc: simplify init_zspage free obj linking · 5538c562
      Dan Streetman 提交于
      Change zsmalloc init_zspage() logic to iterate through each object on each
      of its pages, checking the offset to verify the object is on the current
      page before linking it into the zspage.
      
      The current zsmalloc init_zspage free object linking code has logic that
      relies on there only being one page per zspage when PAGE_SIZE is a
      multiple of class->size.  It calculates the number of objects for the
      current page, and iterates through all of them plus one, to account for
      the assumed partial object at the end of the page.  While this currently
      works, the logic can be simplified to just link the object at each
      successive offset until the offset is larger than PAGE_SIZE, which does
      not rely on PAGE_SIZE being a multiple of class->size.
      Signed-off-by: NDan Streetman <ddstreet@ieee.org>
      Acked-by: NMinchan Kim <minchan@kernel.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Seth Jennings <sjennings@variantweb.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5538c562
    • W
      mm/zsmalloc.c: correct comment for fullness group computation · 6dd9737e
      Wang Sheng-Hui 提交于
      The letter 'f' in "n <= N/f" stands for fullness_threshold_frac, not
      1/fullness_threshold_frac.
      Signed-off-by: NWang Sheng-Hui <shhuiw@gmail.com>
      Acked-by: NMinchan Kim <minchan@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6dd9737e
    • M
      zsmalloc: change return value unit of zs_get_total_size_bytes · 722cdc17
      Minchan Kim 提交于
      zs_get_total_size_bytes returns a amount of memory zsmalloc consumed with
      *byte unit* but zsmalloc operates *page unit* rather than byte unit so
      let's change the API so benefit we could get is that reduce unnecessary
      overhead (ie, change page unit with byte unit) in zsmalloc.
      
      Since return type is pages, "zs_get_total_pages" is better than
      "zs_get_total_size_bytes".
      Signed-off-by: NMinchan Kim <minchan@kernel.org>
      Reviewed-by: NDan Streetman <ddstreet@ieee.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: <juno.choi@lge.com>
      Cc: <seungho1.park@lge.com>
      Cc: Luigi Semenzato <semenzato@google.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Seth Jennings <sjennings@variantweb.net>
      Cc: David Horner <ds2horner@gmail.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      722cdc17
    • M
      zsmalloc: move pages_allocated to zs_pool · 13de8933
      Minchan Kim 提交于
      Currently, zram has no feature to limit memory so theoretically zram can
      deplete system memory.  Users have asked for a limit several times as even
      without exhaustion zram makes it hard to control memory usage of the
      platform.  This patchset adds the feature.
      
      Patch 1 makes zs_get_total_size_bytes faster because it would be used
      frequently in later patches for the new feature.
      
      Patch 2 changes zs_get_total_size_bytes's return unit from bytes to page
      so that zsmalloc doesn't need unnecessary operation(ie, << PAGE_SHIFT).
      
      Patch 3 adds new feature.  I added the feature into zram layer, not
      zsmalloc because limiation is zram's requirement, not zsmalloc so any
      other user using zsmalloc(ie, zpool) shouldn't affected by unnecessary
      branch of zsmalloc.  In future, if every users of zsmalloc want the
      feature, then, we could move the feature from client side to zsmalloc
      easily but vice versa would be painful.
      
      Patch 4 adds news facility to report maximum memory usage of zram so that
      this avoids user polling frequently via /sys/block/zram0/ mem_used_total
      and ensures transient max are not missed.
      
      This patch (of 4):
      
      pages_allocated has counted in size_class structure and when user of
      zsmalloc want to see total_size_bytes, it should gather all of count from
      each size_class to report the sum.
      
      It's not bad if user don't see the value often but if user start to see
      the value frequently, it would be not a good deal for performance pov.
      
      This patch moves the count from size_class to zs_pool so it could reduce
      memory footprint (from [255 * 8byte] to [sizeof(atomic_long_t)]).
      Signed-off-by: NMinchan Kim <minchan@kernel.org>
      Reviewed-by: NDan Streetman <ddstreet@ieee.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: <juno.choi@lge.com>
      Cc: <seungho1.park@lge.com>
      Cc: Luigi Semenzato <semenzato@google.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Seth Jennings <sjennings@variantweb.net>
      Reviewed-by: NDavid Horner <ds2horner@gmail.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      13de8933
  9. 30 8月, 2014 1 次提交
  10. 07 8月, 2014 3 次提交
  11. 05 6月, 2014 1 次提交