1. 26 6月, 2015 11 次提交
  2. 11 6月, 2015 1 次提交
  3. 06 5月, 2015 1 次提交
  4. 16 4月, 2015 8 次提交
  5. 01 3月, 2015 1 次提交
  6. 13 2月, 2015 8 次提交
    • G
      mm/zpool: add name argument to create zpool · 3eba0c6a
      Ganesh Mahendran 提交于
      Currently the underlay of zpool: zsmalloc/zbud, do not know who creates
      them.  There is not a method to let zsmalloc/zbud find which caller they
      belong to.
      
      Now we want to add statistics collection in zsmalloc.  We need to name the
      debugfs dir for each pool created.  The way suggested by Minchan Kim is to
      use a name passed by caller(such as zram) to create the zsmalloc pool.
      
          /sys/kernel/debug/zsmalloc/zram0
      
      This patch adds an argument `name' to zs_create_pool() and other related
      functions.
      Signed-off-by: NGanesh Mahendran <opensource.ganesh@gmail.com>
      Acked-by: NMinchan Kim <minchan@kernel.org>
      Cc: Seth Jennings <sjennings@variantweb.net>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3eba0c6a
    • S
      zram: remove request_queue from struct zram · ee980160
      Sergey Senozhatsky 提交于
      `struct zram' contains both `struct gendisk' and `struct request_queue'.
      the latter can be deleted, because zram->disk carries ->queue pointer, and
      ->queue carries zram pointer:
      
      create_device()
      	zram->queue->queuedata = zram
      	zram->disk->queue = zram->queue
      	zram->disk->private_data = zram
      
      so zram->queue is not needed, we can access all necessary data anyway.
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ee980160
    • M
      zram: remove init_lock in zram_make_request · 08eee69f
      Minchan Kim 提交于
      Admin could reset zram during I/O operation going on so we have used
      zram->init_lock as read-side lock in I/O path to prevent sudden zram
      meta freeing.
      
      However, the init_lock is really troublesome.  We can't do call
      zram_meta_alloc under init_lock due to lockdep splat because
      zram_rw_page is one of the function under reclaim path and hold it as
      read_lock while other places in process context hold it as write_lock.
      So, we have used allocation out of the lock to avoid lockdep warn but
      it's not good for readability and fainally, I met another lockdep splat
      between init_lock and cpu_hotplug from kmem_cache_destroy during working
      zsmalloc compaction.  :(
      
      Yes, the ideal is to remove horrible init_lock of zram in rw path.  This
      patch removes it in rw path and instead, add atomic refcount for meta
      lifetime management and completion to free meta in process context.
      It's important to free meta in process context because some of resource
      destruction needs mutex lock, which could be held if we releases the
      resource in reclaim context so it's deadlock, again.
      
      As a bonus, we could remove init_done check in rw path because
      zram_meta_get will do a role for it, instead.
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: NMinchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Ganesh Mahendran <opensource.ganesh@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      08eee69f
    • M
      zram: check bd_openers instead of bd_holders · 2b269ce6
      Minchan Kim 提交于
      bd_holders is increased only when user open the device file as FMODE_EXCL
      so if something opens zram0 as !FMODE_EXCL and request I/O while another
      user reset zram0, we can see following warning.
      
        zram0: detected capacity change from 0 to 64424509440
        Buffer I/O error on dev zram0, logical block 180823, lost async page write
        Buffer I/O error on dev zram0, logical block 180824, lost async page write
        Buffer I/O error on dev zram0, logical block 180825, lost async page write
        Buffer I/O error on dev zram0, logical block 180826, lost async page write
        Buffer I/O error on dev zram0, logical block 180827, lost async page write
        Buffer I/O error on dev zram0, logical block 180828, lost async page write
        Buffer I/O error on dev zram0, logical block 180829, lost async page write
        Buffer I/O error on dev zram0, logical block 180830, lost async page write
        Buffer I/O error on dev zram0, logical block 180831, lost async page write
        Buffer I/O error on dev zram0, logical block 180832, lost async page write
        ------------[ cut here ]------------
        WARNING: CPU: 11 PID: 1996 at fs/block_dev.c:57 __blkdev_put+0x1d7/0x210()
        Modules linked in:
        CPU: 11 PID: 1996 Comm: dd Not tainted 3.19.0-rc6-next-20150202+ #1125
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
        Call Trace:
          dump_stack+0x45/0x57
          warn_slowpath_common+0x8a/0xc0
          warn_slowpath_null+0x1a/0x20
          __blkdev_put+0x1d7/0x210
          blkdev_put+0x50/0x130
          blkdev_close+0x25/0x30
          __fput+0xdf/0x1e0
          ____fput+0xe/0x10
          task_work_run+0xa7/0xe0
          do_notify_resume+0x49/0x60
          int_signal+0x12/0x17
        ---[ end trace 274fbbc5664827d2 ]---
      
      The warning comes from bdev_write_node in blkdev_put path.
      
         static void bdev_write_inode(struct inode *inode)
         {
              spin_lock(&inode->i_lock);
              while (inode->i_state & I_DIRTY) {
                      spin_unlock(&inode->i_lock);
                      WARN_ON_ONCE(write_inode_now(inode, true)); <========= here.
                      spin_lock(&inode->i_lock);
              }
              spin_unlock(&inode->i_lock);
         }
      
      The reason is dd process encounters I/O fails due to sudden block device
      disappear so in filemap_check_errors in __writeback_single_inode returns
      -EIO.
      
      If we check bd_openers instead of bd_holders, we could address the
      problem.  When I see the brd, it already have used it rather than
      bd_holders so although I'm not a expert of block layer, it seems to be
      better.
      
      I can make following warning with below simple script.  In addition, I
      added msleep(2000) below set_capacity(zram->disk, 0) after applying your
      patch to make window huge(Kudos to Ganesh!)
      
      script:
      
         echo $((60<<30)) > /sys/block/zram0/disksize
         setsid dd if=/dev/zero of=/dev/zram0 &
         sleep 1
         setsid echo 1 > /sys/block/zram0/reset
      Signed-off-by: NMinchan Kim <minchan@kernel.org>
      Acked-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Ganesh Mahendran <opensource.ganesh@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2b269ce6
    • S
      zram: rework reset and destroy path · a096cafc
      Sergey Senozhatsky 提交于
      We need to return set_capacity(disk, 0) from reset_store() back to
      zram_reset_device(), a catch by Ganesh Mahendran.  Potentially, we can
      race set_capacity() calls from init and reset paths.
      
      The problem is that zram_reset_device() is also getting called from
      zram_exit(), which performs operations in misleading reversed order -- we
      first create_device() and then init it, while zram_exit() perform
      destroy_device() first and then does zram_reset_device().  This is done to
      remove sysfs group before we reset device, so we can continue with device
      reset/destruction not being raced by sysfs attr write (f.e.  disksize).
      
      Apart from that, destroy_device() releases zram->disk (but we still have
      ->disk pointer), so we cannot acces zram->disk in later
      zram_reset_device() call, which may cause additional errors in the future.
      
      So, this patch rework and cleanup destroy path.
      
      1) remove several unneeded goto labels in zram_init()
      
      2) factor out zram_init() error path and zram_exit() into
         destroy_devices() function, which takes the number of devices to
         destroy as its argument.
      
      3) remove sysfs group in destroy_devices() first, so we can reorder
         operations -- reset device (as expected) goes before disk destroy and
         queue cleanup.  So we can always access ->disk in zram_reset_device().
      
      4) and, finally, return set_capacity() back under ->init_lock.
      
      [akpm@linux-foundation.org: tweak comment]
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Reported-by: NGanesh Mahendran <opensource.ganesh@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a096cafc
    • S
      zram: fix umount-reset_store-mount race condition · ba6b17d6
      Sergey Senozhatsky 提交于
      Ganesh Mahendran was the first one who proposed to use bdev->bd_mutex to
      avoid ->bd_holders race condition:
      
              CPU0                            CPU1
      umount /* zram->init_done is true */
      reset_store()
      bdev->bd_holders == 0                   mount
      ...                                     zram_make_request()
      zram_reset_device()
      
      However, his solution required some considerable amount of code movement,
      which we can avoid.
      
      Apart from using bdev->bd_mutex in reset_store(), this patch also
      simplifies zram_reset_device().
      
      zram_reset_device() has a bool parameter reset_capacity which tells it
      whether disk capacity and itself disk should be reset.  There are two
      zram_reset_device() callers:
      
      -- zram_exit() passes reset_capacity=false
      -- reset_store() passes reset_capacity=true
      
      So we can move reset_capacity-sensitive work out of zram_reset_device()
      and perform it unconditionally in reset_store().  This also lets us drop
      reset_capacity parameter from zram_reset_device() and pass zram pointer
      only.
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Reported-by: NGanesh Mahendran <opensource.ganesh@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ba6b17d6
    • G
      zram: free meta table in zram_meta_free · 1fec1172
      Ganesh Mahendran 提交于
      zram_meta_alloc() and zram_meta_free() are a pair.  In
      zram_meta_alloc(), meta table is allocated.  So it it better to free it
      in zram_meta_free().
      Signed-off-by: NGanesh Mahendran <opensource.ganesh@gmail.com>
      Acked-by: NMinchan Kim <minchan@kernel.org>
      Acked-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1fec1172
    • S
      zram: clean up zram_meta_alloc() · b8179958
      Sergey Senozhatsky 提交于
      A trivial cleanup of zram_meta_alloc() error handling.
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Acked-by: NMinchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b8179958
  7. 14 12月, 2014 4 次提交
  8. 14 11月, 2014 1 次提交
  9. 30 10月, 2014 1 次提交
  10. 10 10月, 2014 4 次提交
    • S
      zram: use notify_free to account all free notifications · 015254da
      Sergey Senozhatsky 提交于
      `notify_free' device attribute accounts the number of slot free
      notifications and internally represents the number of zram_free_page()
      calls.  Slot free notifications are sent only when device is used as a
      swap device, hence `notify_free' is used only for swap devices.  Since
      f4659d8e (zram: support REQ_DISCARD) ZRAM handles yet another one
      free notification (also via zram_free_page() call) -- REQ_DISCARD
      requests, which are sent by a filesystem, whenever some data blocks are
      discarded.  However, there is no way to know the number of notifications
      in the latter case.
      
      Use `notify_free' to account the number of pages freed by
      zram_bio_discard() and zram_slot_free_notify().  Depending on usage
      scenario `notify_free' represents:
      
       a) the number of pages freed because of slot free notifications, which is
         equal to the number of swap_slot_free_notify() calls, so there is no
         behaviour change
      
       b) the number of pages freed because of REQ_DISCARD notifications
      Signed-off-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Acked-by: NMinchan Kim <minchan@kernel.org>
      Acked-by: NJerome Marchand <jmarchan@redhat.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Chao Yu <chao2.yu@samsung.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      015254da
    • M
      zram: report maximum used memory · 461a8eee
      Minchan Kim 提交于
      Normally, zram user could get maximum memory usage zram consumed via
      polling mem_used_total with sysfs in userspace.
      
      But it has a critical problem because user can miss peak memory usage
      during update inverval of polling.  For avoiding that, user should poll it
      with shorter interval(ie, 0.0000000001s) with mlocking to avoid page fault
      delay when memory pressure is heavy.  It would be troublesome.
      
      This patch adds new knob "mem_used_max" so user could see the maximum
      memory usage easily via reading the knob and reset it via "echo 0 >
      /sys/block/zram0/mem_used_max".
      Signed-off-by: NMinchan Kim <minchan@kernel.org>
      Reviewed-by: NDan Streetman <ddstreet@ieee.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: <juno.choi@lge.com>
      Cc: <seungho1.park@lge.com>
      Cc: Luigi Semenzato <semenzato@google.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Seth Jennings <sjennings@variantweb.net>
      Reviewed-by: NDavid Horner <ds2horner@gmail.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      461a8eee
    • M
      zram: zram memory size limitation · 9ada9da9
      Minchan Kim 提交于
      Since zram has no control feature to limit memory usage, it makes hard to
      manage system memrory.
      
      This patch adds new knob "mem_limit" via sysfs to set up the a limit so
      that zram could fail allocation once it reaches the limit.
      
      In addition, user could change the limit in runtime so that he could
      manage the memory more dynamically.
      
      Initial state is no limit so it doesn't break old behavior.
      
      [akpm@linux-foundation.org: fix typo, per Sergey]
      Signed-off-by: NMinchan Kim <minchan@kernel.org>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: <juno.choi@lge.com>
      Cc: <seungho1.park@lge.com>
      Cc: Luigi Semenzato <semenzato@google.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Seth Jennings <sjennings@variantweb.net>
      Cc: David Horner <ds2horner@gmail.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9ada9da9
    • M
      zsmalloc: change return value unit of zs_get_total_size_bytes · 722cdc17
      Minchan Kim 提交于
      zs_get_total_size_bytes returns a amount of memory zsmalloc consumed with
      *byte unit* but zsmalloc operates *page unit* rather than byte unit so
      let's change the API so benefit we could get is that reduce unnecessary
      overhead (ie, change page unit with byte unit) in zsmalloc.
      
      Since return type is pages, "zs_get_total_pages" is better than
      "zs_get_total_size_bytes".
      Signed-off-by: NMinchan Kim <minchan@kernel.org>
      Reviewed-by: NDan Streetman <ddstreet@ieee.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: <juno.choi@lge.com>
      Cc: <seungho1.park@lge.com>
      Cc: Luigi Semenzato <semenzato@google.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Seth Jennings <sjennings@variantweb.net>
      Cc: David Horner <ds2horner@gmail.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      722cdc17