1. 13 1月, 2016 1 次提交
    • M
      virtio_balloon: fix race by fill and leak · f68b992b
      Minchan Kim 提交于
      During my compaction-related stuff, I encountered a bug
      with ballooning.
      
      With repeated inflating and deflating cycle, guest memory(
      ie, cat /proc/meminfo | grep MemTotal) is decreased and
      couldn't be recovered.
      
      The reason is balloon_lock doesn't cover release_pages_balloon
      so struct virtio_balloon fields could be overwritten by race
      of fill_balloon(e,g, vb->*pfns could be critical).
      
      This patch fixes it in my test.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NMinchan Kim <minchan@kernel.org>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      f68b992b
  2. 08 9月, 2015 2 次提交
    • D
      virtio_balloon: do not change memory amount visible via /proc/meminfo · 997e1208
      Denis V. Lunev 提交于
      Balloon device is frequently used as a mean of cooperative memory control
      in between guest and host to manage memory overcommitment. This is the
      typical case for any hosting workload when KVM guest is provided for
      end-user.
      
      Though there is a problem in this setup. The end-user and hosting provider
      have signed SLA agreement in which some amount of memory is guaranted for
      the guest. The good thing is that this memory will be given to the guest
      when the guest will really need it (f.e. with OOM in guest and with
      VIRTIO_BALLOON_F_DEFLATE_ON_OOM configuration flag set). The bad thing
      is that end-user does not know this.
      
      Balloon by default reduce the amount of memory exposed to the end-user
      each time when the page is stolen from guest or returned back by using
      adjust_managed_page_count and thus /proc/meminfo shows reduced amount
      of memory.
      
      Fortunately the solution is simple, we should just avoid to call
      adjust_managed_page_count with VIRTIO_BALLOON_F_DEFLATE_ON_OOM set.
      Signed-off-by: NDenis V. Lunev <den@openvz.org>
      CC: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      997e1208
    • D
      virtio_ballon: change stub of release_pages_by_pfn · b4d34037
      Denis V. Lunev 提交于
      and rename it to release_pages_balloon. The function originally takes
      arrays of pfns and now it takes pointer to struct virtio_ballon.
      This change is necessary to conditionally call adjust_managed_page_count
      in the next patch.
      Signed-off-by: NDenis V. Lunev <den@openvz.org>
      CC: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      b4d34037
  3. 15 4月, 2015 1 次提交
  4. 10 3月, 2015 2 次提交
  5. 21 1月, 2015 2 次提交
  6. 10 12月, 2014 1 次提交
  7. 09 12月, 2014 1 次提交
  8. 11 11月, 2014 2 次提交
    • R
      virtio_balloon: free some memory from balloon on OOM · 5a10b7db
      Raushaniya Maksudova 提交于
      Excessive virtio_balloon inflation can cause invocation of OOM-killer,
      when Linux is under severe memory pressure. Various mechanisms are
      responsible for correct virtio_balloon memory management. Nevertheless
      it is often the case that these control tools does not have enough time
      to react on fast changing memory load. As a result OS runs out of memory
      and invokes OOM-killer. The balancing of memory by use of the virtio
      balloon should not cause the termination of processes while there are
      pages in the balloon. Now there is no way for virtio balloon driver to
      free some memory at the last moment before some process will be get
      killed by OOM-killer.
      
      This does not provide a security breach as balloon itself is running
      inside guest OS and is working in the cooperation with the host. Thus
      some improvements from guest side should be considered as normal.
      
      To solve the problem, introduce a virtio_balloon callback which is
      expected to be called from the oom notifier call chain in out_of_memory()
      function. If virtio balloon could release some memory, it will make
      the system to return and retry the allocation that forced the out of
      memory killer to run.
      
      Allocate virtio  feature bit for this: it is not set by default,
      the the guest will not deflate virtio balloon on OOM without explicit
      permission from host.
      Signed-off-by: NRaushaniya Maksudova <rmaksudova@parallels.com>
      Signed-off-by: NDenis V. Lunev <den@openvz.org>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      5a10b7db
    • R
      virtio_balloon: return the amount of freed memory from leak_balloon() · 1fd9c672
      Raushaniya Maksudova 提交于
      This value would be useful in the next patch to provide the amount of
      the freed memory for OOM killer.
      Signed-off-by: NRaushaniya Maksudova <rmaksudova@parallels.com>
      Signed-off-by: NDenis V. Lunev <den@openvz.org>
      CC: Rusty Russell <rusty@rustcorp.com.au>
      CC: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      1fd9c672
  9. 15 10月, 2014 1 次提交
  10. 10 10月, 2014 3 次提交
    • K
      mm/balloon_compaction: add vmstat counters and kpageflags bit · 09316c09
      Konstantin Khlebnikov 提交于
      Always mark pages with PageBalloon even if balloon compaction is disabled
      and expose this mark in /proc/kpageflags as KPF_BALLOON.
      
      Also this patch adds three counters into /proc/vmstat: "balloon_inflate",
      "balloon_deflate" and "balloon_migrate".  They accumulate balloon
      activity.  Current size of balloon is (balloon_inflate - balloon_deflate)
      pages.
      
      All generic balloon code now gathered under option CONFIG_MEMORY_BALLOON.
      It should be selected by ballooning driver which wants use this feature.
      Currently virtio-balloon is the only user.
      Signed-off-by: NKonstantin Khlebnikov <k.khlebnikov@samsung.com>
      Cc: Rafael Aquini <aquini@redhat.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09316c09
    • K
      mm/balloon_compaction: remove balloon mapping and flag AS_BALLOON_MAP · 9d1ba805
      Konstantin Khlebnikov 提交于
      Now ballooned pages are detected using PageBalloon().  Fake mapping is no
      longer required.  This patch links ballooned pages to balloon device using
      field page->private instead of page->mapping.  Also this patch embeds
      balloon_dev_info directly into struct virtio_balloon.
      Signed-off-by: NKonstantin Khlebnikov <k.khlebnikov@samsung.com>
      Cc: Rafael Aquini <aquini@redhat.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9d1ba805
    • K
      mm/balloon_compaction: redesign ballooned pages management · d6d86c0a
      Konstantin Khlebnikov 提交于
      Sasha Levin reported KASAN splash inside isolate_migratepages_range().
      Problem is in the function __is_movable_balloon_page() which tests
      AS_BALLOON_MAP in page->mapping->flags.  This function has no protection
      against anonymous pages.  As result it tried to check address space flags
      inside struct anon_vma.
      
      Further investigation shows more problems in current implementation:
      
      * Special branch in __unmap_and_move() never works:
        balloon_page_movable() checks page flags and page_count.  In
        __unmap_and_move() page is locked, reference counter is elevated, thus
        balloon_page_movable() always fails.  As a result execution goes to the
        normal migration path.  virtballoon_migratepage() returns
        MIGRATEPAGE_BALLOON_SUCCESS instead of MIGRATEPAGE_SUCCESS,
        move_to_new_page() thinks this is an error code and assigns
        newpage->mapping to NULL.  Newly migrated page lose connectivity with
        balloon an all ability for further migration.
      
      * lru_lock erroneously required in isolate_migratepages_range() for
        isolation ballooned page.  This function releases lru_lock periodically,
        this makes migration mostly impossible for some pages.
      
      * balloon_page_dequeue have a tight race with balloon_page_isolate:
        balloon_page_isolate could be executed in parallel with dequeue between
        picking page from list and locking page_lock.  Race is rare because they
        use trylock_page() for locking.
      
      This patch fixes all of them.
      
      Instead of fake mapping with special flag this patch uses special state of
      page->_mapcount: PAGE_BALLOON_MAPCOUNT_VALUE = -256.  Buddy allocator uses
      PAGE_BUDDY_MAPCOUNT_VALUE = -128 for similar purpose.  Storing mark
      directly in struct page makes everything safer and easier.
      
      PagePrivate is used to mark pages present in page list (i.e.  not
      isolated, like PageLRU for normal pages).  It replaces special rules for
      reference counter and makes balloon migration similar to migration of
      normal pages.  This flag is protected by page_lock together with link to
      the balloon device.
      Signed-off-by: NKonstantin Khlebnikov <k.khlebnikov@samsung.com>
      Reported-by: NSasha Levin <sasha.levin@oracle.com>
      Link: http://lkml.kernel.org/p/53E6CEAA.9020105@oracle.com
      Cc: Rafael Aquini <aquini@redhat.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: <stable@vger.kernel.org>	[3.8+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d6d86c0a
  11. 13 3月, 2014 2 次提交
  12. 16 1月, 2014 1 次提交
  13. 05 12月, 2013 1 次提交
  14. 17 10月, 2013 1 次提交
  15. 23 9月, 2013 1 次提交
  16. 04 7月, 2013 1 次提交
    • J
      mm: correctly update zone->managed_pages · 3dcc0571
      Jiang Liu 提交于
      Enhance adjust_managed_page_count() to adjust totalhigh_pages for
      highmem pages.  And change code which directly adjusts totalram_pages to
      use adjust_managed_page_count() because it adjusts totalram_pages,
      totalhigh_pages and zone->managed_pages altogether in a safe way.
      
      Remove inc_totalhigh_pages() and dec_totalhigh_pages() from xen/balloon
      driver bacause adjust_managed_page_count() has already adjusted
      totalhigh_pages.
      
      This patch also fixes two bugs:
      
      1) enhances virtio_balloon driver to adjust totalhigh_pages when
         reserve/unreserve pages.
      2) enhance memory_hotplug.c to adjust totalhigh_pages when hot-removing
         memory.
      
      We still need to deal with modifications of totalram_pages in file
      arch/powerpc/platforms/pseries/cmm.c, but need help from PPC experts.
      
      [akpm@linux-foundation.org: remove ifdef, per Wanpeng Li, virtio_balloon.c cleanup, per Sergei]
      [akpm@linux-foundation.org: export adjust_managed_page_count() to modules, for drivers/virtio/virtio_balloon.c]
      Signed-off-by: NJiang Liu <jiang.liu@huawei.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Tang Chen <tangchen@cn.fujitsu.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: <sworddragon2@aol.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jianguo Wu <wujianguo@huawei.com>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3dcc0571
  17. 02 7月, 2013 1 次提交
  18. 20 3月, 2013 1 次提交
  19. 13 2月, 2013 1 次提交
  20. 04 1月, 2013 1 次提交
    • G
      Drivers: virtio: remove __dev* attributes. · 8590dbc7
      Greg Kroah-Hartman 提交于
      CONFIG_HOTPLUG is going away as an option.  As a result, the __dev*
      markings need to be removed.
      
      This change removes the use of __devinit, __devexit_p, and __devexit
      from these drivers.
      
      Based on patches originally written by Bill Pemberton, but redone by me
      in order to handle some of the coding style issues better, by hand.
      
      Cc: Bill Pemberton <wfp5p@virginia.edu>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8590dbc7
  21. 18 12月, 2012 1 次提交
  22. 12 12月, 2012 1 次提交
    • R
      virtio_balloon: introduce migration primitives to balloon pages · e2250429
      Rafael Aquini 提交于
      Memory fragmentation introduced by ballooning might reduce significantly
      the number of 2MB contiguous memory blocks that can be used within a guest,
      thus imposing performance penalties associated with the reduced number of
      transparent huge pages that could be used by the guest workload.
      
      Besides making balloon pages movable at allocation time and introducing
      the necessary primitives to perform balloon page migration/compaction,
      this patch also introduces the following locking scheme, in order to
      enhance the syncronization methods for accessing elements of struct
      virtio_balloon, thus providing protection against concurrent access
      introduced by parallel memory migration threads.
      
       - balloon_lock (mutex) : synchronizes the access demand to elements of
                                struct virtio_balloon and its queue operations;
      
      [yongjun_wei@trendmicro.com.cn: fix missing unlock on error in fill_balloon()]
      [akpm@linux-foundation.org: avoid having multiple return points in fill_balloon()]
      [akpm@linux-foundation.org: fix printk warning]Signed-off-by: Rafael Aquini <aquini@redhat.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e2250429
  23. 09 7月, 2012 1 次提交
    • M
      virtio-balloon: fix add/get API use · 9c378abc
      Michael S. Tsirkin 提交于
      Since ee7cd898 'virtio: expose added
      descriptors immediately.', in virtio balloon virtqueue_get_buf might
      now run concurrently with virtqueue_kick.  I audited both and this
      seems safe in practice but this is not guaranteed by the API.
      Additionally, a spurious interrupt might in theory make
      virtqueue_get_buf run in parallel with virtqueue_add_buf, which is
      racy.
      
      While we might try to protect against spurious callbacks it's
      easier to fix the driver: balloon seems to be the only one
      (mis)using the API like this, so let's just fix balloon.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (removed unused var)
      9c378abc
  24. 22 5月, 2012 2 次提交
  25. 17 5月, 2012 1 次提交
  26. 15 4月, 2012 2 次提交
  27. 31 3月, 2012 2 次提交
  28. 01 3月, 2012 1 次提交
    • A
      virtio: balloon: leak / fill balloon across S4 · 4eb05d56
      Amit Shah 提交于
      commit e562966d added support for S4 to
      the balloon driver.  The freeze function did nothing to free the pages,
      since reclaiming the pages from the host to immediately give them back
      (if S4 was successful) seemed wasteful.  Also, if S4 wasn't successful,
      the guest would have to re-fill the balloon.  On restore, the pages were
      supposed to be marked freed and the free page counters were incremented
      to reflect the balloon was totally deflated.
      
      However, this wasn't done right.  The pages that were earlier taken away
      from the guest during a balloon inflation operation were just shown as
      used pages after a successful restore from S4.  Just a fancy way of
      leaking lots of memory.
      
      Instead of trying that, just leak the balloon on freeze and fill it on
      restore/thaw paths.  This works properly now.  The optimisation to not
      leak can be added later on after a bit of refactoring of the code.
      Signed-off-by: NAmit Shah <amit.shah@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      4eb05d56
  29. 12 1月, 2012 2 次提交
    • A
      virtio: balloon: Add freeze, restore handlers to support S4 · e562966d
      Amit Shah 提交于
      Handling balloon hibernate / restore is tricky.  If the balloon was
      inflated before going into the hibernation state, upon resume, the host
      will not have any memory of that.  Any pages that were passed on to the
      host earlier would most likely be invalid, and the host will have to
      re-balloon to the previous value to get in the pre-hibernate state.
      
      So the only sane thing for the guest to do here is to discard all the
      pages that were put in the balloon.  When to discard the pages is the
      next question.
      
      One solution is to deflate the balloon just before writing the image to
      the disk (in the freeze() PM callback).  However, asking for pages from
      the host just to discard them immediately after seems wasteful of
      resources.  Hence, it makes sense to do this by just fudging our
      counters soon after wakeup.  This means we don't deflate the balloon
      before sleep, and also don't put unnecessary pressure on the host.
      
      This also helps in the thaw case: if the freeze fails for whatever
      reason, the balloon should continue to remain in the inflated state.
      This was tested by issuing 'swapoff -a' and trying to go into the S4
      state.  That fails, and the balloon stays inflated, as expected.  Both
      the host and the guest are happy.
      
      Finally, in the restore() callback, we empty the list of pages that were
      previously given off to the host, add the appropriate number of pages to
      the totalram_pages counter, reset the num_pages counter to 0, and
      all is fine.
      
      As a last step, delete the vqs on the freeze callback to prepare for
      hibernation, and re-create them in the restore and thaw callbacks to
      resume normal operation.
      
      The kthread doesn't race with any operations here, since it's frozen
      before the freeze() call and is thawed after the thaw() and restore()
      callbacks, so we're safe with that.
      Signed-off-by: NAmit Shah <amit.shah@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      e562966d
    • A
      virtio: balloon: Move vq initialization into separate function · be91c33d
      Amit Shah 提交于
      The probe and PM restore functions will share this code.
      Signed-off-by: NAmit Shah <amit.shah@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      be91c33d