1. 11 4月, 2018 1 次提交
  2. 31 1月, 2018 1 次提交
  3. 01 12月, 2017 1 次提交
  4. 15 11月, 2017 1 次提交
    • M
      virtio_balloon: fix deadlock on OOM · c7cdff0e
      Michael S. Tsirkin 提交于
      fill_balloon doing memory allocations under balloon_lock
      can cause a deadlock when leak_balloon is called from
      virtballoon_oom_notify and tries to take same lock.
      
      To fix, split page allocation and enqueue and do allocations outside the lock.
      
      Here's a detailed analysis of the deadlock by Tetsuo Handa:
      
      In leak_balloon(), mutex_lock(&vb->balloon_lock) is called in order to
      serialize against fill_balloon(). But in fill_balloon(),
      alloc_page(GFP_HIGHUSER[_MOVABLE] | __GFP_NOMEMALLOC | __GFP_NORETRY) is
      called with vb->balloon_lock mutex held. Since GFP_HIGHUSER[_MOVABLE]
      implies __GFP_DIRECT_RECLAIM | __GFP_IO | __GFP_FS, despite __GFP_NORETRY
      is specified, this allocation attempt might indirectly depend on somebody
      else's __GFP_DIRECT_RECLAIM memory allocation. And such indirect
      __GFP_DIRECT_RECLAIM memory allocation might call leak_balloon() via
      virtballoon_oom_notify() via blocking_notifier_call_chain() callback via
      out_of_memory() when it reached __alloc_pages_may_oom() and held oom_lock
      mutex. Since vb->balloon_lock mutex is already held by fill_balloon(), it
      will cause OOM lockup.
      
        Thread1                                       Thread2
          fill_balloon()
            takes a balloon_lock
            balloon_page_enqueue()
              alloc_page(GFP_HIGHUSER_MOVABLE)
                direct reclaim (__GFP_FS context)       takes a fs lock
                  waits for that fs lock                  alloc_page(GFP_NOFS)
                                                            __alloc_pages_may_oom()
                                                              takes the oom_lock
                                                              out_of_memory()
                                                                blocking_notifier_call_chain()
                                                                  leak_balloon()
                                                                    tries to take that balloon_lock and deadlocks
      Reported-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Wei Wang <wei.w.wang@intel.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      c7cdff0e
  5. 25 7月, 2017 2 次提交
  6. 19 6月, 2017 1 次提交
  7. 03 5月, 2017 1 次提交
  8. 29 3月, 2017 3 次提交
    • A
      virtio_balloon: prevent uninitialized variable use · f0bb2d50
      Arnd Bergmann 提交于
      The latest gcc-7.0.1 snapshot reports a new warning:
      
      virtio/virtio_balloon.c: In function 'update_balloon_stats':
      virtio/virtio_balloon.c:258:26: error: 'events[2]' is used uninitialized in this function [-Werror=uninitialized]
      virtio/virtio_balloon.c:260:26: error: 'events[3]' is used uninitialized in this function [-Werror=uninitialized]
      virtio/virtio_balloon.c:261:56: error: 'events[18]' is used uninitialized in this function [-Werror=uninitialized]
      virtio/virtio_balloon.c:262:56: error: 'events[17]' is used uninitialized in this function [-Werror=uninitialized]
      
      This seems absolutely right, so we should add an extra check to
      prevent copying uninitialized stack data into the statistics.
      >From all I can tell, this has been broken since the statistics code
      was originally added in 2.6.34.
      
      Fixes: 9564e138 ("virtio: Add memory statistics reporting to the balloon driver (V4)")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NLadi Prosek <lprosek@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      f0bb2d50
    • L
      virtio-balloon: use actual number of stats for stats queue buffers · 9646b26e
      Ladi Prosek 提交于
      The virtio balloon driver contained a not-so-obvious invariant that
      update_balloon_stats has to update exactly VIRTIO_BALLOON_S_NR counters
      in order to send valid stats to the host. This commit fixes it by having
      update_balloon_stats return the actual number of counters, and its
      callers use it when pushing buffers to the stats virtqueue.
      
      Note that it is still out of spec to change the number of counters
      at run-time. "Driver MUST supply the same subset of statistics in all
      buffers submitted to the statsq."
      Suggested-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NLadi Prosek <lprosek@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      9646b26e
    • L
      virtio_balloon: init 1st buffer in stats vq · fc865322
      Ladi Prosek 提交于
      When init_vqs runs, virtio_balloon.stats is either uninitialized or
      contains stale values. The host updates its state with garbage data
      because it has no way of knowing that this is just a marker buffer
      used for signaling.
      
      This patch updates the stats before pushing the initial buffer.
      
      Alternative fixes:
      * Push an empty buffer in init_vqs. Not easily done with the current
        virtio implementation and violates the spec "Driver MUST supply the
        same subset of statistics in all buffers submitted to the statsq".
      * Push a buffer with invalid tags in init_vqs. Violates the same
        spec clause, plus "invalid tag" is not really defined.
      
      Note: the spec says:
      	When using the legacy interface, the device SHOULD ignore all values in
      	the first buffer in the statsq supplied by the driver after device
      	initialization. Note: Historically, drivers supplied an uninitialized
      	buffer in the first buffer.
      
      Unfortunately QEMU does not seem to implement the recommendation
      even for the legacy interface.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NLadi Prosek <lprosek@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      fc865322
  9. 02 3月, 2017 1 次提交
  10. 28 2月, 2017 1 次提交
  11. 25 2月, 2017 1 次提交
  12. 31 10月, 2016 1 次提交
    • K
      virtio: update balloon size in balloon "probe" · 8424af53
      Konstantin Neumoin 提交于
      The following commit 'fad7b7b2 (virtio_balloon: Use a workqueue
      instead of "vballoon" kthread)' has added a regression. Original code with
      kthread starts the thread inside probe and checks the necessity to update
      balloon inside the thread immediately.
      
      Nowadays the code behaves differently. Work is queued only on the first
      command from the host after the negotiation. Thus there is a window
      especially at the guest startup or the module reloading when the balloon
      size is not updated until the notification from the host.
      
      This patch adds balloon size check at the end of the probe to match
      original behaviour.
      Signed-off-by: NKonstantin Neumoin <kneumoin@virtuozzo.com>
      Signed-off-by: NDenis V. Lunev <den@openvz.org>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      8424af53
  13. 02 8月, 2016 1 次提交
  14. 27 7月, 2016 2 次提交
  15. 23 5月, 2016 1 次提交
  16. 18 3月, 2016 1 次提交
  17. 11 3月, 2016 2 次提交
    • P
      virtio_balloon: Allow to resize and update the balloon stats in parallel · fd0e21c3
      Petr Mladek 提交于
      The virtio balloon statistics are not updated when the balloon
      is being resized. But it seems that both tasks could be done
      in parallel.
      
      stats_handle_request() updates the statistics in the balloon
      structure and then communicates with the host.
      
      update_balloon_stats() calls all_vm_events() that just reads
      some per-CPU variables. The values might change during and
      after the call but it is expected and happens even without
      this patch.
      
      update_balloon_stats() also calls si_meminfo(). It is a bit
      more complex function. It too just reads some variables and
      looks lock-less safe. In each case, it seems to be called
      lock-less on several similar locations, e.g. from post_status()
      in dm_thread_func(), or from vmballoon_send_get_target().
      
      The communication with the host is done via a separate virtqueue,
      see vb->stats_vq vs. vb->inflate_vq and vb->deflate_vq. Therefore
      it could be used in parallel with fill_balloon() and leak_balloon().
      
      This patch splits the existing work into two pieces. One is for
      updating the balloon stats. The other is for resizing of the balloon.
      It seems that they can be proceed in parallel without any
      extra locking.
      Signed-off-by: NPetr Mladek <pmladek@suse.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      fd0e21c3
    • P
      virtio_balloon: Use a workqueue instead of "vballoon" kthread · fad7b7b2
      Petr Mladek 提交于
      This patch moves the deferred work from the "vballoon" kthread into a
      system freezable workqueue.
      
      We do not need to maintain and run a dedicated kthread. Also the event
      driven workqueues API makes the logic much easier. Especially, we do
      not longer need an own wait queue, wait function, and freeze point.
      
      The conversion is pretty straightforward. One cycle of the main loop
      is put into a work. The work is queued instead of waking the kthread.
      
      fill_balloon() and leak_balloon() have a limit for the amount of modified
      pages. The work re-queues itself when necessary. For this, we make
      fill_balloon() to return the number of really modified pages.
      Note that leak_balloon() already did this.
      
      virtballoon_restore() queues the work only when really needed.
      
      The only complication is that we need to prevent queuing the work
      when the balloon is being removed. It was easier before because the
      kthread simply removed itself from the wait queue. We need an
      extra boolean and spin lock now.
      
      My initial idea was to use a dedicated workqueue. Michael S. Tsirkin
      suggested using a system one. Tejun Heo confirmed that the system
      workqueue has a pretty high concurrency level (256) by default.
      Therefore we need not be afraid of too long blocking.
      Signed-off-by: NPetr Mladek <pmladek@suse.cz>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      fad7b7b2
  18. 13 1月, 2016 2 次提交
    • S
      virtio: make find_vqs() checkpatch.pl-friendly · f7ad26ff
      Stefan Hajnoczi 提交于
      checkpatch.pl wants arrays of strings declared as follows:
      
        static const char * const names[] = { "vq-1", "vq-2", "vq-3" };
      
      Currently the find_vqs() function takes a const char *names[] argument
      so passing checkpatch.pl's const char * const names[] results in a
      compiler error due to losing the second const.
      
      This patch adjusts the find_vqs() prototype and updates all virtio
      transports.  This makes it possible for virtio_balloon.c, virtio_input.c,
      virtgpu_kms.c, and virtio_rpmsg_bus.c to use the checkpatch.pl-friendly
      type.
      Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Acked-by: NBjorn Andersson <bjorn.andersson@sonymobile.com>
      f7ad26ff
    • M
      virtio_balloon: fix race by fill and leak · f68b992b
      Minchan Kim 提交于
      During my compaction-related stuff, I encountered a bug
      with ballooning.
      
      With repeated inflating and deflating cycle, guest memory(
      ie, cat /proc/meminfo | grep MemTotal) is decreased and
      couldn't be recovered.
      
      The reason is balloon_lock doesn't cover release_pages_balloon
      so struct virtio_balloon fields could be overwritten by race
      of fill_balloon(e,g, vb->*pfns could be critical).
      
      This patch fixes it in my test.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NMinchan Kim <minchan@kernel.org>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      f68b992b
  19. 08 9月, 2015 2 次提交
    • D
      virtio_balloon: do not change memory amount visible via /proc/meminfo · 997e1208
      Denis V. Lunev 提交于
      Balloon device is frequently used as a mean of cooperative memory control
      in between guest and host to manage memory overcommitment. This is the
      typical case for any hosting workload when KVM guest is provided for
      end-user.
      
      Though there is a problem in this setup. The end-user and hosting provider
      have signed SLA agreement in which some amount of memory is guaranted for
      the guest. The good thing is that this memory will be given to the guest
      when the guest will really need it (f.e. with OOM in guest and with
      VIRTIO_BALLOON_F_DEFLATE_ON_OOM configuration flag set). The bad thing
      is that end-user does not know this.
      
      Balloon by default reduce the amount of memory exposed to the end-user
      each time when the page is stolen from guest or returned back by using
      adjust_managed_page_count and thus /proc/meminfo shows reduced amount
      of memory.
      
      Fortunately the solution is simple, we should just avoid to call
      adjust_managed_page_count with VIRTIO_BALLOON_F_DEFLATE_ON_OOM set.
      Signed-off-by: NDenis V. Lunev <den@openvz.org>
      CC: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      997e1208
    • D
      virtio_ballon: change stub of release_pages_by_pfn · b4d34037
      Denis V. Lunev 提交于
      and rename it to release_pages_balloon. The function originally takes
      arrays of pfns and now it takes pointer to struct virtio_ballon.
      This change is necessary to conditionally call adjust_managed_page_count
      in the next patch.
      Signed-off-by: NDenis V. Lunev <den@openvz.org>
      CC: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      b4d34037
  20. 15 4月, 2015 1 次提交
  21. 10 3月, 2015 2 次提交
  22. 21 1月, 2015 2 次提交
  23. 10 12月, 2014 1 次提交
  24. 09 12月, 2014 1 次提交
  25. 11 11月, 2014 2 次提交
    • R
      virtio_balloon: free some memory from balloon on OOM · 5a10b7db
      Raushaniya Maksudova 提交于
      Excessive virtio_balloon inflation can cause invocation of OOM-killer,
      when Linux is under severe memory pressure. Various mechanisms are
      responsible for correct virtio_balloon memory management. Nevertheless
      it is often the case that these control tools does not have enough time
      to react on fast changing memory load. As a result OS runs out of memory
      and invokes OOM-killer. The balancing of memory by use of the virtio
      balloon should not cause the termination of processes while there are
      pages in the balloon. Now there is no way for virtio balloon driver to
      free some memory at the last moment before some process will be get
      killed by OOM-killer.
      
      This does not provide a security breach as balloon itself is running
      inside guest OS and is working in the cooperation with the host. Thus
      some improvements from guest side should be considered as normal.
      
      To solve the problem, introduce a virtio_balloon callback which is
      expected to be called from the oom notifier call chain in out_of_memory()
      function. If virtio balloon could release some memory, it will make
      the system to return and retry the allocation that forced the out of
      memory killer to run.
      
      Allocate virtio  feature bit for this: it is not set by default,
      the the guest will not deflate virtio balloon on OOM without explicit
      permission from host.
      Signed-off-by: NRaushaniya Maksudova <rmaksudova@parallels.com>
      Signed-off-by: NDenis V. Lunev <den@openvz.org>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      5a10b7db
    • R
      virtio_balloon: return the amount of freed memory from leak_balloon() · 1fd9c672
      Raushaniya Maksudova 提交于
      This value would be useful in the next patch to provide the amount of
      the freed memory for OOM killer.
      Signed-off-by: NRaushaniya Maksudova <rmaksudova@parallels.com>
      Signed-off-by: NDenis V. Lunev <den@openvz.org>
      CC: Rusty Russell <rusty@rustcorp.com.au>
      CC: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      1fd9c672
  26. 15 10月, 2014 1 次提交
  27. 10 10月, 2014 3 次提交
    • K
      mm/balloon_compaction: add vmstat counters and kpageflags bit · 09316c09
      Konstantin Khlebnikov 提交于
      Always mark pages with PageBalloon even if balloon compaction is disabled
      and expose this mark in /proc/kpageflags as KPF_BALLOON.
      
      Also this patch adds three counters into /proc/vmstat: "balloon_inflate",
      "balloon_deflate" and "balloon_migrate".  They accumulate balloon
      activity.  Current size of balloon is (balloon_inflate - balloon_deflate)
      pages.
      
      All generic balloon code now gathered under option CONFIG_MEMORY_BALLOON.
      It should be selected by ballooning driver which wants use this feature.
      Currently virtio-balloon is the only user.
      Signed-off-by: NKonstantin Khlebnikov <k.khlebnikov@samsung.com>
      Cc: Rafael Aquini <aquini@redhat.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09316c09
    • K
      mm/balloon_compaction: remove balloon mapping and flag AS_BALLOON_MAP · 9d1ba805
      Konstantin Khlebnikov 提交于
      Now ballooned pages are detected using PageBalloon().  Fake mapping is no
      longer required.  This patch links ballooned pages to balloon device using
      field page->private instead of page->mapping.  Also this patch embeds
      balloon_dev_info directly into struct virtio_balloon.
      Signed-off-by: NKonstantin Khlebnikov <k.khlebnikov@samsung.com>
      Cc: Rafael Aquini <aquini@redhat.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9d1ba805
    • K
      mm/balloon_compaction: redesign ballooned pages management · d6d86c0a
      Konstantin Khlebnikov 提交于
      Sasha Levin reported KASAN splash inside isolate_migratepages_range().
      Problem is in the function __is_movable_balloon_page() which tests
      AS_BALLOON_MAP in page->mapping->flags.  This function has no protection
      against anonymous pages.  As result it tried to check address space flags
      inside struct anon_vma.
      
      Further investigation shows more problems in current implementation:
      
      * Special branch in __unmap_and_move() never works:
        balloon_page_movable() checks page flags and page_count.  In
        __unmap_and_move() page is locked, reference counter is elevated, thus
        balloon_page_movable() always fails.  As a result execution goes to the
        normal migration path.  virtballoon_migratepage() returns
        MIGRATEPAGE_BALLOON_SUCCESS instead of MIGRATEPAGE_SUCCESS,
        move_to_new_page() thinks this is an error code and assigns
        newpage->mapping to NULL.  Newly migrated page lose connectivity with
        balloon an all ability for further migration.
      
      * lru_lock erroneously required in isolate_migratepages_range() for
        isolation ballooned page.  This function releases lru_lock periodically,
        this makes migration mostly impossible for some pages.
      
      * balloon_page_dequeue have a tight race with balloon_page_isolate:
        balloon_page_isolate could be executed in parallel with dequeue between
        picking page from list and locking page_lock.  Race is rare because they
        use trylock_page() for locking.
      
      This patch fixes all of them.
      
      Instead of fake mapping with special flag this patch uses special state of
      page->_mapcount: PAGE_BALLOON_MAPCOUNT_VALUE = -256.  Buddy allocator uses
      PAGE_BUDDY_MAPCOUNT_VALUE = -128 for similar purpose.  Storing mark
      directly in struct page makes everything safer and easier.
      
      PagePrivate is used to mark pages present in page list (i.e.  not
      isolated, like PageLRU for normal pages).  It replaces special rules for
      reference counter and makes balloon migration similar to migration of
      normal pages.  This flag is protected by page_lock together with link to
      the balloon device.
      Signed-off-by: NKonstantin Khlebnikov <k.khlebnikov@samsung.com>
      Reported-by: NSasha Levin <sasha.levin@oracle.com>
      Link: http://lkml.kernel.org/p/53E6CEAA.9020105@oracle.com
      Cc: Rafael Aquini <aquini@redhat.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: <stable@vger.kernel.org>	[3.8+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d6d86c0a
  28. 13 3月, 2014 1 次提交