1. 09 12月, 2014 11 次提交
  2. 26 11月, 2014 1 次提交
    • A
      kvm: fix kvm_is_mmio_pfn() and rename to kvm_is_reserved_pfn() · d3fccc7e
      Ard Biesheuvel 提交于
      This reverts commit 85c8555f ("KVM: check for !is_zero_pfn() in
      kvm_is_mmio_pfn()") and renames the function to kvm_is_reserved_pfn.
      
      The problem being addressed by the patch above was that some ARM code
      based the memory mapping attributes of a pfn on the return value of
      kvm_is_mmio_pfn(), whose name indeed suggests that such pfns should
      be mapped as device memory.
      
      However, kvm_is_mmio_pfn() doesn't do quite what it says on the tin,
      and the existing non-ARM users were already using it in a way which
      suggests that its name should probably have been 'kvm_is_reserved_pfn'
      from the beginning, e.g., whether or not to call get_page/put_page on
      it etc. This means that returning false for the zero page is a mistake
      and the patch above should be reverted.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      d3fccc7e
  3. 24 11月, 2014 2 次提交
  4. 18 11月, 2014 2 次提交
    • D
      can: dev: add can_is_canfd_skb() API · 98e69016
      Dong Aisheng 提交于
      The CAN device drivers can use can_is_canfd_skb() to check if the frame to send
      is on CAN FD mode or normal CAN mode.
      Acked-by: NOliver Hartkopp <socketcan@hartkopp.net>
      Signed-off-by: NDong Aisheng <b29396@freescale.com>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      98e69016
    • J
      clk-divider: Fix READ_ONLY when divider > 1 · e6d5e7d9
      James Hogan 提交于
      Commit 79c6ab50 (clk: divider: add CLK_DIVIDER_READ_ONLY flag) in
      v3.16 introduced the CLK_DIVIDER_READ_ONLY flag which caused the
      recalc_rate() and round_rate() clock callbacks to be omitted.
      
      However using this flag has the unfortunate side effect of causing the
      clock recalculation code when a clock rate change is attempted to always
      treat it as a pass-through clock, i.e. with a fixed divide of 1, which
      may not be the case. Child clock rates are then recalculated using the
      wrong parent rate.
      
      Therefore instead of dropping the recalc_rate() and round_rate()
      callbacks, alter clk_divider_bestdiv() to always report the current
      divider as the best divider so that it is never altered.
      
      For me the read only clock was the system clock, which divided the PLL
      rate by 2, from which both the UART and the SPI clocks were divided.
      Initial setting of the UART rate set it correctly, but when the SPI
      clock was set, the other child clocks were miscalculated. The UART clock
      was recalculated using the PLL rate as the parent rate, resulting in a
      UART new_rate of double what it should be, and a UART which spewed forth
      garbage when the rate changes were propagated.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Thomas Abraham <thomas.ab@samsung.com>
      Cc: Tomasz Figa <t.figa@samsung.com>
      Cc: Max Schwarz <max.schwarz@online.de>
      Cc: <stable@vger.kernel.org> # v3.16+
      Acked-by: NHaojian Zhuang <haojian.zhuang@gmail.com>
      Signed-off-by: NMichael Turquette <mturquette@linaro.org>
      e6d5e7d9
  5. 16 11月, 2014 3 次提交
  6. 15 11月, 2014 1 次提交
  7. 14 11月, 2014 2 次提交
    • T
      mem-hotplug: reset node managed pages when hot-adding a new pgdat · f784a3f1
      Tang Chen 提交于
      In free_area_init_core(), zone->managed_pages is set to an approximate
      value for lowmem, and will be adjusted when the bootmem allocator frees
      pages into the buddy system.
      
      But free_area_init_core() is also called by hotadd_new_pgdat() when
      hot-adding memory.  As a result, zone->managed_pages of the newly added
      node's pgdat is set to an approximate value in the very beginning.
      
      Even if the memory on that node has node been onlined,
      /sys/device/system/node/nodeXXX/meminfo has wrong value:
      
        hot-add node2 (memory not onlined)
        cat /sys/device/system/node/node2/meminfo
        Node 2 MemTotal:       33554432 kB
        Node 2 MemFree:               0 kB
        Node 2 MemUsed:        33554432 kB
        Node 2 Active:                0 kB
      
      This patch fixes this problem by reset node managed pages to 0 after
      hot-adding a new node.
      
      1. Move reset_managed_pages_done from reset_node_managed_pages() to
         reset_all_zones_managed_pages()
      2. Make reset_node_managed_pages() non-static
      3. Call reset_node_managed_pages() in hotadd_new_pgdat() after pgdat
         is initialized
      Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
      Signed-off-by: NYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: <stable@vger.kernel.org>	[3.16+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f784a3f1
    • J
      mm/page_alloc: fix incorrect isolation behavior by rechecking migratetype · ad53f92e
      Joonsoo Kim 提交于
      Before describing bugs itself, I first explain definition of freepage.
      
       1. pages on buddy list are counted as freepage.
       2. pages on isolate migratetype buddy list are *not* counted as freepage.
       3. pages on cma buddy list are counted as CMA freepage, too.
      
      Now, I describe problems and related patch.
      
      Patch 1: There is race conditions on getting pageblock migratetype that
      it results in misplacement of freepages on buddy list, incorrect
      freepage count and un-availability of freepage.
      
      Patch 2: Freepages on pcp list could have stale cached information to
      determine migratetype of buddy list to go.  This causes misplacement of
      freepages on buddy list and incorrect freepage count.
      
      Patch 4: Merging between freepages on different migratetype of
      pageblocks will cause freepages accouting problem.  This patch fixes it.
      
      Without patchset [3], above problem doesn't happens on my CMA allocation
      test, because CMA reserved pages aren't used at all.  So there is no
      chance for above race.
      
      With patchset [3], I did simple CMA allocation test and get below
      result:
      
       - Virtual machine, 4 cpus, 1024 MB memory, 256 MB CMA reservation
       - run kernel build (make -j16) on background
       - 30 times CMA allocation(8MB * 30 = 240MB) attempts in 5 sec interval
       - Result: more than 5000 freepage count are missed
      
      With patchset [3] and this patchset, I found that no freepage count are
      missed so that I conclude that problems are solved.
      
      On my simple memory offlining test, these problems also occur on that
      environment, too.
      
      This patch (of 4):
      
      There are two paths to reach core free function of buddy allocator,
      __free_one_page(), one is free_one_page()->__free_one_page() and the
      other is free_hot_cold_page()->free_pcppages_bulk()->__free_one_page().
      Each paths has race condition causing serious problems.  At first, this
      patch is focused on first type of freepath.  And then, following patch
      will solve the problem in second type of freepath.
      
      In the first type of freepath, we got migratetype of freeing page
      without holding the zone lock, so it could be racy.  There are two cases
      of this race.
      
       1. pages are added to isolate buddy list after restoring orignal
          migratetype
      
          CPU1                                   CPU2
      
          get migratetype => return MIGRATE_ISOLATE
          call free_one_page() with MIGRATE_ISOLATE
      
                                      grab the zone lock
                                      unisolate pageblock
                                      release the zone lock
      
          grab the zone lock
          call __free_one_page() with MIGRATE_ISOLATE
          freepage go into isolate buddy list,
          although pageblock is already unisolated
      
      This may cause two problems.  One is that we can't use this page anymore
      until next isolation attempt of this pageblock, because freepage is on
      isolate buddy list.  The other is that freepage accouting could be wrong
      due to merging between different buddy list.  Freepages on isolate buddy
      list aren't counted as freepage, but ones on normal buddy list are
      counted as freepage.  If merge happens, buddy freepage on normal buddy
      list is inevitably moved to isolate buddy list without any consideration
      of freepage accouting so it could be incorrect.
      
       2. pages are added to normal buddy list while pageblock is isolated.
          It is similar with above case.
      
      This also may cause two problems.  One is that we can't keep these
      freepages from being allocated.  Although this pageblock is isolated,
      freepage would be added to normal buddy list so that it could be
      allocated without any restriction.  And the other problem is same as
      case 1, that it, incorrect freepage accouting.
      
      This race condition would be prevented by checking migratetype again
      with holding the zone lock.  Because it is somewhat heavy operation and
      it isn't needed in common case, we want to avoid rechecking as much as
      possible.  So this patch introduce new variable, nr_isolate_pageblock in
      struct zone to check if there is isolated pageblock.  With this, we can
      avoid to re-check migratetype in common case and do it only if there is
      isolated pageblock or migratetype is MIGRATE_ISOLATE.  This solve above
      mentioned problems.
      
      Changes from v3:
      Add one more check in free_one_page() that checks whether migratetype is
      MIGRATE_ISOLATE or not. Without this, abovementioned case 1 could happens.
      Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Acked-by: NMinchan Kim <minchan@kernel.org>
      Acked-by: NMichal Nazarewicz <mina86@mina86.com>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Cc: Tang Chen <tangchen@cn.fujitsu.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Laura Abbott <lauraa@codeaurora.org>
      Cc: Heesub Shin <heesub.shin@samsung.com>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Ritesh Harjani <ritesh.list@gmail.com>
      Cc: Gioh Kim <gioh.kim@lge.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ad53f92e
  8. 13 11月, 2014 1 次提交
  9. 12 11月, 2014 1 次提交
    • U
      PM / Domains: Fix initial default state of the need_restore flag · 67732cd3
      Ulf Hansson 提交于
      The initial state of the device's need_restore flag should'nt depend on
      the current state of the PM domain. For example it should be perfectly
      valid to attach an inactive device to a powered PM domain.
      
      The pm_genpd_dev_need_restore() API allow us to update the need_restore
      flag to somewhat cope with such scenarios. Typically that should have
      been done from drivers/buses ->probe() since it's those that put the
      requirements on the value of the need_restore flag.
      
      Until recently, the Exynos SOCs were the only user of the
      pm_genpd_dev_need_restore() API, though invoking it from a centralized
      location while adding devices to their PM domains.
      
      Due to that Exynos now have swithed to the generic OF-based PM domain
      look-up, it's no longer possible to invoke the API from a centralized
      location. The reason is because devices are now added to their PM
      domains during the probe sequence.
      
      Commit "ARM: exynos: Move to generic PM domain DT bindings"
      did the switch for Exynos to the generic OF-based PM domain look-up,
      but it also removed the call to pm_genpd_dev_need_restore(). This
      caused a regression for some of the Exynos drivers.
      
      To handle things more properly in the generic PM domain, let's change
      the default initial value of the need_restore flag to reflect that the
      state is unknown. As soon as some of the runtime PM callbacks gets
      invoked, update the initial value accordingly.
      
      Moreover, since the generic PM domain is verifying that all devices
      are both runtime PM enabled and suspended, using pm_runtime_suspended()
      while pm_genpd_poweroff() is invoked from the scheduled work, we can be
      sure of that the PM domain won't be powering off while having active
      devices.
      
      Do note that, the generic PM domain can still only know about active
      devices which has been activated through invoking its runtime PM resume
      callback. In other words, buses/drivers using pm_runtime_set_active()
      during ->probe() will still suffer from a race condition, potentially
      probing a device without having its PM domain being powered. That issue
      will have to be solved using a different approach.
      
      This a log from the boot regression for Exynos5, which is being fixed in
      this patch.
      
      ------------[ cut here ]------------
      WARNING: CPU: 0 PID: 308 at ../drivers/clk/clk.c:851 clk_disable+0x24/0x30()
      Modules linked in:
      CPU: 0 PID: 308 Comm: kworker/0:1 Not tainted 3.18.0-rc3-00569-gbd9449f-dirty #10
      Workqueue: pm pm_runtime_work
      [<c0013c64>] (unwind_backtrace) from [<c0010dec>] (show_stack+0x10/0x14)
      [<c0010dec>] (show_stack) from [<c03ee4cc>] (dump_stack+0x70/0xbc)
      [<c03ee4cc>] (dump_stack) from [<c0020d34>] (warn_slowpath_common+0x64/0x88)
      [<c0020d34>] (warn_slowpath_common) from [<c0020d74>] (warn_slowpath_null+0x1c/0x24)
      [<c0020d74>] (warn_slowpath_null) from [<c03107b0>] (clk_disable+0x24/0x30)
      [<c03107b0>] (clk_disable) from [<c02cc834>] (gsc_runtime_suspend+0x128/0x160)
      [<c02cc834>] (gsc_runtime_suspend) from [<c0249024>] (pm_generic_runtime_suspend+0x2c/0x38)
      [<c0249024>] (pm_generic_runtime_suspend) from [<c024f44c>] (pm_genpd_default_save_state+0x2c/0x8c)
      [<c024f44c>] (pm_genpd_default_save_state) from [<c024ff2c>] (pm_genpd_poweroff+0x224/0x3ec)
      [<c024ff2c>] (pm_genpd_poweroff) from [<c02501b4>] (pm_genpd_runtime_suspend+0x9c/0xcc)
      [<c02501b4>] (pm_genpd_runtime_suspend) from [<c024a4f8>] (__rpm_callback+0x2c/0x60)
      [<c024a4f8>] (__rpm_callback) from [<c024a54c>] (rpm_callback+0x20/0x74)
      [<c024a54c>] (rpm_callback) from [<c024a930>] (rpm_suspend+0xd4/0x43c)
      [<c024a930>] (rpm_suspend) from [<c024bbcc>] (pm_runtime_work+0x80/0x90)
      [<c024bbcc>] (pm_runtime_work) from [<c0032a9c>] (process_one_work+0x12c/0x314)
      [<c0032a9c>] (process_one_work) from [<c0032cf4>] (worker_thread+0x3c/0x4b0)
      [<c0032cf4>] (worker_thread) from [<c003747c>] (kthread+0xcc/0xe8)
      [<c003747c>] (kthread) from [<c000e738>] (ret_from_fork+0x14/0x3c)
      ---[ end trace 40cd58bcd6988f12 ]---
      
      Fixes: a4a8c2c4 (ARM: exynos: Move to generic PM domain DT bindings)
      Reported-and-tested0by: Sylwester Nawrocki <s.nawrocki@samsung.com>
      Reviewed-by: NSylwester Nawrocki <s.nawrocki@samsung.com>
      Reviewed-by: NKevin Hilman <khilman@linaro.org>
      Signed-off-by: NUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      67732cd3
  10. 11 11月, 2014 1 次提交
    • R
      tracing: Do not busy wait in buffer splice · e30f53aa
      Rabin Vincent 提交于
      On a !PREEMPT kernel, attempting to use trace-cmd results in a soft
      lockup:
      
       # trace-cmd record -e raw_syscalls:* -F false
       NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [trace-cmd:61]
       ...
       Call Trace:
        [<ffffffff8105b580>] ? __wake_up_common+0x90/0x90
        [<ffffffff81092e25>] wait_on_pipe+0x35/0x40
        [<ffffffff810936e3>] tracing_buffers_splice_read+0x2e3/0x3c0
        [<ffffffff81093300>] ? tracing_stats_read+0x2a0/0x2a0
        [<ffffffff812d10ab>] ? _raw_spin_unlock+0x2b/0x40
        [<ffffffff810dc87b>] ? do_read_fault+0x21b/0x290
        [<ffffffff810de56a>] ? handle_mm_fault+0x2ba/0xbd0
        [<ffffffff81095c80>] ? trace_event_buffer_lock_reserve+0x40/0x80
        [<ffffffff810951e2>] ? trace_buffer_lock_reserve+0x22/0x60
        [<ffffffff81095c80>] ? trace_event_buffer_lock_reserve+0x40/0x80
        [<ffffffff8112415d>] do_splice_to+0x6d/0x90
        [<ffffffff81126971>] SyS_splice+0x7c1/0x800
        [<ffffffff812d1edd>] tracesys_phase2+0xd3/0xd8
      
      The problem is this: tracing_buffers_splice_read() calls
      ring_buffer_wait() to wait for data in the ring buffers.  The buffers
      are not empty so ring_buffer_wait() returns immediately.  But
      tracing_buffers_splice_read() calls ring_buffer_read_page() with full=1,
      meaning it only wants to read a full page.  When the full page is not
      available, tracing_buffers_splice_read() tries to wait again with
      ring_buffer_wait(), which again returns immediately, and so on.
      
      Fix this by adding a "full" argument to ring_buffer_wait() which will
      make ring_buffer_wait() wait until the writer has left the reader's
      page, i.e.  until full-page reads will succeed.
      
      Link: http://lkml.kernel.org/r/1415645194-25379-1-git-send-email-rabin@rab.in
      
      Cc: stable@vger.kernel.org # 3.16+
      Fixes: b1169cc6 ("tracing: Remove mock up poll wait function")
      Signed-off-by: NRabin Vincent <rabin@rab.in>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      e30f53aa
  11. 10 11月, 2014 1 次提交
    • K
      mfd: max77693: Fix always masked MUIC interrupts · c0acb814
      Krzysztof Kozlowski 提交于
      All interrupts coming from MUIC were ignored because interrupt source
      register was masked.
      
      The Maxim 77693 has a "interrupt source" - a separate register and interrupts
      which give information about PMIC block triggering the individual
      interrupt (charger, topsys, MUIC, flash LED).
      
      By default bootloader could initialize this register to "mask all"
      value. In such case (observed on Trats2 board) MUIC interrupts won't be
      generated regardless of their mask status. Regmap irq chip was unmasking
      individual MUIC interrupts but the source was masked
      
      Before introducing regmap irq chip this interrupt source was unmasked,
      read and acked. Reading and acking is not necessary but unmasking is.
      
      Fixes: 342d669c ("mfd: max77693: Handle IRQs using regmap")
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Reviewed-by: NChanwoo Choi <cw00.choi@samsung.com>
      Signed-off-by: NLee Jones <lee.jones@linaro.org>
      c0acb814
  12. 08 11月, 2014 1 次提交
  13. 06 11月, 2014 2 次提交
  14. 04 11月, 2014 1 次提交
    • G
      of: Fix overflow bug in string property parsing functions · a87fa1d8
      Grant Likely 提交于
      The string property read helpers will run off the end of the buffer if
      it is handed a malformed string property. Rework the parsers to make
      sure that doesn't happen. At the same time add new test cases to make
      sure the functions behave themselves.
      
      The original implementations of of_property_read_string_index() and
      of_property_count_strings() both open-coded the same block of parsing
      code, each with it's own subtly different bugs. The fix here merges
      functions into a single helper and makes the original functions static
      inline wrappers around the helper.
      
      One non-bugfix aspect of this patch is the addition of a new wrapper,
      of_property_read_string_array(). The new wrapper is needed by the
      device_properties feature that Rafael is working on and planning to
      merge for v3.19. The implementation is identical both with and without
      the new static inline wrapper, so it just got left in to reduce the
      churn on the header file.
      Signed-off-by: NGrant Likely <grant.likely@linaro.org>
      Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Darren Hart <darren.hart@intel.com>
      Cc: <stable@vger.kernel.org>  # v3.3+: Drop selftest hunks that don't apply
      a87fa1d8
  15. 31 10月, 2014 2 次提交
  16. 30 10月, 2014 4 次提交
    • J
      mm: memcontrol: fix missed end-writeback page accounting · d7365e78
      Johannes Weiner 提交于
      Commit 0a31bc97 ("mm: memcontrol: rewrite uncharge API") changed
      page migration to uncharge the old page right away.  The page is locked,
      unmapped, truncated, and off the LRU, but it could race with writeback
      ending, which then doesn't unaccount the page properly:
      
      test_clear_page_writeback()              migration
                                                 wait_on_page_writeback()
        TestClearPageWriteback()
                                                 mem_cgroup_migrate()
                                                   clear PCG_USED
        mem_cgroup_update_page_stat()
          if (PageCgroupUsed(pc))
            decrease memcg pages under writeback
      
        release pc->mem_cgroup->move_lock
      
      The per-page statistics interface is heavily optimized to avoid a
      function call and a lookup_page_cgroup() in the file unmap fast path,
      which means it doesn't verify whether a page is still charged before
      clearing PageWriteback() and it has to do it in the stat update later.
      
      Rework it so that it looks up the page's memcg once at the beginning of
      the transaction and then uses it throughout.  The charge will be
      verified before clearing PageWriteback() and migration can't uncharge
      the page as long as that is still set.  The RCU lock will protect the
      memcg past uncharge.
      
      As far as losing the optimization goes, the following test results are
      from a microbenchmark that maps, faults, and unmaps a 4GB sparse file
      three times in a nested fashion, so that there are two negative passes
      that don't account but still go through the new transaction overhead.
      There is no actual difference:
      
       old:     33.195102545 seconds time elapsed       ( +-  0.01% )
       new:     33.199231369 seconds time elapsed       ( +-  0.03% )
      
      The time spent in page_remove_rmap()'s callees still adds up to the
      same, but the time spent in the function itself seems reduced:
      
           # Children      Self  Command        Shared Object       Symbol
       old:     0.12%     0.11%  filemapstress  [kernel.kallsyms]   [k] page_remove_rmap
       new:     0.12%     0.08%  filemapstress  [kernel.kallsyms]   [k] page_remove_rmap
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Cc: Vladimir Davydov <vdavydov@parallels.com>
      Cc: <stable@vger.kernel.org>	[3.17.x]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d7365e78
    • J
      mm: page-writeback: inline account_page_dirtied() into single caller · 3a3c02ec
      Johannes Weiner 提交于
      A follow-up patch would have changed the call signature.  To save the
      trouble, just fold it instead.
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Cc: Vladimir Davydov <vdavydov@parallels.com>
      Cc: <stable@vger.kernel.org>	[3.17.x]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3a3c02ec
    • D
      mm, thp: fix collapsing of hugepages on madvise · 6d50e60c
      David Rientjes 提交于
      If an anonymous mapping is not allowed to fault thp memory and then
      madvise(MADV_HUGEPAGE) is used after fault, khugepaged will never
      collapse this memory into thp memory.
      
      This occurs because the madvise(2) handler for thp, hugepage_madvise(),
      clears VM_NOHUGEPAGE on the stack and it isn't stored in vma->vm_flags
      until the final action of madvise_behavior().  This causes the
      khugepaged_enter_vma_merge() to be a no-op in hugepage_madvise() when
      the vma had previously had VM_NOHUGEPAGE set.
      
      Fix this by passing the correct vma flags to the khugepaged mm slot
      handler.  There's no chance khugepaged can run on this vma until after
      madvise_behavior() returns since we hold mm->mmap_sem.
      
      It would be possible to clear VM_NOHUGEPAGE directly from vma->vm_flags
      in hugepage_advise(), but I didn't want to introduce special case
      behavior into madvise_behavior().  I think it's best to just let it
      always set vma->vm_flags itself.
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Reported-by: NSuleiman Souhlal <suleiman@google.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6d50e60c
    • M
      drivers: of: add return value to of_reserved_mem_device_init() · 47f29df7
      Marek Szyprowski 提交于
      Driver calling of_reserved_mem_device_init() might be interested if the
      initialization has been successful or not, so add support for returning
      error code.
      
      This fixes a build warining caused by commit 7bfa5ab6 ("drivers:
      dma-coherent: add initialization from device tree"), which has been
      merged without this change and without fixing function return value.
      
      Fixes: 7bfa5ab6 ("drivers: dma-coherent: add initialization from device tree")
      Signed-off-by: NMarek Szyprowski <m.szyprowski@samsung.com>
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Grant Likely <grant.likely@linaro.org>
      Cc: Laura Abbott <lauraa@codeaurora.org>
      Cc: Josh Cartwright <joshc@codeaurora.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Kyungmin Park <kyungmin.park@samsung.com>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      47f29df7
  17. 29 10月, 2014 4 次提交