- 07 7月, 2017 9 次提交
-
-
由 Michal Hocko 提交于
Commit 20b2f52b ("numa: add CONFIG_MOVABLE_NODE for movable-dedicated node") has introduced CONFIG_MOVABLE_NODE without a good explanation on why it is actually useful. It makes a lot of sense to make movable node semantic opt in but we already have that because the feature has to be explicitly enabled on the kernel command line. A config option on top only makes the configuration space larger without a good reason. It also adds an additional ifdefery that pollutes the code. Just drop the config option and make it de-facto always enabled. This shouldn't introduce any change to the semantic. Link: http://lkml.kernel.org/r/20170529114141.536-3-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com> Acked-by: NReza Arbab <arbab@linux.vnet.ibm.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@suse.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Yasuaki Ishimatsu <yasu.isimatu@gmail.com> Cc: Xishi Qiu <qiuxishi@huawei.com> Cc: Kani Toshimitsu <toshi.kani@hpe.com> Cc: Chen Yucong <slaoub@gmail.com> Cc: Joonsoo Kim <js1304@gmail.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Rientjes <rientjes@google.com> Cc: Daniel Kiper <daniel.kiper@oracle.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Johannes Weiner 提交于
Patch series "mm: per-lruvec slab stats" Josef is working on a new approach to balancing slab caches and the page cache. For this to work, he needs slab cache statistics on the lruvec level. These patches implement that by adding infrastructure that allows updating and reading generic VM stat items per lruvec, then switches some existing VM accounting sites, including the slab accounting ones, to this new cgroup-aware API. I'll follow up with more patches on this, because there is actually substantial simplification that can be done to the memory controller when we replace private memcg accounting with making the existing VM accounting sites cgroup-aware. But this is enough for Josef to base his slab reclaim work on, so here goes. This patch (of 5): To re-implement slab cache vs. page cache balancing, we'll need the slab counters at the lruvec level, which, ever since lru reclaim was moved from the zone to the node, is the intersection of the node, not the zone, and the memcg. We could retain the per-zone counters for when the page allocator dumps its memory information on failures, and have counters on both levels - which on all but NUMA node 0 is usually redundant. But let's keep it simple for now and just move them. If anybody complains we can restore the per-zone counters. [hannes@cmpxchg.org: fix oops] Link: http://lkml.kernel.org/r/20170605183511.GA8915@cmpxchg.org Link: http://lkml.kernel.org/r/20170530181724.27197-3-hannes@cmpxchg.orgSigned-off-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Michal Hocko 提交于
Heiko Carstens has noticed that he can generate overlapping zones for ZONE_DMA and ZONE_NORMAL: DMA [mem 0x0000000000000000-0x000000007fffffff] Normal [mem 0x0000000080000000-0x000000017fffffff] $ cat /sys/devices/system/memory/block_size_bytes 10000000 $ cat /sys/devices/system/memory/memory5/valid_zones DMA $ echo 0 > /sys/devices/system/memory/memory5/online $ cat /sys/devices/system/memory/memory5/valid_zones Normal $ echo 1 > /sys/devices/system/memory/memory5/online Normal $ cat /proc/zoneinfo Node 0, zone DMA spanned 524288 <----- present 458752 managed 455078 start_pfn: 0 <----- Node 0, zone Normal spanned 720896 present 589824 managed 571648 start_pfn: 327680 <----- The reason is that we assume that the default zone for kernel onlining is ZONE_NORMAL. This was a simplification introduced by the memory hotplug rework and it is easily fixable by checking the range overlap in the zone order and considering the first matching zone as the default one. If there is no such zone then assume ZONE_NORMAL as we have been doing so far. Fixes: "mm, memory_hotplug: do not associate hotadded memory to zones until online" Link: http://lkml.kernel.org/r/20170601083746.4924-3-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com> Reported-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Tested-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Reza Arbab <arbab@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Michal Hocko 提交于
The current memory hotplug implementation relies on having all the struct pages associate with a zone/node during the physical hotplug phase (arch_add_memory->__add_pages->__add_section->__add_zone). In the vast majority of cases this means that they are added to ZONE_NORMAL. This has been so since 9d99aaa3 ("[PATCH] x86_64: Support memory hotadd without sparsemem") and it wasn't a big deal back then because movable onlining didn't exist yet. Much later memory hotplug wanted to (ab)use ZONE_MOVABLE for movable onlining 511c2aba ("mm, memory-hotplug: dynamic configure movable memory and portion memory") and then things got more complicated. Rather than reconsidering the zone association which was no longer needed (because the memory hotplug already depended on SPARSEMEM) a convoluted semantic of zone shifting has been developed. Only the currently last memblock or the one adjacent to the zone_movable can be onlined movable. This essentially means that the online type changes as the new memblocks are added. Let's simulate memory hot online manually $ echo 0x100000000 > /sys/devices/system/memory/probe $ grep . /sys/devices/system/memory/memory32/valid_zones Normal Movable $ echo $((0x100000000+(128<<20))) > /sys/devices/system/memory/probe $ grep . /sys/devices/system/memory/memory3?/valid_zones /sys/devices/system/memory/memory32/valid_zones:Normal /sys/devices/system/memory/memory33/valid_zones:Normal Movable $ echo $((0x100000000+2*(128<<20))) > /sys/devices/system/memory/probe $ grep . /sys/devices/system/memory/memory3?/valid_zones /sys/devices/system/memory/memory32/valid_zones:Normal /sys/devices/system/memory/memory33/valid_zones:Normal /sys/devices/system/memory/memory34/valid_zones:Normal Movable $ echo online_movable > /sys/devices/system/memory/memory34/state $ grep . /sys/devices/system/memory/memory3?/valid_zones /sys/devices/system/memory/memory32/valid_zones:Normal /sys/devices/system/memory/memory33/valid_zones:Normal Movable /sys/devices/system/memory/memory34/valid_zones:Movable Normal This is an awkward semantic because an udev event is sent as soon as the block is onlined and an udev handler might want to online it based on some policy (e.g. association with a node) but it will inherently race with new blocks showing up. This patch changes the physical online phase to not associate pages with any zone at all. All the pages are just marked reserved and wait for the onlining phase to be associated with the zone as per the online request. There are only two requirements - existing ZONE_NORMAL and ZONE_MOVABLE cannot overlap - ZONE_NORMAL precedes ZONE_MOVABLE in physical addresses the latter one is not an inherent requirement and can be changed in the future. It preserves the current behavior and made the code slightly simpler. This is subject to change in future. This means that the same physical online steps as above will lead to the following state: Normal Movable /sys/devices/system/memory/memory32/valid_zones:Normal Movable /sys/devices/system/memory/memory33/valid_zones:Normal Movable /sys/devices/system/memory/memory32/valid_zones:Normal Movable /sys/devices/system/memory/memory33/valid_zones:Normal Movable /sys/devices/system/memory/memory34/valid_zones:Normal Movable /sys/devices/system/memory/memory32/valid_zones:Normal Movable /sys/devices/system/memory/memory33/valid_zones:Normal Movable /sys/devices/system/memory/memory34/valid_zones:Movable Implementation: The current move_pfn_range is reimplemented to check the above requirements (allow_online_pfn_range) and then updates the respective zone (move_pfn_range_to_zone), the pgdat and links all the pages in the pfn range with the zone/node. __add_pages is updated to not require the zone and only initializes sections in the range. This allowed to simplify the arch_add_memory code (s390 could get rid of quite some of code). devm_memremap_pages is the only user of arch_add_memory which relies on the zone association because it only hooks into the memory hotplug only half way. It uses it to associate the new memory with ZONE_DEVICE but doesn't allow it to be {on,off}lined via sysfs. This means that this particular code path has to call move_pfn_range_to_zone explicitly. The original zone shifting code is kept in place and will be removed in the follow up patch for an easier review. Please note that this patch also changes the original behavior when offlining a memory block adjacent to another zone (Normal vs. Movable) used to allow to change its movable type. This will be handled later. [richard.weiyang@gmail.com: simplify zone_intersects()] Link: http://lkml.kernel.org/r/20170616092335.5177-1-richard.weiyang@gmail.com [richard.weiyang@gmail.com: remove duplicate call for set_page_links] Link: http://lkml.kernel.org/r/20170616092335.5177-2-richard.weiyang@gmail.com [akpm@linux-foundation.org: remove unused local `i'] Link: http://lkml.kernel.org/r/20170515085827.16474-12-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com> Signed-off-by: NWei Yang <richard.weiyang@gmail.com> Tested-by: NDan Williams <dan.j.williams@intel.com> Tested-by: NReza Arbab <arbab@linux.vnet.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # For s390 bits Acked-by: NVlastimil Babka <vbabka@suse.cz> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Daniel Kiper <daniel.kiper@oracle.com> Cc: David Rientjes <rientjes@google.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Joonsoo Kim <js1304@gmail.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Tobias Regnery <tobias.regnery@gmail.com> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Xishi Qiu <qiuxishi@huawei.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Michal Hocko 提交于
is_pageblock_removable_nolock() relies on having zone association to examine all the page blocks to check whether they are movable or free. This is just wasting of cycles when the memblock is offline. Later patch in the series will also change the time when the page is associated with a zone so we let's bail out early if the memblock is offline. Link: http://lkml.kernel.org/r/20170515085827.16474-7-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com> Reported-by: NIgor Mammedov <imammedo@redhat.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Daniel Kiper <daniel.kiper@oracle.com> Cc: David Rientjes <rientjes@google.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Joonsoo Kim <js1304@gmail.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Reza Arbab <arbab@linux.vnet.ibm.com> Cc: Tobias Regnery <tobias.regnery@gmail.com> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Xishi Qiu <qiuxishi@huawei.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Michal Hocko 提交于
Memory hotplug (add_memory_resource) has to reinitialize node infrastructure if the node is offline (one which went through the complete add_memory(); remove_memory() cycle). That involves node registration to the kobj infrastructure (register_node), the proper association with cpus (register_cpu_under_node) and finally creation of node<->memblock symlinks (link_mem_sections). The last part requires to know node_start_pfn and node_spanned_pages which we currently have but a leter patch will postpone this initialization to the onlining phase which happens later. In fact we do not need to rely on the early pgdat initialization even now because the currently hot added pfn range is currently known. Split register_one_node into core which does all the common work for the boot time NUMA initialization and the hotplug (__register_one_node). register_one_node keeps the full initialization while hotplug calls __register_one_node and manually calls link_mem_sections for the proper range. This shouldn't introduce any functional change. Link: http://lkml.kernel.org/r/20170515085827.16474-6-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Daniel Kiper <daniel.kiper@oracle.com> Cc: David Rientjes <rientjes@google.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Joonsoo Kim <js1304@gmail.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Reza Arbab <arbab@linux.vnet.ibm.com> Cc: Tobias Regnery <tobias.regnery@gmail.com> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Xishi Qiu <qiuxishi@huawei.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Michal Hocko 提交于
Device memory hotplug hooks into regular memory hotplug only half way. It needs memory sections to track struct pages but there is no need/desire to associate those sections with memory blocks and export them to the userspace via sysfs because they cannot be onlined anyway. This is currently expressed by for_device argument to arch_add_memory which then makes sure to associate the given memory range with ZONE_DEVICE. register_new_memory then relies on is_zone_device_section to distinguish special memory hotplug from the regular one. While this works now, later patches in this series want to move __add_zone outside of arch_add_memory path so we have to come up with something else. Add want_memblock down the __add_pages path and use it to control whether the section->memblock association should be done. arch_add_memory then just trivially want memblock for everything but for_device hotplug. remove_memory_section doesn't need is_zone_device_section either. We can simply skip all the memblock specific cleanup if there is no memblock for the given section. This shouldn't introduce any functional change. Link: http://lkml.kernel.org/r/20170515085827.16474-5-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com> Tested-by: NDan Williams <dan.j.williams@intel.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Daniel Kiper <daniel.kiper@oracle.com> Cc: David Rientjes <rientjes@google.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Joonsoo Kim <js1304@gmail.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Reza Arbab <arbab@linux.vnet.ibm.com> Cc: Tobias Regnery <tobias.regnery@gmail.com> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Xishi Qiu <qiuxishi@huawei.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Michal Hocko 提交于
Commit c04fc586 ("mm: show node to memory section relationship with symlinks in sysfs") has added means to export memblock<->node association into the sysfs. It has also introduced get_nid_for_pfn which is a rather confusing counterpart of pfn_to_nid which checks also whether the pfn page is already initialized (page_initialized). This is done by checking page::lru != NULL which doesn't make any sense at all. Nothing in this path really relies on the lru list being used or initialized. Just remove it because this will become a problem with later patches. Thanks to Reza Arbab for testing which revealed this to be a problem (http://lkml.kernel.org/r/20170403202337.GA12482@dhcp22.suse.cz) Link: http://lkml.kernel.org/r/20170515085827.16474-4-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Cc: Reza Arbab <arbab@linux.vnet.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Daniel Kiper <daniel.kiper@oracle.com> Cc: David Rientjes <rientjes@google.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Joonsoo Kim <js1304@gmail.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Tobias Regnery <tobias.regnery@gmail.com> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Xishi Qiu <qiuxishi@huawei.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dave Hansen 提交于
There are a number of times that we loop over NR_MEM_SECTIONS, looking for section_present() on each section. But, when we have very large physical address spaces (large MAX_PHYSMEM_BITS), NR_MEM_SECTIONS becomes very large, making the loops quite long. With MAX_PHYSMEM_BITS=46 and a section size of 128MB, the current loops are 512k iterations, which we barely notice on modern hardware. But, raising MAX_PHYSMEM_BITS higher (like we will see on systems that support 5-level paging) makes this 64x longer and we start to notice, especially on slower systems like simulators. A 10-second delay for 512k iterations is annoying. But, a 640- second delay is crippling. This does not help if we have extremely sparse physical address spaces, but those are quite rare. We expect that most of the "slow" systems where this matters will also be quite small and non-sparse. To fix this, we track the highest section we've ever encountered. This lets us know when we will *never* see another section_present(), and lets us break out of the loops earlier. Doing the whole for_each_present_section_nr() macro is probably overkill, but it will ensure that any future loop iterations that we grow are more likely to be correct. Kirrill said "It shaved almost 40 seconds from boot time in qemu with 5-level paging enabled for me". Link: http://lkml.kernel.org/r/20170504174434.C45A4735@viggo.jf.intel.comSigned-off-by: NDave Hansen <dave.hansen@linux.intel.com> Tested-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 03 7月, 2017 1 次提交
-
-
由 Jan Kiszka 提交于
Aligns us with device_add_properties, the function we call. Signed-off-by: NJan Kiszka <jan.kiszka@siemens.com> Reviewed-by: NAndy Shevchenko <andy.shevchenko@gmail.com>
-
- 01 7月, 2017 1 次提交
-
-
由 Vladimir Murzin 提交于
Currently, internals of dma_common_mmap() is compiled out if build is done for either NOMMU or target which explicitly says it does not have/want coherent DMA mmap. It turned out that dma_common_mmap() can be handy in NOMMU setup (at least for ARM). This patch converts exitent NOMMU targets to use ARCH_NO_COHERENT_DMA_MMAP, thus when CONFIG_MMU is gone from dma_common_mmap() their behaviour stays unchanged. ARM is not converted to ARCH_NO_COHERENT_DMA_MMAP because it 1) already has mmap callback which can handle (at some extent) NOMMU 2) already defines dummy pgprot_noncached() for NOMMU build. c6x and frv stay untouched since they already have ARCH_NO_COHERENT_DMA_MMAP. Cc: Steven Miao <realmz6@gmail.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: Chris Zankel <chris@zankel.net> Cc: Max Filippov <jcmvbkbc@gmail.com> Suggested-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com> Tested-by: NBenjamin Gaignard <benjamin.gaignard@linaro.org>
-
- 29 6月, 2017 7 次提交
-
-
由 Krzysztof Kozlowski 提交于
Commit fc5cbf0c (PM / Domains: Support for multiple states) split out some code out of default_power_down_ok() function so the documentation has to be moved to appropriate place. Signed-off-by: NKrzysztof Kozlowski <krzk@kernel.org> Acked-by: NUlf Hansson <ulf.hansson@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Krzysztof Kozlowski 提交于
of_genpd_remove_last() iterates over list of domains and removes matching element thus it has to use safe version of list iteration. Fixes: 17926551 (PM / Domains: Add support for removing nested PM domains by provider) Cc: 4.9+ <stable@vger.kernel.org> # 4.9+ Signed-off-by: NKrzysztof Kozlowski <krzk@kernel.org> Acked-by: NUlf Hansson <ulf.hansson@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Krzysztof Kozlowski 提交于
of_genpd_del_provider() iterates over list of domain provides and removes matching element thus it has to use safe version of list iteration. Fixes: aa42240a (PM / Domains: Add generic OF-based PM domain look-up) Cc: 3.19+ <stable@vger.kernel.org> # 3.19+ Signed-off-by: NKrzysztof Kozlowski <krzk@kernel.org> Acked-by: NUlf Hansson <ulf.hansson@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Krzysztof Kozlowski 提交于
pm_genpd_remove_subdomain() iterates over domain's master_links list and removes matching element thus it has to use safe version of list iteration. Fixes: f721889f ("PM / Domains: Support for generic I/O PM domains (v8)") Cc: 3.1+ <stable@vger.kernel.org> # 3.1+ Signed-off-by: NKrzysztof Kozlowski <krzk@kernel.org> Acked-by: NUlf Hansson <ulf.hansson@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Krzysztof Kozlowski 提交于
genpd_syscore_switch() had two problems: 1. It silently assumed that device, it is being called for, belongs to generic power domain and used container_of() on its power domain pointer. Such assumption might not be true always. 2. It iterated over list of generic power domains without holding gpd_list_lock mutex thus list could have been modified at the same time. Usage of genpd_lookup_dev() solves both problems as it is safe a call for non-generic power domains and uses mutex when iterating. Reported-by: NUlf Hansson <ulf.hansson@linaro.org> Signed-off-by: NKrzysztof Kozlowski <krzk@kernel.org> Acked-by: NUlf Hansson <ulf.hansson@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Mikko Perttunen 提交于
Currently genpd installs its own noirq callbacks, but never calls down to the driver's corresponding callbacks. Add these calls. Signed-off-by: NMikko Perttunen <mperttunen@nvidia.com> Acked-by: NUlf Hansson <ulf.hansson@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Michael Grzeschik 提交于
Some irq controllers have writeonly/multipurpose register layouts. In those cases we read invalid data back. Here we add the option mask_writeonly as masking option. Signed-off-by: NMichael Grzeschik <m.grzeschik@pengutronix.de> Signed-off-by: NMark Brown <broonie@kernel.org>
-
- 28 6月, 2017 9 次提交
-
-
由 Vladimir Murzin 提交于
This patch introduces default coherent DMA pool similar to default CMA area concept. To keep other users safe code kept under CONFIG_ARM. Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Rob Herring <robh+dt@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Suggested-by: NRobin Murphy <robin.murphy@arm.com> Tested-by: NBenjamin Gaignard <benjamin.gaignard@linaro.org> Tested-by: NAndras Szemzo <sza@esh.hu> Tested-by: NAlexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Vladimir Murzin 提交于
dma_declare_coherent_memory() and friends are designed to account difference in CPU and device addresses. However, when it is used with reserved memory regions there is assumption that CPU and device have the same view on address space. This assumption gets invalid when reserved memory for coherent DMA allocations is referenced by device with non-empty "dma-range" property. Simply feeding device address as rmem->base + dev->dma_pfn_offset would not work due to reserved memory region can be shared, so this patch turns device address to be expressed with help of CPU address and device's dma_pfn_offset in case memory reservation has been done via device tree; non device tree users continue to use the old scheme. Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Roger Quadros <rogerq@ti.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Tested-by: NBenjamin Gaignard <benjamin.gaignard@linaro.org> Tested-by: NAndras Szemzo <sza@esh.hu> Tested-by: NAlexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Christoph Hellwig 提交于
dmam_alloc_noncoherent is a trivial wrapper around dmam_alloc_attrs, that hardcodes one particular flag. Make the devres code more flexible by allowing the callers to pass arbitrary flags. Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NTejun Heo <tj@kernel.org>
-
由 Christoph Hellwig 提交于
This function was never used since it was added. Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NTejun Heo <tj@kernel.org>
-
由 Arvind Yadav 提交于
File size before: text data bss dec hex filename 3890 1152 8 5050 13ba drivers/base/power/sysfs.o File size After adding 'const': text data bss dec hex filename 4250 800 8 5058 13c2 drivers/base/power/sysfs.o Signed-off-by: NArvind Yadav <arvind.yadav.cs@gmail.com> Acked-by: NPavel Machek <pavel@ucw.cz> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Krzysztof Kozlowski 提交于
Local instances of struct attribute_group are not modified so they can be made const to increase code safeness. Signed-off-by: NKrzysztof Kozlowski <krzk@kernel.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Krzysztof Kozlowski 提交于
The 'info' string appearing in many places points to a .rodata string so it should be passes as pointer to const. Signed-off-by: NKrzysztof Kozlowski <krzk@kernel.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Krzysztof Kozlowski 提交于
The pm_verb() returns a pointer to string from .rodata so it should be marked as const. Signed-off-by: NKrzysztof Kozlowski <krzk@kernel.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Krzysztof Kozlowski 提交于
Mark pointer to struct generic_pm_domain const (either passed in argument or used localy in a function), whenever it is not modifed by the function itself. Signed-off-by: NKrzysztof Kozlowski <krzk@kernel.org> Acked-by: NUlf Hansson <ulf.hansson@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 27 6月, 2017 1 次提交
-
-
由 Thomas Gleixner 提交于
The wakeirq infrastructure uses RCU to protect the list of wakeirqs. That breaks the irq bus locking infrastructure, which is allows sleeping functions to be called so interrupt controllers behind slow busses, e.g. i2c, can be handled. The wakeirq functions hold rcu_read_lock and call into irq functions, which in case of interrupts using the irq bus locking will trigger a might_sleep() splat. Convert the wakeirq infrastructure to Sleepable RCU and unbreak it. Fixes: 4990d4fe (PM / Wakeirq: Add automated device wake IRQ handling) Reported-by: NBrian Norris <briannorris@chromium.org> Suggested-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: NTony Lindgren <tony@atomide.com> Tested-by: NBrian Norris <briannorris@chromium.org> Cc: 4.2+ <stable@vger.kernel.org> # 4.2+ Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 24 6月, 2017 1 次提交
-
-
由 Viresh Kumar 提交于
In order to support OPP switching, OPP layer needs to get pointer to the clock for the device. Simple cases work fine without using the routines added by this patch (i.e. by passing connection-id as NULL), but for a device with multiple clocks available, the OPP core needs to know the exact name of the clk to use. Add a new set of APIs to get that done. Tested-by: NRajendra Nayak <rnayak@codeaurora.org> Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org> Reviewed-by: NStephen Boyd <sboyd@codeaurora.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 23 6月, 2017 1 次提交
-
-
由 Marc Zyngier 提交于
Now that we have irq_domain_update_bus_token(), switch everyone over to it. The debugfs code thanks you for your continued support. Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 22 6月, 2017 5 次提交
-
-
由 Viresh Kumar 提交于
We create "supply-0" debugfs directory even if the device doesn't do voltage scaling. That looks confusing, as if the regulator is found but we never managed to get voltage levels for it. Avoid creating such a directory unnecessarily. Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org> Reviewed-by: NStephen Boyd <sboyd@codeaurora.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Viresh Kumar 提交于
If dev_pm_opp_set_regulators() is called for a device and its regulators are set in the OPP core, the OPP nodes for the device must contain the "opp-microvolt" property, otherwise there is something wrong and we better error out. Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org> Reviewed-by: NStephen Boyd <sboyd@codeaurora.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Viresh Kumar 提交于
This code was required while the OPP core was managed with help of RCUs, but not anymore. Get rid of unnecessary alloc/memcpy operations. Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org> Reviewed-by: NStephen Boyd <sboyd@codeaurora.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Viresh Kumar 提交于
The code was overly complicated here because of the limitations that we had with RCUs (Couldn't use opp-table and OPPs outside RCU protected section and can't call sleep-able routines from within that). But that is long gone now. Reorganize _generic_set_opp_regulator() in order to avoid using "struct dev_pm_set_opp_data" and copying data into it for the case where opp_table->set_opp is not set. Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org> Reviewed-by: NStephen Boyd <sboyd@codeaurora.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Viresh Kumar 提交于
The pm_domain_data (pdd) pointer is set from genpd_alloc_dev_data() and pdd->dev is guaranteed to be valid. There is no need to check pdd and pdd->dev in rest of the code as pdd->dev will always be valid for a non NULL pdd pointer. Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org> Acked-by: NUlf Hansson <ulf.hansson@linaro.org> Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 15 6月, 2017 2 次提交
-
-
由 Rafael J. Wysocki 提交于
The ACPI SCI (System Control Interrupt) is set up as a wakeup IRQ during suspend-to-idle transitions and, consequently, any events signaled through it wake up the system from that state. However, on some systems some of the events signaled via the ACPI SCI while suspended to idle should not cause the system to wake up. In fact, quite often they should just be discarded. Arguably, systems should not resume entirely on such events, but in order to decide which events really should cause the system to resume and which are spurious, it is necessary to resume up to the point when ACPI SCIs are actually handled and processed, which is after executing dpm_resume_noirq() in the system resume path. For this reasons, add a loop around freeze_enter() in which the platforms can process events signaled via multiplexed IRQ lines like the ACPI SCI and add suspend-to-idle hooks that can be used for this purpose to struct platform_freeze_ops. In the ACPI case, the ->wake hook is used for checking if the SCI has triggered while suspended and deferring the interrupt-induced system wakeup until the events signaled through it are actually processed sufficiently to decide whether or not the system should resume. In turn, the ->sync hook allows all of the relevant event queues to be flushed so as to prevent events from being missed due to race conditions. In addition to that, some ACPI code processing wakeup events needs to be modified to use the "hard" version of wakeup triggers, so that it will cause a system resume to happen on device-induced wakeup events even if the "soft" mechanism to prevent the system from suspending is not enabled. However, to preserve the existing behavior with respect to suspend-to-RAM, this only is done in the suspend-to-idle case and only if an SCI has occurred while suspended. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Rafael J. Wysocki 提交于
Avoid printing the device suspend/resume timing information if CONFIG_PM_DEBUG is not set to reduce the log noise level. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- 13 6月, 2017 3 次提交
-
-
由 Thierry Reding 提交于
Allow generic power domain providers to override the ->xlate() callback in case the default genpd_xlate_onecell() translation callback is not good enough. One potential use-case for this is to allow generic power domains to be specified by an ID rather than an index. Signed-off-by: NThierry Reding <treding@nvidia.com> Acked-by: NUlf Hansson <ulf.hansson@linaro.org> Signed-off-by: NThierry Reding <treding@nvidia.com>
-
由 Johan Hovold 提交于
Commit ab78029e ("drivers/pinctrl: grab default handles from device core") added automatic pin-control management to driver core by looking up and setting any default pinctrl state found in device tree while a device is being probed. This obviously runs into problems as soon as device-tree nodes are reused for child devices which are later also probed as pins would already have been claimed by the ancestor device. For example if a USB host controller claims a pin, its root hub would consequently fail to probe when its device-tree node is set to the node of the controller: pinctrl-single 48002030.pinmux: pin PIN204 already requested by 48064800.ehci; cannot claim for usb1 pinctrl-single 48002030.pinmux: pin-204 (usb1) status -22 pinctrl-single 48002030.pinmux: could not request pin 204 (PIN204) from group usb_dbg_pins on device pinctrl-single usb usb1: Error applying setting, reverse things back usb: probe of usb1 failed with error -22 Fix this by checking the new of_node_reused flag and skipping automatic pinctrl configuration during probe if set. Note that the flag is checked in driver core rather than in pinctrl (e.g. in pinctrl_dt_to_map()) which would specifically have prevented intentional use of a parent's pinctrl properties by a child device (should such a need ever arise). Fixes: ab78029e ("drivers/pinctrl: grab default handles from device core") Acked-by: NLinus Walleij <linus.walleij@linaro.org> Signed-off-by: NJohan Hovold <johan@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Johan Hovold 提交于
Add a helper function to be used when reusing the device-tree node of another device. It is fairly common for drivers to reuse the device-tree node of a parent (or other ancestor) device when creating class or bus devices (e.g. gpio chips, i2c adapters, iio chips, spi masters, serdev, phys, usb root hubs). But reusing a device-tree node may cause problems if the new device is later probed as for example driver core would currently attempt to reinitialise an already active associated pinmux configuration. Other potential issues include the platform-bus code unconditionally dropping the device-tree node reference in its device destructor, reinitialisation of other bus-managed resources such as clocks, and the recently added DMA-setup in driver core. Note that for most examples above this is currently not an issue as the devices are never probed, but this is a problem for the USB bus which has recently gained device-tree support. This was discovered and worked-around in a rather ad-hoc fashion by commit dc5878ab ("usb: core: move root hub's device node assignment after it is added to bus") by not setting the of_node pointer until after the root-hub device has been registered. Instead we can allow devices to reuse a device-tree node by setting a flag in their struct device that can be used by core, bus and driver code to avoid resources from being over-allocated. Note that the helper also grabs an extra reference to the device node, which specifically balances the unconditional put in the platform-device destructor. Signed-off-by: NJohan Hovold <johan@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-