- 27 2月, 2016 1 次提交
-
-
由 Paul Mackerras 提交于
No code changes. Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 22 2月, 2016 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
We can get a hash pte fault with 4k base page size and find the pte already inserted with 64K base page size. In that case we need to clear the existing slot information from the old pte. Fix this correctly With THP, we also clear the slot information with respect to all the 64K hash pte mapping that 16MB page. They are all invalid now. This make sure we don't find the slot valid when we fault with 4k base page size. Finding the slot valid should not result in any wrong behavior because we do check again in hash page table for the validity. But we can avoid that check completely. Fixes: a43c0eb8 ("powerpc/mm: Convert 4k hash insert to C") Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 15 2月, 2016 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
With ppc64 we use the deposited pgtable_t to store the hash pte slot information. We should not withdraw the deposited pgtable_t without marking the pmd none. This ensure that low level hash fault handling will skip this huge pte and we will handle them at upper levels. Recent change to pmd splitting changed the above in order to handle the race between pmd split and exit_mmap. The race is explained below. Consider following race: CPU0 CPU1 shrink_page_list() add_to_swap() split_huge_page_to_list() __split_huge_pmd_locked() pmdp_huge_clear_flush_notify() // pmd_none() == true exit_mmap() unmap_vmas() zap_pmd_range() // no action on pmd since pmd_none() == true pmd_populate() As result the THP will not be freed. The leak is detected by check_mm(): BUG: Bad rss-counter state mm:ffff880058d2e580 idx:1 val:512 The above required us to not mark pmd none during a pmd split. The fix for ppc is to clear the huge pte of _PAGE_USER, so that low level fault handling code skip this pte. At higher level we do take ptl lock. That should serialze us against the pmd split. Once the lock is acquired we do check the pmd again using pmd_same. That should always return false for us and hence we should retry the access. We do the pmd_same check in all case after taking plt with THP (do_huge_pmd_wp_page, do_huge_pmd_numa_page and huge_pmd_set_accessed) Also make sure we wait for irq disable section in other cpus to finish before flipping a huge pte entry with a regular pmd entry. Code paths like find_linux_pte_or_hugepte depend on irq disable to get a stable pte_t pointer. A parallel thp split need to make sure we don't convert a pmd pte to a regular pmd entry without waiting for the irq disable section to finish. Fixes: eef1b3ba ("thp: implement split_huge_pmd()") Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 25 1月, 2016 1 次提交
-
-
由 Vasant Hegde 提交于
With commit 90a545e9 (restrict /dev/mem to idle io memory ranges) mapping rtas_rmo_buf from user space is failing. Hence we are not able to make RTAS syscall. This patch calls page_is_rtas_user_buf before calling iomem_is_exclusive in devmem_is_allowed(). This will allow user space to map rtas_rmo_buf and we are able to make RTAS syscall. Reported-by: NBharata B Rao <bharata@linux.vnet.ibm.com> CC: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: NVasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 16 1月, 2016 3 次提交
-
-
由 Kirill A. Shutemov 提交于
With new refcounting we don't need to mark PMDs splitting. Let's drop code to handle this. pmdp_splitting_flush() is not needed too: on splitting PMD we will do pmdp_clear_flush() + set_pte_at(). pmdp_clear_flush() will do IPI as needed for fast_gup. Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Steve Capper <steve.capper@linaro.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Kirill A. Shutemov 提交于
Tail page refcounting is utterly complicated and painful to support. It uses ->_mapcount on tail pages to store how many times this page is pinned. get_page() bumps ->_mapcount on tail page in addition to ->_count on head. This information is required by split_huge_page() to be able to distribute pins from head of compound page to tails during the split. We will need ->_mapcount to account PTE mappings of subpages of the compound page. We eliminate need in current meaning of ->_mapcount in tail pages by forbidding split entirely if the page is pinned. The only user of tail page refcounting is THP which is marked BROKEN for now. Let's drop all this mess. It makes get_page() and put_page() much simpler. Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: NSasha Levin <sasha.levin@oracle.com> Tested-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Acked-by: NJerome Marchand <jmarchan@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Steve Capper <steve.capper@linaro.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Kirill A. Shutemov 提交于
We are going to decouple splitting THP PMD from splitting underlying compound page. This patch renames split_huge_page_pmd*() functions to split_huge_pmd*() to reflect the fact that it doesn't imply page splitting, only PMD. Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Tested-by: NSasha Levin <sasha.levin@oracle.com> Tested-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: NJerome Marchand <jmarchan@redhat.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Steve Capper <steve.capper@linaro.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 09 1月, 2016 1 次提交
-
-
由 Michael Ellerman 提交于
Commit 2fc251a8 ("powerpc: Copy only required pieces of the mm_context_t to the paca") broke the build for CONFIG_PPC_STD_MMU_64=y and CONFIG_PPC_MM_SLICES=n. That only happens for a kernel built with 4K pages and HUGETLB disabled, which is why we missed it. Fix it by adding a mm_ctx_user_psize member to the paca and populating it in the appropriate places. Fixes: 2fc251a8 ("powerpc: Copy only required pieces of the mm_context_t to the paca") Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 27 12月, 2015 1 次提交
-
-
由 Michael Neuling 提交于
Currently we copy the whole mm_context_t to the paca but only access a few bits of it. This is wasteful of space paca and also takes quite some time in the hot path of context switching. This patch pulls in only the required bits from the mm_context_t to the paca and on context switch, copies only those. Benchmarking this (On top of Anton's recent MSR context switching changes [1]) using processes and yield shows an improvement of almost 3% on POWER8: http://ozlabs.org/~anton/junkcode/context_switch2.c ./context_switch2 --test=yield --process 0 0 1. https://lists.ozlabs.org/pipermail/linuxppc-dev/2015-October/135700.htmlSigned-off-by: NMichael Neuling <mikey@neuling.org> [mpe: Rename paca fields to be mm_ctx_foo rather than context_foo] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 23 12月, 2015 1 次提交
-
-
由 Scott Wood 提交于
e6500 has threads but does not have TLB write conditional. Thus, the hugetlb code needs to take the same lock that the normal TLB miss handlers take, to ensure that the tlbsx and tlbwe are atomic. Signed-off-by: NScott Wood <scottwood@freescale.com>
-
- 19 12月, 2015 1 次提交
-
-
由 Michael Neuling 提交于
This adds a function to copy the mm->context to the paca. This is only a basic conversion for now but will be used more extensively in the next patch. This also adds #ifdef CONFIG_PPC_BOOK3S around this code since it's not used elsewhere. Signed-off-by: NMichael Neuling <mikey@neuling.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 14 12月, 2015 15 次提交
-
-
由 Aneesh Kumar K.V 提交于
The slot information of base page size hash pte is stored in the pgtable_t w.r.t transparent hugepage. We need to make sure we don't index beyond pgtable_t size. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
pte and pmd table size are dependent on config items. Don't hard code the same. This make sure we use the right value when masking pmd entries and also while checking pmd_bad Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
For a pte entry we will have _PAGE_PTE set. Our pte page address have a minimum alignment requirement of HUGEPD_SHIFT_MASK + 1. We use the lower 7 bits to indicate hugepd. ie. For pmd and pgd we can find: 1) _PAGE_PTE set pte -> indicate PTE 2) bits [2..6] non zero -> indicate hugepd. They also encode the size. We skip bit 1 (_PAGE_PRESENT). 3) othewise pointer to next table. Acked-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
We support THP only with book3s_64 and 64K page size. Move THP details to hash64-64k.h to clarify the same. Acked-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
W.r.t hugetlb, we support two format for pmd. With book3s_64 and 64K linux page size, we can have pte at the pmd level. Hence we don't need to support hugepd there. For everything else hugepd is supported and pmd_huge is (0). Acked-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
Only difference here is, we apply the WIMG mapping early, so rflags passed to updatepp will also be changed. Acked-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
Instead of open coding it in multiple code paths, export the helper and add more documentation. Also make sure we don't make assumption regarding pte bit position Acked-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
This is similar to 64K insert. May be we want to consolidate Acked-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
Convert from asm to C Acked-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
This free up 11 bits in pte_t. In the later patch we also change the pte_t format so that we can start supporting migration pte at pmd level. We now track 4k subpage valid bit as below If we have _PAGE_COMBO set, we override the _PAGE_F_GIX_SHIFT and _PAGE_F_SECOND. Together we have 4 bits, each of them used to indicate whether any of the 4 4k subpage in that group is valid. ie, [ group 1 bit ] [ group 2 bit ] ..... [ group 4 ] [ subpage 1 - 4] [ subpage 5- 8] ..... [ subpage 13 - 16] We still track each 4k subpage slot number and secondary hash information in the second half of pgtable_t. Removing the subpage tracking have some significant overhead on aim9 and ebizzy benchmark and to support THP with 4K subpage, we do need a pgtable_t of 4096 bytes. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
We should not expect pte bit position in asm code. Simply by moving part of that to C Acked-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
Acked-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
We convert them static inline function here as we did with pte_val in the previous patch Acked-by: NScott Wood <scottwood@freescale.com> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
This is the same bug we fixed as part of 09567e7f ("powerpc/mm: Check paca psize is up to date for huge mappings"). Please check that for details. The difference here is that faults were happening on a 4K page at an address previously mapped by hugetlb. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Reviewed-by: NAnshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 06 11月, 2015 1 次提交
-
-
由 Raghavendra K T 提交于
With the setup_nr_nodes(), we have already initialized node_possible_map. So it is safe to use for_each_node here. There are many places in the kernel that use hardcoded 'for' loop with nr_node_ids, because all other architectures have numa nodes populated serially. That should be reason we had maintained the same for powerpc. But, since sparse numa node ids possible on powerpc, we unnecessarily allocate memory for non existent numa nodes. For e.g., on a system with 0,1,16,17 as numa nodes nr_node_ids=18 and we allocate memory for nodes 2-14. This patch we allocate memory for only existing numa nodes. The patch is boot tested on a 4 node tuleta, confirming with printks that it works as expected. Signed-off-by: NRaghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Cc: Vladimir Davydov <vdavydov@parallels.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Anton Blanchard <anton@samba.org> Cc: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Cc: Greg Kurz <gkurz@linux.vnet.ibm.com> Cc: Grant Likely <grant.likely@linaro.org> Cc: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 28 10月, 2015 2 次提交
-
-
由 Kevin Hao 提交于
In order to workaround Erratum A-008139, we have to invalidate the tlb entry with tlbilx before overwriting. Due to the performance consideration, we don't add any memory barrier when acquire/release the tcd lock. This means the two load instructions for esel_next do have the possibility to return different value. This is definitely not acceptable due to the Erratum A-008139. We have two options to fix this issue: a) Add memory barrier when acquire/release tcd lock to order the load/store to esel_next. b) Just make sure to invalidate and write to the same tlb entry and tolerate the race that we may get the wrong value and overwrite the tlb entry just updated by the other thread. We observe better performance using option b. So reserve an additional register to save the value of the esel_next. Signed-off-by: NKevin Hao <haokexin@gmail.com> Signed-off-by: NScott Wood <scottwood@freescale.com>
-
由 Scott Wood 提交于
This is required for kdump to work when loaded at at an address that does not fall within the first TLB entry -- which can easily happen because while the lower limit is enforced via reserved memory, which doesn't affect how much is mapped, the upper limit is enforced via a different mechanism that does. Thus, more TLB entries are needed than would normally be used, as the total memory to be mapped might not be a power of two. Signed-off-by: NScott Wood <scottwood@freescale.com>
-
- 23 10月, 2015 1 次提交
-
-
由 Scott Wood 提交于
Use an AS=1 trampoline TLB entry to allow all normal TLB1 entries to be loaded at once. This avoids the need to keep the translation that code is executing from in the same TLB entry in the final TLB configuration as during early boot, which in turn is helpful for relocatable kernels (e.g. kdump) where the kernel is not running from what would be the first TLB entry. On e6500, we limit map_mem_in_cams() to the primary hwthread of a core (the boot cpu is always considered primary, as a kdump kernel can be entered on any cpu). Each TLB only needs to be set up once, and when we do, we don't want another thread to be running when we create a temporary trampoline TLB1 entry. Signed-off-by: NScott Wood <scottwood@freescale.com>
-
- 15 10月, 2015 1 次提交
-
-
由 Christophe Jaillet 提交于
of_get_next_parent can be used to simplify the while() loop and avoid the need of a temp variable. Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 12 10月, 2015 2 次提交
-
-
由 Aneesh Kumar K.V 提交于
We need to properly identify whether a hugepage is an explicit or a transparent hugepage in follow_huge_addr(). We used to depend on hugepage shift argument to do that. But in some case that can result in wrong results. For ex: On finding a transparent hugepage we set hugepage shift to PMD_SHIFT. But we can end up clearing the thp pte, via pmdp_huge_get_and_clear. We do prevent reusing the pfn page via the usage of kick_all_cpus_sync(). But that happens after we updated the pte to 0. Hence in follow_huge_addr() we can find hugepage shift set, but transparent huge page check fail for a thp pte. NOTE: We fixed a variant of this race against thp split in commit 691e95fd ("powerpc/mm/thp: Make page table walk safe against thp split/collapse") Without this patch, we may hit the BUG_ON(flags & FOLL_GET) in follow_page_mask occasionally. In the long term, we may want to switch ppc64 64k page size config to enable CONFIG_ARCH_WANT_GENERAL_HUGETLB Reported-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
After commit e2b3d202 ("powerpc: Switch 16GB and 16MB explicit hugepages to a different page table format"), we don't need to support is_hugepd() for 64K page size. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 09 10月, 2015 1 次提交
-
-
由 Cyril Bur 提交于
native_hpte_clear() is called in real mode from two places: - Early in boot during htab initialisation if firmware assisted dump is active. - Late in the kexec path. In both contexts there is no need to disable interrupts are they are already disabled. Furthermore, locking around the tlbie() is only required for pre POWER5 hardware. On POWER5 or newer hardware concurrent tlbie()s work as expected and on pre POWER5 hardware concurrent tlbie()s could result in deadlock. This code would only be executed at crashdump time, during which all bets are off, concurrent tlbie()s are unlikely and taking locks is unsafe therefore the best course of action is to simply do nothing. Concurrent tlbie()s are not possible in the first case as secondary CPUs have not come up yet. Signed-off-by: NCyril Bur <cyrilbur@gmail.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 01 10月, 2015 2 次提交
-
-
由 Michael Ellerman 提交于
For no reason other than it looks ugly. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Anshuman Khandual 提交于
This patch defines macros for the three bolted SLB indexes we use. Switch the functions that take the indexes as an argument to use the enum. Signed-off-by: NAnshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 16 9月, 2015 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
If we had secondary hash flag set, we ended up modifying hash value in the updatepp code path. Hence with a failed updatepp we will be using a wrong hash value for the following hash insert. Fix this by recomputing hash before insert. Without this patch we can end up with using wrong slot number in linux pte. That can result in us missing an hash pte update or invalidate which can cause memory corruption or even machine check. Fixes: 6d492ecc ("powerpc/THP: Add code to handle HPTE faults for hugepages") Cc: stable@vger.kernel.org # v3.11+ Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Reviewed-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 28 8月, 2015 1 次提交
-
-
由 Dan Williams 提交于
While pmem is usable as a block device or via DAX mappings to userspace there are several usage scenarios that can not target pmem due to its lack of struct page coverage. In preparation for "hot plugging" pmem into the vmemmap add ZONE_DEVICE as a new zone to tag these pages separately from the ones that are subject to standard page allocations. Importantly "device memory" can be removed at will by userspace unbinding the driver of the device. Having a separate zone prevents allocation and otherwise marks these pages that are distinct from typical uniform memory. Device memory has different lifetime and performance characteristics than RAM. However, since we have run out of ZONES_SHIFT bits this functionality currently depends on sacrificing ZONE_DMA. Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Jerome Glisse <j.glisse@gmail.com> [hch: various simplifications in the arch interface] Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 18 8月, 2015 2 次提交
-
-
由 Nikunj A Dadhania 提交于
In some situations, a NUMA guest that supports ibm,dynamic-memory-reconfiguration node will end up having flat NUMA distances between nodes. This is because of two problems in the current code. 1) Different representations of associativity lists. There is an assumption about the associativity list in initialize_distance_lookup_table(). Associativity list has two forms: a) [cpu,memory]@x/ibm,associativity has following format: <N> <N integers> b) ibm,dynamic-reconfiguration-memory/ibm,associativity-lookup-arrays <M> <N> <M associativity lists each having N integers> M = the number of associativity lists N = the number of entries per associativity list Fix initialize_distance_lookup_table() so that it does not assume "case a". And update the caller to skip the length field before sending the associativity list. 2) Distance table not getting updated from drconf path. Node distance table will not get initialized in certain cases as ibm,dynamic-reconfiguration-memory path does not initialize the lookup table. Call initialize_distance_lookup_table() from drconf path with appropriate associativity list. Reported-by: NBharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: NNikunj A Dadhania <nikunj@linux.vnet.ibm.com> Acked-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
The relation between CONFIG_PPC_HAS_HASH_64K and CONFIG_PPC_64K_PAGES is painfully complicated. But if we rearrange it enough we can see that PPC_HAS_HASH_64K essentially depends on PPC_STD_MMU_64 && PPC_64K_PAGES. We can then notice that PPC_HAS_HASH_64K is used in files that are only built for PPC_STD_MMU_64, meaning it's equivalent to PPC_64K_PAGES. So replace all uses and drop it. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
-