- 29 7月, 2013 1 次提交
-
-
由 Martin Schwidefsky 提交于
Improve the code to upgrade the standard 2K page tables to 4K page tables with PGSTEs to allow the operation to happen when the program is already multi-threaded. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 29 6月, 2013 1 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 20 6月, 2013 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
This will be later used by powerpc THP support. In powerpc we want to use pgtable for storing the hash index values. So instead of adding them to mm_context list, we would like to store them in the second half of pmd Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Reviewed-by: NAndrea Arcangeli <aarcange@redhat.com> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 05 6月, 2013 3 次提交
-
-
由 Christian Borntraeger 提交于
Getting and Releasing the pgste lock has lock semantics. Make the code an explicit barrier. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Christian Borntraeger 提交于
In modify_prot_start we update the pgste value but never store it back into the original location. Lets save the calculated result, since modify_prot_commit will use the value of the pgste. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Christian Borntraeger 提交于
When doing the transition invalid->valid in the host page table for a guest, then the guest view of C/R is in the pgste. After validation the view is pgste OR real key. We must zero out the real key C/R to avoid guest over-indication for change (and reference). Touching the real key is ok also for the host: The change bit is tracked via write protection and the reference bit is also ok because set_pte_at was called and the page will be touched anyway soon. Furthermore architecture defines reference as "substantially accurate", over- and underindication are ok. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 28 5月, 2013 1 次提交
-
-
由 Christian Borntraeger 提交于
pte_present might return true on PAGE_TYPE_NONE, even if the invalid bit is on. Modify the existing check of the pgste functions to avoid crashes. [ Martin Schwidefsky: added ptep_modify_prot_[start|commit] bits ] Reported-by: NMartin Schwidefky <schwidefsky@de.ibm.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> CC: stable@vger.kernel.org Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 21 5月, 2013 3 次提交
-
-
由 Christian Borntraeger 提交于
The guest prefix pages must be mapped writeable all the time while SIE is running, otherwise the guest might see random behaviour. (pinned at the pte level) Turns out that mlocking is not enough, the page table entry (not the page) might change or become r/o. This patch uses the gmap notifiers to kick guest cpus out of SIE. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Martin Schwidefsky 提交于
The RCP byte is a part of the PGSTE value, the existing RCP_xxx names are inaccurate. As the defines describe bits and pieces of the PGSTE, the names should start with PGSTE_. The KVM_UR_BIT and KVM_UC_BIT are part of the PGSTE as well, give them better names as well. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
由 Christian Borntraeger 提交于
Dont use the same bit as user referenced. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
- 17 5月, 2013 1 次提交
-
-
由 Christian Borntraeger 提交于
Dont use the same bit as user referenced. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
-
- 03 5月, 2013 1 次提交
-
-
由 Martin Schwidefsky 提交于
Add a notifier for kvm to get control before a page table entry is invalidated. The notifier is only called for ptes of an address space with pgstes that have been explicitly marked to require notification. Kvm will use this to get control before prefix pages of virtual CPU are unmapped. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 30 4月, 2013 1 次提交
-
-
由 Gerald Schaefer 提交于
Commit abf09bed ("s390/mm: implement software dirty bits") introduced another difference in the pte layout vs. the pmd layout on s390, thoroughly breaking the s390 support for hugetlbfs. This requires replacing some more pte_xxx functions in mm/hugetlbfs.c with a huge_pte_xxx version. This patch introduces those huge_pte_xxx functions and their generic implementation in asm-generic/hugetlb.h, which will now be included on all architectures supporting hugetlbfs apart from s390. This change will be a no-op for those architectures. [akpm@linux-foundation.org: fix warning] Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: Hillf Danton <dhillf@gmail.com> Acked-by: Michal Hocko <mhocko@suse.cz> [for !s390 parts] Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 23 4月, 2013 1 次提交
-
-
由 Heiko Carstens 提交于
Implement gmap_translate() function which translates a guest absolute address to a user space process address without establishing the guest page table entries. This is useful for kvm guest address translations where no memory access is expected to happen soon (e.g. tprot exception handler). Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 17 4月, 2013 1 次提交
-
-
由 Linus Torvalds 提交于
Commit b4cbb197 ("vm: add vm_iomap_memory() helper function") added a helper function wrapper around io_remap_pfn_range(), and every other architecture defined it in <asm/pgtable.h>. The s390 choice of <asm/io.h> may make sense, but is not very convenient for this case, and gratuitous differences like that cause unexpected errors like this: mm/memory.c: In function 'vm_iomap_memory': mm/memory.c:2439:2: error: implicit declaration of function 'io_remap_pfn_range' [-Werror=implicit-function-declaration] Glory be the kbuild test robot who noticed this, bisected it, and reported it to the guilty parties (ie me). Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 02 4月, 2013 2 次提交
-
-
由 Heiko Carstens 提交于
All architectures need to provide a check_pgt_cache() function. The s390 one got lost somewhere. So reintroduce it to prevent future compile errors e.g. if Thomas Gleixner's idle loop rework patches get merged. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Heiko Carstens 提交于
When translating user space addresses to kernel addresses the follow_table() function had two bugs: - PROT_NONE mappings could be read accessed via the kernel mapping. That is e.g. putting a filename into a user page, then protecting the page with PROT_NONE and afterwards issuing the "open" syscall with a pointer to the filename would incorrectly succeed. - when walking the page tables it used the pgd/pud/pmd/pte primitives which with dynamic page tables give no indication which real level of page tables is being walked (region2, region3, segment or page table). So in case of an exception the translation exception code passed to __handle_fault() is not necessarily correct. This is not really an issue since __handle_fault() doesn't evaluate the code. Only in case of e.g. a SIGBUS this code gets passed to user space. If user space can do something sane with the value is a different question though. To fix these issues don't use any Linux primitives. Only walk the page tables like the hardware would do it, however we leave quite some checks away since we know that we only have full size page tables and each index is within bounds. In theory this should fix all issues... Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: NGerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 08 3月, 2013 1 次提交
-
-
由 Heiko Carstens 提交于
Implement gmap_translate() function which translates a guest absolute address to a user space process address without establishing the guest page table entries. This is useful for kvm guest address translations where no memory access is expected to happen soon (e.g. tprot exception handler). Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>
-
- 28 2月, 2013 1 次提交
-
-
由 Heiko Carstens 提交于
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 14 2月, 2013 2 次提交
-
-
由 Martin Schwidefsky 提交于
The s390 architecture is unique in respect to dirty page detection, it uses the change bit in the per-page storage key to track page modifications. All other architectures track dirty bits by means of page table entries. This property of s390 has caused numerous problems in the past, e.g. see git commit ef5d437f "mm: fix XFS oops due to dirty pages without buffers on s390". To avoid future issues in regard to per-page dirty bits convert s390 to a fault based software dirty bit detection mechanism. All user page table entries which are marked as clean will be hardware read-only, even if the pte is supposed to be writable. A write by the user process will trigger a protection fault which will cause the user pte to be marked as dirty and the hardware read-only bit is removed. With this change the dirty bit in the storage key is irrelevant for Linux as a host, but the storage key is still required for KVM guests. The effect is that page_test_and_clear_dirty and the related code can be removed. The referenced bit in the storage key is still used by the page_test_and_clear_young primitive to provide page age information. For page cache pages of mappings with mapping_cap_account_dirty there will not be any change in behavior as the dirty bit tracking already uses read-only ptes to control the amount of dirty pages. Only for swap cache pages and pages of mappings without mapping_cap_account_dirty there can be additional protection faults. To avoid an excessive number of additional faults the mk_pte primitive checks for PageDirty if the pgprot value allows for writes and pre-dirties the pte. That avoids all additional faults for tmpfs and shmem pages until these pages are added to the swap cache. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Heiko Carstens 提交于
Only needed to make some drivers compile... Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 22 1月, 2013 1 次提交
-
-
由 Gerald Schaefer 提交于
On s390, an architecture-specific implementation of the function pmdp_set_wrprotect() is missing and the generic version is currently being used. The generic version does not flush the tlb as it would be needed on s390 when modifying an active pmd, which can lead to subtle tlb errors on s390 when using transparent hugepages. This patch adds an s390-specific implementation of pmdp_set_wrprotect() including the missing tlb flush. Cc: stable@vger.kernel.org Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 13 1月, 2013 1 次提交
-
-
由 Gerald Schaefer 提交于
The pfn calculation in pmd_pfn() is broken for thp, because it uses HPAGE_SHIFT instead of the normal PAGE_SHIFT. This is fixed by removing the distinction between thp and normal pmds in that function, and always using PAGE_SHIFT. Reported-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 13 12月, 2012 1 次提交
-
-
由 Kirill A. Shutemov 提交于
We have two different implementation of is_zero_pfn() and my_zero_pfn() helpers: for architectures with and without zero page coloring. Let's consolidate them in <asm-generic/pgtable.h>. Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 23 11月, 2012 2 次提交
-
-
由 Heiko Carstens 提交于
Just convert fault_init() to an early initcall. That's still early enough since it only needs be called before user space processes get executed. No reason to externalize it. Also add the function to the init section and move the store_indication variable to the read_mostly section. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Heiko Carstens 提交于
Use 2GB frames for indentity mapping if EDAT2 is available to reduce TLB pressure. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 26 10月, 2012 1 次提交
-
-
由 Gerald Schaefer 提交于
Similar to pte_none() and pte_present(), the pmd functions should also respect page protection of huge pages, especially PROT_NONE. This patch also simplifies massage_pgprot_pmd() by adding new definitions for huge page protection. Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 09 10月, 2012 6 次提交
-
-
由 Heiko Carstens 提交于
Add a special module area on top of the vmalloc area, which may be only used for modules and bpf jit generated code. This makes sure that inter module branches will always happen without a trampoline and in addition having all the code within a 2GB frame is branch prediction unit friendly. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Heiko Carstens 提交于
pmd_huge() will always return 0 on !HUGETLBFS, however we use that helper function when walking the kernel page tables to decide if we have a 1MB page frame or not. Since we create 1MB frames for the kernel 1:1 mapping independently of HUGETLBFS this can lead to incorrect storage accesses since the code can assume that we have a pointer to a page table instead of a pointer to a 1MB frame. Fix this by adding a pmd_large() primitive like other architectures have it already and remove all references to HUGETLBFS/HUGETLBPAGE from the code that walks kernel page tables. Reviewed-by: NGerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 David Miller 提交于
The transparent huge page code passes a PMD pointer in as the third argument of update_mmu_cache(), which expects a PTE pointer. This never got noticed because X86 implements update_mmu_cache() as a macro and thus we don't get any type checking, and X86 is the only architecture which supports transparent huge pages currently. Before other architectures can support transparent huge pages properly we need to add a new interface which will take a PMD pointer as the third argument rather than a PTE pointer. [akpm@linux-foundation.org: implement update_mm_cache_pmd() for s390] Signed-off-by: NDavid S. Miller <davem@davemloft.net> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Gerald Schaefer 提交于
This implements the architecture backend for transparent hugepages on s390. Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Hillf Danton <dhillf@gmail.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Gerald Schaefer 提交于
This patch is part of the architecture backend for thp on s390. It provides the pagetable pre-allocation functions pgtable_trans_huge_deposit() and pgtable_trans_huge_withdraw(). Unlike other archs, s390 has no struct page * as pgtable_t, but rather a pointer to the page table. So instead of saving the pagetable pre- allocation list info inside the struct page, it is being saved within the pagetable itself. Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Hillf Danton <dhillf@gmail.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Gerald Schaefer 提交于
This patch is part of the architecture backend for thp on s390. It provides the functions related to thp splitting, including serialization against gup. Unlike other archs, pmdp_splitting_flush() cannot use a tlb flushing operation to serialize against gup on s390, because that wouldn't be stopped by the disabled IRQs. So instead, smp_call_function() is called with an empty function, which will have the expected effect. Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Hillf Danton <dhillf@gmail.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 20 7月, 2012 1 次提交
-
-
由 Heiko Carstens 提交于
Remove the file name from the comment at top of many files. In most cases the file name was wrong anyway, so it's rather pointless. Also unify the IBM copyright statement. We did have a lot of sightly different statements and wanted to change them one after another whenever a file gets touched. However that never happened. Instead people start to take the old/"wrong" statements to use as a template for new files. So unify all of them in one go. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
-
- 24 5月, 2012 1 次提交
-
-
由 Heiko Carstens 提交于
Replace __s390x__ with CONFIG_64BIT in all places that are not exported to userspace or guarded with #ifdef __KERNEL__. Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 27 12月, 2011 1 次提交
-
-
由 Martin Schwidefsky 提交于
The kernel address space of a 64 bit kernel currently uses a three level page table and the vmemmap array has a fixed address and a fixed maximum size. A three level page table is good enough for systems with less than 3.8TB of memory, for bigger systems four page table levels need to be used. Each page table level costs a bit of performance, use 3 levels for normal systems and 4 levels only for the really big systems. To avoid bloating sparse.o too much set MAX_PHYSMEM_BITS to 46 for a maximum of 64TB of memory. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 01 12月, 2011 1 次提交
-
-
由 Carsten Otte 提交于
This patch makes sure we don't underindicate _PAGE_CHANGED in case we have a race between an operation that changes the page and this code path that hits us between page_get_storage_key and page_set_storage_key. Note that we still have a potential underindication on _PAGE_REFERENCED in the unlikely event that the page was changed but not referenced _and_ someone references the page in the race window. That's not considered to be a problem. Signed-off-by: NCarsten Otte <cotte@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 14 11月, 2011 1 次提交
-
-
由 Martin Schwidefsky 提交于
The pgste_update_all / pgste_update_young and pgste_set_pte need to check if the pte entry contains a valid page address before the storage key can be accessed. In addition pgste_set_pte needs to set the access key and fetch protection bit of the new pte entry, not the old entry. Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
- 30 10月, 2011 2 次提交
-
-
由 Christian Borntraeger 提交于
Linux on System z uses a ballooner based on diagnose 0x10. (aka as collaborative memory management). This patch implements diagnose 0x10 on the guest address space. Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Carsten Otte 提交于
gmap_fault needs to walk the guest page table. However, parts of that may change if some other thread does munmap. In that case gmap_unmap_notifier will also unmap the corresponding parts from the guest page table. We need to take mmap_sem in order to serialize these operations. do_exception now calls __gmap_fault with mmap_sem held which does not get exported to modules. The exported function, which is called from KVM, now takes mmap_sem. Reported-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NCarsten Otte <cotte@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-