- 05 5月, 2020 4 次提交
-
-
由 Aneesh Kumar K.V 提交于
This adds _PAGE_PTE check and makes sure we validate the pte value returned via find_kvm_host_pte. NOTE: this also considers _PAGE_INVALID to the software valid bit. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200505071729.54912-20-aneesh.kumar@linux.ibm.com
-
由 Aneesh Kumar K.V 提交于
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200505071729.54912-13-aneesh.kumar@linux.ibm.com
-
由 Aneesh Kumar K.V 提交于
update kvmppc_hv_handle_set_rc to use find_kvm_nested_guest_pte and find_kvm_secondary_pte Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200505071729.54912-12-aneesh.kumar@linux.ibm.com
-
由 Aneesh Kumar K.V 提交于
The locking rules for walking partition scoped table is different from process scoped table. Hence add a helper for secondary linux page table walk and also add check whether we are holding the right locks. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200505071729.54912-10-aneesh.kumar@linux.ibm.com
-
- 02 8月, 2019 1 次提交
-
-
由 Joel Fernandes (Google) 提交于
The rcu_dereference_raw_notrace() API name is confusing. It is equivalent to rcu_dereference_raw() except that it also does sparse pointer checking. There are only a few users of rcu_dereference_raw_notrace(). This patches renames all of them to be rcu_dereference_raw_check() with the "_check()" indicating sparse checking. Signed-off-by: NJoel Fernandes (Google) <joel@joelfernandes.org> [ paulmck: Fix checkpatch warnings about parentheses. ] Signed-off-by: NPaul E. McKenney <paulmck@linux.ibm.com>
-
- 05 6月, 2019 1 次提交
-
-
由 Thomas Gleixner 提交于
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not write to the free software foundation 51 franklin street fifth floor boston ma 02110 1301 usa extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 67 file(s). Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NAllison Randal <allison@lohutok.net> Reviewed-by: NRichard Fontana <rfontana@redhat.com> Reviewed-by: NAlexios Zavras <alexios.zavras@intel.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190529141333.953658117@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 21 12月, 2018 1 次提交
-
-
由 Suraj Jitindar Singh 提交于
Introduce a function kvmhv_update_nest_rmap_rc_list() which for a given nest_rmap list will traverse it, find the corresponding pte in the shadow page tables, and if it still maps the same host page update the rc bits accordingly. Signed-off-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
- 17 12月, 2018 2 次提交
-
-
由 Suraj Jitindar Singh 提交于
There exists a function kvm_is_radix() which is used to determine if a kvm instance is using the radix mmu. However this only applies to the first level (L1) guest. Add a function kvmhv_vcpu_is_radix() which can be used to determine if the current execution context of the vcpu is radix, accounting for if the vcpu is running a nested guest. Currently all nested guests must be radix but this may change in the future. Signed-off-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
由 Paul Mackerras 提交于
This adds 'const' to the declarations for the struct kvm_memory_slot pointer parameters of some functions, which will make it possible to call those functions from kvmppc_core_commit_memory_region_hv() in the next patch. This also fixes some comments about locking. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Reviewed-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
- 09 10月, 2018 8 次提交
-
-
由 Paul Mackerras 提交于
This adds a list of valid shadow PTEs for each nested guest to the 'radix' file for the guest in debugfs. This can be useful for debugging. Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Suraj Jitindar Singh 提交于
This is only done at level 0, since only level 0 knows which physical CPU a vcpu is running on. This does for nested guests what L0 already did for its own guests, which is to flush the TLB on a pCPU when it goes to run a vCPU there, and there is another vCPU in the same VM which previously ran on this pCPU and has now started to run on another pCPU. This is to handle the situation where the other vCPU touched a mapping, moved to another pCPU and did a tlbiel (local-only tlbie) on that new pCPU and thus left behind a stale TLB entry on this pCPU. This introduces a limit on the the vcpu_token values used in the H_ENTER_NESTED hcall -- they must now be less than NR_CPUS. [paulus@ozlabs.org - made prev_cpu array be short[] to reduce memory consumption.] Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Paul Mackerras 提交于
This adds code to call the H_TLB_INVALIDATE hypercall when running as a guest, in the cases where we need to invalidate TLBs (or other MMU caches) as part of managing the mappings for a nested guest. Calling H_TLB_INVALIDATE lets the nested hypervisor inform the parent hypervisor about changes to partition-scoped page tables or the partition table without needing to do hypervisor-privileged tlbie instructions. Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Suraj Jitindar Singh 提交于
When a host (L0) page which is mapped into a (L1) guest is in turn mapped through to a nested (L2) guest we keep a reverse mapping (rmap) so that these mappings can be retrieved later. Whenever we create an entry in a shadow_pgtable for a nested guest we create a corresponding rmap entry and add it to the list for the L1 guest memslot at the index of the L1 guest page it maps. This means at the L1 guest memslot we end up with lists of rmaps. When we are notified of a host page being invalidated which has been mapped through to a (L1) guest, we can then walk the rmap list for that guest page, and find and invalidate all of the corresponding shadow_pgtable entries. In order to reduce memory consumption, we compress the information for each rmap entry down to 52 bits -- 12 bits for the LPID and 40 bits for the guest real page frame number -- which will fit in a single unsigned long. To avoid a scenario where a guest can trigger unbounded memory allocations, we scan the list when adding an entry to see if there is already an entry with the contents we need. This can occur, because we don't ever remove entries from the middle of a list. A struct nested guest rmap is a list pointer and an rmap entry; ---------------- | next pointer | ---------------- | rmap entry | ---------------- Thus the rmap pointer for each guest frame number in the memslot can be either NULL, a single entry, or a pointer to a list of nested rmap entries. gfn memslot rmap array ------------------------- 0 | NULL | (no rmap entry) ------------------------- 1 | single rmap entry | (rmap entry with low bit set) ------------------------- 2 | list head pointer | (list of rmap entries) ------------------------- The final entry always has the lowest bit set and is stored in the next pointer of the last list entry, or as a single rmap entry. With a list of rmap entries looking like; ----------------- ----------------- ------------------------- | list head ptr | ----> | next pointer | ----> | single rmap entry | ----------------- ----------------- ------------------------- | rmap entry | | rmap entry | ----------------- ------------------------- Signed-off-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Suraj Jitindar Singh 提交于
Consider a normal (L1) guest running under the main hypervisor (L0), and then a nested guest (L2) running under the L1 guest which is acting as a nested hypervisor. L0 has page tables to map the address space for L1 providing the translation from L1 real address -> L0 real address; L1 | | (L1 -> L0) | ----> L0 There are also page tables in L1 used to map the address space for L2 providing the translation from L2 real address -> L1 read address. Since the hardware can only walk a single level of page table, we need to maintain in L0 a "shadow_pgtable" for L2 which provides the translation from L2 real address -> L0 real address. Which looks like; L2 L2 | | | (L2 -> L1) | | | ----> L1 | (L2 -> L0) | | | (L1 -> L0) | | | ----> L0 --------> L0 When a page fault occurs while running a nested (L2) guest we need to insert a pte into this "shadow_pgtable" for the L2 -> L0 mapping. To do this we need to: 1. Walk the pgtable in L1 memory to find the L2 -> L1 mapping, and provide a page fault to L1 if this mapping doesn't exist. 2. Use our L1 -> L0 pgtable to convert this L1 address to an L0 address, or try to insert a pte for that mapping if it doesn't exist. 3. Now we have a L2 -> L0 mapping, insert this into our shadow_pgtable Once this mapping exists we can take rc faults when hardware is unable to automatically set the reference and change bits in the pte. On these we need to: 1. Check the rc bits on the L2 -> L1 pte match, and otherwise reflect the fault down to L1. 2. Set the rc bits in the L1 -> L0 pte which corresponds to the same host page. 3. Set the rc bits in the L2 -> L0 pte. As we reuse a large number of functions in book3s_64_mmu_radix.c for this we also needed to refactor a number of these functions to take an lpid parameter so that the correct lpid is used for tlb invalidations. The functionality however has remained the same. Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Paul Mackerras 提交于
This starts the process of adding the code to support nested HV-style virtualization. It defines a new H_SET_PARTITION_TABLE hypercall which a nested hypervisor can use to set the base address and size of a partition table in its memory (analogous to the PTCR register). On the host (level 0 hypervisor) side, the H_SET_PARTITION_TABLE hypercall from the guest is handled by code that saves the virtual PTCR value for the guest. This also adds code for creating and destroying nested guests and for reading the partition table entry for a nested guest from L1 memory. Each nested guest has its own shadow LPID value, different in general from the LPID value used by the nested hypervisor to refer to it. The shadow LPID value is allocated at nested guest creation time. Nested hypervisor functionality is only available for a radix guest, which therefore means a radix host on a POWER9 (or later) processor. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Paul Mackerras 提交于
When the 'regs' field was added to struct kvm_vcpu_arch, the code was changed to use several of the fields inside regs (e.g., gpr, lr, etc.) but not the ccr field, because the ccr field in struct pt_regs is 64 bits on 64-bit platforms, but the cr field in kvm_vcpu_arch is only 32 bits. This changes the code to use the regs.ccr field instead of cr, and changes the assembly code on 64-bit platforms to use 64-bit loads and stores instead of 32-bit ones. Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Paul Mackerras 提交于
This adds a file called 'radix' in the debugfs directory for the guest, which when read gives all of the valid leaf PTEs in the partition-scoped radix tree for a radix guest, in human-readable format. It is analogous to the existing 'htab' file which dumps the HPT entries for a HPT guest. Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 18 5月, 2018 2 次提交
-
-
由 Simon Guo 提交于
This patch moves nip/ctr/lr/xer registers from scattered places in kvm_vcpu_arch to pt_regs structure. cr register is "unsigned long" in pt_regs and u32 in vcpu->arch. It will need more consideration and may move in later patches. Signed-off-by: NSimon Guo <wei.guo.simon@gmail.com> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
由 Simon Guo 提交于
Current regs are scattered at kvm_vcpu_arch structure and it will be more neat to organize them into pt_regs structure. Also it will enable reimplementation of MMIO emulation code with analyse_instr() later. Signed-off-by: NSimon Guo <wei.guo.simon@gmail.com> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
- 23 3月, 2018 1 次提交
-
-
由 Paul Mackerras 提交于
POWER9 has hardware bugs relating to transactional memory and thread reconfiguration (changes to hardware SMT mode). Specifically, the core does not have enough storage to store a complete checkpoint of all the architected state for all four threads. The DD2.2 version of POWER9 includes hardware modifications designed to allow hypervisor software to implement workarounds for these problems. This patch implements those workarounds in KVM code so that KVM guests see a full, working transactional memory implementation. The problems center around the use of TM suspended state, where the CPU has a checkpointed state but execution is not transactional. The workaround is to implement a "fake suspend" state, which looks to the guest like suspended state but the CPU does not store a checkpoint. In this state, any instruction that would cause a transition to transactional state (rfid, rfebb, mtmsrd, tresume) or would use the checkpointed state (treclaim) causes a "soft patch" interrupt (vector 0x1500) to the hypervisor so that it can be emulated. The trechkpt instruction also causes a soft patch interrupt. On POWER9 DD2.2, we avoid returning to the guest in any state which would require a checkpoint to be present. The trechkpt in the guest entry path which would normally create that checkpoint is replaced by either a transition to fake suspend state, if the guest is in suspend state, or a rollback to the pre-transactional state if the guest is in transactional state. Fake suspend state is indicated by a flag in the PACA plus a new bit in the PSSCR. The new PSSCR bit is write-only and reads back as 0. On exit from the guest, if the guest is in fake suspend state, we still do the treclaim instruction as we would in real suspend state, in order to get into non-transactional state, but we do not save the resulting register state since there was no checkpoint. Emulation of the instructions that cause a softpatch interrupt is handled in two paths. If the guest is in real suspend mode, we call kvmhv_p9_tm_emulation_early() to handle the cases where the guest is transitioning to transactional state. This is called before we do the treclaim in the guest exit path; because we haven't done treclaim, we can get back to the guest with the transaction still active. If the instruction is a case that kvmhv_p9_tm_emulation_early() doesn't handle, or if the guest is in fake suspend state, then we proceed to do the complete guest exit path and subsequently call kvmhv_p9_tm_emulation() in host context with the MMU on. This handles all the cases including the cases that generate program interrupts (illegal instruction or TM Bad Thing) and facility unavailable interrupts. The emulation is reasonably straightforward and is mostly concerned with checking for exception conditions and updating the state of registers such as MSR and CR0. The treclaim emulation takes care to ensure that the TEXASR register gets updated as if it were the guest treclaim instruction that had done failure recording, not the treclaim done in hypervisor state in the guest exit path. With this, the KVM_CAP_PPC_HTM capability returns true (1) even if transactional memory is not available to host userspace. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 23 11月, 2017 1 次提交
-
-
由 Paul Mackerras 提交于
The kvmppc_hpte_page_shifts function decodes the actual and base page sizes for a HPTE, returning -1 if it doesn't recognize the page size encoding. This then gets used as a shift amount in various places, which is undefined behaviour. This was reported by Coverity. In fact this should never occur, since we should only get HPTEs in the HPT which have a recognized page size encoding. The only place where this might not be true is in the call to kvmppc_actual_pgsz() near the beginning of kvmppc_do_h_enter(), where we are validating the HPTE value passed in from the guest. So to fix this and eliminate the undefined behaviour, we make kvmppc_hpte_page_shifts return 0 for unrecognized page size encodings, and make kvmppc_actual_pgsz() detect that case and return 0 for the page size, which will then cause kvmppc_do_h_enter() to return an error and refuse to insert any HPTE with an unrecognized page size encoding. To ensure that we don't get undefined behaviour in compute_tlbie_rb(), we take the 4k page size path for any unrecognized page size encoding. This should never be hit in practice because it is only used on HPTE values which have previously been checked for having a recognized page size encoding. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
- 01 11月, 2017 2 次提交
-
-
由 Paul Mackerras 提交于
Currently, the HPT code in HV KVM maintains a dirty bit per guest page in the rmap array, whether or not dirty page tracking has been enabled for the memory slot. In contrast, the radix code maintains a dirty bit per guest page in memslot->dirty_bitmap, and only does so when dirty page tracking has been enabled. This changes the HPT code to maintain the dirty bits in the memslot dirty_bitmap like radix does. This results in slightly less code overall, and will mean that we do not lose the dirty bits when transitioning between HPT and radix mode in future. There is one minor change to behaviour as a result. With HPT, when dirty tracking was enabled for a memslot, we would previously clear all the dirty bits at that point (both in the HPT entries and in the rmap arrays), meaning that a KVM_GET_DIRTY_LOG ioctl immediately following would show no pages as dirty (assuming no vcpus have run in the meantime). With this change, the dirty bits on HPT entries are not cleared at the point where dirty tracking is enabled, so KVM_GET_DIRTY_LOG would show as dirty any guest pages that are resident in the HPT and dirty. This is consistent with what happens on radix. This also fixes a bug in the mark_pages_dirty() function for radix (in the sense that the function no longer exists). In the case where a large page of 64 normal pages or more is marked dirty, the addressing of the dirty bitmap was incorrect and could write past the end of the bitmap. Fortunately this case was never hit in practice because a 2MB large page is only 32 x 64kB pages, and we don't support backing the guest with 1GB huge pages at this point. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
由 Paul Mackerras 提交于
This removes the dependence of KVM on the mmu_psize_defs array (which stores information about hardware support for various page sizes) and the things derived from it, chiefly hpte_page_sizes[], hpte_page_size(), hpte_actual_page_size() and get_sllp_encoding(). We also no longer rely on the mmu_slb_size variable or the MMU_FTR_1T_SEGMENTS feature bit. The reason for doing this is so we can support a HPT guest on a radix host. In a radix host, the mmu_psize_defs array contains information about page sizes supported by the MMU in radix mode rather than the page sizes supported by the MMU in HPT mode. Similarly, mmu_slb_size and the MMU_FTR_1T_SEGMENTS bit are not set. Instead we hard-code knowledge of the behaviour of the HPT MMU in the POWER7, POWER8 and POWER9 processors (which are the only processors supported by HV KVM) - specifically the encoding of the LP fields in the HPT and SLB entries, and the fact that they have 32 SLB entries and support 1TB segments. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
- 01 4月, 2017 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
Now that we use all the available virtual address range, we need to make sure we don't generate VSID such that it overlaps with the reserved vsid range. Reserved vsid range include the virtual address range used by the adjunct partition and also the VRMA virtual segment. We find the context value that can result in generating such a VSID and reserve it early in boot. We don't look at the adjunct range, because for now we disable the adjunct usage in a Linux LPAR via CAS interface. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Rewrite hash__reserve_context_id(), move the rest into pseries] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 31 1月, 2017 3 次提交
-
-
由 David Gibson 提交于
Currently, kvmppc_alloc_hpt() both allocates a new hashed page table (HPT) and sets it up as the active page table for a VM. For the upcoming HPT resize implementation we're going to want to allocate HPTs separately from activating them. So, split the allocation itself out into kvmppc_allocate_hpt() and perform the activation with a new kvmppc_set_hpt() function. Likewise we split kvmppc_free_hpt(), which just frees the HPT, from kvmppc_release_hpt() which unsets it as an active HPT, then frees it. We also move the logic to fall back to smaller HPT sizes if the first try fails into the single caller which used that behaviour, kvmppc_hv_setup_htab_rma(). This introduces a slight semantic change, in that previously if the initial attempt at CMA allocation failed, we would fall back to attempting smaller sizes with the page allocator. Now, we try first CMA, then the page allocator at each size. As far as I can tell this change should be harmless. To match, we make kvmppc_free_hpt() just free the actual HPT itself. The call to kvmppc_free_lpid() that was there, we move to the single caller. Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
由 David Gibson 提交于
Currently the kvm_hpt_info structure stores the hashed page table's order, and also the number of HPTEs it contains and a mask for its size. The last two can be easily derived from the order, so remove them and just calculate them as necessary with a couple of helper inlines. Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au> Reviewed-by: NThomas Huth <thuth@redhat.com> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
由 Paul Mackerras 提交于
This adds a field in struct kvm_arch and an inline helper to indicate whether a guest is a radix guest or not, plus a new file to contain the radix MMU code, which currently contains just a translate function which knows how to traverse the guest page tables to translate an address. Signed-off-by: NPaul Mackerras <paulus@ozlabs.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 27 9月, 2016 1 次提交
-
-
由 Balbir Singh 提交于
Remove duplicate setting of the the "B" field when doing a tlbie(l). In compute_tlbie_rb(), the "B" field is set again just before returning the rb value to be used for tlbie(l). Signed-off-by: NBalbir Singh <bsingharora@gmail.com> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
- 09 9月, 2016 1 次提交
-
-
由 Paul Mackerras 提交于
This replaces a 2-D search through an array with a simple 8-bit table lookup for determining the actual and/or base page size for a HPT entry. The encoding in the second doubleword of the HPTE is designed to encode the actual and base page sizes without using any more bits than would be needed for a 4k page number, by using between 1 and 8 low-order bits of the RPN (real page number) field to encode the page sizes. A single "large page" bit in the first doubleword indicates that these low-order bits are to be interpreted like this. We can determine the page sizes by using the low-order 8 bits of the RPN to look up a 256-entry table. For actual page sizes less than 1MB, some of the upper bits of these 8 bits are going to be real address bits, but we can cope with that by replicating the entries for those smaller page sizes. While we're at it, let's move the hpte_page_size() and hpte_base_page_size() functions from a KVM-specific header to a header for 64-bit HPT systems, since this computation doesn't have anything specifically to do with KVM. Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
-
- 01 8月, 2016 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
Replace opencoding of the same at multiple places with the helper. No functional change with this patch. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 01 5月, 2016 3 次提交
-
-
由 Aneesh Kumar K.V 提交于
This helps to make following hash only pte bits easier. We have kept _PAGE_CHG_MASK, _HPAGE_CHG_MASK and _PAGE_PROT_BITS as it is in this patch eventhough they use hash specific bits. Using them in radix as it is should be ok, because with radix we expect those bit positions to be zero. Only renames in this patch, no change in functionality. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
PowerISA 3.0 introduces two pte bits with the below meaning for radix: 00 -> Normal Memory 01 -> Strong Access Order (SAO) 10 -> Non idempotent I/O (Cache inhibited and guarded) 11 -> Tolerant I/O (Cache inhibited) We drop the existing WIMG bits in the Linux page table in favour of the above constants. We loose _PAGE_WRITETHRU with this conversion. We only use writethru via pgprot_cached_wthru() which is used by fbdev/controlfb.c which is Apple control display and also PPC32. With respect to _PAGE_COHERENCE, we have been marking hpte always coherent for some time now. htab_convert_pte_flags() always added HPTE_R_M. NOTE: KVM changes need closer review. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Michael Ellerman 提交于
We have five locations in 64-bit hash MMU code that do a cmpxchg() of a PTE. Currently doing it inline OK, but in a future patch we will be converting the PTEs to __be64 in some configs. In that case we will need casts at every cmpxchg() site in order to keep sparse happy. So move the logic into a helper, this is a reasonably nice cleanup on its own. Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 16 2月, 2016 1 次提交
-
-
由 Alexey Kardashevskiy 提交于
SPAPR_TCE_SHIFT is used in few places only and since IOMMU_PAGE_SHIFT_4K can be easily used instead, remove SPAPR_TCE_SHIFT. Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 05 6月, 2015 1 次提交
-
-
由 Paolo Bonzini 提交于
Only two ioctls have to be modified; the address space id is placed in the higher 16 bits of their slot id argument. As of this patch, no architecture defines more than one address space; x86 will be the first. Reviewed-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 21 4月, 2015 3 次提交
-
-
由 Paul Mackerras 提交于
This replaces the assembler code for kvmhv_commence_exit() with C code in book3s_hv_builtin.c. It also moves the IPI sending code that was in book3s_hv_rm_xics.c into a new kvmhv_rm_send_ipi() function so it can be used by kvmhv_commence_exit() as well as icp_rm_set_vcpu_irq(). Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Paul Mackerras 提交于
This creates a debugfs directory for each HV guest (assuming debugfs is enabled in the kernel config), and within that directory, a file by which the contents of the guest's HPT (hashed page table) can be read. The directory is named vmnnnn, where nnnn is the PID of the process that created the guest. The file is named "htab". This is intended to help in debugging problems in the host's management of guest memory. The contents of the file consist of a series of lines like this: 3f48 4000d032bf003505 0000000bd7ff1196 00000003b5c71196 The first field is the index of the entry in the HPT, the second and third are the HPT entry, so the third entry contains the real page number that is mapped by the entry if the entry's valid bit is set. The fourth field is the guest's view of the second doubleword of the entry, so it contains the guest physical address. (The format of the second through fourth fields are described in the Power ISA and also in arch/powerpc/include/asm/mmu-hash64.h.) Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
由 Aneesh Kumar K.V 提交于
This adds helper routines for locking and unlocking HPTEs, and uses them in the rest of the code. We don't change any locking rules in this patch. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NAlexander Graf <agraf@suse.de>
-
- 17 4月, 2015 2 次提交
-
-
由 Aneesh Kumar K.V 提交于
For THP that is marked trans splitting, we return the pte. This require the callers to handle the pmd_trans_splitting scenario, if they care. All the current callers are either looking at pfn or write_ok, hence we don't need to update them. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Aneesh Kumar K.V 提交于
pte can get updated from other CPUs as part of multiple activities like THP split, huge page collapse, unmap. We need to make sure we don't reload the pte value again and again for different checks. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-