- 09 1月, 2017 14 次提交
-
-
由 Junaid Shahid 提交于
MMIO SPTEs currently set both bits 62 and 63 to distinguish them as special PTEs. However, bit 63 is used as the SVE bit in Intel EPT PTEs. The SVE bit is ignored for misconfigured PTEs but not necessarily for not-Present PTEs. Since MMIO SPTEs use an EPT misconfiguration, so using bit 63 for them is acceptable. However, the upcoming fast access tracking feature adds another type of special tracking PTE, which uses not-Present PTEs and hence should not set bit 63. In order to use common bits to distinguish both type of special PTEs, we now use only bit 62 as the special bit. Signed-off-by: NJunaid Shahid <junaids@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Junaid Shahid 提交于
mmu_spte_update() tracks changes in the accessed/dirty state of the SPTE being updated and calls kvm_set_pfn_accessed/dirty appropriately. However, in some cases (e.g. when aging the SPTE), this shouldn't be done. mmu_spte_update_no_track() is introduced for use in such cases. Signed-off-by: NJunaid Shahid <junaids@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Junaid Shahid 提交于
This simplifies mmu_spte_update() a little bit. The checks for clearing of accessed and dirty bits are refactored into separate functions, which are used inside both mmu_spte_update() and mmu_spte_clear_track_bits(), as well as kvm_test_age_rmapp(). The new helper functions handle both the case when A/D bits are supported in hardware and the case when they are not. Signed-off-by: NJunaid Shahid <junaids@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Junaid Shahid 提交于
This change adds retries into the Fast Page Fault path. Without the retries, the code still works, but if a retry does end up being needed, then it will result in a second page fault for the same memory access, which will cause much more overhead compared to just retrying within the original fault. This would be especially useful with the upcoming fast access tracking change, as that would make it more likely for retries to be needed (e.g. due to read and write faults happening on different CPUs at the same time). Signed-off-by: NJunaid Shahid <junaids@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Junaid Shahid 提交于
This change renames spte_is_locklessly_modifiable() to spte_can_locklessly_be_made_writable() to distinguish it from other forms of lockless modifications. The full set of lockless modifications is covered by spte_has_volatile_bits(). Signed-off-by: NJunaid Shahid <junaids@google.com> Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Junaid Shahid 提交于
This change adds some symbolic constants for VM Exit Qualifications related to EPT Violations and updates handle_ept_violation() to use these constants instead of hard-coded numbers. Signed-off-by: NJunaid Shahid <junaids@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 David Matlack 提交于
When using two-dimensional paging, the mmu_page_hash (which provides lookups for existing kvm_mmu_page structs), becomes imbalanced; with too many collisions in buckets 0 and 512. This has been seen to cause mmu_lock to be held for multiple milliseconds in kvm_mmu_get_page on VMs with a large amount of RAM mapped with 4K pages. The current hash function uses the lower 10 bits of gfn to index into mmu_page_hash. When doing shadow paging, gfn is the address of the guest page table being shadow. These tables are 4K-aligned, which makes the low bits of gfn a good hash. However, with two-dimensional paging, no guest page tables are being shadowed, so gfn is the base address that is mapped by the table. Thus page tables (level=1) have a 2MB aligned gfn, page directories (level=2) have a 1GB aligned gfn, etc. This means hashes will only differ in their 10th bit. hash_64() provides a better hash. For example, on a VM with ~200G (99458 direct=1 kvm_mmu_page structs): hash max_mmu_page_hash_collisions -------------------------------------------- low 10 bits 49847 hash_64 105 perfect 97 While we're changing the hash, increase the table size by 4x to better support large VMs (further reduces number of collisions in 200G VM to 29). Note that hash_64() does not provide a good distribution prior to commit ef703f49 ("Eliminate bad hash multipliers from hash_32() and hash_64()"). Signed-off-by: NDavid Matlack <dmatlack@google.com> Change-Id: I5aa6b13c834722813c6cca46b8b1ed6f53368ade Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 David Matlack 提交于
Report the maximum number of mmu_page_hash collisions as a per-VM stat. This will make it easy to identify problems with the mmu_page_hash in the future. Signed-off-by: NDavid Matlack <dmatlack@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Radim Krčmář 提交于
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Radim Krčmář 提交于
The check in kvm_set_pic_irq() and kvm_set_ioapic_irq() was just a temporary measure until the code improved enough for us to do this. This changes APIC in a case when KVM_SET_GSI_ROUTING is called to set up pic and ioapic routes before KVM_CREATE_IRQCHIP. Those rules would get overwritten by KVM_CREATE_IRQCHIP at best, so it is pointless to allow it. Userspaces hopefully noticed that things don't work if they do that and don't do that. Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Radim Krčmář 提交于
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Radim Krčmář 提交于
We don't treat kvm->arch.vpic specially anymore, so the setup can look like ioapic. This gets a bit more information out of return values. Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Radim Krčmář 提交于
irqchip_in_kernel() tried to save a bit by reusing pic_irqchip(), but it just complicated the code. Add a separate state for the irqchip mode. Reviewed-by: NDavid Hildenbrand <david@redhat.com> [Used Paolo's version of condition in irqchip_in_kernel().] Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Radim Krčmář 提交于
Split irqchip cannot be created after creating the kernel irqchip, but we forgot to restrict the other way. This is an API change. Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 05 1月, 2017 5 次提交
-
-
由 Jan Dakinevich 提交于
Declaration of VMX_VPID_EXTENT_SUPPORTED_MASK occures twice in the code. Probably, it was happened after unsuccessful merge. Signed-off-by: NJan Dakinevich <jan.dakinevich@gmail.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 James Hogan 提交于
Flush the KVM entry code from the icache on all CPUs, not just the one that built the entry code. Signed-off-by: NJames Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Cc: <stable@vger.kernel.org> # 3.16.x- Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 James Hogan 提交于
On 64-bit kernels, MIPS KVM will clear CP0_Status.UX to prevent the guest (running in user mode) from accessing the 64-bit memory segments. However the previous value of CP0_Status.UX is never restored when exiting from the guest. If the user process uses 64-bit addressing (the n64 ABI) this can result in address error exceptions from the kernel if it needs to deliver a signal before returning to user mode, as the kernel will need to write a sigframe to high user addresses on the user stack which are disallowed by CP0_Status.UX=0. This is fixed by explicitly setting SX and UX again when exiting from the guest, and explicitly clearing those bits when returning to the guest. Having the SX and UX bits set when handling guest exits (rather than only when exiting to userland) will be helpful when we support VZ, since we shouldn't need to directly read or write guest memory, so it will be valid for cache management IPIs to access host user addresses. Signed-off-by: NJames Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Cc: <stable@vger.kernel.org> # 4.8.x- Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Mark Rutland 提交于
Commit c02433dd ("arm64: split thread_info from task stack") inverted the relationship between get_current() and current_thread_info(), with sp_el0 now holding the current task_struct rather than the current thead_info. The new implementation of get_current() prevents the compiler from being able to optimize repeated calls to either, resulting in a noticeable penalty in some microbenchmarks. This patch restores the previous optimisation by implementing get_current() in the same way as our old current_thread_info(), using a non-volatile asm statement. Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NMark Rutland <mark.rutland@arm.com> Reported-by: NDavidlohr Bueso <dbueso@suse.de> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Mark Rutland 提交于
Recent changes made KERN_CONT mandatory for continued lines. In the absence of KERN_CONT, a newline may be implicit inserted by the core printk code. In show_pte, we (erroneously) use printk without KERN_CONT for continued prints, resulting in output being split across a number of lines, and not matching the intended output, e.g. [ff000000000000] *pgd=00000009f511b003 , *pud=00000009f4a80003 , *pmd=0000000000000000 Fix this by using pr_cont() for all the continuations. Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NMark Rutland <mark.rutland@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 04 1月, 2017 3 次提交
-
-
由 Kevin Hilman 提交于
Signed-off-by: NKevin Hilman <khilman@baylibre.com>
-
由 Neil Armstrong 提交于
Add Video Processing Unit and CVBS Output nodes, and enable CVBS on selected boards. Reviewed-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: NNeil Armstrong <narmstrong@baylibre.com> Signed-off-by: NKevin Hilman <khilman@baylibre.com>
-
由 Kevin Hilman 提交于
Signed-off-by: NKevin Hilman <khilman@baylibre.com>
-
- 03 1月, 2017 3 次提交
-
-
由 Fabio Estevam 提交于
Commit 1be81ea5 ("ARM: dts: imx6: Add imx-weim parameters to dtsi's") causes the following probe error when the weim node is not present on the board dts (such as imx6q-sabresd): imx-weim 21b8000.weim: Invalid 'ranges' configuration imx-weim: probe of 21b8000.weim failed with error -22 There is no need to always enable the "weim" node on mx6. Do the same as in the other i.MX dtsi files where "weim" is disabled and only gets enabled on a per dts basis. All the imx6 weim dts users explicitily provide 'status = "okay"', so this change has no impact on current imx6 weim users. If a board does not use the weim driver it will not describe its 'ranges' property, so simply disable the 'weim' node in the imx6 dtsi files to avoid such probe error message. Fixes: 1be81ea5 ("ARM: dts: imx6: Add imx-weim parameters to dtsi's") Signed-off-by: NFabio Estevam <fabio.estevam@nxp.com> Signed-off-by: NShawn Guo <shawnguo@kernel.org>
-
由 Helge Deller 提交于
Add a leading line break else printed line gets too long. Signed-off-by: NHelge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # v4.9
-
由 Bjorn Andersson 提交于
As per the device tree binding the apq8064 scm node requires the core clock to be specified, so add this. Signed-off-by: NBjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: NAndy Gross <andy.gross@linaro.org>
-
- 02 1月, 2017 9 次提交
-
-
由 Alexandre Bailon 提交于
Everytime the usb20 phy is enabled, there is a "sleeping function called from invalid context" BUG. In addition, there is a recursive locking happening because of the recurse call to clk_enable(). clk_enable() from arch/arm/mach-davinci/clock.c uses spin_lock_irqsave() before to invoke the callback usb20_phy_clk_enable(). usb20_phy_clk_enable() uses clk_get() and clk_enable_prepapre() which may sleep. Replace clk_prepare_enable() by davinci_clk_enable(). Signed-off-by: NAlexandre Bailon <abailon@baylibre.com> Suggested-by: NDavid Lechner <david@lechnology.com> [nsekhar@ti.com: minor commit description adjustment] Signed-off-by: NSekhar Nori <nsekhar@ti.com>
-
由 Alexandre Bailon 提交于
In some cases, there is a need to enable a clock as part of clock enable callback of a different clock. For example, USB 2.0 PHY clock enable requires USB 2.0 clock to be enabled. In this case, it is safe to instead call __clk_enable() since the clock framework lock is already taken. Calling clk_enable() causes recursive locking error. A similar case arises in the clock disable path. To enable such usage, make __clk_{enable,disable} functions publicly available outside of clock.c. Also, call them davinci_clk_{enable|disable} now to be consistent with how other davinci-specific clock functions are named. Note that these functions are not exported to drivers. They are meant for usage in platform specific clock management code. Signed-off-by: NAlexandre Bailon <abailon@baylibre.com> Suggested-by: NDavid Lechner <david@lechnology.com> Signed-off-by: NSekhar Nori <nsekhar@ti.com>
-
由 Bartosz Golaszewski 提交于
Similarly to the aemif clock - this screws up the linked list of clock children. Create a separate clock for mdio inheriting the rate from emac_clk. Cc: <stable@vger.kernel.org> # 3.12.x- Signed-off-by: NBartosz Golaszewski <bgolaszewski@baylibre.com> [nsekhar@ti.com: add a comment over mdio_clk to explaing its existence + commit headline updates] Signed-off-by: NSekhar Nori <nsekhar@ti.com>
-
由 Bartosz Golaszewski 提交于
The aemif clock is added twice to the lookup table in da850.c. This breaks the children list of pll0_sysclk3 as we're using the same list links in struct clk. When calling clk_set_rate(), we get stuck in propagate_rate(). Create a separate clock for nand, inheriting the rate of the aemif clock and retrieve it in the davinci_nand module. Cc: <stable@vger.kernel.org> # 4.9.x Signed-off-by: NBartosz Golaszewski <bgolaszewski@baylibre.com> Signed-off-by: NSekhar Nori <nsekhar@ti.com>
-
由 Vladimir Murzin 提交于
There is no need to define map_io only for debug_ll_io_init() since it is already called in devicemaps_init() if map_io is NULL. Apart from that, for NOMMU build debug_ll_io_init() is a nop which leads to following error: CC arch/arm/mach-imx/mach-imx1.o arch/arm/mach-imx/mach-imx1.c:40:13: error: 'debug_ll_io_init' undeclared here (not in a function) .map_io = debug_ll_io_init, ^ make[1]: *** [arch/arm/mach-imx/mach-imx1.o] Error 1 Cc: Alexander Shiyan <shc_work@mail.ru> Cc: Sascha Hauer <kernel@pengutronix.de> Cc: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: NShawn Guo <shawnguo@kernel.org>
-
由 Andreas Färber 提交于
Found while reviewing Marvell dsa bindings usage. Fixes: f283745b ("arm: vf610: zii devel b: Add support for switch interrupts") Cc: Andrew Lunn <andrew@lunn.ch> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: NAndreas Färber <afaerber@suse.de> Reviewed-by: NAndrew Lunn <andrew@lunn.ch> Signed-off-by: NShawn Guo <shawnguo@kernel.org>
-
由 Gary Bisson 提交于
The NANDF_CS2 pad is also part of the wlan-vmmcgrp iomux group. Removing is from the usdhc2grp group avoids the following error: imx6q-pinctrl 20e0000.iomuxc: pin MX6Q_PAD_NANDF_CS2 already requested by regulators:regulator@4; cannot claim for 2194000.usdhc imx6q-pinctrl 20e0000.iomuxc: pin-187 (2194000.usdhc) status -22 imx6q-pinctrl 20e0000.iomuxc: could not request pin 187 (MX6Q_PAD_NANDF_CS2) from group usdhc2grp on device 20e0000.iomuxc Signed-off-by: NGary Bisson <gary.bisson@boundarydevices.com> Signed-off-by: NShawn Guo <shawnguo@kernel.org>
-
由 Vladimir Zapolskiy 提交于
On i.MX31 AVIC interrupt controller base address is at 0x68000000. The problem was shadowed by the AVIC driver, which takes the correct base address from a SoC specific header file. Fixes: d2a37b3d ("ARM i.MX31: Add devicetree support") Signed-off-by: NVladimir Zapolskiy <vladimir_zapolskiy@mentor.com> Reviewed-by: NFabio Estevam <fabio.estevam@nxp.com> Signed-off-by: NShawn Guo <shawnguo@kernel.org>
-
由 Stafford Horne 提交于
The build robot reports: .tmp_kallsyms1.o: In function `kallsyms_relative_base': >> (.rodata+0x8a18): undefined reference to `_text' This is when using 'make alldefconfig'. Adding this _text symbol to mark the start of the kernel as in other architecture fixes this. Signed-off-by: NStafford Horne <shorne@gmail.com> Acked-by: NJonas Bonn <jonas@southpole.se>
-
- 31 12月, 2016 1 次提交
-
-
由 Kishon Vijay Abraham I 提交于
Add 'gpios' property to pcie1 dt node and populate it with GPIO3_23 in order to drive PCIE_RESETn high. This gets PCIe cards to be detected in AM572X IDK board. Signed-off-by: NKishon Vijay Abraham I <kishon@ti.com> Signed-off-by: NTony Lindgren <tony@atomide.com>
-
- 30 12月, 2016 5 次提交
-
-
由 Sudeep Holla 提交于
The GICv2 CPU interface registers span across 8K, not 4K as indicated in the DT. Only the GICC_DIR register is located after the initial 4K boundary, leaving a functional system but without support for separately EOI'ing and deactivating interrupts. After this change the system supports split priority drop and interrupt deactivation. This patch is based on similar one from Christoffer Dall: commit 368400e2 ("ARM: dts: vexpress: Support GICC_DIR operations") Signed-off-by: NSudeep Holla <sudeep.holla@arm.com>
-
由 Christoffer Dall 提交于
The GICv2 CPU interface registers span across 8K, not 4K as indicated in the DT. Only the GICC_DIR register is located after the initial 4K boundary, leaving a functional system but without support for separately EOI'ing and deactivating interrupts. After this change the system supports split priority drop and interrupt deactivation. Acked-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org> [sudeep.holla@arm.com: included same fix for tc1 platform too] Signed-off-by: NSudeep Holla <sudeep.holla@arm.com>
-
由 Helge Deller 提交于
Commit 7e781418 ("signal: consolidate {TS,TLF}_RESTORE_SIGMASK code") introduced code with which the "restore sigmask" flag lives in task_struct instead of ti->flags. Let's use this optimization on parisc too. Signed-off-by: NHelge Deller <deller@gmx.de>
-
由 Helge Deller 提交于
The cr16 interval timer of each CPU is not syncronized to other cr16 timers in other CPUs in a SMP system. So, delay the registration of the cr16 clocksource until all CPUs have been detected and then - if we are on a SMP machine - mark the cr16 clocksource as unstable and lower it's rating before registering it at the clocksource framework. This patch fixes the stalled CPU warnings which we have seen since introduction of the cr16 clocksource. Signed-off-by: NHelge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # v4.8+
-
由 Linus Torvalds 提交于
In commit 62906027 ("mm: add PageWaiters indicating tasks are waiting for a page bit") Nick Piggin made our page locking no longer unconditionally touch the hashed page waitqueue, which not only helps performance in general, but is particularly helpful on NUMA machines where the hashed wait queues can bounce around a lot. However, the "clear lock bit atomically and then test the waiters bit" sequence turns out to be much more expensive than it needs to be, because you get a nasty stall when trying to access the same word that just got updated atomically. On architectures where locking is done with LL/SC, this would be trivial to fix with a new primitive that clears one bit and tests another atomically, but that ends up not working on x86, where the only atomic operations that return the result end up being cmpxchg and xadd. The atomic bit operations return the old value of the same bit we changed, not the value of an unrelated bit. On x86, we could put the lock bit in the high bit of the byte, and use "xadd" with that bit (where the overflow ends up not touching other bits), and look at the other bits of the result. However, an even simpler model is to just use a regular atomic "and" to clear the lock bit, and then the sign bit in eflags will indicate the resulting state of the unrelated bit #7. So by moving the PageWaiters bit up to bit #7, we can atomically clear the lock bit and test the waiters bit on x86 too. And architectures with LL/SC (which is all the usual RISC suspects), the particular bit doesn't matter, so they are fine with this approach too. This avoids the extra access to the same atomic word, and thus avoids the costly stall at page unlock time. The only downside is that the interface ends up being a bit odd and specialized: clear a bit in a byte, and test the sign bit. Nick doesn't love the resulting name of the new primitive, but I'd rather make the name be descriptive and very clear about the limitation imposed by trying to work across all relevant architectures than make it be some generic thing that doesn't make the odd semantics explicit. So this introduces the new architecture primitive clear_bit_unlock_is_negative_byte(); and adds the trivial implementation for x86. We have a generic non-optimized fallback (that just does a "clear_bit()"+"test_bit(7)" combination) which can be overridden by any architecture that can do better. According to Nick, Power has the same hickup x86 has, for example, but some other architectures may not even care. All these optimizations mean that my page locking stress-test (which is just executing a lot of small short-lived shell scripts: "make test" in the git source tree) no longer makes our page locking look horribly bad. Before all these optimizations, just the unlock_page() costs were just over 3% of all CPU overhead on "make test". After this, it's down to 0.66%, so just a quarter of the cost it used to be. (The difference on NUMA is bigger, but there this micro-optimization is likely less noticeable, since the big issue on NUMA was not the accesses to 'struct page', but the waitqueue accesses that were already removed by Nick's earlier commit). Acked-by: NNick Piggin <npiggin@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Bob Peterson <rpeterso@redhat.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Andrew Lutomirski <luto@kernel.org> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-