- 18 10月, 2013 3 次提交
-
-
由 Christoffer Dall 提交于
Support huge pages in KVM/ARM and KVM/ARM64. The pud_huge checking on the unmap path may feel a bit silly as the pud_huge check is always defined to false, but the compiler should be smart about this. Note: This deals only with VMAs marked as huge which are allocated by users through hugetlbfs only. Transparent huge pages can only be detected by looking at the underlying pages (or the page tables themselves) and this patch so far simply maps these on a page-by-page level in the Stage-2 page tables. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Acked-by: NCatalin Marinas <catalin.marinas@arm.com> Acked-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
-
由 Christoffer Dall 提交于
Update comments to reflect what is really going on and add the TWE bit to the comments in kvm_arm.h. Also renames the function to kvm_handle_wfx like is done on arm64 for consistency and uber-correctness. Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
-
由 Marc Zyngier 提交于
On an (even slightly) oversubscribed system, spinlocks are quickly becoming a bottleneck, as some vcpus are spinning, waiting for a lock to be released, while the vcpu holding the lock may not be running at all. This creates contention, and the observed slowdown is 40x for hackbench. No, this isn't a typo. The solution is to trap blocking WFEs and tell KVM that we're now spinning. This ensures that other vpus will get a scheduling boost, allowing the lock to be released more quickly. Also, using CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT slightly improves the performance when the VM is severely overcommited. Quick test to estimate the performance: hackbench 1 process 1000 2xA15 host (baseline): 1.843s 2xA15 guest w/o patch: 2.083s 4xA15 guest w/o patch: 80.212s 8xA15 guest w/o patch: Could not be bothered to find out 2xA15 guest w/ patch: 2.102s 4xA15 guest w/ patch: 3.205s 8xA15 guest w/ patch: 6.887s So we go from a 40x degradation to 1.5x in the 2x overcommit case, which is vaguely more acceptable. Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
-
- 16 10月, 2013 2 次提交
-
-
由 Christoffer Dall 提交于
Update comments to reflect what is really going on and add the TWE bit to the comments in kvm_arm.h. Also renames the function to kvm_handle_wfx like is done on arm64 for consistency and uber-correctness. Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
-
由 Marc Zyngier 提交于
On an (even slightly) oversubscribed system, spinlocks are quickly becoming a bottleneck, as some vcpus are spinning, waiting for a lock to be released, while the vcpu holding the lock may not be running at all. This creates contention, and the observed slowdown is 40x for hackbench. No, this isn't a typo. The solution is to trap blocking WFEs and tell KVM that we're now spinning. This ensures that other vpus will get a scheduling boost, allowing the lock to be released more quickly. Also, using CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT slightly improves the performance when the VM is severely overcommited. Quick test to estimate the performance: hackbench 1 process 1000 2xA15 host (baseline): 1.843s 2xA15 guest w/o patch: 2.083s 4xA15 guest w/o patch: 80.212s 8xA15 guest w/o patch: Could not be bothered to find out 2xA15 guest w/ patch: 2.102s 4xA15 guest w/ patch: 3.205s 8xA15 guest w/ patch: 6.887s So we go from a 40x degradation to 1.5x in the 2x overcommit case, which is vaguely more acceptable. Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
-
- 14 10月, 2013 1 次提交
-
-
由 Christoffer Dall 提交于
The KVM_HPAGE_DEFINES are a little artificial on ARM, since the huge page size is statically defined at compile time and there is only a single huge page size. Now when the main kvm code relying on these defines has been moved to the x86 specific part of the world, we can get rid of these. Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org> Signed-off-by: NGleb Natapov <gleb@redhat.com>
-
- 13 10月, 2013 2 次提交
-
-
由 Jonathan Austin 提交于
This patch adds support for running Cortex-A7 guests on Cortex-A7 hosts. As Cortex-A7 is architecturally compatible with A15, this patch is largely just generalising existing code. Areas where 'implementation defined' behaviour is identical for A7 and A15 is moved to allow it to be used by both cores. The check to ensure that coprocessor register tables are sorted correctly is also moved in to 'common' code to avoid each new cpu doing its own check (and possibly forgetting to do so!) Signed-off-by: NJonathan Austin <jonathan.austin@arm.com> Acked-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
-
由 Jonathan Austin 提交于
The T{0,1}SZ fields of TTBCR are 3 bits wide when using the long descriptor format. Likewise, the T0SZ field of the HTCR is 3-bits. KVM currently defines TTBCR_T{0,1}SZ as 3, not 7. The T0SZ mask is used to calculate the value for the HTCR, both to pick out TTBCR.T0SZ and mask off the equivalent field in the HTCR during read-modify-write. The incorrect mask size causes the (UNKNOWN) reset value of HTCR.T0SZ to leak in to the calculated HTCR value. Linux will hang when initializing KVM if HTCR's reset value has bit 2 set (sometimes the case on A7/TC2) Fixing T0SZ allows A7 cores to boot and T1SZ is also fixed for completeness. Signed-off-by: NJonathan Austin <jonathan.austin@arm.com> Acked-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
-
- 03 10月, 2013 1 次提交
-
-
由 Anup Patel 提交于
This patch implements kvm_vcpu_preferred_target() function for KVM ARM which will help us implement KVM_ARM_PREFERRED_TARGET ioctl for user space. Signed-off-by: NAnup Patel <anup.patel@linaro.org> Signed-off-by: NPranavkumar Sawargaonkar <pranavkumar@linaro.org> Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
-
- 02 9月, 2013 3 次提交
-
-
由 Dan Aloni 提交于
Signed-off-by: NDan Aloni <alonid@stratoscale.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Douglas Anderson 提交于
It appears that gcc may put some code in ".text.unlikely" or ".text.hot" sections. Right now those aren't accounted for in unwind tables. Add them. I found some docs about this at: http://gcc.gnu.org/onlinedocs/gcc-4.6.2/gcc.pdf Without this, if you have slub_debug turned on, you can get messages that look like this: unwind: Index not found 7f008c50 Signed-off-by: NDoug Anderson <dianders@chromium.org> Acked-by: NMike Frysinger <vapier@gentoo.org> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
由 Uwe Kleine-König 提交于
The newly introduced function is to be used as .restart callback for ARMv7-M machines. The used register is architecturally defined, so it should work for all M-class machines. Acked-by: NJonathan Austin <jonathan.austin@arm.com> Signed-off-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
- 31 8月, 2013 1 次提交
-
-
由 Christoffer Dall 提交于
THe kvm_set_pte function was actually assigning the entire struct to the structure member, which should work because the structure only has that one member, but it is still not very nice. Acked-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
-
- 27 8月, 2013 1 次提交
-
-
由 Marek Szyprowski 提交于
This patch cleans the initialization of dma contiguous framework. The all-in-one dma_declare_contiguous() function is now separated into dma_contiguous_reserve_area() which only steals the the memory from memblock allocator and dma_contiguous_add_device() function, which assigns given device to the specified reserved memory area. This improves the flexibility in defining contiguous memory areas and assigning device to them, because now it is possible to assign more than one device to the given contiguous memory area. Such split in initialization procedure is also required for upcoming device tree support. Signed-off-by: NMarek Szyprowski <m.szyprowski@samsung.com> Acked-by: NKyungmin Park <kyungmin.park@samsung.com> Acked-by: NMichal Nazarewicz <mina86@mina86.com> Acked-by: NTomasz Figa <t.figa@samsung.com>
-
- 26 8月, 2013 7 次提交
-
-
由 Russell King 提交于
Now that the PL01X debug include can mostly stand alone without requiring platforms to provide any macros, move it into the debug directory so it can be directly included. This allows us to get rid of a lot of debug-macros include files. The autodetect case for Versatile Express and the ux500 are left alone; these are more complicated implementations. Acked-by: NRob Herring <rob.herring@calxeda.com> Acked-by: NRyan Mallon <rmallon@gmail.com> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
由 Russell King 提交于
Move the definition of the UART register addresses out of the platform specific header files into the Kconfig files. Acked-by: NRyan Mallon <rmallon@gmail.com> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
由 Russell King 提交于
Now that the 8250 debug include can stand alone without requiring platforms to provide any macros, move it into the debug directory so it can be directly included. This allows us to get rid of a lot of debug-macros include files. Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
由 Russell King 提交于
Move the definition of the UART register addresses out of the platform specific header file into the Kconfig files. Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
由 Russell King 提交于
Move the definition of the UART register shift out of the platform specific header file into the Kconfig files. Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
由 Russell King 提交于
Move the definition out of the machine class debug-macro.S header into the Kconfig files. Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
由 Ard Biesheuvel 提交于
The C99 types uintXX_t that are usually defined in 'stdint.h' are not as unambiguous on ARM as you would expect. For the types below, there is a difference on ARM between GCC built for bare metal ARM, GCC built for glibc and the kernel itself, which results in build errors if you try to build with -ffreestanding and include 'stdint.h' (such as when you include 'arm_neon.h' in order to use NEON intrinsics) As the typedefs for these types in 'stdint.h' are based on builtin defines supplied by GCC, we can tweak these to align with the kernel's idea of those types, so 'linux/types.h' and 'stdint.h' can be safely included from the same source file (provided that -ffreestanding is used). int32_t uint32_t uintptr_t bare metal GCC long unsigned long unsigned int glibc GCC int unsigned int unsigned int kernel int unsigned int unsigned long Acked by: Dave Martin <dave.martin@arm.com> Acked-by: NNicolas Pitre <nico@linaro.org> Acked-by: NMikael Pettersson <mikpe@it.uu.se> Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
- 23 8月, 2013 1 次提交
-
-
由 Rob Herring 提交于
Move the outer_cache declaration of the CONFIG_OUTER_CACHE ifdef so that outer_cache can be used inside IS_ENABLED condition. Signed-off-by: NRob Herring <rob.herring@calxeda.com> Cc: Russell King <linux@arm.linux.org.uk>
-
- 20 8月, 2013 2 次提交
-
-
由 Will Deacon 提交于
The flush_cache_user_range macro takes a pair of addresses describing the start and end of the virtual address range to flush. Due to an accidental oversight when flush_cache_range_user was introduced, the address range was rounded up so that the start and end addresses were page-aligned. For historical reference, the interesting commits in history.git are: 10eacf1775e1 ("[ARM] Clean up ARM cache handling interfaces (part 1)") 71432e79b76b ("[ARM] Add flush_cache_user_page() for sys_cacheflush()") This patch removes the alignment code, reducing the amount of flushing required for ranges that are not an exact multiple of PAGE_SIZE. Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Reported-by: NJonathan Austin <jonathan.austin@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
Flushing a large, non-faulting VMA from userspace can potentially result in a long time spent flushing the cache line-by-line without preemption occurring (in the case of CONFIG_PREEMPT=n). Whilst this doesn't affect the stability of the system, it can certainly affect the responsiveness and CPU availability for other tasks. This patch splits up the user cacheflush code so that it flushes in chunks of a page. After each chunk has been flushed, we may reschedule if appropriate and, before processing the next chunk, we allow any pending signals to be handled before resuming from where we left off. Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 16 8月, 2013 1 次提交
-
-
由 Linus Torvalds 提交于
Ben Tebulin reported: "Since v3.7.2 on two independent machines a very specific Git repository fails in 9/10 cases on git-fsck due to an SHA1/memory failures. This only occurs on a very specific repository and can be reproduced stably on two independent laptops. Git mailing list ran out of ideas and for me this looks like some very exotic kernel issue" and bisected the failure to the backport of commit 53a59fc6 ("mm: limit mmu_gather batching to fix soft lockups on !CONFIG_PREEMPT"). That commit itself is not actually buggy, but what it does is to make it much more likely to hit the partial TLB invalidation case, since it introduces a new case in tlb_next_batch() that previously only ever happened when running out of memory. The real bug is that the TLB gather virtual memory range setup is subtly buggered. It was introduced in commit 597e1c35 ("mm/mmu_gather: enable tlb flush range in generic mmu_gather"), and the range handling was already fixed at least once in commit e6c495a9 ("mm: fix the TLB range flushed when __tlb_remove_page() runs out of slots"), but that fix was not complete. The problem with the TLB gather virtual address range is that it isn't set up by the initial tlb_gather_mmu() initialization (which didn't get the TLB range information), but it is set up ad-hoc later by the functions that actually flush the TLB. And so any such case that forgot to update the TLB range entries would potentially miss TLB invalidates. Rather than try to figure out exactly which particular ad-hoc range setup was missing (I personally suspect it's the hugetlb case in zap_huge_pmd(), which didn't have the same logic as zap_pte_range() did), this patch just gets rid of the problem at the source: make the TLB range information available to tlb_gather_mmu(), and initialize it when initializing all the other tlb gather fields. This makes the patch larger, but conceptually much simpler. And the end result is much more understandable; even if you want to play games with partial ranges when invalidating the TLB contents in chunks, now the range information is always there, and anybody who doesn't want to bother with it won't introduce subtle bugs. Ben verified that this fixes his problem. Reported-bisected-and-tested-by: NBen Tebulin <tebulin@googlemail.com> Build-testing-by: NStephen Rothwell <sfr@canb.auug.org.au> Build-testing-by: NRichard Weinberger <richard.weinberger@gmail.com> Reviewed-by: NMichal Hocko <mhocko@suse.cz> Acked-by: NPeter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 8月, 2013 5 次提交
-
-
由 Rob Herring 提交于
In order to specify a DMA zone size of 4GB on LPAE systems, the sizes need to be 64-bit. So make machine_desc.dma_zone_size and arm_dma_zone_size be phys_addr_t instead of unsigned long. Signed-off-by: NRob Herring <rob.herring@calxeda.com>
-
由 Christoffer Dall 提交于
THe L_PTE_USER actually has nothing to do with stage 2 mappings and the L_PTE_S2_RDWR value sets the readable bit, which was what L_PTE_USER was used for before proper handling of stage 2 memory defines. Changelog: [v3]: Drop call to kvm_set_s2pte_writable in mmu.c [v2]: Change default mappings to be r/w instead of r/o, as per Marc Zyngier's suggestion. Cc: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
由 Stephen Warren 提交于
Architectures should fully validate whether kexec is possible as part of machine_kexec_prepare(), so that user-space's kexec_load() operation can report any problems. Performing validation in machine_kexec() itself is too late, since it is not allowed to return. Prior to this patch, ARM's machine_kexec() was testing after-the-fact whether machine_kexec_prepare() was able to disable all but one CPU. Instead, modify machine_kexec_prepare() to validate all conditions necessary for machine_kexec_prepare()'s to succeed. BUG if the validation succeeded, yet disabling the CPUs didn't actually work. Signed-off-by: NStephen Warren <swarren@nvidia.com> Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
由 Will Deacon 提交于
Commit 15e7e5c1 ("ARM: 7749/1: spinlock: retry trylock operation if strex fails on free lock") modifying our arch_spin_trylock to retry the acquisition if the lock appeared uncontended, but the strex failed. This patch does the same for rwlocks, which were missed by the original patch. Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
由 Will Deacon 提交于
The res variable is written before we've finished with the input operands (namely the lock address), so ensure that we mark it as `early clobber' to avoid unintended register sharing. Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
- 12 8月, 2013 8 次提交
-
-
由 Thomas Petazzoni 提交于
Some PCI drivers may need to adjust the pci_bus structure after it has been allocated by the Linux PCI core. The PCI core allows architectures to implement the pcibios_add_bus() and pcibios_remove_bus() for this purpose. This commit therefore extends the hw_pci and pci_sys_data structures of the ARM PCI core to allow PCI drivers to register ->add_bus() and ->remove_bus() in hw_pci, which will get called when a bus is added or removed from the system. This will be used for example by the Marvell PCIe driver to connect a particular PCI bus with its corresponding MSI chip to handle Message Signaled Interrupts. Signed-off-by: NThomas Petazzoni <thomas.petazzoni@free-electrons.com> Reviewed-by: NThierry Reding <thierry.reding@gmail.com> Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk> Tested-by: NDaniel Price <daniel.price@gmail.com> Tested-by: NThierry Reding <thierry.reding@gmail.com> Signed-off-by: NJason Cooper <jason@lakedaemon.net>
-
由 Will Deacon 提交于
flush_cache_vmap contains a dsb to ensure that any cacheflushing operations to flush out newly written ptes have completed. This patch adds the -ishst option to the dsb, since that is all that is required for completing cacheflushing in the inner-shareable domain. Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
When unlocking a spinlock, we use the sev instruction to signal other CPUs waiting on the lock. Since sev is not a memory access instruction, we require a dsb in order to ensure that the sev is not issued ahead of the store placing the lock in an unlocked state. However, as sev is only concerned with other processors in a multiprocessor system, we can restrict the scope of the preceding dsb to the inner-shareable domain. Furthermore, we can restrict the scope to consider only stores, since there are no independent loads on the unlock path. A side-effect of this change is that a spin_unlock operation no longer forces completion of pending TLB invalidation, something which we rely on when unlocking runqueues to ensure that CPU migration during TLB maintenance routines doesn't cause us to continue before the operation has completed. This patch adds the -ishst suffix to the ARMv7 definition of dsb_sev() and adds an inner-shareable dsb to the context-switch path when running a preemptible, SMP, v7 kernel. Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
Our TLB invalidation routines may require a barrier before the maintenance (in order to ensure pending page table writes are visible to the hardware walker) and barriers afterwards (in order to ensure completion of the maintenance and visibility in the instruction stream). Whilst this is expensive, the cost can be reduced somewhat by reducing the scope of the barrier instructions: - The barrier before only needs to apply to stores (pte writes) - Local ops are required only to affect the non-shareable domain - Global ops are required only to affect the inner-shareable domain This patch makes these changes for the TLB flushing code. Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
On ARMv7, the memory barrier instructions take an optional `option' field which can be used to constrain the effects of a memory barrier based on shareability and access type. This patch allows the caller to pass these options if required, and updates the smp_*() barriers to request inner-shareable barriers, affecting only stores for the _wmb variant. wmb() is also changed to use the -st version of dsb. Reported-by: NAlbin Tonnerre <albin.tonnerre@arm.com> Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
Now that the ASID allocator doesn't require inner-shareable maintenance, we can convert the local_bp_flush_all function to perform only non-shareable flushing, in a similar manner to the TLB invalidation routines. Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
Branch predictor maintenance is only required when we are either changing the kernel's view of memory (switching tables completely) or dealing with ASID rollover. Both of these use-cases require subsequent TLB invalidation, which has the relevant barrier instructions to ensure completion and visibility of the maintenance, so this patch removes the instruction barrier from [local_]flush_bp_all. Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
Inner-shareable TLB invalidation is typically more expensive than local (non-shareable) invalidation, so performing the broadcasting for local_flush_tlb_* operations is a waste of cycles and needlessly clobbers entries in the TLBs of other CPUs. This patch introduces __flush_tlb_* versions for many of the TLB invalidation functions, which only respect inner-shareable variants of the invalidation instructions when presented with the TLB_V7_UIS_FULL flag. The local version is also inlined to prevent SMP_ON_UP kernels from missing flushes, where the __flush variant would be called with the UP flags. This gains us around 0.5% in hackbench scores for a dual-core A15, but I would expect this to improve as more cores (and clusters) are added to the equation. Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Reported-by: NAlbin Tonnerre <Albin.Tonnerre@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 03 8月, 2013 1 次提交
-
-
由 Russell King 提交于
Olof reports that noMMU builds error out with: arch/arm/kernel/signal.c: In function 'setup_return': arch/arm/kernel/signal.c:413:25: error: 'mm_context_t' has no member named 'sigpage' This shows one of the evilnesses of IS_ENABLED(). Get rid of it here and replace it with #ifdef's - and as no noMMU platform can make use of sigpage, depend on CONIFG_MMU not CONFIG_ARM_MPU. Reported-by: NOlof Johansson <olof@lixom.net> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
- 01 8月, 2013 1 次提交
-
-
由 Russell King 提交于
If kuser helpers are not provided by the kernel, disable user access to the vectors page. With the kuser helpers gone, there is no reason for this page to be visible to userspace. Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-