1. 26 3月, 2015 1 次提交
  2. 25 3月, 2015 7 次提交
  3. 14 3月, 2015 1 次提交
    • C
      arm64: Invalidate the TLB corresponding to intermediate page table levels · 285994a6
      Catalin Marinas 提交于
      The ARM architecture allows the caching of intermediate page table
      levels and page table freeing requires a sequence like:
      
      	pmd_clear()
      	TLB invalidation
      	pte page freeing
      
      With commit 5e5f6dc1 (arm64: mm: enable HAVE_RCU_TABLE_FREE logic),
      the page table freeing batching was moved from tlb_remove_page() to
      tlb_remove_table(). The former takes care of TLB invalidation as this is
      also shared with pte clearing and page cache page freeing. The latter,
      however, does not invalidate the TLBs for intermediate page table levels
      as it probably relies on the architecture code to do it if required.
      When the mm->mm_users < 2, tlb_remove_table() does not do any batching
      and page table pages are freed before tlb_finish_mmu() which performs
      the actual TLB invalidation.
      
      This patch introduces __tlb_flush_pgtable() for arm64 and calls it from
      the {pte,pmd,pud}_free_tlb() directly without relying on deferred page
      table freeing.
      
      Fixes: 5e5f6dc1 arm64: mm: enable HAVE_RCU_TABLE_FREE logic
      Reported-by: NJon Masters <jcm@redhat.com>
      Tested-by: NJon Masters <jcm@redhat.com>
      Tested-by: NSteve Capper <steve.capper@linaro.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      285994a6
  4. 28 2月, 2015 2 次提交
    • L
      arm64: cpuidle: add asm/proc-fns.h inclusion · af4819af
      Lorenzo Pieralisi 提交于
      ARM64 CPUidle driver requires the cpu_do_idle function so that it can
      be used to enter the shallowest idle state, and it is declared in
      asm/proc-fns.h.
      
      The current ARM64 CPUidle driver does not include asm/proc-fns.h
      explicitly and it has so far relied on implicit inclusion from other
      header files.
      
      Owing to some header dependencies reshuffling this currently triggers
      build failures when CONFIG_ARM64_64K_PAGES=y:
      
      drivers/cpuidle/cpuidle-arm64.c: In function "arm64_enter_idle_state"
      drivers/cpuidle/cpuidle-arm64.c:42:3: error: implicit declaration of
      function "cpu_do_idle" [-Werror=implicit-function-declaration]
         cpu_do_idle();
         ^
      
      This patch adds the explicit inclusion of the asm/proc-fns.h header file
      in the arm64 asm/cpuidle.h header file, so that the build breakage is fixed
      and the required header inclusion is added to the appropriate arch back-end
      CPUidle header, already included by the CPUidle arm64 driver, where
      CPUidle arch related function declarations belong.
      Reported-by: NLaura Abbott <lauraa@codeaurora.org>
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Tested-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      af4819af
    • C
      arm64: Increase the swiotlb buffer size 64MB · a1e50a82
      Catalin Marinas 提交于
      With commit 3690951f (arm64: Use swiotlb late initialisation), the
      swiotlb buffer size is limited to MAX_ORDER_NR_PAGES. However, there are
      platforms with 32-bit only devices that require bounce buffering via
      swiotlb. This patch changes the swiotlb initialisation to an early 64MB
      memblock allocation. In order to get the swiotlb buffer correctly
      allocated (via memblock_virt_alloc_low_nopanic), this patch also defines
      ARCH_LOW_ADDRESS_LIMIT to the maximum physical address capable of 32-bit
      DMA.
      Reported-by: NKefeng Wang <wangkefeng.wang@huawei.com>
      Tested-by: NKefeng Wang <wangkefeng.wang@huawei.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      a1e50a82
  5. 27 2月, 2015 2 次提交
  6. 23 2月, 2015 2 次提交
  7. 13 2月, 2015 1 次提交
    • A
      all arches, signal: move restart_block to struct task_struct · f56141e3
      Andy Lutomirski 提交于
      If an attacker can cause a controlled kernel stack overflow, overwriting
      the restart block is a very juicy exploit target.  This is because the
      restart_block is held in the same memory allocation as the kernel stack.
      
      Moving the restart block to struct task_struct prevents this exploit by
      making the restart_block harder to locate.
      
      Note that there are other fields in thread_info that are also easy
      targets, at least on some architectures.
      
      It's also a decent simplification, since the restart code is more or less
      identical on all architectures.
      
      [james.hogan@imgtec.com: metag: align thread_info::supervisor_stack]
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: David Miller <davem@davemloft.net>
      Acked-by: NRichard Weinberger <richard@nod.at>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Steven Miao <realmz6@gmail.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Tested-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Chen Liqin <liqin.linux@gmail.com>
      Cc: Lennox Wu <lennox.wu@gmail.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f56141e3
  8. 12 2月, 2015 1 次提交
  9. 11 2月, 2015 1 次提交
  10. 06 2月, 2015 1 次提交
    • P
      kvm: add halt_poll_ns module parameter · f7819512
      Paolo Bonzini 提交于
      This patch introduces a new module parameter for the KVM module; when it
      is present, KVM attempts a bit of polling on every HLT before scheduling
      itself out via kvm_vcpu_block.
      
      This parameter helps a lot for latency-bound workloads---in particular
      I tested it with O_DSYNC writes with a battery-backed disk in the host.
      In this case, writes are fast (because the data doesn't have to go all
      the way to the platters) but they cannot be merged by either the host or
      the guest.  KVM's performance here is usually around 30% of bare metal,
      or 50% if you use cache=directsync or cache=writethrough (these
      parameters avoid that the guest sends pointless flush requests, and
      at the same time they are not slow because of the battery-backed cache).
      The bad performance happens because on every halt the host CPU decides
      to halt itself too.  When the interrupt comes, the vCPU thread is then
      migrated to a new physical CPU, and in general the latency is horrible
      because the vCPU thread has to be scheduled back in.
      
      With this patch performance reaches 60-65% of bare metal and, more
      important, 99% of what you get if you use idle=poll in the guest.  This
      means that the tunable gets rid of this particular bottleneck, and more
      work can be done to improve performance in the kernel or QEMU.
      
      Of course there is some price to pay; every time an otherwise idle vCPUs
      is interrupted by an interrupt, it will poll unnecessarily and thus
      impose a little load on the host.  The above results were obtained with
      a mostly random value of the parameter (500000), and the load was around
      1.5-2.5% CPU usage on one of the host's core for each idle guest vCPU.
      
      The patch also adds a new stat, /sys/kernel/debug/kvm/halt_successful_poll,
      that can be used to tune the parameter.  It counts how many HLT
      instructions received an interrupt during the polling period; each
      successful poll avoids that Linux schedules the VCPU thread out and back
      in, and may also avoid a likely trip to C1 and back for the physical CPU.
      
      While the VM is idle, a Linux 4 VCPU VM halts around 10 times per second.
      Of these halts, almost all are failed polls.  During the benchmark,
      instead, basically all halts end within the polling period, except a more
      or less constant stream of 50 per second coming from vCPUs that are not
      running the benchmark.  The wasted time is thus very low.  Things may
      be slightly different for Windows VMs, which have a ~10 ms timer tick.
      
      The effect is also visible on Marcelo's recently-introduced latency
      test for the TSC deadline timer.  Though of course a non-RT kernel has
      awful latency bounds, the latency of the timer is around 8000-10000 clock
      cycles compared to 20000-120000 without setting halt_poll_ns.  For the TSC
      deadline timer, thus, the effect is both a smaller average latency and
      a smaller variance.
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      f7819512
  11. 03 2月, 2015 1 次提交
  12. 30 1月, 2015 4 次提交
  13. 28 1月, 2015 1 次提交
  14. 27 1月, 2015 3 次提交
    • L
      arm64: kernel: remove ARM64_CPU_SUSPEND config option · af3cfdbf
      Lorenzo Pieralisi 提交于
      ARM64_CPU_SUSPEND config option was introduced to make code providing
      context save/restore selectable only on platforms requiring power
      management capabilities.
      
      Currently ARM64_CPU_SUSPEND depends on the PM_SLEEP config option which
      in turn is set by the SUSPEND config option.
      
      The introduction of CPU_IDLE for arm64 requires that code configured
      by ARM64_CPU_SUSPEND (context save/restore) should be compiled in
      in order to enable the CPU idle driver to rely on CPU operations
      carrying out context save/restore.
      
      The ARM64_CPUIDLE config option (ARM64 generic idle driver) is therefore
      forced to select ARM64_CPU_SUSPEND, even if there may be (ie PM_SLEEP)
      failed dependencies, which is not a clean way of handling the kernel
      configuration option.
      
      For these reasons, this patch removes the ARM64_CPU_SUSPEND config option
      and makes the context save/restore dependent on CPU_PM, which is selected
      whenever either SUSPEND or CPU_IDLE are configured, cleaning up dependencies
      in the process.
      
      This way, code previously configured through ARM64_CPU_SUSPEND is
      compiled in whenever a power management subsystem requires it to be
      present in the kernel (SUSPEND || CPU_IDLE), which is the behaviour
      expected on ARM64 kernels.
      
      The cpu_suspend and cpu_init_idle CPU operations are added only if
      CPU_IDLE is selected, since they are CPU_IDLE specific methods and
      should be grouped and defined accordingly.
      
      PSCI CPU operations are updated to reflect the introduced changes.
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Krzysztof Kozlowski <k.kozlowski@samsung.com>
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      af3cfdbf
    • C
      arm64: Remove asm/syscalls.h · 96486069
      Catalin Marinas 提交于
      This patch moves the sys_rt_sigreturn_wrapper prototype to
      arch/arm64/kernel/sys.c and removes the asm/syscalls.h header.
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      96486069
    • C
      arm64: Implement the compat_sys_call_table in C · 0156411b
      Catalin Marinas 提交于
      Unlike the sys_call_table[], the compat one was implemented in sys32.S
      making it impossible to notice discrepancies between the number of
      compat syscalls and the __NR_compat_syscalls macro, the latter having to
      be defined in asm/unistd.h as including asm/unistd32.h would cause
      conflicts on __NR_* definitions. With this patch, incorrect
      __NR_compat_syscalls values will result in a build-time error.
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Suggested-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      0156411b
  15. 24 1月, 2015 6 次提交
  16. 23 1月, 2015 1 次提交
    • M
      arm64: Fix overlapping VA allocations · aa03c428
      Mark Rutland 提交于
      PCI IO space was intended to be 16MiB, at 32MiB below MODULES_VADDR, but
      commit d1e6dc91 ("arm64: Add architectural support for PCI")
      extended this to cover the full 32MiB. The final 8KiB of this 32MiB is
      also allocated for the fixmap, allowing for potential clashes between
      the two.
      
      This change was masked by assumptions in mem_init and the page table
      dumping code, which assumed the I/O space to be 16MiB long through
      seaparte hard-coded definitions.
      
      This patch changes the definition of the PCI I/O space allocation to
      live in asm/memory.h, along with the other VA space allocations. As the
      fixmap allocation depends on the number of fixmap entries, this is moved
      below the PCI I/O space allocation. Both the fixmap and PCI I/O space
      are guarded with 2MB of padding. Sites assuming the I/O space was 16MiB
      are moved over use new PCI_IO_{START,END} definitions, which will keep
      in sync with the size of the IO space (now restored to 16MiB).
      
      As a useful side effect, the use of the new PCI_IO_{START,END}
      definitions prevents a build issue in the dumping code due to a (now
      redundant) missing include of io.h for PCI_IOBASE.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Laura Abbott <lauraa@codeaurora.org>
      Cc: Liviu Dudau <liviu.dudau@arm.com>
      Cc: Steve Capper <steve.capper@linaro.org>
      Cc: Will Deacon <will.deacon@arm.com>
      [catalin.marinas@arm.com: reorder FIXADDR and PCI_IO address_markers_idx enum]
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      aa03c428
  17. 22 1月, 2015 3 次提交
  18. 21 1月, 2015 2 次提交