1. 19 2月, 2015 2 次提交
    • B
      x86/intel/quark: Add Intel Quark platform support · 8bbc2a13
      Bryan O'Donoghue 提交于
      Add Intel Quark platform support. Quark needs to pull down all
      unlocked IMRs to ensure agreement with the EFI memory map post
      boot.
      
      This patch adds an entry in Kconfig for Quark as a platform and
      makes IMR support mandatory if selected.
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Suggested-by: NAndy Shevchenko <andy.shevchenko@gmail.com>
      Tested-by: NOng, Boon Leong <boon.leong.ong@intel.com>
      Signed-off-by: NBryan O'Donoghue <pure.logic@nexus-software.ie>
      Reviewed-by: NAndy Shevchenko <andy.schevchenko@gmail.com>
      Reviewed-by: NDarren Hart <dvhart@linux.intel.com>
      Reviewed-by: NOng, Boon Leong <boon.leong.ong@intel.com>
      Cc: dvhart@infradead.org
      Link: http://lkml.kernel.org/r/1422635379-12476-3-git-send-email-pure.logic@nexus-software.ieSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8bbc2a13
    • B
      x86/intel/quark: Add Isolated Memory Regions for Quark X1000 · 28a375df
      Bryan O'Donoghue 提交于
      Intel's Quark X1000 SoC contains a set of registers called
      Isolated Memory Regions. IMRs are accessed over the IOSF mailbox
      interface. IMRs are areas carved out of memory that define
      read/write access rights to the various system agents within the
      Quark system. For a given agent in the system it is possible to
      specify if that agent may read or write an area of memory
      defined by an IMR with a granularity of 1 KiB.
      
      Quark_SecureBootPRM_330234_001.pdf section 4.5 details the
      concept of IMRs quark-x1000-datasheet.pdf section 12.7.4 details
      the implementation of IMRs in silicon.
      
      eSRAM flush, CPU Snoop write-only, CPU SMM Mode, CPU non-SMM
      mode, RMU and PCIe Virtual Channels (VC0 and VC1) can have
      individual read/write access masks applied to them for a given
      memory region in Quark X1000. This enables IMRs to treat each
      memory transaction type listed above on an individual basis and
      to filter appropriately based on the IMR access mask for the
      memory region. Quark supports eight IMRs.
      
      Since all of the DMA capable SoC components in the X1000 are
      mapped to VC0 it is possible to define sections of memory as
      invalid for DMA write operations originating from Ethernet, USB,
      SD and any other DMA capable south-cluster component on VC0.
      Similarly it is possible to mark kernel memory as non-SMM mode
      read/write only or to mark BIOS runtime memory as SMM mode
      accessible only depending on the particular memory footprint on
      a given system.
      
      On an IMR violation Quark SoC X1000 systems are configured to
      reset the system, so ensuring that the IMR memory map is
      consistent with the EFI provided memory map is critical to
      ensure no IMR violations reset the system.
      
      The API for accessing IMRs is based on MTRR code but doesn't
      provide a /proc or /sys interface to manipulate IMRs. Defining
      the size and extent of IMRs is exclusively the domain of
      in-kernel code.
      
      Quark firmware sets up a series of locked IMRs around pieces of
      memory that firmware owns such as ACPI runtime data. During boot
      a series of unlocked IMRs are placed around items in memory to
      guarantee no DMA modification of those items can take place.
      Grub also places an unlocked IMR around the kernel boot params
      data structure and compressed kernel image. It is necessary for
      the kernel to tear down all unlocked IMRs in order to ensure
      that the kernel's view of memory passed via the EFI memory map
      is consistent with the IMR memory map. Without tearing down all
      unlocked IMRs on boot transitory IMRs such as those used to
      protect the compressed kernel image will cause IMR violations and system reboots.
      
      The IMR init code tears down all unlocked IMRs and sets a
      protective IMR around the kernel .text and .rodata as one
      contiguous block. This sanitizes the IMR memory map with respect
      to the EFI memory map and protects the read-only portions of the
      kernel from unwarranted DMA access.
      Tested-by: NOng, Boon Leong <boon.leong.ong@intel.com>
      Signed-off-by: NBryan O'Donoghue <pure.logic@nexus-software.ie>
      Reviewed-by: NAndy Shevchenko <andy.schevchenko@gmail.com>
      Reviewed-by: NDarren Hart <dvhart@linux.intel.com>
      Reviewed-by: NOng, Boon Leong <boon.leong.ong@intel.com>
      Cc: andy.shevchenko@gmail.com
      Cc: dvhart@infradead.org
      Link: http://lkml.kernel.org/r/1422635379-12476-2-git-send-email-pure.logic@nexus-software.ieSigned-off-by: NIngo Molnar <mingo@kernel.org>
      28a375df
  2. 09 2月, 2015 1 次提交
  3. 06 2月, 2015 2 次提交
  4. 04 2月, 2015 3 次提交
  5. 03 2月, 2015 1 次提交
    • W
      ARM: 8299/1: mm: ensure local active ASID is marked as allocated on rollover · 8e648066
      Will Deacon 提交于
      Commit e1a5848e ("ARM: 7924/1: mm: don't bother with reserved ttbr0
      when running with LPAE") removed the use of the reserved TTBR0 value
      for LPAE systems, since the ASID is held in the TTBR and can be updated
      atomicly with the pgd of the next mm.
      
      Unfortunately, this patch forgot to update flush_context, which
      deliberately avoids marking the local active ASID as allocated, since we
      used to switch via ASID zero and didn't need to allocate the ASID of
      the previous mm. The side-effect of this is that we can allocate the
      same ASID to the next mm and, between flushing the local TLB and updating
      TTBR0, we can perform speculative TLB fills for userspace nG mappings
      using the page table of the previous mm.
      
      The consequence of this is that the next mm can erroneously hit some
      mappings of the previous mm. Note that this was made significantly
      harder to hit by a391263c ("ARM: 8203/1: mm: try to re-use old ASID
      assignments following a rollover") but is still theoretically possible.
      
      This patch fixes the problem by removing the code from flush_context
      that forces the allocated ASID to zero for the local CPU. Many thanks
      to the Broadcom guys for tracking this one down.
      
      Fixes: e1a5848e ("ARM: 7924/1: mm: don't bother with reserved ttbr0 when running with LPAE")
      
      Cc: <stable@vger.kernel.org> # v3.14+
      Reported-by: NRaymond Ngun <rngun@broadcom.com>
      Tested-by: NRaymond Ngun <rngun@broadcom.com>
      Reviewed-by: NGregory Fong <gregory.0xf0@gmail.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      8e648066
  6. 02 2月, 2015 1 次提交
  7. 01 2月, 2015 3 次提交
    • A
      x86_64, entry: Remove the syscall exit audit and schedule optimizations · 96b6352c
      Andy Lutomirski 提交于
      We used to optimize rescheduling and audit on syscall exit.  Now
      that the full slow path is reasonably fast, remove these
      optimizations.  Syscall exit auditing is now handled exclusively by
      syscall_trace_leave.
      
      This adds something like 10ns to the previously optimized paths on
      my computer, presumably due mostly to SAVE_REST / RESTORE_REST.
      
      I think that we should eventually replace both the syscall and
      non-paranoid interrupt exit slow paths with a pair of C functions
      along the lines of the syscall entry hooks.
      
      Link: http://lkml.kernel.org/r/22f2aa4a0361707a5cfb1de9d45260b39965dead.1421453410.git.luto@amacapital.netAcked-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      96b6352c
    • A
      x86_64, entry: Use sysret to return to userspace when possible · 2a23c6b8
      Andy Lutomirski 提交于
      The x86_64 entry code currently jumps through complex and
      inconsistent hoops to try to minimize the impact of syscall exit
      work.  For a true fast-path syscall, almost nothing needs to be
      done, so returning is just a check for exit work and sysret.  For a
      full slow-path return from a syscall, the C exit hook is invoked if
      needed and we join the iret path.
      
      Using iret to return to userspace is very slow, so the entry code
      has accumulated various special cases to try to do certain forms of
      exit work without invoking iret.  This is error-prone, since it
      duplicates assembly code paths, and it's dangerous, since sysret
      can malfunction in interesting ways if used carelessly.  It's
      also inefficient, since a lot of useful cases aren't optimized
      and therefore force an iret out of a combination of paranoia and
      the fact that no one has bothered to write even more asm code
      to avoid it.
      
      I would argue that this approach is backwards.  Rather than trying
      to avoid the iret path, we should instead try to make the iret path
      fast.  Under a specific set of conditions, iret is unnecessary.  In
      particular, if RIP==RCX, RFLAGS==R11, RIP is canonical, RF is not
      set, and both SS and CS are as expected, then
      movq 32(%rsp),%rsp;sysret does the same thing as iret.  This set of
      conditions is nearly always satisfied on return from syscalls, and
      it can even occasionally be satisfied on return from an irq.
      
      Even with the careful checks for sysret applicability, this cuts
      nearly 80ns off of the overhead from syscalls with unoptimized exit
      work.  This includes tracing and context tracking, and any return
      that invokes KVM's user return notifier.  For example, the cost of
      getpid with CONFIG_CONTEXT_TRACKING_FORCE=y drops from ~360ns to
      ~280ns on my computer.
      
      This may allow the removal and even eventual conversion to C
      of a respectable amount of exit asm.
      
      This may require further tweaking to give the full benefit on Xen.
      
      It may be worthwhile to adjust signal delivery and exec to try hit
      the sysret path.
      
      This does not optimize returns to 32-bit userspace.  Making the same
      optimization for CS == __USER32_CS is conceptually straightforward,
      but it will require some tedious code to handle the differences
      between sysretl and sysexitl.
      
      Link: http://lkml.kernel.org/r/71428f63e681e1b4aa1a781e3ef7c27f027d1103.1421453410.git.luto@amacapital.netSigned-off-by: NAndy Lutomirski <luto@amacapital.net>
      2a23c6b8
    • A
      x86, traps: Fix ist_enter from userspace · b926e6f6
      Andy Lutomirski 提交于
      context_tracking_user_exit() has no effect if in_interrupt() returns true,
      so ist_enter() didn't work.  Fix it by calling exception_enter(), and thus
      context_tracking_user_exit(), before incrementing the preempt count.
      
      This also adds an assertion that will catch the problem reliably if
      CONFIG_PROVE_RCU=y to help prevent the bug from being reintroduced.
      
      Link: http://lkml.kernel.org/r/261ebee6aee55a4724746d0d7024697013c40a08.1422709102.git.luto@amacapital.net
      Fixes: 95927475 x86, traps: Track entry into and exit from IST context
      Reported-and-tested-by: NSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      b926e6f6
  8. 31 1月, 2015 5 次提交
    • J
      MIPS: fork: Fix MSA/FPU/DSP context duplication race · 39148e94
      James Hogan 提交于
      There is a race in the MIPS fork code which allows the child to get a
      stale copy of parent MSA/FPU/DSP state that is active in hardware
      registers when the fork() is called. This is because copy_thread() saves
      the live register state into the child context only if the hardware is
      currently in use, apparently on the assumption that the hardware state
      cannot have been saved and disabled since the initial duplication of the
      task_struct. However preemption is certainly possible during this
      window.
      
      An example sequence of events is as follows:
      
      1) The parent userland process puts important data into saved floating
         point registers ($f20-$f31), which are then dirty compared to the
         process' stored context.
      
      2) The parent process calls fork() which does a clone system call.
      
      3) In the kernel, do_fork() -> copy_process() -> dup_task_struct() ->
         arch_dup_task_struct() (which uses the weakly defined default
         implementation). This duplicates the parent process' task context,
         which includes a stale version of its FP context from when it was
         last saved, probably some time before (1).
      
      4) At some point before copy_process() calls copy_thread(), such as when
         duplicating the memory map, the process is desceduled. Perhaps it is
         preempted asynchronously, or perhaps it sleeps while blocked on a
         mutex. The dirty FP state in the FP registers is saved to the parent
         process' context and the FPU is disabled.
      
      5) When the process is rescheduled again it continues copying state
         until it gets to copy_thread(), which checks whether the FPU is in
         use, so that it can copy that dirty state to the child process' task
         context. Because of the deschedule however the FPU is not in use, so
         the child process' context is left with stale FP context from the
         last time the parent saved it (some time before (1)).
      
      6) When the new child process is scheduled it reads the important data
         from the saved floating point register, and ends up doing a NULL
         pointer dereference as a result of the stale data.
      
      This use of saved floating point registers across function calls can be
      triggered fairly easily by explicitly using inline asm with a current
      (MIPS R2) compiler, but is far more likely to happen unintentionally
      with a MIPS R6 compiler where the FP registers are more likely to get
      used as scratch registers for storing non-fp data.
      
      It is easily fixed, in the same way that other architectures do it, by
      overriding the implementation of arch_dup_task_struct() to sync the
      dirty hardware state to the parent process' task context *prior* to
      duplicating it, rather than copying straight to the child process' task
      context in copy_thread(). Note, the FPU hardware is not disabled so the
      parent process may continue executing with the live register context,
      but now the child process is guaranteed to have an identical copy of it
      at that point.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Reported-by: NMatthew Fortune <matthew.fortune@imgtec.com>
      Tested-by: NMarkos Chandras <markos.chandras@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/9075/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      39148e94
    • D
      MIPS: Fix C0_Pagegrain[IEC] support. · 9ead8632
      David Daney 提交于
      The following commits:
      
        5890f70f (MIPS: Use dedicated exception handler if CPU supports RI/XI exceptions)
        6575b1d4 (MIPS: kernel: cpu-probe: Detect unique RI/XI exceptions)
      
      break the kernel for *all* existing MIPS CPUs that implement the
      CP0_PageGrain[IEC] bit.  They cause the TLB exception handlers to be
      generated without the legacy execute-inhibit handling, but never set
      the CP0_PageGrain[IEC] bit to activate the use of dedicated exception
      vectors for execute-inhibit exceptions.  The result is that upon
      detection of an execute-inhibit violation, we loop forever in the TLB
      exception handlers instead of sending SIGSEGV to the task.
      
      If we are generating TLB exception handlers expecting separate
      vectors, we must also enable the CP0_PageGrain[IEC] feature.
      
      The bug was introduced in kernel version 3.17.
      Signed-off-by: NDavid Daney <david.daney@cavium.com>
      Cc: <stable@vger.kernel.org>
      Cc: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: http://patchwork.linux-mips.org/patch/8880/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      9ead8632
    • J
      MIPS: traps: Fix inline asm ctc1 missing .set hardfloat · d76e9b9f
      James Hogan 提交于
      Commit 842dfc11 ("MIPS: Fix build with binutils 2.24.51+") in v3.18
      enabled -msoft-float and sprinkled ".set hardfloat" where necessary to
      use FP instructions. However it missed enable_restore_fp_context() which
      since v3.17 does a ctc1 with inline assembly, causing the following
      assembler errors on Mentor's 2014.05 toolchain:
      
      {standard input}: Assembler messages:
      {standard input}:2913: Error: opcode not supported on this processor: mips32r2 (mips32r2) `ctc1 $2,$31'
      scripts/Makefile.build:257: recipe for target 'arch/mips/kernel/traps.o' failed
      
      Fix that to use the new write_32bit_cp1_register() macro so that ".set
      hardfloat" is automatically added when -msoft-float is in use.
      
      Fixes 842dfc11 ("MIPS: Fix build with binutils 2.24.51+")
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Cc: <stable@vger.kernel.org> # 3.18+, depends on "MIPS: mipsregs.h: Add write_32bit_cp1_register()"
      Patchwork: https://patchwork.linux-mips.org/patch/9173/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      d76e9b9f
    • J
      MIPS: mipsregs.h: Add write_32bit_cp1_register() · 5e32033e
      James Hogan 提交于
      Add a write_32bit_cp1_register() macro to compliment the
      read_32bit_cp1_register() macro. This is to abstract whether .set
      hardfloat needs to be used based on GAS_HAS_SET_HARDFLOAT.
      
      The implementation of _read_32bit_cp1_register() .sets mips1 due to
      failure of gas v2.19 to assemble cfc1 for Octeon (see commit
      25c30003 ("MIPS: Override assembler target architecture for
      octeon.")). I haven't copied this over to _write_32bit_cp1_register() as
      I'm uncertain whether it applies to ctc1 too, or whether anybody cares
      about that version of binutils any longer.
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: David Daney <david.daney@cavium.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/9172/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      5e32033e
    • G
      arc: mm: Fix build failure · e262eb93
      Guenter Roeck 提交于
      Fix misspelled define.
      
      Fixes: 33692f27 ("vm: add VM_FAULT_SIGSEGV handling support")
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e262eb93
  9. 30 1月, 2015 8 次提交
    • R
      KVM: x86: check LAPIC presence when building apic_map · df04d1d1
      Radim Krčmář 提交于
      We forgot to re-check LAPIC after splitting the loop in commit
      173beedc (KVM: x86: Software disabled APIC should still deliver
      NMIs, 2014-11-02).
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      Fixes: 173beedcSigned-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      df04d1d1
    • H
      MIPS: Fix kernel lockup or crash after CPU offline/online · c7754e75
      Hemmo Nieminen 提交于
      As printk() invocation can cause e.g. a TLB miss, printk() cannot be
      called before the exception handlers have been properly initialized.
      This can happen e.g. when netconsole has been loaded as a kernel module
      and the TLB table has been cleared when a CPU was offline.
      
      Call cpu_report() in start_secondary() only after the exception handlers
      have been initialized to fix this.
      
      Without the patch the kernel will randomly either lockup or crash
      after a CPU is onlined and the console driver is a module.
      Signed-off-by: NHemmo Nieminen <hemmo.nieminen@iki.fi>
      Signed-off-by: NAaro Koskinen <aaro.koskinen@iki.fi>
      Cc: stable@vger.kernel.org
      Cc: David Daney <david.daney@cavium.com>
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/8953/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      c7754e75
    • A
      MIPS: OCTEON: fix kernel crash when offlining a CPU · 63a87fe0
      Aaro Koskinen 提交于
      octeon_cpu_disable() will unconditionally enable interrupts when called.
      We can assume that the routine is always called with interrupts disabled,
      so just delete the incorrect local_irq_disable/enable().
      
      The patch fixes the following crash when offlining a CPU:
      
      [   93.818785] ------------[ cut here ]------------
      [   93.823421] WARNING: CPU: 1 PID: 10 at kernel/smp.c:231 flush_smp_call_function_queue+0x1c4/0x1d0()
      [   93.836215] Modules linked in:
      [   93.839287] CPU: 1 PID: 10 Comm: migration/1 Not tainted 3.19.0-rc4-octeon-los_b5f0 #1
      [   93.847212] Stack : 0000000000000001 ffffffff81b2cf90 0000000000000004 ffffffff81630000
      	  0000000000000000 0000000000000000 0000000000000000 000000000000004a
      	  0000000000000006 ffffffff8117e550 0000000000000000 0000000000000000
      	  ffffffff81b30000 ffffffff81b26808 8000000032c77748 ffffffff81627e07
      	  ffffffff81595ec8 ffffffff81b26808 000000000000000a 0000000000000001
      	  0000000000000001 0000000000000003 0000000010008ce1 ffffffff815030c8
      	  8000000032cbbb38 ffffffff8113d42c 0000000010008ce1 ffffffff8117f36c
      	  8000000032c77300 8000000032cbba50 0000000000000001 ffffffff81503984
      	  0000000000000000 0000000000000000 0000000000000000 0000000000000000
      	  0000000000000000 ffffffff81121668 0000000000000000 0000000000000000
      	  ...
      [   93.912819] Call Trace:
      [   93.915273] [<ffffffff81121668>] show_stack+0x68/0x80
      [   93.920335] [<ffffffff81503984>] dump_stack+0x6c/0x90
      [   93.925395] [<ffffffff8113d58c>] warn_slowpath_common+0x94/0xd8
      [   93.931324] [<ffffffff811a402c>] flush_smp_call_function_queue+0x1c4/0x1d0
      [   93.938208] [<ffffffff811a4128>] hotplug_cfd+0xf0/0x108
      [   93.943444] [<ffffffff8115bacc>] notifier_call_chain+0x5c/0xb8
      [   93.949286] [<ffffffff8113d704>] cpu_notify+0x24/0x60
      [   93.954348] [<ffffffff81501738>] take_cpu_down+0x38/0x58
      [   93.959670] [<ffffffff811b343c>] multi_cpu_stop+0x154/0x180
      [   93.965250] [<ffffffff811b3768>] cpu_stopper_thread+0xd8/0x160
      [   93.971093] [<ffffffff8115ea4c>] smpboot_thread_fn+0x1ec/0x1f8
      [   93.976936] [<ffffffff8115ab04>] kthread+0xd4/0xf0
      [   93.981735] [<ffffffff8111c4f0>] ret_from_kernel_thread+0x14/0x1c
      [   93.987835]
      [   93.989326] ---[ end trace c9e3815ee655bda9 ]---
      [   93.993951] Kernel bug detected[#1]:
      [   93.997533] CPU: 1 PID: 10 Comm: migration/1 Tainted: G        W      3.19.0-rc4-octeon-los_b5f0 #1
      [   94.006591] task: 8000000032c77300 ti: 8000000032cb8000 task.ti: 8000000032cb8000
      [   94.014081] $ 0   : 0000000000000000 0000000010000ce1 0000000000000001 ffffffff81620000
      [   94.022146] $ 4   : 8000000002c72ac0 0000000000000000 00000000000001a7 ffffffff813b06f0
      [   94.030210] $ 8   : ffffffff813b20d8 0000000000000000 0000000000000000 ffffffff81630000
      [   94.038275] $12   : 0000000000000087 0000000000000000 0000000000000086 0000000000000000
      [   94.046339] $16   : ffffffff81623168 0000000000000001 0000000000000000 0000000000000008
      [   94.054405] $20   : 0000000000000001 0000000000000001 0000000000000001 0000000000000003
      [   94.062470] $24   : 0000000000000038 ffffffff813b7f10
      [   94.070536] $28   : 8000000032cb8000 8000000032cbbc20 0000000010008ce1 ffffffff811bcaf4
      [   94.078601] Hi    : 0000000000f188e8
      [   94.082179] Lo    : d4fdf3b646c09d55
      [   94.085760] epc   : ffffffff811bc9d0 irq_work_run_list+0x8/0xf8
      [   94.091686]     Tainted: G        W
      [   94.095613] ra    : ffffffff811bcaf4 irq_work_run+0x34/0x60
      [   94.101192] Status: 10000ce3	KX SX UX KERNEL EXL IE
      [   94.106235] Cause : 40808034
      [   94.109119] PrId  : 000d9301 (Cavium Octeon II)
      [   94.113653] Modules linked in:
      [   94.116721] Process migration/1 (pid: 10, threadinfo=8000000032cb8000, task=8000000032c77300, tls=0000000000000000)
      [   94.127168] Stack : 8000000002c74c80 ffffffff811a4128 0000000000000001 ffffffff81635720
      	  fffffffffffffff2 ffffffff8115bacc 80000000320fbce0 80000000320fbca4
      	  80000000320fbc80 0000000000000002 0000000000000004 ffffffff8113d704
      	  80000000320fbce0 ffffffff81501738 0000000000000003 ffffffff811b343c
      	  8000000002c72aa0 8000000002c72aa8 ffffffff8159cae8 ffffffff8159caa0
      	  ffffffff81650000 80000000320fbbf0 80000000320fbc80 ffffffff811b32e8
      	  0000000000000000 ffffffff811b3768 ffffffff81622b80 ffffffff815148a8
      	  8000000032c77300 8000000002c73e80 ffffffff815148a8 8000000032c77300
      	  ffffffff81622b80 ffffffff815148a8 8000000032c77300 ffffffff81503f48
      	  ffffffff8115ea0c ffffffff81620000 0000000000000000 ffffffff81174d64
      	  ...
      [   94.192771] Call Trace:
      [   94.195222] [<ffffffff811bc9d0>] irq_work_run_list+0x8/0xf8
      [   94.200802] [<ffffffff811bcaf4>] irq_work_run+0x34/0x60
      [   94.206036] [<ffffffff811a4128>] hotplug_cfd+0xf0/0x108
      [   94.211269] [<ffffffff8115bacc>] notifier_call_chain+0x5c/0xb8
      [   94.217111] [<ffffffff8113d704>] cpu_notify+0x24/0x60
      [   94.222171] [<ffffffff81501738>] take_cpu_down+0x38/0x58
      [   94.227491] [<ffffffff811b343c>] multi_cpu_stop+0x154/0x180
      [   94.233072] [<ffffffff811b3768>] cpu_stopper_thread+0xd8/0x160
      [   94.238914] [<ffffffff8115ea4c>] smpboot_thread_fn+0x1ec/0x1f8
      [   94.244757] [<ffffffff8115ab04>] kthread+0xd4/0xf0
      [   94.249555] [<ffffffff8111c4f0>] ret_from_kernel_thread+0x14/0x1c
      [   94.255654]
      [   94.257146]
      Code: a2423c40  40026000  30420001 <00020336> dc820000  10400037  00000000  0000010f  0000010f
      [   94.267183] ---[ end trace c9e3815ee655bdaa ]---
      [   94.271804] Fatal exception: panic in 5 seconds
      Reported-by: NHemmo Nieminen <hemmo.nieminen@iki.fi>
      Signed-off-by: NAaro Koskinen <aaro.koskinen@iki.fi>
      Acked-by: NDavid Daney <david.daney@cavium.com>
      Cc: stable@vger.kernel.org # v3.18+
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/8952/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      63a87fe0
    • M
      arm/arm64: KVM: Use kernel mapping to perform invalidation on page fault · 0d3e4d4f
      Marc Zyngier 提交于
      When handling a fault in stage-2, we need to resync I$ and D$, just
      to be sure we don't leave any old cache line behind.
      
      That's very good, except that we do so using the *user* address.
      Under heavy load (swapping like crazy), we may end up in a situation
      where the page gets mapped in stage-2 while being unmapped from
      userspace by another CPU.
      
      At that point, the DC/IC instructions can generate a fault, which
      we handle with kvm->mmu_lock held. The box quickly deadlocks, user
      is unhappy.
      
      Instead, perform this invalidation through the kernel mapping,
      which is guaranteed to be present. The box is much happier, and so
      am I.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      0d3e4d4f
    • M
      arm/arm64: KVM: Invalidate data cache on unmap · 363ef89f
      Marc Zyngier 提交于
      Let's assume a guest has created an uncached mapping, and written
      to that page. Let's also assume that the host uses a cache-coherent
      IO subsystem. Let's finally assume that the host is under memory
      pressure and starts to swap things out.
      
      Before this "uncached" page is evicted, we need to make sure
      we invalidate potential speculated, clean cache lines that are
      sitting there, or the IO subsystem is going to swap out the
      cached view, loosing the data that has been written directly
      into memory.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      363ef89f
    • M
      arm/arm64: KVM: Use set/way op trapping to track the state of the caches · 3c1e7165
      Marc Zyngier 提交于
      Trying to emulate the behaviour of set/way cache ops is fairly
      pointless, as there are too many ways we can end-up missing stuff.
      Also, there is some system caches out there that simply ignore
      set/way operations.
      
      So instead of trying to implement them, let's convert it to VA ops,
      and use them as a way to re-enable the trapping of VM ops. That way,
      we can detect the point when the MMU/caches are turned off, and do
      a full VM flush (which is what the guest was trying to do anyway).
      
      This allows a 32bit zImage to boot on the APM thingy, and will
      probably help bootloaders in general.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      3c1e7165
    • L
      arm: dma-mapping: Set DMA IOMMU ops in arm_iommu_attach_device() · eab8d653
      Laurent Pinchart 提交于
      Commit 4bb25789 ("arm: dma-mapping: plumb our iommu mapping ops
      into arch_setup_dma_ops") moved the setting of the DMA operations from
      arm_iommu_attach_device() to arch_setup_dma_ops() where the DMA
      operations to be used are selected based on whether the device is
      connected to an IOMMU. However, the IOMMU detection scheme requires the
      IOMMU driver to be ported to the new IOMMU of_xlate API. As no driver
      has been ported yet, this effectively breaks all IOMMU ARM users that
      depend on the IOMMU being handled transparently by the DMA mapping API.
      
      Fix this by restoring the setting of DMA IOMMU ops in
      arm_iommu_attach_device() and splitting the rest of the function into a
      new internal __arm_iommu_attach_device() function, called by
      arch_setup_dma_ops().
      Signed-off-by: NLaurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Tested-by: NHeiko Stuebner <heiko@sntech.de>
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      eab8d653
    • L
      vm: add VM_FAULT_SIGSEGV handling support · 33692f27
      Linus Torvalds 提交于
      The core VM already knows about VM_FAULT_SIGBUS, but cannot return a
      "you should SIGSEGV" error, because the SIGSEGV case was generally
      handled by the caller - usually the architecture fault handler.
      
      That results in lots of duplication - all the architecture fault
      handlers end up doing very similar "look up vma, check permissions, do
      retries etc" - but it generally works.  However, there are cases where
      the VM actually wants to SIGSEGV, and applications _expect_ SIGSEGV.
      
      In particular, when accessing the stack guard page, libsigsegv expects a
      SIGSEGV.  And it usually got one, because the stack growth is handled by
      that duplicated architecture fault handler.
      
      However, when the generic VM layer started propagating the error return
      from the stack expansion in commit fee7e49d ("mm: propagate error
      from stack expansion even for guard page"), that now exposed the
      existing VM_FAULT_SIGBUS result to user space.  And user space really
      expected SIGSEGV, not SIGBUS.
      
      To fix that case, we need to add a VM_FAULT_SIGSEGV, and teach all those
      duplicate architecture fault handlers about it.  They all already have
      the code to handle SIGSEGV, so it's about just tying that new return
      value to the existing code, but it's all a bit annoying.
      
      This is the mindless minimal patch to do this.  A more extensive patch
      would be to try to gather up the mostly shared fault handling logic into
      one generic helper routine, and long-term we really should do that
      cleanup.
      
      Just from this patch, you can generally see that most architectures just
      copied (directly or indirectly) the old x86 way of doing things, but in
      the meantime that original x86 model has been improved to hold the VM
      semaphore for shorter times etc and to handle VM_FAULT_RETRY and other
      "newer" things, so it would be a good idea to bring all those
      improvements to the generic case and teach other architectures about
      them too.
      Reported-and-tested-by: NTakashi Iwai <tiwai@suse.de>
      Tested-by: NJan Engelhardt <jengelh@inai.de>
      Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # "s390 still compiles and boots"
      Cc: linux-arch@vger.kernel.org
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      33692f27
  10. 29 1月, 2015 8 次提交
  11. 28 1月, 2015 4 次提交
  12. 27 1月, 2015 1 次提交
  13. 26 1月, 2015 1 次提交