1. 19 2月, 2015 2 次提交
    • B
      x86/intel/quark: Add Intel Quark platform support · 8bbc2a13
      Bryan O'Donoghue 提交于
      Add Intel Quark platform support. Quark needs to pull down all
      unlocked IMRs to ensure agreement with the EFI memory map post
      boot.
      
      This patch adds an entry in Kconfig for Quark as a platform and
      makes IMR support mandatory if selected.
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Suggested-by: NAndy Shevchenko <andy.shevchenko@gmail.com>
      Tested-by: NOng, Boon Leong <boon.leong.ong@intel.com>
      Signed-off-by: NBryan O'Donoghue <pure.logic@nexus-software.ie>
      Reviewed-by: NAndy Shevchenko <andy.schevchenko@gmail.com>
      Reviewed-by: NDarren Hart <dvhart@linux.intel.com>
      Reviewed-by: NOng, Boon Leong <boon.leong.ong@intel.com>
      Cc: dvhart@infradead.org
      Link: http://lkml.kernel.org/r/1422635379-12476-3-git-send-email-pure.logic@nexus-software.ieSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8bbc2a13
    • B
      x86/intel/quark: Add Isolated Memory Regions for Quark X1000 · 28a375df
      Bryan O'Donoghue 提交于
      Intel's Quark X1000 SoC contains a set of registers called
      Isolated Memory Regions. IMRs are accessed over the IOSF mailbox
      interface. IMRs are areas carved out of memory that define
      read/write access rights to the various system agents within the
      Quark system. For a given agent in the system it is possible to
      specify if that agent may read or write an area of memory
      defined by an IMR with a granularity of 1 KiB.
      
      Quark_SecureBootPRM_330234_001.pdf section 4.5 details the
      concept of IMRs quark-x1000-datasheet.pdf section 12.7.4 details
      the implementation of IMRs in silicon.
      
      eSRAM flush, CPU Snoop write-only, CPU SMM Mode, CPU non-SMM
      mode, RMU and PCIe Virtual Channels (VC0 and VC1) can have
      individual read/write access masks applied to them for a given
      memory region in Quark X1000. This enables IMRs to treat each
      memory transaction type listed above on an individual basis and
      to filter appropriately based on the IMR access mask for the
      memory region. Quark supports eight IMRs.
      
      Since all of the DMA capable SoC components in the X1000 are
      mapped to VC0 it is possible to define sections of memory as
      invalid for DMA write operations originating from Ethernet, USB,
      SD and any other DMA capable south-cluster component on VC0.
      Similarly it is possible to mark kernel memory as non-SMM mode
      read/write only or to mark BIOS runtime memory as SMM mode
      accessible only depending on the particular memory footprint on
      a given system.
      
      On an IMR violation Quark SoC X1000 systems are configured to
      reset the system, so ensuring that the IMR memory map is
      consistent with the EFI provided memory map is critical to
      ensure no IMR violations reset the system.
      
      The API for accessing IMRs is based on MTRR code but doesn't
      provide a /proc or /sys interface to manipulate IMRs. Defining
      the size and extent of IMRs is exclusively the domain of
      in-kernel code.
      
      Quark firmware sets up a series of locked IMRs around pieces of
      memory that firmware owns such as ACPI runtime data. During boot
      a series of unlocked IMRs are placed around items in memory to
      guarantee no DMA modification of those items can take place.
      Grub also places an unlocked IMR around the kernel boot params
      data structure and compressed kernel image. It is necessary for
      the kernel to tear down all unlocked IMRs in order to ensure
      that the kernel's view of memory passed via the EFI memory map
      is consistent with the IMR memory map. Without tearing down all
      unlocked IMRs on boot transitory IMRs such as those used to
      protect the compressed kernel image will cause IMR violations and system reboots.
      
      The IMR init code tears down all unlocked IMRs and sets a
      protective IMR around the kernel .text and .rodata as one
      contiguous block. This sanitizes the IMR memory map with respect
      to the EFI memory map and protects the read-only portions of the
      kernel from unwarranted DMA access.
      Tested-by: NOng, Boon Leong <boon.leong.ong@intel.com>
      Signed-off-by: NBryan O'Donoghue <pure.logic@nexus-software.ie>
      Reviewed-by: NAndy Shevchenko <andy.schevchenko@gmail.com>
      Reviewed-by: NDarren Hart <dvhart@linux.intel.com>
      Reviewed-by: NOng, Boon Leong <boon.leong.ong@intel.com>
      Cc: andy.shevchenko@gmail.com
      Cc: dvhart@infradead.org
      Link: http://lkml.kernel.org/r/1422635379-12476-2-git-send-email-pure.logic@nexus-software.ieSigned-off-by: NIngo Molnar <mingo@kernel.org>
      28a375df
  2. 02 2月, 2015 1 次提交
  3. 01 2月, 2015 3 次提交
    • A
      x86_64, entry: Remove the syscall exit audit and schedule optimizations · 96b6352c
      Andy Lutomirski 提交于
      We used to optimize rescheduling and audit on syscall exit.  Now
      that the full slow path is reasonably fast, remove these
      optimizations.  Syscall exit auditing is now handled exclusively by
      syscall_trace_leave.
      
      This adds something like 10ns to the previously optimized paths on
      my computer, presumably due mostly to SAVE_REST / RESTORE_REST.
      
      I think that we should eventually replace both the syscall and
      non-paranoid interrupt exit slow paths with a pair of C functions
      along the lines of the syscall entry hooks.
      
      Link: http://lkml.kernel.org/r/22f2aa4a0361707a5cfb1de9d45260b39965dead.1421453410.git.luto@amacapital.netAcked-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      96b6352c
    • A
      x86_64, entry: Use sysret to return to userspace when possible · 2a23c6b8
      Andy Lutomirski 提交于
      The x86_64 entry code currently jumps through complex and
      inconsistent hoops to try to minimize the impact of syscall exit
      work.  For a true fast-path syscall, almost nothing needs to be
      done, so returning is just a check for exit work and sysret.  For a
      full slow-path return from a syscall, the C exit hook is invoked if
      needed and we join the iret path.
      
      Using iret to return to userspace is very slow, so the entry code
      has accumulated various special cases to try to do certain forms of
      exit work without invoking iret.  This is error-prone, since it
      duplicates assembly code paths, and it's dangerous, since sysret
      can malfunction in interesting ways if used carelessly.  It's
      also inefficient, since a lot of useful cases aren't optimized
      and therefore force an iret out of a combination of paranoia and
      the fact that no one has bothered to write even more asm code
      to avoid it.
      
      I would argue that this approach is backwards.  Rather than trying
      to avoid the iret path, we should instead try to make the iret path
      fast.  Under a specific set of conditions, iret is unnecessary.  In
      particular, if RIP==RCX, RFLAGS==R11, RIP is canonical, RF is not
      set, and both SS and CS are as expected, then
      movq 32(%rsp),%rsp;sysret does the same thing as iret.  This set of
      conditions is nearly always satisfied on return from syscalls, and
      it can even occasionally be satisfied on return from an irq.
      
      Even with the careful checks for sysret applicability, this cuts
      nearly 80ns off of the overhead from syscalls with unoptimized exit
      work.  This includes tracing and context tracking, and any return
      that invokes KVM's user return notifier.  For example, the cost of
      getpid with CONFIG_CONTEXT_TRACKING_FORCE=y drops from ~360ns to
      ~280ns on my computer.
      
      This may allow the removal and even eventual conversion to C
      of a respectable amount of exit asm.
      
      This may require further tweaking to give the full benefit on Xen.
      
      It may be worthwhile to adjust signal delivery and exec to try hit
      the sysret path.
      
      This does not optimize returns to 32-bit userspace.  Making the same
      optimization for CS == __USER32_CS is conceptually straightforward,
      but it will require some tedious code to handle the differences
      between sysretl and sysexitl.
      
      Link: http://lkml.kernel.org/r/71428f63e681e1b4aa1a781e3ef7c27f027d1103.1421453410.git.luto@amacapital.netSigned-off-by: NAndy Lutomirski <luto@amacapital.net>
      2a23c6b8
    • A
      x86, traps: Fix ist_enter from userspace · b926e6f6
      Andy Lutomirski 提交于
      context_tracking_user_exit() has no effect if in_interrupt() returns true,
      so ist_enter() didn't work.  Fix it by calling exception_enter(), and thus
      context_tracking_user_exit(), before incrementing the preempt count.
      
      This also adds an assertion that will catch the problem reliably if
      CONFIG_PROVE_RCU=y to help prevent the bug from being reintroduced.
      
      Link: http://lkml.kernel.org/r/261ebee6aee55a4724746d0d7024697013c40a08.1422709102.git.luto@amacapital.net
      Fixes: 95927475 x86, traps: Track entry into and exit from IST context
      Reported-and-tested-by: NSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      b926e6f6
  4. 30 1月, 2015 2 次提交
    • R
      KVM: x86: check LAPIC presence when building apic_map · df04d1d1
      Radim Krčmář 提交于
      We forgot to re-check LAPIC after splitting the loop in commit
      173beedc (KVM: x86: Software disabled APIC should still deliver
      NMIs, 2014-11-02).
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      Fixes: 173beedcSigned-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      df04d1d1
    • L
      vm: add VM_FAULT_SIGSEGV handling support · 33692f27
      Linus Torvalds 提交于
      The core VM already knows about VM_FAULT_SIGBUS, but cannot return a
      "you should SIGSEGV" error, because the SIGSEGV case was generally
      handled by the caller - usually the architecture fault handler.
      
      That results in lots of duplication - all the architecture fault
      handlers end up doing very similar "look up vma, check permissions, do
      retries etc" - but it generally works.  However, there are cases where
      the VM actually wants to SIGSEGV, and applications _expect_ SIGSEGV.
      
      In particular, when accessing the stack guard page, libsigsegv expects a
      SIGSEGV.  And it usually got one, because the stack growth is handled by
      that duplicated architecture fault handler.
      
      However, when the generic VM layer started propagating the error return
      from the stack expansion in commit fee7e49d ("mm: propagate error
      from stack expansion even for guard page"), that now exposed the
      existing VM_FAULT_SIGBUS result to user space.  And user space really
      expected SIGSEGV, not SIGBUS.
      
      To fix that case, we need to add a VM_FAULT_SIGSEGV, and teach all those
      duplicate architecture fault handlers about it.  They all already have
      the code to handle SIGSEGV, so it's about just tying that new return
      value to the existing code, but it's all a bit annoying.
      
      This is the mindless minimal patch to do this.  A more extensive patch
      would be to try to gather up the mostly shared fault handling logic into
      one generic helper routine, and long-term we really should do that
      cleanup.
      
      Just from this patch, you can generally see that most architectures just
      copied (directly or indirectly) the old x86 way of doing things, but in
      the meantime that original x86 model has been improved to hold the VM
      semaphore for shorter times etc and to handle VM_FAULT_RETRY and other
      "newer" things, so it would be a good idea to bring all those
      improvements to the generic case and teach other architectures about
      them too.
      Reported-and-tested-by: NTakashi Iwai <tiwai@suse.de>
      Tested-by: NJan Engelhardt <jengelh@inai.de>
      Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # "s390 still compiles and boots"
      Cc: linux-arch@vger.kernel.org
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      33692f27
  5. 29 1月, 2015 1 次提交
  6. 28 1月, 2015 4 次提交
  7. 27 1月, 2015 1 次提交
  8. 24 1月, 2015 1 次提交
  9. 23 1月, 2015 12 次提交
  10. 22 1月, 2015 13 次提交