1. 05 5月, 2014 1 次提交
  2. 01 5月, 2014 1 次提交
    • H
      x86-64, espfix: Don't leak bits 31:16 of %esp returning to 16-bit stack · 3891a04a
      H. Peter Anvin 提交于
      The IRET instruction, when returning to a 16-bit segment, only
      restores the bottom 16 bits of the user space stack pointer.  This
      causes some 16-bit software to break, but it also leaks kernel state
      to user space.  We have a software workaround for that ("espfix") for
      the 32-bit kernel, but it relies on a nonzero stack segment base which
      is not available in 64-bit mode.
      
      In checkin:
      
          b3b42ac2 x86-64, modify_ldt: Ban 16-bit segments on 64-bit kernels
      
      we "solved" this by forbidding 16-bit segments on 64-bit kernels, with
      the logic that 16-bit support is crippled on 64-bit kernels anyway (no
      V86 support), but it turns out that people are doing stuff like
      running old Win16 binaries under Wine and expect it to work.
      
      This works around this by creating percpu "ministacks", each of which
      is mapped 2^16 times 64K apart.  When we detect that the return SS is
      on the LDT, we copy the IRET frame to the ministack and use the
      relevant alias to return to userspace.  The ministacks are mapped
      readonly, so if IRET faults we promote #GP to #DF which is an IST
      vector and thus has its own stack; we then do the fixup in the #DF
      handler.
      
      (Making #GP an IST exception would make the msr_safe functions unsafe
      in NMI/MC context, and quite possibly have other effects.)
      
      Special thanks to:
      
      - Andy Lutomirski, for the suggestion of using very small stack slots
        and copy (as opposed to map) the IRET frame there, and for the
        suggestion to mark them readonly and let the fault promote to #DF.
      - Konrad Wilk for paravirt fixup and testing.
      - Borislav Petkov for testing help and useful comments.
      Reported-by: NBrian Gerst <brgerst@gmail.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Link: http://lkml.kernel.org/r/1398816946-3351-1-git-send-email-hpa@linux.intel.com
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Andrew Lutomriski <amluto@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Dirk Hohndel <dirk@hohndel.org>
      Cc: Arjan van de Ven <arjan.van.de.ven@intel.com>
      Cc: comex <comexk@gmail.com>
      Cc: Alexander van Heukelum <heukelum@fastmail.fm>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: <stable@vger.kernel.org> # consider after upstream merge
      3891a04a
  3. 19 3月, 2014 1 次提交
  4. 16 1月, 2014 1 次提交
  5. 14 1月, 2014 1 次提交
  6. 09 1月, 2014 1 次提交
    • D
      arch: x86: New MailBox support driver for Intel SOC's · 46184415
      David E. Box 提交于
      Current Intel SOC cores use a MailBox Interface (MBI) to provide access to
      configuration registers on devices (called units) connected to the system
      fabric. This is a support driver that implements access to this interface on
      those platforms that can enumerate the device using PCI. Initial support is for
      BayTrail, for which port definitons are provided. This is a requirement for
      implementing platform specific features (e.g. RAPL driver requires this to
      perform platform specific power management using the registers in PUNIT).
      Dependant modules should select IOSF_MBI in their respective Kconfig
      configuraiton. Serialized access is handled by all exported routines with
      spinlocks.
      
      The API includes 3 functions for access to unit registers:
      
      int iosf_mbi_read(u8 port, u8 opcode, u32 offset, u32 *mdr)
      int iosf_mbi_write(u8 port, u8 opcode, u32 offset, u32 mdr)
      int iosf_mbi_modify(u8 port, u8 opcode, u32 offset, u32 mdr, u32 mask)
      
      port:	indicating the unit being accessed
      opcode:	the read or write port specific opcode
      offset:	the register offset within the port
      mdr:	the register data to be read, written, or modified
      mask:	bit locations in mdr to change
      
      Returns nonzero on error
      
      Note: GPU code handles access to the GFX unit. Therefore access to that unit
      with this driver is disallowed to avoid conflicts.
      Signed-off-by: NDavid E. Box <david.e.box@linux.intel.com>
      Link: http://lkml.kernel.org/r/1389216471-734-1-git-send-email-david.e.box@linux.intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      46184415
  7. 29 12月, 2013 1 次提交
    • D
      x86: Export x86 boot_params to sysfs · 5039e316
      Dave Young 提交于
      kexec-tools use boot_params for getting the 1st kernel hardware_subarch,
      the kexec kernel EFI runtime support also needs to read the old efi_info
      from boot_params. Currently it exists in debugfs which is not a good
      place for such infomation. Per HPA, we should avoid "sploit debugfs".
      
      In this patch /sys/kernel/boot_params are exported, also the setup_data is
      exported as a subdirectory. kexec-tools is using debugfs for hardware_subarch
      for a long time now so we're not removing it yet.
      
      Structure is like below:
      
      /sys/kernel/boot_params
      |__ data                /* boot_params in binary*/
      |__ setup_data
      |   |__ 0               /* the first setup_data node */
      |   |   |__ data        /* setup_data node 0 in binary*/
      |   |   |__ type        /* setup_data type of setup_data node 0, hex string */
      [snip]
      |__ version             /* boot protocal version (in hex, "0x" prefixed)*/
      Signed-off-by: NDave Young <dyoung@redhat.com>
      Acked-by: NBorislav Petkov <bp@suse.de>
      Tested-by: NToshi Kani <toshi.kani@hp.com>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      5039e316
  8. 25 9月, 2013 1 次提交
  9. 03 8月, 2013 2 次提交
    • D
      x86: sysfb: move EFI quirks from efifb to sysfb · 2995e506
      David Herrmann 提交于
      The EFI FB quirks from efifb.c are useful for simple-framebuffer devices
      as well. Apply them by default so we can convert efifb.c to use
      efi-framebuffer platform devices.
      Signed-off-by: NDavid Herrmann <dh.herrmann@gmail.com>
      Link: http://lkml.kernel.org/r/1375445127-15480-5-git-send-email-dh.herrmann@gmail.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      2995e506
    • D
      x86: provide platform-devices for boot-framebuffers · e3263ab3
      David Herrmann 提交于
      The current situation regarding boot-framebuffers (VGA, VESA/VBE, EFI) on
      x86 causes troubles when loading multiple fbdev drivers. The global
      "struct screen_info" does not provide any state-tracking about which
      drivers use the FBs. request_mem_region() theoretically works, but
      unfortunately vesafb/efifb ignore it due to quirks for broken boards.
      
      Avoid this by creating a platform framebuffer devices with a pointer
      to the "struct screen_info" as platform-data. Drivers can now create
      platform-drivers and the driver-core will refuse multiple drivers being
      active simultaneously.
      
      We keep the screen_info available for backwards-compatibility. Drivers
      can be converted in follow-up patches.
      
      Different devices are created for VGA/VESA/EFI FBs to allow multiple
      drivers to be loaded on distro kernels. We create:
       - "vesa-framebuffer" for VBE/VESA graphics FBs
       - "efi-framebuffer" for EFI FBs
       - "platform-framebuffer" for everything else
      This allows to load vesafb, efifb and others simultaneously and each
      picks up only the supported FB types.
      
      Apart from platform-framebuffer devices, this also introduces a
      compatibility option for "simple-framebuffer" drivers which recently got
      introduced for OF based systems. If CONFIG_X86_SYSFB is selected, we
      try to match the screen_info against a simple-framebuffer supported
      format. If we succeed, we create a "simple-framebuffer" device instead
      of a platform-framebuffer.
      This allows to reuse the simplefb.c driver across architectures and also
      to introduce a SimpleDRM driver. There is no need to have vesafb.c,
      efifb.c, simplefb.c and more just to have architecture specific quirks
      in their setup-routines.
      
      Instead, we now move the architecture specific quirks into x86-setup and
      provide a generic simple-framebuffer. For backwards-compatibility (if
      strange formats are used), we still allow vesafb/efifb to be loaded
      simultaneously and pick up all remaining devices.
      Signed-off-by: NDavid Herrmann <dh.herrmann@gmail.com>
      Link: http://lkml.kernel.org/r/1375445127-15480-4-git-send-email-dh.herrmann@gmail.comTested-by: NStephen Warren <swarren@nvidia.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      e3263ab3
  10. 21 6月, 2013 2 次提交
    • S
      trace,x86: Move creation of irq tracepoints from apic.c to irq.c · 83ab8514
      Steven Rostedt (Red Hat) 提交于
      Compiling without CONFIG_X86_LOCAL_APIC set, apic.c will not be
      compiled, and the irq tracepoints will not be created via the
      CREATE_TRACE_POINTS macro. When CONFIG_X86_LOCAL_APIC is not set,
      we get the following build error:
      
        LD      init/built-in.o
      arch/x86/built-in.o: In function `trace_x86_platform_ipi_entry':
      linux-test.git/arch/x86/include/asm/trace/irq_vectors.h:66: undefined reference to `__tracepoint_x86_platform_ipi_entry'
      arch/x86/built-in.o: In function `trace_x86_platform_ipi_exit':
      linux-test.git/arch/x86/include/asm/trace/irq_vectors.h:66: undefined reference to `__tracepoint_x86_platform_ipi_exit'
      arch/x86/built-in.o: In function `trace_irq_work_entry':
      linux-test.git/arch/x86/include/asm/trace/irq_vectors.h:72: undefined reference to `__tracepoint_irq_work_entry'
      arch/x86/built-in.o: In function `trace_irq_work_exit':
      linux-test.git/arch/x86/include/asm/trace/irq_vectors.h:72: undefined reference to `__tracepoint_irq_work_exit'
      arch/x86/built-in.o:(__jump_table+0x8): undefined reference to `__tracepoint_x86_platform_ipi_entry'
      arch/x86/built-in.o:(__jump_table+0x14): undefined reference to `__tracepoint_x86_platform_ipi_exit'
      arch/x86/built-in.o:(__jump_table+0x20): undefined reference to `__tracepoint_irq_work_entry'
      arch/x86/built-in.o:(__jump_table+0x2c): undefined reference to `__tracepoint_irq_work_exit'
      make[1]: *** [vmlinux] Error 1
      make: *** [sub-make] Error 2
      
      As irq.c is always compiled for x86, it is a more appropriate location
      to create the irq tracepoints.
      
      Cc: Seiji Aguchi <seiji.aguchi@hds.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      83ab8514
    • S
      x86, trace: Add irq vector tracepoints · cf910e83
      Seiji Aguchi 提交于
      [Purpose of this patch]
      
      As Vaibhav explained in the thread below, tracepoints for irq vectors
      are useful.
      
      http://www.spinics.net/lists/mm-commits/msg85707.html
      
      <snip>
      The current interrupt traces from irq_handler_entry and irq_handler_exit
      provide when an interrupt is handled.  They provide good data about when
      the system has switched to kernel space and how it affects the currently
      running processes.
      
      There are some IRQ vectors which trigger the system into kernel space,
      which are not handled in generic IRQ handlers.  Tracing such events gives
      us the information about IRQ interaction with other system events.
      
      The trace also tells where the system is spending its time.  We want to
      know which cores are handling interrupts and how they are affecting other
      processes in the system.  Also, the trace provides information about when
      the cores are idle and which interrupts are changing that state.
      <snip>
      
      On the other hand, my usecase is tracing just local timer event and
      getting a value of instruction pointer.
      
      I suggested to add an argument local timer event to get instruction pointer before.
      But there is another way to get it with external module like systemtap.
      So, I don't need to add any argument to irq vector tracepoints now.
      
      [Patch Description]
      
      Vaibhav's patch shared a trace point ,irq_vector_entry/irq_vector_exit, in all events.
      But there is an above use case to trace specific irq_vector rather than tracing all events.
      In this case, we are concerned about overhead due to unwanted events.
      
      So, add following tracepoints instead of introducing irq_vector_entry/exit.
      so that we can enable them independently.
         - local_timer_vector
         - reschedule_vector
         - call_function_vector
         - call_function_single_vector
         - irq_work_entry_vector
         - error_apic_vector
         - thermal_apic_vector
         - threshold_apic_vector
         - spurious_apic_vector
         - x86_platform_ipi_vector
      
      Also, introduce a logic switching IDT at enabling/disabling time so that a time penalty
      makes a zero when tracepoints are disabled. Detailed explanations are as follows.
       - Create trace irq handlers with entering_irq()/exiting_irq().
       - Create a new IDT, trace_idt_table, at boot time by adding a logic to
         _set_gate(). It is just a copy of original idt table.
       - Register the new handlers for tracpoints to the new IDT by introducing
         macros to alloc_intr_gate() called at registering time of irq_vector handlers.
       - Add checking, whether irq vector tracing is on/off, into load_current_idt().
         This has to be done below debug checking for these reasons.
         - Switching to debug IDT may be kicked while tracing is enabled.
         - On the other hands, switching to trace IDT is kicked only when debugging
           is disabled.
      
      In addition, the new IDT is created only when CONFIG_TRACING is enabled to avoid being
      used for other purposes.
      Signed-off-by: NSeiji Aguchi <seiji.aguchi@hds.com>
      Link: http://lkml.kernel.org/r/51C323ED.5050708@hds.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      cf910e83
  11. 31 5月, 2013 1 次提交
  12. 14 5月, 2013 1 次提交
  13. 01 2月, 2013 1 次提交
  14. 22 1月, 2013 2 次提交
  15. 14 11月, 2012 1 次提交
  16. 24 10月, 2012 1 次提交
  17. 01 10月, 2012 1 次提交
  18. 23 8月, 2012 1 次提交
  19. 10 8月, 2012 1 次提交
    • J
      perf: Unified API to record selective sets of arch registers · c5e63197
      Jiri Olsa 提交于
      This brings a new API to help the selective dump of registers on event
      sampling, and its implementation for x86 arch.
      
      Added HAVE_PERF_REGS config option to determine if the architecture
      provides perf registers ABI.
      
      The information about desired registers will be passed in u64 mask.
      It's up to the architecture to map the registers into the mask bits.
      
      For the x86 arch implementation, both 32 and 64 bit registers bits are
      defined within single enum to ensure 64 bit system can provide register
      dump for compat task if needed in the future.
      Original-patch-by: NFrederic Weisbecker <fweisbec@gmail.com>
      [ Added missing linux/errno.h include ]
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Cc: "Frank Ch. Eigler" <fche@redhat.com>
      Cc: Arun Sharma <asharma@fb.com>
      Cc: Benjamin Redelings <benjamin.redelings@nescent.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Ulrich Drepper <drepper@gmail.com>
      Link: http://lkml.kernel.org/r/1344345647-11536-2-git-send-email-jolsa@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c5e63197
  20. 18 5月, 2012 1 次提交
    • P
      MCA: delete all remaining traces of microchannel bus support. · bb8187d3
      Paul Gortmaker 提交于
      Hardware with MCA bus is limited to 386 and 486 class machines
      that are now 20+ years old and typically with less than 32MB
      of memory.  A quick search on the internet, and you see that
      even the MCA hobbyist/enthusiast community has lost interest
      in the early 2000 era and never really even moved ahead from
      the 2.4 kernels to the 2.6 series.
      
      This deletes anything remaining related to CONFIG_MCA from core
      kernel code and from the x86 architecture.  There is no point in
      carrying this any further into the future.
      
      One complication to watch for is inadvertently scooping up
      stuff relating to machine check, since there is overlap in
      the TLA name space (e.g. arch/x86/boot/mca.c).
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: James Bottomley <JBottomley@Parallels.com>
      Cc: x86@kernel.org
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Acked-by: NH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      bb8187d3
  21. 17 5月, 2012 1 次提交
  22. 09 5月, 2012 3 次提交
  23. 05 5月, 2012 1 次提交
  24. 06 3月, 2012 1 次提交
  25. 17 2月, 2012 1 次提交
    • S
      uprobes, mm, x86: Add the ability to install and remove uprobes breakpoints · 2b144498
      Srikar Dronamraju 提交于
      Add uprobes support to the core kernel, with x86 support.
      
      This commit adds the kernel facilities, the actual uprobes
      user-space ABI and perf probe support comes in later commits.
      
      General design:
      
      Uprobes are maintained in an rb-tree indexed by inode and offset
      (the offset here is from the start of the mapping). For a unique
      (inode, offset) tuple, there can be at most one uprobe in the
      rb-tree.
      
      Since the (inode, offset) tuple identifies a unique uprobe, more
      than one user may be interested in the same uprobe. This provides
      the ability to connect multiple 'consumers' to the same uprobe.
      
      Each consumer defines a handler and a filter (optional). The
      'handler' is run every time the uprobe is hit, if it matches the
      'filter' criteria.
      
      The first consumer of a uprobe causes the breakpoint to be
      inserted at the specified address and subsequent consumers are
      appended to this list.  On subsequent probes, the consumer gets
      appended to the existing list of consumers. The breakpoint is
      removed when the last consumer unregisters. For all other
      unregisterations, the consumer is removed from the list of
      consumers.
      
      Given a inode, we get a list of the mms that have mapped the
      inode. Do the actual registration if mm maps the page where a
      probe needs to be inserted/removed.
      
      We use a temporary list to walk through the vmas that map the
      inode.
      
      - The number of maps that map the inode, is not known before we
        walk the rmap and keeps changing.
      - extending vm_area_struct wasn't recommended, it's a
        size-critical data structure.
      - There can be more than one maps of the inode in the same mm.
      
      We add callbacks to the mmap methods to keep an eye on text vmas
      that are of interest to uprobes.  When a vma of interest is mapped,
      we insert the breakpoint at the right address.
      
      Uprobe works by replacing the instruction at the address defined
      by (inode, offset) with the arch specific breakpoint
      instruction. We save a copy of the original instruction at the
      uprobed address.
      
      This is needed for:
      
       a. executing the instruction out-of-line (xol).
       b. instruction analysis for any subsequent fixups.
       c. restoring the instruction back when the uprobe is unregistered.
      
      We insert or delete a breakpoint instruction, and this
      breakpoint instruction is assumed to be the smallest instruction
      available on the platform. For fixed size instruction platforms
      this is trivially true, for variable size instruction platforms
      the breakpoint instruction is typically the smallest (often a
      single byte).
      
      Writing the instruction is done by COWing the page and changing
      the instruction during the copy, this even though most platforms
      allow atomic writes of the breakpoint instruction. This also
      mirrors the behaviour of a ptrace() memory write to a PRIVATE
      file map.
      
      The core worker is derived from KSM's replace_page() logic.
      
      In essence, similar to KSM:
      
       a. allocate a new page and copy over contents of the page that
          has the uprobed vaddr
       b. modify the copy and insert the breakpoint at the required
          address
       c. switch the original page with the copy containing the
          breakpoint
       d. flush page tables.
      
      replace_page() is being replicated here because of some minor
      changes in the type of pages and also because Hugh Dickins had
      plans to improve replace_page() for KSM specific work.
      
      Instruction analysis on x86 is based on instruction decoder and
      determines if an instruction can be probed and determines the
      necessary fixups after singlestep.  Instruction analysis is done
      at probe insertion time so that we avoid having to repeat the
      same analysis every time a probe is hit.
      
      A lot of code here is due to the improvement/suggestions/inputs
      from Peter Zijlstra.
      
      Changelog:
      
      (v10):
       - Add code to clear REX.B prefix as suggested by Denys Vlasenko
         and Masami Hiramatsu.
      
      (v9):
       - Use insn_offset_modrm as suggested by Masami Hiramatsu.
      
      (v7):
      
       Handle comments from Peter Zijlstra:
      
       - Dont take reference to inode. (expect inode to uprobe_register to be sane).
       - Use PTR_ERR to set the return value.
       - No need to take reference to inode.
       - use PTR_ERR to return error value.
       - register and uprobe_unregister share code.
      
      (v5):
      
       - Modified del_consumer as per comments from Peter.
       - Drop reference to inode before dropping reference to uprobe.
       - Use i_size_read(inode) instead of inode->i_size.
       - Ensure uprobe->consumers is NULL, before __uprobe_unregister() is called.
       - Includes errno.h as recommended by Stephen Rothwell to fix a build issue
         on sparc defconfig
       - Remove restrictions while unregistering.
       - Earlier code leaked inode references under some conditions while
         registering/unregistering.
       - Continue the vma-rmap walk even if the intermediate vma doesnt
         meet the requirements.
       - Validate the vma found by find_vma before inserting/removing the
         breakpoint
       - Call del_consumer under mutex_lock.
       - Use hash locks.
       - Handle mremap.
       - Introduce find_least_offset_node() instead of close match logic in
         find_uprobe
       - Uprobes no more depends on MM_OWNER; No reference to task_structs
         while inserting/removing a probe.
       - Uses read_mapping_page instead of grab_cache_page so that the pages
         have valid content.
       - pass NULL to get_user_pages for the task parameter.
       - call SetPageUptodate on the new page allocated in write_opcode.
       - fix leaking a reference to the new page under certain conditions.
       - Include Instruction Decoder if Uprobes gets defined.
       - Remove const attributes for instruction prefix arrays.
       - Uses mm_context to know if the application is 32 bit.
      Signed-off-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Also-written-by: NJim Keniston <jkenisto@us.ibm.com>
      Reviewed-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Roland McGrath <roland@hack.frob.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Denys Vlasenko <vda.linux@googlemail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linux-mm <linux-mm@kvack.org>
      Link: http://lkml.kernel.org/r/20120209092642.GE16600@linux.vnet.ibm.com
      [ Made various small edits to the commit log ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2b144498
  26. 05 12月, 2011 1 次提交
    • D
      x86, NMI: Add NMI IPI selftest · 99e8b9ca
      Don Zickus 提交于
      The previous patch modified the stop cpus path to use NMI
      instead of IRQ as the way to communicate to the other cpus to
      shutdown.  There were some concerns that various machines may
      have problems with using an NMI IPI.
      
      This patch creates a selftest to check if NMI is working at
      boot. The idea is to help catch any issues before the machine
      panics and we learn the hard way.
      
      Loosely based on the locking-selftest.c file, this separate file
      runs a couple of simple tests and reports the results.  The
      output looks like:
      
      ...
      Brought up 4 CPUs
      ----------------
      | NMI testsuite:
      --------------------
        remote IPI:  ok  |
         local IPI:  ok  |
      --------------------
      Good, all   2 testcases passed! |
      ---------------------------------
      Total of 4 processors activated (21330.61 BogoMIPS).
      ...
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: seiji.aguchi@hds.com
      Cc: vgoyal@redhat.com
      Cc: mjg@redhat.com
      Cc: tony.luck@intel.com
      Cc: gong.chen@intel.com
      Cc: satoru.moriya@hds.com
      Cc: avi@redhat.com
      Cc: Andi Kleen <andi@firstfloor.org>
      Link: http://lkml.kernel.org/r/1318533267-18880-3-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      99e8b9ca
  27. 18 11月, 2011 1 次提交
    • H
      x86: Generate system call tables and unistd_*.h from tables · 303395ac
      H. Peter Anvin 提交于
      Generate system call tables and unistd_*.h automatically from the
      tables in arch/x86/syscalls.  All other information, like NR_syscalls,
      is auto-generated, some of which is in asm-offsets_*.c.
      
      This allows us to keep all the system call information in one place,
      and allows for kernel space and user space to see different
      information; this is currently used for the ia32 system call numbers
      when building the 64-bit kernel, but will be used by the x32 ABI in
      the near future.
      
      This also removes some gratuitious differences between i386, x86-64
      and ia32; in particular, now all system call tables are generated with
      the same mechanism.
      
      Cc: H. J. Lu <hjl.tools@gmail.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Michal Marek <mmarek@suse.cz>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      303395ac
  28. 10 10月, 2011 1 次提交
  29. 11 8月, 2011 1 次提交
  30. 15 7月, 2011 1 次提交
  31. 21 6月, 2011 2 次提交
  32. 07 6月, 2011 1 次提交
    • A
      x86-64: Emulate legacy vsyscalls · 5cec93c2
      Andy Lutomirski 提交于
      There's a fair amount of code in the vsyscall page.  It contains
      a syscall instruction (in the gettimeofday fallback) and who
      knows what will happen if an exploit jumps into the middle of
      some other code.
      
      Reduce the risk by replacing the vsyscalls with short magic
      incantations that cause the kernel to emulate the real
      vsyscalls. These incantations are useless if entered in the
      middle.
      
      This causes vsyscalls to be a little more expensive than real
      syscalls.  Fortunately sensible programs don't use them.
      The only exception is time() which is still called by glibc
      through the vsyscall - but calling time() millions of times
      per second is not sensible. glibc has this fixed in the
      development tree.
      
      This patch is not perfect: the vread_tsc and vread_hpet
      functions are still at a fixed address.  Fixing that might
      involve making alternative patching work in the vDSO.
      Signed-off-by: NAndy Lutomirski <luto@mit.edu>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Jesper Juhl <jj@chaosbits.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Jan Beulich <JBeulich@novell.com>
      Cc: richard -rw- weinberger <richard.weinberger@gmail.com>
      Cc: Mikael Pettersson <mikpe@it.uu.se>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Louis Rilling <Louis.Rilling@kerlabs.com>
      Cc: Valdis.Kletnieks@vt.edu
      Cc: pageexec@freemail.hu
      Link: http://lkml.kernel.org/r/e64e1b3c64858820d12c48fa739efbd1485e79d5.1307292171.git.luto@mit.edu
      [ Removed the CONFIG option - it's simpler to just do it unconditionally. Tidied up the code as well. ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5cec93c2
  33. 28 5月, 2011 1 次提交
    • S
      x86: Put back -pg to tsc.o and add no GCOV to vread_tsc_64.o · 89e1be50
      Steven Rostedt 提交于
      The commit 44259b1a
          Author: Andy Lutomirski <luto@MIT.EDU>
          x86-64: Move vread_tsc into a new file with sensible options
      
      Removed the -pg from tsc.o which caused the function graph tracer
      to go into an infinite function call recursion as it uses the tsc
      internally outside its recursion protection, thus tracing the tsc
      breaks the function graph tracer.
      
      This commit also added the file vread_tsc_64.c that gets used
      by vdso but failed to prevent GCOV from monkeying with it,
      causing userspace to try to access kernel data when GCOV was
      enabled.
      
      Thanks to Thomas Gleixner for pointing out GCOV as the likely
      culprit that added strange kernel accesses into the vread_tsc()
      call.
      
      Cc: Author: Andy Lutomirski <luto@MIT.EDU>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      89e1be50
  34. 24 5月, 2011 1 次提交