1. 24 2月, 2016 6 次提交
    • J
      x86/amd: Set ELF function type for vide() · de642faf
      Josh Poimboeuf 提交于
      vide() is a callable function, but is missing the ELF function type,
      which confuses tools like stacktool.
      
      Properly annotate it to be a callable function.  The generated code is
      unchanged.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Chris J Arges <chris.j.arges@canonical.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Pedro Alves <palves@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: live-patching@vger.kernel.org
      Link: http://lkml.kernel.org/r/a324095f5c9390ff39b15b4562ea1bbeda1a8282.1453405861.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      de642faf
    • J
      x86/paravirt: Create a stack frame in PV_CALLEE_SAVE_REGS_THUNK · 87b240cb
      Josh Poimboeuf 提交于
      A function created with the PV_CALLEE_SAVE_REGS_THUNK macro doesn't set
      up a new stack frame before the call instruction, which breaks frame
      pointer convention if CONFIG_FRAME_POINTER is enabled and can result in
      a bad stack trace.  Also, the thunk functions aren't annotated as ELF
      callable functions.
      
      Create a stack frame when CONFIG_FRAME_POINTER is enabled and add the
      ELF function type.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Chris J Arges <chris.j.arges@canonical.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Pedro Alves <palves@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: live-patching@vger.kernel.org
      Link: http://lkml.kernel.org/r/a2cad74e87c4aba7fd0f54a1af312e66a824a575.1453405861.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      87b240cb
    • J
      x86/paravirt: Add stack frame dependency to PVOP inline asm calls · bb93eb4c
      Josh Poimboeuf 提交于
      If a PVOP call macro is inlined at the beginning of a function, gcc can
      insert the call instruction before setting up a stack frame, which
      breaks frame pointer convention if CONFIG_FRAME_POINTER is enabled and
      can result in a bad stack trace.
      
      Force a stack frame to be created if CONFIG_FRAME_POINTER is enabled by
      listing the stack pointer as an output operand for the PVOP inline asm
      statements.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Chris J Arges <chris.j.arges@canonical.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Pedro Alves <palves@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: live-patching@vger.kernel.org
      Link: http://lkml.kernel.org/r/6a13e48c5a8cf2de1aa112ae2d4c0ac194096282.1453405861.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      bb93eb4c
    • J
      x86/asm/xen: Create stack frames in xen-asm.S · 8be0eb7e
      Josh Poimboeuf 提交于
      xen_irq_enable_direct(), xen_restore_fl_direct(), and check_events() are
      callable non-leaf functions which don't honor CONFIG_FRAME_POINTER,
      which can result in bad stack traces.
      
      Create stack frames for them when CONFIG_FRAME_POINTER is enabled.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Chris J Arges <chris.j.arges@canonical.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Pedro Alves <palves@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: live-patching@vger.kernel.org
      Link: http://lkml.kernel.org/r/a8340ad3fc72ba9ed34da9b3af9cdd6f1a896e17.1453405861.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8be0eb7e
    • J
      x86/asm/xen: Set ELF function type for xen_adjust_exception_frame() · 9fd21606
      Josh Poimboeuf 提交于
      xen_adjust_exception_frame() is a callable function, but is missing the
      ELF function type, which confuses tools like stacktool.
      
      Properly annotate it to be a callable function.  The generated code is
      unchanged.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Chris J Arges <chris.j.arges@canonical.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Pedro Alves <palves@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: live-patching@vger.kernel.org
      Link: http://lkml.kernel.org/r/b1851bd17a0986472692a7e3a05290d891382cdd.1453405861.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9fd21606
    • J
      x86/xen: Add stack frame dependency to hypercall inline asm calls · 0e8e2238
      Josh Poimboeuf 提交于
      If a hypercall is inlined at the beginning of a function, gcc can insert
      the call instruction before setting up a stack frame, which breaks frame
      pointer convention if CONFIG_FRAME_POINTER is enabled and can result in
      a bad stack trace.
      
      Force a stack frame to be created if CONFIG_FRAME_POINTER is enabled by
      listing the stack pointer as an output operand for the hypercall inline
      asm statements.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Reviewed-by: NDavid Vrabel <david.vrabel@citrix.com>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Chris J Arges <chris.j.arges@canonical.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Pedro Alves <palves@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: live-patching@vger.kernel.org
      Link: http://lkml.kernel.org/r/c6face5a46713108bded9c4c103637222abc4528.1453405861.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0e8e2238
  2. 18 2月, 2016 3 次提交
    • T
      x86/cpufeature: Create a new synthetic cpu capability for machine check recovery · 0f68c088
      Tony Luck 提交于
      The Intel Software Developer Manual describes bit 24 in the MCG_CAP
      MSR:
      
         MCG_SER_P (software error recovery support present) flag,
         bit 24 — Indicates (when set) that the processor supports
         software error recovery
      
      But only some models with this capability bit set will actually
      generate recoverable machine checks.
      
      Check the model name and set a synthetic capability bit. Provide
      a command line option to set this bit anyway in case the kernel
      doesn't recognise the model name.
      Signed-off-by: NTony Luck <tony.luck@intel.com>
      Reviewed-by: NBorislav Petkov <bp@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/2e5bfb23c89800a036fb8a45fa97a74bb16bc362.1455732970.git.tony.luck@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0f68c088
    • I
      3a2f2ac9
    • T
      x86/mm: Fix vmalloc_fault() to handle large pages properly · f4eafd8b
      Toshi Kani 提交于
      A kernel page fault oops with the callstack below was observed
      when a read syscall was made to a pmem device after a huge amount
      (>512GB) of vmalloc ranges was allocated by ioremap() on a x86_64
      system:
      
           BUG: unable to handle kernel paging request at ffff880840000ff8
           IP: vmalloc_fault+0x1be/0x300
           PGD c7f03a067 PUD 0
           Oops: 0000 [#1] SM
           Call Trace:
              __do_page_fault+0x285/0x3e0
              do_page_fault+0x2f/0x80
              ? put_prev_entity+0x35/0x7a0
              page_fault+0x28/0x30
              ? memcpy_erms+0x6/0x10
              ? schedule+0x35/0x80
              ? pmem_rw_bytes+0x6a/0x190 [nd_pmem]
              ? schedule_timeout+0x183/0x240
              btt_log_read+0x63/0x140 [nd_btt]
               :
              ? __symbol_put+0x60/0x60
              ? kernel_read+0x50/0x80
              SyS_finit_module+0xb9/0xf0
              entry_SYSCALL_64_fastpath+0x1a/0xa4
      
      Since v4.1, ioremap() supports large page (pud/pmd) mappings in
      x86_64 and PAE.  vmalloc_fault() however assumes that the vmalloc
      range is limited to pte mappings.
      
      vmalloc faults do not normally happen in ioremap'd ranges since
      ioremap() sets up the kernel page tables, which are shared by
      user processes.  pgd_ctor() sets the kernel's PGD entries to
      user's during fork().  When allocation of the vmalloc ranges
      crosses a 512GB boundary, ioremap() allocates a new pud table
      and updates the kernel PGD entry to point it.  If user process's
      PGD entry does not have this update yet, a read/write syscall
      to the range will cause a vmalloc fault, which hits the Oops
      above as it does not handle a large page properly.
      
      Following changes are made to vmalloc_fault().
      
      64-bit:
      
       - No change for the PGD sync operation as it handles large
         pages already.
       - Add pud_huge() and pmd_huge() to the validation code to
         handle large pages.
       - Change pud_page_vaddr() to pud_pfn() since an ioremap range
         is not directly mapped (while the if-statement still works
         with a bogus addr).
       - Change pmd_page() to pmd_pfn() since an ioremap range is not
         backed by struct page (while the if-statement still works
         with a bogus addr).
      
      32-bit:
       - No change for the sync operation since the index3 PGD entry
         covers the entire vmalloc range, which is always valid.
         (A separate change to sync PGD entry is necessary if this
          memory layout is changed regardless of the page size.)
       - Add pmd_huge() to the validation code to handle large pages.
         This is for completeness since vmalloc_fault() won't happen
         in ioremap'd ranges as its PGD entry is always valid.
      Reported-by: NHenning Schild <henning.schild@siemens.com>
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Acked-by: NBorislav Petkov <bp@alien8.de>
      Cc: <stable@vger.kernel.org> # 4.1+
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: linux-mm@kvack.org
      Cc: linux-nvdimm@lists.01.org
      Link: http://lkml.kernel.org/r/1455758214-24623-1-git-send-email-toshi.kani@hpe.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f4eafd8b
  3. 17 2月, 2016 12 次提交
    • A
      x86/entry/compat: Keep TS_COMPAT set during signal delivery · 4e79e182
      Andy Lutomirski 提交于
      Signal delivery needs to know the sign of an interrupted syscall's
      return value in order to detect -ERESTART variants.  Normally this
      works independently of bitness because syscalls internally return
      long.  Under ptrace, however, this can break, and syscall_get_error
      is supposed to sign-extend regs->ax if needed.
      
      We were clearing TS_COMPAT too early, though, and this prevented
      sign extension, which subtly broke syscall restart under ptrace.
      Reported-by: NRobert O'Callahan <robert@ocallahan.org>
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Shuah Khan <shuahkh@osg.samsung.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org # 4.3.x-
      Fixes: c5c46f59 ("x86/entry: Add new, comprehensible entry and exit handlers written in C")
      Link: http://lkml.kernel.org/r/cbce3cf545522f64eb37f5478cb59746230db3b5.1455142412.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      4e79e182
    • A
      selftests/x86: Add a test for syscall restart under ptrace · 40361343
      Andy Lutomirski 提交于
      This catches a regression from the compat syscall rework.  The
      32-bit variant of this test currently fails.  The issue is that, for
      a 32-bit tracer and a 32-bit tracee, GETREGS+SETREGS with no changes
      should be a no-op.  It currently isn't a no-op if RAX indicates
      signal restart, because the high bits get cleared and the kernel
      loses track of the restart state.
      Reported-by: NRobert O'Callahan <robert@ocallahan.org>
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Shuah Khan <shuahkh@osg.samsung.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/c4040b40b5b4a37ed31375a69b683f753ec6788a.1455142412.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      40361343
    • A
      selftests/x86: Fix some error messages in ptrace_syscall · adcfd23e
      Andy Lutomirski 提交于
      I had some obvious typos.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert O'Callahan <robert@ocallahan.org>
      Cc: Shuah Khan <shuahkh@osg.samsung.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/e5e6772d4802986cf7df702e646fa24ac14f2204.1455142412.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      adcfd23e
    • M
      hpet: Drop stale URLs · 4e7f9df2
      Michael S. Tsirkin 提交于
      Looks like the HPET spec at intel.com got moved.
      It isn't hard to find so drop the link, just mention
      the revision assumed.
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Clemens Ladisch <clemens@ladisch.de>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-doc@vger.kernel.org
      Link: http://lkml.kernel.org/r/1455145462-3877-1-git-send-email-mst@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      4e7f9df2
    • T
      x86/uaccess/64: Handle the caching of 4-byte nocache copies properly in __copy_user_nocache() · a82eee74
      Toshi Kani 提交于
      Data corruption issues were observed in tests which initiated
      a system crash/reset while accessing BTT devices.  This problem
      is reproducible.
      
      The BTT driver calls pmem_rw_bytes() to update data in pmem
      devices.  This interface calls __copy_user_nocache(), which
      uses non-temporal stores so that the stores to pmem are
      persistent.
      
      __copy_user_nocache() uses non-temporal stores when a request
      size is 8 bytes or larger (and is aligned by 8 bytes).  The
      BTT driver updates the BTT map table, which entry size is
      4 bytes.  Therefore, updates to the map table entries remain
      cached, and are not written to pmem after a crash.
      
      Change __copy_user_nocache() to use non-temporal store when
      a request size is 4 bytes.  The change extends the current
      byte-copy path for a less-than-8-bytes request, and does not
      add any overhead to the regular path.
      Reported-and-tested-by: NMicah Parrish <micah.parrish@hpe.com>
      Reported-and-tested-by: NBrian Boylston <brian.boylston@hpe.com>
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Cc: <stable@vger.kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Cc: linux-nvdimm@lists.01.org
      Link: http://lkml.kernel.org/r/1455225857-12039-3-git-send-email-toshi.kani@hpe.com
      [ Small readability edits. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      a82eee74
    • T
      x86/uaccess/64: Make the __copy_user_nocache() assembly code more readable · ee9737c9
      Toshi Kani 提交于
      Add comments to __copy_user_nocache() to clarify its procedures
      and alignment requirements.
      
      Also change numeric branch target labels to named local labels.
      
      No code changed:
      
       arch/x86/lib/copy_user_64.o:
      
          text    data     bss     dec     hex filename
          1239       0       0    1239     4d7 copy_user_64.o.before
          1239       0       0    1239     4d7 copy_user_64.o.after
      
       md5:
          58bed94c2db98c1ca9a2d46d0680aaae  copy_user_64.o.before.asm
          58bed94c2db98c1ca9a2d46d0680aaae  copy_user_64.o.after.asm
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Cc: <stable@vger.kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: brian.boylston@hpe.com
      Cc: dan.j.williams@intel.com
      Cc: linux-nvdimm@lists.01.org
      Cc: micah.parrish@hpe.com
      Cc: ross.zwisler@linux.intel.com
      Cc: vishal.l.verma@intel.com
      Link: http://lkml.kernel.org/r/1455225857-12039-2-git-send-email-toshi.kani@hpe.com
      [ Small readability edits and added object file comparison. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      ee9737c9
    • B
      x86/msr: Document msr-index.h rule for addition · 053080a9
      Borislav Petkov 提交于
      In order to keep this file's size sensible and not cause too much
      unnecessary churn, make the rule explicit - similar to pci_ids.h - that
      only MSRs which are used in multiple compilation units, should get added
      to it.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: alex.williamson@redhat.com
      Cc: gleb@kernel.org
      Cc: joro@8bytes.org
      Cc: kvm@vger.kernel.org
      Cc: sherry.hurwitz@amd.com
      Cc: wei@redhat.com
      Link: http://lkml.kernel.org/r/1455612202-14414-5-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      053080a9
    • B
      x86/ftrace, x86/asm: Kill ftrace_caller_end label · f1b92bb6
      Borislav Petkov 提交于
      One of ftrace_caller_end and ftrace_return is redundant so unify them.
      Rename ftrace_return to ftrace_epilogue to mean that everything after
      that label represents, like an afterword, work which happens *after* the
      ftrace call, e.g., the function graph tracer for one.
      
      Steve wants this to rather mean "[a]n event which reflects meaningfully
      on a recently ended conflict or struggle." I can imagine that ftrace can
      be a struggle sometimes.
      
      Anyway, beef up the comment about the code contents and layout before
      ftrace_epilogue label.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1455612202-14414-4-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f1b92bb6
    • A
      selftests/x86: Add tests for UC_SIGCONTEXT_SS and UC_STRICT_RESTORE_SS · 4f6c8938
      Andy Lutomirski 提交于
      This tests the two ABI-preserving cases that DOSEMU cares about, and
      it also explicitly tests the new UC_SIGCONTEXT_SS and
      UC_STRICT_RESTORE_SS flags.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Acked-by: NBorislav Petkov <bp@alien8.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Shuah Khan <shuahkh@osg.samsung.com>
      Cc: Stas Sergeev <stsp@list.ru>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/f3d08f98541d0bd3030ceb35e05e21f59e30232c.1455664054.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      4f6c8938
    • A
      x86/signal/64: Re-add support for SS in the 64-bit signal context · 6c25da5a
      Andy Lutomirski 提交于
      This is a second attempt to make the improvements from c6f20629
      ("x86/signal/64: Fix SS handling for signals delivered to 64-bit
      programs"), which was reverted by 51adbfbba5c6 ("x86/signal/64: Add
      support for SS in the 64-bit signal context").
      
      This adds two new uc_flags flags.  UC_SIGCONTEXT_SS will be set for
      all 64-bit signals (including x32).  It indicates that the saved SS
      field is valid and that the kernel supports the new behavior.
      
      The goal is to fix a problems with signal handling in 64-bit tasks:
      SS wasn't saved in the 64-bit signal context, making it awkward to
      determine what SS was at the time of signal delivery and making it
      impossible to return to a non-flat SS (as calling sigreturn clobbers
      SS).
      
      This also made it extremely difficult for 64-bit tasks to return to
      fully-defined 16-bit contexts, because only the kernel can easily do
      espfix64, but sigreturn was unable to set a non-flag SS:ESP.
      (DOSEMU has a monstrous hack to partially work around this
      limitation.)
      
      If we could go back in time, the correct fix would be to make 64-bit
      signals work just like 32-bit signals with respect to SS: save it
      in signal context, reset it when delivering a signal, and restore
      it in sigreturn.
      
      Unfortunately, doing that (as I tried originally) breaks DOSEMU:
      DOSEMU wouldn't reset the signal context's SS when clearing the LDT
      and changing the saved CS to 64-bit mode, since it predates the SS
      context field existing in the first place.
      
      This patch is a bit more complicated, and it tries to balance a
      bunch of goals.  It makes most cases of changing ucontext->ss during
      signal handling work as expected.
      
      I do this by special-casing the interesting case.  On sigreturn,
      ucontext->ss will be honored by default, unless the ucontext was
      created from scratch by an old program and had a 64-bit CS
      (unfortunately, CRIU can do this) or was the result of changing a
      32-bit signal context to 64-bit without resetting SS (as DOSEMU
      does).
      
      For the benefit of new 64-bit software that uses segmentation (new
      versions of DOSEMU might), the new behavior can be detected with a
      new ucontext flag UC_SIGCONTEXT_SS.
      
      To avoid compilation issues, __pad0 is left as an alias for ss in
      ucontext.
      
      The nitty-gritty details are documented in the header file.
      
      This patch also re-enables the sigreturn_64 and ldt_gdt_64 selftests,
      as the kernel change allows both of them to pass.
      Tested-by: NStas Sergeev <stsp@list.ru>
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Acked-by: NBorislav Petkov <bp@alien8.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/749149cbfc3e75cd7fcdad69a854b399d792cc6f.1455664054.git.luto@kernel.org
      [ Small readability edit. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      6c25da5a
    • A
      x86/signal/64: Fix SS if needed when delivering a 64-bit signal · 8ff5bd2e
      Andy Lutomirski 提交于
      Signals are always delivered to 64-bit tasks with CS set to a long
      mode segment.  In long mode, SS doesn't matter as long as it's a
      present writable segment.
      
      If SS starts out invalid (this can happen if the signal was caused
      by an IRET fault or was delivered on the way out of set_thread_area
      or modify_ldt), then IRET to the signal handler can fail, eventually
      killing the task.
      
      The straightforward fix would be to simply reset SS when delivering
      a signal.  That breaks DOSEMU, though: 64-bit builds of DOSEMU rely
      on SS being set to the faulting SS when signals are delivered.
      
      As a compromise, this patch leaves SS alone so long as it's valid.
      
      The net effect should be that the behavior of successfully delivered
      signals is unchanged.  Some signals that would previously have
      failed to be delivered will now be delivered successfully.
      
      This has no effect for x32 or 32-bit tasks: their signal handlers
      were already called with SS == __USER_DS.
      
      (On Xen, there's a slight hole: if a task sets SS to a writable
       *kernel* data segment, then we will fail to identify it as invalid
       and we'll still kill the task.  If anyone cares, this could be fixed
       with a new paravirt hook.)
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Acked-by: NBorislav Petkov <bp@alien8.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stas Sergeev <stsp@list.ru>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/163c6e1eacde41388f3ff4d2fe6769be651d7b6e.1455664054.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8ff5bd2e
    • A
      x86/signal/64: Add a comment about sigcontext->fs and gs · e54fdcca
      Andy Lutomirski 提交于
      These fields have a strange history.  This tries to document it.
      
      This borrows from 9a036b93 ("x86/signal/64: Remove 'fs' and 'gs'
      from sigcontext"), which was reverted by ed596cde ("Revert x86
      sigcontext cleanups").
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Acked-by: NBorislav Petkov <bp@alien8.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stas Sergeev <stsp@list.ru>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/baa78f3c84106fa5acbc319377b1850602f5deec.1455664054.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e54fdcca
  4. 16 2月, 2016 5 次提交
    • I
      Merge tag 'efi-urgent' of... · 02a5f765
      Ingo Molnar 提交于
      Merge tag 'efi-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into x86/urgent
      
      Pull EFI bug fixes from Matt Fleming:
      
       * Fix bugs in our code that converts ucs2 strings to utf8 where we
         unintentionally drop bits from the original string (Jason Andryuk)
      
       * Add the efi-pstore variables to the variable whitelist so that
         users can continue to delete them via efivarfs without needing to
         manipulate the immutable flag (Matt Fleming)
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      02a5f765
    • J
      lib/ucs2_string: Correct ucs2 -> utf8 conversion · a6807590
      Jason Andryuk 提交于
      The comparisons should be >= since 0x800 and 0x80 require an additional bit
      to store.
      
      For the 3 byte case, the existing shift would drop off 2 more bits than
      intended.
      
      For the 2 byte case, there should be 5 bits bits in byte 1, and 6 bits in
      byte 2.
      Signed-off-by: NJason Andryuk <jandryuk@gmail.com>
      Reviewed-by: NLaszlo Ersek <lersek@redhat.com>
      Cc: Peter Jones <pjones@redhat.com>
      Cc: Matthew Garrett <mjg59@coreos.com>
      Cc: "Lee, Chun-Yi" <jlee@suse.com>
      Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
      a6807590
    • M
      efi: Add pstore variables to the deletion whitelist · e246eb56
      Matt Fleming 提交于
      Laszlo explains why this is a good idea,
      
       'This is because the pstore filesystem can be backed by UEFI variables,
        and (for example) a crash might dump the last kilobytes of the dmesg
        into a number of pstore entries, each entry backed by a separate UEFI
        variable in the above GUID namespace, and with a variable name
        according to the above pattern.
      
        Please see "drivers/firmware/efi/efi-pstore.c".
      
        While this patch series will not prevent the user from deleting those
        UEFI variables via the pstore filesystem (i.e., deleting a pstore fs
        entry will continue to delete the backing UEFI variable), I think it
        would be nice to preserve the possibility for the sysadmin to delete
        Linux-created UEFI variables that carry portions of the crash log,
        *without* having to mount the pstore filesystem.'
      
      There's also no chance of causing machines to become bricked by
      deleting these variables, which is the whole purpose of excluding
      things from the whitelist.
      
      Use the LINUX_EFI_CRASH_GUID guid and a wildcard '*' for the match so
      that we don't have to update the string in the future if new variable
      name formats are created for crash dump variables.
      Reported-by: NLaszlo Ersek <lersek@redhat.com>
      Acked-by: NPeter Jones <pjones@redhat.com>
      Tested-by: NPeter Jones <pjones@redhat.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: "Lee, Chun-Yi" <jlee@suse.com>
      Signed-off-by: NMatt Fleming <matt@codeblueprint.co.uk>
      e246eb56
    • I
      Merge tag 'efi-urgent' of... · 4682c211
      Ingo Molnar 提交于
      Merge tag 'efi-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into x86/urgent
      
      Pull EFI fixes from Matt Fleming:
      
       * Prevent accidental deletion of EFI variables through efivarfs that
         may brick machines. We use a whitelist of known-safe variables to
         allow things like installing distributions to work out of the box, and
         instead restrict vendor-specific variable deletion by making
         non-whitelist variables immutable (Peter Jones)
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      4682c211
    • B
      x86/cpufeature: Speed up cpu_feature_enabled() · f2cc8e07
      Borislav Petkov 提交于
      When GCC cannot do constant folding for this macro, it falls back to
      cpu_has(). But static_cpu_has() is optimal and it works at all times
      now. So use it and speedup the fallback case.
      
      Before we had this:
      
        mov    0x99d674(%rip),%rdx        # ffffffff81b0d9f4 <boot_cpu_data+0x34>
        shr    $0x2e,%rdx
        and    $0x1,%edx
        jne    ffffffff811704e9 <do_munmap+0x3f9>
      
      After alternatives patching, it turns into:
      
      		  jmp    0xffffffff81170390
      		  nopl   (%rax)
      		  ...
      		  callq  ffffffff81056e00 <mpx_notify_unmap>
      ffffffff81170390: mov    0x170(%r12),%rdi
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1455578358-28347-1-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f2cc8e07
  5. 15 2月, 2016 14 次提交