1. 04 4月, 2017 3 次提交
  2. 03 4月, 2017 2 次提交
  3. 01 4月, 2017 2 次提交
  4. 31 3月, 2017 9 次提交
    • D
      x86/mm: Make in_compat_syscall() work during exec · ada26481
      Dmitry Safonov 提交于
      The x86 mmap() code selects the mmap base for an allocation depending on
      the bitness of the syscall. For 64bit sycalls it select mm->mmap_base and
      for 32bit mm->mmap_compat_base.
      
      On execve the registers of the task invoking exec() are copied to the child
      pt_regs. So child->pt_regs->orig_ax contains the execve syscall number of the
      parent.
      
      exec() calls mmap() which in turn uses in_compat_syscall() to check whether
      the mapping is for a 32bit or a 64bit task. The decision is made on the
      following criteria:
      
        ia32	  child->thread.status & TS_COMPAT
         x32	  child->pt_regs.orig_ax & __X32_SYSCALL_BIT
        ia64	  !ia32 && !x32 
      
      child->thread.status is corretly set up in set_personality_*(), but the
      syscall number in child->pt_regs.orig_ax is left unmodified.
      
      Therefore the parent/child combinations work or fail in the following way:
      
      Parent Child Child->thread_status  child->pt_regs.orig_ax  in_compat()  Works
      ia64    ia64   TS_COMPAT == 0	   __X32_SYSCALL_BIT == 0     false       Y
      ia64    ia32   TS_COMPAT == 1	   __X32_SYSCALL_BIT == 0     true        Y
      ia64     x32   TS_COMPAT == 0	   __X32_SYSCALL_BIT == 0     false       N
      ia32    ia64   TS_COMPAT == 0	   __X32_SYSCALL_BIT == 0     false       Y
      ia32    ia32   TS_COMPAT == 1	   __X32_SYSCALL_BIT == 0     true        Y
      ia32     x32   TS_COMPAT == 0	   __X32_SYSCALL_BIT == 0     false       N
       x32    ia64   TS_COMPAT == 0	   __X32_SYSCALL_BIT == 1     true        N
       x32    ia32   TS_COMPAT == 1	   __X32_SYSCALL_BIT == 1     true        Y
       x32     x32   TS_COMPAT == 0	   __X32_SYSCALL_BIT == 1     true        Y
      
      Make set_personality_*() store the syscall number incl. __X32_SYSCALL_BIT
      which corresponds to the newly started ELF executable in the childs
      pt_regs, i.e. pretend that the exec was invoked from a task with the same
      executable format.
      
      So both thread.status and pt_regs.orig_ax correspond to the new ELF format
      and in_compat_syscall() returns the correct result.
      
      [ tglx: Rewrote changelog ]
      
      Fixes: commit 1b028f78 ("x86/mm: Introduce mmap_compat_base() for 32-bit mmap()")
      Reported-by: NAdam Borowski <kilobyte@angband.pl>
      Suggested-by: NH. Peter Anvin <hpa@zytor.com>
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NDmitry Safonov <dsafonov@virtuozzo.com>
      Cc: 0x7f454c46@gmail.com
      Cc: linux-mm@kvack.org
      Cc: Andrei Vagin <avagin@gmail.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Link: http://lkml.kernel.org/r/20170331111137.28170-1-dsafonov@virtuozzo.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      ada26481
    • Z
      x86/boot: Include missing header file · 6b1cc946
      Zhengyi Shen 提交于
      Sparse complains about missing forward declarations:
      
      arch/x86/boot/compressed/error.c:8:6:
      	warning: symbol 'warn' was not declared. Should it be static?
      arch/x86/boot/compressed/error.c:15:6:
      	warning: symbol 'error' was not declared. Should it be static?
      
      Include the missing header file.
      Signed-off-by: NZhengyi Shen <shenzhengyi@gmail.com>
      Acked-by: NKess Cook <keescook@chromium.org>
      Link: http://lkml.kernel.org/r/1490770820-24472-1-git-send-email-shenzhengyi@gmail.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      6b1cc946
    • Y
      x86/mce/AMD: Give a name to MCA bank 3 when accessed with legacy MSRs · 29f72ce3
      Yazen Ghannam 提交于
      MCA bank 3 is reserved on systems pre-Fam17h, so it didn't have a name.
      However, MCA bank 3 is defined on Fam17h systems and can be accessed
      using legacy MSRs. Without a name we get a stack trace on Fam17h systems
      when trying to register sysfs files for bank 3 on kernels that don't
      recognize Scalable MCA.
      
      Call MCA bank 3 "decode_unit" since this is what it represents on
      Fam17h. This will allow kernels without SMCA support to see this bank on
      Fam17h+ and prevent the stack trace. This will not affect older systems
      since this bank is reserved on them, i.e. it'll be ignored.
      
      Tested on AMD Fam15h and Fam17h systems.
      
        WARNING: CPU: 26 PID: 1 at lib/kobject.c:210 kobject_add_internal
        kobject: (ffff88085bb256c0): attempted to be registered with empty name!
        ...
        Call Trace:
         kobject_add_internal
         kobject_add
         kobject_create_and_add
         threshold_create_device
         threshold_init_device
      Signed-off-by: NYazen Ghannam <yazen.ghannam@amd.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Link: http://lkml.kernel.org/r/1490102285-3659-1-git-send-email-Yazen.Ghannam@amd.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      29f72ce3
    • B
      x86/boot/32: Flip the logic in test_wp_bit() · 952a6c2c
      Borislav Petkov 提交于
      ... to have a natural "likely()" in the code flow and thus have the
      success case with a branch 99.999% of the times non-taken and function
      return code following it instead of jumping to it each time.
      
      This puts the panic() call at the end of the function - it is going to
      be practically unreachable anyway.
      
      The C code is a bit more readable too.
      
      No functionality change.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: boris.ostrovsky@oracle.com
      Cc: jgross@suse.com
      Cc: thgarnie@google.com
      Link: http://lkml.kernel.org/r/20170330080101.ywsf5rg6ilzu4itk@pd.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>
      952a6c2c
    • V
      ARC: fix build warnings with !CONFIG_KPROBES · 4c6fabda
      Vineet Gupta 提交于
      |   CC      lib/nmi_backtrace.o
      | In file included from ../include/linux/kprobes.h:43:0,
      |                  from ../lib/nmi_backtrace.c:17:
      | ../arch/arc/include/asm/kprobes.h:57:13: warning: 'trap_is_kprobe' defined but not used [-Wunused-function]
      |  static void trap_is_kprobe(unsigned long address, struct pt_regs *regs)
      |              ^~~~~~~~~~~~~~
      
      The warning started with 7d134b2c ("kprobes: move kprobe declarations
      to asm-generic/kprobes.h") which started including <asm/kprobes.h>
      unconditionally into <linux/kprobes.h> exposing a stub function for
      !CONFIG_KPROBES to rest of world. Fix that by making the stub a macro
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      4c6fabda
    • A
      ARCv2: SLC: Make sure busy bit is set properly on SLC flushing · c70c4733
      Alexey Brodkin 提交于
      As reported in STAR 9001165532, an SLC control reg read (for checking
      busy state) right after SLC invalidate command may incorrectly return
      NOT busy causing software to NOT spin-wait while operation is underway.
      (and for some reason this only happens if L1 cache is also disabled - as
      required by IOC programming model)
      
      Suggested workaround is to do an additional Control Reg read, which
      ensures the 2nd read gets the right status.
      
      Cc: stable@vger.kernel.org  #4.10
      Signed-off-by: NAlexey Brodkin <abrodkin@synopsys.com>
      [vgupta: reworte changelog a bit]
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      c70c4733
    • M
      arm64: drop non-existing vdso-offsets.h from .gitignore · 9b3403ae
      Masahiro Yamada 提交于
      Since commit a66649da ("arm64: fix vdso-offsets.h dependency"),
      include/generated/vdso-offsets.h is directly generated without
      arch/arm64/kernel/vdso/vdso-offsets.h.
      Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      9b3403ae
    • S
      arm64: remove redundant header file in current.h · 34d04f25
      Shaokun Zhang 提交于
      Commint 9d84fb27 ("arm64: restore get_current() optimisation") has
      removed read_sysreg() and asm/sysreg.h is redundant.
      
      This patch removes asm/sysreg.h header file.
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NShaokun Zhang <zhangshaokun@hisilicon.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      34d04f25
    • M
      arm64: fix NULL dereference in have_cpu_die() · 335d2c2d
      Mark Salter 提交于
      Commit 5c492c3f ("arm64: smp: Add function to determine if cpus are
      stuck in the kernel") added a helper function to determine if die() is
      supported in cpu_ops. This function assumes a cpu will have a valid
      cpu_ops entry, but that may not be the case for cpu0 is spin-table or
      parking protocol is used to boot secondary cpus. In that case, there
      is a NULL dereference if have_cpu_die() is called by cpu0. So add a
      check for a valid cpu_ops before dereferencing it.
      
      Fixes: 5c492c3f ("arm64: smp: Add function to determine if cpus are stuck in the kernel")
      Signed-off-by: NMark Salter <msalter@redhat.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      335d2c2d
  5. 30 3月, 2017 6 次提交
    • J
      x86/build: Mostly disable '-maccumulate-outgoing-args' · 3f135e57
      Josh Poimboeuf 提交于
      The GCC '-maccumulate-outgoing-args' flag is enabled for most configs,
      mostly because of issues which are no longer relevant.  For most
      configs, and with most recent versions of GCC, it's no longer needed.
      
      Clarify which cases need it, and only enable it for those cases.  Also
      produce a compile-time error for the ftrace graph + mcount + '-Os' case,
      which will otherwise cause runtime failures.
      
      The main benefit of '-maccumulate-outgoing-args' is that it prevents an
      ugly prologue for functions which have aligned stacks.  But removing the
      option also has some benefits: more readable argument saves, smaller
      text size, and (presumably) slightly improved performance.
      
      Here are the object size savings for 32-bit and 64-bit defconfig
      kernels:
      
            text	   data	    bss	     dec	    hex	filename
        10006710	3543328	1773568	15323606	 e9d1d6	vmlinux.x86-32.before
         9706358	3547424	1773568	15027350	 e54c96	vmlinux.x86-32.after
      
            text	   data	    bss	     dec	    hex	filename
        10652105	4537576	 843776	16033457	 f4a6b1	vmlinux.x86-64.before
        10639629	4537576	 843776	16020981	 f475f5	vmlinux.x86-64.after
      
      That comes out to a 3% text size improvement on x86-32 and a 0.1% text
      size improvement on x86-64.
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrew Lutomirski <luto@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20170316193133.zrj6gug53766m6nn@trebleSigned-off-by: NIngo Molnar <mingo@kernel.org>
      3f135e57
    • A
      x86/boot/32: Rewrite test_wp_bit() · 4af17110
      Andy Lutomirski 提交于
      This code seems to be very old and has gotten only minor updates.
      It's overcomplicated and has a bunch of comments that are, at best,
      of purely historical interest.  Nowadays we have a shiny function
      probe_kernel_write() that does more or less exactly what we need.
      Use it.
      
      I switched the page that we test from swapper_pg_dir to
      empty_zero_page because writing zero to empty_zero_page is more
      obviously safe than writing to the paging structures.  (It's
      extremely unlikely that any of this would cause problems in practice
      because the write will fail on any supported CPU.)
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Garnier <thgarnie@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/0b9e64ab0236de30e7572213cea77bf95ae2e990.1490831211.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      4af17110
    • K
      x86/dump_pagetables: Add support for 5-level paging · fdd3d8ce
      Kirill A. Shutemov 提交于
      Simple extension to support one more page table level.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-arch@vger.kernel.org
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/20170328104806.41711-1-kirill.shutemov@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      fdd3d8ce
    • H
      parisc: Avoid stalled CPU warnings after system shutdown · 476e75a4
      Helge Deller 提交于
      Commit 73580dac ("parisc: Fix system shutdown halt") introduced an endless
      loop for systems which don't provide a software power off function.  But the
      soft lockup detector will detect this and report stalled CPUs after some time.
      Avoid those unwanted warnings by disabling the soft lockup detector.
      
      Fixes: 73580dac ("parisc: Fix system shutdown halt")
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Cc: stable@vger.kernel.org # 4.9+
      476e75a4
    • H
      parisc: Clean up fixup routines for get_user()/put_user() · d19f5e41
      Helge Deller 提交于
      Al Viro noticed that userspace accesses via get_user()/put_user() can be
      simplified a lot with regard to usage of the exception handling.
      
      This patch implements a fixup routine for get_user() and put_user() in such
      that the exception handler will automatically load -EFAULT into the register
      %r8 (the error value) in case on a fault on userspace.  Additionally the fixup
      routine will zero the target register on fault in case of a get_user() call.
      The target register is extracted out of the faulting assembly instruction.
      
      This patch brings a few benefits over the old implementation:
      1. Exception handling gets much cleaner, easier and smaller in size.
      2. Helper functions like fixup_get_user_skip_1 (all of fixup.S) can be dropped.
      3. No need to hardcode %r9 as target register for get_user() any longer. This
         helps the compiler register allocator and thus creates less assembler
         statements.
      4. No dependency on the exception_data contents any longer.
      5. Nested faults will be handled cleanly.
      Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
      Cc: <stable@vger.kernel.org> # v4.9+
      Signed-off-by: NHelge Deller <deller@gmx.de>
      d19f5e41
    • H
      parisc: Fix access fault handling in pa_memcpy() · 554bfece
      Helge Deller 提交于
      pa_memcpy() is the major memcpy implementation in the parisc kernel which is
      used to do any kind of userspace/kernel memory copies.
      
      Al Viro noticed various bugs in the implementation of pa_mempcy(), most notably
      that in case of faults it may report back to have copied more bytes than it
      actually did.
      
      Fixing those bugs is quite hard in the C-implementation, because the compiler
      is messing around with the registers and we are not guaranteed that specific
      variables are always in the same processor registers. This makes proper fault
      handling complicated.
      
      This patch implements pa_memcpy() in assembler. That way we have correct fault
      handling and adding a 64-bit copy routine was quite easy.
      
      Runtime tested with 32- and 64bit kernels.
      Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
      Cc: <stable@vger.kernel.org> # v4.9+
      Signed-off-by: NJohn David Anglin <dave.anglin@bell.net>
      Signed-off-by: NHelge Deller <deller@gmx.de>
      554bfece
  6. 29 3月, 2017 7 次提交
  7. 28 3月, 2017 2 次提交
  8. 27 3月, 2017 6 次提交
  9. 25 3月, 2017 1 次提交
  10. 24 3月, 2017 2 次提交
    • B
      x86/mm/KASLR: Exclude EFI region from KASLR VA space randomization · a46f60d7
      Baoquan He 提交于
      Currently KASLR is enabled on three regions: the direct mapping of physical
      memory, vamlloc and vmemmap. However the EFI region is also mistakenly
      included for VA space randomization because of misusing EFI_VA_START macro
      and assuming EFI_VA_START < EFI_VA_END.
      
      (This breaks kexec and possibly other things that rely on stable addresses.)
      
      The EFI region is reserved for EFI runtime services virtual mapping which
      should not be included in KASLR ranges. In Documentation/x86/x86_64/mm.txt,
      we can see:
      
        ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space
      
      EFI uses the space from -4G to -64G thus EFI_VA_START > EFI_VA_END,
      Here EFI_VA_START = -4G, and EFI_VA_END = -64G.
      
      Changing EFI_VA_START to EFI_VA_END in mm/kaslr.c fixes this problem.
      Signed-off-by: NBaoquan He <bhe@redhat.com>
      Reviewed-by: NBhupesh Sharma <bhsharma@redhat.com>
      Acked-by: NDave Young <dyoung@redhat.com>
      Acked-by: NThomas Garnier <thgarnie@google.com>
      Cc: <stable@vger.kernel.org> #4.8+
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-efi@vger.kernel.org
      Link: http://lkml.kernel.org/r/1490331592-31860-1-git-send-email-bhe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a46f60d7
    • W
      KVM: VMX: Fix enable VPID conditions · 08d839c4
      Wanpeng Li 提交于
      This can be reproduced by running L2 on L1, and disable VPID on L0
      if w/o commit "KVM: nVMX: Fix nested VPID vmx exec control", the L2
      crash as below:
      
      KVM: entry failed, hardware error 0x7
      EAX=00000000 EBX=00000000 ECX=00000000 EDX=000306c3
      ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
      EIP=0000fff0 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
      ES =0000 00000000 0000ffff 00009300
      CS =f000 ffff0000 0000ffff 00009b00
      SS =0000 00000000 0000ffff 00009300
      DS =0000 00000000 0000ffff 00009300
      FS =0000 00000000 0000ffff 00009300
      GS =0000 00000000 0000ffff 00009300
      LDT=0000 00000000 0000ffff 00008200
      TR =0000 00000000 0000ffff 00008b00
      GDT=     00000000 0000ffff
      IDT=     00000000 0000ffff
      CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
      DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
      DR6=00000000ffff0ff0 DR7=0000000000000400
      EFER=0000000000000000
      
      Reference SDM 30.3 INVVPID:
      
      Protected Mode Exceptions
      - #UD
        - If not in VMX operation.
        - If the logical processor does not support VPIDs (IA32_VMX_PROCBASED_CTLS2[37]=0).
        - If the logical processor supports VPIDs (IA32_VMX_PROCBASED_CTLS2[37]=1) but does
          not support the INVVPID instruction (IA32_VMX_EPT_VPID_CAP[32]=0).
      
      So we should check both VPID enable bit in vmx exec control and INVVPID support bit
      in vmx capability MSRs to enable VPID. This patch adds the guarantee to not enable
      VPID if either INVVPID or single-context/all-context invalidation is not exposed in
      vmx capability MSRs.
      Reviewed-by: NDavid Hildenbrand <david@redhat.com>
      Reviewed-by: NJim Mattson <jmattson@google.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      08d839c4