1. 22 7月, 2016 1 次提交
  2. 21 7月, 2016 1 次提交
    • I
      x86/boot: Reorganize and clean up the BIOS area reservation code · edce2121
      Ingo Molnar 提交于
      So the reserve_ebda_region() code has accumulated a number of
      problems over the years that make it really difficult to read
      and understand:
      
      - The calculation of 'lowmem' and 'ebda_addr' is an unnecessarily
        interleaved mess of first lowmem, then ebda_addr, then lowmem tweaks...
      
      - 'lowmem' here means 'super low mem' - i.e. 16-bit addressable memory. In other
        parts of the x86 code 'lowmem' means 32-bit addressable memory... This makes it
        super confusing to read.
      
      - It does not help at all that we have various memory range markers, half of which
        are 'start of range', half of which are 'end of range' - but this crucial
        property is not obvious in the naming at all ... gave me a headache trying to
        understand all this.
      
      - Also, the 'ebda_addr' name sucks: it highlights that it's an address (which is
        obvious, all values here are addresses!), while it does not highlight that it's
        the _start_ of the EBDA region ...
      
      - 'BIOS_LOWMEM_KILOBYTES' says a lot of things, except that this is the only value
        that is a pointer to a value, not a memory range address!
      
      - The function name itself is a misnomer: it says 'reserve_ebda_region()' while
        its main purpose is to reserve all the firmware ROM typically between 640K and
        1MB, while the 'EBDA' part is only a small part of that ...
      
      - Likewise, the paravirt quirk flag name 'ebda_search' is misleading as well: this
        too should be about whether to reserve firmware areas in the paravirt case.
      
      - In fact thinking about this as 'end of RAM' is confusing: what this function
        *really* wants to reserve is firmware data and code areas! Once the thinking is
        inverted from a mixed 'ram' and 'reserved firmware area' notion to a pure
        'reserved area' notion everything becomes a lot clearer.
      
      To improve all this rewrite the whole code (without changing the logic):
      
      - Firstly invert the naming from 'lowmem end' to 'BIOS reserved area start'
        and propagate this concept through all the variable names and constants.
      
      	BIOS_RAM_SIZE_KB_PTR		// was: BIOS_LOWMEM_KILOBYTES
      
      	BIOS_START_MIN			// was: INSANE_CUTOFF
      
      	ebda_start			// was: ebda_addr
      	bios_start			// was: lowmem
      
      	BIOS_START_MAX			// was: LOWMEM_CAP
      
      - Then clean up the name of the function itself by renaming it
        to reserve_bios_regions() and renaming the ::ebda_search paravirt
        flag to ::reserve_bios_regions.
      
      - Fix up all the comments (fix typos), harmonize and simplify their
        formulation and remove comments that become unnecessary due to
        the much better naming all around.
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      edce2121
  3. 10 7月, 2016 1 次提交
    • T
      x86/mm: Do not reference phys addr beyond kernel · 4ff53087
      Thomas Garnier 提交于
      The new physical address randomized KASLR implementation can cause the
      kernel to be aligned close to the end of physical memory. In this case,
      _brk_end aligned to PMD will go beyond what is expected safe and hit
      the assert in __phys_addr_symbol():
      
      	VIRTUAL_BUG_ON(y >= KERNEL_IMAGE_SIZE);
      
      Instead, perform an inclusive range check to avoid incorrectly triggering
      the assert:
      
      	kernel BUG at arch/x86/mm/physaddr.c:38!
      	invalid opcode: 0000 [#1] SMP
      	...
      	RIP: 0010:[<ffffffffbe055721>] __phys_addr_symbol+0x41/0x50
      	...
      	Call Trace:
      	[<ffffffffbe052eb9>] cpa_process_alias+0xa9/0x210
      	[<ffffffffbe109011>] ? do_raw_spin_unlock+0xc1/0x100
      	[<ffffffffbe051eef>] __change_page_attr_set_clr+0x8cf/0xbd0
      	[<ffffffffbe201a4d>] ? vm_unmap_aliases+0x7d/0x210
      	[<ffffffffbe05237c>] change_page_attr_set_clr+0x18c/0x4e0
      	[<ffffffffbe0534ec>] set_memory_4k+0x2c/0x40
      	[<ffffffffbefb08b3>] check_bugs+0x28/0x2a
      	[<ffffffffbefa4f52>] start_kernel+0x49d/0x4b9
      	[<ffffffffbefa4120>] ? early_idt_handler_array+0x120/0x120
      	[<ffffffffbefa4423>] x86_64_start_reservations+0x29/0x2b
      	[<ffffffffbefa4568>] x86_64_start_kernel+0x143/0x152
      Signed-off-by: NThomas Garnier <thgarnie@google.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Dexuan Cui <decui@microsoft.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sai Praneeth <sai.praneeth.prakhya@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Link: http://lkml.kernel.org/r/20160615190545.GA26071@www.outflux.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      4ff53087
  4. 08 7月, 2016 14 次提交
    • T
      x86/mm: Add memory hotplug support for KASLR memory randomization · 90397a41
      Thomas Garnier 提交于
      Add a new option (CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING) to define
      the padding used for the physical memory mapping section when KASLR
      memory is enabled. It ensures there is enough virtual address space when
      CONFIG_MEMORY_HOTPLUG is used. The default value is 10 terabytes. If
      CONFIG_MEMORY_HOTPLUG is not used, no space is reserved increasing the
      entropy available.
      Signed-off-by: NThomas Garnier <thgarnie@google.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
      Cc: Alexander Popov <alpopov@ptsecurity.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jan Beulich <JBeulich@suse.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Lv Zheng <lv.zheng@intel.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: kernel-hardening@lists.openwall.com
      Cc: linux-doc@vger.kernel.org
      Link: http://lkml.kernel.org/r/1466556426-32664-10-git-send-email-keescook@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      90397a41
    • T
      x86/mm: Enable KASLR for vmalloc memory regions · a95ae27c
      Thomas Garnier 提交于
      Add vmalloc to the list of randomized memory regions.
      
      The vmalloc memory region contains the allocation made through the vmalloc()
      API. The allocations are done sequentially to prevent fragmentation and
      each allocation address can easily be deduced especially from boot.
      Signed-off-by: NThomas Garnier <thgarnie@google.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
      Cc: Alexander Popov <alpopov@ptsecurity.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jan Beulich <JBeulich@suse.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Lv Zheng <lv.zheng@intel.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: kernel-hardening@lists.openwall.com
      Cc: linux-doc@vger.kernel.org
      Link: http://lkml.kernel.org/r/1466556426-32664-8-git-send-email-keescook@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a95ae27c
    • T
      x86/mm: Enable KASLR for physical mapping memory regions · 021182e5
      Thomas Garnier 提交于
      Add the physical mapping in the list of randomized memory regions.
      
      The physical memory mapping holds most allocations from boot and heap
      allocators. Knowing the base address and physical memory size, an attacker
      can deduce the PDE virtual address for the vDSO memory page. This attack
      was demonstrated at CanSecWest 2016, in the following presentation:
      
        "Getting Physical: Extreme Abuse of Intel Based Paged Systems":
        https://github.com/n3k/CansecWest2016_Getting_Physical_Extreme_Abuse_of_Intel_Based_Paging_Systems/blob/master/Presentation/CanSec2016_Presentation.pdf
      
      (See second part of the presentation).
      
      The exploits used against Linux worked successfully against 4.6+ but
      fail with KASLR memory enabled:
      
        https://github.com/n3k/CansecWest2016_Getting_Physical_Extreme_Abuse_of_Intel_Based_Paging_Systems/tree/master/Demos/Linux/exploits
      
      Similar research was done at Google leading to this patch proposal.
      
      Variants exists to overwrite /proc or /sys objects ACLs leading to
      elevation of privileges. These variants were tested against 4.6+.
      
      The page offset used by the compressed kernel retains the static value
      since it is not yet randomized during this boot stage.
      Signed-off-by: NThomas Garnier <thgarnie@google.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
      Cc: Alexander Popov <alpopov@ptsecurity.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jan Beulich <JBeulich@suse.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Lv Zheng <lv.zheng@intel.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: kernel-hardening@lists.openwall.com
      Cc: linux-doc@vger.kernel.org
      Link: http://lkml.kernel.org/r/1466556426-32664-7-git-send-email-keescook@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      021182e5
    • T
      x86/mm: Implement ASLR for kernel memory regions · 0483e1fa
      Thomas Garnier 提交于
      Randomizes the virtual address space of kernel memory regions for
      x86_64. This first patch adds the infrastructure and does not randomize
      any region. The following patches will randomize the physical memory
      mapping, vmalloc and vmemmap regions.
      
      This security feature mitigates exploits relying on predictable kernel
      addresses. These addresses can be used to disclose the kernel modules
      base addresses or corrupt specific structures to elevate privileges
      bypassing the current implementation of KASLR. This feature can be
      enabled with the CONFIG_RANDOMIZE_MEMORY option.
      
      The order of each memory region is not changed. The feature looks at the
      available space for the regions based on different configuration options
      and randomizes the base and space between each. The size of the physical
      memory mapping is the available physical memory. No performance impact
      was detected while testing the feature.
      
      Entropy is generated using the KASLR early boot functions now shared in
      the lib directory (originally written by Kees Cook). Randomization is
      done on PGD & PUD page table levels to increase possible addresses. The
      physical memory mapping code was adapted to support PUD level virtual
      addresses. This implementation on the best configuration provides 30,000
      possible virtual addresses in average for each memory region.  An
      additional low memory page is used to ensure each CPU can start with a
      PGD aligned virtual address (for realmode).
      
      x86/dump_pagetable was updated to correctly display each region.
      
      Updated documentation on x86_64 memory layout accordingly.
      
      Performance data, after all patches in the series:
      
      Kernbench shows almost no difference (-+ less than 1%):
      
      Before:
      
      Average Optimal load -j 12 Run (std deviation): Elapsed Time 102.63 (1.2695)
      User Time 1034.89 (1.18115) System Time 87.056 (0.456416) Percent CPU 1092.9
      (13.892) Context Switches 199805 (3455.33) Sleeps 97907.8 (900.636)
      
      After:
      
      Average Optimal load -j 12 Run (std deviation): Elapsed Time 102.489 (1.10636)
      User Time 1034.86 (1.36053) System Time 87.764 (0.49345) Percent CPU 1095
      (12.7715) Context Switches 199036 (4298.1) Sleeps 97681.6 (1031.11)
      
      Hackbench shows 0% difference on average (hackbench 90 repeated 10 times):
      
      attemp,before,after 1,0.076,0.069 2,0.072,0.069 3,0.066,0.066 4,0.066,0.068
      5,0.066,0.067 6,0.066,0.069 7,0.067,0.066 8,0.063,0.067 9,0.067,0.065
      10,0.068,0.071 average,0.0677,0.0677
      Signed-off-by: NThomas Garnier <thgarnie@google.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
      Cc: Alexander Popov <alpopov@ptsecurity.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jan Beulich <JBeulich@suse.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Lv Zheng <lv.zheng@intel.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: kernel-hardening@lists.openwall.com
      Cc: linux-doc@vger.kernel.org
      Link: http://lkml.kernel.org/r/1466556426-32664-6-git-send-email-keescook@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0483e1fa
    • T
      x86/mm: Separate variable for trampoline PGD · b234e8a0
      Thomas Garnier 提交于
      Use a separate global variable to define the trampoline PGD used to
      start other processors. This change will allow KALSR memory
      randomization to change the trampoline PGD to be correctly aligned with
      physical memory.
      Signed-off-by: NThomas Garnier <thgarnie@google.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
      Cc: Alexander Popov <alpopov@ptsecurity.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jan Beulich <JBeulich@suse.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Lv Zheng <lv.zheng@intel.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: kernel-hardening@lists.openwall.com
      Cc: linux-doc@vger.kernel.org
      Link: http://lkml.kernel.org/r/1466556426-32664-5-git-send-email-keescook@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b234e8a0
    • T
      x86/mm: Add PUD VA support for physical mapping · faa37933
      Thomas Garnier 提交于
      Minor change that allows early boot physical mapping of PUD level virtual
      addresses. The current implementation expects the virtual address to be
      PUD aligned. For KASLR memory randomization, we need to be able to
      randomize the offset used on the PUD table.
      
      It has no impact on current usage.
      Signed-off-by: NThomas Garnier <thgarnie@google.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
      Cc: Alexander Popov <alpopov@ptsecurity.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jan Beulich <JBeulich@suse.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Lv Zheng <lv.zheng@intel.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: kernel-hardening@lists.openwall.com
      Cc: linux-doc@vger.kernel.org
      Link: http://lkml.kernel.org/r/1466556426-32664-4-git-send-email-keescook@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      faa37933
    • T
      x86/mm: Update physical mapping variable names · 59b3d020
      Thomas Garnier 提交于
      Change the variable names in kernel_physical_mapping_init() and related
      functions to correctly reflect physical and virtual memory addresses.
      Also add comments on each function to describe usage and alignment
      constraints.
      Signed-off-by: NThomas Garnier <thgarnie@google.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
      Cc: Alexander Popov <alpopov@ptsecurity.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jan Beulich <JBeulich@suse.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Lv Zheng <lv.zheng@intel.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: kernel-hardening@lists.openwall.com
      Cc: linux-doc@vger.kernel.org
      Link: http://lkml.kernel.org/r/1466556426-32664-3-git-send-email-keescook@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      59b3d020
    • T
      x86/mm: Refactor KASLR entropy functions · d899a7d1
      Thomas Garnier 提交于
      Move the KASLR entropy functions into arch/x86/lib to be used in early
      kernel boot for KASLR memory randomization.
      Signed-off-by: NThomas Garnier <thgarnie@google.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
      Cc: Alexander Popov <alpopov@ptsecurity.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jan Beulich <JBeulich@suse.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Lv Zheng <lv.zheng@intel.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: kernel-hardening@lists.openwall.com
      Cc: linux-doc@vger.kernel.org
      Link: http://lkml.kernel.org/r/1466556426-32664-2-git-send-email-keescook@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d899a7d1
    • I
      9e7f7f54
    • B
      x86/KASLR: Fix boot crash with certain memory configurations · 6daa2ec0
      Baoquan He 提交于
      Ye Xiaolong reported this boot crash:
      
      |
      |  XZ-compressed data is corrupt
      |
      |   -- System halted
      |
      
      Fix the bug in mem_avoid_overlap() of finding the earliest overlap.
      Reported-and-tested-by: NYe Xiaolong <xiaolong.ye@intel.com>
      Signed-off-by: NBaoquan He <bhe@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      6daa2ec0
    • D
      selftests/x86: Add vDSO mremap() test · f80fd3a5
      Dmitry Safonov 提交于
      Should print this on vDSO remapping success (on new kernels):
      
       [root@localhost ~]# ./test_mremap_vdso_32
      	AT_SYSINFO_EHDR is 0xf773f000
       [NOTE]	Moving vDSO: [f773f000, f7740000] -> [a000000, a001000]
       [OK]
      
      Or print that mremap() for vDSOs is unsupported:
      
       [root@localhost ~]# ./test_mremap_vdso_32
      	AT_SYSINFO_EHDR is 0xf773c000
       [NOTE]	Moving vDSO: [0xf773c000, 0xf773d000] -> [0xf7737000, 0xf7738000]
       [FAIL]	mremap() of the vDSO does not work on this kernel!
      Suggested-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NDmitry Safonov <dsafonov@virtuozzo.com>
      Acked-by: NAndy Lutomirski <luto@kernel.org>
      Cc: 0x7f454c46@gmail.com
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Shuah Khan <shuahkh@osg.samsung.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kselftest@vger.kernel.org
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/20160628113539.13606-3-dsafonov@virtuozzo.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f80fd3a5
    • D
      x86/vdso: Add mremap hook to vm_special_mapping · b059a453
      Dmitry Safonov 提交于
      Add possibility for 32-bit user-space applications to move
      the vDSO mapping.
      
      Previously, when a user-space app called mremap() for the vDSO
      address, in the syscall return path it would land on the previous
      address of the vDSOpage, resulting in segmentation violation.
      
      Now it lands fine and returns to userspace with a remapped vDSO.
      
      This will also fix the context.vdso pointer for 64-bit, which does
      not affect the user of vDSO after mremap() currently, but this
      may change in the future.
      
      As suggested by Andy, return -EINVAL for mremap() that would
      split the vDSO image: that operation cannot possibly result in
      a working system so reject it.
      
      Renamed and moved the text_mapping structure declaration inside
      map_vdso(), as it used only there and now it complements the
      vvar_mapping variable.
      
      There is still a problem for remapping the vDSO in glibc
      applications: the linker relocates addresses for syscalls
      on the vDSO page, so you need to relink with the new
      addresses.
      
      Without that the next syscall through glibc may fail:
      
        Program received signal SIGSEGV, Segmentation fault.
        #0  0xf7fd9b80 in __kernel_vsyscall ()
        #1  0xf7ec8238 in _exit () from /usr/lib32/libc.so.6
      Signed-off-by: NDmitry Safonov <dsafonov@virtuozzo.com>
      Acked-by: NAndy Lutomirski <luto@kernel.org>
      Cc: 0x7f454c46@gmail.com
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/20160628113539.13606-2-dsafonov@virtuozzo.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b059a453
    • J
      x86/mm/pat, /dev/mem: Remove superfluous error message · 39380b80
      Jiri Kosina 提交于
      Currently it's possible for broken (or malicious) userspace to flood a
      kernel log indefinitely with messages a-la
      
      	Program dmidecode tried to access /dev/mem between f0000->100000
      
      because range_is_allowed() is case of CONFIG_STRICT_DEVMEM being turned on
      dumps this information each and every time devmem_is_allowed() fails.
      
      Reportedly userspace that is able to trigger contignuous flow of these
      messages exists.
      
      It would be possible to rate limit this message, but that'd have a
      questionable value; the administrator wouldn't get information about all
      the failing accessess, so then the information would be both superfluous
      and incomplete at the same time :)
      
      Returning EPERM (which is what is actually happening) is enough indication
      for userspace what has happened; no need to log this particular error as
      some sort of special condition.
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Link: http://lkml.kernel.org/r/alpine.LNX.2.00.1607081137020.24757@cbobk.fhfr.pmSigned-off-by: NIngo Molnar <mingo@kernel.org>
      39380b80
    • I
  5. 04 7月, 2016 3 次提交
  6. 03 7月, 2016 7 次提交
    • V
      ovl: warn instead of error if d_type is not supported · e7c0b599
      Vivek Goyal 提交于
      overlay needs underlying fs to support d_type. Recently I put in a
      patch in to detect this condition and started failing mount if
      underlying fs did not support d_type.
      
      But this breaks existing configurations over kernel upgrade. Those who
      are running docker (partially broken configuration) with xfs not
      supporting d_type, are surprised that after kernel upgrade docker does
      not run anymore.
      
      https://github.com/docker/docker/issues/22937#issuecomment-229881315
      
      So instead of erroring out, detect broken configuration and warn
      about it. This should allow existing docker setups to continue
      working after kernel upgrade.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      Fixes: 45aebeaf ("ovl: Ensure upper filesystem supports d_type")
      Cc: <stable@vger.kernel.org> 4.6
      e7c0b599
    • L
      Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus · 4f302921
      Linus Torvalds 提交于
      Pull MIPS fix from Ralf Baechle:
       "Only a single fix for 4.7 pending at this point.  It fixes an issue
        that may lead to corruption of the cache mode bits in the page table"
      
      * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
        MIPS: Fix possible corruption of cache mode by mprotect.
      4f302921
    • L
      Merge tag 'powerpc-4.7-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 70bd68d7
      Linus Torvalds 提交于
      Pull powerpc fixes from Michael Ellerman:
      
       - tm: Always reclaim in start_thread() for exec() class syscalls from
         Cyril Bur
      
       - tm: Avoid SLB faults in treclaim/trecheckpoint when RI=0 from Michael
         Neuling
      
       - eeh: Fix wrong argument passed to eeh_rmv_device() from Gavin Shan
      
       - Initialise pci_io_base as early as possible from Darren Stevens
      
      * tag 'powerpc-4.7-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc: Initialise pci_io_base as early as possible
        powerpc/tm: Avoid SLB faults in treclaim/trecheckpoint when RI=0
        powerpc/eeh: Fix wrong argument passed to eeh_rmv_device()
        powerpc/tm: Always reclaim in start_thread() for exec() class syscalls
      70bd68d7
    • L
      Merge tag 'drm-fixes-for-v4.7-rc6' of git://people.freedesktop.org/~airlied/linux · 99b0f54e
      Linus Torvalds 提交于
      Pull drm fixes frlm Dave Airlie:
       "Just some AMD and Intel fixes, the AMD ones are further production
        Polaris fixes, and the Intel ones fix some early timeouts, some PCI ID
        changes and a couple of other fixes.
      
        Still a bit Internet challenged here, hopefully end of next week will
        solve it"
      
      * tag 'drm-fixes-for-v4.7-rc6' of git://people.freedesktop.org/~airlied/linux:
        drm/i915: Fix missing unlock on error in i915_ppgtt_info()
        drm/amd/powerplay: workaround for UVD clock issue
        drm/amdgpu: add ACLK_CNTL setting for polaris10
        drm/amd/powerplay: fix issue uvd dpm can't enabled on Polaris11.
        drm/amd/powerplay: Workaround for Memory EDC Error on Polaris10.
        drm/i915: Removing PCI IDs that are no longer listed as Kabylake.
        drm/i915: Add more Kabylake PCI IDs.
        drm/i915: Avoid early timeout during AUX transfers
        drm/i915/hsw: Avoid early timeout during LCPLL disable/restore
        drm/i915/lpt: Avoid early timeout during FDI PHY reset
        drm/i915/bxt: Avoid early timeout during PLL enable
        drm/i915: Refresh cached DP port register value on resume
        drm/amd/powerplay: Update CKS on/ CKS off voltage offset calculation
        drm/amd/powerplay: disable FFC.
        drm/amd/powerplay: add some definition for FFC feature on polaris.
      99b0f54e
    • L
      Merge tag 'spi-fix-v4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · 467ce769
      Linus Torvalds 提交于
      Pull spi fixes from Mark Brown:
       "A few small driver-specific fixes for SPI, all in the normal important
        if you hit them category especially the rockchip driver fix which
        addresses a race which has been exposed more frequently with some
        recent performance improvements"
      
      * tag 'spi-fix-v4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
        spi: sunxi: fix transfer timeout
        spi: sun4i: fix FIFO limit
        spi: rockchip: Signal unfinished DMA transfers
        spi: spi-ti-qspi: Suspend the queue before removing the device
      467ce769
    • L
      Merge tag 'regulator-fix-v4.7-rc5' of... · a2b0db5b
      Linus Torvalds 提交于
      Merge tag 'regulator-fix-v4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
      
      Pull regulator fixes from Mark Brown:
       "Two small fixes for the regulator subsystem - one fixing a crash with
        one of the devices supported by the max77620 driver, another fixing
        startup for the anatop regulator when it starts up with the regulator
        in bypass mode"
      
      * tag 'regulator-fix-v4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
        regulator: max77620: check for valid regulator info
        regulator: anatop: allow regulator to be in bypass mode
      a2b0db5b
    • L
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 44385120
      Linus Torvalds 提交于
      Pull clk fixes from Stephen Boyd:
       "A small fix for the newly added oxnas clk driver and a handful of
        rockchip clk driver fixes for newly added rk3399 support"
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: Fix return value check in oxnas_stdclk_probe()
        clk: rockchip: release io resource when failing to init clk on rk3399
        clk: rockchip: fix cpuclk registration error handling
        clk: rockchip: Revert "clk: rockchip: reset init state before mmc card initialization"
        clk: rockchip: fix incorrect parent for rk3399's {c,g}pll_aclk_perihp_src
        clk: rockchip: mark rk3399 GIC clocks as critical
        clk: rockchip: initialize flags of clk_init_data in mmc-phase clock
      44385120
  7. 02 7月, 2016 13 次提交
    • D
      Merge tag 'drm-intel-fixes-2016-06-30' of git://anongit.freedesktop.org/drm-intel into drm-fixes · 88c08710
      Dave Airlie 提交于
      here's a batch of i915 fixes for 4.7.
      
      * tag 'drm-intel-fixes-2016-06-30' of git://anongit.freedesktop.org/drm-intel:
        drm/i915: Fix missing unlock on error in i915_ppgtt_info()
        drm/i915: Removing PCI IDs that are no longer listed as Kabylake.
        drm/i915: Add more Kabylake PCI IDs.
        drm/i915: Avoid early timeout during AUX transfers
        drm/i915/hsw: Avoid early timeout during LCPLL disable/restore
        drm/i915/lpt: Avoid early timeout during FDI PHY reset
        drm/i915/bxt: Avoid early timeout during PLL enable
        drm/i915: Refresh cached DP port register value on resume
      88c08710
    • D
      Merge branch 'drm-fixes-4.7' of git://people.freedesktop.org/~agd5f/linux into drm-fixes · 40793e85
      Dave Airlie 提交于
      Just a few more late fixes for Polaris cards.
      
      * 'drm-fixes-4.7' of git://people.freedesktop.org/~agd5f/linux:
        drm/amd/powerplay: workaround for UVD clock issue
        drm/amdgpu: add ACLK_CNTL setting for polaris10
        drm/amd/powerplay: fix issue uvd dpm can't enabled on Polaris11.
        drm/amd/powerplay: Workaround for Memory EDC Error on Polaris10.
        drm/amd/powerplay: Update CKS on/ CKS off voltage offset calculation
        drm/amd/powerplay: disable FFC.
        drm/amd/powerplay: add some definition for FFC feature on polaris.
      40793e85
    • R
      MIPS: Fix possible corruption of cache mode by mprotect. · 6d037de9
      Ralf Baechle 提交于
      The following testcase may result in a page table entries with a invalid
      CCA field being generated:
      
      static void *bindstack;
      
      static int sysrqfd;
      
      static void protect_low(int protect)
      {
      	mprotect(bindstack, BINDSTACK_SIZE, protect);
      }
      
      static void sigbus_handler(int signal, siginfo_t * info, void *context)
      {
      	void *addr = info->si_addr;
      
      	write(sysrqfd, "x", 1);
      
      	printf("sigbus, fault address %p (should not happen, but might)\n",
      	       addr);
      	abort();
      }
      
      static void run_bind_test(void)
      {
      	unsigned int *p = bindstack;
      
      	p[0] = 0xf001f001;
      
      	write(sysrqfd, "x", 1);
      
      	/* Set trap on access to p[0] */
      	protect_low(PROT_NONE);
      
      	write(sysrqfd, "x", 1);
      
      	/* Clear trap on access to p[0] */
      	protect_low(PROT_READ | PROT_WRITE | PROT_EXEC);
      
      	write(sysrqfd, "x", 1);
      
      	/* Check the contents of p[0] */
      	if (p[0] != 0xf001f001) {
      		write(sysrqfd, "x", 1);
      
      		/* Reached, but shouldn't be */
      		printf("badness, shouldn't happen but does\n");
      		abort();
      	}
      }
      
      int main(void)
      {
      	struct sigaction sa;
      
      	sysrqfd = open("/proc/sysrq-trigger", O_WRONLY);
      
      	if (sigprocmask(SIG_BLOCK, NULL, &sa.sa_mask)) {
      		perror("sigprocmask");
      		return 0;
      	}
      
      	sa.sa_sigaction = sigbus_handler;
      	sa.sa_flags = SA_SIGINFO | SA_NODEFER | SA_RESTART;
      	if (sigaction(SIGBUS, &sa, NULL)) {
      		perror("sigaction");
      		return 0;
      	}
      
      	bindstack = mmap(NULL,
      			 BINDSTACK_SIZE,
      			 PROT_READ | PROT_WRITE | PROT_EXEC,
      			 MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
      	if (bindstack == MAP_FAILED) {
      		perror("mmap bindstack");
      		return 0;
      	}
      
      	printf("bindstack: %p\n", bindstack);
      
      	run_bind_test();
      
      	printf("done\n");
      
      	return 0;
      }
      
      There are multiple ingredients for this:
      
       1) PAGE_NONE is defined to _CACHE_CACHABLE_NONCOHERENT, which is CCA 3
          on all platforms except SB1 where it's CCA 5.
       2) _page_cachable_default must have bits set which are not set
          _CACHE_CACHABLE_NONCOHERENT.
       3) Either the defective version of pte_modify for XPA or the standard
          version must be in used.  However pte_modify for the 36 bit address
          space support is no affected.
      
      In that case additional bits in the final CCA mode may generate an invalid
      value for the CCA field.  On the R10000 system where this was tracked
      down for example a CCA 7 has been observed, which is Uncached Accelerated.
      
      Fixed by:
      
       1) Using the proper CCA mode for PAGE_NONE just like for all the other
          PAGE_* pte/pmd bits.
       2) Fix the two affected variants of pte_modify.
      
      Further code inspection also shows the same issue to exist in pmd_modify
      which would affect huge page systems.
      
      Issue in pte_modify tracked down by Alastair Bridgewater, PAGE_NONE
      and pmd_modify issue found by me.
      
      The history of this goes back beyond Linus' git history.  Chris Dearman's
      commit 35133692 ("[MIPS] Allow setting of
      the cache attribute at run time.") missed the opportunity to fix this
      but it was originally introduced in lmo commit
      d523832cf12007b3242e50bb77d0c9e63e0b6518 ("Missing from last commit.")
      and 32cc38229ac7538f2346918a09e75413e8861f87 ("New configuration option
      CONFIG_MIPS_UNCACHED.")
      Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      Reported-by: NAlastair Bridgewater <alastair.bridgewater@gmail.com>
      6d037de9
    • L
      Merge tag 'acpi-4.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · dbdc3bb7
      Linus Torvalds 提交于
      Pull ACPI fix from Rafael Wysocki:
       "Fix an expression in the ACPI PCI IRQ management code added by a
        recent commit that overlooked missing parens in it, so the result of
        the computation is incorrect in some cases (Sinan Kaya)"
      
      * tag 'acpi-4.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI,PCI,IRQ: correct operator precedence
      dbdc3bb7
    • L
      Merge tag 'pm-4.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 81dbd6f5
      Linus Torvalds 提交于
      Pull power management fixes from Rafael Wysocki:
       "Three cpufreq fixes, one in the core (stable-candidate) and two in
        drivers (intel_pstate and cpufreq-dt).
      
        Specifics:
      
         - Fix a recent intel_pstate regression that caused the number of
           wakeups to increase significantly on an idle system in some cases
           due to excessive synchronize_sched() invocations (Rafael Wysocki).
      
         - Fix unnecessary invocations of WARN_ON() in the cpufreq core after
           cpufreq has been suspended introduced during the 4.6 cycla (Rafael
           Wysocki).
      
         - Fix an error code path in the cpufreq-dt-platdev driver that
           forgets to drop a reference to a DT node (Masahiro Yamada)"
      
      * tag 'pm-4.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        cpufreq: Avoid false-positive WARN_ON()s in cpufreq_update_policy()
        cpufreq: dt: call of_node_put() before error out
        intel_pstate: Do not clear utilization update hooks on policy changes
      81dbd6f5
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 48c4565e
      Linus Torvalds 提交于
      Pull vfs fixes from Al Viro:
       "Tmpfs readdir throughput regression fix (this cycle) + some -stable
        fodder all over the place.
      
        One missing bit is Miklos' tonight locks.c fix - NFS folks had already
        grabbed that one by the time I woke up ;-)"
      
      [ The locks.c fix came through the nfsd tree just moments ago ]
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        namespace: update event counter when umounting a deleted dentry
        9p: use file_dentry()
        ceph: fix d_obtain_alias() misuses
        lockless next_positive()
        libfs.c: new helper - next_positive()
        dcache_{readdir,dir_lseek}(): don't bother with nested ->d_lock
      48c4565e
    • L
      Merge tag 'nfsd-4.7-3' of git://linux-nfs.org/~bfields/linux · 2728c57f
      Linus Torvalds 提交于
      Pull lockd/locks fixes from Bruce Fields:
       "One fix for lockd soft lookups in an error path, and one fix for file
        leases on overlayfs"
      
      * tag 'nfsd-4.7-3' of git://linux-nfs.org/~bfields/linux:
        locks: use file_inode()
        lockd: unregister notifier blocks if the service fails to come up completely
      2728c57f
    • L
      Merge tag 'mfd-fixes-4.7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd · 0d064a7b
      Linus Torvalds 提交于
      Pull more MFD fixes from Lee Jones:
       "Apologies for missing these from the first pull request.
      
        Final patches fixing Reset API change"
      
      * tag 'mfd-fixes-4.7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
        usb: dwc3: st: Use explicit reset_control_get_exclusive() API
        phy: phy-stih407-usb: Use explicit reset_control_get_exclusive() API
        phy: miphy28lp: Inform the reset framework that our reset line may be shared
      0d064a7b
    • L
      Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm · f3683ccd
      Linus Torvalds 提交于
      Pull libnvdimm fixes from Dan Williams:
       "1/ Two regression fixes since v4.6: one for the byte order of a sysfs
           attribute (bz121161) and another for QEMU 2.6's NVDIMM _DSM (ACPI
           Device Specific Method) implementation that gets tripped up by new
           auto-probing behavior in the NFIT driver.
      
        2/ A fix tagged for -stable that stops the kernel from
           clobbering/ignoring changes to the configuration of a 'pfn'
           instance ("struct page" driver).  For example changing the
           alignment from 2M to 1G may silently revert to 2M if that value is
           currently stored on media.
      
        3/ A fix from Eric for an xfstests failure in dax.  It is not
           currently tagged for -stable since it requires an 8-exabyte file
           system to trigger, and there appear to be no user visible side
           effects"
      
      * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
        nfit: fix format interface code byte order
        dax: fix offset overflow in dax_io
        acpi, nfit: fix acpi_check_dsm() vs zero functions implemented
        libnvdimm, pfn, dax: fix initialization vs autodetect for mode + alignment
      f3683ccd
    • L
      Merge tag 'staging-4.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · 6e5c4f13
      Linus Torvalds 提交于
      Pull staging and IIO fixes from Greg KH:
       "Here are a few small staging and iio driver fixes for 4.7-rc6.
      
        Nothing major here, just a number of small fixes, all have been in
        linux-next for a while, and the full details are in the shortlog"
      
      * tag 'staging-4.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
        iio:ad7266: Fix probe deferral for vref
        iio:ad7266: Fix support for optional regulators
        iio:ad7266: Fix broken regulator error handling
        iio: accel: kxsd9: fix the usage of spi_w8r8()
        staging: iio: accel: fix error check
        staging: iio: ad5933: fix order of cycle conditions
        staging: iio: fix ad7606_spi regression
        iio: inv_mpu6050: Fix use-after-free in ACPI code
      6e5c4f13
    • L
      Merge tag 'tty-4.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 756c0aec
      Linus Torvalds 提交于
      Pull tty fixes from Greg KH:
       "Here are two tty fixes for some reported issues.  One resolves a crash
        in devpts, and the other resolves a problem with the fbcon cursor
        blink causing lockups.
      
        Both have been in linux-next with no reported problems"
      
      * tag 'tty-4.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        devpts: fix null pointer dereference on failed memory allocation
        tty: vt: Fix soft lockup in fbcon cursor blink timer.
      756c0aec
    • L
      Merge tag 'usb-4.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 0232b23d
      Linus Torvalds 提交于
      Pull USB and PHY fixes from Greg KH:
       "Here are a number of small USB and PHY driver fixes for 4.7-rc6.
      
        Nothing major here, all are described in the shortlog below.  All have
        been in linux-next with no reported issues"
      
      * tag 'usb-4.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        USB: don't free bandwidth_mutex too early
        USB: EHCI: declare hostpc register as zero-length array
        phy-sun4i-usb: Fix irq free conditions to match request conditions
        phy: bcm-ns-usb2: checking the wrong variable
        phy-sun4i-usb: fix missing __iomem *
        phy: phy-sun4i-usb: Fix optional gpios failing probe
        phy: rockchip-dp: fix return value check in rockchip_dp_phy_probe()
        phy: rcar-gen3-usb2: fix unexpected repeat interrupts of VBUS change
        usb: common: otg-fsm: add license to usb-otg-fsm
      0232b23d
    • L
      Merge tag 'iommu-fixes-v4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · aa7a6c8e
      Linus Torvalds 提交于
      Pull IOMMU fixes from Joerg Roedel:
       "Three fixes:
      
         - Fix use of smp_processor_id() in preemptible code in the IOVA
           allocation code.  This got introduced with the scalability
           improvements in this release cycle.
      
         - A VT-d fix for out-of-bounds access of the iommu->domains array.
           The bug showed during suspend/resume.
      
         - AMD IOMMU fix to print the correct device id in the ACPI parsing
           code"
      
      * tag 'iommu-fixes-v4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
        iommu/amd: Initialize devid variable before using it
        iommu/vt-d: Fix overflow of iommu->domains array
        iommu/iova: Disable preemption around use of this_cpu_ptr()
      aa7a6c8e