1. 16 5月, 2016 1 次提交
  2. 13 5月, 2016 1 次提交
  3. 12 5月, 2016 13 次提交
  4. 11 5月, 2016 2 次提交
  5. 10 5月, 2016 8 次提交
    • K
      x86/KASLR: Clarify purpose of each get_random_long() · d2d3462f
      Kees Cook 提交于
      KASLR will be calling get_random_long() twice, but the debug output
      won't distinguishing between them. This patch adds a report on when it
      is fetching the physical vs virtual address. With this, once the virtual
      offset is separate, the report changes from:
      
       KASLR using RDTSC...
       KASLR using RDTSC...
      
      into:
      
       Physical KASLR using RDTSC...
       Virtual KASLR using RDTSC...
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: kernel-hardening@lists.openwall.com
      Cc: lasse.collin@tukaani.org
      Link: http://lkml.kernel.org/r/1462825332-10505-7-git-send-email-keescook@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d2d3462f
    • B
      x86/KASLR: Add virtual address choosing function · 071a7493
      Baoquan He 提交于
      To support randomizing the kernel virtual address separately from the
      physical address, this patch adds find_random_virt_addr() to choose
      a slot anywhere between LOAD_PHYSICAL_ADDR and KERNEL_IMAGE_SIZE.
      Since this address is virtual, not physical, we can place the kernel
      anywhere in this region, as long as it is aligned and (in the case of
      kernel being larger than the slot size) placed with enough room to load
      the entire kernel image.
      
      For clarity and readability, find_random_addr() is renamed to
      find_random_phys_addr() and has "size" renamed to "image_size" to match
      find_random_virt_addr().
      Signed-off-by: NBaoquan He <bhe@redhat.com>
      [ Rewrote changelog, refactored slot calculation for readability. ]
      [ Renamed find_random_phys_addr() and size argument. ]
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: kernel-hardening@lists.openwall.com
      Cc: lasse.collin@tukaani.org
      Link: http://lkml.kernel.org/r/1462825332-10505-6-git-send-email-keescook@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      071a7493
    • K
      x86/KASLR: Return earliest overlap when avoiding regions · 06486d6c
      Kees Cook 提交于
      In preparation for being able to detect where to split up contiguous
      memory regions that overlap with memory regions to avoid, we need to
      pass back what the earliest overlapping region was. This modifies the
      overlap checker to return that information.
      
      Based on a separate mem_min_overlap() implementation by Baoquan He.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: kernel-hardening@lists.openwall.com
      Cc: lasse.collin@tukaani.org
      Link: http://lkml.kernel.org/r/1462825332-10505-5-git-send-email-keescook@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      06486d6c
    • B
      x86/KASLR: Add 'struct slot_area' to manage random_addr slots · c401cf15
      Baoquan He 提交于
      In order to support KASLR moving the kernel anywhere in physical memory
      (which could be up to 64TB), we need to handle counting the potential
      randomization locations in a more efficient manner.
      
      In the worst case with 64TB, there could be roughly 32 * 1024 * 1024
      randomization slots if CONFIG_PHYSICAL_ALIGN is 0x1000000. Currently
      the starting address of candidate positions is stored into the slots[]
      array, one at a time. This method would cost too much memory and it's
      also very inefficient to get and save the slot information into the slot
      array one by one.
      
      This patch introduces 'struct slot_area' to manage each contiguous region
      of randomization slots. Each slot_area will contain the starting address
      and how many available slots are in this area. As with the original code,
      the slot_areas[] will avoid the mem_avoid[] regions.
      
      Since setup_data is a linked list, it could contain an unknown number
      of memory regions to be avoided, which could cause us to fragment
      the contiguous memory that the slot_area array is tracking. In normal
      operation this level of fragmentation will be extremely rare, but we
      choose a suitably large value (100) for the array. If setup_data forces
      the slot_area array to become highly fragmented and there are more
      slots available beyond the first 100 found, the rest will be ignored
      for KASLR selection.
      
      The function store_slot_info() is used to calculate the number of slots
      available in the passed-in memory region and stores it into slot_areas[]
      after adjusting for alignment and size requirements.
      Signed-off-by: NBaoquan He <bhe@redhat.com>
      [ Rewrote changelog, squashed with new functions. ]
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: kernel-hardening@lists.openwall.com
      Cc: lasse.collin@tukaani.org
      Link: http://lkml.kernel.org/r/1462825332-10505-4-git-send-email-keescook@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c401cf15
    • K
      x86/boot: Add missing file header comments · cb18ef0d
      Kees Cook 提交于
      There were some files with missing header comments. Since they are
      included from both compressed and regular kernels, make note of that.
      Also corrects a typo in the mem_avoid comments.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: kernel-hardening@lists.openwall.com
      Cc: lasse.collin@tukaani.org
      Link: http://lkml.kernel.org/r/1462825332-10505-3-git-send-email-keescook@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      cb18ef0d
    • K
      x86/KASLR: Initialize mapping_info every time · 434a6c9f
      Kees Cook 提交于
      As it turns out, mapping_info DOES need to be initialized every
      time, because pgt_data address could be changed during kernel
      relocation. So it can not be build time assigned.
      
      Without this, page tables were not being corrected updated, which
      could cause reboots when a physical address beyond 2G was chosen.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: kernel-hardening@lists.openwall.com
      Cc: lasse.collin@tukaani.org
      Link: http://lkml.kernel.org/r/1462825332-10505-2-git-send-email-keescook@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      434a6c9f
    • B
      x86/boot: Comment what finalize_identity_maps() does · 36a39ac9
      Borislav Petkov 提交于
      So it is not really obvious that finalize_identity_maps() doesn't do any
      finalization but it *actually* writes CR3 with the ident PGD. Comment
      that at the call site.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: akpm@linux-foundation.org
      Cc: bhe@redhat.com
      Cc: dyoung@redhat.com
      Cc: jkosina@suse.cz
      Cc: linux-tip-commits@vger.kernel.org
      Cc: luto@kernel.org
      Cc: vgoyal@redhat.com
      Cc: yinghai@kernel.org
      Link: http://lkml.kernel.org/r/20160507100541.GA24613@pd.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>
      36a39ac9
    • T
      x86/topology: Set x86_max_cores to 1 for CONFIG_SMP=n · 8d415ee2
      Thomas Gleixner 提交于
      Josef reported that the uncore driver trips over with CONFIG_SMP=n because
      x86_max_cores is 16 instead of 12.
      
      The reason is, that for SMP=n the extended topology detection is a NOOP and
      the cache leaf is used to determine the number of cores. That's wrong in two
      aspects:
      
      1) The cache leaf enumerates the maximum addressable number of cores in the
         package, which is obviously not correct
      
      2) UP has no business with topology bits at all.
      
      Make intel_num_cpu_cores() return 1 for CONFIG_SMP=n
      Reported-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: kernel-team <Kernel-team@fb.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Link: http://lkml.kernel.org/r/761b4a2a-0332-7954-f030-c6639f949612@fb.com
      8d415ee2
  6. 07 5月, 2016 6 次提交
    • T
      x86/topology: Handle CPUID bogosity gracefully · 56402d63
      Thomas Gleixner 提交于
      Joseph reported that a XEN guest dies with a division by 0 in the package
      topology setup code. This happens if cpu_info.x86_max_cores is zero.
      
      Handle that case and emit a warning. This does not fix the underlying XEN bug,
      but makes the code more robust.
      Reported-and-tested-by: NJoseph Salisbury <joseph.salisbury@canonical.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1605062046270.3540@nanosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      56402d63
    • K
      x86/KASLR: Build identity mappings on demand · 3a94707d
      Kees Cook 提交于
      Currently KASLR only supports relocation in a small physical range (from
      16M to 1G), due to using the initial kernel page table identity mapping.
      To support ranges above this, we need to have an identity mapping for the
      desired memory range before we can decompress (and later run) the kernel.
      
      32-bit kernels already have the needed identity mapping. This patch adds
      identity mappings for the needed memory ranges on 64-bit kernels. This
      happens in two possible boot paths:
      
      If loaded via startup_32(), we need to set up the needed identity map.
      
      If loaded from a 64-bit bootloader, the bootloader will have already
      set up an identity mapping, and we'll start via the compressed kernel's
      startup_64(). In this case, the bootloader's page tables need to be
      avoided while selecting the new uncompressed kernel location. If not,
      the decompressor could overwrite them during decompression.
      
      To accomplish this, we could walk the pagetable and find every page
      that is used, and add them to mem_avoid, but this needs extra code and
      will require increasing the size of the mem_avoid array.
      
      Instead, we can create a new set of page tables for our own identity
      mapping instead. The pages for the new page table will come from the
      _pagetable section of the compressed kernel, which means they are
      already contained by in mem_avoid array. To do this, we reuse the code
      from the uncompressed kernel's identity mapping routines.
      
      The _pgtable will be shared by both the 32-bit and 64-bit paths to reduce
      init_size, as now the compressed kernel's _rodata to _end will contribute
      to init_size.
      
      To handle the possible mappings, we need to increase the existing page
      table buffer size:
      
      When booting via startup_64(), we need to cover the old VO, params,
      cmdline and uncompressed kernel. In an extreme case we could have them
      all beyond the 512G boundary, which needs (2+2)*4 pages with 2M mappings.
      And we'll need 2 for first 2M for VGA RAM. One more is needed for level4.
      This gets us to 19 pages total.
      
      When booting via startup_32(), KASLR could move the uncompressed kernel
      above 4G, so we need to create extra identity mappings, which should only
      need (2+2) pages at most when it is beyond the 512G boundary. So 19
      pages is sufficient for this case as well.
      
      The resulting BOOT_*PGT_SIZE defines use the "_SIZE" suffix on their
      names to maintain logical consistency with the existing BOOT_HEAP_SIZE
      and BOOT_STACK_SIZE defines.
      
      This patch is based on earlier patches from Yinghai Lu and Baoquan He.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: kernel-hardening@lists.openwall.com
      Cc: lasse.collin@tukaani.org
      Link: http://lkml.kernel.org/r/1462572095-11754-4-git-send-email-keescook@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      3a94707d
    • Y
      x86/boot: Split out kernel_ident_mapping_init() · cf4fb15b
      Yinghai Lu 提交于
      In order to support on-demand page table creation when moving the
      kernel for KASLR, we need to use kernel_ident_mapping_init() in the
      decompression code.
      
      This splits it out into its own file for use outside of init_64.c.
      Additionally, checking for __pa/__va defines is added since they
      need to be overridden in the decompression code.
      
      [kees: rewrote changelog]
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: kernel-hardening@lists.openwall.com
      Cc: lasse.collin@tukaani.org
      Link: http://lkml.kernel.org/r/1462572095-11754-3-git-send-email-keescook@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      cf4fb15b
    • K
      x86/boot: Clean up indenting for asm/boot.h · 8665e6ff
      Kees Cook 提交于
      Before adding more defines to asm/boot.h, this cleans up the existing
      indenting for readability.
      Suggested-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: kernel-hardening@lists.openwall.com
      Cc: lasse.collin@tukaani.org
      Link: http://lkml.kernel.org/r/1462572095-11754-2-git-send-email-keescook@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8665e6ff
    • K
      x86/KASLR: Improve comments around the mem_avoid[] logic · ed09acde
      Kees Cook 提交于
      This attempts to improve the comments that describe how the memory
      range used for decompression is avoided. Additionally uses an enum
      instead of raw numbers for the mem_avoid[] indexing.
      Suggested-by: NBorislav Petkov <bp@alien8.de>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Link: http://lkml.kernel.org/r/20160506194459.GA16480@www.outflux.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ed09acde
    • B
      x86/boot: Simplify pointer casting in choose_random_location() · 549f90db
      Borislav Petkov 提交于
      Pass them down as 'unsigned long' directly and get rid of more casting and
      assignments.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: akpm@linux-foundation.org
      Cc: bhe@redhat.com
      Cc: dyoung@redhat.com
      Cc: linux-tip-commits@vger.kernel.org
      Cc: luto@kernel.org
      Cc: vgoyal@redhat.com
      Cc: yinghai@kernel.org
      Link: http://lkml.kernel.org/r/20160506115015.GI24044@pd.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>
      549f90db
  7. 06 5月, 2016 4 次提交
    • C
      x86/tsc: Read all ratio bits from MSR_PLATFORM_INFO · 886123fb
      Chen Yu 提交于
      Currently we read the tsc radio: ratio = (MSR_PLATFORM_INFO >> 8) & 0x1f;
      
      Thus we get bit 8-12 of MSR_PLATFORM_INFO, however according to the SDM
      (35.5), the ratio bits are bit 8-15.
      
      Ignoring the upper bits can result in an incorrect tsc ratio, which causes the
      TSC calibration and the Local APIC timer frequency to be incorrect.
      
      Fix this problem by masking 0xff instead.
      
      [ tglx: Massaged changelog ]
      
      Fixes: 7da7c156 "x86, tsc: Add static (MSR) TSC calibration on Intel Atom SoCs"
      Signed-off-by: NChen Yu <yu.c.chen@intel.com>
      Cc: "Rafael J. Wysocki" <rafael@kernel.org>
      Cc: stable@vger.kernel.org
      Cc: Bin Gao <bin.gao@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: http://lkml.kernel.org/r/1462505619-5516-1-git-send-email-yu.c.chen@intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      886123fb
    • Y
      x86/KASLR: Consolidate mem_avoid[] entries · 9dc1969c
      Yinghai Lu 提交于
      The mem_avoid[] array is used to track positions that should be avoided (like
      the compressed kernel, decompression code, etc) when selecting a memory
      position for the randomly relocated kernel. Since ZO is now at the end of
      the decompression buffer and the decompression code (and its heap and
      stack) are at the front, we can safely consolidate the decompression entry,
      the heap entry, and the stack entry. The boot_params memory, however, could
      be elsewhere, so it should be explicitly included.
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NBaoquan He <bhe@redhat.com>
      [ Rwrote changelog, cleaned up code comments. ]
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: kernel-hardening@lists.openwall.com
      Cc: lasse.collin@tukaani.org
      Link: http://lkml.kernel.org/r/1462486436-3707-3-git-send-email-keescook@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9dc1969c
    • K
      x86/boot: Clean up pointer casting · 2bc1cd39
      Kees Cook 提交于
      Currently extract_kernel() defines the input and output buffer pointers
      as "unsigned char *" since that's effectively what they are. It passes
      these to the decompressor routine and to the ELF parser, which both
      logically deal with buffer pointers too. There is some casting ("unsigned
      long") done to validate the numerical value of the pointers, but it is
      relatively limited.
      
      However, choose_random_location() operates almost exclusively on the
      numerical representation of these pointers, so it ended up carrying
      a lot of "unsigned long" casts. With the future physical/virtual split
      these casts were going to multiply, so this attempts to solve the
      problem by doing all the casting in choose_random_location()'s entry
      and return instead of through-out the code. Adjusts argument names to
      be more meaningful, and changes one us of "choice" to "output" to make
      the future physical/virtual split more clear (i.e. "choice" should be
      strictly a function return value and not used as an intermediate).
      Suggested-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: kernel-hardening@lists.openwall.com
      Cc: lasse.collin@tukaani.org
      Link: http://lkml.kernel.org/r/1462486436-3707-2-git-send-email-keescook@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2bc1cd39
    • A
      mm: thp: kvm: fix memory corruption in KVM with THP enabled · 127393fb
      Andrea Arcangeli 提交于
      After the THP refcounting change, obtaining a compound pages from
      get_user_pages() no longer allows us to assume the entire compound page
      is immediately mappable from a secondary MMU.
      
      A secondary MMU doesn't want to call get_user_pages() more than once for
      each compound page, in order to know if it can map the whole compound
      page.  So a secondary MMU needs to know from a single get_user_pages()
      invocation when it can map immediately the entire compound page to avoid
      a flood of unnecessary secondary MMU faults and spurious
      atomic_inc()/atomic_dec() (pages don't have to be pinned by MMU notifier
      users).
      
      Ideally instead of the page->_mapcount < 1 check, get_user_pages()
      should return the granularity of the "page" mapping in the "mm" passed
      to get_user_pages().  However it's non trivial change to pass the "pmd"
      status belonging to the "mm" walked by get_user_pages up the stack (up
      to the caller of get_user_pages).  So the fix just checks if there is
      not a single pte mapping on the page returned by get_user_pages, and in
      turn if the caller can assume that the whole compound page is mapped in
      the current "mm" (in a pmd_trans_huge()).  In such case the entire
      compound page is safe to map into the secondary MMU without additional
      get_user_pages() calls on the surrounding tail/head pages.  In addition
      of being faster, not having to run other get_user_pages() calls also
      reduces the memory footprint of the secondary MMU fault in case the pmd
      split happened as result of memory pressure.
      
      Without this fix after a MADV_DONTNEED (like invoked by QEMU during
      postcopy live migration or balloning) or after generic swapping (with a
      failure in split_huge_page() that would only result in pmd splitting and
      not a physical page split), KVM would map the whole compound page into
      the shadow pagetables, despite regular faults or userfaults (like
      UFFDIO_COPY) may map regular pages into the primary MMU as result of the
      pte faults, leading to the guest mode and userland mode going out of
      sync and not working on the same memory at all times.
      
      Any other secondary MMU notifier manager (KVM is just one of the many
      MMU notifier users) will need the same information if it doesn't want to
      run a flood of get_user_pages_fast and it can support multiple
      granularity in the secondary MMU mappings, so I think it is justified to
      be exposed not just to KVM.
      
      The other option would be to move transparent_hugepage_adjust to
      mm/huge_memory.c but that currently has all kind of KVM data structures
      in it, so it's definitely not a cut-and-paste work, so I couldn't do a
      fix as cleaner as this one for 4.6.
      Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
      Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: "Li, Liang Z" <liang.z.li@intel.com>
      Cc: Amit Shah <amit.shah@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      127393fb
  8. 05 5月, 2016 5 次提交