1. 13 1月, 2016 1 次提交
  2. 12 1月, 2016 9 次提交
    • M
      x86/reboot/quirks: Add iMac10,1 to pci_reboot_dmi_table[] · 2f0c0b2d
      Mario Kleiner 提交于
      Without the reboot=pci method, the iMac 10,1 simply
      hangs after printing "Restarting system" at the point
      when it should reboot. This fixes it.
      Signed-off-by: NMario Kleiner <mario.kleiner.de@gmail.com>
      Cc: <stable@vger.kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Jones <davej@codemonkey.org.uk>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1450466646-26663-1-git-send-email-mario.kleiner.de@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2f0c0b2d
    • R
      lguest: Map switcher text R/O · e27d90e8
      Rusty Russell 提交于
      Pavel noted that lguest maps the switcher code executable and
      read-write.  This is a bad idea for any kernel text, but
      particularly for text mapped at a fixed address.
      
      Create two vmas, one for the text (PAGE_KERNEL_RX) and another
      for the stacks (PAGE_KERNEL).  Use VM_NO_GUARD to map them
      adjacent (as expected by the rest of the code).
      Reported-by: NPavel Machek <pavel@ucw.cz>
      Tested-by: NPavel Machek <pavel@ucw.cz>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      e27d90e8
    • B
      x86/boot: Hide local labels in verify_cpu() · aa042141
      Borislav Petkov 提交于
      ... from the final ELF image's symbol table as they're not
      really needed there.
      
      Before:
      
      $ readelf -a vmlinux | grep verify_cpu
          43: ffffffff810001a9     0 NOTYPE  LOCAL  DEFAULT    1 verify_cpu
          45: ffffffff8100028f     0 NOTYPE  LOCAL  DEFAULT    1 verify_cpu_no_longmode
          46: ffffffff810001de     0 NOTYPE  LOCAL  DEFAULT    1 verify_cpu_noamd
          47: ffffffff8100022b     0 NOTYPE  LOCAL  DEFAULT    1 verify_cpu_check
          48: ffffffff8100021c     0 NOTYPE  LOCAL  DEFAULT    1 verify_cpu_clear_xd
          49: ffffffff81000263     0 NOTYPE  LOCAL  DEFAULT    1 verify_cpu_sse_test
          50: ffffffff81000296     0 NOTYPE  LOCAL  DEFAULT    1 verify_cpu_sse_ok
      
      After:
      
      $ readelf -a vmlinux | grep verify_cpu
          43: ffffffff810001a9     0 NOTYPE  LOCAL  DEFAULT    1 verify_cpu
      
      No functionality change.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1451860733-21163-1-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      aa042141
    • Y
      x86/fpu: Disable AVX when eagerfpu is off · 394db20c
      yu-cheng yu 提交于
      When "eagerfpu=off" is given as a command-line input, the kernel
      should disable AVX support.
      
      The Task Switched bit used for lazy context switching does not
      support AVX. If AVX is enabled without eagerfpu context
      switching, one task's AVX state could become corrupted or leak
      to other tasks. This is a bug and has bad security implications.
      
      This only affects systems that have AVX/AVX2/AVX512 and this
      issue will be found only when one actually uses AVX/AVX2/AVX512
      _AND_ does eagerfpu=off.
      
      Reference: Intel Software Developer's Manual Vol. 3A
      
      Sec. 2.5 Control Registers:
      TS Task Switched bit (bit 3 of CR0) -- Allows the saving of the
      x87 FPU/ MMX/SSE/SSE2/SSE3/SSSE3/SSE4 context on a task switch
      to be delayed until an x87 FPU/MMX/SSE/SSE2/SSE3/SSSE3/SSE4
      instruction is actually executed by the new task.
      
      Sec. 13.4.1 Using the TS Flag to Control the Saving of the X87
      FPU and SSE State
      When the TS flag is set, the processor monitors the instruction
      stream for x87 FPU, MMX, SSE instructions. When the processor
      detects one of these instructions, it raises a
      device-not-available exeception (#NM) prior to executing the
      instruction.
      Signed-off-by: NYu-cheng Yu <yu-cheng.yu@intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
      Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: yu-cheng yu <yu-cheng.yu@intel.com>
      Link: http://lkml.kernel.org/r/1452119094-7252-5-git-send-email-yu-cheng.yu@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      394db20c
    • Y
      x86/fpu: Disable MPX when eagerfpu is off · a5fe93a5
      yu-cheng yu 提交于
      This issue is a fallout from the command-line parsing move.
      
      When "eagerfpu=off" is given as a command-line input, the kernel
      should disable MPX support. The decision for turning off MPX was
      made in fpu__init_system_ctx_switch(), which is after the
      selection of the XSAVE format. This patch fixes it by getting
      that decision done earlier in fpu__init_system_xstate().
      Signed-off-by: NYu-cheng Yu <yu-cheng.yu@intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
      Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: yu-cheng yu <yu-cheng.yu@intel.com>
      Link: http://lkml.kernel.org/r/1452119094-7252-4-git-send-email-yu-cheng.yu@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a5fe93a5
    • Y
      x86/fpu: Disable XGETBV1 when no XSAVE · eb7c5f87
      yu-cheng yu 提交于
      When "noxsave" is given as a command-line input, the kernel
      should disable XGETBV1. This issue currently does not cause any
      actual problems. XGETBV1 is only useful if we have something
      using the 'init optimization' (i.e. xsaveopt, xsaves). We
      already clear both of those in fpu__xstate_clear_all_cpu_caps().
      But this is good for completeness.
      Signed-off-by: NYu-cheng Yu <yu-cheng.yu@intel.com>
      Reviewed-by: NDave Hansen <dave.hansen@intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
      Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: yu-cheng yu <yu-cheng.yu@intel.com>
      Link: http://lkml.kernel.org/r/1452119094-7252-3-git-send-email-yu-cheng.yu@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      eb7c5f87
    • Y
      x86/fpu: Fix early FPU command-line parsing · 4f81cbaf
      yu-cheng yu 提交于
      The function fpu__init_system() is executed before
      parse_early_param(). This causes wrong FPU configuration. This
      patch fixes this issue by parsing boot_command_line in the
      beginning of fpu__init_system().
      
      With all four patches in this series, each parameter disables
      features as the following:
      
      eagerfpu=off: eagerfpu, avx, avx2, avx512, mpx
      no387: fpu
      nofxsr: fxsr, fxsropt, xmm
      noxsave: xsave, xsaveopt, xsaves, xsavec, avx, avx2, avx512,
      mpx, xgetbv1 noxsaveopt: xsaveopt
      noxsaves: xsaves
      Signed-off-by: NYu-cheng Yu <yu-cheng.yu@intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
      Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: yu-cheng yu <yu-cheng.yu@intel.com>
      Link: http://lkml.kernel.org/r/1452119094-7252-2-git-send-email-yu-cheng.yu@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      4f81cbaf
    • K
      x86/mm: Use PAGE_ALIGNED instead of IS_ALIGNED · b500f77b
      Kefeng Wang 提交于
      Use PAGE_ALIGEND macro in <linux/mm.h> to simplify code.
      Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com>
      Cc: <guohanjun@huawei.com>
      Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1452565170-11083-1-git-send-email-wangkefeng.wang@huawei.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b500f77b
    • D
      x86/mm/pat: Make split_page_count() check for empty levels to fix /proc/meminfo output · c9e0d391
      Dave Jones 提交于
      In CONFIG_PAGEALLOC_DEBUG=y builds, we disable 2M pages.
      
      Unfortunatly when we split up mappings during boot,
      split_page_count() doesn't take this into account, and
      starts decrementing an empty direct_pages_count[] level.
      
      This results in /proc/meminfo showing crazy things like:
      
        DirectMap2M:    18446744073709543424 kB
      Signed-off-by: NDave Jones <davej@codemonkey.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      c9e0d391
  3. 11 1月, 2016 2 次提交
    • H
      x86/boot: Double BOOT_HEAP_SIZE to 64KB · 8c31902c
      H.J. Lu 提交于
      When decompressing kernel image during x86 bootup, malloc memory
      for ELF program headers may run out of heap space, which leads
      to system halt.  This patch doubles BOOT_HEAP_SIZE to 64KB.
      
      Tested with 32-bit kernel which failed to boot without this patch.
      Signed-off-by: NH.J. Lu <hjl.tools@gmail.com>
      Acked-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      8c31902c
    • A
      x86/mm: Add barriers and document switch_mm()-vs-flush synchronization · 71b3c126
      Andy Lutomirski 提交于
      When switch_mm() activates a new PGD, it also sets a bit that
      tells other CPUs that the PGD is in use so that TLB flush IPIs
      will be sent.  In order for that to work correctly, the bit
      needs to be visible prior to loading the PGD and therefore
      starting to fill the local TLB.
      
      Document all the barriers that make this work correctly and add
      a couple that were missing.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm@kvack.org
      Cc: stable@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      71b3c126
  4. 09 1月, 2016 1 次提交
  5. 07 1月, 2016 6 次提交
    • R
      dts: vt8500: Add SDHC node to DTS file for WM8650 · 0f090bf1
      Roman Volkov 提交于
      Since WM8650 has the same 'WMT' SDHC controller as WM8505, and the driver
      is already in the kernel, this node enables the controller support for
      WM8650
      Signed-off-by: NRoman Volkov <rvolkov@v1ros.org>
      Reviewed-by: NAlexey Charkov <alchark@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      0f090bf1
    • T
      ARM: Fix broken USB support in multi_v7_defconfig for sunxi devices · 5b1a6181
      Timo Sigurdsson 提交于
      Commit 69fb4dca ("power: Add an axp20x-usb-power driver") introduced a
      new driver for the USB power supply used on various Allwinner based SBCs.
      However, the driver was not added to multi_v7_defconfig which breaks USB
      support for some boards (e.g. LeMaker BananaPi) as the kernel will now
      turn off the USB power supply during boot by default if the driver isn't
      present. (This was not the case in linux 4.3 or lower where the USB power
      was always left on.)
      
      Hence, add the driver to multi_v7_defconfig in order to keep USB support
      working on those boards that require it.
      Signed-off-by: NTimo Sigurdsson <public_timo.s@silentcreek.de>
      Tested-by: NTimo Sigurdsson <public_timo.s@silentcreek.de>
      Acked-by: NMaxime Ripard <maxime.ripard@free-electrons.com>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      5b1a6181
    • P
      kvm: x86: only channel 0 of the i8254 is linked to the HPET · e5e57e7a
      Paolo Bonzini 提交于
      While setting the KVM PIT counters in 'kvm_pit_load_count', if
      'hpet_legacy_start' is set, the function disables the timer on
      channel[0], instead of the respective index 'channel'. This is
      because channels 1-3 are not linked to the HPET.  Fix the caller
      to only activate the special HPET processing for channel 0.
      Reported-by: NP J P <pjp@fedoraproject.org>
      Fixes: 0185604cSigned-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      e5e57e7a
    • L
      ARM: versatile: fix MMC/SD interrupt assignment · 20f12758
      Linus Walleij 提交于
      Commit 0976c946
      "arm/versatile: Fix versatile irq specifications"
      has an off-by-one error on the Versatile AB that has
      been regressing the Versatile AB hardware for some time.
      
      However it seems like the interrupt assignments have
      never been correct and I have now adjusted them according
      to the specification. The masks for the valid interrupts
      made it impossible to assign the right SIC interrupt
      for the MMCI, so I went in and fixed these to correspond
      to the specifications, and added references if anyone
      wants to double-check.
      
      Due to the Versatile PB including the Versatile AB
      as a base DTS file, we need to override and correct
      some values to correspond to the actual changes in the
      hardware.
      
      For the Versatile PB I don't think the IRQ line
      assignment for MMCI has ever been correct for either of
      the two MMCI blocks. It would be nice if someone with the
      physical PB board could test this.
      
      Patch tested on the Versatile AB, QEMU for Versatile AB
      and QEMU for Versatile PB.
      
      Cc: Rob Herring <robh@kernel.org>
      Cc: Grant Likely <grant.likely@linaro.org>
      Cc: stable@vger.kernel.org
      Fixes: 0976c946 ("arm/versatile: Fix versatile irq specifications")
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      20f12758
    • L
      ARM: nomadik: set latencies to 8 cycles · a461a3ec
      Linus Walleij 提交于
      The Nomadik has sporadic crashes because of these latencies, setting
      them to max makes the platform work nicely, so use this values for
      now.
      
      These latencies were set to 2 since the Nomadik platform was merged,
      but I suspect they never took effect until the right size and
      associativity for the cache was specified in the device tree and
      that is why the crash comes now.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      a461a3ec
    • T
      ARM: OMAP2+: Fix onenand rate detection to avoid filesystem corruption · e7b11dc7
      Tony Lindgren 提交于
      Commit 63aa945b ("memory: omap-gpmc: Add Kconfig option for debug")
      unified the GPMC debug for the SoCs with GPMC. The commit also left out
      the option for HWMOD_INIT_NO_RESET as we now require proper timings for
      GPMC to be able to remap GPMC devices out of address 0.
      
      Unfortunately on Nokia N900, onenand now only partially works with the
      device tree provided timings. It works enough to get detected but the
      clock rate supported by the onenand chip gets misdetected. This in turn
      causes the GPMC timings to be miscalculated and this leads into file
      system corruption on N900.
      
      Looks like onenand needs CS_CONFIG1 bit 27 WRITETYPE set for for sync
      write. This is needed also for async timings when we write to onenand
      with omap2_onenand_set_async_mode(). Without sync write bit set, the
      async read for the onenand ONENAND_REG_VERSION_ID will return 0xfff.
      
      Let's exit with an error if onenand rate is not detected. And let's
      remove the extra call to omap2_onenand_set_async_mode() as we only need
      to do this once at the end of omap2_onenand_setup_async().
      
      Fixes: 63aa945b ("memory: omap-gpmc: Add Kconfig option for debug")
      Cc: stable@vger.kernel.org # v4.2+
      Reported-by: NIvaylo Dimitrov <ivo.g.dimitrov.75@gmail.com>
      Tested-by: NIvaylo Dimitrov <ivo.g.dimitrov.75@gmail.com>
      Tested-by: NAaro Koskinen <aaro.koskinen@iki.fi>
      Signed-off-by: NTony Lindgren <tony@atomide.com>
      e7b11dc7
  6. 06 1月, 2016 18 次提交
  7. 05 1月, 2016 3 次提交
    • C
      tile: provide CONFIG_PAGE_SIZE_64KB etc for tilepro · c1b27ab5
      Chris Metcalf 提交于
      This allows the build system to know that it can't attempt to
      configure the Lustre virtual block device, for example, when tilepro
      is using 64KB pages (as it does by default).  The tilegx build
      already provided those symbols.
      
      Previously we required that the tilepro hypervisor be rebuilt with
      a different hardcoded page size in its headers, and then Linux be
      rebuilt using the updated hypervisor header.  Now we allow each of
      the hypervisor and Linux to be built independently.  We still check
      at boot time to ensure that the page size provided by the hypervisor
      matches what Linux expects.
      Signed-off-by: NChris Metcalf <cmetcalf@ezchip.com>
      Cc: stable@vger.kernel.org [3.19+]
      c1b27ab5
    • T
      x86/mm/pat: Change free_memtype() to support shrinking case · 2039e6ac
      Toshi Kani 提交于
      Using mremap() to shrink the map size of a VM_PFNMAP range causes
      the following error message, and leaves the pfn range allocated.
      
       x86/PAT: test:3493 freeing invalid memtype [mem 0x483200000-0x4863fffff]
      
      This is because rbt_memtype_erase(), called from free_memtype()
      with spin_lock held, only supports to free a whole memtype node in
      memtype_rbroot.  Therefore, this patch changes rbt_memtype_erase()
      to support a request that shrinks the size of a memtype node for
      mremap().
      
      memtype_rb_exact_match() is renamed to memtype_rb_match(), and
      is enhanced to support EXACT_MATCH and END_MATCH in @match_type.
      Since the memtype_rbroot tree allows overlapping ranges,
      rbt_memtype_erase() checks with EXACT_MATCH first, i.e. free
      a whole node for the munmap case.  If no such entry is found,
      it then checks with END_MATCH, i.e. shrink the size of a node
      from the end for the mremap case.
      
      On the mremap case, rbt_memtype_erase() proceeds in two steps,
      1) remove the node, and then 2) insert the updated node.  This
      allows proper update of augmented values, subtree_max_end, in
      the tree.
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: stsp@list.ru
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/1450832064-10093-3-git-send-email-toshi.kani@hpe.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      2039e6ac
    • T
      x86/mm/pat: Add untrack_pfn_moved for mremap · d9fe4fab
      Toshi Kani 提交于
      mremap() with MREMAP_FIXED on a VM_PFNMAP range causes the following
      WARN_ON_ONCE() message in untrack_pfn().
      
        WARNING: CPU: 1 PID: 3493 at arch/x86/mm/pat.c:985 untrack_pfn+0xbd/0xd0()
        Call Trace:
        [<ffffffff817729ea>] dump_stack+0x45/0x57
        [<ffffffff8109e4b6>] warn_slowpath_common+0x86/0xc0
        [<ffffffff8109e5ea>] warn_slowpath_null+0x1a/0x20
        [<ffffffff8106a88d>] untrack_pfn+0xbd/0xd0
        [<ffffffff811d2d5e>] unmap_single_vma+0x80e/0x860
        [<ffffffff811d3725>] unmap_vmas+0x55/0xb0
        [<ffffffff811d916c>] unmap_region+0xac/0x120
        [<ffffffff811db86a>] do_munmap+0x28a/0x460
        [<ffffffff811dec33>] move_vma+0x1b3/0x2e0
        [<ffffffff811df113>] SyS_mremap+0x3b3/0x510
        [<ffffffff817793ee>] entry_SYSCALL_64_fastpath+0x12/0x71
      
      MREMAP_FIXED moves a pfnmap from old vma to new vma.  untrack_pfn() is
      called with the old vma after its pfnmap page table has been removed,
      which causes follow_phys() to fail.  The new vma has a new pfnmap to
      the same pfn & cache type with VM_PAT set.  Therefore, we only need to
      clear VM_PAT from the old vma in this case.
      
      Add untrack_pfn_moved(), which clears VM_PAT from a given old vma.
      move_vma() is changed to call this function with the old vma when
      VM_PFNMAP is set.  move_vma() then calls do_munmap(), and untrack_pfn()
      is a no-op since VM_PAT is cleared.
      Reported-by: NStas Sergeev <stsp@list.ru>
      Signed-off-by: NToshi Kani <toshi.kani@hpe.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/1450832064-10093-2-git-send-email-toshi.kani@hpe.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      d9fe4fab