1. 30 9月, 2017 1 次提交
  2. 29 9月, 2017 1 次提交
    • J
      x86/asm: Fix inline asm call constraints for GCC 4.4 · 520a13c5
      Josh Poimboeuf 提交于
      The kernel test bot (run by Xiaolong Ye) reported that the following commit:
      
        f5caf621 ("x86/asm: Fix inline asm call constraints for Clang")
      
      is causing double faults in a kernel compiled with GCC 4.4.
      
      Linus subsequently diagnosed the crash pattern and the buggy commit and found that
      the issue is with this code:
      
        register unsigned int __asm_call_sp asm("esp");
        #define ASM_CALL_CONSTRAINT "+r" (__asm_call_sp)
      
      Even on a 64-bit kernel, it's using ESP instead of RSP.  That causes GCC
      to produce the following bogus code:
      
        ffffffff8147461d:       89 e0                   mov    %esp,%eax
        ffffffff8147461f:       4c 89 f7                mov    %r14,%rdi
        ffffffff81474622:       4c 89 fe                mov    %r15,%rsi
        ffffffff81474625:       ba 20 00 00 00          mov    $0x20,%edx
        ffffffff8147462a:       89 c4                   mov    %eax,%esp
        ffffffff8147462c:       e8 bf 52 05 00          callq  ffffffff814c98f0 <copy_user_generic_unrolled>
      
      Despite the absurdity of it backing up and restoring the stack pointer
      for no reason, the bug is actually the fact that it's only backing up
      and restoring the lower 32 bits of the stack pointer.  The upper 32 bits
      are getting cleared out, corrupting the stack pointer.
      
      So change the '__asm_call_sp' register variable to be associated with
      the actual full-size stack pointer.
      
      This also requires changing the __ASM_SEL() macro to be based on the
      actual compiled arch size, rather than the CONFIG value, because
      CONFIG_X86_64 compiles some files with '-m32' (e.g., realmode and vdso).
      Otherwise Clang fails to build the kernel because it complains about the
      use of a 64-bit register (RSP) in a 32-bit file.
      Reported-and-Bisected-and-Tested-by: Nkernel test robot <xiaolong.ye@intel.com>
      Diagnosed-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Cc: LKP <lkp@01.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matthias Kaehlcke <mka@chromium.org>
      Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: f5caf621 ("x86/asm: Fix inline asm call constraints for Clang")
      Link: http://lkml.kernel.org/r/20170928215826.6sdpmwtkiydiytim@trebleSigned-off-by: NIngo Molnar <mingo@kernel.org>
      520a13c5
  3. 25 9月, 2017 1 次提交
  4. 23 9月, 2017 1 次提交
    • J
      x86/asm: Fix inline asm call constraints for Clang · f5caf621
      Josh Poimboeuf 提交于
      For inline asm statements which have a CALL instruction, we list the
      stack pointer as a constraint to convince GCC to ensure the frame
      pointer is set up first:
      
        static inline void foo()
        {
      	register void *__sp asm(_ASM_SP);
      	asm("call bar" : "+r" (__sp))
        }
      
      Unfortunately, that pattern causes Clang to corrupt the stack pointer.
      
      The fix is easy: convert the stack pointer register variable to a global
      variable.
      
      It should be noted that the end result is different based on the GCC
      version.  With GCC 6.4, this patch has exactly the same result as
      before:
      
      	defconfig	defconfig-nofp	distro		distro-nofp
       before	9820389		9491555		8816046		8516940
       after	9820389		9491555		8816046		8516940
      
      With GCC 7.2, however, GCC's behavior has changed.  It now changes its
      behavior based on the conversion of the register variable to a global.
      That somehow convinces it to *always* set up the frame pointer before
      inserting *any* inline asm.  (Therefore, listing the variable as an
      output constraint is a no-op and is no longer necessary.)  It's a bit
      overkill, but the performance impact should be negligible.  And in fact,
      there's a nice improvement with frame pointers disabled:
      
      	defconfig	defconfig-nofp	distro		distro-nofp
       before	9796316		9468236		9076191		8790305
       after	9796957		9464267		9076381		8785949
      
      So in summary, while listing the stack pointer as an output constraint
      is no longer necessary for newer versions of GCC, it's still needed for
      older versions.
      Suggested-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Reported-by: NMatthias Kaehlcke <mka@chromium.org>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/3db862e970c432ae823cf515c52b54fec8270e0e.1505942196.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f5caf621
  5. 18 9月, 2017 2 次提交
    • A
      x86/mm/64: Stop using CR3.PCID == 0 in ASID-aware code · 52a2af40
      Andy Lutomirski 提交于
      Putting the logical ASID into CR3's PCID bits directly means that we
      have two cases to consider separately: ASID == 0 and ASID != 0.
      This means that bugs that only hit in one of these cases trigger
      nondeterministically.
      
      There were some bugs like this in the past, and I think there's
      still one in current kernels.  In particular, we have a number of
      ASID-unware code paths that save CR3, write some special value, and
      then restore CR3.  This includes suspend/resume, hibernate, kexec,
      EFI, and maybe other things I've missed.  This is currently
      dangerous: if ASID != 0, then this code sequence will leave garbage
      in the TLB tagged for ASID 0.  We could potentially see corruption
      when switching back to ASID 0.  In principle, an
      initialize_tlbstate_and_flush() call after these sequences would
      solve the problem, but EFI, at least, does not call this.  (And it
      probably shouldn't -- initialize_tlbstate_and_flush() is rather
      expensive.)
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/cdc14bbe5d3c3ef2a562be09a6368ffe9bd947a6.1505663533.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      52a2af40
    • A
      x86/mm: Factor out CR3-building code · 47061a24
      Andy Lutomirski 提交于
      Current, the code that assembles a value to load into CR3 is
      open-coded everywhere.  Factor it out into helpers build_cr3() and
      build_cr3_noflush().
      
      This makes one semantic change: __get_current_cr3_fast() was wrong
      on SME systems.  No one noticed because the only caller is in the
      VMX code, and there are no CPUs with both SME and VMX.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tom Lendacky <Thomas.Lendacky@amd.com>
      Link: http://lkml.kernel.org/r/ce350cf11e93e2842d14d0b95b0199c7d881f527.1505663533.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      47061a24
  6. 14 9月, 2017 1 次提交
  7. 13 9月, 2017 3 次提交
  8. 11 9月, 2017 1 次提交
  9. 09 9月, 2017 4 次提交
    • M
      x86: implement memset16, memset32 & memset64 · 4c512485
      Matthew Wilcox 提交于
      These are single instructions on x86.  There's no 64-bit instruction for
      x86-32, but we don't yet have any user for memset64() on 32-bit
      architectures, so don't bother to implement it.
      
      Link: http://lkml.kernel.org/r/20170720184539.31609-4-willy@infradead.orgSigned-off-by: NMatthew Wilcox <mawilcox@microsoft.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
      Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Russell King <rmk+kernel@armlinux.org.uk>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4c512485
    • N
      mm: soft-dirty: keep soft-dirty bits over thp migration · ab6e3d09
      Naoya Horiguchi 提交于
      Soft dirty bit is designed to keep tracked over page migration.  This
      patch makes it work in the same manner for thp migration too.
      Signed-off-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Signed-off-by: NZi Yan <zi.yan@cs.rutgers.edu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: David Nellans <dnellans@nvidia.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ab6e3d09
    • Z
      mm: thp: enable thp migration in generic path · 616b8371
      Zi Yan 提交于
      Add thp migration's core code, including conversions between a PMD entry
      and a swap entry, setting PMD migration entry, removing PMD migration
      entry, and waiting on PMD migration entries.
      
      This patch makes it possible to support thp migration.  If you fail to
      allocate a destination page as a thp, you just split the source thp as
      we do now, and then enter the normal page migration.  If you succeed to
      allocate destination thp, you enter thp migration.  Subsequent patches
      actually enable thp migration for each caller of page migration by
      allowing its get_new_page() callback to allocate thps.
      
      [zi.yan@cs.rutgers.edu: fix gcc-4.9.0 -Wmissing-braces warning]
        Link: http://lkml.kernel.org/r/A0ABA698-7486-46C3-B209-E95A9048B22C@cs.rutgers.edu
      [akpm@linux-foundation.org: fix x86_64 allnoconfig warning]
      Signed-off-by: NZi Yan <zi.yan@cs.rutgers.edu>
      Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: David Nellans <dnellans@nvidia.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      616b8371
    • N
      mm: x86: move _PAGE_SWP_SOFT_DIRTY from bit 7 to bit 1 · eee4818b
      Naoya Horiguchi 提交于
      _PAGE_PSE is used to distinguish between a truly non-present
      (_PAGE_PRESENT=0) PMD, and a PMD which is undergoing a THP split and
      should be treated as present.
      
      But _PAGE_SWP_SOFT_DIRTY currently uses the _PAGE_PSE bit, which would
      cause confusion between one of those PMDs undergoing a THP split, and a
      soft-dirty PMD.  Dropping _PAGE_PSE check in pmd_present() does not work
      well, because it can hurt optimization of tlb handling in thp split.
      
      Thus, we need to move the bit.
      
      In the current kernel, bits 1-4 are not used in non-present format since
      commit 00839ee3 ("x86/mm: Move swap offset/type up in PTE to work
      around erratum").  So let's move _PAGE_SWP_SOFT_DIRTY to bit 1.  Bit 7
      is used as reserved (always clear), so please don't use it for other
      purpose.
      
      Link: http://lkml.kernel.org/r/20170717193955.20207-3-zi.yan@sent.comSigned-off-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Signed-off-by: NZi Yan <zi.yan@cs.rutgers.edu>
      Acked-by: NDave Hansen <dave.hansen@intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
      Cc: David Nellans <dnellans@nvidia.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      eee4818b
  10. 07 9月, 2017 3 次提交
  11. 01 9月, 2017 3 次提交
    • J
      KVM: update to new mmu_notifier semantic v2 · fb1522e0
      Jérôme Glisse 提交于
      Calls to mmu_notifier_invalidate_page() were replaced by calls to
      mmu_notifier_invalidate_range() and are now bracketed by calls to
      mmu_notifier_invalidate_range_start()/end()
      
      Remove now useless invalidate_page callback.
      
      Changed since v1 (Linus Torvalds)
          - remove now useless kvm_arch_mmu_notifier_invalidate_page()
      Signed-off-by: NJérôme Glisse <jglisse@redhat.com>
      Tested-by: NMike Galbraith <efault@gmx.de>
      Tested-by: NAdam Borowski <kilobyte@angband.pl>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: kvm@vger.kernel.org
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fb1522e0
    • R
      libnvdimm, nd_blk: remove mmio_flush_range() · 5deb67f7
      Robin Murphy 提交于
      mmio_flush_range() suffers from a lack of clearly-defined semantics,
      and is somewhat ambiguous to port to other architectures where the
      scope of the writeback implied by "flush" and ordering might matter,
      but MMIO would tend to imply non-cacheable anyway. Per the rationale
      in 67a3e8fe ("nd_blk: change aperture mapping from WC to WB"), the
      only existing use is actually to invalidate clean cache lines for
      ARCH_MEMREMAP_PMEM type mappings *without* writeback. Since the recent
      cleanup of the pmem API, that also now happens to be the exact purpose
      of arch_invalidate_pmem(), which would be a far more well-defined tool
      for the job.
      
      Rather than risk potentially inconsistent implementations of
      mmio_flush_range() for the sake of one callsite, streamline things by
      removing it entirely and instead move the ARCH_MEMREMAP_PMEM related
      definitions up to the libnvdimm level, so they can be shared by NFIT
      as well. This allows NFIT to be enabled for arm64.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      5deb67f7
    • J
      x86/xen: Get rid of paravirt op adjust_exception_frame · 5878d5d6
      Juergen Gross 提交于
      When running as Xen pv-guest the exception frame on the stack contains
      %r11 and %rcx additional to the other data pushed by the processor.
      
      Instead of having a paravirt op being called for each exception type
      prepend the Xen specific code to each exception entry. When running as
      Xen pv-guest just use the exception entry with prepended instructions,
      otherwise use the entry without the Xen specific code.
      
      [ tglx: Merged through tip to avoid ugly merge conflict ]
      Signed-off-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: xen-devel@lists.xenproject.org
      Cc: boris.ostrovsky@oracle.com
      Cc: luto@amacapital.net
      Link: http://lkml.kernel.org/r/20170831174249.26853-1-jg@pfupf.net
      5878d5d6
  12. 31 8月, 2017 6 次提交
  13. 29 8月, 2017 13 次提交