1. 24 12月, 2017 2 次提交
    • L
      Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · caf9a826
      Linus Torvalds 提交于
      Pull x86 PTI preparatory patches from Thomas Gleixner:
       "Todays Advent calendar window contains twentyfour easy to digest
        patches. The original plan was to have twenty three matching the date,
        but a late fixup made that moot.
      
         - Move the cpu_entry_area mapping out of the fixmap into a separate
           address space. That's necessary because the fixmap becomes too big
           with NRCPUS=8192 and this caused already subtle and hard to
           diagnose failures.
      
           The top most patch is fresh from today and cures a brain slip of
           that tall grumpy german greybeard, who ignored the intricacies of
           32bit wraparounds.
      
         - Limit the number of CPUs on 32bit to 64. That's insane big already,
           but at least it's small enough to prevent address space issues with
           the cpu_entry_area map, which have been observed and debugged with
           the fixmap code
      
         - A few TLB flush fixes in various places plus documentation which of
           the TLB functions should be used for what.
      
         - Rename the SYSENTER stack to CPU_ENTRY_AREA stack as it is used for
           more than sysenter now and keeping the name makes backtraces
           confusing.
      
         - Prevent LDT inheritance on exec() by moving it to arch_dup_mmap(),
           which is only invoked on fork().
      
         - Make vysycall more robust.
      
         - A few fixes and cleanups of the debug_pagetables code. Check
           PAGE_PRESENT instead of checking the PTE for 0 and a cleanup of the
           C89 initialization of the address hint array which already was out
           of sync with the index enums.
      
         - Move the ESPFIX init to a different place to prepare for PTI.
      
         - Several code moves with no functional change to make PTI
           integration simpler and header files less convoluted.
      
         - Documentation fixes and clarifications"
      
      * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
        x86/cpu_entry_area: Prevent wraparound in setup_cpu_entry_area_ptes() on 32bit
        init: Invoke init_espfix_bsp() from mm_init()
        x86/cpu_entry_area: Move it out of the fixmap
        x86/cpu_entry_area: Move it to a separate unit
        x86/mm: Create asm/invpcid.h
        x86/mm: Put MMU to hardware ASID translation in one place
        x86/mm: Remove hard-coded ASID limit checks
        x86/mm: Move the CR3 construction functions to tlbflush.h
        x86/mm: Add comments to clarify which TLB-flush functions are supposed to flush what
        x86/mm: Remove superfluous barriers
        x86/mm: Use __flush_tlb_one() for kernel memory
        x86/microcode: Dont abuse the TLB-flush interface
        x86/uv: Use the right TLB-flush API
        x86/entry: Rename SYSENTER_stack to CPU_ENTRY_AREA_entry_stack
        x86/doc: Remove obvious weirdnesses from the x86 MM layout documentation
        x86/mm/64: Improve the memory map documentation
        x86/ldt: Prevent LDT inheritance on exec
        x86/ldt: Rework locking
        arch, mm: Allow arch_dup_mmap() to fail
        x86/vsyscall/64: Warn and fail vsyscall emulation in NATIVE mode
        ...
      caf9a826
    • T
      x86/cpu_entry_area: Prevent wraparound in setup_cpu_entry_area_ptes() on 32bit · f6c4fd50
      Thomas Gleixner 提交于
      The loop which populates the CPU entry area PMDs can wrap around on 32bit
      machines when the number of CPUs is small.
      
      It worked wonderful for NR_CPUS=64 for whatever reason and the moron who
      wrote that code did not bother to test it with !SMP.
      
      Check for the wraparound to fix it.
      
      Fixes: 92a0f81d ("x86/cpu_entry_area: Move it out of the fixmap")
      Reported-by: Nkernel test robot <fengguang.wu@intel.com>
      Signed-off-by: NThomas "Feels stupid" Gleixner <tglx@linutronix.de>
      Tested-by: NBorislav Petkov <bp@alien8.de>
      f6c4fd50
  2. 23 12月, 2017 30 次提交
    • L
      Merge tag 'powerpc-4.15-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 9c294ec0
      Linus Torvalds 提交于
      Pull powerpc fixes from Michael Ellerman:
       "This is all fairly boring, except that there's two KVM fixes that
        you'd normally get via Paul's kvm-ppc tree. He's away so I picked them
        up. I was waiting to see if he would apply them, which is why they
        have only been in my tree since today. But they were on the list for a
        while and have been tested on the relevant hardware.
      
        Of note is two fixes for KVM XIVE (Power9 interrupt controller). These
        would normally go via the KVM tree but Paul is away so I've picked
        them up.
      
        Other than that, two fixes for error handling in the IMC driver, and
        one for a potential oops in the BHRB code if the hardware records a
        branch address that has subsequently been unmapped, and finally a
        s/%p/%px/ in our oops code.
      
        Thanks to: Anju T Sudhakar, Cédric Le Goater, Laurent Vivier, Madhavan
        Srinivasan, Naveen N. Rao, Ravi Bangoria"
      
      * tag 'powerpc-4.15-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        KVM: PPC: Book3S HV: Fix pending_pri value in kvmppc_xive_get_icp()
        KVM: PPC: Book3S: fix XIVE migration of pending interrupts
        powerpc/kernel: Print actual address of regs when oopsing
        powerpc/perf: Fix kfree memory allocated for nest pmus
        powerpc/perf/imc: Fix nest-imc cpuhotplug callback failure
        powerpc/perf: Dereference BHRB entries safely
      9c294ec0
    • L
      Merge tag 'for-linus-4.15-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 9ad95bda
      Linus Torvalds 提交于
      Pull xen fixes from Juergen Gross:
       "This contains two fixes for running under Xen:
      
         - a fix avoiding resource conflicts between adding mmio areas and
           memory hotplug
      
         - a fix setting NX bits in page table entries copied from Xen when
           running a PV guest"
      
      * tag 'for-linus-4.15-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen/balloon: Mark unallocated host memory as UNUSABLE
        x86-64/Xen: eliminate W+X mappings
      9ad95bda
    • L
      Merge tag 'xfs-4.15-fixes-8' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · fca0e39b
      Linus Torvalds 提交于
      Pull xfs fixes from Darrick Wong:
       "Here are some XFS fixes for 4.15-rc5. Apologies for the unusually
        large number of patches this late, but I wanted to make sure the
        corruption fixes were really ready to go.
      
        Changes since last update:
      
         - Fix a locking problem during xattr block conversion that could lead
           to the log checkpointing thread to try to write an incomplete
           buffer to disk, which leads to a corruption shutdown
      
         - Fix a null pointer dereference when removing delayed allocation
           extents
      
         - Remove post-eof speculative allocations when reflinking a block
           past current inode size so that we don't just leave them there and
           assert on inode reclaim
      
         - Relax an assert which didn't accurately reflect the way locking
           works and would trigger under heavy io load
      
         - Avoid infinite loop when cancelling copy on write extents after a
           writeback failure
      
         - Try to avoid copy on write transaction reservation overflows when
           remapping after a successful write
      
         - Fix various problems with the copy-on-write reservation automatic
           garbage collection not being cleaned up properly during a ro
           remount
      
         - Fix problems with rmap log items being processed in the wrong
           order, leading to corruption shutdowns
      
         - Fix problems with EFI recovery wherein the "remove any rmapping if
           present" mechanism wasn't actually doing anything, which would lead
           to corruption problems later when the extent is reallocated,
           leading to multiple rmaps for the same extent"
      
      * tag 'xfs-4.15-fixes-8' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: only skip rmap owner checks for unknown-owner rmap removal
        xfs: always honor OWN_UNKNOWN rmap removal requests
        xfs: queue deferred rmap ops for cow staging extent alloc/free in the right order
        xfs: set cowblocks tag for direct cow writes too
        xfs: remove leftover CoW reservations when remounting ro
        xfs: don't be so eager to clear the cowblocks tag on truncate
        xfs: track cowblocks separately in i_flags
        xfs: allow CoW remap transactions to use reserve blocks
        xfs: avoid infinite loop when cancelling CoW blocks after writeback failure
        xfs: relax is_reflink_inode assert in xfs_reflink_find_cow_mapping
        xfs: remove dest file's post-eof preallocations before reflinking
        xfs: move xfs_iext_insert tracepoint to report useful information
        xfs: account for null transactions in bunmapi
        xfs: hold xfs_buf locked between shortform->leaf conversion and the addition of an attribute
        xfs: add the ability to join a held buffer to a defer_ops
      fca0e39b
    • L
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 0fc0f18b
      Linus Torvalds 提交于
      Pull crypto fixes from Herbert Xu:
       "This fixes the following issues:
      
         - fix chacha20 crash on zero-length input due to unset IV
      
         - fix potential race conditions in mcryptd with spinlock
      
         - only wait once at top of algif recvmsg to avoid inconsistencies
      
         - fix potential use-after-free in algif_aead/algif_skcipher"
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: af_alg - fix race accessing cipher request
        crypto: mcryptd - protect the per-CPU queue with a lock
        crypto: af_alg - wait for data at beginning of recvmsg
        crypto: skcipher - set walk.iv for zero-length inputs
      0fc0f18b
    • L
      Merge tag 'pinctrl-v4.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 6ed16756
      Linus Torvalds 提交于
      Pull pin control fix from Linus Walleij:
       "A single pin control fix for Intel machines, affecting a bunch of
        Chromebooks. Nothing else collected up amazingly"
      
      * tag 'pinctrl-v4.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        pinctrl: cherryview: Mask all interrupts on Intel_Strago based systems
      6ed16756
    • L
      Merge tag 'drm-fixes-for-v4.15-rc5' of git://people.freedesktop.org/~airlied/linux · e7ae59cb
      Linus Torvalds 提交于
      Pull drm fixes from Dave Airlie:
       "I've got most of two weeks worth of fixes here due to being on
        holidays last week.
      
        The main things are:
      
        - Core:
           * Syncobj fd reference count fix
           * Leasing ioctl misuse fix
      
         - nouveau regression fixes
      
         - further amdgpu DC fixes
      
         - sun4i regression fixes
      
        I'm not sure I'll see many fixes over next couple of weeks, we'll see
        how we go"
      
      * tag 'drm-fixes-for-v4.15-rc5' of git://people.freedesktop.org/~airlied/linux: (27 commits)
        drm/syncobj: Stop reusing the same struct file for all syncobj -> fd
        drm: move lease init after validation in drm_lease_create
        drm/plane: Make framebuffer refcounting the responsibility of setplane_internal callers
        drm/sun4i: hdmi: Move the mode_valid callback to the encoder
        drm/nouveau: fix obvious memory leak
        drm/i915: Protect DDI port to DPLL map from theoretical race.
        drm/i915/lpe: Remove double-encapsulation of info string
        drm/sun4i: Fix error path handling
        drm/nouveau: use alternate memory type for system-memory buffers with kind != 0
        drm/nouveau: avoid GPU page sizes > PAGE_SIZE for buffer objects in host memory
        drm/nouveau/mmu/gp10b: use correct implementation
        drm/nouveau/pci: do a msi rearm on init
        drm/nouveau/imem/nv50: fix refcount_t warning
        drm/nouveau/bios/dp: support DP Info Table 2.0
        drm/nouveau/fbcon: fix NULL pointer access in nouveau_fbcon_destroy
        drm/amd/display: Fix rehook MST display not light back on
        drm/amd/display: fix missing pixel clock adjustment for dongle
        drm/amd/display: set chroma taps to 1 when not scaling
        drm/amd/display: add pipe locking before front end programing
        drm/sun4i: validate modes for HDMI
        ...
      e7ae59cb
    • L
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 7edc3f20
      Linus Torvalds 提交于
      Pull clk fixes from Stephen Boyd:
       "Here's a trio of fixes:
      
         - The runtime PM clk patches that landed this merge window forgot to
           runtime resume devices that may be off while recalculating and
           setting rates of child clks of whatever clk is changing rates.
      
         - We had a NULL pointer deref in an old clk tracepoint when
           clk_set_parent() is called with a NULL parent pointer. This
           shouldn't really happen, but it's best to avoid this regardless.
      
         - The sun9i-mmc clk driver didn't provide 'reset' support, just
           'assert' and 'deassert' so the MMC driver stopped probing when the
           probe was changed to do a reset instead of assert/deassert pair.
           This implements the reset so things work again"
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: sunxi: sun9i-mmc: Implement reset callback for reset controls
        clk: fix a panic error caused by accessing NULL pointer
        clk: Manage proper runtime PM state in clk_change_rate()
      7edc3f20
    • T
      init: Invoke init_espfix_bsp() from mm_init() · 613e396b
      Thomas Gleixner 提交于
      init_espfix_bsp() needs to be invoked before the page table isolation
      initialization. Move it into mm_init() which is the place where pti_init()
      will be added.
      
      While at it get rid of the #ifdeffery and provide proper stub functions.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      613e396b
    • T
      x86/cpu_entry_area: Move it out of the fixmap · 92a0f81d
      Thomas Gleixner 提交于
      Put the cpu_entry_area into a separate P4D entry. The fixmap gets too big
      and 0-day already hit a case where the fixmap PTEs were cleared by
      cleanup_highmap().
      
      Aside of that the fixmap API is a pain as it's all backwards.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      92a0f81d
    • T
      x86/cpu_entry_area: Move it to a separate unit · ed1bbc40
      Thomas Gleixner 提交于
      Separate the cpu_entry_area code out of cpu/common.c and the fixmap.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      ed1bbc40
    • P
      x86/mm: Create asm/invpcid.h · 1a3b0cae
      Peter Zijlstra 提交于
      Unclutter tlbflush.h a little.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Cc: linux-mm@kvack.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      1a3b0cae
    • D
      x86/mm: Put MMU to hardware ASID translation in one place · dd95f1a4
      Dave Hansen 提交于
      There are effectively two ASID types:
      
       1. The one stored in the mmu_context that goes from 0..5
       2. The one programmed into the hardware that goes from 1..6
      
      This consolidates the locations where converting between the two (by doing
      a +1) to a single place which gives us a nice place to comment.
      PAGE_TABLE_ISOLATION will also need to, given an ASID, know which hardware
      ASID to flush for the userspace mapping.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Cc: linux-mm@kvack.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      dd95f1a4
    • D
      x86/mm: Remove hard-coded ASID limit checks · cb0a9144
      Dave Hansen 提交于
      First, it's nice to remove the magic numbers.
      
      Second, PAGE_TABLE_ISOLATION is going to consume half of the available ASID
      space.  The space is currently unused, but add a comment to spell out this
      new restriction.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Cc: linux-mm@kvack.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      cb0a9144
    • D
      x86/mm: Move the CR3 construction functions to tlbflush.h · 50fb83a6
      Dave Hansen 提交于
      For flushing the TLB, the ASID which has been programmed into the hardware
      must be known.  That differs from what is in 'cpu_tlbstate'.
      
      Add functions to transform the 'cpu_tlbstate' values into to the one
      programmed into the hardware (CR3).
      
      It's not easy to include mmu_context.h into tlbflush.h, so just move the
      CR3 building over to tlbflush.h.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Cc: linux-mm@kvack.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      50fb83a6
    • P
      x86/mm: Add comments to clarify which TLB-flush functions are supposed to flush what · 3f67af51
      Peter Zijlstra 提交于
      Per popular request..
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Cc: linux-mm@kvack.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      3f67af51
    • P
      x86/mm: Remove superfluous barriers · b5fc6d94
      Peter Zijlstra 提交于
      atomic64_inc_return() already implies smp_mb() before and after.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Cc: linux-mm@kvack.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      b5fc6d94
    • P
      x86/mm: Use __flush_tlb_one() for kernel memory · a501686b
      Peter Zijlstra 提交于
      __flush_tlb_single() is for user mappings, __flush_tlb_one() for
      kernel mappings.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Cc: linux-mm@kvack.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      a501686b
    • P
      x86/microcode: Dont abuse the TLB-flush interface · 23cb7d46
      Peter Zijlstra 提交于
      Commit:
      
        ec400dde ("x86/microcode_intel_early.c: Early update ucode on Intel's CPU")
      
      ... grubbed into tlbflush internals without coherent explanation.
      
      Since it says its a precaution and the SDM doesn't mention anything like
      this, take it out back.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: fenghua.yu@intel.com
      Cc: hughd@google.com
      Cc: keescook@google.com
      Cc: linux-mm@kvack.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      23cb7d46
    • P
      x86/uv: Use the right TLB-flush API · 3e46e0f5
      Peter Zijlstra 提交于
      Since uv_flush_tlb_others() implements flush_tlb_others() which is
      about flushing user mappings, we should use __flush_tlb_single(),
      which too is about flushing user mappings.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NAndrew Banman <abanman@hpe.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Travis <mike.travis@hpe.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Cc: linux-mm@kvack.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      3e46e0f5
    • D
      x86/entry: Rename SYSENTER_stack to CPU_ENTRY_AREA_entry_stack · 4fe2d8b1
      Dave Hansen 提交于
      If the kernel oopses while on the trampoline stack, it will print
      "<SYSENTER>" even if SYSENTER is not involved.  That is rather confusing.
      
      The "SYSENTER" stack is used for a lot more than SYSENTER now.  Give it a
      better string to display in stack dumps, and rename the kernel code to
      match.
      
      Also move the 32-bit code over to the new naming even though it still uses
      the entry stack only for SYSENTER.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      4fe2d8b1
    • P
      x86/doc: Remove obvious weirdnesses from the x86 MM layout documentation · e8ffe96e
      Peter Zijlstra 提交于
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: daniel.gruss@iaik.tugraz.at
      Cc: hughd@google.com
      Cc: keescook@google.com
      Cc: linux-mm@kvack.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      e8ffe96e
    • A
      x86/mm/64: Improve the memory map documentation · 5a7ccf47
      Andy Lutomirski 提交于
      The old docs had the vsyscall range wrong and were missing the fixmap.
      Fix both.
      
      There used to be 8 MB reserved for future vsyscalls, but that's long gone.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      5a7ccf47
    • T
      x86/ldt: Prevent LDT inheritance on exec · a4828f81
      Thomas Gleixner 提交于
      The LDT is inherited across fork() or exec(), but that makes no sense
      at all because exec() is supposed to start the process clean.
      
      The reason why this happens is that init_new_context_ldt() is called from
      init_new_context() which obviously needs to be called for both fork() and
      exec().
      
      It would be surprising if anything relies on that behaviour, so it seems to
      be safe to remove that misfeature.
      
      Split the context initialization into two parts. Clear the LDT pointer and
      initialize the mutex from the general context init and move the LDT
      duplication to arch_dup_mmap() which is only called on fork().
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Andy Lutomirsky <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: dan.j.williams@intel.com
      Cc: hughd@google.com
      Cc: keescook@google.com
      Cc: kirill.shutemov@linux.intel.com
      Cc: linux-mm@kvack.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      a4828f81
    • P
      x86/ldt: Rework locking · c2b3496b
      Peter Zijlstra 提交于
      The LDT is duplicated on fork() and on exec(), which is wrong as exec()
      should start from a clean state, i.e. without LDT. To fix this the LDT
      duplication code will be moved into arch_dup_mmap() which is only called
      for fork().
      
      This introduces a locking problem. arch_dup_mmap() holds mmap_sem of the
      parent process, but the LDT duplication code needs to acquire
      mm->context.lock to access the LDT data safely, which is the reverse lock
      order of write_ldt() where mmap_sem nests into context.lock.
      
      Solve this by introducing a new rw semaphore which serializes the
      read/write_ldt() syscall operations and use context.lock to protect the
      actual installment of the LDT descriptor.
      
      So context.lock stabilizes mm->context.ldt and can nest inside of the new
      semaphore or mmap_sem.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Andy Lutomirsky <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: dan.j.williams@intel.com
      Cc: hughd@google.com
      Cc: keescook@google.com
      Cc: kirill.shutemov@linux.intel.com
      Cc: linux-mm@kvack.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      c2b3496b
    • T
      arch, mm: Allow arch_dup_mmap() to fail · c10e83f5
      Thomas Gleixner 提交于
      In order to sanitize the LDT initialization on x86 arch_dup_mmap() must be
      allowed to fail. Fix up all instances.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Andy Lutomirsky <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eduardo Valentin <eduval@amazon.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: aliguori@amazon.com
      Cc: dan.j.williams@intel.com
      Cc: hughd@google.com
      Cc: keescook@google.com
      Cc: kirill.shutemov@linux.intel.com
      Cc: linux-mm@kvack.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      c10e83f5
    • A
      x86/vsyscall/64: Warn and fail vsyscall emulation in NATIVE mode · 4831b779
      Andy Lutomirski 提交于
      If something goes wrong with pagetable setup, vsyscall=native will
      accidentally fall back to emulation.  Make it warn and fail so that we
      notice.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      4831b779
    • A
      x86/vsyscall/64: Explicitly set _PAGE_USER in the pagetable hierarchy · 49275fef
      Andy Lutomirski 提交于
      The kernel is very erratic as to which pagetables have _PAGE_USER set.  The
      vsyscall page gets lucky: it seems that all of the relevant pagetables are
      among the apparently arbitrary ones that set _PAGE_USER.  Rather than
      relying on chance, just explicitly set _PAGE_USER.
      
      This will let us clean up pagetable setup to stop setting _PAGE_USER.  The
      added code can also be reused by pagetable isolation to manage the
      _PAGE_USER bit in the usermode tables.
      
      [ tglx: Folded paravirt fix from Juergen Gross ]
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      49275fef
    • T
      x86/mm/dump_pagetables: Make the address hints correct and readable · 146122e2
      Thomas Gleixner 提交于
      The address hints are a trainwreck. The array entry numbers have to kept
      magically in sync with the actual hints, which is doomed as some of the
      array members are initialized at runtime via the entry numbers.
      
      Designated initializers have been around before this code was
      implemented....
      
      Use the entry numbers to populate the address hints array and add the
      missing bits and pieces. Split 32 and 64 bit for readability sake.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      146122e2
    • T
      x86/mm/dump_pagetables: Check PAGE_PRESENT for real · c0534494
      Thomas Gleixner 提交于
      The check for a present page in printk_prot():
      
             if (!pgprot_val(prot)) {
                      /* Not present */
      
      is bogus. If a PTE is set to PAGE_NONE then the pgprot_val is not zero and
      the entry is decoded in bogus ways, e.g. as RX GLB. That is confusing when
      analyzing mapping correctness. Check for the present bit to make an
      informed decision.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      c0534494
    • T
      x86/Kconfig: Limit NR_CPUS on 32-bit to a sane amount · 7bbcbd3d
      Thomas Gleixner 提交于
      The recent cpu_entry_area changes fail to compile on 32-bit when BIGSMP=y
      and NR_CPUS=512, because the fixmap area becomes too big.
      
      Limit the number of CPUs with BIGSMP to 64, which is already way to big for
      32-bit, but it's at least a working limitation.
      
      We performed a quick survey of 32-bit-only machines that might be affected
      by this change negatively, but found none.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      7bbcbd3d
  3. 22 12月, 2017 8 次提交
    • L
      KVM: PPC: Book3S HV: Fix pending_pri value in kvmppc_xive_get_icp() · 7333b5ac
      Laurent Vivier 提交于
      When we migrate a VM from a POWER8 host (XICS) to a POWER9 host
      (XICS-on-XIVE), we have an error:
      
      qemu-kvm: Unable to restore KVM interrupt controller state \
                (0xff000000) for CPU 0: Invalid argument
      
      This is because kvmppc_xics_set_icp() checks the new state
      is internaly consistent, and especially:
      
      ...
         1129         if (xisr == 0) {
         1130                 if (pending_pri != 0xff)
         1131                         return -EINVAL;
      ...
      
      On the other side, kvmppc_xive_get_icp() doesn't set
      neither the pending_pri value, nor the xisr value (set to 0)
      (and kvmppc_xive_set_icp() ignores the pending_pri value)
      
      As xisr is 0, pending_pri must be set to 0xff.
      
      Fixes: 5af50993 ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller")
      Cc: stable@vger.kernel.org # v4.12+
      Signed-off-by: NLaurent Vivier <lvivier@redhat.com>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      7333b5ac
    • C
      KVM: PPC: Book3S: fix XIVE migration of pending interrupts · dc1c4165
      Cédric Le Goater 提交于
      When restoring a pending interrupt, we are setting the Q bit to force
      a retrigger in xive_finish_unmask(). But we also need to force an EOI
      in this case to reach the same initial state : P=1, Q=0.
      
      This can be done by not setting 'old_p' for pending interrupts which
      will inform xive_finish_unmask() that an EOI needs to be sent.
      
      Fixes: 5af50993 ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller")
      Cc: stable@vger.kernel.org # v4.12+
      Suggested-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NCédric Le Goater <clg@kaod.org>
      Reviewed-by: NLaurent Vivier <lvivier@redhat.com>
      Tested-by: NLaurent Vivier <lvivier@redhat.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      dc1c4165
    • C
      drm/syncobj: Stop reusing the same struct file for all syncobj -> fd · e7cdf5c8
      Chris Wilson 提交于
      The vk cts test:
      dEQP-VK.api.external.semaphore.opaque_fd.export_multiple_times_temporary
      
      triggers a lot of
      VFS: Close: file count is 0
      
      Dave pointed out that clearing the syncobj->file from
      drm_syncobj_file_release() was sufficient to silence the test, but that
      opens a can of worm since we assumed that the syncobj->file was never
      unset. Stop trying to reuse the same struct file for every fd pointing
      to the drm_syncobj, and allocate one file for each fd instead.
      
      v2: Fixup return handling of drm_syncobj_fd_to_handle
      v2.1: [airlied: fix possible syncobj ref race]
      Reported-by: NDave Airlie <airlied@redhat.com>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Tested-by: NDave Airlie <airlied@redhat.com>
      Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      e7cdf5c8
    • D
      Merge tag 'drm-misc-fixes-2017-12-21' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes · 12e412d7
      Dave Airlie 提交于
      drm-misc-fixes before holidays:
      
      - fixup for the lease fixup (Keith)
      - fb leak in the ww mutex fallback code (Maarten)
      - sun4i fixes (Maxime, Hans)
      
      * tag 'drm-misc-fixes-2017-12-21' of git://anongit.freedesktop.org/drm/drm-misc:
        drm: move lease init after validation in drm_lease_create
        drm/plane: Make framebuffer refcounting the responsibility of setplane_internal callers
        drm/sun4i: hdmi: Move the mode_valid callback to the encoder
        drm/sun4i: Fix error path handling
        drm/sun4i: validate modes for HDMI
      12e412d7
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · ead68f21
      Linus Torvalds 提交于
      Pull networking fixes from David Miller"
       "What's a holiday weekend without some networking bug fixes? [1]
      
         1) Fix some eBPF JIT bugs wrt. SKB pointers across helper function
            calls, from Daniel Borkmann.
      
         2) Fix regression from errata limiting change to marvell PHY driver,
            from Zhao Qiang.
      
         3) Fix u16 overflow in SCTP, from Xin Long.
      
         4) Fix potential memory leak during bridge newlink, from Nikolay
            Aleksandrov.
      
         5) Fix BPF selftest build on s390, from Hendrik Brueckner.
      
         6) Don't append to cfg80211 automatically generated certs file,
            always write new ones from scratch. From Thierry Reding.
      
         7) Fix sleep in atomic in mac80211 hwsim, from Jia-Ju Bai.
      
         8) Fix hang on tg3 MTU change with certain chips, from Brian King.
      
         9) Add stall detection to arc emac driver and reset chip when this
            happens, from Alexander Kochetkov.
      
        10) Fix MTU limitng in GRE tunnel drivers, from Xin Long.
      
        11) Fix stmmac timestamping bug due to mis-shifting of field. From
            Fredrik Hallenberg.
      
        12) Fix metrics match when deleting an ipv4 route. The kernel sets
            some internal metrics bits which the user isn't going to set when
            it makes the delete request. From Phil Sutter.
      
        13) mvneta driver loop over RX queues limits on "txq_number" :-) Fix
            from Yelena Krivosheev.
      
        14) Fix double free and memory corruption in get_net_ns_by_id, from
            Eric W. Biederman.
      
        15) Flush ipv4 FIB tables in the reverse order. Some tables can share
            their actual backing data, in particular this happens for the MAIN
            and LOCAL tables. We have to kill the LOCAL table first, because
            it uses MAIN's backing memory. Fix from Ido Schimmel.
      
        16) Several eBPF verifier value tracking fixes, from Edward Cree, Jann
            Horn, and Alexei Starovoitov.
      
        17) Make changes to ipv6 autoflowlabel sysctl really propagate to
            sockets, unless the socket has set the per-socket value
            explicitly. From Shaohua Li.
      
        18) Fix leaks and double callback invocations of zerocopy SKBs, from
            Willem de Bruijn"
      
      [1] Is this a trick question? "Relaxing"? "Quiet"? "Fine"? - Linus.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (77 commits)
        skbuff: skb_copy_ubufs must release uarg even without user frags
        skbuff: orphan frags before zerocopy clone
        net: reevalulate autoflowlabel setting after sysctl setting
        openvswitch: Fix pop_vlan action for double tagged frames
        ipv6: Honor specified parameters in fibmatch lookup
        bpf: do not allow root to mangle valid pointers
        selftests/bpf: add tests for recent bugfixes
        bpf: fix integer overflows
        bpf: don't prune branches when a scalar is replaced with a pointer
        bpf: force strict alignment checks for stack pointers
        bpf: fix missing error return in check_stack_boundary()
        bpf: fix 32-bit ALU op verification
        bpf: fix incorrect tracking of register size truncation
        bpf: fix incorrect sign extension in check_alu_op()
        bpf/verifier: fix bounds calculation on BPF_RSH
        ipv4: Fix use-after-free when flushing FIB tables
        s390/qeth: fix error handling in checksum cmd callback
        tipc: remove joining group member from congested list
        selftests: net: Adding config fragment CONFIG_NUMA=y
        nfp: bpf: keep track of the offloaded program
        ...
      ead68f21
    • D
      Merge branch 'net-zerocopy-fixes' · c50b7c47
      David S. Miller 提交于
      Saeed Mahameed says:
      
      ===================
      Mellanox, mlx5 fixes 2017-12-19
      
      The follwoing series includes some fixes for mlx5 core and etherent
      driver.
      
      Please pull and let me know if there is any problem.
      
      This series doesn't introduce any conflict with the ongoing mlx5 for-next
      submission.
      
      For -stable:
      
      kernels >= v4.7.y
          ("net/mlx5e: Fix possible deadlock of VXLAN lock")
          ("net/mlx5e: Add refcount to VXLAN structure")
          ("net/mlx5e: Prevent possible races in VXLAN control flow")
          ("net/mlx5e: Fix features check of IPv6 traffic")
      
      kernels >= v4.9.y
          ("net/mlx5: Fix error flow in CREATE_QP command")
          ("net/mlx5: Fix rate limit packet pacing naming and struct")
      
      kernels >= v4.13.y
          ("net/mlx5: FPGA, return -EINVAL if size is zero")
      
      kernels >= v4.14.y
          ("Revert "mlx5: move affinity hints assignments to generic code")
      
      All above patches apply and compile with no issues on corresponding -stable.
      ===================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c50b7c47
    • W
      skbuff: skb_copy_ubufs must release uarg even without user frags · b90ddd56
      Willem de Bruijn 提交于
      skb_copy_ubufs creates a private copy of frags[] to release its hold
      on user frags, then calls uarg->callback to notify the owner.
      
      Call uarg->callback even when no frags exist. This edge case can
      happen when zerocopy_sg_from_iter finds enough room in skb_headlen
      to copy all the data.
      
      Fixes: 3ece7826 ("sock: skb_copy_ubufs support for compound pages")
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b90ddd56
    • W
      skbuff: orphan frags before zerocopy clone · 268b7906
      Willem de Bruijn 提交于
      Call skb_zerocopy_clone after skb_orphan_frags, to avoid duplicate
      calls to skb_uarg(skb)->callback for the same data.
      
      skb_zerocopy_clone associates skb_shinfo(skb)->uarg from frag_skb
      with each segment. This is only safe for uargs that do refcounting,
      which is those that pass skb_orphan_frags without dropping their
      shared frags. For others, skb_orphan_frags drops the user frags and
      sets the uarg to NULL, after which sock_zerocopy_clone has no effect.
      
      Qemu hangs were reported due to duplicate vhost_net_zerocopy_callback
      calls for the same data causing the vhost_net_ubuf_ref_>refcount to
      drop below zero.
      
      Link: http://lkml.kernel.org/r/<CAF=yD-LWyCD4Y0aJ9O0e_CHLR+3JOeKicRRTEVCPxgw4XOcqGQ@mail.gmail.com>
      Fixes: 1f8b977a ("sock: enable MSG_ZEROCOPY")
      Reported-by: NAndreas Hartmann <andihartmann@01019freenet.de>
      Reported-by: NDavid Hill <dhill@redhat.com>
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      268b7906