1. 19 5月, 2018 15 次提交
    • R
      radix tree: fix multi-order iteration race · 9f418224
      Ross Zwisler 提交于
      Fix a race in the multi-order iteration code which causes the kernel to
      hit a GP fault.  This was first seen with a production v4.15 based
      kernel (4.15.6-300.fc27.x86_64) utilizing a DAX workload which used
      order 9 PMD DAX entries.
      
      The race has to do with how we tear down multi-order sibling entries
      when we are removing an item from the tree.  Remember for example that
      an order 2 entry looks like this:
      
        struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling]
      
      where 'entry' is in some slot in the struct radix_tree_node, and the
      three slots following 'entry' contain sibling pointers which point back
      to 'entry.'
      
      When we delete 'entry' from the tree, we call :
      
        radix_tree_delete()
          radix_tree_delete_item()
            __radix_tree_delete()
              replace_slot()
      
      replace_slot() first removes the siblings in order from the first to the
      last, then at then replaces 'entry' with NULL.  This means that for a
      brief period of time we end up with one or more of the siblings removed,
      so:
      
        struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling]
      
      This causes an issue if you have a reader iterating over the slots in
      the tree via radix_tree_for_each_slot() while only under
      rcu_read_lock()/rcu_read_unlock() protection.  This is a common case in
      mm/filemap.c.
      
      The issue is that when __radix_tree_next_slot() => skip_siblings() tries
      to skip over the sibling entries in the slots, it currently does so with
      an exact match on the slot directly preceding our current slot.
      Normally this works:
      
                                            V preceding slot
        struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling]
                                                    ^ current slot
      
      This lets you find the first sibling, and you skip them all in order.
      
      But in the case where one of the siblings is NULL, that slot is skipped
      and then our sibling detection is interrupted:
      
                                                   V preceding slot
        struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling]
                                                          ^ current slot
      
      This means that the sibling pointers aren't recognized since they point
      all the way back to 'entry', so we think that they are normal internal
      radix tree pointers.  This causes us to think we need to walk down to a
      struct radix_tree_node starting at the address of 'entry'.
      
      In a real running kernel this will crash the thread with a GP fault when
      you try and dereference the slots in your broken node starting at
      'entry'.
      
      We fix this race by fixing the way that skip_siblings() detects sibling
      nodes.  Instead of testing against the preceding slot we instead look
      for siblings via is_sibling_entry() which compares against the position
      of the struct radix_tree_node.slots[] array.  This ensures that sibling
      entries are properly identified, even if they are no longer contiguous
      with the 'entry' they point to.
      
      Link: http://lkml.kernel.org/r/20180503192430.7582-6-ross.zwisler@linux.intel.com
      Fixes: 148deab2 ("radix-tree: improve multiorder iterators")
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Reported-by: NCR, Sapthagirish <sapthagirish.cr@intel.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9f418224
    • R
      radix tree test suite: multi-order iteration race · fd8f58c4
      Ross Zwisler 提交于
      Add a test which shows a race in the multi-order iteration code.  This
      test reliably hits the race in under a second on my machine, and is the
      result of a real bug report against kernel a production v4.15 based
      kernel (4.15.6-300.fc27.x86_64).  With a real kernel this issue is hit
      when using order 9 PMD DAX radix tree entries.
      
      The race has to do with how we tear down multi-order sibling entries
      when we are removing an item from the tree.  Remember that an order 2
      entry looks like this:
      
        struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling]
      
      where 'entry' is in some slot in the struct radix_tree_node, and the
      three slots following 'entry' contain sibling pointers which point back
      to 'entry.'
      
      When we delete 'entry' from the tree, we call :
      
        radix_tree_delete()
          radix_tree_delete_item()
            __radix_tree_delete()
              replace_slot()
      
      replace_slot() first removes the siblings in order from the first to the
      last, then at then replaces 'entry' with NULL.  This means that for a
      brief period of time we end up with one or more of the siblings removed,
      so:
      
        struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling]
      
      This causes an issue if you have a reader iterating over the slots in
      the tree via radix_tree_for_each_slot() while only under
      rcu_read_lock()/rcu_read_unlock() protection.  This is a common case in
      mm/filemap.c.
      
      The issue is that when __radix_tree_next_slot() => skip_siblings() tries
      to skip over the sibling entries in the slots, it currently does so with
      an exact match on the slot directly preceding our current slot.
      Normally this works:
      
                                            V preceding slot
        struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling]
                                                    ^ current slot
      
      This lets you find the first sibling, and you skip them all in order.
      
      But in the case where one of the siblings is NULL, that slot is skipped
      and then our sibling detection is interrupted:
      
                                                   V preceding slot
        struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling]
                                                          ^ current slot
      
      This means that the sibling pointers aren't recognized since they point
      all the way back to 'entry', so we think that they are normal internal
      radix tree pointers.  This causes us to think we need to walk down to a
      struct radix_tree_node starting at the address of 'entry'.
      
      In a real running kernel this will crash the thread with a GP fault when
      you try and dereference the slots in your broken node starting at
      'entry'.
      
      In the radix tree test suite this will be caught by the address
      sanitizer:
      
        ==27063==ERROR: AddressSanitizer: heap-buffer-overflow on address
        0x60c0008ae400 at pc 0x00000040ce4f bp 0x7fa89b8fcad0 sp 0x7fa89b8fcac0
        READ of size 8 at 0x60c0008ae400 thread T3
            #0 0x40ce4e in __radix_tree_next_slot /home/rzwisler/project/linux/tools/testing/radix-tree/radix-tree.c:1660
            #1 0x4022cc in radix_tree_next_slot linux/../../../../include/linux/radix-tree.h:567
            #2 0x4022cc in iterator_func /home/rzwisler/project/linux/tools/testing/radix-tree/multiorder.c:655
            #3 0x7fa8a088d50a in start_thread (/lib64/libpthread.so.0+0x750a)
            #4 0x7fa8a03bd16e in clone (/lib64/libc.so.6+0xf516e)
      
      Link: http://lkml.kernel.org/r/20180503192430.7582-5-ross.zwisler@linux.intel.comSigned-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: CR, Sapthagirish <sapthagirish.cr@intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Matthew Wilcox <willy@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fd8f58c4
    • R
      radix tree test suite: add item_delete_rcu() · 3e252fa7
      Ross Zwisler 提交于
      Currently the lifetime of "struct item" entries in the radix tree are
      not controlled by RCU, but are instead deleted inline as they are
      removed from the tree.
      
      In the following patches we add a test which has threads iterating over
      items pulled from the tree and verifying them in an
      rcu_read_lock()/rcu_read_unlock() section.  This means that though an
      item has been removed from the tree it could still be being worked on by
      other threads until the RCU grace period expires.  So, we need to
      actually free the "struct item" structures at the end of the grace
      period, just as we do with "struct radix_tree_node" items.
      
      Link: http://lkml.kernel.org/r/20180503192430.7582-4-ross.zwisler@linux.intel.comSigned-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: CR, Sapthagirish <sapthagirish.cr@intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Matthew Wilcox <willy@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3e252fa7
    • R
      radix tree test suite: fix compilation issue · dcbbf25a
      Ross Zwisler 提交于
      Pulled from a patch from Matthew Wilcox entitled "xarray: Add definition
      of struct xarray":
      
      > From: Matthew Wilcox <mawilcox@microsoft.com>
      > Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
      
        https://patchwork.kernel.org/patch/10341249/
      
      These defines fix this compilation error:
      
        In file included from ./linux/radix-tree.h:6:0,
                         from ./linux/../../../../include/linux/idr.h:15,
                         from ./linux/idr.h:1,
                         from idr.c:4:
        ./linux/../../../../include/linux/idr.h: In function `idr_init_base':
        ./linux/../../../../include/linux/radix-tree.h:129:2: warning: implicit declaration of function `spin_lock_init'; did you mean `spinlock_t'? [-Wimplicit-function-declaration]
          spin_lock_init(&(root)->xa_lock);    \
          ^
        ./linux/../../../../include/linux/idr.h:126:2: note: in expansion of macro `INIT_RADIX_TREE'
          INIT_RADIX_TREE(&idr->idr_rt, IDR_RT_MARKER);
          ^~~~~~~~~~~~~~~
      
      by providing a spin_lock_init() wrapper for the v4.17-rc* version of the
      radix tree test suite.
      
      Link: http://lkml.kernel.org/r/20180503192430.7582-3-ross.zwisler@linux.intel.comSigned-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: CR, Sapthagirish <sapthagirish.cr@intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Matthew Wilcox <willy@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dcbbf25a
    • R
      radix tree test suite: fix mapshift build target · 8d9fa88e
      Ross Zwisler 提交于
      Commit c6ce3e2f ("radix tree test suite: Add config option for map
      shift") introduced a phony makefile target called 'mapshift' that ends
      up generating the file generated/map-shift.h.  This phony target was
      then added as a dependency of the top level 'targets' build target,
      which is what is run when you go to tools/testing/radix-tree and just
      type 'make'.
      
      Unfortunately, this phony target doesn't actually work as a dependency,
      so you end up getting:
      
        $ make
        make: *** No rule to make target 'generated/map-shift.h', needed by 'main.o'.  Stop.
        make: *** Waiting for unfinished jobs....
      
      Fix this by making the file generated/map-shift.h our real makefile
      target, and add this a dependency of the top level build target.
      
      Link: http://lkml.kernel.org/r/20180503192430.7582-2-ross.zwisler@linux.intel.comSigned-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: CR, Sapthagirish <sapthagirish.cr@intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Matthew Wilcox <willy@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8d9fa88e
    • S
      include/linux/mm.h: add new inline function vmf_error() · d97baf94
      Souptick Joarder 提交于
      Many places in drivers/ file systems, error was handled in a common way
      like below:
      
      	ret = (ret == -ENOMEM) ? VM_FAULT_OOM : VM_FAULT_SIGBUS;
      
      vmf_error() will replace this and return vm_fault_t type err.
      
      A lot of drivers and filesystems currently have a rather complex mapping
      of errno-to-VM_FAULT code.  We have been able to eliminate a lot of it
      by just returning VM_FAULT codes directly from functions which are
      called exclusively from the fault handling path.
      
      Some functions can be called both from the fault handler and other
      context which are expecting an errno, so they have to continue to return
      an errno.  Some users still need to choose different behaviour for
      different errnos, but vmf_error() captures the essential error
      translation that's common to all users, and those that need to handle
      additional errors can handle them first.
      
      Link: http://lkml.kernel.org/r/20180510174826.GA14268@jordon-HP-15-Notebook-PCSigned-off-by: NSouptick Joarder <jrdr.linux@gmail.com>
      Reviewed-by: NMatthew Wilcox <mawilcox@microsoft.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d97baf94
    • M
      lib/test_bitmap.c: fix bitmap optimisation tests to report errors correctly · 1e3054b9
      Matthew Wilcox 提交于
      I had neglected to increment the error counter when the tests failed,
      which made the tests noisy when they fail, but not actually return an
      error code.
      
      Link: http://lkml.kernel.org/r/20180509114328.9887-1-mpe@ellerman.id.au
      Fixes: 3cc78125 ("lib/test_bitmap.c: add optimisation tests")
      Signed-off-by: NMatthew Wilcox <mawilcox@microsoft.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Reported-by: NMichael Ellerman <mpe@ellerman.id.au>
      Tested-by: NMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Cc: Yury Norov <ynorov@caviumnetworks.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: <stable@vger.kernel.org>	[4.13+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1e3054b9
    • L
      Merge tag 'powerpc-4.17-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 2c71d338
      Linus Torvalds 提交于
      Pull powerpc fixes from Michael Ellerman:
       "Just three commits.
      
        The two cxl ones are not fixes per se, but they modify code that was
        added this cycle so that it will work with a recent firmware change.
      
        And then a fix for a recent commit that added sleeps in the NVRAM
        code, which needs to be more careful and not sleep if eg. we're called
        in the panic() path.
      
        Thanks to Nicholas Piggin, Philippe Bergheaud, Christophe Lombard"
      
      * tag 'powerpc-4.17-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/powernv: Fix NVRAM sleep in invalid context when crashing
        cxl: Report the tunneled operations status
        cxl: Set the PBCQ Tunnel BAR register when enabling capi mode
      2c71d338
    • L
      Merge tag 'acpi-4.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · d3154821
      Linus Torvalds 提交于
      Pull ACPI fix from Rafael Wysocki:
       "Fix an ACPICA regression introduced in this cycle and related to the
        handling of package objects loaded by the Load and loadTable AML
        operators that are not initialized properly after recent changes (Bob
        Moore)"
      
      * tag 'acpi-4.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPICA: Add deferred package support for the Load and loadTable operators
      d3154821
    • L
      Merge tag 'pm-4.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 477e2c6f
      Linus Torvalds 提交于
      Pull power management fix from Rafael Wysocki:
       "Fix Kconfig dependencies of the armada-37xx cpufreq driver (Miquel
        Raynal)"
      
      * tag 'pm-4.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        cpufreq: armada-37xx: driver relies on cpufreq-dt
      477e2c6f
    • L
      Merge tag 'usb-4.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 0e273f9e
      Linus Torvalds 提交于
      Pull USB fixes from Greg KH:
       "Here are some USB driver fixes fro 4.17-rc6.
      
        They resolve some reported bugs in the musb driver, the xhci driver,
        and a number of small fixes for the usbip driver.
      
        All of these have been in linux-next with no reported issues"
      
      * tag 'usb-4.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        usbip: usbip_host: fix bad unlock balance during stub_probe()
        usbip: usbip_host: fix NULL-ptr deref and use-after-free errors
        usbip: usbip_host: run rebind from exit when module is removed
        usbip: usbip_host: delete device from busid_table after rebind
        usbip: usbip_host: refine probe and disconnect debug msgs to be useful
        usb: musb: fix remote wakeup racing with suspend
        xhci: Fix USB3 NULL pointer dereference at logical disconnect.
      0e273f9e
    • L
      Merge tag 'for-linus-20180518' of git://git.kernel.dk/linux-block · 61c2ad9a
      Linus Torvalds 提交于
      Pull block fix from Jens Axboe:
       "Single fix this time, from Coly, fixing a failure case when
        CONFIG_DEBUGFS isn't enabled"
      
      * tag 'for-linus-20180518' of git://git.kernel.dk/linux-block:
        bcache: return 0 from bch_debug_init() if CONFIG_DEBUG_FS=n
      61c2ad9a
    • L
      Merge tag 'spi-fix-v4.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · 8ccaecd0
      Linus Torvalds 提交于
      Pull spi fixes from Mark Brown:
       "A small collection of fixes accumilated since the merge window, all
        fairly small and driver specific"
      
      * tag 'spi-fix-v4.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
        spi: bcm2835aux: ensure interrupts are enabled for shared handler
        spi: bcm-qspi: Always read and set BSPI_MAST_N_BOOT_CTRL
        spi: bcm-qspi: Avoid setting MSPI_CDRAM_PCS for spi-nor master
        spi: pxa2xx: Allow 64-bit DMA
        spi: cadence: Add usleep_range() for cdns_spi_fill_tx_fifo()
        spi: sh-msiof: Fix bit field overflow writes to TSCR/RSCR
        spi: imx: Update MODULE_DESCRIPTION to "SPI Controller driver"
      8ccaecd0
    • L
      Merge tag 'mtd/fixes-for-4.17-rc6' of git://git.infradead.org/linux-mtd · 163ced61
      Linus Torvalds 提交于
      Pull mtd fixes from Boris Brezillon:
       "NAND fixes:
         - Fix read path of the Marvell NAND driver
         - Make sure we don't pass a u64 to ndelay()
      
        CFI fix:
         - Fix the map_word_andequal() implementation"
      
      * tag 'mtd/fixes-for-4.17-rc6' of git://git.infradead.org/linux-mtd:
        mtd: rawnand: Fix return type of __DIVIDE() when called with 32-bit
        mtd: rawnand: marvell: Fix read logic for layouts with ->nchunks > 2
        mtd: Fix comparison in map_word_andequal()
      163ced61
    • L
      Merge tag 'drm-fixes-for-v4.17-rc6' of git://people.freedesktop.org/~airlied/linux · d90eb183
      Linus Torvalds 提交于
      Pull drm fixes from Dave Airlie:
       "Pretty quiet week again: one vmwgfx regression fix, one core buffer
        overflow fix, one vc4 leak fix and three i915 fixes"
      
      * tag 'drm-fixes-for-v4.17-rc6' of git://people.freedesktop.org/~airlied/linux:
        drm/dumb-buffers: Integer overflow in drm_mode_create_ioctl()
        drm/i915/gen9: Add WaClearHIZ_WM_CHICKEN3 for bxt and glk
        drm/vmwgfx: Set dmabuf_size when vmw_dmabuf_init is successful
        drm/vc4: Fix leak of the file_priv that stored the perfmon.
        drm/i915/execlists: Use rmb() to order CSB reads
        drm/i915/userptr: reject zero user_size
        drm: Match sysfs name in link removal to link creation
      d90eb183
  2. 18 5月, 2018 8 次提交
    • D
      Merge tag 'drm-intel-fixes-2018-05-17' of... · 1827cad9
      Dave Airlie 提交于
      Merge tag 'drm-intel-fixes-2018-05-17' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
      
      - Userptr IOCTL zero size check (Matt)
      - Two hardware quirk fixes (Michel & Chris)
      
      * tag 'drm-intel-fixes-2018-05-17' of git://anongit.freedesktop.org/drm/drm-intel:
        drm/i915/gen9: Add WaClearHIZ_WM_CHICKEN3 for bxt and glk
        drm/i915/execlists: Use rmb() to order CSB reads
        drm/i915/userptr: reject zero user_size
      1827cad9
    • L
      Merge tag 'hwmon-for-linus-v4.17-rc6' of... · 3acf4e39
      Linus Torvalds 提交于
      Merge tag 'hwmon-for-linus-v4.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
      
      Pull hwmon fixes from Guenter Roeck:
       "Two k10temp fixes:
      
         - fix race condition when accessing System Management Network
           registers
      
         - fix reading critical temperatures on F15h M60h and M70h
      
        Also add PCI ID's for the AMD Raven Ridge root bridge"
      
      * tag 'hwmon-for-linus-v4.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
        hwmon: (k10temp) Use API function to access System Management Network
        x86/amd_nb: Add support for Raven Ridge CPUs
        hwmon: (k10temp) Fix reading critical temperature register
      3acf4e39
    • L
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 58ddfe6c
      Linus Torvalds 提交于
      Pull kvm fixes from Paolo Bonzini:
      
       - ARM/ARM64 locking fixes
      
       - x86 fixes: PCID, UMIP, locking
      
       - improved support for recent Windows version that have a 2048 Hz APIC
         timer
      
       - rename KVM_HINTS_DEDICATED CPUID bit to KVM_HINTS_REALTIME
      
       - better behaved selftests
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        kvm: rename KVM_HINTS_DEDICATED to KVM_HINTS_REALTIME
        KVM: arm/arm64: VGIC/ITS save/restore: protect kvm_read_guest() calls
        KVM: arm/arm64: VGIC/ITS: protect kvm_read_guest() calls with SRCU lock
        KVM: arm/arm64: VGIC/ITS: Promote irq_lock() in update_affinity
        KVM: arm/arm64: Properly protect VGIC locks from IRQs
        KVM: X86: Lower the default timer frequency limit to 200us
        KVM: vmx: update sec exec controls for UMIP iff emulating UMIP
        kvm: x86: Suppress CR3_PCID_INVD bit only when PCIDs are enabled
        KVM: selftests: exit with 0 status code when tests cannot be run
        KVM: hyperv: idr_find needs RCU protection
        x86: Delay skip of emulated hypercall instruction
        KVM: Extend MAX_IRQ_ROUTES to 4096 for all archs
      58ddfe6c
    • L
      Merge tag 'sound-4.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 7c9a0fc7
      Linus Torvalds 提交于
      Pull sound fixes from Takashi Iwai:
       "We have a core fix in the compat code for covering a potential race
        (double references), but it's a very minor change.
      
        The rest are all small device-specific quirks, as well as a correction
        of the new UAC3 support code"
      
      * tag 'sound-4.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: usb-audio: Use Class Specific EP for UAC3 devices.
        ALSA: hda/realtek - Clevo P950ER ALC1220 Fixup
        ALSA: usb: mixer: volume quirk for CM102-A+/102S+
        ALSA: hda: Add Lenovo C50 All in one to the power_save blacklist
        ALSA: control: fix a redundant-copy issue
      7c9a0fc7
    • M
      kvm: rename KVM_HINTS_DEDICATED to KVM_HINTS_REALTIME · 633711e8
      Michael S. Tsirkin 提交于
      KVM_HINTS_DEDICATED seems to be somewhat confusing:
      
      Guest doesn't really care whether it's the only task running on a host
      CPU as long as it's not preempted.
      
      And there are more reasons for Guest to be preempted than host CPU
      sharing, for example, with memory overcommit it can get preempted on a
      memory access, post copy migration can cause preemption, etc.
      
      Let's call it KVM_HINTS_REALTIME which seems to better
      match what guests expect.
      
      Also, the flag most be set on all vCPUs - current guests assume this.
      Note so in the documentation.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      633711e8
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 3e9245c5
      Linus Torvalds 提交于
      Pull s390 fixes from Martin Schwidefsky:
      
       - a fix for the vfio ccw translation code
      
       - update an incorrect email address in the MAINTAINERS file
      
       - fix a division by zero oops in the cpum_sf code found by trinity
      
       - two fixes for the error handling of the qdio code
      
       - several spectre related patches to convert all left-over indirect
         branches in the kernel to expoline branches
      
       - update defconfigs to avoid warnings due to the netfilter Kconfig
         changes
      
       - avoid several compiler warnings in the kexec_file code for s390
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/qdio: don't release memory in qdio_setup_irq()
        s390/qdio: fix access to uninitialized qdio_q fields
        s390/cpum_sf: ensure sample frequency of perf event attributes is non-zero
        s390: use expoline thunks in the BPF JIT
        s390: extend expoline to BC instructions
        s390: remove indirect branch from do_softirq_own_stack
        s390: move spectre sysfs attribute code
        s390/kernel: use expoline for indirect branches
        s390/ftrace: use expoline for indirect branches
        s390/lib: use expoline for indirect branches
        s390/crc32-vx: use expoline for indirect branches
        s390: move expoline assembler macros to a header
        vfio: ccw: fix cleanup if cp_prefetch fails
        s390/kexec_file: add declaration of purgatory related globals
        s390: update defconfigs
        MAINTAINERS: update s390 zcrypt maintainers email address
      3e9245c5
    • L
      Merge tag 'selinux-pr-20180516' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux · 305bb552
      Linus Torvalds 提交于
      Pull SELinux fixes from Paul Moore:
       "A small pull request to fix a few regressions in the SELinux/SCTP code
        with applications that call bind() with AF_UNSPEC/INADDR_ANY.
      
        The individual commit descriptions have more information, but the
        commits themselves should be self explanatory"
      
      * tag 'selinux-pr-20180516' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
        selinux: correctly handle sa_family cases in selinux_sctp_bind_connect()
        selinux: fix address family in bind() and connect() to match address/port
        selinux: add AF_UNSPEC and INADDR_ANY checks to selinux_socket_bind()
      305bb552
    • W
      proc: do not access cmdline nor environ from file-backed areas · 7f7ccc2c
      Willy Tarreau 提交于
      proc_pid_cmdline_read() and environ_read() directly access the target
      process' VM to retrieve the command line and environment. If this
      process remaps these areas onto a file via mmap(), the requesting
      process may experience various issues such as extra delays if the
      underlying device is slow to respond.
      
      Let's simply refuse to access file-backed areas in these functions.
      For this we add a new FOLL_ANON gup flag that is passed to all calls
      to access_remote_vm(). The code already takes care of such failures
      (including unmapped areas). Accesses via /proc/pid/mem were not
      changed though.
      
      This was assigned CVE-2018-1120.
      
      Note for stable backports: the patch may apply to kernels prior to 4.11
      but silently miss one location; it must be checked that no call to
      access_remote_vm() keeps zero as the last argument.
      Reported-by: NQualys Security Advisory <qsa@qualys.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NWilly Tarreau <w@1wt.eu>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7f7ccc2c
  3. 17 5月, 2018 7 次提交
    • C
      bcache: return 0 from bch_debug_init() if CONFIG_DEBUG_FS=n · 1c1a2ee1
      Coly Li 提交于
      Commit 539d39eb ("bcache: fix wrong return value in bch_debug_init()")
      returns the return value of debugfs_create_dir() to bcache_init(). When
      CONFIG_DEBUG_FS=n, bch_debug_init() always returns 1 and makes
      bcache_init() failedi.
      
      This patch makes bch_debug_init() always returns 0 if CONFIG_DEBUG_FS=n,
      so bcache can continue to work for the kernels which don't have debugfs
      enanbled.
      
      Changelog:
      v4: Add Acked-by from Kent Overstreet.
      v3: Use IS_ENABLED(CONFIG_DEBUG_FS) to replace #ifdef DEBUG_FS.
      v2: Remove a warning information
      v1: Initial version.
      
      Fixes: Commit 539d39eb ("bcache: fix wrong return value in bch_debug_init()")
      Cc: stable@vger.kernel.org
      Signed-off-by: NColy Li <colyli@suse.de>
      Reported-by: NMassimo B. <massimo.b@gmx.net>
      Reported-by: NKai Krakow <kai@kaishome.de>
      Tested-by: NKai Krakow <kai@kaishome.de>
      Acked-by: NKent Overstreet <kent.overstreet@gmail.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      1c1a2ee1
    • N
      powerpc/powernv: Fix NVRAM sleep in invalid context when crashing · c1d2a313
      Nicholas Piggin 提交于
      Similarly to opal_event_shutdown, opal_nvram_write can be called in
      the crash path with irqs disabled. Special case the delay to avoid
      sleeping in invalid context.
      
      Fixes: 3b807033 ("powerpc/powernv: Fix OPAL NVRAM driver OPAL_BUSY loops")
      Cc: stable@vger.kernel.org # v3.2
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      c1d2a313
    • D
      Merge branch 'vmwgfx-fixes-4.17' of git://people.freedesktop.org/~thomash/linux into drm-fixes · bc91d181
      Dave Airlie 提交于
      A single fix for a recent regression.
      
      * 'vmwgfx-fixes-4.17' of git://people.freedesktop.org/~thomash/linux:
        drm/vmwgfx: Set dmabuf_size when vmw_dmabuf_init is successful
      bc91d181
    • D
      Merge tag 'drm-misc-fixes-2018-05-16' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes · 3d3aa969
      Dave Airlie 提交于
      - core: Fix regression in dev node offsets (Haneen)
      - vc4: Fix memory leak on driver close (Eric)
      - dumb-buffers: Prevent overflow in DIV_ROUND_UP() (Dan)
      
      Cc: Haneen Mohammed <hamohammed.sa@gmail.com>
      Cc: Eric Anholt <eric@anholt.net>
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      
      * tag 'drm-misc-fixes-2018-05-16' of git://anongit.freedesktop.org/drm/drm-misc:
        drm/dumb-buffers: Integer overflow in drm_mode_create_ioctl()
        drm/vc4: Fix leak of the file_priv that stored the perfmon.
        drm: Match sysfs name in link removal to link creation
      3d3aa969
    • L
      Merge tag 'trace-v4.17-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · e6506eb2
      Linus Torvalds 提交于
      Pull tracing fix from Steven Rostedt:
       "Some of the ftrace internal events use a zero for a data size of a
        field event. This is increasingly important for the histogram trigger
        work that is being extended.
      
        While auditing trace events, I found that a couple of the xen events
        were used as just marking that a function was called, by creating a
        static array of size zero. This can play havoc with the tracing
        features if these events are used, because a zero size of a static
        array is denoted as a special nul terminated dynamic array (this is
        what the trace_marker code uses). But since the xen events have no
        size, they are not nul terminated, and unexpected results may occur.
      
        As trace events were never intended on being a marker to denote that a
        function was hit or not, especially since function tracing and kprobes
        can trivially do the same, the best course of action is to simply
        remove these events"
      
      * tag 'trace-v4.17-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracing/x86/xen: Remove zero data size trace events trace_xen_mmu_flush_tlb{_all}
      e6506eb2
    • L
      Merge tag 'trace-v4.17-rc5-vsprintf' of... · 9d38cd06
      Linus Torvalds 提交于
      Merge tag 'trace-v4.17-rc5-vsprintf' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
      
      Pull memory barrier for from Steven Rostedt:
       "The memory barrier usage in updating the random ptr hash for %p in
        vsprintf is incorrect.
      
        Instead of adding the read memory barrier into vsprintf() which will
        cause a slight degradation to a commonly used function in the kernel
        just to solve a very unlikely race condition that can only happen at
        boot up, change the code from using a variable branch to a
        static_branch.
      
        Not only does this solve the race condition, it actually will improve
        the performance of vsprintf() by removing the conditional branch that
        is only needed at boot"
      
      * tag 'trace-v4.17-rc5-vsprintf' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        vsprintf: Replace memory barrier with static_key for random_ptr_key update
      9d38cd06
    • S
      usbip: usbip_host: fix bad unlock balance during stub_probe() · c171654c
      Shuah Khan (Samsung OSG) 提交于
      stub_probe() calls put_busid_priv() in an error path when device isn't
      found in the busid_table. Fix it by making put_busid_priv() safe to be
      called with null struct bus_id_priv pointer.
      
      This problem happens when "usbip bind" is run without loading usbip_host
      driver and then running modprobe. The first failed bind attempt unbinds
      the device from the original driver and when usbip_host is modprobed,
      stub_probe() runs and doesn't find the device in its busid table and calls
      put_busid_priv(0 with null bus_id_priv pointer.
      
      usbip-host 3-10.2: 3-10.2 is not in match_busid table...  skip!
      
      [  367.359679] =====================================
      [  367.359681] WARNING: bad unlock balance detected!
      [  367.359683] 4.17.0-rc4+ #5 Not tainted
      [  367.359685] -------------------------------------
      [  367.359688] modprobe/2768 is trying to release lock (
      [  367.359689]
      ==================================================================
      [  367.359696] BUG: KASAN: null-ptr-deref in print_unlock_imbalance_bug+0x99/0x110
      [  367.359699] Read of size 8 at addr 0000000000000058 by task modprobe/2768
      
      [  367.359705] CPU: 4 PID: 2768 Comm: modprobe Not tainted 4.17.0-rc4+ #5
      
      Fixes: 22076557 ("usbip: usbip_host: fix NULL-ptr deref and use-after-free errors") in usb-linus
      Signed-off-by: NShuah Khan (Samsung OSG) <shuah@kernel.org>
      Cc: stable <stable@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c171654c
  4. 16 5月, 2018 7 次提交
    • D
      drm/dumb-buffers: Integer overflow in drm_mode_create_ioctl() · 2b620729
      Dan Carpenter 提交于
      There is a comment here which says that DIV_ROUND_UP() and that's where
      the problem comes from.  Say you pick:
      
      	args->bpp = UINT_MAX - 7;
      	args->width = 4;
      	args->height = 1;
      
      The integer overflow in DIV_ROUND_UP() means "cpp" is UINT_MAX / 8 and
      because of how we picked args->width that means cpp < UINT_MAX / 4.
      
      I've fixed it by preventing the integer overflow in DIV_ROUND_UP().  I
      removed the check for !cpp because it's not possible after this change.
      I also changed all the 0xffffffffU references to U32_MAX.
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180516140026.GA19340@mwanda
      2b620729
    • S
      vsprintf: Replace memory barrier with static_key for random_ptr_key update · 85f4f12d
      Steven Rostedt (VMware) 提交于
      Reviewing Tobin's patches for getting pointers out early before
      entropy has been established, I noticed that there's a lone smp_mb() in
      the code. As with most lone memory barriers, this one appears to be
      incorrectly used.
      
      We currently basically have this:
      
      	get_random_bytes(&ptr_key, sizeof(ptr_key));
      	/*
      	 * have_filled_random_ptr_key==true is dependent on get_random_bytes().
      	 * ptr_to_id() needs to see have_filled_random_ptr_key==true
      	 * after get_random_bytes() returns.
      	 */
      	smp_mb();
      	WRITE_ONCE(have_filled_random_ptr_key, true);
      
      And later we have:
      
      	if (unlikely(!have_filled_random_ptr_key))
      		return string(buf, end, "(ptrval)", spec);
      
      /* Missing memory barrier here. */
      
      	hashval = (unsigned long)siphash_1u64((u64)ptr, &ptr_key);
      
      As the CPU can perform speculative loads, we could have a situation
      with the following:
      
      	CPU0				CPU1
      	----				----
      				   load ptr_key = 0
         store ptr_key = random
         smp_mb()
         store have_filled_random_ptr_key
      
      				   load have_filled_random_ptr_key = true
      
      				    BAD BAD BAD! (you're so bad!)
      
      Because nothing prevents CPU1 from loading ptr_key before loading
      have_filled_random_ptr_key.
      
      But this race is very unlikely, but we can't keep an incorrect smp_mb() in
      place. Instead, replace the have_filled_random_ptr_key with a static_branch
      not_filled_random_ptr_key, that is initialized to true and changed to false
      when we get enough entropy. If the update happens in early boot, the
      static_key is updated immediately, otherwise it will have to wait till
      entropy is filled and this happens in an interrupt handler which can't
      enable a static_key, as that requires a preemptible context. In that case, a
      work_queue is used to enable it, as entropy already took too long to
      establish in the first place waiting a little more shouldn't hurt anything.
      
      The benefit of using the static key is that the unlikely branch in
      vsprintf() now becomes a nop.
      
      Link: http://lkml.kernel.org/r/20180515100558.21df515e@gandalf.local.home
      
      Cc: stable@vger.kernel.org
      Fixes: ad67b74d ("printk: hash addresses printed with %p")
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      85f4f12d
    • M
      drm/i915/gen9: Add WaClearHIZ_WM_CHICKEN3 for bxt and glk · b579f924
      Michel Thierry 提交于
      Factor in clear values wherever required while updating destination
      min/max.
      
      References: HSDES#1604444184
      Signed-off-by: NMichel Thierry <michel.thierry@intel.com>
      Cc: mesa-dev@lists.freedesktop.org
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Oscar Mateo <oscar.mateo@intel.com>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180510200708.18097-1-michel.thierry@intel.com
      Cc: stable@vger.kernel.org
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180514165445.9198-1-michel.thierry@intel.com
      (backported from commit 0c79f9cb)
      Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      b579f924
    • D
      drm/vmwgfx: Set dmabuf_size when vmw_dmabuf_init is successful · 91ba9f28
      Deepak Rawat 提交于
      SOU primary plane prepare_fb hook depends upon dmabuf_size to pin up BO
      (and not call a new vmw_dmabuf_init) when a new fb size is same as
      current fb. This was changed in a recent commit which is causing
      page_flip to fail on VM with low display memory and multi-mon failure
      when cycle monitors from secondary display.
      
      Cc: <stable@vger.kernel.org> # 4.14, 4.16
      Fixes: 20fb5a63 ("drm/vmwgfx: Unpin the screen object backup buffer when not used")
      Signed-off-by: NDeepak Rawat <drawat@vmware.com>
      Reviewed-by: NSinclair Yeh <syeh@vmware.com>
      Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>
      91ba9f28
    • L
      Merge tag 'afs-fixes-20180514' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs · 21b9f1c7
      Linus Torvalds 提交于
      Pull AFS fixes from David Howells:
       "Here's a set of patches that fix a number of bugs in the in-kernel AFS
        client, including:
      
         - Fix directory locking to not use individual page locks for
           directory reading/scanning but rather to use a semaphore on the
           afs_vnode struct as the directory contents must be read in a single
           blob and data from different reads must not be mixed as the entire
           contents may be shuffled about between reads.
      
         - Fix address list parsing to handle port specifiers correctly.
      
         - Only give up callback records on a server if we actually talked to
           that server (we might not be able to access a server).
      
         - Fix some callback handling bugs, including refcounting,
           whole-volume callbacks and when callbacks actually get broken in
           response to a CB.CallBack op.
      
         - Fix some server/address rotation bugs, including giving up if we
           can't probe a server; giving up if a server says it doesn't have a
           volume, but there are more servers to try.
      
         - Fix the decoding of fetched statuses to be OpenAFS compatible.
      
         - Fix the handling of server lookups in Cache Manager ops (such as
           CB.InitCallBackState3) to use a UUID if possible and to handle no
           server being found.
      
         - Fix a bug in server lookup where not all addresses are compared.
      
         - Fix the non-encryption of calls that prevents some servers from
           being accessed (this also requires an AF_RXRPC patch that has
           already gone in through the net tree).
      
        There's also a patch that adds tracepoints to log Cache Manager ops
        that don't find a matching server, either by UUID or by address"
      
      * tag 'afs-fixes-20180514' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
        afs: Fix the non-encryption of calls
        afs: Fix CB.CallBack handling
        afs: Fix whole-volume callback handling
        afs: Fix afs_find_server search loop
        afs: Fix the handling of an unfound server in CM operations
        afs: Add a tracepoint to record callbacks from unlisted servers
        afs: Fix the handling of CB.InitCallBackState3 to find the server by UUID
        afs: Fix VNOVOL handling in address rotation
        afs: Fix AFSFetchStatus decoder to provide OpenAFS compatibility
        afs: Fix server rotation's handling of fileserver probe failure
        afs: Fix refcounting in callback registration
        afs: Fix giving up callbacks on server destruction
        afs: Fix address list parsing
        afs: Fix directory page locking
      21b9f1c7
    • L
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · eeba2dfa
      Linus Torvalds 提交于
      Pull SCSI fixes from James Bottomley:
       "Two small driver fixes: aacraid to fix an unknown IU type on task
        management functions which causes a firmware fault and vmw_pvscsi to
        change a return code to retry the operation instead of causing an
        immediate error"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: aacraid: Correct hba_send to include iu_type
        scsi: vmw-pvscsi: return DID_BUS_BUSY for adapter-initated aborts
      eeba2dfa
    • L
      Merge tag 'drm-fixes-for-v4.17-rc6-urgent' of git://people.freedesktop.org/~airlied/linux · ee4b65c2
      Linus Torvalds 提交于
      Pull drm fix from Dave Airlie:
       "This fixes the mmap regression reported to me on irc by an i686 kernel
        user today, he's tested the fix works, and I've audited all the drm
        drivers for the bad mmap usage and since we use the mmap offset as a
        lookup in a table we aren't inclined to have anything bad in there"
      
      [ See commit be83bbf8 ("mmap: introduce sane default mmap limits")
        for details and the note on why the GPU drivers were expected to be a
        special case.    - Linus ]
      
      * tag 'drm-fixes-for-v4.17-rc6-urgent' of git://people.freedesktop.org/~airlied/linux:
        drm: set FMODE_UNSIGNED_OFFSET for drm files
      ee4b65c2
  5. 15 5月, 2018 3 次提交