1. 18 4月, 2017 2 次提交
    • M
      powerpc/64: Fix HMI exception on LE with CONFIG_RELOCATABLE=y · be5c5e84
      Michael Ellerman 提交于
      Prior to commit 2337d207 ("powerpc/64: CONFIG_RELOCATABLE support for hmi
      interrupts"), the branch from hmi_exception_early() to hmi_exception_realmode()
      was just a bl hmi_exception_realmode, which the linker would turn into a bl to
      the local entry point of hmi_exception_realmode. This was broken when
      CONFIG_RELOCATABLE=y because hmi_exception_realmode() is not in the low part of
      the kernel text that is copied down to 0x0.
      
      But in fixing that, we added a new bug on little endian kernels. Because the
      branch is now a bctrl when CONFIG_RELOCATABLE=y, we branch to the global entry
      point of hmi_exception_realmode(). The global entry point must be called with
      r12 containing the address of hmi_exception_realmode(), because it uses that
      value to calculate the TOC value (r2).
      
      This may manifest as a checkstop, because we take a junk value from r12 which
      came from HSRR1, add a small constant to it and then use that as the TOC
      pointer. The HSRR1 value will have 0x9 as the top nibble, which puts it above
      RAM and somewhere in MMIO space.
      
      Fix it by changing the BRANCH_LINK_TO_FAR() macro to always use r12 to load the
      label we're branching to. This means r12 will be setup correctly on LE, fixing
      this bug, and r12 is also volatile across function calls on BE so it's a good
      choice anyway.
      
      Fixes: 2337d207 ("powerpc/64: CONFIG_RELOCATABLE support for hmi interrupts")
      Reported-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Acked-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      be5c5e84
    • R
      powerpc/kprobe: Fix oops when kprobed on 'stdu' instruction · 9e1ba4f2
      Ravi Bangoria 提交于
      If we set a kprobe on a 'stdu' instruction on powerpc64, we see a kernel
      OOPS:
      
        Bad kernel stack pointer cd93c840 at c000000000009868
        Oops: Bad kernel stack pointer, sig: 6 [#1]
        ...
        GPR00: c000001fcd93cb30 00000000cd93c840 c0000000015c5e00 00000000cd93c840
        ...
        NIP [c000000000009868] resume_kernel+0x2c/0x58
        LR [c000000000006208] program_check_common+0x108/0x180
      
      On a 64-bit system when the user probes on a 'stdu' instruction, the kernel does
      not emulate actual store in emulate_step() because it may corrupt the exception
      frame. So the kernel does the actual store operation in exception return code
      i.e. resume_kernel().
      
      resume_kernel() loads the saved stack pointer from memory using lwz, which only
      loads the low 32-bits of the address, causing the kernel crash.
      
      Fix this by loading the 64-bit value instead.
      
      Fixes: be96f633 ("powerpc: Split out instruction analysis part of emulate_step()")
      Cc: stable@vger.kernel.org # v3.18+
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Reviewed-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Reviewed-by: NAnanth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      [mpe: Change log massage, add stable tag]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      9e1ba4f2
  2. 07 4月, 2017 1 次提交
    • M
      powerpc/crypto/crc32c-vpmsum: Fix missing preempt_disable() · 4749228f
      Michael Ellerman 提交于
      In crc32c_vpmsum() we call enable_kernel_altivec() without first
      disabling preemption, which is not allowed:
      
        WARNING: CPU: 9 PID: 2949 at ../arch/powerpc/kernel/process.c:277 enable_kernel_altivec+0x100/0x120
        Modules linked in: dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c vmx_crypto ...
        CPU: 9 PID: 2949 Comm: docker Not tainted 4.11.0-rc5-compiler_gcc-6.3.1-00033-g308ac756 #381
        ...
        NIP [c00000000001e320] enable_kernel_altivec+0x100/0x120
        LR [d000000003df0910] crc32c_vpmsum+0x108/0x150 [crc32c_vpmsum]
        Call Trace:
          0xc138fd09 (unreliable)
          crc32c_vpmsum+0x108/0x150 [crc32c_vpmsum]
          crc32c_vpmsum_update+0x3c/0x60 [crc32c_vpmsum]
          crypto_shash_update+0x88/0x1c0
          crc32c+0x64/0x90 [libcrc32c]
          dm_bm_checksum+0x48/0x80 [dm_persistent_data]
          sb_check+0x84/0x120 [dm_thin_pool]
          dm_bm_validate_buffer.isra.0+0xc0/0x1b0 [dm_persistent_data]
          dm_bm_read_lock+0x80/0xf0 [dm_persistent_data]
          __create_persistent_data_objects+0x16c/0x810 [dm_thin_pool]
          dm_pool_metadata_open+0xb0/0x1a0 [dm_thin_pool]
          pool_ctr+0x4cc/0xb60 [dm_thin_pool]
          dm_table_add_target+0x16c/0x3c0
          table_load+0x184/0x400
          ctl_ioctl+0x2f0/0x560
          dm_ctl_ioctl+0x38/0x50
          do_vfs_ioctl+0xd8/0x920
          SyS_ioctl+0x68/0xc0
          system_call+0x38/0xfc
      
      It used to be sufficient just to call pagefault_disable(), because that
      also disabled preemption. But the two were decoupled in commit 8222dbe2
      ("sched/preempt, mm/fault: Decouple preemption from the page fault
      logic") in mid 2015.
      
      So add the missing preempt_disable/enable(). We should also call
      disable_kernel_fp(), although it does nothing by default, there is a
      debug switch to make it active and all enables should be paired with
      disables.
      
      Fixes: 6dd7a82c ("crypto: powerpc - Add POWER8 optimised crc32c")
      Cc: stable@vger.kernel.org # v4.8+
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      4749228f
  3. 05 4月, 2017 2 次提交
  4. 04 4月, 2017 1 次提交
    • P
      powerpc: Don't try to fix up misaligned load-with-reservation instructions · 48fe9e94
      Paul Mackerras 提交于
      In the past, there was only one load-with-reservation instruction,
      lwarx, and if a program attempted a lwarx on a misaligned address, it
      would take an alignment interrupt and the kernel handler would emulate
      it as though it was lwzx, which was not really correct, but benign since
      it is loading the right amount of data, and the lwarx should be paired
      with a stwcx. to the same address, which would also cause an alignment
      interrupt which would result in a SIGBUS being delivered to the process.
      
      We now have 5 different sizes of load-with-reservation instruction. Of
      those, lharx and ldarx cause an immediate SIGBUS by luck since their
      entries in aligninfo[] overlap instructions which were not fixed up, but
      lqarx overlaps with lhz and will be emulated as such. lbarx can never
      generate an alignment interrupt since it only operates on 1 byte.
      
      To straighten this out and fix the lqarx case, this adds code to detect
      the l[hwdq]arx instructions and return without fixing them up, resulting
      in a SIGBUS being delivered to the process.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      48fe9e94
  5. 28 3月, 2017 1 次提交
    • B
      powerpc: Disable HFSCR[TM] if TM is not supported · 7ed23e1b
      Benjamin Herrenschmidt 提交于
      On Power8 & Power9 the early CPU inititialisation in __init_HFSCR()
      turns on HFSCR[TM] (Hypervisor Facility Status and Control Register
      [Transactional Memory]), but that doesn't take into account that TM
      might be disabled by CPU features, or disabled by the kernel being built
      with CONFIG_PPC_TRANSACTIONAL_MEM=n.
      
      So later in boot, when we have setup the CPU features, clear HSCR[TM] if
      the TM CPU feature has been disabled. We use CPU_FTR_TM_COMP to account
      for the CONFIG_PPC_TRANSACTIONAL_MEM=n case.
      
      Without this a KVM guest might try use TM, even if told not to, and
      cause an oops in the host kernel. Typically the oops is seen in
      __kvmppc_vcore_entry() and may or may not be fatal to the host, but is
      always bad news.
      
      In practice all shipping CPU revisions do support TM, and all host
      kernels we are aware of build with TM support enabled, so no one should
      actually be able to hit this in the wild.
      
      Fixes: 2a3563b0 ("powerpc: Setup in HFSCR for POWER8")
      Cc: stable@vger.kernel.org # v3.10+
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Tested-by: NSam Bobroff <sam.bobroff@au1.ibm.com>
      [mpe: Rewrite change log with input from Sam, add Fixes/stable]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      7ed23e1b
  6. 27 3月, 2017 1 次提交
    • M
      selftests/powerpc: Fix standalone powerpc build · 2db2c250
      Michael Ellerman 提交于
      The changes to enable building with a separate output directory, in
      commit a8ba798b ("selftests: enable O and KBUILD_OUTPUT") broke
      building the powerpc selftests on their own, eg:
      
       $ cd tools/testing/selftests/powerpc; make
      
      It was partially fixed in commit e53aff45 ("selftests: lib.mk Fix
      individual test builds"), which defined OUTPUT for standalone tests. But
      that only defines OUTPUT within the Makefile, the value is not exported
      so sub-shells can't see it. We could export OUTPUT, but it's actually
      cleaner to just expand the value of OUTPUT before we invoke the shell.
      
      Fixes: a8ba798b ("selftests: enable O and KBUILD_OUTPUT")
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      2db2c250
  7. 21 3月, 2017 2 次提交
  8. 20 3月, 2017 8 次提交
    • N
      powerpc/64s: Fix idle wakeup potential to clobber registers · 6d98ce0b
      Nicholas Piggin 提交于
      We concluded there may be a window where the idle wakeup code could get
      to pnv_wakeup_tb_loss() (which clobbers non-volatile GPRs), but the
      hardware may set SRR1[46:47] to 01b (no state loss) which would result
      in the wakeup code failing to restore non-volatile GPRs.
      
      I was not able to trigger this condition with trivial tests on real
      hardware or simulator, but the ISA (at least 2.07) seems to allow for
      it, and Gautham says that it can happen if there is an exception pending
      when the sleep/winkle instruction is executed.
      
      Fixes: 17065671 ("powerpc/kvm: make hypervisor state restore a function")
      Cc: stable@vger.kernel.org # v4.8+
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Acked-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      6d98ce0b
    • V
      cxl: Route eeh events to all slices for pci_channel_io_perm_failure state · 07f5ab60
      Vaibhav Jain 提交于
      Fix a boundary condition where in some cases an eeh event with state ==
      pci_channel_io_perm_failure wont be passed on to a driver attached to
      the virtual PCI device associated with a slice. This will happen in case
      the slice just before (n-1) doesn't have any vPHB bus associated with
      it, that results in an early return from cxl_pci_error_detected()
      callback.
      
      With state == pci_channel_io_perm_failure, the adapter will be removed
      irrespective of the return value of cxl_vphb_error_detected(). So we now
      always return PCI_ERS_RESULT_DISCONNECTED for this case i.e even if
      the AFU isn't using a vPHB (currently returns PCI_ERS_RESULT_NONE).
      
      Fixes: e4f5fc00("cxl: Do not create vPHB if there are no AFU configuration records")
      Signed-off-by: NVaibhav Jain <vaibhav@linux.vnet.ibm.com>
      Reviewed-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Acked-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      07f5ab60
    • L
      Linux 4.11-rc3 · 97da3854
      Linus Torvalds 提交于
      97da3854
    • L
      mm/swap: don't BUG_ON() due to uninitialized swap slot cache · 452b94b8
      Linus Torvalds 提交于
      This BUG_ON() triggered for me once at shutdown, and I don't see a
      reason for the check.  The code correctly checks whether the swap slot
      cache is usable or not, so an uninitialized swap slot cache is not
      actually problematic afaik.
      
      I've temporarily just switched the BUG_ON() to a WARN_ON_ONCE(), since
      I'm not sure why that seemingly pointless check was there.  I suspect
      the real fix is to just remove it entirely, but for now we'll warn about
      it but not bring the machine down.
      
      Cc: "Huang, Ying" <ying.huang@intel.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      452b94b8
    • L
      Merge tag 'powerpc-4.11-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · a07a6e41
      Linus Torvalds 提交于
      Pull more powerpc fixes from Michael Ellerman:
       "A couple of minor powerpc fixes for 4.11:
      
         - wire up statx() syscall
      
         - don't print a warning on memory hotplug when HPT resizing isn't
           available
      
        Thanks to: David Gibson, Chandan Rajendra"
      
      * tag 'powerpc-4.11-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/pseries: Don't give a warning when HPT resizing isn't available
        powerpc: Wire up statx() syscall
      a07a6e41
    • L
      Merge branch 'parisc-4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · 4571bc5a
      Linus Torvalds 提交于
      Pull parisc fixes from Helge Deller:
      
       - Mikulas Patocka added support for R_PARISC_SECREL32 relocations in
         modules with CONFIG_MODVERSIONS.
      
       - Dave Anglin optimized the cache flushing for vmap ranges.
      
       - Arvind Yadav provided a fix for a potential NULL pointer dereference
         in the parisc perf code (and some code cleanups).
      
       - I wired up the new statx system call, fixed some compiler warnings
         with the access_ok() macro and fixed shutdown code to really halt a
         system at shutdown instead of crashing & rebooting.
      
      * 'parisc-4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
        parisc: Fix system shutdown halt
        parisc: perf: Fix potential NULL pointer dereference
        parisc: Avoid compiler warnings with access_ok()
        parisc: Wire up statx system call
        parisc: Optimize flush_kernel_vmap_range and invalidate_kernel_vmap_range
        parisc: support R_PARISC_SECREL32 relocation in modules
      4571bc5a
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending · 8aa34172
      Linus Torvalds 提交于
      Pull SCSI target fixes from Nicholas Bellinger:
       "The bulk of the changes are in qla2xxx target driver code to address
        various issues found during Cavium/QLogic's internal testing (stable
        CC's included), along with a few other stability and smaller
        miscellaneous improvements.
      
        There are also a couple of different patch sets from Mike Christie,
        which have been a result of his work to use target-core ALUA logic
        together with tcm-user backend driver.
      
        Finally, a patch to address some long standing issues with
        pass-through SCSI export of TYPE_TAPE + TYPE_MEDIUM_CHANGER devices,
        which will make folks using physical (or virtual) magnetic tape happy"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (28 commits)
        qla2xxx: Update driver version to 9.00.00.00-k
        qla2xxx: Fix delayed response to command for loop mode/direct connect.
        qla2xxx: Change scsi host lookup method.
        qla2xxx: Add DebugFS node to display Port Database
        qla2xxx: Use IOCB interface to submit non-critical MBX.
        qla2xxx: Add async new target notification
        qla2xxx: Export DIF stats via debugfs
        qla2xxx: Improve T10-DIF/PI handling in driver.
        qla2xxx: Allow relogin to proceed if remote login did not finish
        qla2xxx: Fix sess_lock & hardware_lock lock order problem.
        qla2xxx: Fix inadequate lock protection for ABTS.
        qla2xxx: Fix request queue corruption.
        qla2xxx: Fix memory leak for abts processing
        qla2xxx: Allow vref count to timeout on vport delete.
        tcmu: Convert cmd_time_out into backend device attribute
        tcmu: make cmd timeout configurable
        tcmu: add helper to check if dev was configured
        target: fix race during implicit transition work flushes
        target: allow userspace to set state to transitioning
        target: fix ALUA transition timeout handling
        ...
      8aa34172
    • L
      Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm · 1b8df619
      Linus Torvalds 提交于
      Pull device-dax fixes from Dan Williams:
       "The device-dax driver was not being careful to handle falling back to
        smaller fault-granularity sizes.
      
        The driver already fails fault attempts that are smaller than the
        device's alignment, but it also needs to handle the cases where a
        larger page mapping could be established. For simplicity of the
        immediate fix the implementation just signals VM_FAULT_FALLBACK until
        fault-size == device-alignment.
      
        One fix is for -stable to address pmd-to-pte fallback from the
        original implementation, another fix is for the new (introduced in
        4.11-rc1) pud-to-pmd regression, and a typo fix comes along for the
        ride.
      
        These have received a build success notification from the kbuild
        robot"
      
      * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
        device-dax: fix debug output typo
        device-dax: fix pud fault fallback handling
        device-dax: fix pmd/pte fault fallback handling
      1b8df619
  9. 19 3月, 2017 22 次提交