1. 06 11月, 2015 3 次提交
    • E
      mm: mlock: add mlock flags to enable VM_LOCKONFAULT usage · b0f205c2
      Eric B Munson 提交于
      The previous patch introduced a flag that specified pages in a VMA should
      be placed on the unevictable LRU, but they should not be made present when
      the area is created.  This patch adds the ability to set this state via
      the new mlock system calls.
      
      We add MLOCK_ONFAULT for mlock2 and MCL_ONFAULT for mlockall.
      MLOCK_ONFAULT will set the VM_LOCKONFAULT modifier for VM_LOCKED.
      MCL_ONFAULT should be used as a modifier to the two other mlockall flags.
      When used with MCL_CURRENT, all current mappings will be marked with
      VM_LOCKED | VM_LOCKONFAULT.  When used with MCL_FUTURE, the mm->def_flags
      will be marked with VM_LOCKED | VM_LOCKONFAULT.  When used with both
      MCL_CURRENT and MCL_FUTURE, all current mappings and mm->def_flags will be
      marked with VM_LOCKED | VM_LOCKONFAULT.
      
      Prior to this patch, mlockall() will unconditionally clear the
      mm->def_flags any time it is called without MCL_FUTURE.  This behavior is
      maintained after adding MCL_ONFAULT.  If a call to mlockall(MCL_FUTURE) is
      followed by mlockall(MCL_CURRENT), the mm->def_flags will be cleared and
      new VMAs will be unlocked.  This remains true with or without MCL_ONFAULT
      in either mlockall() invocation.
      
      munlock() will unconditionally clear both vma flags.  munlockall()
      unconditionally clears for VMA flags on all VMAs and in the mm->def_flags
      field.
      Signed-off-by: NEric B Munson <emunson@akamai.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Shuah Khan <shuahkh@osg.samsung.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b0f205c2
    • R
      arch/powerpc/mm/numa.c: do not allocate bootmem memory for non existing nodes · c118baf8
      Raghavendra K T 提交于
      With the setup_nr_nodes(), we have already initialized
      node_possible_map.  So it is safe to use for_each_node here.
      
      There are many places in the kernel that use hardcoded 'for' loop with
      nr_node_ids, because all other architectures have numa nodes populated
      serially.  That should be reason we had maintained the same for
      powerpc.
      
      But, since sparse numa node ids possible on powerpc, we unnecessarily
      allocate memory for non existent numa nodes.
      
      For e.g., on a system with 0,1,16,17 as numa nodes nr_node_ids=18 and
      we allocate memory for nodes 2-14.  This patch we allocate memory for
      only existing numa nodes.
      
      The patch is boot tested on a 4 node tuleta, confirming with printks
      that it works as expected.
      Signed-off-by: NRaghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
      Cc: Vladimir Davydov <vdavydov@parallels.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Cc: Greg Kurz <gkurz@linux.vnet.ibm.com>
      Cc: Grant Likely <grant.likely@linaro.org>
      Cc: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c118baf8
    • A
      uaccess: reimplement probe_kernel_address() using probe_kernel_read() · 0ab32b6f
      Andrew Morton 提交于
      probe_kernel_address() is basically the same as the (later added)
      probe_kernel_read().
      
      The return value on EFAULT is a bit different: probe_kernel_address()
      returns number-of-bytes-not-copied whereas probe_kernel_read() returns
      -EFAULT.  All callers have been checked, none cared.
      
      probe_kernel_read() can be overridden by the architecture whereas
      probe_kernel_address() cannot.  parisc, blackfin and um do this, to insert
      additional checking.  Hence this patch possibly fixes obscure bugs,
      although there are only two probe_kernel_address() callsites outside
      arch/.
      
      My first attempt involved removing probe_kernel_address() entirely and
      converting all callsites to use probe_kernel_read() directly, but that got
      tiresome.
      
      This patch shrinks mm/slab_common.o by 218 bytes.  For a single
      probe_kernel_address() callsite.
      
      Cc: Steven Miao <realmz6@gmail.com>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0ab32b6f
  2. 28 10月, 2015 1 次提交
  3. 22 10月, 2015 1 次提交
    • V
      powerpc/rtas: Validate rtas.entry before calling enter_rtas() · 8832317f
      Vasant Hegde 提交于
      Currently we do not validate rtas.entry before calling enter_rtas(). This
      leads to a kernel oops when user space calls rtas system call on a powernv
      platform (see below). This patch adds code to validate rtas.entry before
      making enter_rtas() call.
      
        Oops: Exception in kernel mode, sig: 4 [#1]
        SMP NR_CPUS=1024 NUMA PowerNV
        task: c000000004294b80 ti: c0000007e1a78000 task.ti: c0000007e1a78000
        NIP: 0000000000000000 LR: 0000000000009c14 CTR: c000000000423140
        REGS: c0000007e1a7b920 TRAP: 0e40   Not tainted  (3.18.17-340.el7_1.pkvm3_1_0.2400.1.ppc64le)
        MSR: 1000000000081000 <HV,ME>  CR: 00000000  XER: 00000000
        CFAR: c000000000009c0c SOFTE: 0
        NIP [0000000000000000]           (null)
        LR [0000000000009c14] 0x9c14
        Call Trace:
        [c0000007e1a7bba0] [c00000000041a7f4] avc_has_perm_noaudit+0x54/0x110 (unreliable)
        [c0000007e1a7bd80] [c00000000002ddc0] ppc_rtas+0x150/0x2d0
        [c0000007e1a7be30] [c000000000009358] syscall_exit+0x0/0x98
      
      Cc: stable@vger.kernel.org # v3.2+
      Fixes: 55190f88 ("powerpc: Add skeleton PowerNV platform")
      Reported-by: NNAGESWARA R. SASTRY <nasastry@in.ibm.com>
      Signed-off-by: NVasant Hegde <hegdevasant@linux.vnet.ibm.com>
      [mpe: Reword change log, trim oops, and add stable + fixes]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      8832317f
  4. 21 10月, 2015 2 次提交
    • P
      powerpc/powernv: Handle irq_happened flag correctly in off-line loop · 53c656c4
      Paul Mackerras 提交于
      This fixes a bug where it is possible for an off-line CPU to fail to go
      into a low-power state (nap/sleep/winkle), and to become unresponsive to
      requests from the KVM subsystem to wake up and run a VCPU. What can
      happen is that a maskable interrupt of some kind (external, decrementer,
      hypervisor doorbell, or HMI) after we have called local_irq_disable() at
      the beginning of pnv_smp_cpu_kill_self() and before interrupts are
      hard-disabled inside power7_nap/sleep/winkle(). In this situation, the
      pending event is marked in the irq_happened flag in the PACA. This
      pending event prevents power7_nap/sleep/winkle from going to the
      requested low-power state; instead they return immediately. We don't
      deal with any of these pending event flags in the off-line loop in
      pnv_smp_cpu_kill_self() because power7_nap et al. return 0 in this case,
      so we will have srr1 == 0, and none of the processing to clear
      interrupts or doorbells will be done.
      
      Usually, the most obvious symptom of this is that a KVM guest will fail
      with a console message saying "KVM: couldn't grab cpu N".
      
      This fixes the problem by making sure we handle the irq_happened flags
      properly. First, we hard-disable before the off-line loop. Once we have
      hard-disabled, the irq_happened flags can't change underneath us. We
      unconditionally clear the DEC and HMI flags: there is no processing of
      timer interrupts while off-line, and the necessary HMI processing is all
      done in lower-level code. We leave the EE and DBELL flags alone for the
      first iteration of the loop, so that we won't fail to respond to a
      split-core request that came in just before hard-disabling. Within the
      loop, we handle external interrupts if the EE bit is set in irq_happened
      as well as if the low-power state was interrupted by an external
      interrupt. (We don't need to do the msgclr for a pending doorbell in
      irq_happened, because doorbells are edge-triggered and don't remain
      pending in hardware.) Then we clear both the EE and DBELL flags, and
      once clear, they cannot be set again (until this CPU comes online again,
      that is).
      
      This also fixes the debug check to not be done when we just ran a KVM
      guest or when the sleep didn't happen because of a pending event in
      irq_happened.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      53c656c4
    • P
      powerpc: Revert "Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8" · 23316316
      Paul Mackerras 提交于
      This reverts commit 9678cdaa ("Use the POWER8 Micro Partition
      Prefetch Engine in KVM HV on POWER8") because the original commit had
      multiple, partly self-cancelling bugs, that could cause occasional
      memory corruption.
      
      In fact the logmpp instruction was incorrectly using register r0 as the
      source of the buffer address and operation code, and depending on what
      was in r0, it would either do nothing or corrupt the 64k page pointed to
      by r0.
      
      The logmpp instruction encoding and the operation code definitions could
      be corrected, but then there is the problem that there is no clearly
      defined way to know when the hardware has finished writing to the
      buffer.
      
      The original commit attempted to work around this by aborting the
      write-out before starting the prefetch, but this is ineffective in the
      case where the virtual core is now executing on a different physical
      core from the one where the write-out was initiated.
      
      These problems plus advice from the hardware designers not to use the
      function (since the measured performance improvement from using the
      feature was actually mostly negative), mean that reverting the code is
      the best option.
      
      Fixes: 9678cdaa ("Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8")
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      23316316
  5. 20 10月, 2015 1 次提交
  6. 14 10月, 2015 1 次提交
  7. 09 10月, 2015 2 次提交
    • D
      powerpc/powernv: Panic on unhandled Machine Check · f2dd80ec
      Daniel Axtens 提交于
      All unrecovered machine check errors on PowerNV should cause an
      immediate panic. There are 2 reasons that this is the right policy:
      it's not safe to continue, and we're already trying to reboot.
      
      Firstly, if we go through the recovery process and do not successfully
      recover, we can't be sure about the state of the machine, and it is
      not safe to recover and proceed.
      
      Linux knows about the following sources of Machine Check Errors:
      - Uncorrectable Errors (UE)
      - Effective - Real Address Translation (ERAT)
      - Segment Lookaside Buffer (SLB)
      - Translation Lookaside Buffer (TLB)
      - Unknown/Unrecognised
      
      In the SLB, TLB and ERAT cases, we can further categorise these as
      parity errors, multihit errors or unknown/unrecognised.
      
      We can handle SLB errors by flushing and reloading the SLB. We can
      handle TLB and ERAT multihit errors by flushing the TLB. (It appears
      we may not handle TLB and ERAT parity errors: I will investigate
      further and send a followup patch if appropriate.)
      
      This leaves us with uncorrectable errors. Uncorrectable errors are
      usually the result of ECC memory detecting an error that it cannot
      correct, but they also crop up in the context of PCI cards failing
      during DMA writes, and during CAPI error events.
      
      There are several types of UE, and there are 3 places a UE can occur:
      Skiboot, the kernel, and userspace. For Skiboot errors, we have the
      facility to make some recoverable. For userspace, we can simply kill
      (SIGBUS) the affected process. We have no meaningful way to deal with
      UEs in kernel space or in unrecoverable sections of Skiboot.
      
      Currently, these unrecovered UEs fall through to
      machine_check_expection() in traps.c, which calls die(), which OOPSes
      and sends SIGBUS to the process. This sometimes allows us to stumble
      onwards. For example we've seen UEs kill the kernel eehd and
      khugepaged. However, the process killed could have held a lock, or it
      could have been a more important process, etc: we can no longer make
      any assertions about the state of the machine. Similarly if we see a
      UE in skiboot (and again we've seen this happen), we're not in a
      position where we can make any assertions about the state of the
      machine.
      
      Likewise, for unknown or unrecognised errors, we're not able to say
      anything about the state of the machine.
      
      Therefore, if we have an unrecovered MCE, the most appropriate thing
      to do is to panic.
      
      The second reason is that since e784b649 ("powerpc/powernv: Invoke
      opal_cec_reboot2() on unrecoverable machine check errors."), we
      attempt a special OPAL reboot on an unhandled MCE. This is so the
      hardware can record error data for later debugging.
      
      The comments in that commit assert that we are heading down the panic
      path anyway. At the moment this is not always true. With UEs in kernel
      space, for instance, they are marked as recoverable by the hardware,
      so if the attempt to reboot failed (e.g. old Skiboot), we wouldn't
      panic() but would simply die() and OOPS. It doesn't make sense to be
      staggering on if we've just tried to reboot: we should panic().
      
      Explicitly panic() on unrecovered MCEs on PowerNV.
      Update the comments appropriately.
      
      This fixes some hangs following EEH events on cxlflash setups.
      Signed-off-by: NDaniel Axtens <dja@axtens.net>
      Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Reviewed-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f2dd80ec
    • C
      powerpc: Fix checkstop in native_hpte_clear() with lockdep · fdf880a6
      Cyril Bur 提交于
      native_hpte_clear() is called in real mode from two places:
      - Early in boot during htab initialisation if firmware assisted dump is
        active.
      - Late in the kexec path.
      
      In both contexts there is no need to disable interrupts are they are
      already disabled. Furthermore, locking around the tlbie() is only required
      for pre POWER5 hardware.
      
      On POWER5 or newer hardware concurrent tlbie()s work as expected and on pre
      POWER5 hardware concurrent tlbie()s could result in deadlock. This code
      would only be executed at crashdump time, during which all bets are off,
      concurrent tlbie()s are unlikely and taking locks is unsafe therefore the
      best course of action is to simply do nothing. Concurrent tlbie()s are not
      possible in the first case as secondary CPUs have not come up yet.
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      fdf880a6
  8. 08 10月, 2015 1 次提交
  9. 07 10月, 2015 2 次提交
  10. 03 10月, 2015 1 次提交
  11. 30 9月, 2015 1 次提交
  12. 29 9月, 2015 1 次提交
    • M
      powerpc/configs: Re-enable CONFIG_SCSI_DH · 6b98f2be
      Michael Ellerman 提交于
      Commit 086b91d0 ("scsi_dh: integrate into the core SCSI code")
      changed CONFIG_SCSI_DH from tristate to bool.
      
      Our defconfigs have CONFIG_SCSI_DH=m, which the kconfig machinery warns
      us is invalid, but instead of converting it to =y it leaves it unset.
      This means we loose the CONFIG_SCSI_DH code and everything that depends
      on it.
      
      So convert the values in the defconfigs to =y.
      
      Fixes: 086b91d0 ("scsi_dh: integrate into the core SCSI code")
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      6b98f2be
  13. 25 9月, 2015 1 次提交
  14. 21 9月, 2015 4 次提交
    • M
      powerpc: Wire up sys_membarrier() · 793b8bf9
      Michael Ellerman 提交于
      The selftest passes on 64-bit LE & BE, and 32-bit.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      793b8bf9
    • T
      KVM: PPC: Book3S: Take the kvm->srcu lock in kvmppc_h_logical_ci_load/store() · 3eb4ee68
      Thomas Huth 提交于
      Access to the kvm->buses (like with the kvm_io_bus_read() and -write()
      functions) has to be protected via the kvm->srcu lock.
      The kvmppc_h_logical_ci_load() and -store() functions are missing
      this lock so far, so let's add it there, too.
      This fixes the problem that the kernel reports "suspicious RCU usage"
      when lock debugging is enabled.
      
      Cc: stable@vger.kernel.org # v4.1+
      Fixes: 99342cf8Signed-off-by: NThomas Huth <thuth@redhat.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      3eb4ee68
    • G
      KVM: PPC: Book3S HV: Pass the correct trap argument to kvmhv_commence_exit · 7e022e71
      Gautham R. Shenoy 提交于
      In guest_exit_cont we call kvmhv_commence_exit which expects the trap
      number as the argument. However r3 doesn't contain the trap number at
      this point and as a result we would be calling the function with a
      spurious trap number.
      
      Fix this by copying r12 into r3 before calling kvmhv_commence_exit as
      r12 contains the trap number.
      
      Cc: stable@vger.kernel.org # v4.1+
      Fixes: eddb60fbSigned-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      7e022e71
    • P
      KVM: PPC: Book3S HV: Fix handling of interrupted VCPUs · 5fc3e64f
      Paul Mackerras 提交于
      This fixes a bug which results in stale vcore pointers being left in
      the per-cpu preempted vcore lists when a VM is destroyed.  The result
      of the stale vcore pointers is usually either a crash or a lockup
      inside collect_piggybacks() when another VM is run.  A typical
      lockup message looks like:
      
      [  472.161074] NMI watchdog: BUG: soft lockup - CPU#24 stuck for 22s! [qemu-system-ppc:7039]
      [  472.161204] Modules linked in: kvm_hv kvm_pr kvm xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ses enclosure shpchp rtc_opal i2c_opal powernv_rng binfmt_misc dm_service_time scsi_dh_alua radeon i2c_algo_bit drm_kms_helper ttm drm tg3 ptp pps_core cxgb3 ipr i2c_core mdio dm_multipath [last unloaded: kvm_hv]
      [  472.162111] CPU: 24 PID: 7039 Comm: qemu-system-ppc Not tainted 4.2.0-kvm+ #49
      [  472.162187] task: c000001e38512750 ti: c000001e41bfc000 task.ti: c000001e41bfc000
      [  472.162262] NIP: c00000000096b094 LR: c00000000096b08c CTR: c000000000111130
      [  472.162337] REGS: c000001e41bff520 TRAP: 0901   Not tainted  (4.2.0-kvm+)
      [  472.162399] MSR: 9000000100009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 24848844  XER: 00000000
      [  472.162588] CFAR: c00000000096b0ac SOFTE: 1
      GPR00: c000000000111170 c000001e41bff7a0 c00000000127df00 0000000000000001
      GPR04: 0000000000000003 0000000000000001 0000000000000000 0000000000874821
      GPR08: c000001e41bff8e0 0000000000000001 0000000000000000 d00000000efde740
      GPR12: c000000000111130 c00000000fdae400
      [  472.163053] NIP [c00000000096b094] _raw_spin_lock_irqsave+0xa4/0x130
      [  472.163117] LR [c00000000096b08c] _raw_spin_lock_irqsave+0x9c/0x130
      [  472.163179] Call Trace:
      [  472.163206] [c000001e41bff7a0] [c000001e41bff7f0] 0xc000001e41bff7f0 (unreliable)
      [  472.163295] [c000001e41bff7e0] [c000000000111170] __wake_up+0x40/0x90
      [  472.163375] [c000001e41bff830] [d00000000efd6fc0] kvmppc_run_core+0x1240/0x1950 [kvm_hv]
      [  472.163465] [c000001e41bffa30] [d00000000efd8510] kvmppc_vcpu_run_hv+0x5a0/0xd90 [kvm_hv]
      [  472.163559] [c000001e41bffb70] [d00000000e9318a4] kvmppc_vcpu_run+0x44/0x60 [kvm]
      [  472.163653] [c000001e41bffba0] [d00000000e92e674] kvm_arch_vcpu_ioctl_run+0x64/0x170 [kvm]
      [  472.163745] [c000001e41bffbe0] [d00000000e9263a8] kvm_vcpu_ioctl+0x538/0x7b0 [kvm]
      [  472.163834] [c000001e41bffd40] [c0000000002d0f50] do_vfs_ioctl+0x480/0x7c0
      [  472.163910] [c000001e41bffde0] [c0000000002d1364] SyS_ioctl+0xd4/0xf0
      [  472.163986] [c000001e41bffe30] [c000000000009260] system_call+0x38/0xd0
      [  472.164060] Instruction dump:
      [  472.164098] ebc1fff0 ebe1fff8 7c0803a6 4e800020 60000000 60000000 60420000 8bad02e2
      [  472.164224] 7fc3f378 4b6a57c1 60000000 7c210b78 <e92d0000> 89290009 792affe3 40820070
      
      The bug is that kvmppc_run_vcpu does not correctly handle the case
      where a vcpu task receives a signal while its guest vcpu is executing
      in the guest as a result of being piggy-backed onto the execution of
      another vcore.  In that case we need to wait for the vcpu to finish
      executing inside the guest, and then remove this vcore from the
      preempted vcores list.  That way, we avoid leaving this vcpu's vcore
      on the preempted vcores list when the vcpu gets interrupted.
      
      Fixes: ec257165Reported-by: NThomas Huth <thuth@redhat.com>
      Tested-by: NThomas Huth <thuth@redhat.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      5fc3e64f
  15. 17 9月, 2015 2 次提交
  16. 16 9月, 2015 9 次提交
    • T
      genirq: Remove irq argument from irq flow handlers · bd0b9ac4
      Thomas Gleixner 提交于
      Most interrupt flow handlers do not use the irq argument. Those few
      which use it can retrieve the irq number from the irq descriptor.
      
      Remove the argument.
      
      Search and replace was done with coccinelle and some extra helper
      scripts around it. Thanks to Julia for her help!
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Julia Lawall <Julia.Lawall@lip6.fr>
      Cc: Jiang Liu <jiang.liu@linux.intel.com>
      bd0b9ac4
    • T
      powerpc/mpc8xx: Use irq_set_handler_locked() · 9ca86b20
      Thomas Gleixner 提交于
      Use irq_set_handler_locked() as it avoids a redundant lookup of the
      irq descriptor.
      
      Search and replacement was done with coccinelle:
      
      @@
      struct irq_data *d;
      expression E1;
      @@
      
      -__irq_set_handler_locked(d->irq, E1);
      +irq_set_handler_locked(d, E1);
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Jiang Liu <jiang.liu@linux.intel.com>
      Cc: Julia Lawall <julia.lawall@lip6.fr>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: linuxppc-dev@lists.ozlabs.org
      9ca86b20
    • T
      powerpc/ipic: Use irq_set_handler_locked() · 9758a7b0
      Thomas Gleixner 提交于
      Use irq_set_handler_locked() as it avoids a redundant lookup of the
      irq descriptor.
      
      Search and replacement was done with coccinelle:
      
      @@
      struct irq_data *d;
      expression E1;
      @@
      
      -__irq_set_handler_locked(d->irq, E1);
      +irq_set_handler_locked(d, E1);
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Jiang Liu <jiang.liu@linux.intel.com>
      Cc: Julia Lawall <julia.lawall@lip6.fr>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      9758a7b0
    • T
      powerpc/cpm2: Use irq_set_handler_locked() · e9e879a3
      Thomas Gleixner 提交于
      Use irq_set_handler_locked() as it avoids a redundant lookup of the
      irq descriptor.
      
      Search and replacement was done with coccinelle:
      
      @@
      struct irq_data *d;
      expression E1;
      @@
      
      -__irq_set_handler_locked(d->irq, E1);
      +irq_set_handler_locked(d, E1);
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Jiang Liu <jiang.liu@linux.intel.com>
      Cc: Julia Lawall <julia.lawall@lip6.fr>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: linuxppc-dev@lists.ozlabs.org
      e9e879a3
    • T
      powerpc/mpc52xx: Use irq_set_handler_locked() · 6b83bd94
      Thomas Gleixner 提交于
      Use irq_set_handler_locked() as it avoids a redundant lookup of the
      irq descriptor.
      
      Search and replacement was done with coccinelle:
      
      @@
      struct irq_data *d;
      expression E1;
      @@
      
      -__irq_set_handler_locked(d->irq, E1);
      +irq_set_handler_locked(d, E1);
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Jiang Liu <jiang.liu@linux.intel.com>
      Cc: Julia Lawall <julia.lawall@lip6.fr>
      Cc: Anatolij Gustschin <agust@denx.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: linuxppc-dev@lists.ozlabs.org
      6b83bd94
    • A
      powerpc/mm: Recompute hash value after a failed update · 36b35d5d
      Aneesh Kumar K.V 提交于
      If we had secondary hash flag set, we ended up modifying hash value in
      the updatepp code path. Hence with a failed updatepp we will be using
      a wrong hash value for the following hash insert. Fix this by
      recomputing hash before insert.
      
      Without this patch we can end up with using wrong slot number in linux
      pte. That can result in us missing an hash pte update or invalidate
      which can cause memory corruption or even machine check.
      
      Fixes: 6d492ecc ("powerpc/THP: Add code to handle HPTE faults for hugepages")
      Cc: stable@vger.kernel.org # v3.11+
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Reviewed-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      36b35d5d
    • B
      powerpc/boot: Specify ABI v2 when building an LE boot wrapper · 655471f5
      Benjamin Herrenschmidt 提交于
      The kernel does it, not the boot wrapper, which breaks with some
      cross compilers that still default to ABI v1.
      
      Fixes: 147c0516 ("powerpc/boot: Add support for 64bit little endian wrapper")
      Cc: stable@vger.kernel.org # v3.16+
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      655471f5
    • P
      KVM: add halt_attempted_poll to VCPU stats · 62bea5bf
      Paolo Bonzini 提交于
      This new statistic can help diagnosing VCPUs that, for any reason,
      trigger bad behavior of halt_poll_ns autotuning.
      
      For example, say halt_poll_ns = 480000, and wakeups are spaced exactly
      like 479us, 481us, 479us, 481us. Then KVM always fails polling and wastes
      10+20+40+80+160+320+480 = 1110 microseconds out of every
      479+481+479+481+479+481+479 = 3359 microseconds. The VCPU then
      is consuming about 30% more CPU than it would use without
      polling.  This would show as an abnormally high number of
      attempted polling compared to the successful polls.
      
      Acked-by: Christian Borntraeger <borntraeger@de.ibm.com<
      Reviewed-by: NDavid Matlack <dmatlack@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      62bea5bf
    • B
      PCI: Revert "PCI: Call pci_read_bridge_bases() from core instead of arch code" · 237865f1
      Bjorn Helgaas 提交于
      Revert dff22d20 ("PCI: Call pci_read_bridge_bases() from core instead
      of arch code").
      
      Reading PCI bridge windows is not arch-specific in itself, but there is PCI
      core code that doesn't work correctly if we read them too early.  For
      example, Hannes found this case on an ARM Freescale i.mx6 board:
      
        pci_bus 0000:00: root bus resource [mem 0x01000000-0x01efffff]
        pci 0000:00:00.0: PCI bridge to [bus 01-ff]
        pci 0000:00:00.0: BAR 8: no space for [mem size 0x01000000] (mem window)
        pci 0000:01:00.0: BAR 2: failed to assign [mem size 0x00200000]
        pci 0000:01:00.0: BAR 1: failed to assign [mem size 0x00004000]
        pci 0000:01:00.0: BAR 0: failed to assign [mem size 0x00000100]
      
      The 00:00.0 mem window needs to be at least 3MB: the 01:00.0 device needs
      0x204100 of space, and mem windows are megabyte-aligned.
      
      Bus sizing can increase a bridge window size, but never *decrease* it (see
      d65245c3 ("PCI: don't shrink bridge resources")).  Prior to
      dff22d20, ARM didn't read bridge windows at all, so the "original size"
      was zero, and we assigned a 3MB window.
      
      After dff22d20, we read the bridge windows before sizing the bus.  The
      firmware programmed a 16MB window (size 0x01000000) in 00:00.0, and since
      we never decrease the size, we kept 16MB even though we only needed 3MB.
      But 16MB doesn't fit in the host bridge aperture, so we failed to assign
      space for the window and the downstream devices.
      
      I think this is a defect in the PCI core: we shouldn't rely on the firmware
      to assign sensible windows.
      
      Ray reported a similar problem, also on ARM, with Broadcom iProc.
      
      Issues like this are too hard to fix right now, so revert dff22d20.
      Reported-by: NHannes <oe5hpm@gmail.com>
      Reported-by: NRay Jui <rjui@broadcom.com>
      Link: http://lkml.kernel.org/r/CAAa04yFQEUJm7Jj1qMT57-LG7ZGtnhNDBe=PpSRa70Mj+XhW-A@mail.gmail.com
      Link: http://lkml.kernel.org/r/55F75BB8.4070405@broadcom.comSigned-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Acked-by: NYinghai Lu <yinghai@kernel.org>
      Acked-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      237865f1
  17. 15 9月, 2015 1 次提交
  18. 14 9月, 2015 3 次提交
    • T
      powerpc/cell: Prepare irq handler for irq argument removal · 391de7f9
      Thomas Gleixner 提交于
      The irq argument of most interrupt flow handlers is unused or merily
      used instead of a local variable. The handlers which need the irq
      argument can retrieve the irq number from the irq descriptor.
          
      Search and update was done with coccinelle and the invaluable help of
      Julia Lawall.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Julia Lawall <Julia.Lawall@lip6.fr>
      Cc: Jiang Liu <jiang.liu@linux.intel.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: linuxppc-dev@lists.ozlabs.org
      391de7f9
    • T
      powerpc/85xx: Prepare irq handlers for irq argument removal · 0a0dbd92
      Thomas Gleixner 提交于
      The irq argument of most interrupt flow handlers is unused or merily
      used instead of a local variable. The handlers which need the irq
      argument can retrieve the irq number from the irq descriptor.
          
      Search and update was done with coccinelle and the invaluable help of
      Julia Lawall.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Julia Lawall <Julia.Lawall@lip6.fr>
      Cc: Jiang Liu <jiang.liu@linux.intel.com>
      Cc: Scott Wood <scottwood@freescale.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      0a0dbd92
    • T
      powerpc/mpc5121_ads_cpld: Prepare irq handler for irq argument removal · 5aac2d33
      Thomas Gleixner 提交于
      The irq argument of most interrupt flow handlers is unused or merily
      used instead of a local variable. The handlers which need the irq
      argument can retrieve the irq number from the irq descriptor.
          
      Search and update was done with coccinelle and the invaluable help of
      Julia Lawall.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Julia Lawall <Julia.Lawall@lip6.fr>
      Cc: Jiang Liu <jiang.liu@linux.intel.com>
      Cc: Anatolij Gustschin <agust@denx.de>
      Cc: linuxppc-dev@lists.ozlabs.org
      5aac2d33
  19. 13 9月, 2015 3 次提交
    • S
      perf/core: Drop PERF_EVENT_TXN · 8f3e5684
      Sukadev Bhattiprolu 提交于
      We currently use PERF_EVENT_TXN flag to determine if we are in the middle
      of a transaction. If in a transaction, we defer the schedulability checks
      from pmu->add() operation to the pmu->commit() operation.
      
      Now that we have "transaction types" (PERF_PMU_TXN_ADD, PERF_PMU_TXN_READ)
      we can use the type to determine if we are in a transaction and drop the
      PERF_EVENT_TXN flag.
      
      When PERF_EVENT_TXN is dropped, the cpuhw->group_flag on some architectures
      becomes unused, so drop that field as well.
      
      This is an extension of the Powerpc patch from Peter Zijlstra to s390,
      Sparc and x86 architectures.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1441336073-22750-11-git-send-email-sukadev@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8f3e5684
    • S
      powerpc, perf/powerpc/hv-24x7: Use PMU_TXN_READ interface · 88a48613
      Sukadev Bhattiprolu 提交于
      The 24x7 counters in Powerpc allow monitoring a large number of counters
      simultaneously. They also allow reading several counters in a single
      HCALL so we can get a more consistent snapshot of the system.
      
      Use the PMU's transaction interface to monitor and read several event
      counters at once. The idea is that users can group several 24x7 events
      into a single group of events. We use the following logic to submit
      the group of events to the PMU and read the values:
      
      	pmu->start_txn()		// Initialize before first event
      
      	for each event in group
      		pmu->read(event);	// Queue each event to be read
      
      	pmu->commit_txn()		// Read/update all queuedcounters
      
      The ->commit_txn() also updates the event counts in the respective
      perf_event objects.  The perf subsystem can then directly get the
      event counts from the perf_event and can avoid submitting a new
      ->read() request to the PMU.
      
      Thanks to input from Peter Zijlstra.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NMichael Ellerman <mpe@ellerman.id.au>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1441336073-22750-10-git-send-email-sukadev@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      88a48613
    • S
      perf/core: Add a 'flags' parameter to the PMU transactional interfaces · fbbe0701
      Sukadev Bhattiprolu 提交于
      Currently, the PMU interface allows reading only one counter at a time.
      But some PMUs like the 24x7 counters in Power, support reading several
      counters at once. To leveage this functionality, extend the transaction
      interface to support a "transaction type".
      
      The first type, PERF_PMU_TXN_ADD, refers to the existing transactions,
      i.e. used to _schedule_ all the events on the PMU as a group. A second
      transaction type, PERF_PMU_TXN_READ, will be used in a follow-on patch,
      by the 24x7 counters to read several counters at once.
      
      Extend the transaction interfaces to the PMU to accept a 'txn_flags'
      parameter and use this parameter to ignore any transactions that are
      not of type PERF_PMU_TXN_ADD.
      
      Thanks to Peter Zijlstra for his input.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      [peterz: s390 compile fix]
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NMichael Ellerman <mpe@ellerman.id.au>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1441336073-22750-3-git-send-email-sukadev@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      fbbe0701