1. 12 10月, 2015 2 次提交
  2. 09 10月, 2015 2 次提交
  3. 06 10月, 2015 2 次提交
  4. 05 10月, 2015 6 次提交
  5. 02 10月, 2015 2 次提交
  6. 01 10月, 2015 6 次提交
    • M
      powerpc: Add ppc64le_defconfig · 2adc48a6
      Michael Ellerman 提交于
      Based directly on ppc64_defconfig using merge_config.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      2adc48a6
    • A
      powerpc/mm: Add virt_to_pfn and use this instead of opencoding · 65d3223a
      Aneesh Kumar K.V 提交于
      This add helper virt_to_pfn and remove the opencoded usage of the
      same.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      65d3223a
    • M
      powerpc/vdso: Avoid link stack corruption in __get_datapage() · c974809a
      Michael Neuling 提交于
      powerpc has a link register (lr) used for calling functions. We "bl
      <func>" to call a function, and "blr" to return back to the call site.
      
      The lr is only a single register, so if we call another function from
      inside this function (ie. nested calls), software must save away the
      lr on the software stack before calling the new function. Before
      returning (ie. before the "blr"), the lr is restored by software from
      the software stack.
      
      This makes branch prediction quite difficult for the processor as it
      will only know the branch target just before the "blr".
      
      To help with this, modern powerpc processors keep a (non-architected)
      hardware stack of lr called a "link stack". When a "bl <func>" is
      run, the lr is pushed onto this stack. When a "blr" is called, the
      branch predictor pops the lr value from the top of the link stack, and
      uses it to predict the branch target. Hence the processor pipeline
      knows a lot earlier the branch target.
      
      This works great but there are some cases where you call "bl" but
      without a matching "blr". Once such case is when trying to determine
      the program counter (which can't be read directly). Here you "bl+4;
      mflr" to get the program counter. If you do this, the link stack will
      get out of sync with reality, causing the branch predictor to
      mis-predict subsequent function returns.
      
      To avoid this, modern micro-architectures have a special case of bl.
      Using the form "bcl 20,31,+4", ensures the processor doesn't push to
      the link stack.
      
      The 32 and 64 bit variants of __get_datapage() use a "bl; mflr" to
      determine the loaded address of the VDSO. The current versions of
      these attempt to use this special bl variant.
      
      Unfortunately they use +8 rather than the required +4. Hence the
      current code results in the link stack getting out of sync with
      reality and hence the resulting performance degradation.
      
      This patch moves it to bcl+4 by moving __kernel_datapage_offset out of
      __get_datapage().
      
      With this patch, running a gettimeofday() (which uses
      __get_datapage()) microbenchmark we get a decent bump in performance
      on POWER7/8.
      
      For the benchmark in tools/testing/selftests/powerpc/benchmarks/gettimeofday.c
        POWER8:
          64bit gets ~4% improvement
          32bit gets ~9% improvement
        POWER7:
          64bit gets ~7% improvement
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Reported-by: NAaron Sawdey <sawdey@us.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      c974809a
    • M
      powerpc/slb: Use a local to avoid multiple calls to get_slb_shadow() · 26cd835e
      Michael Ellerman 提交于
      For no reason other than it looks ugly.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      26cd835e
    • A
      powerpc/slb: Define an enum for the bolted indexes · 1d15010c
      Anshuman Khandual 提交于
      This patch defines macros for the three bolted SLB indexes we use.
      Switch the functions that take the indexes as an argument to use the
      enum.
      Signed-off-by: NAnshuman Khandual <khandual@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      1d15010c
    • M
      powerpc/vdso: Emit GNU & SysV hashes · 787b393c
      Michael Ellerman 提交于
      Andy Lutomirski says:
      
        Some dynamic loaders may be slightly faster if a GNU hash is
        available.
      
        This is unlikely to have any measurable effect on the time it takes
        to resolve vdso symbols (since there are so few of them).  In some
        contexts, it can be a win for a different reason: if every DSO has a
        GNU hash section, then libc can avoid calculating SysV hashes at
        all. Both musl and glibc appear to have this optimization.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      787b393c
  7. 29 9月, 2015 2 次提交
  8. 25 9月, 2015 1 次提交
  9. 21 9月, 2015 4 次提交
    • M
      powerpc: Wire up sys_membarrier() · 793b8bf9
      Michael Ellerman 提交于
      The selftest passes on 64-bit LE & BE, and 32-bit.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      793b8bf9
    • T
      KVM: PPC: Book3S: Take the kvm->srcu lock in kvmppc_h_logical_ci_load/store() · 3eb4ee68
      Thomas Huth 提交于
      Access to the kvm->buses (like with the kvm_io_bus_read() and -write()
      functions) has to be protected via the kvm->srcu lock.
      The kvmppc_h_logical_ci_load() and -store() functions are missing
      this lock so far, so let's add it there, too.
      This fixes the problem that the kernel reports "suspicious RCU usage"
      when lock debugging is enabled.
      
      Cc: stable@vger.kernel.org # v4.1+
      Fixes: 99342cf8Signed-off-by: NThomas Huth <thuth@redhat.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      3eb4ee68
    • G
      KVM: PPC: Book3S HV: Pass the correct trap argument to kvmhv_commence_exit · 7e022e71
      Gautham R. Shenoy 提交于
      In guest_exit_cont we call kvmhv_commence_exit which expects the trap
      number as the argument. However r3 doesn't contain the trap number at
      this point and as a result we would be calling the function with a
      spurious trap number.
      
      Fix this by copying r12 into r3 before calling kvmhv_commence_exit as
      r12 contains the trap number.
      
      Cc: stable@vger.kernel.org # v4.1+
      Fixes: eddb60fbSigned-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      7e022e71
    • P
      KVM: PPC: Book3S HV: Fix handling of interrupted VCPUs · 5fc3e64f
      Paul Mackerras 提交于
      This fixes a bug which results in stale vcore pointers being left in
      the per-cpu preempted vcore lists when a VM is destroyed.  The result
      of the stale vcore pointers is usually either a crash or a lockup
      inside collect_piggybacks() when another VM is run.  A typical
      lockup message looks like:
      
      [  472.161074] NMI watchdog: BUG: soft lockup - CPU#24 stuck for 22s! [qemu-system-ppc:7039]
      [  472.161204] Modules linked in: kvm_hv kvm_pr kvm xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ses enclosure shpchp rtc_opal i2c_opal powernv_rng binfmt_misc dm_service_time scsi_dh_alua radeon i2c_algo_bit drm_kms_helper ttm drm tg3 ptp pps_core cxgb3 ipr i2c_core mdio dm_multipath [last unloaded: kvm_hv]
      [  472.162111] CPU: 24 PID: 7039 Comm: qemu-system-ppc Not tainted 4.2.0-kvm+ #49
      [  472.162187] task: c000001e38512750 ti: c000001e41bfc000 task.ti: c000001e41bfc000
      [  472.162262] NIP: c00000000096b094 LR: c00000000096b08c CTR: c000000000111130
      [  472.162337] REGS: c000001e41bff520 TRAP: 0901   Not tainted  (4.2.0-kvm+)
      [  472.162399] MSR: 9000000100009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 24848844  XER: 00000000
      [  472.162588] CFAR: c00000000096b0ac SOFTE: 1
      GPR00: c000000000111170 c000001e41bff7a0 c00000000127df00 0000000000000001
      GPR04: 0000000000000003 0000000000000001 0000000000000000 0000000000874821
      GPR08: c000001e41bff8e0 0000000000000001 0000000000000000 d00000000efde740
      GPR12: c000000000111130 c00000000fdae400
      [  472.163053] NIP [c00000000096b094] _raw_spin_lock_irqsave+0xa4/0x130
      [  472.163117] LR [c00000000096b08c] _raw_spin_lock_irqsave+0x9c/0x130
      [  472.163179] Call Trace:
      [  472.163206] [c000001e41bff7a0] [c000001e41bff7f0] 0xc000001e41bff7f0 (unreliable)
      [  472.163295] [c000001e41bff7e0] [c000000000111170] __wake_up+0x40/0x90
      [  472.163375] [c000001e41bff830] [d00000000efd6fc0] kvmppc_run_core+0x1240/0x1950 [kvm_hv]
      [  472.163465] [c000001e41bffa30] [d00000000efd8510] kvmppc_vcpu_run_hv+0x5a0/0xd90 [kvm_hv]
      [  472.163559] [c000001e41bffb70] [d00000000e9318a4] kvmppc_vcpu_run+0x44/0x60 [kvm]
      [  472.163653] [c000001e41bffba0] [d00000000e92e674] kvm_arch_vcpu_ioctl_run+0x64/0x170 [kvm]
      [  472.163745] [c000001e41bffbe0] [d00000000e9263a8] kvm_vcpu_ioctl+0x538/0x7b0 [kvm]
      [  472.163834] [c000001e41bffd40] [c0000000002d0f50] do_vfs_ioctl+0x480/0x7c0
      [  472.163910] [c000001e41bffde0] [c0000000002d1364] SyS_ioctl+0xd4/0xf0
      [  472.163986] [c000001e41bffe30] [c000000000009260] system_call+0x38/0xd0
      [  472.164060] Instruction dump:
      [  472.164098] ebc1fff0 ebe1fff8 7c0803a6 4e800020 60000000 60000000 60420000 8bad02e2
      [  472.164224] 7fc3f378 4b6a57c1 60000000 7c210b78 <e92d0000> 89290009 792affe3 40820070
      
      The bug is that kvmppc_run_vcpu does not correctly handle the case
      where a vcpu task receives a signal while its guest vcpu is executing
      in the guest as a result of being piggy-backed onto the execution of
      another vcore.  In that case we need to wait for the vcpu to finish
      executing inside the guest, and then remove this vcore from the
      preempted vcores list.  That way, we avoid leaving this vcpu's vcore
      on the preempted vcores list when the vcpu gets interrupted.
      
      Fixes: ec257165Reported-by: NThomas Huth <thuth@redhat.com>
      Tested-by: NThomas Huth <thuth@redhat.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      5fc3e64f
  10. 17 9月, 2015 2 次提交
  11. 16 9月, 2015 9 次提交
    • T
      genirq: Remove irq argument from irq flow handlers · bd0b9ac4
      Thomas Gleixner 提交于
      Most interrupt flow handlers do not use the irq argument. Those few
      which use it can retrieve the irq number from the irq descriptor.
      
      Remove the argument.
      
      Search and replace was done with coccinelle and some extra helper
      scripts around it. Thanks to Julia for her help!
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Julia Lawall <Julia.Lawall@lip6.fr>
      Cc: Jiang Liu <jiang.liu@linux.intel.com>
      bd0b9ac4
    • T
      powerpc/mpc8xx: Use irq_set_handler_locked() · 9ca86b20
      Thomas Gleixner 提交于
      Use irq_set_handler_locked() as it avoids a redundant lookup of the
      irq descriptor.
      
      Search and replacement was done with coccinelle:
      
      @@
      struct irq_data *d;
      expression E1;
      @@
      
      -__irq_set_handler_locked(d->irq, E1);
      +irq_set_handler_locked(d, E1);
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Jiang Liu <jiang.liu@linux.intel.com>
      Cc: Julia Lawall <julia.lawall@lip6.fr>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: linuxppc-dev@lists.ozlabs.org
      9ca86b20
    • T
      powerpc/ipic: Use irq_set_handler_locked() · 9758a7b0
      Thomas Gleixner 提交于
      Use irq_set_handler_locked() as it avoids a redundant lookup of the
      irq descriptor.
      
      Search and replacement was done with coccinelle:
      
      @@
      struct irq_data *d;
      expression E1;
      @@
      
      -__irq_set_handler_locked(d->irq, E1);
      +irq_set_handler_locked(d, E1);
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Jiang Liu <jiang.liu@linux.intel.com>
      Cc: Julia Lawall <julia.lawall@lip6.fr>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      9758a7b0
    • T
      powerpc/cpm2: Use irq_set_handler_locked() · e9e879a3
      Thomas Gleixner 提交于
      Use irq_set_handler_locked() as it avoids a redundant lookup of the
      irq descriptor.
      
      Search and replacement was done with coccinelle:
      
      @@
      struct irq_data *d;
      expression E1;
      @@
      
      -__irq_set_handler_locked(d->irq, E1);
      +irq_set_handler_locked(d, E1);
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Jiang Liu <jiang.liu@linux.intel.com>
      Cc: Julia Lawall <julia.lawall@lip6.fr>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: linuxppc-dev@lists.ozlabs.org
      e9e879a3
    • T
      powerpc/mpc52xx: Use irq_set_handler_locked() · 6b83bd94
      Thomas Gleixner 提交于
      Use irq_set_handler_locked() as it avoids a redundant lookup of the
      irq descriptor.
      
      Search and replacement was done with coccinelle:
      
      @@
      struct irq_data *d;
      expression E1;
      @@
      
      -__irq_set_handler_locked(d->irq, E1);
      +irq_set_handler_locked(d, E1);
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Jiang Liu <jiang.liu@linux.intel.com>
      Cc: Julia Lawall <julia.lawall@lip6.fr>
      Cc: Anatolij Gustschin <agust@denx.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: linuxppc-dev@lists.ozlabs.org
      6b83bd94
    • A
      powerpc/mm: Recompute hash value after a failed update · 36b35d5d
      Aneesh Kumar K.V 提交于
      If we had secondary hash flag set, we ended up modifying hash value in
      the updatepp code path. Hence with a failed updatepp we will be using
      a wrong hash value for the following hash insert. Fix this by
      recomputing hash before insert.
      
      Without this patch we can end up with using wrong slot number in linux
      pte. That can result in us missing an hash pte update or invalidate
      which can cause memory corruption or even machine check.
      
      Fixes: 6d492ecc ("powerpc/THP: Add code to handle HPTE faults for hugepages")
      Cc: stable@vger.kernel.org # v3.11+
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Reviewed-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      36b35d5d
    • B
      powerpc/boot: Specify ABI v2 when building an LE boot wrapper · 655471f5
      Benjamin Herrenschmidt 提交于
      The kernel does it, not the boot wrapper, which breaks with some
      cross compilers that still default to ABI v1.
      
      Fixes: 147c0516 ("powerpc/boot: Add support for 64bit little endian wrapper")
      Cc: stable@vger.kernel.org # v3.16+
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      655471f5
    • P
      KVM: add halt_attempted_poll to VCPU stats · 62bea5bf
      Paolo Bonzini 提交于
      This new statistic can help diagnosing VCPUs that, for any reason,
      trigger bad behavior of halt_poll_ns autotuning.
      
      For example, say halt_poll_ns = 480000, and wakeups are spaced exactly
      like 479us, 481us, 479us, 481us. Then KVM always fails polling and wastes
      10+20+40+80+160+320+480 = 1110 microseconds out of every
      479+481+479+481+479+481+479 = 3359 microseconds. The VCPU then
      is consuming about 30% more CPU than it would use without
      polling.  This would show as an abnormally high number of
      attempted polling compared to the successful polls.
      
      Acked-by: Christian Borntraeger <borntraeger@de.ibm.com<
      Reviewed-by: NDavid Matlack <dmatlack@google.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      62bea5bf
    • B
      PCI: Revert "PCI: Call pci_read_bridge_bases() from core instead of arch code" · 237865f1
      Bjorn Helgaas 提交于
      Revert dff22d20 ("PCI: Call pci_read_bridge_bases() from core instead
      of arch code").
      
      Reading PCI bridge windows is not arch-specific in itself, but there is PCI
      core code that doesn't work correctly if we read them too early.  For
      example, Hannes found this case on an ARM Freescale i.mx6 board:
      
        pci_bus 0000:00: root bus resource [mem 0x01000000-0x01efffff]
        pci 0000:00:00.0: PCI bridge to [bus 01-ff]
        pci 0000:00:00.0: BAR 8: no space for [mem size 0x01000000] (mem window)
        pci 0000:01:00.0: BAR 2: failed to assign [mem size 0x00200000]
        pci 0000:01:00.0: BAR 1: failed to assign [mem size 0x00004000]
        pci 0000:01:00.0: BAR 0: failed to assign [mem size 0x00000100]
      
      The 00:00.0 mem window needs to be at least 3MB: the 01:00.0 device needs
      0x204100 of space, and mem windows are megabyte-aligned.
      
      Bus sizing can increase a bridge window size, but never *decrease* it (see
      d65245c3 ("PCI: don't shrink bridge resources")).  Prior to
      dff22d20, ARM didn't read bridge windows at all, so the "original size"
      was zero, and we assigned a 3MB window.
      
      After dff22d20, we read the bridge windows before sizing the bus.  The
      firmware programmed a 16MB window (size 0x01000000) in 00:00.0, and since
      we never decrease the size, we kept 16MB even though we only needed 3MB.
      But 16MB doesn't fit in the host bridge aperture, so we failed to assign
      space for the window and the downstream devices.
      
      I think this is a defect in the PCI core: we shouldn't rely on the firmware
      to assign sensible windows.
      
      Ray reported a similar problem, also on ARM, with Broadcom iProc.
      
      Issues like this are too hard to fix right now, so revert dff22d20.
      Reported-by: NHannes <oe5hpm@gmail.com>
      Reported-by: NRay Jui <rjui@broadcom.com>
      Link: http://lkml.kernel.org/r/CAAa04yFQEUJm7Jj1qMT57-LG7ZGtnhNDBe=PpSRa70Mj+XhW-A@mail.gmail.com
      Link: http://lkml.kernel.org/r/55F75BB8.4070405@broadcom.comSigned-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Acked-by: NYinghai Lu <yinghai@kernel.org>
      Acked-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      237865f1
  12. 15 9月, 2015 1 次提交
  13. 14 9月, 2015 1 次提交
    • T
      powerpc/cell: Prepare irq handler for irq argument removal · 391de7f9
      Thomas Gleixner 提交于
      The irq argument of most interrupt flow handlers is unused or merily
      used instead of a local variable. The handlers which need the irq
      argument can retrieve the irq number from the irq descriptor.
          
      Search and update was done with coccinelle and the invaluable help of
      Julia Lawall.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Julia Lawall <Julia.Lawall@lip6.fr>
      Cc: Jiang Liu <jiang.liu@linux.intel.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: linuxppc-dev@lists.ozlabs.org
      391de7f9