1. 08 10月, 2016 1 次提交
    • H
      parisc: Increase KERNEL_INITIAL_SIZE for 32-bit SMP kernels · 690d097c
      Helge Deller 提交于
      Increase the initial kernel default page mapping size for SMP kernels to 32MB
      and add a runtime check which panics early if the kernel is bigger than the
      initial mapping size.
      
      This fixes boot crashes of 32bit SMP kernels. Due to the introduction of huge
      page support in kernel 4.4 and it's required initial kernel layout in memory, a
      32bit SMP kernel usually got bigger (in layout, not size) than 16MB.
      
      Cc: stable@vger.kernel.org #4.4+
      Signed-off-by: NHelge Deller <deller@gmx.de>
      690d097c
  2. 07 10月, 2016 2 次提交
  3. 06 10月, 2016 8 次提交
    • B
      xen/x86: Update topology map for PV VCPUs · a6a198bc
      Boris Ostrovsky 提交于
      Early during boot topology_update_package_map() computes
      logical_pkg_ids for all present processors.
      
      Later, when processors are brought up, identify_cpu() updates
      these values based on phys_pkg_id which is a function of
      initial_apicid. On PV guests the latter may point to a
      non-existing node, causing logical_pkg_ids to be set to -1.
      
      Intel's RAPL uses logical_pkg_id (as topology_logical_package_id())
      to index its arrays and therefore in this case will point to index
      65535 (since logical_pkg_id is a u16). This could lead to either a
      crash or may actually access random memory location.
      
      As a workaround, we recompute topology during CPU bringup to reset
      logical_pkg_id to a valid value.
      
      (The reason for initial_apicid being bogus is because it is
      initial_apicid of the processor from which the guest is launched.
      This value is CPUID(1).EBX[31:24])
      Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
      a6a198bc
    • R
      ARM: fix delays · fb833b1f
      Russell King 提交于
      Commit 215e362d ("ARM: 8306/1: loop_udelay: remove bogomips value
      limitation") tried to increase the bogomips limitation, but in doing
      so messed up udelay such that it always gives about a 5% error in the
      delay, even if we use a timer.
      
      The calculation is:
      
      	loops = UDELAY_MULT * us_delay * ticks_per_jiffy >> UDELAY_SHIFT
      
      Originally, UDELAY_MULT was ((UL(2199023) * HZ) >> 11) and UDELAY_SHIFT
      30.  Assuming HZ=100, us_delay of 1000 and ticks_per_jiffy of 1660000
      (eg, 166MHz timer, 1ms delay) this would calculate:
      
      	((UL(2199023) * HZ) >> 11) * 1000 * 1660000 >> 30
      		=> 165999
      
      With the new values of 2047 * HZ + 483648 * HZ / 1000000 and 31, we get:
      
      	(2047 * HZ + 483648 * HZ / 1000000) * 1000 * 1660000 >> 31
      		=> 158269
      
      which is incorrect.  This is due to a typo - correcting it gives:
      
      	(2147 * HZ + 483648 * HZ / 1000000) * 1000 * 1660000 >> 31
      		=> 165999
      
      i.o.w, the original value.
      
      Fixes: 215e362d ("ARM: 8306/1: loop_udelay: remove bogomips value limitation")
      Cc: <stable@vger.kernel.org>
      Reviewed-by: NNicolas Pitre <nico@linaro.org>
      Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
      fb833b1f
    • N
      sparc: fixing ident and beautifying code · 98e98eb6
      netmonk@netmonk.org 提交于
      Good evening,
      
      Following LinuxCodingStyle documentation and with the help of Sam, fixed
      severals identation issues in the code, and few others cosmetic changes
      
      And last and i hope least fixing my name :)
      
      Signed-off-by : Dominique Carrel <netmonk@netmonk.org>
      Acked-by: NSam Ravnborg <sam@ravnborg.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      98e98eb6
    • C
      sparc64: Enable setting "relaxed ordering" in IOMMU mappings · aa7bde1a
      chris hyser 提交于
      Enable relaxed ordering for memory writes in IOMMU TSB entry from
      dma_4v_alloc_coherent(), dma_4v_map_page() and dma_4v_map_sg() when
      dma_attrs DMA_ATTR_WEAK_ORDERING is set. This requires PCI IOMMU I/O
      Translation Services version 2.0 API.
      
      Many PCIe devices allow enabling relaxed-ordering (memory writes bypassing
      other memory writes) for various DMA buffers. A notable exception is the
      Mellanox mlx4 IB adapter. Due to the nature of x86 HW this appears to have
      little performance impact there. On SPARC HW however, this results in major
      performance degradation getting only about 3Gbps. Enabling RO in the IOMMU
      entries corresponding to mlx4 data buffers increases the throughput to
      about 13 Gbps.
      
      Orabug: 19245907
      Signed-off-by: NChris Hyser <chris.hyser@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aa7bde1a
    • C
      sparc64: Enable PCI IOMMU version 2 API · 8914391b
      chris hyser 提交于
      Enable Version 2 of the PCI IOMMU API needed for advanced features
      such as PCI Relaxed Ordering and greater than 2 GB DMA address
      space per root complex.
      Signed-off-by: NChris Hyser <chris.hyser@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8914391b
    • P
      sparc: migrate exception table users off module.h and onto extable.h · cdd4f4c7
      Paul Gortmaker 提交于
      These files were only including module.h for exception table
      related functions.  We've now separated that content out into its
      own file "extable.h" so now move over to that and avoid all the
      extra header content in module.h that we don't really need to compile
      these files.
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: sparclinux@vger.kernel.org
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cdd4f4c7
    • H
      parisc: Add cfi_startproc and cfi_endproc to assembly code · f39cce65
      Helge Deller 提交于
      Add ENTRY_CFI() and ENDPROC_CFI() macros for dwarf debug info and
      convert assembly users to new macros.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      f39cce65
    • H
      parisc: Move hpmc stack into page aligned bss section · 2929e738
      Helge Deller 提交于
      Do not reserve space in data section for hpmc stack, instead move it
      into the page aligned bss section.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      2929e738
  4. 05 10月, 2016 2 次提交
  5. 04 10月, 2016 27 次提交
    • N
      powerpc/bpf: Add support for bpf constant blinding · b7b7013c
      Naveen N. Rao 提交于
      In line with similar support for other architectures by Daniel Borkmann.
      
      'MOD Default X' from test_bpf without constant blinding:
      84 bytes emitted from JIT compiler (pass:3, flen:7)
      d0000000058a4688 + <x>:
         0:	nop
         4:	nop
         8:	std     r27,-40(r1)
         c:	std     r28,-32(r1)
        10:	xor     r8,r8,r8
        14:	xor     r28,r28,r28
        18:	mr      r27,r3
        1c:	li      r8,66
        20:	cmpwi   r28,0
        24:	bne     0x0000000000000030
        28:	li      r8,0
        2c:	b       0x0000000000000044
        30:	divwu   r9,r8,r28
        34:	mullw   r9,r28,r9
        38:	subf    r8,r9,r8
        3c:	rotlwi  r8,r8,0
        40:	li      r8,66
        44:	ld      r27,-40(r1)
        48:	ld      r28,-32(r1)
        4c:	mr      r3,r8
        50:	blr
      
      ... and with constant blinding:
      140 bytes emitted from JIT compiler (pass:3, flen:11)
      d00000000bd6ab24 + <x>:
         0:	nop
         4:	nop
         8:	std     r27,-40(r1)
         c:	std     r28,-32(r1)
        10:	xor     r8,r8,r8
        14:	xor     r28,r28,r28
        18:	mr      r27,r3
        1c:	lis     r2,-22834
        20:	ori     r2,r2,36083
        24:	rotlwi  r2,r2,0
        28:	xori    r2,r2,36017
        2c:	xoris   r2,r2,42702
        30:	rotlwi  r2,r2,0
        34:	mr      r8,r2
        38:	rotlwi  r8,r8,0
        3c:	cmpwi   r28,0
        40:	bne     0x000000000000004c
        44:	li      r8,0
        48:	b       0x000000000000007c
        4c:	divwu   r9,r8,r28
        50:	mullw   r9,r28,r9
        54:	subf    r8,r9,r8
        58:	rotlwi  r8,r8,0
        5c:	lis     r2,-17137
        60:	ori     r2,r2,39065
        64:	rotlwi  r2,r2,0
        68:	xori    r2,r2,39131
        6c:	xoris   r2,r2,48399
        70:	rotlwi  r2,r2,0
        74:	mr      r8,r2
        78:	rotlwi  r8,r8,0
        7c:	ld      r27,-40(r1)
        80:	ld      r28,-32(r1)
        84:	mr      r3,r8
        88:	blr
      Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b7b7013c
    • N
      powerpc/bpf: Implement support for tail calls · ce076141
      Naveen N. Rao 提交于
      Tail calls allow JIT'ed eBPF programs to call into other JIT'ed eBPF
      programs. This can be achieved either by:
      (1) retaining the stack setup by the first eBPF program and having all
      subsequent eBPF programs re-using it, or,
      (2) by unwinding/tearing down the stack and having each eBPF program
      deal with its own stack as it sees fit.
      
      To ensure that this does not create loops, there is a limit to how many
      tail calls can be done (currently 32). This requires the JIT'ed code to
      maintain a count of the number of tail calls done so far.
      
      Approach (1) is simple, but requires every eBPF program to have (almost)
      the same prologue/epilogue, regardless of whether they need it. This is
      inefficient for small eBPF programs which may not sometimes need a
      prologue at all. As such, to minimize impact of tail call
      implementation, we use approach (2) here which needs each eBPF program
      in the chain to use its own prologue/epilogue. This is not ideal when
      many tail calls are involved and when all the eBPF programs in the chain
      have similar prologue/epilogue. However, the impact is restricted to
      programs that do tail calls. Individual eBPF programs are not affected.
      
      We maintain the tail call count in a fixed location on the stack and
      updated tail call count values are passed in through this. The very
      first eBPF program in a chain sets this up to 0 (the first 2
      instructions). Subsequent tail calls skip the first two eBPF JIT
      instructions to maintain the count. For programs that don't do tail
      calls themselves, the first two instructions are NOPs.
      Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      ce076141
    • N
      powerpc/bpf: Introduce accessors for using the tmp local stack space · 7b847f52
      Naveen N. Rao 提交于
      While at it, ensure that the location of the local save area is
      consistent whether or not we setup our own stackframe. This property is
      utilised in the next patch that adds support for tail calls.
      Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      7b847f52
    • M
      powerpc/fadump: Fix build break when CONFIG_PROC_VMCORE=n · 2685f826
      Michael Ellerman 提交于
      The fadump code calls vmcore_cleanup() which only exists if
      CONFIG_PROC_VMCORE=y. We don't want to depend on CONFIG_PROC_VMCORE,
      because it's user selectable, so just wrap the call in an #ifdef.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      2685f826
    • C
      powerpc: tm: Enable transactional memory (TM) lazily for userspace · 5d176f75
      Cyril Bur 提交于
      Currently the MSR TM bit is always set if the hardware is TM capable.
      This adds extra overhead as it means the TM SPRS (TFHAR, TEXASR and
      TFAIR) must be swapped for each process regardless of if they use TM.
      
      For processes that don't use TM the TM MSR bit can be turned off
      allowing the kernel to avoid the expensive swap of the TM registers.
      
      A TM unavailable exception will occur if a thread does use TM and the
      kernel will enable MSR_TM and leave it so for some time afterwards.
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      5d176f75
    • C
      powerpc/tm: Add TM Unavailable Exception · 172f7aaa
      Cyril Bur 提交于
      If the kernel disables transactional memory (TM) and userspace still
      tries TM related actions (TM instructions or TM SPR accesses) TM aware
      hardware will cause the kernel to take a facility unavailable
      exception.
      
      Add checks for the exception being caused by illegal TM access in
      userspace.
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      [mpe: Rewrite comment entirely, bugs in it are mine]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      172f7aaa
    • C
      powerpc: Remove do_load_up_transact_{fpu,altivec} · d986d6f4
      Cyril Bur 提交于
      Previous rework of TM code leaves these functions unused
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      d986d6f4
    • C
      powerpc: tm: Rename transct_(*) to ck(\1)_state · 000ec280
      Cyril Bur 提交于
      Make the structures being used for checkpointed state named
      consistently with the pt_regs/ckpt_regs.
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      000ec280
    • C
      powerpc: tm: Always use fp_state and vr_state to store live registers · dc310669
      Cyril Bur 提交于
      There is currently an inconsistency as to how the entire CPU register
      state is saved and restored when a thread uses transactional memory
      (TM).
      
      Using transactional memory results in the CPU having duplicated
      (almost) all of its register state. This duplication results in a set
      of registers which can be considered 'live', those being currently
      modified by the instructions being executed and another set that is
      frozen at a point in time.
      
      On context switch, both sets of state have to be saved and (later)
      restored. These two states are often called a variety of different
      things. Common terms for the state which only exists after the CPU has
      entered a transaction (performed a TBEGIN instruction) in hardware are
      'transactional' or 'speculative'.
      
      Between a TBEGIN and a TEND or TABORT (or an event that causes the
      hardware to abort), regardless of the use of TSUSPEND the
      transactional state can be referred to as the live state.
      
      The second state is often to referred to as the 'checkpointed' state
      and is a duplication of the live state when the TBEGIN instruction is
      executed. This state is kept in the hardware and will be rolled back
      to on transaction failure.
      
      Currently all the registers stored in pt_regs are ALWAYS the live
      registers, that is, when a thread has transactional registers their
      values are stored in pt_regs and the checkpointed state is in
      ckpt_regs. A strange opposite is true for fp_state/vr_state. When a
      thread is non transactional fp_state/vr_state holds the live
      registers. When a thread has initiated a transaction fp_state/vr_state
      holds the checkpointed state and transact_fp/transact_vr become the
      structure which holds the live state (at this point it is a
      transactional state).
      
      This method creates confusion as to where the live state is, in some
      circumstances it requires extra work to determine where to put the
      live state and prevents the use of common functions designed (probably
      before TM) to save the live state.
      
      With this patch pt_regs, fp_state and vr_state all represent the
      same thing and the other structures [pending rename] are for
      checkpointed state.
      Acked-by: NSimon Guo <wei.guo.simon@gmail.com>
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      dc310669
    • C
      powerpc: signals: Stop using current in signal code · d1199431
      Cyril Bur 提交于
      Much of the signal code takes a pt_regs on which it operates. Over
      time the signal code has needed to know more about the thread than
      what pt_regs can supply, this information is obtained as needed by
      using 'current'.
      
      This approach is not strictly incorrect however it does mean that
      there is now a hard requirement that the pt_regs being passed around
      does belong to current, this is never checked. A safer approach is for
      the majority of the signal functions to take a task_struct from which
      they can obtain pt_regs and any other information they need. The
      caveat that the task_struct they are passed must be current doesn't go
      away but can more easily be checked for.
      
      Functions called from outside powerpc signal code are passed a pt_regs
      and they can confirm that the pt_regs is that of current and pass
      current to other functions, furthurmore, powerpc signal functions can
      check that the task_struct they are passed is the same as current
      avoiding possible corruption of current (or the task they are passed)
      if this assertion ever fails.
      
      CC: paulus@samba.org
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      d1199431
    • C
      powerpc: Never giveup a reclaimed thread when enabling kernel {fp, altivec, vsx} · e909fb83
      Cyril Bur 提交于
      After a thread is reclaimed from its active or suspended transactional
      state the checkpointed state exists on CPU, this state (along with the
      live/transactional state) has been saved in its entirety by the
      reclaiming process.
      
      There exists a sequence of events that would cause the kernel to call
      one of enable_kernel_fp(), enable_kernel_altivec() or
      enable_kernel_vsx() after a thread has been reclaimed. These functions
      save away any user state on the CPU so that the kernel can use the
      registers. Not only is this saving away unnecessary at this point, it
      is actually incorrect. It causes a save of the checkpointed state to
      the live structures within the thread struct thus destroying the true
      live state for that thread.
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      e909fb83
    • C
      powerpc: Return the new MSR from msr_check_and_set() · 3cee070a
      Cyril Bur 提交于
      msr_check_and_set() always performs a mfmsr() to determine if it needs
      to perform an mtmsr(), as mfmsr() can be a costly operation
      msr_check_and_set() could return the MSR now on the CPU to avoid
      callers of msr_check_and_set having to make their own mfmsr() call.
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      3cee070a
    • C
      powerpc: Add check_if_tm_restore_required() to giveup_all() · b0f16b46
      Cyril Bur 提交于
      giveup_all() causes FPU/VMX/VSX facilities to be disabled in a threads
      MSR. If the thread performing the giveup was transactional, the kernel
      must record which facilities were in use before the giveup as the
      thread must have these facilities re-enabled on return to userspace.
      
      >From process.c:
       /*
        * This is called if we are on the way out to userspace and the
        * TIF_RESTORE_TM flag is set.  It checks if we need to reload
        * FP and/or vector state and does so if necessary.
        * If userspace is inside a transaction (whether active or
        * suspended) and FP/VMX/VSX instructions have ever been enabled
        * inside that transaction, then we have to keep them enabled
        * and keep the FP/VMX/VSX state loaded while ever the transaction
        * continues.  The reason is that if we didn't, and subsequently
        * got a FP/VMX/VSX unavailable interrupt inside a transaction,
        * we don't know whether it's the same transaction, and thus we
        * don't know which of the checkpointed state and the transactional
        * state to use.
        */
      
      Calling check_if_tm_restore_required() will set TIF_RESTORE_TM and
      save the MSR if needed.
      
      Fixes: c2085059 ("powerpc: create giveup_all()")
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b0f16b46
    • C
      powerpc: Always restore FPU/VEC/VSX if hardware transactional memory in use · dc16b553
      Cyril Bur 提交于
      Comment from arch/powerpc/kernel/process.c:967:
       If userspace is inside a transaction (whether active or
       suspended) and FP/VMX/VSX instructions have ever been enabled
       inside that transaction, then we have to keep them enabled
       and keep the FP/VMX/VSX state loaded while ever the transaction
       continues.  The reason is that if we didn't, and subsequently
       got a FP/VMX/VSX unavailable interrupt inside a transaction,
       we don't know whether it's the same transaction, and thus we
       don't know which of the checkpointed state and the ransactional
       state to use.
      
      restore_math() restore_fp() and restore_altivec() currently may not
      restore the registers. It doesn't appear that this is more serious
      than a performance penalty. If the math registers aren't restored the
      userspace thread will still be run with the facility disabled.
      Userspace will not be able to read invalid values. On the first access
      it will take an facility unavailable exception and the kernel will
      detected an active transaction, at which point it will abort the
      transaction. There is the possibility for a pathological case
      preventing any progress by transactions, however, transactions
      are never guaranteed to make progress.
      
      Fixes: 70fe3d98 ("powerpc: Restore FPU/VEC/VSX if previously used")
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      dc16b553
    • G
      powerpc/powernv: Fix data type for @r in pnv_ioda_parse_m64_window() · 0e7736c6
      Gavin Shan 提交于
      This fixes warning reported from sparse:
      
        pci-ioda.c:451:49: warning: incorrect type in argument 2 (different base types)
      
      Fixes: 262af557 ("powerpc/powernv: Enable M64 aperatus for PHB3")
      Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      0e7736c6
    • G
      powerpc/powernv: Use CPU-endian PEST in pnv_pci_dump_p7ioc_diag_data() · 5adaf862
      Gavin Shan 提交于
      This fixes the warnings reported from sparse:
      
        pci.c:312:33: warning: restricted __be64 degrades to integer
        pci.c:313:33: warning: restricted __be64 degrades to integer
      
      Fixes: cee72d5b ("powerpc/powernv: Display diag data on p7ioc EEH errors")
      Cc: stable@vger.kernel.org # v3.3+
      Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      5adaf862
    • G
      powerpc/powernv: Specify proper data type for PCI_SLOT_ID_PREFIX · 066bcd78
      Gavin Shan 提交于
      This fixes the warning reported from sparse:
      
        eeh-powernv.c:875:23: warning: constant 0x8000000000000000 is so big it is unsigned long
      
      Fixes: ebe22531 ("powerpc/powernv: Support PCI slot ID")
      Suggested-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      066bcd78
    • G
      powerpc/powernv: Use CPU-endian hub diag-data type in pnv_eeh_get_and_dump_hub_diag() · a7032132
      Gavin Shan 提交于
      The hub diag-data type is filled with big-endian data by OPAL call
      opal_pci_get_hub_diag_data(). We need convert it to CPU-endian value
      before using it. The issue is reported by sparse as pointed by Michael
      Ellerman:
      
        eeh-powernv.c:1309:21: warning: restricted __be16 degrades to integer
      
      This converts hub diag-data type to CPU-endian before using it in
      pnv_eeh_get_and_dump_hub_diag().
      
      Fixes: 2a485ad7 ("powerpc/powernv: Drop PHB operation next_error()")
      Cc: stable@vger.kernel.org # v4.1+
      Suggested-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Reviewed-by: NRussell Currey <ruscur@russell.cc>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      a7032132
    • G
      powerpc/powernv: Pass CPU-endian PE number to opal_pci_eeh_freeze_clear() · d63e51b3
      Gavin Shan 提交于
      The PE number (@frozen_pe_no), filled by opal_pci_next_error() is in
      big-endian format. It should be converted to CPU-endian before it is
      passed to opal_pci_eeh_freeze_clear() when clearing the frozen state if
      the PE is invalid one. As Michael Ellerman pointed out, the issue is
      also detected by sparse:
      
        eeh-powernv.c:1541:41: warning: incorrect type in argument 2 (different base types)
      
      This passes CPU-endian PE number to opal_pci_eeh_freeze_clear() and it
      should be part of commit <0f36db77> ("powerpc/eeh: Fix wrong printed
      PE number"), which was merged to 4.3 kernel.
      
      Fixes: 71b540ad ("powerpc/powernv: Don't escalate non-existing frozen PE")
      Cc: stable@vger.kernel.org # v4.3+
      Suggested-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Reviewed-by: NRussell Currey <ruscur@russell.cc>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      d63e51b3
    • A
      powerpc: Set default CPU type to POWER8 for little endian builds · e2ad477c
      Anton Blanchard 提交于
      We supported POWER7 CPUs for bootstrapping little endian, but the
      target was always POWER8. Now that POWER7 specific issues are
      impacting performance, change the default target to POWER8.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      e2ad477c
    • A
      powerpc: Only disable HAVE_EFFICIENT_UNALIGNED_ACCESS on POWER7 little endian · 8a18cc0c
      Anton Blanchard 提交于
      POWER8 handles unaligned accesses in little endian mode, but commit
      0b5e6661 ("powerpc: Don't set HAVE_EFFICIENT_UNALIGNED_ACCESS on
      little endian builds") disabled it for all.
      
      The issue with unaligned little endian accesses is specific to POWER7,
      so update the Kconfig check to match. Using the stat() testcase from
      commit a75c380c ("powerpc: Enable DCACHE_WORD_ACCESS on ppc64le"),
      performance improves 15% on POWER8.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      8a18cc0c
    • A
      powerpc: Remove static branch prediction in atomic{, 64}_add_unless · 61e98ebf
      Anton Blanchard 提交于
      I see quite a lot of static branch mispredictions on a simple
      web serving workload. The issue is in __atomic_add_unless(), called
      from _atomic_dec_and_lock(). There is no obvious common case, so it
      is better to let the hardware predict the branch.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      61e98ebf
    • A
      powerpc: During context switch, check before setting mm_cpumask · bb85fb58
      Anton Blanchard 提交于
      During context switch, switch_mm() sets our current CPU in mm_cpumask.
      We can avoid this atomic sequence in most cases by checking before
      setting the bit.
      
      Testing on a POWER8 using our context switch microbenchmark:
      
      tools/testing/selftests/powerpc/benchmarks/context_switch \
      	--process --no-fp --no-altivec --no-vector
      
      Performance improves 2%.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Acked-by: NBalbir Singh <bsingharora@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      bb85fb58
    • A
      powerpc/eeh: Quieten EEH message when no adapters are found · 91ac730b
      Anton Blanchard 提交于
      No real need for this to be pr_warn(), reduce it to pr_info().
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Acked-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      91ac730b
    • A
      powerpc/configs: Enable Intel i40e on 64 bit configs · 9eda65fb
      Anton Blanchard 提交于
      We are starting to see i40e adapters in recent machines, so enable
      it in our configs.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      9eda65fb
    • A
      powerpc/configs: Change a few things from built in to modules · d3eb34a3
      Anton Blanchard 提交于
      Change a few devices and filesystems that are seldom used any more
      from built in to modules. This reduces our vmlinux about 500kB.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      d3eb34a3
    • A
      powerpc/configs: Bump kernel ring buffer size on 64 bit configs · 32eab6c9
      Anton Blanchard 提交于
      When we issue a system reset, every CPU in the box prints an Oops,
      including a backtrace. Each of these can be quite large (over 4kB)
      and we may end up wrapping the ring buffer and losing important
      information.
      
      Bump the base size from 128kB to 256kB and the per CPU size from
      4kB to 8kB.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      32eab6c9