提交 · cf904e3088fd3760d9f6fed5fde755ac299590b7 · gsplhtlxg / clone-Linux

28 10月, 2015 7 次提交

powerpc/book3e-64/kexec: create an identity TLB mapping · cf904e30

由 Tiejun Chen 提交于 10月 06, 2015

book3e has no real MMU mode so we have to create an identity TLB
mapping to make sure we can access the real physical address.
Signed-off-by: NTiejun Chen <tiejun.chen@windriver.com>
[scottwood: cleanup, and split off some changes]
Signed-off-by: NScott Wood <scottwood@freescale.com>

cf904e30

powerpc/book3e-64: Don't limit paca to 256 MiB · ecc4999f

由 Scott Wood 提交于 10月 06, 2015

This limit only makes sense on book3s, and on book3e it can cause
problems with kdump if we don't have any memory under 256 MiB.
Signed-off-by: NScott Wood <scottwood@freescale.com>

ecc4999f

powerpc/book3e/kdump: Enable crash_kexec_wait_realmode · eeaab663

由 Scott Wood 提交于 10月 06, 2015

While book3e doesn't have "real mode", we still want to wait for
all the non-crash cpus to complete their shutdown.
Signed-off-by: NScott Wood <scottwood@freescale.com>

eeaab663

powerpc/book3e: support CONFIG_RELOCATABLE · 1cb6e064

由 Tiejun Chen 提交于 10月 06, 2015

book3e is different with book3s since 3s includes the exception
vectors code in head_64.S as it relies on absolute addressing
which is only possible within this compilation unit. So we have
to get that label address with got.

And when boot a relocated kernel, we should reset ipvr properly again
after .relocate.
Signed-off-by: NTiejun Chen <tiejun.chen@windriver.com>
[scottwood: cleanup and ifdef removal]
Signed-off-by: NScott Wood <scottwood@freescale.com>

1cb6e064

powerpc/booke64: Fix args to copy_and_flush · 835c031c

由 Tiejun Chen 提交于 10月 06, 2015

Convert r4/r5, not r6, to a virtual address when calling
copy_and_flush.  Otherwise, r3 is already virtual, and copy_to_flush
tries to access r3+r6, PAGE_OFFSET gets added twice.

This isn't normally seen because on book3e we normally enter with
the kernel at zero and thus skip copy_to_flush -- but it will be
needed for kexec support.
Signed-off-by: NTiejun Chen <tiejun.chen@windriver.com>
[scottwood: split patch and rewrote changelog]
Signed-off-by: NScott Wood <scottwood@freescale.com>

835c031c

powerpc/book3e-64: rename interrupt_end_book3e with __end_interrupts · 68d10140

由 Tiejun Chen 提交于 10月 06, 2015

Rename 'interrupt_end_book3e' to '__end_interrupts' so that the symbol
can be used by both book3s and book3e.
Signed-off-by: NTiejun Chen <tiejun.chen@windriver.com>
[scottwood: edit changelog]
Signed-off-by: NScott Wood <scottwood@freescale.com>

68d10140

powerpc/e6500: kexec: Handle hardware threads · f34b3e19

由 Scott Wood 提交于 10月 06, 2015

The new kernel will be expecting secondary threads to be disabled,
not spinning.
Signed-off-by: NScott Wood <scottwood@freescale.com>

f34b3e19

23 10月, 2015 1 次提交

powerpc/85xx: Load all early TLB entries at once · d9e1831a

由 Scott Wood 提交于 10月 06, 2015

Use an AS=1 trampoline TLB entry to allow all normal TLB1 entries to
be loaded at once.  This avoids the need to keep the translation that
code is executing from in the same TLB entry in the final TLB
configuration as during early boot, which in turn is helpful for
relocatable kernels (e.g. kdump) where the kernel is not running from
what would be the first TLB entry.

On e6500, we limit map_mem_in_cams() to the primary hwthread of a
core (the boot cpu is always considered primary, as a kdump kernel
can be entered on any cpu).  Each TLB only needs to be set up once,
and when we do, we don't want another thread to be running when we
create a temporary trampoline TLB1 entry.
Signed-off-by: NScott Wood <scottwood@freescale.com>

d9e1831a

15 10月, 2015 3 次提交

powerpc: discard .exit.data at runtime · 4c812318

由 Stephen Rothwell 提交于 10月 08, 2015

.exit.text is discarded at run time and there are some references from
that to .exit.data, so we need to discard .exit.data at run time as well.

Fixes these errors:

`.exit.data' referenced in section `.exit.text' of drivers/built-in.o: defined in discarded section `.exit.data' of drivers/built-in.o
`.exit.data' referenced in section `.exit.text' of drivers/built-in.o: defined in discarded section `.exit.data' of drivers/built-in.o
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

4c812318

powerpc/eeh: atomic_dec_if_positive() to update passthru count · 54f9a64a

由 Gavin Shan 提交于 8月 27, 2015

No need to have two atomic opertions (update and fetch/check) when
decreasing PE's number of passed devices as one atomic operation
is enough.
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

54f9a64a

powerpc/pci: export pcibios_free_controller() · 6b8b252f

由 Andrew Donnellan 提交于 9月 10, 2015

Export pcibios_free_controller(), so it can be used by the cxl module to
free virtual PHBs.
Signed-off-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

6b8b252f

12 10月, 2015 1 次提交

powerpc/mm: Differentiate between hugetlb and THP during page walk · 891121e6

由 Aneesh Kumar K.V 提交于 10月 09, 2015

We need to properly identify whether a hugepage is an explicit or
a transparent hugepage in follow_huge_addr(). We used to depend
on hugepage shift argument to do that. But in some case that can
result in wrong results. For ex:

On finding a transparent hugepage we set hugepage shift to PMD_SHIFT.
But we can end up clearing the thp pte, via pmdp_huge_get_and_clear.
We do prevent reusing the pfn page via the usage of
kick_all_cpus_sync(). But that happens after we updated the pte to 0.
Hence in follow_huge_addr() we can find hugepage shift set, but transparent
huge page check fail for a thp pte.

NOTE: We fixed a variant of this race against thp split in commit
691e95fd
("powerpc/mm/thp: Make page table walk safe against thp split/collapse")

Without this patch, we may hit the BUG_ON(flags & FOLL_GET) in
follow_page_mask occasionally.

In the long term, we may want to switch ppc64 64k page size config to
enable CONFIG_ARCH_WANT_GENERAL_HUGETLB
Reported-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

891121e6

02 10月, 2015 2 次提交

powerpc/nvram: Fix function name in some errors messages. · b6080db4

由 Christophe Jaillet 提交于 7月 17, 2015

'nvram_create_os_partition' should be 'nvram_create_partition'.
Use __func__ to have it right, as done elsewhere in this file.
Signed-off-by: NChristophe Jaillet <christophe.jaillet@wanadoo.fr>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

b6080db4

powerpc/nvram: Add missing kfree in error path · 7d523187

由 Christophe Jaillet 提交于 7月 17, 2015

If 'nvram_write_header' fails, then 'new_part' should be freed, otherwise,
there is a memory leak.
Signed-off-by: NChristophe Jaillet <christophe.jaillet@wanadoo.fr>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

7d523187

01 10月, 2015 2 次提交

powerpc/vdso: Avoid link stack corruption in __get_datapage() · c974809a

由 Michael Neuling 提交于 9月 25, 2015

powerpc has a link register (lr) used for calling functions. We "bl
<func>" to call a function, and "blr" to return back to the call site.

The lr is only a single register, so if we call another function from
inside this function (ie. nested calls), software must save away the
lr on the software stack before calling the new function. Before
returning (ie. before the "blr"), the lr is restored by software from
the software stack.

This makes branch prediction quite difficult for the processor as it
will only know the branch target just before the "blr".

To help with this, modern powerpc processors keep a (non-architected)
hardware stack of lr called a "link stack". When a "bl <func>" is
run, the lr is pushed onto this stack. When a "blr" is called, the
branch predictor pops the lr value from the top of the link stack, and
uses it to predict the branch target. Hence the processor pipeline
knows a lot earlier the branch target.

This works great but there are some cases where you call "bl" but
without a matching "blr". Once such case is when trying to determine
the program counter (which can't be read directly). Here you "bl+4;
mflr" to get the program counter. If you do this, the link stack will
get out of sync with reality, causing the branch predictor to
mis-predict subsequent function returns.

To avoid this, modern micro-architectures have a special case of bl.
Using the form "bcl 20,31,+4", ensures the processor doesn't push to
the link stack.

The 32 and 64 bit variants of __get_datapage() use a "bl; mflr" to
determine the loaded address of the VDSO. The current versions of
these attempt to use this special bl variant.

Unfortunately they use +8 rather than the required +4. Hence the
current code results in the link stack getting out of sync with
reality and hence the resulting performance degradation.

This patch moves it to bcl+4 by moving __kernel_datapage_offset out of
__get_datapage().

With this patch, running a gettimeofday() (which uses
__get_datapage()) microbenchmark we get a decent bump in performance
on POWER7/8.

For the benchmark in tools/testing/selftests/powerpc/benchmarks/gettimeofday.c
  POWER8:
    64bit gets ~4% improvement
    32bit gets ~9% improvement
  POWER7:
    64bit gets ~7% improvement
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Reported-by: NAaron Sawdey <sawdey@us.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

c974809a

powerpc/vdso: Emit GNU & SysV hashes · 787b393c

由 Michael Ellerman 提交于 8月 07, 2015

Andy Lutomirski says:

  Some dynamic loaders may be slightly faster if a GNU hash is
  available.

  This is unlikely to have any measurable effect on the time it takes
  to resolve vdso symbols (since there are so few of them).  In some
  contexts, it can be a win for a different reason: if every DSO has a
  GNU hash section, then libc can avoid calculating SysV hashes at
  all. Both musl and glibc appear to have this optimization.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

787b393c

17 9月, 2015 2 次提交

powerpc32: memset: only use dcbz once cache is enabled · 400c47d8

由 LEROY Christophe 提交于 9月 16, 2015

memset() uses instruction dcbz to speed up clearing by not wasting time
loading cache line with data that will be overwritten.
Some platform like mpc52xx do no have cache active at startup and
can therefore not use memset(). Allthough no part of the code
explicitly uses memset(), GCC may make calls to it.

This patch modifies memset() such that at startup, memset()
unconditionally skip the optimised bloc that uses dcbz instruction.

Once the initial MMU is set up, in machine_init() we patch memset()
by replacing this inconditional jump by a NOP
Tested-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

400c47d8

powerpc32: memcpy: only use dcbz once cache is enabled · 1cd03890

由 LEROY Christophe 提交于 9月 16, 2015

memcpy() uses instruction dcbz to speed up copy by not wasting time
loading cache line with data that will be overwritten.
Some platform like mpc52xx do no have cache active at startup and
can therefore not use memcpy(). Allthough no part of the code
explicitly uses memcpy(), GCC makes calls to it.

This patch modifies memcpy() such that at startup, memcpy()
unconditionally jumps to generic_memcpy() which doesn't use
the dcbz instruction.

Once the initial MMU is set up, in machine_init() we patch memcpy()
by replacing this inconditional jump by a NOP
Reported-by: NMichal Sojka <sojkam1@fel.cvut.cz>
Tested-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

1cd03890

16 9月, 2015 1 次提交

PCI: Revert "PCI: Call pci_read_bridge_bases() from core instead of arch code" · 237865f1

由 Bjorn Helgaas 提交于 9月 15, 2015

Revert dff22d20 ("PCI: Call pci_read_bridge_bases() from core instead
of arch code").

Reading PCI bridge windows is not arch-specific in itself, but there is PCI
core code that doesn't work correctly if we read them too early. For
example, Hannes found this case on an ARM Freescale i.mx6 board:

pci_bus 0000:00: root bus resource [mem 0x01000000-0x01efffff]
pci 0000:00:00.0: PCI bridge to [bus 01-ff]
pci 0000:00:00.0: BAR 8: no space for [mem size 0x01000000] (mem window)
pci 0000:01:00.0: BAR 2: failed to assign [mem size 0x00200000]
pci 0000:01:00.0: BAR 1: failed to assign [mem size 0x00004000]
pci 0000:01:00.0: BAR 0: failed to assign [mem size 0x00000100]

The 00:00.0 mem window needs to be at least 3MB: the 01:00.0 device needs
0x204100 of space, and mem windows are megabyte-aligned.

Bus sizing can increase a bridge window size, but never *decrease* it (see
d65245c3 ("PCI: don't shrink bridge resources")). Prior to
dff22d20, ARM didn't read bridge windows at all, so the "original size"
was zero, and we assigned a 3MB window.

After dff22d20, we read the bridge windows before sizing the bus. The
firmware programmed a 16MB window (size 0x01000000) in 00:00.0, and since
we never decrease the size, we kept 16MB even though we only needed 3MB.
But 16MB doesn't fit in the host bridge aperture, so we failed to assign
space for the window and the downstream devices.

I think this is a defect in the PCI core: we shouldn't rely on the firmware
to assign sensible windows.

Ray reported a similar problem, also on ARM, with Broadcom iProc.

Issues like this are too hard to fix right now, so revert dff22d20.
Reported-by: NHannes <oe5hpm@gmail.com>
Reported-by: NRay Jui <rjui@broadcom.com>
Link: http://lkml.kernel.org/r/CAAa04yFQEUJm7Jj1qMT57-LG7ZGtnhNDBe=PpSRa70Mj+XhW-A@mail.gmail.com
Link: http://lkml.kernel.org/r/55F75BB8.4070405@broadcom.comSigned-off-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NYinghai Lu <yinghai@kernel.org>
Acked-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>

237865f1

15 9月, 2015 1 次提交

powerpc, irq: Use access helper irq_data_get_affinity_mask() · da92b4eb

由 Jiang Liu 提交于 6月 01, 2015

Use access helper irq_data_get_affinity_mask() so we can move the
affinity mask to irq_common_data.
Signed-off-by: NJiang Liu <jiang.liu@linux.intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1433145945-789-25-git-send-email-jiang.liu@linux.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

da92b4eb

28 8月, 2015 1 次提交

powerpc/eeh: Fix fenced PHB caused by eeh_slot_error_detail() · 25980013

由 Gavin Shan 提交于 8月 28, 2015

The config space of some PCI devices can't be accessed when their
PEs are in frozen state. Otherwise, fenced PHB might be seen.
Those PEs are identified with flag EEH_PE_CFG_RESTRICTED, meaing
EEH_PE_CFG_BLOCKED is set automatically when the PE is put to
frozen state (EEH_PE_ISOLATED). eeh_slot_error_detail() restores
PCI device BARs with eeh_pe_restore_bars(), which then calls
eeh_ops->restore_config() to reinitialize the PCI device in
(OPAL) firmware. eeh_ops->restore_config() produces PCI config
access that causes fenced PHB. The problem was reported on below
adapter:

   0001:01:00.0 0200: 14e4:168e (rev 10)
   0001:01:00.0 Ethernet controller: Broadcom Corporation \
                NetXtreme II BCM57810 10 Gigabit Ethernet (rev 10)

This fixes the issue by skipping eeh_pe_restore_bars() in
eeh_slot_error_detail() when EEH_PE_CFG_BLOCKED is set for the PE.

Fixes: b6541db1 ("powerpc/eeh: Block PCI config access upon frozen PE")
Cc: stable@vger.kernel.org # v4.0+
Reported-by: NManvanthara B. Puttashankar <mputtash@in.ibm.com>
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

25980013

26 8月, 2015 1 次提交

powerpc/PCI: Disable MSI/MSI-X interrupts at PCI probe time in OF case · 4d9aac39

由 Guilherme G. Piccoli 提交于 8月 24, 2015

Since commit 1851617c ("PCI/MSI: Disable MSI at enumeration even if
kernel doesn't support MSI"), the setup of dev->msi_cap/msix_cap and the
disable of MSI/MSI-X interrupts isn't being done at PCI probe time, as
the logic responsible for this was moved in the aforementioned commit
from pci_device_add() to pci_setup_device(). The latter function is not
reachable on PowerPC pseries platform during Open Firmware PCI probing
time.

This exhibits as drivers not being able to enable MSI, eg:

  bnx2x 0000:01:00.0: no msix capability found

This patch calls pci_msi_setup_pci_dev() explicitly to disable MSI/MSI-X
during PCI probe time on pSeries platform.

Fixes: 1851617c ("PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI")
[mpe: Flesh out change log and clarify comment]
Signed-off-by: NGuilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

4d9aac39

22 8月, 2015 2 次提交

KVM: PPC: Book3S HV: Implement dynamic micro-threading on POWER8 · b4deba5c

由 Paul Mackerras 提交于 7月 02, 2015

This builds on the ability to run more than one vcore on a physical
core by using the micro-threading (split-core) modes of the POWER8
chip.  Previously, only vcores from the same VM could be run together,
and (on POWER8) only if they had just one thread per core.  With the
ability to split the core on guest entry and unsplit it on guest exit,
we can run up to 8 vcpu threads from up to 4 different VMs, and we can
run multiple vcores with 2 or 4 vcpus per vcore.

Dynamic micro-threading is only available if the static configuration
of the cores is whole-core mode (unsplit), and only on POWER8.

To manage this, we introduce a new kvm_split_mode struct which is
shared across all of the subcores in the core, with a pointer in the
paca on each thread.  In addition we extend the core_info struct to
have information on each subcore.  When deciding whether to add a
vcore to the set already on the core, we now have two possibilities:
(a) piggyback the vcore onto an existing subcore, or (b) start a new
subcore.

Currently, when any vcpu needs to exit the guest and switch to host
virtual mode, we interrupt all the threads in all subcores and switch
the core back to whole-core mode.  It may be possible in future to
allow some of the subcores to keep executing in the guest while
subcore 0 switches to the host, but that is not implemented in this
patch.

This adds a module parameter called dynamic_mt_modes which controls
which micro-threading (split-core) modes the code will consider, as a
bitmap.  In other words, if it is 0, no micro-threading mode is
considered; if it is 2, only 2-way micro-threading is considered; if
it is 4, only 4-way, and if it is 6, both 2-way and 4-way
micro-threading mode will be considered.  The default is 6.

With this, we now have secondary threads which are the primary thread
for their subcore and therefore need to do the MMU switch.  These
threads will need to be started even if they have no vcpu to run, so
we use the vcore pointer in the PACA rather than the vcpu pointer to
trigger them.

It is now possible for thread 0 to find that an exit has been
requested before it gets to switch the subcore state to the guest.  In
that case we haven't added the guest's timebase offset to the
timebase, so we need to be careful not to subtract the offset in the
guest exit path.  In fact we just skip the whole path that switches
back to host context, since we haven't switched to the guest context.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

b4deba5c

KVM: PPC: Book3S HV: Make use of unused threads when running guests · ec257165

由 Paul Mackerras 提交于 6月 24, 2015

When running a virtual core of a guest that is configured with fewer
threads per core than the physical cores have, the extra physical
threads are currently unused.  This makes it possible to use them to
run one or more other virtual cores from the same guest when certain
conditions are met.  This applies on POWER7, and on POWER8 to guests
with one thread per virtual core.  (It doesn't apply to POWER8 guests
with multiple threads per vcore because they require a 1-1 virtual to
physical thread mapping in order to be able to use msgsndp and the
TIR.)

The idea is that we maintain a list of preempted vcores for each
physical cpu (i.e. each core, since the host runs single-threaded).
Then, when a vcore is about to run, it checks to see if there are
any vcores on the list for its physical cpu that could be
piggybacked onto this vcore's execution.  If so, those additional
vcores are put into state VCORE_PIGGYBACK and their runnable VCPU
threads are started as well as the original vcore, which is called
the master vcore.

After the vcores have exited the guest, the extra ones are put back
onto the preempted list if any of their VCPUs are still runnable and
not idle.

This means that vcpu->arch.ptid is no longer necessarily the same as
the physical thread that the vcpu runs on.  In order to make it easier
for code that wants to send an IPI to know which CPU to target, we
now store that in a new field in struct vcpu_arch, called thread_cpu.
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Tested-by: NLaurent Vivier <lvivier@redhat.com>
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

ec257165

20 8月, 2015 1 次提交

powerpc/kexec: Reset secondary cpu endianness before kexec · ffebf5f3

由 Samuel Mendoza-Jonas 提交于 7月 22, 2015

If the target kernel does not inlcude the FIXUP_ENDIAN check, coming
from a different-endian kernel will cause the target kernel to panic.
All ppc64 kernels can handle starting in big-endian mode, so return to
big-endian before branching into the target kernel.

This mainly affects pseries as secondaries on powernv are returned to
OPAL.
Signed-off-by: NSamuel Mendoza-Jonas <sam.mj@au1.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

ffebf5f3

19 8月, 2015 1 次提交

powerpc/nvram: print no error when pstore backend is not nvram · 74943dab

由 Hari Bathini 提交于 5月 11, 2015

Pstore only supports one backend at a time. The preferred
pstore backend is set by passing the pstore.backend=<name>
argument to the kernel at boot time. Currently, while trying
to register with pstore, nvram throws an error message even
when "pstore.backend != nvram", which is unnecessary. This
patch removes the error message in case "pstore.backend != nvram".
Signed-off-by: NHari Bathini <hbathini@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

74943dab

18 8月, 2015 4 次提交

powerpc/nvram: use kmemdup rather than duplicating its implementation · fc9e9cbf

由 Andrzej Hajda 提交于 8月 07, 2015

The patch was generated using fixed coccinelle semantic patch
scripts/coccinelle/api/memdup.cocci [1].

[1]: http://permalink.gmane.org/gmane.linux.kernel/2014320Signed-off-by: NAndrzej Hajda <a.hajda@samsung.com>
Reviewed-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

fc9e9cbf

powerpc/eeh: Disable automatically blocked PCI config · 39bfd715

由 Gavin Shan 提交于 7月 30, 2015

pcibios_set_pcie_reset_state() could be called to complete
reset request when passing through PCI device, flag
EEH_PE_ISOLATED is set before saving the PCI config sapce.
On some Broadcom adapters, EEH_PE_CFG_BLOCKED is automatically
set when the flag EEH_PE_ISOLATED is marked. It caused bogus
data saved from the PCI config space, which will be restored
to the PCI adapter after the reset. Eventually, the hardware
can't work with corrupted data in PCI config space.

The patch fixes the issue with eeh_pe_state_mark_no_cfg(), which
doesn't set EEH_PE_CFG_BLOCKED when seeing EEH_PE_ISOLATED on the
PE, in order to avoid the bogus data saved and restored to the PCI
config space.
Reported-by: NRajanikanth H. Adaveeshaiah <rajanikanth.ha@in.ibm.com>
Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

39bfd715

powerpc/powernv: move dma_get_required_mask from pnv_phb to pci_controller_ops · 53522982

由 Andrew Donnellan 提交于 8月 07, 2015

Simplify the dma_get_required_mask call chain by moving it from pnv_phb to
pci_controller_ops, similar to commit 763d2d8d ("powerpc/powernv:
Move dma_set_mask from pnv_phb to pci_controller_ops").

Previous call chain:

  0) call dma_get_required_mask() (kernel/dma.c)
  1) call ppc_md.dma_get_required_mask, if it exists. On powernv, that
     points to pnv_dma_get_required_mask() (platforms/powernv/setup.c)
  2) device is PCI, therefore call pnv_pci_dma_get_required_mask()
     (platforms/powernv/pci.c)
  3) call phb->dma_get_required_mask if it exists
  4) it only exists in the ioda case, where it points to
       pnv_pci_ioda_dma_get_required_mask() (platforms/powernv/pci-ioda.c)

New call chain:

  0) call dma_get_required_mask() (kernel/dma.c)
  1) device is PCI, therefore call pci_controller_ops.dma_get_required_mask
     if it exists
  2) in the ioda case, that points to pnv_pci_ioda_dma_get_required_mask()
     (platforms/powernv/pci-ioda.c)

In the p5ioc2 case, the call chain remains the same -
dma_get_required_mask() does not find either a ppc_md call or
pci_controller_ops call, so it calls __dma_get_required_mask().
Signed-off-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: NDaniel Axtens <dja@axtens.net>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

53522982

powerpc/e6500: remove the stale TCD_LOCK macro · e5e55cc0

由 Kevin Hao 提交于 8月 13, 2015

Since we moved the "lock" to be the first element of
struct tlb_core_data in commit 82d86de2 ("powerpc/e6500: Make TLB
lock recursive"), this macro is not used by any code. Just delete it.
Signed-off-by: NKevin Hao <haokexin@gmail.com>
Signed-off-by: NScott Wood <scottwood@freescale.com>

e5e55cc0

14 8月, 2015 1 次提交

powerpc/eeh: Probe after unbalanced kref check · e642d11b

由 Daniel Axtens 提交于 8月 14, 2015

In the complete hotplug case, EEH PEs are supposed to be released
and set to NULL. Normally, this is done by eeh_remove_device(),
which is called from pcibios_release_device().

However, if something is holding a kref to the device, it will not
be released, and the PE will remain. eeh_add_device_late() has
a check for this which will explictly destroy the PE in this case.

This check in eeh_add_device_late() occurs after a call to
eeh_ops->probe(). On PowerNV, probe is a pointer to pnv_eeh_probe(),
which will exit without probing if there is an existing PE.

This means that on PowerNV, devices with outstanding krefs will not
be rediscovered by EEH correctly after a complete hotplug. This is
affecting CXL (CAPI) devices in the field.

Put the probe after the kref check so that the PE is destroyed
and affected devices are correctly rediscovered by EEH.

Fixes: d91dafc0 ("powerpc/eeh: Delay probing EEH device during hotplug")
Cc: stable@vger.kernel.org
Cc: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NDaniel Axtens <dja@axtens.net>
Acked-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

e642d11b

12 8月, 2015 2 次提交

powerpc/prom: Use DRCONF flags while processing detected LMBs · 9afac933

由 Anshuman Khandual 提交于 8月 06, 2015

Replace hard coded values with existing DRCONF flags while procesing
detected LMBs from the device tree. Does not change any functionality.
Signed-off-by: NAnshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

9afac933

powerpc/prom: Simplify the logic to fetch SLB size · 9c61f7a0

由 Anshuman Khandual 提交于 7月 29, 2015

The code to fetch the SLB size from the device tree wants to first look
for "slb-size" and then if that's not found "ibm,slb-size".

We can simplify the code by looking for the properties and then if we
find one of them we set mmu_slb_size.

We also change the function name from check_cpu_slb_size() to
init_mmu_slb_size() as the function doesn't check anything, it only
initialises mmu_slb_size.
Signed-off-by: NAnshuman Khandual <khandual@linux.vnet.ibm.com>
[mpe: Rewrite change log]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

9c61f7a0

11 8月, 2015 1 次提交

cleanup IORESOURCE_CACHEABLE vs ioremap() · 92b19ff5

由 Dan Williams 提交于 8月 10, 2015

Quoting Arnd:
    I was thinking the opposite approach and basically removing all uses
    of IORESOURCE_CACHEABLE from the kernel. There are only a handful of
    them.and we can probably replace them all with hardcoded
    ioremap_cached() calls in the cases they are actually useful.

All existing usages of IORESOURCE_CACHEABLE call ioremap() instead of
ioremap_nocache() if the resource is cacheable, however ioremap() is
uncached by default. Clearly none of the existing usages care about the
cacheability. Particularly devm_ioremap_resource() never worked as
advertised since it always fell back to plain ioremap().

Clean this up as the new direction we want is to convert
ioremap_<type>() usages to memremap(..., flags).
Suggested-by: NArnd Bergmann <arnd@arndb.de>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

92b19ff5

10 8月, 2015 1 次提交

powerpc/time: Migrate to new 'set-state' interface · 37a13e78

由 Viresh Kumar 提交于 7月 16, 2015

Migrate powerpc driver to the new 'set-state' interface provided by
clockevents core, the earlier 'set-mode' interface is marked obsolete
now.

This also enables us to implement callbacks for new states of clockevent
devices, for example: ONESHOT_STOPPED.

We weren't doing anything in ->set_mode(ONSHOT) and so
set_state_oneshot() isn't implemented.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org>

37a13e78

08 8月, 2015 1 次提交

powerpc/fsl: Force coherent memory on e500mc derivatives · c6023202

由 Scott Wood 提交于 7月 18, 2015

In CoreNet systems it is not allowed to mix M and non-M mappings to the
same memory, and coherent DMA accesses are considered to be M mappings
for this purpose.  Ignoring this has been observed to cause hard
lockups in non-SMP kernels on e6500.

Furthermore, e6500 implements the LRAT (logical to real address table)
which allows KVM guests to control the WIMGE bits.  This means that
KVM cannot force the M bit on the way it usually does, so the guest had
better set it itself.
Signed-off-by: NScott Wood <scottwood@freescale.com>

c6023202

07 8月, 2015 1 次提交

signal: fix information leak in copy_siginfo_from_user32 · 3c00cb5e

由 Amanieu d'Antras 提交于 8月 06, 2015

This function can leak kernel stack data when the user siginfo_t has a
positive si_code value.  The top 16 bits of si_code descibe which fields
in the siginfo_t union are active, but they are treated inconsistently
between copy_siginfo_from_user32, copy_siginfo_to_user32 and
copy_siginfo_to_user.

copy_siginfo_from_user32 is called from rt_sigqueueinfo and
rt_tgsigqueueinfo in which the user has full control overthe top 16 bits
of si_code.

This fixes the following information leaks:
x86:   8 bytes leaked when sending a signal from a 32-bit process to
       itself. This leak grows to 16 bytes if the process uses x32.
       (si_code = __SI_CHLD)
x86:   100 bytes leaked when sending a signal from a 32-bit process to
       a 64-bit process. (si_code = -1)
sparc: 4 bytes leaked when sending a signal from a 32-bit process to a
       64-bit process. (si_code = any)

parsic and s390 have similar bugs, but they are not vulnerable because
rt_[tg]sigqueueinfo have checks that prevent sending a positive si_code
to a different process.  These bugs are also fixed for consistency.
Signed-off-by: NAmanieu d'Antras <amanieu@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3c00cb5e

06 8月, 2015 3 次提交

powerpc/ftrace: add powerpc timebase as a trace clock source · 197165d4

由 Naveen N. Rao 提交于 4月 24, 2015

Add a new powerpc-specific trace clock using the timebase register,
similar to x86-tsc. This gives us
- a fast, monotonic, hardware clock source for trace entries, and
- a clock that can be used to correlate events across cpus as well as across
  hypervisor and guests.
Signed-off-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

197165d4

powerpc: Remove redundant breaks · a825ac07

由 Joe Perches 提交于 6月 29, 2015

break; break; isn't useful.

Remove one.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

a825ac07

powerpc: pci: use %pR for printing struct resource · ae2a84b4

由 Kevin Hao 提交于 6月 12, 2015

Use %pR to simplify the debug code. This also make the debug info more
readable.
Signed-off-by: NKevin Hao <haokexin@gmail.com>
[mpe: Unsplit multi-line printk strings]
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

ae2a84b4