提交 · 405e44f2e312dd5dd63e5a9f459bffcbcd4368ef · openeuler / raspberrypi-kernel

03 11月, 2011 3 次提交

powerpc: get_hugepte() don't put_page() the wrong page · 405e44f2

由 Andrea Arcangeli 提交于 11月 02, 2011

"page" may have changed to point to the next hugepage after the loop
completed, The references have been taken on the head page, so the
put_page must happen there too.

This is a longstanding issue pre-thp inclusion.

It's totally unclear how these page_cache_add_speculative and
pte_val(pte) != pte_val(*ptep) checks are necessary across all the
powerpc gup_fast code, when x86 doesn't need any of that: there's no way
the page can be freed with irq disabled so we're guaranteed the
atomic_inc will happen on a page with page_count > 0 (so not needing the
speculative check).

The pte check is also meaningless on x86: no need to rollback on x86 if
the pte changed, because the pte can still change a CPU tick after the
check succeeded and it won't be rolled back in that case.  The important
thing is we got a reference on a valid page that was mapped there a CPU
tick ago.  So not knowing the soft tlb refill code of ppc64 in great
detail I'm not removing the "speculative" page_count increase and the
pte checks across all the code, but unless there's a strong reason for
it they should be later cleaned up too.

If a pte can change from huge to non-huge (like it could happen with
THP) passing a pte_t *ptep to gup_hugepte() would also require to repeat
the is_hugepd in gup_hugepte(), but that shouldn't happen with hugetlbfs
only so I'm not altering that.
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

405e44f2

powerpc: remove superfluous PageTail checks on the pte gup_fast · 2839bdc1

由 Andrea Arcangeli 提交于 11月 02, 2011

This part of gup_fast doesn't seem capable of handling hugetlbfs ptes,
those should be handled by gup_hugepd only, so these checks are
superfluous.

Plus if this wasn't a noop, it would have oopsed because, the insistence
of using the speculative refcounting would trigger a VM_BUG_ON if a tail
page was encountered in the page_cache_get_speculative().
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2839bdc1

mm: thp: tail page refcounting fix · 70b50f94

由 Andrea Arcangeli 提交于 11月 02, 2011

Michel while working on the working set estimation code, noticed that
calling get_page_unless_zero() on a random pfn_to_page(random_pfn)
wasn't safe, if the pfn ended up being a tail page of a transparent
hugepage under splitting by __split_huge_page_refcount().

He then found the problem could also theoretically materialize with
page_cache_get_speculative() during the speculative radix tree lookups
that uses get_page_unless_zero() in SMP if the radix tree page is freed
and reallocated and get_user_pages is called on it before
page_cache_get_speculative has a chance to call get_page_unless_zero().

So the best way to fix the problem is to keep page_tail->_count zero at
all times.  This will guarantee that get_page_unless_zero() can never
succeed on any tail page.  page_tail->_mapcount is guaranteed zero and
is unused for all tail pages of a compound page, so we can simply
account the tail page references there and transfer them to
tail_page->_count in __split_huge_page_refcount() (in addition to the
head_page->_mapcount).

While debugging this s/_count/_mapcount/ change I also noticed get_page is
called by direct-io.c on pages returned by get_user_pages.  That wasn't
entirely safe because the two atomic_inc in get_page weren't atomic.  As
opposed to other get_user_page users like secondary-MMU page fault to
establish the shadow pagetables would never call any superflous get_page
after get_user_page returns.  It's safer to make get_page universally safe
for tail pages and to use get_page_foll() within follow_page (inside
get_user_pages()).  get_page_foll() is safe to do the refcounting for tail
pages without taking any locks because it is run within PT lock protected
critical sections (PT lock for pte and page_table_lock for
pmd_trans_huge).

The standard get_page() as invoked by direct-io instead will now take
the compound_lock but still only for tail pages.  The direct-io paths
are usually I/O bound and the compound_lock is per THP so very
finegrined, so there's no risk of scalability issues with it.  A simple
direct-io benchmarks with all lockdep prove locking and spinlock
debugging infrastructure enabled shows identical performance and no
overhead.  So it's worth it.  Ideally direct-io should stop calling
get_page() on pages returned by get_user_pages().  The spinlock in
get_page() is already optimized away for no-THP builds but doing
get_page() on tail pages returned by GUP is generally a rare operation
and usually only run in I/O paths.

This new refcounting on page_tail->_mapcount in addition to avoiding new
RCU critical sections will also allow the working set estimation code to
work without any further complexity associated to the tail page
refcounting with THP.
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Reported-by: NMichel Lespinasse <walken@google.com>
Reviewed-by: NMichel Lespinasse <walken@google.com>
Reviewed-by: NMinchan Kim <minchan.kim@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: <stable@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

70b50f94

19 7月, 2011 1 次提交

powerpc/mm: Fix output of total_ram. · f7ba2991

由 Tony Breeds 提交于 7月 04, 2011

On 32bit platforms that support >= 4GB memory total_ram was truncated.
This creates a confusing printk:
	Top of RAM: 0x100000000, Total RAM: 0x0
Fix that:
	Top of RAM: 0x100000000, Total RAM: 0x100000000
Signed-off-by: NTony Breeds <tony@bakeyournoodle.com>
Acked-by: NJosh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

f7ba2991

12 7月, 2011 4 次提交

powerpc/47x: allow kernel to be loaded in higher physical memory · 9661534d

由 Dave Kleikamp 提交于 7月 04, 2011

The 44x code (which is shared by 47x) assumes the available physical memory
begins at 0x00000000.  This is not necessarily the case in an AMP
environment.

Support CONFIG_RELOCATABLE for 476 in order to allow the kernel to be
loaded into a higher memory range.
Signed-off-by: NTony Breeds <tony@bakeyournoodle.com>
Signed-off-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: NJosh Boyer <jwboyer@linux.vnet.ibm.com>

9661534d

powerpc/44x: don't use tlbivax on AMP systems · 91b191c7

由 Dave Kleikamp 提交于 7月 04, 2011

Since other OS's may be running on the other cores don't use tlbivax
Signed-off-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: NTony Breeds <tony@bakeyournoodle.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: NJosh Boyer <jwboyer@linux.vnet.ibm.com>

91b191c7

KVM: PPC: book3s_hv: Add support for PPC970-family processors · 9e368f29

由 Paul Mackerras 提交于 6月 29, 2011

This adds support for running KVM guests in supervisor mode on those
PPC970 processors that have a usable hypervisor mode.  Unfortunately,
Apple G5 machines have supervisor mode disabled (MSR[HV] is forced to
1), but the YDL PowerStation does have a usable hypervisor mode.

There are several differences between the PPC970 and POWER7 in how
guests are managed.  These differences are accommodated using the
CPU_FTR_ARCH_201 (PPC970) and CPU_FTR_ARCH_206 (POWER7) CPU feature
bits.  Notably, on PPC970:

* The LPCR, LPID or RMOR registers don't exist, and the functions of
  those registers are provided by bits in HID4 and one bit in HID0.

* External interrupts can be directed to the hypervisor, but unlike
  POWER7 they are masked by MSR[EE] in non-hypervisor modes and use
  SRR0/1 not HSRR0/1.

* There is no virtual RMA (VRMA) mode; the guest must use an RMO
  (real mode offset) area.

* The TLB entries are not tagged with the LPID, so it is necessary to
  flush the whole TLB on partition switch.  Furthermore, when switching
  partitions we have to ensure that no other CPU is executing the tlbie
  or tlbsync instructions in either the old or the new partition,
  otherwise undefined behaviour can occur.

* The PMU has 8 counters (PMC registers) rather than 6.

* The DSCR, PURR, SPURR, AMR, AMOR, UAMOR registers don't exist.

* The SLB has 64 entries rather than 32.

* There is no mediated external interrupt facility, so if we switch to
  a guest that has a virtual external interrupt pending but the guest
  has MSR[EE] = 0, we have to arrange to have an interrupt pending for
  it so that we can get control back once it re-enables interrupts.  We
  do that by sending ourselves an IPI with smp_send_reschedule after
  hard-disabling interrupts.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

9e368f29

powerpc, KVM: Split HVMODE_206 cpu feature bit into separate HV and architecture bits · 969391c5

由 Paul Mackerras 提交于 6月 29, 2011

This replaces the single CPU_FTR_HVMODE_206 bit with two bits, one to
indicate that we have a usable hypervisor mode, and another to indicate
that the processor conforms to PowerISA version 2.06.  We also add
another bit to indicate that the processor conforms to ISA version 2.01
and set that for PPC970 and derivatives.

Some PPC970 chips (specifically those in Apple machines) have a
hypervisor mode in that MSR[HV] is always 1, but the hypervisor mode
is not useful in the sense that there is no way to run any code in
supervisor mode (HV=0 PR=0).  On these processors, the LPES0 and LPES1
bits in HID4 are always 0, and we use that as a way of detecting that
hypervisor mode is not useful.

Where we have a feature section in assembly code around code that
only applies on POWER7 in hypervisor mode, we use a construct like

END_FTR_SECTION_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206)

The definition of END_FTR_SECTION_IFSET is such that the code will
be enabled (not overwritten with nops) only if all bits in the
provided mask are set.

Note that the CPU feature check in __tlbie() only needs to check the
ARCH_206 bit, not the HVMODE bit, because __tlbie() can only get called
if we are running bare-metal, i.e. in hypervisor mode.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

969391c5

08 7月, 2011 1 次提交

powerpc: Create next_tlbcam_idx percpu variable for FSL_BOOKE · 3160b097

由 Becky Bruce 提交于 6月 28, 2011

This is used to round-robin TLBCAM entries.
Signed-off-by: NBecky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>

3160b097

01 7月, 2011 1 次提交

perf: Remove the nmi parameter from the swevent and overflow interface · a8b0ca17

由 Peter Zijlstra 提交于 6月 27, 2011

The nmi parameter indicated if we could do wakeups from the current
context, if not, we would set some state and self-IPI and let the
resulting interrupt do the wakeup.

For the various event classes:

  - hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from
    the PMI-tail (ARM etc.)
  - tracepoint: nmi=0; since tracepoint could be from NMI context.
  - software: nmi=[0,1]; some, like the schedule thing cannot
    perform wakeups, and hence need 0.

As one can see, there is very little nmi=1 usage, and the down-side of
not using it is that on some platforms some software events can have a
jiffy delay in wakeup (when arch_irq_work_raise isn't implemented).

The up-side however is that we can remove the nmi parameter and save a
bunch of conditionals in fast paths.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Michael Cree <mcree@orcon.net.nz>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Don Zickus <dzickus@redhat.com>
Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>

a8b0ca17

30 6月, 2011 2 次提交

powerpc: Add printk companion for ppc_md.progress · a9c0f41b

由 Dave Carroll 提交于 6月 18, 2011

This patch adds a printk companion to replace the udbg progress function
when initmem is freed.
Suggested-by: NMilton Miller <miltonm@bga.com>
Suggested-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NDave Carroll <dcarroll@astekcorp.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

a9c0f41b

powerpc: Move free_initmem to common code · 2773fcc8

由 Dave Carroll 提交于 6月 18, 2011

The free_initmem function is basically duplicated in mm/init_32,
and init_64, and is moved to the common 32/64-bit mm/mem.c.

All other sections except init were removed in v2.6.15 by
6c45ab99 (powerpc: Remove section
free() and linker script bits), and therefore the bulk of the executed
code is identical.

This patch also removes updating ppc_md.progress to NULL in the powermac
late_initcall.
Suggested-by: NMilton Miller <miltonm@bga.com>
Suggested-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NDave Carroll <dcarroll@astekcorp.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

2773fcc8

29 6月, 2011 3 次提交

powerpc: mem_init should call memblock_is_reserved with phys_addr_t · 3d41e0f6

由 Becky Bruce 提交于 6月 28, 2011

This has been broken for a while but hasn't been an issue until
now because nobody was reserving regions at high addresses.
Signed-off-by: NBecky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

3d41e0f6

powerpc/book3e-64: use a separate TLB handler when linear map is bolted · f67f4ef5

由 Scott Wood 提交于 6月 22, 2011

On MMUs such as FSL where we can guarantee the entire linear mapping is
bolted, we don't need to worry about linear TLB misses.  If on top of
that we do a full table walk, we get rid of all recursive TLB faults, and
can dispense with some state saving.  This gains a few percent on
TLB-miss-heavy workloads, and around 50% on a benchmark that had a high
rate of virtual page table faults under the normal handler.

While touching the EX_TLB layout, remove EX_TLB_MMUCR0, EX_TLB_SRR0, and
EX_TLB_SRR1 as they're not used.

[BenH: Fixed build with 64K pages (wsp config)]
Signed-off-by: NScott Wood <scottwood@freescale.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

f67f4ef5

arch/powerpc: use printk_ratelimited instead of printk_ratelimit · 76462232

由 Christian Dietrich 提交于 6月 04, 2011

Since printk_ratelimit() shouldn't be used anymore (see comment in
include/linux/printk.h), replace it with printk_ratelimited.
Signed-off-by: NChristian Dietrich <christian.dietrich@informatik.uni-erlangen.de>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

76462232

17 6月, 2011 1 次提交

powerpc/book3e: Clarify HW table walk enable/disable message · 32d206eb

由 Kumar Gala 提交于 5月 19, 2011

Before if we didn't support or enable HW table walk we'd get a messaage
like:

MMU: Book3E Page Tables Disabled

Which is a bit misleading.  Now it will say:

MMU: Book3E HW tablewalk not supported
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

32d206eb

09 6月, 2011 1 次提交

powerpc: Force page alignment for initrd reserved memory · 307cfe71

由 Benjamin Herrenschmidt 提交于 6月 09, 2011

When using 64K pages with a separate cpio rootfs, U-Boot will align
the rootfs on a 4K page boundary. When the memory is reserved, and
subsequent early memblock_alloc is called, it will allocate memory
between the 64K page alignment and reserved memory. When the reserved
memory is subsequently freed, it is done so by pages, causing the
early memblock_alloc requests to be re-used, which in my case, caused
the device-tree to be clobbered.

This patch forces the reserved memory for initrd to be kernel page
aligned, and will move the device tree if it overlaps with the range
extension of initrd. This patch will also consolidate the identical
function free_initrd_mem() from mm/init_32.c, init_64.c to mm/mem.c,
and adds the same range extension when freeing initrd. free_initrd_mem()
is also moved to the __init section.

Many thanks to Milton Miller for his input on this patch.

[BenH: Fixed build without CONFIG_BLK_DEV_INITRD]
Signed-off-by: NDave Carroll <dcarroll@astekcorp.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

307cfe71

25 5月, 2011 2 次提交

mm, powerpc: move the RCU page-table freeing into generic code · 26723911

由 Peter Zijlstra 提交于 5月 24, 2011

In case other architectures require RCU freed page-tables to implement
gup_fast() and software filled hashes and similar things, provide the
means to do so by moving the logic into generic code.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Requested-by: NDavid Miller <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

26723911

powerpc: mmu_gather rework · d6bf29b4

由 Peter Zijlstra 提交于 5月 24, 2011

Fix up powerpc to the new mmu_gather stuff.

PPC has an extra batching queue to RCU free the actual pagetable
allocations, use the ARCH extentions for that for now.

For the ppc64_tlb_batch, which tracks the vaddrs to unhash from the
hardware hash-table, keep using per-cpu arrays but flush on context switch
and use a TLF bit to track the lazy_mmu state.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d6bf29b4

19 5月, 2011 2 次提交

powerpc: Remove ioremap_flags · 40f1ce7f

由 Anton Blanchard 提交于 5月 08, 2011

We have a confusing number of ioremap functions. Make things just a
bit simpler by merging ioremap_flags and ioremap_prot.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

40f1ce7f

powerpc: Add ioremap_wc · be135f40

由 Anton Blanchard 提交于 5月 08, 2011

Add ioremap_wc so drivers can request write combining on kernel
mappings.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

be135f40

06 5月, 2011 1 次提交

powerpc: Fix compile with icwsx support · 79af2187

由 Stephen Rothwell 提交于 5月 06, 2011

Due to a collision between NO_CONTEXT->MMU_NO_CONTEXT change and
Anton's patch.
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

79af2187

04 5月, 2011 3 次提交

powerpc: Convert old cpumask API into new one · 104699c0

由 KOSAKI Motohiro 提交于 4月 28, 2011

Adapt new API.

Almost change is trivial. Most important change is the below line
because we plan to change task->cpus_allowed implementation.

-       ctx->cpus_allowed = current->cpus_allowed;
Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

104699c0

powerpc: Add Initiate Coprocessor Store Word (icswx) support · 851d2e2f

由 Tseng-Hui (Frank) Lin 提交于 5月 02, 2011

Icswx is a PowerPC instruction to send data to a co-processor. On Book-S
processors the LPAR_ID and process ID (PID) of the owning process are
registered in the window context of the co-processor at initialization
time. When the icswx instruction is executed the L2 generates a cop-reg
transaction on PowerBus. The transaction has no address and the
processor does not perform an MMU access to authenticate the transaction.
The co-processor compares the LPAR_ID and the PID included in the
transaction and the LPAR_ID and PID held in the window context to
determine if the process is authorized to generate the transaction.

The OS needs to assign a 16-bit PID for the process. This cop-PID needs
to be updated during context switch. The cop-PID needs to be destroyed
when the context is destroyed.
Signed-off-by: NSonny Rao <sonnyrao@linux.vnet.ibm.com>
Signed-off-by: NTseng-Hui (Frank) Lin <thlin@linux.vnet.ibm.com>
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

851d2e2f

powerpc: Use new CPU feature bit to select 2.06 tlbie · a32e252f

由 Michael Neuling 提交于 4月 06, 2011

This removes MMU_FTR_TLBIE_206 as we can now use CPU_FTR_HVMODE_206.  It
also changes the logic to select which tlbie to use to be based on this
new CPU feature bit.

This also duplicates the ASM_FTR_IF/SET/CLR defines for CPU features
(copied from MMU features).
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

a32e252f

27 4月, 2011 3 次提交

powerpc: Free up some CPU feature bits by moving out MMU-related features · 44ae3ab3

由 Matt Evans 提交于 4月 06, 2011

Some of the 64bit PPC CPU features are MMU-related, so this patch moves
them to MMU_FTR_ bits. All cpu_has_feature()-style tests are moved to
mmu_has_feature(), and seven feature bits are freed as a result.
Signed-off-by: NMatt Evans <matt@ozlabs.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

44ae3ab3

powerpc/numa: Look for ibm, associativity-reference-points at the root · e70606eb

由 Michael Ellerman 提交于 4月 10, 2011

If we don't find ibm,associativity-reference-points as a child of
/rtas, look for it at the root of the tree instead. We use this on
Book3E where we have no RTAS but still use the sPAPR conventions
for NUMA.
Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

e70606eb

powerpc: Add TLB size detection for TYPE_3E MMUs · bd491781

由 Benjamin Herrenschmidt 提交于 4月 14, 2011

Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

bd491781

20 4月, 2011 3 次提交

powerpc: Replace open coded instruction patching with patch_instruction/patch_branch · b68a70c4

由 Anton Blanchard 提交于 4月 04, 2011

There are a few places we patch instructions without using
patch_instruction and patch_branch, probably because they
predated it. Fix it.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

b68a70c4

powerpc/nohash: Allocate stale_map[cpu] on CPU_UP_PREPARE not CPU_ONLINE · f5be2dc0

由 Michael Ellerman 提交于 4月 04, 2011

Currently we allocate the stale_map for a cpu when it comes online,
this leaves open a small window where a process can be scheduled
on the cpu before the stale_map is allocated. Instead allocate
the stale_map at CPU_UP_PREPARE time, that way it will be always
available before tasks start running.

It is possible the cpu fails to come up, in which case we should free
the stale_map, so add a CPU_UP_CANCELED case to do that.
Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

f5be2dc0

powerpc/mm: Standardise on MMU_NO_CONTEXT · 5e8e7b40

由 Michael Ellerman 提交于 4月 12, 2011

Use MMU_NO_CONTEXT as the initialiser for mm_context.id on
nohash and hash64.
Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

5e8e7b40

05 4月, 2011 1 次提交

Documentation: fix minor typos/spelling · f65e51d7

由 Sylvestre Ledru 提交于 4月 04, 2011

Fix some minor typos:
 * informations => information
 * there own => their own
 * these => this
Signed-off-by: NSylvestre Ledru <sylvestre.ledru@scilab.org>
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f65e51d7

31 3月, 2011 1 次提交

Fix common misspellings · 25985edc

由 Lucas De Marchi 提交于 3月 30, 2011

Fixes generated by 'codespell' and manually reviewed.
Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>

25985edc

30 3月, 2011 1 次提交

powerpc: Implement dma_mmap_coherent() · 6090912c

由 Benjamin Herrenschmidt 提交于 3月 24, 2011

This is used by Alsa to mmap buffers allocated with dma_alloc_coherent()
into userspace. We need a special variant to handle machines with
non-coherent DMAs as those buffers have "special" virt addresses and
require non-cachable mappings
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

6090912c

10 3月, 2011 1 次提交

powerpc/pseries: Disable VPNH feature · 36e8695c

由 Benjamin Herrenschmidt 提交于 3月 09, 2011

This feature triggers nasty races in the scheduler between the
rebuilding of the topology and the load balancing code, causing
the machine to hang.

Disable it for now until the races are fixed.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

36e8695c

02 3月, 2011 2 次提交

powerpc: Fix memory limits when starting at a non-zero address · 6dd22700

由 Scott Wood 提交于 1月 27, 2011

memblock_enforce_memory_limit() takes the desired maximum quantity of memory
to end up with, not an address above which memory will not be used.
Signed-off-by: NScott Wood <scottwood@freescale.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

6dd22700

powerpc/mm: Make hpte_need_flush() safe for preemption · f342552b

由 Peter Zijlstra 提交于 2月 24, 2011

hpte_need_flush() might be called outside of a preempt section
when manipulating the kernel page tables, so we need to use the
appopriate variants of per-cpu variable accesses. There should
be no risk of being in the middle of a batch and a context
switch will flush any pending batch.

[Patch extracted from a larger patch in Peter's preemptible
 mmu_gather series]
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NHugh Dickins <hughd@google.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

f342552b

07 2月, 2011 3 次提交

powerpc/numa: Fix bug in unmap_cpu_from_node · 429f4d8d

由 Anton Blanchard 提交于 1月 29, 2011

When converting to the new cpumask code I screwed up:

-       if (cpu_isset(cpu, numa_cpumask_lookup_table[node])) {
-               cpu_clear(cpu, numa_cpumask_lookup_table[node]);
+       if (cpumask_test_cpu(cpu, node_to_cpumask_map[node])) {
+               cpumask_set_cpu(cpu, node_to_cpumask_map[node]);

This was introduced in commit 25863de0 (powerpc/cpumask: Convert NUMA code
to new cpumask API)

Fix it.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Cc: <stable@kernel.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

429f4d8d

powerpc/numa: Disable VPHN on dedicated processor partitions · fe5cfd63

由 Anton Blanchard 提交于 1月 29, 2011

There is no need to start up the timer and monitor topology changes on a
dedicated processor partition, so disable it.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

fe5cfd63

powerpc/numa: Add length when creating OF properties via VPHN · c0e5e46f

由 Anton Blanchard 提交于 1月 29, 2011

The rest of the NUMA code expects an OF associativity property with
the first cell containing the length. Without this fix all topology changes
cause us to misparse the property and put the cpu into node 0.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

c0e5e46f