提交 · b35a35b556f5e6b7993ad0baf20173e75c09ce8c · openanolis / cloud-kernel

03 11月, 2011 7 次提交

thp: share get_huge_page_tail() · b35a35b5

由 Andrea Arcangeli 提交于 11月 02, 2011

This avoids duplicating the function in every arch gup_fast.
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b35a35b5

powerpc: gup_huge_pmd() return 0 if pte changes · cf592bf7

由 Andrea Arcangeli 提交于 11月 02, 2011

powerpc didn't return 0 in that case, if it's rolling back the *nr pointer
it should also return zero to avoid adding pages to the array at the wrong
offset.
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: NDavid Gibson <david@gibson.dropbear.id.au>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cf592bf7

powerpc: gup_hugepte() support THP based tail recounting · 3526741f

由 Andrea Arcangeli 提交于 11月 02, 2011

Up to this point the code assumed old refcounting for hugepages (pre-thp).
This updates the code directly to the thp mapcount tail page refcounting.
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3526741f

powerpc: gup_hugepte() avoid freeing the head page too many times · 85964684

由 Andrea Arcangeli 提交于 11月 02, 2011

We only taken "refs" pins on the head page not "*nr" pins.
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

85964684

powerpc: get_hugepte() don't put_page() the wrong page · 405e44f2

由 Andrea Arcangeli 提交于 11月 02, 2011

"page" may have changed to point to the next hugepage after the loop
completed, The references have been taken on the head page, so the
put_page must happen there too.

This is a longstanding issue pre-thp inclusion.

It's totally unclear how these page_cache_add_speculative and
pte_val(pte) != pte_val(*ptep) checks are necessary across all the
powerpc gup_fast code, when x86 doesn't need any of that: there's no way
the page can be freed with irq disabled so we're guaranteed the
atomic_inc will happen on a page with page_count > 0 (so not needing the
speculative check).

The pte check is also meaningless on x86: no need to rollback on x86 if
the pte changed, because the pte can still change a CPU tick after the
check succeeded and it won't be rolled back in that case.  The important
thing is we got a reference on a valid page that was mapped there a CPU
tick ago.  So not knowing the soft tlb refill code of ppc64 in great
detail I'm not removing the "speculative" page_count increase and the
pte checks across all the code, but unless there's a strong reason for
it they should be later cleaned up too.

If a pte can change from huge to non-huge (like it could happen with
THP) passing a pte_t *ptep to gup_hugepte() would also require to repeat
the is_hugepd in gup_hugepte(), but that shouldn't happen with hugetlbfs
only so I'm not altering that.
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

405e44f2

powerpc: remove superfluous PageTail checks on the pte gup_fast · 2839bdc1

由 Andrea Arcangeli 提交于 11月 02, 2011

This part of gup_fast doesn't seem capable of handling hugetlbfs ptes,
those should be handled by gup_hugepd only, so these checks are
superfluous.

Plus if this wasn't a noop, it would have oopsed because, the insistence
of using the speculative refcounting would trigger a VM_BUG_ON if a tail
page was encountered in the page_cache_get_speculative().
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2839bdc1

mm: thp: tail page refcounting fix · 70b50f94

由 Andrea Arcangeli 提交于 11月 02, 2011

Michel while working on the working set estimation code, noticed that
calling get_page_unless_zero() on a random pfn_to_page(random_pfn)
wasn't safe, if the pfn ended up being a tail page of a transparent
hugepage under splitting by __split_huge_page_refcount().

He then found the problem could also theoretically materialize with
page_cache_get_speculative() during the speculative radix tree lookups
that uses get_page_unless_zero() in SMP if the radix tree page is freed
and reallocated and get_user_pages is called on it before
page_cache_get_speculative has a chance to call get_page_unless_zero().

So the best way to fix the problem is to keep page_tail->_count zero at
all times.  This will guarantee that get_page_unless_zero() can never
succeed on any tail page.  page_tail->_mapcount is guaranteed zero and
is unused for all tail pages of a compound page, so we can simply
account the tail page references there and transfer them to
tail_page->_count in __split_huge_page_refcount() (in addition to the
head_page->_mapcount).

While debugging this s/_count/_mapcount/ change I also noticed get_page is
called by direct-io.c on pages returned by get_user_pages.  That wasn't
entirely safe because the two atomic_inc in get_page weren't atomic.  As
opposed to other get_user_page users like secondary-MMU page fault to
establish the shadow pagetables would never call any superflous get_page
after get_user_page returns.  It's safer to make get_page universally safe
for tail pages and to use get_page_foll() within follow_page (inside
get_user_pages()).  get_page_foll() is safe to do the refcounting for tail
pages without taking any locks because it is run within PT lock protected
critical sections (PT lock for pte and page_table_lock for
pmd_trans_huge).

The standard get_page() as invoked by direct-io instead will now take
the compound_lock but still only for tail pages.  The direct-io paths
are usually I/O bound and the compound_lock is per THP so very
finegrined, so there's no risk of scalability issues with it.  A simple
direct-io benchmarks with all lockdep prove locking and spinlock
debugging infrastructure enabled shows identical performance and no
overhead.  So it's worth it.  Ideally direct-io should stop calling
get_page() on pages returned by get_user_pages().  The spinlock in
get_page() is already optimized away for no-THP builds but doing
get_page() on tail pages returned by GUP is generally a rare operation
and usually only run in I/O paths.

This new refcounting on page_tail->_mapcount in addition to avoiding new
RCU critical sections will also allow the working set estimation code to
work without any further complexity associated to the tail page
refcounting with THP.
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Reported-by: NMichel Lespinasse <walken@google.com>
Reviewed-by: NMichel Lespinasse <walken@google.com>
Reviewed-by: NMinchan Kim <minchan.kim@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: <stable@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

70b50f94

01 11月, 2011 1 次提交

Cross Memory Attach · fcf63409

由 Christopher Yeoh 提交于 10月 31, 2011

The basic idea behind cross memory attach is to allow MPI programs doing
intra-node communication to do a single copy of the message rather than a
double copy of the message via shared memory.

The following patch attempts to achieve this by allowing a destination
process, given an address and size from a source process, to copy memory
directly from the source process into its own address space via a system
call.  There is also a symmetrical ability to copy from the current
process's address space into a destination process's address space.

- Use of /proc/pid/mem has been considered, but there are issues with
  using it:
  - Does not allow for specifying iovecs for both src and dest, assuming
    preadv or pwritev was implemented either the area read from or
  written to would need to be contiguous.
  - Currently mem_read allows only processes who are currently
  ptrace'ing the target and are still able to ptrace the target to read
  from the target. This check could possibly be moved to the open call,
  but its not clear exactly what race this restriction is stopping
  (reason  appears to have been lost)
  - Having to send the fd of /proc/self/mem via SCM_RIGHTS on unix
  domain socket is a bit ugly from a userspace point of view,
  especially when you may have hundreds if not (eventually) thousands
  of processes  that all need to do this with each other
  - Doesn't allow for some future use of the interface we would like to
  consider adding in the future (see below)
  - Interestingly reading from /proc/pid/mem currently actually
  involves two copies! (But this could be fixed pretty easily)

As mentioned previously use of vmsplice instead was considered, but has
problems.  Since you need the reader and writer working co-operatively if
the pipe is not drained then you block.  Which requires some wrapping to
do non blocking on the send side or polling on the receive.  In all to all
communication it requires ordering otherwise you can deadlock.  And in the
example of many MPI tasks writing to one MPI task vmsplice serialises the
copying.

There are some cases of MPI collectives where even a single copy interface
does not get us the performance gain we could.  For example in an
MPI_Reduce rather than copy the data from the source we would like to
instead use it directly in a mathops (say the reduce is doing a sum) as
this would save us doing a copy.  We don't need to keep a copy of the data
from the source.  I haven't implemented this, but I think this interface
could in the future do all this through the use of the flags - eg could
specify the math operation and type and the kernel rather than just
copying the data would apply the specified operation between the source
and destination and store it in the destination.

Although we don't have a "second user" of the interface (though I've had
some nibbles from people who may be interested in using it for intra
process messaging which is not MPI).  This interface is something which
hardware vendors are already doing for their custom drivers to implement
fast local communication.  And so in addition to this being useful for
OpenMPI it would mean the driver maintainers don't have to fix things up
when the mm changes.

There was some discussion about how much faster a true zero copy would
go. Here's a link back to the email with some testing I did on that:

http://marc.info/?l=linux-mm&m=130105930902915&w=2

There is a basic man page for the proposed interface here:

http://ozlabs.org/~cyeoh/cma/process_vm_readv.txt

This has been implemented for x86 and powerpc, other architecture should
mainly (I think) just need to add syscall numbers for the process_vm_readv
and process_vm_writev. There are 32 bit compatibility versions for
64-bit kernels.

For arch maintainers there are some simple tests to be able to quickly
verify that the syscalls are working correctly here:

http://ozlabs.org/~cyeoh/cma/cma-test-20110718.tgzSigned-off-by: NChris Yeoh <yeohc@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Cc: <linux-man@vger.kernel.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fcf63409

28 10月, 2011 1 次提交

compat: sync compat_stats with statfs. · 1448c721

由 Eric W. Biederman 提交于 10月 17, 2011

This was found by inspection while tracking a similar
bug in compat_statfs64, that has been fixed in mainline
since decemeber.

- This fixes a bug where not all of the f_spare fields
  were cleared on mips and s390.
- Add the f_flags field to struct compat_statfs
- Copy f_flags to userspace in case someone cares.
- Use __clear_user to copy the f_spare field to userspace
  to ensure that all of the elements of f_spare are cleared.
  On some architectures f_spare is has 5 ints and on some
  architectures f_spare only has 4 ints.  Which makes
  the previous technique of clearing each int individually
  broken.

I don't expect anyone actually uses the old statfs system
call anymore but if they do let them benefit from having
the compat and the native version working the same.
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

1448c721

14 10月, 2011 1 次提交

Kconfig: remove redundant CONFIG_ prefix on two symbols · 6af677ea

由 Paul Bolle 提交于 10月 10, 2011

Signed-off-by: NPaul Bolle <pebolle@tiscali.nl>
Acked-by: NRandy Dunlap <rdunlap@xenotime.net>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

6af677ea

05 10月, 2011 1 次提交

drivers/video: fsl-diu-fb: only DIU modes 0 and 1 are supported · c4e5a023

由 Timur Tabi 提交于 9月 28, 2011

The Freescale DIU video controller supports five video "modes", but only
the first two are used by the driver.  The other three are special modes
that don't make sense for a framebuffer driver.  Therefore, there's no
point in keeping a global variable that indicates which mode we're
supposed to use.
Signed-off-by: NTimur Tabi <timur@freescale.com>
Signed-off-by: NFlorian Tobias Schandinat <FlorianSchandinat@gmx.de>

c4e5a023

30 9月, 2011 1 次提交

powerpc: Fix device-tree matching for Apple U4 bridge · 16fa42af

由 Benjamin Herrenschmidt 提交于 9月 29, 2011

Apple Quad G5 has some oddity in it's device-tree which causes the new
generic matching code to fail to relate nodes for PCI-E devices below U4
with their respective struct pci_dev.  This breaks graphics on those
machines among others.

This fixes it using a quirk which copies the node pointer from the host
bridge for the root complex, which makes the generic code work for the
children afterward.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

16fa42af

28 9月, 2011 1 次提交

doc: fix broken references · 395cf969

由 Paul Bolle 提交于 8月 15, 2011

There are numerous broken references to Documentation files (in other
Documentation files, in comments, etc.). These broken references are
caused by typo's in the references, and by renames or removals of the
Documentation files. Some broken references are simply odd.

Fix these broken references, sometimes by dropping the irrelevant text
they were part of.
Signed-off-by: NPaul Bolle <pebolle@tiscali.nl>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

395cf969

26 9月, 2011 14 次提交

KVM: PPC: Implement H_CEDE hcall for book3s_hv in real-mode code · 19ccb76a

由 Paul Mackerras 提交于 7月 23, 2011

With a KVM guest operating in SMT4 mode (i.e. 4 hardware threads per
core), whenever a CPU goes idle, we have to pull all the other
hardware threads in the core out of the guest, because the H_CEDE
hcall is handled in the kernel. This is inefficient.

This adds code to book3s_hv_rmhandlers.S to handle the H_CEDE hcall
in real mode. When a guest vcpu does an H_CEDE hcall, we now only
exit to the kernel if all the other vcpus in the same core are also
idle. Otherwise we mark this vcpu as napping, save state that could
be lost in nap mode (mainly GPRs and FPRs), and execute the nap
instruction. When the thread wakes up, because of a decrementer or
external interrupt, we come back in at kvm_start_guest (from the
system reset interrupt vector), find the `napping' flag set in the
paca, and go to the resume path.

This has some other ramifications. First, when starting a core, we
now start all the threads, both those that are immediately runnable and
those that are idle. This is so that we don't have to pull all the
threads out of the guest when an idle thread gets a decrementer interrupt
and wants to start running. In fact the idle threads will all start
with the H_CEDE hcall returning; being idle they will just do another
H_CEDE immediately and go to nap mode.

This required some changes to kvmppc_run_core() and kvmppc_run_vcpu().
These functions have been restructured to make them simpler and clearer.
We introduce a level of indirection in the wait queue that gets woken
when external and decrementer interrupts get generated for a vcpu, so
that we can have the 4 vcpus in a vcore using the same wait queue.
We need this because the 4 vcpus are being handled by one thread.

Secondly, when we need to exit from the guest to the kernel, we now
have to generate an IPI for any napping threads, because an HDEC
interrupt doesn't wake up a napping thread.

Thirdly, we now need to be able to handle virtual external interrupts
and decrementer interrupts becoming pending while a thread is napping,
and deliver those interrupts to the guest when the thread wakes.
This is done in kvmppc_cede_reentry, just before fast_guest_return.

Finally, since we are not using the generic kvm_vcpu_block for book3s_hv,
and hence not calling kvm_arch_vcpu_runnable, we can remove the #ifdef
from kvm_arch_vcpu_runnable.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

19ccb76a

KVM: PPC: book3s_pr: Simplify transitions between virtual and real mode · 02143947

由 Paul Mackerras 提交于 7月 23, 2011

This simplifies the way that the book3s_pr makes the transition to
real mode when entering the guest.  We now call kvmppc_entry_trampoline
(renamed from kvmppc_rmcall) in the base kernel using a normal function
call instead of doing an indirect call through a pointer in the vcpu.
If kvm is a module, the module loader takes care of generating a
trampoline as it does for other calls to functions outside the module.

kvmppc_entry_trampoline then disables interrupts and jumps to
kvmppc_handler_trampoline_enter in real mode using an rfi[d].
That then uses the link register as the address to return to
(potentially in module space) when the guest exits.

This also simplifies the way that we call the Linux interrupt handler
when we exit the guest due to an external, decrementer or performance
monitor interrupt.  Instead of turning on the MMU, then deciding that
we need to call the Linux handler and turning the MMU back off again,
we now go straight to the handler at the point where we would turn the
MMU on.  The handler will then return to the virtual-mode code
(potentially in the module).

Along the way, this moves the setting and clearing of the HID5 DCBZ32
bit into real-mode interrupts-off code, and also makes sure that
we clear the MSR[RI] bit before loading values into SRR0/1.

The net result is that we no longer need any code addresses to be
stored in vcpu->arch.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

02143947

KVM: PPC: Assemble book3s{,_hv}_rmhandlers.S separately · 177339d7

由 Paul Mackerras 提交于 7月 23, 2011

This makes arch/powerpc/kvm/book3s_rmhandlers.S and
arch/powerpc/kvm/book3s_hv_rmhandlers.S be assembled as
separate compilation units rather than having them #included in
arch/powerpc/kernel/exceptions-64s.S.  We no longer have any
conditional branches between the exception prologs in
exceptions-64s.S and the KVM handlers, so there is no need to
keep their contents close together in the vmlinux image.

In their current location, they are using up part of the limited
space between the first-level interrupt handlers and the firmware
NMI data area at offset 0x7000, and with some kernel configurations
this area will overflow (e.g. allyesconfig), leading to an
"attempt to .org backwards" error when compiling exceptions-64s.S.

Moving them out requires that we add some #includes that the
book3s_{,hv_}rmhandlers.S code was previously getting implicitly
via exceptions-64s.S.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NAlexander Graf <agraf@suse.de>

177339d7

KVM: PPC: Add sanity checking to vcpu_run · af8f38b3

由 Alexander Graf 提交于 8月 10, 2011

There are multiple features in PowerPC KVM that can now be enabled
depending on the user's wishes. Some of the combinations don't make
sense or don't work though.

So this patch adds a way to check if the executing environment would
actually be able to run the guest properly. It also adds sanity
checks if PVR is set (should always be true given the current code
flow), if PAPR is only used with book3s_64 where it works and that
HV KVM is only used in PAPR mode.
Signed-off-by: NAlexander Graf <agraf@suse.de>

af8f38b3

KVM: PPC: Enable the PAPR CAP for Book3S · 930b412a

由 Alexander Graf 提交于 8月 08, 2011

Now that Book3S PV mode can also run PAPR guests, we can add a PAPR cap and
enable it for all Book3S targets. Enabling that CAP switches KVM into PAPR
mode.
Signed-off-by: NAlexander Graf <agraf@suse.de>

930b412a

KVM: PPC: Support SC1 hypercalls for PAPR in PR mode · a668f2bd

由 Alexander Graf 提交于 8月 08, 2011

PAPR defines hypercalls as SC1 instructions. Using these, the guest modifies
page tables and does other privileged operations that it wouldn't be allowed
to do in supervisor mode.

This patch adds support for PR KVM to trap these instructions and route them
through the same PAPR hypercall interface that we already use for HV style
KVM.
Signed-off-by: NAlexander Graf <agraf@suse.de>

a668f2bd

KVM: PPC: Stub emulate CFAR and PURR SPRs · aacf9aa3

由 Alexander Graf 提交于 8月 08, 2011

Recent Linux versions use the CFAR and PURR SPRs, but don't really care about
their contents (yet). So for now, we can simply return 0 when the guest wants
to read them.
Signed-off-by: NAlexander Graf <agraf@suse.de>

aacf9aa3

KVM: PPC: Add PAPR hypercall code for PR mode · 0254f074

由 Alexander Graf 提交于 8月 08, 2011

When running a PAPR guest, we need to handle a few hypercalls in kernel space,
most prominently the page table invalidation (to sync the shadows).

So this patch adds handling for a few PAPR hypercalls to PR mode KVM. I tried
to share the code with HV mode, but it ended up being a lot easier this way
around, as the two differ too much in those details.
Signed-off-by: NAlexander Graf <agraf@suse.de>

---

v1 -> v2:

  - whitespace fix

0254f074

KVM: PPC: Add support for explicit HIOR setting · a15bd354

由 Alexander Graf 提交于 8月 08, 2011

Until now, we always set HIOR based on the PVR, but this is just wrong.
Instead, we should be setting HIOR explicitly, so user space can decide
what the initial HIOR value is - just like on real hardware.

We keep the old PVR based way around for backwards compatibility, but
once user space uses the SREGS based method, we drop the PVR logic.
Signed-off-by: NAlexander Graf <agraf@suse.de>

a15bd354

KVM: PPC: Read out syscall instruction on trap · 77e675ad

由 Alexander Graf 提交于 8月 08, 2011

We have a few traps where we cache the instruction that cause the trap
for analysis later on. Since we now need to be able to distinguish
between SC 0 and SC 1 system calls and the only way to find out which
is which is by looking at the instruction, we also read out the instruction
causing the system call.
Signed-off-by: NAlexander Graf <agraf@suse.de>

77e675ad

KVM: PPC: Interpret SDR1 as HVA in PAPR mode · 04fcc11b

由 Alexander Graf 提交于 8月 08, 2011

When running a PAPR guest, the guest is not allowed to set SDR1 - instead
the HTAB information is held in internal hypervisor structures. But all of
our current code relies on SDR1 and walking the HTAB like on real hardware.

So in order to not be too intrusive, we simply set SDR1 to the HTAB we hold
in host memory. That way we can keep the HTAB in user space, but use it from
kernel space to map the guest.
Signed-off-by: NAlexander Graf <agraf@suse.de>

04fcc11b

KVM: PPC: Check privilege level on SPRs · 317a8fa3

由 Alexander Graf 提交于 8月 08, 2011

We have 3 privilege levels: problem state, supervisor state and hypervisor
state. Each of them can access different SPRs, so we need to check on every
SPR if it's accessible in the respective mode.
Signed-off-by: NAlexander Graf <agraf@suse.de>

317a8fa3

KVM: PPC: Add papr_enabled flag · 9432ba60

由 Alexander Graf 提交于 8月 08, 2011

When running a PAPR guest, some things change. The privilege level drops
from hypervisor to supervisor, SDR1 gets treated differently and we interpret
hypercalls. For bisectability sake, add the flag now, but only enable it when
all the support code is there.
Signed-off-by: NAlexander Graf <agraf@suse.de>

9432ba60

KVM: PPC: move compute_tlbie_rb to book3s common header · db507c30

由 Alexander Graf 提交于 7月 08, 2011

We need the compute_tlbie_rb in _pr and _hv implementations for papr
soon, so let's move it over to a common header file that both
implementations can leverage.
Signed-off-by: NAlexander Graf <agraf@suse.de>

db507c30

15 9月, 2011 1 次提交

treewide: typo 'interrrupt' word corrections. · fb914ebf

由 Vitaliy Ivanov 提交于 6月 23, 2011

Signed-off-by: NJustin P. Mattock <justinmattock@gmail.com>
Signed-off-by: NVitaliy Ivanov <vitalivanov@gmail.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

fb914ebf

13 9月, 2011 1 次提交

locking, powerpc: Annotate uic->lock as raw · bccc2f7b

由 Thomas Gleixner 提交于 4月 06, 2010

uic->lock is protecting the interrupt controller hardware. This lock
can not be preempted on -rt.

In mainline this change documents the low level nature of
the lock - otherwise there's no functional difference. Lockdep
and Sparse checking will work as usual.
Reported-by: NDarcy L. Watkins <dwatkins@tranzeo.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

bccc2f7b

31 8月, 2011 3 次提交

powerpc/p1023rds: Fix the error of bank-width of nor flash · 0c81e4b4

由 Chunhe Lan 提交于 8月 12, 2011

In the p1023rds, a physical bus of nor flash is 16 bits width.
The bank-width is width (in bytes) of the bus width. So, the
value of bank-width of nor flash is not one, and it should be
two.
Signed-off-by: NChunhe Lan <Chunhe.Lan@freescale.com>
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>

0c81e4b4

powerpc/85xx: enable caam crypto driver by default · e09e2fb5

由 Kim Phillips 提交于 7月 22, 2011

corenet based SoCs have SEC4 h/w, so enable the SEC4 driver,
caam, and the algorithms it supports, and disable the
SEC2/3 driver, talitos.
Signed-off-by: NKim Phillips <kim.phillips@freescale.com>
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>

e09e2fb5

powerpc/85xx: enable the audio drivers in the defconfigs · 39c428f7

由 Timur Tabi 提交于 8月 16, 2011

Enable the audio drivers in the non-corenet 85xx defconfigs so that audio
is enabled on the Freescale P1022DS reference board.
Signed-off-by: NTimur Tabi <timur@freescale.com>
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>

39c428f7

30 8月, 2011 1 次提交

remove remaining references to nfsservctl · d4d7b2a1

由 Stephen Rothwell 提交于 8月 29, 2011

These were missed in commit f5b94099 "All Arch: remove linkage
for sys_nfsservctl system call" due to them having no sys_ prefix
(presumably).

Cc: NeilBrown <neilb@suse.de>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-parisc@vger.kernel.org
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Acked-by: NJames Bottomley <James.Bottomley@hansenpartnership.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d4d7b2a1

26 8月, 2011 1 次提交

arch/powerpc/sysdev/fsl_rio.c: correct IECSR register clear value · 671ee7f0

由 Liu Gang-B34182 提交于 8月 25, 2011

This bug causes the IECSR register clear failure.  In this case, the RETE
(retry error threshold exceeded) interrupt will be generated and cannot be
cleared.  So the related ISR may be called persistently.

The RETE bit in IECSR is cleared by writing a 1 to it.
Signed-off-by: NLiu Gang <Gang.Liu@freescale.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

671ee7f0

24 8月, 2011 1 次提交

tty/powerpc: introduce the ePAPR embedded hypervisor byte channel driver · dcd83aaf

由 Timur Tabi 提交于 7月 08, 2011

The ePAPR embedded hypervisor specification provides an API for "byte
channels", which are serial-like virtual devices for sending and receiving
streams of bytes.  This driver provides Linux kernel support for byte
channels via three distinct interfaces:

1) An early-console (udbg) driver.  This provides early console output
through a byte channel.  The byte channel handle must be specified in a
Kconfig option.

2) A normal console driver.  Output is sent to the byte channel designated
for stdout in the device tree.  The console driver is for handling kernel
printk calls.

3) A tty driver, which is used to handle user-space input and output.  The
byte channel used for the console is designated as the default tty.
Signed-off-by: NTimur Tabi <timur@freescale.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

dcd83aaf

19 8月, 2011 1 次提交

net: fix IBM EMAC driver after rename. · 3b3bceef

由 Tony Breeds 提交于 8月 18, 2011

In commit 9aa32835 (ehea/ibm*: Move the
IBM drivers) the IBM_NEW_EMAC* were renames to IBM_EMAC*

The conversion was incomplete so that even if the driver was added to
the .config it wasn't built, but there were no errors).  In this commit
we also update the various defconfigs that use EMAC to use the new
Kconfig symbol, and explicitly add the NET_VENDOR_IBM guard.

We do not explicitly select the Kconfig dependencies, as this would force
EMAC on.  Doing it in the defconfig allows more flexibility.

Tested on a canyondlands board.
Signed-off-by: NTony Breeds <tony@bakeyournoodle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3b3bceef

18 8月, 2011 2 次提交

flexcan: Add flexcan device support for p1010rdb. · b489da87

由 holt@sgi.com 提交于 8月 16, 2011

Allow the p1010 processor to select the flexcan network driver.
Signed-off-by: NRobin Holt <holt@sgi.com>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>,
Acked-by: Wolfgang Grandegger <wg@grandegger.com>,
Cc: U Bhaskar-B22300 <B22300@freescale.com>
Cc: socketcan-core@lists.berlios.de,
Cc: netdev@vger.kernel.org,
Cc: PPC list <linuxppc-dev@lists.ozlabs.org>
Cc: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b489da87

flexcan: Fix up fsl-flexcan device tree binding. · 243abbf2

由 holt@sgi.com 提交于 8月 16, 2011

This patch cleans up the documentation of the device-tree binding for
the Flexcan devices on Freescale's PowerPC and ARM cores. Extra
properties are not used by the driver so we are removing them.
Signed-off-by: NRobin Holt <holt@sgi.com>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>,
Acked-by: Wolfgang Grandegger <wg@grandegger.com>,
Cc: U Bhaskar-B22300 <B22300@freescale.com>
Cc: Scott Wood <scottwood@freescale.com>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: socketcan-core@lists.berlios.de,
Cc: netdev@vger.kernel.org,
Cc: PPC list <linuxppc-dev@lists.ozlabs.org>
Cc: devicetree-discuss@lists.ozlabs.org
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

243abbf2

10 8月, 2011 1 次提交

powerpc: Really fix build without CONFIG_PCI · a85fe3fc

由 Benjamin Herrenschmidt 提交于 8月 11, 2011

Brown paper bag day, previous commit wouldn't work very well with modules
enabled. Move the exports into the ifdef.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

a85fe3fc

05 8月, 2011 1 次提交

powerpc: Fix build without CONFIG_PCI · 81210c20

由 Benjamin Herrenschmidt 提交于 8月 05, 2011

Commit fea80311
"iomap: make IOPORT/PCI mapping functions conditional"

Broke powerpc build without CONFIG_PCI as we would still define
pci_iomap(), which overlaps with the new empty inline in the headers.

Make our implementation conditional on CONFIG_PCI
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

81210c20

openanolis / cloud-kernel 大约 2 年 前同步成功

openanolis / cloud-kernel
大约 2 年前同步成功