提交 · ece838b6257412647197c072fe59dfc6615df144 · openeuler / Kernel

11 11月, 2011 5 次提交

x86, mm: Use max_low_pfn for ZONE_NORMAL on 64-bit · ece838b6

由 Pekka Enberg 提交于 11月 01, 2011

64-bit has no highmem so max_low_pfn is always the same as
'max_pfn'.
Acked-by: NTejun Heo <tj@kernel.org>
Acked-by: NYinghai Lu <yinghai@kernel.org>
Acked-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NPekka Enberg <penberg@kernel.org>
Link: http://lkml.kernel.org/r/1320155902-10424-5-git-send-email-penberg@kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>

ece838b6

x86, mm: Wrap ZONE_DMA32 with CONFIG_ZONE_DMA32 · 80b3cac9

由 Pekka Enberg 提交于 11月 01, 2011

In preparation for unifying 32-bit and 64-bit zone_sizes_init()
make sure ZONE_DMA32 is wrapped in CONFIG_ZONE_DMA32.
Acked-by: NTejun Heo <tj@kernel.org>
Acked-by: NYinghai Lu <yinghai@kernel.org>
Acked-by: NDavid Rientjes <rientjes@google.com>
Acked-by: NArun Sharma <asharma@fb.com>
Signed-off-by: NPekka Enberg <penberg@kernel.org>
Link: http://lkml.kernel.org/r/1320155902-10424-4-git-send-email-penberg@kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>

80b3cac9

x86, mm: Use max_pfn instead of highend_pfn · e4794640

由 Pekka Enberg 提交于 11月 01, 2011

The 'highend_pfn' variable is always set to 'max_pfn' so just
use the latter directly.
Acked-by: NTejun Heo <tj@kernel.org>
Acked-by: NYinghai Lu <yinghai@kernel.org>
Signed-off-by: NPekka Enberg <penberg@kernel.org>
Link: http://lkml.kernel.org/r/1320155902-10424-3-git-send-email-penberg@kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>

e4794640

x86, mm: Move zone init from paging_init() on 64-bit · 4c0b2e5f

由 Pekka Enberg 提交于 11月 01, 2011

This patch introduces a zone_sizes_init() helper function on
64-bit to make it more similar to 32-bit init.
Acked-by: NTejun Heo <tj@kernel.org>
Acked-by: NYinghai Lu <yinghai@kernel.org>
Acked-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NPekka Enberg <penberg@kernel.org>
Link: http://lkml.kernel.org/r/1320155902-10424-2-git-send-email-penberg@kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>

4c0b2e5f

x86, mm: Use MAX_DMA_PFN for ZONE_DMA on 32-bit · ff14c1d0

由 Pekka Enberg 提交于 11月 01, 2011

Use MAX_DMA_PFN which represents the 16 MB ISA DMA limit on
32-bit x86 just like we do on 64-bit.
Acked-by: NTejun Heo <tj@kernel.org>
Acked-by: NYinghai Lu <yinghai@kernel.org>
Acked-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NPekka Enberg <penberg@kernel.org>
Link: http://lkml.kernel.org/r/1320155902-10424-1-git-send-email-penberg@kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>

ff14c1d0

07 11月, 2011 1 次提交

mrst pmu: update comment · 22f4521d

由 Len Brown 提交于 8月 12, 2011

referenced MeeGo, in particular, but really means Linux, in general.
Signed-off-by: NLen Brown <len.brown@intel.com>

22f4521d

03 11月, 2011 2 次提交

thp: share get_huge_page_tail() · b35a35b5

由 Andrea Arcangeli 提交于 11月 02, 2011

This avoids duplicating the function in every arch gup_fast.
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b35a35b5

mm: thp: tail page refcounting fix · 70b50f94

由 Andrea Arcangeli 提交于 11月 02, 2011

Michel while working on the working set estimation code, noticed that
calling get_page_unless_zero() on a random pfn_to_page(random_pfn)
wasn't safe, if the pfn ended up being a tail page of a transparent
hugepage under splitting by __split_huge_page_refcount().

He then found the problem could also theoretically materialize with
page_cache_get_speculative() during the speculative radix tree lookups
that uses get_page_unless_zero() in SMP if the radix tree page is freed
and reallocated and get_user_pages is called on it before
page_cache_get_speculative has a chance to call get_page_unless_zero().

So the best way to fix the problem is to keep page_tail->_count zero at
all times.  This will guarantee that get_page_unless_zero() can never
succeed on any tail page.  page_tail->_mapcount is guaranteed zero and
is unused for all tail pages of a compound page, so we can simply
account the tail page references there and transfer them to
tail_page->_count in __split_huge_page_refcount() (in addition to the
head_page->_mapcount).

While debugging this s/_count/_mapcount/ change I also noticed get_page is
called by direct-io.c on pages returned by get_user_pages.  That wasn't
entirely safe because the two atomic_inc in get_page weren't atomic.  As
opposed to other get_user_page users like secondary-MMU page fault to
establish the shadow pagetables would never call any superflous get_page
after get_user_page returns.  It's safer to make get_page universally safe
for tail pages and to use get_page_foll() within follow_page (inside
get_user_pages()).  get_page_foll() is safe to do the refcounting for tail
pages without taking any locks because it is run within PT lock protected
critical sections (PT lock for pte and page_table_lock for
pmd_trans_huge).

The standard get_page() as invoked by direct-io instead will now take
the compound_lock but still only for tail pages.  The direct-io paths
are usually I/O bound and the compound_lock is per THP so very
finegrined, so there's no risk of scalability issues with it.  A simple
direct-io benchmarks with all lockdep prove locking and spinlock
debugging infrastructure enabled shows identical performance and no
overhead.  So it's worth it.  Ideally direct-io should stop calling
get_page() on pages returned by get_user_pages().  The spinlock in
get_page() is already optimized away for no-THP builds but doing
get_page() on tail pages returned by GUP is generally a rare operation
and usually only run in I/O paths.

This new refcounting on page_tail->_mapcount in addition to avoiding new
RCU critical sections will also allow the working set estimation code to
work without any further complexity associated to the tail page
refcounting with THP.
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Reported-by: NMichel Lespinasse <walken@google.com>
Reviewed-by: NMichel Lespinasse <walken@google.com>
Reviewed-by: NMinchan Kim <minchan.kim@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: <stable@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

70b50f94

02 11月, 2011 24 次提交

um: Fix kmalloc argument order in um/vdso/vma.c · 0d65ede0

由 Dave Jones 提交于 10月 24, 2011

kmalloc size is 1st arg, not second.
Signed-off-by: NDave Jones <davej@redhat.com>
Signed-off-by: NRichard Weinberger <richard@nod.at>

Cc: <stable@kernel.org> # 3.0.x
[richard@nod.at: on 3.0 the to be patched file is
arch/um/sys-x86_64/vdso/vma.c]

0d65ede0

R
um: we need sys/user.h only on i386 · 38b64aed
由 Richard Weinberger 提交于 8月 18, 2011
```
Signed-off-by: NRichard Weinberger <richard@nod.at>
```
38b64aed
R
um: merge delay_{32,64}.c · d0af6cbf
由 Richard Weinberger 提交于 8月 18, 2011
```
Signed-off-by: NRichard Weinberger <richard@nod.at>
```
d0af6cbf

um: kill system-um.h · a34978cb

由 Al Viro 提交于 8月 18, 2011

most of it belonged in irqflags.h, actually
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>

a34978cb

um: segment.h is x86-only and needed only there · 46ecca8a

由 Al Viro 提交于 8月 18, 2011

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>

46ecca8a

um: unify ptrace_user.h · 966e803a

由 Al Viro 提交于 8月 18, 2011

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>

966e803a

um: unify KSTK_... · a10c95d8

由 Al Viro 提交于 8月 18, 2011

... and switch get_thread_register() to HOST_... for register numbers
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>

a10c95d8

um: fix gcov build breakage · 4d211093

由 Al Viro 提交于 8月 18, 2011

a) exports in gmon_syms.c duplicate kernel/gcov/* ones
b) excluding -pg in vdso compile is not enough - -fprofile-arcs
and -ftest-coverage also needs to be excluded
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>

4d211093

um: irq_vectors.h just shadows x86 one · 3fb77d72

由 Al Viro 提交于 8月 18, 2011

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>

3fb77d72

A
um: required-features.h is there only to shadow x86 one... · ff9586e9
由 Al Viro 提交于 8月 18, 2011
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>
```
ff9586e9

um: asm/apic.h is there only to shadow the x86 one... · 8807c1d5

由 Al Viro 提交于 8月 18, 2011

... so take it to arch/um/x86/asm.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>

8807c1d5

um: take ldt.h to arch/x86/um/asm/mm_context.h · b3ee571e

由 Al Viro 提交于 8月 18, 2011

it's x86-only and we have no business playing with it in asm/mmu.h; make
the latter have
	struct uml_arch_mm_context arch;
instead of
	struct uml_ldt ldt;
and let arch/<subarch>/um/asm/mm_context.h decide what'll be in there.
While we are at it, kill host_ldt.h - it's not needed in part of places
that include it (we want asm/ldt.h in those) and it can be trivially
expanded into the single remaining one.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>

b3ee571e

um: merge signal_{32,64}.c · f67aa2ff

由 Al Viro 提交于 8月 18, 2011

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>

f67aa2ff

A
um: no need to play with save_sp in signal frame setup anymore · fbe98686
由 Al Viro 提交于 8月 18, 2011
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>
```
fbe98686

um: increase stack growth cushion in pagefault · c7ea591c

由 Al Viro 提交于 8月 18, 2011

analog of [PATCH] i386: let usermode execute the "enter" instruction from
circa 2006.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>

c7ea591c

A
um: merge HOST_... of registers common on i386 and amd64 · 3579a389
由 Al Viro 提交于 8月 18, 2011
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>
```
3579a389

um: sanitize paths in sys_call_table* includes · 8edc4147

由 Al Viro 提交于 8月 18, 2011

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>

8edc4147

um: merge os-Linux/tls.c into arch/x86/um/os-Linux/tls.c · 1bbd5f21

由 Al Viro 提交于 8月 18, 2011

it's i386-specific; moreover, analogs on other targets have
incompatible interface - PTRACE_GET_THREAD_AREA does exist
elsewhere, but struct user_desc does *not*
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>

1bbd5f21

um: move asm/desc.h into arch/x86/um/asm · c5cc32fe

由 Al Viro 提交于 8月 18, 2011

its only purpose is to shadow the x86 asm/desc.h
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>

c5cc32fe

um: merge host_ldt_{32,64}.h · 2014d018

由 Al Viro 提交于 8月 18, 2011

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>

2014d018

um: merge tls_{32,64}.h · 09e129a6

由 Al Viro 提交于 8月 18, 2011

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>

09e129a6

um: kill shared/task.h and HOST_TASK_REGS · 5ade8878

由 Al Viro 提交于 8月 18, 2011

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>

5ade8878

um: bury unused macros around ptrace.h · 0acdbbeb

由 Al Viro 提交于 8月 18, 2011

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>

0acdbbeb

um: take arch/um/sys-x86 to arch/x86/um · 5c48b108

由 Al Viro 提交于 8月 18, 2011

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NRichard Weinberger <richard@nod.at>

5c48b108

01 11月, 2011 8 次提交

i7core_edac: Drop the edac_mce facility · 4140c542

由 Borislav Petkov 提交于 7月 18, 2011

Remove edac_mce pieces and use the normal MCE decoder notifier chain by
retaining the same functionality with considerably less code.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab@redhat.com>

4140c542

Cross Memory Attach · fcf63409

由 Christopher Yeoh 提交于 10月 31, 2011

The basic idea behind cross memory attach is to allow MPI programs doing
intra-node communication to do a single copy of the message rather than a
double copy of the message via shared memory.

The following patch attempts to achieve this by allowing a destination
process, given an address and size from a source process, to copy memory
directly from the source process into its own address space via a system
call.  There is also a symmetrical ability to copy from the current
process's address space into a destination process's address space.

- Use of /proc/pid/mem has been considered, but there are issues with
  using it:
  - Does not allow for specifying iovecs for both src and dest, assuming
    preadv or pwritev was implemented either the area read from or
  written to would need to be contiguous.
  - Currently mem_read allows only processes who are currently
  ptrace'ing the target and are still able to ptrace the target to read
  from the target. This check could possibly be moved to the open call,
  but its not clear exactly what race this restriction is stopping
  (reason  appears to have been lost)
  - Having to send the fd of /proc/self/mem via SCM_RIGHTS on unix
  domain socket is a bit ugly from a userspace point of view,
  especially when you may have hundreds if not (eventually) thousands
  of processes  that all need to do this with each other
  - Doesn't allow for some future use of the interface we would like to
  consider adding in the future (see below)
  - Interestingly reading from /proc/pid/mem currently actually
  involves two copies! (But this could be fixed pretty easily)

As mentioned previously use of vmsplice instead was considered, but has
problems.  Since you need the reader and writer working co-operatively if
the pipe is not drained then you block.  Which requires some wrapping to
do non blocking on the send side or polling on the receive.  In all to all
communication it requires ordering otherwise you can deadlock.  And in the
example of many MPI tasks writing to one MPI task vmsplice serialises the
copying.

There are some cases of MPI collectives where even a single copy interface
does not get us the performance gain we could.  For example in an
MPI_Reduce rather than copy the data from the source we would like to
instead use it directly in a mathops (say the reduce is doing a sum) as
this would save us doing a copy.  We don't need to keep a copy of the data
from the source.  I haven't implemented this, but I think this interface
could in the future do all this through the use of the flags - eg could
specify the math operation and type and the kernel rather than just
copying the data would apply the specified operation between the source
and destination and store it in the destination.

Although we don't have a "second user" of the interface (though I've had
some nibbles from people who may be interested in using it for intra
process messaging which is not MPI).  This interface is something which
hardware vendors are already doing for their custom drivers to implement
fast local communication.  And so in addition to this being useful for
OpenMPI it would mean the driver maintainers don't have to fix things up
when the mm changes.

There was some discussion about how much faster a true zero copy would
go. Here's a link back to the email with some testing I did on that:

http://marc.info/?l=linux-mm&m=130105930902915&w=2

There is a basic man page for the proposed interface here:

http://ozlabs.org/~cyeoh/cma/process_vm_readv.txt

This has been implemented for x86 and powerpc, other architecture should
mainly (I think) just need to add syscall numbers for the process_vm_readv
and process_vm_writev. There are 32 bit compatibility versions for
64-bit kernels.

For arch maintainers there are some simple tests to be able to quickly
verify that the syscalls are working correctly here:

http://ozlabs.org/~cyeoh/cma/cma-test-20110718.tgzSigned-off-by: NChris Yeoh <yeohc@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Cc: <linux-man@vger.kernel.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fcf63409

lguest: add export.h to lguest files for THIS_MODULE/EXPORT_SYMBOL · 39a0e33d

由 Paul Gortmaker 提交于 7月 21, 2011

We need this in advance of the module.h cleanup, or we'll
get compile errors like this:

  CC      drivers/lguest/lguest_device.o
drivers/lguest/lguest_device.c: In function ‘lguest_devices_init’:
drivers/lguest/lguest_device.c:490: error: ‘THIS_MODULE’ undeclared (first use in this function)
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

39a0e33d

x86: efi_32.c is implicitly getting asm/desc.h via module.h · 783ac47c

由 Paul Gortmaker 提交于 5月 27, 2011

We want to clean up the chain of includes stumbling through
module.h, and when we do that, we'll see:

  CC      arch/x86/platform/efi/efi_32.o
  efi/efi_32.c: In function ‘efi_call_phys_prelog’:
  efi/efi_32.c:80: error: implicit declaration of function ‘get_cpu_gdt_table’
  efi/efi_32.c:82: error: implicit declaration of function ‘load_gdt’
  make[4]: *** [arch/x86/platform/efi/efi_32.o] Error 1

Include asm/desc.h so that there are no implicit include assumptions.
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

783ac47c

x86: fix up files really needing to include module.h · 7c52d551

由 Paul Gortmaker 提交于 5月 27, 2011

These files aren't just exporting symbols -- they are also defining
a MODULE_LICENSE etc. so give them the full module.h file.
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

7c52d551

x86: Fix files explicitly requiring export.h for EXPORT_SYMBOL/THIS_MODULE · 69c60c88

由 Paul Gortmaker 提交于 5月 26, 2011

These files were implicitly getting EXPORT_SYMBOL via device.h
which was including module.h, but that will be fixed up shortly.

By fixing these now, we can avoid seeing things like:

arch/x86/kernel/rtc.c:29: warning: type defaults to ‘int’ in declaration of ‘EXPORT_SYMBOL’
arch/x86/kernel/pci-dma.c:20: warning: type defaults to ‘int’ in declaration of ‘EXPORT_SYMBOL’
arch/x86/kernel/e820.c:69: warning: type defaults to ‘int’ in declaration of ‘EXPORT_SYMBOL_GPL’

[ with input from Randy Dunlap <rdunlap@xenotime.net> and also
  from Stephen Rothwell <sfr@canb.auug.org.au> ]
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

69c60c88

x86: fix implicit include of <linux/topology.h> in vsyscall_64 · 29574022

由 Paul Gortmaker 提交于 5月 26, 2011

In removing the presence of <linux/module.h> from some of the
more common <linux/something.h> files, this implict include
of <linux/topology.h> was uncovered.

  CC      arch/x86/kernel/vsyscall_64.o
  arch/x86/kernel/vsyscall_64.c: In function ‘vsyscall_set_cpu’:
  arch/x86/kernel/vsyscall_64.c:259: error: implicit declaration of function ‘cpu_to_node’

Explicitly call it out so the cleanup can take place.
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

29574022

x86: drop unused Kconfig symbol · 4f4d7a9b

由 Paul Bolle 提交于 10月 24, 2011

Signed-off-by: NPaul Bolle <pebolle@tiscali.nl>
Acked-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NMichal Marek <mmarek@suse.cz>

4f4d7a9b

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功