提交 · 0c87197142427063e096f11603543ca874045952 · gsplhtlxg / clone-Linux

19 6月, 2009 1 次提交

perf_counter, x86: Improve interactions with fast-gup · 0c871971

由 Ingo Molnar 提交于 6月 15, 2009

Improve a few details in perfcounter call-chain recording that
makes use of fast-GUP:

- Use ACCESS_ONCE() to observe the pte value. ptes are fundamentally
  racy and can be changed on another CPU, so we have to be careful
  about how we access them. The PAE branch is already careful with
  read-barriers - but the non-PAE and 64-bit side needs an
  ACCESS_ONCE() to make sure the pte value is observed only once.

- make the checks a bit stricter so that we can feed it any kind of
  cra^H^H^H user-space input ;-)
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0c871971

05 2月, 2009 1 次提交

x86: uaccess: use errret as error value in __put_user_size() · 18114f61

由 Hiroshi Shimamoto 提交于 1月 30, 2009

Impact: cleanup

In __put_user_size() macro errret is used for error value.
But if size is 8, errret isn't passed to__put_user_asm_u64().
This behavior is inconsistent.
Signed-off-by: NHiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

18114f61

30 1月, 2009 1 次提交

x86: uaccess: fix compilation error on CONFIG_M386 · 019a1369

由 Hiroshi Shimamoto 提交于 1月 29, 2009

In case of !CONFIG_X86_WP_WORKS_OK, __put_user_size_ex() is not defined.
Add macros for !CONFIG_X86_WP_WORKS_OK case.
Signed-off-by: NHiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

019a1369

24 1月, 2009 1 次提交

x86: uaccess: introduce try and catch framework · fe40c0af

由 Hiroshi Shimamoto 提交于 1月 23, 2009

Impact: introduce new uaccess exception handling framework

Introduce {get|put}_user_try and {get|put}_user_catch as new uaccess exception
handling framework.
{get|put}_user_try begins exception block and {get|put}_user_catch(err) ends
the block and gets err if an exception occured in {get|put}_user_ex() in the
block. The exception is stored thread_info->uaccess_err.

The example usage of this framework is below;
int func()
{
	int err = 0;

	get_user_try {
		get_user_ex(...);
		get_user_ex(...);
		:
	} get_user_catch(err);

	return err;
}

Note: get_user_ex() is not clear the value when an exception occurs, it's
different from the behavior of __get_user(), but I think it doesn't matter.
Signed-off-by: NHiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

fe40c0af

21 1月, 2009 2 次提交

x86: uaccess: rename __put_user_u64() to __put_user_asm_u64() · cc86c9e0

由 Hiroshi Shimamoto 提交于 1月 19, 2009

Impact: cleanup

rename __put_user_u64() to __put_user_asm_u64() like __get_user_asm_u64().
Signed-off-by: NHiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

cc86c9e0

x86: uaccess: fix style problems · 4d5d7838

由 Hiroshi Shimamoto 提交于 1月 19, 2009

Impact: cleanup

Fix coding style problems in arch/x86/include/asm/uaccess.h.
Signed-off-by: NHiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

4d5d7838

12 12月, 2008 1 次提交

x86: uaccess: return value of __{get|put}_user() can be int · 16855f87

由 Hiroshi Shimamoto 提交于 12月 08, 2008

Impact: cleanup

The type of return value of __{get|put}_user() can be int.
There is no user to refer the return value of __{get|put}_user() as long.
This reduces code size a bit on 64-bit.

 $ size vmlinux.*
     text	   data	    bss	    dec	    hex	filename
  4509265	 479988	 673588	5662841	 566879	vmlinux.new
  4511462	 479988	 673588	5665038	 56710e	vmlinux.old
Signed-off-by: NHiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

16855f87

23 10月, 2008 2 次提交

x86: Fix ASM_X86__ header guards · 1965aae3

由 H. Peter Anvin 提交于 10月 22, 2008

Change header guards named "ASM_X86__*" to "_ASM_X86_*" since:

a. the double underscore is ugly and pointless.
b. no leading underscore violates namespace constraints.
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

1965aae3

x86, um: ... and asm-x86 move · bb898558

由 Al Viro 提交于 8月 17, 2008

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

bb898558

12 9月, 2008 1 次提交

x86: some lock annotations for user copy paths, v3 · 1d18ef48

由 Ingo Molnar 提交于 9月 11, 2008

- add annotation back to clear_user()
- change probe_kernel_address() to _inatomic*() method
Signed-off-by: NIngo Molnar <mingo@elte.hu>

1d18ef48

11 9月, 2008 1 次提交

x86: some lock annotations for user copy paths, v2 · 3ee1afa3

由 Nick Piggin 提交于 9月 10, 2008

 - introduce might_fault()
 - handle the atomic user copy paths correctly

[ mingo@elte.hu: move might_sleep() outside of in_atomic(). ]
Signed-off-by: NNick Piggin <npiggin@suse.de>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3ee1afa3

10 9月, 2008 1 次提交

x86: some lock annotations for user copy paths · c10d38dd

由 Nick Piggin 提交于 9月 10, 2008

copy_to/from_user and all its variants (except the atomic ones) can take a
page fault and perform non-trivial work like taking mmap_sem and entering
the filesyste/pagecache.

Unfortunately, this often escapes lockdep because a common pattern is to
use it to read in some arguments just set up from userspace, or write data
back to a hot buffer. In those cases, it will be unlikely for page reclaim
to get a window in to cause copy_*_user to fault.

With the new might_lock primitives, add some annotations to x86. I don't
know if I caught all possible faulting points (it's a bit of a maze, and I
didn't really look at 32-bit). But this is a starting point.

Boots and runs OK so far.
Signed-off-by: NNick Piggin <npiggin@suse.de>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c10d38dd

27 7月, 2008 1 次提交

x86: lockless get_user_pages_fast() · 8174c430

由 Nick Piggin 提交于 7月 25, 2008

Implement get_user_pages_fast without locking in the fastpath on x86.

Do an optimistic lockless pagetable walk, without taking mmap_sem or any
page table locks or even mmap_sem.  Page table existence is guaranteed by
turning interrupts off (combined with the fact that we're always looking
up the current mm, means we can do the lockless page table walk within the
constraints of the TLB shootdown design).  Basically we can do this
lockless pagetable walk in a similar manner to the way the CPU's pagetable
walker does not have to take any locks to find present ptes.

This patch (combined with the subsequent ones to convert direct IO to use
it) was found to give about 10% performance improvement on a 2 socket 8
core Intel Xeon system running an OLTP workload on DB2 v9.5

 "To test the effects of the patch, an OLTP workload was run on an IBM
  x3850 M2 server with 2 processors (quad-core Intel Xeon processors at
  2.93 GHz) using IBM DB2 v9.5 running Linux 2.6.24rc7 kernel.  Comparing
  runs with and without the patch resulted in an overall performance
  benefit of ~9.8%.  Correspondingly, oprofiles showed that samples from
  __up_read and __down_read routines that is seen during thread contention
  for system resources was reduced from 2.8% down to .05%.  Monitoring the
  /proc/vmstat output from the patched run showed that the counter for
  fast_gup contained a very high number while the fast_gup_slow value was
  zero."

(fast_gup is the old name for get_user_pages_fast, fast_gup_slow is a
counter we had for the number of times the slowpath was invoked).

The main reason for the improvement is that DB2 has multiple threads each
issuing direct-IO.  Direct-IO uses get_user_pages, and thus the threads
contend the mmap_sem cacheline, and can also contend on page table locks.

I would anticipate larger performance gains on larger systems, however I
think DB2 uses an adaptive mix of threads and processes, so it could be
that thread contention remains pretty constant as machine size increases.
In which case, we stuck with "only" a 10% gain.

The downside of using get_user_pages_fast is that if there is not a pte
with the correct permissions for the access, we end up falling back to
get_user_pages and so the get_user_pages_fast is a bit of extra work.
However this should not be the common case in most performance critical
code.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: Kconfig fix]
[akpm@linux-foundation.org: Makefile fix/cleanup]
[akpm@linux-foundation.org: warning fix]
Signed-off-by: NNick Piggin <npiggin@suse.de>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Zach Brown <zach.brown@oracle.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Reviewed-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8174c430

23 7月, 2008 1 次提交

x86: consolidate header guards · 77ef50a5

由 Vegard Nossum 提交于 6月 18, 2008

This patch is the result of an automatic script that consolidates the
format of all the headers in include/asm-x86/.

The format:

1. No leading underscore. Names with leading underscores are reserved.
2. Pathname components are separated by two underscores. So we can
   distinguish between mm_types.h and mm/types.h.
3. Everything except letters and numbers are turned into single
   underscores.
Signed-off-by: NVegard Nossum <vegard.nossum@gmail.com>

77ef50a5

09 7月, 2008 9 次提交

x86: define architectural characteristics in uaccess.h. · 22cac167

由 Glauber Costa 提交于 6月 25, 2008

Remove them from the arch-specific file.
Signed-off-by: NGlauber Costa <gcosta@redhat.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

22cac167

x86: put movsl_mask into uaccess.h. · 8bc7de0c

由 Glauber Costa 提交于 6月 25, 2008

x86_64 does not need it, but it won't have X86_INTEL_USERCOPY
defined either.
Signed-off-by: NGlauber Costa <gcosta@redhat.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

8bc7de0c

x86: move __get_user and __put_user into uaccess.h. · 8cb834e9

由 Glauber Costa 提交于 6月 25, 2008

We also carry the unaligned version with us. Only x86_64 uses
it, but there's no problem in defining it.
Signed-off-by: NGlauber Costa <gcosta@redhat.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

8cb834e9

x86: merge put_user. · e30a44fd

由 Glauber Costa 提交于 6月 25, 2008

Move both versions, which are highly similar, to uaccess.h.
Note that, for x86_64, X86_WP_WORKS_OK is always defined.
Signed-off-by: NGlauber Costa <gcosta@redhat.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e30a44fd

x86: merge __get_user_asm and its users. · 3f168221

由 Glauber Costa 提交于 6月 25, 2008

Move __get_user_asm and __get_user_size and __get_user_nocheck
to uaccess.h. This requires us to define a macro at __get_user_size
for the 64-bit access case.
Signed-off-by: NGlauber Costa <gcosta@redhat.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3f168221

x86: merge __put_user_asm and its user. · dc70ddf4

由 Glauber Costa 提交于 6月 25, 2008

Move both __put_user_asm and __put_user_size to
uaccess.h. i386 already had a special function for 64-bit access,
so for x86_64, we just define a macro with the same name.
Note that for X86_64, CONFIG_X86_WP_WORKS_OK will always
be defined, so the #else part will never be even compiled in.
Signed-off-by: NGlauber Costa <gcosta@redhat.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

dc70ddf4

x86: move __addr_ok to uaccess.h. · 002ca169

由 Glauber Costa 提交于 6月 25, 2008

Take it out of uaccess_32.h. Since it seems that no users
of the x86_64 exists, we simply pick the i386 version.
Signed-off-by: NGlauber Costa <gcosta@redhat.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

002ca169

x86: merge getuser. · 865e5b76

由 Glauber Costa 提交于 6月 25, 2008

Merge versions of getuser from uaccess_32.h and uaccess_64.h into
uaccess.h. There is a part which is 64-bit only (for now), and for
that, we use a __get_user_8 macro.
Signed-off-by: NGlauber Costa <gcosta@redhat.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

865e5b76

x86: merge common parts of uaccess. · ca233862

由 Glauber Costa 提交于 6月 13, 2008

Common parts of uaccess_32.h and uaccess_64.h
are put in uaccess.h. Bits in uaccess_32.h and
uaccess_64.h that come to this file are equal
except for comments and whitespaces differences.
Signed-off-by: NGlauber Costa <gcosta@redhat.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ca233862

11 10月, 2007 1 次提交

i386/x86_64: move headers to include/asm-x86 · 96a388de

由 Thomas Gleixner 提交于 10月 11, 2007

Move the headers to include/asm-x86 and fixup the
header install make rules
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

96a388de