提交 · 21eac81f252fe31c3cf64b805a1e8652192f3a3b · openeuler / raspberrypi-kernel

09 1月, 2006 16 次提交

[PATCH] Swap Migration V5: LRU operations · 21eac81f

由 Christoph Lameter 提交于 1月 08, 2006

This is the start of the `swap migration' patch series.

Swap migration allows the moving of the physical location of pages between
nodes in a numa system while the process is running.  This means that the
virtual addresses that the process sees do not change.  However, the system
rearranges the physical location of those pages.

The main intent of page migration patches here is to reduce the latency of
memory access by moving pages near to the processor where the process
accessing that memory is running.

The patchset allows a process to manually relocate the node on which its
pages are located through the MF_MOVE and MF_MOVE_ALL options while
setting a new memory policy.

The pages of process can also be relocated from another process using the
sys_migrate_pages() function call.  Requires CAP_SYS_ADMIN.  The migrate_pages
function call takes two sets of nodes and moves pages of a process that are
located on the from nodes to the destination nodes.

Manual migration is very useful if for example the scheduler has relocated a
process to a processor on a distant node.  A batch scheduler or an
administrator can detect the situation and move the pages of the process
nearer to the new processor.

sys_migrate_pages() could be used on non-numa machines as well, to force all
of a particualr process's pages out to swap, if someone thinks that's useful.

Larger installations usually partition the system using cpusets into sections
of nodes.  Paul has equipped cpusets with the ability to move pages when a
task is moved to another cpuset.  This allows automatic control over locality
of a process.  If a task is moved to a new cpuset then also all its pages are
moved with it so that the performance of the process does not sink
dramatically (as is the case today).

Swap migration works by simply evicting the page.  The pages must be faulted
back in.  The pages are then typically reallocated by the system near the node
where the process is executing.

For swap migration the destination of the move is controlled by the allocation
policy.  Cpusets set the allocation policy before calling sys_migrate_pages()
in order to move the pages as intended.

No allocation policy changes are performed for sys_migrate_pages().  This
means that the pages may not faulted in to the specified nodes if no
allocation policy was set by other means.  The pages will just end up near the
node where the fault occurred.

There's another patch series in the pipeline which implements "direct
migration".

The direct migration patchset extends the migration functionality to avoid
going through swap.  The destination node of the relation is controllable
during the actual moving of pages.  The crutch of using the allocation policy
to relocate is not necessary and the pages are moved directly to the target.
Its also faster since swap is not used.

And sys_migrate_pages() can then move pages directly to the specified node.
Implement functions to isolate pages from the LRU and put them back later.

This patch:

An earlier implementation was provided by Hirokazu Takahashi
<taka@valinux.co.jp> and IWAMOTO Toshihiro <iwamoto@valinux.co.jp> for the
memory hotplug project.

From: Magnus

This breaks out isolate_lru_page() and putpack_lru_page().  Needed for swap
migration.
Signed-off-by: NMagnus Damm <magnus.damm@gmail.com>
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

21eac81f

[PATCH] add schedule_on_each_cpu() · 15316ba8

由 Christoph Lameter 提交于 1月 08, 2006

swap migration's isolate_lru_page() currently uses an IPI to notify other
processors that the lru caches need to be drained if the page cannot be
found on the LRU.  The IPI interrupt may interrupt a processor that is just
processing lru requests and cause a race condition.

This patch introduces a new function run_on_each_cpu() that uses the
keventd() to run the LRU draining on each processor.  Processors disable
preemption when dealing the LRU caches (these are per processor) and thus
executing LRU draining from another process is safe.

Thanks to Lee Schermerhorn <lee.schermerhorn@hp.com> for finding this race
condition.
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

15316ba8

[PATCH] mm: free_pages opt · 48db57f8

由 Nick Piggin 提交于 1月 08, 2006

Try to streamline free_pages_bulk by ensuring callers don't pass in a
'count' that exceeds the list size.

Some cleanups:
Rename __free_pages_bulk to __free_one_page.
Put the page list manipulation from __free_pages_ok into free_one_page.
Make __free_pages_ok static.
Signed-off-by: NNick Piggin <npiggin@suse.de>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

48db57f8

[PATCH] mm: cleanup zone_pcp · 23316bc8

由 Nick Piggin 提交于 1月 08, 2006

Use zone_pcp everywhere even though NUMA code "knows" the internal details
of the zone.  Stop other people trying to copy, and it looks nicer.

Also, only print the pagesets of online cpus in zoneinfo.
Signed-off-by: NNick Piggin <npiggin@suse.de>
Cc: "Seth, Rohit" <rohit.seth@intel.com>
Cc: Christoph Lameter <christoph@lameter.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

23316bc8

[PATCH] Make high and batch sizes of per_cpu_pagelists configurable · 8ad4b1fb

由 Rohit Seth 提交于 1月 08, 2006

As recently there has been lot of traffic on the right values for batch and
high water marks for per_cpu_pagelists.  This patch makes these two
variables configurable through /proc interface.

A new tunable /proc/sys/vm/percpu_pagelist_fraction is added.  This entry
controls the fraction of pages at most in each zone that are allocated for
each per cpu page list.  The min value for this is 8.  It means that we
don't allow more than 1/8th of pages in each zone to be allocated in any
single per_cpu_pagelist.

The batch value of each per cpu pagelist is also updated as a result.  It
is set to pcp->high/4.  The upper limit of batch is (PAGE_SHIFT * 8)
Signed-off-by: NRohit Seth <rohit.seth@intel.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

8ad4b1fb

[PATCH] drop-pagecache · 9d0243bc

由 Andrew Morton 提交于 1月 08, 2006

Add /proc/sys/vm/drop_caches.  When written to, this will cause the kernel to
discard as much pagecache and/or reclaimable slab objects as it can.  THis
operation requires root permissions.

It won't drop dirty data, so the user should run `sync' first.

Caveats:

a) Holds inode_lock for exorbitant amounts of time.

b) Needs to be taught about NUMA nodes: propagate these all the way through
   so the discarding can be controlled on a per-node basis.

This is a debugging feature: useful for getting consistent results between
filesystem benchmarks.  We could possibly put it under a config option, but
it's less than 300 bytes.
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

9d0243bc

[PATCH] slab: remove nested #ifdef CONFIG_NUMA · bec6b0c8

由 Christoph Lameter 提交于 1月 08, 2006

For some reason there is an #ifdef CONFIG_NUMA within another #ifdef
CONFIG_NUMA in the page allocator.  Remove innermost #ifdef CONFIG_NUMA
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

bec6b0c8

[PATCH] slab: fix code formatting · b28a02de

由 Pekka Enberg 提交于 1月 08, 2006

The slab allocator code is inconsistent in coding style and messy.  For this
patch, I ran Lindent for mm/slab.c and fixed up goofs by hand.
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

b28a02de

[PATCH] slab: extract slab order calculation to separate function · 4d268eba

由 Pekka Enberg 提交于 1月 08, 2006

This patch moves the ugly loop that determines the 'optimal' size (page order)
of cache slabs from kmem_cache_create() to a separate function and cleans it
up a bit.

Thanks to Matthew Wilcox for the help with this patch.
Signed-off-by: NMatthew Dobson <colpatch@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

4d268eba

[PATCH] slab: extract slabinfo header printing to separate function · 85289f98

由 Pekka Enberg 提交于 1月 08, 2006

This patch extracts slabinfo header printing to a separate function
print_slabinfo_header() to make s_start() more readable.
Signed-off-by: NMatthew Dobson <colpatch@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

85289f98

[PATCH] slab: remove unused align parameter from alloc_percpu · f9f75005

由 Pekka Enberg 提交于 1月 08, 2006

__alloc_percpu and alloc_percpu both take an 'align' argument which is
completely ignored.  snmp6_mib_init() in net/ipv6/af_inet6.c attempts to use
it, but it will be ignored.  Therefore, remove the 'align' argument and fixup
the lone caller.
Signed-off-by: NMatthew Dobson <colpatch@us.ibm.com>
Acked-by: NManfred Spraul <manfred@colorfullife.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

f9f75005

[PATCH] Fix compilation with CONFIG_MEMORY_HOTPLUG=y and gcc41. · b792de39

由 Olaf Hering 提交于 1月 08, 2006

Fix compilation with CONFIG_MEMORY_HOTPLUG=y and gcc41.
Also remove unneeded declations, add a public function.

drivers/base/memory.c:53: error: static declaration of 'register_memory_notifier' follows non-static declaration
include/linux/memory.h:85: error: previous declaration of 'register_memory_notifier' was here
drivers/base/memory.c:58: error: static declaration of 'unregister_memory_notifier' follows non-static declaration
include/linux/memory.h:86: error: previous declaration of 'unregister_memory_notifier' was here
drivers/base/memory.c:68: error: static declaration of 'register_memory' follows non-static declaration
include/linux/memory.h:73: error: previous declaration of 'register_memory' was here
Signed-off-by: NOlaf Hering <olh@suse.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

b792de39

[PATCH] ARM Netwinder watchdog wdt977 update · 4dab06fa

由 Woody Suwalski 提交于 1月 08, 2006

Cleanup for the ARM-only watchdog driver wdt977.

This is probably the last update, since we want to merge with w83977f_wdt.
Jose Goncalves has ported this driver to i386, so probably we can iron out
configuration differences.
Signed-off-by: NWoody Suwalski <woodys@xandros.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

4dab06fa

[PATCH] small hp_sdc_rtc cleanup: use no_llseek · 70c00ba0

由 Marcelo Tosatti 提交于 1月 08, 2006

Use no_llseek function.
Signed-off-by: NMarcelo Tosatti <marcelo.tosatti@cyclades.com>
Cc: "Brian S. Julin" <bri@calyx.com>
Acked-by: NVojtech Pavlik <vojtech@suse.cz>
Cc: Dmitry Torokhov <dtor_core@ameritech.net>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

70c00ba0

[PATCH] asm-generic/atomic.h needs types.h · 5998bf1d

由 Andrew Morton 提交于 1月 08, 2006

For BITS_PER_LONG
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

5998bf1d

[PATCH] revert "mm: page_state fixes" · 84c2008a

由 Andrew Morton 提交于 1月 08, 2006

Hugh says:

page_alloc_cpu_notify() specifically contains code to

 		/* Add dead cpu's page_states to our own. */

which handles this more efficiently.

Cc: Hugh Dickins <hugh@veritas.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

84c2008a

08 1月, 2006 24 次提交

[IPV6]: small cleanups · 9f5336e2

由 Adrian Bunk 提交于 1月 07, 2006

This patch contains the following cleanups:
- addrconf.c: make addrconf_dad_stop() static
- inet6_connection_sock.c should #include <net/inet6_connection_sock.h>
  for getting the prototypes of it's global functions
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9f5336e2

[IPV4]: make ip_fragment() static · 97dc627f

由 Adrian Bunk 提交于 1月 07, 2006

Since there's no longer any external user of ip_fragment() we can make 
it static.
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

97dc627f

D
[NETFILTER]: Add dummy nf_hook{_thresh}() when NETFILTER is disabled. · f53b61d8
由 David S. Miller 提交于 1月 07, 2006
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
f53b61d8
J
[NETFILTER]: ip_conntrack_proto_sctp.c needs linux/interrupt.h · da7bc6ee
由 Joe Kappus 提交于 1月 06, 2006
```
Signed-off-by: NJoe Kappus <joecool1029@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
da7bc6ee

[AX25/MKISS]: unbalanced spinlock_bh in ax_encaps() · b3413872

由 Francois Romieu 提交于 1月 06, 2006

The unlocking disappeared during commit
5793f4be.
Signed-off-by: NFrancois Romieu <romieu@fr.zoreil.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b3413872

[NETFILTER]: Add ipt_policy/ip6t_policy matches · e16a8f0b

由 Patrick McHardy 提交于 1月 06, 2006

Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e16a8f0b

[NETFILTER]: Handle NAT in IPsec policy checks · eb9c7ebe

由 Patrick McHardy 提交于 1月 06, 2006

Handle NAT of decapsulated IPsec packets by reconstructing the struct flowi
of the original packet from the conntrack information for IPsec policy
checks.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eb9c7ebe

[NETFILTER]: Keep conntrack reference until IPsec policy checks are done · b59c2701

由 Patrick McHardy 提交于 1月 06, 2006

Keep the conntrack reference until policy checks have been performed for
IPsec NAT support. The reference needs to be dropped before a packet is
queued to avoid having the conntrack module unloadable.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b59c2701

[NETFILTER]: Redo policy lookups after NAT when neccessary · 5c901daa

由 Patrick McHardy 提交于 1月 06, 2006

When NAT changes the key used for the xfrm lookup it needs to be done
again. If a new policy is returned in POST_ROUTING the packet needs
to be passed to xfrm4_output_one manually after all hooks were called
because POST_ROUTING is called with fixed okfn (ip_finish_output).
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5c901daa

[NETFILTER]: Use conntrack information to determine if packet was NATed · 4e8e9de7

由 Patrick McHardy 提交于 1月 06, 2006

Preparation for IPsec support for NAT:
Use conntrack information instead of saving the saving and comparing the
addresses to determine if a packet was NATed and needs to be rerouted to
make it easier to extend the key.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4e8e9de7

[NETFILTER]: Fix xfrm lookup in ip_route_me_harder/ip6_route_me_harder · 3e3850e9

由 Patrick McHardy 提交于 1月 06, 2006

ip_route_me_harder doesn't use the port numbers of the xfrm lookup and
uses ip_route_input for non-local addresses which doesn't do a xfrm
lookup, ip6_route_me_harder doesn't do a xfrm lookup at all.

Use xfrm_decode_session and do the lookup manually, make sure both
only do the lookup if the packet hasn't been transformed already.

Makeing sure the lookup only happens once needs a new field in the
IP6CB, which exceeds the size of skb->cb. The size of skb->cb is
increased to 48b. Apparently the IPv6 mobile extensions need some
more room anyway.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3e3850e9

[IPV4]: reset IPCB flags when neccessary · 8cdfab8a

由 Patrick McHardy 提交于 1月 06, 2006

Reset IPSKB_XFRM_TUNNEL_SIZE flags in ipip and ip_gre hard_start_xmit
function before the packet reenters IP. This is neccessary so the
encapsulated packets are checked not to be oversized in xfrm4_output.c
again. Reset all flags in sit when a packet changes its address family.

Also remove some obsolete IPSKB flags.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8cdfab8a

[IPV4/6]: Netfilter IPsec input hooks · b05e1066

由 Patrick McHardy 提交于 1月 06, 2006

When the innermost transform uses transport mode the decapsulated packet
is not visible to netfilter. Pass the packet through the PRE_ROUTING and
LOCAL_IN hooks again before handing it to upper layer protocols to make
netfilter-visibility symetrical to the output path.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b05e1066

[IPV6]: Move nextheader offset to the IP6CB · 951dbc8a

由 Patrick McHardy 提交于 1月 06, 2006

Move nextheader offset to the IP6CB to make it possible to pass a
packet to ip6_input_finish multiple times and have it skip already
parsed headers. As a nice side effect this gets rid of the manual
hopopts skipping in ip6_input_finish.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

951dbc8a

[XFRM]: Netfilter IPsec output hooks · 16a6677f

由 Patrick McHardy 提交于 1月 06, 2006

Call netfilter hooks before IPsec transforms. Packets visit the
FORWARD/LOCAL_OUT and POST_ROUTING hook before the first encapsulation
and the LOCAL_OUT and POST_ROUTING hook before each following tunnel mode
transform.

Patch from Herbert Xu <herbert@gondor.apana.org.au>:

Move the loop from dst_output into xfrm4_output/xfrm6_output since they're
the only ones who need to it. xfrm{4,6}_output_one() processes the first SA
all subsequent transport mode SAs and is called in a loop that calls the
netfilter hooks between each two calls.

In order to avoid the tail call issue, I've added the inline function
nf_hook which is nf_hook_slow plus the empty list check.
Signed-off-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

16a6677f

[XFRM]: Fix sparse warning. · ee2e6841

由 Luiz Capitulino 提交于 1月 06, 2006

security/selinux/xfrm.c:155:10: warning: Using plain integer as NULL pointer
Signed-off-by: NLuiz Capitulino <lcapitulino@mandriva.com.br>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ee2e6841

D
[DCCP]: ipv6.c needs net/ip6_checksum.c · aa0e4e4a
由 David S. Miller 提交于 1月 06, 2006
```
Reported by Dave Jones.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
aa0e4e4a
L

Merge git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · b9abaa3f
由 Linus Torvalds 提交于 1月 07, 2006

b9abaa3f
L

Merge master.kernel.org:/home/rmk/linux-2.6-arm · 8995b161
由 Linus Torvalds 提交于 1月 07, 2006

8995b161
L

Merge master.kernel.org:/home/rmk/linux-2.6-serial · cc918c7a
由 Linus Torvalds 提交于 1月 07, 2006

cc918c7a
L

Merge master.kernel.org:/home/rmk/linux-2.6-mmc · f9c5d045
由 Linus Torvalds 提交于 1月 07, 2006

f9c5d045

[PATCH] fbcon: don´t call set_par() in fbcon_init() if vc_mode == KD_GRAPHICS · d354d9af

由 Knut Petersen 提交于 1月 07, 2006

Nothing prevents a user to modprobe a framebuffer driver from e.g.  the
xterm prompt.  As a result, the set_par() function of the driver will be
called from fbcon_init().

This is fatal as a lot of X / framebuffer combinations are unable to
recover from set_par() reprogramming the graphics controller in
KD_GRAPHICS mode.

It is also unnecessary as the set_par() function will be called during a
switch to KD_TEXT anyway.  Because of this no side effects are possible.
Signed-off-by: NKnut Petersen <Knut_Petersen@t-online.de>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

d354d9af

R
[ARM] byteorder.h needs linux/compiler.h · fe5dd7c7
由 Russell King 提交于 1月 07, 2006
```
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
```
fe5dd7c7

Input: ibmasm - convert to dynamic input_dev allocation · 736ce432

由 Vernon Mauery 提交于 1月 07, 2006

Update the ibmasm driver to use the dynamic allocation of input_dev
structs to work with the sysfs subsystem.

Vojtech: Fixed some problems/bugs in the patch.
Dmitry: Fixed some more.
Signed-off-by: NVernon Mauery <vernux@us.ibm.com>
Signed-off-by: NVojtech Pavlik <vojtech@suse.cz>
Signed-off-by: NDmitry Torokhov <dtor@mail.ru>

736ce432