提交 · 4cae59d2e85c1ee2ab1ee284db1945c5394cd965 · openanolis / cloud-kernel

26 3月, 2006 17 次提交

[PATCH] amiga: fix driver_register() return handling, remove zorro_module_init() · 33d8675e

由 Bjorn Helgaas 提交于 3月 25, 2006

Remove the assumption that driver_register() returns the number of devices
bound to the driver.  In fact, it returns zero for success or a negative
error value.

zorro_module_init() used the device count to automatically unregister and
unload drivers that found no devices.  That might have worked at one time,
but has been broken for some time because zorro_register_driver() returned
either a negative error or a positive count (never zero).  So it could only
unregister on failure, when it's not needed anyway.

This functionality could be resurrected in individual drivers by counting
devices in their .probe() methods.
Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

33d8675e

[PATCH] hp300: fix driver_register() return handling, remove dio_module_init() · e51c01b0

由 Bjorn Helgaas 提交于 3月 25, 2006

Remove the assumption that driver_register() returns the number of devices
bound to the driver.  In fact, it returns zero for success or a negative
error value.

dio_module_init() used the device count to automatically unregister and
unload drivers that found no devices.  That might have worked at one time,
but has been broken for some time because dio_register_driver() returned
either a negative error or a positive count (never zero).  So it could only
unregister on failure, when it's not needed anyway.

This functionality could be resurrected in individual drivers by counting
devices in their .probe() methods.
Signed-off-by: NBjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Philip Blundell <philb@gnu.org>
Cc: Jochen Friedrich <jochen@scram.de>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

e51c01b0

[PATCH] reiserfs: cleanups · cd02b966

由 Vladimir V. Saveliev 提交于 3月 25, 2006

Clean up several places where gcc issues warnings when -W is specified.
Thanks to Neil for finding that.
Signed-off-by: NVladimir V. Saveliev <vs@namesys.com>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: NHans Reiser <reiser@namesys.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

cd02b966

[PATCH] parport: move PP_MAJOR from ppdev.h to major.h · e6a67846

由 Rene Herman 提交于 3月 25, 2006

Today I wondered about /dev/parport<n> after not seeing anything in
drivers/parport register char-major-99.  Having PP_MAJOR in
include/linux/major.h would've allowed me to more quickly determine that it
was the ppdev driver driving these.
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

e6a67846

[PATCH] inotify: lock avoidance with parent watch status in dentry · c32ccd87

由 Nick Piggin 提交于 3月 25, 2006

Previous inotify work avoidance is good when inotify is completely unused,
but it breaks down if even a single watch is in place anywhere in the
system. Robin Holt notices that udev is one such culprit - it slows down a
512-thread application on a 512 CPU system from 6 seconds to 22 minutes.

Solve this by adding a flag in the dentry that tells inotify whether or not
its parent inode has a watch on it. Event queueing to parent will skip
taking locks if this flag is cleared. Setting and clearing of this flag on
all child dentries versus event delivery: this is no in terms of race
cases, and that was shown to be equivalent to always performing the check.

The essential behaviour is that activity occuring _after_ a watch has been
added and _before_ it has been removed, will generate events.
Signed-off-by: NNick Piggin <npiggin@suse.de>
Cc: Robert Love <rml@novell.com>
Cc: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

c32ccd87

[PATCH] kernel/params.c: make param_array() static · 9871728b

由 Adrian Bunk 提交于 3月 25, 2006

param_array() in kernel/params.c can now become static.
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

9871728b

[PATCH] Remove MODULE_PARM · 8d3b33f6

由 Rusty Russell 提交于 3月 25, 2006

MODULE_PARM was actually breaking: recent gcc version optimize them out as
unused.  It's time to replace the last users, which are generally in the
most unloved drivers anyway.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

8d3b33f6

[PATCH] constify tty flip buffer handling · 1aef821a

由 Thomas Koeller 提交于 3月 25, 2006

Add a couple of 'const' qualifiers to the TTY flip buffer APIs, where
appropriate.
Signed-off-by: NThomas Koeller <thomas@koeller.dyndns.org>
Acked-by: NAlan Cox <alan@redhat.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

1aef821a

[PATCH] Introduce FMODE_EXEC file flag · b500531e

由 Oleg Drokin 提交于 3月 25, 2006

Introduce FMODE_EXEC file flag, to indicate that file is being opened for
execution.  This is useful for distributed filesystems to maintain
consistent behavior for returning ETXTBUSY when opening for write and
execution happens on different nodes.

akpm:

  Needed by Lustre at present.  I assume their objective to to work towards
  being able to install Lustre on an unmodified distro kernel, which seems
  sane.  It should have zero runtime cost.

  Trond and Chuck indicate that NFS4 can probably use this too, for the same
  thing.

  Steven says it's also on the GFS todo list.
Signed-off-by: NOleg Drokin <green@linuxhacker.ru>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Chuck Lever <cel@citi.umich.edu>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

b500531e

[PATCH] reiserfs: fix transaction overflowing · 23f9e0f8

由 Alexander Zarochentzev 提交于 3月 25, 2006

This patch fixes a bug in reiserfs truncate.  A transaction might overflow
when truncating long highly fragmented file.  The fix is to split
truncation into several transactions to avoid overflowing.
Signed-off-by: NVladimir V. Saveliev <vs@namesys.com>
Cc; Charles McColgan <cm@chuck.net>
Cc: Alexander Zarochentsev <zam@namesys.com>
Cc: Hans Reiser <reiser@namesys.com>
Cc: Chris Mason <mason@suse.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

23f9e0f8

[PATCH] fs/inode.c: make iprune_mutex static · bdfc3266

由 Adrian Bunk 提交于 3月 25, 2006

There's no reason for iprune_mutex being global.
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

bdfc3266

[PATCH] Small cleanup in quota.h · ca5734db

由 Jan Kara 提交于 3月 25, 2006

Remove unused quota flag.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

ca5734db

[PATCH] jbd: embed j_commit_timer in journal struct · e3df1898

由 Andrew Morton 提交于 3月 25, 2006

The kjournald timer is currently on the kernel thread's stack and the journal
structure points at it.  Save a pointer hop by moving the timer into the
journal structure.
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

e3df1898

[PATCH] slab: optimize constant-size kzalloc calls · 40c07ae8

由 Pekka Enberg 提交于 3月 25, 2006

As suggested by Eric Dumazet, optimize kzalloc() calls that pass a
compile-time constant size.  Please note that the patch increases kernel
text slightly (~200 bytes for defconfig on x86).
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

40c07ae8

[PATCH] slab: introduce kmem_cache_zalloc allocator · a8c0f9a4

由 Pekka Enberg 提交于 3月 25, 2006

Introduce a memory-zeroing variant of kmem_cache_alloc.  The allocator
already exits in XFS and there are potential users for it so this patch
makes the allocator available for the general public.
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

a8c0f9a4

[PATCH] slab: implement /proc/slab_allocators · 871751e2

由 Al Viro 提交于 3月 25, 2006

Implement /proc/slab_allocators.   It produces output like:

idr_layer_cache: 80 idr_pre_get+0x33/0x4e
buffer_head: 2555 alloc_buffer_head+0x20/0x75
mm_struct: 9 mm_alloc+0x1e/0x42
mm_struct: 20 dup_mm+0x36/0x370
vm_area_struct: 384 dup_mm+0x18f/0x370
vm_area_struct: 151 do_mmap_pgoff+0x2e0/0x7c3
vm_area_struct: 1 split_vma+0x5a/0x10e
vm_area_struct: 11 do_brk+0x206/0x2e2
vm_area_struct: 2 copy_vma+0xda/0x142
vm_area_struct: 9 setup_arg_pages+0x99/0x214
fs_cache: 8 copy_fs_struct+0x21/0x133
fs_cache: 29 copy_process+0xf38/0x10e3
files_cache: 30 alloc_files+0x1b/0xcf
signal_cache: 81 copy_process+0xbaa/0x10e3
sighand_cache: 77 copy_process+0xe65/0x10e3
sighand_cache: 1 de_thread+0x4d/0x5f8
anon_vma: 241 anon_vma_prepare+0xd9/0xf3
size-2048: 1 add_sect_attrs+0x5f/0x145
size-2048: 2 journal_init_revoke+0x99/0x302
size-2048: 2 journal_init_revoke+0x137/0x302
size-2048: 2 journal_init_inode+0xf9/0x1c4

Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Alexander Nyberg <alexn@telia.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <clameter@engr.sgi.com>
Cc: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
DESC
slab-leaks3-locking-fix
EDESC
From: Andrew Morton <akpm@osdl.org>

Update for slab-remove-cachep-spinlock.patch

Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Alexander Nyberg <alexn@telia.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <clameter@engr.sgi.com>
Cc: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

871751e2

[PATCH] sys_alarm() unsigned signed conversion fixup · c08b8a49

由 Thomas Gleixner 提交于 3月 25, 2006

alarm() calls the kernel with an unsigend int timeout in seconds.  The
value is stored in the tv_sec field of a struct timeval to setup the
itimer.  The tv_sec field of struct timeval is of type long, which causes
the tv_sec value to be negative on 32 bit machines if seconds > INT_MAX.

Before the hrtimer merge (pre 2.6.16) such a negative value was converted
to the maximum jiffies timeout by the timeval_to_jiffies conversion.  It's
not clear whether this was intended or just happened to be done by the
timeval_to_jiffies code.

hrtimers expect a timeval in canonical form and treat a negative timeout as
already expired.  This breaks the legitimate usage of alarm() with a
timeout value > INT_MAX seconds.

For 32 bit machines it is therefor necessary to limit the internal seconds
value to avoid API breakage.  Instead of doing this in all implementations
of sys_alarm the duplicated sys_alarm code is moved into a common function
in itimer.c
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

c08b8a49

24 3月, 2006 23 次提交

[PATCH] strndup_user() · 96840aa0

由 Davi Arnaut 提交于 3月 24, 2006

This patch series creates a strndup_user() function to easy copying C strings
from userspace.  Also we avoid common pitfalls like userspace modifying the
final \0 after the strlen_user().
Signed-off-by: NDavi Arnaut <davi.arnaut@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

96840aa0

[PATCH] timer-irq-driven soft-watchdog, cleanups · 6687a97d

由 Ingo Molnar 提交于 3月 24, 2006

Make the softlockup detector purely timer-interrupt driven, removing
softirq-context (timer) dependencies.  This means that if the softlockup
watchdog triggers, it has truly observed a longer than 10 seconds
scheduling delay of a SCHED_FIFO prio 99 task.

(the patch also turns off the softlockup detector during the initial bootup
phase and does small style fixes)
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

6687a97d

[PATCH] ide: Allow IDE interface to specify its not capable of 32-bit operations · 208a08f7

由 Kumar Gala 提交于 3月 24, 2006

In some embedded systems the IDE hardware interface may only support 16-bit
or smaller accesses.  Allow the interface to specify if this is the case
and don't allow the drive or user to override the setting.
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
Acked-by: NBartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

208a08f7

[PATCH] Secure Digital Host Controller id and regs · 97f2478d

由 Pierre Ossman 提交于 3月 24, 2006

Class code and register definitions for the Secure Digital Host Controller
standard.
Signed-off-by: NPierre Ossman <drzeus@drzeus.cx>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

97f2478d

[PATCH] fsync: extract internal code · 18e79b40

由 Andrew Morton 提交于 3月 24, 2006

Pull the guts out of do_fsync() - we can use it elsewhere.

Cc: Hugh Dickins <hugh@veritas.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

18e79b40

[PATCH] set_page_dirty() return value fixes · 4741c9fd

由 Andrew Morton 提交于 3月 24, 2006

We need set_page_dirty() to return true if it actually transitioned the page
from a clean to dirty state.  This wasn't right in a couple of places.  Do a
kernel-wide audit, fix things up.

This leaves open the possibility of returning a negative errno from
set_page_dirty() sometime in the future.  But we don't do that at present.
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

4741c9fd

[PATCH] balance_dirty_pages_ratelimited: take nr_pages arg · fa5a734e

由 Andrew Morton 提交于 3月 24, 2006

Modify balance_dirty_pages_ratelimited() so that it can take a
number-of-pages-which-I-just-dirtied argument.  For msync().
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

fa5a734e

[PATCH] fadvise(): write commands · ebcf28e1

由 Andrew Morton 提交于 3月 24, 2006

Add two new linux-specific fadvise extensions():

LINUX_FADV_ASYNC_WRITE: start async writeout of any dirty pages between file
offsets `offset' and `offset+len'. Any pages which are currently under
writeout are skipped, whether or not they are dirty.

LINUX_FADV_WRITE_WAIT: wait upon writeout of any dirty pages between file
offsets `offset' and `offset+len'.

By combining these two operations the application may do several things:

LINUX_FADV_ASYNC_WRITE: push some or all of the dirty pages at the disk.

LINUX_FADV_WRITE_WAIT, LINUX_FADV_ASYNC_WRITE: push all of the currently dirty
pages at the disk.

LINUX_FADV_WRITE_WAIT, LINUX_FADV_ASYNC_WRITE, LINUX_FADV_WRITE_WAIT: push all
of the currently dirty pages at the disk, wait until they have been written.

It should be noted that none of these operations write out the file's
metadata. So unless the application is strictly performing overwrites of
already-instantiated disk blocks, there are no guarantees here that the data
will be available after a crash.

To complete this suite of operations I guess we should have a "sync file
metadata only" operation. This gives applications access to all the building
blocks needed for all sorts of sync operations. But sync-metadata doesn't fit
well with the fadvise() interface. Probably it should be a new syscall:
sys_fmetadatasync().

The patch also diddles with the meaning of `endbyte' in sys_fadvise64_64().
It is made to represent that last affected byte in the file (ie: it is
inclusive). Generally, all these byterange and pagerange functions are
inclusive so we can easily represent EOF with -1.

As Ulrich notes, these two functions are somewhat abusive of the fadvise()
concept, which appears to be "set the future policy for this fd".

But these commands are a perfect fit with the fadvise() impementation, and
several of the existing fadvise() commands are synchronous and don't affect
future policy either. I think we can live with the slight incongruity.

Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

ebcf28e1

[PATCH] abstract type/size specification for assembly · ab7efcc9

由 Jan Beulich 提交于 3月 24, 2006

Provide abstraction for generating type and size information of assembly
routines and data, while permitting architectures to override these
defaults.
Signed-off-by: NJan Beulich <jbeulich@novell.com>
Cc: "Russell King" <rmk@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: "Andi Kleen" <ak@suse.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

ab7efcc9

[PATCH] cpuset memory spread slab cache optimizations · c61afb18

由 Paul Jackson 提交于 3月 24, 2006

The hooks in the slab cache allocator code path for support of NUMA
mempolicies and cpuset memory spreading are in an important code path.  Many
systems will use neither feature.

This patch optimizes those hooks down to a single check of some bits in the
current tasks task_struct flags.  For non NUMA systems, this hook and related
code is already ifdef'd out.

The optimization is done by using another task flag, set if the task is using
a non-default NUMA mempolicy.  Taking this flag bit along with the
PF_SPREAD_PAGE and PF_SPREAD_SLAB flag bits added earlier in this 'cpuset
memory spreading' patch set, one can check for the combination of any of these
special case memory placement mechanisms with a single test of the current
tasks task_struct flags.

This patch also tightens up the code, to save a few bytes of kernel text
space, and moves some of it out of line.  Due to the nested inlines called
from multiple places, we were ending up with three copies of this code, which
once we get off the main code path (for local node allocation) seems a bit
wasteful of instruction memory.
Signed-off-by: NPaul Jackson <pj@sgi.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

c61afb18

[PATCH] cpuset memory spread slab cache implementation · 101a5001

由 Paul Jackson 提交于 3月 24, 2006

Provide the slab cache infrastructure to support cpuset memory spreading.

See the previous patches, cpuset_mem_spread, for an explanation of cpuset
memory spreading.

This patch provides a slab cache SLAB_MEM_SPREAD flag.  If set in the
kmem_cache_create() call defining a slab cache, then any task marked with the
process state flag PF_MEMSPREAD will spread memory page allocations for that
cache over all the allowed nodes, instead of preferring the local (faulting)
node.

On systems not configured with CONFIG_NUMA, this results in no change to the
page allocation code path for slab caches.

On systems with cpusets configured in the kernel, but the "memory_spread"
cpuset option not enabled for the current tasks cpuset, this adds a call to a
cpuset routine and failed bit test of the processor state flag PF_SPREAD_SLAB.

For tasks so marked, a second inline test is done for the slab cache flag
SLAB_MEM_SPREAD, and if that is set and if the allocation is not
in_interrupt(), this adds a call to to a cpuset routine that computes which of
the tasks mems_allowed nodes should be preferred for this allocation.

==> This patch adds another hook into the performance critical
    code path to allocating objects from the slab cache, in the
    ____cache_alloc() chunk, below.  The next patch optimizes this
    hook, reducing the impact of the combined mempolicy plus memory
    spreading hooks on this critical code path to a single check
    against the tasks task_struct flags word.

This patch provides the generic slab flags and logic needed to apply memory
spreading to a particular slab.

A subsequent patch will mark a few specific slab caches for this placement
policy.
Signed-off-by: NPaul Jackson <pj@sgi.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

101a5001

[PATCH] cpuset memory spread page cache implementation and hooks · 44110fe3

由 Paul Jackson 提交于 3月 24, 2006

Change the page cache allocation calls to support cpuset memory spreading.

See the previous patch, cpuset_mem_spread, for an explanation of cpuset memory
spreading.

On systems without cpusets configured in the kernel, this is no change.

On systems with cpusets configured in the kernel, but the "memory_spread"
cpuset option not enabled for the current tasks cpuset, this adds a call to a
cpuset routine and failed bit test of the processor state flag PF_SPREAD_PAGE.

On tasks in cpusets with "memory_spread" enabled, this adds a call to a cpuset
routine that computes which of the tasks mems_allowed nodes should be
preferred for this allocation.

If memory spreading applies to a particular allocation, then any other NUMA
mempolicy does not apply.
Signed-off-by: NPaul Jackson <pj@sgi.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

44110fe3

[PATCH] cpuset memory spread basic implementation · 825a46af

由 Paul Jackson 提交于 3月 24, 2006

This patch provides the implementation and cpuset interface for an alternative
memory allocation policy that can be applied to certain kinds of memory
allocations, such as the page cache (file system buffers) and some slab caches
(such as inode caches).

The policy is called "memory spreading." If enabled, it spreads out these
kinds of memory allocations over all the nodes allowed to a task, instead of
preferring to place them on the node where the task is executing.

All other kinds of allocations, including anonymous pages for a tasks stack
and data regions, are not affected by this policy choice, and continue to be
allocated preferring the node local to execution, as modified by the NUMA
mempolicy.

There are two boolean flag files per cpuset that control where the kernel
allocates pages for the file system buffers and related in kernel data
structures. They are called 'memory_spread_page' and 'memory_spread_slab'.

If the per-cpuset boolean flag file 'memory_spread_page' is set, then the
kernel will spread the file system buffers (page cache) evenly over all the
nodes that the faulting task is allowed to use, instead of preferring to put
those pages on the node where the task is running.

If the per-cpuset boolean flag file 'memory_spread_slab' is set, then the
kernel will spread some file system related slab caches, such as for inodes
and dentries evenly over all the nodes that the faulting task is allowed to
use, instead of preferring to put those pages on the node where the task is
running.

The implementation is simple. Setting the cpuset flags 'memory_spread_page'
or 'memory_spread_cache' turns on the per-process flags PF_SPREAD_PAGE or
PF_SPREAD_SLAB, respectively, for each task that is in the cpuset or
subsequently joins that cpuset. In subsequent patches, the page allocation
calls for the affected page cache and slab caches are modified to perform an
inline check for these flags, and if set, a call to a new routine
cpuset_mem_spread_node() returns the node to prefer for the allocation.

The cpuset_mem_spread_node() routine is also simple. It uses the value of a
per-task rotor cpuset_mem_spread_rotor to select the next node in the current
tasks mems_allowed to prefer for the allocation.

This policy can provide substantial improvements for jobs that need to place
thread local data on the corresponding node, but that need to access large
file system data sets that need to be spread across the several nodes in the
jobs cpuset in order to fit. Without this patch, especially for jobs that
might have one thread reading in the data set, the memory allocation across
the nodes in the jobs cpuset can become very uneven.

A couple of Copyright year ranges are updated as well. And a couple of email
addresses that can be found in the MAINTAINERS file are removed.
Signed-off-by: NPaul Jackson <pj@sgi.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

825a46af

[PATCH] kill include/linux/platform.h, default_idle() cleanup · cdb04527

由 Adrian Bunk 提交于 3月 24, 2006

include/linux/platform.h contained nothing that was actually used except
the default_idle() prototype, and is therefore removed by this patch.

This patch does the following with the platform specific default_idle()
functions on different architectures:
- remove the unused function:
  - parisc
  - sparc64
- make the needlessly global function static:
  - arm
  - h8300
  - m68k
  - m68knommu
  - s390
  - v850
  - x86_64
- add a prototype in asm/system.h:
  - cris
  - i386
  - ia64
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Acked-by: NPatrick Mochel <mochel@digitalimplant.org>
Acked-by: NKyle McMartin <kyle@parisc-linux.org>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

cdb04527

[PATCH] Represent dirty_*_centisecs as jiffies internally · f6ef9438

由 Bart Samwel 提交于 3月 24, 2006

Make that the internal values for:

/proc/sys/vm/dirty_writeback_centisecs
/proc/sys/vm/dirty_expire_centisecs

are stored as jiffies instead of centiseconds.  Let the sysctl interface do
the conversions with full precision using clock_t_to_jiffies, instead of
doing overflow-sensitive on-the-fly conversions every time the values are
used.

Cons: apparent precision loss if HZ is not a multiple of 100, because of
conversion back and forth.  This is a common problem for all sysctl values
that use proc_dointvec_userhz_jiffies.  (There is only one other in-tree
use, in net/core/neighbour.c.)
Signed-off-by: NBart Samwel <bart@samwel.tk>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

f6ef9438

[PATCH] bitmap: region cleanup · 87e24802

由 Paul Jackson 提交于 3月 24, 2006

Paul Mundt <lethal@linux-sh.org> says:

This patch set implements a number of patches to clean up and restructure the
bitmap region code, in addition to extending the interface to support
multiword spanning allocations.

The current implementation (before this patch set) is limited by only being
able to allocate pages <= BITS_PER_LONG, as noted by the strategically
positioned BUG_ON() at lib/bitmap.c:752:

        /* We don't do regions of pages > BITS_PER_LONG.  The
	 * algorithm would be a simple look for multiple zeros in the
	 * array, but there's no driver today that needs this.  If you
	 * trip this BUG(), you get to code it... */
        BUG_ON(pages > BITS_PER_LONG);

As I seem to have been the first person to trigger this, the result ends up
being the following patch set with the help of Paul Jackson.

The final patch in the series eliminates quite a bit of code duplication, so
the bitmap code size ends up being smaller than the current implementation as
an added bonus.

After these are applied, it should already be possible to do multiword
allocations with dma_alloc_coherent() out of ranges established by
dma_declare_coherent_memory() on x86 without having to change any of the code,
and the SH store queue API will follow up on this as the other user that needs
support for this.

This patch:

Some code cleanup on the lib/bitmap.c bitmap_*_region() routines:

 * spacing
 * variable names
 * comments

Has no change to code function.
Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
Signed-off-by: NPaul Jackson <pj@sgi.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

87e24802

[PATCH] vfs: MS_VERBOSE should be MS_SILENT · 9b04c997

由 Theodore Ts'o 提交于 3月 24, 2006

The meaning of MS_VERBOSE is backwards; if the bit is set, it really means,
"don't be verbose".  This is confusing and counter-intuitive.

In addition, there is also no way to set the MS_VERBOSE flag in the
mount(8) program in util-linux, but interesting, it does define options
which would do the right thing if MS_SILENT were defined, which
unfortunately we do not:

#ifdef MS_SILENT
  { "quiet",    0, 0, MS_SILENT    },   /* be quiet  */
  { "loud",     0, 1, MS_SILENT    },   /* print out messages. */
#endif

So the obvious fix is to deprecate the use of MS_VERBOSE and replace it
with MS_SILENT.
Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

9b04c997

[PATCH] add sys_unshare to syscalls.h · 6961ec82

由 Arnd Bergmann 提交于 3月 24, 2006

All architecture independent system calls should be declared
in syscalls.h, add the one that is missing.
Signed-off-by: NArnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

6961ec82

[PATCH] libata: Remove dependence on host_set->dev for SAS · 2f1f610b

由 Brian King 提交于 3月 23, 2006

Remove some of the dependence on the host_set struct
in preparation for supporting SAS HBAs. Adds a struct device
pointer to the ata_port struct.
Signed-off-by: NBrian King <brking@us.ibm.com>
Signed-off-by: NJeff Garzik <jeff@garzik.org>

2f1f610b

[PATCH] libata: add ata_dev_pair helper · ebdfca6e

由 Alan Cox 提交于 3月 23, 2006

Signed-off-by: NAlan Cox <alan@redhat.com>
Signed-off-by: NJeff Garzik <jeff@garzik.org>

ebdfca6e

[PATCH] Make libata not powerdown drivers on PM_EVENT_FREEZE. · 082776e4

由 Nigel Cunningham 提交于 3月 23, 2006

At the moment libata doesn't pass pm_message_t down ata_device_suspend.
This causes drives to be powered down when we just want a freeze,
causing unnecessary wear and tear. This patch gets pm_message_t passed
down so that it can be used to determine whether to power down the
drive.
Signed-off-by: NNigel Cunningham <nigel@suspend2.net>

 drivers/scsi/libata-core.c |    5 +++--
 drivers/scsi/libata-scsi.c |    4 ++--
 drivers/scsi/scsi_sysfs.c  |    2 +-
 include/linux/libata.h     |    4 ++--
 include/scsi/scsi_host.h   |    2 +-
 5 files changed, 9 insertions(+), 8 deletions(-)
Signed-off-by: NJeff Garzik <jeff@garzik.org>

082776e4

[PATCH] libata: add per-dev pio/mwdma/udma_mask · acf356b1

由 Tejun Heo 提交于 3月 24, 2006

Add per-dev pio/mwdma/udma_mask.  All transfer mode limits used to be
applied to ap->*_mask which unnecessarily restricted other devices
sharing the port.  This change will also benefit later EH speed down
and hotplug.
Signed-off-by: NTejun Heo <htejun@gmail.com>
Signed-off-by: NJeff Garzik <jeff@garzik.org>

acf356b1

[PATCH] PCI: fix pci_request_region[s] arg · 3c990e92

由 Jeff Garzik 提交于 3月 04, 2006

    Add missing 'const' to pci_request_region[s] 'res_name' arg,
    since we pass it directly to __request_region(), whose 'name' arg
    is also const.
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

3c990e92

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功