提交 · 3806204ca9017a2cf2d849430f26e2b62666f5b2 · openanolis / cloud-kernel

17 10月, 2007 40 次提交

Remove workaround for unimmunized rcu_dereference from mce_log() · 3806204c

由 Paul E. McKenney 提交于 10月 16, 2007

Remove the rmb() from mce_log(), since the immunized version of
rcu_dereference() makes it unnecessary.
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3806204c

Immunize rcu_dereference() against crazy compiler writers · 97b43032

由 Paul E. McKenney 提交于 10月 16, 2007

Turns out that compiler writers are a bit more aggressive about optimizing
than one might expect.  This patch prevents a number of such optimizations
from messing up rcu_deference().  This is not merely a theoretical problem, as
evidenced by the rmb() in mce_log().
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: NJosh Triplett <josh@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

97b43032

Make unregister_binfmt() return void · f6b450d4

由 Alexey Dobriyan 提交于 10月 16, 2007

list_del() hardly can fail, so checking for return value is pointless
(and current code always return 0).

Nobody really cared that return value anyway.
Signed-off-by: NAlexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f6b450d4

Use list_head in binfmt handling · e4dc1b14

由 Alexey Dobriyan 提交于 10月 16, 2007

Switch single-linked binfmt formats list to usual list_head's. This leads
to one-liners in register_binfmt() and unregister_binfmt(). The downside
is one pointer more in struct linux_binfmt. This is not a problem, since
the set of registered binfmts on typical box is very small -- (ELF +
something distro enabled for you).

Test-booted, played with executable .txt files, modprobe/rmmod binfmt_misc.
Signed-off-by: NAlexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e4dc1b14

fs/reiserfs/: cleanups · deba0f49

由 Adrian Bunk 提交于 10月 16, 2007

- remove the following no longer used functions:
  - bitmap.c: reiserfs_claim_blocks_to_be_allocated()
  - bitmap.c: reiserfs_release_claimed_blocks()
  - bitmap.c: reiserfs_can_fit_pages()

- make the following functions static:
  - inode.c: restart_transaction()
  - journal.c: reiserfs_async_progress_wait()
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Acked-by: NVladimir V. Saveliev <vs@namesys.com>
Cc: Nick Piggin <npiggin@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

deba0f49

writeback: don't propagate AOP_WRITEPAGE_ACTIVATE · e4230030

由 Andrew Morton 提交于 10月 16, 2007

This is a writeback-internal marker but we're propagating it all the way back
to userspace!.

Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e4230030

mm: document tree_lock->zone.lock lockorder · 7a405079

由 Nick Piggin 提交于 10月 16, 2007

zone->lock is quite an "inner" lock and mostly constrained to page alloc as
well, so like slab locks, it probably isn't something that is critically
important to document here. However unlike slab locks, zone lock could be
used more widely in future, and page_alloc.c might possibly have more
business to do tricky things with pagecache than does slab. So... I don't
think it hurts to document it.
Signed-off-by: NNick Piggin <npiggin@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7a405079

mm: test and set zone reclaim lock before starting reclaim · d773ed6b

由 David Rientjes 提交于 10月 16, 2007

Introduces new zone flag interface for testing and setting flags:

int zone_test_and_set_flag(struct zone *zone, zone_flags_t flag)

Instead of setting and clearing ZONE_RECLAIM_LOCKED each time shrink_zone() is
called, this flag is test and set before starting zone reclaim. Zone reclaim
starts in __alloc_pages() when a zone's watermark fails and the system is in
zone_reclaim_mode. If it's already in reclaim, there's no need to start again
so it is simply considered full for that allocation attempt.

There is a change of behavior with regard to concurrent zone shrinking. It is
now possible for try_to_free_pages() or kswapd to already be shrinking a
particular zone when __alloc_pages() starts zone reclaim. In this case, it is
possible for two concurrent threads to invoke shrink_zone() for a single zone.

This change forbids a zone to be in zone reclaim twice, which was always the
behavior, but allows for concurrent try_to_free_pages() or kswapd shrinking
when starting zone reclaim.

Cc: Andrea Arcangeli <andrea@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d773ed6b

oom: convert zone_scan_lock from mutex to spinlock · ae74138d

由 David Rientjes 提交于 10月 16, 2007

There's no reason to sleep in try_set_zone_oom() or clear_zonelist_oom() if
the lock can't be acquired; it will be available soon enough once the zonelist
scanning is done.  All other threads waiting for the OOM killer are also
contingent on the exiting task being able to acquire the lock in
clear_zonelist_oom() so it doesn't make sense to put it to sleep.

Cc: Andrea Arcangeli <andrea@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ae74138d

oom: add header file to Kbuild as unifdef · 9aad369e

由 David Rientjes 提交于 10月 16, 2007

Preprocess include/linux/oom.h before exporting it to userspace.

Cc: Andrea Arcangeli <andrea@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9aad369e

oom: prevent including sched.h in header file · 172acf60

由 David Rientjes 提交于 10月 16, 2007

It's not necessary to include all of linux/sched.h in linux/oom.h.  Instead,
simply include prototypes for the relevant structs and include linux/types.h
for gfp_t.

Cc: Andrea Arcangeli <andrea@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Acked-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

172acf60

oom: do not take callback_mutex · 3ff56696

由 David Rientjes 提交于 10月 16, 2007

Since no task descriptor's 'cpuset' field is dereferenced in the execution of
the OOM killer anymore, it is no longer necessary to take callback_mutex.

[akpm@linux-foundation.org: restore cpuset_lock for other patches]
Cc: Andrea Arcangeli <andrea@suse.de>
Acked-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3ff56696

oom: compare cpuset mems_allowed instead of exclusive ancestors · bbe373f2

由 David Rientjes 提交于 10月 16, 2007

Instead of testing for overlap in the memory nodes of the the nearest
exclusive ancestor of both current and the candidate task, it is better to
simply test for intersection between the task's mems_allowed in their task
descriptors.  This does not require taking callback_mutex since it is only
used as a hint in the badness scoring.

Tasks that do not have an intersection in their mems_allowed with the current
task are not explicitly restricted from being OOM killed because it is quite
possible that the candidate task has allocated memory there before and has
since changed its mems_allowed.

Cc: Andrea Arcangeli <andrea@suse.de>
Acked-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bbe373f2

oom: suppress extraneous stack and memory dump · 7213f506

由 David Rientjes 提交于 10月 16, 2007

Suppresses the extraneous stack and memory dump when a parallel OOM killing
has been found.  There's no need to fill the ring buffer with this information
if its already been printed and the condition that triggered the previous OOM
killer has not yet been alleviated.

Cc: Andrea Arcangeli <andrea@suse.de>
Acked-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7213f506

oom: add oom_kill_allocating_task sysctl · fe071d7e

由 David Rientjes 提交于 10月 16, 2007

Adds a new sysctl, 'oom_kill_allocating_task', which will automatically kill
the OOM-triggering task instead of scanning through the tasklist to find a
memory-hogging target.  This is helpful for systems with an insanely large
number of tasks where scanning the tasklist significantly degrades
performance.

Cc: Andrea Arcangeli <andrea@suse.de>
Acked-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fe071d7e

oom: serialize out of memory calls · ff0ceb9d

由 David Rientjes 提交于 10月 16, 2007

A final allocation attempt with a very high watermark needs to be attempted
before invoking out_of_memory().  OOM killer serialization needs to occur
before this final attempt, otherwise tasks attempting to OOM-lock all zones in
its zonelist may spin and acquire the lock unnecessarily after the OOM
condition has already been alleviated.

If the final allocation does succeed, the zonelist is simply OOM-unlocked and
__alloc_pages() returns the page.  Otherwise, the OOM killer is invoked.

If the task cannot acquire OOM-locks on all zones in its zonelist, it is put
to sleep and the allocation is retried when it gets rescheduled.  One of its
zones is already marked as being in the OOM killer so it'll hopefully be
getting some free memory soon, at least enough to satisfy a high watermark
allocation attempt.  This prevents needlessly killing a task when the OOM
condition would have already been alleviated if it had simply been given
enough time.

Cc: Andrea Arcangeli <andrea@suse.de>
Acked-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ff0ceb9d

oom: add per-zone locking · 098d7f12

由 David Rientjes 提交于 10月 16, 2007

OOM killer synchronization should be done with zone granularity so that memory
policy and cpuset allocations may have their corresponding zones locked and
allow parallel kills for other OOM conditions that may exist elsewhere in the
system. DMA allocations can be targeted at the zone level, which would not be
possible if locking was done in nodes or globally.

Synchronization shall be done with a variation of "trylocks." The goal is to
put the current task to sleep and restart the failed allocation attempt later
if the trylock fails. Otherwise, the OOM killer is invoked.

Each zone in the zonelist that __alloc_pages() was called with is checked for
the newly-introduced ZONE_OOM_LOCKED flag. If any zone has this flag present,
the "trylock" to serialize the OOM killer fails and returns zero. Otherwise,
all the zones have ZONE_OOM_LOCKED set and the try_set_zone_oom() function
returns non-zero.

098d7f12

oom: change all_unreclaimable zone member to flags · e815af95

由 David Rientjes 提交于 10月 16, 2007

Convert the int all_unreclaimable member of struct zone to unsigned long
flags.  This can now be used to specify several different zone flags such as
all_unreclaimable and reclaim_in_progress, which can now be removed and
converted to a per-zone flag.

Flags are set and cleared as follows:

	zone_set_flag(struct zone *zone, zone_flags_t flag)
	zone_clear_flag(struct zone *zone, zone_flags_t flag)

Defines the first zone flags, ZONE_ALL_UNRECLAIMABLE and ZONE_RECLAIM_LOCKED,
which have the same semantics as the old zone->all_unreclaimable and
zone->reclaim_in_progress, respectively.  Also converts all current users that
set or clear either flag to use the new interface.

Helper functions are defined to test the flags:

	int zone_is_all_unreclaimable(const struct zone *zone)
	int zone_is_reclaim_locked(const struct zone *zone)

All flag operators are of the atomic variety because there are currently
readers that are implemented that do not take zone->lock.

[akpm@linux-foundation.org: add needed include]
Cc: Andrea Arcangeli <andrea@suse.de>
Acked-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e815af95

oom: move constraints to enum · 70e24bdf

由 David Rientjes 提交于 10月 16, 2007

The OOM killer's CONSTRAINT definitions are really more appropriate in an
enum, so define them in include/linux/oom.h.

Cc: Andrea Arcangeli <andrea@suse.de>
Acked-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

70e24bdf

oom: move prototypes to appropriate header file · 5a3135c2

由 David Rientjes 提交于 10月 16, 2007

Move the OOM killer's extern function prototypes to include/linux/oom.h and
include it where necessary.

[clg@fr.ibm.com: build fix]
Cc: Andrea Arcangeli <andrea@suse.de>
Acked-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NCedric Le Goater <clg@fr.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5a3135c2

Slab API: remove useless ctor parameter and reorder parameters · 4ba9b9d0

由 Christoph Lameter 提交于 10月 16, 2007

Slab constructors currently have a flags parameter that is never used.  And
the order of the arguments is opposite to other slab functions.  The object
pointer is placed before the kmem_cache pointer.

Convert

        ctor(void *object, struct kmem_cache *s, unsigned long flags)

to

        ctor(struct kmem_cache *s, void *object)

throughout the kernel

[akpm@linux-foundation.org: coupla fixes]
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4ba9b9d0

SLUB: simplify IRQ off handling · b811c202

由 Christoph Lameter 提交于 10月 16, 2007

Move irq handling out of new slab into __slab_alloc.  That is useful for
Mathieu's cmpxchg_local patchset and also allows us to remove the crude
local_irq_off in early_kmem_cache_alloc().
Signed-off-by: NChristoph Lameter <clameter@sgi.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b811c202

mm: dirty balancing for tasks · 3e26c149

由 Peter Zijlstra 提交于 10月 16, 2007

Based on ideas of Andrew:
  http://marc.info/?l=linux-kernel&m=102912915020543&w=2

Scale the bdi dirty limit inversly with the tasks dirty rate.
This makes heavy writers have a lower dirty limit than the occasional writer.

Andrea proposed something similar:
  http://lwn.net/Articles/152277/

The main disadvantage to his patch is that he uses an unrelated quantity to
measure time, which leaves him with a workload dependant tunable. Other than
that the two approaches appear quite similar.

[akpm@linux-foundation.org: fix warning]
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3e26c149

mm: per device dirty threshold · 04fbfdc1

由 Peter Zijlstra 提交于 10月 16, 2007

Scale writeback cache per backing device, proportional to its writeout speed.

By decoupling the BDI dirty thresholds a number of problems we currently have
will go away, namely:

- mutual interference starvation (for any number of BDIs);
- deadlocks with stacked BDIs (loop, FUSE and local NFS mounts).

It might be that all dirty pages are for a single BDI while other BDIs are
idling. By giving each BDI a 'fair' share of the dirty limit, each one can have
dirty pages outstanding and make progress.

A global threshold also creates a deadlock for stacked BDIs; when A writes to
B, and A generates enough dirty pages to get throttled, B will never start
writeback until the dirty pages go away. Again, by giving each BDI its own
'independent' dirty limit, this problem is avoided.

So the problem is to determine how to distribute the total dirty limit across
the BDIs fairly and efficiently. A DBI that has a large dirty limit but does
not have any dirty pages outstanding is a waste.

What is done is to keep a floating proportion between the DBIs based on
writeback completions. This way faster/more active devices get a larger share
than slower/idle devices.

[akpm@linux-foundation.org: fix warnings]
[hugh@veritas.com: Fix occasional hang when a task couldn't get out of balance_dirty_pages]
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

04fbfdc1

lib: floating proportions · 145ca25e

由 Peter Zijlstra 提交于 10月 16, 2007

Given a set of objects, floating proportions aims to efficiently give the
proportional 'activity' of a single item as compared to the whole set. Where
'activity' is a measure of a temporal property of the items.

It is efficient in that it need not inspect any other items of the set
in order to provide the answer. It is not even needed to know how many
other items there are.

It has one parameter, and that is the period of 'time' over which the
'activity' is measured.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

145ca25e

mm: count writeback pages per BDI · 69cb51d1

由 Peter Zijlstra 提交于 10月 16, 2007

Count per BDI writeback pages.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

69cb51d1

mm: count reclaimable pages per BDI · c9e51e41

由 Peter Zijlstra 提交于 10月 16, 2007

Count per BDI reclaimable pages; nr_reclaimable = nr_dirty + nr_unstable.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c9e51e41

mm: scalable bdi statistics counters · b2e8fb6e

由 Peter Zijlstra 提交于 10月 16, 2007

Provide scalable per backing_dev_info statistics counters.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b2e8fb6e

mm: bdi init hooks · e0bf68dd

由 Peter Zijlstra 提交于 10月 16, 2007

provide BDI constructor/destructor hooks

[akpm@linux-foundation.org: compile fix]
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e0bf68dd

lib: percpu_counter_init_irq · dc62a30e

由 Peter Zijlstra 提交于 10月 16, 2007

provide a way to tell lockdep about percpu_counters that are supposed to be
used from irq safe contexts.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

dc62a30e

lib: percpu_counter_init error handling · 833f4077

由 Peter Zijlstra 提交于 10月 16, 2007

alloc_percpu can fail, propagate that error.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

833f4077

lib: percpu_count_sum() · bf1d89c8

由 Peter Zijlstra 提交于 10月 16, 2007

Provide an accurate version of percpu_counter_read.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bf1d89c8

lib: percpu_counter_sum_positive · 52d9f3b4

由 Peter Zijlstra 提交于 10月 16, 2007

 s/percpu_counter_sum/&_positive/

Because its consitent with percpu_counter_read*
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

52d9f3b4

lib: percpu_counter_set · 3a587f47

由 Peter Zijlstra 提交于 10月 16, 2007

Provide a method to set a percpu counter to a specified value.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3a587f47

lib: make percpu_counter_add take s64 · 20e89767

由 Peter Zijlstra 提交于 10月 16, 2007

percpu_counter is a s64 counter, make _add consitent.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

20e89767

lib: percpu_counter variable batch · 252e0ba6

由 Peter Zijlstra 提交于 10月 16, 2007

Because the current batch setup has an quadric error bound on the counter,
allow for an alternative setup.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

252e0ba6

lib: percpu_counter_sub · 3cb4f9fa

由 Peter Zijlstra 提交于 10月 16, 2007

Hugh spotted that some code does:
  percpu_counter_add(&counter, -unsignedlong)

which, when the amount argument is of type s32, sort-of works thanks to
two's-complement. However when we'd change the type to s64 this breaks on 32bit
machines, because the promotion rules zero extend the unsigned number.

Provide percpu_counter_sub() to hide the s64 cast. That is:
  percpu_counter_sub(&counter, foo)
is equal to:
  percpu_counter_add(&counter, -(s64)foo);
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3cb4f9fa

lib: percpu_counter_add · aa0dff2d

由 Peter Zijlstra 提交于 10月 16, 2007

 s/percpu_counter_mod/percpu_counter_add/

Because its a better name, _mod implies modulo.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

aa0dff2d

nfs: remove congestion_end() · c4dc4bee

由 Peter Zijlstra 提交于 10月 16, 2007

These patches aim to improve balance_dirty_pages() and directly address three
issues:
  1) inter device starvation
  2) stacked device deadlocks
  3) inter process starvation

1 and 2 are a direct result from removing the global dirty limit and using
per device dirty limits. By giving each device its own dirty limit is will
no longer starve another device, and the cyclic dependancy on the dirty limit
is broken.

In order to efficiently distribute the dirty limit across the independant
devices a floating proportion is used, this will allocate a share of the total
limit proportional to the device's recent activity.

3 is done by also scaling the dirty limit proportional to the current task's
recent dirty rate.

This patch:

nfs: remove congestion_end().  It's redundant, clear_bdi_congested() already
wakes the waiters.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c4dc4bee

powerpc: add Altivec/VMX state to coredumps · 1f7d6668

由 Mark Nelson 提交于 10月 16, 2007

Update dump_task_altivec() (which has so far never been put to use) so that
it dumps the Altivec/VMX registers (VR[0] - VR[31], VSCR and VRSAVE) in the
same format as the ptrace get_vrregs(), and add the appropriate glue
typedef and #defines to make it work.

A new note type of NT_PPC_VMX was chosen to be 0x100 (arbitrarily) because
it allows the low range values to be used for more generic purposes and
0x100 seems an adequate starting point for PowerPC extensions.
Signed-off-by: NMark Nelson <markn@au1.ibm.com>
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <ak@suse.de>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1f7d6668

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功