提交 · 785b93ef8c309730c2de84ce9c229e40e2d01480 · bug2833 / cloud-kernel

31 8月, 2009 2 次提交

drm/kms: move driver specific fb common code to helper functions (v2) · 785b93ef

由 Dave Airlie 提交于 8月 28, 2009

Initially I always meant this code to be shared, but things
ran away from me before I got to it.

This refactors the i915 and radeon kms fbdev interaction layers
out into generic helpers + driver specific pieces.

It moves all the panic/sysrq enhancements to the core file,
and stores a linked list of kernel fbs. This could possibly be
improved to only store the fb which has fbcon on it for panics
etc.

radeon retains some specific codes used for a big endian
workaround.

changes:
fix oops in v1
fix freeing path for crtc_info
Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: NDave Airlie <airlied@redhat.com>

785b93ef

drm/crtc_helper: place drm_helper_encoder_in_use() in the header file · f380ef86

由 Maarten Maathuis 提交于 8月 19, 2009

- The symbol was already exported.
Signed-off-by: NMaarten Maathuis <madman2003@gmail.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>

f380ef86

27 8月, 2009 2 次提交

drm: GEM handles are u32, not int · a1a2d1d3

由 Pekka Paalanen 提交于 8月 23, 2009

Several functions in the GEM kernel API used int as handle type, but
user API has it __u32 which is also the intended type.

Replace int with u32.
Signed-off-by: NPekka Paalanen <pq@iki.fi>
Signed-off-by: NDave Airlie <airlied@redhat.com>

a1a2d1d3

drm/ttm: consolidate cache flushing code in one place. · c9c97b8c

由 Dave Airlie 提交于 8月 27, 2009

This merges the TTM and drm cache flushing into one file in the
drm core.
Signed-off-by: NDave Airlie <airlied@redhat.com>

c9c97b8c

19 8月, 2009 8 次提交

ttm: Make parts of a struct ttm_bo_device global. · a987fcaa

由 Thomas Hellstrom 提交于 8月 18, 2009

Common resources, like memory accounting and swap lists should be
global and not per device. Introduce a struct ttm_bo_global to
accomodate this, and register it with sysfs. Add a small sysfs interface
to return the number of active buffer objects.
Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: NDave Airlie <airlied@linux.ie>

a987fcaa

drm/ttm: Memory accounting rework. · 5fd9cbad

由 Thomas Hellstrom 提交于 8月 17, 2009

Use inclusive zones to simplify accounting and its sysfs representation.
Use DMA32 accounting where applicable.

Add a sysfs interface to make the heuristically determined limits
readable and configurable.
Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: NDave Airlie <airlied@linux.ie>

5fd9cbad

drm: Enable drm drivers to add drm sysfs devices. · 327c225b

由 Thomas Hellstrom 提交于 8月 17, 2009

Export utility functions for drivers to add specialized devices in the
sysfs drm class subdirectory.

Initially this will be needed form TTM to add a virtual device that
handles power management.
Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: NDave Airlie <airlied@linux.ie>

327c225b

drm/ttm: optimize bo_kmap_type values · a0724fcf

由 Pekka Paalanen 提交于 8月 17, 2009

A micro-optimization on the function ttm_kmap_obj_virtual().

By defining the values of enum ttm_bo_kmap_obj::bo_kmap_type to have a
bit indicating iomem, size of the function ttm_kmap_obj_virtual() will be
reduced by 16 bytes on x86_64 (gcc 4.1.2).

ttm_kmap_obj_virtual() may be heavily used, when buffer objects are
accessed via wrappers, that work for both kinds of memory addresses:
iomem cookies and kernel virtual.
Signed-off-by: NPekka Paalanen <pq@iki.fi>
Signed-off-by: NDave Airlie <airlied@redhat.com>

a0724fcf

drm/kms: no need to return void value (encoder) · 949ef70e

由 Pekka Paalanen 提交于 8月 17, 2009

Cc: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: NPekka Paalanen <pq@iki.fi>
Signed-off-by: NDave Airlie <airlied@redhat.com>

949ef70e

drm: clarify scaling property names · 53bd8389

由 Jesse Barnes 提交于 7月 01, 2009

Now that we're using the scaling property in the Intel driver I noticed
that the names were a bit confusing.  I've corrected them according to
our discussion on IRC and the mailing list, though I've left out
potential new additions for a new scaling property with an integer (or
two) for the scaling factor.  None of the drivers implement that today,
but if someone wants to do it, I think it could be done with the
addition of a single new type and a new property to describe the
scaling factor in the X and Y directions.
Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: NAlex Deucher <alexdeucher@gmail.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>

53bd8389

D
drm: fixup includes in encoder slave header files. · 776f3360
由 Dave Airlie 提交于 8月 19, 2009
```
Signed-off-by: NDave Airlie <airlied@redhat.com>
```
776f3360

mm: revert "oom: move oom_adj value" · 0753ba01

由 KOSAKI Motohiro 提交于 8月 18, 2009

The commit 2ff05b2b (oom: move oom_adj value) moveed the oom_adj value to
the mm_struct.  It was a very good first step for sanitize OOM.

However Paul Menage reported the commit makes regression to his job
scheduler.  Current OOM logic can kill OOM_DISABLED process.

Why? His program has the code of similar to the following.

	...
	set_oom_adj(OOM_DISABLE); /* The job scheduler never killed by oom */
	...
	if (vfork() == 0) {
		set_oom_adj(0); /* Invoked child can be killed */
		execve("foo-bar-cmd");
	}
	....

vfork() parent and child are shared the same mm_struct.  then above
set_oom_adj(0) doesn't only change oom_adj for vfork() child, it's also
change oom_adj for vfork() parent.  Then, vfork() parent (job scheduler)
lost OOM immune and it was killed.

Actually, fork-setting-exec idiom is very frequently used in userland program.
We must not break this assumption.

Then, this patch revert commit 2ff05b2b and related commit.

Reverted commit list
---------------------
- commit 2ff05b2b (oom: move oom_adj value from task_struct to mm_struct)
- commit 4d8b9135 (oom: avoid unnecessary mm locking and scanning for OOM_DISABLE)
- commit 81236810 (oom: only oom kill exiting tasks with attached memory)
- commit 933b787b (mm: copy over oom_adj value at fork time)
Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0753ba01

18 8月, 2009 1 次提交

net: restore gnet_stats_basic to previous definition · c1a8f1f1

由 Eric Dumazet 提交于 8月 16, 2009

In 5e140dfc "net: reorder struct Qdisc
for better SMP performance" the definition of struct gnet_stats_basic
changed incompatibly, as copies of this struct are shipped to
userland via netlink.

Restoring old behavior is not welcome, for performance reason.

Fix is to use a private structure for kernel, and
teach gnet_stats_copy_basic() to convert from kernel to user land,
using legacy structure (struct gnet_stats_basic)

Based on a report and initial patch from Michael Spang.
Reported-by: NMichael Spang <mspang@csclub.uwaterloo.ca>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c1a8f1f1

17 8月, 2009 4 次提交

security: define round_hint_to_min in !CONFIG_SECURITY · 1d995973

由 Eric Paris 提交于 8月 07, 2009

Fix the header files to define round_hint_to_min() and to define
mmap_min_addr_handler() in the !CONFIG_SECURITY case.

Built and tested with !CONFIG_SECURITY
Signed-off-by: NEric Paris <eparis@redhat.com>
Signed-off-by: NJames Morris <jmorris@namei.org>

1d995973

Security/SELinux: seperate lsm specific mmap_min_addr · 788084ab

由 Eric Paris 提交于 7月 31, 2009

Currently SELinux enforcement of controls on the ability to map low memory
is determined by the mmap_min_addr tunable.  This patch causes SELinux to
ignore the tunable and instead use a seperate Kconfig option specific to how
much space the LSM should protect.

The tunable will now only control the need for CAP_SYS_RAWIO and SELinux
permissions will always protect the amount of low memory designated by
CONFIG_LSM_MMAP_MIN_ADDR.

This allows users who need to disable the mmap_min_addr controls (usual reason
being they run WINE as a non-root user) to do so and still have SELinux
controls preventing confined domains (like a web server) from being able to
map some area of low memory.
Signed-off-by: NEric Paris <eparis@redhat.com>
Signed-off-by: NJames Morris <jmorris@namei.org>

788084ab

Capabilities: move cap_file_mmap to commoncap.c · 9c0d9010

由 Eric Paris 提交于 7月 31, 2009

Currently we duplicate the mmap_min_addr test in cap_file_mmap and in
security_file_mmap if !CONFIG_SECURITY.  This patch moves cap_file_mmap
into commoncap.c and then calls that function directly from
security_file_mmap ifndef CONFIG_SECURITY like all of the other capability
checks are done.
Signed-off-by: NEric Paris <eparis@redhat.com>
Acked-by: NSerge Hallyn <serue@us.ibm.com>
Signed-off-by: NJames Morris <jmorris@namei.org>

9c0d9010

drm/radeon/kms: implement bo busy check + current domain · cefb87ef

由 Dave Airlie 提交于 8月 16, 2009

This implements the busy ioctl along with a current domain check.
returns 0 or -EBUSY
puts the current domain no matter what the answer.
Signed-off-by: NDave Airlie <airlied@redhat.com>

cefb87ef

13 8月, 2009 3 次提交

perf: Rework/fix the whole read vs group stuff · 3dab77fb

由 Peter Zijlstra 提交于 8月 13, 2009

Replace PERF_SAMPLE_GROUP with PERF_SAMPLE_READ and introduce
PERF_FORMAT_GROUP to deal with group reads in a more generic
way.

This allows you to get group reads out of read() as well.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Corey J Ashford <cjashfor@us.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: stephane eranian <eranian@googlemail.com>
LKML-Reference: <20090813103655.117411814@chello.nl>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3dab77fb

perf_counter: Provide hw_perf_counter_setup_online() APIs · 28402971

由 Ingo Molnar 提交于 8月 13, 2009

Provide weak aliases for hw_perf_counter_setup_online(). This is
used by the BTS patches (for v2.6.32), but it interacts with
fixes so propagate this upstream. (it has no effect as of yet)

Also export perf_counter_output() to architecture code.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

28402971

drm: Add more standard TV properties. · a75f0236

由 Francisco Jerez 提交于 8月 12, 2009

Overscan, saturation, hue. Used in the nouveau driver for GPUs with
integrated TV encoders.
Signed-off-by: NFrancisco Jerez <currojerez@riseup.net>
Signed-off-by: NDave Airlie <airlied@redhat.com>

a75f0236

12 8月, 2009 1 次提交

NFS: Fix an O_DIRECT Oops... · 1ae88b2e

由 Trond Myklebust 提交于 8月 12, 2009

We can't call nfs_readdata_release()/nfs_writedata_release() without
first initialising and referencing args.context. Doing so inside
nfs_direct_read_schedule_segment()/nfs_direct_write_schedule_segment()
causes an Oops.

We should rather be calling nfs_readdata_free()/nfs_writedata_free() in
those cases.

Looking at the O_DIRECT code, the "struct nfs_direct_req" is already
referencing the nfs_open_context for us. Since the readdata and writedata
structures carry a reference to that, we can simplify things by getting rid
of the extra nfs_open_context references, so that we can replace all
instances of nfs_readdata_release()/nfs_writedata_release().
Reported-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Tested-by: NCatalin Marinas <catalin.marinas@arm.com>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1ae88b2e

10 8月, 2009 4 次提交

perf_counter: Zero dead bytes from ftrace raw samples size alignment · 1853db0e

由 Frederic Weisbecker 提交于 8月 10, 2009

After aligning the ftrace raw samples, there are dead bytes storing
random data from the stack. We don't want to leak these to userspace,
then zero these out.

Before:

	0x2de88 [0x50]: event: 9
	.
	. ... raw event: size 80 bytes
	.  0000:  09 00 00 00 01 00 50 00 d0 c7 00 81 ff ff ff ff  ......P........
	.  0010:  68 01 00 00 68 01 00 00 2c 00 00 00 00 00 00 00  h...h...,......
	.  0020:  2c 00 00 00 2b 00 01 02 68 01 00 00 68 01 00 00  ,...+...h...h..
	.  0030:  6b 6f 6e 64 65 6d 61 6e 64 2f 30 00 00 00 00 00  kondemand/0....
	.  0040:  68 01 00 00 40 7f 46 81 ff ff ff ff 00 10 1b 7f  h...@.F........
                                                      ^  ^  ^  ^
                                                         Leak

After:

	0x2d318 [0x50]: event: 9
	.
	. ... raw event: size 80 bytes
	.  0000:  09 00 00 00 01 00 50 00 d0 c7 00 81 ff ff ff ff  ......P........
	.  0010:  68 01 00 00 68 01 00 00 68 14 00 00 00 00 00 00  h...h...h......
	.  0020:  2c 00 00 00 2b 00 01 02 68 01 00 00 68 01 00 00  ,...+...h...h..
	.  0030:  6b 6f 6e 64 65 6d 61 6e 64 2f 30 00 00 00 00 00  kondemand/0....
	.  0040:  68 01 00 00 a0 80 46 81 ff ff ff ff 00 00 00 00  h.....F........
                                                      ^  ^  ^  ^
							 Fixed
Reported-by: NPeter Zijlstra <peterz@infradead.org>
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1249915116-5210-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>

1853db0e

perf_counter: Subtract the buffer size field from the event record size · 304703ab

由 Frederic Weisbecker 提交于 8月 10, 2009

We compute the perf raw sample size by aligning the raw ftrace
event size plus the buffer size field itself. We do that
instead of aligning only the perf raw sample size, so that we
might economize some in some cases.

But this buffer size field is not stored in the perf raw
sample, we must then substract its size from the buffer once we
computed the alignment unless we may get a useless u32 field in
the buffer.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20090810141129.GA5124@nowhere>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

304703ab

locking, sched: Give waitqueue spinlocks their own lockdep classes · 2fc39111

由 Peter Zijlstra 提交于 8月 10, 2009

Give waitqueue spinlocks their own lockdep classes when they
are initialised from init_waitqueue_head().  This means that
struct wait_queue::func functions can operate other waitqueues.

This is used by CacheFiles to catch the page from a backing fs
being unlocked and to wake up another thread to take a copy of
it.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Tested-by: NTakashi Iwai <tiwai@suse.de>
Cc: linux-cachefs@redhat.com
Cc: torvalds@osdl.org
Cc: akpm@linux-foundation.org
LKML-Reference: <20090810113305.17284.81508.stgit@warthog.procyon.org.uk>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2fc39111

perf_counter: Correct PERF_SAMPLE_RAW output · a044560c

由 Peter Zijlstra 提交于 8月 10, 2009

PERF_SAMPLE_* output switches should unconditionally output the
correct format, as they are the only way to unambiguously parse
the PERF_EVENT_SAMPLE data.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1249896447.17467.74.camel@twins>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

a044560c

09 8月, 2009 3 次提交

perf_counter: Fix tracepoint sampling to be part of generic sampling · 3a43ce68

由 Frederic Weisbecker 提交于 8月 08, 2009

Based on Peter's comments, make tracepoint sampling generic
just like all the other sampling bits are. This is a rename
with no code changes:

- PERF_SAMPLE_TP_RECORD to PERF_SAMPLE_RAW
- struct perf_tracepoint_record to perf_raw_record

We want the system in place that transport tracepoints raw
samples events into the perf ring buffer to be generalized and
usable by any type of counter.

Reported-by; Peter Zijlstra <peterz@infradead.org>
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1249698400-5441-4-git-send-email-fweisbec@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3a43ce68

perf_counter: Fix/complete ftrace event records sampling · f413cdb8

由 Frederic Weisbecker 提交于 8月 07, 2009

This patch implements the kernel side support for ftrace event
record sampling.

A new counter sampling attribute is added:

   PERF_SAMPLE_TP_RECORD

which requests ftrace events record sampling. In this case
if a PERF_TYPE_TRACEPOINT counter is active and a tracepoint
fires, we emit the tracepoint binary record to the
perfcounter event buffer, as a sample.

Result, after setting PERF_SAMPLE_TP_RECORD attribute from perf
record:

 perf record -f -F 1 -a -e workqueue:workqueue_execution
 perf report -D

 0x21e18 [0x48]: event: 9
 .
 . ... raw event: size 72 bytes
 .  0000:  09 00 00 00 01 00 48 00 d0 c7 00 81 ff ff ff ff  ......H........
 .  0010:  0a 00 00 00 0a 00 00 00 21 00 00 00 00 00 00 00  ........!......
 .  0020:  2b 00 01 02 0a 00 00 00 0a 00 00 00 65 76 65 6e  +...........eve
 .  0030:  74 73 2f 31 00 00 00 00 00 00 00 00 0a 00 00 00  ts/1...........
 .  0040:  e0 b1 31 81 ff ff ff ff                          .......
.
0x21e18 [0x48]: PERF_EVENT_SAMPLE (IP, 1): 10: 0xffffffff8100c7d0 period: 33

The raw ftrace binary record starts at offset 0020.

Translation:

 struct trace_entry {
	type		= 0x2b = 43;
	flags		= 1;
	preempt_count	= 2;
	pid		= 0xa = 10;
	tgid		= 0xa = 10;
 }

 thread_comm = "events/1"
 thread_pid  = 0xa = 10;
 func	    = 0xffffffff8131b1e0 = flush_to_ldisc()

What will come next?

 - Userspace support ('perf trace'), 'flight data recorder' mode
   for perf trace, etc.

 - The unconditional copy from the profiling callback brings
   some costs however if someone wants no such sampling to
   occur, and needs to be fixed in the future. For that we need
   to have an instant access to the perf counter attribute.
   This is a matter of a flag to add in the struct ftrace_event.

 - Take care of the events recursivity! Don't ever try to record
   a lock event for example, it seems some locking is used in
   the profiling fast path and lead to a tracing recursivity.
   That will be fixed using raw spinlock or recursivity
   protection.

 - [...]

 - Profit! :-)
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f413cdb8

perf_counter, ftrace: Fix perf_counter integration · 3a659305

由 Peter Zijlstra 提交于 7月 21, 2009

Adds possible second part to the assign argument of TP_EVENT().

  TP_perf_assign(
	__perf_count(foo);
	__perf_addr(bar);
  )

Which, when specified make the swcounter increment with @foo instead
of the usual 1, and report @bar for PERF_SAMPLE_ADDR (data address
associated with the event) when this triggers a counter overflow.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3a659305

08 8月, 2009 4 次提交

bzip2/lzma/gzip: fix comments describing decompressor API · daeb6b6f

由 Phillip Lougher 提交于 8月 06, 2009

Fix and improve comments in decompress/generic.h that describe the
decompressor API.  Also remove an unused definition, and rename INBUF_LEN
in lib/decompress_inflate.c to conform to bzip2/lzma naming.
Signed-off-by: NPhillip Lougher <phillip@lougher.demon.co.uk>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

daeb6b6f

mm: make set_mempolicy(MPOL_INTERLEAV) N_HIGH_MEMORY aware · 4bfc4495

由 KAMEZAWA Hiroyuki 提交于 8月 06, 2009

At first, init_task's mems_allowed is initialized as this.
 init_task->mems_allowed == node_state[N_POSSIBLE]

And cpuset's top_cpuset mask is initialized as this
 top_cpuset->mems_allowed = node_state[N_HIGH_MEMORY]

Before 2.6.29:
policy's mems_allowed is initialized as this.

  1. update tasks->mems_allowed by its cpuset->mems_allowed.
  2. policy->mems_allowed = nodes_and(tasks->mems_allowed, user's mask)

Updating task's mems_allowed in reference to top_cpuset's one.
cpuset's mems_allowed is aware of N_HIGH_MEMORY, always.

In 2.6.30: After commit 58568d2a
("cpuset,mm: update tasks' mems_allowed in time"), policy's mems_allowed
is initialized as this.

  1. policy->mems_allowd = nodes_and(task->mems_allowed, user's mask)

Here, if task is in top_cpuset, task->mems_allowed is not updated from
init's one.  Assume user excutes command as #numactrl --interleave=all
,....

  policy->mems_allowd = nodes_and(N_POSSIBLE, ALL_SET_MASK)

Then, policy's mems_allowd can includes a possible node, which has no pgdat.

MPOL's INTERLEAVE just scans nodemask of task->mems_allowd and access this
directly.

  NODE_DATA(nid)->zonelist even if NODE_DATA(nid)==NULL

Then, what's we need is making policy->mems_allowed be aware of
N_HIGH_MEMORY.  This patch does that.  But to do so, extra nodemask will
be on statck.  Because I know cpumask has a new interface of
CPUMASK_ALLOC(), I added it to node.

This patch stands on old behavior.  But I feel this fix itself is just a
Band-Aid.  But to do fundametal fix, we have to take care of memory
hotplug and it takes time.  (task->mems_allowd should be N_HIGH_MEMORY, I
think.)

mpol_set_nodemask() should be aware of N_HIGH_MEMORY and policy's nodemask
should be includes only online nodes.

In old behavior, this is guaranteed by frequent reference to cpuset's
code.  Now, most of them are removed and mempolicy has to check it by
itself.

To do check, a few nodemask_t will be used for calculating nodemask.  But,
size of nodemask_t can be big and it's not good to allocate them on stack.

Now, cpumask_t has CPUMASK_ALLOC/FREE an easy code for get scratch area.
NODEMASK_ALLOC/FREE shoudl be there.

[akpm@linux-foundation.org: cleanups & tweaks]
Tested-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Miao Xie <miaox@cn.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Paul Menage <menage@google.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: David Rientjes <rientjes@google.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4bfc4495

vfs: add __destroy_inode · 2e00c97e

由 Christoph Hellwig 提交于 8月 07, 2009

When we want to tear down an inode that lost the add to the cache race
in XFS we must not call into ->destroy_inode because that would delete
the inode that won the race from the inode cache radix tree.

This patch provides the __destroy_inode helper needed to fix this,
the actual fix will be in th next patch. As XFS was the only reason
destroy_inode was exported we shift the export to the new __destroy_inode.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NEric Sandeen <sandeen@sandeen.net>

2e00c97e

vfs: fix inode_init_always calling convention · 54e34621

由 Christoph Hellwig 提交于 8月 07, 2009

Currently inode_init_always calls into ->destroy_inode if the additional
initialization fails. That's not only counter-intuitive because
inode_init_always did not allocate the inode structure, but in case of
XFS it's actively harmful as ->destroy_inode might delete the inode from
a radix-tree that has never been added. This in turn might end up
deleting the inode for the same inum that has been instanciated by
another process and cause lots of cause subtile problems.

Also in the case of re-initializing a reclaimable inode in XFS it would
free an inode we still want to keep alive.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NEric Sandeen <sandeen@sandeen.net>

54e34621

06 8月, 2009 2 次提交

Input: matrix_keypad - make matrix keymap size dynamic · d82f1c35

由 Eric Miao 提交于 8月 05, 2009

Remove assumption on the shift and size of rows/columns form
matrix_keypad driver.
Signed-off-by: NEric Miao <eric.y.miao@gmail.com>
Signed-off-by: NDmitry Torokhov <dtor@mail.ru>

d82f1c35

ftrace: Fix perf-tracepoint OOPS · af6af30c

由 Peter Zijlstra 提交于 8月 05, 2009

Not all tracepoints are created equal, in specific the ftrace
tracepoints are created with TRACE_EVENT_FORMAT() which does
not generate the needed bits to tie them into perf counters.

For those events, don't create the 'id' file and fail
->profile_enable when their ID is specified through other
means.
Reported-by: NChris Mason <chris.mason@oracle.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1249497664.5890.4.camel@laptop>
[ v2: fix build error in the !CONFIG_EVENT_PROFILE case ]
Signed-off-by: NIngo Molnar <mingo@elte.hu>

af6af30c

05 8月, 2009 3 次提交

KVM: fix ack not being delivered when msi present · 5116d8f6

由 Michael S. Tsirkin 提交于 7月 26, 2009

kvm_notify_acked_irq does not check irq type, so that it sometimes
interprets msi vector as irq.  As a result, ack notifiers are not
called, which typially hangs the guest.  The fix is to track and
check irq type.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NAvi Kivity <avi@redhat.com>

5116d8f6

drm/radeon: Add support for RS880 chips · 6502fbfa

由 Alex Deucher 提交于 8月 04, 2009

These are new AMD IGP chips
Signed-off-by: NAlex Deucher <alexdeucher@gmail.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>

6502fbfa

tty-ldisc: make refcount be atomic_t 'users' count · 18eac1cc

由 Linus Torvalds 提交于 8月 03, 2009

This is pure preparation of changing the ldisc reference counting to be
a true refcount that defines the lifetime of the ldisc.  But this is a
purely syntactic change for now to make the next steps easier.

This patch should make no semantic changes at all. But I wanted to make
the ldisc refcount be an atomic (I will be touching it without locks
soon enough), and I wanted to rename it so that there isn't quite as
much confusion between 'ldo->refcount' (ldisk operations refcount) and
'ld->refcount' (ldisc refcount itself) in the same file.

So it's now an atomic 'ld->users' count. It still starts at zero,
despite having a reference from 'tty->ldisc', but that will change once
we turn it into a _real_ refcount.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Tested-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Tested-by: NSergey Senozhatsky <sergey.senozhatsky@mail.by>
Acked-by: NAlan Cox <alan@linux.intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

18eac1cc

04 8月, 2009 3 次提交

drm: Define some new standard TV properties. · b6b7902e

由 Francisco Jerez 提交于 8月 02, 2009

Namely "brightness", "contrast" and "flicker reduction".
Signed-off-by: NFrancisco Jerez <currojerez@riseup.net>
Signed-off-by: NDave Airlie <airlied@redhat.com>

b6b7902e

drm: Define DRM_MODE_SUBCONNECTOR_SCART · aeaa1ad3

由 Francisco Jerez 提交于 8月 02, 2009

Signed-off-by: NFrancisco Jerez <currojerez@riseup.net>
Signed-off-by: NDave Airlie <airlied@redhat.com>

aeaa1ad3

drm: Define DRM_MODE_CONNECTOR_TV · 74bd3c26

由 Francisco Jerez 提交于 8月 02, 2009

The existing TV connector types are often unsuitable either because
there is no way to probe them until they're actually plugged in or
because they can change during run time (e.g. 7-pin DIN connectors
that behave as S-Video, Component, Composite or SCART depending on the
adaptor plugged in).
Signed-off-by: NFrancisco Jerez <currojerez@riseup.net>
Signed-off-by: NDave Airlie <airlied@redhat.com>

74bd3c26

bug2833 / cloud-kernel 与 Fork 源项目一致

bug2833 / cloud-kernel
与 Fork 源项目一致