提交 · 8ebc423238341b52912c7295b045a32477b33f09 · gsplhtlxg / clone-Linux

14 9月, 2009 1 次提交

由 Frederic Weisbecker 提交于 4月 07, 2009

This patch is an attempt to remove the Bkl based locking scheme from
reiserfs and is intended.

It is a bit inspired from an old attempt by Peter Zijlstra:

   http://lkml.indiana.edu/hypermail/linux/kernel/0704.2/2174.html

The bkl is heavily used in this filesystem to prevent from
concurrent write accesses on the filesystem.

Reiserfs makes a deep use of the specific properties of the Bkl:

- It can be acqquired recursively by a same task
- It is released on the schedule() calls and reacquired when schedule() returns

The two properties above are a roadmap for the reiserfs write locking so it's
very hard to simply replace it with a common mutex.

- We need a recursive-able locking unless we want to restructure several blocks
  of the code.
- We need to identify the sites where the bkl was implictly relaxed
  (schedule, wait, sync, etc...) so that we can in turn release and
  reacquire our new lock explicitly.
  Such implicit releases of the lock are often required to let other
  resources producer/consumer do their job or we can suffer unexpected
  starvations or deadlocks.

So the new lock that replaces the bkl here is a per superblock mutex with a
specific property: it can be acquired recursively by a same task, like the
bkl.

For such purpose, we integrate a lock owner and a lock depth field on the
superblock information structure.

The first axis on this patch is to turn reiserfs_write_(un)lock() function
into a wrapper to manage this mutex. Also some explicit calls to
lock_kernel() have been converted to reiserfs_write_lock() helpers.

The second axis is to find the important blocking sites (schedule...(),
wait_on_buffer(), sync_dirty_buffer(), etc...) and then apply an explicit
release of the write lock on these locations before blocking. Then we can
safely wait for those who can give us resources or those who need some.
Typically this is a fight between the current writer, the reiserfs workqueue
(aka the async commiter) and the pdflush threads.

The third axis is a consequence of the second. The write lock is usually
on top of a lock dependency chain which can include the journal lock, the
flush lock or the commit lock. So it's dangerous to release and trying to
reacquire the write lock while we still hold other locks.

This is fine with the bkl:

      T1                       T2

lock_kernel()
    mutex_lock(A)
    unlock_kernel()
    // do something
                            lock_kernel()
                                mutex_lock(A) -> already locked by T1
                                schedule() (and then unlock_kernel())
    lock_kernel()
    mutex_unlock(A)
    ....

This is not fine with a mutex:

      T1                       T2

mutex_lock(write)
    mutex_lock(A)
    mutex_unlock(write)
    // do something
                           mutex_lock(write)
                              mutex_lock(A) -> already locked by T1
                              schedule()

    mutex_lock(write) -> already locked by T2
    deadlock

The solution in this patch is to provide a helper which releases the write
lock and sleep a bit if we can't lock a mutex that depend on it. It's another
simulation of the bkl behaviour.

The last axis is to locate the fs callbacks that are called with the bkl held,
according to Documentation/filesystem/Locking.

Those are:

- reiserfs_remount
- reiserfs_fill_super
- reiserfs_put_super

Reiserfs didn't need to explicitly lock because of the context of these callbacks.
But now we must take care of that with the new locking.

After this patch, reiserfs suffers from a slight performance regression (for now).
On UP, a high volume write with dd reports an average of 27 MB/s instead
of 30 MB/s without the patch applied.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: NIngo Molnar <mingo@elte.hu>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Bron Gondwana <brong@fastmail.fm>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
LKML-Reference: <1239070789-13354-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

8ebc4232

10 9月, 2009 1 次提交
- L
  
  Linux 2.6.31 · 74fca6a4
  由 Linus Torvalds 提交于 9月 09, 2009
  
  74fca6a4
09 9月, 2009 3 次提交

aoe: allocate unused request_queue for sysfs · 7135a71b

由 Ed Cashin 提交于 9月 09, 2009

Andy Whitcroft reported an oops in aoe triggered by use of an
incorrectly initialised request_queue object:

  [ 2645.959090] kobject '<NULL>' (ffff880059ca22c0): tried to add
		an uninitialized object, something is seriously wrong.
  [ 2645.959104] Pid: 6, comm: events/0 Not tainted 2.6.31-5-generic #24-Ubuntu
  [ 2645.959107] Call Trace:
  [ 2645.959139] [<ffffffff8126ca2f>] kobject_add+0x5f/0x70
  [ 2645.959151] [<ffffffff8125b4ab>] blk_register_queue+0x8b/0xf0
  [ 2645.959155] [<ffffffff8126043f>] add_disk+0x8f/0x160
  [ 2645.959161] [<ffffffffa01673c4>] aoeblk_gdalloc+0x164/0x1c0 [aoe]

The request queue of an aoe device is not used but can be allocated in
code that does not sleep.

Bruno bisected this regression down to

  cd43e26f

  block: Expose stacked device queues in sysfs

"This seems to generate /sys/block/$device/queue and its contents for
 everyone who is using queues, not just for those queues that have a
 non-NULL queue->request_fn."

Addresses http://bugs.launchpad.net/bugs/410198
Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13942

Note that embedding a queue inside another object has always been
an illegal construct, since the queues are reference counted and
must persist until the last reference is dropped. So aoe was
always buggy in this respect (Jens).
Signed-off-by: NEd Cashin <ecashin@coraid.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Bruno Premont <bonbons@linux-vserver.org>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

7135a71b

i915: disable interrupts before tearing down GEM state · e6890f6f

由 Linus Torvalds 提交于 9月 08, 2009

Reinette Chatre reports a frozen system (with blinking keyboard LEDs)
when switching from graphics mode to the text console, or when
suspending (which does the same thing). With netconsole, the oops
turned out to be

	BUG: unable to handle kernel NULL pointer dereference at 0000000000000084
	IP: [<ffffffffa03ecaab>] i915_driver_irq_handler+0x26b/0xd20 [i915]

and it's due to the i915_gem.c code doing drm_irq_uninstall() after
having done i915_gem_idle(). And the i915_gem_idle() path will do

  i915_gem_idle() ->
    i915_gem_cleanup_ringbuffer() ->
      i915_gem_cleanup_hws() ->
        dev_priv->hw_status_page = NULL;

but if an i915 interrupt comes in after this stage, it may want to
access that hw_status_page, and gets the above NULL pointer dereference.

And since the NULL pointer dereference happens from within an interrupt,
and with the screen still in graphics mode, the common end result is
simply a silently hung machine.

Fix it by simply uninstalling the irq handler before idling rather than
after. Fixes

    http://bugzilla.kernel.org/show_bug.cgi?id=13819Reported-and-tested-by: NReinette Chatre <reinette.chatre@intel.com>
Acked-by: NJesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e6890f6f

drm/i915: fix mask bits setting · 7c8460db

由 Zhenyu Wang 提交于 9月 08, 2009

eDP is exclusive connector too, and add missing crtc_mask
setting for TV.

This fixes

	http://bugzilla.kernel.org/show_bug.cgi?id=14139Signed-off-by: NZhenyu Wang <zhenyuw@linux.intel.com>
Reported-and-tested-by: NCarlos R. Mafra <crmafra2@gmail.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7c8460db

08 9月, 2009 5 次提交

Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 · 3ff323f8

由 Linus Torvalds 提交于 9月 07, 2009

* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
  drm/radeon/kms: add LTE/GTE discard + rv515 two sided stencil register.

3ff323f8

Merge branch 'for-linus' of... · 755ae761

由 Linus Torvalds 提交于 9月 07, 2009

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
  IMA: update ima_counts_put

755ae761

L
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 · 4886b5b4
由 Linus Torvalds 提交于 9月 07, 2009
```
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  gianfar: Fix build.
```
4886b5b4
L
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide-2.6 · cbeb2864
由 Linus Torvalds 提交于 9月 07, 2009
```
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide-2.6:
  pcmcia: add CNF-CDROM-ID for ide
```
cbeb2864

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel · f69fb9c3

由 Linus Torvalds 提交于 9月 07, 2009

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel:
  agp/intel: support for new chip variant of IGDNG mobile
  drm/i915: Unref old_obj on get_fence_reg() error path
  drm/i915: increase default latency constant (v2 w/comment)

f69fb9c3

07 9月, 2009 2 次提交

drm/radeon/kms: add LTE/GTE discard + rv515 two sided stencil register. · a54775c8

由 Dave Airlie 提交于 9月 07, 2009

This adds some rv350+ register for LTE/GTE discard,
and enables the rv515 two sided stencil register.
It also disables the DEPTHXY_OFFSET register which
can be used to workaround the CS checker.
Moves rs690 to proper place in rs600 and uses correct
table on rs600.
Signed-off-by: NDave Airlie <airlied@redhat.com>

a54775c8

IMA: update ima_counts_put · acd0c935

由 Mimi Zohar 提交于 9月 04, 2009

- As ima_counts_put() may be called after the inode has been freed,
verify that the inode is not NULL, before dereferencing it.

- Maintain the IMA file counters in may_open() properly, decrementing
any counter increments on subsequent errors.
Reported-by: NCiprian Docan <docan@eden.rutgers.edu>
Reported-by: NJ.R. Okajima <hooanon05@yahoo.co.jp>
Signed-off-by: NMimi Zohar <zohar@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com
Signed-off-by: NJames Morris <jmorris@namei.org>

acd0c935

06 9月, 2009 28 次提交

gianfar: Fix build. · d9d8e041

由 David S. Miller 提交于 9月 06, 2009

Reported by Michael Guntsche <mike@it-loops.com>

--------------------
Commit
38bddf04 gianfar: gfar_remove needs to call unregister_netdev()

breaks the build of the gianfar driver because "dev" is undefined in
this function. To quickly test rc9 I changed this to priv->ndev but I do
not know if this is the correct one.
--------------------
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d9d8e041

L

Linux 2.6.31-rc9 · e07cccf4
由 Linus Torvalds 提交于 9月 05, 2009

e07cccf4

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6 · f815c335

由 Linus Torvalds 提交于 9月 05, 2009

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
  firewire: sbp2: fix freeing of unallocated memory
  firewire: ohci: fix Ricoh R5C832, video reception
  firewire: ohci: fix Agere FW643 and multiple cameras
  firewire: core: fix crash in iso resource management

f815c335

powerpc: Fix i8259 interrupt driver kernel crash on ML510 · 74a01180

由 Roderick Colenbrander 提交于 9月 03, 2009

This patch fixes a null pointer exception caused by removal of
'ack()' for level interrupts in the Xilinx interrupt driver.  A recent
change to the xilinx interrupt controller removed the ack hook for
level irqs.
Signed-off-by: NRoderick Colenbrander <thunderbird2k@gmail.com>
Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>
Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

74a01180

Merge git://git.infradead.org/~dwmw2/mtd-2.6.31 · 5136a6c0

由 Linus Torvalds 提交于 9月 05, 2009

* git://git.infradead.org/~dwmw2/mtd-2.6.31:
  JFFS2: add missing verify buffer allocation/deallocation
  mtd: nftl: fix offset alignments
  mtd: nftl: write support is broken
  mtd: m25p80: fix null pointer dereference bug

5136a6c0

L
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block · e505a8d5
由 Linus Torvalds 提交于 9月 05, 2009
```
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  block: Allow changing max_sectors_kb above the default 512
```
e505a8d5

Merge branch 'fix/oxygen' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6 · b71b7dc0

由 Linus Torvalds 提交于 9月 05, 2009

* 'fix/oxygen' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
  sound: oxygen: handle cards with missing EEPROM
  sound: oxygen: fix MCLK rate for 192 kHz playback

b71b7dc0

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 · 59430c2f

由 Linus Torvalds 提交于 9月 05, 2009

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  tc: Fix unitialized kernel memory leak
  pkt_sched: Revert tasklet_hrtimer changes.
  net: sk_free() should be allowed right after sk_alloc()
  gianfar: gfar_remove needs to call unregister_netdev()
  ipw2200: firmware DMA loading rework

59430c2f

Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · e9ee3a54

由 Linus Torvalds 提交于 9月 05, 2009

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: skcipher - Fix skcipher_dequeue_givcrypt NULL test

e9ee3a54

L
Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq · 3bb314f0
由 Linus Torvalds 提交于 9月 05, 2009
```
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
  [CPUFREQ] Re-enable cpufreq suspend and resume code
```
3bb314f0

Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 · 535e0c17

由 Linus Torvalds 提交于 9月 05, 2009

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] fix csum_ipv6_magic()
  [IA64] Fix warning in dma-mapping.c

535e0c17

L
Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs · 0edfa2b1
由 Linus Torvalds 提交于 9月 05, 2009
```
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
  xfs: actually enable the swapext compat handler
```
0edfa2b1

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2 · 5a09adf1

由 Linus Torvalds 提交于 9月 05, 2009

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2:
  nilfs2: fix preempt count underflow in nilfs_btnode_prepare_change_key

5a09adf1

L
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu · 931f7035
由 Linus Torvalds 提交于 9月 05, 2009
```
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu: don't assume existence of cpu0
```
931f7035
L
Merge branch 'slab/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6 · e305fc5e
由 Linus Torvalds 提交于 9月 05, 2009
```
* 'slab/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
  slub: Fix kmem_cache_destroy() with SLAB_DESTROY_BY_RCU
```
e305fc5e

Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm · 154f807e

由 Linus Torvalds 提交于 9月 05, 2009

* git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm:
  dm snapshot: fix on disk chunk size validation
  dm exception store: split set_chunk_size
  dm snapshot: fix header corruption race on invalidation
  dm snapshot: refactor zero_disk_area to use chunk_io
  dm log: userspace add luid to distinguish between concurrent log instances
  dm raid1: do not allow log_failure variable to unset after being set
  dm log: remove incorrect field from userspace table output
  dm log: fix userspace status output
  dm stripe: expose correct io hints
  dm table: add more context to terse warning messages
  dm table: fix queue_limit checking device iterator
  dm snapshot: implement iterate devices
  dm multipath: fix oops when request based io fails when no paths

154f807e

L
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 · 9b6a3df3
由 Linus Torvalds 提交于 9月 05, 2009
```
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI SR-IOV: correct broken resource alignment calculations
```
9b6a3df3

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6 · d3acd16c

由 Linus Torvalds 提交于 9月 05, 2009

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  sparc64: Fix bootup with mcount in some configs.
  sparc64: Kill spurious NMI watchdog triggers by increasing limit to 30 seconds.

d3acd16c

Merge branch 'perfcounters-fixes-for-linus' of... · 93697a3c

由 Linus Torvalds 提交于 9月 05, 2009

Merge branch 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf_counter/powerpc: Fix cache event codes for POWER7
  perf_counter: Fix /0 bug in swcounters
  perf_counters: Increase paranoia level

93697a3c

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 63995344

由 Linus Torvalds 提交于 9月 05, 2009

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: atkbd - add Compaq Presario R4000-series repeat quirk
  Input: i8042 - add Acer Aspire 5536 to the nomux list

63995344

ext2: fix unbalanced kmap()/kunmap() · 9de6886e

由 Nicolas Pitre 提交于 9月 05, 2009

In ext2_rename(), dir_page is acquired through ext2_dotdot().  It is
then released through ext2_set_link() but only if old_dir != new_dir.
Failing that, the pkmap reference count is never decremented and the
page remains pinned forever.  Repeat that a couple times with highmem
pages and all pkmap slots get exhausted, and every further kmap() calls
end up stalling on the pkmap_map_wait queue at which point the whole
system comes to a halt.
Signed-off-by: NNicolas Pitre <nico@marvell.com>
Acked-by: NTheodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9de6886e

Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 · ac7ac9f2

由 Linus Torvalds 提交于 9月 05, 2009

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2:
  ocfs2: ocfs2_write_begin_nolock() should handle len=0
  ocfs2: invalidate dentry if its dentry_lock isn't initialized.

ac7ac9f2

pty: don't limit the writes to 'pty_space()' inside 'pty_write()' · ac89a917

由 Linus Torvalds 提交于 9月 05, 2009

The whole write-room thing is something that is up to the _caller_ to
worry about, not the pty layer itself.  The total buffer space will
still be limited by the buffering routines themselves, so there is no
advantage or need in having pty_write() artificially limit the size
somehow.

And what happened was that the caller (the n_tty line discipline, in
this case) may have verified that there is room for 2 bytes to be
written (for NL -> CRNL expansion), and it used to then do those writes
as two single-byte writes.  And if the first byte written (CR) then
caused a new tty buffer to be allocated, pty_space() may have returned
zero when trying to write the second byte (LF), and then incorrectly
failed the write - leading to a lost newline character.

This should finally fix

	http://bugzilla.kernel.org/show_bug.cgi?id=14015Reported-by: NMikael Pettersson <mikpe@it.uu.se>
Acked-by: NAlan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ac89a917

n_tty: do O_ONLCR translation as a single write · 37f81fa1

由 Linus Torvalds 提交于 9月 05, 2009

When translating CR to CRNL in the n_tty line discipline, we did it as
two tty_put_char() calls. Which works, but is stupid, and has caused
problems before too with bad interactions with the write_room() logic.
The generic USB serial driver had that problem, for example.

Now the pty layer had similar issues after being moved to the generic
tty buffering code (in commit d945cb9c:
"pty: Rework the pty layer to use the normal buffering logic").

So stop doing the silly separate two writes, and do it as a single write
instead. That's what the n_tty layer already does for the space
expansion of tabs (XTABS), and it means that we'll now always have just
a single write for the CRNL to match the single 'tty_write_room()' test,
which hopefully means that the next time somebody screws up buffering,
it won't cause weeks of debugging.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

37f81fa1

exec: do not sleep in TASK_TRACED under ->cred_guard_mutex · a2a8474c

由 Oleg Nesterov 提交于 9月 05, 2009

Tom Horsley reports that his debugger hangs when it tries to read
/proc/pid_of_tracee/maps, this happens since

	"mm_for_maps: take ->cred_guard_mutex to fix the race with exec"
	04b836cbf19e885f8366bccb2e4b0474346c02d

commit in 2.6.31.

But the root of the problem lies in the fact that do_execve() path calls
tracehook_report_exec() which can stop if the tracer sets PT_TRACE_EXEC.

The tracee must not sleep in TASK_TRACED holding this mutex.  Even if we
remove ->cred_guard_mutex from mm_for_maps() and proc_pid_attr_write(),
another task doing PTRACE_ATTACH should not hang until it is killed or the
tracee resumes.

With this patch do_execve() does not use ->cred_guard_mutex directly and
we do not hold it throughout, instead:

	- introduce prepare_bprm_creds() helper, it locks the mutex
	  and calls prepare_exec_creds() to initialize bprm->cred.

	- install_exec_creds() drops the mutex after commit_creds(),
	  and thus before tracehook_report_exec()->ptrace_stop().

	  or, if exec fails,

	  free_bprm() drops this mutex when bprm->cred != NULL which
	  indicates install_exec_creds() was not called.
Reported-by: NTom Horsley <tom.horsley@att.net>
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Acked-by: NDavid Howells <dhowells@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a2a8474c

page-allocator: always change pageblock ownership when anti-fragmentation is disabled · dd5d241e

由 Mel Gorman 提交于 9月 05, 2009

On low-memory systems, anti-fragmentation gets disabled as fragmentation
cannot be avoided on a sufficiently large boundary to be worthwhile.  Once
disabled, there is a period of time when all the pageblocks are marked
MOVABLE and the expectation is that they get marked UNMOVABLE at each call
to __rmqueue_fallback().

However, when MAX_ORDER is large the pageblocks do not change ownership
because the normal criteria are not met.  This has the effect of
prematurely breaking up too many large contiguous blocks.  This is most
serious on NOMMU systems which depend on high-order allocations to boot.
This patch causes pageblocks to change ownership on every fallback when
anti-fragmentation is disabled.  This prevents the large blocks being
prematurely broken up.

This is a fix to commit 49255c61 [page
allocator: move check for disabled anti-fragmentation out of fastpath] and
the problem affects 2.6.31-rc8.
Signed-off-by: NMel Gorman <mel@csn.ul.ie>
Tested-by: NPaul Mundt <lethal@linux-sh.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: NGreg Ungerer <gerg@snapgear.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

dd5d241e

nommu: fix error handling in do_mmap_pgoff() · a190887b

由 David Howells 提交于 9月 05, 2009

Fix the error handling in do_mmap_pgoff().  If do_mmap_shared_file() or
do_mmap_private() fail, we jump to the error_put_region label at which
point we cann __put_nommu_region() on the region - but we haven't yet
added the region to the tree, and so __put_nommu_region() may BUG
because the region tree is empty or it may corrupt the region tree.

To get around this, we can afford to add the region to the region tree
before calling do_mmap_shared_file() or do_mmap_private() as we keep
nommu_region_sem write-locked, so no-one can race with us by seeing a
transient region.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
Acked-by: NPaul Mundt <lethal@linux-sh.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Acked-by: NGreg Ungerer <gerg@snapgear.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a190887b

workqueues: introduce __cancel_delayed_work() · 4e49627b

由 Oleg Nesterov 提交于 9月 05, 2009

cancel_delayed_work() has to use del_timer_sync() to guarantee the timer
function is not running after return.  But most users doesn't actually
need this, and del_timer_sync() has problems: it is not useable from
interrupt, and it depends on every lock which could be taken from irq.

Introduce __cancel_delayed_work() which calls del_timer() instead.

The immediate reason for this patch is
http://bugzilla.kernel.org/show_bug.cgi?id=13757
but hopefully this helper makes sense anyway.

As for 13757 bug, actually we need requeue_delayed_work(), but its
semantics are not yet clear.

Merge this patch early to resolves cross-tree interdependencies between
input and infiniband.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Roland Dreier <rdreier@cisco.com>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4e49627b