提交 · a5f28ae4df291d81d9d23066f88c55ca45e388d3 · openeuler / raspberrypi-kernel

09 2月, 2010 2 次提交

ocfs2/cluster: Make o2net connect messages KERN_NOTICE · 6efd8066

由 Sunil Mushran 提交于 2月 05, 2010

Connect and disconnect messages are more than informational as they are required
during root cause analysis for failures. This patch changes them from KERN_INFO
to KERN_NOTICE.
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Acked-by: NMark Faseh <mfasheh@suse.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

6efd8066

ocfs2/dlm: Fix printing of lockname · 86a06aba

由 Sunil Mushran 提交于 2月 05, 2010

The debug call printing the name of the lock resource was chopping
off the last character. This patch fixes the problem.
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Acked-by: NMark Fasheh <mfasheh@suse.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

86a06aba

08 2月, 2010 1 次提交

Fix race in tty_fasync() properly · 80e1e823

由 Linus Torvalds 提交于 2月 07, 2010

This reverts commit 70362511 ("tty: fix race in tty_fasync") and
commit b04da8bf ("fnctl: f_modown should call write_lock_irqsave/
restore") that tried to fix up some of the fallout but was incomplete.

It turns out that we really cannot hold 'tty->ctrl_lock' over calling
__f_setown, because not only did that cause problems with interrupt
disables (which the second commit fixed), it also causes a potential
ABBA deadlock due to lock ordering.

Thanks to Tetsuo Handa for following up on the issue, and running
lockdep to show the problem.  It goes roughly like this:

 - f_getown gets filp->f_owner.lock for reading without interrupts
   disabled, so an interrupt that happens while that lock is held can
   cause a lockdep chain from f_owner.lock -> sighand->siglock.

 - at the same time, the tty->ctrl_lock -> f_owner.lock chain that
   commit 70362511 introduced, together with the pre-existing
   sighand->siglock -> tty->ctrl_lock chain means that we have a lock
   dependency the other way too.

So instead of extending tty->ctrl_lock over the whole __f_setown() call,
we now just take a reference to the 'pid' structure while holding the
lock, and then release it after having done the __f_setown.  That still
guarantees that 'struct pid' won't go away from under us, which is all
we really ever needed.
Reported-and-tested-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: NGreg Kroah-Hartman <gregkh@suse.de>
Acked-by: NAmérico Wang <xiyou.wangcong@gmail.com>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

80e1e823

07 2月, 2010 6 次提交

A
Take ima_file_free() to proper place. · 89068c57
由 Al Viro 提交于 2月 07, 2010
```
Hooks: Just Say No.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
89068c57

ima: rename ima_path_check to ima_file_check · 9bbb6cad

由 Mimi Zohar 提交于 1月 26, 2010

ima_path_check actually deals with files!  call it ima_file_check instead.
Signed-off-by: NEric Paris <eparis@redhat.com>
Acked-by: NMimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

9bbb6cad

fix ima breakage · 8eb988c7

由 Mimi Zohar 提交于 1月 20, 2010

The "Untangling ima mess, part 2 with counters" patch messed
up the counters.  Based on conversations with Al Viro, this patch
streamlines ima_path_check() by removing the counter maintaince.
The counters are now updated independently, from measuring the file,
in __dentry_open() and alloc_file() by calling ima_counts_get().
ima_path_check() is called from nfsd and do_filp_open().
It also did not measure all files that should have been measured.
Reason: ima_path_check() got bogus value passed as mask.
[AV: mea culpa]
[AV: add missing nfsd bits]
Signed-off-by: NMimi Zohar <zohar@us.ibm.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

8eb988c7

A
Take ima_path_check() in nfsd past dentry_open() in nfsd_open() · 1e41568d
由 Al Viro 提交于 1月 26, 2010
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
1e41568d

freeze_bdev: don't deactivate successfully frozen MS_RDONLY sb · 4b06e5b9

由 Jun'ichi Nomura 提交于 1月 29, 2010

Thanks Thomas and Christoph for testing and review.
I removed 'smp_wmb()' before up_write from the previous patch,
since up_write() should have necessary ordering constraints.
(I.e. the change of s_frozen is visible to others after up_write)
I'm quite sure the change is harmless but if you are uncomfortable
with Tested-by/Reviewed-by on the modified patch, please remove them.

If MS_RDONLY, freeze_bdev should just up_write(s_umount) instead of
deactivate_locked_super().
Also, keep sb->s_frozen consistent so that remount can check the frozen state.

Otherwise a crash reported here can happen:
http://lkml.org/lkml/2010/1/16/37
http://lkml.org/lkml/2010/1/28/53

This patch should be applied for 2.6.32 stable series, too.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Tested-by: NThomas Backlund <tmb@mandriva.org>
Signed-off-by: NJun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: stable@kernel.org
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

4b06e5b9

A
befs: fix leak · 8dd5ca53
由 Al Viro 提交于 1月 28, 2010
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
8dd5ca53

06 2月, 2010 1 次提交

ocfs2: Fix contiguousness check in ocfs2_try_to_merge_extent_map() · bd6b0bf8

由 Roel Kluin 提交于 2月 05, 2010

The wrong member was compared in the continguousness check.
Acked-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NRoel Kluin <roel.kluin@gmail.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

bd6b0bf8

05 2月, 2010 6 次提交

Btrfs: apply updated fallocate i_size fix · 23b5c509

由 Aneesh Kumar K.V 提交于 2月 04, 2010

This version of the i_size fix for fallocate makes sure we only update
the i_size when the current fallocate is really operating outside of
i_size.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

23b5c509

Btrfs: do not try and lookup the file extent when finishing ordered io · efd049fb

由 Josef Bacik 提交于 2月 02, 2010

When running the following fio job

[torrent]
filename=torrent-test
rw=randwrite
size=4g
filesize=4g
bs=4k
ioengine=sync

you would see long stalls where no work was being done.  That is because we were
doing all this extra work to read in the file extent outside of the transaction,
however in the random io case this ends up hurting us because the file extents
are not there to begin with.  So axe this logic, since we end up reading in the
file extent when we go to update it anyway.  This took the fio job from 11 mb/s
with several ~10 second stalls to 24 mb/s to a couple of 1-2 second stalls.
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

efd049fb

Btrfs: Fix oopsen when dropping empty tree. · 7a7965f8

由 Yan, Zheng 提交于 2月 01, 2010

When dropping a empty tree, walk_down_tree() skips checking
extent information for the tree root. This will triggers a
BUG_ON in walk_up_proc().
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

7a7965f8

Btrfs: remove BUG_ON() due to mounting bad filesystem · d7ce5843

由 Miao Xie 提交于 2月 02, 2010

Mounting a bad filesystem caused a BUG_ON(). The following is steps to
reproduce it.
 # mkfs.btrfs /dev/sda2
 # mount /dev/sda2 /mnt
 # mkfs.btrfs /dev/sda1 /dev/sda2
 (the program says that /dev/sda2 was mounted, and then exits. )
 # umount /mnt
 # mount /dev/sda1 /mnt

At the third step, mkfs.btrfs exited in the way of make filesystem. So the
initialization of the filesystem didn't finish. So the filesystem was bad, and
it caused BUG_ON() when mounting it. But BUG_ON() should be called by the wrong
code, not user's operation, so I think it is a bug of btrfs.

This patch fixes it.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

d7ce5843

Btrfs: make error return negative in btrfs_sync_file() · 014e4ac4

由 Roel Kluin 提交于 1月 29, 2010

It appears the error return should be negative
Signed-off-by: NRoel Kluin <roel.kluin@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

014e4ac4

Btrfs: fix race between allocate and release extent buffer. · f044ba78

由 Yan, Zheng 提交于 2月 04, 2010

Increase extent buffer's reference count while holding the lock.
Otherwise it can race with try_release_extent_buffer.
Signed-off-by: NYan Zheng <zheng.yan@oracle.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f044ba78

04 2月, 2010 2 次提交

ocfs2/dlm: Remove BUG_ON in dlm recovery when freeing locks of a dead node · cda70ba8

由 Sunil Mushran 提交于 2月 01, 2010

During recovery, the dlm frees the locks for the dead node. If it finds a
lock in a resource for the dead node, it expects that node to also have a
ref in that lock resource. If not, it BUGs.

ossbz#1175 was filed with the above BUG. Now, while it is correct that we
should be expecting the ref, I see no reason why we have to BUG. After all,
we are freeing up the lock and clearing the ref.

This patch replaces the BUG_ON with a printk(). Hopefully, that will give
us more clues next time this happens.

http://oss.oracle.com/bugzilla/show_bug.cgi?id=1175Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Acked-by: NMark Fasheh <mfasheh@suse.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

cda70ba8

ocfs2: Plugs race between the dc thread and an unlock ast message · 079b8057

由 Sunil Mushran 提交于 2月 03, 2010

This patch plugs a race between the downconvert thread and an unlock ast message.
Specifically, after the downconvert worker has done its task, the dc thread needs
to check whether an unlock ast made the downconvert moot.
Reported-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Acked-by: NMark Fasheh <mfasheh@sus.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

079b8057

03 2月, 2010 16 次提交

NFS: Don't clobber the attribute type in nfs_update_inode() · 9b4b3513

由 Trond Myklebust 提交于 2月 03, 2010

If the NFS_ATTR_FATTR_TYPE field isn't set in fattr->valid, then we should
not set the S_IFMT part of inode->i_mode.
Reported-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>

9b4b3513

NFS: Fix a umount race · 387c149b

由 Trond Myklebust 提交于 2月 03, 2010

Ensure that we unregister the bdi before kill_anon_super() calls
ida_remove() on our device name.
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org

387c149b

NFS: Fix an Oops when truncating a file · 9f557cd8

由 Trond Myklebust 提交于 2月 03, 2010

The VM/VFS does not allow mapping->a_ops->invalidatepage() to fail.
Unfortunately, nfs_wb_page_cancel() may fail if a fatal signal occurs.
Since the NFS code assumes that the page stays mapped for as long as the
writeback is active, we can end up Oopsing (among other things).

The only safe fix here is to convert nfs_wait_on_request(), so as to make
it uninterruptible (as is already the case with wait_on_page_writeback()).
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org

9f557cd8

GFS2: Extend umount wait coverage to full glock lifetime · 8f05228e

由 Steven Whitehouse 提交于 1月 29, 2010

Although all glocks are, by the time of the umount glock wait,
scheduled for demotion, some of them haven't made it far
enough through the process for the original set of waiting
code to wait for them.

This extends the ref count to the whole glock lifetime in order
to ensure that the waiting does catch all glocks. It does make
it a bit more invasive, but it seems the only sensible solution
at the moment.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

8f05228e

GFS2: Wait for unlock completion on umount · e402746a

由 Steven Whitehouse 提交于 1月 25, 2010

This patch adds a wait on umount between the point at which we
dispose of all glocks and the point at which we unmount the
lock protocol. This ensures that we've received all the replies
to our unlock requests before we stop the locking.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Reported-by: NFabio M. Di Nitto <fdinitto@redhat.com>

e402746a

ocfs2: Remove overzealous BUG_ON during blocked lock processing · db0f6ce6

由 Sunil Mushran 提交于 2月 01, 2010

During blocked lock processing, we should consider the possibility that the
lock is no longer blocking.

Joel Becker <joel.becker@oracle.com> assisted in fixing this issue.
Reported-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

db0f6ce6

ocfs2: Do not downconvert if the lock level is already compatible · 0d74125a

由 Sunil Mushran 提交于 1月 29, 2010

During upconvert, if the master were to send a BAST, dlmglue will detect the
upconversion in process and send a cancel convert to the master. Upon receiving
the AST for the cancel convert, it will re-process the lock resource to determine
whether it needs downconverting. Say, the up was from PR to EX and the BAST was
for EX. After the cancel convert, it will need to downconvert to NL.

However, if the node was originally upconverting from NL to EX, then there would
be no reason to downconvert (assuming the same message sequence).

This patch makes dlmglue consider the possibility that the current lock level
is already compatible and that downconverting is not required.

Joel Becker <joel.becker@oracle.com> assisted in fixing this issue.

Fixes ossbz#1178
http://oss.oracle.com/bugzilla/show_bug.cgi?id=1178Reported-by: NColy Li <coly.li@suse.de>
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

0d74125a

ocfs2: Prevent a livelock in dlmglue · a1912826

由 Sunil Mushran 提交于 1月 21, 2010

There is possibility of a livelock in __ocfs2_cluster_lock(). If a node were
to get an ast for an upconvert request, followed immediately by a bast,
there is a small window where the fs may downconvert the lock before the
process requesting the upconvert is able to take the lock.

This patch adds a new flag to indicate that the upconvert is still in
progress and that the dc thread should not downconvert it right now.

Wengang Wang <wen.gang.wang@oracle.com> and Joel Becker
<joel.becker@oracle.com> contributed heavily to this patch.
Reported-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

a1912826

ocfs2: Fix setting of OCFS2_LOCK_BLOCKED during bast · 0b94a909

由 Wengang Wang 提交于 1月 21, 2010

During bast, set the OCFS2_LOCK_BLOCKED flag only if the lock needs to
downconverted.
Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com>
Acked-by: NSunil Mushran <sunil.mushran@oracle.com>
Acked-by: NMark Fasheh <mfasheh@suse.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

0b94a909

ocfs2: Use compat_ptr in reflink_arguments. · 34e6c59a

由 Tao Ma 提交于 1月 27, 2010

Although we use u64 to pass userspace pointers to the kernel
to avoid compat_ioctl, it doesn't work in some ppc platform.
So wrap them with compat_ptr and add compat_ioctl.

The detailed discussion about compat_ptr can be found in thread
http://lkml.org/lkml/2009/10/27/423.

We indeed met with a bug when testing on ppc(-EFAULT is returned
when using old_path). This patch try to fix this.
I have tested in ppc64(with 32 bit reflink) and x86_64(with i686
reflink), both works.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

34e6c59a

ocfs2/dlm: Handle EAGAIN for compatibility - v2 · cd34edd8

由 Sunil Mushran 提交于 1月 25, 2010

Mainline commit aad1b153 made the
dlm_begin_reco_handler() return -EAGAIN instead of EAGAIN.

As this error is transmitted over the wire, we want the receiver,
dlm_send_begin_reco_message(), to understand both the older EAGAIN and
the newer -EAGAIN, to allow rolling upgrade of the cluster nodes.
Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

cd34edd8

ocfs2: Add parenthesis to wrap the check for O_DIRECT. · 60c48674

由 Tao Ma 提交于 2月 03, 2010

Add parenthesis to wrap the check for O_DIRECT.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

60c48674

ocfs2: Only bug out when page size is larger than cluster size. · 0a1ea437

由 Tao Ma 提交于 2月 01, 2010

In CoW, we have to make sure that the page is already written
out to the disk. So we have a BUG_ON(PageDirty(page)).

In ppc platform we have pagesize=64K, so if the cs=4K, if the
file have fragmented clusters, we will map the page many times.
See this file as an example.
Tree Depth: 0   Count: 19   Next Free Rec: 14
	## Offset        Clusters       Block#          Flags
	0  0             4              2164864         0x2 Refcounted
	1  4             2              9302792         0x2 Refcounted
...

We have to replace the extent recs one by one, so the page with index 0
will be mapped and dirtied twice.

I'd like to leave the BUG_ON there while adding a check so that in
case we meet with an error in other platforms, we can find it easily.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

0a1ea437

ocfs2: Fix memory overflow in cow_by_page. · d622b89a

由 Tao Ma 提交于 1月 30, 2010

In ocfs2_duplicate_clusters_by_page, we calculate map_end
by shifting page_index. But actually in case we meet with
a large offset(say in a i686 box, poff_t is only 32 bits
and page_index=2056240), we will overflow. So change the
type of page_index to loff_t.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

d622b89a

mm: flush dcache before writing into page to avoid alias · 931e80e4

由 anfei zhou 提交于 2月 02, 2010

The cache alias problem will happen if the changes of user shared mapping
is not flushed before copying, then user and kernel mapping may be mapped
into two different cache line, it is impossible to guarantee the coherence
after iov_iter_copy_from_user_atomic.  So the right steps should be:

	flush_dcache_page(page);
	kmap_atomic(page);
	write to page;
	kunmap_atomic(page);
	flush_dcache_page(page);

More precisely, we might create two new APIs flush_dcache_user_page and
flush_dcache_kern_page to replace the two flush_dcache_page accordingly.

Here is a snippet tested on omap2430 with VIPT cache, and I think it is
not ARM-specific:

	int val = 0x11111111;
	fd = open("abc", O_RDWR);
	addr = mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
	*(addr+0) = 0x44444444;
	tmp = *(addr+0);
	*(addr+1) = 0x77777777;
	write(fd, &val, sizeof(int));
	close(fd);

The results are not always 0x11111111 0x77777777 at the beginning as expected.  Sometimes we see 0x44444444 0x77777777.
Signed-off-by: NAnfei <anfei.zhou@gmail.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: <linux-arch@vger.kernel.org>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

931e80e4

Fix 'flush_old_exec()/setup_new_exec()' split · 7ab02af4

由 Linus Torvalds 提交于 2月 02, 2010

Commit 221af7f8 ("Split 'flush_old_exec' into two functions") split
the function at the point of no return - ie right where there were no
more error cases to check.  That made sense from a technical standpoint,
but when we then also combined it with the actual personality setting
going in between flush_old_exec() and setup_new_exec(), it needs to be a
bit more careful.

In particular, we need to make sure that we really flush the old
personality bits in the 'flush' stage, rather than later in the 'setup'
stage, since otherwise we might be flushing the _new_ personality state
that we're just setting up.

So this moves the flags and personality flushing (and 'flush_thread()',
which is the arch-specific function that generally resets lazy FP state
etc) of the old process into flush_old_exec(), so that it doesn't affect
any state that execve() is setting up for the new process environment.

This was reported by Michal Simek as breaking his Microblaze qemu
environment.
Reported-and-tested-by: NMichal Simek <michal.simek@petalogix.com>
Cc: Peter Anvin <hpa@zytor.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7ab02af4

01 2月, 2010 3 次提交

GFS2: Use GFP_NOFS for alloc structure · ea8d62da

由 Steven Whitehouse 提交于 1月 29, 2010

This is called under a glock, so its a good plan to use GFP_NOFS
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

ea8d62da

GFS2: Fix previous patch · 7fe3ec6f

由 Steven Whitehouse 提交于 1月 29, 2010

The do_div() call needs to remain.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

7fe3ec6f

GFS2: Don't withdraw on partial rindex entries · 55f0b4c5

由 Benjamin Marzinski 提交于 1月 25, 2010

ince gfs2 writes the rindex file a block at a time, and releases the
exclusive lock after each block, it is possible that another process
will grab the lock in the middle of the write. Since rindex entries are
not an even divisor of blocks, that other process may see partial
entries. On grows, this is fine. The process can simply ignore the the
partial entires. Previously, the code withdrew when it saw partial
entries. Now it simply ignores them.
Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

55f0b4c5

31 1月, 2010 2 次提交

nilfs2: fix potential leak of dirty data on umount · 3256a055

由 Ryusuke Konishi 提交于 1月 31, 2010

This fixes incorrect usage of nilfs_segctor_confirm() test function in
nilfs_segctor_destroy(); nilfs_segctor_confirm() returns zero if the
filesystem is not clean, so its use in nilfs_segctor_destroy() needs
inversion.
Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>

3256a055

block: fix bugs in bio-integrity mempool usage · 9e9432c2

由 Chuck Ebbert 提交于 1月 30, 2010

Fix two bugs in the bio integrity code:

 use_bip_pool() always returns 0 because it checks against the wrong limit,
 causing the mempool to be used only when regular allocation fails.

 When the mempool is used as a fallback we don't free the data properly.
Signed-Off-By: NChuck Ebbert <cebbert@redhat.com>
Acked-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

9e9432c2

30 1月, 2010 1 次提交

Split 'flush_old_exec' into two functions · 221af7f8

由 Linus Torvalds 提交于 1月 28, 2010

'flush_old_exec()' is the point of no return when doing an execve(), and
it is pretty badly misnamed.  It doesn't just flush the old executable
environment, it also starts up the new one.

Which is very inconvenient for things like setting up the new
personality, because we want the new personality to affect the starting
of the new environment, but at the same time we do _not_ want the new
personality to take effect if flushing the old one fails.

As a result, the x86-64 '32-bit' personality is actually done using this
insane "I'm going to change the ABI, but I haven't done it yet" bit
(TIF_ABI_PENDING), with SET_PERSONALITY() not actually setting the
personality, but just the "pending" bit, so that "flush_thread()" can do
the actual personality magic.

This patch in no way changes any of that insanity, but it does split the
'flush_old_exec()' function up into a preparatory part that can fail
(still called flush_old_exec()), and a new part that will actually set
up the new exec environment (setup_new_exec()).  All callers are changed
to trivially comply with the new world order.
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

221af7f8