提交 · d9104d1ca9662498339c0de975b4666c30485f4e · xiphi1978 / linux

12 9月, 2013 1 次提交

mm: track vma changes with VM_SOFTDIRTY bit · d9104d1c

由 Cyrill Gorcunov 提交于 9月 11, 2013

Pavel reported that in case if vma area get unmapped and then mapped (or
expanded) in-place, the soft dirty tracker won't be able to recognize this
situation since it works on pte level and ptes are get zapped on unmap,
loosing soft dirty bit of course.

So to resolve this situation we need to track actions on vma level, there
VM_SOFTDIRTY flag comes in.  When new vma area created (or old expanded)
we set this bit, and keep it here until application calls for clearing
soft dirty bit.

Thus when user space application track memory changes now it can detect if
vma area is renewed.
Reported-by: NPavel Emelyanov <xemul@parallels.com>
Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Rob Landley <rob@landley.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d9104d1c

16 8月, 2013 1 次提交

Fix TLB gather virtual address range invalidation corner cases · 2b047252

由 Linus Torvalds 提交于 8月 15, 2013

Ben Tebulin reported:

"Since v3.7.2 on two independent machines a very specific Git
repository fails in 9/10 cases on git-fsck due to an SHA1/memory
failures. This only occurs on a very specific repository and can be
reproduced stably on two independent laptops. Git mailing list ran
out of ideas and for me this looks like some very exotic kernel issue"

and bisected the failure to the backport of commit 53a59fc6 ("mm:
limit mmu_gather batching to fix soft lockups on !CONFIG_PREEMPT").

That commit itself is not actually buggy, but what it does is to make it
much more likely to hit the partial TLB invalidation case, since it
introduces a new case in tlb_next_batch() that previously only ever
happened when running out of memory.

The real bug is that the TLB gather virtual memory range setup is subtly
buggered. It was introduced in commit 597e1c35 ("mm/mmu_gather:
enable tlb flush range in generic mmu_gather"), and the range handling
was already fixed at least once in commit e6c495a9 ("mm: fix the TLB
range flushed when __tlb_remove_page() runs out of slots"), but that fix
was not complete.

The problem with the TLB gather virtual address range is that it isn't
set up by the initial tlb_gather_mmu() initialization (which didn't get
the TLB range information), but it is set up ad-hoc later by the
functions that actually flush the TLB. And so any such case that forgot
to update the TLB range entries would potentially miss TLB invalidates.

Rather than try to figure out exactly which particular ad-hoc range
setup was missing (I personally suspect it's the hugetlb case in
zap_huge_pmd(), which didn't have the same logic as zap_pte_range()
did), this patch just gets rid of the problem at the source: make the
TLB range information available to tlb_gather_mmu(), and initialize it
when initializing all the other tlb gather fields.

This makes the patch larger, but conceptually much simpler. And the end
result is much more understandable; even if you want to play games with
partial ranges when invalidating the TLB contents in chunks, now the
range information is always there, and anybody who doesn't want to
bother with it won't introduce subtle bugs.

Ben verified that this fixes his problem.
Reported-bisected-and-tested-by: NBen Tebulin <tebulin@googlemail.com>
Build-testing-by: NStephen Rothwell <sfr@canb.auug.org.au>
Build-testing-by: NRichard Weinberger <richard.weinberger@gmail.com>
Reviewed-by: NMichal Hocko <mhocko@suse.cz>
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2b047252

04 7月, 2013 3 次提交

fs/exec.c:de_thread: mt-exec should update ->real_start_time · 266b7a02

由 Oleg Nesterov 提交于 7月 03, 2013

924b42d5 ("Use boot based time for process start time and boot time in
/proc") updated copy_process/do_task_stat but forgot about de_thread().
This breaks "ps axOT" if a sub-thread execs.

Note: I think that task->start_time should die.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: NJohn Stultz <johnstul@us.ibm.com>
Cc: Tomas Janousek <tjanouse@redhat.com>
Cc: Tomas Smetana <tsmetana@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

266b7a02

fs/exec.c: do_execve_common(): use current_user() · bd9d43f4

由 Oleg Nesterov 提交于 7月 03, 2013

Trivial cleanup.  do_execve_common() can use current_user() and avoid the
unnecessary "struct cred *cred" var.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Cc: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bd9d43f4

fs/exec.c:de_thread(): use change_pid() rather than detach_pid/attach_pid · 3f418548

由 Oleg Nesterov 提交于 7月 03, 2013

de_thread() can use change_pid() instead of detach + attach.  This looks
better and this ensures that, say, next_thread() can never see a task with
->pid == NULL.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Sergey Dyasly <dserrg@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3f418548

29 6月, 2013 1 次提交
- A
  allow build_open_flags() to return an error · f9652e10
  由 Al Viro 提交于 6月 11, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  f9652e10
26 6月, 2013 1 次提交

perf: Disable monitoring on setuid processes for regular users · 2976b10f

由 Stephane Eranian 提交于 6月 20, 2013

There was a a bug in setup_new_exec(), whereby
the test to disabled perf monitoring was not
correct because the new credentials for the
process were not yet committed and therefore
the get_dumpable() test was never firing.

The patch fixes the problem by moving the
perf_event test until after the credentials
are committed.
Signed-off-by: NStephane Eranian <eranian@google.com>
Tested-by: NJiri Olsa <jolsa@redhat.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@kernel.org>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

2976b10f

01 5月, 2013 2 次提交

exec: do not abuse ->cred_guard_mutex in threadgroup_lock() · e56fb287

由 Oleg Nesterov 提交于 4月 30, 2013

threadgroup_lock() takes signal->cred_guard_mutex to ensure that
thread_group_leader() is stable.  This doesn't look nice, the scope of
this lock in do_execve() is huge.

And as Dave pointed out this can lead to deadlock, we have the
following dependencies:

	do_execve:		cred_guard_mutex -> i_mutex
	cgroup_mount:		i_mutex -> cgroup_mutex
	attach_task_by_pid:	cgroup_mutex -> cred_guard_mutex

Change de_thread() to take threadgroup_change_begin() around the
switch-the-leader code and change threadgroup_lock() to avoid
->cred_guard_mutex.

Note that de_thread() can't sleep with ->group_rwsem held, this can
obviously deadlock with the exiting leader if the writer is active, so it
does threadgroup_change_end() before schedule().
Reported-by: NDave Jones <davej@redhat.com>
Acked-by: NTejun Heo <tj@kernel.org>
Acked-by: NLi Zefan <lizefan@huawei.com>
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e56fb287

set_task_comm: kill the pointless memset() + wmb() · 12eaaf30

由 Oleg Nesterov 提交于 4月 30, 2013

set_task_comm() does memset() + wmb() before strlcpy().  This buys
nothing and to add to the confusion, the comment is wrong.

- We do not need memset() to be "safe from non-terminating string
  reads", the final char is always zero and we never change it.

- wmb() is paired with nothing, it cannot prevent from printing
  the mixture of the old/new data unless the reader takes the lock.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: John Stultz <johnstul@us.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

12eaaf30

30 4月, 2013 2 次提交

mm: allow arch code to control the user page table ceiling · 6ee8630e

由 Hugh Dickins 提交于 4月 29, 2013

On architectures where a pgd entry may be shared between user and kernel
(e.g.  ARM+LPAE), freeing page tables needs a ceiling other than 0.
This patch introduces a generic USER_PGTABLES_CEILING that arch code can
override.  It is the responsibility of the arch code setting the ceiling
to ensure the complete freeing of the page tables (usually in
pgd_free()).

[catalin.marinas@arm.com: commit log; shift_arg_pages(), asm-generic/pgtables.h changes]
Signed-off-by: NHugh Dickins <hughd@google.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: <stable@vger.kernel.org>	[3.3+]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6ee8630e

new helper: read_code() · 3dc20cb2

由 Al Viro 提交于 4月 13, 2013

switch binfmts that use ->read() to that (and to kernel_read()
in several cases in binfmt_flat - sure, it's nommu, but still,
doing ->read() into kmalloc'ed buffer...)
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

3dc20cb2

25 4月, 2013 1 次提交

ARM: 7701/1: mm: Allow arch code to control the user page table ceiling · a0a9434d

由 Hugh Dickins 提交于 4月 23, 2013

On architectures where a pgd entry may be shared between user and kernel
(e.g. ARM+LPAE), freeing page tables needs a ceiling other than 0. This
patch introduces a generic USER_PGTABLES_CEILING that arch code can
override. It is the responsibility of the arch code setting the ceiling
to ensure the complete freeing of the page tables (usually in
pgd_free()).

[catalin.marinas@arm.com: commit log; shift_arg_pages(), asm-generic/pgtables.h changes]
Signed-off-by: NHugh Dickins <hughd@google.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@vger.kernel.org> # 3.3+
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

a0a9434d

28 2月, 2013 1 次提交

coredump: remove redundant defines for dumpable states · e579d2c2

由 Kees Cook 提交于 2月 27, 2013

The existing SUID_DUMP_* defines duplicate the newer SUID_DUMPABLE_*
defines introduced in 54b50199 ("coredump: warn about unsafe
suid_dumpable / core_pattern combo").  Remove the new ones, and use the
prior values instead.
Signed-off-by: NKees Cook <keescook@chromium.org>
Reported-by: NChen Gang <gang.chen@asianux.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alan Cox <alan@linux.intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: James Morris <james.l.morris@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e579d2c2

26 2月, 2013 1 次提交

fs/exec.c: make bprm_mm_init() static · 9cc64cea

由 Yuanhan Liu 提交于 2月 20, 2013

There is only one user of bprm_mm_init, and it's inside the same file.
Signed-off-by: NYuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

9cc64cea

23 2月, 2013 1 次提交
- A
  new helper: file_inode(file) · 496ad9aa
  由 Al Viro 提交于 1月 23, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  496ad9aa
12 1月, 2013 1 次提交

fs/exec.c: work around icc miscompilation · 6d92d4f6

由 Xi Wang 提交于 1月 11, 2013

The tricky problem is this check:

	if (i++ >= max)

icc (mis)optimizes this check as:

	if (++i > max)

The check now becomes a no-op since max is MAX_ARG_STRINGS (0x7FFFFFFF).

This is "allowed" by the C standard, assuming i++ never overflows,
because signed integer overflow is undefined behavior.  This
optimization effectively reverts the previous commit 362e6663
("exec.c, compat.c: fix count(), compat_count() bounds checking") that
tries to fix the check.

This patch simply moves ++ after the check.
Signed-off-by: NXi Wang <xi.wang@gmail.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6d92d4f6

21 12月, 2012 1 次提交

exec: do not leave bprm->interp on stack · b66c5984

由 Kees Cook 提交于 12月 20, 2012

If a series of scripts are executed, each triggering module loading via
unprintable bytes in the script header, kernel stack contents can leak
into the command line.

Normally execution of binfmt_script and binfmt_misc happens recursively.
However, when modules are enabled, and unprintable bytes exist in the
bprm->buf, execution will restart after attempting to load matching
binfmt modules.  Unfortunately, the logic in binfmt_script and
binfmt_misc does not expect to get restarted.  They leave bprm->interp
pointing to their local stack.  This means on restart bprm->interp is
left pointing into unused stack memory which can then be copied into the
userspace argv areas.

After additional study, it seems that both recursion and restart remains
the desirable way to handle exec with scripts, misc, and modules.  As
such, we need to protect the changes to interp.

This changes the logic to require allocation for any changes to the
bprm->interp.  To avoid adding a new kmalloc to every exec, the default
value is left as-is.  Only when passing through binfmt_script or
binfmt_misc does an allocation take place.

For a proof of concept, see DoTest.sh from:

   http://www.halfdog.net/Security/2012/LinuxKernelBinfmtScriptStackDataDisclosure/Signed-off-by: NKees Cook <keescook@chromium.org>
Cc: halfdog <me@halfdog.net>
Cc: P J P <ppandit@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b66c5984

20 12月, 2012 1 次提交

Bury the conditionals from kernel_thread/kernel_execve series · ae903caa

由 Al Viro 提交于 12月 14, 2012

All architectures have
	CONFIG_GENERIC_KERNEL_THREAD
	CONFIG_GENERIC_KERNEL_EXECVE
	__ARCH_WANT_SYS_EXECVE
None of them have __ARCH_WANT_KERNEL_EXECVE and there are only two callers
of kernel_execve() (which is a trivial wrapper for do_execve() now) left.
Kill the conditionals and make both callers use do_execve().
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

ae903caa

18 12月, 2012 1 次提交

exec: use -ELOOP for max recursion depth · d7402698

由 Kees Cook 提交于 12月 17, 2012

To avoid an explosion of request_module calls on a chain of abusive
scripts, fail maximum recursion with -ELOOP instead of -ENOEXEC. As soon
as maximum recursion depth is hit, the error will fail all the way back
up the chain, aborting immediately.

This also has the side-effect of stopping the user's shell from attempting
to reexecute the top-level file as a shell script. As seen in the
dash source:

        if (cmd != path_bshell && errno == ENOEXEC) {
                *argv-- = cmd;
                *argv = cmd = path_bshell;
                goto repeat;
        }

The above logic was designed for running scripts automatically that lacked
the "#!" header, not to re-try failed recursion. On a legitimate -ENOEXEC,
things continue to behave as the shell expects.

Additionally, when tracking recursion, the binfmt handlers should not be
involved. The recursion being tracked is the depth of calls through
search_binary_handler(), so that function should be exclusively responsible
for tracking the depth.
Signed-off-by: NKees Cook <keescook@chromium.org>
Cc: halfdog <me@halfdog.net>
Cc: P J P <ppandit@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d7402698

29 11月, 2012 5 次提交
- A
  get rid of pt_regs argument of ->load_binary() · 71613c3b
  由 Al Viro 提交于 10月 20, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  71613c3b
- A
  get rid of pt_regs argument of search_binary_handler() · 3c456bfc
  由 Al Viro 提交于 10月 20, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  3c456bfc
- A
  get rid of pt_regs argument of do_execve_common() · 835ab32d
  由 Al Viro 提交于 10月 20, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  835ab32d
- A
  get rid of pt_regs argument of do_execve() · da3d4c5f
  由 Al Viro 提交于 10月 20, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  da3d4c5f
- A
  make compat_do_execve() static, lose pt_regs argument · d03d26e5
  由 Al Viro 提交于 10月 20, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  d03d26e5
19 11月, 2012 1 次提交

userns: Ignore suid and sgid on binaries if the uid or gid can not be mapped · 3cdf5b45

由 Eric W. Biederman 提交于 11月 21, 2011

When performing an exec where the binary lives in one user namespace and
the execing process lives in another usre namespace there is the possibility
that the target uids can not be represented.

Instead of failing the exec simply ignore the suid/sgid bits and run
the binary with lower privileges.   We already do this in the case
of MNT_NOSUID so this should be a well tested code path.

As the user and group are not changed this should not introduce any
security issues.
Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>

3cdf5b45

26 10月, 2012 1 次提交

freezer: exec should clear PF_NOFREEZE along with PF_KTHREAD · b40a7959

由 Oleg Nesterov 提交于 10月 25, 2012

flush_old_exec() clears PF_KTHREAD but forgets about PF_NOFREEZE.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Acked-by: NTejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

b40a7959

13 10月, 2012 2 次提交

vfs: make path_openat take a struct filename pointer · 669abf4e

由 Jeff Layton 提交于 10月 10, 2012

...and fix up the callers. For do_file_open_root, just declare a
struct filename on the stack and fill out the .name field. For
do_filp_open, make it also take a struct filename pointer, and fix up its
callers to call it appropriately.

For filp_open, add a variant that takes a struct filename pointer and turn
filp_open into a wrapper around it.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

669abf4e

vfs: define struct filename and have getname() return it · 91a27b2a

由 Jeff Layton 提交于 10月 10, 2012

getname() is intended to copy pathname strings from userspace into a
kernel buffer. The result is just a string in kernel space. It would
however be quite helpful to be able to attach some ancillary info to
the string.

For instance, we could attach some audit-related info to reduce the
amount of audit-related processing needed. When auditing is enabled,
we could also call getname() on the string more than once and not
need to recopy it from userspace.

This patchset converts the getname()/putname() interfaces to return
a struct instead of a string. For now, the struct just tracks the
string in kernel space and the original userland pointer for it.

Later, we'll add other information to the struct as it becomes
convenient.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

91a27b2a

09 10月, 2012 2 次提交

mm: avoid taking rmap locks in move_ptes() · 38a76013

由 Michel Lespinasse 提交于 10月 08, 2012

During mremap(), the destination VMA is generally placed after the
original vma in rmap traversal order: in move_vma(), we always have
new_pgoff >= vma->vm_pgoff, and as a result new_vma->vm_pgoff >=
vma->vm_pgoff unless vma_merge() merged the new vma with an adjacent one.

When the destination VMA is placed after the original in rmap traversal
order, we can avoid taking the rmap locks in move_ptes().

Essentially, this reintroduces the optimization that had been disabled in
"mm anon rmap: remove anon_vma_moveto_tail".  The difference is that we
don't try to impose the rmap traversal order; instead we just rely on
things being in the desired order in the common case and fall back to
taking locks in the uncommon case.  Also we skip the i_mmap_mutex in
addition to the anon_vma lock: in both cases, the vmas are traversed in
increasing vm_pgoff order with ties resolved in tree insertion order.
Signed-off-by: NMichel Lespinasse <walken@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Daniel Santos <daniel.santos@pobox.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

38a76013

exec: make de_thread() killable · d5bbd43d

由 Oleg Nesterov 提交于 10月 08, 2012

Change de_thread() to use KILLABLE rather than UNINTERRUPTIBLE while
waiting for other threads. The only complication is that we should
clear ->group_exit_task and ->notify_count before we return, and we
should do this under tasklist_lock. -EAGAIN is used to match the
initial signal_group_exit() check/return, it doesn't really matter.

This fixes the (unlikely) race with coredump. de_thread() checks
signal_group_exit() before it starts to kill the subthreads, but this
can't help if another CLONE_VM (but non CLONE_THREAD) task starts the
coredumping after de_thread() unlocks ->siglock. In this case the
killed sub-thread can block in exit_mm() waiting for coredump_finish(),
execing thread waits for that sub-thead, and the coredumping thread
waits for execing thread. Deadlock.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d5bbd43d

06 10月, 2012 2 次提交

coredump: use SUID_DUMPABLE_ENABLED rather than hardcoded 1 · 0f4cfb2e

由 Oleg Nesterov 提交于 10月 04, 2012

Cosmetic. Change setup_new_exec() and task_dumpable() to use
SUID_DUMPABLE_ENABLED for /bin/grep.

[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0f4cfb2e

coredump: update coredump-related headers · 179899fd

由 Alex Kelly 提交于 10月 04, 2012

Create a new header file, fs/coredump.h, which contains functions only
used by the new coredump.c.  It also moves do_coredump to the
include/linux/coredump.h header file, for consistency.
Signed-off-by: NAlex Kelly <alex.page.kelly@gmail.com>
Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
Acked-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

179899fd

03 10月, 2012 1 次提交

coredump: move core dump functionality into its own file · 10c28d93

由 Alex Kelly 提交于 9月 26, 2012

This prepares for making core dump functionality optional.

The variable "suid_dumpable" and associated functions are left in fs/exec.c
because they're used elsewhere, such as in ptrace.
Signed-off-by: NAlex Kelly <alex.page.kelly@gmail.com>
Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
Acked-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

10c28d93

01 10月, 2012 2 次提交

generic sys_execve() · 38b983b3

由 Al Viro 提交于 9月 30, 2012

Selected by __ARCH_WANT_SYS_EXECVE in unistd.h.  Requires
	* working current_pt_regs()
	* *NOT* doing a syscall-in-kernel kind of kernel_execve()
implementation.  Using generic kernel_execve() is fine.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

38b983b3

generic kernel_execve() · 282124d1

由 Al Viro 提交于 9月 30, 2012

based mostly on arm and alpha versions.  Architectures can define
__ARCH_WANT_KERNEL_EXECVE and use it, provided that
	* they have working current_pt_regs(), even for kernel threads.
	* kernel_thread-spawned threads do have space for pt_regs
in the normal location.  Normally that's as simple as switching to
generic kernel_thread() and making sure that kernel threads do *not*
go through return from syscall path; call the payload from equivalent
of ret_from_fork if we are in a kernel thread (or just have separate
ret_from_kernel_thread and make copy_thread() use it instead of
ret_from_fork in kernel thread case).
	* they have ret_from_kernel_execve(); it is called after
successful do_execve() done by kernel_execve() and gets normal
pt_regs location passed to it as argument.  It's essentially
a longjmp() analog - it should set sp, etc. to the situation
expected at the return for syscall and go there.  Eventually
the need for that sucker will disappear, but that'll take some
surgery on kernel_thread() payloads.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

282124d1

27 9月, 2012 3 次提交

A
do_coredump(): make sure that descriptor table isn't shared · 179e037f
由 Al Viro 提交于 8月 21, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
179e037f

new helper: replace_fd() · 8280d161

由 Al Viro 提交于 8月 21, 2012

analog of dup2(), except that it takes struct file * as source.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

8280d161

take close-on-exec logics to fs/file.c, clean it up a bit · 6a6d27de

由 Al Viro 提交于 8月 21, 2012

... and add cond_resched() there, while we are at it.  We can
get large latencies as is...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

6a6d27de

20 9月, 2012 1 次提交
- A
  the only place that needs to include asm/exec.h is linux/binfmts.h · 826eba4d
  由 Al Viro 提交于 8月 03, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  826eba4d
31 7月, 2012 1 次提交

coredump: fix wrong comments on core limits of pipe coredump case · 108ceeb0

由 Jovi Zhang 提交于 7月 30, 2012

In commit 898b374a ("exec: replace call_usermodehelper_pipe with use
of umh init function and resolve limit"), the core limits recursive
check value was changed from 0 to 1, but the corresponding comments were
not updated.
Signed-off-by: NJovi Zhang <bookjovi@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

108ceeb0