提交 · c143c2333c48f1430231b31a8c17e074b9b504eb · openeuler / raspberrypi-kernel

09 10月, 2014 1 次提交

vfs: Remove d_drop calls from d_revalidate implementations · c143c233

由 Eric W. Biederman 提交于 2月 13, 2014

Now that d_invalidate always succeeds it is not longer necessary or
desirable to hard code d_drop calls into filesystem specific
d_revalidate implementations.

Remove the unnecessary d_drop calls and rely on d_invalidate
to drop the dentries.  Using d_invalidate ensures that paths
to mount points will not be dropped.
Reviewed-by: NMiklos Szeredi <miklos@szeredi.hu>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

c143c233

26 9月, 2014 1 次提交

mm: softdirty: addresses before VMAs in PTE holes aren't softdirty · 87e6d49a

由 Peter Feiner 提交于 9月 25, 2014

In PTE holes that contain VM_SOFTDIRTY VMAs, unmapped addresses before
VM_SOFTDIRTY VMAs are reported as softdirty by /proc/pid/pagemap.  This
bug was introduced in commit 68b5a652 ("mm: softdirty: respect
VM_SOFTDIRTY in PTE holes").  That commit made /proc/pid/pagemap look at
VM_SOFTDIRTY in PTE holes but neglected to observe the start of VMAs
returned by find_vma.

Tested:
  Wrote a selftest that creates a PMD-sized VMA then unmaps the first
  page and asserts that the page is not softdirty. I'm going to send the
  pagemap selftest in a later commit.
Signed-off-by: NPeter Feiner <pfeiner@google.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Jamie Liu <jamieliu@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

87e6d49a

11 8月, 2014 1 次提交

Revert "proc: Point /proc/{mounts,net} at /proc/thread-self/{mounts,net}... · 155134fe

由 Linus Torvalds 提交于 8月 10, 2014

Revert "proc: Point /proc/{mounts,net} at /proc/thread-self/{mounts,net} instead of /proc/self/{mounts,net}"

This reverts commits 344470ca and e8132440.

It turns out that the exact path in the symlink matters, if for somewhat
unfortunate reasons: some apparmor configurations don't allow dhclient
access to the per-thread /proc files. As reported by Jörg Otte:

audit: type=1400 audit(1407684227.003:28): apparmor="DENIED"
operation="open" profile="/sbin/dhclient"
name="/proc/1540/task/1540/net/dev" pid=1540 comm="dhclient"
requested_mask="r" denied_mask="r" fsuid=0 ouid=0

so we had better revert this for now. We might be able to work around
this in practice by only using the per-thread symlinks if the thread
isn't the thread group leader, and if the namespaces differ between
threads (which basically never happens).

We'll see. In the meantime, the revert was made to be intentionally easy.
Reported-by: NJörg Otte <jrg.otte@gmail.com>
Acked-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

155134fe

09 8月, 2014 19 次提交

sysctl: remove typedef ctl_table · e5eea098

由 Joe Perches 提交于 8月 08, 2014

Remove the final user, and the typedef itself.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e5eea098

fs/proc/vmcore.c:mmap_vmcore: skip non-ram pages reported by hypervisors · 0692dedc

由 Vitaly Kuznetsov 提交于 8月 08, 2014

We have a special check in read_vmcore() handler to check if the page was
reported as ram or not by the hypervisor (pfn_is_ram()).  However, when
vmcore is read with mmap() no such check is performed.  That can lead to
unpredictable results, e.g.  when running Xen PVHVM guest memcpy() after
mmap() on /proc/vmcore will hang processing HVMMEM_mmio_dm pages creating
enormous load in both DomU and Dom0.

Fix the issue by mapping each non-ram page to the zero page.  Keep direct
path with remap_oldmem_pfn_range() to avoid looping through all pages on
bare metal.

The issue can also be solved by overriding remap_oldmem_pfn_range() in
xen-specific code, as remap_oldmem_pfn_range() was been designed for.
That, however, would involve non-obvious xen code path for all x86 builds
with CONFIG_XEN_PVHVM=y and would prevent all other hypervisor-specific
code on x86 arch from doing the same override.

[fengguang.wu@intel.com: remap_oldmem_pfn_checked() can be static]
[akpm@linux-foundation.org: clean up layout]
Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Acked-by: NVivek Goyal <vgoyal@redhat.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0692dedc

proc: remove INF macro · 8f053ac1

由 Alexey Dobriyan 提交于 8月 08, 2014

If you're applying this patch, all /proc/$PID/* files were converted
to seq_file interface and this code became unused.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8f053ac1

proc: convert /proc/$PID/hardwall to seq_file interface · d962c144

由 Alexey Dobriyan 提交于 8月 08, 2014

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d962c144

proc: convert /proc/$PID/io to seq_file interface · 19aadc98

由 Alexey Dobriyan 提交于 8月 08, 2014

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

19aadc98

proc: convert /proc/$PID/oom_score to seq_file interface · 6ba51e37

由 Alexey Dobriyan 提交于 8月 08, 2014

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6ba51e37

proc: convert /proc/$PID/schedstat to seq_file interface · f6e826ca

由 Alexey Dobriyan 提交于 8月 08, 2014

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f6e826ca

proc: convert /proc/$PID/wchan to seq_file interface · edfcd606

由 Alexey Dobriyan 提交于 8月 08, 2014

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

edfcd606

proc: convert /proc/$PID/cmdline to seq_file interface · 2ca66ff7

由 Alexey Dobriyan 提交于 8月 08, 2014

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2ca66ff7

proc: convert /proc/$PID/syscall to seq_file interface · 09d93bd6

由 Alexey Dobriyan 提交于 8月 08, 2014

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

09d93bd6

proc: convert /proc/$PID/limits to seq_file interface · 1c963eb1

由 Alexey Dobriyan 提交于 8月 08, 2014

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1c963eb1

proc: convert /proc/$PID/auxv to seq_file interface · f9ea536e

由 Alexey Dobriyan 提交于 8月 08, 2014

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f9ea536e

proc: more "const char *" pointers · cedbccab

由 Alexey Dobriyan 提交于 8月 08, 2014

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cedbccab

proc: remove proc_tty_ldisc variable · 849d20a1

由 Alexey Dobriyan 提交于 8月 08, 2014

/proc/tty/ldisc appear to be unused as a directory and
it had been always that way.

But it is userspace visible thing.

Cowardly remove only in-kernel variable holding it.

[akpm@linux-foundation.org: add comment]
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

849d20a1

proc: make proc_subdir_lock static · 4dcc03fc

由 Alexey Dobriyan 提交于 8月 08, 2014

Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4dcc03fc

proc: faster /proc/$PID lookup · 335eb531

由 Alexey Dobriyan 提交于 8月 08, 2014

Currently lookup for /proc/$PID first goes through spinlock and whole list
of misc /proc entries only to confirm that, yes, /proc/42 can not possibly
match random proc entry.

List is is several dozens entries long (52 entries on my setup).

None of this is necessary.

Try to convert dentry name to integer first.
If it works, it must be /proc/$PID.
If it doesn't, it must be random proc entry.

Based on patch from Al Viro.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

335eb531

proc: add and remove /proc entry create checks · dbcdb504

由 Alexey Dobriyan 提交于 8月 08, 2014

* remove proc_create(NULL, ...) check, let it oops

* warn about proc_create("", ...) and proc_create("very very long name", ...)
  proc code keeps length as u8, no 256+ name length possible

* warn about proc_create("123", ...)
  /proc/$PID and /proc/misc namespaces are separate things,
  but dumb module might create funky a-la $PID entry.

* remove post mortem strchr('/') check
  Triggering it implies either strchr() is buggy or memory corruption.
  It should be VFS check anyway.

In reality, none of these checks will ever trigger,
it is preparation for the next patch.

Based on patch from Al Viro.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

dbcdb504

proc: constify seq_operations · ccf94f1b

由 Fabian Frederick 提交于 8月 08, 2014

proc_uid_seq_operations, proc_gid_seq_operations and
proc_projid_seq_operations are only called in proc_id_map_open with
seq_open as const struct seq_operations so we can constify the 3
structures and update proc_id_map_open prototype.

   text    data     bss     dec     hex filename
   6817     404    1984    9205    23f5 kernel/user_namespace.o-before
   6913     308    1984    9205    23f5 kernel/user_namespace.o-after
Signed-off-by: NFabian Frederick <fabf@skynet.be>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ccf94f1b

fs/proc/kcore.c: use PAGE_ALIGN instead of ALIGN(PAGE_SIZE) · 108a8a11

由 Fabian Frederick 提交于 8月 08, 2014

Use mm.h definition.
Signed-off-by: NFabian Frederick <fabf@skynet.be>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Acked-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

108a8a11

07 8月, 2014 2 次提交

mm: softdirty: respect VM_SOFTDIRTY in PTE holes · 68b5a652

由 Peter Feiner 提交于 8月 06, 2014

After a VMA is created with the VM_SOFTDIRTY flag set, /proc/pid/pagemap
should report that the VMA's virtual pages are soft-dirty until
VM_SOFTDIRTY is cleared (i.e., by the next write of "4" to
/proc/pid/clear_refs).  However, pagemap ignores the VM_SOFTDIRTY flag
for virtual addresses that fall in PTE holes (i.e., virtual addresses
that don't have a PMD, PUD, or PGD allocated yet).

To observe this bug, use mmap to create a VMA large enough such that
there's a good chance that the VMA will occupy an unused PMD, then test
the soft-dirty bit on its pages.  In practice, I found that a VMA that
covered a PMD's worth of address space was big enough.

This patch adds the necessary VMA lookup to the PTE hole callback in
/proc/pid/pagemap's page walk and sets soft-dirty according to the VMAs'
VM_SOFTDIRTY flag.
Signed-off-by: NPeter Feiner <pfeiner@google.com>
Acked-by: NCyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Hugh Dickins <hughd@google.com>
Acked-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

68b5a652

mm: export NR_SHMEM via sysinfo(2) / si_meminfo() interfaces · cc7452b6

由 Rafael Aquini 提交于 8月 06, 2014

Historically, we exported shared pages to userspace via sysinfo(2)
sharedram and /proc/meminfo's "MemShared" fields.  With the advent of
tmpfs, from kernel v2.4 onward, that old way for accounting shared mem
was deemed inaccurate and we started to export a hard-coded 0 for
sysinfo.sharedram.  Later on, during the 2.6 timeframe, "MemShared" got
re-introduced to /proc/meminfo re-branded as "Shmem", but we're still
reporting sysinfo.sharedmem as that old hard-coded zero, which makes the
"shared memory" report inconsistent across interfaces.

This patch leverages the addition of explicit accounting for pages used
by shmem/tmpfs -- "4b02108a mm: oom analysis: add shmem vmstat" -- in
order to make the users of sysinfo(2) and si_meminfo*() friends aware of
that vmstat entry and make them report it consistently across the
interfaces, as well to make sysinfo(2) returned data consistent with our
current API documentation states.
Signed-off-by: NRafael Aquini <aquini@redhat.com>
Acked-by: NRik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cc7452b6

05 8月, 2014 4 次提交

proc: Point /proc/mounts at /proc/thread-self/mounts instead of /proc/self/mounts · 344470ca

由 Eric W. Biederman 提交于 7月 31, 2014

In oddball cases where the thread has a different mount namespace than
the thread group leader or more likely in cases where the thread
remains and the thread group leader has exited this ensures that
/proc/mounts continues to work.

This should not cause any problems but if it does this patch can just
be reverted.
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

344470ca

proc: Point /proc/net at /proc/thread-self/net instead of /proc/self/net · e8132440

由 Eric W. Biederman 提交于 7月 31, 2014

In oddball cases where the thread has a different network namespace
than the primary thread group leader or more likely in cases where
the thread remains and the thread group leader has exited this
ensures that /proc/net continues to work.

This should not cause any problems but if it does this patch can just
be reverted.
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

e8132440

proc: Implement /proc/thread-self to point at the directory of the current thread · 0097875b

由 Eric W. Biederman 提交于 7月 31, 2014

/proc/thread-self is derived from /proc/self.  /proc/thread-self
points to the directory in proc containing information about the
current thread.

This funtionality has been missing for a long time, and is tricky to
implement in userspace as gettid() is not exported by glibc.  More
importantly this allows fixing defects in /proc/mounts and /proc/net
where in a threaded application today they wind up being empty files
when only the initial pthread has exited, causing problems for other
threads.
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

0097875b

proc: Have net show up under /proc/<tgid>/task/<tid> · 6ba8ed79

由 Eric W. Biederman 提交于 7月 31, 2014

Network namespaces are per task so it make sense for them to show up
in the task directory.
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

6ba8ed79

30 7月, 2014 1 次提交

namespaces: Use task_lock and not rcu to protect nsproxy · 728dba3a

由 Eric W. Biederman 提交于 2月 03, 2014

The synchronous syncrhonize_rcu in switch_task_namespaces makes setns
a sufficiently expensive system call that people have complained.

Upon inspect nsproxy no longer needs rcu protection for remote reads.
remote reads are rare.  So optimize for same process reads and write
by switching using rask_lock instead.

This yields a simpler to understand lock, and a faster setns system call.

In particular this fixes a performance regression observed
by Rafael David Tinoco <rafael.tinoco@canonical.com>.

This is effectively a revert of Pavel Emelyanov's commit
cf7b708c Make access to task's nsproxy lighter
from 2007.  The race this originialy fixed no longer exists as
do_notify_parent uses task_active_pid_ns(parent) instead of
parent->nsproxy.
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

728dba3a

24 7月, 2014 2 次提交

CAPABILITIES: remove undefined caps from all processes · 7d8b6c63

由 Eric Paris 提交于 7月 23, 2014

This is effectively a revert of 7b9a7ec5
plus fixing it a different way...

We found, when trying to run an application from an application which
had dropped privs that the kernel does security checks on undefined
capability bits.  This was ESPECIALLY difficult to debug as those
undefined bits are hidden from /proc/$PID/status.

Consider a root application which drops all capabilities from ALL 4
capability sets.  We assume, since the application is going to set
eff/perm/inh from an array that it will clear not only the defined caps
less than CAP_LAST_CAP, but also the higher 28ish bits which are
undefined future capabilities.

The BSET gets cleared differently.  Instead it is cleared one bit at a
time.  The problem here is that in security/commoncap.c::cap_task_prctl()
we actually check the validity of a capability being read.  So any task
which attempts to 'read all things set in bset' followed by 'unset all
things set in bset' will not even attempt to unset the undefined bits
higher than CAP_LAST_CAP.

So the 'parent' will look something like:
CapInh:	0000000000000000
CapPrm:	0000000000000000
CapEff:	0000000000000000
CapBnd:	ffffffc000000000

All of this 'should' be fine.  Given that these are undefined bits that
aren't supposed to have anything to do with permissions.  But they do...

So lets now consider a task which cleared the eff/perm/inh completely
and cleared all of the valid caps in the bset (but not the invalid caps
it couldn't read out of the kernel).  We know that this is exactly what
the libcap-ng library does and what the go capabilities library does.
They both leave you in that above situation if you try to clear all of
you capapabilities from all 4 sets.  If that root task calls execve()
the child task will pick up all caps not blocked by the bset.  The bset
however does not block bits higher than CAP_LAST_CAP.  So now the child
task has bits in eff which are not in the parent.  These are
'meaningless' undefined bits, but still bits which the parent doesn't
have.

The problem is now in cred_cap_issubset() (or any operation which does a
subset test) as the child, while a subset for valid cap bits, is not a
subset for invalid cap bits!  So now we set durring commit creds that
the child is not dumpable.  Given it is 'more priv' than its parent.  It
also means the parent cannot ptrace the child and other stupidity.

The solution here:
1) stop hiding capability bits in status
	This makes debugging easier!

2) stop giving any task undefined capability bits.  it's simple, it you
don't put those invalid bits in CAP_FULL_SET you won't get them in init
and you won't get them in any other task either.
	This fixes the cap_issubset() tests and resulting fallout (which
	made the init task in a docker container untraceable among other
	things)

3) mask out undefined bits when sys_capset() is called as it might use
~0, ~0 to denote 'all capabilities' for backward/forward compatibility.
	This lets 'capsh --caps="all=eip" -- -c /bin/bash' run.

4) mask out undefined bit when we read a file capability off of disk as
again likely all bits are set in the xattr for forward/backward
compatibility.
	This lets 'setcap all+pe /bin/bash; /bin/bash' run
Signed-off-by: NEric Paris <eparis@redhat.com>
Reviewed-by: NKees Cook <keescook@chromium.org>
Cc: Andrew Vagin <avagin@openvz.org>
Cc: Andrew G. Morgan <morgan@kernel.org>
Cc: Serge E. Hallyn <serge.hallyn@canonical.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Steve Grubb <sgrubb@redhat.com>
Cc: Dan Walsh <dwalsh@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NJames Morris <james.l.morris@oracle.com>

7d8b6c63

sched: Make task->real_start_time nanoseconds based · 57e0be04

由 Thomas Gleixner 提交于 7月 16, 2014

Simplify the only user of this data by removing the timespec
conversion.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>

57e0be04

04 7月, 2014 1 次提交

/proc/stat: convert to single_open_size() · f74373a5

由 Heiko Carstens 提交于 7月 02, 2014

These two patches are supposed to "fix" failed order-4 memory
allocations which have been observed when reading /proc/stat.  The
problem has been observed on s390 as well as on x86.

To address the problem change the seq_file memory allocations to
fallback to use vmalloc, so that allocations also work if memory is
fragmented.

This approach seems to be simpler and less intrusive than changing
/proc/stat to use an interator.  Also it "fixes" other users as well,
which use seq_file's single_open() interface.

This patch (of 2):

Use seq_file's single_open_size() to preallocate a buffer that is large
enough to hold the whole output, instead of open coding it.  Also
calculate the requested size using the number of online cpus instead of
possible cpus, since the size of the output only depends on the number
of online cpus.
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: NDavid Rientjes <rientjes@google.com>
Cc: Ian Kent <raven@themaw.net>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Thorsten Diehl <thorsten.diehl@de.ibm.com>
Cc: Andrea Righi <andrea@betterlinux.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Stefan Bader <stefan.bader@canonical.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f74373a5

07 6月, 2014 3 次提交

fs/proc/vmcore.c: remove NULL assignment to static · a05e16ad

由 Fabian Frederick 提交于 6月 06, 2014

Static values are automatically initialized to NULL.
Signed-off-by: NFabian Frederick <fabf@skynet.be>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a05e16ad

fs/proc/task_mmu.c: replace seq_printf by seq_puts · 17c2b4ee

由 Fabian Frederick 提交于 6月 06, 2014

Signed-off-by: NFabian Frederick <fabf@skynet.be>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

17c2b4ee

mm: add !pte_present() check on existing hugetlb_entry callbacks · d4c54919

由 Naoya Horiguchi 提交于 6月 06, 2014

The age table walker doesn't check non-present hugetlb entry in common
path, so hugetlb_entry() callbacks must check it.  The reason for this
behavior is that some callers want to handle it in its own way.

[ I think that reason is bogus, btw - it should just do what the regular
  code does, which is to call the "pte_hole()" function for such hugetlb
  entries  - Linus]

However, some callers don't check it now, which causes unpredictable
result, for example when we have a race between migrating hugepage and
reading /proc/pid/numa_maps.  This patch fixes it by adding !pte_present
checks on buggy callbacks.

This bug exists for years and got visible by introducing hugepage
migration.

ChangeLog v2:
- fix if condition (check !pte_present() instead of pte_present())
Reported-by: NSasha Levin <sasha.levin@oracle.com>
Signed-off-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: <stable@vger.kernel.org> [3.12+]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
[ Backported to 3.15.  Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org> ]
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d4c54919

05 6月, 2014 1 次提交

mm: softdirty: clear VM_SOFTDIRTY flag inside clear_refs_write() instead of clear_soft_dirty() · c86c97ff

由 Cyrill Gorcunov 提交于 6月 04, 2014

clear_refs_write() is called earlier than clear_soft_dirty() and it is
more natural to clear VM_SOFTDIRTY (which belongs to VMA entry but not
PTEs) that early instead of clearing it a way deeper inside call chain.
Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c86c97ff

21 5月, 2014 1 次提交

mm, fs: Add vm_ops->name as an alternative to arch_vma_name · 78d683e8

由 Andy Lutomirski 提交于 5月 19, 2014

arch_vma_name sucks. It's a silly hack, and it's annoying to
implement correctly. In fact, AFAICS, even the straightforward x86
implementation is incorrect (I suspect that it breaks if the vdso
mapping is split or gets remapped).

This adds a new vm_ops->name operation that can replace it. The
followup patches will remove all uses of arch_vma_name on x86,
fixing a couple of annoyances in the process.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/2eee21791bb36a0a408c5c2bdb382a9e6a41ca4a.1400538962.git.luto@amacapital.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

78d683e8

08 4月, 2014 3 次提交

fault-injection: set bounds on what /proc/self/make-it-fail accepts. · 16caed31

由 Dave Jones 提交于 4月 07, 2014

/proc/self/make-it-fail is a boolean, but accepts any number, including
negative ones.  Change variable to unsigned, and cap upper bound at 1.

[akpm@linux-foundation.org: don't make make_it_fail unsigned]
Signed-off-by: NDave Jones <davej@fedoraproject.org>
Reviewed-by: NAkinobu Mita <akinobu.mita@gmail.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

16caed31

vmcore: continue vmcore initialization if PT_NOTE is found empty · c4082f36

由 WANG Chao 提交于 4月 07, 2014

Currently when an empty PT_NOTE is detected, vmcore initialization
fails.  It sounds too harsh.  Because PT_NOTE could be empty, for
example, one offlined a cpu but never restarted kdump service, and after
crash, PT_NOTE program header is there but no data contains.  It's
better to warn about the empty PT_NOTE and continue to initialise
vmcore.

And ultimately the multiple PT_NOTE are merged into a single one, all
empty PT_NOTE are discarded naturally during the merge.  So empty
PT_NOTE is not visible to user space and vmcore is as good as expected.
Signed-off-by: NWANG Chao <chaowang@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: Greg Pearson <greg.pearson@hp.com>
Cc: Baoquan He <bhe@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c4082f36

include/linux/crash_dump.h: add vmcore_cleanup() prototype · 82e0703b

由 Rashika Kheria 提交于 4月 07, 2014

Eliminate the following warning in proc/vmcore.c:

  fs/proc/vmcore.c:1088:6: warning: no previous prototype for `vmcore_cleanup' [-Wmissing-prototypes]

[akpm@linux-foundation.org: clean up powerpc, remove unneeded EXPORT_SYMBOL]
Signed-off-by: NRashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

82e0703b